Data Formats and Submission for ACs, UFRs and EXPs
Knowledge Base Wiki Data Formats and Submission for ACs and UFRs
Purpose of Document
This document provides guidance to contributors who are submitting data to be included in the Knowledge Base Wiki. It is important that contributors read the content carefully before preparing and submitting their data files. This document sets out the required format for each type of data file, what metadata is required and the folder structures that should be used when submitting data. The total storage space available per Application Challenge will be subject to a limit of 500 MB.
Data Structure and Format
Directory Structure
All the data files associated with an Application Challenge should be submitted as a single appropriately named folder[1], containing three subfolders named ‘X’, ‘C’ and ‘I’. All experimental data must be contained within the directory named ‘X’, all CFD data in the directory named ‘C’ and all images in a directory named ‘I’. Any further directory structure within these three subfolders and the data filenames, are at the discretion of the contributor providing the data. The total storage space available per Application Challenge will be subject to a limit of 500 MB.
File Type Segregation
Data stored in the Knowledge Base are separated into the following three classes:
- Graphical Data Files (independent of size)
- Small Data Files - Experimental or CFD data files that are less than about 2Mb when stored in ASCII format.
- Large Data Files - Experimental or CFD data files that would be more than about 2Mb if stored in ASCII format.
In the following sections the file formats required for each of these three classes will be discussed and must be adhered to when submitting documents.
Graphical Data Files (independent of size)
Static images must have extensions of either “.gif” or “.jpg” or “.png” depending on which format is the most suitable for a given image type.
Files containing movie clips should have extensions of either “.mpg” or “.avi” or “.mov”.
Small Data files
‘Small’ data files have a size smaller than approximately 2Mb when stored in ASCII format. These data files must be in ASCII format and should have the extension .dat. They should also contain their own metadata at the top of the file as indicated in Appendix A below. This metadata header states the Problem Definition Parameters (PDPs) common to the entire data file and also information about the data columns presented. Following the metadata header the file should only contain columns of tab separated data with the rows being separated by CR/LF (Carriage Return/Line Feed) characters. The data type in each column must be either Integer, Real or String, as declared in the metadata. No metadata or any other type of comments should appear after the “%End of Metadata” line. An example of a small data file is provided in Appendix B.
Large Data files
‘Large’ data files are larger than 2Mb when stored in ASCII format. These can be structured in any format convenient to the subscriber. However, instructions for reading and interpreting the data must be provided in the metadata file. The specific information required is a definition of the file format, usually a well-known keyword, and an optional version number.
The metadata associated with a large data file must be stored in a separate ASCII file residing in the same directory as the large data file and having the same name except with a “.met” extension. This metadata file shall contain all the Problem Definition Parameters (PDPs) common to the entire data file and also information about the data columns presented as prescribed in Appendix A.
DNS CFD Data files
The data files for the DNS entries maybe several hundreds of GB's and so a separate large file hosting system has been developed that will store these files on the AWS S3 service. Please see DNS Data Storage for further guidance.
Data Submission, Storage and Retrieval
Data Submission Methods
- FTP (for data sets of the order of tens of MB or less). This is done by special arrangement with the KB Wiki Editorial team. Please email admin@cado-ercoftac.org
- CD ROM, DVD or DLT (for data sets of the order of hundreds of MB)
- AWS S3 (for data sets of the order of hundreds of GB)
Inform the KB Wiki Team of ftp access via email. Contact the KB Wiki Team team for delivery details if the data is being sent by post.
Data Storage
The data will be stored in the format in which they were submitted.
Data Retrieval
Data files will be downloadable in the format in which they were submitted.
Before downloading the larger data files, users are requested to inspect the associated metadata of the file to be sure they have the expected content. In this way, the load on the server can be managed. A future development will enable portions of large data files to be retrieved and this option is currently being investigated.
Appendix A: Metadata Required
The metadata (data about data) which must be provide for each data file shall contain the following information in ASCII format:
- Values of all the defining parameters specified for the Application Challenge which do not explicitly appear as a separate column in the data file. These rows should be prefixed by a # symbol.
- The Thematic Area, using machine-readable format:
%TA = <n>
where <n> = 1, 2, 3, 4, 5 or 6.
- The Application Challenge number, using machine readable format:
%AC = <nTA-N>
where <nTA> = 1 to 45.is the Thematic Area number and N is the AC number, e.g. 4-01 refers to the first AC in TA4 “Wind environment around an airport terminal building” by WS Atkins.
- The variables listed for each column of data using the machine-readable format, for example:
%Column1Name = VelocityX %Column1Units = m/s %Column1Type = Real %Column2Name = Sensor Location on Wing %Column2Units = mm %Column2Type = String
etc. for each column in the file
The value assigned to %Column1Name must be a string containing no unprintable characters (apart from spaces) and if possible it should conform to the CGNS convention for data name identifiers. For details of CGNS conventions, see [1]
The value assigned to %Column1Units is left to the discretion of the contributor, but once again it must be a string containing no unprintable characters except spaces.
The value assigned to %Column1Type must be either “Integer”, “Real” or “String”
Entries are case-sensitive and should follow the exact specification given.
For large data files that are not based on column data it is recommended that a format identifier and if possible a format version identifier be provided. These take the format shown in the following template:
%Format = <MY_FORMAT> %FormatVersion = <MY_FORMAT_VERSION>
- The metadata must end with the following machine readable-command:
%End Of Metadata
Appendix B: Example of Small data ASCII files
############################################################ # # TA1 External Aerodynamics AC41 Oscillating Delta Wing # # File contains ANOVA-results of the Pressure Coefficients -Cp # of the Kulite Pressure Transducers # # Free stream Velocity : 40 ms-1 (± 0.06 %) # Reynolds number : 3.1E6 # Mach number : 0.12 # Mode : Pitch # Angle of Attack : 0.0 deg # Oscillation Amplitude : 6 deg ############################################################ %TA = 1 %AC = 1-4110 %Column1Name = value for red. Frequ. = 0.28 %Column1Units = rad/s %Column1Type = Real %Column2Name = value for red. Frequ. = 0.56 %Column2Units = rad/s %Column2Type = Real %Column3Name = confidence interval for 95% probability %Column3Units = Non-dimensional %Column3Type = Real %Column4Name = Side Of Wing %Column4Units = Location %Column4Type = String %Column5Name = x/ci %Column5Units = ratio %Column5Type = Real %Column6Name = eta %Column6Units = degrees %Column6Type = Real %End Of Metadata 0.0931 0.0907 0.0080 Right 0.3 0.700 0.2934 0.3035 0.0010 Right 0.3 0.700 -4.4748 -9.6125 0.0600 Right 0.3 0.700 0.0798 0.0869 0.0010 Right 0.3 0.700 -17.5862 -37.9607 0.3240 Right 0.3 0.700 0.0230 0.0274 0.0000 Right 0.3 0.700 -25.8954 -50.8786 1.6030 Right 0.3 0.700 0.1060 0.1104 0.0020 Left 0.3 0.700 0.2874 0.2975 0.0010 Left 0.3 0.700 -4.5508 -9.6137 0.1160 Left 0.3 0.700 0.0779 0.0848 0.0010 Left 0.3 0.700 -18.4120 -39.0618 0.6240 Left 0.3 0.700 0.0219 0.0265 0.0010 Left 0.3 0.700 -28.0049 -54.2960 2.2470 Left 0.3 0.700 0.1006 0.1015 0.0020 Left 0.3 0.850 0.3287 0.3403 0.0010 Left 0.3 0.850 -1.4226 -3.4499 0.0370 Left 0.3 0.850 0.0818 0.0866 0.0010 Left 0.3 0.850 -0.8961 -4.9886 0.3980 Left 0.3 0.850 0.0118 0.0166 0.0000 Left 0.3 0.850 22.3086 32.6002 2.5760 Left 0.3 0.850 0.0992 0.0834 0.0050 Left 0.6 0.700 0.1712 0.1730 0.0010 Left 0.6 0.700 0.4754 1.7458 0.1840 Left 0.6 0.700 0.0437 0.0487 0.0010 Left 0.6 0.700 -34.5128 -66.4447 1.0530 Left 0.6 0.700 0.0400 0.0444 0.0010 Left 0.6 0.700 -43.8689 -84.7527 1.1160 Left 0.6 0.700 --------------------------------------------------------------------------------
The name of the data folder will be based on the reference identifier given to the contributor(s) by the KB Wiki Team. Each contributor will be assigned a unique id based on their name.