ASG4 Data Validation

File validation

The validator needs to know which version of AGS data format to validate against. This is normally specified in the "TRAN_AGS"' field in the "TRAN" group. (See rule 14 in the table below). The validator will attempt to determine the correct version from the file, but it may not be possible if the version has been set to something like "4". The application selects version "4.1.1" if it cannot find a correct version, as that was the most commonly used version at the time of release (2026). The user can check the version in the file by selecting [file]-[Show AGS file transfer] information and set the version manually using the options on the main screen.

Note that the validation process only checks compliance with the AGS4 format rules (shown below). It does not validate the items of data. This may be carried out using bespoke data quality assessment specifications (see help).

The data validation process uses the AGS4 Python library.

Other applications that use the same library will provide the same validation results.

Note

Applications will be updated at different times. It is important to ensure that the version of the library is the same when comparing outputs. The library version is shown in the output.

File verification

It is important to be able to demonstrate the the AGS4 file that has been validated is same one that has been issued/received.

The validator application includes the ability to check the 'finger print' of the file using a 'hash' code and to compare hash codes.

[File]-[Copy File Hash to clipboard] and also compare hashes. [File]-[Compare file hashes (verify)]

The validation report includes the hash of the file that was validated.

For further information on the importance and use of file verification see https://en.wikipedia.org/wiki/File_verification

AGS4 Rules

Rule Reference	Rule Description
1	The data file shall be entirely composed of ASCII characters.
2	Each data file shall contain one or more data GROUPs. Each data GROUP shall comprise a number of GROUP HEADER rows and must have one or more DATA rows.
2a	Each row is located on a separate line, delimited by a new line consisting of a carriage return (ASCII character 13) and a line feed (ASCII character 10).
2b	The GROUP HEADER rows fully define the data presented within the DATA rows for that group (Rule 8). As a minimum, the GROUP HEADER rows comprise GROUP, HEADING, UNIT and TYPE rows presented in that order.
3	Each row in the data file must start with a DATA DESCRIPTOR that defines the contents of that row. The following Data Descriptors are used as described below: Each GROUP row shall be preceded by the "GROUP" Data Descriptor. Each HEADING row shall be preceded by the "HEADING" Data Descriptor. Each UNIT row shall be preceded by the "UNIT" Data Descriptor. Each TYPE row shall be preceded by the "TYPE" Data Descriptor. Each DATA row shall be preceded by the "DATA" Data Descriptor.
4	Within each GROUP, the DATA items are contained in data FIELDs. Each data FIELD contains a single data VARIABLE in each row. Each DATA row of a data file will contain one or more data FIELDs. The GROUP row contains only one DATA item, the GROUP name, in addition to the Data Descriptor (Rule 3). All other rows in the GROUP have a number of DATA items defined by the HEADING row.
5	DATA DESCRIPTORS, GROUP names, data field HEADINGs, data field UNITs, data field TYPEs, and data VARIABLEs shall be enclosed in double quotes ("..."). Any quotes within a data item must be defined with a second quote e.g. "he said ""hello""".
6	The DATA DESCRIPTORS, GROUP names, data field HEADINGs, data field UNITs, data field TYPEs, and data VARIABLEs in each line of the data file shall be separated by a comma (,). No carriage returns (ASCII character 13) or line feeds (ASCII character 10) are allowed in or between data VARIABLEs within a DATA row.
7	The order of data FIELDs in each line within a GROUP is defined at the start of each GROUP in the HEADING row. HEADINGs shall be in the order described in the AGS FORMAT DATA DICTIONARY.
8	Data VARIABLEs shall be presented in the units of measurement and type that are described by the appropriate data field UNIT and data field TYPE defined at the start of the GROUP within the GROUP HEADER rows.
9	Data HEADING and GROUP names shall be taken from the AGS FORMAT DATA DICTIONARY. In cases where there is no suitable entry, a user-defined GROUP and/or HEADING may be used in accordance with Rule 18. Any user-defined HEADINGs shall be included at the end of the HEADING row after the standard HEADINGs in the order defined in the DICT group (see Rule 18a).
10	HEADINGs are defined as KEY, REQUIRED or OTHER. KEY fields are necessary to uniquely define the data. REQUIRED fields are necessary to allow interpretation of the data file. OTHER fields are included depending on the scope of the data file and availability of data to be included.
10a	In every GROUP, certain HEADINGs are defined as KEY. There shall not be more than one row of data in each GROUP with the same combination of KEY field entries. KEY fields must appear in each GROUP, but may contain null data (see Rule 12).
10b	Some HEADINGs are marked as REQUIRED. REQUIRED fields must appear in the data GROUPs where they are indicated in the AGS FORMAT DATA DICTIONARY. These fields require data entry and cannot be null (i.e. left blank or empty).
10c	Links are made between data rows in GROUPs by the KEY fields. Every entry made in the KEY fields in any GROUP must have an equivalent entry in its PARENT GROUP. The PARENT GROUP must be included within the data file.
11	HEADINGs defined as a data TYPE of 'Record Link' (RL) can be used to link data rows to entries in GROUPs outside of the defined hierarchy (Rule 10c) or DICT group for user defined GROUPs. The GROUP name followed by the KEY FIELDs defining the cross-referenced data row, in the order presented in the AGS4 DATA DICTIONARY.
11a	Each GROUP/KEY FIELD shall be separated by a delimiter character. This single delimiter character shall be defined in TRAN_DLIM. The default being "
11b	A heading of data TYPE 'Record Link' can refer to more than one combination of GROUP and KEY FIELDs. The combination shall be separated by a defined concatenation character. This single concatenation character shall be defined in TRAN_RCON. The default being "+" (ASCII character 43).
11c	Any heading of data TYPE 'Record Link' included in a data file shall cross-reference to the KEY FIELDs of data rows in the GROUP referred to by the heading contents.
12	Data does not have to be included against each HEADING unless REQUIRED (Rule 10b). The data FIELD can be null; a null entry is defined as "" (two quotes together).
13	Each data file shall contain the PROJ GROUP which shall contain only one data row and, as a minimum, shall contain data under the headings defined as REQUIRED (Rule 10b).
14	Each data file shall contain the TRAN GROUP which shall contain only one data row and, as a minimum, shall contain data under the headings defined as REQUIRED (Rule 10b).
15	Each data file shall contain the UNIT GROUP to list all units used within the data file. Every unit of measurement entered in the UNIT row of a GROUP or data entered in a FIELD where the field TYPE is defined as "PU" (for example ELRG_RUNI, GCHM_UNIT or MOND_UNIT FIELDs) shall be listed and defined in the UNIT GROUP.
16	Each data file shall contain the ABBR GROUP when abbreviations have been included in the data file. The abbreviations listed in the ABBR GROUP shall include definitions for all abbreviations entered in a FIELD where the data TYPE is defined as "PA" or any abbreviation needing definition used within any other heading data type.
16a	Where multiple abbreviations are required to fully codify a FIELD, the abbreviations shall be separated by a defined concatenation character. This single concatenation character shall be defined in TRAN_RCON. The default being "+" (ASCII character 43). Each abbreviation used in such combinations shall be listed separately in the ABBR GROUP. e.g. "CP+RC" must have entries for both "CP" and "RC" in ABBR GROUP, together with their full definition.
17	Each data file shall contain the TYPE GROUP to define the field TYPEs used within the data file. Every data type entered in the TYPE row of a GROUP shall be listed and defined in the TYPE GROUP.
18	Each data file shall contain the DICT GROUP where non-standard GROUP and HEADING names have been included in the data file.
18a	The order in which the user-defined HEADINGs are listed in the DICT GROUP shall define the order in which these HEADINGS are appended to an existing GROUP or appear in a user-defined GROUP. This order also defines the sequence in which such HEADINGS are used in a heading of data TYPE 'Record Link' (Rule 11).
19	A GROUP name shall not be more than 4 characters long and shall consist of uppercase letters and numbers only.
19a	A HEADING name shall not be more than 9 characters long and shall consist of uppercase letters, numbers or the underscore character only.
19b	HEADING names shall start with the GROUP name followed by an underscore character.e.g. "NGRP_HED1" Where a HEADING refers to an existing HEADING within another GROUP, the HEADING name added to the group shall bear the same name. e.g. "CMPG_TESN" in the "CMPT" GROUP.
20	Additional computer files (e.g. digital images) can be included within a data submission. Each such file shall be defined in a FILE GROUP. The additional files shall be transferred in a sub-folder named FILE. This FILE sub-folder shall contain additional sub-folders each named by the FILE_FSET reference. Each FILE_FSET named folder will contain the files listed in the FILE GROUP.