Mechanism 1: Custom Annotations#
Custom annotations are the simplest and most backward-compatible extension mechanism. Standard JSON Schema validators silently ignore keywords they do not recognise, passing them through as annotations — named metadata attached to a schema location.
The FAIR project uses the fair: prefix for all annotation keywords. A dataset schema using fair:concept, fair:unit, etc. validates without error on any standard Draft 2020-12 validator. FAIR-aware tools (such as the fair-data-schema CLI) can then read and act on those annotation values.
Example#
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "object",
"properties": {
"population": {
"type": "integer",
"fair:conceptRef": "https://www.wikidata.org/wiki/Q1203",
"fair:concept": { "en": "Population", "fr": "Population" },
"fair:unitRef": "https://example.org/vocabs/units/persons",
"fair:licenseRef": "https://creativecommons.org/licenses/by/4.0/"
}
}
}
Available Keywords#
Keywords are organized into three functional scopes: Universal, Dataset, and Property. See SPEC.md for the full list.
1. Universal Scope (Any level)#
Used for basic semantic identification and resource role definition.
Keyword |
Type |
Description |
|---|---|---|
|
string |
Role: |
|
URI / CURIE |
URI or CURIE of the semantic concept (Reference) |
|
string / object |
Human-readable name of the semantic concept (Literal) |
|
string / object |
Human-readable label for the property in context |
|
string / object |
Rich-text description (Markdown supported) |
2. Dataset Scope (Root/Resource level)#
Metadata describing the entire container or resource.
Keyword |
Type |
Description |
|---|---|---|
|
array |
Recommended. List of organizations, individuals, or AI agents associated with the resource. Supports Entity Types v1 and Entity Roles v1. |
|
string / URI |
Deprecated. Use |
|
string / URI |
The usage license (Literal / SPDX) |
|
object / URI |
Time period covered (Structured / URI) |
|
string / URI |
Geographic area (Literal / GeoNames) |
|
string / URI |
Specific group bound by time/space |
3. Property Scope (Variable level)#
Keywords describing the data representation of a leaf variable.
Keyword |
Type |
Description |
|---|---|---|
|
string / array |
The authority or code list governing values. |
|
string / URI |
Unit of measurement (Literal / QUDT) |
|
string / URI |
Quantity kind (Mass, Length) |
|
string / URI |
Observation unit type (e.g. ‘Person’) |
|
string / URI |
Broad scope or group (e.g. ‘Students’) |
|
URI / CURIE |
Link to a dataset-specific variable implementation |
|
URI / CURIE |
Link to a shared measurement definition |
|
object |
Hierarchy of measurement references. |
Working Example File#
../../../examples/mechanism-1-annotations.json
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://example.org/schemas/mechanism-1-annotations",
"title": "Mechanism 1: Custom Annotations Example",
"description": "Demonstrates how fair: annotation keywords coexist transparently with standard JSON Schema validation. A standard Draft 2020-12 validator will validate this schema (and data against it) without errors, treating all fair: keywords as annotations.",
"type": "object",
"required": [
"dataset_id",
"year",
"country_code",
"population"
],
"properties": {
"dataset_id": {
"type": "string",
"format": "uri",
"title": "Dataset Identifier",
"description": "Persistent URI identifying this dataset.",
"fair:conceptRef": "https://www.dublincore.org/specifications/dublin-core/dcmi-terms/identifier",
"fair:concept": "Identifier",
"fair:label": "Dataset ID"
},
"year": {
"type": "integer",
"minimum": 1900,
"maximum": 2100,
"title": "Reference Year",
"fair:conceptRef": "https://www.wikidata.org/wiki/Q577",
"fair:label": "Year"
},
"country_code": {
"type": "string",
"pattern": "^[A-Z]{2}$",
"title": "Country Code (ISO 3166-1 alpha-2)",
"fair:conceptRef": "https://www.wikidata.org/wiki/Q6256",
"fair:spatialCoverageRef": "https://www.wikidata.org/wiki/Q6256",
"fair:spatialCoverage": "Country",
"fair:classification": [
"https://ec.europa.eu/eurostat/ramon/nomenclatures/index.cfm?TargetUrl=LST_NOM_DTL&StrNom=CL_AREA"
]
},
"population": {
"type": "integer",
"minimum": 0,
"title": "Population Count",
"fair:conceptRef": "https://www.wikidata.org/wiki/Q1203",
"fair:quantity": "Population",
"fair:quantityRef": "https://www.wikidata.org/wiki/Q1203",
"fair:unit": "person",
"fair:unitRef": "https://www.wikidata.org/wiki/Q1203",
"fair:label": "Population",
"fair:description": "### Population Count\nThe total number of inhabitants...",
"fair:temporalCoverage": {
"description": "Census 2020 Cycle",
"start": "2020-01-01",
"end": "2023-12-31"
},
"fair:entities": [
{
"name": "Example Org",
"entityRef": "https://ror.org/02y3ad647",
"type": "Organization",
"role": "Provider"
}
],
"fair:licenseRef": "https://creativecommons.org/licenses/by/4.0/",
"fair:license": "CC-BY-4.0"
}
},
"additionalProperties": false,
"$comment": "Try validating {\"dataset_id\": \"https://example.org/ds/001\", \"year\": 2023, \"country_code\": \"DE\", \"population\": 84482267} against this schema using any standard validator. It will pass. The fair: keywords are ignored by the validator but readable by FAIR-aware tools."
}