Mechanism 1: Custom Annotations#

Custom annotations are the simplest and most backward-compatible extension mechanism. Standard JSON Schema validators silently ignore keywords they do not recognise, passing them through as annotations — named metadata attached to a schema location.

The FAIR project uses the fair: prefix for all annotation keywords. A dataset schema using fair:concept, fair:unit, etc. validates without error on any standard Draft 2020-12 validator. FAIR-aware tools (such as the fair-data-schema CLI) can then read and act on those annotation values.

Example#

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "population": {
      "type": "integer",
      "fair:conceptRef": "https://www.wikidata.org/wiki/Q1203",
      "fair:concept": { "en": "Population", "fr": "Population" },
      "fair:unitRef": "https://example.org/vocabs/units/persons",
      "fair:licenseRef": "https://creativecommons.org/licenses/by/4.0/"
    }
  }
}

Available Keywords#

Keywords are organized into three functional scopes: Universal, Dataset, and Property. See SPEC.md for the full list.

1. Universal Scope (Any level)#

Used for basic semantic identification and resource role definition.

Keyword

Type

Description

fair:resourceType

string

Role: data-product, dataset, or variable.

fair:conceptRef

URI / CURIE

URI or CURIE of the semantic concept (Reference)

fair:concept

string / object

Human-readable name of the semantic concept (Literal)

fair:label

string / object

Human-readable label for the property in context

fair:description

string / object

Rich-text description (Markdown supported)

2. Dataset Scope (Root/Resource level)#

Metadata describing the entire container or resource.

Keyword

Type

Description

fair:entities

array

Recommended. List of organizations, individuals, or AI agents associated with the resource. Supports Entity Types v1 and Entity Roles v1.

fair:provider / Ref

string / URI

Deprecated. Use fair:entities with a ‘Producer’ role instead.

fair:license / Ref

string / URI

The usage license (Literal / SPDX)

fair:temporalCoverage / Ref

object / URI

Time period covered (Structured / URI)

fair:spatialCoverage / Ref

string / URI

Geographic area (Literal / GeoNames)

fair:population / Ref

string / URI

Specific group bound by time/space

3. Property Scope (Variable level)#

Keywords describing the data representation of a leaf variable.

Keyword

Type

Description

fair:classification / Ref

string / array

The authority or code list governing values.

fair:unit / Ref

string / URI

Unit of measurement (Literal / QUDT)

fair:quantity / Ref

string / URI

Quantity kind (Mass, Length)

fair:unitType / Ref

string / URI

Observation unit type (e.g. ‘Person’)

fair:universe / Ref

string / URI

Broad scope or group (e.g. ‘Students’)

fair:instanceVariableRef

URI / CURIE

Link to a dataset-specific variable implementation

fair:representedVariableRef

URI / CURIE

Link to a shared measurement definition

fair:variableCascade

object

Hierarchy of measurement references.

Working Example File#

../../../examples/mechanism-1-annotations.json

{
    "$schema": "https://json-schema.org/draft/2020-12/schema",
    "$id": "https://example.org/schemas/mechanism-1-annotations",
    "title": "Mechanism 1: Custom Annotations Example",
    "description": "Demonstrates how fair: annotation keywords coexist transparently with standard JSON Schema validation. A standard Draft 2020-12 validator will validate this schema (and data against it) without errors, treating all fair: keywords as annotations.",
    "type": "object",
    "required": [
        "dataset_id",
        "year",
        "country_code",
        "population"
    ],
    "properties": {
        "dataset_id": {
            "type": "string",
            "format": "uri",
            "title": "Dataset Identifier",
            "description": "Persistent URI identifying this dataset.",
            "fair:conceptRef": "https://www.dublincore.org/specifications/dublin-core/dcmi-terms/identifier",
            "fair:concept": "Identifier",
            "fair:label": "Dataset ID"
        },
        "year": {
            "type": "integer",
            "minimum": 1900,
            "maximum": 2100,
            "title": "Reference Year",
            "fair:conceptRef": "https://www.wikidata.org/wiki/Q577",
            "fair:label": "Year"
        },
        "country_code": {
            "type": "string",
            "pattern": "^[A-Z]{2}$",
            "title": "Country Code (ISO 3166-1 alpha-2)",
            "fair:conceptRef": "https://www.wikidata.org/wiki/Q6256",
            "fair:spatialCoverageRef": "https://www.wikidata.org/wiki/Q6256",
            "fair:spatialCoverage": "Country",
            "fair:classification": [
                "https://ec.europa.eu/eurostat/ramon/nomenclatures/index.cfm?TargetUrl=LST_NOM_DTL&StrNom=CL_AREA"
            ]
        },
        "population": {
            "type": "integer",
            "minimum": 0,
            "title": "Population Count",
            "fair:conceptRef": "https://www.wikidata.org/wiki/Q1203",
            "fair:quantity": "Population",
            "fair:quantityRef": "https://www.wikidata.org/wiki/Q1203",
            "fair:unit": "person",
            "fair:unitRef": "https://www.wikidata.org/wiki/Q1203",
            "fair:label": "Population",
            "fair:description": "### Population Count\nThe total number of inhabitants...",
            "fair:temporalCoverage": {
                "description": "Census 2020 Cycle",
                "start": "2020-01-01",
                "end": "2023-12-31"
            },
            "fair:entities": [
                {
                    "name": "Example Org",
                    "entityRef": "https://ror.org/02y3ad647",
                    "type": "Organization",
                    "role": "Provider"
                }
            ],
            "fair:licenseRef": "https://creativecommons.org/licenses/by/4.0/",
            "fair:license": "CC-BY-4.0"
        }
    },
    "additionalProperties": false,
    "$comment": "Try validating {\"dataset_id\": \"https://example.org/ds/001\", \"year\": 2023, \"country_code\": \"DE\", \"population\": 84482267} against this schema using any standard validator. It will pass. The fair: keywords are ignored by the validator but readable by FAIR-aware tools."
}