Coded Values: From Enums to FAIR Domains#

This explains by example how to handle categorical data (code lists, classifications, or response domains) using JSON Schema, evolving from simple technical validation to rich FAIR metadata.

See the companion schema file: ../../../examples/enum-to-fair-coded-values.json

How-to: Implement Rich Coded Values

  1. Define the Base Type: Start with type: string or type: integer.

  2. Use oneOf: Instead of enum, use oneOf to allow for annotations on each value.

  3. Add Standard Labels: Use title for the default human-readable label.

  4. Add FAIR Labels: Use fair:label for multilingual support (e.g., {"en": "Yes", "fr": "Oui"}).

  5. Link to Concepts: Use fair:conceptRef to point to a stable URI (e.g., Wikidata, SKOS).


1. The Standard enum (Validation only)#

The most basic way to restrict a value in JSON Schema is the enum keyword.

{
  "type": "string",
  "enum": ["red", "green", "blue"]
}

Pros: extremely simple; natively supported by all tools. Cons: No way to associate a human-readable label or description with each code. The codes must be self-explanatory.


2. The Labeled Enum Pattern (Standard JSON Schema)#

To associate a label like “Yes” with a code like 1, we use the oneOf + const pattern. This is 100% standard JSON Schema, requires no extensions, and is natively supported by all standard-compliant JSON Schema validators.

{
  "type": "integer",
  "oneOf": [
    { "const": 1, "title": "Yes" },
    { "const": 2, "title": "No" }
  ]
}

By using the standard title keyword inside each oneOf branch, you create an unambiguous mapping between the stored value and its human-readable representation.


3. The Shared Response Domain (DRY Principle)#

In data stewardship, many variables often share the same “Response Domain” (e.g., several “Yes/No” questions in a survey). Instead of repeating the oneOf logic, you define it once in $defs and reference it using $ref.

{
  "$defs": {
    "YesNo": {
      "type": "integer",
      "oneOf": [
        { "const": 1, "title": "Yes" },
        { "const": 2, "title": "No" }
      ]
    }
  },
  "properties": {
    "satisfied": { "$ref": "#/$defs/YesNo" },
    "completed": { "$ref": "#/$defs/YesNo" }
  }
}

This ensures consistency: if you decide to change the label “Yes” to “Agree”, you only change it in one place, and it updates across all variables.


4. The FAIR Data Domain (Rich Metadata)#

While title is great for simple labels, FAIR data requires more depth: multilingual support, semantic pointers, and persistence. The FAIR Data JSON Schema dialect extends the oneOf pattern with custom keywords.

{
  "const": 1,
  "title": "Yes",
  "fair:label": {
    "en": "Yes",
    "fr": "Oui",
    "de": "Ja"
  },
  "fair:conceptRef": "https://www.wikidata.org/wiki/Q231043"
}

Why use FAIR extensions instead of just title?#

  1. Multilingualism: Standard title is a single string. fair:label supports localized objects.

  2. Semantic Context: fair:conceptRef links the code to a global ontology (like Wikidata or SKOS), making the data machine-understandable across different languages and systems.

  3. Variable Cascade: This pattern implements a light version of the DDI “Variable Cascade.” The shared definition in $defs acts as the Represented Variable, while the local property in properties acts as the Instance Variable, allowing you to have a local label (e.g., “User’s Satisfaction”) while inheriting a global domain definition.


5. External Semantic Mapping (SKOS)#

For high-value datasets, the code list is often defined in an external authority or registry using SKOS (Simple Knowledge Organization System).

The “Hybrid” Approach#

Crucially, this is a hybrid approach. We do not replace the technical validation logic with semantic URIs; we anchor them together.

In each entry, we keep:

  • const: Ensures that data files still validate against the correct codes (e.g., “FR”).

  • title: Provides a baseline human label for standard tools.

  • fair:conceptRef: Provides the “semantic bridge” to the official authority URI.

Note that at this stage, we no longer need fair:label inside the schema. Since each code is mapped to a formal URI, a FAIR-aware application can dynamically retrieve the multilingual labels directly from the authoritative source (the SKOS Concept).

{
  "fair:classification": ["http://data.europa.eu/nuts"],
  "oneOf": [
    {
      "const": "FR",
      "title": "France",
      "fair:conceptRef": "http://data.europa.eu/nuts/code/FR"
    }
  ]
}

This mapping allows a FAIR data harvester to:

  1. Discover that the variable follows the NUTS classification.

  2. Automatically translate “FR” to “France” in any language supported by the Eurostat registry.

  3. Perform automated data integration with other datasets that also use the NUTS level 0 URIs.


Summary Comparison#

Feature

Standard enum

Standard oneOf

FAIR Dialect

SKOS Mapping

Value Validation

Human Labels

✅ (title)

✅ (fair:label)

✅ (External)

Shared Definitions

✅ ($ref)

✅ ($ref)

✅ ($ref)

Multilingual (i18n)

✅ (External)

Semantic Mapping

✅ (skos:Concept)

Authority Link

✅ (skos:ConceptScheme)

Standard Compatibility

✅ (ignored by defaults)


Full Schema Implementation#

{
    "$schema": "https://highvaluedata.net/fair-data-schema/dev",
    "$id": "https://highvaluedata.net/fair-data-schema/dev/examples/enum-to-fair-coded-values",
    "title": "Evolution of Enums to FAIR Coded Values",
    "description": "This example demonstrates various ways to represent categorical data, from standard JSON Schema enums to shared FAIR-annotated response domains mapped to external SKOS ontologies.",
    "type": "object",
    "$defs": {
        "SimpleSharedDomain": {
            "title": "Simple Shared Response Domain",
            "description": "A standard JSON Schema way to associate labels with codes once and reuse them.",
            "type": "integer",
            "oneOf": [
                {
                    "const": 1,
                    "title": "Yes"
                },
                {
                    "const": 2,
                    "title": "No"
                },
                {
                    "const": 9,
                    "title": "Don't know",
                    "fair:sentinel": true
                }
            ]
        },
        "FairSharedDomain": {
            "title": "FAIR Shared Response Domain",
            "description": "An extended domain using FAIR vocabularies for multilingual support and semantic mapping.",
            "type": "integer",
            "oneOf": [
                {
                    "const": 1,
                    "title": "Yes",
                    "fair:label": {
                        "en": "Yes",
                        "fr": "Oui",
                        "de": "Ja"
                    },
                    "fair:conceptRef": "https://www.wikidata.org/wiki/Q231043"
                },
                {
                    "const": 2,
                    "title": "No",
                    "fair:label": {
                        "en": "No",
                        "fr": "Non",
                        "de": "Nein"
                    },
                    "fair:conceptRef": "https://www.wikidata.org/wiki/Q15303"
                }
            ]
        },
        "SkosNutsDomain": {
            "title": "SKOS Mapped NUTS Domain",
            "description": "A domain mapped to an external SKOS ConceptScheme (NUTS Regions).",
            "type": "string",
            "fair:classification": "NUTS Classification",
            "fair:classificationRef": [
                "http://data.europa.eu/nuts"
            ],
            "oneOf": [
                {
                    "const": "BE",
                    "title": "Belgium",
                    "fair:conceptRef": "http://data.europa.eu/nuts/code/BE"
                },
                {
                    "const": "FR",
                    "title": "France",
                    "fair:conceptRef": "http://data.europa.eu/nuts/code/FR"
                },
                {
                    "const": "DE",
                    "title": "Germany",
                    "fair:conceptRef": "http://data.europa.eu/nuts/code/DE"
                }
            ]
        }
    },
    "properties": {
        "standard_enum": {
            "title": "1. Standard Enum",
            "description": "Simplest approach. Validates against a list, but has no place for labels.",
            "type": "string",
            "enum": [
                "red",
                "green",
                "blue"
            ]
        },
        "inline_oneof": {
            "title": "2. Inline oneOf (Standard)",
            "description": "Standard JSON Schema replacement for enum that allows associating a label (title) with each value.",
            "type": "integer",
            "oneOf": [
                {
                    "const": 0,
                    "title": "Low"
                },
                {
                    "const": 5,
                    "title": "Medium"
                },
                {
                    "const": 10,
                    "title": "High"
                }
            ]
        },
        "shared_standard_1": {
            "title": "3a. Shared Domain (Usage 1)",
            "description": "This variable reuses a standard definition. If the domain changes, you only update it once in $defs.",
            "$ref": "#/$defs/SimpleSharedDomain"
        },
        "shared_standard_2": {
            "title": "3b. Shared Domain (Usage 2)",
            "description": "Another variable sharing the exact same codes and labels.",
            "$ref": "#/$defs/SimpleSharedDomain"
        },
        "fair_usage_1": {
            "title": "4a. FAIR Shared Domain (Usage 1)",
            "description": "Uses the FAIR dialect to provide multilingual labels and semantic concept references.",
            "$ref": "#/$defs/FairSharedDomain",
            "fair:label": "Current Satisfaction"
        },
        "fair_usage_2": {
            "title": "4b. FAIR Shared Domain (Usage 2)",
            "description": "The same FAIR domain used in a different context. Note how this property has its own local FAIR label while inheriting the multilingual codes.",
            "$ref": "#/$defs/FairSharedDomain",
            "fair:label": "Future Intent"
        },
        "skos_usage_1": {
            "title": "5a. SKOS External Mapping (Residence)",
            "description": "Variable mapped to an official external SKOS classification (NUTS regions).",
            "$ref": "#/$defs/SkosNutsDomain",
            "fair:label": "Region of Residence"
        },
        "skos_usage_2": {
            "title": "5b. SKOS External Mapping (Birth)",
            "description": "The same official SKOS classification reused for a different variable.",
            "$ref": "#/$defs/SkosNutsDomain",
            "fair:label": "Region of Birth"
        }
    }
}