Coded Values: From Enums to FAIR Domains#
This explains by example how to handle categorical data (code lists, classifications, or response domains) using JSON Schema, evolving from simple technical validation to rich FAIR metadata.
See the companion schema file: ../../../examples/enum-to-fair-coded-values.json
How-to: Implement Rich Coded Values
Define the Base Type: Start with
type: stringortype: integer.Use
oneOf: Instead ofenum, useoneOfto allow for annotations on each value.Add Standard Labels: Use
titlefor the default human-readable label.Add FAIR Labels: Use
fair:labelfor multilingual support (e.g.,{"en": "Yes", "fr": "Oui"}).Link to Concepts: Use
fair:conceptRefto point to a stable URI (e.g., Wikidata, SKOS).
1. The Standard enum (Validation only)#
The most basic way to restrict a value in JSON Schema is the enum keyword.
{
"type": "string",
"enum": ["red", "green", "blue"]
}
Pros: extremely simple; natively supported by all tools. Cons: No way to associate a human-readable label or description with each code. The codes must be self-explanatory.
2. The Labeled Enum Pattern (Standard JSON Schema)#
To associate a label like “Yes” with a code like 1, we use the oneOf + const pattern. This is 100% standard JSON Schema, requires no extensions, and is natively supported by all standard-compliant JSON Schema validators.
{
"type": "integer",
"oneOf": [
{ "const": 1, "title": "Yes" },
{ "const": 2, "title": "No" }
]
}
By using the standard title keyword inside each oneOf branch, you create an unambiguous mapping between the stored value and its human-readable representation.
4. The FAIR Data Domain (Rich Metadata)#
While title is great for simple labels, FAIR data requires more depth: multilingual support, semantic pointers, and persistence. The FAIR Data JSON Schema dialect extends the oneOf pattern with custom keywords.
{
"const": 1,
"title": "Yes",
"fair:label": {
"en": "Yes",
"fr": "Oui",
"de": "Ja"
},
"fair:conceptRef": "https://www.wikidata.org/wiki/Q231043"
}
Why use FAIR extensions instead of just title?#
Multilingualism: Standard
titleis a single string.fair:labelsupports localized objects.Semantic Context:
fair:conceptReflinks the code to a global ontology (like Wikidata or SKOS), making the data machine-understandable across different languages and systems.Variable Cascade: This pattern implements a light version of the DDI “Variable Cascade.” The shared definition in
$defsacts as the Represented Variable, while the local property inpropertiesacts as the Instance Variable, allowing you to have a local label (e.g., “User’s Satisfaction”) while inheriting a global domain definition.
5. External Semantic Mapping (SKOS)#
For high-value datasets, the code list is often defined in an external authority or registry using SKOS (Simple Knowledge Organization System).
The “Hybrid” Approach#
Crucially, this is a hybrid approach. We do not replace the technical validation logic with semantic URIs; we anchor them together.
In each entry, we keep:
const: Ensures that data files still validate against the correct codes (e.g., “FR”).title: Provides a baseline human label for standard tools.fair:conceptRef: Provides the “semantic bridge” to the official authority URI.
Note that at this stage, we no longer need fair:label inside the schema. Since each code is mapped to a formal URI, a FAIR-aware application can dynamically retrieve the multilingual labels directly from the authoritative source (the SKOS Concept).
{
"fair:classification": ["http://data.europa.eu/nuts"],
"oneOf": [
{
"const": "FR",
"title": "France",
"fair:conceptRef": "http://data.europa.eu/nuts/code/FR"
}
]
}
This mapping allows a FAIR data harvester to:
Discover that the variable follows the NUTS classification.
Automatically translate “FR” to “France” in any language supported by the Eurostat registry.
Perform automated data integration with other datasets that also use the NUTS level 0 URIs.
Summary Comparison#
Feature |
Standard |
Standard |
FAIR Dialect |
SKOS Mapping |
|---|---|---|---|---|
Value Validation |
✅ |
✅ |
✅ |
✅ |
Human Labels |
❌ |
✅ ( |
✅ ( |
✅ (External) |
Shared Definitions |
❌ |
✅ ( |
✅ ( |
✅ ( |
Multilingual (i18n) |
❌ |
❌ |
✅ |
✅ (External) |
Semantic Mapping |
❌ |
❌ |
✅ |
✅ ( |
Authority Link |
❌ |
❌ |
❌ |
✅ ( |
Standard Compatibility |
✅ |
✅ |
✅ (ignored by defaults) |
✅ |
Full Schema Implementation#
{
"$schema": "https://highvaluedata.net/fair-data-schema/dev",
"$id": "https://highvaluedata.net/fair-data-schema/dev/examples/enum-to-fair-coded-values",
"title": "Evolution of Enums to FAIR Coded Values",
"description": "This example demonstrates various ways to represent categorical data, from standard JSON Schema enums to shared FAIR-annotated response domains mapped to external SKOS ontologies.",
"type": "object",
"$defs": {
"SimpleSharedDomain": {
"title": "Simple Shared Response Domain",
"description": "A standard JSON Schema way to associate labels with codes once and reuse them.",
"type": "integer",
"oneOf": [
{
"const": 1,
"title": "Yes"
},
{
"const": 2,
"title": "No"
},
{
"const": 9,
"title": "Don't know",
"fair:sentinel": true
}
]
},
"FairSharedDomain": {
"title": "FAIR Shared Response Domain",
"description": "An extended domain using FAIR vocabularies for multilingual support and semantic mapping.",
"type": "integer",
"oneOf": [
{
"const": 1,
"title": "Yes",
"fair:label": {
"en": "Yes",
"fr": "Oui",
"de": "Ja"
},
"fair:conceptRef": "https://www.wikidata.org/wiki/Q231043"
},
{
"const": 2,
"title": "No",
"fair:label": {
"en": "No",
"fr": "Non",
"de": "Nein"
},
"fair:conceptRef": "https://www.wikidata.org/wiki/Q15303"
}
]
},
"SkosNutsDomain": {
"title": "SKOS Mapped NUTS Domain",
"description": "A domain mapped to an external SKOS ConceptScheme (NUTS Regions).",
"type": "string",
"fair:classification": "NUTS Classification",
"fair:classificationRef": [
"http://data.europa.eu/nuts"
],
"oneOf": [
{
"const": "BE",
"title": "Belgium",
"fair:conceptRef": "http://data.europa.eu/nuts/code/BE"
},
{
"const": "FR",
"title": "France",
"fair:conceptRef": "http://data.europa.eu/nuts/code/FR"
},
{
"const": "DE",
"title": "Germany",
"fair:conceptRef": "http://data.europa.eu/nuts/code/DE"
}
]
}
},
"properties": {
"standard_enum": {
"title": "1. Standard Enum",
"description": "Simplest approach. Validates against a list, but has no place for labels.",
"type": "string",
"enum": [
"red",
"green",
"blue"
]
},
"inline_oneof": {
"title": "2. Inline oneOf (Standard)",
"description": "Standard JSON Schema replacement for enum that allows associating a label (title) with each value.",
"type": "integer",
"oneOf": [
{
"const": 0,
"title": "Low"
},
{
"const": 5,
"title": "Medium"
},
{
"const": 10,
"title": "High"
}
]
},
"shared_standard_1": {
"title": "3a. Shared Domain (Usage 1)",
"description": "This variable reuses a standard definition. If the domain changes, you only update it once in $defs.",
"$ref": "#/$defs/SimpleSharedDomain"
},
"shared_standard_2": {
"title": "3b. Shared Domain (Usage 2)",
"description": "Another variable sharing the exact same codes and labels.",
"$ref": "#/$defs/SimpleSharedDomain"
},
"fair_usage_1": {
"title": "4a. FAIR Shared Domain (Usage 1)",
"description": "Uses the FAIR dialect to provide multilingual labels and semantic concept references.",
"$ref": "#/$defs/FairSharedDomain",
"fair:label": "Current Satisfaction"
},
"fair_usage_2": {
"title": "4b. FAIR Shared Domain (Usage 2)",
"description": "The same FAIR domain used in a different context. Note how this property has its own local FAIR label while inheriting the multilingual codes.",
"$ref": "#/$defs/FairSharedDomain",
"fair:label": "Future Intent"
},
"skos_usage_1": {
"title": "5a. SKOS External Mapping (Residence)",
"description": "Variable mapped to an official external SKOS classification (NUTS regions).",
"$ref": "#/$defs/SkosNutsDomain",
"fair:label": "Region of Residence"
},
"skos_usage_2": {
"title": "5b. SKOS External Mapping (Birth)",
"description": "The same official SKOS classification reused for a different variable.",
"$ref": "#/$defs/SkosNutsDomain",
"fair:label": "Region of Birth"
}
}
}