Data Products & Dataset Relationships#
[!NOTE] Advanced Feature: While the primary focus of the FAIR Data JSON Schema is the description of simple, standalone datasets, this guide explores advanced patterns for complex data products and multi-resource packages.
How-to: Describe Relationships & Joins
Identify Datasets: Define each dataset (table) in the product.
Map Relations: Use
fair:datasetRelationsto describe how they relate (e.g.,isPartOf,isContinuedBy).Specify Mapping: Define
sourceVariablesandtargetVariablesfor the join.Define Cardinality: Specify the relationship type (e.g.,
one-to-many,many-to-one).Add Variable Links: Link related variables across datasets using
fair:variableRef.
Flat Multi-Dataset Structures#
A Data Product (annotated with fair:resourceType: "data-product") acts as a container. In a flat structure, datasets are listed side-by-side in the root properties object rather than nested within each other.
Use Case: Primary & Secondary Data#
A bundle containing the core data file and a secondary file for auxiliary or derived variables.
{
"fair:resourceType": "data-product",
"properties": {
"primary_data": { "fair:resourceType": "dataset" },
"secondary_data": { "fair:resourceType": "dataset" }
}
}
Describing Relationships (fair:datasetRelations)#
The fair:datasetRelations keyword (Dataset Scope) allows for explicit, machine-actionable connections between resources. It aligns with DDI and Dublin Core standards.
Core Relationship Types#
isPartOf: Indicates the dataset is a component of a larger aggregate (e.g. a table in a product).isVersionOf: Indicates a previous or alternative version.isContinuedBy: Indicates chronological succession (critical for time series).isReferencedBy: Indicates the dataset is cited or used as a source by another.
Join Relationships (Variables & Cardinality)#
Relationships can precisely define how datasets are linked at the variable level.
sourceVariables: The linking keys in the current dataset.targetVariables: The corresponding keys in the target dataset.cardinality: Defines the nature of the join (one-to-one,many-to-one, etc.).
Example 1: Flat Hierarchy (Census Join)#
In this “Story”, the persons registry is not nested inside households, but it carries a defined relationship that links them via household_id.
{
"persons": {
"fair:resourceType": "dataset",
"fair:datasetRelations": [
{
"relationType": "isPartOf",
"targetRef": "#/properties/households",
"sourceVariables": ["hh_id"],
"targetVariables": ["household_id"],
"cardinality": "many-to-one",
"description": "Each person belongs to exactly one household."
}
]
}
}
Example 2: Time Series & Variable Reuse#
Time-series products often release new datasets monthly that share an identical structure. To ensure consistency and reduce maintenance, we use Variable Reuse via $defs and $ref.
The Pattern#
Define shared variable shapes in a central
$defssection.Define a “Base Release” dataset shape that references those variables.
Each monthly release
$refs the base shape and adds specifictemporalCoverageanddatasetRelations.
{
"$defs": {
"ValueVar": {
"type": "number",
"fair:unit": "USD"
},
"BaseRelease": {
"properties": {
"val": { "$ref": "#/$defs/ValueVar" }
}
}
},
"properties": {
"release_jan": {
"allOf": [{ "$ref": "#/$defs/BaseRelease" }],
"fair:datasetRelations": [{ "relationType": "isContinuedBy", "targetRef": "..." }]
}
}
}
Full Schema Implementations#
Flat Hierarchy Product#
{
"$schema": "https://highvaluedata.net/fair-data-schema/dev",
"$id": "https://highvaluedata.net/fair-data-schema/dev/examples/flat-hierarchy-product",
"title": "Flat Census Data Product",
"description": "An example of a Data Product where datasets are side-by-side (non-nested) but linked via join relationships.",
"fair:resourceType": "data-product",
"type": "object",
"properties": {
"households": {
"title": "Household Registry",
"fair:resourceType": "dataset",
"type": "array",
"items": {
"type": "object",
"properties": {
"household_id": {
"type": "string",
"fair:label": "Household Primary Key"
},
"region": {
"type": "string",
"fair:conceptRef": "https://example.org/vocabs/nuts"
}
}
}
},
"persons": {
"title": "Person Registry",
"fair:resourceType": "dataset",
"fair:datasetRelations": [
{
"relationType": "isPartOf",
"targetRef": "#/properties/households",
"sourceVariables": ["hh_id"],
"targetVariables": ["household_id"],
"cardinality": "many-to-one",
"description": "Each person belongs to exactly one household."
}
],
"type": "array",
"items": {
"type": "object",
"properties": {
"person_id": { "type": "string" },
"hh_id": {
"type": "string",
"fair:label": "Household Foreign Key"
},
"age": { "type": "integer" }
}
}
}
}
}
Time Series Product#
{
"$schema": "https://highvaluedata.net/fair-data-schema/dev",
"$id": "https://highvaluedata.net/fair-data-schema/dev/examples/time-series-product",
"title": "Monthly Economic Indicator Series",
"description": "A time-series data product released monthly, reusing variable definitions across releases.",
"fair:resourceType": "data-product",
"fair:provider": "Bureau of Economic Analysis",
"type": "object",
"$defs": {
"BaseRelease": {
"type": "object",
"fair:resourceType": "dataset",
"required": ["date", "value"],
"properties": {
"date": {
"$ref": "#/$defs/Variables/DateVar"
},
"value": {
"$ref": "#/$defs/Variables/ValueVar"
}
}
},
"Variables": {
"DateVar": {
"type": "string",
"format": "date",
"title": "Reference Date",
"fair:conceptRef": "https://www.wikidata.org/wiki/Q577"
},
"ValueVar": {
"type": "number",
"title": "Indicator Value",
"fair:unit": "USD",
"fair:unitRef": "https://example.org/vocabs/units/usd"
}
}
},
"properties": {
"release_2024_01": {
"title": "January 2024 Release",
"allOf": [ { "$ref": "#/$defs/BaseRelease" } ],
"fair:temporalCoverage": {
"start": "2024-01-01",
"end": "2024-01-31"
},
"fair:datasetRelations": [
{
"relationType": "isContinuedBy",
"targetRef": "#/properties/release_2024_02"
},
{
"relationType": "isPartOf",
"targetRef": "#"
}
]
},
"release_2024_02": {
"title": "February 2024 Release",
"allOf": [ { "$ref": "#/$defs/BaseRelease" } ],
"fair:temporalCoverage": {
"start": "2024-02-01",
"end": "2024-02-29"
},
"fair:datasetRelations": [
{
"relationType": "isPartOf",
"targetRef": "#"
}
]
}
}
}