Skip to content

Provenance

Support for provenance was developed as a BIDS Extension Proposal. Please see Citing BIDS on how to appropriately credit this extension when referring to it in the context of the academic literature.

Example datasets

Several example BIDS-Prov datasets have been formatted using this specification and can be used for practical guidance when curating a new dataset.

This part of the BIDS specification is aimed at describing the provenance of a BIDS dataset. This description is retrospective: it describes a set of steps that were executed in order to obtain the dataset. Note: this is different from prospective provenance that focuses describing workflows that may be run on a dataset. This description is based on the W3C Prov standard (see the Provenance from an RDF perspective section for more information).

Provenance information SHOULD be included in a BIDS dataset when possible. If provenance information is included, it MUST be described using the conventions detailed hereafter. Provenance information reflects the provenance of a full dataset and/or of specific files at any level of the BIDS hierarchy. Provenance information SHOULD not include human subject identifying data.

Provenance of a BIDS file

Provenance of a BIDS file SHOULD be stored inside its sidecar JSON.

For that purpose, any sidecar JSON file MAY include the following keys:

Key name Requirement Level Data type Description
GeneratedBy OPTIONAL string or array of strings Identifier(s) of the activity/activities responsible for the creation of the data.
Related activities MUST be described as specified in the Activities section.
SidecarGeneratedBy OPTIONAL string or array of strings Identifier(s) of the activity/activities responsible for the creation of the sidecar JSON file.
Related activities MUST be described as specified in the Activities section.
Digest OPTIONAL object Object containing digests of the provEntity. Each key in the object MUST be the name of a checksum function if present in this list: MD5; SHA1; SHA-224 ; SHA-256 ; SHA-384 ; SHA-512 ; SHA3-224; SHA3-256; SHA3-384; SHA3-512; BLAKE2B-256; BLAKE3-256; SHAKE128; SHAKE256. Otherwise, key MAY be an arbitrary label. The corresponding value is the checksum as computed by the function identified by the key.
Type OPTIONAL string Term from a controlled vocabulary that more specifically describes the provEntity.

Example of metadata in a sidecar JSON file

{
    "GeneratedBy": "bids::prov#conversion-00f3a18f",
    "SidecarGeneratedBy": [
        "bids::prov#preparation-conversion-1xkhm1ft",
        "bids::prov#conversion-00f3a18f"
    ],
    "Digest": {
        "sha256": "66eeafb465559148e0222d4079558a8354eb09b9efabcc47cd5b8af6eed51907"
    }
}
This snippet is similar to fields described in DICOM to Nifti conversion with heudiconv example.

Provenance of a BIDS dataset

Provenance of a BIDS dataset (raw, derivative, or study) SHOULD be stored inside its dataset_description.json file. Corresponding metadata describes the provenance of the whole dataset. The dataset_description.json file of a BIDS raw dataset or BIDS study dataset MAY include the GeneratedBy key to describe provenance. The dataset_description.json file of a BIDS derivative dataset MUST include the GeneratedBy key to describe provenance.

The GeneratedBy field MAY contain either of the following values:

Description using identifiers

This section details the way to describe provenance of a dataset in the GeneratedBy field, using identifiers.

Key name Requirement Level Data type Description
GeneratedBy RECOMMENDED for BIDS raw datasets and BIDS study datasets, REQUIRED for BIDS derivative datasets string or array of strings Identifier(s) of the activity/activities responsible for the creation of the data.
Related activities MUST be described as specified in the Activities section.

Example of GeneratedBy contents in a dataset_description.json

{
    "GeneratedBy": "bids::prov#preprocessing-xMpFqB5q"
}
This is a snippet from the fMRI preprocessing with fMRIPrep example.

Description of processes or pipelines

This section details a way to describe the provenance of a dataset, providing GeneratedBy with an array of objects representing pipelines or processes that generated the dataset.

Warning

This description can be equivalently represented using the previous section. This modeling is kept for backward-compatibility but might be removed in future BIDS releases (see BIDS 2.0).

Key name Requirement Level Data type Description
GeneratedBy RECOMMENDED for BIDS raw datasets and BIDS study datasets, REQUIRED for BIDS derivative datasets array of objects Used to specify provenance of the dataset.

Each object in the GeneratedBy array includes the following REQUIRED, RECOMMENDED and OPTIONAL keys:

Key name Requirement Level Data type Description
Name REQUIRED string Name of the pipeline or process that generated the dataset. Use "Manual" to indicate the derivatives were generated by hand, or adjusted manually after an initial run of an automated pipeline.
Version RECOMMENDED string Version of the pipeline or process that generated the dataset.
Description RECOMMENDED if Name is "Manual", OPTIONAL otherwise string Plain-text description of the pipeline or process that generated the dataset. RECOMMENDED if Name is "Manual".
CodeURL OPTIONAL string URL where the code used to generate the dataset may be found.
Container OPTIONAL object Used to specify the location and relevant attributes of software container image used to produce the dataset. Valid keys in this object include Type, Tag and [URI][uri] with [string][] values.

Example of GeneratedBy contents in a dataset_description.json

{
    "GeneratedBy": [
        {
          "Name": "reproin",
          "Version": "0.6.0",
          "Container": {
            "Type": "docker",
            "Tag": "repronim/reproin:0.6.0"
          }
        }
    ]
}

Provenance files

In addition to storing provenance in sidecar JSON files (see the Provenance of BIDS file) or in dataset_description.json (see Provenance of BIDS dataset section), other provenance information MUST be stored inside provenance files.

Template:

Legend:
  • For more information about filename elements (for example, entities, suffixes, extensions), follow the links embedded in the filename template.

  • <matches> is a placeholder to denote an arbitrary (and valid) sequence of entities and labels at the beginning of the filename (only BIDS "raw").

  • <source-entities> is a placeholder to denote an arbitrary sequence of entities and labels at the beginning of the filename matching a source file from which the file derives (only BIDS-Derivatives).

  • Filename entities or directories between square brackets (for example, [_ses-<label>]) are OPTIONAL.

  • Some entities may only allow specific values, in which case those values are listed in <>, separated by |.

  • _<suffix> means that there are several (>6) valid suffixes for this filename pattern.

  • .<extension> means that there are several (>6) valid extensions for this file type.

  • [.gz] means that both the unzipped and gzipped versions of the extension are valid.

Note

The prov entity allows to group related provenance files, using an arbitrary value for <label>. A subdirectory MAY be used to group provenance files sharing the same prov entity.

The following suffixes specify the contents of provenance files.

Name suffix Description
Description of activities act A JSON file containing objects describing activities in the context of provenance. (See the Activities section).
Description of provEntities ent A JSON file containing objects describing provEntities in the context of provenance. (See the ProvEntities section).
Description of environments env A JSON file containing objects describing environments in the context of provenance. (See the Environments section).
Description of software soft A JSON file containing objects describing software in the context of provenance. (See the Software section).

Example of organization for provenance files

prov/
├─ prov-preprocspm/
│  ├─ prov-preprocspm_desc-v1_act.json
│  ├─ prov-preprocspm_desc-v1_ent.json
│  ├─ prov-preprocspm_desc-v2_act.json
│  └─ prov-preprocspm_desc-v2_ent.json
├─ prov-preprocfsl_act.json
├─ prov-preprocfsl_ent.json
├─ prov-preprocfsl_env.json
├─ prov-preprocfsl_soft.json
└─ ...

Activities

Activities are transformations that have been applied to data.

Each file with a act suffix is a JSON file describing activities. It MUST include the following key:

Key name Requirement Level Data type Description
Activities REQUIRED array of objects Objects describing activities.

Each object in the Activities array includes the following keys:

Key name Requirement Level Data type Description
Id REQUIRED string Identifier for the activity (see the Consistency and uniqueness of identifiers section).
Label REQUIRED string Name for the activity.
Command REQUIRED string or null Command (or commands) used to run the tool, including all parameters.
Set to null to describe that the activity was performed manually.
AssociatedWith OPTIONAL string or array of strings Identifier(s) of the software package(s) used to compute the activity.
Related software MUST be described as specified in the Software section.
Used OPTIONAL string or array of strings Identifier(s) of the provEntity/provEntities or environment(s) used by the activity.
Related provEntities MUST be described as specified in the ProvEntities section.
Related environment(s) MUST be described as specified in the Environments section.
Type OPTIONAL string Term from a controlled vocabulary that more specifically describes the activity.
StartedAtTime OPTIONAL string Timestamp tracking when the activity started.
EndedAtTime OPTIONAL string Timestamp tracking when the activity ended.

Example: description of an activity in a prov/[<subdir>/]prov-<label>_act.json file

{
    "Activities": [
        {
            "Id": "bids::prov#conversion-00f3a18f",
            "Label": "Dicom to Nifti conversion",
            "Command": "dcm2niix -o . -f sub-%i/anat/sub-%i_T1w sourcedata/dicoms",
            "AssociatedWith": "bids::prov#dcm2niix-khhkm7u1",
            "Used": [
                "bids::prov#fedora-uldfv058",
                "bids::sourcedata/dicoms"
            ],
            "StartedAtTime": "2025-03-13T10:26:00",
            "EndedAtTime": "2025-03-13T10:26:05"
        }
    ]
}
This snippet is similar to Activities described in the DICOM to Nifti conversion with dcm2niix example.

Software

Software are software packages that computed the activities.

Each file with a soft suffix is a JSON file describing software. It MUST include the following key:

Key name Requirement Level Data type Description
Software REQUIRED array of objects Objects describing software.

Each object in the Software array includes the following keys:

Key name Requirement Level Data type Description
Id REQUIRED string Identifier for the software package (see the Consistency and uniqueness of identifiers section).
Label REQUIRED string Name of the software package.
Version REQUIRED string Version of the software package.
AltIdentifier OPTIONAL string URI of the RRID for the software package (cf. https://rrid.site/).
ActedOnBehalfOf OPTIONAL string Identifier of another software package that was responsible for using the software package in the context of the activities associated to it.
Example: if software A launches software B to perform activity C, then B ActedOnBehalfOf A.
Related software MUST be described as specified in the Software section.

Example: description of a software package in a prov/[<subdir>/]prov-<label>_soft.json file

{
    "Software": [
        {
            "Id": "bids::prov#dcm2niix-khhkm7u1",
            "AltIdentifier": "RRID:SCR_023517",
            "Label": "dcm2niix",
            "Version": "v1.0.20220720"
        }
    ]
}
This is a snippet from the DICOM to Nifti conversion with dcm2niix example

Environments

Environments are software environments in which activities were performed.

Each file with a env suffix is a JSON file describing environments. It MUST include the following key:

Key name Requirement Level Data type Description
Environments REQUIRED array of objects Objects describing environments.

Each object in the Environments array includes the following keys:

Key name Requirement Level Data type Description
Id REQUIRED string Identifier for the environment (see the Consistency and uniqueness of identifiers section).
Label REQUIRED string Name for the environment.
EnvVars OPTIONAL object Object containing environment variables as key-value pairs.
OperatingSystem OPTIONAL string Name of the operating system for the environment.
Dependencies OPTIONAL object Object containing names of the software dependencies as keys and their versions as values.

Example: description of an environment in a prov/[<subdir>/]prov-<label>_env.json file

{
    "Environments": [
        {
            "Id": "bids::prov#fedora-uldfv058",
            "Label": "Fedora release 36 (Thirty Six)",
            "OperatingSystem": "GNU/Linux 6.2.15-100.fc36.x86_64"
        }
    ]
}
This is a snippet from the DICOM to Nifti conversion with dcm2niix example

ProvEntities

ProvEntities are input or output data for activities.

Note

This corresponds to Entities in W3C Prov, the prefix "Prov" is used here to disambiguate with BIDS entities).

Each file with a ent suffix is a JSON file describing provEntities.

Warning

These files SHOULD not describe files that are available in the dataset. See Provenance of a BIDS file for this purpose.

These files SHOULD not describe the current dataset. See Provenance of a BIDS dataset for this purpose.

Each file MUST include the following key:

Key name Requirement Level Data type Description
ProvEntities REQUIRED array of objects Objects describing provEntities.

Each object in the ProvEntities array includes the following keys:

Key name Requirement Level Data type Description
Id REQUIRED string Identifier for the provEntity (see the Consistency and uniqueness of identifiers section).
Label REQUIRED string Name for the provEntity.
Digest RECOMMENDED object Object containing digests of the provEntity. Each key in the object MUST be the name of a checksum function if present in this list: MD5; SHA1; SHA-224 ; SHA-256 ; SHA-384 ; SHA-512 ; SHA3-224; SHA3-256; SHA3-384; SHA3-512; BLAKE2B-256; BLAKE3-256; SHAKE128; SHAKE256. Otherwise, key MAY be an arbitrary label. The corresponding value is the checksum as computed by the function identified by the key.
AtLocation OPTIONAL string For input files, this is the relative path to the file on disk.
GeneratedBy OPTIONAL string or array of strings Identifier(s) of the activity/activities responsible for the creation of the data.
Related activities MUST be described as specified in the Activities section.
Type OPTIONAL string Term from a controlled vocabulary that more specifically describes the provEntity.

Example: description of a provEntity in a prov/[<subdir>/]prov-<label>_ent.json file

{
    "ProvEntities": [
        {
            "Id": "bids::prov#provEntity-9rfe8szz",
            "Label": "sub-01_task-tonecounting_bold.nii",
            "AtLocation": "sub-01/func/sub-01_task-tonecounting_bold.nii",
            "GeneratedBy": "bids::prov#realign-acea8093",
            "Digest": {
                "sha256": "a4e801438b9c36df010309c94fc4ef8b07d95e7d9cb2edb8c212a5e5efc78d90"
            }
        }
    ]
}
This is a snippet from the fMRI preprocessing with SPM example

Provenance description file

Template:

prov/
    provenance.tsv
    provenance.json

The purpose of this RECOMMENDED file is to describe properties of provenance files. It MUST contain the column provenance_label, which MUST consist of prov-<label> values identifying one row for each prov entity in the dataset, followed by an optional column containing a description for the entity. Each entity MUST be described by one and only one row.

We RECOMMEND to make use of these columns, and in case that you do use them, we RECOMMEND to use the following values for them:

Column name Requirement Level Data type Description
provenance_label REQUIRED string An identifier of the form prov-<label>, matching a prov entity found in the dataset. There MUST be exactly one row for each prov-<label> entity.

Values in provenance_label MUST be unique.

This column must appear first in the file.
description OPTIONAL string Free-form text description of the provenance file(s).

This column may appear anywhere in the file.
Additional Columns OPTIONAL n/a Additional columns are allowed.

Throughout BIDS you can indicate missing values with n/a (for "not available").

provenance.tsv example:

provenance_label description
prov-preprocspm Provenance of preprocessing performed with SPM.
prov-preprocfsl Provenance of preprocessing performed with FSL.

It is RECOMMENDED to accompany each provenance.tsv file with a sidecar provenance.json file to describe the TSV column names and properties of their values (see also the section on tabular files).

provenance.json example:

{
    "description": {
        "Description": "Description of the provenance file(s)."
    }
}

Consistency and uniqueness of identifiers

The following rules and conventions are provided in order to have consistent, human readable, and explicit IRIs as identifiers for JSON objects related to provenance.

Identifiers for provEntities

The identifier for a provEntity which is a BIDS file or a BIDS dataset MUST be a BIDS URI. The identifier for a provEntity which is a no longer existing BIDS file or BIDS dataset SHOULD be a BIDS URI with a fragment part.

Warning

The use of BIDS URIs may require to define the DatasetLinks object in dataset_description.json.

The identifier for a provEntity in a BIDS dataset <dataset-name> MAY have the following form, where <label> is an arbitrary value for identifying the provEntity.

bids:[<dataset-name>]:prov#provEntity-<label>

Examples of identifiers for provEntities

  • bids:ds001734:sub-002/anat/sub-02_T1w.nii - identifier for a T1w file for subject sub-002 in the ds001734 dataset;
  • bids::sub-014/func/sub-014_task-MGT_run-01_events.tsv - identifier for an events file for subject sub-014 in the current dataset;
  • bids:fmriprep:sub-001/func/sub-001_task-MGT_run-01_bold_space-MNI152NLin2009cAsym_preproc.nii.gz - identifier for a bold file for subject sub-001 in the fmriprep dataset;
  • bids::prov#provEntity-acea8093 - identifier for a file that is not available in the dataset.

Identifiers for other objects

The identifier for an activity, software, or environment described in a BIDS dataset <dataset-name> SHOULD have the following form, where <label> is a human readable name for coherently identifying the object and <uid> is a unique group of chars.

bids:[<dataset-name>]:prov#<label>-<uid>

The uniqueness of this identifier MUST be used to distinguish any activity, software, or environment that are different in any of their attributes.

Examples of identifiers for activities, environments and software

  • bids:ds001734:prov#conversion-xfMMbHK1 - a conversion activity described inside the ds001734 dataset;
  • bids::prov#fedora-uldfv058 - a Fedora based environment described inside the current dataset.
  • bids:preprocessing:prov#fmriprep-r4kzzMt8 - the fMRIPrep software described inside the preprocessing dataset.

Provenance from an RDF perspective

Objects describing provenance as defined in this specification can be aggregated into JSON-LD files ; which allows to represent provenance as an RDF graph (see Resource Description Framework (RDF)).

Minimal provenance graph

flowchart BT
    B[Brain extraction] -->|wasAssociatedWith| S{FSL<br>}
    B -->|used| T1([sub-001_T1w.nii])
    B -->|used| L((Linux))
    T1p([sub-001_T1w_preproc.nii]) -->|wasGeneratedBy| B

In this example, a brain extraction algorithm was applied on a T1-weighted image:

  • sub-001_T1w.nii is the original T1-weighted image;
  • sub-001_T1w_preproc.nii is the skull striped image;
  • the "Brain extraction" activity was performed using the FSL software within a Linux software environment.

Moreover, the terms defined in this specification to describe provenance are based on the W3C Prov standard. They can be resolved to IRIs using the JSON-LD context file provenance-context.json provided with this specification.

All BIDS examples related to provenance (see. bids-examples, provenance section) show the aggregated version of the provenance metadata they contain. This comes as a JSON-LD file and a visualization of the graph.

Minimal examples

Provenance of a BIDS raw dataset

Example

This section shows a snippet from the Provenance of DICOM to Nifti conversion with dcm2niix example.

In this example, we explain provenance metadata of a DICOM to Nifti conversion with dcm2niix. Consider the following BIDS raw dataset:

├─ prov/
│  ├─ prov-dcm2niix_act.json 
│  ├─ prov-dcm2niix_soft.json 
│  └─ ... 
├─ sourcedata/
│  └─ dicoms/
│     └─ ... 
├─ sub-001/
│  └─ anat/
│     ├─ sub-001_T1w.json 
│     └─ sub-001_T1w.nii.gz 
└─ ... 

The prov/prov-dcm2niix_soft.json file describes dcm2niix, the software package used for the DICOM conversion. As per the Consistency and uniqueness of identifiers section, the identifier for the associated software object SHOULD start with bids:<dataset>:prov# (bids:: refers to the current dataset).

{
  "Software": [
    {
      "Id": "bids::prov#dcm2niix-khhkm7u1",
      "Label": "dcm2niix",
      ...
    }
  ]
}

The prov/prov-dcm2niix_act.json file describes the conversion activity. Note that the identifier for the previously described software package is used here to describe that the software package was used to compute this activity.

{
    "Activities": [
        {
            "Id": "bids::prov#conversion-00f3a18f",
            "Label": "Conversion",
            "AssociatedWith": "bids::prov#dcm2niix-khhkm7u1",
            ...
        }
    ]
}

Inside the sub-001/anat/sub-001_T1w.json file, the metadata field GeneratedBy indicates that the sub-001/anat/sub-001_T1w.nii.gz file was generated by the previously described activity.

{
    ...
    "GeneratedBy": "bids::prov#conversion-00f3a18f",
    ...
}

Provenance of a BIDS derivative dataset

Example

This section shows a snippet from the Provenance of fMRI preprocessing with SPM example.

In this example, we explain provenance metadata of fMRI preprocessing steps performed with SPM. Consider the following BIDS derivative dataset:

├─ prov/
│  ├─ prov-spm_act.json 
│  ├─ prov-spm_ent.json 
│  └─ ... 
├─ sub-01/
│  ├─ anat/
│  │  ├─ c1sub-001_T1w.json 
│  │  ├─ c1sub-001_T1w.nii 
│  │  ├─ ... 
│  │  ├─ sub-001_T1w.json 
│  │  └─ sub-001_T1w.nii 
│  └─ func/
│     └─ ... 
└─ ... 

The prov/prov-spm_act.json file describes the preprocessing steps (activities) as JSON objects. Among them:

  • the bids::prov#movefile-bac3f385 activity needed a T1w file from the ds000011 dataset identified by bids:ds000011:sub-01/anat/sub-01_T1w.nii.gz;
  • the bids::prov#segment-7d5d4ac5 brain segmentation activity needed the two files listed inside the Used array.
{
    "Activities": [
        {
            "Id": "bids::prov#movefile-bac3f385",
            "Label": "Move file",
            "Used": [
                "bids:ds000011:sub-01/anat/sub-01_T1w.nii.gz"
            ],
            ...
        },
        ...
        {
            "Id": "bids::prov#segment-7d5d4ac5",
            "Label": "Segment",
            "Used": [
                "urn:c1d082a5-34ee-4282-99df-28c0ba289210",
                "bids::sub-01/anat/sub-01_T1w.nii"
            ],
            ...
        },
        ...
    ]
}

bids::sub-01/anat/sub-01_T1w.nii is a BIDS file available in the current dataset. The spm12/tpm/TPM.nii file is not inside the dataset ; hence its description is stored inside prov/prov-spm_ent.json and its identifier is not a BIDS URI:

{
    "ProvEntities": [
        ...
        {
          "Id": "urn:c1d082a5-34ee-4282-99df-28c0ba289210",
          "Label": "TPM.nii",
          "AtLocation": "spm12/tpm/TPM.nii",
          ...
        },
        ...
    ]
}

Inside the sub-001/anat/c1sub-001_T1w.json file, the metadata field GeneratedBy indicates that the c1sub-001/anat/sub-001_T1w.nii.gz file was generated by the previously described brain segmentation activity.

{
    "GeneratedBy": "bids::prov#segment-7d5d4ac5",
    ...
}

Provenance of a BIDS study dataset

Example

This section shows a snippet from the Provenance of manual annotations example.

In this example, we explain provenance metadata of brain segmentation performed by two experts on the same T1w file. Consider the following BIDS study dataset:

├─ dataset_description.json 
├─ derivatives/
│  └─ seg/
│     ├─ dataset_description.json 
│     ├─ descriptions.tsv 
│     ├─ ... 
│     ├─ prov/
│     │  ├─ provenance.tsv 
│     │  ├─ prov-seg_desc-exp1_act.json 
│     │  ├─ prov-seg_desc-exp1_soft.json 
│     │  ├─ prov-seg_desc-exp2_act.json 
│     │  ├─ prov-seg_desc-exp2_soft.json 
│     │  └─ prov-seg_ent.json 
│     └─ sub-001/
│        ├─ sub-001_space-orig_desc-exp1_dseg.json 
│        ├─ sub-001_space-orig_desc-exp1_dseg.nii.gz 
│        ├─ sub-001_space-orig_desc-exp2_dseg.json 
│        └─ sub-001_space-orig_desc-exp2_dseg.nii.gz 
├─ ... 
└─ sourcedata/
   └─ raw/
      ├─ dataset_description.json 
      └─ sub-001/
         ├─ sub-001_T1w.json 
         └─ sub-001_T1w.nii.gz 

Inside the dataset_description.json file of the seg derivative dataset, the DatasetLinks metadata field defines an alias that is needed to refer to the raw dataset using BIDS URIs.

{
    ...
    "DatasetLinks": {
        "raw": "../../sourcedata/raw"
    }
}

The prov/prov-seg_desc-exp1_act.json file describes the activity during which expert #1 generated the brain segmentation.

{
    "Activities": [
        {
            "Id": "bids::prov#segmentation-nO5RGsrb",
            "Label": "Semi-automatic brain segmentation",
            "Command": "itk-snap sourcedata/raw/sub-001/anat/sub-001_T1w.nii.gz",
            "AssociatedWith": [
                "bids::prov#itksnap-Lfs6FRMn"
            ],
            "Used": [
                "bids:raw:sub-001/anat/sub-001_T1w.nii.gz"
            ]
        }
    ]
}

Note that a description of the sub-001/anat/sub-001_T1w.nii.gz file is needed inside derivatives/seg/prov/prov-seg_ent.json because this data file is not in the derivative/seg dataset.

Under the derivatives/seg dataset, the sub-001_space-orig_desc-exp1_dseg.json file describes that this activity generated the sub-001_space-orig_desc-exp1_dseg.nii.gz file.

{
    "GeneratedBy": "bids::prov#segmentation-nO5RGsrb"
}

The derivatives/seg/prov/provenance.tsv gives a description of the prov-seg.

provenance_label    description
prov-seg   Manual brain segmentation performed by two experts

The descriptions.tsv gives descriptions of the desc entities used both for provenance files and datafiles.

desc_id    description
desc-seg1   Files generated by expert #1
desc-seg2   Files generated by expert #2