Thursday, June 16, 2016

How to meet the new data sharing requirements of NIMH

National Institute of Mental Health (NIMH) have recently mandated uploading data collected from all of clinical trials sponsored by them to the NIMH Data Archive (NDA). Similar policies are not in place for many of their grant calls. This initiative differed from the previous attempts of NIH to make more data shared. In contrast to "data management plans" that have to be included in all NIH grants that historically remained unimplemented without any consequences to the grantees this new policy has teeth. Folks at NDA have access to all ongoing grants and are motivated to go after the researchers that are late with their data submission. Since there is nothing more scary than an angry grant officer it's worth taking this new policy seriously!

In this brief guide I'll describe how to prepare your neuroimaging data for the NDA submission with minimal fuss.


Minimal required data

NDA requires each study to collect and share some small subset of values for all subjects and scans:
  1. Name, surname, date of birth, and place of birth. This data will not be share with the NDA, but is required to generate unique IDs for your subjects. Those IDs are used by NDA to link participants across studies. Therefore you should collect this data and keep it in a safe place linked to the IDs you use internally. You will use this data during the submission process to generate NDA compatible IDs.
  2. Age (in months at the time of scan) and sex (male or female). Those are minimal required demographic information you will need to collect.
  3. Repetition time, echo time, flip angle, scanner manufacturer, model, and field strength, date of acqusition for all of your neuroimaging files (T1s, fMRI, DWI etc.)
  4. bvec and bval files if you include diffusion files.
  5. Slice timing if your dataset includes fMRI data.

Data organization

For organizing your data after acquisition I recommend using the Brain Imaging Data Structure (BIDS). It's an intuitive file organization scheme that will make it easier to analyze your data later due to growing set of tools that use it (such as mriqc, FMRIPREP or AA). Use can use tools like dcm2niix (to convert from DICOM to NIFTI and extract metadata) and/or heudiconv (batch processing and sorting DICOMs from many scans). Data required by NDA can be included in your BIDS dataset in the following places:
  1. Age and sex can be included as columns in the participants.tsv file. If your data comes from a longitudinal study you can include age and sex in the _session.tsv files (one per subject) to specify the values for each session independently.
  2. If you use dcm2niix you should have almost all of the required metadata values and extra files (bval and bvec). Double check scanner model, field strength and flip angle.
  3. Finally date of acquisition can be inserted as the acq_time column in the _scans.tsv files.
When you are done organizing your data use the BIDS Validator to check if everything is ok.

When the data acquisition is completed and the validator passes all of the checks it is a good habit to make the folder with the dataset read only. This will prevent from any accidental deletion or modification of the data down the road.

    Submission to NDA

      Using BIDS makes the submission process very easy and requires very little manual data wrangling.
      1. Use the GUID Tool to generate GUIDs for each of your subject. This will require providing name, date of birth and place of birth and will result in a file mapping from the IDs you use internally to GUIDs. Make sure you keep this data safe – it includes personal information!
      2. Create NDA submission package using the BIDS2NDA tool. This command line tool take three arguments: your BIDS dataset, GUID mapping file and the folder where NDA submission package will be stored.
      3. Submit the data using NDA Validation and Submission tool. If the BIDS2NDA tool worked correctly NDA Validator should not return any errors and you should be able to submit your dataset without problems.
        I hope that this guide will convince you that submitting data to NDA (and thus fulfilling your grant requirements) can be a relatively straightforward process. One thing that is worth keeping in mind is that some of those steps require a little bit of planning (for example remembering to collect the place of birth of each of your participant). This guide also covers only the neuroimaging data - to submit other data types (such as questionnaires or clinical assesments) you will have to use data dictionaries provided by NDA.
          BIDS2NDA Tool is still under active development – please submit an Issue on GitHub if you find a bug.