Skip to main content
Version: 8.x

Step-by-step guide to organize and submit SPARC datasets with SODA for SPARC

Prepare and submit SPARC datasets with SODA

The typical process for submitting your SPARC dataset consists of organizing your data according to the SPARC Data Structure (SDS), adding metadata files, uploading everything on the Pennsieve data platform where more metadata needs to be added, and finally sharing the dataset with the SPARC Curation Team who will review it for compliance. Once approved by the Curation Team, you will have to share your dataset as embargoed dataset and it will become accessible to all members of the SPARC Consortium through Pennsieve. Once the embargo period is over (one year after initial upload or after publication of related manuscript(s), whichever comes first), you will have to publish your dataset and it will then become accessible publicly through the SPARC Data Portal.

We describe below the suggested workflow for preparing and submitting your SPARC datasets with SODA. All these steps are mandatory (unless marked otherwise) if you wish to satisfy the SPARC requirements.

A. Preliminary Steps

These steps only need to be completed once.

  • Download and install SODA
  • All SPARC datasets must be uploaded on the Pennsieve data platform. Get access to Pennsieve as well as the SPARC Consortium organization on Pennsieve by filling out this form. We also suggest to request access to the SPARC Airtable sheet through the same form as it will come in handy when your prepare your SPARC metadata files.
  • Download and install the Pennsieve agent required to upload files through SODA
  • Watch our quick video to familiarize yourself with the user interface of SODA (note: optional but recommended)
  • Read about the SPARC requirements for organizing and sharing datasets to familiarize yourself with the process (note: optional but recommended)

B. Prepare Dataset on Pennsieve

The SPARC guidelines require each dataset to have specific metadata on Pennsieve. We recommend starting with this such that everything is set on Pennsieve when you are ready to upload your data and metadata files (Step D). This metadata can be easily added to Pennsieve through SODA.

C. Prepare SPARC Metadata Files

The SPARC guidelines require each dataset to have specific metadata files, as described by the SPARC Data Standards (SDS). These metadata files can be conveniently prepared through SODA.

D. Organize Dataset According to the SPARC Data Structure

All SPARC datasets must be organized according to the structure described by the SPARC Data Standards (SDS). Briefly, all data must be organized into one of the following six high-level folders: primary, source, derivative, code, protocol, and docs. Each of these folders must have a manifest metadata file that summarizes the content of the folder. Additionally, all the metadata files created during Step C must be located at the highest-level of the dataset, alongside the high-level folders. SODA provides a intuitive interface for organizing your dataset according to the SDS and upload it on Pennsieve with automatically generated manifest files.

E. Submit Dataset to the Curation Team for Review

Once all the previous steps have been completed, it is time to share your dataset with the SPARC Data Curation Team for review.

F. Post-curation steps

These steps must be completed ONLY after your dataset is approved by the Curation Team

Was this page helpful?