The Figshare Report released yesterday The State of Open Data inspired me to reflect on how researchers at IUPUI are sharing their data. Below, I describe three increasingly common scenarios for data sharing, including common considerations - what data, when, how/where, and what permissions.
What data should be shared?
Typically, authors share only the data reported in the article, especially if the research and analysis is still happening.
When will the data be shared?
Publisher data availability policies tend to require that the data are available no later than the date of publication for the associated article.
How/Where can the data be shared?
It depends. Sometimes, the publisher will require deposit in a particular repository or sometimes they will require that it be made available through their own platform. I recommend that authors deposit where the publisher recommends and then submit another copy of the data to IUPUI DataWorks. As we have seen over the last decade or two, publishers merge and platforms change, and often supplemental materials are lost. Large public institutions like IU are far more stable and can provide persistent access to the data for the next 10, 25, or even 50 years.
What will others (researchers, students, educators, policy makers, citizens) be allowed to do with the data?
Some data repositories require you to choose certain licenses during the deposit process, while others do not specify a license. Generally, there are two approaches to telling potential reusers how they can use your data: licenses and waivers. Choosing a CC-BY license is a popular choice because it maximizes what others can do with your data while getting credit for creating it. You can also waive all rights to the data, which may be appropriate if your analyses are complete or you want. Since the intellectual property policies differ by nation and region, this can be a complex decision for international researchers. For more guidance on choosing a license for your research data, check out the http://www.dcc.ac.uk/resources/how-guides/license-research-data in the UK.
Scenario 2: A public health researcher would like to share de-identified data gathered from surveys and EMR in a controlled way.
What data should be shared?
Since the data are already de-identified, the researcher could share the full database once the data have been analyzed and results published.
When will the data be shared?
In the absence of funder or publisher requirements, a good general practice is to release the data upon publication of associated results. If multiple publications are planned, it is usually reasonable to release the full dataset once all results are published.
How/Where can the data be shared?
Since a partner in this research project would prefer controlled release of the data, one choice is to deposit the data in our institutional data repository IUPUI DataWorks. The data will be made discoverable through rich metadata and study documentation, but the data files will not be available for automatic download. Potential reusers are able to use the "Request This" feature to request permission to use the data. Often, this also requires the user to sign a data use agreement. This model is very common in biomedical research, for data including personal health information (PHI) and de-identified human subjects data.
What will others (researchers, students, educators, policy makers, citizens) be allowed to do with the data?
The specific permissions and restrictions are detailed in the Data Use Agreement, but there are many options to choose from.
Scenario 3: A graduate student is required by his/her program to make the data underlying his/her dissertation openly available through deposit in IUPUI DataWorks and an appropriate subject repository, if one exists.
What data should be shared?
All data discussed and reported in the dissertation should be shared, with the exception of sensitive data such as personal health information (PHI) or personally identifiable information (PII), among others.
When will the data be shared?
The data should be made available once the student's dissertation is publicly available through IUPUI ScholarWorks.
How/Where can the data be shared?
Deposit into both IUPUI DataWorks and the appropriate subject repository can be done simultaneously (or nearly so).
What will others (researchers, students, educators, policy makers, citizens) be allowed to do with the data?
Again, this depends. See the response to Scenario 1 or check out the http://www.dcc.ac.uk/resources/how-guides/license-research-data in the UK.