Data Archiving and Sharing
Comprehensive policies for research data management and accessibility.
Advancing Disease Research Through Open Data
Diseases is committed to promoting research reproducibility and scientific transparency through robust data archiving and sharing practices. Our policies align with international standards and funding agency requirements, ensuring that research data supporting published findings remains accessible for verification, replication, and future discovery. Proper data management accelerates scientific progress while protecting participant privacy and researcher interests.
Data Availability Requirements
Authors submitting manuscripts to Diseases must include a Data Availability Statement describing how underlying research data can be accessed. This statement should appear at the end of the manuscript before references. The journal recognizes that data sharing practices vary across disease research disciplines and accommodates different levels of data accessibility based on legitimate constraints.
Open Access Data
Datasets deposited in recognized public repositories with persistent identifiers. Authors provide repository names, accession numbers, and direct URLs enabling immediate verification of reported findings.
Controlled Access Data
Sensitive data requiring access applications through institutional data governance committees. Common for patient health information, genetic data, and clinical trial records requiring privacy protections.
Upon Request Access
Data available from corresponding authors upon reasonable request. Appropriate when institutional or legal constraints prevent public deposition but data sharing remains possible for qualified researchers.
Restricted Data
Data that cannot be shared due to participant consent limitations, proprietary constraints, or national security considerations. Authors must clearly explain restrictions and describe alternative verification approaches.
Recommended Repositories
Diseases encourages authors to deposit research data in established discipline-specific or general-purpose repositories that assign persistent identifiers, ensure long-term preservation, and enable data citation. Selecting appropriate repositories ensures data remains discoverable and accessible to the global research community for years following publication.
Disease-Specific Archives
Specialized repositories including GenBank for sequence data, ClinicalTrials.gov for trial data, GISAID for pathogen surveillance, and disease-specific consortia databases serving particular research communities.
General Purpose Repositories
Figshare, Dryad, Zenodo, and institutional repositories accepting diverse data types. These platforms provide DOI assignment, version control, and integration with publication metadata systems.
Institutional Archives
University and research institution repositories meeting preservation standards. Acceptable when discipline repositories are unavailable and institutional systems provide persistent access guarantees.
Data Citation Standards
Proper data citation acknowledges data creators, enables discovery, and establishes clear provenance chains linking publications to underlying evidence. Authors should cite datasets as formal research outputs using structured formats that include creators, titles, repositories, identifiers, and access dates. Data citations may appear in reference lists or within Data Availability Statements depending on manuscript format.
Citation Format: Creator(s) (Publication Year). Dataset Title. Repository Name. Identifier (DOI or accession number). For example: Smith J, Jones A (2024). Clinical outcomes dataset for respiratory infection study. Figshare. doi:10.6084/m9.figshare.xxxxx
Privacy and Ethical Considerations
Disease research frequently involves sensitive patient information requiring careful handling during data sharing. Authors must ensure that shared data complies with informed consent provisions, institutional review board approvals, and applicable privacy regulations including HIPAA, GDPR, and national health data protection laws. De-identification procedures should follow established standards to prevent re-identification of research participants.
When complete datasets cannot be shared due to privacy constraints, authors should consider sharing summary statistics, aggregated data, or synthetic datasets that preserve analytical utility while protecting participant confidentiality. The journal recognizes legitimate privacy concerns and works with authors to identify appropriate solutions balancing transparency with protection.
Long-Term Preservation
Research data supporting published findings should remain accessible for at least ten years following publication to enable verification and replication studies. Authors depositing data in institutional repositories should confirm preservation commitments and consider migration strategies if repositories discontinue operations. The journal maintains records of data availability statements and may contact authors regarding data accessibility for published articles.
Version Control
Repositories should maintain version histories when datasets are updated or corrected. Authors should cite specific versions used in published analyses rather than generic dataset identifiers that may change over time.
Format Standards
Data should be deposited in non-proprietary formats enabling long-term accessibility. Common formats include CSV for tabular data, XML for structured data, and standard image formats for microscopy and imaging data.
Documentation Requirements
Comprehensive metadata and codebooks accompanying datasets enable interpretation by researchers unfamiliar with original study context. Variable definitions, measurement units, and collection procedures should be clearly documented.
Editor Discretion: The editorial team may request data access during peer review to verify reported findings. Failure to provide reasonable data access when requested may result in manuscript rejection or post-publication investigation for published articles.