Shared Resources Data Management and Sharing Policies

Biostatistics and bioinformatics

Analytic data are required to be de-identified prior to our participation. Data from all analyses are stored with a unique identifying code and available to the investigators, journals, and sponsors according to data sharing policies.

Clinical pharmacology

Starting in 1993, de-identified bioanalytical data were stored on paper records and floppy disks. In 2012, we began storing these data using electronic means on the computers linked to the HPLC systems and/or the LC-MS/MS mass spectrometers. These data are backed up every month onto external hard drives and in real-time on centralized storage servers. The pharmacokinetic and pharmacodynamic modeling data using the bioanalytical data of 1993 were kept on computer hard drives and backed up on floppy disk drives. Since 2012, these data are now kept on larger computer hard drives and backed up monthly on external (4TB) hard drives. These forms of data are always provided to the investigator once the assays and analysis are completed. Retrospective access to this data for investigators is available upon request.

Genomics and molecular biology

The Genomics and Molecular Biology Shared Resource (GMBSR) generates genomics data for human, mouse and a range of other species in the form of raw DNA and RNA sequence (fastqs/fast5) as well as SNP genotyping information. In some cases, data pre-processing such as quality control, trimming, alignment, and gene counting is performed to generate intermediate files required for downstream analysis. While the maintenance and dissemination of these data remain the responsibility of the user in accordance with the granting agency’s policies, the GMBSR is committed to supporting researchers in meeting these requirements. All data produced in the GMBSR is stored on the Dartmouth File System (DartFS) for 5 years, supported by fees charged to the user to generate said data. DartFS is a redundant, high-performance system, with on- and off-site data replication and direct access to Dartmouth high-performance computing resources. Dartmouth researchers may request access to DartFS and purchase their own storage as needed to retain data beyond 5 years. For users outside of Dartmouth, files on DartFS are made accessible through a web interface or transmitted via ftp. The GMBSR may assist with this process and additional support is available through Dartmouth Research Computing. In rare instances, outside users may be granted direct access to DartFS if they are sponsored by a Dartmouth faculty member. Data submission to repositories such as dbGAP, GEO, and SRA are the responsibility of individual investigators in accordance with the granting agency’s policies and the GMBSR will not submit these data on their behalf. The GMBSR will work with its users to complete the necessary forms and provide metadata required for database submission. The above reflects the GMBSR’s commitment to supporting users in meeting federal guidelines for data retention and sharing, and constitutes the extent of its obligation in this regard.

For reagents generated or used in these studies, Dartmouth shall share model organisms and related research resources in accordance with the NIH Grant Policy on Sharing of Unique Research Resources, including the Sharing of Biomedical Research Resources Principles and Guidelines for Recipients of NIH Grants and Contracts issued in 2003. The generated reagents will be shared with the research community pending third-party rights, via Material Transfer Agreements (MTAs) generated and monitored by Dartmouth's Technology Transfer Office. Such MTAs will have no more restrictive terms than in the Simple Letter Agreement (SLA) to non-profit institutions or the Uniform Biological Materials Transfer Agreement (UBMTA) to for-profit institutions. Constructs will be deposited with Addgene to facilitate distribution. Genetically engineered cell lines will be made available to the research community through repositories such as the Children’s Tumor Foundation toolkit. Patient Derived Xenograft (PDX) mouse models generated at the Dartmouth Cancer Center will be deposited at the Patient Derived Model Repository managed by the NCI.

For RNA, DNA sequencing, and proteomic data and analyses, all outputs, including code, data, figures, documentation, and manuscripts will be made publicly available under an open license to the extent permitted while protecting the genetic information of individuals. Genomics data generated by the Genomics and Molecular Biology Shared Resource (GMBSR) at Dartmouth will be stored for a minimum of 5 years on the Dartmouth File System (DartFS) which is covered by the fee structure charged to the principal investigator (PI) by the GMBSR at the completion of the work. Data storage beyond 5 years becomes the responsibility of the PI. The PI will be notified by the GMBSR when the 5-year term nears expiration.

Code will be released under a BSD 3-Clause License, a permissive open-source software license. Data will be released under the Creative Commons Public Domain Dedication (CC0, version 1.0 or later), except where such release could compromise genetic information, in which case release will be managed via dbGaP. For example, we will make expression estimates and cell type proportion estimates for all public data freely available without restriction and release raw sequencing data via dbGaP. RNAseq data will be shared in SRA and that in cases where participants are not consented for fully public sharing of genetic data (PDX models) those will be placed in SRA behind dbGaP access control in accordance with IRB guidance. Results from analyses of public data, as well as figures, documentation, and writing will be released under a Creative Commons Attribution License (version 4.0 or later).

In addition to the aforementioned licensing for project outputs, creators of specific project content may release any such content as CC0, at their individual discretion. The principal investigators of this project may release any project content as CC0, at their individual discretion.

In instances where upstream inputs are used that restrict the licensing of project outputs beyond the aforementioned guidelines, the most permissive licensing option possible will be applied. However, no inputs will be incorporated that prevent original software developed under this award from being released under an Open Source Initiative (opensource.org) approved license or prevent original non-code content from being released under an Open Definition (opendefinition.org/) conformant license except as required to protect individuals.

Source code will be made available on a publicly accessible version control system, such as GitHub. Prior to submission of project manuscripts to a journal, all related outputs will be made publicly available under the aforementioned licensing guidelines and deposited to persistent archives. Currently the group uses Zenodo for code repositories, figshare for datasets, and bioRxiv for preprints, however the group may transition to an alternative if a superior option becomes available during the course of the grant period. We will distribute both processed and raw data via accepted repositories (well-annotated SRA records) to enable others to evaluate their own approaches

Immune monitoring and flow cytometry

Users are responsible for their own data. Users should retrieve their data from instruments as soon as possible following acquisition. DartLab agrees to store user data for 1 month after acquisition. After one month data will be removed from instrument computers and continued storage of data is not guaranteed. Data may be removed from MACSquant computers after less than one month; this data is stored externally and may be retrieved upon request to DartLab staff. In practice, data are store much longer than 1 month.

Irradiation, pre-clinical imaging and microscopy

Data management: Imaging data acquired on the instruments within the IPIM Microscopy Shared Resource are saved onto a hard drive on the local computer. At the end of each imagining session facility users are required to transfer their data via a closed local network to researcher-specific password-protected folders on the IPIM Microscopy Shared Resource “Dartmouth Cancer CenterMicroscopy” storage space on “DartFS”. DartFS is an institutional research data storage service offered by Dartmouth Research Computing, affording ample storage capacity and data back-up. Data is Dartmouth credentials password protected and accessible from any device with an internet connection. The data generated in the facility will be maintained on DartFS Dartmouth Cancer CenterMicroscopy for 60 days, period within which users are required to transfer their data to a laboratory specific DartFS storage space, local individual laboratory storage devices or public repositories.

Data sharing: Individual laboratories are responsible for plans for data sharing such as publication and/or submission of published and unpublished data into publicly available repositories, as specified by their sponsor. Examples of such repositories include: The Image Data Resource (IDR)

  Website Description Submission of data Accessing data
NCI The Cancer Imaging Archive (TCIA) TCIA is a service which de-identifies and hosts a large archive of medical images of cancer accessible for public download. The data are organized as “Collections”, typically patients related by a common disease (e.g., lung cancer), image modality (MRI, CT, etc.) or research focus. DICOM is the primary file format used by TCIA for image storage. Supporting data related to the images such as patient outcomes, treatment details, genomics, pathology, and expert analyses are also provided when available How to submit data to TCIA How to access TCIA data
NIGMS Cell Image Library The Cell Image Library accepts image data sets that are too large for publishers to store and provides access to the biomedical community. There are 10,000 datasets in 20TB of uploaded data as of mid-2018. The library inherits data from the Cell Centered Database at UCSD. Since its launch in 2010, the site has had 721,00 visitors and was cited by 175 research publications. How to submit data to Cell Image Library How to access Cell Image Library data
NIDDK NIDDK Central Repository The NIDDK Central Repository stores biosamples, genetic and other data collected in designated NIDDK-funded clinical studies. The purpose of the NIDDK Central Repository is to expand the usefulness of these studies by allowing a wider research community to access data and materials beyond the end of the study. How to submit data to NIDDK Central Repository How to access NIDDK Central Repository data

Mouse modeling

Efficacy studies:

  1. Scientific and technical evaluations of appropriate models (syngeneic or patient-derived models).
  2. Provide tumor cell lines and PDX tumor samples for implantation into mice.
  3. Perform treatments with requested agents as planned by the investigator(s) through discussions with this core.
  4. Provide data, including tumor growth data, necropsy studies with evaluation of metastasis in different organs, processing, and handling of tumor tissues for pathological evaluations, survival curves.
  5. Processing of tissues for genomic/proteomics/immunological studies either in our laboratory or in collaboration with other Dartmouth Cancer Center Core Facilities.
  6. In vivo data will be provided to individual researchers by encrypted email and handling of the genomic (RNA, DNA, array data, etc.) will be performed in agreement with Genomic Core Facility. Proteomic (immunohistochemistry) and immunology data (flow, ELISA, etc.) will be provided in a similar manner.

Pathology

No DMS policy available at this time.

Trace elements analysis

All stages of sample preparation and analysis are recorded electronically in XL spreadsheets. Analytical balances are interfaced to laptops to transfer data without the risk of transcription errors. ICP-MS data is exported from Agilent MassHunter software as .csv files (raw data, sorted concentrations and calibration statistics) and combined, along with sample manifiest, all sample preparation steps and ICP-MS calibration preparation information) as separate worksheets in an XL workbook. Data processing, correction for sample preparation and dilution steps and quality control checking are conducted, and results and Quality Control metrics are recorded as worksheets in the workbook. The workbook and all accompanying files are saved into the TE Core directory on the Earth Sciences GEO database, which is backed up daily to the Dartmouth Central servers.

In general, data generated by the TEA is ‘owned’ by the client and are not shared by the TEA resource. TEA analyses are not considered as human subjects research because no personal identifiers or other individual-specific data is supplied with the samples. Aggregated unidentified data are occasionally used to demonstrate the resource proficiency or to inform the range of expected concentrations in a particular sample type.