The NIH’s National Cancer Institute has launched the Genomic Data Commons (GDC), a unified data system designed to promote sharing of free genomic and clinical data between cancer researchers. NCI said the GDC was created to centralize, standardize, and broaden access to data from NCI programs such as The Cancer Genome Atlas (TCGA) and its pediatric equivalent, Therapeutically Applicable Research to Generate Effective Treatments (TARGET).
Through the GDC, researchers will be able to integrate genetic and clinical data, such as cancer imaging and histological data, with information on the molecular profiles of tumors as well as treatment response, NCI added. The GDC went live with approximately 4.1 petabytes of data from TCGA, TARGET, and other research programs, as well as more than 14,000 anonymized patient cases.
Within the GDC, patient data will be harmonized using standardized software algorithms to enhance their accessibility and utility to researchers. The data commons will accept submissions of cancer genomic and clinical data from researchers worldwide who wish to share their data, while ensuring secure data storage and downloading, NCI said.
“With the GDC, NCI has made a major commitment to maintaining long-term storage of cancer genomic data and providing researchers with free access to these data,” NCI Acting Director Douglas Lowy, M.D., said in a statement.
The GDC is being built and managed by the University of Chicago Center for Data Intensive Science (CDIS), in collaboration with the Ontario Institute for Cancer Research, all under an NCI contract with Leidos Biomedical Research, operator of the Frederick National Laboratory for Cancer Research.
According to UChicago, the GDC will also create a foundation for future cloud-based technologies, enabling researchers to analyze large-scale datasets and perform experiments remotely, such as through NCI’s Cancer Cloud Pilots Program. CDIS said the open-source software it is developing for the GDC could become a model for data-intensive research efforts for other diseases, such as Alzheimer’s and diabetes.
“Over time, I expect the GDC will play a more and more important role in providing the data required at the scale required so that precision medicine fulfills its promise,” GDC principal investigator Robert Grossman, professor of medicine and director of CDIS, said in a university statement.
The GDC is envisioned as a core component of Vice President Joe Biden’s National Cancer Moonshot and President Obama’s Precision Medicine Initiative (PMI). NCI said the data system will benefit from $70 million of funding for the institute intended to lead cancer genomics efforts through the “PMI for Oncology” portion of the $215 million Initiative. Biden announced the GDC earlier this week during an appearance at the annual meeting of the American Society for Clinical Oncology.