About the ISB Cancer Genomics CloudΒΆ
The ISB-CGC provides interactive and programmatic access to the TCGA data, leveraging many aspects of the Google Cloud Platform including BigQuery, Compute Engine, App Engine, Cloud Datalab and Google Genomics. Open-access clinical and biospecimen information for all TCGA patients and samples, combined with the Level-3 TCGA data and genomic reference and platform-annotation sources are stored in BigQuery, enabling fast SQL-like queries against the entire dataset. Controlled-access DNA and RNA sequence data is available to dbGaP-authorized users in the original BAM and FASTQ file formats.
The ISB-CGC aims to serve the needs of a broad range of cancer researchers ranging from scientists or clinicians who prefer to use an interactive web-based application to access and explore the rich TCGA dataset, to computational scientists who want to write their own custom scripts using languages such as R or Python, accessing the data through APIs, to algorithm developers who want to spin up thousands of virtual machines to rapidly analyze hundreds of terabytes of sequence data. The ISB-CGC allows scientists to interactively define and compare cohorts, examine the underlying molecular data for specific genes or pathways of interest, and share insights with collaborators around the globe.