TCGA and TARGET datasets

GDC data portal

The Genomic Data Commons (GDC) data portal is a good place to start exploring the data available via The Cancer Genome Atlas Program (TCGA) and the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) programs. However, this interface does not really offer any easy ways of programatically accessing and reanalysing this data. Thus, we provide some tools and resources below, that should make this easier.

TCGAbiolinks is a well-documented bioconductor package for working with GDC data. In addition to the TCGA and TARGET projects, TCGAbiolinks also provides access to gene expression from healthy individuals via the public Genotype-Tissue Expression Project (GTEx) data. Check out the main bioconductor landing page for TCGAbiolinks for the original papers introducing the functionality and for links to the extensive documentation with walkthroughs. In addition, the TCGAWorkflow vignette shows how to use it for reanalysis of TCGA data.