TCGA and TARGET datasets
GDC data portal
The Genomic Data Commons (GDC) data portal is a good place to start exploring the data available via The Cancer Genome Atlas Program (TCGA) and the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) programs. However, this interface does not really offer any easy ways of programatically accessing and reanalysing this data. Thus, we provide some tools and resources below, that should make this easier.
TCGAbiolinks
TCGAbiolinks
is a well-documented bioconductor package for working with GDC data.
In addition to the TCGA and TARGET projects, TCGAbiolinks also provides access to gene expression from healthy individuals via the public Genotype-Tissue Expression Project (GTEx) data.
Check out the main bioconductor landing page for TCGAbiolinks
for the original papers introducing the functionality and for links to the extensive documentation with walkthroughs.
In addition, the TCGAWorkflow
vignette shows how to use it for reanalysis of TCGA data.