You can specify the RNAsum
reference dataset of interest
(--dataset
option) by using one of the TCGA project IDs
(Project
column in the following tables).
Primary datasets
The table below summarises TCGA expression data available for 33 cancer types.
NOTE: To reduce the data processing time and the
size of the final RNAsum
HTML report, the following
datasets were restricted to include expression data from 300 patients:
BRCA
, THCA
, HNSC
,
LGG
, KIRC
, LUSC
,
LUAD
, PRAD
, STAD
and
LIHC
.
No | Project | Name | Tissue code* | Samples no.** |
---|---|---|---|---|
1 | BRCA |
Breast Invasive Carcinoma | 1 | 300 |
2 | THCA |
Thyroid Carcinoma | 1 | 300 |
3 | HNSC |
Head and Neck Squamous Cell Carcinoma | 1 | 300 |
4 | LGG |
Brain Lower Grade Glioma | 1 | 300 |
5 | KIRC |
Kidney Renal Clear Cell Carcinoma | 1 | 300 |
6 | LUSC |
Lung Squamous Cell Carcinoma | 1 | 300 |
7 | LUAD |
Lung Adenocarcinoma | 1 | 300 |
8 | PRAD |
Prostate Adenocarcinoma | 1 | 300 |
9 | STAD |
Stomach Adenocarcinoma | 1 | 300 |
10 | LIHC |
Liver Hepatocellular Carcinoma | 1 | 300 |
11 | COAD |
Colon Adenocarcinoma | 1 | 257 |
12 | KIRP |
Kidney Renal Papillary Cell Carcinoma | 1 | 252 |
13 | BLCA |
Bladder Urothelial Carcinoma | 1 | 246 |
14 | OV |
Ovarian Serous Cystadenocarcinoma | 1 | 220 |
15 | SARC |
Sarcoma | 1 | 214 |
16 | PCPG |
Pheochromocytoma and Paraganglioma | 1 | 177 |
17 | CESC |
Cervical Squamous Cell Carcinoma and Endocervical Adenocarcinoma | 1 | 171 |
18 | UCEC |
Uterine Corpus Endometrial Carcinoma | 1 | 168 |
19 | PAAD |
Pancreatic Adenocarcinoma | 1 | 150 |
20 | TGCT |
Testicular Germ Cell Tumours | 1 | 149 |
21 | LAML |
Acute Myeloid Leukaemia | 3 | 145 |
22 | ESCA |
Esophageal Carcinoma | 1 | 142 |
23 | GBM |
Glioblastoma Multiforme | 1 | 141 |
24 | THYM |
Thymoma | 1 | 118 |
25 | SKCM |
Skin Cutaneous Melanoma | 1 | 100 |
26 | READ |
Rectum Adenocarcinoma | 1 | 87 |
27 | UVM |
Uveal Melanoma | 1 | 80 |
28 | ACC |
Adrenocortical Carcinoma | 1 | 78 |
29 | MESO |
Mesothelioma | 1 | 77 |
30 | KICH |
Kidney Chromophobe | 1 | 59 |
31 | UCS |
Uterine Carcinosarcoma | 1 | 56 |
32 | DLBC |
Lymphoid Neoplasm Diffuse Large B-cell Lymphoma | 1 | 47 |
33 | CHOL |
Cholangiocarcinoma | 1 | 34 |
Extended datasets
There are extended sets available for the BLCA, PAAD, and LUAD cohorts, including neuroendocrine tumours (NETs), intraductal papillary mucinous neoplasm (IPMNs), acinar cell carcinoma (ACC) samples and large-cell neuroendocrine carcinoma (LCNEC).
No | Project | Name | Tissue code* | Samples no.** |
---|---|---|---|---|
1 | LUAD-LCNEC |
LUAD dataset including LCNECs (n=14) | 1 | 314 |
2 | BLCA-NET |
BLCA dataset including NETs (n=2) | 1 | 248 |
3 | PAAD-IPMN |
Pancreatic Adenocarcinoma dataset including IPMNs (n=2) | 1 | 152 |
4 | PAAD-NET |
Pancreatic Adenocarcinoma dataset including NETs (n=8) | 1 | 158 |
5 | PAAD-ACC |
Pancreatic Adenocarcinoma dataset including ACCs (n=1) | 1 | 151 |
Pan-Cancer dataset
10 samples from each of the 33 datasets were combined to create the Pan-Cancer dataset.
No | Project | Name | Tissue code* | Samples no.** |
---|---|---|---|---|
1 | PANCAN |
Samples from all 33 cancer types, 10 samples from each | 1 and 3 (LAML samples only) |
330 |
* Tissue codes:
Tissue code | Letter code | Definition |
---|---|---|
1 | TP | Primary solid Tumour |
3 | TB | Primary Blood Derived Cancer - Peripheral Blood |
** Each dataset was cleaned based on the quality metrics provided in the Merged Sample Quality Annotations file (TSV available from here), linked to from the TCGA PanCanAtlas initiative webpage (see the TCGA-data-harmonization repository for more details, including sample inclusion criteria).