反向推理TCGA生存分析使用的仅仅是肿瘤样品
TCGA生存分析使用的是什么样本?癌组织,还是癌旁组织,或者血样?
在学习群里求助了这个问题。得到曾老师的耐心解答。现整理如下,供同学参考。
先了解一下TCGA的背景知识:
TCGA首页的一句话:
The Cancer Genome Atlas (TCGA), a landmark cancer genomics program, molecularly characterized over 20,000 primary cancer and matched normal samples spanning 33 cancer types.
从这一句话可以看出TCGA主要收集的肿瘤组织。
1、样品采集与编号
先了解一下样品采集与编号方法。
1.1 样品采集与编号流程
1.2 Barcodes含义
图片来源:https://docs.gdc.cancer.gov/Encyclopedia/pages/TCGA_Barcode/
Label | Identifier for | Value | Value Description | Possible Values |
---|---|---|---|---|
Analyte | Molecular type of analyte for analysis | D | The analyte is a DNA sample | See Code Tables Report |
Plate | Order of plate in a sequence of 96-well plates | 182 | The 182nd plate | 4-digit alphanumeric value |
Portion | Order of portion in a sequence of 100 - 120 mg sample portions | 1 | The first portion of the sample | 01-99 |
Vial | Order of sample in a sequence of samples | C | The third vial | A to Z |
Project | Project name | TCGA | TCGA project | TCGA |
Sample | Sample type | 1 | A solid tumor | Tumor types range from 01 - 09, normal types from 10 - 19 and control samples from 20 - 29. See Code Tables Report for a complete list of sample codes |
Center | Sequencing or characterization center that will receive the aliquot for analysis | 1 | The Broad Institute GCC | See Code Tables Report & |
Participant | Study participant | 1 | The first participant from MD Anderson for GBM study | Any alpha-numeric value |
TSS | Tissue source site | 2 | GBM (brain tumor) sample from MD Anderson | See Code Tables Report |
2、数据分析工具
对照分析工作,看看生存分析用是的什么样品。
在线分析工具:http://gepia2.cancer-pku.cn/#survival 或 http://ualcan.path.uab.edu/analysis.html
第三方R包:TCGAbiolinks,对TCGAbiolinks使用的详细介绍看它的官方文档即可!
以EGFR为例,在线分析工具得到的生存分析图:
使用TCGAbiolinks包,得到生存分析的36个样品编号如下:
[1] "TCGA-3X-AAVA-01A-11R-A41I-07" "TCGA-3X-AAVE-01A-11R-A41I-07" "TCGA-ZH-A8Y5-01A-11R-A41I-07"
[4] "TCGA-ZH-A8Y6-01A-11R-A41I-07" "TCGA-W5-AA2G-01A-11R-A41I-07" "TCGA-3X-AAVB-01A-31R-A41I-07"
[7] "TCGA-ZH-A8Y8-01A-51R-A41I-07" "TCGA-ZU-A8S4-01A-11R-A41I-07" "TCGA-W5-AA2I-01A-32R-A41I-07"
[10] "TCGA-ZD-A8I3-01A-11R-A41I-07" "TCGA-W5-AA30-01A-31R-A41I-07" "TCGA-W6-AA0S-01A-11R-A41I-07"
[13] "TCGA-W5-AA2H-01A-31R-A41I-07" "TCGA-W5-AA2T-01A-12R-A41I-07" "TCGA-3X-AAV9-01A-72R-A41I-07"
[16] "TCGA-YR-A95A-01A-12R-A41I-07" "TCGA-W5-AA2X-01A-11R-A41I-07" "TCGA-W5-AA2Z-01A-11R-A41I-07"
[19] "TCGA-W5-AA31-01A-11R-A41I-07" "TCGA-ZH-A8Y4-01A-11R-A41I-07" "TCGA-W5-AA2O-01A-11R-A41I-07"
[22] "TCGA-W5-AA2W-01A-11R-A41I-07" "TCGA-W5-AA33-01A-11R-A41I-07" "TCGA-W5-AA2Q-01A-11R-A41I-07"
[25] "TCGA-W5-AA36-01A-11R-A41I-07" "TCGA-W5-AA39-01A-11R-A41I-07" "TCGA-ZH-A8Y1-01A-11R-A41I-07"
[28] "TCGA-4G-AAZO-01A-12R-A41I-07" "TCGA-ZH-A8Y2-01A-11R-A41I-07" "TCGA-W5-AA38-01A-11R-A41I-07"
[31] "TCGA-W5-AA2U-01A-11R-A41I-07" "TCGA-3X-AAVC-01A-21R-A41I-07" "TCGA-WD-A7RX-01A-12R-A41I-07"
[34] "TCGA-4G-AAZT-01A-11R-A41I-07" "TCGA-W5-AA2R-01A-11R-A41I-07" "TCGA-W5-AA34-01A-11R-A41I-07"
参照前面的Barcodes含义:所有样品编号中的Sample代码都为01,而01-09都是癌症组织。可见生存分析使用样品是肿瘤组织。
如果你对生存分析细节感兴趣
生信技能树多次分享过生存分析的细节;
生存分析是目前肿瘤等疾病研究领域的点睛之笔!
文末友情推荐
与十万人一起学生信,你值得拥有下面的学习班: