反向推理TCGA生存分析使用的仅仅是肿瘤样品

TCGA生存分析使用的是什么样本?癌组织,还是癌旁组织,或者血样?

在学习群里求助了这个问题。得到曾老师的耐心解答。现整理如下,供同学参考。

先了解一下TCGA的背景知识:

TCGA首页的一句话:

The Cancer Genome Atlas (TCGA), a landmark cancer genomics program, molecularly characterized over 20,000 primary cancer and matched normal samples spanning 33 cancer types.

从这一句话可以看出TCGA主要收集的肿瘤组织。

1、样品采集与编号

先了解一下样品采集与编号方法。

1.1 样品采集与编号流程

Creating Barcodes

1.2 Barcodes含义

Reading Barcodes

图片来源:https://docs.gdc.cancer.gov/Encyclopedia/pages/TCGA_Barcode/

Label Identifier for Value Value Description Possible Values
Analyte Molecular type of analyte for analysis D The analyte is a DNA sample See Code Tables Report
Plate Order of plate in a sequence of 96-well plates 182 The 182nd plate 4-digit alphanumeric value
Portion Order of portion in a sequence of 100 - 120 mg sample portions 1 The first portion of the sample 01-99
Vial Order of sample in a sequence of samples C The third vial A to Z
Project Project name TCGA TCGA project TCGA
Sample Sample type 1 A solid tumor Tumor types range from 01 - 09, normal types from 10 - 19 and  control samples from 20 - 29. See Code Tables Report for a complete list  of sample codes
Center Sequencing or characterization center that will receive the aliquot for analysis 1 The Broad Institute GCC See Code Tables Report &
Participant Study participant 1 The first participant from MD Anderson for GBM study Any alpha-numeric value
TSS Tissue source site 2 GBM (brain tumor) sample from MD Anderson See Code Tables Report

2、数据分析工具

对照分析工作,看看生存分析用是的什么样品。

在线分析工具:http://gepia2.cancer-pku.cn/#survival 或 http://ualcan.path.uab.edu/analysis.html

第三方R包:TCGAbiolinks,对TCGAbiolinks使用的详细介绍看它的官方文档即可!

以EGFR为例,在线分析工具得到的生存分析图:

使用TCGAbiolinks包,得到生存分析的36个样品编号如下:

[1] "TCGA-3X-AAVA-01A-11R-A41I-07" "TCGA-3X-AAVE-01A-11R-A41I-07" "TCGA-ZH-A8Y5-01A-11R-A41I-07"
 [4] "TCGA-ZH-A8Y6-01A-11R-A41I-07" "TCGA-W5-AA2G-01A-11R-A41I-07" "TCGA-3X-AAVB-01A-31R-A41I-07"
 [7] "TCGA-ZH-A8Y8-01A-51R-A41I-07" "TCGA-ZU-A8S4-01A-11R-A41I-07" "TCGA-W5-AA2I-01A-32R-A41I-07"
[10] "TCGA-ZD-A8I3-01A-11R-A41I-07" "TCGA-W5-AA30-01A-31R-A41I-07" "TCGA-W6-AA0S-01A-11R-A41I-07"
[13] "TCGA-W5-AA2H-01A-31R-A41I-07" "TCGA-W5-AA2T-01A-12R-A41I-07" "TCGA-3X-AAV9-01A-72R-A41I-07"
[16] "TCGA-YR-A95A-01A-12R-A41I-07" "TCGA-W5-AA2X-01A-11R-A41I-07" "TCGA-W5-AA2Z-01A-11R-A41I-07"
[19] "TCGA-W5-AA31-01A-11R-A41I-07" "TCGA-ZH-A8Y4-01A-11R-A41I-07" "TCGA-W5-AA2O-01A-11R-A41I-07"
[22] "TCGA-W5-AA2W-01A-11R-A41I-07" "TCGA-W5-AA33-01A-11R-A41I-07" "TCGA-W5-AA2Q-01A-11R-A41I-07"
[25] "TCGA-W5-AA36-01A-11R-A41I-07" "TCGA-W5-AA39-01A-11R-A41I-07" "TCGA-ZH-A8Y1-01A-11R-A41I-07"
[28] "TCGA-4G-AAZO-01A-12R-A41I-07" "TCGA-ZH-A8Y2-01A-11R-A41I-07" "TCGA-W5-AA38-01A-11R-A41I-07"
[31] "TCGA-W5-AA2U-01A-11R-A41I-07" "TCGA-3X-AAVC-01A-21R-A41I-07" "TCGA-WD-A7RX-01A-12R-A41I-07"
[34] "TCGA-4G-AAZT-01A-11R-A41I-07" "TCGA-W5-AA2R-01A-11R-A41I-07" "TCGA-W5-AA34-01A-11R-A41I-07"

参照前面的Barcodes含义:所有样品编号中的Sample代码都为01,而01-09都是癌症组织。可见生存分析使用样品是肿瘤组织。

如果你对生存分析细节感兴趣

生信技能树多次分享过生存分析的细节;

生存分析是目前肿瘤等疾病研究领域的点睛之笔!

文末友情推荐

与十万人一起学生信,你值得拥有下面的学习班:

(0)

相关推荐