MATH值代表的肿瘤异质性在乳腺癌与生存关系不显著

前面我们说到文章 Oncotarget. 2016 Mar 使用pyclone计算的克隆数量来代表肿瘤异质性,发现肿瘤内部异质性在不同癌症里面都显著影响生存。

但是今天要分享的文章Breast Cancer Research and Treatment February 2017 , 题目是:Clinical and molecular relevance of mutant-allele tumor heterogeneity in breast cancer 使用了另外一个肿瘤异质性量化指标,就是MATH.

MATH算法背景

以前在生信技能树分享过:https://www.jianshu.com/p/f9573a4a8aeb

研究过程

仅仅是根据TCGA数据库的病人的肿瘤WES数据结果的maf格式文件,就使用MATH公式计算916个病人的MATH值,然后关联点突变信息和CNV信息。

先看点突变,其中有着 TP53 突变的患者群体的ITH显著高于没有TP53 突变的患者,而CDH1相反。

再看GISTIC2的CNV分析结果,也是可以同理发现一些显著区分ITH的因素,但是统计学方法需要调整,作者使用逻辑回归和广义线性回归。

根据MATH值对肿瘤患者进行分组:

  • MATH value lower than 33 were classified as the “low MATH” group (306 patients; 33.4%).

  • Cases with a MATH value higher than 46 were classified as the “high MATH” group (290 patients; 31.7%)

  • while the rest were defined as the “Intermediate” group (320 patients; 35.0%).

然后MATH值与基因表达的相关性的GSEA分析结果也很有意义,这里作者是通过MATH值的高低分组做差异分析,然后对差异分析结果(通常是logFC)进行GSEA,而我想的是MATH值本来就是数值,其实可以跟每个基因计算相关性,这个相关性就可以作为排序指标进行GSEA分析。

事实上作者也意识到了自己研究的局限性:

  • First, imperfections in clinical data, especially in survival data held us back from further investigating the potential clinical value of MATH in breast cancer.

  • Second, while MATH provided a more quantifiable estimation of ITH and is less affected by the limitation of somatic mutation number and sequencing depth in TCGA breast cancer cohort comparing with clustering based strategies like PyClone and EXPAND, it introduced another drawback that the ITH represented by the MATH value might be confounded by the load of SCNA, a phenomenon that cannot be ignored, although the fraction of the genome altered was adjusted in calculations involving SCNAs.

理论上应该是需要增加其它独立数据集来做第三方验证。

实际上后面我会分享两个2018的研究,都是围绕着这个MATH值来讲故事。

文献俱乐部2019年笔记分享第一弹,目录如下:

(0)

相关推荐