单细胞转录组学揭示适应内分泌治疗的多步骤疗法
当你的才华还撑不起你的野心时,请潜下心来,脚踏实地,跟着我们慢慢进步。不知不觉在单细胞转录组领域做知识分析也快两年了,通过文献速递这个栏目很幸运聚集了一些小伙伴携手共进,一起成长。
文献速递栏目通过简短介绍,扩充知识面,每天关注,希望你也能有所收获!
Outlook
Introduction
Estrongen receptor positive (ER+) breast cancer (BCa)生长很大程度依赖ER signaling。对于 (ER+) 的Breast cancer patients,常规采用endocrine therapy,即通过不同方式阻断ER的signaling,因此抑制cancer生长。
这些病人在若干年后常复发,此时演化后的BCa cells常具有estrogen independent的signaling机制,大多是由于某些新突变的累积;被称为endocrine resistant cells
Lab里常用细胞模型进行研究。Endocrine resistant cells的经典lab细胞模型为Long-term oestrogen-deprived (LTED) cells,即对原本ER+的细胞进行一年左右的estrogen deprivation得到可以estrogen independent生长的细胞。目前这些细胞模型中可能具有ESR1或CYP19A1的突变,这些突变或为获得resistance的drivers
许多cancer(如melanoma)在primiary tumor中即具有少量的具有drug tolerance/quiescence的细胞,它们是stably resistant细胞的前体。Breast cancer复发很晚,在genetic level不具有这些resistant的特征,但其transcriptome层面上estrongen deprivation (ED)过程中的演化尚有待研究
本研究:
目的:dissect phenotypic heterogeneity & plasticity of ER+ BCa,找到pre-adapted cells(PA cells)及其signatures
技术:scRNA-seq, live cell imaging,machine learning(使用分类器dissect)
结果:从细胞模型的确发现了PA cells;PA cells具有dormancy和mixed epithelial & mesenchymal traits;这些signatures/traits在clustered circulating tumor cells(CTC)中比single CTC高(CTC cluster比 single CTC具有更强的metastasis能力);PA cells和LTED很不同,因此只是ED过程中的一个阶段性产物,ED因此也应被视为multi-step model
Result
Absence of features of resistance in treatment-naive cells
Treatment naive cells和LTED cells很不同,体现于以下方面
gene copy number alteration (CNA) c
general transcriptome b
pathway activation d
Phenotypic heterogeneity of luminal breast cancer cells
下文找PA cells主要从CD44+的cells:CD44是基于previous knowledge,代表高细胞plasticity的marker。因此在此验证一下此gene的重要性,为下文做铺垫; 建模:MCF7 and LTED cell lines with a GFP reporter expressed under the promoter of the CD44 gene
基于paired samples(安利一下,我们lab很多年的efforts就是收集同一个病人的primary/metastasis samples,前辈做了一个简单的shiny app (现在比较卡,再过一段时间可能会升级服务器),欢迎大家访问和提议 (http://157.230.50.64:3838/apps/Paired_Mets/) ;不过unfortunately我们的dataset里面的met patients里并未发现这种上升) CD44在endocrine resistant的病人中上升,我们PI说一种可能解释为Aromatase treatment与Tamoxifen treatment引起resistance的transcriptomes会不同) a b
CD44 high cells具有plasticity,即它们在分裂后大多会lose CD44;这种现象在treatment-naive和LTED cells里都存在 c d
CD44 high cells在ED过程中比CD44- cells具有生存优势 e f g
Transcriptional heterogeneity of plastic cells
CD44 high cells的heterogeneity在scRNAseq中得到体现
最大的槽点:两个library没有进行integration就互相比较heterogeneity(无法区分batch effects还是真正的biological difference),但没有integration时候的确形状很像,说明这两批细胞在大尺度上的确很像而且batch effects不明显
都和LTED做了比较 b
CD44 high cells的heterogeneity比CD44 low cells的高(用average euclidean distance进行quantify) c
对CD44 high和low cells分别建了inferred regulatory networks,进行对比:community 1的区别最为显著 e,用hall mark gene sets来annotate这三个community的生物学意义 f
Single-cell transcriptomics identifies pre-adapted cells
通过高级方法定义了一群pre adapted cells
PA cells 的特点及clinical/biological意义
transcriptomically strongly biased towards features of starved cells (misclassified by random forest classifier) up b
通过上述PA cells找到的DEGs,找到一个新的marker CLDN1;only in condition of CD44 high expression, CLDN1 high cells在ED过程中具有生存优势 up d
PA cell marker genes在pathway上的特征 down a
PA cell marker genes和cell cycle related genes为negatively correlated
这一段可谓本文最神奇的地方。我们来看一看高级方法
Identification of pre-adapted cells
Two different strategies were employed to identify the pre-adapted cells.
The first one takes advantage of SWNE; a threshold was applied on the first component and the cells showing extreme values (>=0.75) were labelled as pre-adapted.
The second strategy leverages random forests classifiers62. First of all, the data sets of CD44high cells in +E2 media and starved conditions (2 days) were split into training and testing sets, using 10% and 90% of the cells, (怀疑是个typo,应该是90% and 10%) respectively. The training set was then used to call the DEGs between the two conditions (+E2 vs starved), using the procedure described in the Differential expression analysis paragraph above. These DEGs were used as input features to train a random forest classifier, using the randomForest R package (v4.6-14; default parameters). This model was then used to test the remaining data. Those cells in the testing set labelled as +E2 that were showing a probability >50% of being classified as starved were considered pre-adapted.
AUCell39 (R package v1.0.0) was the used to quantify the activity of the pre-adapted signatures (and of other signatures, whenever indicated in the text) in single cells. First of all, normalised data were processed using the AUCell_buildRankings function. The resulting rankings, along with the signatures of interest, were then subject to function AUCell_calcAUC (aucMaxRank set to 5% of the number of input genes). Following inspection of the resulting distributions, thresholds were then manually set to 0.37, 0.18 and 0.32 for the signatures of pre-adapted cells either based on SWNE or random forests, or for the LTED signature (defined as those genes upregulated in LTED vs MCF7, as described in the Differential expression analysis section above).
分析一下,首先基于SWNE1的定义比较牵强,因为所有的data都没有Integrate,所以理论上,不同data的SWNE1具体含义/组分就很不同,因此他再做一个classifier去说明PA cells更像starved cells而非treatment-naive cells,是必要的。那么第一步之所以成功的原因是因为batch effect小因此支持对比(上一段提过)
但CLDN1来的也比较突兀,given有很多DEGs备选
PA features persist in acute-ET, but not in full resistance
Cells在短期ED的过程中逐渐显示更高的PA signature(以及其中一些marker gene)表达,但没有LTED signature表达(用AUCell进行quantify);这说明LTED和短期的ED的cell states不同, 即PA features只存在于acute phase of estrogen deprivation
观察到7 days of ED后, 存活的CD44 low cells和CD44 high cells有相似的transcriptomic alteration,提出假设:PA signature为acute-Endocrine therapy的bottleneck,被
selected against
;CD44 low cells upregulate该signature的efficiency低于CD44 high cells因而具有生存劣势(PA signature是生存的必要不充分条件)
The PA signature is enriched in clusters of CTCs
为本文的验证部分:在另一株细胞系和临床样本中进行验证。
T47D细胞中也发现了类似的PA cells (shifting on SWNE1)
PA phenotype(因显示了EMT和polarity特征),hypothesize这些特征是否在metastasis progression中有作用
Comment: 这一点比较contradictory,因为基于Fig5a,PA中EMT和apical junction都是上调的,而后者是epithelial(Epi)的特征;这里想说明CTC cluster比single cell更具有Epi特征,PA具有polarity(Epi)特征,因此CTC cluster很可能具有(实际上也是)更高的PA features,这与PA中EMT高矛盾
已知:CTC cluster contribute to >85% metastasis dissemination; single CTC有更多epithelial的feature
CTC比healthy blood中PA signature高
CTC cluster和CTC single cell相比,前者的PA signature/EMT/Cell cycle signature表达更高
TCGA:
luminal A表达PA signature高于luminal B,吻合前者有更长relapse latency的事实
在luminal A中,PA genes内部的co-expression强度高于random gene set的co-expression;吻合之前的结论:PA genes受相同Gene regulatory network的调控
Summary Model
即, endocrine resistance的multi-step过程。
Estrogen deprivation后,一部分plastic cells拥有生存优势因而survive - 它们之中,随着acute ED的进程,transcriptome会不断富集pre-adaptation的特征;这些cells在日后漫长的ED过程中会不断积累gene mutations并最终形成fully resistant cells并引起转移