monocle2 拟时间分支点分析结果解读

How to map cell fate to branches?

参考:https://www.jianshu.com/p/9995cd707002

拟时间分析结果有很多重要的结果,但是这些结果如何解读?比如下图的分支点分析结果:

分支点热图结果

从图中可以看到,行代表基因,这个好说,热图的列主要分为三方面:Pre−branch、Cell fate 1、Cell fate 2,这三个列代表什么含义?


Pre−branch

为了解读结果,我们看一下拟时间分析分的state结果图,然后我们对应的Pre−branch包含哪些细胞?

拟时间分析state结果

这里,我们想比较state7和state1的差异,也就是想分析branch point 3的分支点(identify genes expressed in a branch-dependent ),那这里Pre−branch到底包含哪些细胞?

In fact, BEAM tries to traverse backward from the cell on the branch point all the way back to the root cell (the cell with pseudotime 0) and use all those cells as the the pre-branch.
从结果说明可以看到,Pre−branch包含的细胞为 2, 3, 5。


'cell fate 1' and 'cell fate 2'

cell fate 1和cell fate 2到底指什么?比如还是这里的branch point 3为例:

Cell fate 1 corresponds to the state with small id (in this case, state 1) while cell fate 2 corresponds to sate with bigger id (in this case, state 2)
从说明文档中可以看出:

  • [x] Cell fate 1:state 1

  • [x] Cell fate 2:state 7

其他场景Pre−branch说明

如果比较state4和state7,Pre−branch又是哪些细胞?

this is a very good question since state 4 relates to branch point 2 while state 7 relates to branch point 3. For this test, the pre-branch will only include cells from state 2.
这里的Pre−branch仅仅包含state2细胞。

后记

此文仅仅记录了分支点依赖相关基因的解读,其他的解读后续在说明。

plot_multiple_branches_pseudotime函数说明

plot_multiple_branches_pseudotime:Create a kinetic curves to demonstrate the bifurcation of gene expression along multiple branches。
此函数可以进行多个分支点进行比较分析。

plot_multiple_branches_pseudotime(cds, branches, branches_name = NULL,min_expr = NULL, cell_size = 0.75, norm_method = c("vstExprs", "log"),nrow = NULL, ncol = 1, panel_order = NULL, color_by = "Branch",trend_formula = "~sm.ns(Pseudotime, df=3)", label_by_short_name = TRUE,TPM = FALSE, cores = 1)#示范命令plot_multiple_branches_heatmap(celltrajectory.monocle, branches = c(6,7),cluster_rows = TRUE, hclust_method = "ward.D2", num_clusters = 6,hmcols = NULL, add_annotation_row = NULL, add_annotation_col = NULL,show_rownames = FALSE, use_gene_short_name = TRUE,norm_method = c("vstExprs", "log"), scale_max = 3, scale_min = -3,trend_formula = "~sm.ns(Pseudotime, df=3)", return_heatmap = FALSE,cores = 1)

热图的每一列代表什么?

If you're looking for a deeper understanding of what the function is doing, I'd recommend digging into the source code for the function. The plot_genes_branched_heatmap function is in R/plotting.R, but it calls a nested function (buildBranchCellDataSet) that's contained in R/BEAM.R. I found it valuable to run through the code line by line and see what variables get made/changed.

But to briefly answer your question, monocle orders your cells along the trajectory, giving each cell a pseudotime value. Now, with expression values for each gene at different points in pseudotime (ie. each cell), it uses a VGLM with splines to fit non-linear expression dynamics as a function of pseudotime. This model can then directly be used for differential expression if desired (eg. using a likelihood ratio test against a reduced model that doesn't incorporate pseudotime). For plotting a heatmap though, there's a problem: the pseudotime values for your cells do not increase by sequential integers (ie. 1,2,3,..,n). This is because monocle was designed, recognizing that the jump between cells along a trajectory aren't always the same distance. So if you were to make a heatmap, your column representation of pseudotime wouldn't be linear--it will depend on your sampling density along the trajectory. It could go, for example, 1,1.15,1.25,5,6,6.25,10 (see the problem?). So what the plotting function does (more specifically, a function called genSmoothCurves) is use the constructed models from before to predict gene expression of all genes along 100 evenly spaced pseudotime values spanning the range, and then makes a heatmap of those predictions rather than your scRNA-Seq measurements themselves. Each column represents those one of those 100 pseudotime values.

The branched heatmap function is similar, except things are ordered differently. Those modelled values are ordered from the middle of the heatmap outwards. The left and right directions represent the modelled expression for two separate branches of the trajectory. The small region in the middle that is symmetrical represents the "progenitors" (the nomenclature used by the devs) prior to the branchpoint, and the point moving outwards where that symmetry breaks is the bifurcation point of the two independent branches. Going through the source code for this would really help make this clear.

简而言之,就是根据的拟时间值的范围,分成100个bin,每个bin中代表一个拟时间值。

参考资料

官方说明:How to map cell fate to branches?
plot_multiple_branches_pseudotime源代码
Understanding plot_genes_branched_heatmap columns

(0)

相关推荐

  • Monocle2 踩坑教程(2)

    回顾 Monocle2 踩坑教程(1) 差异分析 差异基因表达分析是RNA-Seq实验中的一项常见任务.Monocle可以帮助你找到不同细胞群间差异表达的基因,并评估这些变化的统计显著性.这些比较要求 ...

  • stLearn :空间轨迹推断

    男, 一个长大了才会遇到的帅哥, 稳健,潇洒,大方,靠谱. 一段生信缘,一棵技能树. 生信技能树核心成员,单细胞天地特约撰稿人,简书创作者,单细胞数据科学家. 空间信息在空间转录组中的运用 Giott ...

  • 蝶腭神经节分支

    The ganglion also consists of sympathetic efferent (postganglionic) fibers from the superior cervica ...

  • 不同谱系的差异基因分类注释

    作者 | 单细胞天地小编  刘小泽 课程链接在:http://jm.grazy.cn/index/mulitcourse/detail.html?cid=55 这次会介绍如何对不同谱系的差异基因分类注 ...

  • 拟时序分析后细胞类型按照不同state进行区分

    前面我们已经介绍了:使用monocle做拟时序分析(单细胞谱系发育)  然后回答了一个学员的问题: 拟时序分析的热图提取基因问题 , 但是因为大家对monocle包的说明书不熟悉,对R不熟练,以至于无 ...

  • 拟时序分析的热图提取基因问题

    昨天我在单细胞天地讲解了使用monocle2进行拟时序分析的方法,基本上跟着我的代码走一波就可以学会了,当然具体参数理解需要自行发力哦,见:使用monocle做拟时序分析(单细胞谱系发育) 用法只是最 ...

  • Garnett—细胞类型注释工具

    男, 一个长大了才会遇到的帅哥, 稳健,潇洒,大方,靠谱. 一段生信缘,一棵技能树, 一枚大型测序工厂的螺丝钉, 一个随机森林中提灯觅食的津门旅客. 前言 Garnett是一个从单细胞表达数据中实现自 ...

  • 原始文献介绍 :trajectory细胞轨迹分析

    介绍 单细胞轨迹可以揭示基因调控如何控制细胞命运的决定.然而,学习具有两个或更多个分支的复杂轨迹的结构仍然是一个具有挑战性的计算问题.我们介绍了Monocle 2,它使用反向图嵌入以完全无监督的方式描 ...

  • 单细胞转录组数据的个性化分析汇总

    都介绍到单细胞转录组数据处理之细胞亚群比例比较部分了,10讲就告一段落了,大家可以回看仔细品读.后面的分析其实都是个性化的了,取决于课题设计,假说,生物学背景知识,而且需要学习大量的R包. 既然是个性 ...

  • 单细胞基础视频课程结业考核20题

    转眼间一年时间过去了,我们的全网第一个单细胞基础视频课程也结束了,还有点依依不舍. 不知道大家学的怎么样,我这边看起来线下学徒和实习生都学的挺好的,还有详细的笔记分享,考虑到大部分人是没有机会线下接受 ...

  • 拟时序分析的10个步骤

    最近刷了刷植物领域单细胞文献,有一个蛮早期的拟南芥根部单细胞研究:<High-Throughput Single-Cell Transcriptome Profiling of Plant Ce ...