微生物群落来自哪里，我们说了算-FEAST-or-source_tracker

2024-05-10 20:16:39

微生物来源分析

微生物来源分析

写在前面

最近由于老板有分析项目，我实在是进展缓慢，一直苦恼并艰难的探索和进展，所以很长时间没有和大家见面了，今天我为大家带来的source tracker分析，使用前一段时间刚出来的工具FEAST。

刘老师对这片文章进行了详细的解读： Nature Methods：快速准确的微生物来源追溯工具FEAST。跟着刘老师的步伐，今天我对这个工具进行一个尝试。为什么作者不将这个工具封装到R包呢这样不就更容易了吗？可能好多小伙伴都没有从github上克隆过项目。

source_tracker的流程及其说明宏基因组公众号上有很详细的介绍，这里就略过了。

本次重点FEAST

准备

不仅仅是这一次，我在之后全部的分析都会将整个群落封装到phylsoeq，只是为了更好的更加灵活的对微生物群落数据进行分析，当然大家如果初次见面，可能需要安装依赖极多的phyloseq包。需要熟悉phylsoeq封装的结构和调用方法。

为了让大家更容易操作，我把数据保存为csv，方便尚未解除phylsoeq的小伙伴进行无压力测试。

结合作者的分析内核，我构建了基于otu表格和分组文件和流畅pipline，并且添加可视化模块和保存结果模块，希望可以方便使用。

微生物来源分析

FEAST提供两种方式来做微生物来源分析。

基于单个目标的来源。单个样品的来分析。2.基于多个目标和多个来源。多个样品进行来源分析。

首先我们来演示基于单个目标样品和来源样品的来源分析

# rm(list = ls()) # gc()


path = "./phyloseq_7_source_FEAST"

dir.create(path)

##导入主函数

source("./FEAST-master/FEAST_src//src.R")
ps = readRDS("./a3_DADA2_table/ps_OTU_.ps")

# 导入分组文件和OTU表格

metadata <- as.data.frame(sample_data(ps))

head(metadata)
write.csv(metadata,"metadata.csv",quote = F)

# Load OTU table

vegan_otu <- function(physeq){

OTU <- otu_table(physeq)

if(taxa_are_rows(OTU)){

OTU <- t(OTU)

}

return(as(OTU,"matrix"))

}

otus <- as.data.frame(t(vegan_otu(ps)))

write.csv(otus,"otus.csv",quote = F)

otus <- t(as.matrix(otus))
###下面区分目标样品和来源样品。
envs <- metadata$SampleType
metadata<- arrange(metadata, SampleType)

metadata$id = rep(1:6,4)

Ids <- na.omit(unique(metadata$id))

it = 1
train.ix <- which(metadata$SampleType%in%c("B","C","D")& metadata$id == Ids[it])

test.ix <- which(metadata$SampleType=='A' & metadata$id == Ids[it])
# Extract the source environments and source/sink indices
num_sources <- length(train.ix) #number of sources

COVERAGE = min(rowSums(otus[c(train.ix, test.ix),])) #Can be adjusted by the user
#对两组样品进行抽平

sources <- as.matrix(rarefy(otus[train.ix,], COVERAGE))

sinks <- as.matrix(rarefy(t(as.matrix(otus[test.ix,])), COVERAGE))
dim(sinks)

print(paste("Number of OTUs in the sink sample = ",length(which(sinks > 0))))

print(paste("Seq depth in the sources and sink samples = ",COVERAGE))

print(paste("The sink is:", envs[test.ix]))
# Estimate source proportions for each sink

EM_iterations = 1000 # number of EM iterations. default value
FEAST_output<-FEAST(source=sources, sinks = t(sinks), env = envs[train.ix], em_itr = EM_iterations, COVERAGE = COVERAGE)

Proportions_est <- FEAST_output$data_prop[,1]

names(Proportions_est) <- c(as.character(envs[train.ix]), "unknown")
print("Source mixing proportions")

Proportions_est

round(Proportions_est,3)

就正常样品而言，我们都会测定重复，这里基于多个样品的sourceracker分析

基于多个目标和来源的微生物来源分析: different_sources_flags设置目标样品和来源样品的对应关系。是否不同目标对应不同来源样品，还是不同目标对应相同来源样品


##导入主函数

source("./FEAST-master/FEAST_src//src.R")
ps = readRDS("./a3_DADA2_table/ps_OTU_.ps")

# 导入分组文件和OTU表格

metadata <- as.data.frame(sample_data(ps))

head(metadata)

# Load OTU table

vegan_otu <- function(physeq){

OTU <- otu_table(physeq)

if(taxa_are_rows(OTU)){

OTU <- t(OTU)

}

return(as(OTU,"matrix"))

}

otus <- as.data.frame(t(vegan_otu(ps)))

otus <- t(as.matrix(otus))
head(metadata)
metadata<- arrange(metadata, SampleType)

metadata$id = rep(1:6,4)

EM_iterations = 1000 #default value

different_sources_flag = 1
envs <- metadata$SampleType

Ids <- na.omit(unique(metadata$id))

Proportions_est <- list()

it = 1
for(it in 1:length(Ids)){
# Extract the source environments and source/sink indices

if(different_sources_flag == 1){
train.ix <- which(metadata$SampleType%in%c("B","C","D")& metadata$id == Ids[it])

test.ix <- which(metadata$SampleType=='A' & metadata$id == Ids[it])
}
else{
train.ix <- which(metadata$SampleType%in%c("B","C","D"))

test.ix <- which(metadata$SampleType=='A' & metadata$id == Ids[it])

}
num_sources <- length(train.ix)

COVERAGE = min(rowSums(otus[c(train.ix, test.ix),])) #Can be adjusted by the user
# Define sources and sinks
sources <- as.matrix(rarefy(otus[train.ix,], COVERAGE))

sinks <- as.matrix(rarefy(t(as.matrix(otus[test.ix,])), COVERAGE))
print(paste("Number of OTUs in the sink sample = ",length(which(sinks > 0))))

print(paste("Seq depth in the sources and sink samples = ",COVERAGE))

print(paste("The sink is:", envs[test.ix]))
# Estimate source proportions for each sink
FEAST_output<-FEAST(source=sources, sinks = t(sinks), env = envs[train.ix], em_itr = EM_iterations, COVERAGE = COVERAGE)

Proportions_est[[it]] <- FEAST_output$data_prop[,1]
names(Proportions_est[[it]]) <- c(as.character(envs[train.ix]), "unknown")
if(length(Proportions_est[[it]]) < num_sources +1){
tmp = Proportions_est[[it]]

Proportions_est[[it]][num_sources] = NA

Proportions_est[[it]][num_sources+1] = tmp[num_sources]

}
print("Source mixing proportions")

print(Proportions_est[[it]])
}
print(Proportions_est)
went = as.data.frame(Proportions_est)

colnames(went) = paste("repeat_",unique(metadata$id),sep = "")

head(went)
filename = paste(path,"/FEAST.csv",sep = "")

write.csv(went,filename,quote = F)

出图，简单出一张饼图供大家参考

library(RColorBrewer) library(dplyr) library(graphics)


head(went)
plotname = paste(path,"/FEAST.pdf",sep = "")

pdf(file = plotname,width = 12,height = 12)

par(mfrow=c((length(unique(metadata$SampleType))%/%2 +2 ),2), mar=c(1,1,1,1))

# layouts = as.character(unique(metadata$SampleType))
for (i in 1:length(colnames(went))) {
labs <- paste0(row.names(went)," \n(", round(went[,i]/sum(went[,i])*100,2), "%)")
pie(went[,i],labels=labs, init.angle=90,col = brewer.pal(nrow(went), "Reds"),

border="black",main =colnames(went)[i] )

}
dev.off()

基于多个重复，我们合并饼图展示

我们作为生物可能测定9个以上重复了，如果展示九个饼图，那就显得太夸张了，求均值，展示均值饼图

head(went)


asx = as.data.frame(rowMeans(went))
asx = as.matrix(asx)

asx_norm = t(t(asx)/colSums(asx)) #* 100 # normalization to total 100

head(asx_norm)
plotname = paste(path,"/FEAST_mean.pdf",sep = "")

pdf(file = plotname,width = 6,height = 6)

labs <- paste0(row.names(asx_norm)," \n(", round(asx_norm[,1]/sum(asx_norm[,1])*100,2), "%)")
pie(asx_norm[,1],labels=labs, init.angle=90,col = brewer.pal(nrow(went), "Reds"),

border="black",main = "mean of source tracker")

dev.off()

赞 (0)

MPB：使用QIIME 2分析微生物组16S rRNA基因扩增子测序数据(视频)

为进一步提高<微生物组实验手册>稿件质量,本项目新增大众评审环节.文章在通过同行评审后,采用公众号推送方式分享全文,任何人均可在线提交修改意见.公众号格式显示略有问题,建议电脑端点击文末阅 ...
艾健康】身体好坏，脾胃说了算！您的脾胃健康吗？快来自测

☞ ☞ ☞ ☞ 内外兼修平衡养生让艾走进千家万户早饭不吃,午饭乱吃,晚饭大吃:爱喝冰镇饮料.爱吃腌制食物--这其中有没有你的影子?生活越来越好,我们的脾胃却越来越差. 中医里,提到胃就不能不说脾 ...
身体好坏，脾胃说了算！您的脾胃健康吗？快来自测

.早饭不吃,午饭乱吃,晚饭大吃:爱喝冰镇饮料.爱吃腌制食物--这其中有没有你的影子?生活越来越好,我们的脾胃却越来越差. 中医里,提到胃就不能不说脾,脾胃是健康的"根",如果脾胃功 ...
每个人小时候活成恐怖片还是奋斗剧，来自于你的生活态度，你的人生你说了算

记得小时候,除了上学.干农活,就只剩下玩了,连写作业的场景都很少. 其实,小时候自己写作业写的少,抄别人作业多,当时也没觉得丢人,当然,也不理解为什么人家会做呢? 多年以后,那一个个个无聊的午后就像梦 ...
强大的气场来自心态, 心态是最好的本钱

在生活中,每个人都会遇到各种各样的问题和困难,其实,人与人之间并无太大的区别,真正的区别在于心态.一辈子就图个无愧于心 ,自在悠得,你会明白,你越懂得接受,你的心灵所享有的自由度就越高. 人说:&qu ...
【新提醒】来自墨格拉的帖子

【新提醒】来自墨格拉的帖子
【新提醒】来自℡尒羴咩咩ゞ的帖子

【新提醒】来自℡尒羴咩咩ゞ的帖子
退休中医：留下八味药，锁住阳气，来自民间的验方

本文理论依据:<千金翼方>.<本草经>. 大家好,我是你们的家庭中医,祝娘. 从小就喜欢看金庸的武侠著作,里面高手如云,各个都武艺超凡.身怀绝技,尤其对一些隐居的侠客尤为感兴趣 ...
跌下来的机会在哪里？目前的行情像极了去年7月至12月的行情，还需要较长一段时间来消化来自全球的不确...

目前的行情像极了去年7月至12月的行情,还需要较长一段时间来消化来自全球的不确定性.现在是以时间换空间的阶段,只有时间足够了才能向上突破3700点.大盘如果跌下来了应该重点关注山西汾酒.通策医疗.智飞 ...
两味药，锁住“脾精”！治早晨肠鸣、拉肚子，来自民间土中医

(本文仅供学习.参考,不能替代医嘱和处方.文中所述配伍.方剂,必须在中医师当面辨证指导下来借鉴.应用,切勿盲目尝试!) 本文理论依据:<中药学>.<中医内科学> 你好,我是中医 ...