RNAseqQC 给你的数据来个全面的 QC 检查

引言


分享个新鲜出炉的 R 包,RNAseqQC 可以给你的 counts 数据做个全面的质量评估,非常简单,基本上一行代码就可出图。

参考文档:
https://cran.r-project.org/web/packages/RNAseqQC/vignettes/introduction.html
安装

install.packages(“RNAseqQC”)
示例

输入数据是一个 counts 矩阵和 表型 数据:
count_mat <- counts(T47D)
meta <- data.frame(colData(T47D))

count matrix; rownames must be ENSEMBL gene IDs

count_mat[head(which(rowSums(count_mat) > 0)), 1:10]

> veh_WT_rep1 veh_WT_rep2 veh_WT_rep3 veh_WT_rep4 veh_D538G_rep1

> ENSG00000223972 5 3 3 6 4

> ENSG00000278267 2 3 2 4 2

> ENSG00000227232 80 91 95 97 65

> ENSG00000243485 1 2 1 2 0

> ENSG00000237613 0 0 0 1 0

> ENSG00000239945 7 5 7 4 5

> veh_D538G_rep2 veh_D538G_rep4 veh_Y537S_rep1 veh_Y537S_rep2

> ENSG00000223972 1 6 3 5

> ENSG00000278267 2 2 2 2

> ENSG00000227232 70 73 74 65

> ENSG00000243485 2 3 1 1

> ENSG00000237613 0 0 0 0

> ENSG00000239945 5 5 1 5

> veh_Y537S_rep3

> ENSG00000223972 3

> ENSG00000278267 2

> ENSG00000227232 89

> ENSG00000243485 1

> ENSG00000237613 0

> ENSG00000239945 7

metadata of the samples, where row i corresponds to column i in the count matrix

meta

> treatment mutation replicate

> veh_WT_rep1 veh WT rep1

> veh_WT_rep2 veh WT rep2

> veh_WT_rep3 veh WT rep3

> veh_WT_rep4 veh WT rep4

> veh_D538G_rep1 veh D538G rep1

> veh_D538G_rep2 veh D538G rep2

> veh_D538G_rep4 veh D538G rep4

> veh_Y537S_rep1 veh Y537S rep1

> veh_Y537S_rep2 veh Y537S rep2

> veh_Y537S_rep3 veh Y537S rep3

> veh_Y537S_rep4 veh Y537S rep4

> veh_D538G_rep3 veh D538G rep3

> E2_WT_rep1 E2 WT rep1

> E2_WT_rep2 E2 WT rep2

> E2_WT_rep3 E2 WT rep3

> E2_WT_rep4 E2 WT rep4

> E2_Y537S_rep1 E2 Y537S rep1

> E2_Y537S_rep2 E2 Y537S rep2

> E2_Y537S_rep3 E2 Y537S rep3

> E2_Y537S_rep4 E2 Y537S rep4

> E2_D538G_rep1 E2 D538G rep1

> E2_D538G_rep2 E2 D538G rep2

> E2_D538G_rep3 E2 D538G rep3

> E2_D538G_rep4 E2 D538G rep4

然后创建一个 DESeqDataSet,然后就可以画 QC 图了:
dds <- make_dds(counts = count_mat, metadata = meta, ah_record = “AH89426”)
查看样本的所有 counts:
plot_total_counts(dds)

查看文库复杂性:
plot_library_complexity(dds)

查看基因和 counts 的关系:
plot_gene_detection(dds)

查看基因类型:
plot_biotypes(dds)

vst 标准化:
vsd <- vst(dds)
mean_sd_plot(vsd)

查看基因在染色体上的表达:
map(c(“1”, “5”, “14”), ~plot_chromosome(vsd, .x))

> [[1]]

查看生物学重复样本的相关性:

define new grouping variable

colData(vsd)$trt_mut <- paste0(colData(vsd)$treatment, “_”, colData(vsd)$mutation)

ma_plots <- plot_sample_MAs(vsd, group = “trt_mut”)
cowplot::plot_grid(plotlist = ma_plots[17:24], ncol = 2)

热图聚类:

set seed to control random annotation colors

set.seed(1)
plot_sample_clustering(vsd, anno_vars = c(“treatment”, “mutation”, “replicate”), distance = “euclidean”)

PCA:
plot_pca(vsd, PC_x = 1, PC_y = 2, color_by = “treatment”, shape_by = “mutation”)

PCA 展示可视化基因在每个组的表达:
plot_pca(vsd, PC_x = 1, PC_y = 2, color_by = “ENSG00000223972”)

PCA 和 loading 的关系:
pca_res <- plot_pca(vsd, show_plot = FALSE)
plot_loadings(pca_res, PC = 1, annotate_top_n = 5)

标出基因:
plot_loadings(pca_res, PC = 1, highlight_genes = c(“CD34”, “FLT1”, “MAPT”))

以基因类型来上色:
plot_loadings(pca_res, PC = 4, color_by = “gene_biotype”, show_plot = F)$plot +
theme(legend.position = “bottom”)

多个主成分展示:
plot_pca_scatters(vsd, n_PCs = 5, color_by = “treatment”, shape_by = “mutation”)

差异分析,然后可视化基因在不同组的表达:
dds <- estimateSizeFactors(dds)
plot_gene(“CLEC2B”, dds, x_var = “mutation”, color_by = “treatment”)

多因子差异分析:

design variables need to be factors

since we update the design of the dds

object, we have to do it manually

dds$mutation <- as.factor(dds$mutation)
dds$treatment <- as.factor(dds$treatment)
design(dds) <- ~ mutation + treatment

dds <- DESeq(dds, parallel = T)
plotDispEsts(dds)

可视化:

get testing results

de_res <- lfcShrink(dds, coef = “mutation_WT_vs_D538G”, lfcThreshold = log2(1.5), type = “normal”, parallel = TRUE)

MA plot

plot_ma(de_res, dds, annotate_top_n = 5)

标出基因:
plot_ma(de_res, dds, highlight_genes = c(“CLEC2B”, “PAGE5”, “GAPDH”))

结尾


路漫漫其修远兮,吾将上下而求索。
欢迎加入生信交流群。加我微信我也拉你进 微信群聊 老俊俊生信交流群 (微信交流群需收取 20 元入群费用,一旦交费,拒不退还!(防止骗子和便于管理)) 。QQ 群可免费加入, 记得进群按格式修改备注哦。

声明:文中观点不代表本站立场。本文传送门:https://eyangzhen.com/418468.html

(0)
联系我们
联系我们
分享本页
返回顶部