← 返回分析流程中心创建时间 2026/6/3 分析难度 高级 推荐场景 肿瘤转录组 预计耗时 3-5 天
Pipeline Detail
Cancer Transcriptomics肿瘤转录组与临床应用
肿瘤 RNA-seq 综合分析
面向肿瘤 bulk RNA-seq 的综合分析流程,整合 DEG、通路活性、免疫浸润、分型、预后、融合基因和候选机制解释。
Metadata
流程元数据
先看应用场景、输入输出和工具依赖,再进入正文命令细节。
Difficulty
高级
Scenario
肿瘤转录组
Estimated Time
3-5 天
Tools
DESeq2STARGSVAssGSEAxCellESTIMATESTAR-FusionArriba
Inputs
BAMGTFTPM
Outputs
heatmapfusion candidatespathway scorereport
Workflow DAG
流程图
用步骤节点快速理解这个分析从原始数据到结果报告的流转关系。
STEP 1
→建立肿瘤项目目录
STEP 2
→临床信息和表达矩阵
STEP 3
→差异表达
STEP 4
→通路活性
STEP 5
→免疫浸润
STEP 6
→预后分析
STEP 7
→分型/聚类
STEP 8
→融合基因
STEP 9
综合报告
Protocol
流程文档
正文保留 Markdown 排版、代码语言标识和表格样式,适合边学边复现。
肿瘤 RNA-seq 综合分析
一、项目目录
mkdir -p tumor_rnaseq_project/{00_clinical,01_expression,02_deg,03_pathway,04_immune,05_survival,06_subtype,07_fusion,report}
二、示例数据
00_clinical/clinical_info.csv:
sample_id,group,stage,OS_time,OS_status
Tumor_1,Tumor,III,520,1
Tumor_2,Tumor,II,900,0
Normal_1,Normal,NA,NA,NA
Normal_2,Normal,NA,NA,NA
01_expression/tpm_matrix.csv:
gene_symbol,Tumor_1,Tumor_2,Normal_1,Normal_2
MKI67,50,45,5,6
PDCD1,8,10,1,1.2
EPCAM,100,120,30,28
三、整体流程图
flowchart TD
A[expression + clinical metadata] --> B[DEG]
A --> C[GSVA/ssGSEA pathway score]
A --> D[immune infiltration]
A --> E[survival analysis]
A --> F[molecular subtype clustering]
A --> G[fusion detection]
B --> H[integrated tumor mechanism]
C --> H
D --> H
E --> H
F --> H
G --> H
四、差异表达
library(DESeq2)
counts <- read.csv("01_expression/raw_counts.csv", row.names = 1, check.names = FALSE)
clinical <- read.csv("00_clinical/clinical_info.csv", row.names = 1)
dds <- DESeqDataSetFromMatrix(
countData = round(as.matrix(counts)),
colData = clinical,
design = ~ group
)
dds <- dds[rowSums(counts(dds) >= 10) >= 3, ]
dds <- DESeq(dds)
res <- results(dds, contrast = c("group", "Tumor", "Normal"))
write.csv(as.data.frame(res), "02_deg/Tumor_vs_Normal_DESeq2.csv")
五、通路活性
library(GSVA)
library(msigdbr)
tpm <- read.csv("01_expression/tpm_matrix.csv", row.names = 1, check.names = FALSE)
expr_log <- log2(as.matrix(tpm) + 1)
hallmark <- msigdbr(species = "Homo sapiens", category = "H") |>
split(x = .$gene_symbol, f = .$gs_name)
gsva_score <- gsva(expr_log, hallmark, method = "gsva", kcdf = "Gaussian")
write.csv(gsva_score, "03_pathway/hallmark_gsva_scores.csv")
六、免疫浸润
library(immunedeconv)
immune_xcell <- deconvolute(as.matrix(tpm), method = "xcell", arrays = FALSE)
estimate_score <- deconvolute(as.matrix(tpm), method = "estimate")
write.csv(immune_xcell, "04_immune/xcell_scores.csv", row.names = FALSE)
write.csv(estimate_score, "04_immune/estimate_scores.csv", row.names = FALSE)
七、预后分析
library(survival)
library(survminer)
gene <- "MKI67"
clinical$expr <- as.numeric(tpm[gene, rownames(clinical)])
clinical$risk_group <- ifelse(clinical$expr >= median(clinical$expr, na.rm = TRUE), "High", "Low")
fit <- survfit(Surv(OS_time, OS_status) ~ risk_group, data = clinical)
ggsurvplot(
fit,
data = clinical,
pval = TRUE,
risk.table = TRUE
)
八、分型/聚类
library(pheatmap)
top_var <- names(sort(apply(expr_log, 1, mad), decreasing = TRUE))[1:1000]
mat <- expr_log[top_var, grepl("Tumor", colnames(expr_log))]
pheatmap(
mat,
scale = "row",
show_rownames = FALSE,
filename = "06_subtype/tumor_unsupervised_clustering.pdf"
)
九、融合基因检测
STAR-Fusion 示例:
STAR-Fusion --genome_lib_dir ref/ctat_genome_lib --left_fq tumor_R1.fq.gz --right_fq tumor_R2.fq.gz --CPU 16 --output_dir 07_fusion/Tumor_1
Arriba 示例:
arriba -x tumor.Aligned.sortedByCoord.out.bam -g ref/genes.gtf -a ref/genome.fa -o 07_fusion/Tumor_1_fusions.tsv
十、综合解释示例
Tumor 组中 MKI67 上调,cell cycle 通路 GSVA 分数升高,并且高 MKI67 表达组预后更差。
同时免疫浸润分析显示 macrophage score 升高,提示该肿瘤亚型可能具有高增殖和免疫抑制特征。
如果 fusion 检测发现 driver fusion,需要与 DEG 和通路结果联合解释。
十一、交付物
- DEG 表和火山图
- Hallmark/KEGG 通路活性矩阵
- 免疫浸润分数矩阵
- 生存曲线和 Cox 结果
- 肿瘤样本分型热图
- fusion candidates
- 综合机制解释报告