Session 04 · No-Code & Agentic AI for Life Sciences

Omics Data
Analysis: RNA-Seq

Differential Expression · Pipelines · Network Analysis

Md. Jubayer Hossain · Founder & CEO
DeepBio Limited · DeepBio Academy

No Code & Agentic AI for Life Sciences · Session 04 · June 2026

Pipeline Overview

Bulk RNA-seq No-Code Pipeline

FASTQ
Raw Reads
QC
Quality Control
ALIGN
Mapping
COUNT
Quantification

🛠 Tools (No-Code)

  • Galaxy: Web-based platform for NGS analysis.
  • FastQC: Visualizing read quality.
  • STAR / Salmon: Mapping reads to transcriptome.
  • featureCounts: Creating the count matrix.

📦 The Deliverable

A Count Matrix: Rows (Genes) x Columns (Samples). This is the starting point for statistical analysis.

Statistical Analysis

Count Matrix → DESeq2 Analysis

DESeq2 is the industry standard for identifying Differentially Expressed Genes (DEGs).

  • Normalization for sequencing depth.
  • Estimation of dispersion (variability).
  • Statistical testing (Wald test).
  • Correction for multiple testing (FDR/Padj).

📊 The DEG Table

log2FoldChange: Magnitude of change.
pvalue: Statistical significance.
padj: Adjusted p-value (FDR) — use this for filtering!

"A gene is usually considered a DEG if |log2FC| > 1 and padj < 0.05."
Visualization

Interpreting the Volcano Plot

[Volcano Plot Schematic]

← Downregulated
Upregulated →
-log10(p-value)

X-axis: Log2 Fold Change | Y-axis: -Log10 P-value

🎯 Top Right

Highly significant, highly upregulated genes. Your primary candidates.

🎯 Top Left

Highly significant, highly downregulated genes.

💡 Pro Tip

The "most significant" gene (highest point) isn't always the "most changed" gene (furthest right/left).

Functional Analysis

Enrichr & STRING Analysis

🌈 Enrichr: Pathway Analysis

Input your list of DEGs. Enrichr checks for over-representation in:

  • GO Terms: Biological Process, Molecular Function.
  • KEGG / Reactome: Canonical metabolic pathways.
  • WikiPathways: Community-curated mechanisms.

🕸 STRING: Network Analysis

Visualizes Protein-Protein Interaction (PPI) networks.

  • Identifies Hub Genes (highly connected nodes).
  • Reveals functional clusters and physical complexes.
  • Supports cross-species interaction mapping.
Agentic Interpretation

Claude Interprets DEG Results

Agents excel at connecting raw gene lists to biological narratives.

# Prompt for Claude Input: DEG table + Enrichr top 5 pathways. Prompt: Act as a molecular oncologist. These DEGs are from a study on Lung Adenocarcinoma vs Healthy Tissue. 1. Identify 3 key biomarkers. 2. Propose a biological mechanism. 3. Suggest 2 follow-up experiments.

🧬 Hub Gene Context

Claude can search UniProt/PubMed for your STRING hub genes to see if they are known drug targets or prognostic markers.

🚫 Warning

Don't trust Claude's fold-change numbers if you haven't provided them. Always ground the interpretation in your actual DEG table data.

Summary

Session 04 Key Takeaways

  • No-code NGS: FASTQ → Galaxy → Count Matrix.
  • DESeq2: Identifies significant DEGs using Padj.
  • Volcano plots: Visualize Significance vs. Change.
  • Enrichr: Links gene lists to biological pathways.
  • STRING: Identifies hub genes in PPI networks.
  • AI Agents: Transform gene tables into scientific hypotheses.

Next Session: Deep Learning Prediction Methods — Biomarker Classification and Model Selection.

No Code & Agentic AI for Life Sciences

Thank You

Get in Touch

Md. Jubayer Hossain

bio.link/hossainlab

© 2026 Md. Jubayer Hossain · No Code & Agentic AI for Life Sciences — Session 04