This is for EuchroGene Members.

PyDESeq2 (EuchroGene PyDESeq2 v4.0) Full-feature, journal-grade differential gene expression analysis built on PyDESeq2 (Muzellec et al. 2023), the Python re-implementation of DESeq2 (Love et al. 2014). The wrapper accepts a raw integer count matrix and a sample metadata table and runs the complete DESeq2 statistical workflow — size-factor estimation, dispersion shrinkage, GLM fitting, Wald testing, optional log2 fold-change shrinkage, Cook's outlier filtering, and independent filtering. Multi-factor designs and explicit Wald contrasts are exposed directly. Each run produces seven publication-quality figures (300 DPI PNG and vector PDF), a full results table plus a filtered DEG table, normalized and VST-transformed counts, an interactive HTML report with the journal-ready methods section inlined, and a complete run_manifest.json for reproducibility.

How It Works:

Input Validation: Count matrix and design CSV are validated for format, sample-name agreement, and integer counts, then staged for containerized execution.
Pre-filtering & Model Setup: Genes with row-sum below -min_count_sum are removed. A negative-binomial GLM is built with the design formula ~ <design_factors>; the reference level is set explicitly when -reference_level is provided.
DESeq2 Workflow: Size factors are estimated by median-of-ratios; gene-wise dispersions are obtained by Cox–Reid adjusted profile likelihood and shrunk toward a parametric (or mean) trend; the GLM is fit and the requested Wald contrast is tested. Cook's outlier filtering and independent filtering are applied per the user's flags.
Effect-Size Shrinkage: Optional apeGLM-style log2 fold-change shrinkage produces stabilized effect sizes for ranking and visualization.
Figure & Table Generation: PCA (with 95 % confidence ellipses grouped by the primary design factor), volcano plot with top-N gene labels, MA plot, dispersion-estimates plot, sample-distance heatmap, top-N DEG heatmap, and p-value histogram are rendered. Five tabular outputs (full results, shrunken results, filtered DEGs, normalized counts, VST counts) are written.
Report Generation: A self-contained interactive HTML report, a machine-readable run_manifest.json, a deg_stats.json summary, and a journal-ready methods section (inlined inside the HTML report) are produced automatically. A zip archive of the full results folder is also created.

Required Inputs:

Sample design CSV (-exp_design_csv) — Sample metadata table where rows are samples and columns are experimental factors (e.g., condition, batch, genotype). The sample IDs must match the column names of the count table.
Count table (-count_table) — Raw integer count matrix where rows are genes (first column = gene IDs) and columns are samples. Output of featureCounts, HTSeq-count, STAR --quantMode GeneCounts, salmon → tximport, or equivalent.

Post-Analysis:

Open DESeq2_Analysis_Report.html for an interactive summary of DEG counts, run parameters, all seven figures inline, and the journal-ready methods section ready to paste into a manuscript.
Use <exp_name>_DEGs.csv for downstream functional enrichment (GO, KEGG, Reactome, GSEA), and <exp_name>_normalized_counts.csv / <exp_name>_vst_counts.csv for single-gene follow-up or co-expression analysis.
The run_manifest.json records the wrapper version, Docker image tag, run timestamp, and every parameter value — paste it into a manuscript supplement for full reproducibility.

Installation

0. Install EG_tools (skip if already installed)

wget https://github.com/euchrogene/EG_tools/raw/refs/heads/main/EG_tools
sudo chmod 777 EG_tools
sudo mv EG_tools /usr/bin

1. Install PyDESeq2

sudo EG_tools install -r https://github.com/euchrogene/PyDESeq2.git -d PyDESeq2 -e PyDESeq2_v.4.0 -m "Full-feature differential gene expression analysis using PyDESeq2"

2. Display installed software

EG_tools

3. Show help contents

PyDESeq2_v.4.0

4. Uninstall

sudo EG_tools uninstall -t PyDESeq2_v.4.0 -i managene7/pydeseq2:v.1.0

Help Contents:

This pipeline is provided by EuchroGene, LLC.
Bug reports: bioinformatics@euchrogene.com

============================================================================
EuchroGene PyDESeq2 Pipeline v4.0
Docker Image: managene7/pydeseq2:v.1.0
============================================================================

DESCRIPTION:
  Full-feature differential gene expression analysis built on PyDESeq2,
  the Python re-implementation of DESeq2. Runs the complete DESeq2
  statistical workflow with multi-factor designs, explicit Wald contrasts,
  and optional log2 fold-change shrinkage. Produces journal-grade figures
  (300 DPI PNG + vector PDF), full results tables, and a self-contained
  HTML report with an inlined journal-ready methods section.

USAGE:
  PyDESeq2_v.4.0 -exp_design_csv <design.csv> -count_table <counts.csv> [OPTIONS]

REQUIRED:
  -exp_design_csv <FILE>   Sample metadata CSV (rows = samples, cols = factors)
  -count_table    <FILE>   Raw integer count matrix (rows = genes, cols = samples)

EXPERIMENT:
  -exp_name       <STR>    Experiment label used in filenames and report
                           (default: Sample)

DESIGN / CONTRAST:
  -design_factors <STR>    Comma-separated factors. The LAST factor is the
                           primary factor of interest, matching the PyDESeq2
                           convention. e.g. "batch,condition"
                           (default: condition)
  -reference_level <STR>   Reference level for the primary factor.
                           e.g. "control"
                           (default: alphabetically first level)
  -contrast       <STR>    Explicit Wald contrast "factor,test_level,ref_level".
                           Overrides -reference_level when given.
                           e.g. "condition,treated,control"

STATISTICAL THRESHOLDS:
  -log2fc         <FLOAT>  |log2FC| cutoff for DEG calling     (default: 1.0)
  -padj           <FLOAT>  Adjusted p-value (FDR) cutoff       (default: 0.05)
  -alpha          <FLOAT>  Target FDR for independent filter   (default: 0.05)

MODEL / FILTERING:
  -fit_type       <STR>    Dispersion fit: parametric | mean   (default: parametric)
  -min_replicates <INT>    Cook's filter min replicates        (default: 7)
  -cooks_filter   <BOOL>   Apply Cook's outlier filter         (default: true)
  -independent_filter <BOOL>  Apply independent filtering       (default: true)
  -shrinkage      <BOOL>   Apply LFC shrinkage to results      (default: true)
  -min_count_sum  <INT>    Pre-filter genes with row-sum < N   (default: 10)

REPORTING:
  -top_n_genes    <INT>    Top-N DEGs for heatmap              (default: 30)
  -label_top_n    <INT>    Top-N gene labels on volcano plot   (default: 15)

MISC:
  -threads        <INT>    Threads passed to PyDESeq2          (default: 4)
  -random_seed    <INT>    Random seed                         (default: 42)

EXAMPLES:

  # Minimal two-condition contrast
  PyDESeq2_v.4.0 -exp_design_csv design.csv -count_table counts.csv \
                 -exp_name stress_response

  # Multi-factor design with explicit contrast and tighter thresholds
  PyDESeq2_v.4.0 -exp_design_csv design.csv -count_table counts.csv \
                 -exp_name treated_vs_ctrl \
                 -design_factors "batch,condition" \
                 -reference_level "condition,control" \
                 -contrast "condition,treated,control" \
                 -log2fc 1.5 -padj 0.01

  # Disable shrinkage and use the mean-dispersion trend
  PyDESeq2_v.4.0 -exp_design_csv design.csv -count_table counts.csv \
                 -exp_name pilot_run \
                 -shrinkage false -fit_type mean

OUTPUT FILES:

  <exp_name>_DEG_results/
  ├── <exp_name>_results_full.csv             Complete DESeq2 results (all genes)
  ├── <exp_name>_results_shrunk.csv           LFC-shrunken results (all genes)
  ├── <exp_name>_DEGs.csv                     Filtered DEGs (|log2FC|, padj cutoffs)
  ├── <exp_name>_normalized_counts.csv        Median-of-ratios normalized counts
  ├── <exp_name>_vst_counts.csv               Variance-stabilizing-transformed counts
  ├── <exp_name>_PCA_plot.png|pdf             PCA — clusters with 95% confidence ellipses
  ├── <exp_name>_volcano_plot.png|pdf         Volcano with top-N gene labels
  ├── <exp_name>_MA_plot.png|pdf              MA plot (mean expression vs. log2FC)
  ├── <exp_name>_dispersion_plot.png|pdf      Dispersion estimates and fitted trend
  ├── <exp_name>_sample_distance_heatmap.png|pdf   Sample-to-sample distance heatmap
  ├── <exp_name>_top<N>_DEGs_heatmap.png|pdf  Top-N DEGs z-scored expression heatmap
  ├── <exp_name>_pvalue_histogram.png|pdf     P-value distribution diagnostic
  ├── DESeq2_Analysis_Report.html             Self-contained HTML report (methods inlined)
  ├── Results_Summary.txt                     Human-readable summary
  ├── run_manifest.json                       Full parameter record (reproducibility)
  └── deg_stats.json                          Programmatic DEG counts

  A zip archive (<exp_name>_DEG_results.zip) is also created next to the folder.

SUPPORT:
  Bugs / Questions: bioinformatics@euchrogene.com

============================================================================

Uninstall old version

sudo EG_tools uninstall -t PyDESeq2 -i managene7/rna-seq_to_tpm_deseq2:v.1.0

Uninstall v.1.0

sudo EG_tools uninstall -t PyDESeq2_v.1.0 -i managene7/rna-seq_to_tpm_deseq2:v.1.1

Uninstall v.4.0

sudo EG_tools uninstall -t PyDESeq2_v.4.0 -i managene7/pydeseq2:v.1.0

Citation

If you use this pipeline in published research, please cite:

Love MI, Huber W, Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology 15: 550. https://doi.org/10.1186/s13059-014-0550-8

Muzellec B, Telenczuk M, Cabeli V, Andreux M (2023) PyDESeq2: a Python package for bulk RNA-seq differential expression analysis. Bioinformatics 39(9): btad547. https://doi.org/10.1093/bioinformatics/btad547

Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society B 57(1): 289–300.

EuchroGene PyDESeq2 v4.0 (2026). EuchroGene, LLC. bioinformatics@euchrogene.com

The journal-ready methods section is inlined inside DESeq2_Analysis_Report.html and is parameterized on the actual run settings (design formula, contrast, shrinkage state, alpha, filtering flags), so the text reflects exactly what was run — copy directly from the rendered report into your manuscript.

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
Example_Exp_design.csv		Example_Exp_design.csv
Example_Gene_Exp_Count.csv		Example_Gene_Exp_Count.csv
PyDESeq2_v.4.0		PyDESeq2_v.4.0
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

This is for EuchroGene Members.

How It Works:

Required Inputs:

Post-Analysis:

Installation

0. Install EG_tools (skip if already installed)

1. Install PyDESeq2

2. Display installed software

3. Show help contents

4. Uninstall

Help Contents:

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

This is for EuchroGene Members.

How It Works:

Required Inputs:

Post-Analysis:

Installation

0. Install EG_tools (skip if already installed)

1. Install PyDESeq2

2. Display installed software

3. Show help contents

4. Uninstall

Help Contents:

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages