A scheduler based on the Temporal.io framework has been developed to enable optimizations of bioinformatics workflows. Specifically, users can transparently map workflow steps to diverse execution environments, including high-performance computing (HPC) resources managed by the SLURM resource manager through an easy-to-use graphical user interface. Asynchronous execution of workflows is supported to optimize resource utilization even when the scheduler cannot make use of a system’s full RAM and CPU resources. Pipelines are executed using a combination of UW compute resources and the NSF Bridges2 supercomputer.
An R package for analyzing JAX lab bulk RNA-seq perturbation experiments, supporting differential expression analysis and downstream functional enrichment.
An interactive Shiny app for exploring bulk RNA-seq differential expression results across the MorPhiC project. Users can browse all perturbation experiments and filter assays (each defined as a DESeq2 analysis) by model system, perturbation strategy, or experimental condition. For every assay, the app reports DEG counts alongside the functional annotations and disease/phenotype associations of both the perturbed gene and its significantly differentially expressed genes. Cross-assay comparison is supported through a DEG overlap matrix and UpSet plots, allowing users to identify shared and assay-specific transcriptional responses across two or more perturbations.
The MorPhiC Data Integrator facilitates uniform data processing and integration with datasets generated by the NIH funded MorPhiC program. In the dataset identification step, users can select published bulk RNA sequencing (RNA-seq) datasets generated by the MorPhiC program, specify external datasets from the Gene Expression Omnibus (GEO) repository, or upload raw RNA-seq data. Subsequently, users can select a subset of samples and indicate experimental design by assigning groups to samples. External data will then be processed using the MorPhiC approved bulk RNA-seq analytical pipeline, and differential expression will be inferred. Users also have the option to download the uniformly processed counts table.
Perturb-cNMF is a scalable framework for identifying gene programs from single-cell expression data. It uses consensus non-negative matrix factorization to break down complex cell-by-gene matrices into interpretable gene programs that capture underlying biological processes, and their responses to perturbations. This pipeline supports both CPU and GPU execution and includes built-in methods to assess program quality, robustness, and condition-specific effects. It also provides tools for visualization, program selection, and automated annotation using pathway enrichment and language model–based approaches. In MorPhiC, Perturb-cNMF is used to define gene programs within scRNA- and scPerturb-seq and link perturbations to those programs, helping map the underlying regulatory pathways and identify shared program structure.
scE2G is a computational pipeline for predicting enhancer–gene regulatory links from single-cell ATAC-seq or multiome data. It integrates chromatin accessibility and, when available, gene expression to score likely regulatory interactions across the genome. The method accounts for genomic distance when linking enhancers to genes and estimates these relationships within individual cell clusters. The result is a genome-wide set of enhancer-gene link scores that can be used for downstream analysis, visualization, and interpretation of gene regulation at single-cell resolution. In MorPhiC, scE2G is applied to various chromatin datasets to help identify which perturbation effects are most likely to be biologically meaningful.
STAR Suite updates the original STAR aligner by integrating four modules — STAR-perturb, STAR-Flex, STAR-SLAM, and TranscriptVB — to provide complete internal C/C++ pipelines for bulk RNA-seq, scRNA-seq, Perturb-seq, 10x Flex, and SLAM-seq. The integration results in substantial speedups and a simplified toolchain that can be installed through pre-compiled binaries for researchers and agents. No new external dependencies are required; the suite is built entirely with the existing STAR toolchain and vendored components. This is a drop-in replacement for the STAR aligner.
Biodepot Launcher is a desktop application that facilitates installation, management and deployment of bioinformatics workflows using the Biodepot-workflow-builder (Bwb). With this app, Bwb can be started by double-clicking on an icon, eliminating the need for typing cryptic start up commands into the terminal. This creates an end-to-end graphical and easy-to-use interface to manage and launch containerized workflows on the local computer or cloud instances. Biodepot Launcher is written in React and Javascript, and uses the node.js framework Neutralinojs and web browser routines to allow the application to execute on Linux, Windows and Mac desktop environments.
A scalable framework for inferring causal gene regulatory networks and predicting cellular responses to unseen perturbations from single-cell CRISPR screens.
ChromBPNet is a deep learning framework for modeling chromatin accessibility at base resolution from ATAC-seq or DNase-seq data. After correcting for enzyme-specific bias, it learns genome-wide accessibility patterns and isolates sequence-driven regulatory signals, allowing it to predict how individual DNA bases influence chromatin accessibility. This can be used to predict transcription factors binding events, the effects of regulatory variants, and many other sequence-driven regulatory features. In MorPhiC, ChromBPNet is applied to chromatin accessibility datasets to predict how perturbations may affect transcription factor binding profiles and to support downstream interpretation of gene regulatory pathways.
Dynode is an integrative AI framework for learning continuous single-cell developmental dynamics, reconstructing vector fields from transcriptomic and spatial data, and enabling predictive in silico perturbation modeling for systems such as embryogenesis, cardiogenesis, and disease progression. Related method: Dynamo (https://github.com/aristoteleo/dynamo-release) for RNA velocity vector field learning and prediction, Spateo (https://github.com/aristoteleo/spateo-release) for 3D spatial transcriptomics reconstruction and spatiotemporal modeling.
PantheonOS is a multi-agent AI framework for genomics that uses large language model–driven agents to help automate and guide biological analysis. It connects agents to domain-specific workflows, enabling tasks like scRNA-seq analysis, gene program selection, and regulatory modeling. By coordinating multiple agents that can iteratively refine analyses, it helps improve the quality and consistency of analysis results. Overall, PatheonOS is designed to make computational genomics research more automated, adaptable, and scalable using adaptive AI systems.
