TECSAS: Transformer of Epigenetics to Chromatin Structural AnnotationS

TECSAS (Transformer of Epigenetics to Chromatin Structural AnnotationS) is a deep learning model based on the Transformer architecture designed to predict chromatin subcompartment annotations directly from epigenomic data. TECSAS leverages information from histone modifications, transcription factor binding profiles, and RNA-seq data to decode the relationship between the biochemical composition of chromatin and its 3D structural behavior.

Chromatin within the nucleus adopts complex three-dimensional structures that are crucial for gene regulation and cellular function. Recent studies have revealed the presence of distinct chromatin subcompartments beyond the traditional A/B compartments (eu- and hetero-chromatin), each exhibiting unique structural and functional properties. TECSAS achieves high accuracy in predicting subcompartment annotations and reveals the influence of long-range epigenomic context on chromatin organization.

TECSAS Overview

The framework enables:

  • Chromatin subcompartment prediction: Classification of genomic regions into subcompartments (A1, A2, B1, B2, B3) at 25-50kb resolution
  • Nuclear body association prediction: Identification of lamina-associated domains (LADs), nucleolus-associated domains (NADs), and nuclear speckle-associated domains (SPADs)
  • Transfer learning: Pre-trained encoder on reference cell lines (e.g., GM12878) can be fine-tuned for target cell lines

TECSAS processes epigenomic signal tracks at specified genomic resolution (default 50kb bins), normalizes signals using z-score standardization, and uses sliding window context (default ±14 neighboring bins) to capture spatial dependencies. Unlike methods that rely on Hi-C contact maps, TECSAS predicts 3D genome organization directly from the epigenome, enabling analysis across diverse cell types without requiring proximity ligation experiments.


Usage & Resources

For complete examples, see the Tutorials directory.

Resources:

  • Tutorials: Step-by-step notebooks in the Tutorials/ directory
  • Reference data: Subcompartment annotations and nuclear body association labels (LADs, NADs, SPADs) in TECSAS/share/

Installation

TECSAS requires Python 3.6+ and the following dependencies:

  • PyTorch (>=1.7.0)
  • NumPy (>=1.18)
  • pyBigWig
  • requests
  • joblib
  • tqdm
  • urllib3

Install from source:

git clone https://github.com/ed29rice/TECSAS.git
cd TECSAS
pip install -e .

Install dependencies:

pip install torch numpy pyBigWig requests joblib tqdm urllib3

Note: For GPU acceleration, ensure you have CUDA-compatible PyTorch installed


Citation

If you use TECSAS in your research, please cite:

Dodero-Rojas, E., Mendieta, A., Fehlis, Y., Mayala, N., Contessoto, V. G., & Onuchic, J. N. (2025). Epigenetics is all you need: A transformer to decode chromatin structural compartments from the epigenome. PLOS Computational Biology, 21(12), e1012326.
https://doi.org/10.1371/journal.pcbi.1012326

For questions, issues, or collaborations, please open an issue on GitHub or contact the developers.