Researchers at Helmholtz Munich and the Technical University of Munich (TUM) have developed Nicheformer, the first large-scale foundation model that integrates single-cell analysis with spatial transcriptomics. Trained on more than 110 million cells, it offers a new way to study how cells are organized and interact in tissues – knowledge that is crucial for understanding health and disease.
 
Missing context in single-cell data
Single-cell RNA sequencing has transformed biology by showing which genes are active in individual cells. However, this approach requires cells to be removed from their natural environment, erasing information about their position and neighbors. Spatial transcriptomics preserves this context but is technically more limited and harder to scale. Researchers have long lacked a way to study cell identity and tissue organization together.
AI model reveals hidden tissue structures
Nicheformer overcomes this barrier by learning from both dissociated and spatial data. It can “transfer” spatial context back onto cells that were previously studied in isolation – essentially reconstructing how they fit into the bigger picture of a tissue. To make this possible, the research team created SpatialCorpus-110M, one of the largest curated resources of single-cell and spatial data to date. In their study published in Nature Methods, the model consistently outperformed existing approaches and showed that spatial patterns leave measurable traces in gene expression, even when cells are dissociated. Beyond performance, the researchers also explored interpretability, revealing that the model identifies biologically meaningful patterns in its internal layers – offering a new window into how AI learns from biology.
“With Nicheformer we can now transfer spatial information onto dissociated single-cell data at scale,” says Alejandro Tejada-Lapuerta, PhD student at Helmholtz Munich and TUM and co-first author of the study together with Anna Schaar. “This opens up many possibilities to study tissue organization and cellular neighborhoods without additional experiments.”
The study connects to the emerging idea of a “Virtual Cell”, a computational representation of how cells behave and interact within their native environments. While this concept is gaining momentum across biology and AI, previous models have largely treated cells as isolated entities, without reasoning their spatial relationships. Nicheformer is the first foundation model to learn directly from spatial organization, offering a way to reconstruct how cells sense and influence their neighbors. Beyond introducing this new capability, the researchers also present an entire suite of spatial benchmarking tasks that challenge future models to capture tissue architecture and collective cellular behavior – an essential step toward biologically realistic AI systems.
Next steps
With Nicheformer we are taking the first steps toward building general-purpose AI models that represent cells in their natural context – the foundation of a Virtual Cell and Tissue model. Such models will transform how we study health and disease and could ultimately guide the development of new therapies.”
Prof. Fabian Theis, Director of the Computational Health Center at Helmholtz Munich and Professor at TUM
In their next project, the team aims to develop a “tissue foundation model” that also learns the physical relationships between cells. Such a model could help analyze tumor microenvironments and other complex structures in the body with direct relevance for diseases such as cancer, diabetes, and chronic inflammation.
 
Source:
Journal reference:
Tejada-Lapuerta, A., et al. (2025). Nicheformer: a foundation model for single-cell and spatial omics. Nature Methods. doi.org/10.1038/s41592-025-02814-z