MIT Researchers Unveil KATMAP to Predict Gene Splicing Dynamics

Researchers at the Massachusetts Institute of Technology (MIT) have introduced a groundbreaking framework called KATMAP, aimed at understanding and predicting gene splicing. Published on November 4, 2025, in the journal Nature Biotechnology, this innovative model could significantly enhance our comprehension of how splicing factors regulate gene expression across various cell types and species.

Gene splicing is a complex process where molecular machinery cuts and assembles different segments of DNA to create diverse proteins. This mechanism allows identical cells, such as heart and skin cells, to perform vastly different functions. The regulation of this process is influenced by splicing factors that determine which segments are utilized, ultimately affecting protein production.

KATMAP, which stands for Knockdown Activity and Target Models from Additive regression Predictions, builds on experimental data that disrupts the expression of splicing factors. By analyzing these disruptions, the model predicts the likely targets of splicing factors, offering insights into the regulatory activities that govern gene expression.

Michael P. McGurk, a postdoctoral researcher in the lab of Professor Christopher Burge, emphasized the model’s ability to provide a more accurate depiction of splicing regulation than previous methods, which could only offer an average overview. KATMAP leverages RNA sequencing data from perturbation experiments, where the expression levels of splicing factors are either increased or decreased.

This approach allows researchers to observe changes in splicing patterns, helping to identify the specific targets of splicing factors. By incorporating information about binding sites—specific sequences where splicing factors are likely to interact—KATMAP distinguishes between direct targets and indirect effects.

In McGurk’s words, “In our analyses, we identify predicted targets as exons that have binding sites for this particular factor in the regions where this model thinks they need to be to impact regulation.” This capability is particularly beneficial for less-studied splicing factors, providing a clearer understanding of their regulatory roles.

While many predictive models operate as “black boxes,” KATMAP stands out for its interpretability. Researchers can generate hypotheses and interpret splicing patterns related to regulatory factors, gaining insights into the rationale behind predictions. McGurk noted, “I don’t just want to predict things; I want to explain and understand.”

The researchers did acknowledge the necessity of simplifying assumptions in developing KATMAP. The model currently focuses on individual splicing factors, though in reality, these factors often work together. Furthermore, RNA target sequences may be structured in a way that limits access to predicted binding sites.

Looking ahead, the Burge lab is collaborating with the Dana-Farber Cancer Institute to explore how splicing factors are altered in disease contexts. There are also plans to extend KATMAP’s capabilities to include cooperative regulation among splicing factors. McGurk expressed his aspirations, stating, “We’re still in a very exploratory phase, but I would like to apply these models to understand splicing regulation in disease or development.”

Professor Burge, who serves as the Uncas (1923) and Helen Whitaker Professor, underscored the potential of KATMAP, stating, “We now have a tool that can learn the pattern of activity of a splicing factor from types of data that can be readily generated for any factor of interest.” As more models are developed, researchers hope to better understand how splicing factors might contribute to various pathologies by analyzing transcriptomic data.

The introduction of KATMAP marks a significant advancement in the field of genetics, offering a new lens through which scientists can explore the intricate regulatory mechanisms governing gene splicing. As this research evolves, it promises to unlock further potential for therapeutic applications in addressing diseases linked to splicing mutations.