Computational biologists from the National University of Singapore (NUS) have uncovered how RNA splicing — a crucial process for isoform expression and protein diversity — is regulated across different cell types in the peripheral blood. This important discovery helps explain how individuals’ genetic differences contribute to their predisposition to complex diseases such as systemic lupus erythematosus (SLE) and Graves’ disease (GD).
This project was conducted as part of the Asian Immune Diversity Atlas (AIDA) consortium, which uses population-scale single-cell gene expression profiling of over one million immune cells (PBMCs) from over 600 Asian donors in five countries to understand how genes and environment make us different from each other and influence our health. The study was a research collaboration with A*STAR Genome Institute of Singapore (GIS), Samsung Genome Institute, RIKEN Center for Integrative Medical Sciences, and Nanyang Technological University. This study was published as a cover article in the journal Nature Genetics on 03 December 2024.
Alternative splicing (AS) is a fundamental regulatory mechanism in messenger RNA (mRNA) processing, and abnormal splicing is a major cause of genetic disorders. To understand the genetic regulation of splicing, previous efforts such as the Genotype-Tissue Expression (GTEx) project have primarily focused on tissue-level measurements, and these efforts have shown that different tissues have distinct patterns of splicing regulation. However, this raised an intriguing question: does disease-relevant genetic regulation of splicing occurs only in one or a few cell types? To answer this research question, the main bottleneck is the lack of a large population-scale cell-type-resolved dataset suitable for splicing analysis and corresponding analytical pipelines.
Furthermore, Asian populations have been notably underrepresented in large-scale genetic studies. For instance, Asians account for only 1.3 percent of the GTEx dataset, while individuals of European descent make up 84.6 percent. A recent study (Kachuri, L., et al. (2023). Nat Genet.) showed that ancestry could be a main factor that affects the findings of genetic regulation, mostly due to differences in allele frequencies. This highlights an urgent need for genetic studies that better represent diverse ancestries.
To address these research gaps, a research team led by Assistant Professor Liu Boxiang, with lead authors Tian Chi, Zhang Yuntian, and Tong Yihan, from the Department of Pharmacy and Pharmaceutical Sciences at the NUS Faculty of Science utilised the AIDA single-cell RNA-seq dataset to analyse cell-type-specific splicing. This work represents the first comprehensive analysis of splicing regulation in a population-scale and genetics-coupled single-cell dataset. Asst Prof Liu holds a joint appointment with the Department of Biomedical Informatics and Precision Medicine Translational Research Programme at the NUS Yong Loo Lin School of Medicine and is an Adjunct Principal Scientist at GIS.
The Asian Immune Diversity Atlas (AIDA) single-cell RNA-seq dataset
The AIDA Data Freeze v.1 includes up to 21 immune cell subtypes for context-dependent alternative splicing and splicing quantitative trait loci (sQTL) analysis. The blood samples in this dataset were collected from a cohort of 503 healthy donors of diverse Asian ancestries, spanning East, Southeast, and South Asian populations. This diversity allows the observation of Asian-specific genetic regulation of splicing. For example, an sQTL of the TCHP gene has been identified to possibly influence the risk of Graves’ disease in East Asian populations. Owing to the high average sequencing depth and the “exon painting” effect (incomplete reverse transcription along with stochastic mRNA cleavage and recapping that creates multiple 5′ ends) captured by 5′ library preparation, the AIDA scRNA-seq data preserved a substantial portion of mRNA sequences, making it particularly well-suited for splicing analysis. The full description of the AIDA dataset can be accessed from Kock, K.H. et al. (2024) bioRxiv: https://www.biorxiv.org/content/10.1101/2024.06.30.601119v1.
Cell-type-specificity in splicing regulation
This study uncovered widespread context-dependent splicing events that were often specific to a particular cell type. Notably, an ancestry-biased mRNA isoform of SPSB2, likely driven by cross-population allele frequency differences in rs11064437, was found to be unannotated in canonical gene annotation. This highlighted the lack of ancestral diversity in a widely used annotation database.
Not only is splicing cell-type-specific, but its genetic regulation is also cell-type-specific. Terminologically, an sQTL is a genetic variant that influences the splicing of RNA transcript. This study revealed 11,577 independent cis-sQTLs and 607 trans-sQTLs across 19 PBMC subtypes, and many of these were cell-type-specific and disease-associated.
Implication in diseases and experimental validation
These findings provided a unique resource for identifying genetic variants and molecular mechanisms underlying complex traits and diseases. The researchers demonstrated that diseases could be linked to splicing by showing the significant contributions of cis-sQTL effects to autoimmune and inflammatory disease. They also identified 563 putative risk genes. For example, an Asian-specific sQTL was found to disrupt the 5′ splice site of TCHP exon four to putatively modulate the risk of Graves’ disease in East Asian populations. The sQTL effect has been validated using a minigene experiment in K562 cells.
Asst Prof Liu said, “Our study established a roadmap for population-scale single-cell splicing regulation analysis and provided insights into the development of splice-modifying therapeutics.” This cell-type-specific sQTL map is a milestone in human genetics and drug target discovery for complex diseases related to splicing. Meanwhile, the examples provided in the analysis strongly suggests the importance of ancestral diversity in human genetics research.
To take the research further, the team plans to leverage single-cell technology to investigate more tissues such as muscle and adipose. The ongoing research holds great promise in revealing more detailed molecular mechanisms in complex diseases at single-cell resolution.
This study is part of the international Human Cell Atlas (HCA) consortium, which is creating comprehensive reference maps of all human cells as a basis for both understanding human health and diagnosing, monitoring, and treating disease. This paper is one of a Collection of more than 40 HCA publications in Nature Portfolio journals that represent a milestone leap in our understanding of the human body. These highly-complementary studies shed light on central aspects of human development, healthy and disease biology, and vital analytical tools and technologies, all of which will contribute to the creation of the Human Cell Atlas. As an open, scientist-led consortium, HCA is a collaborative effort of researchers, institutes, and funders worldwide, and will provide a foundation to transform and democratise global healthcare.
This project has been supported by grants from the Chan Zuckerberg Foundation (CZF2019-002446), and the Chan Zuckerberg Initiative DAF, an advised fund of Silicon Valley Community Foundation (2020-224570 and 2021-240178). This work is supported by the Ministry of Education Singapore, under its Academic Research Fund Tier 1 (FY2023; 23-0434-A0001; 22-5800-A0001), Tier 2 (MOE-T2EP30123-0015), and the Precision Medicine Translational Research Programme Core Funding under NUHSRO/2020/080/MSC/04/PM. The computational work for this article was partially performed on resources of the National Supercomputing Centre, Singapore and partially supported by NUS IT’s Cloud Credits for Research Programme.