Research Article

Transposable elements regulate thymus development and function

Institute for Research in Immunology and Cancer, Université de Montréal, Canada
Department of Medicine, Université de Montréal, Canada
Deeley Research Centre, BC Cancer, Canada
Department of Medical Genetics, University of British Columbia, Canada
Department of Computer Science and Operations Research, Université de Montréal, Canada
Fred Hutchinson Cancer Center, United States
Department of Physics, University of Washington, United States
Department of Epigenetics and Molecular Carcinogenesis, University of Texas M.D. Anderson Cancer Center, United States
Department of Biochemistry and Molecular Medicine, Université de Montréal, Canada
Department of Chemistry, Université de Montréal, Canada

Apr 18, 2024

https://doi.org/10.7554/eLife.91037.3

Open access
Copyright information

eLife assessment

This important study shows, based on analyses of single-cell RNA-seq data sets of thymus cells, that transposable elements (TEs) are broadly expressed in thymic stromal cells, especially in medullary thymic epithelial cells and plasamacytoid dendritic cells. The authors also show that at least some TE-derived peptides are presented by MHC-I molecules in the thymus. The study provides solid findings supporting a role of TEs in thymic T-cell selection and immune self-tolerance.

https://doi.org/10.7554/eLife.91037.3.sa0

Significance of the findings:

Important: Findings that have theoretical or practical implications beyond a single subfield

Landmark
Fundamental
Important
Valuable
Useful

Strength of evidence:

Solid: Methods, data and analyses broadly support the claims with only minor weaknesses

Exceptional
Compelling
Convincing
Solid
Incomplete
Inadequate

During the peer-review process the editor and reviewers write an eLife Assessment that summarises the significance of the findings reported in the article (on a scale ranging from landmark to useful) and the strength of the evidence (on a scale ranging from exceptional to inadequate). Learn more about eLife Assessments

Abstract
Introduction
Results
Discussion
Materials and methods
Data availability
References
Article and author information
Metrics

Abstract

Transposable elements (TEs) are repetitive sequences representing ~45% of the human and mouse genomes and are highly expressed by medullary thymic epithelial cells (mTECs). In this study, we investigated the role of TEs on T-cell development in the thymus. We performed multiomic analyses of TEs in human and mouse thymic cells to elucidate their role in T-cell development. We report that TE expression in the human thymus is high and shows extensive age- and cell lineage-related variations. TE expression correlates with multiple transcription factors in all cell types of the human thymus. Two cell types express particularly broad TE repertoires: mTECs and plasmacytoid dendritic cells (pDCs). In mTECs, transcriptomic data suggest that TEs interact with transcription factors essential for mTEC development and function (e.g., PAX1 and REL), and immunopeptidomic data showed that TEs generate MHC-I-associated peptides implicated in thymocyte education. Notably, AIRE, FEZF2, and CHD4 regulate small yet non-redundant sets of TEs in murine mTECs. Human thymic pDCs homogenously express large numbers of TEs that likely form dsRNA, which can activate innate immune receptors, potentially explaining why thymic pDCs constitutively secrete IFN ɑ/β. This study highlights the diversity of interactions between TEs and the adaptive immune system. TEs are genetic parasites, and the two thymic cell types most affected by TEs (mTEcs and pDCs) are essential to establishing central T-cell tolerance. Therefore, we propose that orchestrating TE expression in thymic cells is critical to prevent autoimmunity in vertebrates.

Introduction

Self/non-self discrimination is a fundamental requirement of life (Boehm, 2012). In jawed vertebrates, the thymus is the only site where T lymphocytes can be properly educated to distinguish self from non-self (Boehm and Swann, 2014; Suo et al., 2022). This is vividly illustrated by Oncostatin M-transgenic mice, where T-cell production occurs exclusively in the lymph nodes (Terra et al., 2005). These mice harbor normal numbers of T-cell receptors (TCRs) αβ T cells but present severe autoimmunity and cannot fight infections (Blais et al., 2008). Intrathymic generation of a functional T-cell repertoire depends on choreographed interactions between the TCRs of thymocytes and peptides presented by major histocompatibility complex (MHC) molecules on various antigen-presenting cells (APCs) (Zuñiga-Pflucker et al., 1989). Positive selection depends on self-antigens presented by cortical thymic epithelial cells (cTECs) and ensures that TCRs recognize antigens in the context of the host’s MHC molecules (Breed et al., 2018; Dervović and Zúñiga-Pflücker, 2010). The establishment of central tolerance depends on two main classes of APCs located in the thymic medulla: dendritic cells (DCs) and medullary TEC (mTEC) (Lebel et al., 2020; Srinivasan et al., 2021; Cheng and Anderson, 2018). Two other APC types have a more limited contribution to central tolerance: thymic fibroblasts and B cells (Perera et al., 2016; Nitta et al., 2011). High avidity interactions between thymic APCs and autoreactive thymocytes lead to thymocyte deletion (negative selection) or generation of regulatory T cells (Treg) (Malhotra et al., 2016).

The main drivers of central tolerance, mTECs and DCs, display considerable phenotypic and functional heterogeneity. Indeed, recent single-cell RNA-seq (scRNA-seq) studies have identified several subpopulations of mTECs: immature mTEC(I) that stimulate thymocyte migration to the medulla via chemokine secretion (Lkhagvasuren et al., 2013), mTEC(II) that express high levels of MHC and are essential to tolerance induction, fully differentiated corneocyte-like mTEC(III) that foster a pro-inflammatory microenvironment (Laan et al., 2021), and finally mimetic mTECs that express peripheral tissue antigens (Michelson et al., 2022). Three different proteins whose loss of function leads to severe autoimmunity, AIRE, FEZF2, and CHD4, have been shown to drive the expression of non-redundant sets of peripheral tissue antigens in mTECs (Ramsey et al., 2002; Takaba et al., 2015; Tomofuji et al., 2020). DCs, on the other hand, are separated into three main populations. Conventional DC 1 and 2 (cDC1 and cDC2) have an unmatched ability to present both endogenous antigens and exogenous antigens acquired via cross-presentation or cross-dressing (Ginhoux et al., 2022). Plasmacytoid DC (pDC) are less effective APCs than cDCs, their primary role being to produce interferon alpha (IFNɑ) (Ginhoux et al., 2022). Notably, thymic pDCs originate from intrathymic IRF8^hi precursors, and, in contrast to extrathymic pDCs, they constitutively secrete high amounts of IFNɑ (Colantonio et al., 2011; Lavaert et al., 2020; Le et al., 2020). This constitutive IFNɑ secretion by thymic pDCs regulates the late stages of thymocyte development by promoting the generation of Tregs and innate CD8 T cells (Xing et al., 2016; Hanabuchi et al., 2010; Martín Gayo et al., 2010; Martinet et al., 2015; Epeldegui et al., 2015).

Transposable elements (TEs) are repetitive sequences representing ~45% of the human and mouse genomes (Treangen and Salzberg, 2011; Deniz et al., 2019). Most TEs can be grouped into three categories: the long and short interspersed nuclear elements (LINE and SINE, respectively) and the long terminal repeats (LTRs). These broad categories are subdivided into over 800 subfamilies based on sequence homology (Bourque et al., 2018). TE expression is typically repressed in host cells to prevent deleterious integrations of TE sequences in protein-coding genes (Argueso et al., 2008). Unexpectedly, TEs were recently found to be expressed at higher levels in human mTECs than in any other MHC-expressing tissues and organs (i.e., excluding the testis) (Larouche et al., 2020; Carter et al., 2022), suggesting a role for TEs in thymopoiesis. Since some TEs are translated and generate MHC I-associated peptides (MAP) (Larouche et al., 2020), they might induce TE-specific central tolerance (Kassiotis, 2023). Additionally, TEs provide binding sites to transcription factors (TFs) and stimulate cytokine secretion via the formation of double-stranded RNA (dsRNA) (Chuong et al., 2016; Bogdan et al., 2020; Adoue et al., 2019; Lefkopoulos et al., 2020; Lima-Junior et al., 2021). Hence, TEs could have pleiotropic effects on thymopoiesis. To evaluate the role of TEs in thymopoiesis, we adopted a multipronged strategy beginning with scRNA-seq of human thymi and culminating in MS analyses of the MAP repertoire of mouse mTECs.

Results

LINE, LTR, and SINE expression shows extensive variations during ontogeny of the human thymus

We first profiled TE expression in various thymic cell populations during development. To do so, we quantified the expression of 809 TE subfamilies (classified according to the RepeatMasker annotations) in the scRNA-seq dataset of human thymi created by Park et al., 2020. Cells were clustered in 19 populations representing the main constituents of the thymic hematolymphoid and stromal compartments (Figure 1a, Figure 1—figure supplement 1). The expression of TE subfamilies was quantified at all developmental stages available, ranging from 7 post-conception weeks (pcw) to 40 years of age (Supplementary file 1a). Unsupervised hierarchical clustering revealed three clusters of TE subfamilies based on their pattern of expression during thymic development (Figure 1b, upper panel): (i) maximal expression at early embryonic stages persisting, albeit at lower levels, throughout ontogeny (cluster 1), (ii) an expression specific to a given timepoint (cluster 2), or (iii) a high expression at early embryonic stages that decreases rapidly at later timepoints (cluster 3). LINE and SINE subfamilies were enriched in cluster 1, whereas LTR subfamilies were significantly enriched in clusters 2 and 3 (Figure 1b, lower panel). Expression of individual LINE and SINE subfamilies was highly shared among different cell types (Figure 1d). In contrast, the LTR subfamilies' expression pattern was shared by fewer cell subsets and adopted a quasi-random distribution (Figure 1d). The pattern of expression assigned to TE subfamilies (Figure 1c, innermost track) was not affected by the proportion of cells of different developmental stages (embryonic or postnatal) (Figure 1c, outermost track, and Figure 1—figure supplement 2). This suggests that our observations do not result from a bias in the composition of the dataset. To gain further insights into the expression of TE subfamilies, we studied two biological processes known to regulate TE expression in other contexts: cell proliferation and expression of KRAB zinc-finger proteins (KZFP) (Brocks et al., 2018; Imbeault et al., 2017). Cell cycling scores negatively correlated with TE expression in various thymic cell subsets, particularly for LINE and SINE subfamilies shared among cell types (Figure 1—figure supplement 3 and Supplementary file 1b), whereas analysis of KZFP expression identified ZNF10 as a probable repressor of L1 subfamilies in Th17 and NK cells (Figure 1—figure supplement 4 and Supplementary file 1c). Thus, we conclude that the expression of the three main classes of TEs shows major divergences as a function of age and thymic cell types.

Figure 1 with 4 supplements see all

Download asset Open asset

Long interspersed nuclear elements (LINE), short interspersed nuclear elements (SINE), and long terminal repeats (LTRs) exhibit distinct expression profiles in human thymic cell populations.

(a) UMAP depicting the cell populations present in human thymi (CD4 SP, CD4 single positive thymocytes; CD8 SP, CD8 single positive thymocytes; cTEC, cortical thymic epithelial cells; DC, dendritic cells; DN, double negative thymocytes; DP, double positive thymocytes; Mono/Macro, monocytes and macrophages; mTEC, medullary thymic epithelial cells; NK, natural killer cells; NKT, natural killer T cells; pro/pre-B, pro-B and pre-B cells; Th17, T helper 17 cells; Treg, regulatory T cells; VSMC, vascular smooth muscle cell). Cells were clustered in 19 populations based on the expression of marker genes from Lefkopoulos et al., 2020. (b) Upper panel: heatmap of transposable element (TE) expression during thymic development, with each column representing the expression of one TE subfamily in one cell type. Unsupervised hierarchical clustering was performed, and the dendrogram was manually cut into three clusters (red dashed line). Lower panel: the class of TE subfamilies and significant enrichments in the three clusters (Fisher’s exact tests; ****p≤0.0001). pcw, post-conception week; m, month; y, year. (c) Circos plot showing the expression pattern of TE subfamilies across thymic cells. From outermost to innermost tracks: (i) proportion of cells in embryonic and postnatal samples, (ii) class of TE subfamilies, and (iii) expression pattern of TE subfamilies identified in (b). TE subfamilies are in the same order for all cell types. (d) Histograms showing the number of cell types sharing the same expression pattern for a given TE subfamily. LINE (n = 171), LTR (n = 577), and SINE (n = 60) were compared to a randomly generated distribution (n = 809) (Kolmogorov–Smirnov tests, ****p≤0.0001).

TEs form interactions with transcription factors regulating thymic development and function

TEs provide binding sites to TFs (Chuong et al., 2016; Kunarso et al., 2010; Sundaram et al., 2014), and T-cell development is driven by the coordinated timing of multiple changes in transcriptional regulators (Hosokawa and Rothenberg, 2021). We, therefore, investigated interactions between TE subfamilies and TFs during the development of the human thymus. Two criteria defined an interaction: (i) a significant and positive correlation between the expression of a TF and a TE subfamily in a given cell population, and (ii) the presence of the TF binding motif in the loci of the TE subfamily (Figure 2a). Additionally, we validated the correlations we obtained using a bootstrap procedure to ascertain their reproducibility (see ‘Materials and methods’ for details). This procedure removed weakly correlated TF-TE pairs (Figure 2b). TF-TE interactions were observed in all thymic cell populations (Figure 2c and d, Figure 2—figure supplement 1, and Supplementary file 1d). Numerous TF-TE interactions were conserved between hematolymphoid and stromal cell subsets (Figure 2e). However, the number of interactions and the complexity of the interaction networks were much higher in mTECs than in other cell populations (Figure 2c and d, Figure 2—figure supplements 1 and 2).

Figure 2 with 3 supplements see all

Download asset Open asset

Transposable elements (TEs) shape complex gene regulatory networks in human thymic cells.

(a) The flowchart depicts the decision tree for each TE promoter or enhancer candidate. (b) Density heatmap representing the correlation coefficient and the empirical p-value determined by bootstrap for transcription factor (TF) and TE pairs in each cell type of the dataset. The color code shows density (i.e., the occurrence of TF-TE pairs at a specific point). (c) Connectivity map of interactions between TEs and TFs in medullary thymic epithelial cells (mTECs). For visualization purposes, only TF-TE pairs with high positive correlations (Spearman correlation coefficient ≥ 0.3 and p-value adjusted for multiple comparisons with the Benjamini–Hochberg procedure ≤ 0.05) and TF binding sites in ≥1% of TE loci are shown. (d) Number of TF-TE interactions for each thymic cell population. (e) Sharing of TF-TE pairs between thymic cell types. (f) Number of promoter (top) or enhancer (bottom) TE candidates per TF in hematopoietic cells of the thymus. (g) The proportion of statistically significant peaks overlapping with TE sequences in ETS1 ChIP-seq data from NK cells. (h) Genomic tracks depicting the colocalization of ETS1 occupancy (i.e., read coverage) and TE sequences (in red) in the upstream region of two genes in ETS1 ChIP-seq data from NK cells. Statistically significant ETS1 peaks are indicated by the black rectangles.

Several TFs instrumental in thymus development and thymopoiesis interacted with TE subfamilies (Figure 2—figure supplement 2 and Supplementary file 1d). These TFs include the NFKB1 and REL subunits of the NF-κB complex and PAX1 in mTECs (Baik et al., 2016; Akiyama et al., 2008; Yamazaki et al., 2020) and JUND in thymocytes (Meixner et al., 2004). In DCs, the most notable TF-TE interactions involved interferon regulatory factors (IRF), which regulate the late stages of T-cell maturation, and TCF4, which is essential for pDC development (Xing et al., 2016; Cisse et al., 2008). This observation is consistent with evidence that TEs have shaped the evolution of IFN signaling networks (Chuong et al., 2016). Finally, we found significant interactions between CTCF and TE subfamilies in mTECs and endothelial cells, suggesting that the binding of CTCF to TE sequences affects the tridimensional structure of the chromatin in the thymic stroma (Choudhary et al., 2020). Interestingly, LINE and SINE subfamilies that occupy more genomic space interacted with higher numbers of transcription factors (Figure 2—figure supplement 3).

Using data from the ENCODE consortium for hematopoietic cells (Consortium, 2012; Luo et al., 2020), we looked at the histone marks at the TE loci identified as TF interactors by our analyses (i.e., correlated with TF expression and containing the TF binding motif). The objective was to determine if they could act as promoters or enhancers (Figure 2a and Supplementary file 1e). We found several TE promoter and enhancer candidates in all eight hematopoietic cell types analyzed, with a striking overrepresentation of LINE and SINE compared to LTR sequences (Figure 2f and Supplementary file 1f). Finally, we analyzed publicly available ChIP-seq data of ETS1, an important TF for NK cell development (Taveirne et al., 2020), to confirm its ability to bind TE sequences. Indeed, 19% of ETS1 peaks overlap with TE sequences (Figure 2g). Notably, ETS1 peaks overlapped with TE sequences (Figure 2h, in red) in the promoter regions of PRF1 and KLRD1, two genes critical for NK cells’ effector functions (Kim et al., 2014; Gunturi et al., 2004). Hence, our data suggest that TEs affect thymic development and function by providing binding sites to multiple TFs.

TEs are highly and differentially expressed in human thymic APC subsets

We next sought to determine whether the high expression of TEs reported in mTECs (Bourque et al., 2018; Argueso et al., 2008) was limited to this cell subset or was found in other thymic cell types. Since several thymic stromal cells reach maturity after birth (Bornstein et al., 2018), we selected postnatal samples for the following analyses. We computed two distinct Shannon entropy indices: one for the global diversity of TEs expressed by all cells of a given population and another for the median value of TE diversity expressed by individual cells of a population (Figure 3a). Then, we computed a linear model to represent the diversity of TEs expressed by a cell population based on the diversity of TEs expressed by individual cells (Figure 3a, blue curve). Two salient findings emerged from this analysis. First, the diversity of TEs expressed in the T-cell lineage decreases during differentiation according to the following hierarchy: DN thymocytes > DP thymocytes > SP thymocytes (Figure 3a, Figure 3—figure supplement 1). Second, among the populations of thymic APCs implicated in positive and negative selection (Figure 3a, orange dots), cTECs, mTECs, and DCs expressed broader repertoires of TEs than B cells and fibroblasts. While cTECs and DCs expressed highly diverse TE repertoires at both the population and individual cell levels, the breadth of TE expression in mTECs was found only at the population level (Figure 3a). Accordingly, intercellular heterogeneity (i.e., deviation from the linear model) was higher for mTECs than other cell populations (Figure 3b).

Figure 3 with 3 supplements see all

Download asset Open asset

Human plasmacytoid dendritic cells (pDCs) and mTEC(II) express diverse and distinct repertoires of transposable element (TE) sequences.

(a) Diversity of TEs expressed by thymic populations measured by Shannon entropy. The x and y axes represent the median diversity of TEs expressed by individual cells in a population and the global diversity of TEs expressed by an entire population, respectively. The equation and blue curve represent a linear model summarizing the data. Thymic antigen-presenting cell (APC) subsets are indicated in orange. (b) Difference between the observed diversity of TEs expressed by cell populations and the one expected by the linear model in (A). (c) UMAP showing the subsets of thymic APCs (aDC, activated DC; cDC1, conventional DC1; cDC2, conventional DC2). (d) Bar plot showing the number and class of differentially expressed TE subfamilies between APC subsets. (e) Frequency of expression of TE subfamilies by the different APC subsets. The distributions for pDCs and mTEC(II) are highlighted in bold. mTEC, medullary thymic epithelial cell.

We next focused on thymic APCs expressing the broadest TE repertoires: cTECs, mTECs, and DCs (Figure 3a). To this end, we annotated these APC subpopulations based on previously published lists of marker genes (Figure 3c, Figure 3—figure supplement 2; Park et al., 2020; Bautista et al., 2021). We performed differential expression analyses to determine whether some TE subfamilies were overexpressed in specific APC subsets. pDCs and mTEC(II) overexpressed a broader TE repertoire than other APCs: 32.01% of subfamilies were overexpressed in pDCs and 10.88% in mTEC(II) (Figure 3d and Supplementary file 1g). The nature of the overexpressed TEs differed between pDCs and other thymic APC subsets. Indeed, pDCs overexpressed LTRs, LINEs, and SINEs, including several Alu and L1 subfamilies (Figure 3d and Supplementary file 1g). In contrast, other thymic APCs predominantly overexpressed LTRs.

TE expression showed wildly divergent levels of intercellular heterogeneity in APC subsets. Indeed, whereas most TE subfamilies were expressed by <25% of cells of the mTEC(II) population, an important proportion of TEs were expressed by >75% of pDCs (Figure 3e). To evaluate this question further, we compared TE expression between metacells of thymic APCs; metacells are small clusters of cells with highly similar transcription profiles. This analysis revealed that overexpression of TE subfamilies was shared between pDC metacells but not mTEC(II) metacells, reinforcing the idea that TE expression adopts a mosaic pattern in the mTEC(II) population (Figure 3—figure supplement 3). We conclude that cTECs, mTECs, and DCs express broad TE repertoires. However, two subpopulations of thymic APCs clearly stand out. pDCs express an extremely diversified repertoire of LTRs, SINEs, and LINEs, showing limited intercellular heterogeneity, whereas the mTEC(II) population shows a highly heterogeneous overexpression of LTR subfamilies.

TE expression in human pDCs is associated with dsRNA structures

The high expression of a broad repertoire of TE sequences in thymic pDCs was unexpected (Figure 3d). LINE and SINE subfamilies, in particular, were highly and homogeneously expressed by thymic pDCs (Figure 4a). Constitutive IFNɑ secretion is a feature of thymic pDCs not found in extrathymic pDCs. We, therefore, hypothesized that this constitutive IFNɑ secretion by thymic pDCs might be mechanistically linked to their TE expression profile. We first assessed whether thymic and extrathymic pDCs have similar TE expression profiles by reanalyzing scRNA-seq data from human spleens published by Madissoon et al., 2019; Figure 4—figure supplement 1a and b. This revealed that extrathymic pDCs express TE sequences at similar or lower levels than other splenic cells (Figure 4—figure supplement 1c and d). We then used pseudobulk RNA-seq methods to perform a differential expression analysis of TE subfamilies between thymic and splenic pDCs. This analysis confirmed that TE expression was globally higher in thymic than in extrathymic pDCs (Figure 4b). Since TE overexpression can lead to the formation of dsRNA (Lefkopoulos et al., 2020; Lima-Junior et al., 2021), we investigated if such structures were found in thymic pDCs. pDCs were magnetically enriched from primary human thymi following labeling with anti-CD303 antibody (a marker of pDCs). Then, pDC-enriched thymic cells were stained with an antibody against CD123 (another marker of pDCs) and the J2 antibody that stains dsRNA. The intensity of the J2 signal was more than tenfold higher in CD123⁺ relative to CD123^- cells (Figure 4c and d). We conclude that thymic pDCs contain large amounts of dsRNAs. To evaluate if these dsRNAs arise from TE sequences, we analyzed in thymic APC subsets the proportion of the transcriptome assigned to two groups of genomic sequences known as important sources of dsRNAs: TEs and mitochondrial genes (Sadeq et al., 2021). Strikingly, whereas the percentage of reads from mitochondrial genes was typically lower in pDCs than in other thymic APCs, the proportion of the transcriptome originating from TEs was higher in pDCs (~22%) by several orders of magnitude (Figure 4—figure supplement 2). Finally, we performed gene set enrichment analyses to ascertain if the high expression of TEs by thymic pDCs was associated with specific gene signatures. These analyses highlighted signatures of antigen presentation, immune response, and interferon signaling in thymic pDCs (Figure 4e and Supplementary file 1h). Notably, thymic pDCs harbored moderate yet significant enrichment of gene signatures of RIG-I and MDA5-mediated IFN ɑ/β signaling compared to all other thymic APCs (Figure 4e and Supplementary file 1h). Altogether, these data support a model in which the high and ubiquitous expression of TEs in thymic pDCs would lead to the formation of dsRNAs triggering innate immune sensors, which might explain their constitutive secretion of IFN ɑ/β.

Figure 4 with 2 supplements see all

Download asset Open asset

Transposable element (TE) expression in human plasmacytoid dendritic cells (pDCs) is associated with dsRNA formation and type I IFN signaling.

(a) Frequency of long interspersed nuclear elements (LINE), long terminal repeat (LTR), and short interspersed nuclear elements (SINE) subfamilies expression in thymic pDCs. (b) Differential expression of TE subfamilies between splenic and thymic pDCs. TE subfamilies significantly upregulated or downregulated by thymic pDCs are indicated in red and blue, respectively (Upregulated, log₂(Thymus/Spleen) ≥ 1 and adj. p≤0.05; Downregulated, log₂(Thymus/Spleen) ≤ –1 and adj. p≤0.05). (**c, d**) Immunostaining of dsRNAs in human thymic pDCs (CD123⁺) using the J2 antibody (n = 3). (c) One representative experiment. Three examples of CD123 and J2 colocalization are shown with white arrows. (d) J2 staining intensity in CD123⁺ and CD123^- cells from three human thymi (Wilcoxon rank-sum test, ****p-value≤0.0001). (e) UpSet plot showing gene sets enriched in pDCs compared to the other populations of thymic antigen-presenting cells (APCs). On the lower panel, black dots represent cell populations for which gene signatures are significantly depleted compared to pDCs. All comparisons where gene signatures were significantly enriched in pDCs are shown.

AIRE, CHD4, and FEZF2 regulate distinct sets of TE sequences in murine mTECs

The essential role of mTECs in central tolerance hinges on their ability to ectopically express tissue-restricted genes, whose expression is otherwise limited to specific epithelial lineage (Sansom et al., 2014; Pierre et al., 2017). This promiscuous gene expression is driven by AIRE, CHD4, and FEZF2 (Ramsey et al., 2002; Takaba et al., 2015; Tomofuji et al., 2020). We, therefore, investigated the contribution of these three genes to the expression of TE subfamilies in the mTEC(II) population (Figure 3d). First, we validated that mTEC(II) express AIRE, CHD4, and FEZF2 in the human scRNA-seq dataset (Figure 5a). Next, we analyzed published murine mTEC RNA-seq data to assess the regulation of TE sequences by AIRE, CHD4, and FEZF2. Differential expression analyses between knock-out (KO) and wild-type (WT) mice showed that these three factors regulate TE sequences, but the magnitude and directionality of this regulation differed (Figure 5b and Supplementary file 1i). Indeed, while CHD4 had the biggest impact on TE expression by inducing 433 TE loci and repressing 463, FEZF2’s impact was minimal, with 97 TE loci induced and 60 repressed (Figure 5b). Besides, AIRE mainly acted as a repressor of TE sequences, with 326 loci repressed and 171 induced (Figure 5b). Interestingly, there was minimal overlap between the TE sequences regulated by AIRE, CHD4, and FEZF2, indicating that they have non-redundant roles in TE regulation (Figure 5c). Additionally, AIRE, CHD4, and FEZF2 preferentially targeted LTR and LINE elements, with significant enrichment of specific subfamilies such as MTA_Mm-int and RLTR4_Mm that are induced by Aire and Fezf2, respectively (Figure 5d and Figure 5—figure supplement 1a). While AIRE and CHD4 preferentially targeted evolutionary young TE sequences, the age of the TE sequence did not seem to affect the regulation by FEZF2 (Figure 5—figure supplement 1b). We also noticed that the distance between regulated TE loci was smaller than the distributions of randomly selected TEs (Figure 5—figure supplement 1c). This suggests that AIRE, CHD4, and FEZF2 nonrandomly affect the expression of TE sequences located in specific genomic regions. We observed no significant differences in the genomic localization of TE loci targeted by AIRE, CHD4, and FEZF2 relative to the genomic localization of all TE sequences in the murine genome: most TE loci were located in intronic and intergenic regions (Figure 5—figure supplement 1d). Enrichment for intronic TEs could not be ascribed to induction of global intron retention: the intron retention ratio was similar for TEs regulated or not by AIRE, CHD4, and FEZF2 (Figure 5—figure supplement 1e). ChIP-seq-based analysis of permissive histone marks showed that TE loci induced by AIRE, CHD4, and FEZF2 were all marked by H3K4me3 (Figure 5e). As a proof of concept, we validated that 31.42% of AIRE peaks overlap with TE sequences by reanalyzing ChIP-seq data, confirming AIRE’s potential to bind TE sequences (Figure 5f). Hence, AIRE, CHD4, and FEZF2 regulate the expression of small yet non-redundant repertoires of TE sequences associated with permissive histone marks.

Figure 5 with 1 supplement see all

Download asset Open asset

*AIRE*, *FEZF2*, and *CHD4* regulate non-redundant sets of transposable elements (TEs) in murine medullary thymic epithelial cells (mTECs).

(a) Expression of *AIRE*, *CHD4*, and *FEZF2* in human TEC subsets. (b) Differential expression of TE loci between wild-type (WT) and *Aire*-, *Chd4*-, or *Fezf2*-knockout (KO) mice (Induced, log₂(WT/KO) ≥ 2 and adj. p≤0.05; Repressed, log₂(WT/KO) ≤ –2 and adj. p≤0.05). p-Values were corrected for multiple comparisons with the Benjamini–Hochberg procedure. The numbers of induced (red) and repressed (blue) TE loci are indicated on the volcano plots. (c) Overlap of TE loci repressed or induced by AIRE, FEZF2, and CHD4. (d) Proportion of TE classes and subfamilies in the TE loci regulated by AIRE, FEZF2, or CHD4, as well as all TE loci in the murine genome for comparison (chi-squared tests with Bonferroni correction, **adj. p≤0.01, ***adj. p≤0.001). (e) Plots for the tag density of H3K4me3 and H3K4me2 on the sequence and flanking regions (3000 base pairs) of TE loci induced by AIRE, FEZF2, and CHD4. (f) Proportion of statistically significant peaks overlapping TE sequences in AIRE ChIP-seq data from murine mTECs.

TEs are translated and presented by MHC class I molecules in murine TECs

Several TEs are translated and generate MAPs (Larouche et al., 2020). Hence, the expression of TEs in cTECs and even more in mTECs raises a fundamental question: do these TEs generate MAPs that would shape the T cell repertoire? Mass spectrometry (MS) is the only method that can faithfully identify MAPs (Shapiro and Bassani-Sternberg, 2023; Vizcaíno et al., 2020; Kubiniok et al., 2022). Despite its quintessential role in central tolerance, the MAP repertoire of mTECs has never been studied by MS because of the impossibility of obtaining sufficient mTECs for MS analyses: mTECs represent ≤1% of thymic cells, and they do not proliferate in vitro. To get enough cTECs and mTECs for MS analyses, we used transgenic mice that express cyclin D1 under the control of the keratin 5 promoter (K5D1 mice). These mice develop dramatic thymic hyperplasia, but their thymus is morphologically and functionally normal (Robles et al., 1996; Klug et al., 2000; Ohigashi et al., 2019). Primary cTECs and mTECs (two replicates of 70 × 10⁶ cells from 121 and 90 mice, respectively) were isolated from the thymi of K5D1 mice as described (Dumont-Lagacé et al., 2019). Following cell lysis and MHC I immunoprecipitation, MAPs were analyzed by liquid chromatography MS/MS (Figure 6a). To identify TE-coded MAPs, we generated a TE proteome by in silico translation of TE transcripts expressed by mTECs or cTECs, and this TE proteome was concatenated with the canonical proteome. MS analyses enabled the identification of a total of 1636 and 1714 MAPs in mTECs and cTECs, respectively. From these, we identified four TE-derived MAPs in mTECs and two in cTECs, demonstrating that TEs can be translated and presented by MHC I in the thymic cortex and medulla (Figure 6b and Supplementary file 1j). These MAPs were coded by the three major groups of TE: LINEs (n = 1), LTRs (n = 1), and SINEs (n = 4). Next, we evaluated whether the low number of TE MAPs identified could result from mass spectrometry detection limits (Ghosh et al., 2020; Nanaware et al., 2021). We measured the level and frequency of TE expression in two subsets of cTECs (Figure 6c, left) or mTECs (Figure 6c, right) using scRNA-seq data from Baran-Gale et al., 2020. TE subfamilies generating MAPs in cTECs or mTECs are highlighted in red in their respective plots. Strikingly, TECs highly and ubiquitously expressed the MAP-generating TE subfamilies. These results suggest that the contribution of TEs to the MAP repertoire of cTECs and mTECs might be significantly underestimated by the limits of detection of MS. This is particularly true for mTECs because they express high levels of TEs (Figure 3d), but their TE profile displays considerable intercellular heterogeneity (Figure 3e, Figure 3—figure supplement 2). Nonetheless, our data provide direct evidence that TEs can generate MAPs presented by cTECs and mTECs, which can contribute to thymocyte education.

Figure 6

Download asset Open asset

Murine cortical thymic epithelial cells (cTEC) and medullary thymic epithelial cell (mTEC) present transposable element (TE) MAPs.

(a) mTECs and cTECs were isolated from the thymi of K5D1 mice (n = 2). The peptide-MHC I complexes were immunoprecipitated independently for both populations, and MAPs were sequenced by MS analyses. (b) Number of long interspersed nuclear elements (LINE)-, long terminal repeat (LTR)-, and short interspersed nuclear elements (SINE)-derived MAPs in mTECs and cTECs from K5D1 mice. (c) Distributions of TE subfamilies in murine TECs subsets based on expression level (x-axis) and frequency of expression (y-axis).

Discussion

TEs are germline-integrated parasitic DNA elements that comprise about half of mammalian genomes. Over evolutionary timescales, TE sequences have been co-opted for host regulatory functions. Mechanistically, TEs encode proteins and noncoding RNAs that regulate gene expression at multiple levels (Bourque et al., 2018; Frank and Feschotte, 2017). Regulation of IFN signaling and triggering innate sensors are the best-characterized roles of TEs in the mammalian immune system (Kassiotis, 2023). TEs are immunogenic and can elicit adaptive immune responses implicated in autoimmune diseases (Larouche et al., 2020; Kassiotis, 2023; Gröger and Cynis, 2018; Volkman and Stetson, 2014). Pervasive TE expression in various somatic organs means that co-evolution with their host must depend on establishing immune tolerance, a concept supported by the highly diversified TE repertoire expressed in mTECs (Larouche et al., 2020). This observation provided the impetus to perform multiomic studies of TE expression in the thymus. At the whole organ level, we found that TE expression showed extensive age- and cell lineage-related variations and was negatively correlated with cell proliferation and expression of KZFPs. The negative correlation between TE expression and cell cycle scores in the thymus is coherent with recent data showing that transcriptional activity of L1s is increased in senescent cells (De Cecco et al., 2019). A potential rationale for this could be to prevent deleterious transposition events during DNA replication and cell division. On the other hand, the contribution of KZFPs to TE regulation in the thymus is likely underestimated due to their typically low expression (Huntley et al., 2006) and scRNA-seq detection limit. Additionally, TEs interact with multiple TFs in all thymic cell subsets. This is particularly true for the LINE and SINE subfamilies that occupy larger genomic spaces. Notably, TEs appear to play particularly important roles in two cell types located in the thymic medulla: mTECs and pDCs.

As mTECs are the APC population crucial to central tolerance induction, their high and diverse TE expression is poised to impact the T cell repertoire’s formation profoundly. The extent and complexity of TF-TE interactions were higher in mTECs than in all other thymic cell subsets. These interactions included PAX1 and subunits of the NF-κB complex (e.g., RELB). PAX1 is essential for the development of TEC progenitors (Yamazaki et al., 2020), and RELB is for the development and differentiation of mTECs (Mouri et al., 2014). RelB-deficient mice have reduced thymic cellularity, markedly fewer mTECs, lack Aire expression, and suffer from autoimmunity (Akiyama et al., 2008; O’Sullivan et al., 2018). Under the influence of Aire, Fezf2, and Chd4, mTECs collectively express almost the entire exome (Sansom et al., 2014; Pierre et al., 2017). However, the expression of all genes in each mTEC would cause proteotoxic stress (Pierre et al., 2017). Hence, promiscuous expression of tissue-restricted genes in mTECs adopts a mosaic pattern: individual tissue-restricted genes are expressed in a small fraction of mTECs (Michelson et al., 2022; Klein et al., 2014). The present work shows that mTECs also express an extensive repertoire of TEs in a mosaic pattern (i.e., with considerable intercellular heterogeneity). Aire, Fezf2, and Chd4 regulate non-redundant sets of TEs and preferentially induce TE sequences associated with permissive histone marks. The immunopeptidome of thymic stromal cells is responsible for thymocyte education and represents one of the most fundamental ‘known unknowns’ in immunology. Inferences on the immunopeptidome of thymic stromal cells are based on transcriptomic data. However, (i) TCRs interact with MAPs, not transcripts, and (ii) the MAP repertoire cannot be inferred from the transcriptome (Shapiro and Bassani-Sternberg, 2023; Caron et al., 2011; Admon, 2023). Using K5D1 mice presenting prominent thymic hyperplasia, we conducted MS searches of TE MAPs, identifying four TE MAPs in mTECs and two in cTECs. These results demonstrate that cTECs and mTECs present TE MAPs and suggest they present different TE MAPs. However, the correlation between transcriptomic and immunopeptidomic data suggests that TECs can present many more TE MAPs. Their profiling will require MS analyses of enormous numbers of TECs or the development of more sensitive MS techniques. As TE MAPs have been detected in normal and neoplastic extrathymic cells (Larouche et al., 2020; Laumont et al., 2018; Burbage et al., 2023; Shah et al., 2023), the presentation of TEs by mTECs is likely essential to central tolerance. In line with vibrant plaidoyers for a collaborative Human Immunopeptidome Project (Vizcaíno et al., 2020; Shao et al., 2018), our work suggests that immunopeptidomic studies should not be limited to protein-coding genes (2% of the genome) but also encompass non-coding sequences such as TEs.

The second population of cells exhibiting high TE expression, pDCs, are mainly seen as producers of IFN ɑ/β and potentially as APCs (Ginhoux et al., 2022). Thymic and extrathymic pDCs are ontogenically and functionally different. They develop independently from each other from different precursor cells (Lavaert et al., 2020; Le et al., 2020; Weijer et al., 2002). IFN ɑ/β secretion is inducible in extrathymic pDCs but constitutive in thymic pDCs (Ginhoux et al., 2022; Colantonio et al., 2011). In line with the location of pDCs in the thymic medulla, their constitutive IFN ɑ/β secretion is instrumental in the terminal differentiation of thymocytes and the generation of Tregs and innate CD8 T cells (Xing et al., 2016; Hanabuchi et al., 2010; Martín Gayo et al., 2010; Martinet et al., 2015; Epeldegui et al., 2015). We report here that high TE expression is also a feature of thymic, but not extrathymic, pDCs. Thus, the present study provides a rationale for the constitutive IFN ɑ/β secretion by thymic pDCs: they homogeneously express large numbers of TEs (in particular LINEs and SINEs), leading to the formation of dsRNAs that trigger RIG-I and MDA5 signaling that causes the constitutive secretion of IFN ɑ/β. As such, our data suggest that recognition of TE-derived dsRNAs by innate immune receptors promotes a pro-inflammatory environment favorable to the establishment of central tolerance in the thymic medulla.

At first sight, the pleiotropic effects of TEs on thymic function may look surprising. It should be reminded that the integration of genetic parasites such as TEs is a source of genetic conflicts with the host. Notably, the emergence of adaptive immunity gave rise to higher-order conflicts between TEs and their vertebrate hosts (Kassiotis, 2023; Boehm et al., 2023). The crucial challenge for the immune system is developing immune tolerance towards TEs to prevent autoimmune diseases that affect up to 10% of humans (Harroud and Hafler, 2023) without allowing selfish retrotransposition events that hinder genome integrity. The resolution of these conflicts has been proposed to be a determining factor in shaping the function of the immune system (Boehm et al., 2023). Our data suggest that the thymus is the central battlefield for conflict resolution between TEs and T cells in vertebrates. Consistent with the implication of TEs in autoimmunity, more than 90% of putative causal variants associated with autoimmune diseases are in allegedly noncoding regions of the genome (Harroud and Hafler, 2023). In this context, our study illustrates the complexity of interactions between TEs and the vertebrate immune system and should provide impetus to explore them further in health and disease. We see two limitations to our study. First, as with all multiomic systems immunology studies, our work provides a roadmap for many future mechanistic studies that could not be realized at this stage. Second, our immunopeptidomic analyses of TECs prove that TECs present TE MAPs but certainly underestimate the diversity of TE MAPs presented by cTECs and mTECs.

Share this article

Cite this article

Long interspersed nuclear elements (LINE), short interspersed nuclear elements (SINE), and long terminal repeats (LTRs) exhibit distinct expression profiles in human thymic cell populations.

Transposable elements (TEs) shape complex gene regulatory networks in human thymic cells.

Human plasmacytoid dendritic cells (pDCs) and mTEC(II) express diverse and distinct repertoires of transposable element (TE) sequences.

Transposable element (TE) expression in human plasmacytoid dendritic cells (pDCs) is associated with dsRNA formation and type I IFN signaling.

AIRE, FEZF2, and CHD4 regulate non-redundant sets of transposable elements (TEs) in murine medullary thymic epithelial cells (mTECs).

Murine cortical thymic epithelial cells (cTEC) and medullary thymic epithelial cell (mTEC) present transposable element (TE) MAPs.

Author details

Jean-David Larouche

Contribution

Competing interests

Céline M Laumont

Contribution

Competing interests

Assya Trofimov

Contribution

Competing interests

Krystel Vincent

Contribution

Competing interests

Leslie Hesnard

Contribution

Competing interests

Sylvie Brochu

Contribution

Competing interests

Caroline Côté

Contribution

Competing interests

Juliette F Humeau

Contribution

Competing interests

Éric Bonneil

Contribution

Competing interests

Joel Lanoix

Contribution

Competing interests

Chantal Durette

Contribution

Competing interests

Patrick Gendron

Contribution

Competing interests

Jean-Philippe Laverdure

Contribution

Competing interests

Ellen R Richie

Contribution

Competing interests

Sébastien Lemieux

Contribution

Competing interests

Pierre Thibault

Contribution

Competing interests

Claude Perreault

Contribution

For correspondence

Competing interests

Citations by DOI

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

Categories and tags

Research organisms