T-REX17 is a transiently expressed non-coding RNA essential for human endoderm formation

  1. Alexandro Landshammer
  2. Adriano Bolondi
  3. Helene Kretzmer
  4. Christian Much
  5. René Buschow
  6. Alina Rose
  7. Hua-Jun Wu
  8. Sebastian D Mackowiak
  9. Bjoern Braendl
  10. Pay Giesselmann
  11. Rosaria Tornisiello
  12. Krishna Mohan Parsi
  13. Jack Huey
  14. Thorsten Mielke
  15. David Meierhofer
  16. René Maehr
  17. Denes Hnisz
  18. Franziska Michor
  19. John L Rinn
  20. Alexander Meissner  Is a corresponding author
  1. Department of Genome Regulation, Max Planck Institute for Molecular Genetics, Germany
  2. Institute of Chemistry and Biochemistry, Freie Universität Berlin, Germany
  3. Department of Biochemistry, University of Colorado Boulder and BioFrontiers Institute, United States
  4. Max Planck Institute for Molecular Genetics, Microscopy Core Facility, Germany
  5. Helmholtz Institute for Metabolic, Obesity and Vascular Research, Germany
  6. Department of Data Science, Dana-Farber Cancer Institute, Department of Biostatistics, Harvard T. H. Chan School of Public Health, United States
  7. Center for Precision Medicine Multi-Omics Research, School of Basic Medical Sciences, Peking University Health Science Center and Peking University Cancer Hospital and Institute, China
  8. Program in Molecular Medicine, University of Massachusetts Medical School, United States
  9. Max Planck Institute for Molecular Genetics, Mass Spectrometry Core Facility, Germany
  10. Diabetes Center of Excellence, University of Massachusetts Medical School, United States
  11. Department of Stem Cell and Regenerative Biology, Harvard University, United States
  12. Broad Institute of MIT and Harvard, United States
  13. Department of Data Science, Dana-Farber Cancer Institute, and Department of Biostatistics, Harvard T. H. Chan School of Public Health, United States
  14. The Ludwig Center at Harvard, Boston, MA 02215, USA, and Center for Cancer Evolution, Dana-Farber Cancer Institute, United States

Abstract

Long non-coding RNAs (lncRNAs) have emerged as fundamental regulators in various biological processes, including embryonic development and cellular differentiation. Despite much progress over the past decade, the genome-wide annotation of lncRNAs remains incomplete and many known non-coding loci are still poorly characterized. Here, we report the discovery of a previously unannotated lncRNA that is transcribed 230 kb upstream of the SOX17 gene and located within the same topologically associating domain. We termed it T-REX17 (Transcript Regulating Endoderm and activated by soX17) and show that it is induced following SOX17 activation but its expression is more tightly restricted to early definitive endoderm. Loss of T-REX17 affects crucial functions independent of SOX17 and leads to an aberrant endodermal transcriptome, signaling pathway deregulation and epithelial to mesenchymal transition defects. Consequently, cells lacking the lncRNA cannot further differentiate into more mature endodermal cell types. Taken together, our study identified and characterized T-REX17 as a transiently expressed and essential non-coding regulator in early human endoderm differentiation.

Editor's evaluation

Supported by a large set of complementary experiments, the authors convincingly show that the lncRNA T-REX17 is required for human definitive endoderm differentiation. T-REX17 function is not related to the adjacent SOX17 gene that lies in the same topological domain (TAD), implying a trans-acting role. The study is important because it sheds light on the stage-specific role of lncRNAs in cell lineage induction.

https://doi.org/10.7554/eLife.83077.sa0

Introduction

To date, nearly 28,000 long non-coding RNAs (lncRNAs) have been reported in the human genome, but less than 1% (~150) has been functionally characterized (Ransohoff et al., 2018; Hon et al., 2017; Quek et al., 2015; Jiang et al., 2016). Several of those have been shown to influence cellular physiology in developmental, adult and disease contexts (Sarropoulos et al., 2019; James, 2015; Prensner et al., 2011; Castellanos-Rubio et al., 2016; Perry and Ulitsky, 2016; Lorenzi et al., 2021). Depending on their genomic location, lncRNAs can be classified into genic lncRNAs (overlapping with a protein-coding gene) or intergenic lncRNAs (lincRNAs; no overlap with a protein-coding gene) (Ransohoff et al., 2018). Together with transcription factors and epigenetic regulators (Hung et al., 2011; Jeon and Lee, 2011; Boque-Sastre et al., 2015), lncRNAs participate in complex gene-regulatory networks by fine-tuning gene expression in a precise and controlled manner (Grote and Herrmann, 2015). In particular, lncRNAs have been shown to modulate gene expression at multiple levels, including chromatin structure and folding (Gupta et al., 2010), activating neighboring (Engreitz et al., 2013) and distal (Hacisuleyman et al., 2014) genes, affecting RNA splicing (Pisignano and Ladomery, 2021), or influencing nuclear compartmentalization (Caudron-Herger and Rippe, 2012; Rinn and Guttman, 2014; Quinodoz and Guttman, 2014).

More specifically, long non-coding RNAs have also been shown to fine-tune the activation and function of developmental regulators, including transcription factors responsible for maintenance of pluripotency (Sheik Mohamed et al., 2010; Ulitsky et al., 2011; Ng et al., 2012), mesoderm specification (Frank et al., 2019) and neuronal differentiation (Xi et al., 2022). Recent studies have also attributed critical roles for lncRNAs in the early stages of human development, in particular during definitive endoderm specification through cis-regulatory activity on nearby genes (Jiang et al., 2015; Yang et al., 2020). For instance, LNC00261 facilitates the activation of the proximal FOXA2 gene via association with SMAD2/3 (Jiang et al., 2015). A mechanistically similar cis-regulation of GATA6 has been attributed to lncRNA GATA6-AS1 (Yang et al., 2020), while the lncRNA DIGIT has been reported to control GSC in trans-, via the formation of BRD3-dependent phase-separated condensates (Daneshvar et al., 2016; Daneshvar et al., 2020). The majority of lncRNAs exhibit highly tissue-specific expression, often more restricted than observed for protein-coding genes (Cabili et al., 2011). Signaling molecules, including TGF-β, WNT and the JUN/JNK/AP-pathway represent critical cascades necessary for endoderm formation, inducing the expression of endodermal factors such as SOX17, GATA6 and C-X-C chemokine receptor 4 (CXCR4) (Li et al., 2019; Chia et al., 2019; Fisher et al., 2017). SOX17 is a member of the SOX-F group of transcription factors and its expression is necessary for the specification of definitive endoderm in vitro (Séguin et al., 2008) and in vivo (Kanai-Azuma et al., 2002). Despite being an essential and well-studied gene, much remains to be understood about the regulatory elements and nuclear organization of the larger SOX17 domain and how it functions in early endoderm development.

Results

Discovery of an unannotated non-coding transcript within the SOX17 topological domain

So far, SOX17 is the only annotated gene located within the 336 kb SOX17 loop-domain insulated by strong CTCF-boundaries (Figure 1A, top). However, upon closer inspection of multiple epigenetic modifications in pluripotent stem cells (hESCs and hiPSCs) and early definitive endoderm we observed a potential unannotated gene locus. In particular, the combination of histone H3 lysine 4 trimethylation (H3K4me3) and histone H3 lysine 36 trimethylation (H3K36me3) in ESC-derived endoderm suggested the presence of an RNA Polymerase-II-driven transcript (Guttman et al., 2009; Bilodeau et al., 2009). Further supporting this, matched RNA sequencing data showed a 22 kb long transcribed region approximately 230 kb upstream of SOX17 (Figure 1A, bottom). These results combined with a strong UCSC PhyloCSF sequence conservation points to an intergenic lncRNA (lincRNA) that we subsequently termed T-REX17 (Transcript Regulating Endoderm and activated by soX17) (Figure 1A and B). Although the sequence conservation to the mouse is only modest (Figure 1—figure supplement 1), we detect the presence of a distal SOX17 transcript in a number of vertebrates based on stage- and tissue-matched embryonic data (Figure 1—figure supplement 1A).

Figure 1 with 1 supplement see all
Identification of T-REX17 at the human SOX17 locus.

(A) Normalized capture Hi-C (cHi-C) contact map of the human SOX17 locus in endoderm cells (top panel) and chromatin immunoprecipitation sequencing (ChIP-seq) tracks of CTCF, H3K36me3 and H3K4me3 as well as whole genome bisulfite sequencing (WGBS) (Supplementary file 1) and RNA-seq profiles in PSCs and EN (bottom panel). T-REX17 locus (hg19, chr8:55117776–55140806) is highlighted in grey. (B) Zoomed in view of the SOX17 distal regulatory element in EN cells comprising Assay for Transposase-Accessible Chromatin with high-throughput sequencing (ATAC-seq) profile and H3K27ac, FOXA2, GATA4 and GATA6 ChIP-seq (Supplementary file 1) profiles. Chrom-HMM (Ernst and Kellis, 2015; The ENCODE Project Consortium, 2012) 25-state profile is shown below the phylo100 (Murphy et al., 2001; Pollard et al., 2010) UCSC conservation track. Dashed lines indicate the two distinct regulatory elements, characterized by enriched transcription factors occupancy (eSOX17 and pT-REX17). (C) Firefly luciferase assay from either eSOX17.1 (hg19, chr8:55136923–55137557), eSOX17.2 (hg19, chr8:55137558–55138192) or both together at days 2, 3, or 5 of EN differentiation. Values are calculated as luciferase activity ratio (LAR) between firefly and renilla signal, normalized on empty vector background and day 0 baseline signal. Bars indicate mean values, error bars show standard deviation (SD) across three independent experiments. Individual data points are displayed. Raw measurements are reported in Supplementary file 1. (D) 4Cseq of PSC (black) and EN (blue) at the SOX17-locus. Normalized interaction-scores displayed as arcs and histogram-profiles utilizing the SOX17 promoter as viewpoint (VP). (E) 4Cseq interactions as a zoomed in view at the SOX17 regulatory element and corresponding quantification. In the zoomed in tracks, the line represents the median and the shaded areas depict 95% CI; in the quantification, the central line represents the median and error bars show SD across three independent experiments.

We began to explore the locus in more detail by investigating the overlapping distal regulatory element that appears to be a putative SOX17 enhancer (Tsankov et al., 2015). We found two distinct sites with notable transcription factor (TF) occupancy within a region of open chromatin specifically in definitive endoderm (Figure 1B). Although both sites show enriched UCSC PhyloCSF sequence conservation, they are also characterized by a distinguishable promoter and enhancer signature (ChromHMM state 22 and ChromHMM state 13, respectively) (Figure 1B; Ernst and Kellis, 2012; Ernst and Kellis, 2017).

We next assessed the activity of the putative promoter region of T-REX17 (pT-REX17) in a luciferase assay and found it to be endoderm-specific (Figure 1—figure supplement 1B). We similarly tested the activity of the putative enhancer, which was further separated into two parts based on its TF occupancy profile (eSOX17.1 and eSOX17.2) (Figure 1B and C). The entire region but also eSOX17.2 alone showed strong enhancer activity during endoderm differentiation (Figure 1C, Figure 1—figure supplement 1C).

We then further evaluated eSOX17.2 function using Cas9-induced homozygous deletions and assessed the effect of the mutation during directed endoderm differentiation (Figure 1—figure supplement 1D and E). Interestingly, we observed a delayed activation of SOX17 and overall reduced expression of the transmembrane C-X-C chemokine receptor 4 (CXCR4) (Figure 1—figure supplement 1F and G). To investigate the physical interactions at the locus, we performed Circularized Chromosome Conformation Capture sequencing (4C-seq) on pluripotent cells and early endoderm and found an enriched interaction between the SOX17 promoter and its distal enhancer (eSOX17) (Figure 1D and E; Figure 1—figure supplement 1H). Therefore, we can conclude that the topologically isolated domain of SOX17 encompasses a distal, transcribed region driven by a promoter in close proximity but otherwise independent from a functional enhancer that interacts with the SOX17 gene.

T-REX17 is a definitive endoderm-specific lncRNA

We next investigated the expression of the non-coding transcript during endoderm differentiation with time-resolved qRT-PCR and found that T-REX17 expression follows SOX17 kinetics but with an approximate 24 hour delay (Figure 2A). To explore possible regulatory links between SOX17 and T-REX17, we compared their expression across a wide range of cell and tissue types (n=44) (Figure 2B). T-REX17 appears tightly restricted to early human definitive endoderm and, importantly, uncoupled from the much broader expression of SOX17 in many other endoderm-derived tissues (Mathias et al., 2015, Thul et al., 2017; Figure 2B; Figure 2—figure supplement 1A–C). Moreover, we utilized RNA-seq data from the three pluripotent stem cell-derived germ layers to show that T-REX17 is not expressed during mesoderm and ectoderm formation (Figure 2—figure supplement 1D). scRNAseq data in the early human gastrulating embryo (Tyser et al., 2021) confirms T-REX17’s tissue specificity in vivo (Figure 2—figure supplement 1E).

Figure 2 with 1 supplement see all
T-REX17 cellular and molecular characterization.

(A) Time resolved qRT-PCR profiling SOX17 (green) and T-REX17 (orange) transcript levels during endoderm differentiation (normalized to the housekeeping gene 18s). Symbols indicate the mean and error bars indicate SD across three independent experiments. (B) Lineage tree heatmap showing SOX17 (green) and T-REX17 (orange) expression across EN derived embryonic and adult tissues as measured by RNA-seq, extracted from a curated data set of the Roadmap Epigenome Project (Roadmap Epigenomics Consortium et al., 2015; Supplementary file 1). TPM, transcripts per million. aPS, anterior primitive streak; AFE, anterior foregut endoderm; PFE, posterior foregut endoderm; MHG; mid-hindgut; PPT, Peyer’s patch tissue; S, sigmoid; T, transverse. (C) smRNA-FISH of T-REX17 in PSCs (left) and EN cells (right) counter-stained with Hoechst. Red arrowheads indicate two brighter and bigger foci present in each cell, potentially representing sites of nascent transcription. Scale bars, 10 µm. (D) Frequencies of T-REX17 smRNA-FISH foci in the nuclear (grey) or the cytoplasmic (white) compartments. n=79, number of analyzed cells. Lines of the violin plot indicate interquartile range around the median value. In the stacked barplot, error bars indicate SD around the mean value. (E) Barplots showing coding potential scores of randomly sampled LNCRNA ORFs (n=257,992) (grey) versus T-REX17 ORFs (n=40) (orange). Scores are shown on the x-axis while ORF-density is plotted on the y-axis. Both conditions area is equal and compared to SOX17 ORFs as coding gene control. n, number of analyzed ORFs. (F) Schematic of T-REX17 isoform structure derived from MinION-seq reads of endoderm cDNA. Exons are shown in orange while the poly(A) is shown in white. The arrow indicates the transcriptional start site (TSS). Pie chart shows isoform reads (Ex1+2 black n=16, Ex1+3 grey n=11) and ‘sloppy spliced’ (white n=89) transcript distribution as measured by MinIONseq (Supplementary file 1).

We also investigated T-REX17 localization by single-molecule RNA fluorescence in situ hybridization (smRNA-FISH) and found it highly enriched at foci within the nuclear compartment, a characteristic feature of non-coding transcripts (median of 40 foci/cell, Figure 2C and D). Nuclear localization and association with chromatin were further confirmed by cell-fractionation experiments (Figure 2—figure supplement 1F). Next, we wanted to more closely inspect the coding potential of T-REX17 and used PhyloCSF to show that 37 of 40 predicted open reading frames (ORFs) would likely result in no functional protein (Figure 2E). This is comparable to other short ORFs (sORFs) in the human lncRNA catalog (Figure 2E; Lin et al., 2011). Notably, even the coding potential of the remaining three sORFs is about two orders of magnitude lower than for the SOX17 coding sequence (Figure 2E).

To explore the structure and splicing variants of T-REX17, we used long-read Nanopore sequencing of definitive endoderm cDNA. The two most prevalent isoforms account for 23.3% of the split-reads, while 76.7% appear inconsistently spliced, a feature which is frequently observed in lncRNAs (Mukherjee et al., 2017; Lagarde et al., 2017; Schlackow et al., 2017; Struhl, 2007; Beck et al., 2016) (termed ‘sloppy’ splicing, Figure 2F; Figure 2—figure supplement 1G). Additionally, we used 5’ and 3’ rapid amplification of cDNA end (RACE) to determine the exact transcriptional start and end sites as well as the corresponding polyadenylation signal (Figure 2F; Figure 2—figure supplement 1H).

Taken together, our results show that T-REX17 is specifically and transiently expressed in early definitive endoderm and creates a ‘sloppy spliced’ nuclear transcript.

T-REX17 does not regulate SOX17

To investigate the functional role of T-REX17 during endoderm formation, we first generated a cell line carrying a constitutive transcriptional repressor (dCas9-KRAB-MeCP2, Yeo et al., 2018). We then derived two cell lines from it, one harboring a control sgRNA (sgCtrl) designed by a randomization approach of human TSS regions (Gilbert et al., 2014) and the other specifically targeting the T-REX17 promoter (sgT-REX17; see Materials and methods) (Figure 3A). Immunofluorescent staining for dCas9-KRAB-MeCP2 demonstrated its homogeneous expression in the parental cell line (Figure 3—figure supplement 1A). The dCas9-mediated silencing resulted in a strong repression of T-REX17 RNA compared to the control, which we further validated by smRNA-FISH (Figure 3B and C; Figure 3—figure supplement 1B). We furthermore detected H3K9me3 enrichment around the T-REX17 promoter in sgT-REX17 cells, with a certain degree of spreading toward the enhancer eSOX17 but no apparent consequence on SOX17 regulation (Figure 3—figure supplement 1C). To assess possible effects of the T-REX17 depletion on SOX17, we performed Capture Hi-C (cHi-C) in both cell lines, but could not observe any significant interaction differences (Log2FC = 0.02 p=0.049) within the SOX17-loop domain in definitive endoderm (Figure 3—figure supplement 1D). Nevertheless, virtual 4C analysis revealed a marginal decrease in the SOX17 enhancer-promoter interaction in the absence of T-REX17 (Figure 3D). Despite this limited topological difference, loss of T-REX17 does not appear to affect SOX17 transcriptional activation and expression levels, indicating preserved enhancer functionality (Figure 3D and E). We also confirmed that unrelated genes present in neighboring domains were unaffected by the perturbation (Figure 3—figure supplement 1E).

Figure 3 with 2 supplements see all
T-REX17 regulation at the SOX17 topological domain.

(A) Schematic of T-REX17 locus regulation in the absence (top) or presence (bottom) of a targeting dCas9-KRAB-MeCP2 complex, decorating T-REX17 promoter with an H3K9me3 mark 355 bp upstream of the TSS. (B) Time-resolved qRT-PCR showing the expression of T-REX17 during EN differentiation in the presence or absence of dCas9-KRAB-MeCP2 complex targeting T-REX17 promoter (normalized to the housekeeping gene 18s). Symbols indicate the mean and error bars indicate SD across three independent experiments. Individual data points are displayed. (C) smRNA-FISH of T-REX17 in sgCtrl (left) and sgT-REX17 (right) EN cells counter-stained with Hoechst. Scale bars, 10 µm. For an extended field of view see Figure 3—figure supplement 1B. (D) Virtual 4C analysis from capture Hi-C experiments in sgCtrl and sgT-REX17 EN cells using SOX17 promoter as viewpoint, with 2 kb resolution (upper panel). SOX17 EN ChIP-seq (RPKM) and RNA-seq (CPM) profiles in the two conditions are shown in the tracks (lower panel). eSOX17 and pT-REX17 are highlighted in grey. (E) Time-resolved qRT-PCR showing the expression of SOX17 during EN differentiation in the presence or absence of dCas9-KRAB-MeCP2 complex targeting T-REX17 promoter (normalized to the housekeeping gene 18s). Symbols indicate the mean and error bars indicate SD across three independent experiments. Individual data points are displayed. (F) Heatmap showing SOX17 binding distribution genome-wide in sgCtrl and sgT-REX17 EN. The displayed peaks represent the union of the identified peaks in the two conditions (n=61.153). (G) SOX17 ChIP-seq and RNA-seq tracks at the T-REX17 locus showing SOX17 binding at the SOX17 enhancer (eSOX17) and T-REX17 promoter (pT-REX17). SOX17 binding on pT-REX17 results in T-REX17 activation, if pT-REX17 is not targeted by dCas9-KRAB-MeCP2.

Next, we performed SOX17 Chromatin Immunoprecipitation sequencing (ChIP-seq) and show that SOX17 occupancy at the SOX17 locus (including at its induced heterochromatic distal enhancer (eSOX17)) as well as genome-wide is largely unaffected by the loss of T-REX17 (Figure 3D and F; Figure 3—figure supplement 1F). Interestingly, we found SOX17 enrichment at the T-REX17 promoter (pT-REX17), potentially contributing to its activation and consistent with the timing relative to SOX17 (Figures 3D, G ,, 2A). To further explore this relationship we generated heterozygous (SOX17WT/∆) and homozygous (SOX17∆/∆) SOX17 knock-out cell lines (Figure 3—figure supplement 1G–I). Notably, homozygous knock-out cells fail to induce the expression of the endoderm master regulator GATA4, and show no activation of T-REX17 (Figure 3—figure supplement 1J).

In order to distinguish between the function of T-REX17 active transcription and its actual transcript (Allou and Balzano, 2021; Daneshvar et al., 2016), we generated an additional cell line by introducing a strong transcriptional termination signal downstream of an mRuby cassette into the first exon of T-REX17, hereafter T-REX17p(A)/p(A) (Figure 3—figure supplement 1K–M). qRT-PCR demonstrated that the expression of T-REX17 is abolished in T-REX17p(A)/p(A) EN cells, while the mRuby cassette is actively transcribed, indicating ongoing transcription at the locus in an endoderm-specific manner (Figure 3—figure supplement 1N). In line with our depletion experiments, SOX17 expression levels are not affected in T-REX17p(A)/p(A) EN cells (Figure 3—figure supplement 1N).

These results demonstrate that T-REX17 induction is dependent on SOX17, whereas the T-REX17 transcript and the act of transcription are dispensable for SOX17 activation as well as its genome-wide localization.

T-REX17 interacts with HNRNPU

To explore how T-REX17 is involved in endoderm regulation, we investigated whether it was associated with RNA binding proteins, a common way lncRNAs exert their functions (Hudson et al., 2014; Xue et al., 2016; Duszczyk et al., 2011; Brown et al., 2014; Chillón and Pyle, 2016). To this end, we performed RNA-pulldown followed by mass spectrometry (Figure 3—figure supplement 2A–B). Among the putative T-REX17 interactors, we identified several heterogenous nuclear ribonucleoprotein (hnRNP) family members, including HNRNPU (Figure 3—figure supplement 2C). HNRNPU waspreviously reported to interact with lncRNAs to regulate various functions during development including nuclear matrix organization (Hacisuleyman et al., 2014; Alvarez-Dominguez et al., 2017), X chromosome inactivation (Hasegawa et al., 2010), RNA splicing (Xiao et al., 2012; Huelga et al., 2012), and epigenetic control of gene expression (Khyzha et al., 2019; Song et al., 2020; Puvvula et al., 2014). To validate HNRNPU-T-REX17 interaction, we performed HNRNPU RNA immunoprecipitation (RIP) (Figure 3—figure supplement 2D and E) and found T-REX17 to be enriched to levels comparable to known RNA interactors such as XIST or NEAT1 (Figure 3—figure supplement 2F).

Although more work is required, our preliminary analysis identified known lncRNA-interacting ribonucleoproteins that may help resolve the molecular function of T-REX17.

T-REX17 is required for the differentiation toward definitive endoderm

To investigate the cellular role of T-REX17, we performed immunofluorescent staining and fluorescent activated cell sorting (FACS) for CXCR4 in control and T-REX17-depleted cells. The latter showed a substantial reduction in the CXCR4+ cell population during differentiation, suggesting hampered differentiation potential toward endoderm (Figure 4A). However, consistent with the transcriptional data, SOX17 protein levels were not affected (Figure 4A). Both phenotypes were recapitulated in the T-REX17p(A)/p(A) EN cells (Figure 4—figure supplement 1A). As expected, based on its highly restricted expression, differentiation toward the other two germ layers (mesoderm and ectoderm) was not affected (Figure 2—figure supplement 1D; Figure 4—figure supplement 1B,D).

Figure 4 with 2 supplements see all
Endodermal defects in cells depleted of T-REX17.

(A) Immunofluorescent (IF) staining of SOX17 and CXCR4 in EN cells expressing either sgCtrl or sgT-REX17 counter-stained with DAPI (left panel). Line plot showing percentage of FACS-derived CXCR4+ cell population at given time points during endoderm differentiation (right panel). Symbols indicate mean values, while error bars show SD across three independent experiments. Individual data points are displayed. Scale bars, 10 µm. (B) Scatter plot highlighting differentially expressed genes between sgT-REX17 and sgCtrl EN cells. Significantly (Log2FC ≥1, p-value <0.05) upregulated genes (n=590) upon T-REX17 repression are shown in red while significantly (Log2FC ≤ –1, p-value <0.05) down-regulated genes (n=584) are shown in blue. Three independent replicates have been performed. The complete lists of TPMs and differentially expressed genes are provided in Supplementary file 2. (C) JNK and pJNK western blots of sgCtrl and sgT-REX17 EN cells (left panel). GAPDH signals are used as loading controls above the corresponding JNK/pJNK signals. Boxplot showing relative pJNK levels during endoderm differentiation. Quantification is depicted as Log2FC of sgT-REX17 over sgCtrl (right panel) and provided in Supplementary file 1. Central line indicates the mean, error bars indicate the SD across two independent experiments. Differentiation time-course blots are shown in Figure 4—figure supplement 2A. (D) IF staining of ECAD and NCAD in EN cells expressing either sgCtrl or sgT-REX17 counter-stained with DAPI. Scale bars, 10 µm. (E) IF staining of VIM in EN cells expressing either sgCtrl or sgT-REX17 counter-stained with DAPI. Scale bars, 5 µm. (F) Bright field images of PP differentiation cultures (upper panel) followed by IF staining for PDX1 (lower panel) of either sgCtrl or sgT-REX17 cells. Scale bars, 10 µm. (G) IF staining quantification of overall (sgCtrl, n=17.657, sgT-REX17, n=5.279 analyzed cells) PDX1+ population percentages (left) or PDX1 mean fluorescence intensity distribution in PDX1+ cells (right). Bar plot error bars indicate SD around the mean value and white dots represent mean values for the individual replicates (N=10). Lines of the violin plot indicate interquartile range around the median value and white dots represent median values for the individual replicates (N=10). List of values for each cell and corresponding statistics are shown in Supplementary file 4. (H) Heatmap showing row-normalized z-scores of PP specific marker genes (Alvarez-Dominguez et al., 2020) in sgCtrl and sgT-REX17 EN cells as measured by RNA-seq at day 9 of differentiation. Columns were ordered by hierarchical clustering (represented as tree above the heatmap). Note the reduced expression of PP master transcription factor PDX1 in sgT-REX17 as compared to sgCtrl. The complete lists of TPMs and differentially expressed genes are provided in Supplementary file 2.

Next, we performed time-resolved RNA-seq in T-REX17 depleted and control cell lines on days 0, 3, and 5 of endoderm differentiation. Principal Component Analysis (PCA) revealed only marginal variance by day 3, while a more substantial transcriptional divergence was observed on day 5 (Figure 4—figure supplement 1D). Differential gene expression analysis identified 584 significantly down- and 590 significantly upregulated genes in T-REX17-depleted cells at day 5 (Figure 4B). In particular, we found pluripotency genes (e.g. POU5F1, NANOG) and endoderm/WNT-related genes (e.g. EOMES, GATA3, CXCR4, FZD5, FZD7, FZD8, DKK1, NOTUM, ROR1, CXXC4, SFRP5) to be significantly up- and downregulated, respectively (Figure 4B; Figure 4—figure supplement 1E). Time resolved qPCR analysis over 5 days confirmed, a lack of key endoderm markers activation and expression in T-REX17-depleted cells (including CXCR4, GATA3, GATA4, KLF5, CPE, GPR, HHEX, EPSTI1, FOXA3), an aberrant transcriptional signature we also observe in T-REX17p(A)/p(A) EN cells (Martinez Barbera et al., 2000; Grapin-Botton and Constam, 2007; McLean et al., 2007; Séguin et al., 2008; Teo et al., 2011; Aksoy et al., 2014; Dettmer et al., 2020; Figure 4—figure supplement 1F–H). Interestingly, among the significantly, upregulated genes in T-REX17-depleted cells, we found an enrichment of JUN (AP-1) pathway target genes (including EGR1, ATF3, PVR, DAB2, NOTCH2, MFHAS1, SPARC) (Briggs et al., 2002; Schummer et al., 2016; Florin et al., 2004; van Dam and Castellazzi, 2001; Hoffmann et al., 2008; Kockel et al., 2001), which has recently been described to act as a barrier for the exit from pluripotency toward endoderm formation (Figure 4B; Figure 4—figure supplement 1E; Li et al., 2019). Phosphorylation levels of JUN-activating upstream kinase JNK are a strong indicator of JUN pathway activation (Raivich and Behrens, 2006; Muniyappa and Das, 2008; Li et al., 2019), which we observed by increased relative amounts of pJNK in T-REX17-depleted cells (Figure 4C; Figure 4—figure supplement 2A). Inhibition of JNK hyperactivity (JNK Inhibitor XVI) from day 3 of definitive endoderm differentiation partially rescued the specification defect in T-REX17-depleted cells (Figure 4—figure supplement 2B and C).

Furthermore, immunofluorescent staining for ECAD, NCAD, and VIM revealed retention of an epithelial signature in T-REX17 depleted endoderm cells (Figure 4D and E; Figure 4—figure supplement 1E; Figure 4—figure supplement 2D and E). Moreover, VIM-signal distribution within T-REX17-depleted cells was also altered, indicating a potential cellular polarization defect (Figure 4E; Figure 4—figure supplement 2E).

Finally, we evaluated if T-REX17-depleted cells have lost the potential to further differentiate into pancreatic progenitor (PP) cells (Alvarez-Dominguez et al., 2020). Immunofluorescent staining identified a very distinct PDX1+ population in the control cell population after 9 days of directed differentiation, which is notably reduced in T-REX17-depleted cells (Figure 4F and G; Figure 4—figure supplement 2F). In addition, transcriptomic analysis of differentiated control and T-REX17-depleted cells indicates a substantial gene expression difference, including the specific downregulation of pancreatic progenitor marker genes (Alvarez-Dominguez et al., 2020; Figure 4H; Figure 4—figure supplement 2G).

Our data therefore highlight the importance of T-REX17 for the induction of definitive endoderm, which directly impacts the subsequent differentiation potential.

Discussion

Here, we describe the discovery and characterization of T-REX17 as a functionally essential lncRNA in human definitive endoderm. Most lncRNAs act locally, regulating the chromatin architecture and the expression of neighboring genes in cis- (Tan et al., 2017; Wang et al., 2011; Goff et al., 2015; Engreitz et al., 2016), especially when overlapping with enhancer elements. In particular, the fine-tuned expression of several developmental transcription factors has been shown to rely on the activity of lncRNAs present within the same topological domain (Wang et al., 2011; Frank et al., 2019; Messemaker et al., 2018). Interestingly, T-REX17 appears distinct from these and other endodermal specific lncRNAs (Jiang et al., 2015; Liao et al., 2019; Daneshvar et al., 2016) as it does not appear to regulate the adjacent SOX17 gene. The use of two orthogonal loss of function approaches in our work (suppression of T-REX17 activation and early termination) showed that T-REX17 transcription is dispensable for proper SOX17 regulation. It remains to be determined what the targets and regulatory mechanism of T-REX17 are. One may speculate that these could be distant and unrelated loci to the SOX17 loop-domain, as we find many T-REX17 distinct puncta in the nuclear compartment of endodermal cells. Typically, local cis-acting lncRNAs mainly show accumulation at the two sites of nascent transcription (Jiang et al., 2015; Lewandowski et al., 2019; Daneshvar et al., 2016; Daneshvar et al., 2020). The observed interaction with the HNRNP complex may link it to various nuclear-related functions needed for endoderm specification. It will be interesting to see how this compares to other endodermal lncRNAs, which mainly exert their functions together with endoderm-specific transcription factors (Jiang et al., 2015; Daneshvar et al., 2020; Chen et al., 2020; Yang et al., 2020).

SOX genes are fundamental transcription factors that have a variety of functions including the specification of cell types and tissues during embryonic development. They are evolutionary conserved and evolved as a result of a series of ancient genomic duplication events (Bowles et al., 2000). Interestingly, at other SOX gene loci, the presence of one or multiple lncRNAs have been reported, but these lncRNAs, in contrast to T-REX17, appear involved in the modulation of the associated SOX gene expression in cis- (Tariq et al., 2020; Barter et al., 2017; Amaral et al., 2009; Ahmad et al., 2017). This suggests that lncRNAs near paralogous genes may evolve distinct role and regulatory mechanisms.

At a functional level, our results show that T-REX17 is essential for definitive endoderm specification and its loss limits further downstream differentiation, as demonstrated by the pancreatic progenitor differentiation. How the different phenotypic changes associated with the loss of T-REX17 arise, such as an aberrant endodermal transcriptome, EMT-failure, JNK-hyperactivity and lack of pancreatic progeny, remains unclear. Advanced biochemical assays to simultaneously profile RNA-RNA, RNA-DNA, and RNA-protein interactions (Chu et al., 2012; Quinodoz et al., 2018; Engreitz et al., 2015) might help elucidating the mechanism of action by which T-REX17 controls endodermal transition. From the developmental perspective, T-REX17 and its transient, highly stage-specific nature make it an intriguing regulator compared to most of the protein-coding genes, including endodermal transcription factors, for example SOX17, FOXA2, and GATA4, which are expressed longer and in a variety of somatic tissues. In this context, it is worth noting that the development of definitive endoderm during human gastrulation in vivo takes place within hours and the gene regulatory network (GRN) governing this transition has to be tightly controlled (Tsankov et al., 2015; Gifford et al., 2013; Chia et al., 2019), which could also involve lncRNAs such as T-REX17.

As such, our study contributes toward a more complete understanding of the multi-layered regulation of human cellular differentiation and connects it to a previously unannotated non-coding RNA.

Materials and methods

Default parameters were used, if not otherwise specified, for all software and pipelines utilized in this study.

Molecular cloning of SOX17 and eSOX17.2 knock-out constructs

Request a detailed protocol

For CRISPR/Cas9 mediated targeting of either SOX17 (Addgene plasmid #195494) or eSOX17.2 (Addgene plasmid #195495) we utilized our previously generated two small guide RNAs (sgRNAs) at once expression system 2 X_pX458_pSpCas9(BB)–2A-GFP (Addgene plasmid #172221). sgRNA-cloning was performed with NEBuilder HiFi DNA Assembly Master Mix (New England Biolabs, E2621S) according to manufacturer’s instructions using BbsI-linearization of 2 X_pX458 for the first sgRNA and SapI linearization of 2 X_pX458 for the second sgRNA as backbone, combined with single stranded oligonucleotides containing the sgRNA sequences as inserts (1:3 molar ratio; find sequence in Supplementary file 3). Bacterial transformation and Sanger sequencing were performed to verify successful cloning.

Molecular cloning of Luciferase reporter constructs

Request a detailed protocol

pGL4.27[luc2P/minP/Hygro] (Promega, E8451) containing a minimal CMV-promoter for enhancer-assays or pGL4.15[luc2P/Hygro] (Promega, E6701) w/o any promoter for promoter-assays were first digested using EcoRV (New England Biolabs, R3195S). Next, full eSOX17 (Addgene plasmid # 195498), eSOX17.1 or eSOX17.2 (Addgene plasmid # 195499) for enhancer-assays and pSOX17 (Addgene plasmid # 195496) or pT-REX17 (Addgene plasmid # 195497) genomic regions were PCR amplified with primers containing homology overhangs to the plasmid. PCR products were purified and cloned into the linearized plasmid utilizing the NEBuilder HiFi DNA Assembly Master Mix (1:3 molar ratio) according to the manufacturer’s instructions. Bacterial transformation followed by Sanger sequencing verified the successful cloning. Cloning primers are listed in Supplementary file 3.

Molecular cloning of lentiviral sgRNA constructs

Request a detailed protocol

pU6-sgRNA EF1Alpha-puro-T2A-BFP (Gilbert et al., 2014) was digested with BstXI (New England Biolabs, R0113S) and BlpI (New England Biolabs, R0585S) and the linearized plasmid was gel extracted with the QIAquick Gel Extraction Kit (Quiagen, 28704). Subsequently sgRNA containing oligonucleotides (sgT-REX17 or sgCtrl) (s. Supplementary file 3) were cloned in the linearized backbone using NEBuilder HiFi DNA Assembly Master Mix (1:3 molar ratio) according to the manufacturer’s instructions to finally obtain pU6-sgT-REX17_EF1a-Puro-T2A-BFP (Addgene plasmid #195501) and pU6-sgCtrl_EF1a-Puro-T2A-BFP (Addgene plasmid #195500). Bacterial transformation and sanger sequencing confirmed the successful cloning. pU6-sgRNA EF1Alpha-puro-T2A-BFP (Gilbert et al., 2014) was a gift from Jonathan Weissman (Addgene plasmid # 60955; http://n2t.net/addgene:60955; RRID:Addgene_60955).

Molecular cloning of SOX17 reporter knock-in constructs

Request a detailed protocol

pUC19 plasmid was digested with SmaI (New England Biolabs, R0141S) and the linearized plasmid was gel extracted with the QIAquick Gel Extraction Kit (Quiagen, 28704). Next, SOX17 homology arm genomic regions were PCR amplified with primers containing homology overhangs to the plasmid and to a T2A-H2B-mCitrine-loxP-hPGK-BSD-loxP selection cassette.

The left homology arm overlapped with the end of the SOX17 coding sequence, and the T2A-H2B-mCitrine cassette which was cloned in frame with the last SOX17 aminoacid. PCR products and selection cassette were purified and cloned into the linearized pUC19 to finally generate pUC19_T2A-H2B-mCitrine_loxP-hPGK-BSD-loxP (Addgene plasmid #195503) utilizing the NEBuilder HiFi DNA Assembly Master Mix according to the manufacturer’s instructions. Bacterial transformation followed by Sanger sequencing verified the successful cloning.

sgRNA targeting the genomic region of integration (SOX17 C-terminus) was cloned in BbsI linearized pX335-U6-Chimeric_BB-CBh-hSpCas9n(D10A) (Cong et al., 2013) plasmid (Addgene plasmid #42335) to finally generate pX335_U6-Chimeric_BB-CBh-hSpCas9n(D10A)_SOX17_C-term_KI (Addgene plasmid #195502) using NEBuilder HiFi DNA Assembly Master Mix (1:3 molar ratio) according to the manufacturer’s instructions. pX335-U6-Chimeric_BB-CBh-hSpCas9n(D10A) was a gift from Feng Zhang (Addgene plasmid # 42335; http://n2t.net/addgene:42335; RRID:Addgene_42335). Bacterial transformation and sanger sequencing confirmed the successful cloning. Cloning primers are listed in Supplementary file 3.

Molecular cloning of T-REX17-promoter-KI constructs

Request a detailed protocol

pUC19 plasmid was digested with SmaI (New England Biolabs, R0141S) and the linearized plasmid was gel extracted with the QIAquick Gel Extraction Kit (Quiagen, 28704). Next, T-REX17 homology arm genomic regions were PCR amplified with primers containing homology overhangs to the plasmid and to a mRuby-3xFLAG-NLS-3xSV40-poly(A)_loxP-mPGK-PuroR-loxP selection cassette to finally generate pUC19_mRuby-3xFLAG-NLS-3xSV40-poly(A)_loxP-mPGK-PuroR-loxP (Addgene plasmid #195505).

The left homology arm overlapped with the T-REX17 promoter including 30 bp of T-REX17 Exon 1, and a mRuby-3xFLAG-NLS-3xSV40-poly(A) cassette which was cloned +30 bp after T-REX17-TSS into Exon 1. The right homology arm overlapped with T-REX17 Exon 1–30 bp TSS, and a loxP-mPGK-PuroR-loxP cassette, which was cloned following the mRuby-3xFLAG-NLS-3xSV40-poly(A) cassette, originating from a synthetic oligonucleotide (GenScript Biotech). Both the mRuby-3xFLAG-NLS-3xSV40-poly(A) and the loxP-mPGK-PuroR-loxP cassette also shared homology. All PCR products were purified and cloned into the linearized plasmid utilizing the NEBuilder HiFi DNA Assembly Master Mix according to the manufacturer’s instructions. Bacterial transformation and Sanger sequencing verified the successful cloning.

For Cas9 mediated targeting of the T-REX17 promoter we utilized pSpCas9(BB)–2A-Puro (PX459) V2.0 (Ran et al., 2013), which was a gift from Feng Zhang (Addgene plasmid # 62988; http://n2t.net/addgene:62988; RRID:Addgene_62988) (Ran et al., 2013). sgRNA-cloning was performed with NEBuilder HiFi DNA Assembly Master Mix (New England Biolabs, E2621S) according to manufacturer’s instructions using BbsI-linearization of PX459, combined with single stranded oligonucleotides containing the sgRNA sequences as inserts (1:3 molar ratio) (find sequence in Supplementary file 3) to finally obtain pX459_V2.0_pSpCas9(BB)–2A-Puro_T-REX17_Ex1_KI (Addgene plasmid #195504). Bacterial transformation and Sanger sequencing were performed to verify successful cloning.

hiPS cell culture

Request a detailed protocol

ZIP13K2 (Tandon et al., 2018) hiPSCs were maintained in mTeSR1 (Stemcell Technologies, 85850) on pre-coated culture ware (1:100 diluted Matrigel (Corning, 354234) in KnockOut DMEM (Thermo Fisher Scientific, 10829–018)). Clump-based cell splitting was performed by incubating the cells in final 5 mM EDTA pH 8,0 (Thermo Fisher Scientific, 15575–038) in DPBS (Thermo Fisher Scientific, 14190250) 5 min at 37 °C, 5% CO2. Single-cell splitting was performed by incubating the cells with Accutase (Sigma-Aldrich, A6964) supplemented with 10 µM Y-27632 (Tocris, 1254) for 15 min at 37 °C, 5% CO2. Cell counting was performed using a 1:1 diluted single-cell suspensions in 0,4% Trypan Blue staining-solution (Thermo Fisher Scientific, 15250061) on the Countess II automated cell-counter (Thermo Fisher Scientific). Wash-steps were performed by spinning cell-suspensions at 300 x g 5 min at room temperature (RT).

Definitive endoderm (EN) differentiation

Request a detailed protocol

To guarantee high reproducibility, constant media-quality, and mTeSR1 compatibility, definitive endoderm differentiations were exclusively performed utilizing the STEMdiff Trilineage Endoderm Differentiation media (Stemcell Technologies, 05230). Single-cell suspensions of mTeSR1 maintained ZIP13K2 hiPSCs were seeded into the respective culture formats according to the required cell-number as recommended by the manufacturer’s instructions. Media change using the STEMdiff Trilineage Endoderm Differentiation media was performed on a daily bases according to the manufacturer’s instructions. Cells were then collected at required timepoints by washing the plate with DPBS before single-cell dissociation was performed with Accutase for 15 min at 37 °C, 5% CO2. Single-cell suspensions of definitive endoderm (EN) differentiated cells were utilized for further downstream analysis (qPCR, western blot, FACS etc.).

Embryoid body (EB) formation followed by ScoreCard Assay

Request a detailed protocol

ZIP13K2 hiPSC single cell suspensions were prepared and counted as previously described (s. hiPS cell culture). Next, 1x103 cells/well of either sgCtrl or sgT-REX17 hiPSCs were seeded on a 96-well ultra-low attachment U-bottom plate (Corning, 7007) in respective cell culture media.

Random EB differentiation

Request a detailed protocol

Cells were seeded in 200 µl /well of hES-media (Final DMEM-F12 (Thermo Fisher Scientific, 11320074), 20% KSR (Thermo Fisher Scientific, 10828028), 1% Penicillin /Streptomycin, 1% NEAA (Thermo Fisher Scientific, 11140050), 0,5% GlutaMAX, HEPES (Thermo Fisher Scientific, 31330038)), supplemented with final 10 µM Y-27632. Single-cell suspensions were spun at 100 x g for 1 min at RT and further cultured for 16 hr at 37 °C, 5% CO2. The following day 150 µl media supernatant was carefully exchanged by 150 µl fresh hES-media (without Y-27632). Cells were further cultured for additional 48 hr at 37 °C, 5% CO2. The very same media was replaced every 48 hr until day 9. At day 9, EBs were collected, washed once in DPBS and RNA isolated (s. RNA isolation and cDNA synthesis).

Undifferentiated control EBs

Request a detailed protocol

Cells were seeded in 200 µl /well of mTeSR1, supplemented with final 10 µM Y-27632. Single-cell suspensions were spun at 100 x g for 1 min at RT and further cultured for 16 hr at 37 °C, 5% CO2. The following day 150 µl media supernatant was carefully exchanged by 150 µl fresh mTeSR1 media (without Y-27632). Cells were further cultured for additional 48 hr 37 °C, 5% CO2. At day 3, EBs were collected, washed once in DPBS and RNA isolated (s. RNA isolation and cDNA synthesis).

cDNA-conversion and ScoreCard assay (Thermo Fisher Scientific, A15870) has been performed according to the manufacturer’s instructions.

JNK inhibition experiments

Request a detailed protocol

For the JNK-inhibition experiments, 1 µM JNK inhibitor XVI (Sellekchem, S4901) final was supplemented to the media from day 3 of EN differentiation onward. The corresponding volume of DMSO was supplemented to the media of the control samples.

Pancreatic progenitor (PP) differentiation

Request a detailed protocol

Pancreatic progenitor (PP) differentiation was performed as previously described (Alvarez-Dominguez et al., 2020) with minor changes. Briefly, single-cell suspensions of ZIP13K2 hiPSCs (s. hiPS cell culture) were seeded at a density of 5x105 cells /cm2 in mTeSR1 supplemented with 10 µM Y-27632. After 24 hr, culture medium was replaced with S1-media (Final 11.6 g/L MCDB131, Sigma Aldrich, M8537-1L; 2 mM D-+-Glucose, Sigma Aldrich, G7528-250G; 2.46 g/L NaHCO3, Sigma Aldrich, S5761-500G; 2% FAF-BSA, Proliant Biologicals, 68700–1; 1:50,000 of 100 x ITS-X, Thermo Fisher Scientific, 51500056; 1 x GlutaMAX, Thermo Fisher Scientific, 35050–038; 0.25 mM ViatminC, Sigma-Aldrich, A4544-100G; 1% Pen-Strep, Thermo Fisher Scientific, 15140122) supplemented with final 100 ng/ml Activin-A (R&D Systems, 338-AC-01M) and 1.4 µg/ml CHIR99021 (Stemgent, 04-0004-10). The following 2 days, cells were cultured in S1-media supplemented with final 100 ng/ml Activin-A. Next, cells were cultured in S2-media (Final 11.6 g/L MCDB131; 2 mM D-+-Glucose; 1.23 g/L NaHCO3; 2% FAF-BSA; 1:50,000 of 100 x ITS-X; 1 x GlutaMAX; 0,25 mM ViatminC; 1% Pen-Strep) supplemented with final 50 ng/ml KGF (Peprotech, 100-19-1MG) for 48 hr. After these 48 hr, cells were cultured in S3-media (Final 11.6 g/L MCDB131; 2 mM D-+-Glucose; 1.23 g/L NaHCO3; 2% FAF-BSA; 1:200 of 100 x ITS-X; 1 x GlutaMAX; 0.25 mM ViatminC; 1% Pen-Strep) supplemented with final 50 ng/ml KGF (Peprotech, 100-19-1MG), 200 nM LDN193189 (Sigma Aldrich, SML0559-5MG), 0.25 µM Sant-1 (Sigma Aldrich, S4572-5MG), 2 µM Retinoic Acid (Sigma Aldrich, R2625-50MG), 500 nM PDBU (Merck Millipore, 524390–5 MG) and 10 µM Y-27632 for 24 hr. Finally, cells were cultured in the previous S3-media composition w/o supplementation of LDN193189 for 24 hr. Between daily media changes, cells were washed once with 1 x DPBS. Throughout the entire differentiation process, cells were cultured at 37 °C, 5% CO2 in 100 µl media /cm2.

Luciferase reporter assays

Request a detailed protocol

ZIP13K2 hiPSCs (s. hiPS cell culture) were treated with Accutase containing 10 µM Y-27632 for 15 min, 37 °C, 5% CO2 to obtain a single cell suspension. Cell suspensions were counted and seeded at a density of 105 cells /cm2 in mTeSR1 supplemented with final 10 µM Y-27632. Sixteen hours later, cells were co-transfected with 15 fmol pRL-TK (Promega, E2241) and 150 fmol of either pGL4.27[luc2P/minP/Hygro] empty vector or pGL4.27[luc2P/minP/Hygro] containing either eSOX17, eSOX17.1 or eSOX17.2 utilizing Lipofectamin Stem Transfection Reagent (Thermo Fisher Scientific, STEM00003) following the manufacturer’s instructions. Transfection was performed in mTeSR1 containing 10 µM Y-27632 for 16 hr at 37 °C, 5% CO2. Subsequently, endoderm differentiation was initiated (day 0) using the STEMdiff Trilineage Endoderm Differentiation media. At days 0, 2, 3, or 5 of endoderm differentiation, cells were lysed and Renilla as well as Firefly Luciferase activity was measured using the Dual-Glo Luciferase Assay System (Promega, E2920) according to the manufacturer’s instructions. Raw values (Supplementary file 1) were measured on the GloMax-Multi Detection System (Promega).

Generation of SOX17 and eSOX17.2 CRISPR/Cas9 knock-out hiPSC lines

Request a detailed protocol

ZIP13K2 hiPSCs (s. hiPS cell culture) were treated with Accutase containing final 10 µM Y-27632 for 15 min at 37 °C, 5% CO2 to obtain a single cell suspension. Cell suspensions were counted and seeded at a density of 1–2 x 105 cells /cm2 in mTeSR1 supplemented with final 10 µM Y-27632. Cells were pre-cultured for 16 hr at 37 °C, 5% CO2 prior to transfection.

Cells were then transfected with 6 µg /6-well of P2X458 using Lipofectamin Stem Transfection Reagent according to the manufacturer’s instructions. GFP+ cells were FACS-sorted 16–24 hr post-transection with the FACSAria II or the FACSAria Fusion (Beckton Dickinson) and seeded at a density of 0,5–1 x 103 cells /cm2 in mTeSR1 supplemented with 10 µM Y-27632 to derive isogenic clones. Single-cell derived colonies were manually picked, and split half for maintenance in a well of a 96-well plate and half used for genotyping using the Phire Animal Tissue Direct PCR Kit (Thermo Fisher Scientific, F140WH) following manufacturer’s instructions. Genotyping primer are listed in Supplementary file 3. Edited alleles were verified by cloning PCR-products into the pJET1.2 backbone (Thermo Fisher Scientific, K1232) according to the manufacturer’s instructions, followed by bacterial transformation and sanger sequencing.

Generation of SOX17-reporter hiPS cell line

Request a detailed protocol

ZIP13K2 hiPSCs (s. hiPS cell culture) were treated with Accutase containing final 10 µM Y-27632 for 15 min at 37 °C, 5% CO2 to obtain a single cell suspension. Cell suspensions were counted and seeded at a density of 1–2 x 105 cells /cm2 in mTeSR1 supplemented with final 10 µM Y-27632. Cells were pre-cultured for 16 hr at 37 °C, 5% CO2 prior to transfection.

The following day, cells were transfected using Lipofectamin Stem Transfection Reagent in fresh mTeSR1 supplemented with final 10 µM Y-27632 for 24 hr at 37 °C, 5% CO2. Transfection mixtures contained 3 µg of T2A-H2B-mCitrine-loxP-hPGK-BSD-loxP donor plasmid and 3 µg of PX335-SOX17 (1:1 molar ratio) per 6-well.

Two days post transfection, cells were selected with final 2 µg/ml Blasticidin-S-HCl (Thermo Fisher Scientific, A1113903) for 14 days at 37 °C, 5% CO2. For the derivation of isogenic reporter cell lines, single-cell derived colonies were manually picked and expanded. Differentiation into EN followed by FACS analysis was used to confirm clones that were activating the reporter.

Generation of T-REX17-promoter-KI hiPS cell line

Request a detailed protocol

ZIP13K2 SOX17-reporter (s. Generation of SOX17-reporter hiPS cell line) hiPSCs (s. hiPS cell culture) were treated with Accutase containing final 10 µM Y-27632 for 15 min at 37 °C, 5% CO2 to obtain a single cell suspension. Cell suspensions were counted and seeded at a density of 1–2 x 105 cells /cm2 in mTeSR1 supplemented with final 10 µM Y-27632. Cells were pre-cultured for 16 hr at 37 °C, 5% CO2 prior to transfection.

The following day, cells were transfected using Lipofectamin Stem Transfection Reagent in fresh mTeSR1 supplemented with final 10 µM Y-27632 for 24 hr at 37 °C, 5% CO2. Transfection mixtures contained 3 µg of mRuby-3xFLAG-NLS-3xSV40-poly(A)-loxP-mPGK-PuroR-loxP donor plasmid and 3 µg of PX458-T-REX17-promoter (1:1 molar ratio) per 6-well.

Two days post transfection, cells were selected with final 2 µg/ml Puromycin-Dihydrochloride (Thermo Fisher Scientific, A1113803) for 14 days at 37 °C, 5% CO2. For the derivation of isogenic reporter cell lines, single-cell derived colonies were manually picked and expanded. Differentiation into EN followed by qRT-PCR analysis was used to confirm clones that were activating the reporter.

Generation of dCas9-KRAB-MeCP2 hiPS cell line

Request a detailed protocol

ZIP13K2 hiPSCs (s. hiPS cell culture) were treated with Accutase containing final 10 µM Y-27632 for 15 min at 37 °C, 5% CO2 to obtain a single cell suspension. Cell suspensions were counted and seeded at a density of 1–2 x 105 cells /cm2 in mTeSR1 supplemented with final 10 µM Y-27632. Cells were pre-cultured for 16 hr at 37 °C, 5% CO2 prior to transfection.

The following day, cells were transfected using Lipofectamin Stem Transfection Reagent in fresh mTeSR1 supplemented with final 10 µM Y-27632 for 24 hr at 37 °C, 5% CO2. Transfection mixtures contained 2 µg of Super PiggyBac transposase expression vector (SBI, PB210PA-1) and 4 µg dCas9-KRAB-MeCP2 (Yeo et al., 2018) (1:1 molar ratio) per 6-well. dCas9-KRAB-MeCP2 was a gift from Alejandro Chavez & George Church (Addgene plasmid # 110821; http://n2t.net/addgene:110821; RRID:Addgene_110821).

Two days post transfection, cells were selected with final 2 µg/ml Blasticidin-S-HCl (Thermo Fisher Scientific, A1113903) for 14 days at 37 °C, 5% CO2. For the derivation of isogenic CRISPRi cell lines, single-cell derived colonies were manually picked and expanded. IF stainings for Cas9 confirmed homogenous dCas9-KRAB-MeCP2 expression in the selected clones (s. Immunofluorescence staining for detailed experimental procedure).

Production of lentiviral particles carrying sgRNAs

Request a detailed protocol

Lentiviral particles of specific sgRNA constructs have been produced in HEK-293T cells by co-transfection of 1:1:1 molar ratios pCMV-VSV-G plasmid (addgene, #8454 Stewart et al., 2003, 3,5 µg), psPAX2 plasmid (addgene, #12260, 7 µg) in combination with sgRNA-specific variants of pU6-sgRNA EF1Alpha-puro-T2A-BFP (Gilbert et al., 2014) plasmid (addgene, #60955, 14 µg). pCMV-VSV-G was a gift from Bob Weinberg (Addgene plasmid # 8454; http://n2t.net/addgene:8454; RRID: Addgene_8454). Prior to transfection, HEK-293T cells were grown on a 10 cm dish up to 70–80% confluency in HEK-media (KO-DMEM (Themro Fisher Scientific, 10829018), 10% fetal bovine serum (FBS, PAN Biotech, P30-2602), 1 x GlutaMAX Supplement, 100 U/ml Penicillin-Streptomycin (Thermo Fisher Scientific, 15140122) and final 1 x, 5,5 µM ß-Mercaptoethanol (Thermo Fisher Scientific, 21985023)). For each sgRNA construct, plasmid DNA mixtures and 50 µl of LipoD293 transfection reagent (SignaGen Laboratories, SL100668) were mixed in 250 µl KO-DMEM at RT. After pipette mixing, transfection particles were incubated at RT for 15 min. Each sgRNA-specific mixture was added drop-wise onto HEK-293T cultures in 10 ml HEK-media and incubated for 16 hr at 37 °C, 5% CO2. Cell culture media was exchanged by 10 ml fresh HEK-media the next day and culture supernatants (S/N) of the two subsequent days were then filtered (0.22 µm), collected and stored at 4 °C. After the second harvesting day, S/N were supplemented with 1 x PEG-it virus precipitation solution (SBI, LV810A-1) for 24 h at 4 °C. Viral particles were finally precipitated by centrifugation at 3234 x g, 4 °C. Viral precipitates were resuspended in 200 µl mTeSR1 and either frozen at –80 °C or immediately used for lentiviral transduction of CRISPRi hiPSCs. The entire lentivirus preparation and storage was carried out under S2-safety conditions and precautions.

Lentiviral transduction of dCas9-KRAB-MeCP2 hiPSCs

Request a detailed protocol

Lentiviral particles were either thawed on ice (if frozen) or directly used fresh on the day of production. For hiPS cells transduction, clump-based hiPSCs splitting was performed (s. hiPS cell culture for detailed experimental procedure) and dissociated clumps were supplemented with 10 µM Y-27632, 10 µg/ml Polybrene infection reagent (MerckMillipore, TR-1003-G) and 100 µl lentiviral particles preparation. Cells were then plated and cultured for 16 hr at 37 °C, 5% CO2. The following day, cells were washed 10 times with DPBS and given fresh mTeSR1 supplemented with 10 µM Y-27632 for 24 hr at 37 °C, 5% CO2.

Successfully infected cells were then selected with 2 µg/ml Puromycin Dihydrochloride (Thermo Fisher Scientific, A1113803) for 14 days at 37 °C, 5% CO2. dCas9-KRAB-MeCP2 cell lines expressing sgRNAs (sgT-REX17 and sgCtrl), were grown as bulk cultures, and Tag-BFP was used as a proxy for sgRNA expression prior to differentiation into the respective endodermal derivate.

RNA isolation and cDNA synthesis

Request a detailed protocol

For RNA extraction, cells were lysed in 500 μl Qiazol from the miRNeasy Mini Kit (Quiagen, 217004), followed by vortexing. RNA was then extracted using the miRNeasy Mini Kit (Quiagen, 217004) and RNA concentration was measured. cDNA synthesis was performed using 1 μg total RNA for each sample using the RevertAid First Strand cDNA Synthesis Kit (Thermo Fisher Scientifc, K1622), following the manufacturer’s instructions Random hexamers have been used as primers for first strand cDNA synthesis.

Quantitative PCR (qPCR)

Request a detailed protocol

Quantitative PCR (qPCR) was carried out on a StepOnePlus 96-well or a QuantStudio 7 Flex 384-well Real-Time PCR System (Thermo Fisher Scientific) loading 20–25 ng cDNA /well and using TaqMan Fast Advanced Master-Mix (Thermo Fisher Scirentific, 4444557) with TaqMan validated probes (Supplementary file 3) (Thermo Fisher Scientific) following the manufacturer’s instructions.

5’/3’ RACE PCR experiments

Request a detailed protocol

5’/3’ rapid amplification of cDNA ends (RACE) PCR reactions where performed utilizing the 5′/3′ RACE Kit, second generation (Sigma-Aldrich, 3353621001) according to the manufacturer’s instructions. Corresponding gene specific (SP) primers are listed in Supplementary file 3.

RACE-PCR products were cloned into pJET1.2 backbone followed by bacterial transformation and sanger sequencing.

Extraction of polyA RNA for Nanopore sequencing

Request a detailed protocol

Isolation of poly(A)-enriched mRNA was performed using the Dynabeads mRNA DIRECT purification kit (Thermo Fisher Scientific, 61011) according to the manufacturer’s instruction with minor modifications. ZIP13K2-derived EN cells were washed once with DPBS and dissociated with Accutase for 15 min at 37 °C, 5% CO2. Enzymatic reaction was quenched by adding mTeSR1 and cells were counted using the Countess II automated cell-counter. A total of 4x106 viable cells were centrifuged for 5 min at 4 °C, 300 x g. The supernatant was discarded and cells were washed with 1 ml of ice-cold DPBS and centrifuged as described above. The supernatant was completely removed and the cell pellet was carefully resuspended in 1.25 ml Lysis/Binding buffer. In order to reduce viscosity resulting from released genomic DNA, the samples were passed through a 21 gauge needle (Becton Dickinson, 304432) for five times and subsequently added to the pre-washed Oligo(dT)25 beads. Hybridization of the beads/mRNA complex was carried out for 10 min on a Mini Rotator (Grant-bio) and vials were placed on a DynaMag2 magnet (Thermo Fisher Scientific, 12321D) until the beads were fully immobilized. The DNA containing supernatant was removed and the beads were resuspended twice with 2 ml of Buffer A following a second wash step with two times 1 ml of Buffer B. Purified RNA was eluted with 10 µl of pre-heated Elution Buffer (10 mM Tris-HCl pH 7,5) for 5 min at 80 °C and quantified with a Qubit Fluorometer (Thermo Fisher Scientific) using the RNA HS Assay Kit (Thermo Fisher Scientific, Q32852). Eluted RNA samples were immediately used for preparation of Nanopore sequencing libraries or kept at –80 °C.

Preparation of Nanopore sequencing libraries

Request a detailed protocol

Preparation of RNA sequencing libraries was performed following the manufacturer’s instructions (ONT, SQK-PCS109) with minor modifications. Briefly, 50 ng of freshly prepared poly(A)-enriched mRNA was subjected to reverse transcription and strand-switching reaction. A total of four PCR reactions, each containing 5 µl of reverse transcribed cDNA, was used for the attachment of rapid primers (cPRM). Sufficient amplification of long cDNA molecules was enabled by setting the PCR extension time to 19 min and a total of 12 x cycles were used for amplification. Samples were treated with 1 µl of Exonuclease I (New England Biolabs, M0293S) and subsequently pooled for SPRI bead cleanup. Wash steps were performed using 80% ethanol solution and beads were eluted in 60 µl of 50 °C pre-heated nuclease-free water. Samples were then incubated for additional 20 min at 50 °C. Eluted DNA was combined with 5 µl adapter mix (AMX), 25 µl ligation buffer (LNB) from ONTs ligation sequencing kit (ONT, SQK-LSK109) and 10 µl of NEBNext Quick T4 DNA Ligase (New England Biolabs, E6056S). Ligation mix was incubated at RT for 30 min. Removal of short DNA fragments was achieved by adding 40 µl of Agencourt AMPure XP beads (Beckmann Coulter, A63881) combined with two wash steps with 250 µl of long fragment buffer (LFB) included in ONTs ligation sequencing kit. The final library was eluted in 13 µl elution buffer (EB) for 20 min at 48 °C and DNA concentration was quantified using the Qubit dsDNA BR assay kit (Thermo Fisher Scientific, Q32850). A total of 400 ng was carefully mixed with 37.5 µl sequencing buffer (SQB), 25.5 µl of loading beads (LB) and loaded onto a primed MinION flow cell (ONT, R9.4.1 FLO-MIN106).

RNA sequencing

Request a detailed protocol

ZIP13K2 hiPSCs and their derived EN cultures were treated with Accutase for 15 min at 37 °C, 5% CO2 to obtain a single cell suspension. Cells were then collected, washed with ice cold DPBS and centrifuged at 4 °C, 300 x g for 5 min. Subsequently, 350 µl of RLT Plus buffer containing 1% β-mercaptoethanol (Thermo) was added to the cell pellets for cell lysis. After dissociation by trituration and vortexing, RNA was extracted using RNeasy Plus Micro Kit (Qiagen) and RNA concentration and quality was measured using the Agilent RNA 6000 Pico Kit (Agilent Technologies, 5067–1513) on an Agilent 2100 Bioanalyzer. All samples analyzed had a RINe value higher than 8.0, and were subsequently used for library preparation. mRNA libraries were prepared using KAPA Stranded RNA-Seq Kit (KapaBiosystem) according to the manufacturer’s instructions. A total of 500 ng of total RNA was used for each sample to enter the library preparation protocol. For adapter ligation dual indexes were used (NEXTFLEX Unique Dual Index Barcodes NOVA-514150) at a working concentration of 71 nM (5 µl of 1 uM stock in each 70 µl ligation reaction). Quality and concentration of the obtained libraries were measured using Agilent High Sensitivity D5000 ScreenTape (Agilent-Technologies, 5067–5592) on an Agilent 4150 TapeStation. All libraries were sequenced using 100 bp paired-end sequencing (200 cycles kit) on a NovaSeq platform at a minimum of 25 million fragments /sample.

4C sequencing

Request a detailed protocol

Triplicates of either undifferentiated ZIP13K2 or ZIP13K2-derived EN cultures were collected as described previously. ZIP13K2-derived EN cultures were further quenched with MACS-buffer (Final DPBS, 2 mM EDTA (ThermoFisher Scientific), 0.5% BSA (Sigma-Aldrich)) to obtain a single cell suspension. CXCR4+ cell populations, were enriched using MicroBead Kit (Miltenyi Biotec) following the manufacturer’s instructions. Pre- and post-MACS enriched cell fractions of differentiated cultures were measured for CXCR4-APC signal on the FACS Aria II (Beckton Dickinson) to confirm the cell population purity. Circularized Chromosome Conformation Capture (4 C) library preparation of undifferentiated, or differentiated CXCR4+ enriched cell populations was performed according to the Weintraub A.S. et al. protocol (Weintraub et al., 2017). Briefly, NlaIII (New England Biolabs, R0125) was used as the primary cutter and DpnII (New England Biolabs, R0543) as a secondary cutter. Touchdown PCR on 4 C libraries was performed using specific primer-pairs (s. primer list in Supplementary file 3) for the respective view-points. Illumina sequencing libraries were then prepared and sequenced using 150 paired-end sequencing (300 cycles kit) on a HiSeq4000 platform at a minimum of 10 M fragments/ sample.

Capture Hi-C sequencing

Request a detailed protocol

cHi-C libraries were prepared from CRISPRi sgCtrl or sgT-REX17 EN cells. 5x106 ZIP13K2-derived EN cells were harvested and washed with ice cold DPBS. Cell lysis, NlaIII (NEB, R0125) digestion and proximity-ligation was performed according to the Franke et al. protocol (Franke et al., 2016) with minor changes. Adaptors were added to DNA fragments and amplified according to Agilent Technologies instructions for Illumina sequencing. The library was hybridized to the custom-designed SureSelect probes (Agilent Technologies, 5190–4806/3253271) (s. probe list in Supplementary file 3) and indexed for sequencing of 200 M fragments /sample (100  bp paired-end) following the Agilent instructions. Capture Hi-C experiments were performed as biological duplicates.

SOX17 chromatin immunoprecipitation (ChIP) sequencing

Request a detailed protocol

ZIP13K2-derived EN cells (5x106 / IP) were harvested and cross-linked in 1% formaldehyde (Thermo Fisher Scientific, 28908) in DPBS for 10 min at RT, followed by quenching with final 125 mM Glycine (Sigma-Aldrich, 50046) for 5 min at RT. Cross-linked cells were then centrifuged at 500 x g at 4 °C and washed twice with ice cold DPBS. Cell lysis was performed by resuspending the pellet in 500 μl Cell Lysis Buffer (Final 10 mM Tris-HCl, pH 8,0 (Sigma Aldrich, T2694); 85 mM KCl (Sigma Aldrich, P9541); 0,5% NP40 (Sigma Aldrich, 56741); 1 x cOmplete, EDTA-free Protease Inhibitor Cocktail (Sigma Aldrich, 11873580001)) followed by 10 min incubation on ice. After the incubation, lysed cells were centrifuged at 2500 x g for 5 min at 4 °C. Supernatant was carefully removed and the extracted nuclei were then resuspended in 230 μl Nuclei Lysis Buffer (Final 10 mM Tris-HCl, pH 7,5 Sigma Aldrich, T2319); 1% NP40; 0.5% sodium deoxycholate (Sigma Aldrich, D6750); 0,1% SDS (Thermo Fisher Scientific, AM9820); 1 x cOmplete, EDTA-free Protease Inhibitor Cocktail. Following 10 min incubation on ice, each 260 μl sample was split into two microTUBEs (Covaris, 520045) and chromatin was sonicated using a Covaris E220 Evolution with the following settings: Temperature → 4 °C; Peak power → 140; Duty factor → 5,0; Cycles/Burst → 200; Duration → 750 sec. After sonication, sheared chromatin (ranging from 200 to 600 bp) was transferred in a new 1.5 ml tube and centrifuged at max speed for 10 min at 4 °C. Supernatant was then transferred into a new tube and volume was increased to 1 ml /sample with ChIP Dilution Buffer (Final 16.7 mM Tris-HCl, pH 8.0; 1.2 mM EDTA Sigma Aldrich, 03690); 167 mM NaCl (Sigma Aldrich); 1,1% Triton-X (Sigma Aldrich); 0.01% SDS; 1 x Protease Inhibitor. Fifty μl (5%) was then transferred into a new tube and frozen at –20 °C as INPUT. One μg of SOX17 antibody /106 initial cells was added to the 950 μl left, and immunoprecipitation was carried out at 4 °C o/n on a rotator (Supplementary file 3). The next day, 50 μl of Dynabeads Protein G (Thermo Fisher Scientific, 10004D) /IP were washed twice with ice cold ChIP Dilution Buffer and then added to each IPs. IP/bead mixes were incubated for 4 hr at 4 °C on a rotor. Next, bead/chromatin complexes were washed twice with Low Salt Wash Buffer at 4 °C (Final 20 mM Tris-HCl, pH 8,0; 2 mM EDTA; 150 mM NaCl (Sigma-Aldrich, S6546); 1% Triton-X; 0,1% SDS), twice with High Salt Wash Buffer at 4 °C (Final 20 mM Tris-HCl, pH 8.0; 2 mM EDTA; 500 mM NaCl; 1% Triton-X; 0.1% SDS), twice with LiCl Wash Buffer at 4 °C (Final 10 mM Tris-HCl, pH 8.0; 1 mM EDTA; 250 mM LiCl (Sigma Aldrich, L9650); 1% sodium deoxycholate (Sigma Aldrich); 1% NP40), twice with TE pH 8.0 (Sigma Aldrich, 8890) at room temperature and finally eluted twice in 50 μl freshly prepared ChIP Elution Buffer (Final 0,5% SDS; 100 mM NaHCO3 (Sigma Aldrich, S5761)) at 65 °C for 15 min (total 100 μl final eluent). Thawed INPUTS and eluted IPs were next reverse cross-linked at 65 °C o/n after the addition of 16 μl freshly prepared Reverse Crosslinking Salt Mixture (Final 250 mM Tris-HCl, pH 6,5 (Sigma Aldrich, 20–160); 62.5 mM EDTA; 1,25 M NaCl; 5 mg/ml Proteinase K (Thermo Fisher Scientific, AM2548)). The following day, phenol:chloroform (Thermo Fisher Scientific, 15593031) extraction followed by precipitation was performed to isolate DNA. IPs and INPUTS were then quantified and NGS libraries were prepared using NEBNext Ultra II DNA Library Prep Kit for Illumina (New England Biolabs, #E7645) following the manufacturer’s instructions. Library quality and size distribution was verified using a TapeStation D5000 HS kit (Agilent Technologies, 5067–5592). Samples were sequenced with a coverage of 50 M paired end reads (2x100 bp) /sample on a NovaSeq (Illumina).

GATA4/GATA6 chromatin immunoprecipitation (ChIP) sequencing

Request a detailed protocol

GATA4/6 ChIPs were perfored in duplicates as previously described (Genga et al., 2019). Briefly, approximately 5x106 cells were used for each IP. Cells were cross-linked with 1% formaldehyde for 10 min followed by quenching with 125 mM glycine for 4–5 min at room temperature. The cell pellet was lysed in cell lysis buffer (20 mM Tris-HCl pH 8, 85 mM KCl, 0.5% NP-40) supplemented with 1 X protease inhibitors (Roche, 11836170001) on ice for 20 min then spun at 5000 rpm for 10 min. The nuclear pellet was resuspended in sonication buffer (10 mM Tris pH 7.5, 1% NP-40, 0.5% sodium deoxycholate, 0.1% SDS, and 1 X protease inhibitors) and incubated for 10 minutes at 4 °C. In order to achieve a 200–700 bp DNA fragmentation range, nuclei were sonicated using a Bronson sonifier (model 250) with the following conditions: amplitude = 15%, time interval = 3 min (total of 8–12 min) and pulse ON/OFF = 0.7 s/1.3 s. Chromatin was pre-cleared with Dynabeads Protein A (Invitrogen, 10002D) for 1 hr and incubated with antibody on a rotating wheel overnight at 4 °C. On the following day, 30–40 μl of Dynabeads Protein A was added to chromatin for 2–3 hr. The captured immuno-complexes were washed as follows – 1 x in low-salt buffer, 1 x in high-salt buffer, 1 x in LiCl salt buffer, and 1 x in TE. The immuno-complexes were eluted in ChIP-DNA elution buffer (10 mM Tris-HCl pH 8, 100 mM NaCl, 20 mM EDTA, and 1% SDS) for 20 min. The eluted ChIP-DNA was reverse cross-linked overnight at 65 °C, followed by proteinase K (Thermo, 25530049) treatment, RNase A (Thermo, ENO531) treatment, and Phenol:Chloroform:Isoamyl alcohol extraction. The Illumina library construction steps were carried out with 5–10 ng of purified DNA. During library construction, purification was performed after every step using QIAquick PCR purification kit (QIAGEN, 28104) or QIAquick gel extraction kit (QIAGEN, 28706). The library reaction steps were as follows: end-repair, 3′ end A-base addition, adaptor ligation, and PCR amplification. The amplified libraries were size-selected for 200–450 bp on a 2% agarose E-gel (Thermo, G402002) and sequenced (single-end, 75) on a NextSeq500 or Hi-Seq2000 platform.

H3K9me3 chromatin immunoprecipitation (ChIP) qPCR

Request a detailed protocol

ZIP13K2-derived EN cells (2x106 / IP) were harvested, cross-linked, washed, lysed, and sonicated as described previously (s. SOX17 ChIP sequencing). ChIP for H3K9me3 was performed in triplicates utilizing the High-Sensitivity ChIP Kit (abcam, ab185913) in combination with the ChIP-grade H3K9me3 antibody (ab8898, abcam) according to the manufacturer’s instructions with slight modifications. Instead of DNA column purification, phenol:chloroform extraction followed by precipitation was performed to isolate DNA (s. SOX17 ChIP sequencing). Precipitated DNA was dissolved in 200 µl H2O.

qPCR reactions were set up utilizing the 2 x PowerUp SYBR Green Master Mix (Thermo Fisher Scientific, A25777) containing final 250 nM forward /reverse primer (s. Supplementary file 3). All samples have been measured in technical triplicates using 4 µl diluted input or IP sample from above /reaction /replicate. qPCRs were set-up on 96-well plates (Thermo Fisher Scientific, N8010560), spun down for 1 min at 2500 x g, RT and ran on a StepOnePlus 96-well Real-Time PCR System (Thermo Fisher Scientific).

T-REX17 RNA-pulldown followed by mass spectrometry

Request a detailed protocol

RNA-pulldown protocol to discover T-REX17 protein interaction partners has been performed combining (Engreitz et al., 2014; Chu et al., 2012) protocols with some modifications. ZIP13K2-derived EN cells (60x106) were harvested and cross-linked in 1% formaldehyde (Thermo Fisher Scientific, 28908) in DPBS for 5 min at RT, followed by quenching with final 125 mM Glycine (Sigma-Aldrich, 50046) for 5 min at RT. Cross-linked cells were then centrifuged at 500 x g at 4 °C and washed three times with ice cold DPBS. Cells are then resuspended in 10 ml Sucrose/Glycerol buffer (1:1) (Sucrose Buffer: 0.3 M Sucrose; 1% Triton-X (Sigma Aldrich); 10 mM HEPES (Thermo Fisher Scientific, 31330038); 100 mM KOAc; 0.1 mM EGTA (Sigma Aldrich); 0.5 mM Spermidine; 0.15 mM Spermine; 1 mM DTT; 1 X proteinase inhibitor (Roche, 11836170001); 10 U/ml SUPER-asIN (Thermo Fisher Scientific, AM2694)) (Glycerol Buffer: 25% Glycerol; 10 mM HEPES; 100 mM KOAc; 0.1 mM EGTA; 1 mM EDTA (Sigma Aldrich, 03690); 0.5 mM Spermidine; 0.15 mM Spermine; 1 mM DTT; 1 X proteinase inhibitor; 10 U/ml SUPER-asIN) and dounced 20 times in a glass tight pestle (Sigma Aldrich, D9938-1SET). After douncing, lysed cells are incubated for 10 min on ice inside the pestle. Cells are then transferred on a cushion of 10 ml Glycerol Buffer in a 50 ml falcon tube and centrifuged at 1000 x g for 15 min at 4 °C to recover nuclei. Supernatant is discarded by pipetting first, and residual volume is decanted on a clean paper towel. Extracted nuclei are then resuspended in 5 ml 3% formaldehyde and fixed again for 30 min at RT, followed by three DPBS washes. Next, nuclei are resuspended in 5 ml Nuclei Extraction Buffer (Final 50 mM HEPES, pH 7,5; 250 mM NaCl; 0,1% sodium deoxycholate (Sigma Aldrich, D6750); 0,1 mM EGTA; 0,5% N-lauroylsarcosine; 5 mM DTT; 100 U/ml SUPER-asIN) and incubated for 10 min on ice. Nuclei are then centrifuged at 400 x g for 5 min at 4 °C, and resuspended in 530 μl Nuclei Resuspension Buffer (Final 50 mM HEPES, pH 7.5; 75 mM NaCl; 0.1% sodium deoxycholate; 0.1 mM EGTA; 0.5% N-lauroylsarcosine; 5 mM DTT; 100 U/ml SUPER-asIN) and sonicated using a Covaris E220 Evolution with the following settings: Temperature → 4 °C; Peak power → 140; Duty factor → 5,0; Cycles/Burst → 200; Duration → 15 min. After sonication, sheared chromatin is split into 3 samples (Even/Odd/LacZ, 120 μl each) and incubated with the corresponding biotinylated probes set (36 pmols of probes are added; see Supplementary file 3 for probes sequences) together with 240 μ. Hybridization Buffer (Final 33 mM HEPES, pH 7.5; 808 mM NaCl; 0,33% SDS; 5 mM EDTA; 0.17% N-lauroylsarcosine; 2.5 mM DTT; 5 X Denhardt’s solution; 1 X proteinase inhibitor; 100 U/ml SUPER-asIN) overnight at RT on a rotor. 5% sonicated sample was frozen as INPUT. The next day, 240 μl of MyOne Streptavidin C1 beads (Thermo Fisher Scientific, 65001) were added to each pulldown after washing and resuspension in Hybridization Buffer, and incubated for 3 hr at RT on a rotor. Next, bead complexes were washed once with Wash Buffer 1 (Final 30 mM HEPES, pH 7.5; 1.5 mM EDTA; 240 mM NaCl; 0.75% N-lauroylsarcosine; 0.65% SDS; 0,7 mM EGTA; 2 M Urea), four times with Wash Buffer 2 (Final 10 mM HEPES, pH 7.5; 2 mM EDTA; 240 mM NaCl; 0,1% N-lauroylsarcosine; 0.2% SDS; 1 mM EGTA) and once with RNase H elution Buffer (Final 50 mM HEPES, pH 7.5; 1.5 mM EDTA; 75 mM NaCl; 0.125% N-lauroylsarcosine; 0.5% Triton-X; 10 mM DTT; 0.5 M Urea). In this last step, 10% of the beads from each pulldown is transferred to a new tube for RNA isolation. The remaining 90% (protein sample fraction) is eluted in RNase H elution Buffer containing 10% RNase H, 10% RNase A and 10% DNase for 30 min at RT. The RNase fraction is de-crosslinked together with the INPUT samples with Proteinase K (Thermo Fisher Scientific, AM2548) treatment and RNA is extracted following Trizol purification. RNA and INPUT samples were reverse transcribed and used for qPCR to validate T-REX17 enrichment. Protein samples were run on a NuPAGE 4–12%, Bis-Tris, 1.0 mm, Mini Protein Gel, Silver stained using SilverQuest (Thermo Fisher Scientific; LC6070) following manufacturer instructions. The mass spectrometry compatible SilverQuest Silver Staining Kit was used for de-staining. Gel pieces were then washed twice with 300 µL of 25 mM ammonium bicarbonate in 50% acetonitrile, shaking at 500 rpm for 10 min, followed by centrifugation at 16,000 x g for 30 s. Gel pieces were completely dried in a vacuum concentrator. In-gel digestion with trypsin and extraction of peptides was done as previously described (Kaiser et al., 2008). Dried peptides were reconstituted in 5% acetonitrile and 2% formic acid in water, briefly vortexed, and sonicated in a water bath for 30 s before injection to nano-LC-MS. LC-MS/MS was carried out by nanoflow reverse-phase liquid chromatography (Dionex Ultimate 3000, Thermo Scientific) coupled online to a Q-Exactive HF Orbitrap mass spectrometer (Thermo Scientific), as reported previously (Gielisch and Meierhofer, 2015). Briefly, the LC separation was performed using a PicoFrit analytical column (75 μm ID ×50 cm long, 15 µm Tip ID; New Objectives, Woburn, MA) in-house packed with 3 µm C18 resin (Reprosil-AQ Pur, Dr. Maisch, Ammerbuch, Germany). Peptides were eluted using a gradient from 3.8 to 38% solvent B in solvent A over 120 min at a 266 nL/min flow rate. Solvent A was 0.1% formic acid and solvent B was 79.9% acetonitrile, 20% H2O, and 0.1% formic acid. For the IP samples, a 1-hr gradient was used. Nanoelectrospray was generated by applying 3.5kV. A cycle of one full Fourier transformation scan mass spectrum (300–1750 m/z, resolution of 60,000 at m/z 200, automatic gain control (AGC) target 1×106) was followed by 12 data-dependent MS/MS scans (resolution of 30,000, AGC target 5×105) with a normalized collision energy of 25 eV.

HNRNPU RNA immunoprecipitation (RIP) followed by qRT-PCR or western blot

Request a detailed protocol

ZIP13K2-derived EN cells (10x106) were harvested and cross-linked according to the manufacturer’s instructions in 0.3% formaldehyde in DPBS for 10 min at RT, followed by quenching with final 1 x Glycine solution for 5 min at RT utilizing the Magna Nuclear RIP (Cross-Linked) Nuclear RNA-Binding Protein Immunoprecipitation Kit (Merck millipore, 17–10520). Cross-linked cells were then centrifuged at 800 x g at 4 °C and washed three times with ice cold DPBS. Supernatant free cell pellets were conducted to cell lysis according to the Kit manufacturer’s instructions. Sonication has been performed in Kit provided RIP Cross-linked Lysis Buffer using the Covaris E220 Evolution with the following settings: Temperature → 4 °C; Peak power → 140; Duty factor → 5,0; Cycles/Burst → 200; Duration → 6 min. to obtain a DNA smear of 200–1000 bp. Sonicated lysates were centrifuged at 1000 x g for 10 min at 4 °C and supernatants aliquoted and stored at –80 °C. DNase I treatment following Immunoprecipitation has been performed according to the Kit manufacturer’s instructions, combining lysates corresponding to 106 cells with 5 µg antibody per sample (Supplementary file 3 for antibodies). After DNase I treatment 10% input material for qRT-PCR has been kept and stored at –80 °C. Initial supernatants (unbound fraction w/o beads) after o/n immunoprecipitation and 10% material of the last wash step (IP including beads) has been kept for Western Blot and stored at –20 °C. Inputs and IP were further conducted to reverse crosslinking and RNA purification according to the Kit manufacturer’s instructions. cDNA synthesis has been carried out as mentioned earlier (s. RNA isolation and cDNA synthesis).

Quantitative PCR (qPCR) reactions were set up utilizing the 2 x PowerUp SYBR Green Master Mix (Thermo Fisher Scientific, A25777) containing final 250 nM forward /reverse primer (s. Supplementary file 3 for primer & oligos) and 20–25 ng cDNA /well. Reactions were set up in 384-well plates (Thermo Fisher Scientific, AB2384B) following centrifugation for 2 min at 2500 x g, RT. Reactions were carried out on a QuantStudio 7 Flex 384-well Real-Time PCR System (Thermo Fisher Scientific).

Western Blot samples of unbound fractions and IP were boiled in final 1 x Laemmli Buffer (BioRad, 1610747) containing 10% 2-Mercaptoethanol (M6250, Sigma-Aldrich) for 10 min at 95 °C, followed by cooling on ice for 5 min. Western blots have finally been carried out as described below (s. Western Blot) utilizing respective antibody dilutions (s. Supplementary file 3 for antibodies).

Immunofluorescence staining

Request a detailed protocol

For immunofluorescent stainings, cells were grown in Ibidi eight-well glass-bottom plates (Ibidi, 80827) (initial seeding, 104 cells /well). On the day of analysis, cells were washed twice with DPBS and then fixed in 4% Paraformaldehyde (PFA) solution (Sigma-Aldrich, P6148-500G) for 30 min at 4 °C, and then washed three more times with DPBS. Subsequently, cells were permeabilized for 30 min in DPBS-T solution (Final 0.5% Triton-X (Sigma-Aldrich, T8787-50 ML) in DPBS) and blocked for 30 min in Blocking solution (Final 10% fetal bovine serum in DPBS-T) at RT. Primary antibody incubation was performed in blocking solution for 1 hr and 45 min at RT, after which cells were washed three times with Blocking solution. After the last washing step, samples were incubated with secondary antibodies diluted in Blocking solution for 30 min at RT. Afterwards, cells were washed three times with DPBS-T. The last DPBS-T washing step after secondary antibody incubation contained 0.02% DAPI (Roche Diagnostics, 10236276001). DAPI was incubated for 10 min at RT and washed off once with DPBS. All primary and secondary antibodies and their working concentrations are listed in Supplementary file 3.

Cell clearing

Request a detailed protocol

Prior to imaging, cells were cleared with RIMS (Refractive Index Matching Solution) in order to increase light penetrability. To this end, samples were first washed three times with 0.1 M phosphate buffer (0.025 M NaH2PO4, 0.075 M Na2HPO4, pH 7.4). Clearing was then performed by incubation in RIMS solution (133% w/v Histodenz (Sigma-Aldrich, D2158) in 0.02 M phosphate buffer) at 4 °C o/n.

Immunofluorescence imaging

Request a detailed protocol

Cells stained with antibodies were imaged with the Zeiss Celldiscoverer7 (wide-field), Zeiss LSM880 (laser-scanning microscope with Airyscan), Zeiss Observer (wide-field) or Nikon Eclipse TS2 (bench-top microscope) with appropriate filters for DAPI, Alexa Fluor 488, Alexa Fluor 568, Alexa Fluor 647, and combinations thereof.

Quantitative fluorescence microscopy

Request a detailed protocol

For each staining tested, a total of 49 individual positions were acquired in 3 fluorescence channels /replicate /well, with a 20 x /NA = 0.95 objective, an afocal magnification changer 1 x, 3x3 camera binning, a consequential pixel size of 0.46 µm2, and in constant focus stabilization mode. Analysis was then performed using the Image Analysis module running in ZEN 3.2. On average 6928 single cells were analyzed per replicate. Cells were identified on smoothed nuclear counterstaining (DAPI) using fixed intensity thresholds, nearby objects were separated by mild water shedding. The consequential primary objects were filtered (area 45–175 µm2) and expanded by 8 pixels (=5.44 µm2); the consecutive ring, surrogated a cytoplasm compartment. Fluorescence intensities (mean and standard deviation) were quantified for each nucleus and expanded object, depending on the staining pattern profiled.

Single-molecule RNA fluorescent in situ hybridization

Request a detailed protocol

For single-molecule RNA fluorescent in situ hybridization (smRNA-FISH), cells were grown in Ibidi eight-well glass-bottom plates (Ibidi 80827) (initial seeding, 104 cells /well). On the day of analysis, cells were washed twice with DPBS, fixed in 4% PFA for 10 min at RT, and washed again twice with DPBS. Cells were then incubated in 70% ethanol at 4 °C for at least 1 hr and then washed with 1 ml of Wash Buffer A (LGC Biosearch Technologies) at room temperature for 5 min. Cells were subsequently hybridized with 100 μl of Hybridization Buffer (LGC Biosearch Technologies) containing the smRNA-FISH probes at a 1:100 dilution in a humid chamber at 37 °C o/n (not more than 16 h). The next day, cells were washed with 1 ml of Wash Buffer A at 37 °C for 30 min and stained with Wash Buffer A containing 10 μg/ml Hoechst 33342 at 37 °C for 30 min. Cells were then washed with 1 ml of Wash Buffer B (LGC Biosearch Technologies) at RT for 5 min, mounted with ProLong Gold (Thermo, P10144), and left to curate at 4 °C o/n before proceeding to image acquisition. Oligonucleotides probes were designed with the Stellaris smRNA-FISH probe designer (LGC Biosearch Technologies, version 4.2), labeled with Quasar 570 and produced by LGC Biosearch Technologies. smRNA-FISH probes sequences are listed in Supplementary file 3.

smRNA-FISH imaging

Request a detailed protocol

Image acquisition was performed using a DeltaVision Elite widefield microscope with an Olympus UPlanSApo 100 x /1.40-numerical aperture oil objective lens and a PCO Edge sCMOS camera. Z-stacks of 200 nm step size capturing the entire cell were acquired. Images were deconvolved with the built-in DeltaVision SoftWoRx Imaging software and maximum intensity projections were created. RNA-FISH foci were then quantified manually considering the overlap with Hoechst (nuclear fraction) and calculating T-REX17 background staining (cytoplasmic fraction) using ImageJ (Rueden et al., 2017) and Fiji (Schindelin et al., 2012).

Cell fractionation followed by RT-PCR and agarose gel band quantification

Request a detailed protocol

hiPSCs WT cells were differentiated to definitive endoderm cells (s. Definitive endoderm (EN) differentiation) and cytoplasmatic, nucleoplasmatic and chromatin fractions subsequently isolated utilizing the Subcellular Protein Fractionation Kit for Cultured Cells (Thermo Fisher Scientific, 78840) according to the manufactures protocol. Kit-provided buffers were substituted with 1 U/µl SUPERaseIn RNase Inhibitor (ThermoFisher Scientific, AM2694). RNA of respective cell fraction was isolated subsequently followed by cDNA synthesis (s. RNA isolation and cDNA synthesis). Relative PCR-product band intensity was obtained from agarose gel purified PCR-products utilizing the BioRad ChemiDoc XRS + imaging system. Band intensities of each fraction were normalized on the cytoplasmatic fraction (Cyt). Relative fracions per replicate were summed up to 100% before representing them as relative percentage fraction Supplementary file 1. PCR-primer sequences are listed in Supplementary file 3.

Staining for FACS analysis

Request a detailed protocol

Undifferentiated or differentiated ZIP13K2 cultures were treated with Accutase for 15 min, 37 °C, 5% CO2 to obtain a single-cell suspension. To quench the dissociation reaction and to wash the cells, FACS-buffer was added (Final DPBS, 5 mM EDTA (Thermo Fisher Scientific, 15575020), 10% Fetal bovine serum (FBS, PAN Biotech, P30-2602)). Next, cells were spun down at 300 x g, 5 min at 4 °C. Cells were then resuspended in FACS-buffer containing surface marker antibodies (s. Supplementary file 3) and incubated for 15 min at 4 °C in the dark. For extracellular stainings (ECS) only, cells were further washed once with FACS-buffer and spun down at 300 x g before FACS analysis was performed. If additional intracellular stainings (ECS +ICS) were performed, cells were washed once with FACS-buffer, supernatants were removed and cells fixed according to the manufacturer’s instructions utilizing the True-Nuclear Transcription Factor Buffer Set (Biolegend, 424401). Intracellular staining was performed according to manufacturer’s instructions before FACS analysis was carried out. ICS antibody dilutions are listed in Supplementary file 3. FACS analysis was performed on the FACSCelesta Flow Cytometer (Beckton Dickinson). Raw data were analyzed using FlowJo (LLC) V10.6.2.

Western blot and band quantification

Request a detailed protocol

Undifferentiated or differentiated ZIP13K2 cultures were treated with Accutase for 15 min, 37 °C, 5% CO2 to obtain a single suspension. Single cell suspensions were washed once with ice cold DPBS and spun down at 300 x g, 5 min at 4 °C. Supernatants were removed and cell lysates generated by treatment for 30 min on ice with RIPA buffer (Thermo Fisher Scientific, 89900) supplemented with 1 x HALT protease inhibitor (Thermo Fisher Scientific, 87786). Lysates were spun down at 12,000 x g, 10 min at 4 °C and supernatants quantified for protein content using the Pierce BCA Protein Assay Kit (Thermo Fisher Scientific, 23227) according to the manufacturer’s instructions.

For western blot, 20 µg total protein extract per sample were boiled in final 1 x Laemmli Buffer (BioRad, 1610747) containing 10% 2-Mercaptoethanol (M6250, Sigma-Aldrich) for 10 min at 95 °C, followed by cooling on ice for 5 min. Samples were then loaded on a NuPAGE 4–12%, Bis-Tris, 1.0 mm, Mini Protein Gel (Thermo Fisher Scientific, NP0322BOX) and ran at 200 V for 30 min in 1 x NuPAGE MOPS SDS Running Buffer (Thermo Fisher Scientific, NP0001) containing 1:400 NuPAGE Antioxidant (Thermo Fisher Scientific, NP0005). Protein transfer has been performed utilizing the iBlot 2 Starter Kit, PVDF (Thermo Fisher Scientific, IB21002S) following the manufacturer’s instructions for the P0 program.

PVDF membranes containing transferred proteins were incubated in blocking buffer (1 x TBS-T (Thermo Fisher Scientific, 28360), 5% Blotting-Grade Blocker (BioRad, 1706404)) for 1 hr at RT. Incubation with primary antibody dilution (s. Supplementary file 3) was performed in blocking buffer at 4 °C overnight. The following day, membranes were washed three times 10 min at RT with 1 x TBS-T and incubated for 2 hr at RT in secondary antibody dilution in blocking buffer (Supplementary file 3). Next, membranes were washed three times for 10 min at RT with 1 x TBS-T and developed using the SuperSignal West Dura Extended Duration Substrate (Thermo Fisher Scientific, 34075) according to the manufacturer’s instructions and imaged on the BioRad ChemiDoc XRS+imaging system to finally obtain relative band-intensities. JNK or pJNK band-intensities were then normalized on their respective GAPDH levels before calculating relative pJNK levels (pJNK/JNK). Relative pJNK levels of EN time-course differentiations (Figure 4C, left panel; Figure 4—figure supplement 2A) were finally calculated and depicted as Log2FC(sgT-REX/sgCtrl) (Figure 4C, right panel). Raw data and calulcations are provided in Supplementary file 1.

Computational analysis

Command-line processing of BAM, BED and bigwig files was done using SAMtools (v1.10) (Li et al., 2009), BEDtools (v2.25.0) (Quinlan and Hall, 2010) and UCSCtools (v4) (Kuhn et al., 2013). If not stated otherwise: All statistics and plots are generated using R version 3.6.0 and 3.6.1. In all boxplots, the centerline is median; boxes, first and third quartiles; whiskers, 1.5 x inter-quartile range; data beyond the end of the whiskers are displayed as points.

Human vs. mouse T-REX17 conservation analysis

Request a detailed protocol

Local alignment was performed with EMBOSS Water (Madeira et al., 2022). Visualizations were created with Matplotlib (Hunter, 2007). Alignment sequences were read into python using the Biopython library (Cock et al., 2009). The full sequence of the human T-REX17 locus was aligned to the full sequence of the mouse T-rex17 locus using Water. Aligned subsequences of 20 base pairs or more in length, including substitutions but excluding indels were used to calculate conservation. Additionally, individual exons and the enhancer sequence were also aligned with Water. Conserved stretches were connected from the human sequence box to the mouse sequence box and visualized as lines.

4Cseq data analysis

Request a detailed protocol

The raw sequencing reads were trimmed by using cutadapt (Martin, 2011) (--discard-untrimmed -e 0.05 m 25) to remove primer sequences and restriction enzyme sequences. The reads not matching those sequences, were removed from further analysis. The remaining reads were then mapped to the reference sequences GRCh37/hg19 by bowtie2 (Langmead and Salzberg, 2012) (default parameters). An iterative mapping procedure was performed. Specifically, the full-length reads were first mapped to the genome. The unmapped reads were then cut by 5-nt from the 3-prime end each time until they were successfully mapped to the genome or until they were shorter than 25 bp. The final mapped reads were assigned to valid fragments. The fragment counts were then normalized by RPM (reads per million) and smoothed by averaging the counts of the closest five fragments.

Coding potential calculation

Request a detailed protocol

Whole genome multiple species alignments of 46 vertebrate species with human (assembly hg19, October 2009) as a reference have been retrieved from the UCSC genome browser (Kent et al., 2002). Human lincRNA annotation was obtained from Gencode (Frankish et al., 2019) (gencode.v33lift37.long_noncoding_RNAs.gtf, December 2019). All ORFs in each transcript were identified and the corresponding multiple species alignment was scored by the omega method of PhyloCSF (Lin et al., 2011; Figure 2C, left panel) shows 95% (2.5–97.5percentile) of the 271,572 sORFs from the (Kent et al., 2002) analyzed human lincRNAs (randomly sampled from chromosomes 16,21,18,11,17,5,10,19,22,2,7,X,12,6,Y). The SOX17 CDS and all identified sORFs in T-REX17 were scored by omega phyloCSF as shown in Figure 2C, right panel.

RNA-seq

Request a detailed protocol

All RNAseq samples were pre-processed using cutadapt (Martin, 2011) to remove adapter and trim low quality bases. Reads were subsequently aligned against the human reference genome hg19 using STAR (Dobin et al., 2013) (parameter: outSAMtype BAM SortedByCoordinate --outSAMattributes Standard --outSAMstrandField intronMotif --outSAMunmapped Within --quantMode GeneCounts). Finally, Stringtie (Pertea et al., 2015) was used for calculation of strand-specific TPMs.

Differential gene expression was calculated using DESeq2 (Love et al., 2014). Genes with an absolute log2 fold change >1 and an adjusted P-value <0.05 were termed differentially expressed. Lowly expressed genes (all sample have a TPM <1) were excluded from the analysis.

Capture Hi-C

Request a detailed protocol

Raw sequence reads of capture Hi-C (cHi-C) were mapped to the hg19 version of the human genome using BWA (v0.7.17-r1188) (Li and Durbin, 2009) with parameters (mem -A 1 -B 4 -E 50 L 0). Mapped reads were further processed by HiCExplorer (v3.6) (Ramírez et al., 2018) to remove duplicated reads and reads from dangling ends, self-circle, self-ligation and same fragments. The replicates were merged to construct contact matrices of 1 kb resolution. Normalization was performed to ensure that all samples have the same number of total contacts, followed by KR correction. The relative contact difference between two cHi-C maps was calculated by subtracting one from the other using the corrected matrices.

SOX17 chromatin immunoprecipitation

Request a detailed protocol

The ChIP-seq sequencing data as well as the control input sequencing were aligned to the human reference genome (hg19) using BWA mem (Heng, 2013) using the default parameter. GATK (McKenna et al., 2010) was used to obtain alignment metrics and remove duplicates. Peaks were called using the MACS2 (2.1.2_dev) (Zhang et al., 2008) peakcall function using default parameters. After validation of replicate comparability and quality, replicates were merged on read level and reprocessed together with input samples. Background subtracted coverage files were obtained using MACS2 bdgcomp with -m FE. Peaks were removed from the analysis if overlapping with ENCODE blacklisted (hg19-blacklist.v2.bed) regions.

GATA4/6 chromatin immunoprecipitation

Request a detailed protocol

The ChIP-seq sequencing data as well as the Fastqs for GATA4/6 ChiP-seq experiments were processed using the ENCODE ChIP-seq pipeline version 1.6.1 (https://github.com/ENCODE-DCC/chip-seq-pipeline2, copy archived at swh:1:rev:ec4295c8ac68be25b25357038d82ec942ac0bf8d; Jin, 2022) using default settings with the hg19 genome. Standard ENCODE ChIP-seq reference files were used as found in https://storage.googleapis.com/encode-pipeline-genome-data/genome_tsv/v1/hg19_caper.tsv. Pooled fold-change bigWigs were used.

Single-cell RNAseq pipeline

Request a detailed protocol

Publicly available single-cell RNAseq raw data of already filtered 1195 cells from a gastrulating human embryo (Tyser et al., 2021) was downloaded from ArrayExpress (Athar et al., 2019) under accession code E-MTAB-9388. The GENCODE (Frankish et al., 2021) human transcriptome (GRCh37.p13) and its annotation were downloaded and added with the T-REX17 entry. After building the transcriptome index, the transcripts abundance was quantified via Salmon v1.6.0 (Patro et al., 2017) in quasi-mapping-based mode using the –seqBias and the –gcBias flags. Data was loaded as a scanpy v1.4.4 (Wolf et al., 2018) object, reproducing clustering as reported by Tyser, R. C. v. et al. (Tyser et al., 2021). The resulting clusters were visualized via the scanpy UMAP representation in two dimensions, using default parameters (tl.umap). UMAPs are displayed in Figure 2—figure supplement 1E (upper panel).

Bulk measurements from scRNAseq pipeline

Request a detailed protocol

To measure T-REX17 read counts in endoderm cells fastq files were combined in one bulk raw file. The file went through a bulk RNAseq pipeline comprising a pre-alignment quality control via fastQC v0.11.9, adaptor and low-quality bases trimming using cutadapt (Martin, 2011), post-QC and reads alignment against the human genome (GRCh37.p13) by means of STAR (Dobin et al., 2013) (parameters: --outSAMtype BAM SortedByCoordinate, --chimSegmentMin 20, --outSAMstrandField intronMotif, --quantMode GeneCounts). Finally, the BAM file was visualized using the Integrative Genomic Viewer (IGV) (Robinson et al., 2011). IGV tracks are displayed in Figure 2—figure supplement 1E (lower panel).

Oxford Nanopore RNA analysis

Request a detailed protocol

All Oxford Nanopore Technologies derived runs were processed using the Nanopype pipeline (v1.1.0) (Giesselmann et al., 2019). The basecaller Guppy (v4.0.11) was used with the r9.4.1 high-accuracy configuration. Quality filtering was disabled for any base calling. Base-called reads were aligned against the human reference genome hg19 using minimap2 (v2.10) (Li and Birol, 2018) with the Oxford Nanopore Technologies parameter preset for spliced alignments (-ax splice -uf -k14). Only unique alignments (-F 2304) are reported.

Oxford Nanopore RNA split-read analysis

Request a detailed protocol

Nanopore post processed split read data (s. Oxford Nanopore RNA analysis) from wild-type endoderm mRNA (s. Extraction of polyA RNA for Nanopore sequencing; s. Preparation of Nanopore sequencing libraries) were extracted from the junctions-track of BAM files visualized using the Integrative Genomic Viewer (IGV) (Robinson et al., 2011) utilizing the coordinates hg19, chr8:55115873–55141447. Split reads between hg19, chr8:55140806 (5’-sequence of Exon 1, s. 5’/3’ RACE PCR experiments) and hg19, chr8: 55125601 (3’-sequence of Exon 3, s. 5’/3’ RACE PCR experiments) were accounted for isoform Ex1 +2 (s. Figure 1—figure supplement 1C). Full isoform Ex1 +2 sequence (~2,8 kb long) can be found in Supplementary file 1.

Split reads between hg19, chr8:55140806 (5’-sequence of Exon 1, s. 5’/3’ RACE PCR experiments) and hg19, chr8:55123254 (3’-sequence of Exon 3, s. 5’/3’ RACE PCR experiments) were accounted for isoform Ex1 +3 (s. Figure 1—figure supplement 1C). All other reads were accounted as “sloppy spliced” reads and together with both isoforms calculated in relative terms (s. Figure 1—figure supplement 1C). Full isoform Ex1 +3 sequence (~3 kb long) can be found in Supplementary file 1. Summary of the relative isoform quantification is displayed in Figure 2F.

Mass spectrometry analysis and ranking of T-REX17 protein partners

Request a detailed protocol

Raw MS data were processed with MaxQuant software (v 1.6.10.43) and searched against the human proteome database UniProtKB with 75,074 entries, released in May 2020. Parameters of MaxQuant database searching were a false discovery rate (FDR) of 0.01 for proteins and peptides, cysteine carbamidomethylation was set as fixed modification, while N-terminal acetylation and methionine oxidation were set as variable modifications. Protein abundance in each of the three samples has been quantified by calculating Label free quantitation (LFQ) values for each detected protein. Protein targets have then been ranked based on the Log2[(EvenLFQ +OddLFQ)/2/LacZLFQ] extrapolated values.

Plotting

Request a detailed protocol

Plots were generated with GraphPad Prism 8, R 3.6.0 and R 3.6.1.

Data availability

All data presented in this study are available in the main text, methods or tables. Sequencing data have been deposited in the Gene Expression Omnibus (GEO) under accession code GSE178990. Source data files have been provided where necessary. Codes used to perform the analysis in this study are available at the following GitHub locations or as source data files in the submission documents: https://github.com/RosariaTornisiello/lncSox17.git (copy archived at swh:1:rev:3cfa5a270e27c291d0dd390173e2311406c78bd7); https://github.com/Drmirdeep/stitch_maf (copy archived at swh:1:rev:706a7b94e0be9b02a10f47680b9a693f1178e409); https://github.com/Drmirdeep/micpdp (copy archived at swh:1:rev:901d60c3b78208bef072ff0906e6ec271f1f9667); https://github.com/HeleneKretzmer/lncRNA_Sox17 (copy archived at swh:1:rev:6d5d930e306ec8ebb4420609aa30986429e474be); https://github.com/ENCODE-DCC/chip-seq-pipeline2 (copy archived at swh:1:rev:ec4295c8ac68be25b25357038d82ec942ac0bf8d).

References

Decision letter

  1. Deborah Bourc'his
    Reviewing Editor; Institut Curie, France
  2. Marianne E Bronner
    Senior Editor; California Institute of Technology, United States
  3. Pablo Navarro
    Reviewer; Institut Pasteur, France
  4. Peter J Rugg-Gunn
    Reviewer; The Babraham Institute, United Kingdom

Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.

Decision letter after peer review:

Thank you for submitting your article "Discovery and characterization of LNCSOX17 as an essential regulator in human endoderm formation" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Marianne Bronner as the Senior Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Pablo Navarro (Reviewer #2); Peter J Rugg-Gunn (Reviewer #3).

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Comments from the three reviewers were overall positive and broadly similar. The large breadth of relevant approaches that the authors used to tackle the function of this lncRNA were greatly appreciated, even though the mechanism by which it controls endoderm differentiation stays unclear. From their individual evaluation and the common discussion that followed, it was agreed that the requested revisions should be minor-statistical analyses, and text adjustments and clarifications-with the idea of strengthening and clarifying some of the findings of the manuscript.

More specifically, the essential revisions are:

1. Provide the exact genomic coordinates of the lncRNA.

2. Perform statistical analyses (DEG analyses in particular).

3. Revise the naming of LNCSOX17: it is misleading as there is no functional connection with SOX17.

4. Use the reviewers ‘ comments to clarify the text, including pieces of additional information that are important to include in the text.

Please answer the reviewers ' comments in a point-by-point rebuttal letter. Additional experiments are not necessarily expected, but the reviewers felt that if one had to be done, it would be the test of the trans-acting function of LNCSOX17 through ectopic expression (With the appreciation that the length and variable splicing patterns of the lncRNA might make this difficult).

Reviewer #1 (Recommendations for the authors):

The points to be addressed before proceeding to final publication are listed below and divided in Major and Minor corrections. We hope the authors find them useful.

Corrections

Page 5: If lncRNA has 40 foci/cell it is unlikely that is a cis-acting lncRNA. Indeed, does RNA-seq after depletion of LNCSOX17 leads to any changes in genes associated within the same TAD (beside SOX17 not changing)?

Page 7: it was not clear whether LncSox17 P(A)/P(A)cells (Supp Figure 3I-K) that also loss LNCSOX17 expression exhibit the same differentiation phenotype as cells depleted of lncRNA using the dCAS9-KRAB-MeCP2 method? Could the authors confirm endoderm defects with IF (eg ECAD, NCAD, VIM) with this CRISPR PAS system? Do they also have JNK signaling pathway activated in LncSox17 PAS cells? What about pancreatic progenitor marker genes?

At the moment the endoderm defects are confirmed only with single guide RNA using the dCAS9-KRAB-MeCP2 system to deplete LNCSOX17 as shown in Figure 4.

Page 6: why not targeting LNCSOX17 with LNA gapmers against different exons since it is a nuclear lncRNA (targeting the 5' and 3') ? Although the authors mention that the CAS9-KRAB-MeCPD did not affect SOX17 (Supp 3c), did the authors check the expression of other neighboring genes upon dCAS9-KRAB-MeCP2? In addition, at least 2-3 guide RNAs should be used even with CRISPR to avoid any off target effects and to assure phenotype is reproducible. Do stable cell lines expressing dCAS9-KRAB-MeCP2 differentiate normally before adding control guides and guide against LNCSOX17? More importantly, does depletion of SOX17 affects endoderm differentiation? This should be stated clearly (Supp 3H).

Page 48, Figure 2C – The authors should consider adding RNA fractionation experiments to the panel to assess whether LNCSOX17 is associated to chromatin. The result will also give indications on the possible lncRNA molecular mechanisms (i.e. nucleoplasm vs chromatin enrichment).

Page 63, Supp Figure 4A – The prediction of the RNA secondary structure is superficial and it does not support the data presented later on. Moreover, although I understand that Vienna RNA has a maximum limit of nucleotides to input, I find useless that the prediction is run only on the spliced isoforms, representing only the 20% of the LNCSOX17 transcripts (the authors mention at Page 6 that the gene undergoes sloppy splicing). I suggest removing this part.

Page 8 and Page 63, Supp Figure 4D – It is not clear why the authors focus only on the hnRNP family, given the high number of putative candidates upon mass spectrometry analysis. Please clarify this point. A gene ontology analysis could help data interpretation.

Moreover, it is not clear whether the association with hnRNP proteins is somehow functional to guarantee endoderm differentiation. Thus, the phenotype of LNCSOX17 interacting proteins such as hnRNPU should be to further investigated to reinforce the role of LNCSOX17 – hnRNPU in the context of endoderm differentiation. This is important since lncRNA- bound proteins identified in this manuscript are found to interact with many other lncRNAs. Also interaction between hnRNPU and LNCSOX17 should be confirmed with CLIP, not RIP. RIP qPCR is not ideal method for RNA-protein interaction since it is known to lead to post-lysis reassociation of RBP and RNAs. Please see PMID: 15388877.

Page 52, Figure 4B – RNA-seq in LNCSOX17-depleted cells at the 5 day of endoderm differentiation reveals a high number of differentially expressed genes, however Figure 2A shows that the lncRNA is expressed already after 3 days from the start of endoderm differentiation. The authors should comment this in the discussion (e.g. how the dynamics of such events are related to each other, if the case?).

Page 9: How did the authors decide to focus on CXCR4 since it was introduced before RNA-seq analysis?

Reviewer #2 (Recommendations for the authors):

I wonder if LncSox17 could not be TCONS_00014700 (hg38:chr8:54379158-54380837; lincRNA and TUCP transcripts from UCSC) and have any functional relationship with ENSG00000286471 (GencodeV41).

Perhaps a simultaneous smFISH using both intronic and exonic probes for both Sox17 and LncSox17 would clarify this issue and perhaps reveal more subtle regulations between the two genes.

Reviewer #3 (Recommendations for the authors):

1. Figure 1A: It looks like the LNCSOX17 locus has fairly high levels of DNA methylation in PSC and then low methylation as the gene is activated in EN. This contrasts with SOX17 that is unmethylated in both cell types. LNCSOX17 can therefore perhaps provide an interesting and relatively atypical example of how a developmental gene is presumably not restricted by Polycomb-mediated processes but rather by a specific DNA demethylation event in the forming endoderm. Can the authors comment on this?

2. Figure Supplement 1A: The sequence conversation between human and mouse LNCSOX17 appears moderate in exon 1 and the enhancer region, but very low in the rest of the transcript. Also, the mouse RNA-seq track is very noisy, perhaps due to the scaling applied. From the figure provided, I found it difficult to tell to whether the LNCSOX17 really was present and meaningfully expressed in these other species, and if it is then what is the level of conservation of the transcripts and secondary structures between species?

3. I could not see a quantitative assessment of LNCSOX17 expression compared to SOX17 in human EN cells. Can the authors please show log2 RPKM values (or similar)?

4. Figure 3D: It is interesting that SOX17 still seems to be binding to the eSOX17 even in the CRISPRi cells when presumably the region is in a heterochromatinised state (Figure Supplement 3C). One prior possibility was that the transcription of LNCSOX17 was needed to keep the region accessible, engage in long distance interactions, and facilitate SOX17 occupancy. But this experiment suggests that is probably not the case. I felt this point could be made in the text.

5. Figure Supplement 3H: The deletion of SOX17 lead to the failure to induce LNCSOX17, but it is unclear if that was due to failure of SOX17 to directly activate LNCSOX17 (as currently implied), or due to a general failure to make EN and therefore an indirect effect on LNCSOX17. Do the SOX17 knockout cells make EN cells?

6. The proposed model, that LNCSOX17 is not acting in cis and therefore potentially in trans, is well supported by several experiments. But one further strong piece of evidence for a trans-acting function would be if the LNCSOX17 mutant phenotype could be rescued by expressing LNCSOX17 ectopically. Have the authors tried this? With the appreciation that the long transcript and variable splicing might make this difficult.

7. I could not really follow why the transcript was called LNCSOX17. Presumably because it lies within the same TAD as SOX17? Nevertheless, I feel this name could be confusing because, as the authors show, LNCSOX17 does not have much to do with SOX17. The authors might want to consider proposing an alternative name for their transcript to avoid confusion over inferring a genetic or functional link to SOX17.

https://doi.org/10.7554/eLife.83077.sa1

Author response

Essential revisions:

1. Provide the exact genomic coordinates of the lncRNA.

We now provide this information in the legend of Figure 1.

2. Perform statistical analyses (DEG analyses in particular).

We now provide this information in the figure legend and methods section.

3. Revise the naming of LNCSOX17: it is misleading as there is no functional connection with SOX17.

We revised the name LNCSOX17 to T-REX17 (Transcript Regulating Endoderm and activated by soX17) and adapted the new naming in all text, figures and tables. Please note that we changed the name also in the point by point response to reviewer comments. Moreover, we had to adapt our title to include the “non-coding RNA” since the new name does not contain this information anymore.

4. Use the reviewers ‘ comments to clarify the text, including pieces of additional information that are important to include in the text.

We have done this throughout. Please, see below our point-by-point response to all reviewers’ comments.

Please answer the reviewers ' comments in a point-by-point rebuttal letter. Additional experiments are not necessarily expected, but the reviewers felt that if one had to be done, it would be the test of the trans-acting function of LNCSOX17 through ectopic expression (With the appreciation that the length and variable splicing patterns of the lncRNA might make this difficult).

We agree with this assessment and have tried this particular experiment, but run into the largely expected difficulties. Please see below response to reviewer 3 (point 6, page 26, lines 666-703 of this document).

Reviewer #1 (Recommendations for the authors):

The points to be addressed before proceeding to final publication are listed below and divided in Major and Minor corrections. We hope the authors find them useful.

Corrections

Page 5: If lncRNA has 40 foci/cell it is unlikely that is a cis-acting lncRNA. Indeed, does RNA-seq after depletion of LNCSOX17 leads to any changes in genes associated within the same TAD (beside SOX17 not changing)?

As shown in Figure 2D and correctly pointed out by the reviewer we observe a median amount of 40 foci/cell for T-REX17 in endoderm cells:

This type of foci distribution is unlikely to reflect a cis-acting function, as also acknowledged by the reviewer and noted in our discussion.

SOX17 is currently the only annotated gene present in the loop-domain as shown in our Figure 1A (more details on the TAD structure, its genetic and epigenetic landscape can also be found in one of our previous publications: PMID: 34385432):

Furthermore, our data show that T-REX17 (or its absence) does not influence SOX17 expression levels (see Figure 3 and Figure Supplement 3).

Taken together, all evidence to date argues against a cis-function within the domain.

In the revised version, we now expanded our analysis to also include genes present in the neighboring TAD and show that their expression in the absence of T-REX17 is largely unaffected:

We included a panel in Figure Supplement 3E and adjusted the text (page 7 lines 192-194) to describe this analysis:

“We also confirmed that unrelated genes present in neighboring domains were unaffected by the perturbation (Figure supplement 3E).”

Page 7: it was not clear whether LncSox17 P(A)/P(A)cells (Supp Figure 3I-K) that also loss LNCSOX17 expression exhibit the same differentiation phenotype as cells depleted of lncRNA using the dCAS9-KRAB-MeCP2 method? Could the authors confirm endoderm defects with IF (eg ECAD, NCAD, VIM) with this CRISPR PAS system? Do they also have JNK signaling pathway activated in LncSox17 PAS cells? What about pancreatic progenitor marker genes?

At the moment the endoderm defects are confirmed only with single guide RNA using the dCAS9-KRAB-MeCP2 system to deplete LNCSOX17 as shown in Figure 4.

We acknowledge that in our manuscript we characterized the cellular phenotype of T-REX17 depleted cells only in the dCas9-KRAB-MeCP2 cell line at greater depth (Figure 4). Nevertheless, the p(A)/p(A) background line (which represses T-REX17 without inhibiting locus transcription) displays a similar phenotypic outcome, which strongly supports the idea that the observed phenotype in the dCas9 line is not due to off-target effect of dCas9-KRAB-MeCP2 but rather hints at a functional role for T-REX17.

Briefly, we observed reduced levels of CXCR4 concomitant with preserved expression of SOX17 (Figure Supplement 5A), reduced expression of a set of endoderm specific genes (Figure Supplement 5G) and upregulation of a pluripotency-associated factor like NANOG (Figure Supplement 5G).

Importantly, all these phenotypes and specifically the endoderm specific DEGs, are also observed in the dCas9-KRAB-MeCP2 line (Figure Supplement 5H), reinforcing the interpretation of a convergent phenotypic alteration caused by T-REX17 loss:

In the text, we refer to the phenotypes observed in both the dCas9-KRAB-MeCP2 repression and the p(A)/p(A) (pages 9-10, lines 249-252 and 261-267).

Page 6: why not targeting LNCSOX17 with LNA gapmers against different exons since it is a nuclear lncRNA (targeting the 5' and 3') ? Although the authors mention that the CAS9-KRAB-MeCPD did not affect SOX17 (Supp 3c), did the authors check the expression of other neighboring genes upon dCAS9-KRAB-MeCP2? In addition, at least 2-3 guide RNAs should be used even with CRISPR to avoid any off target effects and to assure phenotype is reproducible. Do stable cell lines expressing dCAS9-KRAB-MeCP2 differentiate normally before adding control guides and guide against LNCSOX17? More importantly, does depletion of SOX17 affects endoderm differentiation? This should be stated clearly (Supp 3H).

The reviewer is correct that interference with LNA or siRNA/shRNA remains a frequently used approach to study gene function in mammalian cells. However, nuclear transcripts—such as many long non-coding RNAs (lncRNAs)—may be more difficult to target in this way (PMID: 22955988, PMID: 24296535). Moreover, such an approach would require a transient transfection of differentiating endoderm cells, which we have had little success to date. As a result, we decided to focus on alternative methods such as dCas9-KRAB-MeCP2 based repression and early transcriptional termination cassette.

Page 48, Figure 2C – The authors should consider adding RNA fractionation experiments to the panel to assess whether LNCSOX17 is associated to chromatin. The result will also give indications on the possible lncRNA molecular mechanisms (i.e. nucleoplasm vs chromatin enrichment).

As shown above, in the revised version, we now expanded our analysis to also include genes present in the neighboring TAD and show that their expression in the absence of T-REX17 is largely unaffected:

We included this panel in Figure Supplement 3E and adjusted the text (page 7 lines 192-194) to describe this analysis:

“We also confirmed that unrelated genes present in neighboring domains were unaffected by the perturbation (Figure supplement 3E).”

Page 63, Supp Figure 4A – The prediction of the RNA secondary structure is superficial and it does not support the data presented later on. Moreover, although I understand that Vienna RNA has a maximum limit of nucleotides to input, I find useless that the prediction is run only on the spliced isoforms, representing only the 20% of the LNCSOX17 transcripts (the authors mention at Page 6 that the gene undergoes sloppy splicing). I suggest removing this part.

We agree and removed this analysis from the figure, and adjusted the main text, methods and figure legend accordingly (pages 8, 44 and 66, lines 222-226, 1109-1113 and 1432-1433).

Page 8 and Page 63, Supp Figure 4D – It is not clear why the authors focus only on the hnRNP family, given the high number of putative candidates upon mass spectrometry analysis. Please clarify this point. A gene ontology analysis could help data interpretation.

The reviewer is correct and a better rationale should be provided. We also tried performing the GO analysis using the enriched candidate list, but could not find any significant terms with obvious links to endoderm development or developmental progression. Most terms point to rather general cellular functions (Author response image 1):

Author response image 1
Gene ontology (GO) analysis of the candidate T-REX17 protein interactors as measured by RNA pulldown followed by mass spectrometry.

Blue bars represent significant (FDR<0.05) biological process identified. The analysis was performed using WEB-based GEne SeT AnaLysis Toolkit (WebGESTALT).

As for the HNRNPs, these proteins have been well characterized as RNA interacting factors, often associating in multiprotein complexes involved in a variety of cellular processes (PMID: 24463464; PMID: 28636939; PMID: 20833368; PMID: 22325991; PMID: 22574288; PMID: 31350345; PMID: 31892844; PMID: 25406515). We therefore decided to independently validate the MS results with our RIP experiment (Figure Supplement 4E,F)

We now provide the full list of genes and their ranked enrichment in Table 4 to help manual inspection and further investigation of the complete list.

Moreover, it is not clear whether the association with hnRNP proteins is somehow functional to guarantee endoderm differentiation. Thus, the phenotype of LNCSOX17 interacting proteins such as hnRNPU should be to further investigated to reinforce the role of LNCSOX17 – hnRNPU in the context of endoderm differentiation. This is important since lncRNA- bound proteins identified in this manuscript are found to interact with many other lncRNAs.

We agree with the reviewer that we can’t establish a functional link between T-REX17-HNRNPU interaction and the observed phenotypes with the current available information. Therefore, we decided to rephrase the paragraph about the HNRNPU interaction to reflect this aspect (page 8, lines 221-242).

Also interaction between hnRNPU and LNCSOX17 should be confirmed with CLIP, not RIP. RIP qPCR is not ideal method for RNA-protein interaction since it is known to lead to post-lysis reassociation of RBP and RNAs. Please see PMID: 15388877.

The reviewer is correct that CLIP may be preferred in this context, though the RIP result shown in Figure Supplement 4F is convincing to us.

Page 52, Figure 4B – RNA-seq in LNCSOX17-depleted cells at the 5 day of endoderm differentiation reveals a high number of differentially expressed genes, however Figure 2A shows that the lncRNA is expressed already after 3 days from the start of endoderm differentiation. The authors should comment this in the discussion (e.g. how the dynamics of such events are related to each other, if the case?).

As correctly stated by the reviewer, the onset of T-REX17 expression is first detectable by day 3 of endoderm differentiation in WT cells (Figure 2A; Figure 3B; Figure Supplement 5F).

Page 9: How did the authors decide to focus on CXCR4 since it was introduced before RNA-seq analysis?

We made this choice based on the literature (PMID: 16258519; PMID: 25693565; PMID: 34385432; PMID: 25843708; PMID: 27705785; PMID: 21358635; PMID: 24412311). Briefly, C-X-C chemokine receptor type 4 (CXCR4) is a chemokine receptor expressed by various cell types during development and homeostasis. During the early stages of human differentiation and exit from pluripotency, its expression is specifically confined in definitive endoderm, and for this reason this receptor has been widely adopted by the community as a marker for successful differentiation into the endoderm lineage.

Reviewer #2 (Recommendations for the authors):

I wonder if LncSox17 could not be TCONS_00014700 (hg38:chr8:54379158-54380837; lincRNA and TUCP transcripts from UCSC) and have any functional relationship with ENSG00000286471 (GencodeV41).

TCONS_00014700 on chr8:54379158-54380837 is listed in the lincRNA and TUCP transcripts dataset (PMID: 21890647) cataloged by the Rinn Lab. It is a 1.6kb long transcript present within the SOX17 loop domain, transcribed on the (+)-strand and characterized by the presence of 2 exons. It is localized in proximity of the ENSG00000286471 transcript, but far away from the T-REX17 genomic location (hg19, chr8:55117776-55140806). The difference in size (22kb vs 1.6kb), synthesis strand ((-) vs (+)), isoform configurations and genomic location (see above for T-REX17 exact genomic location) suggest that these are two independent RNA molecules:

Author response image 2

John Rinn (co-author on this manuscript) also had a closer look at this relationship and confirmed that the two transcripts do not occupy the same location. Moreover, neither ENSG00000286471 nor TCONS_00014700 are expressed in definitive endoderm where T-REX17 is active (Figure 1A, we do not see any transcription at the locus except for T-REX17 and SOX17), therefore we are confident that they are functionally independent elements.

Perhaps a simultaneous smFISH using both intronic and exonic probes for both Sox17 and LncSox17 would clarify this issue and perhaps reveal more subtle regulations between the two genes.

The experiments proposed by the reviewer could be helpful in clarifying a possible cis-regulative function for T-REX17. In Figure 1D we have shown that the two genomic locations interact in endoderm cells, since an active SOX17 enhancer is present in close proximity to the T-REX17 promoter.

Based on this analysis, we would expect that intronic probes smFISH would highlight co-localization of the two nascent transcripts. Our current smFISH already provide some hints for the localization of the nascent and mature T-REX17 transcripts, since the site of transcription shows an enriched signal as compared to the other foci (indicated by red arrows and detailed in the Figure 2C legend):

Exon probes would in contrast have no effect on the smFISH of T-REX17 and would localize SOX17 prevalently in the cytoplasm (if using probes spanning exon-exon junctions). The foci distribution pattern of T-REX17 would strongly indicate an activity within the nuclear compartment but at different genomic locations far from its transcriptional start site (also supported by our perturbation experiments in Figure 3 showing intact chromatin, expression and SOX17 occupancy in T-REX17 depleted cells).

Reviewer #3 (Recommendations for the authors):

1. Figure 1A: It looks like the LNCSOX17 locus has fairly high levels of DNA methylation in PSC and then low methylation as the gene is activated in EN. This contrasts with SOX17 that is unmethylated in both cell types. LNCSOX17 can therefore perhaps provide an interesting and relatively atypical example of how a developmental gene is presumably not restricted by Polycomb-mediated processes but rather by a specific DNA demethylation event in the forming endoderm. Can the authors comment on this?

The reviewer raises a very interesting point regarding the epigenetic control of developmental genes within the SOX17 loop domain. SOX17, as most developmental regulators, contains CpG islands (CGIs) within its promoter which is always unmethylated irrespective of its expression status. As the reviewer suggests, its expression is rather controlled by Polycomb.

As for the T-REX17 locus, it is worth highlighting that the large and tissue specific differentially methylated region (DMR) mainly overlaps with the SOX17 active enhancer (eSOX17). Only few CpGs at the tail of the DMR overlap with the T-REX17 promoter region.

Some developmental cis-regulatory elements are known to be regulated by DNA methylation, and loss of this modification is correlated with transcription factors binding and chromatin opening (PMID: 31422875, PMID: 25693565).

2. Figure Supplement 1A: The sequence conversation between human and mouse LNCSOX17 appears moderate in exon 1 and the enhancer region, but very low in the rest of the transcript.

The presence of orthologous transcripts in other species provides initial support for a functional role of lncRNAs. The reviewer is right when saying that most of the sequence conservation is localized in the first portion of the transcript. As lncRNAs often exert their function through their secondary structure it may not be entirely unexpected to see limited sequence conservation (PMID: 35098341). Other parameters such us synteny and positional conservation of neighboring genes (see T-REX17 and its position as compared to SOX17 in the analyzed clades) are emerging as more appropriate analysis for orthologous non-coding transcripts identification across species (PMID: 33563213; PMID: 31450588; PMID: 31247106)

Also, the mouse RNA-seq track is very noisy, perhaps due to the scaling applied. From the figure provided, I found it difficult to tell to whether the LNCSOX17 really was present and meaningfully expressed in these other species, and if it is then what is the level of conservation of the transcripts and secondary structures between species?

We agree with the reviewer that it is not trivial to establish a direct parallel between the transcripts in various species. We are of course limited here by the scarcity of high-quality in vivo definitive endoderm RNA-seq datasets from other species, including mouse.

As the reviewer is also hinting at in the last part of the comment, the secondary structure of the RNA might be more relevant than the actual sequence to infer functional conservation. Following suggestions from the first reviewer (see response above, lines 219-225 of this document) and given the unreliable secondary structure predicted for T-REX17, we now decided to remove the secondary structure analysis from our manuscript and therefore won’t be able to compare it to the mouse counterpart.

To highlight the limitation of this analysis, we rephrased the main text (page 4, lines 106-109) to reflect the reviewer comment:

Although the sequence conservation to the mouse is only modest, we detect the presence of a distal SOX17 transcript in a number of vertebrates based on stage- and tissue-matched embryonic data (Figure supplement 1A, left).”

3. I could not see a quantitative assessment of LNCSOX17 expression compared to SOX17 in human EN cells. Can the authors please show log2 RPKM values (or similar)?

In the current version of the manuscript, we use different approaches to measure the expression of SOX17 and T-REX17 during endoderm formation. With qPCR (carried out using Taqman probes and therefore being quantitatively comparable across targets) we show the levels of the two transcripts with 24h resolution during in vitro differentiation (Figure 2A)

Moreover, we use a dataset from the NIH Roadmap Epigenomics Mapping Consortium (PMID: 20944595) to calculate the log2TPM for the two transcripts in stem cell-derived human endoderm cells and show this result in Figure 2B:

We now re-analyzed our internal day 5 endoderm RNA-seq dataset and show in Author response image 3 the log2TPM values for both SOX17 and T-REX17.

Author response image 3

Given that this result is in line with the previous quantitative analysis of SOX17 and T-REX17 gene expression (Figure 2A; Figure 2B; Figure Supplement 3J), we include this panel for the reviewers but do not include it in the revised manuscript. Moreover, TPM values for all our time-course RNA-seq analysis can be found in Supplementary Table 2.

4. Figure 3D: It is interesting that SOX17 still seems to be binding to the eSOX17 even in the CRISPRi cells when presumably the region is in a heterochromatinised state (Figure Supplement 3C). One prior possibility was that the transcription of LNCSOX17 was needed to keep the region accessible, engage in long distance interactions, and facilitate SOX17 occupancy. But this experiment suggests that is probably not the case. I felt this point could be made in the text.

This observation puzzled us as well for quite long time, and motivated us to introduce the pA/pA construct to preserve local transcription while depleting T-REX17. Blocking T-REX17 transcription (dCas9-KRAB-MeCP2 repression) or the generation of its transcript (pA/pA) do not affect SOX17 expression. This indicates that indeed eSOX17 can be activated (and theoretically bound by SOX17) independently of T-REX17 function. Pioneer factor activity of SOX17 or other locally bound factors (such as FOXA2) could be a plausible explanation for its ability to bind eSOX17 despite the induced heterochromatinization. We now updated the main text to specify this interesting point (page 7, lines 195-199)

“Next, we performed SOX17 Chromatin Immunoprecipitation sequencing (ChIP-seq) and show that SOX17 occupancy at the SOX17 locus (including at its induced heterochromatic distal enhancer (eSOX17)) as well as genome-wide is largely unaffected by the loss of T-REX17 (Figure 3D,F, Figure supplement 3F).”

5. Figure Supplement 3H: The deletion of SOX17 lead to the failure to induce LNCSOX17, but it is unclear if that was due to failure of SOX17 to directly activate LNCSOX17 (as currently implied), or due to a general failure to make EN and therefore an indirect effect on LNCSOX17.

The reviewer raises a relevant point here. Given the failure of SOX17-KO cells to properly differentiate into endoderm (see response at lines 181-203 of this document), it is challenging to distinguish between the two proposed scenarios for T-REX17 activation (direct activation by SOX17 vs general failure in differentiation). Nonetheless, we note:

1) T-REX17 starts to be expressed after SOX17 (Figure 2A)

2) SOX17 binds T-REX17 promoter (Figure 3G)

3) SOX17-KO cells fail to activate T-REX17 (Figure Supplement 3J)

In the revised manuscript we show that SOX17-KO cells fail to properly differentiate into endoderm and explicitly highlight it (page 7, lines 202-207).

“Notably, homozygous knockout cells fail to induce GATA4 expression and show no activation of T-REX17 (Figure supplement 3J).”

Do the SOX17 knockout cells make EN cells?

Depletion of SOX17 clearly impacts endoderm differentiation.

In the revised manuscript we now show that a SOX17 homozygous KO cell line fails to induce GATA4 (and other marker genes) expression upon endoderm differentiation (Figure Supplement 3J):

Moreover, in the current manuscript we show that SOX17 enhancer disruption results in delayed SOX17 expression during endoderm differentiation and reduced CXCR4 levels (Figure Supplement 1F):

Based on published literature, SOX17 miss-expression causes major differentiation defects, as we (PMID: 34385432) and others have shown (PMID: 11973269, 21305474).

Overall, based on our results and the reported role of SOX17 in the literature it appears reasonable to conclude that SOX17 is necessary for proper endoderm differentiation. Following the reviewer’s suggestion, we now state this more clearly in the main text (page 7, lines 202-207).

“Notably, homozygous knockout cells fail to induce GATA4 expression and show no activation of T-REX17 (Figure supplement 3J).”

6. The proposed model, that LNCSOX17 is not acting in cis and therefore potentially in trans, is well supported by several experiments. But one further strong piece of evidence for a trans-acting function would be if the LNCSOX17 mutant phenotype could be rescued by expressing LNCSOX17 ectopically. Have the authors tried this? With the appreciation that the long transcript and variable splicing might make this difficult.

The reviewer is correct that our data imply that T-REX17 is a potential trans-acting molecule, but we did not make a major point out of that as our results mostly argue against a cis role and don’t provide direct evidence for a trans mechanism.

We tried to strengthen this point further, and cloned the full-length T-REX17 (~22kb) into a PiggyBac vector and delivered it (~31kb) into T-REX17 depleted cells. As a control, we also created an isogenic line transfected with an empty vector only (PB only) (see panel A in Author response image 4). We generated stably integrated ectopic T-REX17 expressing hiPSCs. After targeting, we derived four independent isogenic iPSC clones, one for PB only integrations and three clones for PB T-REX17 integrations (PB T-REX17 #1, #2, #3). We then verified the expression levels of T-REX17 in these clones and saw T-REX17 is already expressed in hiPSCs at levels comparable to the WT endoderm (see panel C in the reviewer figure below). As indicated by CXCR4-FACS at day 5 of EN-differentiation, we unfortunately could not observe rescue of CXCR4 protein levels (see panel B in Author response image 4).

There are of course many reasons why this complex rescue experiment did not work. For instance, the ideal experimental setup would have included the transient expression of T-REX17 starting at day 3 of EN differentiation (Figure 2A). However, transfection of adherent human EN-differentiating cells with a ~31kb construct is extremely challenging various chemically based approaches have been tried without success including Lipo Stem (Thermo), Lipo3000 (Thermo), LipoLTX (Thermo), FuGENE (Promega). Other factors beyond temporal control may include expression levels, co-transcriptional regulation and localization. As a result, we cannot use this experiment to further comment on the trans-function of T-REX17 and only provide these attempts as Author response image 4.

Author response image 4
PiggyBAC (PB) integrated constitutive T-REX17 rescue.

(A) Schematic of the ectopic PB T-REX17 construct. Lines indicate randomly integrated sequence context for empty backbone (PB only) or T-REX17 (PB T-REX17). (B) FACS histograms showing percentages of CXCR4+ cells during EN differentiation of sgT-REX17 dCas9-KRAB-MeCP2 cells of PB only or PB T-REX17 clones. Clone numbers are indicated. Sample sizes are normalized to 9000 cells /sample. Two independent experiments are displayed. (C) qRT-PCR showing NANOG, SOX17 and T-REX17 expression in PSCs and EN cells of sgT-REX17 dCas9-KRAB-MeCP2 cells integrated for either with PB only or PB T-REX17. Clone numbers are indicated. Fold change is calculated relative to the 18s housekeeping gene. Bar indicate the means, two independent experiments are represented. T-REX17 wild type levels are indicated by the dashed line. Note, ectopic T-REX17 is constantly expressed throughout differentiation.

7. I could not really follow why the transcript was called LNCSOX17. Presumably because it lies within the same TAD as SOX17? Nevertheless, I feel this name could be confusing because, as the authors show, LNCSOX17 does not have much to do with SOX17. The authors might want to consider proposing an alternative name for their transcript to avoid confusion over inferring a genetic or functional link to SOX17.

The name LNCSOX17 derive from the fact that the lncRNA overlaps the SOX17 enhancer (eSOX17), it follows a similar gene expression pattern during endoderm formation, it is localized within the same loop domain (as the only other gene) and its expression is dependent on SOX17 function. Despite this series of crucial observation, we agree with the reviewer and the editors that, given that most lncRNAs overlapping with developmental enhancers regulate their cognate genes in cis-, the current name might be misleading.

Therefore, we changed the transcript name to T-REX17 (Transcript Regulating Endoderm and activated by soX17) to still include some information about where it is transcribed, how it is regulated and its necessity for endoderm formation. We adapted the text, figures, tables accordingly.

https://doi.org/10.7554/eLife.83077.sa2

Article and author information

Author details

  1. Alexandro Landshammer

    1. Department of Genome Regulation, Max Planck Institute for Molecular Genetics, Berlin, Germany
    2. Institute of Chemistry and Biochemistry, Freie Universität Berlin, Berlin, Germany
    Contribution
    Conceptualization, Data curation, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing
    Contributed equally with
    Adriano Bolondi
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-5367-3303
  2. Adriano Bolondi

    1. Department of Genome Regulation, Max Planck Institute for Molecular Genetics, Berlin, Germany
    2. Institute of Chemistry and Biochemistry, Freie Universität Berlin, Berlin, Germany
    Contribution
    Conceptualization, Data curation, Formal analysis, Supervision, Validation, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing
    Contributed equally with
    Alexandro Landshammer
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-1096-9435
  3. Helene Kretzmer

    Department of Genome Regulation, Max Planck Institute for Molecular Genetics, Berlin, Germany
    Contribution
    Resources, Data curation, Formal analysis, Visualization, Methodology
    Competing interests
    No competing interests declared
  4. Christian Much

    Department of Biochemistry, University of Colorado Boulder and BioFrontiers Institute, Boulder, United States
    Contribution
    Investigation
    Competing interests
    No competing interests declared
  5. René Buschow

    Max Planck Institute for Molecular Genetics, Microscopy Core Facility, Berlin, Germany
    Contribution
    Formal analysis
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-9800-2578
  6. Alina Rose

    Helmholtz Institute for Metabolic, Obesity and Vascular Research, Leipzig, Germany
    Contribution
    Investigation
    Competing interests
    No competing interests declared
  7. Hua-Jun Wu

    1. Department of Data Science, Dana-Farber Cancer Institute, Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, United States
    2. Center for Precision Medicine Multi-Omics Research, School of Basic Medical Sciences, Peking University Health Science Center and Peking University Cancer Hospital and Institute, Beijing, China
    Contribution
    Formal analysis
    Competing interests
    No competing interests declared
  8. Sebastian D Mackowiak

    Department of Genome Regulation, Max Planck Institute for Molecular Genetics, Berlin, Germany
    Contribution
    Formal analysis
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-1673-5389
  9. Bjoern Braendl

    Department of Genome Regulation, Max Planck Institute for Molecular Genetics, Berlin, Germany
    Contribution
    Investigation
    Competing interests
    No competing interests declared
  10. Pay Giesselmann

    Department of Genome Regulation, Max Planck Institute for Molecular Genetics, Berlin, Germany
    Contribution
    Formal analysis
    Competing interests
    No competing interests declared
  11. Rosaria Tornisiello

    Department of Genome Regulation, Max Planck Institute for Molecular Genetics, Berlin, Germany
    Contribution
    Formal analysis
    Competing interests
    No competing interests declared
  12. Krishna Mohan Parsi

    Program in Molecular Medicine, University of Massachusetts Medical School, Worcester, United States
    Contribution
    Investigation
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-6002-3816
  13. Jack Huey

    Program in Molecular Medicine, University of Massachusetts Medical School, Worcester, United States
    Contribution
    Formal analysis
    Competing interests
    No competing interests declared
  14. Thorsten Mielke

    Max Planck Institute for Molecular Genetics, Microscopy Core Facility, Berlin, Germany
    Contribution
    Supervision
    Competing interests
    No competing interests declared
  15. David Meierhofer

    Max Planck Institute for Molecular Genetics, Mass Spectrometry Core Facility, Berlin, Germany
    Contribution
    Formal analysis, Investigation, Methodology
    Competing interests
    No competing interests declared
  16. René Maehr

    1. Center for Precision Medicine Multi-Omics Research, School of Basic Medical Sciences, Peking University Health Science Center and Peking University Cancer Hospital and Institute, Beijing, China
    2. Diabetes Center of Excellence, University of Massachusetts Medical School, Worcester, United States
    Contribution
    Supervision
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-9520-3382
  17. Denes Hnisz

    Department of Genome Regulation, Max Planck Institute for Molecular Genetics, Berlin, Germany
    Contribution
    Supervision
    Competing interests
    No competing interests declared
  18. Franziska Michor

    1. Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, United States
    2. Broad Institute of MIT and Harvard, Cambridge, United States
    3. Department of Data Science, Dana-Farber Cancer Institute, and Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, United States
    4. The Ludwig Center at Harvard, Boston, MA 02215, USA, and Center for Cancer Evolution, Dana-Farber Cancer Institute, Boston, United States
    Contribution
    Supervision
    Competing interests
    No competing interests declared
  19. John L Rinn

    Department of Biochemistry, University of Colorado Boulder and BioFrontiers Institute, Boulder, United States
    Contribution
    Supervision
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-7231-7539
  20. Alexander Meissner

    1. Department of Genome Regulation, Max Planck Institute for Molecular Genetics, Berlin, Germany
    2. Institute of Chemistry and Biochemistry, Freie Universität Berlin, Berlin, Germany
    3. Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, United States
    4. Broad Institute of MIT and Harvard, Cambridge, United States
    Contribution
    Conceptualization, Supervision, Funding acquisition, Writing – review and editing
    For correspondence
    meissner@molgen.mpg.de
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-8646-7469

Funding

National Institutes of Health (P01GM099117)

  • John L Rinn
  • Alexander Meissner

National Institutes of Health (DP3K111898)

  • René Maehr
  • Alexander Meissner

Max Planck Society

  • Alexander Meissner

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication. Open access funding provided by Max Planck Society.

Acknowledgements

We are grateful for the support and feedback received by all Meissner Lab members during the project development. Special recognition to T Aktas and IA Ilik for experimental advising. AA Hernandez for the T-REX17 naming. B Lukaszewska-McGreal for support with the mass spectrometry experiment. RD Acemel for suggestions regarding the virtual 4 C analysis. I Ulitsky for advice with the conservation analysis. B Fauler for help with microscopy; D Ibrahim for fruitful feedback; MPIMG Seq-Core for NGS support. This work was funded by the NIH (DP3K111898 R.M and A.M; P01GM099117 J.R. and A.M.) and the Max Planck Society.

Senior Editor

  1. Marianne E Bronner, California Institute of Technology, United States

Reviewing Editor

  1. Deborah Bourc'his, Institut Curie, France

Reviewers

  1. Pablo Navarro, Institut Pasteur, France
  2. Peter J Rugg-Gunn, The Babraham Institute, United Kingdom

Version history

  1. Received: August 30, 2022
  2. Preprint posted: September 13, 2022 (view preprint)
  3. Accepted: January 6, 2023
  4. Version of Record published: January 31, 2023 (version 1)
  5. Version of Record updated: April 17, 2023 (version 2)

Copyright

© 2023, Landshammer, Bolondi et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,049
    Page views
  • 91
    Downloads
  • 1
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Alexandro Landshammer
  2. Adriano Bolondi
  3. Helene Kretzmer
  4. Christian Much
  5. René Buschow
  6. Alina Rose
  7. Hua-Jun Wu
  8. Sebastian D Mackowiak
  9. Bjoern Braendl
  10. Pay Giesselmann
  11. Rosaria Tornisiello
  12. Krishna Mohan Parsi
  13. Jack Huey
  14. Thorsten Mielke
  15. David Meierhofer
  16. René Maehr
  17. Denes Hnisz
  18. Franziska Michor
  19. John L Rinn
  20. Alexander Meissner
(2023)
T-REX17 is a transiently expressed non-coding RNA essential for human endoderm formation
eLife 12:e83077.
https://doi.org/10.7554/eLife.83077

Further reading

    1. Cell Biology
    2. Physics of Living Systems
    Xarxa Quiroga, Nikhil Walani ... Pere Roca-Cusachs
    Research Article

    As cells migrate and experience forces from their surroundings, they constantly undergo mechanical deformations which reshape their plasma membrane (PM). To maintain homeostasis, cells need to detect and restore such changes, not only in terms of overall PM area and tension as previously described, but also in terms of local, nano-scale topography. Here we describe a novel phenomenon, by which cells sense and restore mechanically induced PM nano-scale deformations. We show that cell stretch and subsequent compression reshape the PM in a way that generates local membrane evaginations in the 100 nm scale. These evaginations are recognized by I-BAR proteins, which triggers a burst of actin polymerization mediated by Rac1 and Arp2/3. The actin polymerization burst subsequently re-flattens the evagination, completing the mechanochemical feedback loop. Our results demonstrate a new mechanosensing mechanism for PM shape homeostasis, with potential applicability in different physiological scenarios.

    1. Cell Biology
    Shuxian Li, Lingbing Li ... Xietong Wang
    Research Article

    Epidemiological studies have demonstrated that fine particulate matter (PM2.5) is associated with adverse obstetric and postnatal metabolic health outcomes, but the mechanism remains unclear. This study aimed to investigate the toxicological pathways by which PM2.5 damaged placental trophoblasts in vivo and in vitro. We confirmed that PM2.5 induced adverse gestational outcomes such as increased fetal mortality rates, decreased fetal number and weight, damaged placental structure, and increased apoptosis of trophoblasts. Additionally, PM2.5 induced dysfunction of the trophoblast cell line HTR8/SVneo, including in its proliferation, apoptosis, invasion, migration and angiogenesis. Moreover, we comprehensively analyzed the transcriptional landscape of HTR8/SVneo cells exposed to PM2.5 through RNA-Seq and observed that PM2.5 triggered overexpression of pathways involved in oxidative stress and mitochondrial apoptosis to damage HTR8/SVneo cell biological functions through CYP1A1. Mechanistically, PM2.5 stimulated KLF9, a transcription factor identified as binding to CYP1A1 promoter region, which further modulated the CYP1A1-driven downstream phenotypes. Together, this study demonstrated that the KLF9/CYP1A1 axis played a crucial role in the toxic progression of PM2.5 induced adverse pregnancy outcomes, suggesting adverse effects of environmental pollution on pregnant females and putative targeted therapeutic strategies.