END-seq successfully captures the 5’ termini of human and mouse chromosomes.

A. Schematic representation of the END-seq method applied to DSBs (left) and natural chromosome ends (right). Blunted dsDNA ends are ligated to biotinylated adaptors, purified, and subsequently sequenced by NGS. Reads originating from DSBs align to either side of the break (left), while telomeric reads align only to the 5’ C-rich template (red). B. Number of reads containing at least 4 consecutive telomeric repeats corresponding to the G-rich (green) and C-rich (red) strands in HeLa and RPE1 cell lines. Telomeric reads are normalized to the total number of reads identified by END-seq and are represented as reads per million (RPM). C. Sequence logo representing the conservation (bits) of the last 6 nucleotides at the 5’ end of chromosomes in HeLa and RPE1 cells. D. Distribution of the three most frequent telomeric repeats across human cell lines (orange), mouse cells (yellow) and a canine cell line (purple). E. Schematic representation of T7-mediated randomization of the 5’ termini. F. Percentage of telomeric reads that have the indicated sequence as a 5’ end. The following sequences (CCCAAT-5’, TCCCAA-5’, and ATCCCA-5’) are grouped and labelled as “Rest”. Kullback-Leibler divergence (KL divergence) analysis was used to compare the distributions of the individual conditions. KL divergence between 0.25 and 0.375 is represented by 2 dots. Prior to the END-seq protocol cells were either left untreated (NT) or were treated with T7. G. Percentage of telomeric reads are displayed as described in F. Cells expressing either a non-targeting shRNA (shScr) or a shRNA targeting POT1 (shPOT1) were harvested 3 days post induction and analyzed by END-seq. KL divergence between 0.125 and 0.25 is represented by 1 dot (•).

POT1a is the key regulator of 5’ telomere end processing in mice, distinct from POT1b

A. Comparison of the effect of single and double depletion of POT1a and POT1b on the percentage of telomeric reads with the indicated 5’ end sequences. The following sequences (CCCAAT-5’, TCCCAA-5’, and ATCCCA-5’) are grouped and labelled as “Rest”. Kullback-Leibler divergence (KL divergence) analysis was used to compare the distributions across conditions. KL divergence between 0.25 and 0.375 is represented by 2 dots. B. Sequence logo illustrating the conservation (in bits) of the last 6 nucleotides at the 5’ end of chromosomes in POT1-proficient, POT1a-depleted, POT1b-depleted, and POT1a/POT1b double-depleted mESCs.

ALT cells have precise 5’ termini.

A-C. Number of telomeric reads containing at least 4 consecutive telomeric repeats corresponding to the G-rich (green) and C-rich (red) strands in the ALT-positive G292, SAOS2 and U2OS cell lines. Telomeric reads are normalized to the total number of reads identified by END-seq and are represented as reads per million (RPM). D. Sequence logo representing the conservation at chromosome ends. A sequence logo representing the conservation (bits) of the last 6 nucleotides at the 5’ end of chromosomes in ALT cells. Fraction of reads with CCAATC as 5’ end is displayed. E. Cells expressing either a non-targeting shRNA (shScr) or a shRNA targeting POT1 (shPOT1-1) were stained for 53BP1 (red) and telomeric DNA (TTAGGG, green). F. Quantification of the data shown in panel E. Graphs indicated the percentage of cells that have at least 5 telomere dysfunctional foci (TIF) with 53BP1 co-localizing at telomeres. G. Percentage of telomeric reads that have the indicated sequence as a 5’ sequence that have as a 5’ end. The following sequences (CCCAAT-5’, TCCCAA-5’, and ATCCCA-5’) are grouped and labelled as “Rest”. Cells expressing either a non-targeting shRNA (shScr) or a shRNA targeting POT1 (shPOT1-1 and shPOT1-2) were harvested 3 days post induction and analyzed by END-seq. Kullback-Leibler divergence (KL divergence) analysis was used to compare the distributions of the individual conditions. KL divergence between 0.25 and 0.375 is represented by 2 dots (••), KL divergence greater than 0.375 is represented by 3 dots (•••).

ALT telomeres are readily distinguished due to presence of telomeric ssDNA by sequencing.

A. Schematic Representation of the S1-END-seq method applied to telomeres containing either an internal ssDNA region (left) or only the natural G-rich overhang (right). S1 nuclease treatment cleaves ssDNA regions generating two ended DSB (left) or one ended DSB (right). DNA ends are ligated to biotinylated adaptors, purified, and subsequently sequenced by NGS. Reads originating from the double ended DSB align to either side of the break (left), resulting in C-rich (red) and G-rich (green) reads. Reads originating from the single-ended chromosome ends align only to the C-rich template (red). B. Proportion of telomeric reads (containing at least 4 consecutive telomeric reads) corresponding to G-rich reads (green) or C-rich reads (red). C-D Proportion of telomeric reads (containing at least 4 consecutive telomeric reads) corresponding to G-rich reads (green) or C-rich reads (red) in BLM proficient (U2OS) or deficient (BML-/-) cells by END-seq (C) or S1-seq (D). E. Native telomeric FISH (ssTelo) in BLM proficient (U2OS) or deficient (BML-/-) cells either left untreated (control) or expressing TRF1-Fok1. F. Quantification of the data shown in panel E, cells with 5 ssTelo signal (or grater) were scored as positive. When indicated, cells were induced to express the TRF1-Fok1 nuclease for 24 hours prior to harvesting. G-H. Proportion of telomeric reads (containing at least 4 consecutive telomeric reads) corresponding to G-rich reads (green) or C-rich reads (red) in BLM proficient (U2OS) or deficient (BML-/-) cells expressing TRF1-Fok1 by END-seq (G) or S1-seq (H).