Figures and data

T. brucei repetitive elements, TALE design and target site number.
A, Distinct repetitive elements are present at various locations on T. brucei chromosomes. B. Construct designed to express the indicated TALE proteins that bind 15 bp target sequences fused to 3xTy1 and YFP tags when integrated at the β-tubulin locus. The Aldolase 5’UTR and PAD1 3’UTR regulate expression levels. A Bleomycin resistance marker gene provides Phleomycin selection (not shown). C. Predicted number of target sequences for each TALE in the Lister 427 genome.

Localization and specific target sequence association of five synthetic TALE-YFP fusion proteins expressed in T. brucei compared to YFP-TRF and YFP-KKT.
A. Bloodstream form Lister 427 T. brucei cells expressing the indicated TALE-YFP fusion proteins fixed and TALE-YFP protein localization detected with anti-GFP primary antibody and Alexa Fluor 568-labeled secondary antibody (red). Nuclear and kinetoplast (mitochondrial) DNA were stained with DAPI (green). Control cell expressing telomeric YFP-TRF, centromeric YFP-KKT2 kinetochore protein or wild type Lister 427 cells expressing no YFP, are also shown. Scale bar, 10 μm. B. Anti-GFP ChIP-seq analysis for 147R-TALE, 177R-TALE, 70R-TALE, TelR-TALE and ingiR-TALE, demonstrates that each protein is enriched on the repeat elements they were designed to recognize: CIR147 repeats, 177bp repeats, 70bp repeats, telomeric (TTAGGG)n repeats and ingi retrotransposons. Enrichments obtained for the YFP-KKT2 kinetochore protein, the TRF telomere repeat binding protein and with a No-Tag control are shown for comparison. Data are from two biological replicates.

TelR-TALE-YFP and 70R-TALE-YFP are enriched at or near telomeric T. brucei blood stream expression sites.
A. Telomeric repeat (TTAGGG)n sequence (top) and 70bp repeat consensus sequence (bottom). Sequences that TelR-TALE and 70R-TALE were designed to bind is indicated. Deletion of TelR-TALE recognition modules following integration in T. brucei results in recognition of AGGGTTAG rather than the full 15 bp target sequence. B. Anti-GFP ChIP-seq for cells expressing TelR-TALE-YFP, YFP-TRF or 70R-TALE–YFP proteins, or 427 cells expressing no YFP tagged protein. Anti-GFP ChIP-seq enrichment profiles are shown for telomeric blood stream expression sites (BES) BES1 (top) and BES5 (bottom). Diagrams show the position of telomeric (TTAGGG)n repeats (black chevrons), VSG genes (blue) and upstream 70 bp repeats (green bars). Data are from two biological replicates. Y axis: Log2 values, X axis: base pairs.:

The 147R-TALE-YFP protein is enriched at a subset of centromeres containing canonical CIR147 repeats.
A. CIR147 repeat consensus sequence. Sequence that 147R-TALE-YFP was designed to bind is indicated. B. Comparison of sequences enriched in YFP-KKT2 (purple) and 147R-TALE-YFP (blue) anti-GFP ChIP-seq for chromosomes 1, 3, 4, 5 and 8. DNA from all centromeres are enriched in YFP-KKT2 anti-GFP ChIP-seq whereas only CIR147 repeats at centomeres on chromosomes 4, 5 and 8 are enriched in 147R-TALE-YFP anti-GFP ChIP-seq. C. Split-Violin plot demonstrating the relative enrichment of YFP-KKT2 (purple) and 147R-TALE-YFP (blue) over the eleven main chromosome centromere regions. Data are from two biological replicates. Y axis: Log2 values.

The 177R-TALE-YFP is enriched over 177bp repeats located on intermediate-sized and mini-chromosomes.
A. 177 repeat consensus sequence. Sequence that 177R-TALE-YFP was designed to bind is indicated. B. Distribution of 177R-TALE-YFP, TelR-TALE-YFP, YFP-TRF and 70R-TALE-YFP, at two intermediate/mini-chromosome telomeres determined by anti-GFP ChIP-seq. Anti-GFP ChIP-seq of 427 cells expressing no tagged protein is included as control. Diagrams below ChIP-seq profiles indicate the positions of 177bp repeats (red chevrons), 70bp repeats (green bars), VSG encoding genes (blue) and telomere (TTAGGG)n repeats (black chevrons) within Tb427VSG-671_unitig_Tb427v12:17,836-31,606 (31kb) and Tb427VSG-647_untig_Tb427v12 (10kb). Data are from two biological replicates. Y axis: Log2 values, X axis: base pairs.

Affinity selection of TelR-TALE-YFP enriches for telomere-associated proteins and 177R-TALE-YFP protein enriches for kinetochore proteins.
Affinity selection was performed on control cells expressing YFP-TRF (A), YFP-RPA2 (C), YFP-KKT2 (E) or No-YFP tagged protein and cells expressing synthetic TelR-TALE-YFP (B), 70R-TALE-YFP (D), 177R-TALE-YFP (F). Enriched proteins were identified and quantified by LC-MS/MS analysis relative to the No-YFP tag control. The data for each plot is derived from three biological replicates. Cutoffs used for significance: P< 0.05 (Student’s t-test). Enrichment scores for proteins identified in each affinity selection are presented in Supplementary Tables.

Synthetic 177R-TALE-YFP and YFP-KKT2 kinetochore proteins co-localize over 177bp repeats located on intermediate-sized and mini-chromosomes but not over centromeic CIR147 repeats where 147R-TALE-YFP binds.
A. Distribution of 177R-TALE, YFP-KKT2 and 147R-TALE over two intermediate/mini-chromosome telomeres determined by anti-GFP ChIP-seq. Anti-GFP ChIP-seq of T. brucei 427 cells expressing no tagged protein is included as control. Diagram below ChIP-seq profiles indicates the positions of 177bp repeats (red chevrons), 70bp repeats (green bars), VSG encoding genes (blue) and telomere (TTAGGG)n repeats (black chevrons) within Tb427VSG-671_unitig_Tb427v12:17,836-31,606 (31kb) and Tb427VSG-647_untig_Tb427v12 (10kb). B. Comparison of distribution of 177R-TALE, 147R-TALE and YFP-KKT2 over the chromosome 4 CIR147 centromere repeat array and adjacent unique sequences. Chr4:880,000-895,000 (15Kb) and Tb427VSG-671_unitig_Tb427v12:12,000-27,000 (31kb). Diagram below ChIP-seq indicates position of CIR147 repeats. C. Comparison of YFP-KKT2 kinetochore protein enrichment on 177 bp and 147 bp repeats. Data are from two biological replicates. Y axis: Log2 values, X axis: repeat type.

TALE-YFP construction and sequencing reveals rearranged TALE domain in TelR-TALE-YFP following intergration in the T. brucei genome.
A. Tetramer and trimer modules used to build plasmids designed to express each of six TALE-YFP when integrated in the T. brucei genome. B. Each of the complete TALE plasmid contructs are expected to contain four modules comprising a complete TALE domain of the same size in all. C. Sequencing of five assembled TALE-YFP constructs revealed that they had the expected layout following integration at the β-tubulin locus in T. brucei. D. PCR of genomic DNA extracted from 427 control cells and 427 cells with the TelR-TALE, 177R-TALE, 70R-TALE, ingiR-TALE or NonR-TALE constructs integrated at the β-tubulin locus confirms correct size (2 kb). PCR product for TelR-TALE is shorter than expected (1.5 kb). Position of Left (LP) and Right (LP) primer pair, common to all TALE-YFP constructs, are indicated. E. Following integration at the β-tubulin locus sequencing revealed that the TALE DNA binding domain of TelR-TALE had rearranged explaining the shorter TelR-TALE protein made by T. brucei 427 TelR-TALE expressing cells (Figure S2).

Synthetic TALE proteins are expressed in Lister 427 T. brucei bloodstream form cells but the TelR-TALE protein is shorter than expected.
Protein extracted from 427 cells and 427 cells with constructs designed to express 177R-TALE, 70R-TALE, 147R-TALE, ingiR-TALE, NonR-TALE and TelR-TALE fused to Ty and YFP tags integrated at the β-tubulin locus was subject to western analysis using either: A. monoclonal mouse anti-GFP (anti-YFP) or B. anti-BB2 (anti-Ty). The TelR-TALE protein is smaller than the 177R-TALE, 70R-TALE, 147R-TALE, ingiR-TALE and NonR-TALE proteins.

Growth assays of cells expressing TelR-TALE-GFP, 177R-TALE-GFP or ingi-TALE-GFP and their cellular localisation.
A. Indicated cell cultures were seeded, cell number monitored and diluted every two days. B. DAPI and anti-GFP staining of fixed cells expressing indicated synthetic TALE-GFP fusion proteins. Bar = 10μm.

Fields of T. brucei cells showing the cellular localization of six expressed synthetic TALE-YFP fusion proteins compared to YFP-TRF and YFP-KKT.
Bloodstream form Lister 427 T. brucei cells expressing the indicated TALE-YFP fusion proteins fixed and TALE-YFP protein localization detected with anti-GFP primary antibody and Alexa Fluor 568-labeled secondary antibody (red). Nuclear and kinetoplastid (mitochondrial) DNA were stained with DAPI (green). Controls cell expressing telomeric YFP-TRF, centromeric YFP-KKT2 kinetochore protein or wild type Lister 427 cells expressing no YFP, are also shown. Scale bar, 10 μm. NonR-TALE-TFP field was captured at a quarter the size of others.

IngiR-TALE is enriched at matching binding sites located in retrotransposons.
A. ‘Ingi’ repeat consensus sequence conserved in Ingi, RIME, SIDER and DIRE elements. The sequence that ingiR-TALE was designed to bind is indicated. B. Cross-hatched rectangle indicates the conserved region in Ingi, RIME, SIDER and DIRE retrotransposons, which are predicted to provide approximately 295, 187, 101 and 21 binding sites for the ingiR-TALE, respectively. C. Analysis of ChIP-seq data for cells expressing ingi-TALE shows enrichment of DNA residing at or near predicted ingiR-TALE binding sites.

Overlap of proteins enriched in affinity purifications of both synthetic protein telomere binding protein YFP-TRF and TelR-TALE-YFP.
A. The Telomere Repeat binding Factor TRF binds (TTAGGG)n repeats at the ends of T. brucei chromosomes. A Venn diagram is shown comparing number of proteins enriched in YFP-TRF versus TelR-TALE-YFP affinity purifications. B. List of known telomere associated porteins and other proteins detected in YFP-TRF and/or TelR-TALE-YFP affinity purifications. + detected, - not detected, (-) weakly detected. Lists of all proteins detected in YFP-TRF and TelR-TALE-YFP affinity purifications are available in Tables S1 and S2, respectively. *See Reiss et al, 2018; Leal et al, 2020; Weisert et al, 2024.

A control TALE that binds no specific T. brucei sequence validates proteins enriched in TelR-TALE, 70R-TALE and 177R-TALE affinity purifications.
A control NonR-TALE was designed to bind the sequence GGAAGTATACCTGGC that is not present in the T. brucei 427 genome. Affinity selection was performed on cells expressing the synthetic NonR-TALE-YFP, TelR-TALE, 70R-TALE-YFP, 177R-TALE-YFP, 147R-TALE-YFP or ingiR-TALE proteins and control cells expressing no-YFP tagged protein. Proteins enriched with the five repeat sequence targetted TALE-YFP proteins were identified and quantified by LC-MS/MS analysis relative to the NonR-TALE-YFP control rather than the No-YFP tag control. The data for each plot is derived from three biological replicates. Cutoffs used for significance: log2(tagged/untagged) P< 0.05 (Student’s t-test). Enrichment scores for proteins identified in each affinity selection are presented in Supplementary Tables (Excel files).

Affinity selection of TelR-TALE-YFP, 70R-TALE-YFP 177R-TALE-YFP relative to ingiR-TALE-YFP validates specificity.
Affinity selection was performed on cells expressing synthetic TelR-TALE-YFP (A), 70R-TALE-YFP (B), 177R-TALE-YFP (C). Enriched proteins were identified and quantified by LC-MS/MS analysis relative to affinity selected ingiR-TALE-YFP as a negative control. The data for each plot is derived from three biological replicates. Cutoffs used for significance: P< 0.05 (Student’s t-test). Enrichment scores for proteins identified in each affinity selection are presented in Supplementary Tables (Excel files).

No proteins of interest are detected following affinity selection of 147R-TALE or ingiR-TALE.
Affinity selection was performed on cells expressing synthetic (A) 147R-TALE-YFP or (B) ingiR-TALE-YFP proteins and control cells expressing no-YFP tagged protein. Enriched proteins were identified and quantified by LC-MS/MS analysis relative to the No-YFP tag control. The data for each plot is derived from three biological replicates. Cutoffs used for significance: log2(tagged/untagged) P< 0.05 (Student’s t-test). Enrichment scores for proteins identified in each affinity selection are presented in Supplementary Tables (Excel files).

Overlap of proteins enriched in affinity purifications of both kinetochore protein YFP-KKT2 and synthetic protein 177R-TALE.
Kinetoplastid KineTochore (KKT) proteins are known to be enriched at all centromeres on T. brucei main chromosomes (Akiyoshi and Gull 2014). 177bp repeats are confined to intermediate-sized or mini-chromosomes. A. Venn diagram comparing proteins enriched in YFP-KKT versus 177R-TALE versus affinity purifications. A high proportion of KKT proteins, in addition to cohesin (SCC1, SCC3, SMC1 and SMC3) and condensin (SMC2) subunits, are enriched on 177 bp repeats. B. List of kinetochore, cohesin and condensin proteins detected in YFP-KKT and/or 177R-TALE affinity purifications. Lists of all proteins detected in 177R-TALE-YFP and YFP-KKT2 affinity purifications are available in Tables S12 and S14, respectively.