Characterization of full-length CNBP expanded alleles in myotonic dystrophy type 2 patients by Cas9-mediated enrichment and nanopore sequencing

  1. Massimiliano Alfano
  2. Luca De Antoni
  3. Federica Centofanti
  4. Virginia Veronica Visconti
  5. Simone Maestri
  6. Chiara Degli Esposti
  7. Roberto Massa
  8. Maria Rosaria D'Apice
  9. Giuseppe Novelli
  10. Massimo Delledonne
  11. Annalisa Botta  Is a corresponding author
  12. Marzia Rossato  Is a corresponding author
  1. Department of Biotechnology, University of Verona, Italy
  2. Department of Biomedicine and Prevention, Medical Genetics Section, University of Rome Tor Vergata, Italy
  3. Department of Systems Medicine (Neurology), University of Rome Tor Vergata, Italy
  4. Laboratory of Medical Genetics, Tor Vergata Hospital, Italy
  5. IRCCS Neuromed, Via Atinense, Italy
  6. Department of Pharmacology, School of Medicine, University of Nevada Reno, United States
  7. Genartis s.r.l., Via P. Mascagni, Italy
4 figures, 4 tables and 2 additional files

Figures

Figure 1 with 2 supplements
Cas9-mediated sequencing of the CNBP microsatellite.

(A) Experimental methods applied retrospectively to study the CNBP microsatellite in nine confirmed dystrophy type 2 (DM2) patients. The positions of AluI and HaeIII restriction sites (142 bp upstream and 108 bp downstream of the CCTG array, respectively) and the (CCTG)5 probe (orange rectangle) for Southern blot hybridization are shown, along with the gRNAs cleavage site for Cas9-mediated enrichment (boundaries of blue line = 4.2 kb) and the annealing position of P4TCTG primers for quadruplet-repeat primed PCR (QP-PCR). (B) Average coverage and (C) fold enrichment obtained by Cas9-mediated enrichment in DM2 patients following singleplex (n = 4) or multiplex (n = 4) runs. Numbers above bars represent the percentage of on-target reads. (D) Total and on-target number of PASS reads generated for each DM2 patient and (E) number of reads fully spanning the CNBP microsatellite and attributable to either the normal or expanded alleles. Data are plotted as means ± standard error of the mean (SEM). WG, whole genome. Sequencing statistics of singleplex and multiplex experiments are reported in detail in Supplementary file 1.

Figure 1—figure supplement 1
Pedigree of the dystrophy type 2 (DM2) family analyzed.

Genetic pedigree showing the relationship among A1–A4 familiar cases analyzed in this study. DNA sample from III-5 was not available for ONT sequencing.

Figure 1—figure supplement 2
Southern blot analysis of expanded alleles in dystrophy type 2 (DM2) patients.

(A) Restriction map of the DM2 locus indicating the AluI and HaeIII restriction sites used for the digestion of genomic DNA. (B) Southern blot analysis of genomic DNA double digested with AluI and HaeIII and probed with a digoxigenin (DIG)-labeled (CCTG)5 locked nucleic acid (LNA) probe. Lane 1, CTR, healthy control sample; lane 2, DM1 sample; lanes 3–11, DM2 samples. Molecular markers are indicated on the left. The original scan of the Southern blot analysis reported in panel B and the whole figure incorporating the original uncropped scan can be found in Figure 1—figure supplement 2—source data 1 and 2, respectively.

Figure 1—figure supplement 2—source data 1

Original scan of the Southern blot analysis reported in Figure 1—figure supplement 2.

https://cdn.elifesciences.org/articles/80229/elife-80229-fig1-figsupp2-data1-v2.zip
Figure 1—figure supplement 2—source data 2

Figure 1—figure supplement 2 including the original uncropped scan of the Southern blot analysis.

(A) Restriction map of the dystrophy type 2 (DM2) locus indicating the AluI and HaeIII restriction sites used for the digestion of genomic DNA. (B) Southern blot analysis of genomic DNA double digested with AluI and HaeIII and probed with a digoxigenin (DIG)-labeled (CCTG)5 locked nucleic acid (LNA) probe. Lane 1, CTR, healthy control sample; lane 2, DM1 sample; lanes 3–11, DM2 samples. Molecular markers are indicated on the left. *indicates aspecific bands, as reported by Nakamori et al., 2009.

https://cdn.elifesciences.org/articles/80229/elife-80229-fig1-figsupp2-data2-v2.zip
Analysis of ONT sequencing data from normal and expanded CNBP alleles.

(A) Integrative Genomics Viewer (IGV) visualization of ONT sequencing data at the CNBP locus of a representative dystrophy type 2 (DM2) patient following Cas9-mediated enrichment. Reads generated from the normal allele feature clear cuts on both sides of the CNBP repeat, whereas those derived from the expanded allele are longer, soft-clipped and do not match the reference genome, as expected. Length distributions of reads derived from the normal alleles (B) and expanded alleles (D) of each patient. Boxes represent the interquartile range (IQR) of lengths, the horizontal line is the median, whiskers and outliers are plotted according to Tukey’s method. (C) Correlation between the length of ONT and Sanger consensus sequences for the normal allele (n = 9). (E) Correlations between the maximum length of ONT sequences (longest complete read) and the upper edge of the Southern blot trace for the expanded allele (n = 9). Numbers on top of panels (B) and (D) indicate the coefficient of variation of normal and expanded alleles, respectively.

Figure 3 with 2 supplements
Analysis of the expanded-repeat CNBP alleles in dystrophy type 2 (DM2) patients.

Integrative Genomics Viewer (IGV) visualization (35-kbp windows) of ONT-targeted sequencing data from the expanded alleles of four representative DM2 patients. Complete reads were aligned at the 5′ end (A) and then at the 3′ end (B) in order to identify the repeat pattern that characterizes the expanded microsatellite locus. Each motif in the expanded alleles was visualized using a different color, as indicated in the key. Samples C and E contained a ‘pure’ CCTG expansion (blue) whereas samples A4 and A2 also contained the unexpected TCTG motif (red) downstream of CCTG. (C) Abundance of quadruplets identified in each patient. The y-axis shows the number of ONT reads with a certain number of repeats, whereas the x-axis shows the number of quadruplet repeats identified. ONT reads were grouped into 500 bp bins. The gray line represents the estimated kernel density of the underlying solid gray distribution of ONT reads.

Figure 3—figure supplement 1
Analysis of the CNBP repeat motif for the expanded alleles in dystrophy type 2 (DM2) patients.

Integrative Genomics viewer (IGV) visualization (53-kbp window) of ONT-targeted sequencing data from the expanded alleles of the five remaining DM2 samples. Complete reads were aligned at the 5′ end (A) and subsequently at the 3′ end (B) in order to identify the repeat pattern characterizing the expanded microsatellite locus. Each motif in the expanded alleles was visualized using a different color, as indicated in the key. All samples feature the unexpected TCTG motif of variable length downstream of the CCTG motif. (C) Abundance of quadruplets identified in each patient. The y-axis shows the number of ONT reads with a certain number of repeats, whereas the x-axis shows the number of quadruplet repeats identified. ONT reads were grouped into 500 bp bins. The gray line represents the estimated kernel density of the underlying solid gray distribution of ONT reads. (D) IGV visualization of ONT-targeted sequencing data from the expanded alleles of two representative patients (E and A3) carrying a pure CCTG expansion and a repeat with the TCTG motif, respectively.

Figure 3—figure supplement 2
Analysis of the CNBP 5′-end (TG)v repeat motif of the CNBP expanded alleles of A1–A4 dystrophy type 2 (DM2) family members.

Integrative Genomics viewer (IGV) visualization (~130 bp window) of ONT-targeted sequencing data from the expanded alleles of the DM2 family members. Complete reads were aligned at the 5′ end in order to identify the repeat pattern characterizing the expanded microsatellite locus. Nucleotides of the expanded repeat have been colored with A: green, C: blue, G: orange, and T: red.

Figure 4 with 1 supplement
Analysis of the TCTG motif by quadruplet-repeat primed PCR (QP-PCR) and Sanger sequencing.

(A) Representative QP-PCR profiles of genomic DNA samples from patients A2 and A4 and showing the presence of the TCTG block. Upper panels show QP-PCR results using the conventional P4CCTG primer. Lower panels show QP-PCR results using primer P4TCTG. (B) Sanger sequencing of the QP-PCR products from P4TCTG reaction confirming the presence of the TCTG sequence. (C) QP-PCR profiles of genomic DNA samples from patients C and E showing only the traditional dystrophy type 2 (DM2) motif CCTG. For each patient, the composition of CNBP expanded alleles above the QP-PCR tracks reflects the ONT sequencing data. Asterisks (*) indicate nonspecific signals, also visible in the QP-PCR profiles of DM1 and CTR samples (D).

Figure 4—figure supplement 1
Sanger sequencing of quadruplet-repeat primed PCR (QP-PCR) products showing the presence of the (TCTG)n motif.

QP-PCR was carried out using primer P4TCTG and genomic DNA from dystrophy type 2 (DM2) patients A1, A3, B, D, and F. Sanger sequencing of the QP-PCR products confirmed the presence of the (TCTG)n array at the 3′ end of the (CCTG) expansions.

Tables

Table 1
Demographic and molecular features of the dystrophy type 2 (DM2) patients.

For each patient, the table shows the ID, sex, age, age at onset, repeat length, and structure of the CNBP normal allele characterized by Sanger sequencing, and the estimated repeat length of the expanded allele determined by Southern blotting (–, not available).

Normal alleleExpanded allele
Sample IDSexAgeAge at onsetRepeat length (bp)Repeat structureRepeat length (bp)
A1F7570136(TG)24 (TCTG)7 (CCTG)5 GCTG CCTG TCTG (CCTG)740,000
A2M2725130(TG)17 (TCTG)9 (CCTG)5 GCTG CCTG TCTG (CCTG)720,000
A3M21132(TG)20 (TCTG)8 (CCTG)5 GCTG CCTG TCTG (CCTG)720,323
A4M6561122(TG)19 (TCTG)9 (CCTG)1232,745
BF4944134(TG)21 (TCTG)7 (CCTG)6 GCTG CCTG TCTG (CCTG)740,000
CM20140(TG)24 (TCTG)6 (CCTG)7 GCTG CCTG TCTG (CCTG)729,027
DM4439134(TG)19 (TCTG)9 (CCTG)5 GCTG CCTG TCTG (CCTG)720,000
EF6143134(TG)19 (TCTG)9 (CCTG)5 GCTG CCTG TCTG (CCTG)730,000
FM5650138(TG)21 (TCTG)9 (CCTG)5 GCTG CCTG TCTG (CCTG)739,000
Table 2
CNBP repeat analysis based on Cas9-mediated sequencing of the normal alleles.

For each patient, the table shows the characteristics of normal CNBP alleles based on the analysis of ONT sequencing data, in terms of length and structure. The table reports the percentage identity of consensus sequences reconstructed from ONT or Sanger sequencing data and the incongruences in ONT-derived sequences are highlighted in bold.

Normal alleleSample IDRepeat length (bp)Repeat structureIdentity with Sanger sequence
A1136(TG)24 (TCTG)7 (CCTG)5 GCTG CCTG TCTG (CCTG)7100.0%
A2131(TG)17 TGCTG (TCTG)8 (CCTG)5 GCTG CCTG TCTG (CCTG)799.2%
A3132(TG)20 (TCTG)8 (CCTG)5 GCTG CCTG TCTG (CCTG)7100.0%
A4122(TG)19 (TCTG)9 (CCTG)12100.0%
B134(TG)21 (TCTG)7 (CCTG)6 GCTG CCTG TCTG (CCTG)7100.0%
C141(TG)24 TGCTG (TCTG)5 (CCTG)7 GCTG CCTG TCTG (CCTG)799.3%
D138(TG)21 (TCTG)9 (CCTG)5 GCTG CCTG TCTG (CCTG)797.1%
E134(TG)19 (TCTG)9 (CCTG)5 GCTG CCTG TCTG (CCTG)7100.0%
F138(TG)21 (TCTG)9 (CCTG)5 GCTG CCTG TCTG (CCTG)7100.0%
Table 3
CNBP repeat analysis based on Cas9-mediated sequencing of the expanded alleles.

For each patient, the table shows the characteristics of expanded CNBP alleles based on the analysis of ONT sequencing data, in terms of length and structure. The expanded (CCTG)x are colored in blue and (TCTG)y in red. Potential incongruences in ONT-derived sequences are highlighted in bold. The table indicates also the fraction of reads carrying the unexpected TCTG motif at the 3′ end of the CNBP microsatellite expansion.

Expanded alleleSample IDRepeat length (bp) (min–max)Repeat structureNumber of reads carrying the TCTG motif
A13241–46,685(TG)20 (TCTG)7 (CCTG)100012,000 (TCTG)01014 (45%)
A2864–23,779(TG)18 (TCTG)7 (CCTG)10004500 (TCTG)0200092 (86%)
A34429–18,983(TG)19 (TCTG)7 (CCTG)30005000 (TCTG)0258 (73%)
A4660–34,284(TG)18 (TCTG)7 (CCTG)2508000 (TCTG)015009 (29%)
B344–23,358(TG)18 (TCTG)7 (CCTG)3004000 (TCTG)04006 (23%)
C700–31,753(TG)20 (TCTG)7 (CCTG)15080000 (0%)
D383–19,143(TG)18 (TCTG)7 (CCTG)1004000 (TCTG)010003 (11%)
E848–25,162(TG)18 (TCTG)6 (CCTG)20062000 (0%)
F1533–32,824(TG)15 (TCTG)10 (CCTG)4006000 (TCTG)0200015 (43%)
Key resources table
Reagent type (species) or resourceDesignationSource or referenceIdentifiersAdditional information
Gene (Homo sapiens)CNBPEnsemblHGNC:13164Hg38
Biological sample (Homo sapiens)Anti-coagulated peripheral bloodPoliclinico Tor Vergata, Rome, ItalyPatient A1, A2, A3, A4, B, C, D, E, F
Sequence-based reagentDigoxigenin (DIG)-labeled locked nucleic acid (LNA) probeNakamori et al., 2009DIG-LNA probe(CCTG)5
Sequence-based reagentP4TCTGThis paperPCR primersagc gga taa caa ttt cac aca gga TCT GTC TGT CTG TCT GTC TGT
Sequence-based reagentCL3N58_DR-[FAM]This paperPCR primersGCC TAG GGG ACA AAG TGA GA
Sequence-based reagentP3This paperPCR primersAGC GGA TAA CAA TTT CAC ACA GGA
Sequence-based reagentcrRNA1_CNBPThis paperCRISPR RNACCA CCT GAT TCA CTG CGA TA
Sequence-based reagentcrRNA2_CNBPThis paperCRISPR RNAGGC TTC TCA TTC CAC GAC CA
Sequence-based reagentNative barcodesOxford Nanopore Technologies (ONT)EXP-NBD104
Commercial assay or kitDIG-High Prime DNA Labeling and Detection Starter Kit IIRocheCat. No. 11585614910
Commercial assay or kitFlexigene DNA KitQiagenCat. No. 51206
Commercial assay or kitBigDye Terminator v3.1 Cycle Sequencing KitThermo FisherCat. No. 4337458
Commercial assay or kitNanobind CBB Big DNA HMW KitCirculomicsSKU 102-301-900
Commercial assay or kitNucleoSpin Blood L KitMacherey-NagelItem number: 740954.20
Commercial assay or kitQubit dsDNA BR Assay KitThermo Fisher ScientificCat. No. Q32853
Commercial assay or kitTapeStation DNA ScreenTape & ReagentsAgilent TechnologiesCat. No.
5067–5365
5067–5366
Software, algorithmGeneMapper Software 6Applied BiosystemsCat. No. 4475074
Software, algorithmCHOPCHOPLabun et al., 2019chopchop.cbu.uib.no/
Software, algorithmGuppy v3.4.5Computational Biology Research Center – AIST
Software, algorithmNanoFilt v2.7.1De Coster et al., 2018
Software, algorithmBBMap suite v38.87https://sourceforge.net/projects/bbmap/
Software, algorithmTandem Repeat Finder v4.09Benson, 1999
Software, algorithmMinimap2 v2.17-r941Li, 2018
Software, algorithmIntegrative Genomics Viewer (IGV) v2.8.3Robinson, 2011
Software, algorithmScripts for the generation of consensus sequences and repeat annotations for the normal allelehttps://github.com/MaestSi/CharONT2Script Name: CharONT2
Software, algorithmScripts for the annotation of repeats and the generation of simplified reads for the expanded allelehttps://github.com/MaestSi/MosaicViewer_CNBPScript Name: MosaicViewer_CNBP v1.0.0
Software, algorithmMinKNOW V20.06.5Oxford Nanopore Technologies
Commercial assay or kitAlt-R S.p. HiFi Cas9 Nuclease v3IDTCat. No. 1081060Recombinant Cas9 nuclease for target excision (see M&M)
Commercial assay or kitAlt-R CRISPR-Cas9 tracrRNAIDTCat. No. 1072532Structural RNA for gRNA formation (see M&M)
Commercial assay or kitAMPure XP BeadsBeckman-CoulterProduct No. A63881Magnetic beads for nucleic acid purification (see M&M)
Commercial assay or kitCutSmart buffer 10×New England BioLabsCat. No. B7204Buffer for gDNA dephosphorylation, RNP formation, and target excision (see M&M)
Commercial assay or kitBlunt/TA Ligase Master MixNew England BioLabsCat. No.: M0367ST4 DNA ligase for native barcode ligation to dA-tailed ends (see M&M)
Commercial assay or kitFLO-MIN106D (R9.4.1) flow cellOxford Nanopore TechnologiesFLO-MIN106DFlowcell for ONT sequencing (see M&M)

Additional files

Supplementary file 1

Sequencing statistics of singleplex and multiplex experiments.

The table reports the feature of each Cas9-mediated sequencing experiment performed and the sequencing statistics of each ONT run. Average values are also provided for singleplex and multiplex runs separately.

https://cdn.elifesciences.org/articles/80229/elife-80229-supp1-v2.docx
MDAR checklist
https://cdn.elifesciences.org/articles/80229/elife-80229-mdarchecklist1-v2.pdf

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Massimiliano Alfano
  2. Luca De Antoni
  3. Federica Centofanti
  4. Virginia Veronica Visconti
  5. Simone Maestri
  6. Chiara Degli Esposti
  7. Roberto Massa
  8. Maria Rosaria D'Apice
  9. Giuseppe Novelli
  10. Massimo Delledonne
  11. Annalisa Botta
  12. Marzia Rossato
(2022)
Characterization of full-length CNBP expanded alleles in myotonic dystrophy type 2 patients by Cas9-mediated enrichment and nanopore sequencing
eLife 11:e80229.
https://doi.org/10.7554/eLife.80229