The Aquilegia genome provides insight into adaptive radiation and reveals an extraordinarily polymorphic chromosome with a unique history

  1. Danièle L Filiault
  2. Evangeline S Ballerini
  3. Terezie Mandáková
  4. Gökçe Aköz
  5. Nathan J Derieg
  6. Jeremy Schmutz
  7. Jerry Jenkins
  8. Jane Grimwood
  9. Shengqiang Shu
  10. Richard D Hayes
  11. Uffe Hellsten
  12. Kerrie Barry
  13. Juying Yan
  14. Sirma Mihaltcheva
  15. Miroslava Karafiátová
  16. Viktoria Nizhynska
  17. Elena M Kramer
  18. Martin A Lysak
  19. Scott A Hodges  Is a corresponding author
  20. Magnus Nordborg  Is a corresponding author
  1. Vienna BioCenter, Austria
  2. University of California, United States
  3. Masaryk University, Czech Republic
  4. Vienna Graduate School of Population Genetics, Austria
  5. Joint Genome Institute, United States
  6. HudsonAlpha Institute of Biotechnology, United States
  7. Centre of the Region Haná for Biotechnological and Agricultural Research, Czech Republic
  8. Harvard University, United States
8 figures, 3 tables and 15 additional files

Figures

Figure 1 with 1 supplement
Distribution of Aquilegia species.

There are ~70 species in the genus Aquilegia, broadly distributed across temperate regions of the Northern Hemisphere (grey). The 10 Aquilegia species sequenced here were chosen as representatives …

https://doi.org/10.7554/eLife.36426.002
Figure 1—figure supplement 1
Origin of species samples used for sequencing.
https://doi.org/10.7554/eLife.36426.003
Figure 2 with 2 supplements
Polymorphism and divergence in Aquilegia.

(a) The percentage of pairwise differences within each species (estimated from individual heterozygosity) and between species (divergence). FST values between geographic regions are given on the …

https://doi.org/10.7554/eLife.36426.004
Figure 2—figure supplement 1
Polymorphism across the genome in all ten species samples.
https://doi.org/10.7554/eLife.36426.005
Figure 2—figure supplement 2
Species and chromosome trees of Aquilegia.
https://doi.org/10.7554/eLife.36426.006
Figure 3 with 3 supplements
Discordance between gene and species trees.

(a) Cloudogram of neighbor joining (NJ) trees constructed in 100 kb windows across the genome. The topology of each window-based tree is co-plotted in grey and the whole genome NJ tree shown in Figur…

https://doi.org/10.7554/eLife.36426.007
Figure 3—figure supplement 1
Proportion of significantly-varying subtrees by chromosome.
https://doi.org/10.7554/eLife.36426.008
Figure 3—figure supplement 2
P-values of proportion tests by chromosome for significantly-different trees.
https://doi.org/10.7554/eLife.36426.009
Figure 3—figure supplement 3
Subtree prevalence across chromosomes for the nine significantly-different subtrees.
https://doi.org/10.7554/eLife.36426.010
Figure 4 with 1 supplement
Sharing patterns of derived polymorphisms.

Proportion of derived variants (a) private to an individual species, (b) shared within the geographic region of origin, (c) shared across two geographic regions, and (d) shared across all three …

https://doi.org/10.7554/eLife.36426.011
Figure 4—figure supplement 1
Sharing pattern percentages by pattern type.
https://doi.org/10.7554/eLife.36426.012
D statistics demonstrate gene flow during Aquilegia speciation.

D statistics for tests with (a–c) all North American species, (d) both European species, (e) Asian species other than A. oxysepala, and (f) A. oxysepala as H3 species. All tests use S. adoxoides as …

https://doi.org/10.7554/eLife.36426.013
Figure 6 with 2 supplements
The effect of differences in coalescence time and gene flow on tree topologies.

(a) The observed proportion of informative derived variants supporting each possible Asian tree topology genome-wide and on chromosome four. Species considered include A. oxysepala (oxy), A. japonica

https://doi.org/10.7554/eLife.36426.015
Figure 6—figure supplement 1
Model output for all three gene flow scenarios.
https://doi.org/10.7554/eLife.36426.016
Figure 6—figure supplement 2
Tree topology proportions simulated under assymmetric and unidirectional models.
https://doi.org/10.7554/eLife.36426.017
Figure 7 with 2 supplements
Recombination and selection on chromosome four (a) Physical vs.

genetic distance for all chromosomes calculated in an A. formosa x A. pubescens mapping population. High nucleotide diversity on chromosome four was also observed in parental plants of this …

https://doi.org/10.7554/eLife.36426.018
Figure 7—source data 1

(Physical and genetic distance for A.formosa x A.pubescens markers).

https://doi.org/10.7554/eLife.36426.021
Figure 7—figure supplement 1
Polymorphism in the A. formosa x A. pubescens mapping population.
https://doi.org/10.7554/eLife.36426.019
Figure 7—figure supplement 2
Distribution of gene expression values by chromosome.
https://doi.org/10.7554/eLife.36426.020
Figure 8 with 1 supplement
Cytogenetic characterization of chromosome four in Semiaquilegia and Aquilegia species.

Pachytene chromosome spreads were probed with probes corresponding to oligoCh4 (red), 35S rDNA (yellow), 5S rDNA (green) and two (peri)centromeric tandem repeats (pink). Chromosomes were …

https://doi.org/10.7554/eLife.36426.025
Figure 8—figure supplement 1
Immunodetection of anti-5mC antibody.
https://doi.org/10.7554/eLife.36426.026

Tables

Table 1
GO term enrichment on chromosome four
https://doi.org/10.7554/eLife.36426.022
GOCorrected
P-value
Number on Chr_04Percent of
Chr_04 genes
GO term
ObservedExpected
0043531

5.61×1079

14097.57ADP binding
0016705

4.40×1048

179399.68Oxidoreductase activity, actingon paired donors, withincorporation or reduction of
molecular oxygen
0004497

7.19×1046

158328.55Monooxygenase activity
0005506

2.73×1041

181469.79Iron ion binding
0020037

2.57×1037

1865310.06Heme binding
0010333

1.72×1015

3942.11Terpene synthase activity
0016829

2.08×1013

3952.11Lyase activity
0055114

9.53×1010

24714913.36Oxidation-reduction process
0016747

6.66×105

44162.38Transferase activity,transferring acyl groups other
than amino-acyl groups
0000287

1.23×104

42152.27Magnesium ion binding
0008152

2.56×104

137837.41Metabolic process
0006952

3.60×104

32101.73Defense response
0004674

4.52×104

2351.24Protein serine/threoninekinase activity
0016758

1.35×103

44182.38Transferase activity, transferringhexosyl groups
0005622

4.14×103

14420.76Intracellular
0008146

2.68×102

910.49Sulfotransferase activity
0016760

3.72×102

1220.65Cellulose synthase(UDP-forming) activity
Table 2
Content of the A. coerulea v3.1 reference by chromosome
https://doi.org/10.7554/eLife.36426.023

Chromosome
Genome
1234567
Number of genes504143904449314947863292444329550
Genes per Mb11210210469107108102100
Mean gene length (bp)36293641368930203712362037083580
Percent repetitive38.941.139.154.239.439.340.642.0
Percent genes withHIGH effect variant25.323.823.632.324.122.123.624.7
Percent GC36.837.036.937.037.136.836.837.0
Table 3
Population genetics parameters for Semiaquilegia by chromosome
https://doi.org/10.7554/eLife.36426.024


Percent pairwise differences
ChromosomeGenome
1234567
Polymorphism within Semiaquilegia0.0790.0850.0810.1620.0760.0780.0710.082
Divergence between Aquilegia and Semiaquilegia2.462.472.472.772.482.472.472.48

Additional files

Supplementary file 1

Genomic libraries included in the A. coerulea genome assembly and their respective assembled sequence coverage levels in the A. coerulea v3.1 release

https://doi.org/10.7554/eLife.36426.027
Supplementary file 2

Summary statistics of the output of the whole genome shotgun assembly prior to screening, removal of organelles and contaminating scaffolds and chromosome-scale pseudomolecule construction

https://doi.org/10.7554/eLife.36426.028
Supplementary file 3

Final summary assembly statistics for chromosome-scale assembly

https://doi.org/10.7554/eLife.36426.029
Supplementary file 4

Placement of the individual BAC clones and their contribution to the overall error rate

https://doi.org/10.7554/eLife.36426.030
Supplementary file 5

RNAseq data sets used for gene annotation

https://doi.org/10.7554/eLife.36426.031
Supplementary file 6

Ratio of polymorphism or divergence on each chromosome versus genome-wide for each species

https://doi.org/10.7554/eLife.36426.032
Supplementary file 7

Robustness of nucleotide diversity patterns to copy number variant detection methods

https://doi.org/10.7554/eLife.36426.033
Supplementary file 8

Repeat family prevalence and permutation results in the A. coerulea v3.1 genome release

https://doi.org/10.7554/eLife.36426.034
Supplementary file 9

K-mer based estimates of genome size and repetitive sequence proportion

https://doi.org/10.7554/eLife.36426.035
Supplementary file 10

Mean and median coverage by species

https://doi.org/10.7554/eLife.36426.036
Supplementary file 11

Proportion of sites removed by each filter - initial filtration without Semiaquilegia

https://doi.org/10.7554/eLife.36426.037
Supplementary file 12

Proportion of sites removed by each filter - final filtration with Semiaquilegia

https://doi.org/10.7554/eLife.36426.038
Supplementary file 13

Number of derived variants by species

https://doi.org/10.7554/eLife.36426.039
Supplementary file 14

Transition matrix for the Five-State Markov process

https://doi.org/10.7554/eLife.36426.040
Supplementary file 15

Transition matrix for the Eight-State Markov process

https://doi.org/10.7554/eLife.36426.041

Download links