Generation of a new high quality reference genome for Metriaclima zebra.

Example alignment of Bionano maps created from blood isolated from a single Metriaclima zebra male to the M_zebra_UMD2a reference (top) or the newly generated M_zebra_GT3a reference (bottom). The M_zebra_GT3a reference was created using a combination of PacBio HiFi reads and Bionano molecules. A summary of the improvements in the new genome is found in Table S1.

Inversion genotypes of 31 samples based on Bionano data.

Identification of six large single or double inversions segregating within Lake Malawi cichlids.

Representative alignments of inversions or double inversions identified from blood samples obtained from 30 individuals from eleven species. For each inversion, the top shows predicted motifs from an in-situ digest of the reference genome, the bottom shows motifs identified using Bionano molecules obtained from an individual of the species indicated on the left, and the grey lines indicate matching motifs based upon predicted and observed distances. Single inversions were identified on chromosomes 2, 9, 10, 13, and 20. Tandem double inversions were identified on chromosomes 11 and 20. The estimated length and percentage of the chromosome spanned by the inversion is shown on the right. The single inversion on 20 identified in Rhamphochromis longiceps has the same position as the first inversion of the double inversion on 20 identified from Mchenga conophoros. A list of all samples and their inversion genotype is in Table 1. Note that an error in the reference genome in chromosome 2 is indicated in the first panel.

Distribution of inversions within the Lake Malawi ecogroups.

Principal component analysis was used to analyze SNVs identified using whole genome sequencing from 365 samples to genotype the inversion haplotype for all six inversions. (A) We illustrate the approach for genotyping the chromosome 10 inversions. The PCA plot for the entire genome is shown on the left, with each sample colored by its ecogroup. Individuals with Bionano data are shown as grey Xs. The deep benthic/shallow benthic and utaka individuals that cluster together for the whole genome analysis split into multiple clusters when analyzed using the SNVs that fall within the chromosome 10 inversion, and these clusters were assigned to inversion genotypes using samples that were genotyped using Bionano data. To make comparisons of the whole genome PCA and chromosome 10 PCA plots easier, we reversed the y-axis on the right panel. The genotypes of each sample can be found in Supplemental Table 2. Interactive PCA plots for each inversion are included in Figures S9-S17. (B) The derived allele frequency was calculated for each inversion within the seven ecogroups. The number of species that were genotyped and the estimated number of species that live in Lake Malawi are listed in the whole-genome phylogeny (left).

Evolutionary history of large inversions violates species tree.

(A) Maximum likelihood trees built from either whole genome SNVs (top) or within the 9, 11, and 20 inversions. Between 2-25 representative samples were selected for each ecogroup (see Table S2) and individuals that were heterozygous were excluded from this analysis. Clades were collapsed by ecogroup for visualization purposes. Full phylogenies are presented in Figures S18-S25. The whole genome tree indicates Diplotaxodon and Rhamphochromis species split first and formed their own clade. For each of the displayed inversions, the first split was between samples carrying the inverted vs. normal genotypes. Additionally, the Diplotaxodon individuals now formed a clade with the benthics that carry the inversion. (B) Density plots of dXY values comparing benthic species and species of the indicated ecogroup. For the displayed inversions, the Diplotaxodon species were much closer in evolutionary distance when compared to the whole genome plot.

A role for the chromosome 10 inversion in sex determination

(A) Schematic of the two inversion states (normal and inverted) and their assignment as X or Y chromosomes. (B) 24 o‘spring (12 male and 12 female) from two separate male/female pairs were genotyped for the inversion. For 47 of 48 individuals, the XY animals were male and the XX animals were female. One male was XX. (C) FST analysis comparing 24 males and 24 females identified chromosome 10 as the primary genetic region that segregated with sex. (D) Comparison of heterozygosity leve2ls9o3f 24 male and 24 female animals also identified chromosome 10 as an outlier.

Model for inversions in Lake Malawi.

Model for the origin spread of inversions in Lake Malawi overlayed on the species tree proposed in Malinsky et al. Key events in the proposed inversions’ histories are indicated by numbers overlayed on the phylogeny.