Variable paralog expression underlies phenotype variation

  1. Raisa Bailon-Zambrano
  2. Juliana Sucharov
  3. Abigail Mumme-Monheit
  4. Matthew Murry
  5. Amanda Stenzel
  6. Anthony T Pulvino
  7. Jennyfer M Mitchell
  8. Kathryn L Colborn
  9. James T Nichols  Is a corresponding author
  1. Department of Craniofacial Biology, University of Colorado Anschutz Medical Campus, United States
  2. Department of Surgery, University of Colorado Anschutz Medical Campus, United States

Abstract

Human faces are variable; we look different from one another. Craniofacial disorders further increase facial variation. To understand craniofacial variation and how it can be buffered, we analyzed the zebrafish mef2ca mutant. When this transcription factor encoding gene is mutated, zebrafish develop dramatically variable craniofacial phenotypes. Years of selective breeding for low and high penetrance of mutant phenotypes produced strains that are either resilient or sensitive to the mef2ca mutation. Here, we compared gene expression between these strains, which revealed that selective breeding enriched for high and low mef2ca paralog expression in the low- and high-penetrance strains, respectively. We found that mef2ca paralog expression is variable in unselected wild-type zebrafish, motivating the hypothesis that heritable variation in paralog expression underlies mutant phenotype severity and variation. In support, mutagenizing the mef2ca paralogs, mef2aa, mef2b, mef2cb, and mef2d demonstrated modular buffering by paralogs. Specifically, some paralogs buffer severity while others buffer variability. We present a novel, mechanistic model for phenotypic variation where variable, vestigial paralog expression buffers development. These studies are a major step forward in understanding the mechanisms of facial variation, including how some genetically resilient individuals can overcome a deleterious mutation.

Editor's evaluation

In this elegant genetic study, Bailon-Zambrano et al. draw on classical genetic concepts to address the clinically pertinent question of how genetic variants in the same gene can yield wildly different phenotypes in different individuals. Specifically, this work makes significant contributions to our understanding of how a mutant phenotype may be modified by the expression levels of paralogues of the mutant gene.

https://doi.org/10.7554/eLife.79247.sa0

Introduction

Human faces are variable

Human craniofacial variation allows us – and even computers – to identify specific individuals. A recent study indicated that human facial structures are more variable than other human anatomical features and that evolution selected for increased variation in craniofacial-associated genomic regions (Sheehan and Nachman, 2014). While human genomic variation contributes to craniofacial phenotypic variation (Liu et al., 2012; Paternoster et al., 2012), facial variation persists among more genetically homogeneous populations such as Finnish people (Sheehan and Nachman, 2014) and even monozygotic twins (Kohn, 1991). Therefore, even among individuals with similar genomes, craniofacial development may remain noisy or sensitive to minor fluctuations, compared with other developmental processes.

Phenotype severity and variation can be measured in different ways. Penetrance is the frequency of a phenotype associated with a genotype. Expressivity is different degrees of the same phenotype associated with a genotype. Here, we use both penetrance and average expressivity to measure phenotype severity. We also use the distribution in expressivity to measure phenotypic variation. Furthermore, there are two types of variation. Among-individual variation can be quantified by measuring differences between individuals. Within-individual variation is measured by quantifying departures from symmetry on left versus right sides of an individual known as fluctuating asymmetry (Valen, 1962). It is unknown if among- and within-individual variation are products of the same biological mechanisms (Hallgrimsson et al., 2019). There is evidence to support that these two types of variation are associated (Breuker et al., 2006; Santos et al., 2005); however, other studies indicate that they are independent (Debat et al., 2009; Breno et al., 2011). Further work is needed to resolve this question.

Variability in human craniofacial disease

In the 1940s, Waddington observed that mutant organisms are often associated with increased phenotypic variation compared to wild types (Waddington, 1942), and more recent studies support this finding (Fish, 2016; Hallgrímsson et al., 2006; Scharloo, 1991; Parsons et al., 2008). Human genetic craniofacial disease phenotypes also appear more variable than normal human facial phenotypes (Ward et al., 2000). For example, facial clefting among monozygotic twins can be incompletely penetrant or unilateral (Trotman et al., 1993; Young et al., 2021), demonstrating both among- and within-individual variation. How the same genetic disease allele can have devastating consequences in some individuals, while other resilient individuals can overcome the deleterious mutation is not well understood (Chen et al., 2016).

One human genetic disease that presents variable craniofacial phenotypes is MEF2C haploinsufficiency syndrome. Patients heterozygous for mutations affecting the transcription factor encoding gene MEF2C show variable facial dysmorphologies (Le Meur et al., 2010; Zweier et al., 2010; Gordon et al., 2018; Tonk et al., 2011). To our knowledge, no MEF2C homozygous mutant patients have been identified, likely due to embryonic lethality. Additionally, craniofacial asymmetry is documented in some patients with this disorder (Nowakowska et al., 2010; Novara et al., 2013). Thus, both among- and within-individual variation are present in MEF2C haploinsufficiency syndrome patients. Numerous MEF2C mutant alleles cause this disorder. Therefore, a model system in which the same allele can be studied in many different individuals is needed.

Variable buffering by gene family members might underlie among-individual variation

During evolution, whole genome duplications produced multiple mef2 genes, or paralogs, in vertebrate genomes. Thus far, no mef2 paralogs are associated with craniofacial development besides mef2c. When whole genome duplications occur, the most likely outcome is the loss of one of the duplicate genes through the accumulation of deleterious mutations and eventual nonfunctionalization (Takahata and Maruyama, 1979; Watterson, 1983). However, sometimes mutations occur in gene regulatory elements partitioning gene expression domains among the new duplicates. This preserves both copies and is called subfunctionalization (Force et al., 1999; Ohno, 2013). Subfunctionalized duplicates may experience only partial loss of expression subdomains (Force et al., 1999), retaining relics of their ancestral expression pattern even if the gene is no longer required for the original function.

Compensation, or the ability of paralogs or gene family members to make up for the loss of a gene due to an overlap in function, has long been recognized as a source of genetic robustness (Wagner, 2000; Kirschner and Gerhart, 1998; Gu et al., 2003; Conant and Wagner, 2004; Kamath et al., 2003). Robustness is defined as the ability of a biological system to overcome genetic or environmental perturbation (Diss et al., 2014; de Visser et al., 2003). Buffering lessens the impact of perturbation on a system. Although buffering by paralogs has long been proposed, whether paralogs retain buffering capacity after subfunctionalization has not been sufficiently addressed.

We hypothesize that vestigial expression can buffer against loss of another paralog and that variation in vestigial expression underlies phenotypic variation. This hypothesis is supported by work in systems ranging from yeast to human cells demonstrating that paralogs contribute to genetic robustness (Diss et al., 2014; De Kegel and Ryan, 2019; Dandage and Landry, 2019; Li et al., 2010; Hsiao and Vitkup, 2008; Wagner, 2008; Dean et al., 2008). However, whether paralog expression variation underlies phenotypic variation has not been directly tested. Finally, whether variation in paralogous compensation is subject to selection is unknown.

We previously found evidence to support this hypothesis in zebrafish. mef2ca single mutants produce dramatically variable craniofacial phenotypes (Nichols et al., 2016). mef2cb is the most closely related mef2ca paralog. However, mef2cb is not required for craniofacial development, and homozygous mutants are viable and indistinguishable from wild types (Hinits et al., 2012). However, we reported that ventral cartilage defects associated with mef2ca mutation become more severe when we removed a single functional copy of mef2cb from mef2ca homozygous mutants (De Laurier et al., 2014). Our vestigial buffering hypothesis predicts that although mef2cb is no longer overtly required for craniofacial development, remaining mef2cb expression in neural crest cells might partially substitute for mef2ca loss. There is further evidence from mouse mutants that Mef2 paralogs can functionally substitute for one another (Majidi et al., 2019). Because mutations in none of the other four zebrafish paralogous mef2 genes have been reported, their function is unknown. These paralogs have human orthologs, MEF2A, MEF2B, and MEF2D. None of these genes have been associated with human craniofacial development or disease.

We developed a system for understanding craniofacial development and variability

One prominent phenotype in zebrafish mef2ca homozygous mutants is expansion of the opercle, a bone supporting the gill flap. This phenotype is remarkably variable (De Laurier et al., 2014; Kimmel et al., 2003); mef2ca homozygous mutants have many opercle shapes, and some even develop wild-type-looking opercle bones; the phenotype is incompletely penetrant (Nichols et al., 2016). With some genes, mutants encoding a premature termination codon (PTC) upregulate compensating genes through transcriptional adaptation, explaining why some mutant alleles do not produce a phenotype (Rossi et al., 2015; El-Brolosy et al., 2019). Transcriptional adaptation does not contribute to incomplete penetrance in our system (Sucharov et al., 2019). Specifically, we pharmacologically inhibited transcriptional adaptation (by inhibiting RNA decay) and did not observe changes in penetrance.

We demonstrated that the factors underlying opercle variation are heritable through selective breeding, which shifted the penetrance of the expanded bone phenotype to generate strains with consistently low or high penetrance of this phenotype (Nichols et al., 2016). Unlike traditional mutagenesis modifier screens (St Johnston, 2002; Kile and Hilton, 2005), our selective breeding paradigm likely enriched for standing genetic variation without the need for further mutagenesis. Thus, we likely bred naturally occurring genetic modifiers in our background to fixation. Similar approaches have proven successful in other systems (Vu et al., 2015; Gasch et al., 2016). mef2ca is pleiotropic, and although we only selected on opercle bone penetrance, nearly all mef2ca-associated phenotypes increased or decreased penetrance in the high- and low-penetrance strains, respectively (Sucharov et al., 2019). The only phenotype remaining fully (100%) penetrant in the low- and high-penetrance strains is a shortened symplectic cartilage, a linear, rod-shaped cartilage that functions as a jaw support structure. In unselected lines, mef2ca-associated phenotypes are only found in mef2ca homozygous mutants (Miller et al., 2007). However, in high-penetrance heterozygotes, we observed the shortened symplectic cartilage, suggesting this phenotype is highly sensitive to mef2ca loss (Sucharov et al., 2019). Penetrance is a binary measurement (a shortened symplectic is present or not), and we have not yet examined expressivity (to what extent a mutant symplectic is shortened). We do not know if the average length of the shortened symplectic or the distribution of symplectic length is different between low- and high-penetrance strains.

Here, we capitalize on the strengths of our zebrafish system to address the mechanisms underlying craniofacial variation. We quantified expressivity by taking linear measurements of the symplectic cartilage, allowing us to measure severity, among-individual variation, and within-individual variation in the zebrafish craniofacial skeleton. Combining these measurements with penetrance scoring, we compared phenotype severity and variation between selectively bred strains. We found that increased severity is associated with increased variation. In the high-penetrance strain, we found a more severe symplectic phenotype, and more among-individual variation but not within-individual variation compared to the low-penetrance strain.

What factors were selected upon from the original genetic background to increase or decrease penetrance? Work in nematodes demonstrates that expression variation in otherwise redundant genes contributes to variable penetrance in mutants (Raj et al., 2010). Therefore, we hypothesized that one or more of the five zebrafish mef2ca paralogs might be redundant with mef2ca and ameliorate the phenotype in the low-penetrance line or reciprocally intensify the phenotype in the high-penetrance line. We compared the expression of the mef2 paralogs between selectively bred strains and found that many of the paralogs are more highly expressed in the low-penetrance strain compared to the high-penetrance strain. Furthermore, we uncovered standing variation in paralog expression in unselected strains. To determine if the increased paralog expression we discovered in the low-penetrance strain buffers the mef2ca mutant phenotype, we mutagenized the mef2 paralogs. Double mutant analyses indicate that the different paralogs modularly buffer different aspects of the pleiotropic mef2ca mutant phenotype: some affect severity, while others affect variation and some affect both. These findings demonstrate that heritable, variable paralog expression is a major factor affecting phenotype severity and among-individual phenotypic variation but does not contribute to within-individual variation.

Results

Craniofacial phenotypes are more variable in mef2ca mutants compared with wild types

The phenotypic variation often associated with human genetic diseases might be partially due to individuals inheriting different mutant alleles retaining different levels of functional activity, expression domains, and/or transcriptional levels. A strength of the zebrafish system is that variation can be studied in many different individuals with the same mutant allele. Wild-type mef2ca encodes an N-terminal MADS box (MCM1, agamous, deficiens, and SRF) and an adjacent MEF2 domain (Chen et al., 2017; Figure 1A). These highly conserved domains mediate dimerization, DNA binding, and co-factor interactions (Potthoff and Olson, 2007). The mef2cab1086 mutant allele arose from a forward genetic screen and produces a PTC just downstream of the MADS box. This mutant allele variably displays phenotypes including ectopic bone near the opercle, interhyal joint fusion, dysmorphic ceratohyal, reduced Meckel’s cartilage, jaw joint fusion, and shortened symplectic (Figure 1B and C). Skeletal preparations from two full-sibling individuals illustrate both among- and within-individual variation easily observed in mef2ca mutants (Figure 1B). Comparing the two mutant individuals (B3 and B4) demonstrates among-individual variation associated with the opercle bone phenotype; one individual (B3) has phenotypically wild-type opercles, while the other individual has bilateral mutant phenotypes (B4). Meanwhile, within-individual (left-right) symplectic cartilage phenotype variation is present in one of the mef2cab1086 animals (B4); in this animal, the symplectic cartilage is longer on one side than the other.

Figure 1 with 1 supplement see all
mef2ca mutant craniofacial phenotypes are more variable than wild types.

(A) Schematic of mef2ca exonic structure. The mef2cab1086 mutant allele used in this study and regions encoding known functional domains are annotated. (B1–B4) Zebrafish heterozygous for mef2ca were pairwise intercrossed, and 6 days post fertilization (dpf), larvae were stained with Alcian blue and Alizarin red to label cartilage and bone. The individuals were then genotyped, flat mounted, and imaged. Two examples (upper and lower) are provided for each genotype. The following craniofacial skeletal elements are indicated in a wild-type individual: opercle bone (op), branchiostegal ray (br), Meckel’s (mc), ceratohyal (ch), symplectic (sy) cartilages, interhyal (ih), and jaw (jj) joints. Indicated phenotypes associated with mef2ca mutants include: ectopic bone (arrowheads), interhyal and jaw-joint fusions (^), reduced mc (double arrowhead), and a shortened sy (red arrows). Dashed outline indicates symplectic cartilage. Scale bar: 50 μm. (C) The penetrance of mef2ca mutant-associated phenotypes observed in 6 dpf homozygous mutant larvae is indicated. (D) Schematic indicating how the symplectic cartilage length was measured in this study. (E) Symplectic cartilage length was measured from 6 dpf wild type or homozygous mutant larvae. The p-value from a Welch’s t-test is indicated (****≤0.0001). F-test value testing for significant differences in variation between genotypes is indicated. (F) Symplectic cartilage length on left and right sides of 6 dpf zebrafish was measured to determine fluctuating asymmetry, or the absolute difference between left and right, for wild type or mutant larvae. The p-value from a Welch’s t-test is indicated (****≤0.0001). For box and whisker plots, the box extends from the 25th to 75th percentiles. The line in the middle of the box is plotted at the median, and the bars are minimum and maximum values. For E and F, n=22 for wild types and 44 for mutants.

Figure 1—source data 1

Penetrance data and raw symplectic cartilage length measurements.

https://cdn.elifesciences.org/articles/79247/elife-79247-fig1-data1-v2.xlsx

To quantify the variation associated with mef2ca mutants, we first scored penetrance of the various mef2ca phenotypes in an unselected AB background (Figure 1C). Based on penetrance, the symplectic cartilage is the craniofacial structure most sensitive to mef2ca loss (Sucharov et al., 2019). Moreover, we found that symplectic length correlates with opercle bone area (Figure 1—figure supplement 1), suggesting the shortened symplectic phenotype is a good proxy for the entire craniofacial complex. We developed a quantitative phenotyping assay where the symplectic length can be rapidly measured in many animals (Figure 1D). We used the total linear symplectic length (left side plus right side) to examine expressivity (Figure 1E). The average symplectic length was significantly shorter in mutants compared to wild types. The among-individual variation was significantly greater in mutants compared to wild types. When we examined within-individual variation by determining the absolute value of the difference between left- and right-symplectic cartilage length, we found that mutants have more within-individual variation than the wild type (Figure 1F).

Selective breeding reveals that severity and among-individual variation, but not within-individual variation, is heritable and segregates with penetrance

Another strength of the zebrafish system is that penetrance can be altered by selective breeding to better understand the influence of genetic background on severity and variation. An ongoing, long-term selective-breeding experiment in our laboratory derived two strains of zebrafish with consistently low and high penetrance of mef2ca-associated phenotypes in mef2cab1086 homozygous mutants (Nichols et al., 2016; Brooks and Nichols, 2017; Figure 2A). Our previous work examined the phenotypes present in homozygous mutant fish from these strains (Sucharov et al., 2019). Here, we closely examined mef2ca wild types (mef2ca+/+) from these strains. We were surprised to observe that occasionally some mef2ca-associated phenotypes, like shortened symplectic cartilages (19%, n=16) and fused interhyal joints (13%, n=16), were observed in mef2ca+/+ individuals from the high-penetrance strain (Figure 2B). In contrast, we never observed these phenotypes in mef2ca+/+ individuals from the low-penetrance strain. Importantly, these phenotypes are not likely due to general developmental delay because other skeletal structures like pharyngeal teeth are unaffected. Thus, the phenotypes we discovered in high-penetrance mef2ca+/+ individuals are specific to developmental processes associated with mef2ca function. Quantifying the severity of the shortened symplectic cartilage phenotype, we found significant differences between low- and high-penetrance strains for both mef2ca+/+ as well as mef2ca+/-, but homozygous mutants were statistically the same by this measure in the two strains (Figure 2C).

Selective breeding affects mef2ca-associated phenotype severity and variation in mef2ca wild types, heterozygotes, and homozygous mutants.

(A) Selective breeding pedigree illustrating ectopic bone phenotype penetrance inheritance. Dashed boxes indicate families used in this study. Six generations of full-sibling inbreeding produced the animals used here, a more complete pedigree extending back >10 generations can be found in our previous publication (Sucharov et al., 2019). (B) Alcian blue- and Alizarin red-stained animals from the low- and high-penetrance strains were genotyped, and mef2ca homozygous wild types were flat mounted and imaged. The following craniofacial skeletal elements are indicated in a wild-type individual from the low-penetrance strain: opercle bone (op), Meckel’s (mc), ceratohyal (ch), symplectic (sy) cartilages, interhyal (ih), and jaw (jj) joints. Phenotypes normally associated with mef2ca homozygous mutants are present in some wild types from the low-penetrance strain including: ih joint fusions (^) and shortened sy (red arrows). Bars indicating sy length are presented to illustrate the shortened symplectic phenotype present in some high-penetrance wild types but not low-penetrance wild types. A stage-appropriate complement of ankylosed of pharyngeal teeth (pt) are present, and normal sized op bones are present in the individual with shortened sy, indicating the phenotypes we discovered in high-penetrance mef2ca+/+ are not due to general delay. Dashed outline indicates symplectic cartilage. Scale bar: 50 μm (C) Symplectic cartilage length was measured from 6 days post fertilization (dpf) larvae from wild types, heterozygotes, and homozygous mutants from both the low- and high-penetrance strains. p-Values from a Dunnet’s T3 test are indicated (**≤0.01). (D) The coefficient of variation for symplectic length in all three genotypes from both strains was plotted (E) Table listing F-test values testing for significant differences in variation between strains comparing the same genotype. (F) Symplectic cartilage length on left and right sides of 6 dpf zebrafish was measured to determine fluctuating asymmetry or the absolute difference between left and right for all three genotypes from both strains. (G) Table listing the Dunnett’s test for significant differences in fluctuating asymmetry between all three genotypes from both strains. For box and whisker plots, the box extends from the 25th to 75th percentiles. The line in the middle of the box is plotted at the median, and the bars are minimum and maximum values.

Figure 2—source data 1

Symplectic cartilage length measurements from selectively bred strains.

https://cdn.elifesciences.org/articles/79247/elife-79247-fig2-data1-v2.xlsx

When we examined variation in symplectic cartilage length, we compared like genotypes between strains and observed that among-individual variation was greater in the high-penetrance strain for homozygous wild types and heterozygotes, and there was no significant difference in variation between strains for homozygous mutants (Figure 2D and E). Thus, the genetic background producing more severe phenotypes also exhibited more among-individual variation, even in wild types. Within-individual variation was not significantly different between backgrounds (Figure 2F and G).

Six mef2 paralogs in the zebrafish genome share highly conserved amino acid sequences

We sought to explore potential mechanisms underlying the differences in severity and variation between strains. Work from budding yeast demonstrates that gene duplication contributes to genetic robustness against null mutations and that the probability of compensation by gene duplicates is correlated with sequence similarity among the gene duplicates (Gu et al., 2003). Therefore, we hypothesized that mef2ca duplicates with high sequence similarity would modulate severity and variation. There are five mef2ca paralogs in the zebrafish genome. To determine which are the most similar to mef2ca and therefore more likely to compensate for its loss, we used Clustal Omega to compare the amino acid sequences of mef2 paralogs. While the C-terminal domain is divergent among different mef2 genes (Molkentin et al., 1996; Figure 3—figure supplement 1), the zebrafish mef2 paralogs each encode a MADS box and MEF2 domain which are remarkably similar in sequence across all paralogs. We found that the paralog with the most sequence similarity to mef2ca is mef2cb (Figure 3A and B). Together, these genes are the co-orthologs of mammalian Mef2c. mef2aa and mef2ab, co-orthologs of mammalian Mef2a, also share similar sequences to each other and the mef2c pair. By these analyses, mef2d and mef2b sequences are more divergent. There is only one zebrafish ortholog for Mef2d and Mef2b, presumably the other co-ortholog for each was lost. These results and interpretations are consistent with previous analyses of the evolutionary history of the mef2 family (Chen et al., 2017; Wu et al., 2011a; Martindale et al., 2004; He et al., 2019; Wu et al., 2011b). Thus, these duplicates are strong candidates for mechanistically underlying the differences we observe between strains.

Figure 3 with 1 supplement see all
Zebrafish mef2 paralogs encode highly conserved MADS box (MCM1, agamous, deficiens, and SRF) and MEF2 domains.

(A) Neighbor joining tree generated by Clustal Omega Multiple Sequence Alignment tool depicts the evolutionary relationships between the different zebrafish mef2 paralogs. The distance values (branch length) are indicated, which represent the evolutionary distance between the individual amino acid sequences and a consensus sequence. (B) mef2-encoded protein sequence alignment reveals high conservation of MADS box (green) and MEF2 (yellow) domains among all six paralogs. These domains are responsible for DNA binding, dimerization, and cofactor interactions. Transcript IDs used for alignment using the HHalign algorithm are listed in the Materials and methods.

Closely related mef2 paralogs share similar temporospatial expression dynamics

Our previous work, and that of others, demonstrates when (20–30 hpf) (hours post fertilization) and where (cranial neural crest cells) mef2ca functions during craniofacial development (De Laurier et al., 2014; Sucharov et al., 2019; Miller et al., 2007). We therefore hypothesized that mef2ca duplicates with temporospatial expression dynamics similar to mef2ca would modulate severity and variation. To examine paralog temporal expression dynamics, we performed RT-quantitative PCR (qPCR) (quantitative polymerase chain reaction) on wild-type zebrafish heads from 20 to 30 hpf (Figure 4A). We found that closely related paralogs share similar gene expression dynamics. For example, mef2ca and mef2cb have both early (~24 hpf) and late (~28 hpf) expression peaks. mef2aa and mef2ab share an early expression peak. mef2d and mef2b have a single late and early expression peak, respectively.

Figure 4 with 1 supplement see all
Wild-type gene expression studies reveal mef2 paralog expression dynamics, strain-specific expression levels, and standing paralog expression variation.

(A) Wild-type head expression of all mef2 paralogs was quantified by RT-quantitative PCR (qPCR) at 1 hr intervals from 20 to 30 hpf. Expression of each paralog was normalized to rps18. Error bars are SD. (B) Expression of each paralog in cranial neural crest cells at 24 hpf was determined by single cell RNA-sequencing on sorted cells. Seurat-based clustering subdivided the cells into four populations as described (Mitchell et al., 2021): anterior arches (aa), frontonasal (fn), posterior arches (pa), and a satellite population containing melanocyte lineage cells (m). (C) Wild-type head expression of all mef2 paralogs was quantified by RT-qPCR at 24 hpf to compare paralog expression levels between the low- and high-penetrance strains. Expression of each paralog was normalized to rps18. Asterisks indicate significant difference (*≤0.05 and **≤0.01). Error bars are SD. (D) Wild-type head expression of all mef2 paralogs was quantified by RT-qPCR at 24 hpf to compare paralog expression levels between two families from the unselected AB strain. Expression of each paralog was normalized to rps18. Asterisks indicate significant difference (*≤0.05, ***≤0.001, and ****≤0.0001). Error bars are SEM.

To examine paralog expression in the cells that give rise to the craniofacial skeleton, we examined our previously published single cell RNA-sequencing dataset to monitor mef2 paralog expression in isolated 24 hpf wild-type cranial neural crest cells (Mitchell et al., 2021). While this method does not capture all the cells orchestrating craniofacial development, it does allow us to specifically assay cranial neural crest cells, including those residing in the anterior arches, which are the precursors of the cells which form the skeletal elements affected by mef2ca loss. We found that mef2ca has the strongest expression of all the paralogs across different populations of cranial neural crest cells and that expression is strongest in the anterior arches (Figure 4B). The closely related paralog mef2cb is the next highest expressed, while other paralogs are more weakly expressed in these craniofacial progenitor cells at this stage.

mef2 paralogs are differentially expressed in the low- versus high-penetrance strains, and paralog expression varies in an unselected background

Data showing that paralog transcriptional adaptation does not account for the phenotypic differences between the low- and high-penetrance strains (Sucharov et al., 2019) do not rule out the possibility that selective breeding changed paralog expression between strains in mef2ca wild types. To examine this possibility, we used RT-qPCR to compare mef2 paralog expression between wild types from the high- and low-penetrance strains. Strikingly, we found that mef2aa, mef2ab, mef2b, mef2ca, and mef2d were all significantly differentially expressed in heads from wild types from the high-penetrance strain compared with heads from wild types from the low-penetrance strain (Figure 4C). These findings strongly suggest that one outcome of selective breeding for low- and high-mef2ca phenotype penetrance is generally increased and decreased expression, respectively, of the mef2 paralogs. We do not see overall increases in transcription in the low-penetrance strain compared with the high-penetrance strain; housekeeping genes are not significantly upregulated in the low-penetrance strain (Figure 4—figure supplement 1).

The differences in paralog expression between selectively bred strains suggest that we might be selecting upon standing variation in paralog expression present in the unselected strain. To test this, we examined paralog gene expression in heads from offspring of two separate families of unselected AB wild types. We found significant differences in mef2ab, mef2b, mef2cb, and mef2d expression between these families. Of note, the direction of paralog expression differences between families was consistent across paralogs. That is, family 1 had generally higher mef2 paralog expression compared with family 2. These data support that paralog expression variation is present in unselected lines and that our selective breeding for penetrance enriched for these expression differences.

mef2cb buffers against mef2ca loss

We hypothesized that the paralog expression differences that we discovered between strains underlie the differences in severity and variation between the low- and high-penetrance strains. To test this hypothesis, we studied mef2ca mutants in the context of mef2 paralog mutations. For mef2cb, we crossed a previously generated allele (De Laurier et al., 2014) into the mef2ca low-penetrance strain. We then maintained this stock by outcrossing to unselected AB. mef2cb is the paralog with the highest degree of sequence similarity to mef2ca and is the second highest expressed mef2 paralog in cranial neural crest cells behind mef2ca (Figures 3 and 4). We confirmed that mef2cb homozygous mutants do not have an overt craniofacial phenotype and are homozygous viable (De Laurier et al., 2014; Figure 5A, B and C), indicating that mef2cb is not required for zebrafish craniofacial development in an otherwise wild-type background. However, when we removed one functional copy of mef2ca from mef2cb homozygous mutants, mef2ca mutant-associated phenotypes developed (Figure 5B and C), phenocopying the high-penetrance strain (Sucharov et al., 2019). We find further evidence for phenocopy; mef2ca homozygous mutant phenotype penetrance increased when we removed one copy of mef2cb (Figure 5B and D). Removing both copies of mef2cb from mef2ca homozygous mutants produces severe, nonspecific defects which make larvae impossible to meaningfully study (De Laurier et al., 2014) although their craniofacial skeletons are severely affected (Figure 5—figure supplement 1). Measuring the symplectic cartilage length further demonstrates that when mef2cb is fully functional, development is buffered against partial loss of mef2ca; there is no difference in symplectic length between wild types and mef2ca heterozygotes (wild type versus mef2ca+/-;mef2cb+/+). In contrast, when mef2cb is disabled, development is sensitive to partial loss of mef2ca (wild type versus mef2ca+/-;mef2cb-/-) (Figure 5E). Removing copies of mef2cb from mef2ca wild types does not significantly change symplectic cartilage length but does significantly increase symplectic cartilage variation (Figure 5F). Thus, even in the mef2ca wild-type context, this paralog buffers against phenotypic variation. We conclude that mef2cb buffers against mef2ca-associated phenotype severity, and among-individual variation, but not within-individual variation (Figure 5D, E and Figure 5—figure supplement 2). However, our gene expression study in high- and low-penetrance strains indicated that several mef2 paralogs are differentially expressed between strains. Therefore, we examined how other paralogs might also buffer against mef2ca loss.

Figure 5 with 2 supplements see all
mef2cb function buffers against mef2ca loss.

(A) Schematic of mef2cb exonic structure, mutant allele used in this study, and regions encoding proposed functional domains are annotated. (B) Zebrafish heterozygous for both mef2cab1086 and mef2cb were pairwise intercrossed. 6 days post fertilization (dpf) larvae were stained with Alcian blue and Alizarin red to label cartilage and bone. Stained larvae were genotyped, flat mounted, and imaged. The following craniofacial skeletal elements are indicated in a wild-type individual: opercle bone (op), branchiostegal ray (br), Meckel’s (mc), ceratohyal (ch), symplectic (sy) cartilages, interhyal (ih), and jaw (jj) joints. Indicated phenotypes associated with mef2ca mutants include: ectopic bone (arrowheads), interhyal and jaw-joint fusions (^), dysmorphic ch (arrows), reduced mc (double arrowhead), and a shortened sy (red arrows). Dashed outline indicates symplectic cartilage. Scale bar: 50 μm (C and D) The penetrance of mef2ca mutant-associated phenotypes observed in 6 dpf larvae is indicated. Asterisk indicates significant difference in penetrance between the indicated genotypes by Fishers exact test. (E) Symplectic cartilage length was measured from 6 dpf larvae from the indicated genotypes. Asterisks indicate significant differences in symplectic length. The p-values from a Dunnet’s T3 test are indicated (*≤0.05, ***≤0.001, and ****≤0.0001) (F) Table listing F-test values for significant differences in variation between genotypes. For box and whisker plots, the box extends from the 25th to 75th percentiles. The line in the middle of the box is plotted at the median, and the bars are minimum and maximum values. N’s for all analyses are indicated in C and D.

mef2d buffers against mef2ca homozygous mutant severity but not variability

We detected mef2d transcripts in the anterior arch population of 24 hpf wild-type cranial neural crest cells, and mef2d expression is significantly lower in the high-penetrance strain compared with the low-penetrance strain (Figure 4C and D). To explore the function of this paralog, we generated a mef2d mutant allele (Figure 6A). For mef2d and all subsequent paralog functional experiments (mef2b and mef2aa), we generated new mutant alleles using (Clustered regularly interspaced short palindromic repeats) CRISPR/Cas9 mutagenesis in the low-penetrance strain. We then maintained these stocks by outcrossing to unselected AB. Homozygous mef2d mutants did not develop any overt skeletal phenotypes in an otherwise wild type, unselected background (Figure 6B). Intercrosses between heterozygotes produced homozygous mutant adults at the expected Mendelian frequency, indicating that mef2d is not required for craniofacial development or viability in laboratory conditions. Similarly, Mef2d mutant mice are viable and display no overt phenotypic abnormalities in standard laboratory conditions (Arnold et al., 2007; Kim et al., 2008). To test our hypothesis that reduced levels of mef2d expression associated with selective breeding for high mef2ca penetrance contribute to severity, we examined offspring from mef2ca;mef2d double heterozygous parents (Figure 6B). When we examined the penetrance of all mef2ca mutant-associated skeletal phenotypes in mef2ca homozygous mutants, we found that the frequency of ventral cartilage defects was significantly increased by removing a single functional copy of mef2d and further increased in double homozygous mutants (Figure 6C). The penetrance of other mef2ca-associated phenotypes was not significantly changed when mef2d function was removed. While symplectic length was not affected in mef2d single mutants, we found that removing both copies of mef2d from mef2ca homozygous mutants significantly shortened symplectic cartilage length. We did not observe significant changes in among-individual variation when we removed mef2d function from mef2ca homozygotes, but symplectic cartilage length is more variable in mef2ca heterozygotes when mef2d is mutated (Figure 6E).

mef2d function buffers against mef2ca loss.

(A) Schematic of mef2d exonic structure, mutant allele used in this study, and regions encoding proposed functional domains are annotated. (B) Zebrafish heterozygous for both mef2ca b1086 and mef2d were pairwise intercrossed. 6 days post fertilization (dpf) larvae were stained with Alcian blue and Alizarin red to label cartilage and bone. Stained larvae were genotyped, flat mounted, and imaged. The following craniofacial skeletal elements are indicated in a wild-type individual: opercle bone (op), branchiostegal ray (br), Meckel’s (mc), ceratohyal (ch), symplectic (sy) cartilages, interhyal (ih), and jaw (jj) joints. Indicated phenotypes associated with mef2ca mutants include: ectopic bone (arrowheads), interhyal and jaw-joint fusions (^), dysmorphic ch (arrows), reduced mc (double arrowhead), and a shortened sy (red arrows). Dashed outline indicates symplectic cartilage. Scale bar: 50 μm (C) The penetrance of mef2ca mutant-associated phenotypes observed in 6 dpf larvae is indicated. Asterisk denotes significant difference in penetrance between the indicated genotypes by Fishers exact test. (D) Symplectic cartilage length was measured from 6 dpf larvae from the indicated genotypes, and asterisk indicates significant differences in symplectic length (*≤0.05, **≤0.01, and ***≤0.001). (E) Table listing F-test values testing for significant differences in variation between genotypes. For box and whisker plots, the box extends from the 25th to 75th percentiles. The line in the middle of the box is plotted at the median, and the bars are minimum and maximum values. N’s for all analyses are indicated in C.

mef2b buffers against mef2ca homozygous mutant variation but not severity

mef2b is the most divergent mef2ca paralog by amino acid sequence and is only minimally expressed in cranial neural crest cells at 24 hpf (Figures 3 and 4). However, we did observe significantly higher mef2b expression in the low-penetrance strain compared with the high-penetrance strain (Figure 4D). We generated a mef2b mutant allele to test for a role for this gene in craniofacial development (Figure 7A). Homozygous mef2b mutants did not exhibit any overt skeletal phenotypes (Figure 7B) and intercrosses between animals heterozygous for this allele produced homozygous mutant adults at the expected Mendelian frequency. When we removed functional copies of mef2b from mef2ca homozygous mutants, we did neither observe any significant changes in mef2ca mutant-associated phenotype penetrance (Figure 7C) nor did we detect further reductions in symplectic cartilage length when functional copies of mef2b were removed from mef2ca homozygous mutants (Figure 7D). These findings indicate that mef2b function does not affect mef2ca mutant severity. However, we do observe a modest increase in among-individual variation in mef2ca homozygous mutants when mef2b is disabled (Figure 7E).

mef2b function buffers against mef2ca loss.

(A) Schematic of mef2b exonic structure, mutant allele used in this study, and regions encoding proposed functional domains are annotated. (B) Zebrafish heterozygous for both mef2ca b1086 and mef2b were pairwise intercrossed. 6 days post fertilization (dpf) larvae were stained with Alcian blue and Alizarin red to label cartilage and bone. Stained larvae were genotyped, flat mounted, and imaged. The following craniofacial skeletal elements are indicated in a wild-type individual: opercle bone (op), branchiostegal ray (br), Meckel’s (mc), ceratohyal (ch), symplectic (sy) cartilages, interhyal (ih), and jaw (jj) joints. Indicated phenotypes associated with mef2ca mutants include: ectopic bone (arrowheads), interhyal and jaw-joint fusions (^), dysmorphic ch (arrows), reduced mc (double arrowhead), and a shortened sy (red arrows). Scale bar: 50 μm. (C) The penetrance of mef2ca mutant-associated phenotypes observed in 6 dpf larvae is indicated. (D) Symplectic cartilage length was measured from 6 dpf larvae from the indicated genotypes, and asterisk indicates significant differences in symplectic length (***≤0.001 and ****≤0.0001). (E) Table listing F-test values testing for significant differences in variation between genotypes. For box and whisker plots, the box extends from the 25th to 75th percentiles. The line in the middle of the box is plotted at the median, and the bars are minimum and maximum values. N’s for all analyses are indicated in C.

mef2aa buffers against mef2ca heterozygous phenotypes but not mef2ca homozygous mutant phenotypes

mef2aa is minimally expressed in cranial neural crest cells (Figure 4). However, mef2aa expression differs between strains (Figure 4C). We generated a mef2aa mutant (Figure 8A) that does not develop any overt skeletal phenotypes when homozygous (Figure 8B). When we removed mef2aa from mef2ca homozygous mutants, we did not observe any significant changes in mef2ca-associated phenotype penetrance (Figure 8C), symplectic cartilage length (Figure 8D), or variation (Figure 8E). However, when we removed one functional copy of mef2ca from mef2aa homozygous mutants, mef2ca-associated phenotypes developed with low frequency (Figure 8B). Specifically, some mef2ca heterozygous animals developed nubbins (small lumps of cartilage) and shortened symplectic cartilages; phenotypes traditionally only associated with homozygous mutants (Piotrowski et al., 1996). Neither of these phenotypes are ever seen in unselected mef2ca heterozygotes but do develop in mef2ca heterozygotes from the high-penetrance strain (Sucharov et al., 2019). Therefore, disabling mef2aa partially phenocopies the high-penetrance strain.

mef2aa function buffers against mef2ca partial loss.

(A) Schematic of mef2aa exonic structure, mutant allele used in this study, and regions encoding proposed functional domains are annotated. (B) Zebrafish heterozygous for both mef2ca b1086 and mef2aa were pairwise intercrossed. 6 days post fertilization (dpf) larvae were stained with Alcian blue and Alizarin red to label cartilage and bone. Stained larvae were genotyped, flat mounted, and imaged. The following craniofacial skeletal elements are indicated in a wild type: opercle bone (op), branchiostegal ray (br), Meckel’s (mc), ceratohyal (ch), symplectic (sy) cartilages, interhyal (ih), and jaw (jj) joints. Indicated phenotypes associated with mef2ca mutants include: cartilage nubbin fused to the mc symphysis (brown arrowhead), ectopic bone (black arrowheads), ih and jj fusions (^), dysmorphic ch (arrows), reduced mc (double arrowhead), and a shortened sy (red arrows). Scale bar: 50 μm. (C) The penetrance of mef2ca mutant-associated phenotypes observed in 6 dpf larvae is indicated. (D) Symplectic cartilage length was measured from 6 dpf larvae from the indicated genotypes, and asterisk indicates significant differences in symplectic length (****≤0.0001). (E) Table listing F-test values testing for significant differences in variation between genotypes. For box and whisker plots, the box extends from the 25th to 75th percentiles. The line in the middle of the box is plotted at the median, and the bars are minimum and maximum values. N’s for all analyses are indicated in C.

Discussion

Vestigial paralog expression may provide developmental robustness

Vertebrate mef2 functions downstream of endothelin signaling in the developing craniofacial skeleton (Miller et al., 2007; Verzi et al., 2007). The endothelin pathway was subfunctionalized following whole genome duplications in vertebrates (Square et al., 2020). Thus, mef2 genes may have been subfunctionalized following genome duplications. Comparing mice and zebrafish further supports mef2 subfunctionalization. In mice, Mef2c is required for both heart and craniofacial development (Verzi et al., 2007; Lin et al., 1997). In zebrafish, both co-orthologs (mef2ca and mef2cb) function redundantly in the heart (Hinits et al., 2012), while craniofacial function has been subfunctionalized to just mef2ca (Figure 9). However, it is possible that the ancestral craniofacial function and expression pattern of mef2cb are partially retained following subfunctionalization, even though it is no longer required for this function. We propose that while duplicated genes can evolve new expression domains and functions, vestiges of their original expression pattern remain and can buffer against loss of another paralog. Consistently, with the exception of mef2ca, the mef2 paralog mutants we analyzed here do not have an overt single mutant skeletal phenotype but do modify mef2ca mutant phenotypes. In our system, selective breeding likely fixed existing paralog expression variation. Thus, in the low-penetrance strain, selection may replicate paralog ancestral expression, restoring near full redundancy (Figure 9).

Model for mef2 gene duplicate evolution and replication of ancestral, redundant expression via selective breeding.

In our model, an ancestral mef2 gene existed with distinct expression domains. In this example, ancestral mef2c had expression in the head and the heart. Following whole genome duplication, initially the regulatory and coding sequences would be identical. Through divergence and natural selection, factors controlling gene regulation would acquire mutations that dampen, but do not eliminate, some expression domains. In this example, mef2cb expression is dampened in the head. These gene expression changes result in subfunctionalization but with vestigial retention of the original expression. mef2cb is no longer required for craniofacial development, yet traces of the original craniofacial expression remain. The amount of vestigial expression is variable among individuals, resulting in variable buffering against mef2ca mutant phenotypes across a population. Our selective breeding selected on this variable expression resulting in higher paralog expression in the low-penetrance strain compared with the high-penetrance strain. Thus, mef2 paralog expression in the low-penetrance strain resembles the ancestral condition following whole genome duplication, when both copies’ expression profiles were highly similar, and the genes were redundant for craniofacial development.

Our experiments support a model where we selected upon pre-existing alleles that either amplify or dampen existing vestigial paralog expression variation, depending on the direction of selection. We found that expression of several paralogs was affected by selection. Previously however, we reported that the rapid response to selection in our system indicates that relatively few heritable factors contribute to penetrance (Nichols et al., 2016; Sucharov et al., 2019). To account for this paradox, we propose that a few shared genetic factors regulate paralog expression. By selecting for penetrance, expression of several paralogs was increased or decreased via selection for these shared genetic factors. In support, we found evidence of shared gene regulation in our experiments examining mef2 gene expression dynamics. The expression dynamics of paralogs closely related in sequence and phylogeny are similar, suggesting shared regulatory modules. In this model, few factors are inherited that control the expression of many paralogs. These mef2 paralog regulatory factors might be either shared mef2 enhancers (in cis) or might be in shared upstream transcriptional regulators (in trans) that simultaneously regulate several mef2 paralogs. Future studies will discern between these two, not mutually exclusive, possibilities.

Evidence for genetic assimilation

It is fascinating that a phenotype originally associated with a homozygous loss of function mutation can appear in homozygous wild types after selective breeding for high penetrance. We previously observed a similar phenomenon with heterozygotes in the high-penetrance strain and proposed that this is a form of genetic assimilation (Sucharov et al., 2019; Waddington, 1953). In Waddington’s genetic assimilation experiments (Waddington, 1942; Waddington, 1953; Waddington, 1959; Waddington, 1952; Waddington, 1956; Waddington, 1957), he studied phenotypes that were originally only present in perturbed conditions. Following selective breeding, the phenotypes arose without the perturbation. Similarly, we studied phenotypes that were originally only present in perturbed conditions, mef2ca mutants. Following selective breeding, the phenotypes arose without the perturbation, in wild types. We offer a mechanistic explanation for our findings. Analyzing craniofacial skeletons in the context of the mef2ca mutation revealed cryptic paralog expression variation. Specifically, in wild types, paralog expression variation does not impact phenotype. However, in mutants, the consequences of variable expression levels are unveiled, allowing us to select upon them. Because the paralogs seem to have shared regulatory systems, we enriched for variants producing lower expression of several paralogs, including mef2ca, by selective breeding for high penetrance. This pan-paralog expression decrease led the high-penetrance strain to be both extremely sensitive to coding mutations in mef2ca and to manifest mef2ca mutant-associated phenotypes in mef2ca wild types from this strain. The latter is likely due to decreased mef2ca expression in high-penetrance mef2ca wild types.

Decoupling buffering mechanisms

We found that paralogs modularly buffer the mef2ca homozygous mutants. For example, disabling mef2cb affects most mef2ca-related craniofacial phenotypes, while mef2d mutations primarily affect penetrance of ventral cartilage phenotypes, and only variation is affected when mef2b function is removed. Although we were unable to mutagenize mef2ab, despite significant effort, this is not likely to affect our general conclusions. Similar to work in other systems (Gu et al., 2003), the paralog with the highest expression (mef2ca) is the one associated with a single mutant phenotype, and the paralog with the most sequence similarity (mef2cb) is the one playing the largest buffering role. However, others have reported that partial gene expression overlap is also predictive of buffering capacity (Kafri et al., 2005; Kafri et al., 2006), in line with our results that even distantly related paralogs with more dissimilar expression profiles (mef2d) can modify penetrance in our system. Different types of compensation events have previously been classified as either active or passive (Diss et al., 2014). In our system, compensation is likely passive, but subject to selection, which can change robustness across generations.

Determining whether the specific portion of the phenotype that each paralog buffers is related to that paralog’s wild-type expression pattern would expand our understanding of modularity. Unfortunately, in situ gene expression protocols were not sensitive enough to detect the low expression of the individual paralogs. Moreover, the quantitative phenotyping in this study was limited to the symplectic cartilage, as a proxy for the whole system (Figure 1—figure supplement 1). However, quantitative phenotyping of other parts of the developing craniofacial complex (Sasaki et al., 2013; Kimmel et al., 2015) might reveal more information about modular buffering of severity and variation by paralogs. For example, penetrance of ceratohyal cartilage defects was affected by mef2d, and therefore quantitative variation of this structure, which would be difficult to measure, might also be affected by loss of mef2d.

Our experiments also decouple the mechanisms that buffer among- and within-individual variation. Examining selectively bred strains and the different paralog mutants demonstrates that buffering in our system only regulates among-individual variation, not within-individual variation (Figure 5—figure supplement 2). These two types of variation are buffered by different mechanisms in our system. Our results shed some light on the long-standing debate surrounding similarities and differences between mechanisms buffering these two types of variation (Hallgrimsson et al., 2019).

These studies advance our understanding of craniofacial variation. We propose that cryptic variation in vestigial paralog expression is the noise underlying variable craniofacial development. Whole genome duplications producing paralogs are not zebrafish specific or specific to this gene family. In fact, previous work demonstrates that paralogs contribute to robustness in diverse systems (Gu et al., 2003; Kamath et al., 2003; De Kegel and Ryan, 2019). Thus, even human craniofacial variation might be due to cryptic paralog expression variation and may explain how some genetically resilient humans (Chen et al., 2016) can overcome a deleterious mutant allele.

Materials and methods

Zebrafish strains and husbandry

Request a detailed protocol

All fish were maintained and staged according to established protocols (Kimmel et al., 1995; Westerfield, 1993). Selective breeding was performed as previously described (Nichols et al., 2016; Sucharov et al., 2019; Brooks and Nichols, 2017). Briefly, because the mef2ca mutation is lethal when homozygous, heterozygous full-sibling parents were intercrossed, and their homozygous mutant offspring were scored for penetrance when determining what family was to be propagated for the next generation. The mef2cab1086 and mef2cbfh288 mutant alleles have been previously described (Hinits et al., 2012; Miller et al., 2007) and were maintained by outcrossing to the unselected AB background. When we first introduced the mef2cbfh288 allele to mef2cab1086, we crossed it to the low-penetrance mef2cab1086 selectively bred strain and subsequently maintained these double heterozygotes by outcrossing to the unselected AB background.

CRISPR/Cas9-induced mutant alleles

Request a detailed protocol

We generated germline mutant alleles using CRISPR/Cas9 mutagenesis (Hwang et al., 2013) with modifications as described (Mitchell et al., 2021). Briefly, we designed sgRNAs(Single-guide RNA) within or just downstream of the MADS or MEF2 domain. The XbaI-digested pT3TS-nCas9n plasmid (Addgene plasmid #46757) was used as a template to transcribe Cas9 mRNA with the T3 mMESSAGE kit (Invitrogen). We transcribed sgRNAs (see table below) from PCR-generated templates using the MEGAscript T7 Kit (Thermo Fisher Scientific). One cell-stage embryos were injected with a mix of 200 ng/µl Cas9 mRNA and 50 ng/µl of each gene-specific sgRNA. Injected embryos were raised, and founders identified by amplifying the genomic region containing the sgRNA site and identifying banding size shifts indicating insertions and/or deletions (see genotyping assay table for primers). All new paralog mutants were originally generated in the low-penetrance strain and were subsequently maintained on the AB background for at least three generations. The following sgRNAs were used: mef2dco3013: 5’-GGACAAATACCGGAAGAGCG-3’; mef2bco2012: 5’-CACGAGAGCCGCACTAACAC-3’; mef2aaco3017: 5’-TCATGGACGACCGTTTCGGC-3’.

We generated six independent sgRNAs for mef2ab, and none of them mutagenized this locus. Precise sequences of mutant alleles are indicated below:

GeneMutant insertion (underlined) and deletion (italicised) sequence
mef2d co3013Exon3: AATACCGGAAGATCGAGGAGCTGGATATCCTC
mef2b co3012Exon3: AACCTCACGAGAGCCGCACTAACAC
mef2aaco3017Exon5: TCATGCCCCTGGACGACCGTTTCGGCAAA

Cartilage and bone staining and imaging

Request a detailed protocol

Fixed animals were stained with Alcian blue and Alizarin red as described previously (Brooks and Nichols, 2017; Walker and Kimmel, 2007). Alcian blue- and Alizarin red-stained 6 days post fertilization (dpf) skeletons were dissected and flat mounted for Nomarski imaging on a Leica DMi8 inverted microscope equipped with a Leica DMC2900 as previously described (Nichols et al., 2013).

Phenotype scoring

Request a detailed protocol

For penetrance scoring, 6 dpf Alcian blue- and Alizarin red-stained skeletons were genotyped then scored for the proportion of animals with a given genotype that exhibit a particular phenotype. In the interest of strong rigor and reproducibility, phenotypes were scored by three observers blinded to genotype. All three agreed with the number of animals in each phenotypic class, indicating that phenotype penetrance can be reproducibly identified by different observers. For symplectic cartilage measurements, whole mount Alcian blue- and Alizarin red-stained skeletons were imaged under a transmitted light dissecting scope. Images were captured with Zeiss ZEN software. This software was then used to measure the linear distance in microns from the posterior most point of the interhyal cartilage and the distal tip of the symplectic cartilage similar to a previous study (Talbot et al., 2016). For each individual, the total symplectic length was calculated by summing the measurements from the left and right side. These same measurements were also used to calculate the absolute value of the left minus the right symplectic cartilage lengths to determine developmental instability for each individual. All raw phenotype data are presented in source data tables.

Reverse transcription-quantitative polymerase chain reaction

Request a detailed protocol

Gene expression studies were performed as previously described (Sucharov et al., 2019). For the time course study from unselected AB and for comparing low- and high-penetrance wild types, live individual 24 hpf embryos from each strain had their heads removed. Decapitated bodies were genotyped to identify homozygous wild types. Heads from five to six identified homozygous wild types were pooled, and total RNA was extracted with TRI Reagent. cDNA was prepared with Superscript III from Invitrogen. qPCR experiments utilized a real-time PCR StepOnePlus system from Applied Biosystems and SYBR green. A standard curve was generated from serially diluted (1:2:10) cDNA pools, and primers with a slope of –3.3 ±0.3 were accepted. The relative quantity of target cDNA was calculated using Applied Biosystems StepOne V.2.0 software and the comparative Ct method. After surveying the expression of many housekeeping genes at multiple stages, we determined that rps18 expression was the most consistent across stages, genotypes, and strains. Target gene expression in all experiments was normalized to rps18. Reactions were performed in technical triplicate, and the results represent two to six biological replicates. The following primers were used: rps18 FW, 5’-CTGAACAGACAGAAGGACATAA-3’ and rps18 REV 5’-AGCCTCTCCAGATCTTCTC-3’, mef2ca FW, 5’-GTCCAGAATCCGAGGACAAATA-3’ and mef2ca REV 5’-GAGACAGGCATGTCGTAGTTAG-3’, mef2cb FW, 5’-AGTACGCCAGCACAGATA-3’ and mef2cb REV 5’-AGCCATTTAGACCCTTCTTTC –3’, mef2aa FW, 5’-CCACGAGAGCAGAACCAACTC-3’ and mef2aa REV 5’-GTCCATGAGGGGACTGTGAC-3’, mef2ab FW, 5’-AACCTCACGAGAGCAGAACC-3’ and mef2ab REV 5’-AGGACATATGAGGCGTCTGG-3’, mef2b FW, 5’-CCGATATGGACAAAGTGCTG-3’ and mef2b REV 5’-CCAATCCCAATCCTTTCCTT-3’, mef2d FW, 5’-TTCCAGTATGCCAGCACTGA-3’ and mef2d REV 5’-CGAATCACGGTGCTCTTTCT-3’. All qPCR numerical data and statistical analyses are reported in supplementary data table.

scRNA-seq analysis

Request a detailed protocol

We analyzed our published data set (Mitchell et al., 2021) from 24 hpf wild-type AB zebrafish cranial neural crest cells for mef2 paralog expression. Seurat’s ‘VlnPlot’ function was used to generate mef2 paralog plots indicating the distribution of cells expressing each paralog across each of the four clusters we described in the previous publication. The raw, feature-barcode matrix for this dataset can be accessed from the GEO database (accession number GSE163826).

Genotyping assays

Request a detailed protocol

mef2cab1086 was genotyped by KASP as previously described (Brooks and Nichols, 2017). PCR-based genotyping assays are as follows:

GeneallelePrimersEnzymewtmut
mef2dco3013Two separate reactions are run per sample.
First reaction: Fw Full, Rv Full, and Fw wt
Second reaction: Fw Full, Rv Full, and Rev mut
Fw Full: 5’-AAGAAAGGCTTTAACGGTTGC-3’
Rv Full: 5’-AAGAGAAGGACGGAGGTTAGA-3’
Fw wt: 5’-GACAAATACCGGAAGAGCGA-3’
Rv mut: 5’-AGAGGATATCCAGCTCCTCTTC-3’
None173 bp and 98 bp121 bp and 173 bp
mef2aaco3017Fw: 5’-TTGACCCAACGGTTTACAGA-3’
Rv: 5’-CACAAAGCCAAGCAAAAACA-3’
NlaIII291 bp and 145 bpUncut
442 bp
mef2cb fh288Fw: 5'-TCCCTGCTTCTCTCTAGGTGACATTTACATCG-3 ‘
Rv: 5'-TCGTGTGGCTCGTTGTACTC-3’
TaqaI190 bp and 10 bpUncut
200 bp
mef2b co3012Fw: 5’-CGAGATCGCTCTCATCATCTT-3’
Rv: 5’-GACATACTGGAGGTATACAGACCAAA-3’
AciI112 bp and 38 bpUncut
139 bp

Sequence alignments

Request a detailed protocol

We used the Clustal Omega tool by EMBL-EBI for multiple sequence alignment of the different paralog gene products. We obtained the following transcripts and protein products from ENSEMBL for alignments: mef2aa-206 ENSDART00000171594.2, mef2ab-201 ENSDART00000173414.2, mef2b-202 ENSDART00000166300.3, mef2ca-202 ENSDART00000099134.5, mef2cb-207 ENSDART00000183585.1, and mef2d-203 ENSDART00000132589.2ENSDART00000132589.2.

Statistical analyses

Request a detailed protocol

Penetrance scores were compared using Fisher’s exact test to determine significance. All scoring data and exact p-values are reported in source data tables. Studies with symplectic cartilage measurements are presented as box and whisker plots, and the box extends from the 25th to 75th percentiles. The line in the middle of the box is plotted at the median, and the bars are minimum and maximum values. We used a Welch’s t-test or Dunnet’s T3 test to compare total symplectic cartilage (left plus right sides) between genotypes. We used F-test to test for significant differences in variation between genotypes. For developmental instability, the absolute value of the difference between left and right symplectic cartilage lengths were grouped by genotype, and Welch’s t-test, Brown-Forsythe, or Welch’s ANOVA tests were used to determine significant differences in left-right asymmetry between genotypes. Power analyses were used to determine the number of animals to be examined for each experiment. All statistical analyses and exact p-values are reported in source data.

Data availability

All raw data are provided in supplementary data table. Sequencing dataset have been deposited in GEO. The raw, feature-barcode matrix can be accessed from the GEO database (accession number GSE163826).

The following previously published data sets were used
    1. Mitchell, et al
    (2021) NCBI Gene Expression Omnibus
    ID GSE163826. The alx3 gene shapes the zebrafish neurocranium by regulating frontonasal neural crest cell differentiation timing.

References

    1. Valen LV
    (1962) A study of fluctuating asymmetry
    Evolution; International Journal of Organic Evolution 16:125–142.
    https://doi.org/10.1111/j.1558-5646.1962.tb03206.x
    1. Wagner A
    (2008) Gene duplications, robustness and evolutionary innovations
    BioEssays : News and Reviews in Molecular, Cellular and Developmental Biology 30:367–373.
    https://doi.org/10.1002/bies.20728
  1. Book
    1. Westerfield M
    (1993)
    The Zebrafish Book: A Guide for the Laboratory Use of Zebrafish
    Eugene: University of Oregon Press.

Decision letter

  1. Tatjana Piotrowski
    Reviewing Editor; Stowers Institute for Medical Research, United States
  2. Christian R Landry
    Senior Editor; Université Laval, Canada

Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.

Decision letter after peer review:

Thank you for submitting your article "Variable paralog expression underlies phenotype variation" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Christian Landry as the Senior Editor. The reviewers have opted to remain anonymous.

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Essential revisions:

The three reviewers agree that the question addressed is of great interest to evolutionary and developmental biologists in general and to those studying the evolution of developmental mechanisms in particular. The manuscript reports substantial and interesting advances, but in its current form it is difficult to read, which led to significant frustration among the reviewers. The manuscript needs to be substantially rewritten and shortened. The writing and interpretation of the results needs to be clarified as many statements are perceived as lacking precision. One of many examples is the statement that each selected strain has similar inbredness. However, there is no evidence that this was actually tested. The authors should also ensure that their conclusions match the results and are not over interpreted, pointing out explicitly caveats where necessary. Finally, the results need to be discussed in relation to what has been published, for example how does the phenomenon studied here relate to studies of genetic modifiers and what has been learned there.

Reviewer #1 (Recommendations for the authors):

P3 abstract: 'In support, mutagenizing all mef2ca paralogs in the low penetrance strain'

But one wasn't mutated. It would be helpful to say which paralogs were mutated in the abstract just so that literature searches by people studying the paralogs would be more likely to find this paper.

P3 'Human craniofacial variation allows us – and even computers -- to identify each other.

Re: 'facial variation persists among genetically homogeneous populations1.

Text should specifically talk about identical twins here.

P4 paragraph should start with a topic sentence: 'Variation can be talked about in two different ways: penetrance is….'

P5 ‘large deletion encompassing the MEF2C locus’

Is that in homozygous or heterozygous condition?

P5 text says that mef2cb mutants are viable without craniofacial phenotypes, but do they have other phenotypes?

P6 'Because no other zebrafish mef2 gene mutants'. Reader needs to know how many mef2 genes zebrafish has and their relationship to human MEF2 genes.

P6 'nearly all mef2ca associated phenotypes changed penetrance as well34'

Did they change in the same direction? All more severe or all less severe?

P6 'the spread in symplectic length'

Define what this 'spread' means in phenotypic terms.

P6 'linear measurements of the symplectic cartilage, allowing us to measure severity,

among-individual variation, and within-individual variation in the zebrafish craniofacial skeleton'

Authors need to demonstrate that measuring the length of this single skeletal element is sufficient to capture the variation they intend to study.

P7 'upregulated in the low penetrance strain compared with the high-penetrance strain'

With respect to what internal control?

P7 'partially due to individuals inheriting different mutant alleles retaining different levels

of functional activity'

Right, for protein function, but also for different transcriptional expression domains or levels.

P7 'The C-terminal domain is more divergent among different mef2 genes38.'

This paragraph is talking about alleles of one gene but this sentence is talking about different genes, so it's a bit confusing and seems out of place.

P7 'destroys the initiating methionine35.'

Is any protein made from an internal methionine that might provide an alternative initiation site?

P8 'deletion allele found in a [mildly affected] patient23'

P8 'among-individual variation associated with the opercle bone phenotype; one individual

has phenotypically wild-type opercles while the other individual has bilateral mutant

phenotypes'

Give each of the individuals in Figure 1B a different panel number, like B1, B2 for the two wild types. Then in the sentence above call them out by name so the reader doesn't have to figure out which is which.

P8 'within-individual (left-right) opercle phenotype variation is present in one of the mef2cab631 animals (lower)

Be a bit more specific, saying that in the bottom individual (which could be called B4), the left opercle is relatively normal but the right opercle is partially duplicated (or something like that).

Figure 1D. I think it would be helpful for the reader to label all of the skeletal elements in Figure 1D and highlight the symplectic by color.

P8 'Ordering our allelic series by expressivity [of this one element] shows'

What about the other phenotypes? Do they follow the same allelic series? It's possible that different alleles could affect differentially different phenotypes.

P8 'expected to be [milder] than the full deletion'

P8 'the PTC (mef2cab1086) allele would be expected to be more mild than the full deletion

(mef2caco3008) allele if transcriptional adaptation was a factor.'

Yes, agreed.

P9 'the most severe allele is also the most variable (Figure 1F).'

Consider plotting a measure of severity on the vertical axis at the right of the figure on the same horizontal axis to make the point about the correlation.

P9 'there is no significant difference [among] mutant alleles for this type of variation '

P9 'Severity is positively correlated with variation when comparing'

Would it make more sense to say 'Variation is positively correlated with severity when comparing'? Both of course are correct but phrase below says it in this order and biologically it seems to make more sense to me that the severity is a property of the allele and that the variation is a property of the biology arising from the altered protein function.

P9 'we hypothesized that [a single] mutant allele, mef2cab1086,'

P9 'shortened symplectic cartilages and fused interhyal joints, were occasionally observed'

In nearly all of the alcian-alizarin stains, the length of the symplectic is difficult to see due to interference at the rostral end by overlap with the palatoquadrate. Consider marking it in some way in the figures.

P9 'We were surprised to observe some mef2ca-associated phenotypes, like shortened symplectic cartilages and fused interhyal joints, were occasionally observed in mef2ca+/+ individuals from the high-penetrance strain (Figure 2B)'

Reader needs to know the frequency with which these phenotypes appeared in wild types.

P9 'we found significant differences between low- and high-penetrance strains for both mef2ca+/+, as well as mef2ca+/- [but homozygous mutants were statistically the same by this measure in the two strains](Figure 2C).'

p10 'the low- and high penetrance strains are both similarly inbred'

They had similar mating schemes, but no data are presented to show that they are similarly inbred. That requires data on heterozygosity. It could be that for one of the strains, for some reason selection was for stronger heterozygosity across the genome than the other strain, that you tended to select for heterozygotes because, for example, inbreeding depresses health subtly in ways that affect the manifestation of the phenotype.

P10 'to generate a 'natural knock down' in mef2ca wild types'

Maybe it doesn't knockdown the mef2ca gene function or expression, maybe it alters the related phenotypes in some other way, in an independent pathway.

P10 'Six mef2 paralogs in the zebrafish genome share highly conserved amino acid sequences'

Has no one done this type of analysis before that the text could cite? Do the conclusions in this paragraph match the phylogenetic analysis of Mef2 genes in Chen 2017 MEF2 signaling and human diseases?

Figure 3. It would be easier for the reader to evaluate the degree of conservation if only conserved amino acids were colored and variants left uncolored.

P11 'To determine the gene expression dynamics of the mef2 paralogs during craniofacial development'

Consider changing motivation a bit to: 'To determine [whether any mef2 paralogs have expression domains that overlap mef2ca in the head skeleton], we studied the gene expression dynamics of mef2 paralogs during craniofacial development'

P11 'and that shared [and divergent] regulatory sequences might control expression'

P11 'expression in isolated wild-type cranial neural crest cells41'

Essential here to tell reader the age of the preparations for scRNA-seq to compare to ages for the other analyses.

scRNA-seq was from isolated neural crest cells, but the work has not established that this is the only cell type that is expressing each of these genes in the head skeleton; if that's true, then it should be mentioned (or maybe I missed it). And the resolution of the scRNA-seq results into anterior arches, frontonasal, posterior arches, and melanocytes seems to come from a method with much less resolution than in situ hybridization to mRNAs would provide.

The selection of crest cells and excluding endodermal and ectodermal cells, which signal to the developing cartilages, removed cells that might very well be contributing to the selected phenotypes because the selection experiments might have altered genes in the signaling pathway that are expressed in these epithelial cell types.

P11 'When we demonstrated that paralog transcriptional adaptation does not account

for the phenotypic differences between the low- and high-penetrance strains34, we did

not examine if selective breeding changed paralog expression between strains in

mef2ca wild types'

Consider change to: 'Data showing that paralog transcriptional adaptation does not account for the phenotypic differences between the low- and high-penetrance strains34 do not rule out the possibility that selective breeding changed paralog expression between strains in mef2ca wild types

P11 'cranial neural crest cells between unselected wild types and high-penetrance wild types'

Were these wild type siblings from an unselected mef2ca mutant line? Or were they from AB or some other wild type line?

Would a better comparison be to compare the two selected lines? Maybe in both the up and the down lines the paralogs are down-regulated with respect to wild types?

P11 'mef2aa, mef2ca, mef2cb and mef2d were all significantly downregulated in the high-penetrance line compared with the unselected strain.'

First, consider change to: 'mef2aa, mef2ca, mef2cb and mef2d were all significantly downregulated in wild-type siblings in the high-penetrance line compared with wild-type siblings from the unselected strain.'

It's unclear why scRNA-seq was used for this analysis instead of a more quantitative approach like the qPCR approach of Figure 4A. It seems like this experiment doesn't utilize the advantages of scRNA-seq and instead uses it for something the method is not well suited for. Visual comparison of the plots requires that the same number of cells are present in each of the two samples. Was that true?

Furthermore, for comparison to Figure 4B, it would be better to show the data in Figure 4C divided into the same four 'cell types'. And even better if shown in the context of all clusters that came out of the experiment.

P11 'all significantly downregulated in high-penetrance heads compared'

Should it read: '…downregulated in the heads of wild-type siblings from high-penetrance lines'? Or were these selected heads that were mutant with high expressivity? I'm just not sure what a 'high penetrance head' is.

P11 'paralogs. We next used RT-qPCR to compare mef2 paralog expression between highand low-penetrance wild types.'

This experiment would be more interpretable if, like the scRNA-seq experiments, wild types from an unselected line of mef2ca mutants were used. If the unselected line is not in between the two selected lines, then explanations for the results shown would be different than currently in the manuscript.

P12 'generally increased and decreased expression, respectively, of the mef2 paralogs'

This up and down conclusion can't be drawn if the unselected wild types are not in the middle. But because that control is missing, the conclusion does not seem to be justified, only that the two selected lines differ from each other. While selecting on phenotypes, not expression patterns, both selected lines could be up or both down in paralog expression from the data presented.

Figure S3. Expression of the housekeeping genes was normalized to rps18. Were the mef2 expression levels normalized to the same gene?

In addition, the conclusion from these data is that there's no difference between low and high lines for these two genes, but no statistical test is given and the standard deviations do not overlap. A statistical test would seem useful, even though the data would seem to show as the authors concluded that the low line is not higher than the high line for these two genes.

P12 'we systematically removed functional copies of mef2ca paralogs from this strain'

This is indeed the best way to address the hypothesis above. The phrase 'from this strain', however, should probably be replaced by 'the low-penetrance strain'.

P12 'is most closely related to'

In what way? In sequence? In expression patterns? In history?

P12 'is the second highest expressed MEF2 PARALOG in cranial'

P12 'mef2cb homozygous mutants do not have an overt craniofacial Phenotype'

Do they have other phenotypes?

P12 'mef2cb does not function in zebrafish craniofacial development'

Because mef2cb gene, which the sentence means because it uses gene nomenclature, is expressed, then I'd say the mef2cb gene DOES function in craniofacial development. But the Mef2cb protein apparently doesn't provide an essential unique non-redundant function because the mutant has no phenotype.

P12 'when we removed one copy of mef2cb'

Good that the reciprocal experiment was done.

P12 'Removing both copies of mef2cb from mef2ca homozygous mutants IN THE LOW-PENETRANCE STRAIN produces severe'

P12 'there is no difference in symplectic length between wild types and mef2ca heterozygotes (wild type vs. mef2ca+/-;mef2cb+/+)'

Specify genetic background.

P12 'Removing copies of mef2cb from mef2ca wild types does not significantly change'

Again, specify strain/background. Is it the low background in the whole paragraph?

P12 Did the authors also remove mef2cb activity in unselected and high-penetrance lines? If not, then it would be helpful to motivate why choosing one line and not the others.

P12 'We conclude that mef2cb buffers against mef2ca-associated phenotype severity, and

among-individual variation, but not within-individual variation IN THE LOW-PENETRANCE LINE.'

I think that's right, or were all three lines checked? It's hard to tell because the text does not always clearly specify which lines is used.

P13 'mef2d mutant allele in the low-penetrance strain (Figure 6A). Homozygous mef2d mutants did not develop any overt skeletal phenotypes in an otherwise wild-type background'

This is confusing. Is the background the low-penetrance strain, which is not the same as a wild-type background because it has been selected for likely rare alleles in the original background, or is the mef2d mutant in an 'otherwise wild-type background' as the sentence says?

P13 'mef2d, and further increased in double homozygous mutants (Figure 6C)'

This is difficult to follow. This paragraph says that the mef2d mutants were made in the low line, but then Figure 6C indicates it's the unselected strain. Which is correct?

P13 'but mef2ca heterozygotes are more variable when mef2d is disabled (Figure 6E).'

Which phenotype is examined here? The previous sentence makes me think it's only symplectic length variation, but because neither the sentence nor the figure panel nor the figure legend says, it's impossible to know.

P13. 'mef2b is the most divergent mef2ca paralog,'

By what criterion? Sequence? Length? Time at which this variant appeared in phylogeny? Expression pattern?

P14 'Both of these phenotypes are never seen in unselected'

Change to: 'Neither of these phenotypes are ever seen in unselected'

P15 'Our experiments support a model where we selected upon alleles that either amplify or dampen existing vestigial expression'

Are those selected alleles at the mef2ca locus? Or at the mef2 paralog loci? Or in trans acting genes, presumably transcription factors or their partners, that regulate mef2ca or cb or other genes?

P15 'and is what we selected upon in mef2ca mutants'

This sentence implies that the selection was on variation at the mef2ca mutant locus. Either some mapping data need to be presented (maybe I missed it) that support where the responding loci were in the genome of the selected lines, or some discussion is warranted about where the variation might be in the genome and what types of gene variants might have responded. Maybe this is later in the discussion.

P15 'For example, disabling mef2cb affects most phenotypes'

I think this means 'For example, disabling mef2cb affects most MEF2CA-RELATED CRANIOFACIAL phenotypes.

P15 'primarily affect penetrance of ventral cartilages'

Should be: 'primarily affect penetrance of ventral cartilage PHENOTYPES'.

P15 'Further, expression of all the paralogs was affected by selection, indicating that they all contribute to buffering.'

Not necessarily. It could be that selection acted on a single factor that alters the expression of all of the paralogs in the same way even though the protein of only one or two contributes to the selected phenotype; the others are just along for the ride because of shared upstream regulatory mechanisms, not because of shared downstream functions.

P15 'Closely related paralog expression dynamics are similar'

I guess closely related expression dynamics are of course similar. But I think what the text meant to say was that 'The expression dynamics of paralogs closely related in sequence and phylogeny are similar'

P15 'In this model, few enhancers are inherited that control the expression of many paralogs.'

Again, it would be helpful to reader to know whether the authors think these selected enhancers are in the mef2 genes themselves or in transcription factors that might regulate coordinately all of the mef2 paralogs.

P15 'Determining whether the aspect of the mef2ca phenotype buffered by each

paralog is related to their wild type expression pattern'

Consider: 'Determining whether the SPECIFIC PORTION of the mef2ca phenotype THAT EACH PARALOG BUFFERS is related to THAT PARALOG'S wild-type expression pattern'

P16 'phenotyping in this study was limited to the symplectic cartilage.'

Remind reader why this specific element makes the best proxy for the whole system and focusing on it is unlikely to lead to wrong conclusions even if several other phenotypes had been studied additionally.

16 'buffering in our system only regulates among-individual variation, not within-individual variation'

Tell reader why this is an important finding. Why should I care if these two types of variation are regulated by distinct or common mechanisms.

P16 'We propose that cryptic variation in vestigial paralog expression is the noise underlying variable craniofacial development'

Here, come back to the motivation in the beginning about variation in human facial structure. Sceptics might say that, well this doesn't apply to humans because the effects these authors found are mostly related to genes from a fish-specific gene duplication, the a and b copy of mef2c, and because humans just have one MEF2C gene, this is irrelevant. Text could readily disabuse sceptics of this notion and the symmetry of returning to the original motivation would make this more fun.

Reviewer #2 (Recommendations for the authors):

1) I found this manuscript difficult to read. In large part, my difficulty was due to finding vague phrases throughout that I would linger over to try to interpret. I will list some of these below, but they were sufficiently numerous as to lessen the impact of this cool work!

Page 1 (the Abstract): "Here we used the zebrafish mef2ca mutant, which produces variable phenotypes, to understand craniofacial variation." This is difficult to understand – is there one mutant or different alleles collectively produce a range of phenotypes.

Page 2: "facial variation persists among genetically homogeneous populations" Does "genetically homogeneous" mean twins or clones? If not, is this useful?

Page 2: "Thus, it is possible that high human facial variability is heritable, and under selection." Do the authors mean that variability is heritable? or that features are heritable and variability of features is under selection?

2) I question whether the study of the penetrance and variability of expression associated with the different mef2ca alleles adds to the study or diverts attention from the take-home messages. In particular, the authors claim that the data presented in Figure 1C shows that mef2caco3008 and mef2cab1086 alleles are associated with different penetrance. This just does not seem to be true from the data – or at least the authors do not indicate what criteria they use to say there is a significant difference.

3) The authors emphasize that a deletion mutation of mef2 in mammals has a mild phenotype compared with the phenotype of other mutant alleles. Then they sort of suggest that the mef2caco3008 is of particular interest because it is a deletion of the entire coding sequence of mef2ca. I think the strong implication is that the two mutations are equivalent. Nevertheless, the mef2caco3008 zebrafish deletion has a strong phenotype. And then no comment. I think this discussion in the Results section is a distraction. I think too little is known about each mutation to even compare them and suggest they are similar. This is an example of the lack of focus that diminishes the strong points of the manuscript.

4) The authors are not sufficiently careful about their use of and the meanings of the words "up-regulation" and "down-regulation" with respect to paralog expression. One might imagine that selecting for high or low penetrance would select for new mutations that modify expression of the paralogs. Or, more likely in my opinion, that different animals in the starting population have higher or lower expression of the paralogs and the breeding scheme isolates these genomes into lower penetrance or higher penetrance strains. The authors MUST make clear what they mean by these terms. It is not clear in fact whether there ever is a change in gene expression of a paralog. Selection of higher-expressing alleles or lower-expressing alleles from a starting population is not the same as selecting for their up- or down-regulation. The authors really should investigate how much standing variation exists in their populations of fish, especially the unselected starting population.

5) Throughout, the authors make conclusions that are based on assumptions that are not sufficiently supported by data. For example:

On page 6: "transcriptional adaptation is not a major factor underlying phenotypic variation in our system; the PTC (mef2cab1086) allele would be expected to be more mild than the full deletion (mef2caco3008) allele if transcriptional adaptation was a factor." I don't think we fully understand the triggers of transcription adaptation to make this conclusion – especially when it could have been measured directly!

On page 8: "Because the low- and high penetrance strains are both similarly inbred, it is unlikely that there is more genetic variation in the high-penetrance strain background accounting for the increased variation in this strain." This would not be correct if breeding for phenotypic variability turns out to be a selection for genetic variability.

As mentioned above, the authors claim the paralogs have no function in craniofacial development and yet they suggest that diminished expression of the paralogs leads to the expression of mutant phenotypes in the high penetrance strain. They need to clarify and explain.

6) I am concerned that the newly engineered mutations in the paralogs might trigger "transcriptional adaptation" as they each produce premature termination codons – early stops. This would be easy to test.

7) There are lots of places where the authors need to be clearer.

On page 7: the authors discuss the establishment of high penetrance and low penetrance strains – they need to remind us that the mutation is lethal and so they propagate these strains as crosses between heterozygotes (I think?). and they need to remind us or refer to the Methods section to tell us how many generations of selective breeding have taken place.

On page 8: "The zebrafish mef2 paralogs each encode a MADS box and MEF2 domain which are remarkably similar (Figure 3B and S1)," – similar to what?

On page 9: "early" and "late" stages of expression should be defined or clarified

8) The entire discussion of "buffering" in the Introduction, etc. is concerning to me. How is the phenotypic buffering that occurs because of vestigial expression of a paralog different than partial "redundancy"? The authors write "To our knowledge, this vestigial buffering hypothesis has not been previously proposed or tested." And yet certainly the work of Yitzhak Pilpel (Nature Genetics 2005; PNAS, 2006) and others very specifically hypothesize that paralogs function as "backup circuits". Further, how does the phenomenon studied here relate to studies of genetic modifiers and what has been learned there. The current work does not operate in isolation, and the Introduction seems to imply that.

Reviewer #3 (Recommendations for the authors):

1. A conclusion drawn from the results in Figure 4 is that selecting for high penetrance led to decreased expression of mef2 paralogs in wild-type embryos, whereas selecting for low penetrance led to increased expression. However, direct comparisons are only made between the unselected vs. high-penetrance strains, and the high-penetrance vs. low-penetrance strains. Without explicitly showing that paralogs are also upregulated in the low-penetrance strain vs. unselected, it can't be fairly concluded that selection for low penetrance worked through this mechanism.

2. The authors note that they removed functional copies of mef2 paralogs from the low penetrance mef2cab1086 strain, but then used the published fh288 allele for mef2cb. I assume they crossed this line into the low penetrance strain, but for how many generations? The penetrance of the mef2ca phenotypes for this new combination in Figure 5D seem considerably higher than they previously reported (Sucharov et al. 2019), so whether it is still technically "low penetrance" seems questionable. Similarly, the authors initially state that they made the new mef2d and mef2b alleles (co3013, co3012) on the low-penetrance strain (as suggested in the Materials and methods), but then Figure 6C/7C implies that they are assessing phenotypes on an "unselected" strain (the penetrance values listed support that this is not the original low penetrance strain). Please correct this discrepancy. Presumably, the predicted phenotypes and interpretations would be different depending on strain background, though possibly not, given my previous point.

3. Regarding the mild(er) phenotype of the b631 allele: Ensembl and UCSC genome browser list multiple transcripts for mef2ca, some of which have distinct transcriptional and translational start sites. Are any of these alternate forms present in the zebrafish neural crest, and, if so, could they be ameliorating the b631 phenotype? Would all transcripts be impacted by the b1086 mutation?

4. In the discussion on decoupling buffering mechanisms, the authors note that paralogs can share enhancer sequences dating back to before duplication, which is indisputable, but then seem to imply either that each paralog's version of the enhancer has evolved in parallel under selection for high or low penetrance or that a single (or few) enhancer(s) is regulating other paralogs in trans (?). If I am interpreting correctly, more evidence should be provided for each of these claims. This section requires further clarification.

[Editors’ note: further revisions were suggested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Variable paralog expression underlies phenotype variation" for further consideration by eLife. Your revised article has been evaluated by Christian Landry (Senior Editor) and a Reviewing Editor.

The manuscript is valuable and has been improved but there are some remaining issues that need to be addressed, as outlined below:

The clarity of the text still needs significant further improvement.

– Please make it clear throughout the intro and text that the manuscript consists of two components that are not yet clearly related.

– Discuss that the lack of understanding of how the various alleles affect expression or function of the gene's protein product, and the precise function of the gene in shaping craniofacial structures is not clear. Therefore, it is difficult to derive a mechanistic understanding of the observations and it remains to be investigated if the findings can be extrapolated from this study to other genes.

– Please correct the text according to the issues identified by the two reviewers. The writing is often unclear: terms are used that are poorly defined or incorrectly applied; interpretations are freely mixed with presentations of results; paragraphs often have multiple ideas intermingled leading to a lack of clarity and repetition of ideas in different paragraphs. There are mistakes in the use of some references or the descriptions of some past work. Also descriptions of the precise experimental approach taken should be made clearer in some cases.

Reviewer #2 (Recommendations for the authors):

The revised manuscript is still very difficult to read and needs further improvements

This work makes a substantial contribution to our understanding of how genetic background might function to modify the phenotype associated with a particular mutation. Three important take home lessons of this work are:

1) Within populations there is existing (standing) variation in the expression levels of a set of paralogous genes;

2) Control over the expression levels of these paralogues is heritable;

3) Variations in the level of expression of paralogues can modify the expressivity and penetrance associated with a mutant allele of one of the paralogues. The authors use these observations to support a very interesting model they propose in which over evolution as a paralogue loses one of its functions, perhaps by acquiring diminished expression in a particular tissue, it may retain a vestigial capacity to support that function, which may be uncovered by boosting its expression in that tissue. The data presented appear consistent with the hypothesis that the levels of expression of each paralogue may be coordinately regulated, a model that could be readily addressed here but is not directly addressed in this study.

The work is valuable. Its impact is dampened by four factors:

1) The work makes two very different points that are not yet clearly related: the first portion of the manuscript deals with a purported relation between the strength of a mutant allele causing a developmental phenotype and the amount of variability in the structures that are formed abnormally, whereas the second portion of the manuscript deals with the idea that paralogue expression may mitigate or exaggerate the expression of the phenotype associated with a single mutation.

2) The writing is often unclear: terms are used that are poorly defined or incorrectly applied; interpretations are freely mixed with presentations of results; paragraphs often have multiple ideas intermingled leading to a lack of clarity and repetition of ideas in different paragraphs.

3) There are outright mistakes in the use of some references or the descriptions of some past work.

4) Descriptions of the precise experimental approach taken should be made clearer in some cases.

Bailon-Zambrano et al. study factors that influence the expression of phenotype associated with a mutation. In one portion of the work they study a series of alleles of the zebrafish mef2ca gene, which is necessary for normal craniofacial development. They demonstrate that recessive putative loss-of-function mutant alleles of the mef2ca gene of zebrafish are associated with a range of expressivity. By focusing on one aspect of the mutant phenotype, the length of the symplectic cartilages that support the jaw, they find a correlation between the average strength of the phenotype of an allele (measured as reduction in length) and the extent of variability between mutant individuals that carry the allele. This is an interesting idea, but it is not fully explored or proven by their study. Because it is not clear how the various alleles affect expression or function of the gene's protein product, and because the precise function of the gene in shaping craniofacial structures is not clear, it is difficult to derive a mechanistic understanding of their observation and it is not possible to extrapolate from this study to other genes.

The findings that associate individual mef2ca alleles with variability in the severity of the craniofacial phenotype are difficult to interpret. The authors order the "strength" of an allele based on the overall severity of the phenotype, but the molecular reason why the different alleles have different "strengths" is completely unclear. The mildest allele has a mutation affecting the initiation codon, and so presumably it produces a protein utilizing a different initiation point. However it is not clear why the deletion of almost the entire locus leads to an intermediate phenotype, whereas a mutation introducing a premature termination causes the most severe phenotype. Perhaps the severe allele produces a product that interferes with the activity of paralogous products or interacting proteins? Who knows? In addition, since we don't know the precise cellular functions regulated by the gene, it is impossible to generalize from this case and conclude, as the authors do, that there is a general correlation between severity of an allele and the variability in phenotype that results. Finally, this reviewer is concerned about this conclusion and generalizations that may be drawn from focus on a single quantifiable character, the symplectic cartilage. Perhaps there is always a fixed amount of variation in the length of this cartilage. As stronger alleles produce shorter cartilage pieces, similar absolute variations in length may appear to be of greater significance when affecting structures of shorter average length.

In contrast with the first portion of the work, the second portion of the work makes truly valuable and mechanistic contributions to our understanding of how genetic background might modify the phenotypic expression of a particular mutation. The second portion investigates one mechanism that may contribute to the oft-observed phenomenon that a single mutation may be associated with different expressions of a phenotype in a manner that is dependent on the genetic background. The authors focus on loss-of-function of the mef2ca gene of zebrafish, which is needed for the normal development of several craniofacial structures. They find expression levels of mef2 paralogues may modify the phenotype associated with the mef2ca mutation. Three important take home lessons of this work are:

1) Within populations there is existing (standing) variation in the expression levels of a set of paralogous genes;

2) Control over the expression levels of these paralogues is heritable;

3) Variations in the level of expression of paralogues can modify the expressivity and penetrance associated with a mutant allele of one of the paralogues. The authors use these observations to support a very interesting model they propose in which over evolution as a paralogue loses one of its functions, perhaps by acquiring diminished expression in a particular tissue, it may retain a vestigial capacity to support that function, which may be uncovered by boosting its expression in that tissue.

Major issues:

1) I think the work is important, I would like to see it published, but I found the manuscript very difficult to read.

2) Perhaps one reason it is difficult to follow is that the work described here is composed of two very different studies, one describing the phenotypic effects of three different alleles and the other describing the effects of genetic background on the expression of one allele. This is reflected in the presence of two independent sections in the Introduction. Because we don't understand how the three different mutations affect generation of a protein product, or how they might affect the expression of paralogues (which are shown to be very important in the second half of this work), we cannot understand what a "strong" or "intermediate" allele is. IF the deletion completely removes gene function, then from a molecular point of view, it is a complete loss of function, and thus the premature termination mutation is a "special" type of allele that is more than simply completely loss of function. In any case, without insight into how these alleles work, I don't think it is fair to make generalizations about strengths of alleles and variability in the expression of a phenotype.

On the other hand, there are clear statements in the Introduction that could really be used as the crux of an entire paper: "whether paralog expression variation underlies phenotypic variation has not been directly tested. Finally, whether variation paralogous buffering is subject to selection is unknown." Personally, I suggest focusing on the effects of paralogues to buffer phenotype – it makes for a crisp introduction and focused story.

3) There are assertions that I believe are incorrect.

For example, lines 110-113: "Because mutations in none of the other four zebrafish paralogous mef2 genes have been reported, their function is unknown. These paralogs have human orthologs, MEF2A, MEF2B and MEF2D, that do not appear to be critical for craniofacial development46-48." Reference 46 and perhaps the others does NOT address the question of whether these genes are required for craniofacial development – ref 46 says mef2d fusion proteins are associated with cancer. These are somatic gain-of-function mutations and irrelevant to the matter at hand, which is complete whole animal loss of function.

Another example, lines 307-311: "Data showing that paralog transcriptional adaptation does not account for the phenotypic differences between the low- and high-penetrance strains34 do not rule out the possibility that selective breeding changed paralog expression between strains in mef2ca wild types." Reference 34 is the classic Force et al. paper and is not the correct reference – I think you meant to use reference 52.

Another example, lines 364-367: "We detected mef2d transcripts in the anterior arch population of 24 hpf wild-type cranial neural crest cells, and mef2d expression is significantly lower in the high-penetrance strain compared with the high-penetrance strain" I think you meant to compare high and low penetrance strains.

There are other examples of assertions that are a bit glib and that distract from the strength of the work.

4) Terminology is used loosely or incorrectly.

I find in the literature the use of between-individual as well as among-individual, and that between-individual is used more often. Personally I find between-individual easier to keep in my mind what the comparison being described is, but both are legit I guess.

In many cases I believe it would be clearer to say that there are differences in expression levels among members of a population and that these levels are heritable – rather than saying "variability is heritable" (line 42 and elsewhere), which is a difficult concept to grasp precisely.

Lines 17-18: "Craniofacial disorders further increase this variability." "this variability" is ill-defined – perhaps it is more useful to speak of "range of variability"

Line 44: "Finish people" should be "Finnish people".

Lines 63-64: "Human genetic craniofacial disease phenotypes also appear more variable than normal human facial phenotypes." What is the actual comparison that is being made? I can't reconstruct this experiment.

Lines 70-71: "Patients heterozygous for mutations affecting the transcription factor encoding gene MEF2C show variable facial dysmorphologies19-21." This is relevant to the present study if you are describing variability associated with a single mutation – if that is true, make it explicit. If you mean that there are different alleles and these different alleles generate a range of phenotypes then it is not relevant to the discussion here.

Line 187 – not a "range of alleles", but a series of alleles.

Line 225-227 and elsewhere – there are lots of places where alleles are described as variable – where the mutation or gene is conflated with the phenotype. "and the most severe mutant (mef2cab1086) allele was significantly more variable than the mildest mutant (mef2cab631) allele".

Line 266: "likely to be robust against null mutations" this is not clear.

The entire section beginning with line 284: "Closely related mef2 paralogs share similar expression dynamics" It is not clear what conclusions can be made or what conclusions you want to make about similar "dynamics". What one would like to measure is whether the paralogues are expressed in the same cells as the gene of interest and at a time when they are performing a function of interest. Since you have not explained when and where mef2ca functions, it is impossible to interpret the "dynamics" data in terms of relevant function. Please be explicit about what these experiments let you conclude and why.

Line 296-297: this paragraph is ended with a summary sentence: "Of note, previous work indicates that partially shared regulatory motifs are predictive of robustness66." Neither the meaning of this nor its relevance is clear to me. whose robustness? And in what context?

5) Some things can just be made simpler:

Lines 32-34: "We present a novel, mechanistic model for phenotypic variation where cryptically variable, vestigial paralog expression buffers development." I don't think the Reader will understand "cryptically variable" at this point of the paper (the Abstract).

Line 61 and elsewhere: "genetic mutations" could be simply "mutations".

6) Experiments and Models that need to made clearer:

A critical series of experiments involves study of mutant alleles of the paralogues that are thought to be insufficient to have a dramatic effect on the craniofacial phenotype, but are thought to modify this phenotype in the context of mef2ca or other paralogue mutations. The authors at one point talk about this experiment as simply "removing" the paralogue (line 343). At that point, I had assumed the new mutations were being generated within the strain that was being studied – that is the easiest way to interpret "remove". Instead, paralogue mutations are induced elsewhere and bred into strains for study. The rationale for the experimental design is complicated and should be explicitly set forward. I would ask the authors to make clearer within the Results section how they generated the new mutations in the paralogues and how these were introduced into various strains and what precisely was examined. The entire exposition is a bit buried in Lines 336 – 338 paralogs. "For all paralog functional experiments, we introduced the relevant paralog mutation into the low-penetrance strain, then maintained by outcrossing to unselected AB." This is not a complete sentence. The details of this experiment are really important.

The other point that needs to be better explained and clarified is their model in Figure 9. The mechanistic implications of the illustration are vague, but to me there appears to be focus on the roles of enhancers – the figure seems to imply (i) that enhancers acquire mutations that diminish expression of the paralogue and (ii) that under selection the enhancer acquires mutations that result in increased expression of the gene. I would think rather that standing variation arises prior to the selection – and this might be illustrated in the figure. Nevertheless, it seems from the data that the expression levels of all the paralogues may be coordinately regulated – and this idea seems to supported in the Discussion. If all the paralogues are expressed at low levels in the high penetrance strain and at higher levels in the low penetrance strain, then it is more likely that variation is not being driven at the level of each paralogue's enhancer but rather by some factor working in trans. That idea is really missing from Figure 9 – and since it is a summary illustration of what the authors think, getting this Figure right is important. In truth, I would have liked to see a few crosses between strains to test the idea that a single modifier working in trans is responsible for all the effects – THAT would have been both relatively easy to test and killer interesting as a result. But without such an experiment, the authors need to be very clear about alternate possible interpretations.

Reviewer #3 (Recommendations for the authors):

The authors have extensively revised and added new data to their manuscript in response to the initial round of reviews. The language is generally more precise and concise, and sections lacking clarity are improved or have been removed. I especially appreciate that they now explain the breeding scheme by which the paralog mutant alleles were established and maintained, as the phenotype prevalences are now better in line with what the reader is led to predict.

https://doi.org/10.7554/eLife.79247.sa1

Author response

Essential revisions:

The three reviewers agree that the question addressed is of great interest to evolutionary and developmental biologists in general and to those studying the evolution of developmental mechanisms in particular. The manuscript reports substantial and interesting advances, but in its current form it is difficult to read, which led to significant frustration among the reviewers. The manuscript needs to be substantially rewritten and shortened. The writing and interpretation of the results needs to be clarified as many statements are perceived as lacking precision. One of many examples is the statement that each selected strain has similar inbredness. However, there is no evidence that this was actually tested. The authors should also ensure that their conclusions match the results and are not over interpreted, pointing out explicitly caveats where necessary. Finally, the results need to be discussed in relation to what has been published, for example how does the phenomenon studied here relate to studies of genetic modifiers and what has been learned there.

We are thankful that the reviewers and editors appreciated the “substantial and interesting advances” reported in this work. Our revised manuscript contains important new data further supporting our conclusions, and advancing our model. Furthermore, the text has been extensively rewritten for clarity and precision. Thus, we have fully addressed all the reviewer critiques. See below for a point-by-point response.

Reviewer #1 (Recommendations for the authors):

P3 abstract: 'In support, mutagenizing all mef2ca paralogs in the low penetrance strain'

But one wasn't mutated. It would be helpful to say which paralogs were mutated in the abstract just so that literature searches by people studying the paralogs would be more likely to find this paper.

P3 'Human craniofacial variation allows us – and even computers -- to identify each other.

Done

Re: 'facial variation persists among genetically homogeneous populations1.

Text should specifically talk about identical twins here.

Done

P4 paragraph should start with a topic sentence: 'Variation can be talked about in two different ways: penetrance is….'

Done

P5 ‘large deletion encompassing the MEF2C locus’

Is that in homozygous or heterozygous condition?

Done

P5 text says that mef2cb mutants are viable without craniofacial phenotypes, but do they have other phenotypes?

Done

P6 'Because no other zebrafish mef2 gene mutants'. Reader needs to know how many mef2 genes zebrafish has and their relationship to human MEF2 genes.

Done

P6 'nearly all mef2ca associated phenotypes changed penetrance as well34'

Did they change in the same direction? All more severe or all less severe?

Done

P6 'the spread in symplectic length'

Define what this 'spread' means in phenotypic terms.

Done

P6 'linear measurements of the symplectic cartilage, allowing us to measure severity,

among-individual variation, and within-individual variation in the zebrafish craniofacial skeleton'

Authors need to demonstrate that measuring the length of this single skeletal element is sufficient to capture the variation they intend to study.

This is an excellent suggestion, and we now include data to indicate that shorter symplectic cartilage length is correlated with expanded opercle bone suggesting that the symplectic is a representative phenotype for the other craniofacial phenotypes affected in mef2ca mutants (Figure 1—figure supplement 2).

P7 'upregulated in the low penetrance strain compared with the high-penetrance strain'

With respect to what internal control?

Done

P7 'partially due to individuals inheriting different mutant alleles retaining different levels

of functional activity'

Right, for protein function, but also for different transcriptional expression domains or levels.

Done

P7 'The C-terminal domain is more divergent among different mef2 genes38.'

This paragraph is talking about alleles of one gene but this sentence is talking about different genes, so it's a bit confusing and seems out of place.

We deleted this statement.

P7 'destroys the initiating methionine35.'

Is any protein made from an internal methionine that might provide an alternative initiation site?

Thank you for this question, to address this we performed a new experiment. We now include antibody stains for both the mef2ca methionine allele (b631) and the deletion allele (co3008) these results further our understanding of the different alleles. (Figure 1—figure supplement 1).

P8 'deletion allele found in a [mildly affected] patient23'

Done

P8 'among-individual variation associated with the opercle bone phenotype; one individual has phenotypically wild-type opercles while the other individual has bilateral mutant phenotypes'

Give each of the individuals in Figure 1B a different panel number, like B1, B2 for the two wild types. Then in the sentence above call them out by name so the reader doesn't have to figure out which is which.

We thank the reviewer for this helpful comment, we made the suggested changes that improved clarity.

P8 'within-individual (left-right) opercle phenotype variation is present in one of the mef2cab631 animals (lower)

Be a bit more specific, saying that in the bottom individual (which could be called B4), the left opercle is relatively normal but the right opercle is partially duplicated (or something like that).

Figure 1D. I think it would be helpful for the reader to label all of the skeletal elements in Figure 1D and highlight the symplectic by color.

This helps clarity, thank you for the suggestion.

P8 'Ordering our allelic series by expressivity [of this one element] shows'

What about the other phenotypes? Do they follow the same allelic series? It's possible that different alleles could affect differentially different phenotypes.

Done

P8 'expected to be [milder] than the full deletion'

P8 'the PTC (mef2cab1086) allele would be expected to be more mild than the full deletion

(mef2caco3008) allele if transcriptional adaptation was a factor.'

Yes, agreed.

P9 'the most severe allele is also the most variable (Figure 1F).'

Consider plotting a measure of severity on the vertical axis at the right of the figure on the same horizontal axis to make the point about the correlation.

This is a good idea, but because we have already ordered by severity in this graph, the suggested change is not likely to affect the interpretation. Furthermore, we clarified in the text and figure legend that the x-axis is ordered by our previously determined severity by both penetrance and expressivity.

P9 'there is no significant difference [among] mutant alleles for this type of variation '

Done

P9 'Severity is positively correlated with variation when comparing'

Would it make more sense to say 'Variation is positively correlated with severity when comparing'? Both of course are correct but phrase below says it in this order and biologically it seems to make more sense to me that the severity is a property of the allele and that the variation is a property of the biology arising from the altered protein function.

We fully agree and changed as suggested.

P9 'we hypothesized that [a single] mutant allele, mef2cab1086,'

Done

P9 'shortened symplectic cartilages and fused interhyal joints, were occasionally observed'

In nearly all of the alcian-alizarin stains, the length of the symplectic is difficult to see due to interference at the rostral end by overlap with the palatoquadrate. Consider marking it in some way in the figures.

This is a good suggestion, we now outlined the symplectic cartilage in all figures containing Alcian-Alizarin stains.

P9 'We were surprised to observe some mef2ca-associated phenotypes, like shortened symplectic cartilages and fused interhyal joints, were occasionally observed in mef2ca+/+ individuals from the high-penetrance strain (Figure 2B)'

Done

Reader needs to know the frequency with which these phenotypes appeared in wild types.

P9 'we found significant differences between low- and high-penetrance strains for both mef2ca+/+, as well as mef2ca+/- [but homozygous mutants were statistically the same by this measure in the two strains](Figure 2C).'

Done

p10 'the low- and high penetrance strains are both similarly inbred'

They had similar mating schemes, but no data are presented to show that they are similarly inbred. That requires data on heterozygosity. It could be that for one of the strains, for some reason selection was for stronger heterozygosity across the genome than the other strain, that you tended to select for heterozygotes because, for example, inbreeding depresses health subtly in ways that affect the manifestation of the phenotype.

Agree, we removed all statements asserting that the strains are similarly inbred.

P10 'to generate a 'natural knock down' in mef2ca wild types'

Maybe it doesn't knockdown the mef2ca gene function or expression, maybe it alters the related phenotypes in some other way, in an independent pathway.

We removed this statement.

P10 'Six mef2 paralogs in the zebrafish genome share highly conserved amino acid sequences'

Has no one done this type of analysis before that the text could cite? Do the conclusions in this paragraph match the phylogenetic analysis of Mef2 genes in Chen 2017 MEF2 signaling and human diseases?

Thank you for this suggestion, we now reference several studies that performed phylogenetic analyses on mef2 paralogs and orthologs. We also point out that we arrived at similar conclusions as these other studies.

Figure 3. It would be easier for the reader to evaluate the degree of conservation if only conserved amino acids were colored and variants left uncolored.

Done

P11 'To determine the gene expression dynamics of the mef2 paralogs during craniofacial development'

Consider changing motivation a bit to: 'To determine [whether any mef2 paralogs have expression domains that overlap mef2ca in the head skeleton], we studied the gene expression dynamics of mef2 paralogs during craniofacial development'

Done

P11 'and that shared [and divergent] regulatory sequences might control expression'

Done

P11 'expression in isolated wild-type cranial neural crest cells41'

Essential here to tell reader the age of the preparations for scRNA-seq to compare to ages for the other analyses.

Done

scRNA-seq was from isolated neural crest cells, but the work has not established that this is the only cell type that is expressing each of these genes in the head skeleton; if that's true, then it should be mentioned (or maybe I missed it). And the resolution of the scRNA-seq results into anterior arches, frontonasal, posterior arches, and melanocytes seems to come from a method with much less resolution than in situ hybridization to mRNAs would provide.

We agree with this comment. In response, we clarify in the text that sorting for neural crest cells does not capture all the cells involved in craniofacial development. But does allow us to focus on the anterior arches. Furthermore, in response to this comment we removed the sorted single cell RNA sequencing experiment comparing paralog expression between the unselected and high-penetrance strains. We now only include the qPCR comparison between high and low penetrance strains. We still include single-cell RNAseq from isolated NCC data from the unselected strain to demonstrate the relative expression of the mef2 paralogs in these bone and cartilage progenitor cells.

The selection of crest cells and excluding endodermal and ectodermal cells, which signal to the developing cartilages, removed cells that might very well be contributing to the selected phenotypes because the selection experiments might have altered genes in the signaling pathway that are expressed in these epithelial cell types.

P11 'When we demonstrated that paralog transcriptional adaptation does not account

for the phenotypic differences between the low- and high-penetrance strains34, we did

not examine if selective breeding changed paralog expression between strains in

mef2ca wild types'

Consider change to: 'Data showing that paralog transcriptional adaptation does not account for the phenotypic differences between the low- and high-penetrance strains34 do not rule out the possibility that selective breeding changed paralog expression between strains in mef2ca wild types

Done

P11 'cranial neural crest cells between unselected wild types and high-penetrance wild types'

Were these wild type siblings from an unselected mef2ca mutant line? Or were they from AB or some other wild type line?

Would a better comparison be to compare the two selected lines? Maybe in both the up and the down lines the paralogs are down-regulated with respect to wild types?

We removed these data comparing unselected to high-penetrance with single cell RNA sequencing.

P11 'mef2aa, mef2ca, mef2cb and mef2d were all significantly downregulated in the high-penetrance line compared with the unselected strain.'

First, consider change to: 'mef2aa, mef2ca, mef2cb and mef2d were all significantly downregulated in wild-type siblings in the high-penetrance line compared with wild-type siblings from the unselected strain.'

Done

It's unclear why scRNA-seq was used for this analysis instead of a more quantitative approach like the qPCR approach of Figure 4A. It seems like this experiment doesn't utilize the advantages of scRNA-seq and instead uses it for something the method is not well suited for. Visual comparison of the plots requires that the same number of cells are present in each of the two samples. Was that true?

In response to this comment, we removed these analyses from the manuscript.

Furthermore, for comparison to Figure 4B, it would be better to show the data in Figure 4C divided into the same four 'cell types'. And even better if shown in the context of all clusters that came out of the experiment.

See above, these analyses were removed.

P11 'all significantly downregulated in high-penetrance heads compared'

Should it read: '…downregulated in the heads of wild-type siblings from high-penetrance lines'? Or were these selected heads that were mutant with high expressivity? I'm just not sure what a 'high penetrance head' is.

We edited this sentence for clarity.

P11 'paralogs. We next used RT-qPCR to compare mef2 paralog expression between highand low-penetrance wild types.'

This experiment would be more interpretable if, like the scRNA-seq experiments, wild types from an unselected line of mef2ca mutants were used. If the unselected line is not in between the two selected lines, then explanations for the results shown would be different than currently in the manuscript.

This is an excellent suggestion. In response, we performed this key experiment which considerably improved our model. We discovered significant differences between unselected AB strain families (Figure 4D). These important new data motivate the following interpretations: unselected AB strains exhibit standing variation in paralog expression, and this variation is likely what we selected upon. We are extremely thankful for this suggestion which considerably improved our understanding of the system.

P12 'generally increased and decreased expression, respectively, of the mef2 paralogs'

This up and down conclusion can't be drawn if the unselected wild types are not in the middle. But because that control is missing, the conclusion does not seem to be justified, only that the two selected lines differ from each other. While selecting on phenotypes, not expression patterns, both selected lines could be up or both down in paralog expression from the data presented.

We agree and changed the text to reflect our new unselected qPCR expression data (see above).

Figure S3. Expression of the housekeeping genes was normalized to rps18. Were the mef2 expression levels normalized to the same gene?

Yes, we used the same normalization control (rps18). And your attention to detail motivated us to run a t-test finding that expression is significantly higher in the high penetrance strain. We present both these facts in the figure legend. These results do not alter our conclusions.

In addition, the conclusion from these data is that there's no difference between low and high lines for these two genes, but no statistical test is given and the standard deviations do not overlap. A statistical test would seem useful, even though the data would seem to show as the authors concluded that the low line is not higher than the high line for these two genes.

Done, see above

P12 'we systematically removed functional copies of mef2ca paralogs from this strain'

This is indeed the best way to address the hypothesis above. The phrase 'from this strain', however, should probably be replaced by 'the low-penetrance strain'.

See methods, we clarify that the new CRISPR alleles were originally generated in the low-penetrance strain but then were outcrossed to the unselected AB background for several generations. We now consider these to be unselected and changed the text throughout accordingly.

P12 'is most closely related to'

In what way? In sequence? In expression patterns? In history?

Sequence similarity, we now clarify this in the text.

P12 'is the second highest expressed MEF2 PARALOG in cranial'

Done

P12 'mef2cb homozygous mutants do not have an overt craniofacial Phenotype'

Do they have other phenotypes?

We now address this in the introduction

P12 'mef2cb does not function in zebrafish craniofacial development'

Because mef2cb gene, which the sentence means because it uses gene nomenclature, is expressed, then I'd say the mef2cb gene DOES function in craniofacial development. But the Mef2cb protein apparently doesn't provide an essential unique non-redundant function because the mutant has no phenotype.

We agree and changed the wording to be more precise and specific.

P12 'when we removed one copy of mef2cb'

Good that the reciprocal experiment was done.

Thanks

P12 'Removing both copies of mef2cb from mef2ca homozygous mutants IN THE LOW-PENETRANCE STRAIN produces severe'

We changed our description of the maintenance of these lines to clarify that this is no longer true breeding low penetrance strain.

P12 'there is no difference in symplectic length between wild types and mef2ca heterozygotes (wild type vs. mef2ca+/-;mef2cb+/+)'

Specify genetic background.

We clarify these are unselected, see above.

P12 'Removing copies of mef2cb from mef2ca wild types does not significantly change'

Again, specify strain/background. Is it the low background in the whole paragraph?

P12 Did the authors also remove mef2cb activity in unselected and high-penetrance lines? If not, then it would be helpful to motivate why choosing one line and not the others.

We clarify these are unselected, see above.

P12 'We conclude that mef2cb buffers against mef2ca-associated phenotype severity, and

among-individual variation, but not within-individual variation IN THE LOW-PENETRANCE LINE.'

I think that's right, or were all three lines checked? It's hard to tell because the text does not always clearly specify which lines is used.

We clarify these are unselected, see above.

P13 'mef2d mutant allele in the low-penetrance strain (Figure 6A). Homozygous mef2d mutants did not develop any overt skeletal phenotypes in an otherwise wild-type background'

This is confusing. Is the background the low-penetrance strain, which is not the same as a wild-type background because it has been selected for likely rare alleles in the original background, or is the mef2d mutant in an 'otherwise wild-type background' as the sentence says?

See above, we clarify in the text that we consider these unselected.

P13 'mef2d, and further increased in double homozygous mutants (Figure 6C)'

This is difficult to follow. This paragraph says that the mef2d mutants were made in the low line, but then Figure 6C indicates it's the unselected strain. Which is correct?

See above, we clarify in the text that we consider these unselected.

P13 'but mef2ca heterozygotes are more variable when mef2d is disabled (Figure 6E).'

Which phenotype is examined here? The previous sentence makes me think it's only symplectic length variation, but because neither the sentence nor the figure panel nor the figure legend says, it's impossible to know.

We clarified this in the text.

P13. 'mef2b is the most divergent mef2ca paralog,'

By what criterion? Sequence? Length? Time at which this variant appeared in phylogeny? Expression pattern?

We clarify that this is by sequence analyses. This finding is consistent with several other previous studies, which we now cite.

P14 'Both of these phenotypes are never seen in unselected'

Change to: 'Neither of these phenotypes are ever seen in unselected'

Done

P15 'Our experiments support a model where we selected upon alleles that either amplify or dampen existing vestigial expression'

Are those selected alleles at the mef2ca locus? Or at the mef2 paralog loci? Or in trans acting genes, presumably transcription factors or their partners, that regulate mef2ca or cb or other genes?

Done

P15 'and is what we selected upon in mef2ca mutants'

This sentence implies that the selection was on variation at the mef2ca mutant locus. Either some mapping data need to be presented (maybe I missed it) that support where the responding loci were in the genome of the selected lines, or some discussion is warranted about where the variation might be in the genome and what types of gene variants might have responded. Maybe this is later in the discussion.

We now include a considerable discussion about where the variation might be in the genome.

P15 'For example, disabling mef2cb affects most phenotypes'

I think this means 'For example, disabling mef2cb affects most MEF2CA-RELATED CRANIOFACIAL phenotypes.

Done

P15 'primarily affect penetrance of ventral cartilages'

Should be: 'primarily affect penetrance of ventral cartilage PHENOTYPES'.

Done

P15 'Further, expression of all the paralogs was affected by selection, indicating that they all contribute to buffering.'

Not necessarily. It could be that selection acted on a single factor that alters the expression of all of the paralogs in the same way even though the protein of only one or two contributes to the selected phenotype; the others are just along for the ride because of shared upstream regulatory mechanisms, not because of shared downstream functions.

We agree and extensively edited this section.

P15 'Closely related paralog expression dynamics are similar'

I guess closely related expression dynamics are of course similar. But I think what the text meant to say was that 'The expression dynamics of paralogs closely related in sequence and phylogeny are similar'

We agree and extensively edited this section.

P15 'In this model, few enhancers are inherited that control the expression of many paralogs.'

Again, it would be helpful to reader to know whether the authors think these selected enhancers are in the mef2 genes themselves or in transcription factors that might regulate coordinately all of the mef2 paralogs.

We address this in the extensively rewritten discussion.

P15 'Determining whether the aspect of the mef2ca phenotype buffered by each

paralog is related to their wild type expression pattern'

Consider: 'Determining whether the SPECIFIC PORTION of the mef2ca phenotype THAT EACH PARALOG BUFFERS is related to THAT PARALOG'S wild-type expression pattern'

Done

P16 'phenotyping in this study was limited to the symplectic cartilage.'

Remind reader why this specific element makes the best proxy for the whole system and focusing on it is unlikely to lead to wrong conclusions even if several other phenotypes had been studied additionally.

Done

16 'buffering in our system only regulates among-individual variation, not within-individual variation'

Tell reader why this is an important finding. Why should I care if these two types of variation are regulated by distinct or common mechanisms.

Done

P16 'We propose that cryptic variation in vestigial paralog expression is the noise underlying variable craniofacial development'

Here, come back to the motivation in the beginning about variation in human facial structure. Sceptics might say that, well this doesn't apply to humans because the effects these authors found are mostly related to genes from a fish-specific gene duplication, the a and b copy of mef2c, and because humans just have one MEF2C gene, this is irrelevant. Text could readily disabuse sceptics of this notion and the symmetry of returning to the original motivation would make this more fun.

Done, excellent suggestion thank you.

Reviewer #2 (Recommendations for the authors):

Major issues:

1) I found this manuscript difficult to read. In large part, my difficulty was due to finding vague phrases throughout that I would linger over to try to interpret. I will list some of these below, but they were sufficiently numerous as to lessen the impact of this cool work!

Thank you for appreciating our work. Your comments and those of the other reviewers led us to considerably revise the manuscript striving for clarity.

Page 1 (the Abstract): "Here we used the zebrafish mef2ca mutant, which produces variable phenotypes, to understand craniofacial variation." This is difficult to understand – is there one mutant or different alleles collectively produce a range of phenotypes.

We clarified this section.

Page 2: "facial variation persists among genetically homogeneous populations" Does "genetically homogeneous" mean twins or clones? If not, is this useful?

We clarified this.

Page 2: "Thus, it is possible that high human facial variability is heritable, and under selection." Do the authors mean that variability is heritable? or that features are heritable and variability of features is under selection?

We clarified this language.

2) I question whether the study of the penetrance and variability of expression associated with the different mef2ca alleles adds to the study or diverts attention from the take-home messages. In particular, the authors claim that the data presented in Figure 1C shows that mef2caco3008 and mef2cab1086 alleles are associated with different penetrance. This just does not seem to be true from the data – or at least the authors do not indicate what criteria they use to say there is a significant difference.

By our statistical analyses, the b1086 allele has a higher penetrance of ectopic op and ih joint fusion (Figure 1C) compared with b631. In contrast penetrance of these phenotypes in the co3008 allele is not significantly different from the mildest b631 allele. Furthermore, the total sy length (Figure 1E) is significantly shorter in b1086 compared with co3008. These lines of evidence indicate that b1086 is more severe than co3008. This is clarified in the text.

3) The authors emphasize that a deletion mutation of mef2 in mammals has a mild phenotype compared with the phenotype of other mutant alleles. Then they sort of suggest that the mef2caco3008 is of particular interest because it is a deletion of the entire coding sequence of mef2ca. I think the strong implication is that the two mutations are equivalent. Nevertheless, the mef2caco3008 zebrafish deletion has a strong phenotype. And then no comment. I think this discussion in the Results section is a distraction. I think too little is known about each mutation to even compare them and suggest they are similar. This is an example of the lack of focus that diminishes the strong points of the manuscript.

We agree and changed the language describing this allele indicating that it approximates the human deletion allele. Furthermore, we include new data addressing the protein produced from these alleles which increase our understanding of each mutation.

4) The authors are not sufficiently careful about their use of and the meanings of the words "up-regulation" and "down-regulation" with respect to paralog expression. One might imagine that selecting for high or low penetrance would select for new mutations that modify expression of the paralogs. Or, more likely in my opinion, that different animals in the starting population have higher or lower expression of the paralogs and the breeding scheme isolates these genomes into lower penetrance or higher penetrance strains. The authors MUST make clear what they mean by these terms. It is not clear in fact whether there ever is a change in gene expression of a paralog. Selection of higher-expressing alleles or lower-expressing alleles from a starting population is not the same as selecting for their up- or down-regulation. The authors really should investigate how much standing variation exists in their populations of fish, especially the unselected starting population.

This reviewer proposes the excellent hypothesis that “different animals in the starting population have higher or lower expression of the paralogs and the breeding scheme isolates these genomes into lower penetrance or higher penetrance strains.” In response we directly tested this model and found that there is significant standing variation in paralog expression in the unselected strains (Figure 4D). We would like to thank this reviewer for motivating this very important experiment that dramatically improved the manuscript and clarified our model. This amended model is now clearly conveyed in the text.

5) Throughout, the authors make conclusions that are based on assumptions that are not sufficiently supported by data. For example:

On page 6: "transcriptional adaptation is not a major factor underlying phenotypic variation in our system; the PTC (mef2cab1086) allele would be expected to be more mild than the full deletion (mef2caco3008) allele if transcriptional adaptation was a factor." I don't think we fully understand the triggers of transcription adaptation to make this conclusion – especially when it could have been measured directly!

We disagree with this reviewer comment. Our previous work (Sucharov 2019) extensively tested the hypothesis that differences in transcriptional adaptation account for differences in our selectively bred strains finding that it is not a major factor in our system. Nevertheless, we qualify this statement in the text by stating that we did not directly test if the deletion allele induces transcriptional adaptation.

On page 8: "Because the low- and high penetrance strains are both similarly inbred, it is unlikely that there is more genetic variation in the high-penetrance strain background accounting for the increased variation in this strain." This would not be correct if breeding for phenotypic variability turns out to be a selection for genetic variability.

We agree with this comment and edited the text accordingly.

As mentioned above, the authors claim the paralogs have no function in craniofacial development and yet they suggest that diminished expression of the paralogs leads to the expression of mutant phenotypes in the high penetrance strain. They need to clarify and explain.

We agree with this comment and edited the text accordingly.

6) I am concerned that the newly engineered mutations in the paralogs might trigger "transcriptional adaptation" as they each produce premature termination codons – early stops. This would be easy to test.

This reviewer is correct, the new paralog mutations might trigger transcriptional adaptation. However, determining if they do is beyond the scope of this work, and not critical to the conclusions we draw. Future studies will test this hypothesis.

7) There are lots of places where the authors need to be clearer.

On page 7: the authors discuss the establishment of high penetrance and low penetrance strains – they need to remind us that the mutation is lethal and so they propagate these strains as crosses between heterozygotes (I think?). and they need to remind us or refer to the Methods section to tell us how many generations of selective breeding have taken place.

We have gone to great lengths to make this manuscript clearer, including the breeding methods description. Specifically, we added a brief description of it to the methods and figure legend of this manuscript including how many generations of selective breeding have taken place. Furthermore, this method is clearly described in three previous manuscripts (Nichols 2016, Brooks 2017, Sucharov 2019) which we reference.

On page 8: "The zebrafish mef2 paralogs each encode a MADS box and MEF2 domain which are remarkably similar (Figure 3B and S1)," – similar to what?

Done

On page 9: "early" and "late" stages of expression should be defined or clarified

Done

8) The entire discussion of "buffering" in the Introduction, etc. is concerning to me. How is the phenotypic buffering that occurs because of vestigial expression of a paralog different than partial "redundancy"? The authors write "To our knowledge, this vestigial buffering hypothesis has not been previously proposed or tested." And yet certainly the work of Yitzhak Pilpel (Nature Genetics 2005; PNAS, 2006) and others very specifically hypothesize that paralogs function as "backup circuits". Further, how does the phenomenon studied here relate to studies of genetic modifiers and what has been learned there. The current work does not operate in isolation, and the Introduction seems to imply that.

Thank you for this comment. In response, we clarified our language throughout and included numerous new citations referencing many of the important manuscripts that set the stage for our work. Our study is placed in much better context following this suggestion.

Reviewer #3 (Recommendations for the authors):

1. A conclusion drawn from the results in Figure 4 is that selecting for high penetrance led to decreased expression of mef2 paralogs in wild-type embryos, whereas selecting for low penetrance led to increased expression. However, direct comparisons are only made between the unselected vs. high-penetrance strains, and the high-penetrance vs. low-penetrance strains. Without explicitly showing that paralogs are also upregulated in the low-penetrance strain vs. unselected, it can't be fairly concluded that selection for low penetrance worked through this mechanism.

We agree, our new unselected AB paralog expression study (Figure 4D) strengthens our conclusion (see above).

2. The authors note that they removed functional copies of mef2 paralogs from the low penetrance mef2cab1086 strain, but then used the published fh288 allele for mef2cb. I assume they crossed this line into the low penetrance strain, but for how many generations? The penetrance of the mef2ca phenotypes for this new combination in Figure 5D seem considerably higher than they previously reported (Sucharov et al. 2019), so whether it is still technically "low penetrance" seems questionable. Similarly, the authors initially state that they made the new mef2d and mef2b alleles (co3013, co3012) on the low-penetrance strain (as suggested in the Materials and methods), but then Figure 6C/7C implies that they are assessing phenotypes on an "unselected" strain (the penetrance values listed support that this is not the original low penetrance strain). Please correct this discrepancy. Presumably, the predicted phenotypes and interpretations would be different depending on strain background, though possibly not, given my previous point.

We agree with these points. We now clarify that the fh288 allele was crossed to low penetrance which is, as you state, no longer low penetrance. We then maintained by outcrossing to AB. Thus, we consider it to be unselected. Similarly, the new CRISPR mutants were originally created on the low-penetrance strain, but maintained by outcrossing these alleles to unselected AB. We also consider them to be unselected. We now clearly describe our allele generation and maintenance strategy in the methods and text.

3. Regarding the mild(er) phenotype of the b631 allele: Ensembl and UCSC genome browser list multiple transcripts for mef2ca, some of which have distinct transcriptional and translational start sites. Are any of these alternate forms present in the zebrafish neural crest, and, if so, could they be ameliorating the b631 phenotype? Would all transcripts be impacted by the b1086 mutation?

Thank you for this point. We include new Mef2c protein (immunostaining) data indicating that protein is produced from the b631 allele, as this reviewer hypothesizes (Figure 1—figure supplement 1). The protein we detect is likely from the downstream in frame ATGs that the reviewer mentions. Future protein study (Western blot) will be informative. We also include new protein data with the deletion allele. Together with our previous report (Sucharov 2019), these data indicate that the two more severe alleles (b1086 and co3008) do not produce meaningful amount of Mef2ca protein but b631 does. We incorporated these new data and interpretations into the text.

4. In the discussion on decoupling buffering mechanisms, the authors note that paralogs can share enhancer sequences dating back to before duplication, which is indisputable, but then seem to imply either that each paralog's version of the enhancer has evolved in parallel under selection for high or low penetrance or that a single (or few) enhancer(s) is regulating other paralogs in trans (?). If I am interpreting correctly, more evidence should be provided for each of these claims. This section requires further clarification.

We improved the discussion related to this including that the differences between strains might be either in cis (enhancers) or trans (upstream activators) to the paralogs.

[Editors’ note: further revisions were suggested prior to acceptance, as described below.]

– Please make it clear throughout the intro and text that the manuscript consists of two components that are not yet clearly related.

We removed the first component, the allelic series study, at the suggestion of reviewer 2. We now focus on the strongest component, paralogous buffering.

– Discuss that the lack of understanding of how the various alleles affect expression or function of the gene's protein product, and the precise function of the gene in shaping craniofacial structures is not clear. Therefore, it is difficult to derive a mechanistic understanding of the observations and it remains to be investigated if the findings can be extrapolated from this study to other genes.

The various alleles have been removed.

– Please correct the text according to the issues identified by the two reviewers. The writing is often unclear: terms are used that are poorly defined or incorrectly applied; interpretations are freely mixed with presentations of results; paragraphs often have multiple ideas intermingled leading to a lack of clarity and repetition of ideas in different paragraphs. There are mistakes in the use of some references or the descriptions of some past work. Also descriptions of the precise experimental approach taken should be made clearer in some cases.

We have taken great care to define terms and apply them correctly. We saved all interpretations for the discussion and streamlined the text to remove repetitive ideas. We have double checked and corrected all references. We now include a precise description of the experimental approach as suggested.

Reviewer #2 (Recommendations for the authors):

The revised manuscript is still very difficult to read and needs further improvements

This work makes a substantial contribution to our understanding of how genetic background might function to modify the phenotype associated with a particular mutation. Three important take home lessons of this work are:

1) Within populations there is existing (standing) variation in the expression levels of a set of paralogous genes;

2) Control over the expression levels of these paralogues is heritable;

3) Variations in the level of expression of paralogues can modify the expressivity and penetrance associated with a mutant allele of one of the paralogues. The authors use these observations to support a very interesting model they propose in which over evolution as a paralogue loses one of its functions, perhaps by acquiring diminished expression in a particular tissue, it may retain a vestigial capacity to support that function, which may be uncovered by boosting its expression in that tissue. The data presented appear consistent with the hypothesis that the levels of expression of each paralogue may be coordinately regulated, a model that could be readily addressed here but is not directly addressed in this study.

We thank this reviewer for appreciating this work. We emphasize the three important take home lessons described by this reviewer by explicitly listing them in the impact statement.

See further responses to specific critiques below.

The work is valuable. Its impact is dampened by four factors:

1) The work makes two very different points that are not yet clearly related: the first portion of the manuscript deals with a purported relation between the strength of a mutant allele causing a developmental phenotype and the amount of variability in the structures that are formed abnormally, whereas the second portion of the manuscript deals with the idea that paralogue expression may mitigate or exaggerate the expression of the phenotype associated with a single mutation.

Per this reviewer’s suggestion, we removed the allelic series experiment and all interpretations purporting relation between the strength of a mutant mef2ca allele and the amount of variation.

2) The writing is often unclear: terms are used that are poorly defined or incorrectly applied; interpretations are freely mixed with presentations of results; paragraphs often have multiple ideas intermingled leading to a lack of clarity and repetition of ideas in different paragraphs.

We have carefully edited to ensure that terms are clearly defined and correctly used. We added a paragraph in which we defined some of the major terms, including robustness, compensation, and buffering. We are careful with our usage thereafter. We have taken great care to save interpretations for the discussion rather than mix with presentations of results.

3) There are outright mistakes in the use of some references or the descriptions of some past work.

We corrected these, thank you.

4) Descriptions of the precise experimental approach taken should be made clearer in some cases.

We have made clearer how different mutant lines were generated and maintained in the appropriate Results section.

Bailon-Zambrano et al. study factors that influence the expression of phenotype associated with a mutation. In one portion of the work they study a series of alleles of the zebrafish mef2ca gene, which is necessary for normal craniofacial development. They demonstrate that recessive putative loss-of-function mutant alleles of the mef2ca gene of zebrafish are associated with a range of expressivity. By focusing on one aspect of the mutant phenotype, the length of the symplectic cartilages that support the jaw, they find a correlation between the average strength of the phenotype of an allele (measured as reduction in length) and the extent of variability between mutant individuals that carry the allele. This is an interesting idea, but it is not fully explored or proven by their study. Because it is not clear how the various alleles affect expression or function of the gene's protein product, and because the precise function of the gene in shaping craniofacial structures is not clear, it is difficult to derive a mechanistic understanding of their observation and it is not possible to extrapolate from this study to other genes.

As suggested by this reviewer, we removed the first portion containing the allelic series from this study. The result is a streamlined body of work focusing on the paralogs.

The findings that associate individual mef2ca alleles with variability in the severity of the craniofacial phenotype are difficult to interpret. The authors order the "strength" of an allele based on the overall severity of the phenotype, but the molecular reason why the different alleles have different "strengths" is completely unclear. The mildest allele has a mutation affecting the initiation codon, and so presumably it produces a protein utilizing a different initiation point. However it is not clear why the deletion of almost the entire locus leads to an intermediate phenotype, whereas a mutation introducing a premature termination causes the most severe phenotype. Perhaps the severe allele produces a product that interferes with the activity of paralogous products or interacting proteins? Who knows? In addition, since we don't know the precise cellular functions regulated by the gene, it is impossible to generalize from this case and conclude, as the authors do, that there is a general correlation between severity of an allele and the variability in phenotype that results.

We removed the allelic series from this study.

Finally, this reviewer is concerned about this conclusion and generalizations that may be drawn from focus on a single quantifiable character, the symplectic cartilage. Perhaps there is always a fixed amount of variation in the length of this cartilage. As stronger alleles produce shorter cartilage pieces, similar absolute variations in length may appear to be of greater significance when affecting structures of shorter average length.

We removed the allelic series from this study. In the remaining phenotypic characterizations, we always include penetrance scoring of all mef2ca mutant-associated phenotypes to assay all mef2ca mutant-associated phenotypes.

We use the coefficient of variation to compare symplectic length variation between the high- and low-penetrance strains. We have confirmed with a statistician and coauthor (K. Colborn) that this value is appropriate for comparing variation in symplectic cartilage length because it uses a ratio of the variation and the mean, which allows you to compare variables of different scales or magnitude. Also, because the measurements are from a ratio scale, they are appropriate variables for estimating the coefficient of variation (for example, an ordinal or nominal variable would be inappropriate).

Finally, we also use the F-test to determine significant differences in variation.

In contrast with the first portion of the work, the second portion of the work makes truly valuable and mechanistic contributions to our understanding of how genetic background might modify the phenotypic expression of a particular mutation. The second portion investigates one mechanism that may contribute to the oft-observed phenomenon that a single mutation may be associated with different expressions of a phenotype in a manner that is dependent on the genetic background. The authors focus on loss-of-function of the mef2ca gene of zebrafish, which is needed for the normal development of several craniofacial structures. They find expression levels of mef2 paralogues may modify the phenotype associated with the mef2ca mutation. Three important take home lessons of this work are:

1) Within populations there is existing (standing) variation in the expression levels of a set of paralogous genes;

2) Control over the expression levels of these paralogues is heritable;

3) Variations in the level of expression of paralogues can modify the expressivity and penetrance associated with a mutant allele of one of the paralogues. The authors use these observations to support a very interesting model they propose in which over evolution as a paralogue loses one of its functions, perhaps by acquiring diminished expression in a particular tissue, it may retain a vestigial capacity to support that function, which may be uncovered by boosting its expression in that tissue.

We are pleased that the reviewer thinks that “the work makes truly valuable and mechanistic contributions”

Major issues:

1) I think the work is important, I would like to see it published, but I found the manuscript very difficult to read.

By removing the allelic series, we made the paper clearer and more concise. We have also clarified language and edited the manuscript according to this reviewer’s suggestions.

2) Perhaps one reason it is difficult to follow is that the work described here is composed of two very different studies, one describing the phenotypic effects of three different alleles and the other describing the effects of genetic background on the expression of one allele. This is reflected in the presence of two independent sections in the Introduction. Because we don't understand how the three different mutations affect generation of a protein product, or how they might affect the expression of paralogues (which are shown to be very important in the second half of this work), we cannot understand what a "strong" or "intermediate" allele is. IF the deletion completely removes gene function, then from a molecular point of view, it is a complete loss of function, and thus the premature termination mutation is a "special" type of allele that is more than simply completely loss of function. In any case, without insight into how these alleles work, I don't think it is fair to make generalizations about strengths of alleles and variability in the expression of a phenotype.

We agree and removed the allelic series component of the study.

On the other hand, there are clear statements in the Introduction that could really be used as the crux of an entire paper: "whether paralog expression variation underlies phenotypic variation has not been directly tested. Finally, whether variation paralogous buffering is subject to selection is unknown." Personally, I suggest focusing on the effects of paralogues to buffer phenotype – it makes for a crisp introduction and focused story.

As suggested, in this revised version we focus “on the effects of paralogues to buffer phenotype”.

3) There are assertions that I believe are incorrect.

For example, lines 110-113: "Because mutations in none of the other four zebrafish paralogous mef2 genes have been reported, their function is unknown. These paralogs have human orthologs, MEF2A, MEF2B and MEF2D, that do not appear to be critical for craniofacial development46-48." Reference 46 and perhaps the others does NOT address the question of whether these genes are required for craniofacial development – ref 46 says mef2d fusion proteins are associated with cancer. These are somatic gain-of-function mutations and irrelevant to the matter at hand, which is complete whole animal loss of function.

We agree that only “complete whole animal loss of function” is relevant to the matter at hand and have revised this section accordingly.

Another example, lines 307-311: "Data showing that paralog transcriptional adaptation does not account for the phenotypic differences between the low- and high-penetrance strains34 do not rule out the possibility that selective breeding changed paralog expression between strains in mef2ca wild types." Reference 34 is the classic Force et al. paper and is not the correct reference – I think you meant to use reference 52.

We have corrected this – thank you.

Another example, lines 364-367: "We detected mef2d transcripts in the anterior arch population of 24 hpf wild-type cranial neural crest cells, and mef2d expression is significantly lower in the high-penetrance strain compared with the high-penetrance strain" I think you meant to compare high and low penetrance strains.

We have corrected this – thank you.

There are other examples of assertions that are a bit glib and that distract from the strength of the work.

We have carefully edited and revised the manuscript.

4) Terminology is used loosely or incorrectly.

We now include a section defining terms and have taken great care to use them correctly.

I find in the literature the use of between-individual as well as among-individual, and that between-individual is used more often. Personally I find between-individual easier to keep in my mind what the comparison being described is, but both are legit I guess.

We’ve opted to keep among-individual.

In many cases I believe it would be clearer to say that there are differences in expression levels among members of a population and that these levels are heritable – rather than saying "variability is heritable" (line 42 and elsewhere), which is a difficult concept to grasp precisely.

We removed this statement (line 42) and clarified elsewhere that the expression levels are heritable.

Lines 17-18: "Craniofacial disorders further increase this variability." "this variability" is ill-defined – perhaps it is more useful to speak of "range of variability"

We edited this statement for clarity.

Line 44: "Finish people" should be "Finnish people".

We have corrected this, thank you.

Lines 63-64: "Human genetic craniofacial disease phenotypes also appear more variable than normal human facial phenotypes." What is the actual comparison that is being made? I can't reconstruct this experiment.

We added a citation to support this claim.

Lines 70-71: "Patients heterozygous for mutations affecting the transcription factor encoding gene MEF2C show variable facial dysmorphologies19-21." This is relevant to the present study if you are describing variability associated with a single mutation – if that is true, make it explicit. If you mean that there are different alleles and these different alleles generate a range of phenotypes then it is not relevant to the discussion here.

We clarify in this section that multiple alleles are associated with MEF2C haploinsufficiency syndrome. Therefore, there is a need for a controlled model system that can be used to explore the variation that persists when the same allele is analyzed in different individuals.

Line 187 – not a "range of alleles", but a series of alleles.

We removed the allelic series study.

Line 225-227 and elsewhere – there are lots of places where alleles are described as variable – where the mutation or gene is conflated with the phenotype. "and the most severe mutant (mef2cab1086) allele was significantly more variable than the mildest mutant (mef2cab631) allele".

We removed the allelic series study and are careful to not conflate mutation or gene with the phenotype.

Line 266: "likely to be robust against null mutations" this is not clear.

We clarified the sentence.

The entire section beginning with line 284: "Closely related mef2 paralogs share similar expression dynamics" It is not clear what conclusions can be made or what conclusions you want to make about similar "dynamics". What one would like to measure is whether the paralogues are expressed in the same cells as the gene of interest and at a time when they are performing a function of interest. Since you have not explained when and where mef2ca functions, it is impossible to interpret the "dynamics" data in terms of relevant function. Please be explicit about what these experiments let you conclude and why.

We clarified this section to indicate when and where mef2ca functions and include relevant references. We then clarify that we used qPCR on whole heads to study the temporal dynamics of the paralogs. We then state that we use single-cell RNA sequencing to examine spatial expression of the paralogs.

Line 296-297: this paragraph is ended with a summary sentence: "Of note, previous work indicates that partially shared regulatory motifs are predictive of robustness66." Neither the meaning of this nor its relevance is clear to me. whose robustness? And in what context?

We deleted this statement.

5) Some things can just be made simpler:

Lines 32-34: "We present a novel, mechanistic model for phenotypic variation where cryptically variable, vestigial paralog expression buffers development." I don't think the Reader will understand "cryptically variable" at this point of the paper (the Abstract).

We have removed this from the abstract.

Line 61 and elsewhere: "genetic mutations" could be simply "mutations".

We changed this as suggested.

6) Experiments and Models that need to made clearer:

A critical series of experiments involves study of mutant alleles of the paralogues that are thought to be insufficient to have a dramatic effect on the craniofacial phenotype, but are thought to modify this phenotype in the context of mef2ca or other paralogue mutations. The authors at one point talk about this experiment as simply "removing" the paralogue (line 343). At that point, I had assumed the new mutations were being generated within the strain that was being studied – that is the easiest way to interpret "remove". Instead, paralogue mutations are induced elsewhere and bred into strains for study. The rationale for the experimental design is complicated and should be explicitly set forward. I would ask the authors to make clearer within the Results section how they generated the new mutations in the paralogues and how these were introduced into various strains and what precisely was examined. The entire exposition is a bit buried in Lines 336 – 338 paralogs. "For all paralog functional experiments, we introduced the relevant paralog mutation into the low-penetrance strain, then maintained by outcrossing to unselected AB." This is not a complete sentence. The details of this experiment are really important.

We clarified in the Results section how each paralog was mutagenized and maintained and provide details of the experiment in the relevant section. While we agree that a more elegant approach would be to perform the experiments in true breeding low-penetrance strains, this was not feasible due to time and space constraints (for example, rederiving new low-penetrance strain after crossing in the mef2cb allele). We are careful with our language to indicate that the mef2cb allele needed to be introduced by crossing into the low-penetrance background. Further we describe how after initially introducing the CRISPR/Cas9 lesions in the low-penetrance strain they were maintained outcrossing to an unselected AB background.

The other point that needs to be better explained and clarified is their model in Figure 9. The mechanistic implications of the illustration are vague, but to me there appears to be focus on the roles of enhancers – the figure seems to imply (i) that enhancers acquire mutations that diminish expression of the paralogue and (ii) that under selection the enhancer acquires mutations that result in increased expression of the gene. I would think rather that standing variation arises prior to the selection – and this might be illustrated in the figure. Nevertheless, it seems from the data that the expression levels of all the paralogues may be coordinately regulated – and this idea seems to supported in the Discussion. If all the paralogues are expressed at low levels in the high penetrance strain and at higher levels in the low penetrance strain, then it is more likely that variation is not being driven at the level of each paralogue's enhancer but rather by some factor working in trans. That idea is really missing from Figure 9 – and since it is a summary illustration of what the authors think, getting this Figure right is important. In truth, I would have liked to see a few crosses between strains to test the idea that a single modifier working in trans is responsible for all the effects – THAT would have been both relatively easy to test and killer interesting as a result. But without such an experiment, the authors need to be very clear about alternate possible interpretations.

We agree that getting this figure right is important. We completely revised our graphical model to summarize and illustrate the conclusions from this work.

Regarding crosses between strains: We previously published the hybridization experiment this reviewer suggests (Nichols et al. Development, 2016). Briefly, we crossed low- and high-penetrance individuals to each other. In this experiment: “We found that the F1 progeny had a mean phenotype between that of the parental lines, although the distribution of penetrance in the two F1 families differed slightly and both were skewed towards higher penetrance.” We further interpreted these data (and other data in the 2016 manuscript) to indicate that penetrance is inherited as a threshold character. In this model, inheritance of continuous variables results in an outwardly discontinuous phenotypic value. Our new work here indicates that the continuous variables we predicted are paralog expression levels, and the discontinuous phenotype is penetrance (a phenotype is present or not).

Finally, in that work our data supported a model where a few genetic factors control penetrance. In other words, it is oligogenic inheritance.

Reviewer #3 (Recommendations for the authors):

The authors have extensively revised and added new data to their manuscript in response to the initial round of reviews. The language is generally more precise and concise, and sections lacking clarity are improved or have been removed. I especially appreciate that they now explain the breeding scheme by which the paralog mutant alleles were established and maintained, as the phenotype prevalences are now better in line with what the reader is led to predict.

Thank you for your comments which improved the manuscript.

https://doi.org/10.7554/eLife.79247.sa2

Article and author information

Author details

  1. Raisa Bailon-Zambrano

    Department of Craniofacial Biology, University of Colorado Anschutz Medical Campus, Aurora, United States
    Contribution
    Formal analysis, Investigation, Writing – review and editing
    Contributed equally with
    Juliana Sucharov
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-5848-3952
  2. Juliana Sucharov

    Department of Craniofacial Biology, University of Colorado Anschutz Medical Campus, Aurora, United States
    Contribution
    Conceptualization, Formal analysis, Investigation, Writing – review and editing
    Contributed equally with
    Raisa Bailon-Zambrano
    Competing interests
    No competing interests declared
  3. Abigail Mumme-Monheit

    Department of Craniofacial Biology, University of Colorado Anschutz Medical Campus, Aurora, United States
    Contribution
    Investigation, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-0090-1418
  4. Matthew Murry

    Department of Craniofacial Biology, University of Colorado Anschutz Medical Campus, Aurora, United States
    Contribution
    Investigation
    Competing interests
    No competing interests declared
  5. Amanda Stenzel

    Department of Craniofacial Biology, University of Colorado Anschutz Medical Campus, Aurora, United States
    Contribution
    Investigation
    Competing interests
    No competing interests declared
  6. Anthony T Pulvino

    Department of Craniofacial Biology, University of Colorado Anschutz Medical Campus, Aurora, United States
    Contribution
    Investigation
    Competing interests
    No competing interests declared
  7. Jennyfer M Mitchell

    Department of Craniofacial Biology, University of Colorado Anschutz Medical Campus, Aurora, United States
    Contribution
    Investigation
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-4222-5235
  8. Kathryn L Colborn

    Department of Surgery, University of Colorado Anschutz Medical Campus, Aurora, United States
    Contribution
    Methodology
    Competing interests
    No competing interests declared
  9. James T Nichols

    Department of Craniofacial Biology, University of Colorado Anschutz Medical Campus, Aurora, United States
    Contribution
    Conceptualization, Formal analysis, Supervision, Funding acquisition, Investigation, Visualization, Methodology, Writing – original draft, Project administration, Writing – review and editing
    For correspondence
    JAMES.NICHOLS@UCDENVER.EDU
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-7263-1704

Funding

National Institute of Dental and Craniofacial Research (R01 DE029193)

  • James T Nichols

National Science Foundation (Graduate Research Fellowships Program 201569)

  • Raisa Bailon-Zambrano

National Institute of Dental and Craniofacial Research (F32 DE029995)

  • Jennyfer M Mitchell

National Science Foundation (Graduate Research Fellowships Program)

  • Abigail Mumme-Monheit

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank Charles Kimmel for careful reading of this manuscript, members of the department of craniofacial biology for insightful discussions, Austin Tillery for contributions in the early stages of this study, and the University of Colorado zebrafish care staff.

Ethics

All of our work with zebrafish has been approved by the University of Colorado Institutional Animal Care and Use Committee (IACUC), Protocol # 00188. Animals were euthanized by hypothermic shock followed by 1.5% sodium hypochlorite.

Senior Editor

  1. Christian R Landry, Université Laval, Canada

Reviewing Editor

  1. Tatjana Piotrowski, Stowers Institute for Medical Research, United States

Publication history

  1. Received: April 5, 2022
  2. Preprint posted: April 28, 2022 (view preprint)
  3. Accepted: September 21, 2022
  4. Accepted Manuscript published: September 22, 2022 (version 1)
  5. Version of Record published: October 12, 2022 (version 2)

Copyright

© 2022, Bailon-Zambrano, Sucharov et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Raisa Bailon-Zambrano
  2. Juliana Sucharov
  3. Abigail Mumme-Monheit
  4. Matthew Murry
  5. Amanda Stenzel
  6. Anthony T Pulvino
  7. Jennyfer M Mitchell
  8. Kathryn L Colborn
  9. James T Nichols
(2022)
Variable paralog expression underlies phenotype variation
eLife 11:e79247.
https://doi.org/10.7554/eLife.79247

Further reading

    1. Developmental Biology
    Marianne E Emmert, Parul Aggarwal ... Roger Cornwall
    Research Article Updated

    Neonatal brachial plexus injury (NBPI) causes disabling and incurable muscle contractures that result from impaired longitudinal growth of denervated muscles. This deficit in muscle growth is driven by increased proteasome-mediated protein degradation, suggesting a dysregulation of muscle proteostasis. The myostatin (MSTN) pathway, a prominent muscle-specific regulator of proteostasis, is a putative signaling mechanism by which neonatal denervation could impair longitudinal muscle growth, and thus a potential target to prevent NBPI-induced contractures. Through a mouse model of NBPI, our present study revealed that pharmacologic inhibition of MSTN signaling induces hypertrophy, restores longitudinal growth, and prevents contractures in denervated muscles of female but not male mice, despite inducing hypertrophy of normally innervated muscles in both sexes. Additionally, the MSTN-dependent impairment of longitudinal muscle growth after NBPI in female mice is associated with perturbation of 20S proteasome activity, but not through alterations in canonical MSTN signaling pathways. These findings reveal a sex dimorphism in the regulation of neonatal longitudinal muscle growth and contractures, thereby providing insights into contracture pathophysiology, identifying a potential muscle-specific therapeutic target for contracture prevention, and underscoring the importance of sex as a biological variable in the pathophysiology of neuromuscular disorders.

    1. Developmental Biology
    2. Genetics and Genomics
    Ankit Sabharwal, Mark D Wishman ... Stephen C Ekker
    Research Advance Updated

    The clinical and largely unpredictable heterogeneity of phenotypes in patients with mitochondrial disorders demonstrates the ongoing challenges in the understanding of this semi-autonomous organelle in biology and disease. Previously, we used the gene-breaking transposon to create 1200 transgenic zebrafish strains tagging protein-coding genes (Ichino et al., 2020), including the lrpprc locus. Here, we present and characterize a new genetic revertible animal model that recapitulates components of Leigh Syndrome French Canadian Type (LSFC), a mitochondrial disorder that includes diagnostic liver dysfunction. LSFC is caused by allelic variations in the LRPPRC gene, involved in mitochondrial mRNA polyadenylation and translation. lrpprc zebrafish homozygous mutants displayed biochemical and mitochondrial phenotypes similar to clinical manifestations observed in patients, including dysfunction in lipid homeostasis. We were able to rescue these phenotypes in the disease model using a liver-specific genetic model therapy, functionally demonstrating a previously under-recognized critical role for the liver in the pathophysiology of this disease.