1. Genetics and Genomics
Download icon

Ancient mechanisms for the evolution of the bicoid homeodomain's function in fly development

  1. Qinwen Liu
  2. Pinar Onal
  3. Rhea R Datta
  4. Julia M Rogers
  5. Urs Schmidt-Ott
  6. Martha L Bulyk
  7. Stephen Small  Is a corresponding author
  8. Joseph W Thornton  Is a corresponding author
  1. University of Chicago, United States
  2. New York University, United States
  3. Harvard University, United States
  4. Brigham and Women’s Hospital and Harvard Medical School, United States
Research Article
Cite this article as: eLife 2018;7:e34594 doi: 10.7554/eLife.34594
5 figures, 1 data set and 7 additional files


Figure 1 with 1 supplement
Role of the homeodomain in functional divergence of D. melanogaster Zen and Bcd.

Offspring of transgenic bcd-deficient D. melanogaster containing two copies of various replacement transgenes are shown in cuticle preparations of larval head (A, K, U) and whole body (B, L, V); in situ hybridization experiments and antibody stains in transgenic embryos show expression of Bcd’s translational (C, M, W) and transcriptional target genes (D–J, N–T, X–DD). Replacement transgenes were Bcd from D. melanogaster (DmBcd, A–J), Zen from D. melanogaster (DmZen, K–T), and a chimera of DmZen in which the homeodomain (HD) has been replaced by the HD of DmBcd (DmZen-BcdHD, U–DD). All three transgenes contain the D. melanogaster bcd promoter (BP), which directs maternal expression, and the bcd 3’UTR sequence required for anterior localization of the transgenic RNA. In the icons for each construct, protein-coding sequences from DmBcd are light blue and DmZen are red. Triangles show AttB sites used for recombination-mediated cassette exchange. For cuticle preps, mouthhooks (MH), dorsal arm (DA), ventral arm (VA), cephalic structures (Ceph), thoracic segments (T1–T3), abdominal segments (A1–A8), and Filzkörper (Fk) are indicated when present. An anti-Cad antibody was used to detect Cad protein. Antisense probes used in in situ hybridization experiments are indicated (hb: hunchback, kni: knirps, gt: giant, otd: orthodenticle, ems: empty-spiracle, btd: buttonhead, eve: even-skipped mRNAs). For comparison to wild-type, bcd-deficient, and a reciprocal construct of Bcd containing the ZenHD, see Figure 1—figure supplement 1.

Figure 1—figure supplement 1
Phenotypes of wildtype, bcd-null, and transgenic D. melanogaster.

(A–J) Flies of genotype yw (wild-type) are not modified at the bcd locus. Shown are cuticle preparations of larval head (A) and whole body (B); embryonic expression patterns are shown for Bcd’s translational regulatory target Caudal (Cad, C) and transcriptional targets (D–J). (K–T) Flies of genotype bcdE1 carry null alleles at the bcd locus. (U–DD) Flies of genotype DmBcdZenHD are bcdE1null but carry transgenic alleles consisting of the D. melanogaster bcd locus, including the coding region (blue) but with the HD replaced by that of the D. melanogaster zen gene (red). (EE–GG) Embryonic expression of bcd and zen mRNAs in wild-type and transgenic embryos: (EE) bcd mRNA in yw flies, (FF) transgenic bcd mRNA in bcdE1 flies transformed with D. melanogaster bcd, and (GG) zen mRNA in bcdE1 flies transformed with the D. melanogaster zen.

Figure 2 with 1 supplement
Reconstruction of ancestral HD sequences.

(A) Phylogeny of Bcd, Zen, and Hox3 HD amino acid sequences from insects and other arthropods. One representative sequence is shown among sequence entries with no sequence diversity. Reconstructed ancestral nodes are labeled; the mean posterior probability across sites for each maximum likelihood ancestral sequence is shown. Branch support labels show the approximate likelihood ratio statistic. Scale bar, 0.7 amino acid changes per site. The tree was constrained to reproduce well-corroborated species relationships; for the maximum likelihood phylogeny, see Figure 2—figure supplement 1. For species names and sequences, see Supplementary file 1. (B) Amino acid sequences of AncZB, AncBcd and Bcd from D. melanogaster (Dm) HDs. For AncBcd and DmBcd, only residues that differ from the AncZB HD are shown. Of these, 11 substitutions (shown in blue) are diagnostic differences conserved among all sequences in the Bcd vs. Zen clades. Differences between AncBcd and DmBcd are labeled red. For site-specific posterior probabilities of ancestral states, see Supplementary file 2.

Figure 2—figure supplement 1
Unconstrained phylogeny.

The maximum likelihood phylogeny of the Hox3/Bcd/Zen homeodomain protein family obtained with no topological constraint. Scale bar, expected amino acid substitutions per site. Branch labels, statistical clade support shown as the approximate likelihood ratio statistic, defined as two times the difference in log-likelihoods of the ML tree and the next-best tree containing a topological arrangement around that node.

Figure 3 with 2 supplements
In vitro and in vivo functions of AncBcd and AncZB HDs.

(A, B) Protein binding microarray (PBM) analysis of the DNA specificity of AncBcd, DmBcd, and AncZB HDs. Each dot represents one DNA 6-mer, plotted according to its binding enrichment (E-scores) in PBMs performed with two different HDs; Pearson’s correlation coefficient for each comparison is shown. (C–E) Energy logos showing the site-specific sequence determinants of DNA binding by AncZB, AncBcd, and DmBcd HDs as determined from PBM data. The size of each letter is proportional to the absolute value of that base’s main effect on the energy of binding relative to the mean of all bases at that position. The y-axis is unitless after division by RT. (F) Affinity of ancestral HDs for consensus Zen and Bcd DNA motifs (ZM and BM, sequences shown) in a quantitative fluorescence electrophoretic mobility shift assay. Several noncanonical Bcd targets previously shown to bind Drosophila Bcd were also tested. Column heights and error bars show the mean and SEM for three replicates. (G–Z) Cuticle phenotypes and gene expression patterns for offspring produced by bcdE1 females carrying two copies of a rescue construct expressing Dm Bcd protein (light blue) in which the HD has been replaced by the reconstructed AncBcd HD (dark blue, panels G-P) or the reconstructed AncZB HD (purple, panels Q-Z). Individual panels and constructs are labeled as in Figure 1. For replicates and alternative ancestral reconstructions, see Figure 3—figure supplement 1, Figure 4—figure supplement 1 and Supplementary files 3 and 4. For SPR analysis, see Figure 3—figure supplement 2 and Supplementary file 4.

Figure 3—figure supplement 1
Binding site preference is robust to uncertainty about the ancestral sequence reconstruction.

Quantitative fluorescent EMSAs were performed using alt-all versions of AncZB and AncBcd HDs, which contain the state with the second highest posterior probability at all ambiguously reconstructed sites, defined as all those sites at which more than one state has posterior probability >0.20. Plots show the fraction of ZM and BM bound as the concentrations of AncZB HD-AltAll (A) or AncBcd HD-AltAll (B) were increased.

Figure 3—figure supplement 2
Bcd evolved novel binding specificity by changes in sequence-specific dissociation rates.

(A) Association and dissociation rates of reconstructed ancestral HDs to canonical Zen and Bcd motifs (ZM, BM) were measured using a surface plasmon resonance (SPR) assay. Raw binding traces (color in black) show SPR response when different concentrations of HD were applied to sensors with immobilized BM or ZM DNA elements. Binding modes with parameters fit to the experimental data are in red. (B) Isoaffinity plots show the dissociation (koff) and association (kon) rates of AncBcd (square) and AncZB (triangle) HDs on DNA elements ZM (cyan), and to BM (red-orange) obtained from SPR. Arrows indicate the evolutionary change from AncZB to AncBcd HD. Diagonal dotted lines mark isoaffinity lines of association constants (Ka, blue labels) that differ by factors of ten. (C) Effect of historical substitutions on off- and on-rates of binding to BM. Solid arrows show the effect of q50K; dotted arrows show the effect of all 30 other substitutions from the same historical interval.

Figure 4 with 1 supplement
A single amino acid substitution changes AncZB's in vitro binding preference.

(A) Effect of reversing diagnostic historical substitutions on AncBcd-HD’s affinity for consensus Bcd and Zen motifs (BM and ZM), in a fluorescent EMSA assay. Column height and error bars show mean and SEM of three replicates. Upper and lower cases denote derived and ancestral states, respectively. (B) Effect of introducing historical substitutions to the derived state at site 50 and 54 into AncZB HD. (C) Binding of DNA motifs by AncZBHD-q50K is tightly correlated with binding by AncBcdHD. Each point is a DNA 6-mer plotted by the average of E-scores for 8-mers containing that 6-mer. Pearson correlation coefficient is shown. (E) Binding of DNA motifs by AncBcdHD-K50q is tightly correlated with binding by AncZB-HD. (D, F) The PBM energy logos of AncZBHD-q50K and AncBcdHD-K50q (compare to Figure 3C). (G) Effect of historical substitutions on global binding specificities in PBMs. Each column represents an 8-mer, colored by its PBM E-score, according to the color scale shown. Each row shows PBM of a reconstructed HD protein with or without residue 50 swapped between ancestral and derived states. 8-mers were clustered using the Manhattan distance metric. Only 8-mers with E-score ≥ 0.45 for at least one HD variant are shown. Logos at the bottom of the heatmap were created from the 8-mers in the two large clusters.

Figure 4—figure supplement 1
8mer binding profiles for replicate HD constructs.

Each row represents one 8mer sequence, and each column represents one HD protein variant, colored by E-score using the heatmap scale shown at left. Rows and columns were clustered by similarity using the Manhattan distance metric. Each protein was tested in two replicate experiments, indicated by the suffix _1 or _2. Ancestral reconstructions represent the maximum likelihood ancestral sequence (unlabeled) or the ‘alt-all’ reconstruction, which contains the state with the second-highest posterior probability at all ambiguously reconstructed sites, defined as those in which more than one site has posterior probability >0.20. All HDs were tested with 15 additional flanking residues from the DmBcd protein (labeled B) or the DmZen protein (unlabeled). Only 8-mers with E-score ≥0.45 for at least one protein variant are shown.

Effect of historical substitutions on Bcd functions in vivo.

(A–J) Testing the necessity of q50K for AncBcd function. Cuticle phenotypes and Bcd-target gene expression patterns of embryos from bcdE1 females carrying two copies of a Bcd rescue construct containing AncBcd-HD (dark blue) with an amino acid reversion to the AncZB ancestral state at residue 50 (purple arrow). (K–NN) Testing the sufficiency of historical substitutions of q50K and m54R. Embryos expressing Bcd rescue constructs containing AncZB-HD (purple) with historical substitutions to the derived AncBcd state (blue arrows) at site 50 (K–T), 54 (U–DD), or both (EE–NN). Individual panels and constructs are labeled as in Figure 1. Stars indicate Bcd-dependent target gene activation induced by the indicated constructs.


Data availability

The PBM data generated is available via Dryad (doi:10.5061/dryad.pm3g4r3)

The following data sets were generated
  1. 1

Additional files

Supplementary file 1

Extant homeodomain sequences used for phylogenetic analysis and ancestral reconstruction.

For each sequence in the alignment, the genus and species name, amino acid sequence, and abbreviation used in the phylogeny are shown.

Supplementary file 2

Reconstructed ancestral sequences and site-specific marginal posterior probabilities (mpp).

At each sequence site, the three amino acid states with the highest mpp are shown for the HDs of AncZB (panel A) and AncBcd (panel B). Red indicates ambiguously reconstructed sites, defined as those that have more than one state with mpp >0.20.

Supplementary file 3

PBM binding specificity profiles are robust to the uncertainty in ancestral sequence and choice of flanking sequences.

Energy PWMs were inferred from PBMs using maximum likelihood reconstructions (no suffix) or alternative reconstructions (suffix ‘altAll’) and from constructs using 15 flanking residues derived from D. melanogaster Bcd protein (prefix 'B-’) or from D. melanogaster Zen protein (no prefix). The Predicted column shows R2 for the comparison of intensities predicted from a model fit to the PBM in the column labeled ‘Training’ against measured intensities in the PBM of the ‘Prediction’ protein. The ‘Measured’ column shows R2 for comparison of measured intensities from the training PBM against the prediction sample.

Supplementary file 4

Kinetic binding parameters inferred by surface plasmon resonance assays.

Binding by various homeodomain constructs (HD) to canonical DNA motifs BM or ZM was measured by SPR. The estimated on-rate (kon), off-rate (koff), association constrant (KA) and average residence time t1/2 are shown. Mean and SEM of three replicates is shown for each parameter.

Supplementary file 5

Frequency of phenotypes in transgenic embryos.

Frequency of observation (in percent) of various phenotypic features in cuticular preps of embryos from females of various genotypes is shown. YW, wild-type; bcd-, bcdE1/bcdE1; AncZBq50K, bcd- embryos rescued with a bcd construct containing the ancestral AncZB homeodomain with the q50K substitution; AncZB-K50R54, bcd- embryos rescued with a bcd construct containing the ancestral AncZB homeodomain with the q50K and m54R substitutions. No mutant larvae showed complete head development; partial head is defined as the observation of microscopic bright spots (sclerotized tissue) at or near the anterior tip. T1-T3: Thoracic segments, numbered anterior to posterior, A1-A4: Abdominal segments, numbered anterior to posterior. The number of larvae counted for each genotype, n, is shown.

Supplementary file 6

Protein binding microarray experimental reproducibility and binding model fitting results.

The table shows model accuracy and cross-replicate reproducibility for protein binding microarrays. Each row represents one HD tested in two replicates. Energy models (see Materials and methods) were trained on data from replicate 1, used to predict the median intensity of each possible 6-mer, and compared to the experimentally measured median intensity of DNA probes carrying that 6-mer (see Materials and methods). Column a, coefficient of determination (R2) for model-predicted intensities from PBM replicate 1 against measured intensities from replicate 1. Column b, R2 for model-predicted intensities from replicate 2 against measured intensities for replicate 2. Column c, R2 for measured intensities from PBM replicate 1 against measured intensities from PBM replicate 2. Cross-replicate statistical reproducibility is reduced because stripping and re-using protein binding microarrays reduces signal; nevertheless, the specific binding profiles are similar between replicates (see Supplementary file 6).

Transparent reporting form

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)