Sex chromosome gene expression associated with vocal learning following hormonal manipulation in female zebra finches
Figures

Song system anatomy and experimental design.
(a) Diagram of song system connectivity within the adult male zebra finch brain with major telencephalic domains indicated. Area X connects back to the lateral magnocellular nucleus of the anterior nidopallium (LMAN) through the nonvocal-specific thalamic nucleus DLM. (b) Experimental design. Animals were treated with E2 or a vehicle from hatch until sacrifice on post-hatch day 30 (PHD30). (c) Weighted gene correlation network analysis (WGCNA) assignment of genes to modules. Left: Hierarchy computed over the transcriptome-wide topological overlap matrix of gene-to-gene correlations in transcript abundance across samples. Right: Module assignment raster, rows are genes colored according to the assigned module, unassigned genes in black. (d) Module eigengene (MEG) expression heatmaps arranged by module size (left) aligned to traits of interest (bottom). Each row is an MEG, and each sample is a column. Samples are grouped according to neural circuit node in different colored subpanels. Color intensity encodes MEG expression as calculated by WGCNA. An example raster with sample category labels is provided at right.

Outlier sample detection by hierarchical clustering.
Two samples (a vehicle-treated male HVC sample and an E2-treated female robust nucleus of the arcopallium [RA] sample, in red) form single sample branches in the hierarchical clustering tree, indicative of technical outliers unlikely to fit the correlational structure of the larger dataset. Samples were removed prior to gene network construction and module detection.

Selection of soft-thresholding power for weighted gene correlation network analysis (WGCNA) model.
Soft-thresholding power (beta, x-axis) is the exponent to which each element in the gene-to-gene correlation matrix is raised during adjacency matrix calculation. (a) Scale-free fit index (y-axis) as a function of the soft-thresholding power (x-axis). Horizontal line indicates a fit of 90%. (b) Mean connectivity (degree) in the network model (y-axis) as a function of the soft-thresholding power. We selected a power of 6 as it is on the knee of both plots and above the 90% scale-free fit criteria.

Selection of minimum module size and tree cut height parameter values for weighted gene correlation network analysis (WGCNA) model.
Each plot shows the sample distance matrix that results from the parameters in the plot title. Titles also show the number of modules found in each resulting model and the percentage of genes in the finch genome assigned. Minimum module size increases across columns (left to right: 10, 25, 50, 100, 250) and tree cut height decreases down rows (top to bottom: 0.9, 0.8, 0.6, 0.4, 0.2). Red arrow indicates the selected model. Model was selected to explain as much transcriptomic variance as possible while minimizing the number of technically overfit samples.

Initial module overfitting to single samples.
(a) Module eigengene (MEG) 7 and 13 are both highly expressed only in single samples, indicated by black arrows. (b) This overfitting causes these samples to be deep outliers in the sample-sample distance matrix, distant from all samples but themselves, indicated by black arrows. (c) Removing these MEGs from the set prevents these samples from behaving as outliers in the distance matrix. (d) These overfit modules were removed prior to module lettering and statistical analysis.

Association of modules to experimental variables.
(a–g) Bubble plots showing statistical association between module eigengene (MEG) expression and variables of interest in various sample subsets. Strength of association (r²) is encoded by bubble size, significance (p) is encoded in the color scale with significant associations darkly bordered. Pearson’s correlation and Student’s t-test, alpha = 0.05. Plots show the associations between gene modules (rows) to: a–b vehicle-treated song system specializations, comparing MEG expression in song system samples from either sex to their appropriate surrounding controls; c–d E2-treated song system specialization, same comparison as a–b but within E2-treated samples; e female vocal learning capacity after E2, comparing E2-treated female song system components to all other female samples from that circuit node; f sexual dimorphism within the song system, comparing vehicle-treated male and female song system components; g sexual dimorphism within the surrounding control regions, comparing the vehicle-treated male and female surrounding control samples. Each neural circuit node is considered separately (columns). (h–k) Expression of modules with strong region-specific expression. Module A is highly expressed in Str and Area X samples, with additional differences between robust nucleus of the arcopallium (RA) and lateral intermediate arcopallium (LAI) (h). Module C is highly expressed in the arcopallium, especially RA, with some increase in HVC (i). Module F is expressed highly only in the lateral magnocellular nucleus of the anterior nidopallium (LMAN) regardless of sex or treatment (j). Module G is only highly expressed in HVC, where it differs by both sex and treatment (k). (l–o) Expression of selected MEGs by animal (top) aligned to their respective experimental variables (bottom), color indicates region. The sex chromosome enriched module E was highly expressed in all male samples and depleted in all female samples regardless of brain region or pharmacological treatment. Module J, K, and L eigengenes were each highly expressed in samples from one (J and L) or two (K) animals across all brain regions sampled.

Gene module enrichments for human convergent signature and for chromosomes.
(a) Enrichment of genes previously found to be convergently differentially expressed in the human laryngeal motor cortex and the pallial song nuclei or convergently expressed between the human vocal striatum and Area X. Bubble size linearly encodes the number of genes in each convergence module pairing. Significance was assessed using a one-tailed generally applicable gene set enrichment (GAGE) test, similar to Gene Ontology (GO) ontologies, alpha = 0.05. Significant enrichments are darkly bordered and opaque. Values to the right of the vertical black line indicate above random chance. (b) Enrichment of genes from specific chromosomes. Left, fold enrichment of modules onto zebra finch chromosomes in the newest genome assembly available; center, the portion of module-assigned transcripts from each chromosome per module; right, the number of module-assigned genes per chromosome. Each row is a chromosome, with each bubble representing the enrichment of transcripts from that chromosome in one of the gene modules defined by weighted gene correlation network analysis (WGCNA). Values to the right of the vertical black line indicate above random chance. The size of the bubbles indicates the log10 transformed number of genes in each chromosome module pairing. Significance was assessed using an FDR-corrected bootstrapped test of observed enrichment for each module chromosome pairing based on 50,000 randomizations of genes into modules. Significant enrichments are darkly bordered and opaque. a–b use the same color scale for modules.

Brain-wide signatures of sex chromosome expression.
(a) Distribution of continuous membership in module E across all module-assigned genes (top) and module E-assigned genes (bottom) based on correlation of expression to the module eigengene (Pearson’s r to MEG-E) with sex chromosomes separated. (b–c) Distribution of sex chromosome gene expression correlations to the sex difference in vehicle-treated finches. Positive correlations indicate female-biased expression, while anticorrelations indicate male-biased expression. Significance was assessed in each region using an upper-tailed Student’s correlation test for W chromosome transcripts (b) and lower-tailed for Z chromosome transcripts (c) with significant correlations in black, alpha = 0.05. (d–e) Venn diagrams intersecting the significantly sex difference correlated genes across nonvocal surround regions for the W and Z chromosomes respectively. (f) Comparison of continuous membership in module E (r2 to MEG-E, y-axis) and module G (r2 to MEG-G, x-axis) across all 12,444 module-assigned genes.

Sex chromosome gene module effects by module.
(a–b) Scatter plots of Z and W chromosome transcript abundance averages for each region. X-axis contains all annotated sex chromosome genes ordered by module assignment (raster) and then expression (y-axis). Each gene has eight measured points, each the average of one brain region in vehicle-treated birds broken out by sex (color). Two y-axis scales are presented for each chromosome to help show the lowly expressed genes. (c) Boxplots describing the distribution of the percent reads from male samples per Z chromosome gene from a. (d) Bubble plot of the underlying Z chromosome single gene data from b; X-axis is ordered as in a with module assignment encoded by bubble color. The cumulative average expression is indicated by bubble size and opacity (male avg. FPKM + female avg. FPKM) with higher expressed genes being larger and more opaque. Red lines indicate the male read percentage expected for Z chromosome genes, 66.6%. Green lines indicate equal expression between sexes. Note that the Z chromosome genes of module E are expressed on average at almost exactly the sex ratio predicted by dosage, while Z chromosome genes in other modules show some degree of compensation. (d–e) Same as b–d but for song nuclei. (f) Boxplots describing the distribution of the percent reads from male samples per W chromosome gene from b. (g) Bubble plot of the underlying W chromosome single gene data from f; X-axis is ordered as in b with module assignment encoded by bubble color. The cumulative average expression is indicated by bubble size and opacity (male avg. FPKM + female avg. FPKM) with higher expressed genes being larger and more opaque. Red lines indicate the male read percentage expected for W chromosome genes, 0%. Green lines indicate equal expression between sexes. (h–i) same as b–d but for song nuclei.

Identification of core genes in module G and their association to the Z chromosome.
(a–d) Single gene continuous membership in module G (x-axis; Pearson’s r to module eigengene [MEG] from module G) for all assigned genes vs correlation to vocal learning in masculine or masculinized HVC relative to samples from nonvocal learning females in each of the four comparisons: (a) male song system membership, comparing individual gene expression in male HVC samples of either treatment to expression in the surrounding dorsal nidopallium (DN); (b) female vocal learning capacity after E2, comparing E2-treated HVC to all other female DN or HVC samples; (c) sexual dimorphic gene expression within the song system, comparing vehicle-treated male and female song system components; (d) estradiol-responsive gene expression in female HVC, comparing E2-treated and vehicle-treated female HVC samples. Each point is a gene colored by module assignment, and the shaded area indicates gene of interest criteria for each comparison. (e–h) Blowup of shaded regions in a–d, respectively, showing genes of interest from each comparison. (i) Identification of core genes by intersecting the four gene sets of interest. (j) Enrichment of Z chromosome transcripts within the core genes. * indicates p=0.0087 by an upper-tailed hypergeometric test.

Expression of module G core genes in HVC and surrounding dorsal nidopallium.
Each of the 14 core genes shows reduced expression in female HVC relative to the male, with an increase in expression in response to E2 treatment. Bar represents mean with individual data points shown. This transcriptional response to E2 is not seen in the surrounding dorsal nidopallium (DN).

Proposed model of sexually dimorphic zebra finch vocal learning.
We propose that estradiol treatment in female zebra finches masculinizes song behavior by overcoming insufficient Z sex chromosome dosage in HVC to increase the expression of transcripts normally depleted in females. The Z chromosome genes upregulated by E2 are components in a larger proliferative genetic program which prevents HVC atrophy in males and allows for its expansion late in development. The upregulation of these genes allows for the increased specialization of the gene networks they participate in, promoting HVC development sufficiently to enable rudimentary vocal learning in females.
Additional files
-
Supplementary file 1
Gene module assignments.
- https://cdn.elifesciences.org/articles/89425/elife-89425-supp1-v1.csv
-
Supplementary file 2
Gene Ontology (GO) enrichments by module.
- https://cdn.elifesciences.org/articles/89425/elife-89425-supp2-v1.csv
-
Supplementary file 3
Sexually dimorphic sex chromosome transcripts across regions.
- https://cdn.elifesciences.org/articles/89425/elife-89425-supp3-v1.csv
-
Supplementary file 4
Z chromosome expression ratios.
- https://cdn.elifesciences.org/articles/89425/elife-89425-supp4-v1.csv
-
Supplementary file 5
W chromosome expression ratios.
- https://cdn.elifesciences.org/articles/89425/elife-89425-supp5-v1.csv
-
Supplementary file 6
Module G core VL genes.
- https://cdn.elifesciences.org/articles/89425/elife-89425-supp6-v1.csv
-
Supplementary file 7
Sex- and E2-dependent HVC specializations.
- https://cdn.elifesciences.org/articles/89425/elife-89425-supp7-v1.xlsx
-
MDAR checklist
- https://cdn.elifesciences.org/articles/89425/elife-89425-mdarchecklist1-v1.docx