Song system anatomy and experimental design.

a, Diagram of song system connectivity within the adult male zebra finch brain with major telencephalic domains indicated. Area X connects back to LMAN through the non-vocal specific thalamic nucleus DLM. b, Experimental design. Animals were treated with E2 or a vehicle from hatch until sacrifice on PHD30. c, WGCNA assignment of genes to modules. Left: Hierarchy computed over the tran-scriptome wide topological overlap matrix of gene to gene correlations in transcript abundance across samples. Right: Module assignment raster, rows are genes colored according to the assigned module, unassigned genes in black. d, MEG expression heatmaps arranged by module size (left) aligned to traits of interest (bottom). Each row is an MEG and each sample is a column. Samples are grouped according to neural circuit node in different colored subpanels. Color intensity encodes MEG expression as calculated by WGCNA. An example raster with sample category labels is provided at right.

Association of modules to experimental variables.

a-g, Bubble plots showing statistical association between MEG expression and variables of interest in various sample subsets. Strength of association (r²) is encoded by bubble size, significance (p) is encoded in the color scale with significant associations darkly bordered. Pearson correlation and Students t test, alpha=0.05. Plots show the associations between gene modules (rows) to; a-b, vehical treated song system specializations, comparing MEG expression in song system samples from either sex to their appropriate surrounding controls; c-d, E2-treated song system specialization, same comparison as a-b but within E2-treated samples; e, female vocal learning capacity after E2, comparing E2 treated female song system components to all other female samples from that circuit node; f, sexual dimorphism within the song system, comparing vehicle treated male and female song system components; g, sexual dimorphism within the surrounding control regions, comparing the vehicle treated male and female surrounding control samples. Each neural circuit node is considered separately (columns). h-k, Expression of modules with strong region specific expression. Module A is highly expressed in Str and Area X samples, with addtional differences between RA and LAI (h); Module C is highly expressed in the arcopallium, especially RA, with some increaase in HVC (i); Module F is expressed highly only in LMAN regardless of sex or treatement (j); Module G is only highly expressed in HVC, where it differs both by sex and treatment (k).

Module G functional enrichment analysis.

Significantly enriched GO terms within the 1:1 human orthologs of module G. Lists full GO term, GOid, and uncorrected p value calculated by GAGE.

Module E functional enrichment analysis.

Significantly enriched GO terms within the 1:1 human orthologs of module G. Lists full GO term, GOid, and uncorrected p value calculated by GAGE.

Gene module enrichments for human convergent signature and for chromosomes.

a, Enrichment of genes previously found to be convergently differentially expressed in the human laryngeal motor cortex and the pallial song nuclei or convergently expressed between the human vocal striatum and Area X. Bubble size linearly encodes the number of genes in each convergence module pairing. Significance was assed using a one-tailed GAGE test, similar to GO ontologies, alpha=0.05. Significant enrichments are darkly bordered and opaque. Values to the right of the vertical black line indicate above random chance. b, Enrichment of genes from specific chromosomes. Left, fold enrichment of modules onto zebra finch chromosomes in the newest genome assembly available; center, the portion of module assigned trascripts from each chromosome per module; right, the number of module assinged genes per chromosome. Each row is a chromosome with each bubble representing the enrichment of transcripts from that chromosome in one of the gene module defined by WGNCA. Values to the right of the vertical black line indicate above random chance. The size of the bubbles indicates the log10 transformed number of genes in each chromosome module pairing. Significance was assessed using an FDR corrected bootstrapped test of observed enrichment for each module chromosome pairing based on 50,000 randomizations of genes into modules. Significant enrichments are darkly bordered and opaque. a-b use the same color scale for modules.

Brainwide signatures of sex and micro chromosome expression.

a-d, Expression of selected module eigengenes by animal (top) aligned to their respective experimental variables (bottom), color indicates region. The sex chromosome enriched module E was highly expressed in all male samples and depleted in all female samples regardless of brain region or pharmacological treatment (a). Module J, K, and L eigengenes were each highly expressed in samples from one (J and L) or two (K) animals across all brain regions sampled. e, Distribution of continuous membership in module E across all module assigned genes (top) and module E assigned genes (bottom) based on corelation of expression to the module eigengene (Pearson r to MEG-E) with sex chromosomes separated. f-g, Distribution of sex chromosome gene expression correlations to the sex difference in vehicle treated finches. Postive correlations indicate female biased expression while anticorrelations indicate male biased expression. Significance was assessed in each region using an upper-tailed student’s correlation test for W chromosome transcripts (f) and lower-tailed for Z chromosome transcripts (g), with significant correlations in black, alpha=0.05. h-i, Venn diagrams intersecting the significantly sex difference correlated genes across non-vocal surround regions for the W and Z chromosomes respectively. j, Comparison of continuous membership in module E (r2 to MEG-E, y-axis) and module G (r2 to MEG-G, x-axis) across all

Identification of core genes in module G and their association to the Z chromosome.

a-d, Single gene continuous membership in module G (x-axis; Pearson r to MEG from module G) for all assigned genes vs correlation to vocal learning in masculine or masculinized HVC relative to samples from non-vocal learning females in each of the four comparisons; a, male song system membership, comparing individual gene expression in male HVC samples of either treatment to expression in the surrounding DN; b, female vocal learning capacity after E2, comparing E2 treated HVC to all other female DN or HVC samples; c, sexual dimorphic gene expression within the song system, comparing vehicle treated male and female song system components; d, estradiol responsive gene expression in female HVC, comparing E2 treat and vehicle treat female HVC samples. Each point is a gene colored by module assignment, the shaded area indicates gene of interest criteria for each comparison. e-h, Blowup of shaded regions in a-d respectively showing genes of interest from each comparison. i, Identification of core genes by intersecting the four gene sets of interest. j, Enrichment of Z chromosome transcripts within the core genes. * indicates p = 0.0087 by an upper-tailed hypergeometric test.

Core genes of module G specialization to vocal learning HVC.

Putative drivers of module G specialization to vocal learning capale HVC. Intersection of the four gene of interest sets (Fig. 4b), separating the significant enrichment of genes from the Z sex chromosome.

Proposed model of sexually dimorphic zebra finch vocal learning.

We propose that estradiol treatment in female zebra finches masculinizes song behavior by overcoming insufficient Z sex chromosome dosage in HVC to increase the expressed of transcripts normally depleted in females. The Z chromosome genes upregulated by E2 are components in a larger proliferative genetic program which prevents HVC atrophy in males and allows for its expansion late in development. The upregulation of these genes allows for the increased specilization of the gene networks they participate in, promoting HVC development sufficiently to enable rudimentary vocal learning in females.

Outlier sample detection by hierarchical clustering.

Two samples (a vehicle-treated male HVC sample and an E2-treated female RA sample, in red) form single sample branches in the hierarchical clustering tree, indicative of technical outliers unlikely to fit the correlational structure of the larger dataset. Samples were removed prior to gene network construction and module detection. For clustering of included samples, see Fig S5.

Selection of soft thresholding power for WGCNA model.

Soft-thresholding power (beta, x-axis) is the exponent to which each element in the gene-to-gene correlation matrix is raised during adjacency matrix calculation. a, Scale-free fit index (y-axis) as a function of the soft-thresholding power (x-axis). Horizontal line indicates a fit of 90%. b, Mean connectivity (degree) in the network model (y-axis) as a function of the soft-thresholding power. We selected a power of 6 as it is on the knee of both plots and above the 90% scale free fit criteria.

Selection of minimum module size and tree cut height parameter values for WGCNA model.

Each plot shows the sample distance matrix that results from the parameters in the plot title. Titles also show the number of modules found in each resulting model and the percentage of genes in the finch genome assigned. Minimum module size increases across columns (left to right: 10, 25, 50, 100, 250) and tree cut height decreases down rows (top to bottom: 0.9, 0.8, 0.6, 0.4, 0.2). Black arrow indicates selected model. Model was selected to explain as much transcriptomic variance as possible while minimizing the number of technically overfit samples.

Initial module overfitting to single samples.

a, Module eigengene (MEG) 7 and 13 are both highly expressed only in sinlge samples, indicated by black arrows. b, This overfitting causes these samples to be deep outliers in the sample-sample distance matrix, distant from all samples but tthemselves, indicated by black arrows. c, Removing these module eigengenes from the set prevents these samples from behaving as outliers in the distance matrix (d). These overfit modules were removed prior to module lettering and statistical analysis.

Gene module enriched for chromosomes and for human convergent signature.

Left, fold enrichment of modules onto zebra finch chromosomes in the newest genome assembly available; center, the portion of module assigned trascripts from each chromosome per module; right, the number of module assinged genes per chromosome. Each row is a chromosome with each bubble representing the enrichment of transcripts from that chromosome in one of the gene module defined by WGNCA. Values to the right of the vertical black line indicate above random chance. The size of the bubbles indicate the number of genes in each chromosome module pairing. Significance was assessed using an FDR corrected bootstrapped test of observed enrichment for each module chromosome pairing based on 50,000 randomizations of genes into modules. Significant enrichments are darkly bordered and opaque.

Expression of module G core genes in HVC and surrounding dorsal nidopallium.

Each of the 14 core genes show reduced expression in female HVC relative to the male with an increase in expression in response to E2 treatment. Bar represents mean with individual data points shown. This transcriptional response to E2 is not seen in the surrounding DN.

ModuleG constituent genes.

Lists all genes assigned to module G by and their continuous membership in module G (Pearson r to MEG from moduleG)

Sex chromosomes consistently repressed or expressed across regions.

Lists Z and W chromosome genes from Fig. 3e-f which exhibited sexually dimorphic expression in vehicle-treated, non-vocal learning related samples across brain regions.