1. Neuroscience
Download icon

Acetylcholine acts on songbird premotor circuitry to invigorate vocal output

  1. Paul I Jaffe  Is a corresponding author
  2. Michael S Brainard  Is a corresponding author
  1. Departments of Physiology and Psychiatry, University of California, San Francisco, United States
  2. Center for Integrative Neuroscience, University of California, San Francisco, United States
  3. Kavli Institute for Fundamental Neuroscience, University of California, San Francisco, United States
  4. Howard Hughes Medical Institute, University of California, San Francisco, United States
Research Article
  • Cited 0
  • Views 305
  • Annotations
Cite this article as: eLife 2020;9:e53288 doi: 10.7554/eLife.53288

Abstract

Acetylcholine is well-understood to enhance cortical sensory responses and perceptual sensitivity in aroused or attentive states. Yet little is known about cholinergic influences on motor cortical regions. Here we use the quantifiable nature of birdsong to investigate how acetylcholine modulates the cortical (pallial) premotor nucleus HVC and shapes vocal output. We found that dialyzing the cholinergic agonist carbachol into HVC increased the pitch, amplitude, tempo and stereotypy of song, similar to the natural invigoration of song that occurs when males direct their songs to females. These carbachol-induced effects were associated with increased neural activity in HVC and occurred independently of basal ganglia circuitry. Moreover, we discovered that the normal invigoration of female-directed song was also accompanied by increased HVC activity and was attenuated by blocking muscarinic acetylcholine receptors. These results indicate that, analogous to its influence on sensory systems, acetylcholine can act directly on cortical premotor circuitry to adaptively shape behavior.

Introduction

Physiological arousal is accompanied by global changes in brain state that facilitate sensory processing and enable rapid behavioral responses (Lee and Dan, 2012). In the sensory domain, active and aroused behavioral states are associated with enhanced perceptual capabilities that can aid detection and processing of threats or other salient stimuli (Bennett et al., 2013; McGinley et al., 2015; Woods et al., 2013). Analogously, in the motor domain, greater arousal can enable more rapid, forceful, and precise movements—that is, greater vigor—which work in tandem with enhanced sensory processing to coordinate adaptive behavioral responses (Bouman et al., 2015; DiGirolamo et al., 2016; Lovett-Barron et al., 2017; McGinley et al., 2015). Acetylcholine, which figures prominently in the ascending arousal system, has been linked to the arousal-related enhancement of sensory processing by direct action on sensory cortices (Fu et al., 2014; Herrero et al., 2008; St Peters et al., 2011; Pinto et al., 2013; Reimer et al., 2016). However, while motor cortical regions receive dense cholinergic innervation from the nucleus basalis (NBM; Eckenstein et al., 1988; McKinney et al., 1983; Raghanti et al., 2008), the extent to which cholinergic signaling in cortex contributes to motor invigoration observed in aroused behavioral states remains unknown.

Previous work on the control of movement vigor has focused primarily on dopaminergic signaling in the basal ganglia, which appears to be particularly important for invigorating movements in order to obtain reward and in other motivational contexts (Berke, 2018; Dudman and Krakauer, 2016; Turner and Desmurget, 2010). Nevertheless, experimental and pathological disturbances of the cholinergic system point to a possible cholinergic contribution to the invigoration of motor output. Indeed, lesions of NBM and diseases of the cholinergic system can be accompanied by a reduction in the speed, force, and amplitude of movements (Berger-Sweeney et al., 1994; Buchman et al., 2007; Ferris and Farlow, 2013; Goldman et al., 1999), in addition to more general impairments in motor skill acquisition and motor recovery following cortical lesions (Conner et al., 2003; Conner et al., 2005; Conner et al., 2010). Moreover, stimulation of NBM can invigorate movements of rat vibrissae that are induced by stimulation of motor cortex (Berg et al., 2005). These observations raise the question of whether acetylcholine can adaptively modulate movements by direct action on cortical circuitry, and whether its influence is separable from the control of vigor by dopaminergic signaling in the basal ganglia.

Birdsong is an attractive system for evaluating cholinergic contributions to the control of motor vigor in states of elevated arousal. Song is a learned and readily quantifiable motor skill that is controlled by well-defined cortical and basal ganglia circuitry. Like speech, song is naturally produced in states of greater or lesser arousal that are associated with changes in ‘vocal vigor’. In particular, female-directed song during courtship is associated with greater pitch and tempo, altered song amplitude, and increased acoustic stereotypy compared to the undirected song that male birds sing in isolation (Cooper and Goller, 2006; James and Sakata, 2015; Kao et al., 2005; Sakata et al., 2008; Sossinka and Böhner, 1980; Suri and Rajan, 2018). Consistent with the notion that directed song reflects a state of greater physiological arousal, pre-song heart rate is faster for directed song than undirected song (Cooper and Goller, 2006). Moreover, the cortical premotor song nucleus HVC, like mammalian motor cortex, receives strong cholinergic innervation from the basal forebrain (Ryan and Arnold, 1981; Zuschratter and Scheich, 1990), and cholinergic manipulations can alter excitability of HVC neurons in vitro and modulate their responses to song playback in vivo in anesthetized birds (Shea and Margoliash, 2003; Shea et al., 2010). However, as in mammalian systems, the extent to which cholinergic action on cortical motor regions contributes to arousal-related changes in behavior has not been examined.

In this study, we use the quantifiable nature of birdsong to investigate how cholinergic signaling modulates the cortical (pallial) premotor nucleus HVC, which has similar cell types, connections, and function to mammalian motor cortical regions. Using a combination of electrophysiological recordings and targeted pharmacological manipulations in singing birds, we show that activation of cortical acetylcholine receptors leads to elevated HVC activity accompanied by an increase in the pitch, amplitude, tempo, and stereotypy of undirected song. Moreover, we demonstrate that the normal invigoration of song that occurs in the presence of a female bird is also associated with elevated HVC activity, and that this invigoration can be attenuated by blocking muscarinic acetylcholine receptors in HVC. Strikingly, song invigoration in response to increased cholinergic tone in HVC persists even when contributions of basal ganglia circuitry to song are pharmacologically blocked. Overall, our findings argue that acetylcholine can act directly on cortical premotor circuitry to adaptively shape behavioral outputs in aroused behavioral states, and indicate that the control of movement vigor in central circuits is more distributed than previously appreciated.

Results

Acetylcholine invigorates song and increases song stereotypy

To determine whether cholinergic modulation of cortical motor areas can shape behavioral output, we evaluated changes to the acoustic structure of undirected song (‘baseline song’) following localized delivery of the cholinergic agonist carbachol into HVC of adult male Bengalese finches (Lonchura striata domestica; Figure 1A, n = 8 birds; see Methods). Carbachol did not cause gross distortions of the individual acoustic elements of the song, referred to as ‘syllables’ (Figure 1B). However, quantitative analysis revealed that carbachol had a number of consistent effects on syllable structure that largely paralleled those observed during female-directed song, including increases in pitch, syllable stereotypy, tempo, and amplitude (Cooper and Goller, 2006; James and Sakata, 2015; Kao et al., 2005; Sakata et al., 2008; Sossinka and Böhner, 1980; Suri and Rajan, 2018).

Figure 1 with 1 supplement see all
Activation of muscarinic receptors in HVC increases motor vigor.

(A) Experiment schematic and song system. Carbachol (Carb) was microdialyzed into HVC. HVC receives a cholinergic projection from a homolog of the nucleus basalis (ACh, red). The avian song control system consists of a direct motor pathway that includes the cortical nuclei HVC (used as a proper name) and RA (the robust nucleus of the arcopallium), which projects to the brainstem premotor regions that control vocal musculature; and an Anterior Forebrain Pathway (AFP) that includes the basal ganglia homologue Area X, the thalamic nucleus DLM (the dorsolateral nucleus of the medial thalamus), and the cortical nucleus LMAN (the lateral magnocellular nucleus of the anterior nidopallium), which projects back to the motor pathway at the level of RA. (B) Example song bouts before and after carbachol. (C) Change in pitch produced by carbachol for one example syllable (one experiment). Left, time course of raw pitch values. Red points indicate data used for analysis of carbachol effects (60–180 min. following onset of dialysis). Top right, rendition-averaged spectrograms. Lower right, pitch distributions during baseline (black) and Carb (red) conditions. (D) Normalized (drug/baseline) pitch (mean ± s.e.m. increase in pitch for Carb: 1.2 ± 0.21%, n = 20 experiments; Saline: 0.34 ± 0.11%, n = 17 experiments; Carb+Atrp: 0.31 ± 0.16%, n = 10 experiments; Carb+MEC+MLA: 1.5 ± 0.24%, n = 10 experiments; Carb vs. Saline, p=0.00088, two-tailed signed-rank test, n = 22 syllables, eight birds; Carb+Atrp vs. Carb+MEC+MLA, p=0.00024, two-tailed signed-rank test, n = 14 syllables, five birds). (E) Normalized (drug/baseline) pitch c.v. (mean ± s.e.m. reduction in pitch c.v. for Carb: 13 ± 1.6%, n = 20 experiments; Saline: 1.4 ± 2.7%, n = 17 experiments; Carb+Atrp: 4.0 ± 1.7%, n = 10 experiments; Carb+MEC+MLA: 14 ± 2.9%, n = 10 experiments; Carb vs. Saline, p=0.0014, two-tailed signed-rank test, n = 22 syllables, eight birds; Carb+Atrp vs. Carb+MEC+MLA, p=0.0023, two-tailed signed-rank test, n = 14 syllables, five birds). (F) Rendition-averaged spectrograms of one example syllable sequence before and after carbachol (one experiment). (G) Normalized (drug/baseline) syllable sequence duration (mean ± s.e.m. reduction in sequence duration for Carb: 2.8 ± 0.43%, n = 20 experiments; Saline: 0.26 ± 0.20%, n = 15 experiments; Carb+Atrp: 0.25 ± 0.39%, n = 9 experiments; Carb+MEC+MLA: 2.2 ± 0.51%, n = 10 experiments; Carb vs. Saline, p=0.00024, two-tailed signed-rank test, n = 13 syllable sequences, eight birds; Carb+Atrp vs. Carb+MEC+MLA, p=0.078, two-tailed signed-rank test, n = 8 syllable sequences, five birds). (H) Amplitude envelopes (mean ± s.e.m.) for one example syllable before and after carbachol (one experiment). Amplitude envelopes were normalized to the maximum value in the carbachol condition. (I) Normalized (drug/baseline) amplitude (mean ± s.e.m. increase in amplitude for Carb: 8.3 ± 2.5%, n = 18 experiments; Saline: 0.99 ± 1.6%, n = 17 experiments; Carb+Atrp: −3.2 ± 2.4%, n = 10 experiments; Carb+MEC+MLA: 6.1 ± 3.1%, n = 9 experiments; Carb vs. Saline, p=0.012, two-tailed signed-rank test, n = 30 syllables, eight birds; Carb+Atrp vs. Carb+MEC+MLA, p=0.0033, two-tailed signed-rank test, n = 18 syllables, five birds). ***p<0.001, **p<0.01, *p<0.05. For panels D, E, G, and I, each point represents one syllable or syllable sequence averaged over experiments; Atrp = the muscarinic antagonist atropine; MEC and MLA = the nicotinic antagonists mecamylamine and methyllycaconitine.

To quantify effects on pitch, we identified syllables with well-defined harmonic structure and computed the normalized fundamental frequency of each syllable (drug/baseline) in a two-hour period during carbachol infusion relative to a baseline period prior to drug infusion (see Methods). To control for possible circadian fluctuations in behavior (Chi and Margoliash, 2001; Derégnaucourt et al., 2005; Glaze and Troyer, 2006; Wood et al., 2013), we compared the magnitude of changes following carbachol infusion to the magnitude of changes in response to control saline infusion on alternate days. Compared to saline, carbachol elicited significant increases in pitch (Figure 1C and D; increase in pitch for carbachol: 1.2 ± 0.21%, mean ± s.e.m.; saline: 0.34 ± 0.11%; carbachol vs. saline, p=8.8e-4, signed-rank test; see Figure 1—source data 1 for mixed effects model analysis of behavioral effects).

To quantify effects on syllable stereotypy, we measured changes to the across-rendition coefficient of variation (c.v.) of pitch as in previous studies (Hampton et al., 2009; Kao et al., 2005). Increases in stereotypy of movements have often been observed in conjunction with increases in motor vigor in other systems (Manohar et al., 2015; Summerside et al., 2018), and across-rendition stereotypy of acoustic structure is higher during directed song relative to undirected song (Kao et al., 2005; Leblois et al., 2010; Sakata et al., 2008; Stepanek and Doupe, 2010). Carbachol produced significant increases in stereotypy as measured by reduced pitch c.v. (Figure 1C and E; reduction in pitch c.v. for carbachol: 13 ± 1.6%, mean ± s.e.m.; saline: 1.4 ± 2.7%; carbachol vs. saline, p=0.0014, signed-rank test; Figure 1—source data 1).

To quantify effects on song tempo, we calculated changes to the mean duration of defined syllable sequences in each bird's repertoire (e.g. the sequence of three syllables illustrated in Figure 1F, see Methods). Carbachol elicited robust increases in tempo as measured by reduced durations of syllable sequences (Figure 1F and G; decrease in sequence duration for carbachol: 2.8 ± 0.43%, mean ± s.e.m.; saline: 0.26 ± 0.20%; carbachol vs. saline, p=2.4e-4, signed-rank test; Figure 1—source data 1). These increases in tempo reflected a reduction in durations of both syllables and the gaps between syllables; moreover, consistent with the effects of temperature manipulations of HVC (Long and Fee, 2008; Zhang et al., 2017), we observed a proportionately greater effect for gaps than for syllables (Figure 1—figure supplement 1; decrease in syllable duration for carbachol: 0.93 ± 0.58%, decrease in gap duration: 3.2 ± 1.0%; syllables vs. gaps: p=0.085, rank-sum test).

To quantify effects on syllable amplitude, we averaged the smoothed amplitude envelope over the middle 80% of each syllable (see Methods). We found that carbachol significantly increased song amplitude (Figure 1H and I, and Figure 1—figure supplement 1; increase in amplitude for carbachol: 8.3 ± 2.5%, mean ± s.e.m.; saline: 0.99 ± 1.6%; carbachol vs. saline, p=0.012, signed-rank test; Figure 1—source data 1). An increase in amplitude was observed not only for ‘harmonic stack’ syllables that could be analyzed for changes to pitch, but also for syllables with more varied spectrotemporal properties, including ‘sweeps’ and ‘complex’ syllables that had frequency-modulated harmonic components, and ‘noisy’ syllables that had high spectral entropy (Figure 1—figure supplement 1).

In addition to examining effects of carbachol on the acoustic structure of syllables, we also tested whether carbachol dialyzed into HVC alters the sequencing of syllables. This was motivated by prior observations that syllable sequencing is systematically altered in directed song relative to undirected song (Hampton et al., 2009; Sakata et al., 2008; Sossinka and Böhner, 1980). Consistent with the changes to sequencing observed during directed song, carbachol significantly altered transition probabilities at variable transitions in song (‘branch points’) (Figure 2A−C; n = 15 branch points, seven birds), and significantly increased the number of times that syllables were repeated (Figure 2D and E; carbachol vs. baseline, p=0.031, signed-rank test, n = 7 repeated syllables, six birds; Hampton et al., 2009; Sakata et al., 2008). Further, the effect on syllable repetitions was greater for more variable repetitions, as previously observed for temperature manipulations of HVC (Figure 2F; Zhang et al., 2017). Previous studies in Bengalese finches have also reported that syllable sequencing can be more stereotyped during directed song, as measured by a reduction in transition entropy at variable transitions in song (Hampton et al., 2009; Sakata et al., 2008). Consistent with this, we observed modest trends toward reduced transition entropy for both branch points and repeated syllables (see Methods; mean ± s.e.m. reduction in transition entropy for branch points: 7.1 ± 6.3%; repeated syllables: 5.3 ± 2.5%; carbachol vs. baseline for branch points: p=0.17, repeated syllables: p=0.078, signed-rank test). Hence, carbachol dialyzed into HVC and directed song were associated with similar changes to syllable sequencing.

Microdialysis of carbachol into HVC alters song sequencing.

(A) Spectrogram of a song with a divergent branch point. Syllable ‘X’ can transition to syllable ‘Y’ or syllable ‘Z’. (B) Transition probabilities before and after either carbachol or saline dialysis for the branch point shown in panel A. (C) Change in transition probability averaged across all branch points for each bird (see Methods; mean ± s.e.m. change for Carb: 0.11 ± 0.025, Saline: 0.053 ± 0.022; Carb vs. Saline, p=0.047, two-sided signed-rank test, n = 15 branch points, seven birds, 11 experiments for Carb, 12 experiments for Saline). On a case-by-case basis, transition probabilities at 6 out of 15 branch points were significantly affected by carbachol, while only 1 out of 15 was significantly affected by saline (p<0.05, generalized likelihood ratio test for homogeneity, see Methods). (D) Song spectrogram depicting a variably repeated syllable (syllable 'J'). (E) Histogram of repeat counts before and after carbachol for the repeated syllable shown in panel D. (F) Scatter plot of repeat length c.v. vs. normalized repeat count (carbachol/baseline) after carbachol (p=0.0012, test for non-zero Pearson's correlation coefficient, corr. coeff. = 0.95; mean ± s.e.m. increase in repeat length: 7.4 ± 3.6%; mean ± s.e.m. repeat length c.v.: 0.23 ± 0.043, n = 7 repeated syllables, six birds, 12 experiments). *p<0.05.

To determine whether the effects of carbachol are mediated by muscarinic or nicotinic receptors, both of which are expressed in HVC (Asogwa et al., 2018; Ball et al., 1990; Watson et al., 1988), we tested how the effects produced by carbachol were affected by the concurrent dialysis of either muscarinic or nicotinic receptor antagonists into HVC (n = 5 birds). The effects produced by carbachol on pitch, pitch c.v., amplitude, and tempo were attenuated by the muscarinic antagonist atropine, but not by the nicotinic antagonists MEC (mecamylamine) and MLA (methyllycaconitine; Figures 1D, E, G and I; carbachol+atropine vs. carbachol+MEC+MLA, p<0.05 for all features except for tempo, signed-rank test; Figure 1—source data 1). In summary, we found that activation of muscarinic acetylcholine receptors in HVC enhances song vigor, producing a suite of behavioral changes that largely parallels those observed during directed song (Cooper and Goller, 2006; James and Sakata, 2015; Kao et al., 2005; Sakata et al., 2008; Sossinka and Böhner, 1980; Suri and Rajan, 2018).

Acetylcholine invigorates song via the cortical-brainstem motor pathway

In principle, acetylcholine could invigorate song through either of two major pathways emanating from HVC: through the direct cortical-brainstem motor pathway, via a projection from HVC to RA, or through basal ganglia circuitry, via an indirect pathway from HVC→Area X→DLM→LMAN→RA (the Anterior Forebrain Pathway or ‘AFP’; Figure 1A). Previous studies in mammalian systems have identified the basal ganglia as a key locus for the modulation of motor vigor (Panigrahi et al., 2015; Yttri and Dudman, 2016), supporting the possibility that the effects of carbachol infusion into HVC could reflect primarily an influence on AFP circuitry. Further, a number of studies in songbirds have identified the AFP as a critical site for social modulation of pitch variability that occurs during directed song (Hampton et al., 2009; Jarvis et al., 1998; Kao and Brainard, 2006; Kao et al., 2005; Leblois et al., 2010; Stepanek and Doupe, 2010; Teramitsu and White, 2006; Woolley, 2016).

To determine whether basal ganglia circuitry contributes to acoustic changes produced by carbachol, we microdialyzed carbachol into HVC while inactivating LMAN with muscimol (Figure 3A and B; n = 4 birds). LMAN is the main output nucleus of the AFP, and muscimol inactivation of LMAN disconnects the AFP from the song motor pathway (Andalman and Fee, 2009; Ölveczky et al., 2011; Stepanek and Doupe, 2010; Warren et al., 2011). As in previous studies, infusion of muscimol caused a significant decrease in pitch variability, confirming that LMAN was effectively inactivated (reduction in pitch c.v.: 23 ± 5.3%, mean ± s.e.m.; muscimol vs. baseline, p=0.0020, signed-rank test). However, even when LMAN was inactivated, infusion of carbachol into HVC caused increases in pitch, pitch stereotypy, tempo, and amplitude (Figure 3B–F). For each of these features, the increases produced by combined carbachol + LMAN inactivation were greater than for LMAN inactivation alone (p<0.05 for pitch, amplitude, and tempo; p=0.084 for pitch c.v.; signed-rank test; see also Figure 3—source data 1 for mixed effects analysis). Moreover, for each feature, the changes elicited by combined carbachol + LMAN inactivation were not significantly different from the sum of the individual effects of carbachol and LMAN inactivation (pitch: p=0.77, pitch c.v.: p=0.85, tempo: p=0.22, amplitude: p=0.71, signed-rank test; Figure 3—source data 1). These results indicate that increased cholinergic tone in HVC can modulate song via primary motor circuitry independently of input from the songbird basal ganglia, and thus provide support for a recent model proposing that basal ganglia and non-basal ganglia pathways can make independent and additive contributions to behavior (Yttri and Dudman, 2018).

Carbachol invigorates song via the cortical-brainstem motor pathway.

(A) Experiment schematic. Carbachol (Carb) was microdialyzed into HVC, and LMAN was inactivated with muscimol (No LMAN), concurrently or separately. (B) Time course of pitch values for representative experiments from one bird (the same syllable is shown for each condition). For each condition, pitch is plotted as percent change relative to the average value during baseline (−120 to 0 min.). Points in color indicate data used for analysis of drug effects (90 to 150 min.). (C) Normalized (drug/baseline) pitch (mean ± s.e.m. increase in pitch for Carb: 1.4 ± 0.32%, No LMAN: 0.68 ± 0.29%, Carb+No LMAN: 2.0 ± 0.24%; Carb+No LMAN vs. No LMAN, p=0.0020, two-tailed signed-rank test, n = 10 syllables, four birds). (D) Normalized (drug/baseline) pitch c.v. (mean ± s.e.m. reduction in pitch c.v. for Carb: 12 ± 4.2%, No LMAN: 23 ± 5.3%, Carb+No LMAN: 34 ± 4.7%; Carb+No LMAN vs. No LMAN, p=0.084, two-tailed signed-rank test, n = 10 syllables, four birds). (E) Normalized (drug/baseline) syllable sequence duration (mean ± s.e.m. reduction in sequence duration for Carb: 2.7 ± 0.72%, No LMAN: −0.55 ± 0.47%, Carb+No LMAN: 3.0 ± 0.95%; Carb+No LMAN vs. No LMAN, p=0.031, two-tailed signed-rank test, n = 7 syllable sequences, four birds). (F) Normalized (drug/baseline) amplitude (mean ± s.e.m. increase in amplitude for Carb: 13 ± 3.5%, No LMAN: 3.8 ± 1.5%, Carb+No LMAN: 17 ± 4.7%; Carb+No LMAN vs. No LMAN, p=0.035, two-tailed signed-rank test, n = 14 syllables, four birds). **p<0.01, *p<0.05. For panels C–F, each point represents one syllable or syllable sequence averaged over experiments. For panels C, D, and E, n = 9 experiments for all conditions. For panel F, n = 8 experiments for Carb, nine for No LMAN, and nine for Carb + No LMAN.

Acetylcholine increases neural activity in HVC

To determine how increased cholinergic tone alters HVC activity, we recorded multi-unit neural activity in HVC of singing Bengalese finches before and after microdialysis of carbachol (Figure 4A). We focused on multi-unit activity, since stable recordings of isolated single units are difficult to maintain over the course of pharmacological manipulations as required for these experiments. The signal-to-noise ratio (SNR) of these recordings ranged from 3.2 to 6.7 (see Methods), and firing rates ranged from 50 to 560 Hz, indicating that most recordings sampled from multiple neurons simultaneously (Figure 4—figure supplement 1). Moreover, due to the higher firing rates of inhibitory interneurons relative to the excitatory projection neurons in HVC, such multi-unit activity is likely to primarily reflect interneuron activity (Hahnloser et al., 2002; Kozhevnikov and Fee, 2007; Liberti et al., 2016; Rauske et al., 2003).

Figure 4 with 1 supplement see all
Carbachol increases HVC multi-unit firing rates.

(A) Experiment schematic. Carbachol was microdialyzed into HVC while recording neural activity with multi-electrode arrays. (B) Example multi-unit site (one experiment; activity aligned to the onset of one syllable). Top left, spectrogram of the syllable used for alignment. Middle left, raster plot of the multi-unit site (red dashed line: onset of carbachol). Bottom left, rendition-averaged firing rates (mean ± s.e.m., smoothed with a 5 ms SD gaussian kernel). Right, example raw traces for this multi-unit site (bandpass filtered between 300 and 4000 Hz). (C) Population average firing rates aligned to syllable onsets, before and after saline. Prior to averaging across sites/syllables, mean firing rates from both baseline and saline blocks were normalized by the maximum of both conditions in a 100 ms window centered on the syllable onset (n = 118 multi-unit sites x syllables, five birds, eight experiments). (D) Population average firing rates aligned to syllable onsets, before and after carbachol. Data are normalized as in panel C (n = 202 multi-unit sites x syllables, five birds, eight experiments). (E) Percent change in firing rate after switch to carbachol relative to baseline, or after switch to saline relative to baseline (mean ± s.e.m. increase in firing rate for Carb in −30 to 0 ms window: 12 ± 1.6%; Carb 0 to 30 ms: 7.7 ± 1.6%, Saline −30 to 0 ms: 2.2 ± 2.0%, Saline 0 to 30 ms: 1.0 ± 1.9%; Carb vs. Saline in −30 to 0 ms window, p=1.3e-4, two-tailed rank-sum test; Carb vs. Saline 0 to 30 ms, p=0.0039, two-tailed rank-sum test; Carb −30 to 0 ms vs. Carb 0 to 30 ms, p=5.4e-4, two-tailed signed-rank test). (F) Percent change in Fano factor after switch to carbachol relative to baseline, or after switch to saline relative to baseline (mean ± s.e.m. change in Fano factor for Carb in −30 to 0 ms window: 1.0 ± 2.3%; Carb 0 to 30 ms: 3.3 ± 2.5%, Saline −30 to 0 ms: 4.2 ± 3.5%, Saline 0 to 30 ms: −1.8 ± 2.5%; Carb vs. Saline in −30 to 0 ms window, p=0.73, two-tailed rank-sum test; Carb vs. Saline 0 to 30 ms, p=0.29, two-tailed rank-sum test). ***p<0.001, **p<0.01, n.s., not significant.

For each recording site and syllable, we computed rendition-averaged firing rates during saline and carbachol blocks aligned to syllable onsets (Figure 4B). During control saline blocks, there was no change in average firing rates relative to baseline, indicating that multiunit recordings remained stable over the period of drug dialysis (Figure 4C; change in firing rate in 100 ms window centered on syllable onsets: 1.1 ± 1.8%, mean ± s.e.m.; saline vs. baseline, p=0.64, signed-rank test; n = 118 multi-unit sites x syllables, five birds). In contrast, we found that carbachol significantly increased average firing rates in HVC relative to baseline (Figure 4D; increase in firing rate in 100 ms window centered on syllable onsets: 9.9 ± 1.4%, mean ± s.e.m.; carbachol vs. baseline, p=2.8e-9, signed-rank test; n = 202 multi-unit sites x syllables, five birds; see also Figure 4—source data 1 for mixed effects analysis). In general, the firing rate changes caused by carbachol were complex, varying in magnitude at different time points in song (Figure 4B and Figure 4—figure supplement 1). However, on average there was a significant increase in firing rate both preceding and following syllable onsets, with a modestly greater effect in the premotor window preceding syllable onsets (Figure 4D and E; maximum of 16% increase in firing rate at 25 ms before syllable onsets, minimum of 6.1% at 15 ms after syllable onsets).

The increased acoustic stereotypy we observed during microdialysis of carbachol led us to consider whether this was caused by a corresponding reduction in neural variability in HVC. To evaluate this, we measured the Fano factor (across-rendition spike count variance/mean spike count) for each multi-unit site and syllable (Figure 4F). On average, carbachol did not produce a significant change in Fano factor (carbachol vs. saline, p>0.05 in 30 ms window preceding syllable onsets and 30 ms window after syllable onsets, rank-sum test). We also evaluated neural variability by calculating the spike count variance. We found no significant effect of carbachol relative to saline in the 30 ms window prior to syllable onsets (p=0.17, rank-sum test), and a tendency for carbachol to increase neural variability in the 30 ms window after syllable onsets (p=0.023, rank-sum test). Thus, the increased pitch stereotypy caused by carbachol cannot be explained by reduced neural variability in HVC at the multi-unit level, though we cannot rule out the possibility that this increased behavioral stereotypy arises in part from reduced variability specifically among HVC projection neurons.

HVC activity is modulated by social context

Directed song and microdialysis of carbachol into HVC were associated with similar behavioral changes, yet previous studies have found only limited evidence that activity within HVC differs between undirected and directed song (Jarvis et al., 1998; Matheson et al., 2016; Woolley et al., 2014). We therefore wondered if we could detect any changes to neural activity in HVC during directed song similar to the increases in activity caused by carbachol infusion. To address this, we recorded multi-unit neural activity in HVC during interleaved blocks of directed and undirected singing (Figure 5A; n = 5 birds). We computed rendition-averaged firing rates aligned to syllable onsets separately for directed and undirected songs. The pattern of neural modulation with respect to song features was largely conserved across social contexts: the mean ± s.e.m. correlation coefficient between directed and undirected firing rates was 0.90 ± 0.010 (e.g. Figure 5B and Figure 5—figure supplement 1; n = 151 multi-unit sites x syllables, 15 unique multi-unit sites, five birds). Nevertheless, HVC multi-unit firing rates were consistently higher during directed song relative to undirected song (Figure 5B−5D and Figure 5—figure supplement 1; increase in firing rate in 100 ms window centered on syllable onsets: 7.2 ± 1.0%, mean ± s.e.m.; directed vs. undirected, p=1.1e-15, signed-rank test; n = 151 multi-unit sites x syllables, five birds; see also Figure 5—source data 1 for mixed effects analysis). Further, the percent increase in firing rate for directed song was not significantly different from that observed after microdialysis of carbachol (p=0.90, rank-sum test). The patterns of increased firing during directed song were idiosyncratic across recording sites and syllables (Figure 5—figure supplement 1). However, as observed for carbachol, there was on average a significant increase in firing rate both preceding and following syllable onsets (Figure 5D; mean ± s.e.m. increase in −30 to 0 ms window: 7.9 ± 1.2%; 0 to 30 ms window: 7.5 ± 1.2%). Overall, these results reveal that both microdialysis of carbachol and directed song are associated with greater activity in HVC, and raise the question of whether cholinergic signaling in HVC normally contributes to social modulation of song.

Figure 5 with 1 supplement see all
HVC activity is modulated by social context.

(A) Experiment schematic. Multi-unit activity in HVC was recorded during interleaved female-directed and undirected song sessions. (B) Example multi-unit site (one experiment; activity aligned to the onset of one syllable). Top, spectrogram of the syllable used for alignment. Middle, raster plot of the multi-unit site (activity plotted chronologically from top to bottom; spaces separate blocks of directed or undirected singing). Middle right, mean ± s.e.m. firing rates for each block of singing (firing rates were computed in a 100 ms window centered on the syllable onset). Bottom, rendition-averaged firing rates (smoothed with a 5 ms SD gaussian kernel; bold lines show mean firing rates for all renditions, light lines show mean firing rates for each block of directed or undirected song). (C) Population average firing rates aligned to syllable onsets, for directed and undirected conditions. Prior to averaging across sites/syllables, mean firing rates from directed and undirected renditions were normalized by the maximum of both conditions in a 100 ms window centered on the syllable onset (n = 151 multi-unit sites x syllables, five birds, 10 experiments). (D) Percent change in firing rate during directed song relative to undirected song (mean ± s.e.m. increase in firing rate in −30 to 0 ms window: 7.9 ± 1.2%; 0 to 30 ms window: 7.5 ± 1.2%; directed vs. undirected in −30 to 0 ms window: p=9.7e-11, 0 to 30 ms window: 1.4e-12, firing rate change in −30 to 0 ms window vs. 0 to 30 ms window: p=0.31, two-tailed signed-rank test). ***p<0.001.

Acetylcholine contributes to the social modulation of song

To test whether cholinergic signaling in HVC contributes to the social modulation of song, we quantified how the normal differences between directed and undirected song were affected by dialysis of the muscarinic antagonist atropine into HVC (Figure 6A and B). Under control conditions, as previously reported (Hampton et al., 2009; James and Sakata, 2015; Sakata et al., 2008), directed song was higher in pitch, less variable in pitch, and faster than undirected song (Figure 6D–F, saline conditions). The increase in song tempo during directed song reflected decreases in the durations of both syllables and gaps, with a proportionately larger decrease in gaps as also observed for carbachol (Figure 6—figure supplement 1; mean ± s.e.m. reduction in duration for syllables: 1.0 ± 0.49%, gaps: 4.2 ± 1.4%; syllables vs. gaps, p=0.016, rank-sum test, pre and post saline conditions combined). In contrast, for this set of birds, we did not observe robust effects of social context on syllable sequencing under control saline conditions as has been observed in some previous studies of how social context influences song (Hampton et al., 2009; Toccalino et al., 2016; Figure 6—figure supplement 1; see also Methods). To investigate whether social modulation of song is dependent on cholinergic signaling in HVC, we therefore primarily focused our analysis on pitch, pitch c.v., and song tempo—those features that were significantly modulated by social context under control conditions.

Figure 6 with 2 supplements see all
Atropine attenuates the social modulation of song.

(A) Experiment schematic. Atropine or saline was microdialyzed into HVC during interleaved female-directed and undirected song sessions. (B) Song was recorded on three separate days for each bird in the following order: Saline pre, Atropine, Saline post, with one day between sessions. See Methods for details. (C) Pitch distributions for one example syllable. (D) Normalized pitch (directed/undirected) for all syllables (mean ± s.e.m. increase in pitch for Saline pre: 0.59 ± 0.19%, Atropine: 0.32 ± 0.13%, Saline post: 0.60 ± 0.17%; Saline pre vs. Atropine, p=0.046, one-tailed signed-rank test; Saline post vs. Atropine, p=0.078, one-tailed signed-rank test). (E) Normalized pitch c.v. (directed/undirected) for all syllables (mean ± s.e.m. reduction in pitch c.v. for Saline pre: 41 ± 3.9%, Atropine: 19 ± 4.4%, Saline post: 37 ± 4.6%; Saline pre vs. Atropine, p=0.0028, one-tailed signed-rank test; Saline post vs. Atropine, p=0.012, one-tailed signed-rank test). For pitch and pitch c.v., n = 16 syllables, six birds, six experiments per condition. (F) Normalized syllable sequence duration (directed/undirected) for all syllable sequences (mean ± s.e.m. reduction in sequence duration for Saline pre: 2.4 ± 0.29%, Atropine: 1.9 ± 0.50%, Saline post: 1.9 ± 0.39%; Saline pre vs. Atropine, p=0.098, one-tailed signed-rank test; Saline post vs. Atropine, p=0.47, one-tailed signed-rank test, n = 8 syllable sequences, seven birds, seven experiments per condition). **p<0.01, *p<0.05, n.s., not significant.

We found that blocking muscarinic acetylcholine receptors in HVC with atropine caused an attenuation of social modulation for each of these features, which achieved significance for pitch and pitch c.v., but not song tempo (Figure 6C–F and Figure 6—figure supplement 1; see also Figure 6—source data 1 for mixed effects analysis). The reduction in social modulation of pitch and pitch c.v. could in principle reflect an effect of atropine on directed song or on undirected song. We therefore separately analyzed the differences between saline and atropine conditions for each of these features for both directed and undirected song. We found that atropine caused a significant decrease in pitch and increase in pitch c.v. for directed song (relative to saline conditions), but had no effect on these features for undirected song (Figure 6—figure supplement 2). Thus, atropine attenuated a process that is specifically engaged during directed song, indicating that increased cholinergic signaling in HVC normally contributes to the increased pitch and reduced pitch variability of directed song.

Discussion

Cholinergic neurons project throughout the forebrain, including motor and sensory cortices, and contribute importantly to global changes in brain activity in aroused behavioral states (Buzsaki et al., 1988; Eckenstein et al., 1988; McKinney et al., 1983; Metherate et al., 1992; Raghanti et al., 2008). Nonetheless, how cholinergic signaling affects motor cortical activity and behavior remains poorly understood. Here we examined how acetylcholine affects motor behavior in the context of birdsong, leveraging the quantifiable nature of song and the well-defined neural circuitry underlying song production. We found that pharmacological enhancement of cholinergic signaling had an activating effect on HVC and concomitantly increased pitch, amplitude, tempo, and stereotypy of song. These behavioral changes did not require the songbird basal ganglia, indicating that cholinergic enhancement of song vigor occurred via a direct cortical-brainstem pathway. Moreover, the natural increases in pitch and stereotypy of song that are elicited in a courtship context were accompanied by similar increases in neural activity and were attenuated by blockade of cholinergic receptors in HVC. Thus, our findings demonstrate that acetylcholine contributes to enhanced vigor of a motor skill by direct action on cortical premotor circuitry.

Distributed circuits for the control of motor vigor

Our demonstration that cholinergic invigoration of song does not require participation of basal ganglia circuitry indicates that the neural control of motor vigor is more distributed than has typically been recognized. Prior experimental work and theoretical treatments of motor vigor have primarily focused on basal ganglia circuitry (Manohar et al., 2015; Panigrahi et al., 2015; Schmidt et al., 2008; Shadmehr et al., 2019; Yttri and Dudman, 2016). While some authors have recognized that cortical and other non-basal ganglia circuitry are likely to contribute to the control of motor vigor (Dudman and Krakauer, 2016; Yttri and Dudman, 2018), there have been limited experimental tests of this proposal. Our findings indicate that in some situations, motor invigoration can indeed occur independently of basal ganglia circuitry. This separability raises the possibility that cortical and basal ganglia circuitry contribute differentially to the control of movement vigor in distinct contexts. In particular, basal ganglia circuits have been linked to shaping the vigor of motor output in the motivational context of goal-directed behaviors—e.g. movements that are generated to obtain reward, with local dopaminergic signaling playing a key role (Berke, 2018; Dudman and Krakauer, 2016; Mazzoni et al., 2007; Schmidt et al., 2008; Turner and Desmurget, 2010). In turn, cholinergic modulation of forebrain motor circuitry may especially contribute to invigoration of behaviors that are adaptive in externally-triggered states of heightened physiological arousal, like those that involve escape from danger, prey capture or courtship. Nonetheless, in many naturalistic contexts cortical and basal ganglia circuitry may jointly control movement vigor, with coordination between these pathways mediated by bidirectional feedback between them (Bosch-Bouju et al., 2013; Yttri and Dudman, 2018). In mammals, projections from cholinergic neurons in the pedunculopontine nucleus to both the cholinergic basal forebrain and dopaminergic neurons in the substantia nigra also link these two pathways (Bolam et al., 1991; Lee and Dan, 2012), and much of this anatomy is likely to be conserved in songbirds (Medina and Reiner, 1994).

Neural mechanisms underlying the control of motor vigor in songbirds

The persistence of cholinergic effects on song despite the inactivation of basal ganglia circuitry indicates that invigoration of movement can be mediated by direct engagement of cortical projections to brainstem motor nuclei. In the song system, the cortical motor nucleus RA serves as an intermediary between HVC and brainstem motor regions (Figure 1A). Our findings indicate that cholinergic invigoration of song can be attributed to altered neural activity within this pathway (from HVC→RA→brainstem). RA also receives input from basal ganglia circuitry via LMAN (Figure 1A), so that it is well positioned to integrate both cortical and basal ganglia contributions to song vigor (Thompson and Johnson, 2007). Such invigoration likely includes increased activity of RA projection neurons, which has been linked to increased pitch and amplitude via an excitatory influence on the relevant syringeal and respiratory muscles of the vocal apparatus (Goller and Riede, 2013; Sober et al., 2008; Srivastava et al., 2015). Consistent with this model, pharmacological excitation of RA projection neurons (via manipulation of inhibition) produces similar behavioral changes to microdialysis of carbachol into HVC (Miller et al., 2017). We therefore predict that the increase in neural activity in HVC in response to carbachol administration results in net excitation of RA and downstream vocal musculature.

Within HVC, vocal invigoration is likely to be mediated in part by increased activity of the neurons that project directly to RA (the HVCRA neurons), which we hypothesize occurs in conjunction with increased activity of both the basal ganglia projecting HVCX neurons and local inhibitory interneurons. While our multi-unit recordings cannot directly resolve how these three major HVC cell types are affected by acetylcholine, consideration of our findings in light of prior songbird studies supports the idea that acetylcholine increases both projection neuron and interneuron activity. Multi-unit recordings in HVC are thought to primarily sample from the inhibitory interneurons, due to the fact that this population is considerably more active than the projection neurons during song and awake quiescent periods (Hahnloser et al., 2002; Kozhevnikov and Fee, 2007; Liberti et al., 2016; Rauske et al., 2003). Since interneurons provide a strong source of inhibition to projection neurons within HVC (Kosche et al., 2015; Mooney and Prather, 2005), our finding that acetylcholine increases multi-unit activity suggests that acetylcholine could decrease rather than increase projection neuron activity. However, a model in which acetylcholine suppresses HVC projection neuron activity during song is difficult to reconcile with the cellular effects of acetylcholine measured in acute slice electrophysiology experiments: muscarinic acetylcholine receptor agonists depolarize both classes of projections neurons and hyperpolarize interneurons (Shea et al., 2010). Extrapolating from these findings, if acetylcholine applied in vitro hyperpolarizes interneurons, while carbachol applied in vivo causes an increase in their firing rate, then carbachol applied in vivo must recruit a sufficiently strong source of excitatory input to interneurons to overwhelm any suppressive cellular effect that it has. We hypothesize that this excitatory input originates primarily from the local HVC projection neurons (Kosche et al., 2015; Mooney and Prather, 2005), which are driven to greater activation by the depolarizing effect of acetylcholine. Notably, injections of carbachol into HVC of anesthetized songbirds increase spontaneous activity within RA (Shea and Margoliash, 2003), supporting the view that acetylcholine increases the activity of the HVCRA neurons in particular. In sum, our data and prior literature suggest a model in which acetylcholine increases the activity of all major cell classes within HVC in tandem. Such a coordinated increase in activity among the major excitatory and inhibitory cell classes within HVC also occurs normally during the transition from non-singing quiescent periods to active song production (Kozhevnikov and Fee, 2007), and is broadly consistent with the tight coupling of excitation and inhibition that has been observed in mammalian cortical regions (Atallah and Scanziani, 2009; Haider et al., 2006; Okun and Lampl, 2008; Wehr and Zador, 2003).

Contributions of HVC to acoustic variability

In the context of song control, our finding that acetylcholine can operate on HVC to reduce pitch variability is somewhat surprising. Most previous studies that have manipulated HVC activity have not reported a reduction in behavioral variability (Hamaguchi et al., 2016; Long and Fee, 2008; Zhang et al., 2017; but see Isola et al., 2020). In contrast, lesions and pharmacological inactivation of the AFP output nucleus LMAN reduce pitch variability substantially (Hampton et al., 2009; Kao and Brainard, 2006; Kao et al., 2005; Stepanek and Doupe, 2010). The observation that HVC projection neurons exhibit extremely low trial-to-trial variability also might suggest that HVC does not introduce substantial behavioral variability (Hahnloser et al., 2002). In contrast, our findings suggest that a significant source of behavioral variability originates within HVC. Conceivably, this variability could be harnessed in the service of reinforcement learning in much the same way that is thought to occur for variability originating from the AFP (Charlesworth et al., 2012; Kojima et al., 2018).

How might cholinergic modulation of HVC contribute to the observed reduction in behavioral variability? Most simply, acetylcholine could cause decreased rendition-by-rendition variability in HVCRA and/or HVCX projection neurons that are well-positioned to directly and indirectly control variability in RA. However, since we did not observe reduced neural variability at the multi-unit level, we also consider how changes within HVC could drive reduced behavioral variability even in the absence of reduced variability in projection neuron populations. One possibility is that acetylcholine decorrelates the activity across HVC projection neurons such that their added contributions to motor effectors ‘cancel out’ (Darshan et al., 2017; Kaufman et al., 2014; Sober et al., 2008). Alternatively, by increasing the firing rates of HVCRA projection neurons (as discussed above), variability within downstream RA projection neurons could be suppressed through a saturation mechanism, similar to that proposed to account for developmental reduction in vocal variability (Garst-Orozco et al., 2014). Additionally, at the network level, increased drive to RA could suppress intrinsic dynamics within RA that amplify external perturbations (originating from LMAN, for example; Mastrogiuseppe and Ostojic, 2018; Rajan et al., 2010). This in turn could disrupt the correlation of activity within RA that contributes to behavioral variability (Darshan et al., 2017; Sober et al., 2008).

A potential role for the cholinergic system in movement disorders

Our finding that arousing or activating stimuli can invigorate movement by mechanisms that are distinct from the action of dopamine in the basal ganglia may explain observations that certain sensory cues and emotional stimuli can have a prokinetic effect on movement disorder patients. Patients with Parkinson's disease or basal ganglia damage engage in fewer volitional movements and exhibit a reduced willingness to exert effort to obtain reward, suggesting that reduced motor vigor in these patients is primarily motivational in nature (Mazzoni et al., 2007; Schmidt et al., 2008). However, these patients can exhibit 'paradoxical' movements in situations that provoke extreme emotion (Bonanni et al., 2010; Glickstein and Stein, 1991), and can effectively modulate grip strength when explicitly instructed to do so, even while exhibiting deficits in the modulation of grip strength by reward (Schmidt et al., 2008). Similarly, Parkinson’s patients can exhibit improvements in movement vigor in response to engaging sensory stimulation (McIntosh et al., 1997; Rubinstein et al., 2002). Our results indicate that such arousal-dependent motor invigoration in these patients could be enabled by cholinergic modulation of cortical motor circuitry. Observations that dopamine-depleted mice (Panigrahi et al., 2015) and patients with Parkinson's disease (Mazzoni et al., 2007) can learn to move more vigorously may similarly depend on cholinergic signaling in cortex, consistent with the known role of acetylcholine in motor learning and associated cortical plasticity (Conner et al., 2003; Conner et al., 2005; Conner et al., 2010).

Conversely, some movement disorders that include decreased speed, force, and movement stereotypy may reflect in part disrupted cholinergic signaling in cortical motor regions. Particularly noteworthy in this respect is the slowing of gait, reduced force generation, and loss of verbal fluency that are frequently observed in patients with Alzheimer’s disease (Buchman et al., 2007; Ferris and Farlow, 2013; Goldman et al., 1999), which is principally associated with the loss of cholinergic neurons in the basal forebrain and diminished cholinergic innervation of the cortex (Francis et al., 1999). Indeed, loss of movement vigor may precede, and be predictive of, subsequent cognitive decline in Alzheimer’s and other diseases (Buchman et al., 2007). An underappreciated role of cortical cholinergic signaling in the invigoration of movements, as indicated by our findings, may both explain this link, and account for some of the ameliorative effects on movements of pro-cholinergic treatments (Ferris and Farlow, 2013).

Contributions of HVC and the cholinergic system to social modulation of song

Our results demonstrate a previously unknown contribution of HVC to the social modulation of song, contrasting with previous studies that emphasize the role of the AFP (Hampton et al., 2009; Kao and Brainard, 2006; Kao et al., 2005; Leblois et al., 2010). While previous neural recordings from HVCX projection neurons in zebra finches did not reveal conspicuous differences between social contexts (Woolley et al., 2014), the other cell classes in HVC were not examined, as was implicitly done here by recording multi-unit activity. However, consistent with our findings, one study of the expression of the immediate early gene (IEG) EGR-1 in HVC of Bengalese finches reported differences between social contexts (Matheson et al., 2016). Notably, that study reported greater expression of EGR-1 in the undirected song condition; since EGR-1 expression is often construed as a proxy for neural activity, this finding is potentially at odds with our observation of increased neural activity during directed song. This disparity could reflect differences between the IEG response measured histologically in postmortem tissue and the amount of neural activity during song, arising from the long integration time of the IEG response and/or nonlinearities in the relationship between neural firing rates and IEG expression (Wang et al., 2019). Alternatively, this discrepancy could be attributed to differences in the specific neural types that contributed to analysis of neural activity versus IEG expression levels, or to other experimental variables that may have differed between studies, such as the protocols for eliciting directed song or the regions of HVC that were sampled (Basista et al., 2014).

The contribution of the cholinergic system to the social modulation of song is also noteworthy, given that previous work has identified dopaminergic and noradrenergic signaling within song system nuclei as contributing to social modulation of song (Castelino and Ball, 2005; Glaze, 2017; Ihle et al., 2015; Leblois et al., 2010; Sasaki et al., 2006). Each of these neuromodulators—acetylcholine, dopamine, and norepinephrine—are part of the classical ‘ascending arousal system’ that responds to activating stimuli and drives changes in internal state (Lee and Dan, 2012). Hence, while our results directly demonstrate a strong influence of acetylcholine in HVC on social modulation of song, we expect that multiple neuromodulatory systems are likely to be engaged in a courtship context, orchestrating adaptive changes to song and other courtship behaviors through their collective influence on distributed brain regions.

Beyond the rapid changes in arousal elicited by changes in social context, arousal levels also vary over slower timescales in relation to circadian rhythms, and this variation in arousal may contribute to circadian changes in song and neural activity in song control regions (Chi and Margoliash, 2001; Day et al., 2009; Derégnaucourt et al., 2005; Glaze and Troyer, 2006; Liberti et al., 2016; Wood et al., 2013). Our results suggest that the ascending arousal system, including cholinergic components associated with circadian changes in wakefulness, could contribute to these circadian changes in song structure and neural activity in a similarly distributed manner.

In the context of female-directed song, cholinergic enhancement of motor vigor may function to augment the female's perception of the singer's fitness, or serve a more general function in communication. A corollary of the finding that acetylcholine robustly modulates the acoustic structure of song is that changes to song structure in principle can be decoded to make inferences about the internal state of the singer. That is, greater song vigor in terms of pitch, amplitude, tempo, and stereotypy could be interpreted by a female conspecific as a greater level of arousal and interest on the part of the male. Indeed, behavioral studies have demonstrated that female birds are attentive to the differences between undirected and female-directed song, and that they prefer the latter (Dunning et al., 2014; Woolley and Doupe, 2008). Similarly, variation in the acoustic structure of song in other settings may serve an adaptive function in communicating levels of arousal, attraction, aggression, or other aspects of internal state. For example, the quality of song in some bird species differs between affiliative interactions in a courtship setting versus territorial interactions between males, or interactions with juveniles (Chen et al., 2016; Trillo and Vehrencamp, 2005). This parallels the observation for human speech that features such as loudness, pitch, and tempo can be decoded to make inferences about the affective state of the speaker (Banse and Scherer, 1996; Fairbanks and Pronovost, 1938; Leinonen et al., 1997). An intriguing possibility is that for both song and speech, the nuances of acoustic structure that contribute importantly to social communication are driven by the combinatorial influences of neuromodulatory systems that reflect corresponding variations in internal state.

Materials and methods

Statistics

Unless noted otherwise, we used nonparametric two-sided tests for comparing two samples: the Wilcoxon rank-sum test for unpaired data and the Wilcoxon signed-rank test for paired data. Details for all statistical tests are included in the figure legends, the main text, or in supplemental tables that accompany the main figures (for linear mixed effects models; see below). For all tests, we rejected the null hypothesis if p<0.05. No statistical methods were used to predetermine sample sizes, though our sample sizes are comparable to those used in previous publications (Sakata and Brainard, 2008; Sober et al., 2008; Stepanek and Doupe, 2010). Unless noted otherwise, data collection and analysis were not performed blind to experimental conditions; however, there was minimal opportunity for subjective biases to influence outcomes as exclusion criteria and quantitative analyses were applied uniformly across experimental conditions. Details on randomization of conditions are discussed in the relevant methods section where applicable. A small number of syllables with very low sample sizes were excluded; details are given in the corresponding methods section. For pitch, tempo, and amplitude measurements, a simple heuristic was used to remove outliers (described in detail below). For amplitude analyses, we excluded a small number of experiments (3/75, combined across conditions) in which large (>25%) and sudden changes in amplitude occurred, as these were likely caused by the bird changing its orientation with respect to the recording microphone (described in detail below). Source data are provided for group summary analyses of all of the main figures in associated ‘.mat’ files. MATLAB (MathWorks) code for analyzing these data and generating summary figures is provided in Source code 1.

Mixed effects models

Request a detailed protocol

To supplement the primary statistical analysis reported in the main text and figure legends, we conducted an additional statistical analysis using linear mixed effects models to account for structure in the data that may violate the assumption of independence. Specifically, we accounted for correlations between measurements that were sampled from the same bird—both syllables and multi-unit recording sites—by including bird identity as a random effect in a linear mixed effects model. The details of these analyses are provided in supplemental tables that accompany the main figures. These tables are referred to in the text as source data, e.g. Figure 1—source data 1.

Subjects

Data were collected from 25 adult male Bengalese finches (Lonchura striata domestica; microdialysis only: n = 15; microdialysis + electrophysiology: n = 5; electrophysiology only: n = 5). All but two birds in the study were bred in the University of California, San Francisco (UCSF) breeding facility. The ages of these birds ranged from 124 to 239 days at the start of experiments. The remaining two birds were obtained from outside sources and had adult-like song and physical characteristics. Adult female Bengalese finches (>120 days old) were used to elicit directed song. During experiments, male birds were individually housed in sound-attenuating chambers (Acoustic Systems) on a 14 hr:10 hr light:dark cycle with food and water provided ad libitum. All procedures were performed in accordance with protocols (#AN170723-02) approved by the UCSF Institutional Animal Care Use Committee.

Song recording

Request a detailed protocol

Audio was recorded with custom Labview software (National Instruments; digitized at 32 kHz) using an omnidirectional lavalier microphone (Countryman), or with a USB interface board (Intan Technologies; digitized at 30 kHz) using a custom-made microphone and pre-amplifier system.

In vivo microdialysis

Request a detailed protocol

Guide cannulae (CMA 7, CMA Microdialysis) were implanted into HVC or both HVC and LMAN using stereotaxic coordinates. For combined electrophysiology/microdialysis experiments, cannulae were implanted unilaterally in the left HVC (n = 5 birds). For all other experiments, cannulae were implanted bilaterally (HVC + LMAN: n = 4 birds; HVC only: n = 11 birds). After birds recovered from surgery, we inserted microdialysis probes into the cannulae (CMA 7; 0.24 mm diameter, 1 mm diffusion membrane, 6 kDa diffusion pore size).

Dialysis probes were connected to fluid pumps through flexible tubing. Outflow was continually monitored throughout the duration of the experiment. In some cases, we observed leakage from the dialysis tubing or diminished flow as indicated by reduced volume of the outflow. These experiments were excluded from summary analyses and dialysis probes were replaced for subsequent experiments. For experiments without combined electrophysiology, solutions were exchanged to either saline (for control experiments) or drug (carbachol, muscimol, etc.) after a two to three-hour baseline period (flow rate held constant at 1–1.5 uL/min.; solutions exchanged at the same time each day across experiments). For experiments with combined electrophysiology, the duration and time of day that solutions were exchanged varied depending on when the bird sang. For animals in which we tested multiple different conditions (e.g., carbachol vs. carbachol + atropine), each condition was repeated a variable number of times on different days in a randomized order. At least one full day of saline-only infusion was interposed between consecutive drug infusion experiments.

In a subset of birds, we conducted a series of pilot experiments to determine effective drug concentrations. For carbachol experiments, we increased the concentration of carbachol until a significant pitch effect was observed, up to a maximum concentration of 1 mM. In one case, the initial concentration of carbachol (500 µM) caused the bird to call continuously and was reduced on subsequent experiments to 250 µM. For LMAN inactivation experiments, we increased the concentration of muscimol until a significant reduction in pitch c.v. was observed, or to the highest level that did not interfere with singing. For combined microdialysis and electrophysiology experiments, no pilot experiments were conducted, and all experiments were conducted with 1 mM carbachol. Pilot experiments were not included in summary analyses. The final concentration of drugs used in this study is as follows. Carbachol (Santa Cruz Biotechnology): 250 µM-1mM; muscimol (Tocris): 250 µM-1mM; mecamylamine hydrochloride (Sigma, abbreviated MEC): 400 µM; atropine sulfate (Sigma, abbreviated Atrp): 500 µM-2mM; methyllycaconitine citrate salt (Sigma, abbreviated MLA): 100 µM. In cases where more than one concentration of antagonist was used within a given animal, we included only data from experiments with the highest concentrations. All drugs were diluted in saline.

While we did not directly confirm the extent of drug diffusion, previous studies in songbirds that used a similar dialysis procedure estimated that the radius of maximal spread of biotinylated muscimol is <= 1 mm (Warren et al., 2011). In contrast, the boundaries of the nearest major song control nuclei (NIf and RA) are ~2–3 mm away from our probes, implying that diffusion of drug to these regions is minimal.

In vivo electrophysiology

Request a detailed protocol

For experiments in which we combined electrophysiology with microdialysis (n = 5 birds), extracellular recordings from HVC were obtained with custom-designed tungsten electrode arrays (MicroProbes, 10 electrodes, 6MOhm impedances, n = 1 bird), multi-site silicon electrode arrays (NeuroNexus, A4 × 4–3 mm-50-125-413-H16_21 mm, signal acquired from 12 out of 16 sites, n = 1 bird), or tungsten electrode arrays assembled in-house (FHC or MicroProbes, 5–11 electrodes per array, impedances ranging from 0.5MOhm to 10MOhm, n = 3 birds). Electrode arrays were positioned using a custom manual microdrive.

For experiments in which we monitored HVC activity during both directed and undirected song (n = 5 birds), extracellular recordings were obtained with tungsten electrode arrays assembled in-house (MicroProbes, 4–10 electrodes per array, impedances ranging from 0.5MOhm to 6MOhm). Electrode arrays were positioned using a custom manual microdrive, or remotely using a custom motorized microdrive (Faulhaber motor, n = 1 bird).

Neural data were amplified, band-pass filtered (1–7500 Hz), and digitized (30 kHz) with a commercially available head-mounted amplifier board (Intan Technologies, RHD2132 16-channel amplifier board, part #C3334) or a custom amplifier board designed in-house to reduce weight, made with the RHD2132 amplifier chip (Intan Technologies). Neural and audio data were registered with a USB interface board (RHD2000, Intan Technologies).

Female-directed song

Request a detailed protocol

For experiments in which we microdialyzed atropine into HVC while manipulating social context, we collected female-directed song and interleaved undirected song on three separate days, with sessions separated by one day (day 1: saline, day 3: atropine, day 5: saline; see Figure 6B). The final saline day was included to ensure that any attenuation of social modulation of song by atropine could not be attributed to habituation of courtship behavior over time or repeated exposure to females (Hampton et al., 2009; Toccalino et al., 2016). For each session, females were presented 30 or 40 min apart in a cage placed next to the male's cage (spacing between sessions was constant for a given bird), for a total of 2 min for each presentation. For a given male bird, the same sequence of females was presented in the same order across sessions, since some aspects of courtship behavior can depend on female identity (Heinig et al., 2014). The presence of a courtship dance was used to confirm that males sang female-directed song (puffing of feathers, orientation toward female, and hopping from side-to-side). For experiments in which we recorded neural activity in HVC while manipulating social context, females were presented 15–20 min apart in a cage placed next to the male's cage, or were introduced directly into the male's cage.

Analysis of song features

Definition of analysis windows

Request a detailed protocol

To quantify the behavioral effects of carbachol, we analyzed songs recorded in a two-hour window beginning one hour after the estimated time of drug delivery into the brain (drug analysis window). This same choice of analysis window was used for all animals and experiments, and for all reported behavioral features. We selected this time window primarily to ensure that carbachol would have time to diffuse throughout HVC. However, this time window also avoids transient behavioral effects that were observed following carbachol infusion. Such transient effects typically included a gradual onset of reported behavioral changes, consistent with prior studies indicating that it can take tens of minutes for dialyzed drugs to diffuse throughout targeted structures. We also occasionally observed more idiosyncratic patterns of behavioral change following onset of drug infusion (see Figure 3B, for example), but because these were inconsistent across experiments and transient, we did not analyze them further, instead focusing on the relatively robust, consistent, and stable effects that persisted throughout our analysis window. Baseline measurements were obtained from songs recorded in a one to two-hour window immediately prior to the exchange of dialysis solutions (baseline analysis window).

For experiments in which we probed the requirement of LMAN for the behavioral effects of carbachol, we estimated the onset of drug effects for the carbachol and muscimol conditions by visual inspection of the raw pitch time course from select syllables. We then defined the drug analysis window as a one-hour window beginning at the maximum of the estimated carbachol and muscimol onsets (held constant for a given animal across conditions). This procedure ensured that both drugs would be active during the combined carbachol + muscimol condition. The baseline analysis window was defined as described above. For social context experiments, we analyzed all recorded directed songs and a random subset of interleaved undirected songs.

Outlier and experiment inclusion criteria

Request a detailed protocol

Unless otherwise noted, we excluded pitch, amplitude, and tempo measurements that exceeded four times the median absolute deviation from the median, repeating this process three times. This procedure functions as a simple heuristic for culling erroneous measurements resulting from incorrect segmentation of the amplitude envelope. For a given experiment, syllables, syllable sequences, or branch points with fewer than 15 recorded renditions in any condition (e.g. drug or baseline periods, undirected or directed songs) were not included in summary analyses. This criterion was applied after outlier removal.

Pitch and pitch variability

Request a detailed protocol

Analysis of fundamental frequency (pitch) was carried out on a random subset of ‘harmonic stack’ syllables (i.e. syllables with clear harmonic structure that was relatively constant over the duration of the syllable). We focused on such harmonic stacks, as in many previous studies (Kao and Brainard, 2006; Kao et al., 2005; Kojima et al., 2018; Stepanek and Doupe, 2010), because it is difficult to consistently quantify the fundamental frequency of syllables that have entropic spectral structure (‘noisy syllables’) or frequency modulated components (‘sweeps’). For sweeps in particular, the value of fundamental frequency changes rapidly over the course of the syllable so that the value measured for a particular rendition is sensitive to where within the syllable the measurement is made. Changes in song tempo (and syllable duration) exacerbate this problem, and can confound pitch measurements when comparing conditions across which song tempo varies. For quantitative pitch measurements, we therefore used harmonic stack syllables, for which we confirmed that estimates of pitch are robust to variation in the location within the syllable at which measurements are made.

To quantify the pitch of a given syllable rendition, raw audio data were bandpass filtered between 500 and 10,000 Hz, and a spectrogram was computed using a gaussian-windowed (SD = 1–3 ms) short-time Fourier transform (window size = 1024 samples; overlap = 1020 samples). A pitch contour was calculated from the spectrogram by finding the maximum power in a small frequency range around the first harmonic in each time bin, followed by parabolic interpolation of the resulting time series. The pitch of the syllable was then determined by averaging the pitch over a fixed portion of the syllable (relative to syllable onset) that had a constant frequency component. The coefficient of variation of pitch (pitch c.v.) was computed as the across-rendition standard deviation divided by the mean.

For social context experiments, the accuracy of each pitch calculation was confirmed by visual inspection of the pitch contour overlaid on the syllable spectrogram, and inaccurate pitch calculations (due to incorrect segmentation, for example) were excluded from summary analyses. Exclusions were performed blind to social context condition (female-directed vs. undirected). No additional outlier removal was performed.

Song tempo

Request a detailed protocol

We calculated the duration of one or two stereotyped syllable sequences from each bird's repertoire (i.e. sequences without sequence variability; range: 2–4 syllables). Sequence duration was determined from the onset of the first syllable in the sequence to the onset of the last syllable in the sequence. Syllable onsets were determined by an amplitude threshold and were used for tempo measurements because they were more sharply defined than syllable offsets. We also measured the durations of syllables and gaps (using an amplitude threshold). Only gaps that occurred in stereotyped syllable sequences were included in summary analyses.

Amplitude

Request a detailed protocol

Amplitude for a given syllable was calculated by averaging the smoothed amplitude envelope over the middle 80% of the syllable. Amplitude envelopes were calculated by bandpass filtering the raw audio signal between 500 and 10,000 Hz (80th order linear-phase FIR filter), computing the root-mean-square, and smoothing with a sliding 2.5 ms rectangular window. In a small number of experiments, we observed large (>25%) and sudden changes in amplitude that were likely caused by the bird changing its orientation with respect to the recording microphone. We excluded these experiments if the mean amplitude for each syllable changed by more than 25% (in the same direction) in the drug period relative to the baseline period (3/75 experiments excluded, combined across conditions).

For analyses in which we measured changes in amplitude for different types of syllables (Figure 1—figure supplement 1), we defined the following four types of syllables: ‘harmonic stacks’ and ‘sweeps’ were defined by—respectively—constant or frequency-modulated (FM) harmonics for the majority of the syllable’s duration; ‘complex’ syllables contained both a FM sweep and either a constant harmonic component or a high-entropy component; and ‘noisy’ syllables did not fit into these other three categories and were typically characterized by high spectral entropy. To increase the sample size of non-harmonic syllables, we labelled an additional n = 21 syllables in a subset of experiments; these syllables are not included in the main amplitude group summary analysis (Figure 1I).

Repeated syllables

Request a detailed protocol

Syllables that repeated a variable number of times were classified as repeated syllables, with the following exception: syllables that repeated only once or twice were considered as branch points (see below). We also did not consider high entropy syllables that sometimes separate motifs in Bengalese finch song as repeated syllables, since these are difficult to distinguish from introductory notes. Syllables separated by a gap of more than 200 ms were not considered a part of the same repeat sequence. The primary cohort of animals used in this study had few repeated syllables and so we included an additional cohort of animals that did not have paired saline control experiments (n = 4 additional animals). Prior to statistical comparisons and calculation of normalized repeat length, we pooled syllable renditions across experiments. Repeat length c.v. was calculated from the pooled repeat length distributions from the baseline period of all carbachol experiments. For the social context experiments, we assessed whether atropine attenuated the social modulation of two repeated syllables that exhibited a significant increase in repeat length during directed song relative to undirected song (directed vs. undirected, p<0.05 for both saline pre and saline post, one-tailed rank-sum test; n = 2 birds, two experiments per condition). Three other repeated syllables did not exhibit significant social modulation and therefore were not analyzed for attenuation by atropine (directed vs. undirected, p>0.05 for saline pre and saline post, one-tailed rank-sum test).

Branch points

Request a detailed protocol

A syllable that transitions probabilistically to two or more syllables is a branch point (specifically, a divergent branch point). Similar to previous studies (Zhang et al., 2017), we treated sequences of repeated syllables as a single song element. Syllables separated by a gap of more than 200 ms were not included in the calculation of transition probabilities.

To determine if transition probabilities at a given branch point were significantly different in the baseline and drug windows, we employed a generalized likelihood ratio test for homogeneity of transition probabilities. Specifically, we tested the null hypothesis H0: pi = qi for all n possible transitions; where pi denotes the probability of transition i in the baseline period, and qi denotes the probability of transition i in the drug period. The test statistic is the likelihood ratio L(Msub)/L(Mfull), where Mfull denotes two independent and unconstrained multinomial distributions with parameters estimated separately for the baseline and drug periods, and Msub denotes a single multinomial distribution with parameters estimated from the combined baseline and drug periods. Intuitively, this ratio captures the extent to which a single multinomial model is a better descriptor of the data than two separate multinomial models split by baseline and drug periods, thereby adjudicating the hypothesis that transition probabilities have changed.

Systematic differences in sample sizes between carbachol and saline experiments could confound the interpretation that carbachol affects sequencing, as assessed by analyzing the proportion of branch points with significant changes in transition probabilities. However, the number of transitions from the combined baseline and drug periods did not differ between carbachol and saline experiments (p=0.89, two-tailed signed-rank test, n = 15 branch points, seven birds).

For a given branch point, the magnitude of change in transition probability was calculated as the summed change in transition probability for the first n-1 of n possible transition types. Prior to calculating this statistic, we pooled data across experiments. Transition entropy at variable transitions in song was calculated as ∑-pi⋅log2(pi), where the sum ranges over the n possible transitions. For the social context experiments, we did not observe a significant decrease in transition entropy for directed song relative to undirected song, rendering a comparison to atropine irrelevant (mean ± s.e.m. decrease in transition entropy for Saline pre: 3.5 ± 5.6%, Saline post: 0.75 ± 17%; undirected vs. directed for Saline pre: p=0.41, Saline post: p=0.15, one-tailed signed-rank test).

Neural analyses

Spike sorting

Request a detailed protocol

Multi-unit activity was extracted using Wave clus (Quiroga et al., 2004). Briefly, raw voltage traces from all recorded song files were concatenated and bandpass filtered between 300 and 4000 Hz. Events greater than 3.5 and below 50 times the estimated noise level in the negative direction were considered spikes, with a minimum refractory period between events of 0.2 ms. Wave clus estimates noise as the median absolute deviation of the filtered voltage trace, divided by 0.6745 (Donoho and Johnstone, 1994; Quiroga et al., 2004). This noise estimator mitigates the upward bias that would be introduced by instead using the standard deviation of the signal, since the signal contains a small fraction of large events (corresponding to spikes).

Signal-to-noise ratio of multi-unit data

Request a detailed protocol

We estimated the signal-to-noise ratio (SNR) of our multi-unit recording sites as the mean peak height of all detected events, divided by the estimated noise level as defined above for spike sorting. The SNR was calculated separately for all recording sites and syllables. Since the estimated noise level for data in the vicinity of a given syllable could differ from that obtained from the entire recording (as was done for spike detection), a small fraction of our site/syllables had SNRs below the 3.5x spike detection threshold.

Analysis of firing rates and neural variability

Request a detailed protocol

Rendition-averaged firing rates before and after carbachol/saline, and for directed and undirected song, were calculated by aligning spike trains to syllable onsets and convolving with a 5 ms SD gaussian kernel. Firing rate differences elicited by carbachol or directed song were calculated by averaging smoothed firing rates over three separate time windows: a 100 ms window centered on syllable onsets, a 30 ms window just prior to syllable onsets, and a 30 ms window just after syllable onsets. The Fano factor was calculated as the across-rendition spike count variance divided by the mean spike count in these same time windows. For all analyses, multi-unit site/syllable pairs with fewer than 10 renditions or firing rates less than 50 Hz in either the baseline/undirected or drug/directed periods were excluded. The minimum firing rate criterion was applied to a 100 ms window centered at the onset of the syllable.

Localization of microdialysis probes and recording electrodes

Request a detailed protocol

We collected post-mortem histology at the conclusion of experiments to confirm the placement of microdialysis probes and recording electrodes. HVC was visualized by fluorescent staining for Parvalbumin (Swant Cat# 235, RRID:AB_10000343; monoclonal ab raised in mice, 1:10000); LMAN was visualized by fluorescent staining for calcitonin gene-related peptide (Sigma-Aldrich Cat# C8198, RRID:AB_259091; polyclonal ab raised in rabbits, 1:5000 to 1:10000). The location of microdialysis probes was indicated by tissue damage within or adjacent to HVC or LMAN. Placement of recording electrodes was confirmed by tracks left by the electrodes and/or small electrolytic marker lesions. Dialysis probe placement could not be confirmed in 4/20 birds that died without being perfused for histology. However, we observed behavioral effects in these animals in response to drug microdialysis (i.e. increased pitch/amplitude/tempo and reduced pitch c.v. for microdialysis of carbachol into HVC, and reduced pitch c.v. for microdialysis of muscimol into LMAN), providing an independent confirmation of correct probe placement.

References

  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
  17. 17
  18. 18
  19. 19
  20. 20
  21. 21
  22. 22
  23. 23
  24. 24
  25. 25
  26. 26
  27. 27
  28. 28
  29. 29
  30. 30
  31. 31
  32. 32
  33. 33
  34. 34
  35. 35
  36. 36
  37. 37
  38. 38
  39. 39
  40. 40
  41. 41
  42. 42
  43. 43
  44. 44
  45. 45
  46. 46
  47. 47
  48. 48
  49. 49
  50. 50
  51. 51
  52. 52
  53. 53
  54. 54
  55. 55
  56. 56
  57. 57
  58. 58
  59. 59
  60. 60
  61. 61
  62. 62
  63. 63
  64. 64
  65. 65
  66. 66
  67. 67
  68. 68
  69. 69
  70. 70
  71. 71
  72. 72
  73. 73
  74. 74
  75. 75
  76. 76
  77. 77
  78. 78
  79. 79
  80. 80
  81. 81
  82. 82
  83. 83
  84. 84
  85. 85
  86. 86
  87. 87
  88. 88
  89. 89
  90. 90
  91. 91
  92. 92
  93. 93
  94. 94
  95. 95
  96. 96
  97. 97
  98. 98
  99. 99
  100. 100
  101. 101
  102. 102
  103. 103
  104. 104
  105. 105
  106. 106
  107. 107
  108. 108
  109. 109
  110. 110
  111. 111
  112. 112
  113. 113
  114. 114
  115. 115
  116. 116
  117. 117
  118. 118

Decision letter

  1. Megan R Carey
    Reviewing Editor; Champalimaud Foundation, Portugal
  2. Ronald L Calabrese
    Senior Editor; Emory University, United States
  3. Stephen D Shea
    Reviewer; Cold Spring Harbor Laboratory, United States
  4. Joshua Tate Dudman
    Reviewer; Janelia Research Campus, Howard Hughes Medical Institute, United States
  5. Sarah C Woolley
    Reviewer

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Acceptance summary:

Although the cholinergic system is well known to modulate sensory processing, its effects on motor systems are less well studied. The reviewers were enthusiastic about this paper's demonstration of cholinergic modulation of motor behavior within the birdsong premotor nucleus HVC. Several aspects of the cholinergic modulation of motor outputs described here have typically been investigated within the context of the dopaminergic system. In particular, the findings suggest that cholinergic basal forebrain inputs to HVC may contribute to the well-described modulation of song in the presence of a female bird (directed song). Together, these findings raise intriguing possibilities about potential contributions of cholinergic signaling to modulation of motor outputs, across systems.

Decision letter after peer review:

Thank you for submitting your article "Acetylcholine acts on songbird premotor circuitry to invigorate vocal output" for consideration by eLife. Your article has been reviewed by Ronald Calabrese as the Senior Editor, a Reviewing Editor, and three reviewers. The following individuals involved in review of your submission have agreed to reveal their identity: Stephen D Shea (Reviewer #1); Joshua Tate Dudman (Reviewer #2); Sarah C Woolley (Reviewer #3).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

Summary:

The cholinergic system is well known to modulate sensory processing. The paper by Jaffe and colleagues describes cholinergic modulation of motor behavior within the birdsong premotor nucleus HVC. Their findings further suggest that cholinergic basal forebrain inputs to HVC may contribute to the well-described modulation of song in the presence of a female bird (directed song). Several aspects of the cholinergic modulation of motor outputs that they describe have typically been investigated within the context of the dopaminergic system. Together, these findings raise intriguing possibilities about potential contributions of cholinergic signaling to modulation of motor outputs, across systems.

All reviewers were in agreement that this work is well-executed and interesting. There was also broad agreement on the suggestions for improvement, which are summarized here.

Essential revisions:

1) Several reviewers commented on the limits of multiunit recordings in general and more specifically in HVC because of the disparity in firing rates among cell types. Please provide better characterization of the nature of the recordings and clearer expression of some of the caveats associated with this approach. Discussing the results in light of what we know about the cell types and connectivity in the local network (interneurons vs. RA-projecting neurons, etc.) would also be helpful.

2) All of the reviewers would like to see a more thorough treatment of the effects on syllable sequencing, for both the carbachol and atropine experiments, particularly with regard to branch points. Also of interest is HVC activity at these branch points, if available.

3) Please provide a finer grain analysis of the durations of individual syllables and gaps in directed vs. undirected song as opposed to the overall length of predominant motifs.

We expect that these points, as well as the reviewers' individual concerns (appended below), can be addressed without new experiments, and with a moderate amount of further analysis.

Reviewer #1:

In this paper, "Acetylcholine acts on songbird premotor circuitry to invigorate vocal output," Jaffe et al. report several pieces of evidence that suggest that a previously described input to the song premotor nucleus HVC from the cholinergic basal forebrain regulates song motor behavior. The experiments are skillfully and rigorously conducted, and there are several novel aspects to the conclusions. First, as the authors correctly note, cholinergic release in the cortex or other forebrain targets are long known to modulate sensory activity. However, the contribution of these circuits to motor performance are more poorly understood. Second, it is also long been known that there are significant differences in the structure of songs performed in the presence of a female ('directed song') as compared to songs performed while the male is alone ('undirected song'). Over the past 15 years or so, a number of features of the circuitry underlying this phenomenon have been discovered. For the most part, these studies have focused on the dopamine system and the so-called 'anterior forebrain pathway,' a basal ganglia-thamalocortical loop that is been implicated in changes to song in both juveniles and adults. The novelty of this paper primarily lies in its identification of cholinergic pathways to HVC as a separate neurochemical arousal pathway that makes an overlapping contribution to social modulation of song via a distinct song system target. With this study, Jaffe et al. expand our understanding of the mechanisms of contextual regulation of vocal behavior, and they reveal an unappreciated function of the cholinergic system, which is understudied in songbirds. I have a number of more specific comments listed below.

1) Acetylcholine is of course involved in arousal, and the authors have good evidence that this pathway is activated during the arousal that accompanies directed song. Nevertheless, the cholinergic system is also a major participant in circadian patterns of arousal and alertness. In light of the sizable body of data that show circadian variability in song behavior and neural activity in the song system, some speculation in the discussion about how their data might interface with that line of work seems warranted.

2) I recommend the authors add citations of these relevant papers to their Introduction: Neuron 46:173; Cerebral Cortex 20:2739.

3) I understand why the authors limited their pitch analysis to syllables with defined harmonic structure that were also constant in pitch, but the authors might consider analyzing a subset of different syllables that don't share these attributes to see if an increase in pitch is global to song or if noisier or modulated syllables are affected differently. For example, syllables that contain substantial spectrotemporal modulation may show an increase in the span of that spectrotemporal modulation rather than a uniform upward shift in pitch.

4) I'm curious why the authors did not separately analyze the duration of each syllable and silence between them, instead choosing to analyze the overall duration of common groups of syllables. As far as I know, this is not the approach taken by past studies of contextual modulation of song behavior.

5) For me, one of the most interesting results was the experiment depicted Supplementary Figure 1. The authors' finding that a manipulation of HVC alone affected song sequencing is quite noteworthy. As far as I understand, the neural mechanisms of song sequencing, particularly in species with significant sequence variability like Bengalese finches, remain controversial. This result seems to support the idea that song sequencing is entirely contained within the HVC circuit. Therefore, I'm not sure these data deserve to be relegated to a supplementary figure. That said, if the authors were to move it to a main figure, they should probably do a more comprehensive analysis of sequence changes. I think here the focus on the most common sequences, which like in point #1 above, may be affected differently than rarer sequences.

6) I noticed something in Figure 2B that may be interesting if it is consistently observed. In these examples, the carbachol experiment in which LMAN is intact shows a larger pitch shift in earlier songs that relaxes partially in later songs. This is not seen in the example they show in which LMAN has been inactivated. Was this something they observed consistently? Because it might suggest that after many songs performed in the presence of carbachol, the bird begins to correct or compensate for the pitch shift, and that this may be LMAN-dependent.

7) I'm not sure that the authors assertion that HVC firing rates are specifically elevated just before the syllable onset is very well justified by the data. Although there is a difference in Figure 3E between firing rates just before and just after the start of the syllable, there is still a large increase in both in the presence of carbachol. Also, why did they not perform this temporally limited analysis on the data in Figure 4?

8) There are at least three major classes of neurons in HVC based on their projection targets: RA-projecting, X-projecting, and interneurons. The projection neurons fire much less frequently than the interneurons, so their multiunit recording data is almost certainly overwhelmingly dominated by the activity of interneurons. This is an important point that, in my opinion, they have to address because (a) the activity the authors see likely doesn't reflect much of the activity of the pathway they think is mediating the effects of carbachol, and (b) the interneurons are GABAergic and probably target RA-projecting neurons with synaptic inhibition. This leads to potentially a different interpretation of the changes in HVC firing rate that they see in carbachol infusion and directed song.

Reviewer #2:

Here the authors describe a set of experiments to assess the direct contributions of cholinergic modulation of activity in a premotor nucleus on the execution of bird song – specifically the vigor with which the song is performed. This touches upon longstanding ideas that cholinergic modulation of cortical circuits emanating from the basal forebrain play a critical role in modulating activity. This modulation has often been associated with generalized increases in processing such as attention, arousal, sensory gain etc. It is relatively less well studied what contribution such modulation may play on the details with which actions are executed. The latter has been more closely associated with basal ganglia function in the literature. Moreover, as far as I could tell this question is largely unexplored specifically in the bird. I found the experiments well done and compelling. The authors provide clear evidence that muscarinic modulation of HVC activity can alter the vigor of song, that this parallels and potentially mediates some portion of the social modulation of song performance, and occurs as an independent, additive modulation with modulation by basal ganglia circuits. I think the paper merits publication. Below I detail some points that I think could be refined in a revised version of the manuscript and one experimental/analysis question that I think would benefit the paper if addressed.

A comment on analysis:

"Thus, the increased pitch stereotypy caused by carbachol cannot be explained by reduced neural variability in HVC."

The authors do not really return to this point in the Discussion section, but I was interested to understand why/how this dissociation might be relevant to theories about syllable production and variability in pitch that is exploited in many reinforcement paradigms.

The transition probability effect at branch syllables was not, as far as I could find, addressed in light of recordings from HVC. It seemed however that since HVC activity sequences are thought to determine transition probabilities this would be quite interesting to explore. Was it not possible to observe HVC activity specifically around branch syllables for sampling reasons? Or were there other reasons this was not explored in more detail? Does social modulation also modulate transition probabilities?

A general point:

At least in mammals ACh also acts within BG via intrinsic cholinergic neurons in striatum and globus pallidus as well as projections from midbrain (PPN for example) on to midbrain dopamine neurons. In the design of the current experiments the coordination of ACh signaling mediated by these pathways might be lost. It would be useful to discuss the point that while the Ach-mediated vigor mechanisms can be dissociated from putative dopamine-dependent mechanisms using local infusion of carbachol, the two mechanisms may in fact be coordinated in other conditions. Perhaps consistent with this in Figure 5 it would appear that HVC-infused atropine mediates only part of the social modulation of song consistent with possibility that mAChR function in other (possibly BG-related) brain areas could contribute to further effects. This might make additional infusions of atropine in the context of directed song interesting to explore whether cholinergic modulation more broadly than in HVC is necessary to account for the full extent of social modulation.

Comments on text/interpretation:

"the extent to which acetylcholine contributes to enhanced motor vigor observed in aroused behavioral states remains unknown."

Maybe modify this point a bit. There is a certainly a strong association between cholinergic activity and aroused behavioral states observed in mammals (Jones, 2004) and there has long been data on how mAChR antagonists effect aroused behavioral responses (such as conditioned responses, e.g. Longo, 1966).

"Moreover, for each feature, the changes elicited by combined carbachol + LMAN inactivation were not significantly different from the sum of the individual effects of carbachol and LMAN inactivation (pitch: p = 0.77, pitch c.v.: p = 0.85, tempo: p = 0.22, amplitude: p = 0.71, signed-rank test). These results indicate that increased cholinergic tone in HVC can modulate song via primary motor circuitry independently of input from the songbird basal ganglia."

I thought this result was a particularly key demonstration that these two pathways appear to linearly sum. I would note that a linear sum of BG-independent and BG-dependent mechanisms has been proposed previously (Yttri and Dudman, 2018) – although this provides the clearest direct evidence for ~linear combination of effects to date.

“Prior work has identified basal ganglia circuitry as an important locus for the control of motor vigor (Panigrahi et al., 2015; Schmidt et al., 2008; Yttri and Dudman, 2016). In principle, cortical and basal ganglia circuitry could jointly control movement vigor, with coordination between these pathways mediated by bidirectional feedback between them (Bosch-Bouju et al., 2013). However, our findings indicate that in some situations, modulation of motor vigor can occur independently of basal ganglia circuitry.”

The Discussion section as written suggests that previous treatments have suggested that basal ganglia (BG) might be an exclusive determinant of motor vigor. However, I would just note that this was explicitly presented as an argument for BG-independent pathways for control of movement vigor previously – as noted in Dudman and Krakauer (2016) cited in this manuscript: "It should be stressed that just because the basal ganglia can influence vigor this does not imply the converse: that vigor parameters are always under the obligate control of the basal ganglia." This is also explicit in normal treatments of this model of vigor in the equations in Yttri and Dudman (2018) in which BG-independent component of vigor sums (is independent of) the BG contribution.

Nonetheless, I think the authors make a very important point here that an explicit modulatory influence that does function in cortex independent of BG to control vigor is a very valuable addition/demonstration. The authors might be interested to consider other phenomena that are similarly hard to reconcile with an exclusive reinforcement mechanism like verbally-instructed changes in movement vigor in normal subjects ("Move faster!"), or the learned changes in vigor present that persist in dopamine depleted animals (Panigrahi et al., 2015) and patients (Mazzoni et al., 2007; Baraduc et al., 2013), or as authors do discuss paradoxical kinesia. (These were the observations that led to the proposal ofBG-independent circuits for motor vigor in previous reviews). Moreover, similar to the bird circuitry highlighted in the discussion of this manuscript, in the mammal BG-output also converges on subcortical targets of descending motor cortical projections making these parallels perhaps even closer.

Reviewer #3:

In this manuscript, Jaffe and Brainard investigate the role of acetylcholine in motor invigoration. In general, I found this to be a thorough manuscript addressing an interesting question of how acetylcholine modulates motor output. That said, my enthusiasm is tempered by a handful of issues regarding the approach and aspects of the data that were not included or not sufficiently described.

1) Historically, HVC has been thought to be important for tempo and sequencing while the AFP contributes to syllable structure. This makes Bengalese finches an interesting model to study HVC because they have greater sequence variability than zebra finches (a focus that Brainard has taken advantage of in the past). In particular, his lab has found that during directed singing, sequence entropy decreases and syllable repeats increase and this has not been found to be affected by lesions of LMAN, thus hinting that HVC may be significant in this modulation. Here, they report that Ach manipulation in HVC affects sequence transitions (though there is little detail on this change, including whether there is a change in entropy) and also increases repeats (similar to the effect of directed singing). However, there is no further discussion of sequence or repeats through the rest of the paper. I would like to see the effects on sequence and repeats throughout. It would be especially interesting to know whether atropine affects the decrease in entropy and increase in repeats previously reported for directed singing. Presumably, this would not require additional experiments as these data would be available in the data already collected.

2) I'd like more raw data and interpretation of the multi-unit recordings. It's unclear what these sites look like (how multi-unit is multi-unit?). Moreover, I find the rasters to be a poor way to convey the effects. In the one in Figure 3, there is so much black that you notice the decrease in white-space more than the increase in black, which gives the impression of a decrease rather than an increase. In Figure 4 I can't see the difference in the rasters. For interpretation, is the idea that there are just more neurons firing during directed singing, or that there is an increase in the rate of individual neurons, or both? Work using IEG expression in Bengalese finches indicates that there are more EGR1 expressing cells during undirected singing that directed singing, which would appear to be at odds with the current multi-unit result, but this is not discussed.

3) There are a number of instances in which there are many syllables and/or recordings sites in a single bird but there is no indication that bird has been included as a random variable in the statistical models. It is actually quite important to include bird ID as a variable in the model to account for the fact that many measurements have been made in the same individual (or to indicate more explicitly if it has already been included).

https://doi.org/10.7554/eLife.53288.sa1

Author response

Essential revisions:

1) Several reviewers commented on the limits of multiunit recordings in general and more specifically in HVC because of the disparity in firing rates among cell types. Please provide better characterization of the nature of the recordings and clearer expression of some of the caveats associated with this approach. Discussing the results in light of what we know about the cell types and connectivity in the local network (interneurons vs. RA-projecting neurons, etc.) would also be helpful.

The revised manuscript now includes substantial additional characterization of our multi-unit recordings, including examples of raw data and a quantification of the signal-to-noise ratio and firing rates for each recording site (Figure 4—figure supplement 1). Additionally, our revision now contains a more extended and nuanced discussion of how the different cell types within HVC may be affected by acetylcholine and contribute to the observed behavioral effects, incorporating previous literature pertaining to the connectivity within HVC and cellular effects of acetylcholine as assessed in acute slice electrophysiology experiments (Shea et al., 2010; see subsection “Neural mechanisms underlying the control of motor vigor in songbirds”). More detail on the particulars of these revisions is provided in our responses to Reviewer #1, point #8 and Reviewer #3, point #2.

2) All of the reviewers would like to see a more thorough treatment of the effects on syllable sequencing, for both the carbachol and atropine experiments, particularly with regard to branch points. Also, of interest is HVC activity at these branch points, if available.

As requested, we now provide a more thorough characterization of the syllable sequencing effects for both the carbachol and atropine experiments. Our new Figure 2 and associated text presents data showing that carbachol has a significant effect on sequence probabilities for both branch points (Figures 2A-C) and repeated syllables (Figure 2D-F). Our new Figure 6—figure supplement 1 and associated text presents new analysis addressing how atropine influences sequencing for the limited set of branch points and repeated syllables present in the relevant dataset (i.e. songs from the subset of birds singing both directed and undirected song with or without atropine dialyzed into HVC). For this dataset, the effects of directed song on sequencing were weak even under control saline conditions (as has sometimes been observed previously; see for example Hampton et al., 2009 Figure 5C; Toccalino et al., 2016 Figure 3). We did not observe a significant attenuation of the effects of social context on sequencing in these birds with atropine, but the conclusions that can be drawn from this are clearly limited given the weak modulation at baseline. We have presented the relevant data in Figure 6—figure supplement 1 and discussed these caveats in the Results section and in the Materials and methods section (see also our responses to Reviewer #1, point #5 and Reviewer #3, point #1).

We agree with Reviewer #2 that it would be interesting to examine how HVC activity around ‘branch points’ in song is altered by carbachol and directed song. However, we have not attempted to add such a description, as we think it is beyond the scope of what we could reasonably incorporate into the current manuscript. This is largely because there is currently little understanding of how activity in HVC relates to variable transitions even in the absence of any manipulations. Hence, extensive additional data, analysis, and description would be required to appropriately contextualize any findings regarding the effects of carbachol or social context (see also our response to Reviewer #2 for more commentary on this point).

3) Please provide a finer grain analysis of the durations of individual syllables and gaps in directed vs. undirected song as opposed to the overall length of predominant motifs.

The revised manuscript now includes an analysis of syllable and gap durations as requested. We found that both syllables and gaps were shortened by carbachol and directed song, but that the effects were larger for gaps than for syllables (Figure 1—figure supplement 1 and Figure 6—figure supplement 1; see also our response to Rreviewer #1, point #4 for more detail).

Reviewer #1:

[…]

1) Acetylcholine is of course involved in arousal, and the authors have good evidence that this pathway is activated during the arousal that accompanies directed song. Nevertheless, the cholinergic system is also a major participant in circadian patterns of arousal and alertness. In light of the sizable body of data that show circadian variability in song behavior and neural activity in the song system, some speculation in the discussion about how their data might interface with that line of work seems warranted.

This is a great point, and we agree that some discussion of circadian changes in songbird behavior and neural activity is warranted. We now comment on these circadian changes in the Discussion section, in particular with respect to the ascending arousal system. More specifically we note that some of the previously described circadian variation in neural activity and song could reflect changing activity in the ascending arousal system, including cholinergic components that have been linked to changes in wakefulness and alertness (see subsection “Contributions of HVC and the cholinergic system to social modulation of song”).

2) I recommend the authors add citations of these relevant papers to their Introduction: Neuron 46:173; Cerebral Cortex 20:2739.

Thanks – we agree that these papers pertaining to the role of the cholinergic system in motor skill learning are relevant, and we now reference them in the second paragraph of the Introduction.

3) I understand why the authors limited their pitch analysis to syllables with defined harmonic structure that were also constant in pitch, but the authors might consider analyzing a subset of different syllables that don't share these attributes to see if an increase in pitch is global to song or if noisier or modulated syllables are affected differently. For example, syllables that contain substantial spectrotemporal modulation may show an increase in the span of that spectrotemporal modulation rather than a uniform upward shift in pitch.

We found that it was difficult to assess changes in pitch (fundamental frequency) for syllables with frequency-modulated harmonic components (“sweeps”), due to the increase in tempo produced by carbachol. Pitch for a given syllable was analyzed by averaging the pitch contour over a short time window in which the fundamental frequency is relatively constant; for harmonic stacks, with relatively constant fundamental frequency, pitch measurements are robust to variation in the timing of the measurement window. In contrast, for frequency-modulated sweeps, small variation in the timing of the measurement window can result in inaccuracies in estimation of pitch. For example, for a downward sweeping syllable, a slightly later measurement window within the syllable results in a lower estimate of pitch. This is especially problematic for sweep syllables in which there has been an increase in tempo, such that differences in pitch between two renditions will vary depending on whether a measurement window is aligned relative to the onset or offset of the syllable. Indeed, we found that subtle differences in measurement procedures for sweep syllables could result in variation in pitch measurements that was large compared to the small systematic shifts in pitch that were observed for harmonic stacks (~1.5%). Nonetheless, we agree that it is worthwhile to investigate whether the behavioral effects of carbachol are observed for other syllable types in the bird’s repertoire. To investigate this, we focused on amplitude changes caused by carbachol, as this is a syllable feature that was strongly modulated by carbachol (~8%), and because amplitude can readily be measured for all syllable types. We found that carbachol induced comparable increases in amplitude for syllables of all types (harmonic stacks, sweeps, complex syllables, and noisy syllables). We present this analysis in Figure 1—figure supplement 1C and associated text in subsections “Pitch and pitch variability” and “Amplitude”).

4) I'm curious why the authors did not separately analyze the duration of each syllable and silence between them, instead choosing to analyze the overall duration of common groups of syllables. As far as I know, this is not the approach taken by past studies of contextual modulation of song behavior.

In our initial submission, we analyzed the duration of syllable sequences (from syllable onset to syllable onset) because the shape of the amplitude envelope at syllable onsets is less variable than for offsets, yielding more accurate estimates of tempo. Previous studies of social modulation of song have also assessed changes in song tempo by measuring sequence/motif durations, though some have measured sequences from syllable onset to syllable offset (Aronov and Fee, 2012; Cooper and Goller, 2006; Sakata et al., 2008). In our revised manuscript, we have added an analysis of syllable and gap durations for both the carbachol experiments (Figure 1—figure supplement 1) and the social context experiments (Figure 6—figure supplement 1). In both cases, syllables and gaps were each shortened, with a larger effect on gap durations, which as we now note in the main text parallels a similar differential effect on gaps versus syllables caused by cooling of HVC.

5) For me, one of the most interesting results was the experiment depicted Figure “. The authors' finding that a manipulation of HVC alone affected song sequencing is quite noteworthy. As far as I understand, the neural mechanisms of song sequencing, particularly in species with significant sequence variability like Bengalese finches, remain controversial. This result seems to support the idea that song sequencing is entirely contained within the HVC circuit. Therefore, I'm not sure these data deserve to be relegated to a supplementary figure. That said, if the authors were to move it to a main figure, they should probably do a more comprehensive analysis of sequence changes. I think here the focus on the most common sequences, which like in point #1 above, may be affected differently than rarer sequences.

We agree that the sequencing results are interesting and merit further elaboration. As suggested, we now report sequencing results in a main figure (Figure 2), and provide additional characterization of these effects. Our results show that carbachol significantly alters the probabilities of different transitions at branch points (Figure 2A-C) and significantly alters syllable repetitions (Figure 2D-F). We also now report in the Results section that these sequencing changes are associated with a trend towards reduction in “transition entropy” of the sort that has been observed for directed song. As we note in our response to Reviewer #3 (point #1), we also assessed potential contributions of acetylcholine to social modulation of song sequencing (Figure 6—figure supplement 1).

Regarding the broader question of the role of HVC in syllable sequencing in the Bengalese finch, while our data indicate that carbachol dialyzed into HVC can significantly alter syllable sequencing (consistent with prior data from Zhang et al., 2017 on the effects of cooling Bengalese finch HVC), they do not rule out the possibility (likely in our view) that syllable sequencing is additionally influenced by activity elsewhere in song system, including via recurrent projections from the brainstem back to HVC.

6) I noticed something in Figure 2B that may be interesting if it is consistently observed. In these examples, the carbachol experiment in which LMAN is intact shows a larger pitch shift in earlier songs that relaxes partially in later songs. This is not seen in the example they show in which LMAN has been inactivated. Was this something they observed consistently? Because it might suggest that after many songs performed in the presence of carbachol, the bird begins to correct or compensate for the pitch shift, and that this may be LMAN-dependent.

Thanks for pointing this out – this is a thoughtful observation that we agree merits further investigation, so we took a look. We did occasionally observe transient behavioral effects following carbachol infusion – lasting on the order of tens of minutes – prior to a relatively stable behavioral response that persisted for the remainder of the experiment (such as the example you refer to in Figure 3B, previously Figure 2B). However, these transient effects were not observed consistently, even within birds in which we conducted multiple experiments (see the example experiment in Figure 1C). Addressing your question more directly, we occasionally observed this pattern of strong early increase followed by partial relaxation in the carbachol + LMAN inactivation experiments. Thus, it seems that this transient response is not strictly LMAN dependent. We chose to analyze behavioral effects in a time window beginning one hour after drug onset, primarily to ensure that carbachol would have time to diffuse throughout HVC. However, this time window also corresponds to a point at which other idiosyncratic transient effects have settled. We now make note of these points in “Definition of analysis windows” under the subsection “Analysis of song features” in Materials and methods.

7) I'm not sure that the authors assertion that HVC firing rates are specifically elevated just before the syllable onset is very well justified by the data. Although there is a difference in Figure 3E between firing rates just before and just after the start of the syllable, there is still a large increase in both in the presence of carbachol. Also, why did they not perform this temporally limited analysis on the data in Figure 4?

Thanks for catching our confusing wording. We did not mean to imply that the increase in firing rates was restricted to the period just prior to syllable onsets, and we have revised our wording in the main text to make this clear. We now specifically note that carbachol caused an increase in firing rates in windows both preceding and following syllable onset, with a modestly larger effect in the premotor window. We now also report the results of this temporally-limited analysis for the directed and undirected song HVC recordings (see Figure 5D and associated text of theResults section). Relative to undirected song, directed song was associated with a significant increase in firing rate both preceding and following syllable onsets. The magnitude of this increase did not differ significantly between the pre and post-syllable onset windows.

8) There are at least three major classes of neurons in HVC based on their projection targets: RA-projecting, X-projecting, and interneurons. The projection neurons fire much less frequently than the interneurons, so their multiunit recording data is almost certainly overwhelmingly dominated by the activity of interneurons. This is an important point that in my opinion, they have to address, both because the activity the authors see likely doesn't reflect much of the activity of the pathway they think is mediating the effects of carbachol, and also because the interneurons are GABAergic and probably target RA-projecting neurons with synaptic inhibition. This leads to potentially a different interpretation of the changes in HVC firing rate that they see in carbachol infusion and directed song.

We agree that multi-unit activity in HVC is likely to be dominated by inhibitory interneurons, and have added further discussion regarding possible ways in which cholinergic modulation of HVC activity could contribute to the observed behavioral changes (see the subsection “Neural mechanisms underlying the control of motor vigor in songbirds”). In this section we note that the reasonable intuition that an increase in the firing rate of interneurons should result in a decrease in the firing rate of projection neurons is likely too simple. For example, we note that both projection neurons and interneurons increase their activity during song relative to non-singing quiescent periods (Kozhevnikov and Fee, 2007). Additionally, we now discuss the possible circuit level effects of acetylcholine on HVC during singing more explicitly in relation to the previously reported cellular effects of acetylcholine in acute HVC slice recordings (Shea et al., 2010).

In particular, we note that muscarinic agonists applied in vitro tend to hyperpolarize interneurons. If interneuron firing rates are nevertheless increased by cholinergic agonists applied in vivo, one possibility is that any hyperpolarization at a cellular level is overcome by increased excitatory input to interneurons. We hypothesize that such increased excitatory input could derive from HVC projection neurons, which are depolarized by muscarinic agonists in vitro (Shea et al., 2010), and which synapse on HVC interneurons. In the Discussion section noted above, we now provide a more extended commentary on the implications of our findings in light of these prior studies.

Reviewer #2:

[…]

A comment on analysis:

"Thus, the increased pitch stereotypy caused by carbachol cannot be explained by reduced neural variability in HVC."

The authors do not really return to this point in the Discussion section, but I was interested to understand why/how this dissociation might be relevant to theories about syllable production and variability in pitch that is exploited in many reinforcement paradigms.

In response to a comment from Reviewer #3, we have tempered our wording and conclusions on this point to reflect that while variability was not reduced in our multi-unit recordings, this does not rule out the possibility of a reduction in the variability of projection neurons. Nonetheless, we agree that the reduction in behavioral variability following a manipulation of HVC is noteworthy in its own right. Most prior work emphasizes the contribution of the AFP to acoustic variability (e.g. Hampton et al., 2009; Kao et al., 2005; Leblois et al., 2010; Stepanek and Doupe, 2010). Our findings suggest that a significant source of behavioral variability originates within HVC. Conceivably, this variability could be harnessed in the service of reinforcement learning in much the same way that is thought to occur for variability originating from the AFP (Charlesworth et al., 2012). Our revised manuscript now includes a brief discussion of this point, as well as a discussion of potential neural mechanisms underlying the reduction in behavioral variability (see the subsection “Contributions of HVC to acoustic variability”).

The transition probability effect at branch syllables was not, as far as I could find, addressed in light of recordings from HVC. It seemed however that since HVC activity sequences are thought to determine transition probabilities this would be quite interesting to explore. Was it not possible to observe HVC activity specifically around branch syllables for sampling reasons? Or were there other reasons this was not explored in more detail? Does social modulation also modulate transition probabilities?

We agree that HVC neural activity around branch points could be quite revealing with respect to the neural mechanisms underlying the effects of carbachol on song sequencing. In response to this suggestion, we considered the possibility of adding analysis of neural activity at branch points and how this activity is influenced by experimental conditions (i.e. carbachol or directed song). However, we concluded that the addition of neural data that would be required to describe and contextualize any findings is beyond the scope of what we could reasonably incorporate in an expanded manuscript. This is in part because addressing how neural activity at branch points is affected by various manipulations would require an initial description of normal or “baseline” activity at branch points, which itself is a complex enterprise that we think will ultimately merit its own story. For example, a baseline description of sequencing under saline conditions would need to address questions such as: When and how do activity patterns for different possible transitions diverge with respect to syllable boundaries? How heterogeneous are these dynamics across different recording sites and different branch points? And are these dynamics related in any systematic way to transition probabilities? In short, a characterization of how carbachol and directed song affect HVC activity around branch points would require extensive characterization of this activity in the absence of any manipulations.

Regarding whether social context modulates transition probabilities, previous studies have shown that social context can modulate transition probabilities for both “branch points” and repeated syllables in Bengalese finch song (see for example Sakata et al., 2008), though such effects have not always been observed, indicating that they are less robust than effects on syllable structure (e.g. Hampton et al., 2009; Toccalino et al., 2016). Our revised manuscript now provides a more thorough characterization of experimental influences on syllable sequencing for both carbachol and social context experiments (Figure 2 and Figure 6—figure supplement 1). See also our responses to Reviewer #1, point #5 and Reviewer #3, point #1 for more detail.

A general point:

At least in mammals ACh also acts within BG via intrinsic cholinergic neurons in striatum and globus pallidus as well as projections from midbrain (PPN for example) on to midbrain dopamine neurons. In the design of the current experiments the coordination of ACh signaling mediated by these pathways might be lost. It would be useful to discuss the point that while the Ach-mediated vigor mechanisms can be dissociated from putative dopamine-dependent mechanisms using local infusion of carbachol, the two mechanisms may in fact be coordinated in other conditions. Perhaps consistent with this in Figure 5 it would appear that HVC-infused atropine mediates only part of the social modulation of song consistent with possibility that mAChR function in other (possibly BG-related) brain areas could contribute to further effects. This might make additional infusions of atropine in the context of directed song interesting to explore whether cholinergic modulation more broadly than in HVC is necessary to account for the full extent of social modulation.

Thanks – we have revised the Discussion section to emphasize that these two pathways may indeed be coordinated in many situations, making note of some of the subcortical anatomy that could serve as an anatomical substrate (see the subsection “Distributed circuits for the control of motor vigor”). We agree that it would be interesting to explore whether cholinergic signaling in other brain regions contributes to social modulation of song and expect that future studies will investigate this topic.

Comments on text/interpretation:

"the extent to which acetylcholine contributes to enhanced motor vigor observed in aroused behavioral states remains unknown."

Maybe modify this point a bit. there is a certainly a strong association between cholinergic activity and aroused behavioral states observed in mammals (Jones, 2004) and there has long been data on how mAChR antagonists effect aroused behavioral responses (such as conditioned responses, e.g. Longo, 1966).

Sorry for the confusion – we intended to refer in particular to cholinergic signaling in motor cortical regions. We have revised the sentence referenced here to make this more explicit: "However, while motor cortical regions receive dense cholinergic innervation from the nucleus basalis (NBM; Eckenstein et al., 1988; McKinney et al., 1983; Raghanti et al., 2008), the extent to which cholinergic signaling in cortex contributes to motor invigoration observed in aroused behavioral states remains unknown."

"Moreover, for each feature, the changes elicited by combined carbachol + LMAN inactivation were not significantly different from the sum of the individual effects of carbachol and LMAN inactivation (pitch: p = 0.77, pitch c.v.: p = 0.85, tempo: p = 0.22, amplitude: p = 0.71, signed-rank test). These results indicate that increased cholinergic tone in HVC can modulate song via primary motor circuitry independently of input from the songbird basal ganglia."

I thought this result was a particularly key demonstration that these two pathways appear to linearly sum. I would note that a linear sum of BG-independent and BG-dependent mechanisms has been proposed previously (Yttri and Dudman, 2018) – although this provides the clearest direct evidence for ~linear combination of effects to date.

Thanks for pointing this out – we now make reference to the model described in the Yttri and Dudman, 2018 paper in the Results section for these experiments, and also in the Discussion section as an example of prior work that has recognized contributions of cortical circuits to the control of vigor (see the subsection “Distributed circuits for the control of motor vigor”).

"Prior work has identified basal ganglia circuitry as an important locus for the control of motor vigor (Panigrahi et al., 2015; Schmidt et al., 2008; Yttri and Dudman, 2016). In principle, cortical and basal ganglia circuitry could jointly control movement vigor, with coordination between these pathways mediated by bidirectional feedback between them (Bosch-Bouju et al., 2013). However, our findings indicate that in some situations, modulation of motor vigor can occur independently of basal ganglia circuitry."

The Discussion section as written suggests that previous treatments have suggested that basal ganglia (BG) might be an exclusive determinant of motor vigor. However, I would just note that this was explicitly presented as an argument for BG-independent pathways for control of movement vigor previously – as noted in Dudman and Krakauer (2016) cited in this manuscript: "It should be stressed that just because the basal ganglia can influence vigor this does not imply the converse: that vigor parameters are always under the obligate control of the basal ganglia." This is also explicit in normal treatments of this model of vigor in the equations in Yttri and Dudman (2018) in which BG-independent component of vigor sums (is independent of) the BG contribution.

Nonetheless, I think the authors make a very important point here that an explicit modulatory influence that does function in cortex independent of BG to control vigor is a very valuable addition/demonstration. The authors might be interested to consider other phenomena that are similarly hard to reconcile with an exclusive reinforcement mechanism like verbally-instructed changes in movement vigor in normal subjects ("Move faster!"), or the learned changes in vigor present that persist in dopamine depleted animals (Panigrahi et al. 2015) and patients (Mazzoni et al., 2007; Baraduc et al., 2013), or as authors do discuss paradoxical kinesia. (These were the observations that led to the proposal ofBG-independent circuits for motor vigor in previous reviews). Moreover, similar to the bird circuitry highlighted in the discussion of this manuscript, in the mammal BG-output also converges on subcortical targets of descending motor cortical projections making these parallels perhaps even closer.

Thanks for bringing this work to our attention – we agree that much of this is relevant, and have incorporated in particular the possibility that cholinergic signaling in cortex could contribute to learned changes in motor vigor. As you point out, verbally-instructed changes in vigor and paradoxical kinesia may also rely on non-basal ganglia circuitry, and these are points that we note in the Discussion section (see the subsection “A potential role for the cholinergic system in movement disorders”).

Reviewer #3:

[…]

1) Historically, HVC has been thought to be important for tempo and sequencing while the AFP contributes to syllable structure. This makes Bengalese finches an interesting model to study HVC because they have greater sequence variability than zebra finches (a focus that Brainard has taken advantage of in the past). In particular, his lab has found that during directed singing, sequence entropy decreases and syllable repeats increase and this has not been found to be affected by lesions of LMAN, thus hinting that HVC may be significant in this modulation. Here, they report that Ach manipulation in HVC affects sequence transitions (though there is little detail on this change, including whether there is a change in entropy) and also increases repeats (similar to the effect of directed singing). However, there is no further discussion of sequence or repeats through the rest of the paper. I would like to see the effects on sequence and repeats throughout. It would be especially interesting to know whether atropine affects the decrease in entropy and increase in repeats previously reported for directed singing. Presumably, this would not require additional experiments as these data would be available in the data already collected.

As requested, we have now analyzed how carbachol affects transition entropy; we observed a trend toward reduced sequence entropy similar to directed song, though this was not statistically significant (see the text of the Results section for these experiments). Following a suggestion from Reviewer #1 (point #5), the sequencing effects from the carbachol experiments are now reported in a main figure (Figure 2).

We found that it was difficult to definitively determine whether or not changes in sequencing observed during directed song are dependent on cholinergic signaling in HVC, due to the small number of repeated syllables and branch points in the dataset, the limited number of songs we had available to estimate transition probabilities, and possibly instabilities in sequencing effects observed across repeated sessions of directed singing (as reported in Hampton et al., 2009). In particular, the effects of social context on sequencing were not very robust in the subset of birds tested with manipulations of social context and concurrent dialysis of atropine into HVC. Only two out of five repeated syllables in these birds exhibited a significant increase in syllable repetitions during directed song, and this increase was not attenuated by atropine (Figure 6—figure supplement 1C). We did not observe a significant decrease in transition entropy during directed song for these birds, so that there was not a meaningful opportunity to look for attenuation by atropine. Similarly, we did not observe an attenuation of the change in branch point transition probabilities with atropine (Figure 6—figure supplement 1D).

2) I'd like more raw data and interpretation of the multi-unit recordings. It's unclear what these sites look like (how multi-unit is multi-unit?). Moreover, I find the rasters to be a poor way to convey the effects. In the one in Figure 3, there is so much black that you notice the decrease in white-space more than the increase in black, which gives the impression of a decrease rather than an increase. In Figure 4 I can't see the difference in the rasters. For interpretation, is the idea that there are just more neurons firing during directed singing, or that there is an increase in the rate of individual neurons, or both? Work using IEG expression in Bengalese finches indicates that there are more EGR1 expressing cells during undirected singing that directed singing, which would appear to be at odds with the current multi-unit result, but this is not discussed.

As requested, the revised manuscript now includes substantial additional characterization of our multi-unit neural recordings and more interpretation in the Discussion section (see also our response to Reviewer #1, point #8). We now show a number of examples of the raw data from these recordings (Figure 4 and Figure 4—figure supplement 1). Additionally, we now report the distribution of firing rates (Figure 4—figure supplement 1A) and signal-to-noise ratios (Figure 4—figure supplement 1B and 1C) for all recording sites, which speak to the question of how 'multi-unit' the recordings are.

We agree that it was difficult to see firing rate changes induced by carbachol or social context in raster plots due in part to the density of points. We have enlarged the raster plots in Figure 4 (previously Figure 3) to decrease the density of points so that the modulation of activity might be more apparent. Nevertheless, we would agree that the effects of carbachol and of social context are more apparent in the averaged firing rates presented in these figures. We still think that the raster plots are useful in part to give a visual impression of the raw data with respect to song locked modulation of activity and overall stability of the pattern of neural firing, and have therefore retained them in the figures. For example, the raster plots illustrate that HVC activity is strongly modulated during song and that the pattern of this modulation is remarkably consistent from trial-to-trial, a typical observation for HVC activity that otherwise may not be apparent to those outside of the birdsong field.

We agree with the implicit comment regarding limitations on the interpretation of changes to multi-unit firing rates; it is difficult to make strong inferences from the observed increases in multi-unit activity about underlying changes to the activity of individual neurons – this includes the question of whether any formerly silent neurons are recruited, as well as the question raised by Reviewer #1 of how the different neural subtypes within HVC are affected. In our revised Discussion section we address the issue of how changes to HVC activity across different neural populations could contribute to the observed behavioral changes, but are careful to note that our models are speculative (see also our responses to Reviewer #1, point #8).

We have expanded the discussion of how our results might relate to the previous observation that IEG expression is modulated by social context (Matheson et al., 2016). We first note that the two studies are consistent in indicating that HVC activity is modulated by social context. We then discuss various differences between neural recordings and IEG expression that could reconcile our observation of an increase in multi-unit activity during directed song with an observed decrease in IEG expression during directed song. These include potential differences arising from the long integration time of the IEG response, the nonlinear relationship between neural activity and IEG expression, the specific neural types examined, and other aspects of experimental design (see the subsection “Contributions of HVC and the cholinergic system to social modulation of song”).

3) There are a number of instances in which there are many syllables and/or recordings sites in a single bird but there is no indication that bird has been included as a random variable in the statistical models. It is actually quite important to include bird ID as a variable in the model to account for the fact that many measurements have been made in the same individual (or to indicate more explicitly if it has already been included).

We agree. In particular, we note that while we (and many other birdsong publications) implicitly assume that all syllables can be treated as independent measurements, this is not formally correct, since different syllables sampled from the same bird may exhibit correlated effects, resulting in a form of pseudoreplication that could inflate estimates of statistical significance. As suggested, we can account for this type of structured data by including the identity of each bird as a random effect in a linear mixed effects model. Our revised manuscript now includes statistical analysis based on mixed effects models to supplement the original statistical tests. The significance tests of our main results are essentially unchanged. Details are provided in the Materials and methods section and in supplementary files that accompany the main figures (see the subsection “Mixed effects models” under “Statistics”).

https://doi.org/10.7554/eLife.53288.sa2

Article and author information

Author details

  1. Paul I Jaffe

    1. Departments of Physiology and Psychiatry, University of California, San Francisco, San Francisco, United States
    2. Center for Integrative Neuroscience, University of California, San Francisco, San Francisco, United States
    3. Kavli Institute for Fundamental Neuroscience, University of California, San Francisco, San Francisco, United States
    Contribution
    Conceptualization, Software, Formal analysis, Investigation, Methodology, Writing - original draft, Writing - review and editing
    For correspondence
    pauljaffe7@gmail.com
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-0680-3923
  2. Michael S Brainard

    1. Departments of Physiology and Psychiatry, University of California, San Francisco, San Francisco, United States
    2. Center for Integrative Neuroscience, University of California, San Francisco, San Francisco, United States
    3. Kavli Institute for Fundamental Neuroscience, University of California, San Francisco, San Francisco, United States
    4. Howard Hughes Medical Institute, University of California, San Francisco, San Francisco, United States
    Contribution
    Conceptualization, Resources, Methodology, Writing - review and editing
    For correspondence
    msb@phy.ucsf.edu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-9425-9907

Funding

Howard Hughes Medical Institute

  • Michael S Brainard

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank Josh Berke, Alla Karpova, Lucas Tian, and members of the Brainard lab for helpful discussion and comments on the manuscript. We also thank Hamish Mehaffey for technical assistance and for designing the lightweight headstages used for extracellular recordings. This work was supported by the Howard Hughes Medical Institute.

Ethics

Animal experimentation: All procedures were performed in accordance with protocols (#AN170723-02) approved by the UCSF Institutional Animal Care Use Committee.

Senior Editor

  1. Ronald L Calabrese, Emory University, United States

Reviewing Editor

  1. Megan R Carey, Champalimaud Foundation, Portugal

Reviewers

  1. Stephen D Shea, Cold Spring Harbor Laboratory, United States
  2. Joshua Tate Dudman, Janelia Research Campus, Howard Hughes Medical Institute, United States
  3. Sarah C Woolley

Publication history

  1. Received: November 2, 2019
  2. Accepted: April 1, 2020
  3. Version of Record published: May 19, 2020 (version 1)

Copyright

© 2020, Jaffe and Brainard

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 305
    Page views
  • 44
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

  1. Further reading

Further reading

    1. Neuroscience
    Ying Zhang et al.
    Research Article Updated
    1. Neuroscience
    Matthew Grove et al.
    Research Advance Updated