1. Evolutionary Biology
  2. Neuroscience
Download icon

A small, computationally flexible network produces the phenotypic diversity of song recognition in crickets

  1. Jan Clemens  Is a corresponding author
  2. Stefan Schöneich
  3. Konstantinos Kostarakos
  4. R Matthias Hennig
  5. Berthold Hedwig
  1. European Neuroscience Institute Göttingen – A Joint Initiative of the University Medical Center Göttingen and the Max-Planck Society, Germany
  2. BCCN Göttingen, Germany
  3. University of Cambridge, Department of Zoology, United Kingdom
  4. Friedrich-Schiller-University Jena, Institute for Zoology and Evolutionary Research, Germany
  5. Institute of Biology, University of Graz, Austria
  6. Humboldt-Universität zu Berlin, Department of Biology, Germany
Research Article
  • Cited 0
  • Views 216
  • Annotations
Cite this article as: eLife 2021;10:e61475 doi: 10.7554/eLife.61475

Abstract

How neural networks evolved to generate the diversity of species-specific communication signals is unknown. For receivers of the signals, one hypothesis is that novel recognition phenotypes arise from parameter variation in computationally flexible feature detection networks. We test this hypothesis in crickets, where males generate and females recognize the mating songs with a species-specific pulse pattern, by investigating whether the song recognition network in the cricket brain has the computational flexibility to recognize different temporal features. Using electrophysiological recordings from the network that recognizes crucial properties of the pulse pattern on the short timescale in the cricket Gryllus bimaculatus, we built a computational model that reproduces the neuronal and behavioral tuning of that species. An analysis of the model’s parameter space reveals that the network can provide all recognition phenotypes for pulse duration and pause known in crickets and even other insects. Phenotypic diversity in the model is consistent with known preference types in crickets and other insects, and arises from computations that likely evolved to increase energy efficiency and robustness of pattern recognition. The model’s parameter to phenotype mapping is degenerate – different network parameters can create similar changes in the phenotype – which likely supports evolutionary plasticity. Our study suggests that computationally flexible networks underlie the diverse pattern recognition phenotypes, and we reveal network properties that constrain and support behavioral diversity.

Editor's evaluation

Clemens et al. present a computational model of the cricket song recognition network, which they show is capable of reasonably reproducing neural activity and song selectivity in G. bimaculatus. They then explore the parameter space of this network and find that varying parameters of model cells enable it to produce a range of selectivities for the period, pulse duration, duty cycle, or pause duration of input song. They then identify the network parameters that most affect song selectivity and investigate the relationship between several subsets of parameters and song preference. This is a fascinating exploration of the computational flexibility of a small neural circuit; it is well researched and written and was enjoyable to read.

https://doi.org/10.7554/eLife.61475.sa0

Introduction

Many behaviors are driven by the recognition and evaluation of sensory stimuli. For instance, hunting requires the detection and tracking of prey; communication requires the recognition of the sounds, pheromones, or visual displays that serve as signals. The diversity of animal behavior across taxa betrays the capacity of sensory systems to evolve and adapt to reliably and specifically recognize a wide variety of signals. Neural evolution is well understood when behaviors are driven by signals recognized through the response specificity of primary afferent neurons, where the change in a single amino acid can change the tuning of a specific behavior (Prieto-Godino et al., 2017; Ramdya and Benton, 2010). However, many behaviors are driven by complex temporal and spatial signal patterns, whose recognition is based on the processing and comparison of neural activity across time and space, where changes in many parameters define the tuning of the system. For these behaviors, unraveling the underlying neural computation is challenging since it requires a mapping from circuit parameters to recognition phenotype.

One prominent behavior involving the recognition of complex temporal patterns is acoustic communication. Many animals – monkeys, mice, bats, birds, frogs, crickets, grasshoppers, katydids, fruit flies – produce species-specific songs to attract and woo conspecifics of the other sex (Baker et al., 2019; Bradbury and Vehrencamp, 2011; Kostarakos and Hedwig, 2014; Neunuebel et al., 2015; Schöneich et al., 2015). During the evolution of acoustic communication, the structure of songs as well as behavioral preferences can evolve rapidly during speciation events (Blankers et al., 2015; Mendelson and Shaw, 2005), giving rise to the large diversity of species-specific songs. Since the evolution of song is mainly driven by the female (Gray and Cade, 2000), the females’ song recognition must be selective and modifiable in order to drive the evolution of distinct, species-specific song patterns in males (Wagner, 2007). But how are these changes implemented at the level of the pattern recognition networks? While electrophysiological experiments can demonstrate the principles of their operations at a given time, they are limited in revealing the functional contribution of cellular and synaptic parameters in a network-wide systematic analysis. To overcome these limitations, the ability of biological networks to generate different recognition phenotypes can be investigated using computational modeling.

Here, we examine the computational capacity of the brain network that recognizes the pulse pattern in the Mediterranean field cricket, Gryllus bimaculatus. Cricket song consists of a sinusoidal carrier frequency, modulated in amplitude with temporal structure on short (<100 ms) and long (>100 ms) timescales (Figure 1A). On the short timescale, the song consists of trains of sound pulses with a species-specific pulse duration and pulse pause. On the long timescale, the pattern is more variable, and pulse trains are either continuous (trills) or grouped into chirps interrupted by a longer chirp pause. The pulse pattern on the short timescale – and the female’s behavioral preference for it – is compactly described in a two-dimensional parameter space spanned by pulse duration and pause (Figure 1B and C). The diverse song preferences have been extensively mapped in more than 20 species (e.g., Bailey et al., 2017; Cros and Hedwig, 2014; Gray et al., 2016; Hennig, 2003; Hennig, 2009; Hennig et al., 2016; Rothbart and Hennig, 2012). This revealed three principal types of preference, defined by selectivity for specific features of the pulse pattern (Figure 1B): pulse duration, pulse period (duration plus pause), and pulse duty cycle (duration divided by period, corresponds to signal energy) (Hennig et al., 2014). Intermediates between these types are not known. A fourth type of selectivity – for pulse pause – has not been reported in crickets and is only known from katydids (Schul, 1998).

Song structure and song preference in crickets, and the song recognition network of Gryllus bimaculatus.

(A) Parameters of the temporal pattern of cricket song. Short sound pulses are interleaved by pulse pauses. The pulse period is given by the sum of pulse duration and pulse pause. The pulse duty cycle corresponds to signal energy and is given by the ratio between pulse duration and period. In many species, pulses are grouped in chirps and interleaved by a chirp pause, while other species produce continuous pulse trains, called trills. (B) The behavioral tuning for pulse patterns can be characterized using response fields, which mark the set of behaviorally preferred pulse parameters in a two-dimensional diagram spanned by pulse duration and pause duration. Shown are schematics derived from behavioral data from particular species illustrating the four principal response types known from crickets and other insects. Traces below each response field show typical song patterns for each species. Response types can be defined based on tolerance (black lines) and selectivity (double headed arrows) for particular stimulus parameters, leading to specific orientations of the response field (see left inset): period tuning (purple, G. bimaculatus) is defined by selectivity for pulse period and tolerance for pulse duty cycle, giving an orientation of the response field of –45°. Duration tuning (lilac, G13, Gray et al., 2016) leads to vertically oriented response fields. Duty cycle tuning (cyan, Gryllus lineaticeps, Hennig et al., 2016) leads to diagonally oriented response fields. Pause tuning (green) with horizontal response fields is not known from crickets but has been reported in the katydid Tettigonia viridissima (Schul, 1998). The response field given represents a hypothetical cricket species. (C) Example stimulus series illustrating the stimulus features each response type in (B) is selective for. Vertical black lines mark the feature that is constant for each stimulus series. For duty cycle (cyan), the ratio between pulse duration and period is constant. (D) Song recognition network in the brain of G. bimaculatus. The network consists of five neurons, each with a specific computational role, which are connected in a feed-forward manner using excitation (pointed arrowheads) and inhibition (round arrowheads). The excitatory ascending neuron 1 (AN1) relays information from auditory receptors in the prothorax to the brain. The inhibitory local neuron 2 (LN2) inverts the sign of forwarded responses. LN2 inhibits the non-spiking LN5 neuron, which produces a post-inhibitory rebound. LN3 acts as a coincidence detector for excitatory input from AN1 and LN5. Input delays are tuned such that LN3 is maximally driven by the conspecific pulse train with a pulse period of 30–40 ms. LN4 integrates excitatory input from LN3 and inhibitory input from LN2 and further sharpens the output of the network. (E) Tuning for pulse period in LN4 (purple) matches the phonotactic behavior (gray) of G. bimaculatus females (D, E adapted from Figures 5A and 6A; Schöneich et al., 2015).

© 2015, Schöneich et al. Figure 1D & E are adapted E Tuning for pulse period in LN4 (purple) matches the phonotactic behavior (gray) of G. bimaculatus females (D, E adapted from Figures 5A and 6A, Schöneich et al., 2015).

Repetitive patterns of short pulses that are organized in groups on a longer timescale are a common feature of acoustic signaling in insects, fish, and frogs (Baker et al., 2019; Carlson and Gallant, 2013; Gerhardt and Huber, 2002), and the processing and evaluation of these pulse patterns is therefore common to song recognition systems across species. Moreover, circuits analyzing temporal patterns of amplitude modulations are likely building blocks for recognizing the more complex acoustic communication signals found in vertebrates like songbirds or mammals (Aubie et al., 2012; Coffey et al., 2019; Comins and Gentner, 2014; Gentner, 2008; Neunuebel et al., 2015) including human language (Oganian and Chang, 2019; Neophytou and Oviedo, 2020). Insights from insects where assumptions on physiologically relevant parameters like synaptic strengths, delays, and membrane properties of individual neurons can be made and systematically tested are therefore relevant for studies of pattern recognition systems and the evolution of acoustic communication systems in general.

While little is known about the neural substrates that recognize song on the long timescale of chirps, the neuronal circuit that computes the behavioral preference for the pulse pattern on the short timescale has been revealed in the cricket Gryllus bimaculatus (Kostarakos and Hedwig, 2012; Schöneich et al., 2015). In this species, the selectivity for a narrow range of pulse periods is created in a network of five neurons and six synaptic connections by combining a delay line with a coincidence detector (Figure 1D). The ascending auditory neuron 1 (AN1) is tuned to the carrier frequency of the male calling song and provides the input to a small, four-cell network in the cricket brain. Driven by AN1, the local neuron 2 (LN2) inhibits the non-spiking neuron LN5, which functions as delay line and produces a post-inhibitory rebound depolarization driven by the end of each sound pulse. The coincidence detector neuron LN3 receives direct excitatory input from AN1 and a delayed excitatory input driven by the rebound of LN5; it fires strongly only if the rebound from LN5 coincides with the onset of the AN1 response to the next syllable. Lastly, the feature detector neuron LN4 receives excitatory input from LN3 and inhibitory input from LN2, which sharpens its selectivity by further reducing responses to pulse patterns that do not produce coincident inputs to LN3. LN4’s selectivity for pulse patterns closely matches the phonotactic behavior of the females (Figure 1E).

We here asked whether the network that recognizes features of the pulse pattern on the short timescale in G. bimaculatus (Figure 1D) has the capacity to produce the diversity of recognition phenotypes for pulse duration and pause known from crickets and other insects (Figure 1B), and what circuit properties support and constrain this capacity. Based on electrophysiological recordings (Kostarakos and Hedwig, 2012; Schöneich et al., 2015), we fitted a computational model that reproduces the response dynamics and the tuning of the neurons in the network. By exploring the network properties over a wide range of physiological parameters, we show that the network of G. bimaculatus can be modified to produce all types of preference functions for pulse duration and pause known from crickets and other insect species. The phenotypic diversity generated by the network is shaped by two computations – adaptation and inhibition – that reduce responses and point to fundamental properties of neuronal networks underlying temporal pattern recognition.

Results

A computational model of the song recognition network in G. bimaculatus

We tested whether the delay line and coincidence detector network of the cricket G. bimaculatus (Figure 1D) can be modified to produce the known diversity of preference functions for pulse duration and pause in cricket calling songs (Figure 1B). This network was previously inferred from the anatomical overlap together with the dynamics and the timing of responses of individually recorded neurons to a diverse set of pulse patterns (Kostarakos and Hedwig, 2012; Schöneich et al., 2015). Given that electrophysiology is challenging in this system, dual-electrode recordings to prove the existence of the inferred connections do not exist presently. We consider the neurons in the network cell types that may also comprise multiple cells per hemisphere with highly consistent properties across individuals (Schöneich, 2020). We fitted a computational model based on intracellularly recorded responses of the network’s neurons to pulse trains. Our goal was to obtain a model that captures the computational capacity of the network without tying it to a specific biophysical implementation, and we reproduced the responses of individual neurons using a phenomenological model based on four elementary computations (Figure 2A): (1) filtering, (2) nonlinear transfer functions (nonlinearities), (3) adaptation, and (4) linear transmission with a delay. Nevertheless, all model components have straightforward biophysical correlates (see Discussion), which allows us to propose biophysical parameters that tune the network in specific implementations. The computational steps – for instance, whether a neuron had an integrating or a differentiating filter (Figure 2—figure supplement 2) – were selected such that each neuron’s response dynamics could be reproduced (Figure 2A, Figure 2—figure supplement 2, Table 1). The model parameters were first manually tuned to initial values that reproduced the key properties of each neuron’s response and then numerically optimized to more precisely fit each neuron’s response dynamics and tuning (Figure 2B–H). To simplify fitting, we exploited the feed-forward nature of the network: we first optimized the parameters of the input neuron to the network, AN1. Then, we went downstream and fitted the parameters of each neuron’s downstream partners while fixing all parameters of its upstream partners for all neurons in the network. Electrophysiological data used for fitting were the time-varying firing rate for the spiking neurons AN1, LN2, LN3, and LN4, and the membrane voltage for the non-spiking LN5 neuron, all in response to periodical pulse trains with different pulse durations and pauses (Kostarakos and Hedwig, 2012; Schöneich et al., 2015). A detailed description of the model parameters, the data used for fitting, and the fitting procedure are described in Materials and methods, Figure 2—figure supplement 1, Figure 2—figure supplement 2, and Table 1.

Figure 2 with 4 supplements see all
A computational model reproduces the responses of the song recognition network.

(A) The model of the song recognition network (Figure 1D) combines four elementary computations: linear filtering (yellow), static nonlinearities (blue), adaptation (red), and synaptic transmission (black lines, pointed arrowheads: excitatory inputs; round arrowheads: inhibitory inputs). Multiple inputs to a cell are summed (green). Pictograms inside of each box depict the shapes of the filters and of the nonlinearities. Y scales are omitted for clarity. See Figure 2—figure supplement 1, Figure 2—figure supplement 2, and Materials and methods for details. (B–D) Tuning for period (B), duty cycle (C), and pause (D) in the data (red, each line is a trial-averaged recording from one individual) and in the model (black) for the five neurons in the network. Thicker red lines depict the recording used for fitting. Stimulus schematics are shown on the top of each plot. Tuning is given by the firing rate (AN1, LN2–4) or integral rebound voltage (LN5) for each chirp and was normalized to peak at 1.0 across the whole set of tuning curves shown in (B–D). A duration tuning curve is not shown since it is not contained in the electrophysiological data. See Figure 3 for duration tuning generated by the model. Number of individual recordings is 8/4/3/6/4 for AN1/LN2/5/3/4. (E) Firing rate (Hz) or membrane voltage (mV) traces from the recording used for fitting (red) and from the model (black). Stimuli (top) are pulse trains with different pulse periods. (F) Firing rate of LN2 in Hz (shaded area) and membrane voltage traces of LN5 in mV (line) in the recording used for fitting (top, red) and in the model (bottom, black) for short (18 ms), intermediate (34 ms), and long (58 ms) periods. The model reproduces the response timing of LN5 and LN2 responses overlap for short and intermediate but not for long periods. (G) Goodness of fit for the response dynamics of all neurons at different timescales, quantified as the r2 between the traces in the data and the model. Fits for AN1M, LN2M, and LN5M are good across all timescales. The fits for LN3M and in particular for LN4M increase with timescale (>10 ms) due to the sparse and variable spiking of these neurons (see E). (H) Goodness of fit for the tuning curves, quantified as 1 minus the root mean-square error between the tuning curves from the data and the model (compare black lines and thick red lines in B–D). The curves from the data and the model were normalized by the peak of the curve from the data, to make the measure independent of response scale. Performance is high for all neurons. The weaker match for LN5M stems from larger model responses for stimuli with long periods or pauses (see B, D).

Table 1
Model parameters.

See Figure 2—figure supplement 1 and Figure 2—figure supplement 2 for an illustration and methods for a definition of all parameters. * marks parameters that were fixed during training (9/55). marks parameters that were fixed during parameter and sensitivity analyses (10/55, Figures 46).

CellComponentParameters
AN1MFilter excitatory lobe(Gaussian) width α=0.0005, duration = 9.88 ms, input delay = 7.41 ms
Filter inhibitory lobe(Gaussian) width α=2.32, gain γ=0.06, duration N = 184 ms
Nonlinearity(Sigmoidal) slope = 1.5, shift = 1.5, gain = 5, baseline = −0.5
Adaptation(Divisive normalization) timescale τ=3760 ms, strength w = 2.82, offset x0 = 1*
Output gainGain=12.8
LN2MInput from AN1MDelay = 0 ms, gain = 0.19
Filter excitatory lobe(Gaussian) width α=1.07, duration N = 14.2 ms, gain = 0.272
Filter inhibitory lobe(Exponential) decay γ=5.98 ms, duration N = 1000 ms*
Output nonlinearity(Rectifying) threshold = 0*, gain = 1.33
LN5MInput from LN2MDelay = 8.39 ms, gain = −0.005
Postsynaptic filter(Differentiated Gaussian) duration N = 5.0 ms, width α=3.5*, gain of the excitatory lobe = 1.15
Postsynaptic nonlinearity(Rectifying) threshold = 0*, gain = 1*
Rebound filter exc. lobe(Exponential) decay γ=3.54 ms, duration N = 20.7 ms, gain = 915
Rebound filter inh. lobe(Exponential) decay γ=30.3 ms, duration N = 500 ms*, gain = 1718
Output nonlinearity(Rectifying) threshold = 0*, gain = 3.82
LN3MInput from AN1MDelay = 7.33 ms, gain = 32.1
Input from LN5MDelay = 3.16 ms, gain = 3.78
Postsyn. nonlinearity(Rectifying) threshold = 0.26, gain = 0.014
Adaptation(Divisive normalization) timescale τ=39.4 ms, strength w = 0.283, offset x0 = 1*
Output nonlinearity(Rectifying) threshold = 2.33, gain = 7.68
LN4MInput from LN2MDelay = 17 ms, gain = –1205
Input from LN3MDelay = 4.87 ms, gain = 401
Output nonlinearity(Rectifying) threshold = 738, gain = 0.0052

The model faithfully reproduces the neural responses

The fitted model closely reproduced the responses of the network neurons to stimuli from the electrophysiological data set (Figure 2B–H). To quantify model performance, we assessed the match in the dynamics and in the tuning between the neuronal and the model responses. First, we computed the squared correlation coefficient (r2) between the recorded and the modeled responses (Figure 2G, Figure 2—figure supplement 3). We performed this correlation analysis at different timescales of the traces by low-pass filtering responses and predictions with filters of different durations. At shorter timescales, the measure is sensitive to the precise response timing, whereas at longer timescales it reflects the match in the coarse features of the firing rate or voltage dynamics. The r2 value is high across all timescales for the model neurons (indexed with ‘M’) AN1M, LN2M, LN3M, and LN5M, which respond to a pulse in the biological network with multiple spikes or sustained membrane voltage deflections. By contrast, LN4 produces only a few and irregularly timed spikes during a chirp, and therefore r2 is highest for timescales exceeding the duration of a typical pulse (15 ms) (Figure 2—figure supplement 3). Second, we calculated the match between the tuning curves derived from the experimental data and the model (Figure 2H). The model excellently reproduced the tuning curves of AN1M, LN2M, LN3M, and LN4M. Performance is lower for LN5M since the model produced overly strong rebound responses for patterns with long pulse periods and pauses (Figure 2B and D). In the electrophysiological data, the rebound amplitude is also variable across individuals. This may reflect interindividual variability, but it could also be an experimental artifact due to the challenges of recordings from the tiny branches of this very small neuron. Despite this variability, the tuning of responses downstream of LN5 – LN3 and LN4 – is not (Figure 2B–D). This indicates that the biological network is robust to small changes in rebound amplitude and that it primarily relies on rebound timing. This is well reproduced in our model (Figure 2E–G): the response dynamics and tuning for the downstream neurons LN3M and LN4M are well reproduced despite the discrepancy in LN5M rebound amplitude. Moreover, we find that altering rebound amplitude within the range of the discrepancy only weakly affects model output (Figure 2—figure supplement 4). We can also not exclude that there is a population of multiple LN5-type neurons in each hemisphere and that variability between the individual LN5 neurons reflected in our recordings is averaged in their summed input to LN3. Overall, this shows that despite this small discrepancy between the data and the model, our model well captures the computations of the biological network.

To further assess the model’s performance, we examined each model neuron’s responses over a wide range of pulse and pause durations that covered the range of song parameters found across cricket species (Weissman and Gray, 2019). There exist no electrophysiological data for such a wide range of stimuli, but the behavioral data from G. bimaculatus indicate that the neural responses should change smoothly with the song parameters (Grobe et al., 2012; Hennig et al., 2014; Kostarakos and Hedwig, 2012). The responses of all neurons in the model – presented as two-dimensional response fields that depict the response rate for each combination of pulse duration and pause in the set of test stimuli – are consistent with this prediction (Figure 3A). Discontinuities in the responses with a stimulus parameter stem from the discrete nature of the stimulus because the number of pulses per train changes with pulse duration and pause (Figure 3—figure supplement 1A and B). The response fields illustrate the gradual transformation of tuning in the network: LN2M at the beginning of the network responds best to stimuli with large duty cycles, that is, stimuli with long pulse durations and short pauses. Following the network from LN5M over LN3M to LN4M, the responses to large duty cycle stimuli attenuate and the pulse period tuning becomes more and more prominent, with LN4M ultimately being selective for a narrow range of pulse periods.

Figure 3 with 2 supplements see all
Model responses to novel pulse train stimuli.

Responses of the model neurons for stimuli with different combinations of pulse and pause durations (1–80 ms, 1600 stimuli per response field, color code corresponds to response magnitude). Each response field depicts the firing rate (for AN1M, LN2M–4M) or the voltage of the rebound (for LN5M) of model neurons. Pulse trains had a fixed duration of 140 ms and were interleaved by a pause of 200 ms, mimicking the chirp structure of G. bimaculatus calling song. Anti-diagonal step-like patterns in the response fields arise from changes in the number of pulses per train (Figure 3—figure supplement 1A and B). Although the data set used for fitting did not include stimuli with long pulse durations, the model predicts the weak response known from the behavior for these stimuli. Solid black lines indicate stimuli with 15 ms pause duration (horizontal), 15 ms pulse duration (vertical), 30 ms pulse period (anti-diagonal), and 0.5 pulse duty cycle (diagonal). Dots in the leftmost panel mark the stimuli used for fitting.

The song of G. bimaculatus is produced with a chirp pattern (Figure 1A) and female preference for it is broad relative to the tuning for pulse duration and pause (Grobe et al., 2012). Consistent with that, the model’s response field is robust to the small changes in the chirp duration like adding or removing a single pulse from a chirp typically found in the natural song of this species (Figure 3—figure supplement 2). This confirms that our results on pulse duration and pause are robust to the small variations on the longer timescale of chirps observed in natural song. Future studies will examine to what extent the network that recognizes song on the short timescale of pulses also contributes to female preference on the longer timescale of chirps.

In summary, our model reproduces the characteristic response features of each neuron type in the biological network. Using this model of the song recognition mechanism in G. bimaculatus, we can now test whether the network has the capacity to produce the behavioral preferences for pulse duration and pause known from other cricket species and identify the parameters that determine the network’s preference.

The network can be tuned to produce all known preferences for pulse duration and pause in crickets

To determine the diversity of behavioral preferences for pulse duration and pause that the network can produce, we created different model variants by altering all model parameters – for instance, the weight and delay of inputs or the amplitude or duration of filters. The model variants were generated by randomly picking values for each of the 45 parameters from an interval around the parameter values of the fit to G. bimaculatus (see Materials and methods for details and Table 1). Biophysical parameters of a neuron type can vary 10-fold even within a species (Goaillard et al., 2009; Schulz et al., 2007), and we therefore chose an interval of 1/10 to 10 times the values from the fit to G. bimaculatus. Initial experiments with a wider range (1/100 to 100-fold) yielded qualitatively similar results but with a larger fraction of untuned or unresponsive models. Delay parameters were randomly picked from an interval between 1 and 21 ms. Delay parameters correspond to the delay added to a neuron’s inputs and were optimized during fitting to match the timing of the responses of the neuron’s outputs. They therefore account not only for axonal transduction and synaptic transmission delays but also for delays arising from low-pass filtering or integration of inputs to the spiking threshold (Creutzig et al., 2010; Zhou et al., 2019), justifying the extended range of values. To translate responses of the output neuron of the network – LN4M – into phonotaxis, we used a simple model: the firing rate of LN4 is strongly correlated with the female phonotaxis in G. bimaculatus (Figure 1E, Schöneich et al., 2015), and we therefore took LN4M’s firing rate averaged over a chirp to predict phonotaxis from the model responses. Integrative processes over timescales exceeding the chirp are known to affect behavior in crickets and other insects (Poulet and Hedwig, 2005, see also Meckenhäuser et al., 2014; Clemens et al., 2014; DasGupta et al., 2014). We omit them here since they do not crucially affect responses for the simple, repetitive stimuli typical for pulse trains produced by crickets. The preference properties of the network models with randomized parameter sets were characterized for a two-dimensional stimulus space using pulse trains with all combinations of pulse durations and pauses between 1 and 80 ms (Figure 4A). We generated 5 million model variants, 9% of these were responsive and selective and used for all further analyses. This low proportion arises because many parameter combinations produce constant output, for instance, if the firing threshold in AN1 is too low or too high.

Figure 4 with 3 supplements see all
The network generates the diversity of response profiles known from crickets and other insects.

(A) Response field generated from a model variant with randomized parameters. Response magnitude coded in gray colors (see color bar). Tuning was characterized in terms of the preferred pulse and pause durations (red dot) and as tolerant (blue) or selective (purple) directions in the stimulus field. This example is period tuned (purple contour marks the 75% response level) and the set of preferred stimuli is oriented at –45° (see inset in D for a definition of the angles), corresponding to selectivity for period (purple) and tolerance for duty cycle (cyan). (B) Distribution of preferred pulse and pause parameters for all model variants generated from randomized parameter combinations (coded in gray colors, see color bar). Anti-diagonal patterns arise from the discrete nature of pulse trains (Figure 3—figure supplement 1A and B). Models that prefer pauses of 0 ms correspond to models that prefer constant tones. Enrichment of models that prefer the maximally tested pause of 80 ms indicates that the network can generate preference for longer pauses than tested. Preferences cover the stimulus field. (C) Distribution of preferred pulse parameters (left to right: pulse period, pulse duration, duty cycle, and pause). Gray histograms correspond to the distributions expected from uniform sampling of stimulus space – deviations from this distribution indicate response biases. The network is biased to produce preferences for short pulse periods, short pulse durations, and low duty cycles. Peaks in the histograms arise from the discrete nature of pulse trains (Figure 3—figure supplement 1A and B) or from boundary effects (see B). (D) Distribution of the orientation of the response fields (see A) for model variants that are well fitted by an ellipsoid, have a single peak, and are asymmetrical. Colored lines indicate the range of angles (±10°) that correspond to the four principal response types (see inset and Figure 1B). The network can produce response fields at all angles, including the four principal types of tuning for period, duration, duty cycle, and pause. Response fields with small angles around 0°, corresponding duration tuning, occur most often. (E) Examples of tuning profiles for pulse period, duration, duty cycle, and pause. Profiles for all tuning types cover the examined stimulus space. (F) To identify model parameters useful for controlling network tuning, we modified each model parameter between 1/100 and 100-fold and calculated the change in the response field. The sensitivity score quantifies how much changing a parameter’s value changes the response field. Examples shown are the parameters with the highest (LN5M postsynaptic filter duration, black) and lowest non-zero sensitivity (LN5M rebound filter excitatory decay, gray) (see H). (G) Average sensitivity scores by neuron. LN5M has the highest score, it most strongly shapes network tuning, consistent with the rebound and coincidence detection being the core computational motif of the network. (H) Model parameters ranked by sensitivity score. Parameters that induce no or only a single step-like change in the response field were excluded. Color indicates cell type (same as in G). Parameters of LN5M (bright orange) and LN3M (dark orange) rank high, demonstrating the importance of the rebound and coincidence detection for shaping model tuning. The model schematic (inset) highlights the most important types of parameters.

As a first step towards characterizing the types of tuning the network can produce, we assessed the preferred pulse duration and pause for each of the 450,000 selective model variants (Figure 4A). We find that preferences cover the full range of pulse and pause combinations tested (Figure 4B). However, the model variants do not cover the preference space uniformly but are biased to prefer patterns with short pulse durations, short periods, and low duty cycles (Figure 4C). Peaks at pauses of 0 ms arise from duty cycle-tuned models with a preference for unmodulated stimuli, and peaks at pauses of 80 ms arise from models preferring pauses beyond the range tested here. In conclusion, the network can produce diverse recognition phenotypes, but this diversity is biased towards specific stimulus patterns.

The preferred pulse parameters – duration, pause, and their combinations period and duty cycle – only incompletely describe a network’s recognition phenotype. In the next step, we focused a more exhaustive description of the response fields on aspects that have been well described in behavioral analyses. This allowed us to assess the match between the diversity of response fields in the model with the known biological diversity. Behavioral analyses in crickets and other insects (Deutsch et al., 2019; Hennig et al., 2014; Schul, 1998) typically find oriented response fields with a single peak in the two-dimensional parameter space spanned by pulse duration and pause (Figure 4A). The vast majority of these fields have an elongated major axis defining stimulus parameters the female is most tolerant for, and a shorter minor axis defining parameters the female is most selective for. Multi-peaked response fields have been associated with a resonant recognition mechanism and have so far only been reported in katydids (Bush and Schul, 2005; Webb et al., 2007), not in crickets. The orientations of response fields measured in more than 20 cricket species cluster around four angles (Figure 1B) forming four principal types of tuning. Intermediate types of phonotactic tuning may exist but have not been described yet. Specifically, duration tuning is defined as selectivity for pulse duration and tolerance for pause (Figure 1C, lilac) (Teleogryllus commodus, Gray et al., 2016; Hennig, 2003, see also Deutsch et al., 2019). This corresponds to the response field’s major axis being parallel to the pause axis (defined as an orientation θ of 0°, see inset in Figure 4D). By contrast, pause tuning (Figure 1C, green) corresponds to an orientation θ of 90°, with the response field’s major axis extending parallel to the pulse duration axis. This type of tuning is not known in crickets, only in katydids (Schul, 1998). Pulse period and duty cycle tuning correspond to response fields with diagonal and anti-diagonal orientations, respectively. Period tuning (Figure 1C, purple) is given by an anti-diagonal orientation (θ = –45°), indicating selectivity for pulse period and tolerance for duty cycle (G. bimaculatus, Hennig, 2003; Hennig, 2009; Rothbart and Hennig, 2012). Last, duty cycle tuning (Figure 1C, cyan) is given by diagonal alignment (θ = 45°) and selectivity for duty cycle but tolerance for period (Gryllus lineaticeps, Hennig et al., 2016).

We first examined to what extent the model produced the single-peaked, asymmetrical response fields typical for crickets. We find that most response fields (80%) produced by the selective model variants were well described by a single ellipse (Figure 4—figure supplement 2A and B, see Materials and methods for details). Of these, 83% were asymmetrical (major axis >1.25× longer than minor axis), 17% were symmetrical (Figure 4—figure supplement 2C). 12% of all models produce multi-peaked response fields (Figure 4—figure supplement 2D), which are only known from katydids (Webb et al., 2007). The remaining 8% of the response fields were not well described by ellipses and/or did not have multiple distinct peaks. Thus, while the model produces more diverse responses – including complex, multi-peaked ones – most responses do match those typical for crickets.

We next assessed the orientation of the single-peaked, asymmetrical response fields to test to what extent they fall into four principal types (Figure 4D). We find response fields with any orientation, again demonstrating that the network can produce more diverse response fields than has been reported in crickets. However, the orientations are unevenly distributed and are enriched for the principal types known from crickets: 36% of the response fields have an orientation of 0 ± 10°, which corresponds to duration tuning (expectation from uniform distribution: 20°/360° = 5.6%). Duty cycle tuning (45 ± 10°) and period tuning (–45 ± 10°) are also enriched, with 17% and 12%, respectively. Notably, pause tuning (90 ± 10°) is not known in crickets and is the only principal type that is rarer than expected from a uniform distribution of orientations (2.0% vs. 5.6% expected). The rarity of pause tuning is consistent with the bias to prefer short pulse durations observed above (Figure 4C) since orientations around 90° require response fields that extend parallel to the duration axis. Note that these trends do not depend critically on the ranges of angles chosen for specifying the different response types.

Overall, the response fields generated by the model are roughly consistent with the behavioral diversity in crickets: most response fields form a single, elongated ellipse, similar to the behaviorally measured response fields. Duration, duty cycle, and period tuning are frequent in the models and in crickets, pause tuning is rare in the models and absent in crickets. Interestingly, the network tends to create a larger diversity of response fields than is known from crickets, for instance, fields that are symmetrical, multi-peaked, or have intermediate orientations. This suggests that biases in the network – like the rarity of pause tuning – constrain the distribution of preferences that evolution can select from, and that additional factors – like robustness to noise or temperature – then determine the ultimate distribution of phenotypes. Our analysis of different model variants suggests that this song recognition network can produce all known preference types for pulse duration and pause over the range of stimulus parameters relevant for crickets. This phenotypic flexibility implies that the network may form the basis for the diversity of song recognition phenotypes. We therefore sought to identify model parameters that support that diversity, that is, parameters that change the preference for pulse period or that switch the preference from one type to another. We also looked for parameters that constrain the diversity, for instance, parameters that induce a bias towards low duty cycles (Figure 4B–D).

Post-inhibitory rebound properties and coincidence timing are key parameters that shape preferences

To determine key parameters that control the network tuning and to identify the computational steps that induce the preference bias, we systematically examined the effect of changing individual model parameters on the response fields. We swept each parameter individually in 21 log-spaced steps over an interval of 1/100 to 100-fold around the value from the original fit (Figure 4F). We then calculated a sensitivity score for each parameter as the average change in the response field of the network’s output neuron, LN4M, over the parameter sweep (see Materials and methods). Parameters that when changed produced mostly unselective or unresponsive models were excluded from subsequent analyses, as were parameters that only induced one or two sudden changes in the response fields. For instance, parameters that control the firing threshold of AN1 were excluded because they turn the input to the network on or off – this produces a large, step-like change in the response field and many unresponsive models. Our sensitivity analysis thereby focuses on parameters suitable for controlling the network’s tuning, that is, whose change induces smooth shifts in the model responses while retaining responsiveness and selectivity.

The topology of the pattern recognition network is defined by five neurons (Figure 1D). As a first step, we sought to evaluate the importance of each neuron for controlling the network tuning by averaging the sensitivity scores for the parameters of each neuron (Figure 4G). This revealed that the network tuning can be least controlled through the parameters of AN1M, the input neuron of the network, and best controlled through the parameters of the non-spiking neuron LN5M, which generates the delayed post-inhibitory rebound. AN1M is unsuitable for control because changes in most parameters of AN1M will not induce a gradual change in tuning but will quickly produce no or unselective responses in the network. The importance of LN5M is consistent with the idea that the rebound and coincidence detection form the core computations of the network (Schöneich et al., 2015) since the dynamics of the rebound response directly influence what stimulus features result in simultaneous inputs to the coincidence-detector neuron LN3M.

To identify important parameters for tuning the network, we ranked them by their sensitivity score (Figure 4H). In line with the above analysis for each neuron, the top-ranked parameters directly affect the timing and the amplitude of inputs to LN3M. Among these are the delays and gains for the connections upstream of LN3M (AN1M→LN3M, LN2M→LN5M, LN5M→LN3M) but also the gain of the excitatory lobe of the LN2 filter (see Figure 2—figure supplement 1, Figure 2—figure supplement 2). Another group of important parameters affects the dynamics and amplitude of the rebound response in LN5M: first, the duration of the ‘postsynaptic’ filter of LN5M (Figure 2—figure supplement 2), which is required to reproduce the adapting and saturating dynamics of the inputs to LN5M, visible as negative voltage components in recordings of LN5 (Schöneich et al., 2015). Second, the gain of the inhibitory and the duration of the excitatory lobe of the rebound filter that produces the post-inhibitory rebound (Figure 2—figure supplement 1C, Figure 2—figure supplement 2). A modified sensitivity analysis, in which we changed combinations of two parameters at a time, produced a similar parameter ranking, confirming the robustness of these results (Figure 4—figure supplement 1).

Our sensitivity analysis revealed key parameters that change the tuning of the network, but did not address their specific effect, for instance, on the preference for specific pulse durations or periods (Figure 4E). This is however crucial for understanding which network parameters need to be modified to produce a specific phenotype and where the bias for stimuli with low duty cycles (short pulses and long pauses) arises (Figure 4B–D). We therefore examined the specific effects of some of the top-ranked parameters (Figure 4H) on the tuning of the network.

Relative timing of inputs to the coincidence detector controls pulse period preference

Four of the important parameters identified in our sensitivity analysis affect the timing of inputs to the coincidence-detector neuron LN3. Three of these parameters are the delays of the AN1M→LN3M, the LN2M→LN5M, and the LN5M→LN3M connections. The fourth parameter – the duration of the filter that shapes input adaptation in LN5M – also affects the input delays to LN3M (Figure 5—figure supplement 1). The delay between the spikes from AN1M and the rebound from LN5M in LN3M strongly affects network tuning since it determines which pulse train parameters produce coincident inputs required for driving spikes in LN3M (Figure 5A). Increasing this delay – for instance by delaying the rebound from LN5M – increases the preferred pulse period in LN3M (Figure 5B). This delay was hypothesized to be the core parameter that tunes G. bimaculatus to a pulse period of 30–40 ms (Schöneich et al., 2015), and our sensitivity analysis identifies this parameter as crucial for shaping the network’s tuning.

Figure 5 with 3 supplements see all
Input delays and response suppression control period and duty cycle preference.

(A) Inputs from LN5M and AN1M (orange and green shaded areas) to LN3M and output of LN3M (red line) for three input delays from LN5M to LN3M (‘rebound delay’ Δ). The rebound delay is defined as the delay added to the output of LN5M in the model. The effective delay between the AN1M and LN5M inputs to LN3M depends on the pulse pattern (black, top, pulse duration 20 ms, pause 18 ms). An intermediate delay of 11 ms produces the most overlap between the AN1M and LN5M inputs for that stimulus and hence the strongest responses in LN3M. Vertical black line marks an AN1M response peak, arrows point from the nearest LN5M response peak. (B) Preferred periods for LN3M (red), LN4M in an intact model (purple), and LN4M in a model without inhibition from LN2M to LN4M (blue) as a function of the rebound delay. The preferred period increases with rebound delay for all three cases. (C) Response fields for LN3M (left), LN4M in an intact network (middle), and for LN4M in a model without inhibition in LN4M from LN2M (right) (color coded, see color bar). The rebound delay was set to 21 ms, which increases the preferred period in both LN3M and LN4M to 50 ms (left, compare B). However, increasing the delay also decreases the preferred duty cycle in LN4M (middle). Removing the inhibition from LN2M in LN4M abolishes the change in duty cycle preference (right). Anti-diagonal lines mark the preferred period of 50 ms for each response field, and lines starting at the origin mark the preferred duty cycle. (D) Same as (B) but for the preferred duty cycle. With increasing delay, the preferred duty cycle for LN4M approaches 0.25 but is stable for LN3M and for LN4M without inhibition (Figure 5—figure supplement 2). (E) Inputs to LN4M (middle, green: inhibition from LN2M; red: excitation from LN3M) and output of LN4M (bottom, purple) for the intact network in (C) and for three different stimulus sequences with a pulse period of 54 ms and increasing duty cycles (top, black). Responses are shown for the second pulse in a train. Excitatory input from LN3M is weaker and overlaps more with the inhibition for high duty cycles (compare amplitude and latency of response peaks in LN3M), leading to a reduction in LN4M responses with increasing duty cycle. Y-scales are identical for all three panels and were omitted for clarity. (F) Dependence of peak amplitude (top) and peak latency (time from pulse onset to response peak, middle) of inputs to LN4M (red: excitation from LN3M; green: inhibition from LN2M) on pulse duty cycle for the intact network in (C). Weaker and later excitation suppresses LN4M responses for pulse trains with high duty cycles (bottom, purple). (G) Four sources of suppression in the network: the inhibitory lobe in the filter of AN1M (green), adaptation in AN1M (cyan) and LN3M (red), and inhibition in LN4M from LN2M (purple). Shown are responses to a pulse pattern (top black, 20 ms pulse duration and 20 ms pause) when the source of suppression is present (dashed lines) or absent (solid lines). Removing suppression produces stronger or more sustained responses. ‘A’ and ‘I’ refer to adaptation and inhibition, respectively. (H) Response fields (color coded, see color bar) for the network output (LN4M) after removing different sources of suppression. The presence or absence of different sources of suppression is marked with a ‘+’ and a ‘–’, respectively. Removing suppression in the network increases the preferred duty cycle. Lines mark the preferred pulse duty cycle, and black dots indicate the preferred pulse duration and pause. (I) Preferred duty cycle in LN4M as a function of the number of sources of adaption present in the model. Black dots show the preferred duty cycle of individual model variants, the purple line shows the average over models for a given number of adaptation sources. Adaptation decreases the preferred duty cycle (Pearson’s r = 0.78, p=3 × 10-4). See Figure 2—figure supplement 1 for details. The pulse trains for all simulations in this figure had a duration of 600 ms and were interleaved by chirp pauses of 200 ms to ensure that trains contained enough pulses even for long pulse durations and pauses. Rebound delay set to 21 ms in (C) and (E–I) to make changes in the duty cycle preference more apparent.

Interestingly, changing the rebound delay has differential effects in LN3M and in the output neuron of the network, LN4M. In LN3M, increasing the rebound delay changes both duration and pause preference and increases the preferred pulse period without changing the duty cycle preference (Figure 5C and D). However, in LN4M, a longer rebound delay only affects pause preference, but not duration preference, and thereby reduces the preferred duty cycle from 0.50 to 0.25 (Figure 5D, Figure 5—figure supplement 2). This reduction of the preferred duty cycle is a correlate of the low duty cycle bias observed in the network (Figure 4C). We therefore investigated the origin of this effect more closely.

LN4M receives excitatory input from LN3M and inhibitory input from LN2M and applies a threshold to the summed inputs (see Figure 2A, Figure 2—figure supplement 2). To determine which computation in LN4M reduces the preferred duty cycle, we removed the inhibition from LN2M and the threshold in LN4M’s output nonlinearity. While the threshold has only minor effects on tuning, removing the inhibition is sufficient to restore the preference for intermediate duty cycles in LN4M (Figure 5B–D, blue). This implies that the inhibition from LN2M suppresses the responses to high duty cycles in LN4M. We find that changes in the strength and in the timing of excitatory inputs from LN3M to LN4M contribute to this suppression (Figure 5E and F). First, the excitatory inputs from LN3M weaken with increasing duty cycle, leading to a relatively stronger impact of the inhibition from LN2M on the responses of LN4M (Figure 5F). Second, the excitatory inputs from LN3M arrive later with increasing duty cycle, resulting in a more complete overlap with the inhibition from LN2M and therefore to a more effective suppression of LN4M spiking responses (Figure 5E and F).

These results demonstrate that response suppression by inhibition in LN4M becomes more effective at high duty cycles and is one source of the network’s bias towards low duty cycle preferences (Figure 4C). We reasoned that other sources of response suppression, like inhibition or response adaptation elsewhere in the network, could further contribute to this bias.

Mechanisms of response suppression control duty cycle preference

Four additional computational steps in the network could contribute to the bias against high duty cycles (Figure 5G): first, the broad inhibitory lobe of the filter in AN1M (Figure 2A, Figure 2—figure supplement 1, Figure 2—figure supplement 2) reduces responses to subsequent pulses in a train (Figure 5G, green, Figure 2—figure supplement 1A and B) because its effect accumulates over multiple pulses. Importantly, this suppression grows with the integral of the stimulus over the duration of the filter lobe and hence with the pulse duration and duty cycle. In AN1M, this leads to shorter pulse responses due to thresholding and saturation by the output nonlinearity (Figure 2—figure supplement 1E, Figure 2—figure supplement 2). Second and third, the adaptation in AN1M and LN3M accumulates during and across pulses and reduces these neuron’s responses (Figure 5G, teal and red). This effect is again most prominent for pulse patterns with high duty cycles (long pulses, short pauses) since adaptation will be strongest during long pulses and recovery prevented during short pauses. Last, as discussed above, the inhibition from LN2M also suppresses responses in LN4M most strongly for stimuli with high duty cycle (Figure 5G, purple).

To examine how the different sources of suppression shape the model’s tuning, we removed one or more of these computational steps: we set the inhibitory lobe of the AN1M filter to zero, we removed the adaptation from AN1M and LN3M, and we removed the inhibition forwarded from LN2M to LN4M (Figure 5G). To accentuate the effects of these manipulations, we increased the delay of the LN5M→LN3M connection, which led to a preference for longer pulse periods (50 ms) and for short duty cycles (0.25) in LN4M when all sources of suppression were present (Figure 5H, left). Consistent with the prediction that different sources of suppression in the network reduce responses for stimuli with high duty cycles, the network’s preferred duty cycle tended to increase when suppression was removed (Figure 5H and I, Figure 5—figure supplement 1). Removing some sources of suppression tended to induce more sustained responses during a pulse train (Figure 5G) and to increase the preferred duty cycle from 0.25 to 0.50 (Figure 5H and I). Removing all four sources of suppression abolished period tuning and produced a preference for constant tone stimuli with a duty cycle of 1.0 (Figure 5H, right). Different sources of suppression sometimes interacted in unexpected ways. For instance, removing the inhibitory lobe and adaptation in AN1M decreased rather than increased duty cycle preference because AN1M produced stronger responses when adaptation was absent, which in turn induced stronger adaptation downstream in the network.

Overall, these results identify mechanisms of response suppression by adaptation and inhibition as a cause for the network preferring small duty cycles (short pulses and long pauses). They demonstrate how specific implementation details of a recognition mechanism constrain phenotypic diversity, but also reveal how different model parameters can be used to create phenotypic diversity, like changing the preferred duty cycle (Figure 5D,H, and I). As a last step in our analysis, we examined a parameter that switches the preference type from period to duration tuning via changes in the rebound dynamics.

Changes in rebound dynamics can switch the preference type by engaging suppression

Switches in preference type occur regularly even among closely related cricket species (Bailey et al., 2017; Hennig, 2003; Hennig et al., 2016), and we therefore looked for a model parameter that induced such a switch. We found that increasing the gain of the inhibitory lobe of the rebound filter in LN5M (‘inhibitory lobe’ in short) (Figure 6A, blue) switched the preference from period tuning (Figure 6B, bottom) to duration tuning (Figure 6B, top), characterized by high tolerance for pause and high selectivity for duration (Figure 1B). Increasing the inhibitory lobe parameter amplifies but also prolongs the rebound (Figure 6A, blue), and we examined which of these two changes creates the switch from period to duration tuning. Other model parameters – like the gain of the input from LN2M to LN5M or from LN5M to LN3M (Figure 6A and B, ‘input gain,’ red) or the LN5M output gain – only amplify the rebound and retain period tuning. This indicates that the amplification of the rebound is insufficient, and that the prolongation of the rebound is necessary to cause the preference switch. In the original model, rebounds are short because the inhibition of LN5M triggered by LN2M activity always cuts off and suppresses rebound responses (Figure 6A, black, see also Figure 2E and F), and this happens even when the rebound is amplified (Figure 6A, red). By contrast, in the model with an increased inhibitory lobe of the LN5M rebound filter, the rebound persists during the LN2M inhibition (Figure 6A, blue). The prolonged rebound drives stronger adaptation downstream in LN3M (Figure 6—figure supplement 1), in particular for pulse patterns with short pauses, because shorter pauses prevent recovery from adaptation in LN3M. This response suppression for short pauses abolishes the preference for intermediate pause durations necessary for period tuning and switches the preference type to duration tuning (Figure 6C). This last analysis highlights the dual role of suppression in shaping the recognition phenotype of the network: suppression constrains phenotypic diversity by reducing responses to patterns with long duty cycles (Figure 4C and D, Figure 5H and I), but it also contributes to phenotypic diversity by adjusting the network’s preferred duty cycle (Figure 5H, I) or by switching the preference type (Figure 6).

Figure 6 with 1 supplement see all
Changes in LN5M rebound dynamics induce a switch in response type.

(A) Increasing the amplitude of the inhibitory lobe of the filter in LN5M that generates the rebound (middle, blue, ‘inhibitory lobe’) increases the rebound’s amplitude and duration. By contrast, the gain of the input from LN5M to LN3M (bottom, red, ‘input gain’) scales the input from the LN5M rebound without prolonging it. Pictograms on the right show the parameters for the original model (black) and for a model with a 10-fold increase in the respective parameter value (blue and red). Traces show the rebound inputs from LN5M to LN3M for the pulse pattern shown on top (black traces, 20 ms pulse duration and 20 ms pause). (B) Response fields of LN4M for the original model (bottom), and for models with increased inhibitory lobe (top) or input gain (middle). Response magnitudes are color coded (see color bar, scaled to span the range of response magnitudes). Amplifying and prolonging the rebound by increasing the inhibitory lobe (top) produces pause tuning, while only amplifying the rebound via the input gain retains period tuning (bottom). Vertical lines correspond to the stimuli for which pause tuning curves are shown in (C). (C) Pause tuning curves for LN4M at a pulse duration of 20 ms (see lines in B) reveal differential effects of the parameters on pause tuning. Amplifying and prolonging the rebound by increasing the inhibitory filter lobe (blue) produces high tolerance for pause duration, in this case high-pass tuning, which is required to obtain duration tuning. By contrast, only amplifying the rebound via the input gain (red) retains the preference for intermediate pauses characteristic for period tuning. The pulse trains had a duration of 600 ms and were interleaved by chirp pauses of 200 ms for all simulations to ensure that the stimuli in the response fields contained enough pulses even for long pulse durations and pauses.

Discussion

How diversity in intraspecific communications systems is shaped by neural networks in the sender and in the receiver is an open question. Here, we asked whether the song recognition network in G. bimaculatus (Schöneich et al., 2015) has the potential to generate the diversity of song recognition phenotypes known from crickets. In particular, we tested whether the delay line and coincidence-detector network in G. bimaculatus can be considered a ‘mother network’ for recognizing the species-specific song pattern in different cricket species (Figure 1; Hennig et al., 2014). A model of the neuronal network reproduced the neurophysiological and behavioral data using simple, elementary computations like filtering, nonlinear transfer functions (nonlinearities), adaptation, and linear transmission with delays (Figures 2 and 3). Examining the model’s responses over a wide range of parameter values revealed that the network can generate all types of song preferences known from different crickets and even other insects (Figure 4A–E). We then identified key parameters that either support or constrain the phenotypic diversity the network can produce, providing insight into how the network can evolve to become selective for different song parameters (Figures 4F-H6).

The delay line and coincidence-detector network can produce the full diversity of preferences for pulse duration and pause in crickets

Four principal preference types have been identified in crickets and other insects (Figure 1B): preference for pulse duration (Deutsch et al., 2019; Gray et al., 2016; Hennig, 2003), pulse pause (Schul, 1998), pulse period (Hennig, 2003; Hennig, 2009; Rothbart and Hennig, 2012; Rothbart et al., 2012), and duty cycle (Hennig et al., 2016). Variants of our network model produce all four of these preference types for the range of song parameter values relevant for crickets (Figure 4A–E).

While the network model analyzed here is derived from recordings in one species (G. bimaculatus), the delay line and coincidence-detector network is likely shared within the closely related cricket species. The phylogenetic position of G. bimaculatus close to the base of the phylogenetic tree from which many other species emerged is consistent with this idea (Gray and Cade, 2000). Our finding that this network can produce all known preferences for pulse and pause supports this idea and suggests that it forms a common substrate – a ’mother network‘ – for the diversity of song recognition phenotypes in crickets. How can the ‘mother network’ hypothesis be tested? Behavioral tests can provide insight into whether other species use the coincidence-detection algorithm found in G. bimaculatus (Hedwig and Sarmiento-Ponce, 2017). These experiments can, for instance, test the prediction that the duration of the last pulse in a chirp only weakly impacts network responses. Species that violate this prediction are unlikely to recognize song by the same coincidence mechanism. However, the ‘mother network’ hypothesis does not imply that all crickets implement a coincidence-detection algorithm, just that they reuse the same neurons with largely conserved response properties. In fact, our analyses have shown that coincidence detection can be circumvented through changes in key parameters to produce a different preference type (Figure 6, Figure 4—figure supplement 1). That is why further electrophysiological experiments in G. bimaculatus are crucial to reveal the precise biophysical mechanisms that tune the network and ultimately link changes in gene expression, for instance, of specific ion channels, to changes in network tuning. Importantly, these experiments need to be extended to other species by identifying and characterizing homologues of the neurons in the network. Recordings in other species are challenging but feasible since homologous neurons are expected to be found in similar locations in the brain. Our model produces testable predictions based on the known behavioral tuning for how key properties of these neurons may look like in any given species (see below).

Future studies will also show whether the network can explain more complex preference functions known from some crickets and also other insects. For instance, preference types that betray resonant cellular or network properties are known from katydids (Bush and Schul, 2005; Webb et al., 2007), and we find that the network can produce these multi-peaked response fields (Figure 4—figure supplement 2D), but it remains to be seen whether similar preference types exist in crickets. Several species of crickets produce complex songs that are composed of multiple types of pulse trains, and it is unclear whether the current network can reproduce the known behavioral preference for such complex songs (Bailey et al., 2017; Cros and Hedwig, 2014; Hennig and Weber, 1997).

In addition, we have not yet explored the network’s ability to reproduce the behavioral selectivity for parameters on the longer timescale of chirps (Figure 1A; Grobe et al., 2012; Blankers et al., 2016; Hennig et al., 2016). It is likely that the network can explain some properties of the selectivity for chirp known from crickets. For instance, that a minimum of two pulses is required to produce coincidence in the network could at least partly explain the existence of a minimal chirp duration for G. bimaculatus (Grobe et al., 2012). Likewise, suppression in the network reduces responses to long chirps, which could explain the reduced behavioral preference for long chirps. However, the current electrophysiological data do not sufficiently constrain responses at these long timescales and studies are needed to address this issue more comprehensively.

Lastly, our model does not address the behaviourally well-documented inter-individual variability in phonotaxis behaviors (Grobe et al., 2012; Meckenhäuser et al., 2014), which likely arises at multiple levels: At the level of song pattern recognition (inter-individual differences in the network parameters) at the level of phonotaxis behavior (biases and noise in localizing the sound) at the motivational level (low or high motivation leads to more or less selective responses) and at the motor level (variability from motor noise). Identifying the contribution of these different levels is challenging since the full characterization of the behavioral phenotype in terms of the response fields cannot be obtained reliably at the individual level - the stimulus space is too large. Therefore our model of song pattern does not explicitly consider the inder-individual variability but is meant to represent he behavior of an average female.

How to tune a pulse pattern detector?

Our sensitivity analysis of the model identified three classes of parameters that define the model’s tuning (Figure 4H). First, parameters that control the relative timing of inputs to the coincidence-detector LN3M set the network’s preferred pulse period. These include input delays in all upstream neurons (Figure 5A and B) but also passive and active membrane properties that delay the rebound responses in LN5M (Figure 5—figure supplement 1). Second, parameters that lead to a stronger and more sustained rebound in LN5M can shift the preference from pulse period to pulse duration tuning (Figure 6). Lastly, sources of response suppression, like inhibition or adaptation, reduce responses to long pulses and high duty cycles (Figure 5G–I). These three classes of parameters account for changes within and transitions across the principal types of song preference in crickets. The model thus provides testable hypotheses for how response properties in the neuronal network may have evolved to compute the preference functions of different species. For instance, species that prefer different pulse periods than G. bimaculatus like Teleogryllus leo (Rothbart and Hennig, 2012), Gryllus locorojo (Rothbart et al., 2012), Gryllus firmus (Gray et al., 2016), or Teleogryllus oceanicus (Hennig, 2003) could differ in the delays of inputs to LN3 (Figure 5B). Duty cycle-tuned species like G. lineaticeps or G15 (Hennig et al., 2016) may exhibit weaker suppression throughout the network, for instance, reduced adaptation in LN3 (Figure 5G–I). Lastly, species with duration tuning such as T. commodus (Hennig, 2003) or G13 (Gray et al., 2016) could exhibit longer and stronger rebound responses in LN5 (Figure 6).

How can these changes be implemented in a biological network? Although our phenomenological model is independent of a specific biophysical implementation, all model components have straightforward biophysical correlates. We can therefore propose biophysical parameters that tune specific aspects of the network in a given implementation. To illustrate this point, we will briefly provide examples of how the four elementary computations of the model – filtering, adaptation, nonlinear transfer functions (nonlinearities), and linear transmission with a delay – can be implemented and tuned. First, filters are shaped by active and passive properties of the membrane: individual filter lobes act as low-pass filters that dampen responses to fast inputs and arise from integrating properties of the passive membrane like capacitive currents (Dewell and Gabbiani, 2019; Azevedo and Wilson, 2017). Increasing the membrane capacitance therefore leads to stronger low-pass filtering. An added negative (inhibitory) lobe makes the stimulus transformation differentiating and can arise from conductances that hyperpolarize the membrane, like potassium or chloride channels (Nagel and Wilson, 2011; Slee et al., 2005; Lundstrom et al., 2008). Increasing the potassium conductance in Drosophila olfactory receptor neurons makes their responses more differentiating (Nagel and Wilson, 2011), while reducing the conductance of delayed-rectifier potassium channels in the auditory brainstem makes responses less differentiating (Slee et al., 2005; Lundstrom et al., 2008). The filter in LN5M that produces the post-inhibitory rebound arises from hyperpolarization-activated cation currents like Ih (mediated by HCN non-selective cation channels) and It (mediated by T-type calcium channels), which control the PIR’s amplitude and latency, respectively (Pape, 1996; Engbers et al., 2011). Second, adaptation is implemented in the model either via inhibitory filter lobes or via divisive normalization. Biophysically, adaptation can arise from synaptic depression (Tsodyks et al., 1998; Fortune and Rose, 2001), or from subthreshold or spike-frequency adaptation (Farkhooi et al., 2013; Nagel and Wilson, 2011; Benda and Herz, 2003; Benda and Hennig, 2008). Spike-frequency adaptation can arise from inactivating sodium currents, voltage-gated potassium currents (M-type currents), or calcium-gated potassium currents (AHP currents) (Slee et al., 2005; Heidenreich et al., 2011). Increasing the expression level of these channels controls the strength of adaptation, while the kinetics specific to each channel type control adaptation speed. AHP currents can last seconds if spiking leads to a long-lasting increase in the intracellular calcium concentration, giving rise to long inhibitory filter lobes or adaptation time constants in the model (Cordoba-Rodriguez et al., 1999). Third, nonlinearities translate the integrated synaptic input to firing output. The nonlinearity’s threshold is governed by the density of sodium channels at the spike-initiating zone while the steepness and saturation of the nonlinearity depend on the inactivation kinetics of sodium channels or the spiking dynamics controlled by the Na/K ratio (Prescott et al., 2008; Lundstrom et al., 2008). Lastly, transmission delays arise from axonal conduction and synaptic delays but also other mechanisms, like low-pass filtering of the membrane voltage at the pre- and postsynapse (Creutzig et al., 2010; Zhou et al., 2019), latencies arising from integration of inputs to the spiking threshold, or from spike generation (Izhikevich, 2004; Figure 5—figure supplement 1). Synaptic weights are set by the number of synaptic boutons between two neurons (Peter’s rule, see Rees et al., 2017) and the amount of neurotransmitter that can be released at the presynapse (vesicle number and loading) and absorbed at the postsynapse (number of transmitter receptors). These examples demonstrate that our phenomenological model has a straightforward, physiologically plausible implementation and can propose experimentally testable hypotheses for transitions between types of behavioral preferences.

The three computations that define model tuning – response suppression, post-inhibitory rebounds, and coincidence detection – occur across species and modalities. Delay lines are prominent in binaural spatial processing (Schnupp and Carr, 2009) but have also been implicated in visual motion detection (Borst and Helmstaedter, 2015) or pulse duration selectivity in vertebrate auditory systems (Aubie et al., 2012; Buonomano, 2000). Suppression is known to act as a high-pass filter for pulse repetition rates (Baker and Carlson, 2014; Benda and Herz, 2003; Fortune and Rose, 2001) that in our case biases the network towards responding to rapidly changing patterns, like those with short pulses (Figure 4C). Finally post-inhibitory rebounds have been implicated in temporal processing in different species like honeybees (Ai et al., 2018), fish (Large and Crawford, 2002), frogs (Rose et al., 2015), or mammals (Felix et al., 2011; Kopp-Scheinpflug et al., 2018). The computations found to control the song preference in G. bimaculatus could therefore also govern pattern recognition in other acoustically communicating animals. For instance, different bird species produce songs with silent gaps of species-specific durations between syllables and auditory neurons in the bird’s brain are sensitive to these gaps (Araki et al., 2016). Our analyses suggest that the gap preference can be shifted from longer gaps (low duty cycle: Java sparrow, Bengalese finch) to shorter gaps (high duty cycle, starling, zebra finch) by reducing suppression or adaptation in the network. This could be implemented, for example, by reducing postsynaptic GABA receptors or by lowering the expression levels of voltage-gated potassium channels.

We here focused on modifying the magnitude of parameters, corresponding, for instance, to the expression levels of neurotransmitters or ion channels. Neuronal networks, however, can also evolve to produce novel phenotypes by changing their topology, through a recruitment of novel neurons, a gain or loss of synapses, or switches in synapse valence from excitatory to inhibitory as has been shown in motor networks in Caenorhabditis elegans (Hong et al., 2019) and snails (Katz, 2011). In addition, we have not considered neuromodulators, which can rapidly alter network tuning (Bargmann, 2012; Marder, 2012; Marder et al., 2014), and which likely play a functional role in the phonotactic response (Poulet and Hedwig, 2005).

Algorithmic details specify constraints

Previous studies revealed that Gabor filters can produce the full diversity of song preference functions found in insects (Clemens and Hennig, 2013; Clemens and Ronacher, 2013; Hennig et al., 2014). However, the computation giving rise to Gabor filters can be implemented with multiple algorithms, each subject to specific constraints. For instance, the period tuning found in G. bimaculatus can be produced by the now known delay line and coincidence detection mechanism (Schöneich et al., 2015), but also by the interplay between precisely timed excitation and inhibition (Aubie et al., 2012; Rau et al., 2015), by cell-intrinsic properties like resonant conductances (Azevedo and Wilson, 2017; Rau et al., 2015), or by a combination of synaptic depression and facilitation (Fortune and Rose, 2001). By considering the implementation of the pattern recognition algorithm in a particular species, we revealed a bias in the diversity of phenotypes that this specific implementation can produce: several sources of suppression induce a bias towards preference for low duty cycle stimuli (Figures 4B–D5G–I). This highlights the importance of studying nervous system function and evolution beyond the computational level at the level of algorithms and implementations (Marr, 1982).

Functional tradeoffs limit behavioral diversity

The low duty cycle bias present in the recognition mechanism of G. bimaculatus has several implications for the evolution of song preference in crickets and elsewhere: perceptual biases that have evolved in contexts like food or predator detection are known to shape sexual selection (Guilford and Dawkins, 1993; Ter Hofstede et al., 2015; Phelps and Ryan, 1998; Ryan and Cummings, 2013). In the case of song recognition in crickets, suppression (adaptation, inhibition, onset accentuation, Figure 5G) reduces neuronal responses to long-lasting tones and likely evolved to save metabolic energy (Niven, 2016) or to make song recognition more robust to changes in overall song intensity (Benda and Hennig, 2008; Hildebrandt et al., 2011; Schöneich et al., 2015). As a side effect, adaptation now biases the song recognition mechanism towards preferring pulse trains with low duty cycles (Figure 5H and I), which is consistent with the apparent absence of pause tuning in crickets (Hennig et al., 2014). Interestingly, pause tuning is known from katydids (Schul, 1998), suggesting that their song recognition system is not subject to the low duty cycle bias. Katydids may have avoided the low bias either by using a delay line and coincidence detection network like that found in G. bimaculatus but with weaker suppression (Figure 5G–I) or by using a different network design that is subject to different constraints (Bush and Schul, 2005). Thus, computations that increase energy efficiency and robustness can constrain the phenotypic diversity of a whole species group.

From evolutionary pattern to process

How can a diversity of neural networks evolve to drive the diversification of species-specific communication signals? Our modeling study of the song recognition network in the cricket brain provides first evidence that the underlying neuronal network is computationally flexible: by adapting physiological parameters, the network can produce all preference types described in crickets (Figure 4B–E). The computational flexibility of the recognition mechanism may explain the species richness as well as the speed of evolution in a particular taxon like crickets (Alexander, 1962; Blankers et al., 2015; Desutter-Grandcolas and Robillard, 2003; Oh and Shaw, 2013; Otte, 1992): female preferences drift around with little constraint in signal space, maybe pushed by abiotic (environmental noise selects against preferences for very short pauses) and biotic factors (avoid overlap with heterospecifics, Amezquita et al., 2011). The male song evolution follows changes in the female’s preference since only males that sing attractive song will reproduce. In this scenario, a female network that has the capacity to produce many different preference types supports the divergence of the communication system. However, this co-evolution of song preference and song structure requires male song production networks to be as flexible as the female song recognition networks (Jacob and Hedwig, 2020; Schöneich, 2020). There is also accumulating evidence for genetic coupling between the networks that produce and recognize the song pattern, which may ensure that sender and receiver stay tuned during evolution of song pattern (Xu and Shaw, 2019; Schöneich, 2020; Xu and Shaw, 2021).

Computational flexibility also needs evolvability – the ability to generate specific and robust phenotypes during evolution (Blankers et al., 2015). The ‘mechanistic degeneracy’ of computation in biological neural networks – the fact that specific network outputs can be obtained by tuning disparate parameters – is thought to be a prerequisite for evolvability (Wagner, 2011, see also Hasson et al., 2020; Leonardo, 2005; Schneider et al., 2021). The degeneracy concept has been demonstrated extensively for motor networks (Gutierrez et al., 2013; Prinz et al., 2004), and here we also found evidence of degeneracy in the cricket song recognition network. For instance, the preferred pulse period (Figure 5A and B, Figure 5—figure supplement 1) or the pulse duty cycle (Figure 5G–I) can be set by multiple parameters. The hypothesis that high-dimensional and degenerate biological parameter spaces are more evolvable could be tested by assessing the computational flexibility of a minimal model of the network that produces the observed behavior with as few parameters as possible.

If a neural network is flexible, how can it maintain a robust phenotype within a species? We find that many parameters have only weak impacts on the network’s preference, leading to a relatively large fitness-neutral parameter set and a robust phenotype (Prinz et al., 2004). For instance, we observed that the rebound amplitude in LN5 tends to be variable across electrophysiological recordings in different individuals but that the network model’s output is relatively robust to these changes (Figure 2B, Figure 2—figure supplement 4). Other parameters crucially affect the recognition phenotype and support evolvability. These fitness-critical parameters likely drive changes in the recognition phenotype during speciation events to ensure species discrimination (Amezquita et al., 2011; Schmidt et al., 2011). Future studies will employ novel computational methods for characterizing the high-dimensional parameter space (Bittner et al., 2021; Gonçalves et al., 2020) to more comprehensively characterize the fitness-neutral parameter set leading to interindividual variability and to provide insight into the robustness and evolvability of pattern recognition networks.

Materials and methods

Electrophysiological data

Request a detailed protocol

The data used for fitting the model come from intracellular recordings with sharp microelectrodes of AN1, LN2, LN3, LN4, and LN5 and are published in Kostarakos and Hedwig, 2012 and Schöneich et al., 2015. They include (1) 12 stimuli with a pulse duty cycle of ~0.5 and periods ranging from 10 to 98 ms (Figure 2B), (2) 10 stimuli with a pulse period of 40 ms and duty cycles ranging between 0.1 and 1.0 (Figure 2C), and (3) 12 stimuli with a pulse duration of 20 ms and pauses varying between 6 and 90 ms (Figure 2D). During the electrophysiological experiments, each pulse train was presented interleaved by a chirp pause of 230 ms.

Quantification of tuning in the recordings and the model responses

Request a detailed protocol

In the recordings, spikes were detected using custom routines that found peaks in the voltage traces robust to changes in baseline. The accuracy of spike detection was checked by visual inspection of the voltage traces. From the spike times, average firing rates were calculated by dividing the number of spikes produced during each chirp (pulse train and chirp pause) by the chirp period (Figure 2B–D). For the non-spiking LN5, the response corresponds to the voltage of the rebound response. Voltage rates were obtained by first calculating a baseline voltage as the average voltage in the 25 ms preceding a given pulse train, and then integrating the supra-threshold components of the voltage. This integral voltage was then divided by the chirp period to get a rate. Note that the tuning curves for LN5 are not very accurate because the rebound responses have a low amplitude and the baseline voltage fluctuated considerably during recordings of LN5.

For the model, tuning curves and responses fields were calculated as for the recordings – by integrating the firing rate outputs of the spiking neurons in the model or the rebound voltage for the non-spiking LN5 and dividing the resulting values by the chirp period.

Model inputs and simulation

Request a detailed protocol

We built a rate-based, phenomenological model of the song recognition network in G. bimaculatus. Sound inputs were presented as the time-varying amplitude of pulse trains with the specified pulse and chirp structure. Model responses were simulated with a temporal resolution of 1 kHz and tested as if pulse trains and chirp pauses were repeated endlessly.

Elementary computations

The model was built from four elementary neural computations. In the following, we will first define the elementary computations and then describe how they were combined to model each neuron in the network. Neurons in the model are referred to with a subscript M.

Filtering

Request a detailed protocol

Filtering was implemented via y(t)=τh(τ)x(t-τ), where x(t) and y(t) are the stimulus and response at time t, respectively, and τ is a temporal delay. The causal filter h(τ) was constructed from discrete Gaussian and exponential kernels. A discrete Gaussian kernel was defined as hg=e-(τ-N/2)2/(4σ2) with support N, where 0τN, and width α such that σ=(N-1)/(2α). Exponential kernels were defined as he=e-τ/γ/γ, with support N, where 0τN, and decay γ. Note that throughout the article we use ‘filter duration’ as a non-technical synonym for ‘support,’ which specifies the time over which the filter is defined. Gaussian (Figure 2—figure supplement 1A) and exponential filters are low-pass or integrating filters, which smooth the stimulus (Figure 2—figure supplement 1B). The filters were not normalized. Bandpass filters were implemented as biphasic filters by either differentiating a Gaussian filter Δh(τ)=h(τ)-h(τ-1) (Figure 2—figure supplement 1A) or by subtracting two Gaussian and/or exponential filters to form the positive (excitatory) and negative (inhibitory) filter lobes. Parameterizing biphasic filters as combinations of Gaussians and/or exponentials provides greater and more direct control over the timing and the gain of the filters’ excitatory and inhibitory lobes. Differentiating filters respond most strongly to changes in the stimulus, for instance, to the onsets or offsets of pulses (Figure 2—figure supplement 1B). The support parameter was initialized to be sufficiently long and was typically fixed during model fitting (Table 1).

Nonlinear transfer functions

Request a detailed protocol

Nonlinear transfer functions (nonlinearities) transform inputs to implement thresholding or saturation. The model employs two types of nonlinearities: a rectifying nonlinearity thresholds the input x at the threshold value x0 and scales it with a gain β:

y={xβif x>00otherwise

(Figure 2—figure supplement 1D). In many cases, the threshold parameter was used to constrain signals to be purely positive or negative, and it was therefore often fixed to 0. A sigmoidal nonlinearity combines a soft threshold with saturation: y=y0+ymax/(1+e-ax-b) with gain a, shift b, and minimal and maximal output y0 and ymax, respectively (Figure 2—figure supplement 1E).

Linear transmission with delay

Request a detailed protocol

Neuronal connections transmit neural activity linearly with delay Δ and gain α. Multiple inputs to a neuron are added. The gain α is negative for inhibitory and positive for excitatory synapses and controls the input strength. The delay parameter Δ corresponds to the delay that needs to be added to the input of a neuron to reproduce the timing of the output of that neuron. It includes axonal conduction or synaptic transmission delays but also other delay mechanisms, like low-pass filtering of the membrane voltage at the pre- and postsynapse (Creutzig et al., 2010; Zhou et al., 2019) or latencies arising from integration of inputs to the spiking threshold or from spike generation.

Adaptation

Request a detailed protocol

Adaptation in AN1M and LN2M is implemented using differentiating filters (Figure 2—figure supplement 1A and B). Such filters produce adaptation via their broad inhibitory lobe, which suppresses subsequent responses (Nagel and Wilson, 2011). For AN1M and LN3M, adaptation effects were better captured using divisive normalization (Figure 2—figure supplement 1F). The input to the adaptation stage, x, is low-pass filtered using an exponential filter to generate an adaptation signal xada that divides the input: y=x/(x0+wxada). Here, x0 and w control the strength of adaptation while the width of the exponential filter, γ, controls the timescale of adaptation. Both implementations of adaptation – differentiating filters and divisive normalization – can produce similar adaptation time courses. However, since adaptation with differentiating filters acts subtractively, the negative response components accumulate over time to a negative filter output (Figure 2—figure supplement 1F). For modeling adaptation on longer timescales, this can completely suppress responses to subsequent stimuli. This complete suppression does not occur with divisive normalization since it only scales the response.

Model neurons

The elementary computations described above were combined to reproduce the response of each neuron in the network – the firing rate patterns in the case of spiking neurons and the pattern of the rebound responses in the non-spiking LN5M (Figure 2E). All filters and nonlinearities are plotted in Figure 2A, and all parameter names and values are listed in Table 1 and graphically defined in Figure 2—figure supplement 1 and Figure 2—figure supplement 2. The model was implemented in MATLAB. Code for running the model is available at https://github.com/janclemenslab/cricketnet, (Clemens, 2021 copy archived at swh:1:rev:73ff788143c6afa3adbbb3e0f61f600a13052352).

AN1M (relay)

Request a detailed protocol

AN1 is the input neuron of the network. It faithfully copies the pulse structure and adapts weakly (Figure 2E). These response properties were reproduced using a differentiating linear filter (Figure 2—figure supplement 1): the filter’s narrow and strong excitatory lobe captures the fast and faithful stimulus responses, while the broad inhibitory lobe captures the delayed suppressive effect of past stimulus epochs. AN1M’s filter was generated by concatenating two Gaussians as the excitatory and inhibitory lobes. AN1M has a sigmoidal nonlinearity that saturates responses to longer pulses. To capture the decrease of the response across pulses, we additionally included a divisive normalization stage.

LN2M (inhibition)

Request a detailed protocol

LN2 responds reliably to individual pulses similar to AN1, but exhibits stronger adaptation during the pulse. This was reproduced using a differentiating filter, composed of a truncating Gaussian function as the excitatory lobe and an exponential function as the inhibitory lobe. A rectifying nonlinearity restricts outputs to be positive.

LN5M (post-inhibitory rebound)

Request a detailed protocol

LN5 is a non-spiking neuron. It receives inhibitory input coupled to the spike activity of LN2 and generates a rebound depolarization when the inhibition terminates (Figure 2E). The inhibitory input from LN2 over a pulse saturates in LN5, which is modeled using a differentiating input filter followed by a rectifying nonlinearity that restricts the inputs to be negative. The rebound is then produced using a differentiating filter with a narrow excitatory lobe, which further low-pass filters the inhibitory input, and a broad inhibitory lobe, which generates the broad positive voltage deflections at the offset of negative voltage deviations. See Figure 2—figure supplement 1C for an illustration of the principle by which the rebound is produced in the model. The filter was generated by concatenating two exponential filters as negative and excitatory lobes.

LN3 (coincidence detection)

Request a detailed protocol

LN3 responds strongly when two excitatory synaptic inputs arrive at the same time: a short-latency input from AN1 and a delayed input from LN5. In the neurophysiological data, the input from AN1 alone is often sufficient to drive spiking (e.g., to the first pulse of a train), suggesting that LN3 does not require two coincident inputs to spike. We therefore define LN3M as a linear adder (with a threshold) that receives stronger input from AN1M than from LN5M. From the LN5M responses, only the positive components corresponding to the rebound were passed on as the input to LN3M, which was separated from the negative components of the LN5M responses using a rectifying nonlinearity (Figure 2—figure supplement 1D). The sum of the two inputs is thresholded using a rectifying nonlinearity before being passed into an adaptation stage with divisive normalization. The adaptation after summation reduces LN3M firing over a pulse train, and a final rectifying nonlinearity ensures positive firing rates, by cutting of all negative responses that can occur in the model, for instance, due to filtering (Figure 2—figure supplement 1B).

LN4 (feature detection)

Request a detailed protocol

Finally, LN4M adds excitatory input from LN3M and inhibitory input from LN2M, which sharpens its selectivity for pulse patterns as compared to LN3M. A rectifying nonlinearity restricts the output firing rate to be positive.

Prediction of phonotactic behavior from network responses

Request a detailed protocol

For regular pulse trains, the firing rate of LN4 (number of spikes per chirp/chirp duration) predicts phonotactic behavior very well (Schöneich et al., 2015; Figure 1E). We therefore used the firing rate of LN4M to predict behavioral responses from network output.

Model fitting

Request a detailed protocol

Model parameters were optimized in two phases. First, the model parameters and structure were initialized by hand. This initialization established which computational steps were necessary to reproduce the key response features of the neurons in the network – their response dynamics and tuning – and determined initial parameter values. In the second phase, we used a genetic algorithm to tune the network parameters to optimally fit the data; see Clemens et al., 2014. To simplify fitting, we exploited the feed-forward topology of the network: instead of fitting all parameters simultaneously, we adopted a stepwise procedure, in which the model neurons were fitted in order of their position in the network. We started with fitting the parameters of the first neuron in the network, AN1M, and reproducing the firing rate traces of AN1M. In a seconds step, we held the parameters of AN1M constant and fitted parameters of LN2M. We then held the parameters of AN1M and LN2M constant and fitted the temporal dynamics of the rebound response in LN5M but not the inhibitory component, since the precise magnitude and dynamics of inhibitory components were highly variable across recordings and since the rebound constitutes the effective output of LN5. We then fitted the parameters of LN3M and finally of LN4M, again holding those of the upstream neurons constant. For these last two neurons, firing was very sparse and irregularly timed across trials, which complicated fitting. We therefore used information from the firing rate dynamics and from the tuning curves for optimizing the parameters of LN3M and LN4M by using mixed error functions: EMSEtraces+wEMAEtuning where EMSEtraces is the mean square error between the predicted and actual traces, EMAEtuning is the mean absolute error between the predicted and the actual tuning curves, and w was set to 0.1 for LN3M and to 1.0 for the even sparser activity of LN4M. For the other neurons in the network (LN2M, LN3M, LN5M), only E was used (equivalent to w = 0).

EMSEtraces

The initial parameters for fitting at each step were drawn from an exponential distribution ranging between 0.1 and 10 around the hand-fitted parameters. The exponential distribution was chosen to have an even number of parameters above and below the hand-fitted values, that is, a uniform distribution after log scaling. While the fitting algorithm is stochastic by nature, it repeatedly found similar optimal parameter sets on independent initializations (not shown).

The model had a total of 55 parameters. Of these, nine parameters (marked by ‘*’ in Table 1) were fixed to the initialized values during fitting to facilitate convergence. For instance, the gain of the input to a neuron is equally set by the output gain of the presynaptic neuron’s nonlinearity, β, or by the synaptic input gain α, and therefore only one of these parameters needs to be optimized. Or the support parameter of an exponential filter sets an upper bound for the timescale of that filter, which in turn is given by the filter’s time constant. In that case, the support was typically set to a sufficiently high value and fixed. Lastly, some of the thresholds were fixed to zero since their sole role was to restrict firing rates to be positive.

Evaluation of model performance

Request a detailed protocol

After the fitting procedure, the model performance was evaluated based on the tuning curves and the response traces. The fit for the tuning curves is given by 1-ERMSE(an,bn) (Figure 2H), where ERMSE is the root-mean-squared error and an and bn are the tuning curves from the experimental data and the model, each normalized by the maximum of the tuning curve from the data: an=a/maxa, bn=b/maxa. This measure of fit is sensitive to the addition of a constant offset and to a mismatch in the scale of responses between the model and the data, but is independent of the overall scale of the responses in the data. The fit for the response traces is given by the squared Pearson correlation coefficient:

r2=(i=1n(xi-x¯)(yi-y¯)i=1n(xi-x¯)2i=1n(yi-y¯)2)2

where x and y are the response traces from the model and the data, respectively, and μ and σ are the mean and the standard deviation. The r2 was calculated for different timescales (Figure 2G) by low-pass filtering both the prediction and the response with rectangular windows with durations ranging between 1 and 25 ms before computing the r2. At short timescales, this measure is sensitive to fine details in the firing rate dynamics, while at longer timescales, the measure reflects the match in coarse features of the firing dynamics, for example, the spike counts per pulse.

Generation of model variants for the analysis of the parameter space

Request a detailed protocol

To determine the range of preference types the network can produce (Figure 4A–E), we generated models with random parameter sets taken from a range around that obtained from the fit to G. bimaculatus. For that, we chose 45 of the 55 model parameters. The 10 parameters fixed for this analysis (marked by ‘† in Table 1) had also been fixed for fitting (see above), but we allowed more of the parameters of AN1’s filter and output nonlinearity to change. All 45 parameters except the synaptic delay parameters were taken from a hypercube spanning the range between 1/10 and 10-fold around the original parameter set, spaced logarithmically such that the fraction of random parameter values below and above the original value was similar. Delay parameters in the model (see Table 1) were allowed to range uniformly between 1 and 21 ms, irrespective of the original parameter value. The delay parameters correspond to the delay added to the inputs to a neuron required to produce the desired timing of the output of that neuron and include axonal conduction and synaptic transmission delays, delays induced by low-pass filtering at the pre- and postsynapse, and delays from the integration of inputs to the spiking threshold and from spike generation. To ensure uniform sampling from the 45-dimensional parameter space, we used a quasi-random sampling scheme based on the Sobol set. We used the sobolset function in MATLAB with the following parameters: skip 1e3, leap 1e2, scramble MatousekAffineOwen. Using this approach, we generated 5 million different model variants. Of these, 9% (450,000 models) were responsive and selective, that is, they responded to at least one pulse train pattern and did not produce the same response for all patterns tested. Initial tests with extended parameter ranges yielded qualitatively similar phenotypic variability but produced many more unresponsive or unselective models.

Sensitivity analysis

Request a detailed protocol

For the sensitivity analysis (Figure 4F–H, Figure 4—figure supplement 1), we changed either single parameters or systematically varied pairs of parameters on a grid. We used the same set of 45 parameters as in the analysis of the parameter space above (see Table 1). Single-parameter sweeps were generated as 21 logarithmically spaced values between 1/100 and 100-fold around the original value, except for delay parameters, which were generated as 21 values ranging between 1 and 41 ms. While these parameter ranges may appear rather large, in particular for the input delay parameters, they are chosen to facilitate the detection of key parameters controlling the network’s tuning. Their specific and meaningful effects on the model tuning were confirmed and further analyzed within the more restricted parameter range also chosen for the analysis of the network’s phenotypic flexibility above.

For each model, we calculated the response fields for pulse durations and pauses between 1 and 80 ms (2 ms spacing, 1600 stimuli per response field), with a pulse train duration of 600 ms and a chirp pause of 200 ms. To quantify the model’s sensitivity to changes in each parameter, we first calculated the correlation distance (1 – Pearson’s correlation coefficient) between the response fields from the original model and each modified model in the parameter sweep (Figure 4F). By using the correlation coefficient, our sensitivity analysis is robust to trivial changes in the response field like scaling or the addition of a constant to all responses. The average correlation distance over a parameter sweep is then taken as a measure of how much changing a parameter affects the model output.

For some parameters, models produced constant output (e.g., all zeros) over most of the parameter sweep with only one or two step-like changes in the response fields, leading to artificially high sensitivity values. We found that we could reliably exclude such parameters by calculating the median difference in the correlation distance between consecutive parameter values over the parameter sweep and requiring this quantity to be larger than 0.005:

medianNn<N|D(R(0),R(n))D(R(0),R(n+1))|>0.005

where || is the absolute value, and D(R(0),R(n)) denotes the correlation distance (1 – Pearson’s correlation coefficient) between the response field from the original model, R(0), and from a model with the modified parameter, R(n). This approach also excluded parameters whose change produced largely untuned models over the sweep. We ensured that none of our results crucially depended on these criteria. For instance, even without the above constraint, the parameters of LN5M and LN3M were still among the top ranked.

For the sensitivity analysis over parameter pairs (Figure 4—figure supplement 1), value grids were generated using the same value ranges as for single parameters for each parameter in the pair, resulting in 21 × 21 = 441 model variants for each of the 1035 unique, unordered parameter pairs.

Characterization of response fields

Request a detailed protocol

The response fields known in crickets typically have a roughly ellipsoid shape with a single peak. To assess the extent by which the response fields produced by randomizing the model parameters match these properties, we determined the fraction of fields with a single dominant peak (vs. multiple dominant peaks) and the match of the response fields with a best-fitting ellipse. Dominant peaks were defined as local maxima in the response field exceeding 0.5 of the global maximum value and separated by a trough that is smaller than 0.75 of the value of the lower of the two peaks. This measure is robust to the existence of either small or poorly separated local maxima. Ellipses were fitted to the binarized fields to be robust to the steps in the response fields by the stimulus structure (Figure 3—figure supplement 1). Binarization was performed by thresholding each response field at 50% of its maximum value. We then fitted a two-dimensional Gaussian to the superthreshold coordinate values using MATLAB’s fitgmdist and created a best-fitted prediction by setting all pixels inside the 99.9% probability threshold to 1.0. Results are robust to the specific choice of this value. The match between the binarized response field and the binarized ellipsoid was estimated via the Jaccard similarity, SJ , between the two binary images. The Jaccard similarity corresponds to the fraction of non-zero pixels that differ between the binarized response fields and the binarized ellipsoid or the intersection over union: SJ = |PT |/|PT |, where | … | denotes the set size and ∩ and ∪ correspond to intersection and union, respectively. Using alternative error measures, like the Hamming distance, produced similar results.

Deviations from a perfect ellipsoid, like more rectangular shapes or asymmetrical shapes, can reduce the match (Figure 4—figure supplement 2). Manual inspection of response fields revealed that most models with a Jaccard similarity SJ >0.5 were well described by an ellipse, and we therefore chose this SJ as a threshold for considering a response field well fitted by an ellipsoid. For the asymmetry index, we extracted the length of the major and minor axis of the best-fitted ellipse as the eigenvalues of the covariance matrix and took their ratio.

Identification of preference types

Request a detailed protocol

Response fields were assigned to one of the four principal types in Figure 1B based on two inclusion criteria: first, the angle of the main axis had to fall within ± 10° of that of the prototypical angle (–45° for period, 0° for duration, 45° for duty cycle, 90° for pause). Second, the response field had to be sufficiently selective for the designated stimulus parameter and tolerant for the orthogonal stimulus parameter. For instance, a response field was determined to be pause tuned if the angle was ~90 ± 10°, and if it was selective for pause and tolerant for duration. We ensured the validity of these criteria through visual inspection of many response fields (see examples in Figure 4E). The orientation angle was calculated by fitting a line to the ‘‘ridge of the response field’’ (the set of pause and duration values that elicited maximal responses). We first identified well-responded stimuli as those with response values exceeding 50% of the maximal response value for that field. To make the fits more robust, we determined whether the set of well-responded stimuli extended more along the pause or the duration axis and selected pause and duration values for the fit as follows: if the response field was most extended along the pause axis, then we identified the preferred duration at each pause value for which the response field was above the 50% threshold. For a response field that extended more along the duration axis, we identified the preferred pause at each duration value exceeding the 50% threshold. We then fitted a line to the resulting set of duration and pause values and took the inverse tangent of that line’s slope as the orientation angle.

Data availability

The data used for fitting the model can be found at https://data.goettingen-research-online.de/dataverse/cricketnet. Code for running the model can be found at https://github.com/janclemenslab/cricketnet.

References

  1. Book
    1. Bradbury JW
    2. Vehrencamp SL
    (2011)
    Principles of Animal Communication
    Sinauer Associates Incorporated.
    1. Buonomano DV
    (2000)
    Decoding temporal information: A model based on short-term synaptic plasticity
    The Journal of Neuroscience 20:1129–1141.
  2. Book
    1. Gerhardt CH
    2. Huber F
    (2002)
    Acoustic Communication in Insects and Anurans
    University of Chicago Press.
    1. Hennig R M
    (2003) Acoustic feature extraction by cross-correlation in crickets?
    Journal of Comparative Physiology. A, Neuroethology, Sensory, Neural, and Behavioral Physiology 189:589–598.
    https://doi.org/10.1007/s00359-003-0438-7
    1. Hennig R. M
    2. Blankers T
    3. Gray DA
    (2016) Divergence in male cricket song and female preference functions in three allopatric sister species
    Journal of Comparative Physiology. A, Neuroethology, Sensory, Neural, and Behavioral Physiology 202:347–360.
    https://doi.org/10.1007/s00359-016-1083-2
    1. Katz PS
    (2011) Neural mechanisms underlying the evolvability of behaviour
    Philosophical Transactions of the Royal Society B 366:2086–2099.
    https://doi.org/10.1098/rstb.2010.0336
    1. Leonardo A
    (2005) Degenerate coding in neural systems
    Journal of Comparative Physiology. A, Neuroethology, Sensory, Neural, and Behavioral Physiology 191:995–1010.
    https://doi.org/10.1007/s00359-005-0026-0
  3. Book
    1. Marr D
    (1982)
    Vision: A Computational Investigation into the Human Representation and Processing of Visual Information
    Henry Holt & Company.
    1. Otte D
    (1992) Evolution of cricket songs
    Journal of Orthoptera Research 1(1):25.
    https://doi.org/10.2307/3503559
    1. Ryan MJ
    2. Cummings ME
    (2013) Perceptual Biases and Mate Choice
    Annual Review of Ecology, Evolution, and Systematics 44:437–459.
    https://doi.org/10.1146/annurev-ecolsys-110512-135901
    1. Webb B
    2. Wessnitzer J
    3. Bush SL
    4. Schul J
    5. Buchli J
    6. Ijspeert A
    (2007) Resonant neurons and bushcricket behaviour
    Journal of Comparative Physiology. A, Neuroethology, Sensory, Neural, and Behavioral Physiology 193:285–288.
    https://doi.org/10.1007/s00359-006-0199-1

Decision letter

  1. Ronald L Calabrese
    Senior and Reviewing Editor; Emory University, United States
  2. Ann Kennedy
    Reviewer; Northwestern University, United States
  3. Barbara Webb
    Reviewer; University of Edinburgh, United Kingdom
  4. Martin Nawrot
    Reviewer

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Decision letter after peer review:

Thank you for submitting your article "A small, computationally flexible network produces the phenotypic diversity of song recognition in crickets" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, and the evaluation has been overseen by Ronald Calabrese as the Senior and Reviewing Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Ann Kennedy (Reviewer #1); Barbara Webb (Reviewer #2); Martin Paul Nawrot (Reviewer #3).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

Summary:

Clemens et al., present a computational model of the cricket song recognition network, which they show is capable of reasonably reproducing neural activity and song selectivity in G. bimaculatus. They then explore the parameter space of this network and find that varying parameters of model cells enable it to produce a range of selectivities for the period, pulse duration, duty cycle, or pause duration of input song. They then identify the network parameters that most affect song selectivity and investigate the relationship between several subsets of parameters and song preference. This is a fascinating exploration of the computational flexibility of a small neural circuit; it is well researched and written and was enjoyable to read.

Essential Revisions:

There were several concerns that must be addressed before the paper can be accepted.; including justification/discussion of the generality (beyond crickets) of the insights gained.

1. Explanation of the core principle of function is not clear, particularly as it relates to whether the filter properties in the model have some plausible biophysical counterpart. The authors must present a knowledgeable discussion of the biophysical mechanisms that could generate the described filter functions particularly the seemingly long delays in the linear filter shapes.

2. The use of LN2 output instead of AN1 output for neurons downstream of AN1 seems like an unnecessary kluge and could have affected the results. "For simplicity and since AN1 and LN2 produce similar responses, we used the output of LN2_M in lieu of AN1_M responses for all neurons postsynaptic to AN1_M". How is it simpler? It sounds like a mistake in constructing the model. It could strongly affect the results, e.g. by making the excitatory and inhibitory inputs to LN4 more similar, as both are derived from LN2, instead of excitation from AN1 and inhibition from LN2.This should be fixed and the model reassessed.

Along the same lines the authors should discuss how they would translate the network output into behavioral output. Currently they seemingly compare behavioral response fields with their network output response field – there was considerable confusion about Figure 1B – but this has not been fully explained.

3. Several of the results in this paper, such as the anti-diagonal bands in Figure 3 and Figure 4b, seem to be a consequence of the fact that all simulated pulse trains had a fixed duration of 140ms. Is it reasonable to assume that pulse trains would have such a precisely fixed duration? Is LN4 also selective for pulse trains of a specific duration? If there isn't data on this, it would be an interesting thing to test with the model: if the input pulse train has one extra pulse or one fewer pulse, how are model LN4 responses affected? Does the model predict a preferred pulse train duration, or could it be the case that (for instance) longer pulse trains always lead to stronger LN4 activation?

Along the same lines, a cricket song consists of pulse-pause patterns, which are grouped within chirps that are separated by chirp pauses. The present paper ignores this and refers to the pulse-pause pattern filtering as 'song recognition'. This concern relates how the model would perform for an extra pulse.

4. One of the biggest discrepancies between data and model is the LN5 response at longer periods/pause durations (Figure 2b). Is it possible for the authors to comment on how this discrepancy might impact their other findings in the paper?

5. The distribution of model tolerances in Figure 4d seems surprising. It seems to suggest that the four categories of tolerances (period/duration/duty cycle/pause) do not correspond to distinct categories of models, but rather that the tolerance axis can take essentially any orientation. Is this true of other cricket species as well? If so, might it be that dividing tolerant axes into these four categories is misleading, in that it imposes discrete categories onto what is really a continuously varying signal?

6. We wonder how reasonable it is to describe all models as having a narrow "selective" axis and a broad "tolerant" axis as depicted in 4A. Among generated models where LN4m was responsive, was the LN4m response field always reasonably fit by a single ellipse, or did any models have more complex response fields?

7. The authors should discuss the previously described behavioral inter-individual variability within the species bimacultus in relation to the inter-species variability, which they cover with their parameter distributions.

These major points are further explained in the individual reviewsReviewer #1:

Clemens et al., present a computational model of the cricket song recognition network, which they show is capable of reasonably reproducing neural activity and song selectivity in G. bimaculatus. They then explore the parameter space of this network, and find that varying parameters of model cells enable it to produce a tremendous range of selectivities for the period, pulse duration, duty cycle, or pause duration of input song. They then identify the network parameters that most affect song selectivity, and investigate the relationship between several subsets of parameters and song preference. This is a fantastic exploration of the computational flexibility of a small neural circuit; it is very well researched and written, and was enjoyable to read. Although I had a few questions about the paper contents (see below), I believe that all of these can be addressed by the authors, upon which I would warmly recommend this paper for publication in eLife.

1) Several of the results in this paper, such as the anti-diagonal bands in Figure 3 and Figure 4b, seem to be a consequence of the fact that all simulated pulse trains had a fixed duration of 140ms. Is it reasonable to assume that pulse trains would have such a precisely fixed duration? Is LN4 also selective for pulse trains of a specific duration? If there isn't data on this, it would be an interesting thing to test with the model: if the input pulse train has one extra pulse or one fewer pulse, how are model LN4 responses affected? Does the model predict a preferred pulse train duration, or could it be the case that (for instance) longer pulse trains always lead to stronger LN4 activation?

2) Because of the number of layers and nonlinearities in this model, it is hard to picture what is happening under the hood to give rise to preferences for a particular period, duration, duty cycle, or pause in Figure 4. I found Figure 3 to be very helpful for the example of G. bimaculatus- would it be possible to generate similar plots for some of the models from Figure 4e, for comparison?

3) I was a bit surprised by the distribution of model tolerances in Figure 4d. This seems to suggest that the four categories of tolerances (period/duration/duty cycle/pause) do not correspond to distinct categories of models, but rather that the tolerance axis can take essentially any orientation. Is this true of cricket species as well? If so, might it be that dividing tolerant axes into these four categories is misleading, in that it imposes discrete categories onto what is really a continuously varying value?

4) On a related note, I found myself wondering how reasonable it is to describe all models as having a narrow "selective" axis and a broad "tolerant" axis as depicted in 4A. Among generated models where LN4m was responsive, was the LN4m response field always reasonably fit by a single ellipse, or did any models have more complex response fields? In addition to the orientation and preference of the selective axis, is there anything to be learned from looking at the width of the tolerant/selective axes, or the preference range of the tolerant axis?

5) One of the biggest discrepancies between data and model is the LN5 response at longer periods/pause durations (Figure 2b). Is it possible for the authors to comment on how this discrepancy might impact their other findings in the paper?Reviewer #2:

This paper contributes an interesting study of how parameter variation in a five-neuron network, closely based on identified neurons in the cricket, can establish different temporal tuning properties. The main application is to cricket song recognition; although the paper argues for more general insight into temporal recognition circuits, this is somewhat limited. Similarly, the argument for evolutionary relevance, as explaining how the diversity of cricket song might arise, would be more strongly supported either by showing potential 'pathways' of divergence (ideally through co-evolution models of production and recognition) or a clearer link from the model components to plausible biophysical mechanisms that could produce the relevant properties (e.g. specific filter shapes, especially where these are assumed to be comprised of multiple components within one neuron). As such, though the work is sound, it is of somewhat narrow interest.

Main contributions: the context of the work is that evolution of sensory preferences has been well explored at receptor level, but not yet for more complex stimulus properties, for which more sophisticated neural processing is needed to identify the preferred property in the signal. The main result of the paper is that different song preferences, as found across cricket species, can be obtained for different parameter settings within the same circuit, with biases in the frequency of preference types that match phenotypic diversity. The specific parameters/neural properties that produce qualitative differences in tuning (preference for period, duration or duty-cycle) are examined in more detail to provide some mechanistic insight into the circuit. This is approached in a very thorough manner, e.g., looking at each neuron's contribution and carrying out a full exploration followed by a sensitivity analysis to focus on the most important properties, and as such is of also of interest from a purely methodological point of view in neural modelling.

Substantive concerns:

1) As someone familiar both with cricket song recognition research and neural modelling, I had to work very hard to understand the circuit function from the presented description. The paper seems to assume the reader has very close familiarity with the papers by Kostarakos and Hedwig, 2012 and Schöneich et al., 2015 rather than giving a sufficiently clear account. E.g. in the introduction, the key concept is described as coincidence detection of delayed original input (AN1) and a "post-inhibitory rebound driven by the end of each sound pulse (LN5)". On the face of it, this seems to be a mechanism for pulse duration tuning, not period tuning, and it is unclear why "feature detector neuron LN4 integrates excitatory input from LN3 and inhibitory input from LN2, …sharpens its selectivity." From close inspection Figure 2, the mechanism for period selectivity appears to be 1) the timing of the rebound from LN5 from one syllable coincides with the onset of the next syllable 2) the response per syllable without this input decays for repeated syllables, and is further reduced in LN4 by inhibition with the same pattern as AN1. To some extent these phenomena are discussed later in the paper with reference to the effects of specific parameters, e.g. to increase the duration of the rebound from LN5; but it is difficult for the reader to follow without having the initial conceptual understanding of the original model.

2) The model seems relatively complex (multiple, somewhat arbitrarily chosen filters for each neuron, many parameters) and there is no discussion of whether it could be simplified while retaining the flexibility to be tuned to different song properties. Nor does the reader gain much insight into whether the parameters causing particular effects are plausible, or what might be the biophysical basis (this is discussed only for time delay variables) that could be subject to genetic modification.

3) The argument for insight into evolution from close examination of this network is not very convincing. Why would the existing network in one species be the "mother network" for other species? In the introduction, it is argued that "song recognition networks must be selective and modifiable to adapt to changing signal patterns" but the evolutionary drive seems more likely to be the opposite – the song should adapt to the recognition. Discussion of the co-evolution of production and recognition is very limited.Reviewer #3:

This model study nicely and exemplarily describes how, in a sensory system with highly limited neuronal resources, a small generic network with 5 neuron types can flexibly generate a variety of tuning properties, allowing for species-specific auditory mate-recognition. Building on their previous body of works, the authors here employ a phenomenological (i.e. non-mechanistic) rate-based feed-forward circuit model, fitted to accommodate known single-neuron input/output features. The model faithfully predicts the animals' (average) response behavior to parametrically controlled sensory stimuli. Targeted parameter modifications can tune the network for different auditory pulse patterns. The authors argue that such a flexible generic network motive could allow for evolutionary fast species separation.

1) The authors state "The neuronal circuit … has been revealed p.4 …". However, is there clear anatomic evidence for the explicit network wiring of 5 neurons and 6 connections? Does each of these neurons exist only once per hemisphere of any individual or are the authors referring to neuron types? Clear evidence should be referenced or missing evidence should be critically discussed.

2) The authors present a purely phenomenological model. How are these computations implemented biophysically? Which synaptic, cellular and network mechanisms are involved? Discussion of possible mechanisms and references to relevant works seems mandatory, in particular with respect to the (long) delays / rebound delay in the causal filters and the divisive normalization. Discussing adaptation for type AN1 and LN1 by means of either SFA (Nagel and Wilson, 2011, Nat Neurosci 14(2); Farkhooi et al., 2013, PLoS CB 9(10); Benda et al., 2003,2008) or short-term depressing synapses is straightforward. The phenomenological rate model has its own value. However, the argument that the authors did not aim at a biophysical implementation because ion channels and conductances are not known is not a good argument, this would have prevented 95% of published model studies.

3) The authors repeatedly refer to "song-recognition". However, the authors only investigate pulse sequences (Figure 1B) neglecting the impact of chirp tuning (e.g. Grobe et al., 2012, JEB 2015; Meckenhäuser et al., 2013, PloS one, 8; Clemens and Hennig, 2013). This needs to be discussed.

4) The authors argue their model "has the capacity to reproduce the behavioral preferences" (p.8) with reference to behavioral tuning in Figure 1B. They should make explicit that this refers to the congruence of the LN4 response field (Figure 3) and behavioral response field (Figure 1B) for G. bimaculatus; no attempt was made to explicitly model behavioral output. How could behavior be generated? What is the typical delay between song onset and behavioral response? What is known about the behavioral decision circuit? Please discuss possible mechanisms of behavioral decision making such as the previously suggested drift diffusion models (Hennig et al., 2014, Front. Physiol 5; Clemens et al., 2014, PNAS 11; Meckenhäuser et al., 2014, Front. Sys. Neurosci 8) and possible others?

5) Inter vs. intra-species variability. To my knowledge there is large inter-individual variability in female G.bimaculatus behavior (Grobe et al., 2012, Meckenhäuser et al., 2013) and the authors sate tenfold neuron parameter variation within species. However, they model only an average animal and do not mention behavioral variability at all. How could parameter variation for inter-individual variability in their network differ from inter-species variability? This should be discussed.

6) What is the critical test for the existence of a 'multi-purpose' circuit across species? Do the authors expect the same basic network topology across species and did they attempt to identify it (anatomically) in any species other than G. bimaculatus? Can they think of other methods of validation? This should be addressed in the Discussion.

7) Authors briefly mention various song preference patterns across species. It would be great to see specific examples of song patterns for a few species along with matching model tuning, e.g. in a supplemental figure, possibly together behavioral date / response diagrams.

8) The core result on multi-species covered in Figure 4 and text indicates all four "principal types" of response fields can be achieved by the model with reference to Figure 1B. Where does Figure 1B come from? Are we looking at sketches? Or behavioral response fields measured in crickets? Please make this clear and give references to the underlying data. Clemens and Hennig 2013 and Hennig et al., 2014 do not show these graphs. Ideally, the authors can reproduce exemplary experimental data from diverse cricket species for illustration.

https://doi.org/10.7554/eLife.61475.sa1

Author response

Essential revisions:

There were several concerns that must be addressed before the paper can be accepted.; including justification/discussion of the generality (beyond crickets) of the insights gained.

Beyond crickets many animal groups produce regular series of sound pulses directed at a receiver. If analyzed on the short time scale, there are four axes (pulse rate/period, pulse duration, pulse pause, duty cycle). It is a strength of the model that it identifies the main computations (forward excitation, suppression, rebound depolarization, coincidence detection, summation) needed to extract relevant information along these axes independent of the specific biophysical mechanism. In other words, the same computation may have more than one biophysical implementation.

Because of this generality (in terms of signal structure and modelling approach), our computational analyses produce hypotheses for how song preference may change from one type to another in different species and we now provide an example for that in the discussion. Since each model element has a straightforward biophysical implementation, we can at the same time propose specific changes in biophysical parameters that will lead to the desired changes in behavior.

We made the following edits to clarify these points:

Introduction:

“Repetitive patterns of short pulses that are organized in groups on a longer time scale are a common feature of acoustic signaling in insects, fish, and frogs (Baker et al., 2019; Carlson and Gallant, 2013; Gerhardt and Huber, 2002) and the processing and evaluation of these pulse patterns is therefore common to song recognition systems across species. […] Insights from insects where assumptions on physiological relevant parameters like synaptic strengths, delays and membrane properties of individual neurons can be made and systematically tested are therefore relevant for studies of pattern recognition systems and the evolution of acoustic communication systems in general.”

Discussion:

“The three computations that define model tuning – response suppression, post-inhibitory rebounds, and coincidence detection – occur across species and modalities. […] This could be implemented for example by reducing postsynaptic GABA receptors or by lowering the expression levels of voltage-gated potassium channels.”

1. Explanation of the core principle of function is not clear, particularly as it relates to whether the filter properties in the model have some plausible biophysical counterpart. The authors must present a knowledgeable discussion of the biophysical mechanisms that could generate the described filter functions particularly the seemingly long delays in the linear filter shapes.

We now explain the working principle in more detail in the introduction, based on the reviewers’ suggestions:

“In this species, the selectivity for a narrow range of pulse periods is created in a network of 5 neurons and 6 synaptic connections by combining a delay-line with a coincidence-detector (Figure 1D). […] Lastly, the feature detector neuron LN4 receives excitatory input from LN3 and inhibitory input from LN2, which sharpens its selectivity by further reducing responses to pulse patterns that do not produce coincident inputs to LN3. LN4’s selectivity for pulse patterns closely matches the phonotactic behavior of the females (Figure 1E).”

We also added a new section to the Discussion on the biophysical mechanisms generating each model element:

“How can these changes be implemented in a biological network? Although our phenomenological model is independent of a specific biophysical implementation, all model components have straightforward biophysical correlates. […] These examples demonstrate, that our phenomenological model has a straightforward, physiologically plausible implementations and therefore can propose experimentally testable hypothesis for transitions between types of behavioral preferences.”

2. The use of LN2 output instead of AN1 output for neurons downstream of AN1 seems like an unnecessary kluge and could have affected the results. "For simplicity and since AN1 and LN2 produce similar responses, we used the output of LN2_M in lieu of AN1_M responses for all neurons postsynaptic to AN1_M". How is it simpler? It sounds like a mistake in constructing the model. It could strongly affect the results, e.g. by making the excitatory and inhibitory inputs to LN4 more similar, as both are derived from LN2, instead of excitation from AN1 and inhibition from LN2.This should be fixed and the model reassessed.

We agree that our simplifying assumption that AN1 and LN2 provide similar inputs to the other neurons in the network could have affected our results, in particular since we removed one degree of freedom, which could restrict the distribution of responses produced by the network. We had done the simplification because we lacked data from AN1.

To address the concern, we now added AN1 recordings, refitted the model, and repeated all analyses (including the generation of millions of responses from models with randomized parameters). To replicate the adaptation across pulses in a train observable in the AN1 responses, we added an explicit divisive normalization step to AN1. We find that nearly all results obtained are in line with the original model. Importantly, the key results on the biases of the network and the role of delays and inhibition/adaptation can be replicated with the adjusted model. This highlights the robustness of our conclusions.

Fitting the new model changed some results in minor ways, without affecting our conclusions:

– The proportion of responsive and sensitive models dropped from ~⅓ to ~1/11. This is because AN1 forms the network’s input and new parameters added to model AN1 lead to more parameter combinations. As a result, very low or very high AN1 responses will lead to no responses or the same response to all patterns, respectively.

– The sensitivity of the model to changes in AN1 parameters is now reduced (Figure 4G). This is likely because of the addition of new parameters to fit the AN1 response dynamics – most of these parameters do not affect tuning.

– The ranking of parameters in the sensitivity analysis is different (synaptic delays are slightly less important but still rank highly). However, the general trends remain (Figure 4H) – parameters that were important in the old model are still important in the revised version. The correlation of sensitivity ranking when changing pairs of parameters vs individual parameters is slightly weaker but still substantial (Figure 4—figure supplement 1, before: correlation 0.95, now 0.88).

– The decrease in the preferred duty cycle in LN4M with rebound delay was linear before. This curve has now a weakly decreasing slope, likely because of the adaptation added in AN1 (Figure 5D). In addition, the effect of duty cycle on the amplitude and latency of inputs to LN4M is now slightly different but the trend of later and weaker excitation onto LN4M with increasing duty cycle remains (Figure 5F).

– Trends in adaptation remain but addition of one more source of adaptation in AN1 makes the relationship between the number of adaptation/inhibition sources in the model and the preferred duty cycle more complex. However, the general trend, that more adaptation leads to lower preferred duty cycles, remains. We now show a simplified summary plot for this results in Figure 5I and moved the detailed plot to the Figure 5—figure supplement 3. In addition, small details in the response dynamics for trace for LN3M and LN4M in Figure 5G changed.

Overall, all these changes concern small details but do not affect any of the conclusions drawn from the model.

Along the same lines the authors should discuss how they would translate the network output into behavioral output. Currently they seemingly compare behavioral response fields with their network output response field – there was considerable confusion about Figure 1B – but this has not been fully explained.

We apologize for not making explicit the connection between model output and behavioral response fields. For the regular patterns used in this study and found in natural cricket songs, the translation from LN4 output to phonotaxis is rather simple – the firing rate over the chirp of LN4 is proportional to the phonotaxis value in Gryllus bimaculatus. Integrative processes over timescales that exceed the duration of the chirp shape insect decision making (Meckenhäuser 2014, DasGupta 2014) and these timescales can be characterized using artificial stimuli with conflicting information (Poulet and Hedwig 2005, Clemens 2014). We did not explicitly include them in the model, since the focus of our study was in pulse pattern recognition and these longer integration timescales do not change the responses to the regular pulse trains used in this study.

We now make explicit that we use the LN4M firing rate over a chirp to predict phonotaxis in the Methods (P23, L814):

“Prediction of phonotactic behavior from network responses

For regular pulse trains, the firing rate of LN4 (number of spikes per chirp/chirp duration) predicts phonotactic behavior very well (Schöneich et al., 2015) (Figure 1E). We therefore use the firing rate of LN4M to predict behavioral response fields from network output.”

And also discuss the existence of longer integration timescales in cricket phonotaxis in Results:

“To translate responses of the output neuron of the network – LN4M – into phonotaxis, we used a simple model: The firing rate of LN4 is strongly correlated with the female phonotaxis in G. bimaculatus (Figure 1E, Schöneich et al., (2015)) and we therefore took LN4M‘s firing rate averaged over a chirp to predict phonotaxis from the model responses. Integrative processes over timescales exceeding the chirp are known to affect behavior in crickets and other insects (Poulet and Hedwig (2005), see also Meckenhäuser et al,. (2014); Clemens et al., (2014); DasGupta et al., (2014)). We omit them here since they do not crucially affect responses for the simple, repetitive stimuli typical for pulse trains produced by crickets.”

Regarding the response fields shown in Figure 1B: Initially, the response fields showed manually generated response fields. We have now replaced 3 of them with schematics derived from behavioral data of three different cricket species. The fourth – pause tuning – is not known in crickets and we kept the manually generated field. This is now clarified in the figure’s caption.

3. Several of the results in this paper, such as the anti-diagonal bands in Figure 3 and Figure 4b, seem to be a consequence of the fact that all simulated pulse trains had a fixed duration of 140ms. Is it reasonable to assume that pulse trains would have such a precisely fixed duration? Is LN4 also selective for pulse trains of a specific duration? If there isn't data on this, it would be an interesting thing to test with the model: if the input pulse train has one extra pulse or one fewer pulse, how are model LN4 responses affected? Does the model predict a preferred pulse train duration, or could it be the case that (for instance) longer pulse trains always lead to stronger LN4 activation?

Along the same lines, a cricket song consists of pulse-pause patterns, which are grouped within chirps that are separated by chirp pauses. The present paper ignores this and refers to the pulse-pause pattern filtering as 'song recognition'. This concern relates how the model would perform for an extra pulse.

This question relates to the selectivity of G. bimaculatus for the chirp. G bimaculatus males produce song with a relatively fixed number of pulses per chirp – variation is small and amounts to the addition or removal of individual pulses (Schöneich and Hedwig 2012, Jacob and Hedwig 2016). G. bimaculatus females do show a chirp selectivity, but with a rather wide tuning and large inter-individual variability (Grobe et al., 2012). Accepted chirp durations range from 120 ms to 300 ms and thus the addition or removal of a single pulse does not make much of a difference to the phonotactic response. We confirm that our model is also robust to addition/removal of a single pulse in a new Figure 3—figure supplement 2.

Although it remains an interesting question whether the chirp selectivity known from behavioral studies could be part of the present model network, there is no electrophysiological data to test this question available. Such data would require extensive electrophysiological recordings with tests of combinations of different chirp durations and chirp pauses for several combinations of pulse-pause patterns as in Grobe et al., (2012). The Gryllus songs are in most cases songs with a chirped structure and as our manuscript focuses on fast time scale recognition of the pulse-pause pattern, we do indeed not assess all aspects of ‘song recognition’. The focus of our analyses on the short timescale of pulse duration and pause is now made explicit throughout the manuscript. For example in the abstract (P1, L22):

“Using electrophysiological recordings from the network that recognizes crucial properties of the pulse pattern on the short timescale in the cricket Gryllus bimaculatus, we built a computational model that reproduces the neuronal and behavioral tuning of that species. An analysis of the model’s parameter space reveals that the network can provide all recognition phenotypes for pulse duration and pause known in crickets and even other insects.”

In the introduction (P4, L107):

“We here asked whether the network that recognizes features of the pulse pattern on the short timescale in G. bimaculatus (Figure 1D) has the capacity to produce the diversity of recognition phenotypes for pulse duration and pause known from crickets and other insects (Figure 1B), and what circuit properties support and constrain this capacity. [..] By exploring the network properties over a wide range of physiological parameters, we show that the network of G. bimaculatus can be modified to produce all types of preference functions for pulse duration and pause known from crickets and other insect species.”

Results (examples):

– “We tested whether the delay-line and coincidence-detector network of the cricket G. bimaculatus (Figure 1D) can be modified to produce the known diversity of preference functions for pulse duration and pause in cricket calling songs (Figure 1B).” (P4, L120)

– Renamed section title from “The network can be tuned to produce all known song preferences of crickets” to “The network can be tuned to produce all known preferences for pulse duration and pause in crickets.” (P7, L212)

– We also used more precise wording when summarizing our results, for instance: “Our analysis of different model variants suggests that this song recognition network can produce all known preference types for pulse duration and pause over the range of stimulus parameters relevant for crickets.” (P9, L305)

We now also address the preference for chirp parameters as an open question and discuss possible aspects of chirp preference the network could contribute to (Discussion, P16, L515):

“In addition, we have not yet explored the network’s ability to reproduce the behavioral selectivity for parameters on the longer timescale of chirps (Figure 1A) (Grobe et al., 2012; Blankers et al., 2016; Hennig et al., 2016). It is likely that the network can explain some properties of the selectivity for chirp known from crickets. For instance, that a minimum of two pulses is required to produce coincidence in the network could at least partly explain the existence of a minimal chirp duration for G. bimaculatus (Grobe et al., 2012). Likewise, suppression in the network reduces responses to long chirps which could explain the reduced behavioral preference for long chirps. However, the current electrophysiological data do not sufficiently constrain responses at these long timescales and studies are needed to address this issue more comprehensively.”

4. One of the biggest discrepancies between data and model is the LN5 response at longer periods/pause durations (Figure 2b). Is it possible for the authors to comment on how this discrepancy might impact their other findings in the paper?

Although it remains an interesting question whether the chirp selectivity known from behavioral studies could be part of the present model network, there is no electrophysiological data to test this question available. Such data would require extensive electrophysiological recordings with tests of combinations of different chirp durations and chirp pauses for several combinations of pulse-pause patterns as in Grobe et al., (2012). The Gryllus songs are in most cases songs with a chirped structure and as our manuscript focuses on fast time scale recognition of the pulse-pause pattern, we do indeed not assess all aspects of ‘song recognition’. The focus of our analyses on the short timescale of pulse duration and pause is now made explicit throughout the manuscript. For example (relevant changes in bold) in the abstract (P1, L22):

“Using electrophysiological recordings from the network that recognizes crucial properties of the pulse pattern on the short timescale in the cricket Gryllus bimaculatus, we built a computational model that reproduces the neuronal and behavioral tuning of that species. An analysis of the model’s parameter space reveals that the network can provide all recognition phenotypes for pulse duration and pause known in crickets and even other insects.”

In the introduction (P4, L107):

“We here asked whether the network that recognizes features of the pulse pattern on the short timescale in G. bimaculatus (Figure 1D) has the capacity to produce the diversity of recognition phenotypes for pulse duration and pause known from crickets and other insects (Figure 1B), and what circuit properties support and constrain this capacity. [..] By exploring the network properties over a wide range of physiological parameters, we show that the network of G. bimaculatus can be modified to produce all types of preference functions for pulse duration and pause known from crickets and other insect species.”

Results (examples):

– “We tested whether the delay-line and coincidence-detector network of the cricket G. bimaculatus (Figure 1D) can be modified to produce the known diversity of preference functions for pulse duration and pause in cricket calling songs (Figure 1B).” (P4, L120)

– Renamed section title from “The network can be tuned to produce all known song preferences of crickets” to “The network can be tuned to produce all known preferences for pulse duration and pause in crickets.” (P7, L212)

– We also used more precise wording when summarizing our results, for instance: “Our analysis of different model variants suggests that this song recognition network can produce all known preference types for pulse duration and pause over the range of stimulus parameters relevant for crickets.” (P9, L305)

We now also address the preference for chirp parameters as an open question and discuss possible aspects of chirp preference the network could contribute to (Discussion, P16, L515):

“In addition, we have not yet explored the network’s ability to reproduce the behavioral selectivity for parameters on the longer timescale of chirps (Figure 1A) (Grobe et al., 2012; Blankers et al., 2016; Hennig et al., 2016). It is likely that the network can explain some properties of the selectivity for chirp known from crickets. For instance, that a minimum of two pulses is required to produce coincidence in the network could at least partly explain the existence of a minimal chirp duration for G. bimaculatus (Grobe et al., 2012). Likewise, suppression in the network reduces responses to long chirps which could explain the reduced behavioral preference for long chirps. However, the current electrophysiological data do not sufficiently constrain responses at these long timescales and studies are needed to address this issue more comprehensively.”

5. The distribution of model tolerances in Figure 4d seems surprising. It seems to suggest that the four categories of tolerances (period/duration/duty cycle/pause) do not correspond to distinct categories of models, but rather that the tolerance axis can take essentially any orientation. Is this true of other cricket species as well? If so, might it be that dividing tolerant axes into these four categories is misleading, in that it imposes discrete categories onto what is really a continuously varying signal?

It is true that the network can produce response fields of essentially any orientation. However, this is not true for cricket species, which cluster around the orientations corresponding to the four principal types. The division of the data into four tolerant axes is thus motivated by the diversity of the known behavioral preferences. Focusing on these types allows us to compare the biological diversity with that produced by the model. The fact that the model produces preferences at orientations not found in nature hints at additional selection pressures, like temperature robustness or environmental noise, that lead to the selection of the four types. However, we cannot rule out the existence of intermediate response types in so far uncharacterized species.

Based on this and the next point in Essential Revisions, we’ve modified our analysis and re-wrote this section. We now make our motivations more explicit and support it with references to behavioral studies that have characterized the response fields in various species of crickets and other insect species (Results):

“The preferred pulse parameters – duration, pause, and their combinations period, and duty cycle – only incompletely describe a network’s recognition phenotype. […] Last, duty-cycle tuning (Figure 1C, cyan) is given by diagonal alignment (θ=45o) and selectivity for duty cycle but tolerance for period (G. lineaticeps, Hennig et al., (2016)).”

We changed the histogram showing the distribution of orientations of the response fields from a log to a linear scale, to highlight the non-uniform distribution of response types (Figure 4D):

This shows that duration tuning is strongly over-represented, period and duty-cycle tuning are weakly enriched, and pause tuning is much rarer than expected. This unequal distribution of the orientations reflects the unequal distribution of preference types found in crickets. For instance, we know of no cricket species with pause tuning. This is now made clearer in a re-written section in Results (P9, L285):

“We find response fields with any orientation, again demonstrating that the network can produce more diverse response fields than has been reported in crickets. However, the orientations are unevenly distributed and are enriched for the principal types known from crickets: 36% of the response fields have an orientation of 0±10° which corresponds to duration tuning (expectation from uniform distribution: 20°/360° = 5.6%). Duty cycle tuning (45 ±10°) and period tuning (-45 ±10°) are also enriched, with 17% and 12%, respectively. Notably, pause tuning (90 ±10°) is not known in crickets and is the only principal type that is rarer than expected from a uniform distribution of orientations (2.0% vs. 2.8% expected). The rarity of pause tuning is consistent with the bias to prefer short pulse durations observed above (Figure 4C), since orientations around 90° requires response fields that extend parallel to the pulse duration axis. Note that these trends do not depend critically on the ranges of angles chosen for specifying different the response types.”

We also address possible implications of the fact that the model produces intermediate orientations which are not known from crickets (Results, P9, L300):

“Interestingly, the network tends to create a larger diversity of response fields than is known from crickets, for instance, fields that are symmetrical, multi-peaked, or have intermediate orientations. This suggests that biases in the network – like the rarity of pause tuning – constrain the distribution of preferences that evolution can select from, and that additional factors – like robustness to noise or temperature – then determine the ultimate distribution of phenotypes.”

6. We wonder how reasonable it is to describe all models as having a narrow "selective" axis and a broad "tolerant" axis as depicted in 4A. Among generated models where LN4m was responsive, was the LN4m response field always reasonably fit by a single ellipse, or did any models have more complex response fields?

To show that the response fields of most models are well described by single ellipsoids, we now characterize the response fields more systematically using three new analyses:

1. We determined whether a given response field was well described by an ellipsoid. We compute the overlap between each response field, binarized by thresholding at 50% of the field’s maximum and a best fitting ellipse. The overlap, given by the Jaccard similarity index is high for most models: the median Jaccard similarity is 0.83, 80% of all response fields have a Jaccard similarity >0.5. Manual inspection of response fields revealed that most models with a Jaccard similarity >0.5 are well described by an ellipse. We show a histogram of similarity values and examples of a high and a low overlap model in the new Figure 4—figure supplement 2A, B.

2. From the fitted ellipses, we computed an asymmetry index as the ratio between the length of the ellipses’ major and the minor axes. We find that most of the fields fitted well by a single ellipse (Jaccard similarity >0.5) are also asymmetrical (83% with an asymmetry index >1.25). The distribution of asymmetry indices is shown in a new Figure 4—figure supplement 2C.

3. We detected the number of distinct, well-separated peaks for each response field. “Dominant peaks were defined as local maxima in the response field exceeding 0.5 of the global maximum value and separated by a trough that is smaller than 0.75 of the value of the lower of the two peaks.” (Methods, P25, L929). This analysis revealed that the vast majority of fields (88%) has a single peak (Figure 4—figure supplement 2A). We now show examples from the 12% of the response fields with multiple peaks (new Figure 4—figure supplement 2D).

The results of these analyses are covered in a new Figure 4—figure supplement 2 and in new section in Results (P9, L275):

“We first examined to what extent the model produced the single-peaked, asymmetrical response fields typical for crickets. We find that most response fields (80%) produced by the selective model variants were well described by a single ellipse (Figure 4—figure supplement 2A, B, see Methods for details). Of these, 83% where asymmetrical (major axis >1.25x longer than minor axis), 17% were symmetrical (Figure 4—figure supplement 2C). 12% of all models produces multi-peaked response fields (Figure 4—figure supplement 2D), which are only known from katydids (Webb et al., 2007). The remaining 8% of the response fields were not well described by ellipses and/or did not have multiple distinct peaks. Thus, while the model produces more diverse responses – including complex, multi-peaked ones – most responses do match those typical for crickets.”

The analysis is also described in a new section “Characterization of response fields” in Methods:

“Characterization of response fields

The response fields known in crickets all have a single maximum and a roughly ellipsoid shape with a single maximum. […] For the asymmetry index, we extracted the length of the major and minor axis of the best-fitted ellipse as the eigenvalues of the covariance matrix and took their ratio.”

Taken together, the changes to how we present our results based on #5 and #6 in Essential Revisions provides a more complete and unbiased assessment of the diversity of responses produced by the model.

7. The authors should discuss the previously described behavioral inter-individual variability within the species bimacultus in relation to the inter-species variability, which they cover with their parameter distributions.

The behaviorally well-documented interindividual variability in phonotaxis behaviors (Grobe et al., 2012; Meckenhäuser et al., 2013) likely arises at multiple levels: At the level of song pattern recognition (interindividual differences in the network parameters), at the level of phonotaxis behavior (biases and noise in localizing the sound), at the motivational level (low or high motivation leads to more or less selective responses), and at the motor level (variability from motor noise). Identifying the contribution of variability at the level of song pattern recognition is challenging, since the full characterization of the behavioral phenotype in terms of the response fields cannot be obtained reliably at the individual level (the stimulus space is too large); this would require multiple testing of the same individuals with the same large set of test patterns. Therefore, our model of song pattern recognition does not explicitly consider the interindividual variability but is meant to represent the behavior of an average female.

Regarding the relation between intra vs inter species variability in network parameters: Our sensitivity analysis (Figure 4H) identified many parameters with little impact on the model tuning, which form one or multiple fitness-neutral subregions in parameter pace. In the biological population, these parameters likely vary between individuals, because they are subject to weak selection. By contrast, parameters with a high sensitivity score that affect crucial aspects of the recognition phenotype undergo stronger selection and are therefore likely less variable across individuals. These fitness-critical parameters are likely the ones driving changes in the recognition phenotype during a speciation event. Our current approach of assessing parameters individually (Figure 4G, H) or in pairs (Figure 4—figure supplement 1) is insufficient to capture the relationship of the fitness-neutral and the fitness-critical parameter subspaces Novel tools (for instance, Bittner et al., (2019) and Gonçalves et al., (2019)) will be employed to more exhaustively characterize the high-dimensional parameter to phenotype map with respect to inter and intra specific variability.

This point is now discussed in the discussion:

“We find that many parameters have only weak impacts on the network’s preference, leading to a relatively large fitness-neutral parameter set and a robust phenotype (Prinz et al., 2004). For instance, we observed that the rebound amplitude in LN5 tends to be variable across electrophysiological recordings in different individuals but that the network model’s output is relatively robust to these changes (Figure 2B, Figure 2—figure supplement 4). Other parameters crucially affect the recognition phenotype and support evolvability. These fitness-critical parameters likely drive changes in the recognition phenotype during speciation events to ensure species discrimination (Amézquita et al., 2011; Schmidt et al., 2011). Future studies will employ novel computational methods for characterizing the high-dimensional parameter space (Bittner et al., 2021; Gonçalves et al., 2020) to more comprehensively characterize the fitness-neutral parameter set leading to interindividual variability and to provide insight into the robustness and evolvability of pattern recognition networks.”

These major points are further explained in the individual reviews

Reviewer #1 (Recommendations for the authors):

Clemens et al., present a computational model of the cricket song recognition network, which they show is capable of reasonably reproducing neural activity and song selectivity in G. bimaculatus. They then explore the parameter space of this network, and find that varying parameters of model cells enable it to produce a tremendous range of selectivities for the period, pulse duration, duty cycle, or pause duration of input song. They then identify the network parameters that most affect song selectivity, and investigate the relationship between several subsets of parameters and song preference. This is a fantastic exploration of the computational flexibility of a small neural circuit; it is very well researched and written, and was enjoyable to read. Although I had a few questions about the paper contents (see below), I believe that all of these can be addressed by the authors, upon which I would warmly recommend this paper for publication in eLife.

Major comments:

1) Several of the results in this paper, such as the anti-diagonal bands in Figure 3 and Figure 4b, seem to be a consequence of the fact that all simulated pulse trains had a fixed duration of 140ms. Is it reasonable to assume that pulse trains would have such a precisely fixed duration? Is LN4 also selective for pulse trains of a specific duration? If there isn't data on this, it would be an interesting thing to test with the model: if the input pulse train has one extra pulse or one fewer pulse, how are model LN4 responses affected? Does the model predict a preferred pulse train duration, or could it be the case that (for instance) longer pulse trains always lead to stronger LN4 activation?

This is now addressed in detail in ‘Essential Revisions #3’.

Within a species, pulse trains (chirps) have a highly consistent duration – differences between chirps lie mainly in the addition or removal of a single pulse. Behavioral data show that for the chirp durations used here, female phonotaxis is robust to the addition/removal of a single chirp (Grobe, 2012). We performed a new analysis to show that this robustness to the addition or removal of a single pulse in a chirp is reproduced by the model (new Figure 3—figure supplement 2).

While the tuning for chirp parameters (chirp duration, chirp pause) is much broader than that for pulse parameters, crickets do indeed exhibit species-specific tuning for the chirp. However, the focus of this manuscript is on the tuning for pulse parameters, which we now make explicit throughout the text (see Essential Rev #3). Some aspects of the known behavioral tuning for chirp parameters are likely explained by the network but some aspects likely require additional mechanisms not captured by the pulse-pattern recognition network. This is now discussed (P16, L515, reproduced in Essential Rev #3).

2) Because of the number of layers and nonlinearities in this model, it is hard to picture what is happening under the hood to give rise to preferences for a particular period, duration, duty cycle, or pause in Figure 4. I found Figure 3 to be very helpful for the example of G. bimaculatus- would it be possible to generate similar plots for some of the models from Figure 4e, for comparison?

We now provide the response fields for all neurons in the network for an example from each response type in Figure 4E in a new Figure 4—figure supplement 3 in the supplement.

3) I was a bit surprised by the distribution of model tolerances in Figure 4d. This seems to suggest that the four categories of tolerances (period/duration/duty cycle/pause) do not correspond to distinct categories of models, but rather that the tolerance axis can take essentially any orientation. Is this true of cricket species as well? If so, might it be that dividing tolerant axes into these four categories is misleading, in that it imposes discrete categories onto what is really a continuously varying value?

This is now addressed in detail in ‘Essential Revisions #5’.

Known response fields from crickets fall into three categories – duration, period and duty cycle tuning. Pause tuning is not known. Intermediate types are not yet known. We have edited the text to clarify this. We have rewritten the relevant section in Results with new analyses and figures (Figure 4—figure supplement 2, 3) to provide a more nuanced assessment of the diversity of responses produced by the model.

4) On a related note, I found myself wondering how reasonable it is to describe all models as having a narrow "selective" axis and a broad "tolerant" axis as depicted in 4A. Among generated models where LN4m was responsive, was the LN4m response field always reasonably fit by a single ellipse, or did any models have more complex response fields? In addition to the orientation and preference of the selective axis, is there anything to be learned from looking at the width of the tolerant/selective axes, or the preference range of the tolerant axis?

This is now addressed in detail in ‘Essential Revisions #6’.

New analyses now more comprehensively describe the general shape of the response fields (new text in Results (P9, L275) and new Figure 4—figure supplement 2).

5) One of the biggest discrepancies between data and model is the LN5 response at longer periods/pause durations (Figure 2b). Is it possible for the authors to comment on how this discrepancy might impact their other findings in the paper?

This is now addressed in detail in ‘Essential Revisions #4’.

We now show that this discrepancy has only minor impact on the model responses (new text in Results, new Figure 2—figure supplement 4).

Reviewer #2 (Recommendations for the authors):

This paper contributes an interesting study of how parameter variation in a five-neuron network, closely based on identified neurons in the cricket, can establish different temporal tuning properties. The main application is to cricket song recognition; although the paper argues for more general insight into temporal recognition circuits, this is somewhat limited. Similarly, the argument for evolutionary relevance, as explaining how the diversity of cricket song might arise, would be more strongly supported either by showing potential 'pathways' of divergence (ideally through co-evolution models of production and recognition) or a clearer link from the model components to plausible biophysical mechanisms that could produce the relevant properties (e.g. specific filter shapes, especially where these are assumed to be comprised of multiple components within one neuron). As such, though the work is sound, it is of somewhat narrow interest.

Main contributions: the context of the work is that evolution of sensory preferences has been well explored at receptor level, but not yet for more complex stimulus properties, for which more sophisticated neural processing is needed to identify the preferred property in the signal. The main result of the paper is that different song preferences, as found across cricket species, can be obtained for different parameter settings within the same circuit, with biases in the frequency of preference types that match phenotypic diversity. The specific parameters/neural properties that produce qualitative differences in tuning (preference for period, duration or duty-cycle) are examined in more detail to provide some mechanistic insight into the circuit. This is approached in a very thorough manner, e.g., looking at each neuron's contribution and carrying out a full exploration followed by a sensitivity analysis to focus on the most important properties, and as such is of also of interest from a purely methodological point of view in neural modelling.

Substantive concerns:

1) As someone familiar both with cricket song recognition research and neural modelling, I had to work very hard to understand the circuit function from the presented description. The paper seems to assume the reader has very close familiarity with the papers by Kostarakos and Hedwig, 2012 and Schöneich et al., 2015 rather than giving a sufficiently clear account. E.g. in the introduction, the key concept is described as coincidence detection of delayed original input (AN1) and a "post-inhibitory rebound driven by the end of each sound pulse (LN5)". On the face of it, this seems to be a mechanism for pulse duration tuning, not period tuning, and it is unclear why "feature detector neuron LN4 integrates excitatory input from LN3 and inhibitory input from LN2, …sharpens its selectivity." From close inspection Figure 2, the mechanism for period selectivity appears to be 1) the timing of the rebound from LN5 from one syllable coincides with the onset of the next syllable 2) the response per syllable without this input decays for repeated syllables, and is further reduced in LN4 by inhibition with the same pattern as AN1. To some extent these phenomena are discussed later in the paper with reference to the effects of specific parameters, e.g. to increase the duration of the rebound from LN5; but it is difficult for the reader to follow without having the initial conceptual understanding of the original model.

This is now addressed in detail in ‘Essential Revisions #1’.

We explain the working principle in more detail in the introduction, based on the reviewers’ suggestions (P4, L94).

2) The model seems relatively complex (multiple, somewhat arbitrarily chosen filters for each neuron, many parameters) and there is no discussion of whether it could be simplified while retaining the flexibility to be tuned to different song properties. Nor does the reader gain much insight into whether the parameters causing particular effects are plausible, or what might be the biophysical basis (this is discussed only for time delay variables) that could be subject to genetic modification.

We agree that the number of parameters is large. This stems from our goal to reproduce the crucial aspects of the dynamics of all neurons faithfully. We believe that a high-dimensional and degenerate parameter space is something that our model shares with the biological network and it is key to the network’s computational flexibility. Our sensitivity analyses identified key parameters that shape crucial aspects of the model tuning (Figures 4, 5, 6). Finding a minimally parameterized circuit model that trades fidelity for simplicity while retaining computational capacity would be a fascinating topic for a future study!

These points are now addressed in Discussion:

“Computational flexibility also needs evolvability – the ability to generate specific and robust phenotypes during evolution (Blankers et al., 2015). The “mechanistic degeneracy” of computation in biological neural networks – the fact that specific network outputs can be obtained by tuning disparate parameters – is thought to be a prerequisite for evolvability (Wagner (2011), see also Hasson et al., (2020); Leonardo (2005); Schneider et al., (2021)). The degeneracy concept has been demonstrated extensively for motor networks (Gutierrez et al., 2013; Prinz et al., 2004), and here we also found evidence of degeneracy in the cricket song recognition network. For instance, the preferred pulse period (Figure 5A, B, Figure 5—figure supplement 1) or the pulse duty cycle (Figure 5G-I) can be set by multiple parameters. The hypothesis that high-dimensional and degenerate biological parameter spaces are more evolvable could be tested by assessing the computational flexibility of a minimal model of the network that produces the observed behavior with as few parameters as possible.”

To address the issue of the plausibility of our parameter values, we now discuss the possible biophysical bases for all parameters in the Discussion (P17, L543) (see ‘Essential Revisions #1’).

3) The argument for insight into evolution from close examination of this network is not very convincing. Why would the existing network in one species be the "mother network" for other species?

It is likely that large parts of the network are retained in closely related species and therefore that the network from G. bimaculatus resembles that found in closely related cricket species. This hypothesis is supported post-hoc by our results, which show that this network can produce all known behavioral preferences for pulse pause and duration in the species group. In addition, a recent phylogenetic analysis suggests that G. bimaculatus is close to the base of the phylogenetic tree from which many other species emerged (Gray et al., 2020, Page 17). Although this does not prove that the network of G. bimaculatus is a mother network for other species, phylogeny is in principle consistent with this idea.

This is now explained in the discussion:

“While the network model analyzed here is derived from recordings in one species (G. bimaculatus), the delay-line and coincidence-detector network is likely shared within the closely related cricket species. The phylogenetic position of G. bimaculatus close to the base of the phylogenetic tree from which many other species emerged is consistent with this idea (Gray et al., 2020). Our finding that this network can produce all known preferences for pulse and pause supports this idea and suggests that it for ms a common substrate – a “mother network” – for the diversity of song recognition phenotypes in crickets.”

In the introduction, it is argued that “song recognition networks must be selective and modifiable to adapt to changing signal patterns” but the evolutionary drive seems more likely to be the opposite – the song should adapt to the recognition. Discussion of the co-evolution of production and recognition is very limited.

This is true. The most likely scenario for the co-evolution of song preference and structure is the following: Female preferences drift around in signal space, maybe pushed by abiotic (environmental noise selects against preferences for very short pauses) and biotic factors (avoid overlap with heterospecifics (Amezquita, 2011)). Males then “adjust” their song to attract females. In this scenario, a female network that has the capacity to produce many different preference types supports the ability of communication system to diverge. Our finding that the network can produce all preference types confirms this hypothesis.

This is now clarified in Introduction:

“Since the evolution of song is largely driven by the female (Gray and Cade, 2000), the females’ song recognition must be selective and modifiable in order to drive the evolution of distinct, species-specific song patterns in males (Wagner, 2008).”

And in Ddiscussion:

“The computational flexibility of the recognition mechanism may explain the species richness as well as the speed of evolution in a particular taxon like crickets (Alexander, 1962; Blankers et al., 2015; Desutter Grandcolas and Robillard, 2003; Oh and Shaw, 2013; Otte, 1992): Female preferences drift around with little constraint in signal space, maybe pushed by abiotic (environmental noise selects against preferences for very short pauses) and biotic factors (avoid overlap with heterospecifics, Amézquita et al.,(2011)). The male song evolution follows female evolution since only males that sing attractive songs will reproduce. In this scenario, a female network that has the capacity to produce many different preference types supports the ability of the communication system to diverge. However, this co-evolution of song preference and song structure requires male song production networks to be as flexible as the female song recognition networks (Jacob and Hedwig, 2020; Schöneich, 2020). There is also accumulating evidence for genetic coupling between the networks that produce and recognize the song pattern, which may ensure that sender and receiver stay tuned during evolution of song pattern (Xu and Shaw, 2009; Schöneich, 2020; Xu and Shaw, 2021).”

Reviewer #3 (Recommendations for the authors):

This model study nicely and exemplarily describes how, in a sensory system with highly limited neuronal resources, a small generic network with 5 neuron types can flexibly generate a variety of tuning properties, allowing for species-specific auditory mate-recognition. Building on their previous body of works, the authors here employ a phenomenological (i.e. non-mechanistic) rate-based feed-forward circuit model, fitted to accommodate known single-neuron input/output features. The model faithfully predicts the animals' (average) response behavior to parametrically controlled sensory stimuli. Targeted parameter modifications can tune the network for different auditory pulse patterns. The authors argue that such a flexible generic network motive could allow for evolutionary fast species separation.

1) The authors state "The neuronal circuit … has been revealed p.4 …". However, is there clear anatomic evidence for the explicit network wiring of 5 neurons and 6 connections? Does each of these neurons exist only once per hemisphere of any individual or are the authors referring to neuron types? Clear evidence should be referenced or missing evidence should be critically discussed.

This is correct – the network is not based on a complete wiring diagram of the song recognition circuit in G. bimaculatus – it was reconstructed from anatomical data and single-cell recordings. We now refer to the different neurons in the network as cell types, of which multiple copies could exist per hemisphere.

This is clarified in Results:

“This network was previously inferred from the anatomical overlap together with the dynamics and the timing of responses of individually recorded neurons to a diverse set of pulse patterns (Kostarakos and Hedwig, 2012; Schöneich et al., 2015). Given that electrophysiology is challenging in this system, dual-electrode recordings to prove the existence of the inferred connections do not exist presently. We consider the neurons in the network cell types, that may also comprise multiple cells per hemisphere with highly consistent properties across individuals (Schöneich, 2020).”

2) The authors present a purely phenomenological model. How are these computations implemented biophysically? Which synaptic, cellular and network mechanisms are involved? Discussion of possible mechanisms and references to relevant works seems mandatory, in particular with respect to the (long) delays / rebound delay in the causal filters and the divisive normalization. Discussing adaptation for type AN1 and LN1 by means of either SFA (Nagel and Wilson, 2011, Nat Neurosci 14(2); Farkhooi et al., 2013, PLoS CB 9(10); Benda et al., 2003,2008) or short-term depressing synapses is straightforward. The phenomenological rate model has its own value. However, the argument that the authors did not aim at a biophysical implementation because ion channels and conductances are not known is not a good argument, this would have prevented 95% of published model studies.

This is now addressed in detail in ‘Essential Revisions #1’.

We added a detailed discussion of the biophysical implementation of all model elements to the discussion. We also now justify our phenomenological modelling approach more clearly (Results):

“We consider the neurons in the network cell types, that may also comprise multiple cells per hemisphere with highly consistent properties across individuals (Schöneich, 2020). We fitted a computational model based on intracellularly recorded responses of the network’s neurons to pulse trains. Our goal was to obtain a model that captures the computational capacity of the network without tying it to a specific biophysical implementation, we reproduced the responses of individual neurons using a phenomenological model based on four elementary computations (Figure 2A): 1. filtering, 2. nonlinear transfer functions (nonlinearities), 3. adaptation, and 4. linear transmission with a delay. This phenomenological model allows us to assess the network’s computational capacity independent of a specific biophysical implementation. However, all model components have straightforward biophysical correlates (see Discussion), which allows us to propose biophysical parameters that tune the network in specific implementations.”

3) The authors repeatedly refer to "song-recognition". However, the authors only investigate pulse sequences (Figure 1B) neglecting the impact of chirp tuning (e.g. Grobe et al., 2012, JEB 2015; Meckenhäuser et al., 2013, PloS one, 8; Clemens and Hennig, 2013). This needs to be discussed.

This is now addressed in detail in ‘Essential Revisions #3’.

This is correct, we focus on pulse sequences because selectivity for this aspect of the song is typically high while the tuning for chirp parameters tends to be broader. In addition, the network we model has been shown to explain pulse pattern recognition and no data exist that probe the network’s selectivity for long time scales. However, the network likely contributes to the preference for long timescales. This is now discussed (P16, L515) and we now make our focus on the recognition of pulse duration and pause explicit throughout the manuscript in abstract, intro, etc. (see ‘Essential Revisions #1’, reply P3-5).

4) The authors argue their model "has the capacity to reproduce the behavioral preferences" (p.8) with reference to behavioral tuning in Figure 1B. They should make explicit that this refers to the congruence of the LN4 response field (Figure 3) and behavioral response field (Figure 1B) for G. bimaculatus; no attempt was made to explicitly model behavioral output. How could behavior be generated? What is the typical delay between song onset and behavioral response? What is known about the behavioral decision circuit? Please discuss possible mechanisms of behavioral decision making such as the previously suggested drift diffusion models (Hennig et al., 2014, Front. Physiol 5; Clemens et al., 2014, PNAS 11; Meckenhäuser et al., 2014, Front. Sys. Neurosci 8) and possible others?

This is now addressed in detail in ‘Essential Revisions #2’.

For the regular patterns used in this study and found in natural cricket songs, the translation from LN4 output to phonotaxis is rather simple and we now make explicit that we use the firing rate of LN4 over a chirp as a predictor of phonotaxis (Results P8 L228, Methods P23, L817). We agree that integrative properties that exceed the duration of the pulse exist in insect decision making (Meckenhäuser, Dasgupta 2014) and that they can be revealed using artificial stimuli with conflicting information (Poulet 2005, Clemens 2014). However, these processes do not strongly affect responses to the repetitive stimuli used here. We discuss the existence of integrative processes in Results (P8, L231).

5) Inter vs. intra-species variability. To my knowledge there is large inter-individual variability in female G.bimaculatus behavior (Grobe et al., 2012, Meckenhäuser et al., 2013) and the authors sate tenfold neuron parameter variation within species. However, they model only an average animal and do not mention behavioral variability at all. How could parameter variation for inter-individual variability in their network differ from inter-species variability? This should be discussed.

See Essential Revisions #7.

6) What is the critical test for the existence of a 'multi-purpose' circuit across species? Do the authors expect the same basic network topology across species and did they attempt to identify it (anatomically) in any species other than G. bimaculatus? Can they think of other methods of validation? This should be addressed in the Discussion.

We have not done recordings or anatomical analyses in other species. We now propose behavioral and electrophysiological tests or our hypotheses in a new section in the discussion:

“How can the mother network hypothesis be tested? Behavioral tests can provide insight into whether other species use the coincidence detection algorithm found in G. bimaculatus (Hedwig and Sarmiento-Ponce, 2017). These experiments can for instance test the prediction that the duration of the last pulse in a chirp only weakly impacts network responses. Species that violate this prediction are unlikely to recognize song by the same coincidence mechanism. However, the “mother network” hypothesis does not imply that all crickets implement a coincidence detection algorithm, just that they reuse the same neurons with largely conserved response properties. In fact, our analyses have shown that through changes in key parameters, coincidence detection can be circumvented to produce a different preference type (Figure 6, Figure 4—figure supplement 1). That is why further electrophysiological experiments in G. bimaculatus are crucial to reveal the precise biophysical mechanisms that tune the network and ultimately link changes in gene expression, for instance, of specific ion channels, to changes in network tuning. Importantly, these experiments need to be extended to other species, by identifying and characterizing homologues of the neurons in the network. Recordings in other species are challenging but feasible, since homologous neurons are expected to be found in similar locations in the brain. Our model produces testable predictions based on the known behavioral tuning for how key properties of these neurons may look like in any given species (see below).”

7) Authors briefly mention various song preference patterns across species. It would be great to see specific examples of song patterns for a few species along with matching model tuning, e.g. in a supplemental figure, possibly together behavioral date / response diagrams.

We agree that an interesting next step would be to compare the model fitted to different species. We now show example response fields from three different species of crickets in Figure 1B and our analyses of the randomized models shows that the network can produce preferences for the pulse durations and pauses found in crickets (Figures 4B, C). However, as mentioned in Results, the preferred pulse-pause combination of the network is insufficient to describe the phenotype. Fitting the networks to the full response fields known from other species would be a large amount of work and is beyond the scope of the current, already quite long paper.

8) The core result on multi-species covered in Figure 4 and text indicates all four "principal types" of response fields can be achieved by the model with reference to Figure 1B. Where does Figure 1B come from? Are we looking at sketches? Or behavioral response fields measured in crickets? Please make this clear and give references to the underlying data. Clemens and Hennig 2013 and Hennig et al., 2014 do not show these graphs. Ideally, the authors can reproduce exemplary experimental data from diverse cricket species for illustration.

This is now addressed in ‘Essential Revisions #2’.

Figure 1B now shows response fields measured from crickets.

https://doi.org/10.7554/eLife.61475.sa2

Article and author information

Author details

  1. Jan Clemens

    1. European Neuroscience Institute Göttingen – A Joint Initiative of the University Medical Center Göttingen and the Max-Planck Society, Göttingen, Germany
    2. BCCN Göttingen, Göttingen, Germany
    Contribution
    Conceptualization, Formal analysis, Writing - original draft
    For correspondence
    clemensjan@gmail.com
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-4200-8097
  2. Stefan Schöneich

    1. University of Cambridge, Department of Zoology, Cambridge, United Kingdom
    2. Friedrich-Schiller-University Jena, Institute for Zoology and Evolutionary Research, Jena, Germany
    Contribution
    Data curation, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-4503-5111
  3. Konstantinos Kostarakos

    1. University of Cambridge, Department of Zoology, Cambridge, United Kingdom
    2. Institute of Biology, University of Graz, Universitätsplatz, Austria
    Contribution
    Data curation, Writing – review and editing
    Competing interests
    No competing interests declared
  4. R Matthias Hennig

    Humboldt-Universität zu Berlin, Department of Biology, Philippstrasse, Germany
    Contribution
    Conceptualization, Supervision, Writing – review and editing
    Competing interests
    No competing interests declared
  5. Berthold Hedwig

    University of Cambridge, Department of Zoology, Cambridge, United Kingdom
    Contribution
    Conceptualization, Supervision, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-1132-0056

Funding

Biotechnology and Biological Sciences Research Council (BB/J01835X/1)

  • Berthold Hedwig
  • Konstantinos Kostarakos

Royal Society (Newton International Fellowship)

  • Konstantinos Kostarakos

Leibniz-Gemeinschaft (SAW 2012-MfN-3)

  • R Matthias Hennig

Deutsche Forschungsgemeinschaft (HE 2812/4-1)

  • R Matthias Hennig

Deutsche Forschungsgemeinschaft (HE 2812/5-1)

  • R Matthias Hennig

Deutsche Forschungsgemeinschaft (CL 596/1-1)

  • Jan Clemens

Deutsche Forschungsgemeinschaft (CL 596/2-1)

  • Jan Clemens

Deutsche Forschungsgemeinschaft (SCHO 1822/3-1)

  • Stefan Schöneich

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank Biswa Sengupta for discussions during an early phase of the project.

Senior and Reviewing Editor

  1. Ronald L Calabrese, Emory University, United States

Reviewers

  1. Ann Kennedy, Northwestern University, United States
  2. Barbara Webb, University of Edinburgh, United Kingdom
  3. Martin Nawrot

Publication history

  1. Preprint posted: July 27, 2020 (view preprint)
  2. Received: July 27, 2020
  3. Accepted: November 3, 2021
  4. Accepted Manuscript published: November 11, 2021 (version 1)
  5. Version of Record published: December 1, 2021 (version 2)

Copyright

© 2021, Clemens et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 216
    Page views
  • 44
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Evolutionary Biology
    Lu Chen et al.
    Research Article

    A high portion of the earliest known insect fauna is composed of the so-called ‘lobeattid insects’, whose systematic affinities and role as foliage feeders remain debated. We investigated hundreds of samples of a new lobeattid species from the Xiaheyan locality using a combination of photographic techniques, including reflectance transforming imaging, geometric morphometrics, and biomechanics to document its morphology, and infer its phylogenetic position and ecological role. Ctenoptilus frequens sp. nov. possessed a sword-shaped ovipositor with valves interlocked by two ball-and-socket mechanisms, lacked jumping hind-legs, and certain wing venation features. This combination of characters unambiguously supports lobeattids as stem relatives of all living Orthoptera (crickets, grasshoppers, katydids). Given the herein presented and other remains, it follows that this group experienced an early diversification and, additionally, occurred in high individual numbers. The ovipositor shape indicates that ground was the preferred substrate for eggs. Visible mouthparts made it possible to assess the efficiency of the mandibular food uptake system in comparison to a wide array of extant species. The new species was likely omnivorous which explains the paucity of external damage on contemporaneous plant foliage.

    1. Evolutionary Biology
    Sergio Álvarez-Parra et al.
    Research Article

    Dinosaur bonebeds with amber content, yet scarce, offer a superior wealth and quality of data on ancient terrestrial ecosystems. However, the preserved palaeodiversity and/or taphonomic characteristics of these exceptional localities had hitherto limited their palaeobiological potential. Here, we describe the amber from the Lower Cretaceous dinosaur bonebed of Ariño (Teruel, Spain) using a multidisciplinary approach. Amber is found in both a root layer with amber strictly in situ and a litter layer mainly composed of aerial pieces unusually rich in bioinclusions, encompassing 11 insect orders, arachnids, and a few plant and vertebrate remains, including a feather. Additional palaeontological data—charophytes, palynomorphs, ostracods— are provided. Ariño arguably represents the most prolific and palaeobiologically diverse locality in which fossiliferous amber and a dinosaur bonebed have been found in association, and the only one known where the vast majority of the palaeontological assemblage suffered no or low-grade pre-burial transport. This has unlocked unprecedentedly complete and reliable palaeoecological data out of two complementary windows of preservation—the bonebed and the amber—from the same site.