Representation of male features in the female mouse Accessory Olfactory Bulb, and their stability during the estrus cycle

  1. Department of Medical Neurobiology, Institute for Medical Research Israel Canada, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem, Israel

Peer review process

Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, public reviews, and a provisional response from the authors.

Read more about eLife’s peer review process.

Editors

  • Reviewing Editor
    Tali Kimchi
    Weizmann Institute of Science, Rehovot, Israel
  • Senior Editor
    Kate Wassum
    University of California, Los Angeles, Los Angeles, United States of America

Reviewer #1 (Public review):

Summary:

In this detailed study, Cohen and Ben-Shaul characterized the AOB cell responses to various conspecific urine samples in female mice across the estrous cycle. The authors found that AOB cell responses vary with the strains and sexes of the samples. Between estrous and non-estrous females, no clear or consistent difference in responses was found. The cell response patterns, as measured by the distance between pairs of stimuli, are largely stable. When some changes do occur, they are not consistent across strains or male status. The authors concluded that AOB detects the signals without interpreting them. Overall, this study will provide useful information for scientists in the field of olfaction.

Strengths:

The study uses electrophysiological recording to characterize the responses of AOB cells to various urines in female mice. AOB recording is not trivial as it requires activation of VNO pump. The team uses a unique preparation to activate the VNO pump with electric stimulation, allowing them to record AOB cell responses to urines in anesthetized animals. The study comprehensively described the AOB cell responses to social stimuli and how the responses vary (or not) with features of the urine source and the reproductive state of the recording females. The dataset could be a valuable resource for scientists in the field of olfaction.

Weaknesses:

(1) The figures could be better labeled.

(2) For Figure 2E, please plot the error bar. Are there any statistics performed to compare the mean responses?

(3) For Figure 2D, it will be more informative to plot the percentage of responsive units.

(4) Could the similarity in response be explained by the similarity in urine composition? The study will be significantly strengthened by understanding the "distance" of chemical composition in different urine.

(5) If it is not possible for the authors to obtain these data first-hand, published data on MUPs and chemicals found in these urines may provide some clues.

(6) It is not very clear to me whether the female overrepresentation is because there are truly more AOB cells that respond to females than males or because there are only two female samples but 9 male samples.

(7) If the authors only select two male samples, let's say ICR Naïve and ICR DOM, combine them with responses to two female samples, and do the same analysis as in Figure 3, will the female response still be overrepresented?

(8) In Figure 4B and 4C, the pairwise distance during non-estrus is generally higher than that during estrus, although they are highly correlated. Does it mean that the cells respond to different urines more distinctively during diestrus than in estrus?

(9) The correlation analysis is not entirely intuitive when just looking at the figures. Some sample heatmaps showing the response differences between estrous states will be helpful.

Reviewer #2 (Public review):

Summary:

Many aspects of the study are carefully done, and in the grand scheme this is a solid contribution. I have no "big-picture" concerns about the approach or methodology. However, in numerous places the manuscript is unnecessarily vague, ambiguous, or confusing. Tightening up the presentation will magnify their impact.

Strengths:

(1) The study includes urine donors from males of three strains each with three social states, as well as females in two states. This diversity significantly enhances their ability to interpret their results.

(2) Several distinct analyses are used to explore the question of whether AOB MCs are biased towards specific states or different between estrus and non-estrus females. The results of these different analyses are self-reinforcing about the main conclusions of the study.

(3) The presentation maintains a neutral perspective throughout while touching on topics of widespread interest.

Weaknesses:

(1) Introduction:
The discussion of the role of the VNS and preferences for different male stimuli should perhaps include Wysocki and Lepri 1991

(2) Results:
a) Given the 20s gap between them, the distinction between sample application and sympathetic nerve trunk stimulation needs to be made crystal clear; in many places, "stimulus application" is used in places where this reviewer suspects they actually mean sympathetic nerve trunk stimulation.
b) There appears to be a mismatch between the discussion of Figure 3 and its contents. Specifically, there is an example of an "adjusted" pattern in 3A, not 3B.
c) The discussion of patterns neglects to mention whether it's possible for a neuron to belong to more than one pattern. For example, it would seem possible for a neuron to simultaneously fit the "ICR pattern" and the "dominant adjusted pattern" if, e.g., all ICR responses are stronger than all others, but if simultaneously within each strain the dominant male causes the largest response.

(3) Discussion:
a) The discussion of chemical specificity in urine focuses on volatiles and MUPs (citation #47), but many important molecules for the VNS are small, nonvolatile ligands. For such molecules, the corresponding study is Fu et al 2015.
b) "Following our line of reasoning, this scarcity may represent an optimal allocation of resources to separate dominant from naïve males": 1 unit out of 215 is roughly consistent with a single receptor. Surely little would be lost if there could be more computational capacity devoted to this important axis than that? It seems more likely that dominance is computed from multiple neuronal types with mixed encoding.

(4) Methods:
a) Male status, "were unambiguous in most cases": is it possible to put numerical estimates on this? 55% and 99% are both "most," yet they differ substantially in interpretive uncertainty.
b) Surgical procedures and electrode positioning: important details of probes are missing (electrode recording area, spacing, etc).
c) Stimulus presentation procedure: Are stimuli manually pipetted or delivered by apparatus with precise timing?
d) Data analysis, "we applied more permissive criteria involving response magnitude": it's not clear whether this is what's spelled out in the next paragraph, or whether that's left unspecified. In either case, the next paragraph appears to be about establishing a noise floor on pattern membership, not a "permissive criterion."
e) Data analysis, method for assessing significance: there's a lot to like about the use of pooling to estimate the baseline and the use of an ANOVA-like test to assess unit responsiveness.
But:
i) for a specific stimulus, at 4 trials (the minimum specified in "Stimulus presentation procedure") kruskalwallis is questionable. They state that most trials use 5, however, and that should be okay.
ii) the methods statement suggests they are running kruskalwallis individually for each neuron/stimulus, rather than once per neuron across all stimuli. With 11 stimuli, there is a substantial chance of a false-positive if they used p < 0.05 to assess significance. (The actual threshold was unstated.) Were there any multiple comparison corrections performed? Or did they run kruskalwallis on the neuron, and then if significant assess individual stimuli? (Which is a form of multiple-comparisons correction.)

Author response:

Public Reviews:

Reviewer #1 (Public review):

Summary:

In this detailed study, Cohen and Ben-Shaul characterized the AOB cell responses to various conspecific urine samples in female mice across the estrous cycle. The authors found that AOB cell responses vary with the strains and sexes of the samples. Between estrous and non-estrous females, no clear or consistent difference in responses was found. The cell response patterns, as measured by the distance between pairs of stimuli, are largely stable. When some changes do occur, they are not consistent across strains or male status. The authors concluded that AOB detects the signals without interpreting them. Overall, this study will provide useful information for scientists in the field of olfaction.

Strengths:

The study uses electrophysiological recording to characterize the responses of AOB cells to various urines in female mice. AOB recording is not trivial as it requires activation of VNO pump. The team uses a unique preparation to activate the VNO pump with electric stimulation, allowing them to record AOB cell responses to urines in anesthetized animals. The study comprehensively described the AOB cell responses to social stimuli and how the responses vary (or not) with features of the urine source and the reproductive state of the recording females. The dataset could be a valuable resource for scientists in the field of olfaction.

Weaknesses:

(1) The figures could be better labeled.

Figures will be revised to provide more detailed labeling.

(2) For Figure 2E, please plot the error bar. Are there any statistics performed to compare the mean responses?

We did not perform statistical comparisons (between the mean rates across the population). We will add this analysis and the corresponding error bars.

(3) For Figure 2D, it will be more informative to plot the percentage of responsive units.

We will do it.

(4) Could the similarity in response be explained by the similarity in urine composition? The study will be significantly strengthened by understanding the "distance" of chemical composition in different urine.

We agree. As we wrote in the Discussion: “Ultimately, lacking knowledge of the chemical space associated with each of the stimuli, this and all the other ideas developed here remain speculative.”

A better understanding of the chemical distance is an important aspect that we aim to include in our future studies. However, this is far from trivial, as it is not chemical distance per se (which in itself is hard to define), but rather the “projection” of chemical space on the vomeronasal receptor neurons array. That is, knowledge of the chemical composition of the stimuli, lacking full knowledge of which molecules are vomeronasal system ligands, will only provide a partial picture. Despite these limitations, this is an important analysis which we would have done had we access to this data.

(5) If it is not possible for the authors to obtain these data first-hand, published data on MUPs and chemicals found in these urines may provide some clues.

Measurements about some classes of molecules may be found for some of the stimuli that we used here, but not for all. We are not aware of any single dataset that contains this information for any type of molecules (e.g., MUPs) across the entire stimulus set that we have used. More generally, pooling results from different studies has limited validity because of the biological and technical variability across studies. In order to reliably interpret our current recordings, it would be necessary to measure the urinary content of the very same samples that were used for stimulation. Unfortunately, we are not able to conduct this analysis at this stage.

(6) It is not very clear to me whether the female overrepresentation is because there are truly more AOB cells that respond to females than males or because there are only two female samples but 9 male samples.

It is true that the number of neurons fulfilling each of the patterns depends on the number of individual stimuli that define it. However, our measure of “over-representation” aims to overcome this bias, by using bootstrapping to reveal if the observed number of patterns is larger than expected by chance. We also note that more generally, the higher frequency of responses to female, as compared to male stimuli, is obtained in other studies by others and by us, also when the number of male and female stimuli is matched (e.g., Bansal et al BMC Biol 2021, Ben-Shaul et al, PNAS 2010, Hendrickson et al, JNS, 2008).

(7) If the authors only select two male samples, let's say ICR Naïve and ICR DOM, combine them with responses to two female samples, and do the same analysis as in Figure 3, will the female response still be overrepresented?

We believe that the answer is positive, but we can, and will perform this analysis to check.

(8) In Figure 4B and 4C, the pairwise distance during non-estrus is generally higher than that during estrus, although they are highly correlated. Does it mean that the cells respond to different urines more distinctively during diestrus than in estrus?

This is an important observation. For the Euclidean distance there might be a simple explanation as the distance depends on the number of units (and there are more units recorded in non-estrus females). However, this simple explanation does not hold for the correlation distance. A higher distance implies higher discrimination during the non-estrus stage, but our other analyses of sparseness and the selectivity indices do not support this idea. We note that absolute values of distance measures should generally be interpreted cautiously, as they may depend on multiple factors including sample size. Also, a small number of non-selective units could increase the correlation in responses among stimuli, and thus globally shift the distances. For these reasons, we focus on comparisons, rather than the absolute values of the correlation distances. In the revised manuscript, we will note and discuss this important observation.

(9) The correlation analysis is not entirely intuitive when just looking at the figures. Some sample heatmaps showing the response differences between estrous states will be helpful.

If we understand correctly, the idea is to show the correlation matrices from which the values in 4B and 4C are taken. We can and will do this, probably as a supplementary figure.

Reviewer #2 (Public review):

Summary:

Many aspects of the study are carefully done, and in the grand scheme this is a solid contribution. I have no "big-picture" concerns about the approach or methodology. However, in numerous places the manuscript is unnecessarily vague, ambiguous, or confusing. Tightening up the presentation will magnify their impact.

We will revise the text with the aim of tightening the presentation.

Strengths:

(1) The study includes urine donors from males of three strains each with three social states, as well as females in two states. This diversity significantly enhances their ability to interpret their results.

(2) Several distinct analyses are used to explore the question of whether AOB MCs are biased towards specific states or different between estrus and non-estrus females. The results of these different analyses are self-reinforcing about the main conclusions of the study.

(3) The presentation maintains a neutral perspective throughout while touching on topics of widespread interest.

Weaknesses:

(1) Introduction:

The discussion of the role of the VNS and preferences for different male stimuli should perhaps include Wysocki and Lepri 1991

Agreed. we will refer to this work in our discussion.

(2) Results:

a) Given the 20s gap between them, the distinction between sample application and sympathetic nerve trunk stimulation needs to be made crystal clear; in many places, "stimulus application" is used in places where this reviewer suspects they actually mean sympathetic nerve trunk stimulation.

In this study, we have considered both responses that are triggered by sympathetic trunk activation, and those that occur (as happens in some preparations) immediately following stimulus application (and prior to nerve trunk stimulation). An example of the latter Is provided in the second unit shown in Figure 1D (and this is indicated also in the figure legend). In our revision, we will further clarify this confusing point.

b) There appears to be a mismatch between the discussion of Figure 3 and its contents. Specifically, there is an example of an "adjusted" pattern in 3A, not 3B.

True. Thanks for catching this error. We will correct this.

c) The discussion of patterns neglects to mention whether it's possible for a neuron to belong to more than one pattern. For example, it would seem possible for a neuron to simultaneously fit the "ICR pattern" and the "dominant adjusted pattern" if, e.g., all ICR responses are stronger than all others, but if simultaneously within each strain the dominant male causes the largest response.

This is true. In the legend to Figure 3B, we actually write: “A neuron may fulfill more than one pattern and thus may appear in more than one row.”, but we will discuss this point in the main text as well.

(3) Discussion:

a) The discussion of chemical specificity in urine focuses on volatiles and MUPs (citation #47), but many important molecules for the VNS are small, nonvolatile ligands. For such molecules, the corresponding study is Fu et al 2015.

We fully agree. We will expand our discussion and refer to Fu et al.

b) "Following our line of reasoning, this scarcity may represent an optimal allocation of resources to separate dominant from naïve males": 1 unit out of 215 is roughly consistent with a single receptor. Surely little would be lost if there could be more computational capacity devoted to this important axis than that? It seems more likely that dominance is computed from multiple neuronal types with mixed encoding.

We agree, and we are not claiming that dominance, nor any other feature, is derived using dedicated feature selective neurons. Our discussion of resource allocation is inevitably speculative. Our main point in this context is that a lack of overrepresentation does not imply that a feature is not important. We will revise our discussion to better clarify our view of this issue.

(4) Methods:

a) Male status, "were unambiguous in most cases": is it possible to put numerical estimates on this? 55% and 99% are both "most," yet they differ substantially in interpretive uncertainty.

This sentence is actually misleading and irrelevant. Ambiguous cases were not considered as dominant for urine collection. We only classified mice as dominant if they were “won” in the tube test and exhibited dominant behavior in the subsequent observation period in the cage. We will correct the wording in the revised manuscript.

b) Surgical procedures and electrode positioning: important details of probes are missing (electrode recording area, spacing, etc).

True. We will add these details.

c) Stimulus presentation procedure: Are stimuli manually pipetted or delivered by apparatus with precise timing?

They are delivered manually. We will clarify this as well.

d) Data analysis, "we applied more permissive criteria involving response magnitude": it's not clear whether this is what's spelled out in the next paragraph, or whether that's left unspecified. In either case, the next paragraph appears to be about establishing a noise floor on pattern membership, not a "permissive criterion."

True, the next paragraph is not the explanation for the more permissive criteria. The more permissive criteria involving response magnitude are actually those described in Figure 3A and 3B. The sentence that was quoted above merely states that before applying those criteria, we had also searched for patterns defined by binary designation of neurons as responsive, or not responsive, to each of the stimuli (this is directly related to the next comment below). Using those binary definitions, we obtained a very small number of neurons for each pattern and thus decided to apply the approach actually used and described in the manuscript.

e) Data analysis, method for assessing significance: there's a lot to like about the use of pooling to estimate the baseline and the use of an ANOVA-like test to assess unit responsiveness.

But:

i) for a specific stimulus, at 4 trials (the minimum specified in "Stimulus presentation procedure") kruskalwallis is questionable. They state that most trials use 5, however, and that should be okay.

The number of cases with 4 trials is truly a minority, and we will provide the exact numbers in our revision.

ii) the methods statement suggests they are running kruskalwallis individually for each neuron/stimulus, rather than once per neuron across all stimuli. With 11 stimuli, there is a substantial chance of a false-positive if they used p < 0.05 to assess significance. (The actual threshold was unstated.) Were there any multiple comparison corrections performed? Or did they run kruskalwallis on the neuron, and then if significant assess individual stimuli? (Which is a form of multiple-comparisons correction.)

First, we indeed failed to mention that our criterion was 0.05. We will correct that in our revision. We did not apply any multiple comparison measures. We consider each neuron-stimulus pair as an independent entity, and we are aware that this leads to a higher false positive rate. On the other hand, applying multiple comparisons would be problematic, as we do not always use the same number of stimuli in different studies. Applying multiple comparison corrections would lead to different response criteria across different studies. Notably, most, if not all, of our conclusions involve comparisons across conditions, and for this purpose we think that our procedure is valid. We do not attach any special meaning to the significance threshold, but rather think of it as a basic criterion that allows us to exclude non-responsive neurons, and to compare frequencies of neurons that fulfill this criterion.

  1. Howard Hughes Medical Institute
  2. Wellcome Trust
  3. Max-Planck-Gesellschaft
  4. Knut and Alice Wallenberg Foundation