Author response:
The following is the authors’ response to the original reviews
Public Reviews:
Reviewer #1 (Public Review):
Summary
This paper summarises responses from a survey completed by around 5,000 academics on their manuscript submission behaviours. The authors find several interesting stylised facts, including (but not limited to):
- Women are less likely to submit their papers to highly influential journals (*e.g.*, Nature, Science and PNAS).
- Women are more likely to cite the demands of co-authors as a reason why they didn't submit to highly influential journals.
- Women are also more likely to say that they were advised not to submit to highly influential journals.
Recommendation
This paper highlights an important point, namely that the submissions' behaviours of men and women scientists may not be the same (either due to preferences that vary by gender, selection effects that arise earlier in scientists' careers or social factors that affect men and women differently and also influence submission patterns). As a result, simply observing gender differences in acceptance rates---or a lack thereof---should not be automatically interpreted as as evidence of for or against discrimination (broadly defined) in the peer review process. I do, however, make a few suggestions below that the authors may (or may not) wish to address.
We thank the author for this comment and for the following suggestions, which we take into account in our revision of the manuscript.
Major comments
What do you mean by bias?
In the second paragraph of the introduction, it is claimed that "if no biases were present in the case of peer review, then 'we should expect the rate with which members of less powerful social groups enjoy successful peer review outcomes to be proportionate to their representation in submission rates." There are a couple of issues with this statement.
- First, the authors are implicitly making a normative assumption that manuscript submission and acceptance rates *should* be equalised across groups. This may very well be the case, but there can also be important reasons why not -- e.g., if men are more likely to submit their less ground-breaking work, then one might reasonably expect that they experience higher rejection rates compared to women, conditional on submission.
We do assume that normative statement: unless we believe that men’s papers are intrinsically better than women’s papers, the acceptance rate should be the same. But the referee is right: we have no way of controlling for the intrinsic quality of the work of men and women. That said, our manuscript does not show that there is a different acceptance rate for men and women; it shows that women are less likely to submit papers to a subset of journals that are of a lower Journal Impact Factor, controlling for their most cited paper, in an attempt to control for intrinsic quality of the manuscripts.
- Second, I assume by "bias", the authors are taking a broad definition, i.e., they are not only including factors that specifically relate to gender but also factors that are themselves independent of gender but nevertheless disproportionately are associated with one gender or another (e.g., perhaps women are more likely to write on certain topics and those topics are rated more poorly by (more prevalent) male referees; alternatively, referees may be more likely to accept articles by authors they've met before, most referees are men and men are more likely to have met a given author if he's male instead of female). If that is the case, I would define more clearly what you mean by bias. (And if that isn't the case, then I would encourage the authors to consider a broader definition of "bias"!)
Yes, the referee is right that we are taking a broad definition of bias. We provide a definition of bias on page 3, line 92. This definition is focused on differential evaluation which leads to differential outcomes. We also hedge our conversation (e.g., page 3, line 104) to acknowledge that observations of disparities may only be an indicator of potential bias, as many other things could explain the disparity. In short, disparities are a necessary but insufficient indicator of bias. We add a line in the introduction to reinforce this. The only other reference to the term bias comes on page 10, line 276. We add a reference to Lee here to contextualize.
Identifying policy interventions is not a major contribution of this paper
In my opinion, the survey evidence reported here isn't really strong enough to support definitive policy interventions to address the issue and, indeed, providing policy advice is not a major -- or even minor -- contribution of your paper, so I would not mention policy interventions in the abstract. (Basically, I would hope that someone interested in policy interventions would consult another paper that much more thoughtfully and comprehensively discusses the costs and benefits of various interventions!)
We thank the referee for this comment. While we agree that our results do not lead to definitive policy interventions, we believe that our findings point to a phenomenon that should be addressed through policy interventions. Given that some interventions are proposed in our conclusion, we feel like stating this in the abstract is coherent.
Minor comments
- What is the rationale for conditioning on academic rank and does this have explanatory power on its own---i.e., does it at least superficially potentially explain part of the gender gap in intention to submit?
The referee is right: academic rank was added to control for career age of researchers, with the assumption that this variable would influence submission behavior. However, the rank information we collected was for the time that the individual respondent took the survey, which could be different from the rank they held concerning their submission behaviors mentioned in the survey. That is why we didn't consider rank as an independent variable of interest. But I do also agree with the reviewer that it could be related to their submission behaviors in some cases. Our initial analysis shows that academic rank is not a significant predictor of whether researchers submitted to SNP, but does contribute significantly to the SNP acceptance rates and desk rejection rates of individuals in Medical Sciences.
Reviewer #2 (Public Review):
Summary:
In this manuscript, Basson et al. study the representation of women in "high-impact" journals through the lens of gendered submission behavior. This work is clear and thorough, and it provides new insights into gender disparities in submissions, such as that women were more likely to avoid submitting to one of these journals based on advice from a colleague/mentor. The results have broad implications for all academic communities and may help toward reducing gender disparities in "high-impact" journal submissions. I enjoyed reading this article, and I have several recommendations regarding the methodology/reporting details that could help to enhance this work.
We thank the referee for their comments.
Strengths:
This is an important area of investigation that is often overlooked in the study of gender bias in publishing. Several strengths of the paper include:
(1) A comprehensive survey of thousands of academics. It is admirable that the authors retroactively reached out to other researchers and collected an extensive amount of data.
(2) Overall, the modeling procedures appear thorough, and many different questions are modeled.
(3) There are interesting new results, as well as a thoughtful discussion. This work will likely spark further investigation into gender bias in submission behavior, particularly regarding the possible gendered effect of mentorship on article submission.
Thank you for those comments.
Weaknesses:
(1) The GitHub page should be further clarified. A detailed description of how to run the analysis and the location of the data would be helpful. For example, although the paper says that "Aggregated and de-identified data by gender, discipline, and rank for analyses are available on GitHub," I was unable to find such data.
We added the link to the Github page, as well as more details on the how to run the statistical analysis. Unfortunately, our IRB approval does not allow for the sharing of the raw data.
(2) Why is desk rejection rate defined as "the number of manuscripts that did not go out for peer review divided by the number of manuscripts rejected for each survey respondent"? For example, in your Grossman 2020 reference, it appears that manuscripts are categorized as "reviewed" or "desk-rejected" (Grossman Figure 2). If there are gender differences in the denominator, then this could affect the results.
We thank the referee for pointing this out. Actually, what the referee is proposing is how we calculated it in the manuscript; the calculation mentioned in the manuscript was a mistake. We corrected the manuscript.
(3) Have you considered correcting for multiple comparisons? Alternatively, you could consider reporting P-values and effect sizes in the main text. Otherwise, sometimes the conclusions can be misleading. For example, in Figure 3 (and Table S28), the effect is described as significant in Social Sciences (p=0.04) but not in Medical Sciences (p=0.07).
We highly appreciate the suggestion. We’ve added Odds Ratio values and p-values to the main manuscript.
(4) More detail about the models could be included. It may be helpful to include this in each table caption so that it is clear what all the terms of the model were. For instance, I was wondering if journal or discipline are included in the models.
We appreciate the suggestion. We’ve added model details to the figure and table captions in the manuscript and the supplemental materials.
Reviewer #3 (Public Review):
Summary:
This is a strong manuscript by Basson and colleagues which contributes to our understanding of gender disparities in scientific publishing. The authors examine attitudes and behaviors related to manuscript submission in influential journals (specifically, Science, Nature and PNAS). The authors rightly note that much attention has been paid to gender disparities in work that is already published, but this fails to capture the unseen hurdles that occur prior to publication (which include decisions about where to publish, desk rejections, revisions and resubmissions, etc.). They conducted a survey study to address some of these components and their results are interesting:
They find that women are less likely to submit their manuscript to Science, Nature or PNAS. While both men and women feel their work would be better suited for more specialized journals, women were more likely to think their work was 'less novel or groundbreaking.'
A smaller proportion of respondents indicated that they were actively discouraged from submitting their manuscripts to these journals. In this instance, women were more likely to receive this advice than men.
Lastly, the authors also looked at self-reported acceptance and rejection rates and found that there were no gender differences in acceptance or rejection rates.
These data are helpful in developing strategies to mitigate gender disparities in influential journals.
We thank the referee for their comments
Comments:
The methods the authors used are appropriate for this study. The low response rate is common for this type of recruitment strategy. The authors provide a thoughtful interpretation of their data in the Discussion.
We thank the referee for their comments
Reviewer #4 (Public Review):
This manuscript covers an important topic of gender biases in the authorship of scientific publications. Specifically, it investigates potential mechanisms behind these biases, using a solid approach, based on a survey of researchers.
Main strengths
The topic of the MS is very relevant given that across sciences/academia representation of genders is uneven, and identified as concerning. To change this, we need to have evidence on what mechanisms cause this pattern. Given that promotion and merit in academia are still largely based on the number of publications and impact factor, one part of the gap likely originates from differences in publication rates of women compared to men.
Women are underrepresented compared to men in journals with high impact factor. While previous work has detected this gap, as well as some potential mechanisms, the current MS provides strong evidence, based on a survey of close to 5000 authors, that this gap might be due to lower submission rates of women compared to men, rather than the rejection rates. The data analysis is appropriate to address the main research aims. The results interestingly show that there is no gender bias in rejection rates (desk rejection or overall) in three high-impact journals (Science, Nature, PNAS). However, submission rates are lower for women compared to men, indicating that gender biases might act through this pathway. The survey also showed that women are more likely to rate their work as not groundbreaking, and be advised not to submit to prestigious journals
With these results, the MS has the potential to inform actions to reduce gender bias in publishing, and actions to include other forms of measuring scientific impact and merit.
We thank the referee for their comments.
Main weakness and suggestions for improvement
(1) The main message/further actions: I feel that the MS fails to sufficiently emphasise the need for a different evaluation system for researchers (and their research). While we might act to support women to submit more to high-impact journals, we could also (and several initiatives do this) consider a broader spectrum of merits (e.g. see https://coara.eu/ ). Thus, I suggest more space to discuss this route in the Discussion. Also, I would suggest changing the terms that imply that prestigious journals have a better quality of research or the highest scientific impact (line 40: journals of the highest scientific impact) with terms that actually state what we definitely know (i.e. that they have the highest impact factor). And think this could broaden the impact of the MS
We agree with the referee. We changed the wording on impact, and added a few lines were added on this in the discussion.
(2) Methods: while methods are all sound, in places it is difficult to understand what has been done or measured. For example, only quite late (as far as I can find, it's in the supplement) we learn the type of authorship considered in the MS is the corresponding authorship. This information should be clear from the very start (including the Abstract).
We performed the suggested edits.
Second, I am unclear about the question on the perceived quality of research work. Was this quality defined for researchers, as quality can mean different things (e.g. how robust their set-up was, how important their research question was)? If researchers have different definitions of what quality means, this can cause additional heterogeneity in responses. Given that the survey cannot be repeated now, maybe this can be discussed as a limitation.
We agree that this can mean something different for researchers—probably varies by discipline, but also by gender. But that was precisely the point: whether men/women considered their “best work” to be published in higher impact venue. While there may be heterogeneity in those perceptions, the fact that 1) men and women rate their research at the same level and 2) we control for disciplinary differences should mitigate some of that.
I was surprised to see that discipline was considered as a moderator for some of the analyses but not for the main analysis on the acceptance and rejection rates.
We appreciate the attention to detail. In our analysis of acceptance and rejection rates, we conducted separate regression analyses for each discipline to capture any field-specific patterns that might otherwise be obscured.
We added more details on this to clarify.
I was also suppressed not to see publication charges as one of the reasons asked for not submitting to selected journals. Low and middle-income countries often have more women in science but are also less likely to support high publication charges.
That is a good point. However, both Science and Nature have subscription options, which do not require any APCs.
Finally, academic rank was asked of respondents but was not taken as a moderator.
Academic rank is included in the regression as a control variable (Figure 1).
Reviewer #2 (Recommendations For The Authors):
In addition to the points in the "Weaknesses" section of the my Public Review above, I have several suggestions to improve this work.
(1) Can you please indicate what the error bars mean in each plot? I am assuming that they are 95% confidence intervals.
We appreciate the attention to detail. Yes, they are 95% confidence intervals. We’ve clarified this in the captions of the corresponding figures.
(2) Can you provide a more detailed explanation for why the 7 journals were separated? I see that on page 3 of the supporting information you write that "Due to limited responses, analysis per journal was not always viable. The results pertaining to the journals were aggregated, with new categories based on the shared similarities in disciplinary foci of the journals and their prestige." Specifically, why did you divide the data into (somewhat arbitrary) categories as opposed to using all the data and including a journal term in your model?
The survey covered 7 journals:
• Science, Nature, and PNAS (S.N.P.)
• Nature Communications and Science Advances (NC.SA.)
• NEJM and Cell (NEJM.C.)
We believe that the first three are a class of their own: they cover all fields (while NEJM and Cell are limited to (bio)medical sciences), and have a much higher symbolic capital than both Nature Comms and Science Advances (which are receiving cascading papers from Nature and Science, respectively). We believe that factors leading to submission to S.N.P. are much different than those leading to submission to the other groups of journals, which is why we separated the analysis in that manner.
(3) You included random effects for linear regression but not for logistic regression. Please justify this choice or include additional logistic regression models with random effects.
We used mixed-effect models for linear regressions (where number of submissions, acceptance rate, or rejection rate is the dependent variable). As mentioned in the previous comment, we tested using rank as the control variable and found it had a potential impact on the variables we analyzed using linear regressions in some disciplines. Therefore, we introduced it as a random effect for all the linear regression models.
Reviewer #3 (Recommendations For The Authors):
The limitations of this work are currently described in the Supplement. It may be helpful to bring several of these items into the Discussion so that they can be addressed more prominently.
Added content
Reviewer #4 (Recommendations For The Authors):
(1) Line 40: add 'as leading authors of papers published in' before ' 'journals'
Done
(2) Explain what the direction in the ' relationship between' line 62 is
Added
(3) Lines 101-102 - this is a bit unclear. Please, provide some more info, also including what did these studies find.
Added
(4) Is 'sociodemographic' the best term in line 120
Yes, we believe so.
(5) Results would benefit from a short intro with the info on the number of respondents, also by gender.
Those are present at the end of the intro (and in the methods, at the end). We nonetheless added gender.
(6) Line 134 add how many woman and man did submit to Science, Nature, and PNAS
Added. In all disciplines combined, 552 women and 1,583 men ever submitted to these three elite journals. More details can be found in SI Table 9
(7) Add 'Self-' before reported, line 141
Added
(8) Add sample sizes to Figs 1 and 2
Those are in the appendix
(9) Line 168 - unclear if this is ever or as their first choice
We do not discriminate – it is whether the considered it at all.
(10) Add sample size in line 177
Added. 480 women and 1404 men across all disciplines reported desk rejections by S.N.P. journals.
(11) I would like to see some discussion on the fact that the highest citation paper will also be a paper that the authors have submitted earlier in their careers given that citations will pile up over time.
Those are actually quite evenly distributed. We modified the supplementary materials.
(12) Data availability - be clear that supporting info contains only summary data. Also, while the Data availability statement refers to de-identified data on Github, the Github page only contains the code, and the note that 'The STAT code used for our analyses is shared.
We are unable to share the survey response details publicly per IRB protocols.' Why were de-identified data shared? This is extremely important to allow for the reproducibility of MS results. I would also suggest sharing data in a trusted repository (e.g. Dryad, ZENODO...) rather than on Github, as per current recommendations on the best practices for data sharing.
Thank you for your careful reading and for highlighting the importance of clear data availability. We will revise our Data Availability Statement to explicitly state that the supporting information contains only summary data and that the complete analysis code is available on GitHub.
We understand the importance of sharing de-identified data for reproducibility. However, our IRB strictly prohibits the sharing of any individual-level data, including de-identified files, to protect participant confidentiality. Consequently, the summary data included in the supporting information, together with the provided code, is intended to facilitate the verification of our core findings. Our previous statement regarding “de-identified” data sharing was inaccurate and thus has been removed. We apologize for the confusion.
In light of your suggestion, we are also exploring depositing the summary data and code in a trusted repository (e.g., Dryad or Zenodo) to further align with current best practices for data sharing.