## Peer review process

**Revised:** This Reviewed Preprint has been revised by the authors in response to the previous round of peer review; the eLife assessment and the public reviews have been updated where necessary by the editors and peer reviewers.

## Editors

- Reviewing EditorBernhard SchmidUniversity of Zurich, Zurich, Switzerland
- Senior EditorMeredith SchumanUniversity of Zurich, Zürich, Switzerland

**Reviewer #1 (Public Review):**

Shoemaker and Grilli analyze publicly available sequencing data to quantify how the microbial diversity of ecosystems changes with the taxonomic scale considered (e.g., diversity of genera vs diversity of families). This study builds directly on Grilli's 2020 paper which used this data to show that for many different microbial species, the distribution of abundances of the species across sampling sites belongs to a simple one-parameter family of gamma distributions. In this work, they show that the gamma distribution also describes the distribution of abundances of higher taxonomic levels. The distribution now requires two parameters, but the second parameter can be approximately derived by treating the distributions of lower-level taxonomic units as being independent. The difference between the species-level result and the result at higher taxonomic levels suggests that in some sense microbial species are ecologically meaningful units.

While the higher-level taxon abundance distributions can be well-approximated assuming independence of the constituent species, this approach substantially underestimates variation in community richness and diversity among sampling sites. Much of this extra variability appears to be driven by variability in sample size across sites. It is not clear to me how much this variation in sample size is itself due to variation in sampling effort versus variation in overall microbial densities. This variation in sample size also produces correlations between taxon richness at lower and higher taxonomic levels. For instance, sites with large samples are likely to have both many species within a genus and many genera. The authors also consider taxon diversity (Shannon index, i.e. entropy), which is constructed from frequencies and is therefore less sensitive to sample size. In this case, correlations between diversity across taxonomic scales instead appear to depend on the idiosyncratic correlations among species abundances.

This paper's results are presented in a fairly terse manner, even when they are describing summary statistics that require a lot of thought to interpret. I don't think it would make sense to try to understand it without having first worked through the 2020 paper. But everyone interested in a general understanding of microbial ecology should read the 2020 paper, and once one has done that, this paper is worth reading as well simply for seeing how the major pattern in that paper shifts as one moves up in taxonomic scale.

**Reviewer #3 (Public Review):**

Summary

In this research advance, the authors purport to show that the unified neutral theory of biodiversity (UNTB) is not a suitable null model for exploring the relationship between macroecological quantities, and additionally that the stochastic logistic growth model (SLM) is a viable replacement. They do this by citing other studies where UNTB was unable to capture individual macroecological quantities, and then demonstrating SLM's strength at predicting the same diversity metrics. They extend this analysis to show SLM's modeling capability at multiple scales of coarse graining, in addition to its failures at predicting these metrics' variances. Finally, authors conduct a similar analysis to Madi et al. (2020) by investigating the relationship between diversity measures within a group and across coarse-grained groups (e.g. genera diversity in one family compared to diversity of families). The authors show that choosing SLM as a null model reveals some previously reported relationships to be no longer "novel", in the sense that the patterns can be adequately captured by the null model. Authors also show that relationships not captured by the null model can be recovered by adding correlations, suggesting interactions are the driving force behind them.

Strengths

1. Authors make a strong argument that UNTB is not a good null model of macroecological observables and especially relationships between them. Authors convincingly argue that a SLM is a better null since the gamma distribution it predicts is a better description of the empirical Abundance Fluctuation Distributions (AFD).

2. Authors show that the gamma distribution predicted by SLM is a good fit for the AFD's at many different scales of coarse graining, not just the OTU level as was previously demonstrated. Authors show the same distribution predicted the mean diversity and richness at all scales of coarse graining.

3. Authors convincingly demonstrate how SLM can be used to test the relevance of interactions to macroecological relationships.

Weaknesses

This reviewer's concerns were convincingly addressed by the revisions.

Overall Impact

The authors present a convincing argument for the use of SLM as a better non-interacting null model for macroecological quantities and relationships.