Abstract
Background:
Which virological factors mediate overdispersion in the transmissibility of emerging viruses remains a longstanding question in infectious disease epidemiology.
Methods:
Here, we use systematic review to develop a comprehensive dataset of respiratory viral loads (rVLs) of SARSCoV2, SARSCoV1 and influenza A(H1N1)pdm09. We then comparatively metaanalyze the data and model individual infectiousness by shedding viable virus via respiratory droplets and aerosols.
Results:
The analyses indicate heterogeneity in rVL as an intrinsic virological factor facilitating greater overdispersion for SARSCoV2 in the COVID19 pandemic than A(H1N1)pdm09 in the 2009 influenza pandemic. For COVID19, case heterogeneity remains broad throughout the infectious period, including for pediatric and asymptomatic infections. Hence, many COVID19 cases inherently present minimal transmission risk, whereas highly infectious individuals shed tens to thousands of SARSCoV2 virions/min via droplets and aerosols while breathing, talking and singing. Coughing increases the contagiousness, especially in close contact, of symptomatic cases relative to asymptomatic ones. Infectiousness tends to be elevated between 1 and 5 days postsymptom onset.
Conclusions:
Intrinsic case variation in rVL facilitates overdispersion in the transmissibility of emerging respiratory viruses. Our findings present considerations for disease control in the COVID19 pandemic as well as future outbreaks of novel viruses.
Funding:
Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Grant program, NSERC Senior Industrial Research Chair program and the Toronto COVID19 Action Fund.
eLife digest
To understand how viruses spread scientists look at two things. One is – on average – how many other people each infected person spreads the virus to. The other is how much variability there is in the number of people each person with the virus infects. Some viruses like the 2009 influenza H1N1, a new strain of influenza that caused a pandemic beginning in 2009, spread pretty uniformly, with many people with the virus infecting around two other people. Other viruses like SARSCoV2, the one that causes COVID19, are more variable. About 10 to 20% of people with COVID19 cause 80% of subsequent infections – which may lead to socalled superspreading events – while 6075% of people with COVID19 infect no one else.
Learning more about these differences can help public health officials create better ways to curb the spread of the virus. Chen et al. show that differences in the concentration of virus particles in the respiratory tract may help to explain why superspreaders play such a big role in transmitting SARSCoV2, but not the 2009 influenza H1N1 virus. Chen et al. reviewed and extracted data from studies that have collected how much virus is present in people infected with either SARSCoV2, a similar virus called SARSCoV1 that caused the SARS outbreak in 2003, or with 2009 influenza H1N1.
Chen et al. found that as the variability in the concentration of the virus in the airways increased, so did the variability in the number of people each person with the virus infects. Chen et al. further used mathematical models to estimate how many virus particles individuals with each infection would expel via droplets or aerosols, based on the differences in virus concentrations from their analyses. The models showed that most people with COVID19 infect no one because they expel little – if any – infectious SARSCoV2 when they talk, breathe, sing or cough. Highly infectious individuals on the other hand have high concentrations of the virus in their airways, particularly the first few days after developing symptoms, and can expel tens to thousands of infectious virus particles per minute. By contrast, a greater proportion of people with 2009 influenza H1N1 were potentially infectious but tended to expel relatively little infectious virus when the talk, sing, breathe or cough.
These results help explain why superspreaders play such a key role in the ongoing pandemic. This information suggests that to stop this virus from spreading it is important to limit crowd sizes, shorten the duration of visits or gatherings, maintain social distancing, talk in low volumes around others, wear masks, and hold gatherings in wellventilated settings. In addition, contact tracing can prioritize the contacts of people with high concentrations of virus in their airways.
Introduction
Severe acute respiratory syndrome coronavirus 2 (SARSCoV2) has spread globally, causing the coronavirus disease 2019 (COVID19) pandemic with more than 129.2 million infections and 2.8 million deaths (as of 1 April 2021) (Dong et al., 2020). For respiratory virus transmission, airway epithelial cells shed virions to the extracellular fluid before atomization (from breathing, talking, singing, coughing and aerosolgenerating procedures) partitions them into a polydisperse mixture of particles that are expelled to the ambient environment. Aerosols (≤100 μm) can be inhaled nasally, whereas droplets (>100 μm) tend to be excluded (Prather et al., 2020; Roy and Milton, 2004). For direct transmission, droplets must be sprayed ballistically onto susceptible tissue (Liu et al., 2017a). Hence, droplets predominantly deposit on nearby surfaces, potentiating indirect transmission. Aerosols can be further categorized based on typical travel characteristics: shortrange aerosols (50–100 μm) tend to settle within 2 m; longrange ones (10–50 μm) often travel beyond 2 m based on emission force; and buoyant aerosols (≤10 μm) remain suspended and travel based on airflow profiles for minutes to many hours (Liu et al., 2017a; Wei and Li, 2015). Although proximity has been associated with infection risk for COVID19 (Chu et al., 2020), studies have also suggested that longrange airborne transmission occurs conditionally (Hamner et al., 2020; Lu et al., 2020a; Park et al., 2020).
While the basic reproductive number has been estimated to be 2.0–3.6 (Hao et al., 2020; Li et al., 2020a), transmissibility of SARSCoV2 is highly overdispersed (dispersion parameter k, 0.10–0.58), with numerous instances of superspreading (Hamner et al., 2020; Lu et al., 2020a; Park et al., 2020) and few cases (10–20%) causing many secondary infections (80%) (Bi et al., 2020; Endo et al., 2020; Laxminarayan et al., 2020). Similarly, few cases drive the transmission of SARSCoV1 (k, 0.16–0.17) (LloydSmith et al., 2005), whereas influenza A(H1N1)pdm09 transmits more homogeneously (k, 7.4–14.4) (Brugger and Althaus, 2020; Roberts and Nishiura, 2011), despite both viruses spreading by contact, droplets and aerosols (Cowling et al., 2013; Yu et al., 2004). Although understanding the determinants of viral overdispersion is crucial towards characterizing transmissibility and developing effective strategies to limit infection (Lee et al., 2020), mechanistic associations for k remain unclear. As an empirical estimate, k depends on myriad extrinsic (behavioral, environmental and invention) and host factors. Nonetheless, k remains similar across distinct outbreaks for a virus (LloydSmith et al., 2005), suggesting that intrinsic virological factors mediate virus overdispersion.
Here, we investigated how intrinsic case variation in respiratory viral loads (rVLs) facilitates overdispersion in SARSCoV2 transmissibility. By systematic review, we developed a comprehensive dataset of rVLs from cases of COVID19, SARS and A(H1N1)pdm09. Using comparative metaanalyses, we found that heterogeneity in rVL was associated with overdispersion among these emerging infections. To assess potential sources of case heterogeneity, we analyzed SARSCoV2 rVLs across age and symptomatology subgroups as well as disease course. To interpret the influence of heterogeneity in rVL on individual infectiousness, we modeled likelihoods of shedding viable virus via respiratory droplets and aerosols.
Results
Systematic review
We conducted a systematic review on virus quantitation in respiratory specimens taken during the infectious periods of SARSCoV2 (−3 to 10 days from symptom onset [DFSO]) (Arons et al., 2020; He et al., 2020; Wölfel et al., 2020), SARSCoV1 (0–20 DFSO) (Pitzer et al., 2007) and A(H1N1)pdm09 (−2 to 9 DFSO) (Ip et al., 2017) (Materials and methods). The systematic search (Figure 1—source data 1, Figure 1—source data 2, Figure 1—source data 3, Figure 1—source data 4, Figure 1—source data 5) identified 4274 results. After screening and fulltext review, 64 studies met the inclusion criteria and contributed to the systematic dataset (Figure 1) (N = 9631 total specimens), which included adult (N = 5033) and pediatric (N = 1608) cases from 15 countries and specimen measurements for asymptomatic (N = 2387), presymptomatic (N = 28) and symptomatic (N = 7161) infections. According to a hybrid Joanna Briggs Institute critical appraisal checklist, risk of bias was low for most contributing studies (Appendix 1—table 1).

Figure 1—source data 1
 https://cdn.elifesciences.org/articles/65774/elife65774fig1data1v2.docx

Figure 1—source data 2
 https://cdn.elifesciences.org/articles/65774/elife65774fig1data2v2.docx

Figure 1—source data 3
 https://cdn.elifesciences.org/articles/65774/elife65774fig1data3v2.docx

Figure 1—source data 4
 https://cdn.elifesciences.org/articles/65774/elife65774fig1data4v2.docx

Figure 1—source data 5
 https://cdn.elifesciences.org/articles/65774/elife65774fig1data5v2.docx
Association of overdispersion with heterogeneity in rVL
We hypothesized that individual case variation in rVL facilitates the distinctions in k among COVID19, SARS and A(H1N1)pdm09. For each study in the systematic dataset, we used specimen measurements (viral RNA concentration in a respiratory specimen) to estimate rVLs (viral RNA concentration in the respiratory tract) (Materials and methods). To investigate the relationship between k and heterogeneity in rVL, we performed a metaregression using each contributing study (Figure 2—figure supplement 1), which showed a weak, negative association between the two variables (metaregression slope ttest: p=0.038, Pearson’s r = −0.26).
Using contributing studies with low risk of bias, metaregression (Figure 2) showed a strong, negative association between k and heterogeneity in rVL for these three viruses (metaregression slope ttest: p<0.001, Pearson’s r = −0.73). In this case, each unit increase (one log_{10} copies/ml) in the standard deviation (SD) of rVL decreased log(k) by a factor of −1.41 (95% confidence interval [CI]: −1.78 to −1.03), suggesting that broader heterogeneity in rVL facilitates greater overdispersion in the transmissibility of SARSCoV2 than of A(H1N1)pmd09. To investigate mechanistic aspects of this association, we conducted a series of analyses on rVL and then modeled the influence of heterogeneity in rVL on individual infectiousness.
Metaanalysis and subgroup analyses of rVL
We first compared rVLs among the emerging infections. We performed a randomeffects metaanalysis (Figure 2—figure supplement 2), which approximated the expected rVL when encountering a COVID19, SARS or A(H1N1)pdm09 case during the infectious period. This showed that the expected rVL of SARSCoV2 was comparable to that of SARSCoV1 (onesided Welch’s ttest: p=0.111) but lesser than that of A(H1N1)pdm09 (p=0.040).
We also performed randomeffects subgroup analyses for COVID19 (Figure 3), which showed that expected SARSCoV2 rVLs were similar between pediatric and adult cases (p=0.476) and comparable between symptomatic/presymptomatic and asymptomatic infections (p=0.090). Since these metaanalyses had significant betweenstudy heterogeneity among the mean estimates (Cochran’s Q test: p<0.001 for each metaanalysis), we conducted riskofbias sensitivity analyses; metaanalyses of lowriskofbias studies continued to show significant heterogeneity (Figure 3—figure supplements 1–5).
Distributions of rVL
We next analyzed rVL distributions. For all three viruses, rVLs best conformed to Weibull distributions (Figure 4—figure supplement 1), and we fitted the entirety of individual sample data for each virus in the systematic dataset (Figure 4A, Figure 4—figure supplement 1N). While COVID19 and SARS cases tended to shed lesser virus than those with A(H1N1)pdm09 (Figure 2—figure supplement 2), broad heterogeneity in SARSCoV2 and SARSCoV1 rVLs inverted this relationship for highly infectious individuals (Figure 4A, Figure 4—figure supplement 2AC). At the 90th case percentile (cp) throughout the infectious period, the estimated rVL was 8.91 (95% CI: 8.83–9.00) log_{10} copies/ml for SARSCoV2, whereas it was 8.62 (8.47–8.76) log_{10} copies/ml for A(H1N1)pdm09 (Figure 4—figure supplement 3). The SD of the overall rVL distribution for SARSCoV2 was 2.04 log_{10} copies/ml, while it was 1.45 log_{10} copies/ml for A(H1N1)pdm09, showing that heterogeneity in rVL was indeed broader for SARSCoV2.
To assess potential sources of heterogeneity in SARSCoV2 rVL, we compared rVL distributions among COVID19 subgroups. In addition to comparable mean estimates (Figure 3), adult, pediatric, symptomatic/presymptomatic and asymptomatic COVID19 cases showed similar rVL distributions (Figure 4B, C), with SDs of 2.03, 2.06, 2.00 and 2.01 log_{10} copies/ml, respectively. Thus, age and symptomatology minimally influenced case variation in SARSCoV2 rVL during the infectious period.
SARSCoV2 kinetics during respiratory infection
To analyze the influences of disease course, we delineated individual SARSCoV2 rVLs by DFSO and fitted the mean estimates to a mechanistic model for respiratory virus kinetics (Figure 4D and Materials and methods). The outputs indicated that, on average, each productively infected cell in the airway epithelium shed SARSCoV2 at 1.33 (95% CI: 0.74–1.93) copies/ml day^{−1} and infected up to 9.25 susceptible cells (Figure 4—figure supplement 4). The turnover rate for infected epithelial cells was 0.71 (0.26–1.15) days^{−1}, while the halflife of SARSCoV2 RNA before clearance from the respiratory tract was 0.21 (0.11–2.75) days. By extrapolating the model to an initial rVL of 0 log_{10} copies/ml, the estimated incubation period was 5.38 days, which agrees with epidemiological findings (Li et al., 2020a). Conversely, the expected duration of shedding was 25.1 DFSO. Thus, SARSCoV2 rVL increased exponentially after infection, peaked around 1 DFSO along with the proportion of infected epithelial cells (Figure 4—figure supplement 5) and then diminished exponentially.
To evaluate case heterogeneity across the infectious period, we fitted distributions for each DFSO (Figure 4E), which showed that high SARSCoV2 rVLs also increased from the presymptomatic period, peaked at 1 DFSO and then decreased towards the end of the first week of illness. For the 90th cp at 1 DFSO, the rVL was 9.84 (95% CI: 9.17–10.56) log_{10} copies/ml, an order of magnitude greater than the overall 90th cp estimate. High rVLs between 1 and 5 DFSO were elevated above the expected values from the overall rVL distribution (Figure 4—figure supplement 3). At −1 DFSO, the 90th cp rVL was 8.30 (6.88–10.02) log_{10} copies/ml, while it was 7.93 (7.35–8.56) log_{10} copies/ml at 10 DFSO. Moreover, heterogeneity in rVL remained broad across the infectious period, with SDs of 1.83–2.44 log_{10} copies/ml between −1 to 10 DFSO (Figure 4—figure supplement 2HS).
Likelihood that droplets and aerosols contain virions
Towards analyzing the influence of heterogeneity in rVL on individual infectiousness, we first modeled the likelihood of respiratory particles containing viable SARSCoV2. Since rVL is an intensive quantity, the volume fraction of virions is low and viral partitioning coincides with atomization, we used Poisson statistics to model likelihood profiles. To calculate an unbiased estimator of partitioning (the expected number of viable copies per particle), our method multiplied rVL estimates with particle volumes during atomization and an assumed viability proportion of 0.1% in equilibrated particles (Materials and methods).
When expelled by the mean COVID19 case during the infectious period, respiratory particles showed low likelihoods of carrying viable SARSCoV2 (Figure 5—figure supplement 1). Aerosols (equilibrium aerodynamic diameter [d_{a}] ≤ 100 µm) were ≤3.16% (95% CI: 2.61–3.71%) likely to contain a virion. Droplets also had low likelihoods: at d_{a} = 200 µm, they were 22.3% (21.4–23.2%), 3.36% (3.03–3.69%) and 0.34% (0.29–0.39%) likely to contain one, two or three virions, respectively.
COVID19 cases with high rVLs, however, expelled particles with considerably greater likelihoods of carrying viable copies (Figure 5A, B, Figure 5—figure supplement 1D, E). For the 80th cp during the infectious period, aerosols (d_{a} ≤ 100 µm) were ≤87.9% (95% CI: 87.2–88.5%) likely to carry at least one SARSCoV2 virion. For the 90th cp, larger aerosols tended to contain multiple virions (Figure 5—figure supplement 1E). At 1 DFSO, these estimates were greatest, and ≤98.8% (98.1–99.4%) of buoyant aerosols (d_{a} ≤ 10 µm) contained at least one viable copy of SARSCoV2 for the 98th cp. When expelled by high cps, droplets (d_{a} > 100 µm) tended to contain tens to thousands of SARSCoV2 virions (Figure 5B, Figure 5—figure supplement 1E).
Shedding SARSCoV2 via respiratory droplets and aerosols
Using the partitioning estimates in conjunction with published profiles of the particles expelled by respiratory activities (Figure 5—figure supplement 2), we next modeled the rates at which talking, singing, breathing and coughing shed viable SARSCoV2 across d_{a} (Figure 5CF). Singing shed virions more rapidly than talking based on the increased emission of aerosols. Voice amplitude, however, had a significant effect on aerosol production, and talking loudly emitted aerosols at similar rates to singing (Figure 5—figure supplement 2E). Based on the generation of larger aerosols and droplets, talking and singing shed virions significantly more rapidly than breathing (Figure 5CE). Each cough shed similar quantities of virions as in a minute of talking (Figure 5C, F).
Each of these respiratory activities expelled aerosols at greater rates than droplets, but particle size correlated with the likelihood of containing virions according to our model. Talking, singing and coughing expelled virions at comparable proportions via droplets (55.6–59.4%) and aerosols (40.6–44.4%), whereas breathing did so predominantly within aerosols (Figure 5G). Moreover, shortrange aerosols mediated most of the virions (79.2–81.9%) shed via aerosols while talking normally and coughing. In comparison, while singing, or talking loudly, buoyant (14.5%) and longrange (17.5%) aerosols carried a larger proportion of the virions shed via aerosols (Figure 5G).
Influence of heterogeneity in rVL on individual infectiousness
To interpret how heterogeneity in rVL influences individual infectiousness, we modeled total SARSCoV2 shedding rates (over all particle sizes) for each respiratory activity (Figure 5H, Figure 5—figure supplement 3). Between the 1st and the 99th cps, the estimates for a respiratory activity spanned ≥8.48 orders of magnitude on each DFSO; cumulatively from −1 to 10 DFSO, they spanned 11.0 orders of magnitude. Hence, many COVID19 cases inherently presented minimal transmission risk, whereas highly infectious individuals shed considerable quantities of SARSCoV2. For the 98th cp at 1 DFSO, singing expelled 313 (95% CI: 37.5–3158) virions/min to the ambient environment, talking emitted 293 (35.1–2664) virions/min, breathing exhaled 1.54 (0.18–15.5) virions/min and coughing discharged 249 (29.8–25111) virions/cough; these estimates were approximately two orders of magnitude greater than those for the 85th cp. For the 98th cp at −1 DFSO, singing shed 14.5 (0.15–4515) virions/min and breathing exhaled 7.13 × 10^{−2} (7.20 × 10^{−4}–220.2) virions/min. The estimates at 9–10 DFSO were similar to these presymptomatic ones (Figure 5H, Figure 5—figure supplement 3B). As indicated by comparable mean rVLs (Figure 3) and heterogeneities in rVL (Figure 4B, C), adult, pediatric, symptomatic/presymptomatic and asymptomatic COVID19 subgroups presented similar distributions for shedding virions through these activities.
We also compared the influence of case variation on individual infectiousness between A(H1N1)pdm09 and COVID19. Aerosol spread accounted for approximately half of A(H1N1)pdm09 transmission events (Cowling et al., 2013), and the 50% human infectious dose for aerosolized influenza A virus is approximately 1–3 virions in the absence of neutralizing antibodies (Fabian et al., 2008). Based on the model, 62.9% of A(H1N1)pdm09 cases were infectious (shed ≥1 virion) via aerosols within 24 hr of talking loudly or singing (Figure 5—figure supplement 4A), and the estimate was 58.6% within 24 hr of talking normally and 22.3% within 24 hr of breathing. In comparison, 48.0% of COVID19 cases shed ≥1 virion via aerosols in 24 hr of talking loudly or singing (Figure 5—figure supplement 4C). Notably, only 61.4% of COVID19 cases shed ≥1 virion via either droplets or aerosols in 24 hr of talking loudly or singing (Figure 5—figure supplement 4D). While the human infectious dose of SARSCoV2 by any exposure route remains unelucidated, it must be at least one viable copy. Thus, at least 38.6% of COVID19 cases were expected to present negligible risk to spread SARSCoV2 through either droplets or aerosols in 24 hr. The proportion of potentially infectious cases further decreased as the threshold increased: 55.8, 42.5 and 25.0% of COVID19 cases were expected to shed ≥2, ≥10 and ≥100 virions, respectively, in 24 hr of talking loudly or singing during the infectious period.
While these analyses indicated that a greater proportion of A(H1N1)pdm09 cases were inherently infectious, 18.8% of COVID19 cases shed virions more rapidly than those infected with A(H1N1)pdm09 (Figure 4A). At the 98th cp for A(H1N1)pdm09, singing expelled 4.38 (2.85–6.78) virions/min and breathing exhaled 2.15 × 10^{−2} (1.40 × 10^{−2}–30.34×10^{−2}) virions/min. Highly infectious COVID19 cases expelled virions at rates that were up to 1–2 orders of magnitude greater than their A(H1N1)pdm09 counterparts (Figure 5H, Figure 5—figure supplement 5).
Discussion
This study provided systematic analyses of several factors characterizing SARSCoV2 transmissibility. First, our results indicate that broader heterogeneity in rVL facilitates greater overdispersion for SARSCoV2 than A(H1N1)pdm09. They suggest that many COVID19 cases infect no one (Bi et al., 2020; Endo et al., 2020; Laxminarayan et al., 2020) because they inherently present minimal transmission risk via respiratory droplets or aerosols, although behavioral and environmental factors may further influence risk. Meanwhile, highly infectious cases can shed tens to thousands of SARSCoV2 virions/min, especially between 1 and 5 DFSO, potentiating superspreading events. The model estimates, when corrected to copies rather than virions, align with recent clinical findings for exhalation rates of SARSCoV2 (Ma et al., 2020). In comparison, a greater proportion of A(H1N1)pdm09 cases are infectious but shed virions at low rates, which concurs with more uniform transmission and few superspreading events observed during the 2009 H1N1 pandemic (Brugger and Althaus, 2020; Roberts and Nishiura, 2011). Moreover, our analyses suggest that heterogeneity in rVL may be generally associated with overdispersion for viral respiratory infections. In this case, rVL distribution can serve as an early correlate for transmission patterns, including superspreading, during outbreaks of novel respiratory viruses. When considered jointly with contacttracing studies, this provides epidemiological triangulation on k: heterogeneity in rVL indirectly estimates k via an association, whereas contact tracing empirically characterizes transmission chains to estimate k but is limited by incomplete or incorrect recall of contact events by cases. When transmission is highly overdispersed, targeted interventions may disproportionately mitigate infection (Lee et al., 2020), with models showing that focused control efforts on the most infectious cases outperform random control policies (LloydSmith et al., 2005).
Second, we analyzed SARSCoV2 kinetics during respiratory infection. While heterogeneity remains broad throughout the infectious period, rVL tends to peak at 1 DFSO and be elevated for 1–5 DFSO, coinciding with the period of highest attack rates observed among close contacts (Cheng et al., 2020). These results indicate that transmission risk tends to be greatest near, and soon after, illness rather than in the presymptomatic period, which concurs with large tracing studies (6.4–12.6% of secondary infections from presymptomatic transmission) (Du et al., 2020; Wei et al., 2020) rather than early temporal models (~44%) (He et al., 2020). Furthermore, our kinetic analysis suggests that, on average, SARSCoV2 reaches diagnostic concentrations 1.54–3.17 days after respiratory infection (−3.84 to −2.21 DFSO), assuming assay detection limits of 1–3 log_{10} copies/ml, respectively, for nasopharyngeal swabs immersed in 1 ml of transport media.
Third, we assessed the relative infectiousness of COVID19 subgroups. As a common symptom of COVID19 (Guan et al., 2020), coughing sheds considerable numbers of virions via droplets and shortrange aerosols. Thus, symptomatic infections tend to be more contagious than asymptomatic ones, providing one reason as to why asymptomatic cases transmit SARSCoV2 at lower relative rates (Li et al., 2020b), especially in close contact (Luo et al., 2020), despite similar rVLs and increased contact patterns. Accordingly, children (48–54% of symptomatic cases present with cough) (Lu et al., 2020b; Team and CDC COVID19 Response Team, 2020) may be less contagious than adults (68–80%) (Guan et al., 2020; Team and CDC COVID19 Response Team, 2020) based on tendencies of symptomatology rather than rVL. Conversely, coughing sheds few virions via smaller aerosols. While singing and talking loudly, highly infectious cases can shed tens to hundreds of SARSCoV2 virions/min via longrange and buoyant aerosols.
Our study has limitations. The systematic search found a limited number of studies reporting quantitative specimen measurements from the presymptomatic period, meaning that these estimates may be sensitive to sampling bias. Although additional studies have reported semiquantitative metrics (cycle thresholds), these data were excluded because they cannot be compared on an absolute scale due to batch effects (Han et al., 2021), limiting use in compound analyses. In addition, our models considered virion partitioning during atomization to be a Poisson process, which stochastically associates partitioning with particle volume. Partitioning mechanisms associated with surface area, perhaps such as film bursting (Bird et al., 2010; Johnson and Morawska, 2009), may enrich the quantities of virions in smaller aerosols, based on their surface areatovolume ratio. As severe COVID19 is associated with high, persistent SARSCoV2 shedding in the lower respiratory tract (Chen et al., 2021) and small particles are typically generated there (Johnson et al., 2011), severe cases may also expel higher quantities of virions via smaller aerosols.
Furthermore, this study considered populationlevel estimates of the infectious periods, viability proportions and profiles for respiratory particles, which omit individual or environmental variation. Studies differ in their measurements of the emission rates and size distributions of the particles expelled during respiratory activities (Johnson et al., 2011; Schijven et al., 2020). Their characterization methods may prompt these differences, or they may be due to individual variation, including from distinctions in respiratory capacity, especially for young children, and phonetic tendencies (Asadi et al., 2020). Some patients shed SARSCoV2 with diminishing viability soon after symptom onset (Wölfel et al., 2020), whereas others produce replicationcompetent virus for weeks (van Kampen et al., 2021). The proportion of viable SARSCoV2 in respiratory particles, and how case characteristics or environmental factors influence it, remains under investigation (Fears et al., 2020; Lednicky et al., 2020; Morris et al., 2020). Cumulatively, these sources of variation may influence the shedding model estimates, further increasing heterogeneity in individual infectiousness.
Taken together, our findings provide a potential path forward for disease control. While talking, singing and coughing, our models indicate that SARSCoV2 is shed via droplets (55.6–59.4% of shed virions), shortrange aerosols (30.1–34.9%), longrange aerosols (7.7–8.3%) and buoyant aerosols (0.01–6.5%). Transmission, however, requires exposure. For direct transmission, droplets tend to be sprayed ballistically onto susceptible tissue. Aerosols can be inhaled, may penetrate more deeply into the lungs and more easily facilitate superspreading events. However, with short durations of stay in wellventilated areas, the exposure risk for both droplets and aerosols remains correlated with proximity to infectious cases (Liu et al., 2017a; Prather et al., 2020). Strategies to abate infection should limit crowd numbers and duration of stay while reinforcing distancing, lowvoice amplitudes and widespread mask usage; wellventilated settings can be recognized as lowerrisk venues. Coughing can shed considerable quantities of virions, while rVL tends to peak at 1 DFSO and can be high throughout the infectious period. Thus, immediate, sustained selfisolation upon illness is crucial to curb transmission from symptomatic cases. Collectively, our analyses highlight the role of cases with high rVLs in propelling the COVID19 pandemic. While diagnosing COVID19, qRTPCR can also triage contact tracing, prioritizing these patients: for nasopharyngeal swabs immersed in 1 ml of transport media, ≥7.14 (95% CI: 7.07–7.22) log_{10} copies/ml corresponds to the top 20% of COVID19 cases for variants before August 2020. Doing so may identify asymptomatic and presymptomatic infections more efficiently, a key step towards mitigation and elimination as the pandemic continues.
Materials and methods
Search strategy, selection criteria and data collection
Request a detailed protocolWe undertook a systematic review and prospectively submitted the protocol for registration on PROSPERO (registration number, CRD42020204637). Other than the title of this study, we have followed PRISMA reporting guidelines (Moher et al., 2009). The systematic review was conducted according to Cochrane methods guidance (Higgins et al., 2019).
The search included papers that (i) reported positive, quantitative measurements (copies/ml or an equivalent metric) of SARSCoV2, SARSCoV1 or A(H1N1)pdm09 in human respiratory specimens (endotracheal aspirate [ETA], nasopharyngeal aspirate [NPA], nasopharyngeal swab [NPS], oropharyngeal swab [OPS], posterior oropharyngeal saliva [POS] and sputum [Spu]) from COVID19, SARS or A(H1N1)pdm09 cases; (ii) reported data that could be extracted from the estimated infectious periods of SARSCoV2 (defined as −3 to +10 DFSO for symptomatic cases and 0 to +10 days from the day of laboratory diagnosis for asymptomatic cases), SARSCoV1 (defined as 0 to +20 DFSO or the equivalent asymptomatic period) or A(H1N1)pdm09 (defined as −2 to +9 DFSO for symptomatic cases and 0 days to +9 days from the day of laboratory diagnosis for asymptomatic cases); and (iii) reported data for two or more cases with laboratoryconfirmed COVID19, SARS or A(H1N1)pdm09 based on World Health Organization (WHO) case definitions. Quantitative specimen measurements were considered after RNA extraction for diagnostic sequences of SARSCoV2 (Ofr1b, N, RdRp and E genes), SARSCoV1 (Ofr1b, N and RdRp genes) and A(H1N1)pdm09 (HA and M genes).
Studies were excluded, in the following order, if they (i) studied an ineligible disease; (ii) had an ineligible study design, including those that were reviews of evidence (e.g., scoping, systematic or narrative), did not include primary clinical human data, reported data for less than two cases due to an increased risk of selection bias, were incomplete (e.g., ongoing clinical trials), did not report an RNA extraction step before measurement or were studies of environmental samples; (iii) reported an ineligible metric for specimen concentration (e.g., qualitative RTPCR or cycle threshold [Ct] values without calibration included in the study); (iv) reported quantitative measurements from an ineligible specimen type (e.g., blood specimens, pooled specimens or selfcollected POS or Spu patient specimens in the absence of a healthcare professional); (v) reported an ineligible sampling period (consisted entirely of data that could not be extracted from within the infectious period); or (vi) were duplicates of an included study (e.g., preprinted version of a published paper or duplicates not identified by Covidence). We included data from control groups receiving standard of care in interventional studies but excluded data from the intervention group. Patients in the intervention group are, by definition, systematically different from general case populations because they receive therapies not being widely used for treatment, which may influence virus concentrations. Interventional studies examining the comparative effectiveness of two or more treatments were excluded for the same reason. Studies exclusively reporting semiquantitative measurements (e.g., Ct values) of specimen concentration were excluded as these measurements are sensitive to batch and instrument variation and, without proper calibration, cannot be compared on an absolute scale across studies (Han et al., 2021).
We searched, without the use of filters or language restrictions, the following sources: MEDLINE (via Ovid, 1946 to 7 August 2020), EMBASE (via Ovid, 1974 to 7 August 2020), Cochrane Central Register of Controlled Trials (via Ovid, 1991 to 7 August 2020), Web of Science Core Collection (including Science Citation Index Expanded, 1900 to 7 August 2020; Social Sciences Citation Index, 1900 to 7 August 2020; Arts & Humanities Citation Index, 1975 to 7 August 2020; Conference Proceedings Citation Index – Science, 1990 to 7 August 2020; Conference Proceedings Citation Index – Social Sciences & Humanities, 1990 to 7 August 2020; and Emerging Sources Citation Index, 2015 to 7 August 2020), as well as medRxiv and bioRxiv (both searched through Google Scholar via the Publish or Perish program, to 7 August 2020). We also gathered studies by searching through the reference lists of review articles identified by the database search, by searching through the reference lists of included articles, through expert recommendation (by Eric J. Topol and Akiko Iwasaki on Twitter) and by handsearching through journals (Nature, Nat. Med., Science, NEJM, Lancet, Lancet Infect. Dis., JAMA, JAMA Intern. Med. and BMJ). A comprehensive search was developed by a librarian, which included subject headings and keywords. The search strategy had three main concepts (disease, specimen type and outcome), and each concept was combined using the appropriate Boolean operators. The search was tested against a sample set of known articles that were preidentified. The linebyline search strategies for all databases are included in Figure 1—source data 1, Figure 1—source data 2, Figure 1—source data 3, Figure 1—source data 4, Figure 1—source data 5. The search results were exported from each database and uploaded to the Covidence online system for deduplication and screening.
Two authors independently screened titles and abstracts, reviewed full texts, collected data and assessed risk of bias via Covidence and a hybrid critical appraisal checklist based on the Joanna Briggs Institute (JBI) tools for case series, analytical crosssectional studies and prevalence studies (Moola et al., 2020; Munn et al., 2019; Munn et al., 2015). To evaluate the sample size in a study, we used the following calculation:
where ${n}^{*}$ is the sample size threshold, $z$ is the zscore for the level of confidence (95%), $\sigma $ is the standard deviation (assumed to be 3 log_{10} copies/ml, one quarter of the full range of rVLs) and $d$ is the marginal error (assumed to be 1 log_{10} copies/ml, based on the minimum detection limit for qRTPCR across studies) (Johnston et al., 2019). The hybrid JBI critical appraisal checklist is shown in the Appendix. Studies were considered to have low risk of bias if they met the majority of the items, indicating that the estimates were likely to be correct for the target population. Inconsistencies were resolved by discussion and consensus.
The search found 29 studies for COVID19 (Argyropoulos et al., 2020; Baggio et al., 2020; Fajnzylber et al., 2020; Han et al., 2020a; Han et al., 2020b; Hung et al., 2020; Hurst et al., 2020; Iwasaki et al., 2020; Kawasuji et al., 2020; L'Huillier et al., 2020; Lavezzo et al., 2020; Lennon et al., 2020; Lucas et al., 2020; Mitjà et al., 2020; Pan et al., 2020; Peng et al., 2020; Perera et al., 2020; Shi et al., 2020; Shrestha et al., 2020; To et al., 2020; van Kampen et al., 2021; Vetter et al., 2020; Wölfel et al., 2020; Wyllie et al., 2020; Xu et al., 2020; Yonker et al., 2020; Zhang et al., 2020a; Zheng et al., 2020; Zou et al., 2020), 8 studies for SARS (Chen et al., 2006; Cheng et al., 2004; Chu et al., 2005; Chu et al., 2004; Hung et al., 2004; Peiris et al., 2003; Poon et al., 2004; Poon et al., 2003) and 27 studies for A(H1N1)pdm09 (Chan et al., 2011; Cheng et al., 2010; Cowling et al., 2010; Duchamp et al., 2010; Esposito et al., 2011; Hung et al., 2010; Ip et al., 2016; Ito et al., 2012; Killingley et al., 2010; Launes et al., 2012; Lee et al., 2011a; Lee et al., 2011b; Li et al., 2010a; Li et al., 2010b; Loeb et al., 2012; Lu et al., 2012; Meschi et al., 2011; Ngaosuwankul et al., 2010; Rath et al., 2012; Rodrigues Guimarães Alves et al., 2020; Suess et al., 2010; Thai et al., 2014; To et al., 2010a; To et al., 2010b; Watanabe et al., 2011; Wu et al., 2012; Yang et al., 2011), and data were collected from each study. For preprinted studies that were published as journal articles before the revised submission of this manuscript, we included the citation for the journal article. Descriptive statistics on quantitative specimen measurements were collected from confirmed cases directly if reported numerically or using WebPlotDigitizer 4.3 (https://apps.automeris.io/wpd/) if reported graphically. Individual specimen measurements were collected directly if reported numerically or, when the data were clearly represented, using the tool if reported graphically. We also collected the relevant numbers of cases, types of cases, reported treatments, volumes of transport media, numbers of specimens and DFSO (for symptomatic cases) or day relative to initial laboratory diagnosis (for asymptomatic cases) on which each specimen was taken. Hospitalized cases were defined as those being tested in a hospital setting and then admitted. Nonadmitted cases were defined as those being tested in a hospital setting but not admitted. Community cases were defined as those being tested in a community setting. Symptomatic, presymptomatic and asymptomatic infections were defined as in the study. Based on rare description in contributing studies, paucisymptomatic infections, when described, were included with symptomatic ones. Pediatric cases were defined as those below 18 years of age or as defined in the study. Adult cases were defined as those 18 years of age or higher or as defined in the study.
Calculation of rVLs from specimen measurements
Request a detailed protocolIn this study, viral concentrations in respiratory specimens were denoted as specimen measurements, whereas viral concentrations in the respiratory tract were denoted as rVLs. To determine rVLs, each collected quantitative specimen measurement was converted to rVL based on the dilution factor. For example, measurements from swabbed specimens (NPS and OPS) typically report the RNA concentration in viral transport media. Based on the expected uptake volume for swabs (0.128 ± 0.031 ml, mean ± SD) (Warnke et al., 2014) or reported collection volume for expulsed fluid in the study (e.g., 0.5–1 ml) along with the reported volume of transport media in the study (e.g., 1 ml), we calculated the dilution factor for each respiratory specimen to estimate the rVL. If the diluent volume was not reported, then the dilution factor was calculated assuming a volume of 1 ml (NPS and OPS), 2 ml (POS and ETA) or 3 ml (NPA) of transport media (Lavezzo et al., 2020; Poon et al., 2004; To et al., 2020). Unless dilution was reported for Spu specimens, we used the specimen measurement as the rVL (Wölfel et al., 2020). The nonreporting of diluent volume was noted as an element increasing risk of bias in the hybrid JBI critical appraisal checklist. Specimen measurements (based on instrumentation, calibration, procedures and reagents) are not standardized and, as DFSO is typically based on patient recall, there is also inherent uncertainty in these values. While the above procedures (including only quantitative measurements after extraction as an inclusion criterion, considering assay detection limits and correcting for specimen dilution) have considered many of these factors, nonstandardization remains an inherent limitation in the variability of specimen measurements.
Metaregression of k and heterogeneity in rVL
Request a detailed protocolTo assess the relationship between k and heterogeneity in rVL, we performed a univariate metaregression ($\mathrm{log}k=a$*$SD+b$, where $a$ is the slope for association and $b$ is the intercept) between pooled estimates of k (based on studies describing community transmission) for COVID19 (k = 0.409) (Adam et al., 2020; Tariq et al., 2020; Zhang et al., 2020b; Laxminarayan et al., 2020; Bi et al., 2020; Endo et al., 2020; Riou and Althaus, 2020), SARS (k = 0.165) (LloydSmith et al., 2005) and A(H1N1)pdm09 (k = 8.155) (Brugger and Althaus, 2020; Roberts and Nishiura, 2011) and the SD of the rVLs in contributing studies. Since SD was the metric, we used a fixedeffects model. For weighting in the metaregression, we used the proportion of rVL samples from each study relative to the entire systematic dataset ($W}_{\mathrm{i}}={n}_{\mathrm{i}}/{n}_{\mathrm{t}\mathrm{o}\mathrm{t}\mathrm{a}\mathrm{l}$). All calculations were performed in units of log_{10} copies/ml. As the metaregression used pooled estimates of k for each infection, it assumed that there was no correlated bias to k across contributing studies. The limit of detection for qRTPCR instruments used in the included studies did not significantly affect the analysis of heterogeneity in rVL as these limits tended to be below the values found for specimens with low virus concentrations. The metaregression was conducted using all contributing studies and showed a weak association. Metaregression was also conducted using studies that had low risk of bias according to the hybrid JBI critical appraisal checklist and showed a strong association. The pvalue for association was obtained using the metaregression slope ttest for $a$, the effect estimate. While there is intrinsic measurement error in virus quantitation, based on the systematic review protocol and study design (as described above), this error should similarly increase heterogeneity in rVL for each virus, and the difference in heterogeneity in rVL between viruses should arise from the viruses.
Metaanalysis of rVLs
Request a detailed protocolBased on the search design and composition of contributing studies, the metaanalysis overall estimates were the expected SARSCoV2, SARSCoV1 and A(H1N1)pdm09 rVL when encountering a COVID19, SARS or A(H1N1)pdm09 case, respectively, during their infectious period. Pooled estimates and 95% CIs for the expected rVL of each virus across their infectious period were calculated using a randomeffects metaanalysis (DerSimonian and Laird method). For studies reporting summary statistics in medians and interquartile or total ranges, we derived estimates of the mean and variance and calculated the 95% CIs (Wan et al., 2014). All calculations were performed in units of log_{10} copies/ml. Betweenstudy heterogeneity in metaanalysis was assessed using Cochran’s Q test and the I^{2} and τ^{2} statistics. If significant betweenstudy heterogeneity in metaanalysis was encountered, sensitivity analysis based on the risk of bias of contributing studies was performed. The metaanalyses were conducted using STATA 14.2 (StataCorp LLC, College Station, TX, USA).
Age and symptomatology subgroup analyses of SARSCoV2 rVLs
Request a detailed protocolThe overall estimate for each subgroup was the expected rVL when encountering a case of that subgroup during the infectious period. Studies reporting data exclusively from a subgroup of interest were directly included in the analysis after rVL estimations. For studies in which data for these subgroups constituted only part of its dataset, rVLs from the subgroup were extracted to calculate the mean, variance and 95% CIs. Randomeffects metaanalysis was performed as described above. For metaanalyses of pediatric and asymptomatic COVID19 cases, contributing studies had low risk of bias, and no riskofbias sensitivity analyses were performed for these subgroups.
Distributions of rVL
Request a detailed protocolWe pooled the entirety of individual sample data in the systematic dataset by disease, COVID19 subgroups and DFSO. For analyses of SARSCoV2 dynamics across disease course, we included estimated rVLs from negative qRTPCR measurements of respiratory specimens for cases that had previously been quantitatively confirmed to have COVID19. These rVLs were estimated based on the reported assay detection limit in the respective study. Probability plots and modified Kolmogorov–Smirnov tests used the Blom scoring method and were used to determine the suitability of normal, lognormal, gamma and Weibull distributions to describe the distribution of rVLs for SARSCoV2, SARSCoV1 and A(H1N1)pdm09. For each virus, the data best conformed to Weibull distributions, which is described by the probability density function
where $\alpha $ is the shape factor, $\beta $ is the scale factor and $\upsilon $ is rVL ($\upsilon \ge 0$ log_{10} copies/ml). Weibull distributions were fitted on the entirety of collected individual sample data for the respective category. Since individual specimen measurements could not be collected from all studies, there was a small bias on the mean estimate for each fitted distribution. Thus, for the curves shown in Figure 4B, C, the mean of the Weibull distributions summarized in Figure 4—figure supplement 2 was adjusted to be the subgroup metaanalysis estimate for correction; the SD and distribution around that mean remained consistent.
For each Weibull distribution, the value of the rVL at the $x$ th percentile was determined using the quantile function,
For cp curves, we used Equation (3) to determine rVLs from the 1st cp to the 99th cp (step size, 1%). Curve fitting to Equation (2) and calculation of Equation (3) and its 95% CI was performed using the Distribution Fitter application in Matlab R2019b (MathWorks, Inc, Natick, MA, USA).
Viral kinetics
Request a detailed protocolTo model SARSCoV2 kinetics during respiratory infection, we used a mechanistic epithelial celllimited model for the respiratory tract (Baccam et al., 2006), based on the system of differential equations:
where $T$ is the number of uninfected target cells, $I$ is the number of productively infected cells, $V$ is the rVL, $\beta $ is the infection rate constant, $p$ is the rate at which airway epithelial cells shed virus to the extracellular fluid, $c$ is the clearance rate of virus and $\delta $ is the clearance rate of productively infected cells. Using these parameters, the viral halflife in the respiratory tract (${t}_{1/2}=\mathrm{ln}2/c$) and the halflife of productively infected cells (${t}_{1/2}=\mathrm{ln}2/\delta $) could be estimated. Moreover, the cellular basic reproductive number (the expected number of secondary infected cells from a single productively infected cell placed in a population of susceptible cells) was calculated by
For initial parameterization, Equations (4)–(6) were simplified according to a quasisteady state approximation (Ikeda et al., 2016) to
where $r=p\beta /c$, for a form with greater numerical stability. The system of differential equations was fitted on the mean estimates of SARSCoV2 rVL between 2 and 10 DFSO using the entirety of individual sample data in units of copies/ml. Numerical analysis was implemented using the Fit ODE app in OriginPro 2019b (OriginLab Corporation, Northampton, MA, USA) via the Runge–Kutta method and initial parameters ${V}_{0}$, ${I}_{0}$ and ${T}_{0}$ of 4 copies/ml, 0 cells and 5 × 10^{7} cells, respectively, for the range –5 to 10 DFSO. The analysis was first performed with Equations (8) and (9). These output parameters were then used to initialize final analysis using Equations (4)–(6), where the estimates for $\beta $ and $\delta $ were input as fixed and variable parameters, respectively. The fitted line and its coefficient of determination (r^{2}) were presented. The estimated halflife of SARSCoV2 RNA has a skewed 95% CI (Figure 4—figure supplement 4). As $c$ is in the denominator of the equation for halflife (${t}_{1/2}=\mathrm{ln}2/c$), ${t}_{1/2}$ is sensitive to c below 1, which is the case for its lower 95% CI (Figure 4—figure supplement 4) and the source of the skew.
To estimate the average incubation period, we extrapolated the kinetic model to 0 log_{10} copies/ml presymptom onset. To estimate the average duration of shedding, we extrapolated the model to 0 log_{10} copies/ml postsymptom onset. Unlike in experimental studies, this estimate for duration of shedding was not defined by assay detection limits. To estimate the average DFSO on which SARSCoV2 concentration reached diagnostic levels, we extrapolated the model presymptom onset to the equivalent of 1 and 3 log_{10} copies/ml (chosen as example assay detection limits) in specimen concentration for NPSs immersed in 1 ml of transport media, as described by the dilution factor estimation above. The average time from respiratory infection to reach diagnostic levels was then calculated by subtracting these values from the estimated average incubation period. The extrapolated time for SARSCoV2 to reach diagnostic concentrations in the respiratory tract should be validated in tracing studies, in which contacts are prospectively subjected to daily sampling.
Likelihood of respiratory particles containing virions
Request a detailed protocolTo calculate an unbiased estimator for viral partitioning (the expected number of viable copies in an expelled particle at a given size), we multiplied rVLs with the volume equation for spherical particles during atomization and the estimated viability proportion, according to the following equation:
where $\lambda $ is the expectation value, $\rho $ is the material density of the respiratory particle (997 kg/m^{3}), ${v}_{p}$ is the volumetric conversion factor (1 ml/g), $\gamma $ is the viability proportion, $\upsilon $ is the rVL and $d$ is the hydrated diameter of the particle during atomization.
The model assumed $\gamma $ was 0.1% as a populationlevel estimate. For influenza, approximately 0.1% of copies in particles expelled from the respiratory tract represent viable virus (Yan et al., 2018), which is equivalent to one viable copy in 3 log_{10} copies/ml for rVL or, after dilution in transport media, roughly one in 4 log_{10} copies/ml for specimen concentration. Respiratory specimens taken from influenza cases show positive cultures for specimen concentrations down to 4 log_{10} copies/ml (Lau et al., 2010). Likewise, for COVID19 cases, recent reports also show culturepositive respiratory specimens with SARSCoV2 concentrations down to 4 log_{10} copies/ml (Wölfel et al., 2020), including from pediatric (L'Huillier et al., 2020) and asymptomatic (Arons et al., 2020) cases. Moreover, replicationcompetent SARSCoV2 has been found in respiratory specimens taken throughout the respiratory tract (mouth, nasopharynx, oropharynx and lower respiratory tract) (Jeong et al., 2020; Wölfel et al., 2020). Taken together, these considerations suggested that the assumption for viability proportion (0.1%) was suitable to model the likelihood of respiratory particles containing viable SARSCoV2. In accordance with the discussion above, the model did not differentiate this populationlevel viability estimate based on age, symptomatology or sites of atomization. Based on the relative relationship between the residence time of expelled particles before assessment (~5 s) (Yan et al., 2018), we took the viability proportion to be for equilibrated particles.
Likelihood profiles were determined using Poisson statistics, as described by the probability mass function
where $k$ is the number of virions partitioned within the particle. For $\lambda $, 95% CIs were determined using the variance of its rVL estimate. To determine 95% CIs for likelihood profiles from the probability mass function, we used the delta method, which specifies
where ${\sigma}^{2}\mathbf{D}$ is the covariance matrix of $\mathit{\theta}$ and $\dot{g}\left(\mathit{\theta}\right)$ is the gradient of $g\left(\mathit{\theta}\right)$. For the univariate Poisson distribution, ${\sigma}^{2}\mathbf{D}=\lambda $ and
Rate profiles of particles expelled by respiratory activities
Request a detailed protocolDistributions from the literature were used to determine the rate profiles of particles expelled during respiratory activities. For breathing, talking and coughing, we used data from Johnson et al., 2011. For singing, we used data from Morawska et al., 2009 for smaller aerosols (d_{a} < 20 μm) and used the profiles from talking for larger aerosols and droplets based on the oral cavity mechanism from Johnson et al., 2011. Rate profiles (particles/min or particles/cough) were calculated based on the corrected normalized concentration (dC_{n}/dlogD_{p}, in units of particles/cm^{3}) at each discrete particle size, normalization (32 size channels per decade) for the aerodynamic particle sizer used, unit conversion (cm^{3} to l) and the sample flow rate (1 l/min). For coughing, the calculation assumed that participants coughed 10 times in the 30s sampling interval. To determine the corrected normalized concentrations for breathing, we used a particle dilution factor of 4 and evaporation factor of 0.5, consistent with the other respiratory activities in Johnson et al., 2011. Breathing was taken to expel negligible quantities of larger respiratory particles based on the bronchiolar fluid film burst mechanism (Johnson et al., 2011). To account for intermittent breathing while talking and singing, the rate profiles for these activities included the contribution of aerosols expelled by breathing. We compared these rate profiles with those collected from talking loudly and talking quietly from Asadi et al., 2020. In our models, we took the diameter of dehydrated respiratory particles to be 0.3 times the initial size when atomized in the respiratory tract (Johnson et al., 2011; Lieber et al., 2021; Liu et al., 2017b). Equilibrium aerodynamic diameter was calculated by ${d}_{a}={d}_{p}{\left(\rho /{\rho}_{0}\right)}^{1/2}$, where ${d}_{p}$ is the dehydrated diameter, $\rho $ is the material density of the respiratory particle and ${\rho}_{0}$ is the reference material density (1 g/cm^{3}). Curves based on discrete particle measurements were connected using the nonparametric Akima spline function.
Shedding virions via respiratory droplets and aerosols
Request a detailed protocolTo model the respiratory shedding rate across particle size, rVL estimates and the hydrated diameters of particles expelled by a respiratory activity were input into Equation (10), and the output was then multiplied by the rate profile of the activity (talking, singing, breathing or coughing). To assess the relative contribution of aerosols and droplets to mediating respiratory viral shedding for a given respiratory activity, we calculated the proportion of the cumulative hydrated volumetric rate contributed by buoyant aerosols (d_{a} ≤ 10 μm), longrange aerosols (10 μm < d_{a} ≤ 50 μm), shortrange aerosols (50 μm < d_{a} ≤ 100 μm) and droplets (d_{a} > 10 μm) for that respiratory activity. Since the Poisson mean was proportional to cumulative volumetric rate, this estimate of the relative contribution of aerosols and droplets to respiratory viral shedding was consistent among viruses and cps in the model.
To determine the total respiratory shedding rate for a given respiratory activity across cp, we determined the cumulative hydrated volumetric rate (by summing the hydrated volumetric rates across particle sizes for that respiratory activity) of particle atomization and input it into Equation (10). Using rVLs and their variances as determined by the Weibull quantile functions, we then calculated the Poisson means and their 95% CIs at the different cps.
To assess the influence of heterogeneity in rVL on individual infectiousness, we first considered transmission of A(H1N1)pdm09 via aerosols (Cowling et al., 2013). The 50% human infectious dose (HID_{50}) of aerosolized A(H1N1)pdm09 was taken to be 1–3 virions (Fabian et al., 2008). To determine the expected time required for a A(H1N1)pdm09 case to shed one virion via aerosols, we took the reciprocal of the Poisson means and their 95% CIs at the different cps of the estimated shedding rates. The expected time required for a COVID19 case to shed one virion via aerosols or one virion via droplets or aerosols was determined in a same manner.
Data availability
Request a detailed protocolThe systematic dataset and model outputs from this study were uploaded to Zenodo (https://zenodo.org/record/4658971). The code generated during this study is available at GitHub (https://github.com/paulzchen/sars2heterogeneity; Chen, 2020; copy archived at swh:1:rev:06649ccfb6e92918b439332314ebf330abfa3d16). The systematic review protocol was prospectively registered on PROSPERO (registration number, CRD42020204637).
Appendix 1
Data availability
The systematic dataset and model outputs from this study are uploaded to Zenodo (https://zenodo.org/record/4658971). The code generated during this study is available at GitHub (https://github.com/paulzchen/sars2heterogeneity; copy archived at https://archive.softwareheritage.org/swh:1:rev:06649ccfb6e92918b439332314ebf330abfa3d16). The systematic review protocol was prospectively registered on PROSPERO (registration number, CRD42020204637).

ZenodoHeterogeneity in transmissibility and shedding SARSCoV2 via droplets and aerosols.https://doi.org/10.5281/zenodo/4658971
References

Association of initial viral load in severe acute respiratory syndrome coronavirus 2 (SARSCoV2) Patients with outcome and symptomsThe American Journal of Pathology 190:1881–1887.https://doi.org/10.1016/j.ajpath.2020.07.001

Presymptomatic SARSCoV2 infections and transmission in a skilled nursing facilityNew England Journal of Medicine 382:2081–2090.https://doi.org/10.1056/NEJMoa2008457

Kinetics of influenza A virus infection in humansJournal of Virology 80:7590–7599.https://doi.org/10.1128/JVI.0162305

SARSCoV2 viral load in the upper respiratory tract of children and adults with early acute COVID19Clinical Infectious Diseases 6:ciaa1157.https://doi.org/10.1093/cid/ciaa1157

Nasopharyngeal shedding of severe acute respiratory syndromeassociated coronavirus is associated with genetic polymorphismsClinical Infectious Diseases 42:1561–1569.https://doi.org/10.1086/503843

Viral replication in the nasopharynx is associated with diarrhea in patients with severe acute respiratory syndromeClinical Infectious Diseases 38:467–475.https://doi.org/10.1086/382681

Viral load distribution in SARS outbreakEmerging Infectious Diseases 11:1882–1886.https://doi.org/10.3201/eid1112.040949

Comparative epidemiology of pandemic and seasonal influenza A in householdsNew England Journal of Medicine 362:2175–2184.https://doi.org/10.1056/NEJMoa0911530

Aerosol transmission is an important mode of influenza A virus spreadNature Communications 4:1935.https://doi.org/10.1038/ncomms2922

An interactive webbased dashboard to track COVID19 in real timeThe Lancet Infectious Diseases 20:533–534.https://doi.org/10.1016/S14733099(20)301201

Serial interval of COVID19 among publicly reported confirmed casesEmerging Infectious Diseases 26:1341–1343.https://doi.org/10.3201/eid2606.200357

Pandemic A(H1N1)2009 influenza virus detection by real time RTPCR: is viral quantification useful?Clinical Microbiology and Infection 16:317–321.https://doi.org/10.1111/j.14690691.2010.03169.x

Clinical characteristics of coronavirus disease 2019 in ChinaNew England Journal of Medicine 382:1708–1720.https://doi.org/10.1056/NEJMoa2002032

High SARSCoV2 attack rate following exposure at a choir practice  Skagit county, Washington, march 2020MMWR. Morbidity and Mortality Weekly Report 69:606–610.https://doi.org/10.15585/mmwr.mm6919e6

Sequential analysis of viral load in a neonate and her mother infected with severe acute respiratory syndrome coronavirus 2Clinical Infectious Diseases 71:2236–2239.https://doi.org/10.1093/cid/ciaa447

Viral RNA load in mildly symptomatic and asymptomatic children with COVID19, Seoul, South KoreaEmerging Infectious Diseases 26:2497–2499.https://doi.org/10.3201/eid2610.202449

RTPCR for SARSCoV2: quantitative versus qualitativeThe Lancet Infectious Diseases 21:165.https://doi.org/10.1016/S14733099(20)304242

BookCochrane Handbook for Systematic Reviews of Interventions, Cochrane Book Series (2nd ed)Chichester, UK: John Wiley & Sons.

Viral loads in clinical specimens and SARS manifestationsEmerging Infectious Diseases 10:1550–1557.https://doi.org/10.3201/eid1009.040058

Viral shedding and transmission potential of asymptomatic and paucisymptomatic influenza virus infections in the communityClinical Infectious Diseases 64:736–742.https://doi.org/10.1093/cid/ciw841

Comparison of SARSCoV2 detection in nasopharyngeal swab and salivaJournal of Infection 81:e145–e147.https://doi.org/10.1016/j.jinf.2020.05.071

Viable SARSCoV2 in various specimens from COVID19 patientsClinical Microbiology and Infection 26:1520–1524.https://doi.org/10.1016/j.cmi.2020.07.020

Modality of human expired aerosol size distributionsJournal of Aerosol Science 42:839–851.https://doi.org/10.1016/j.jaerosci.2011.07.009

The mechanism of breath aerosol formationJournal of Aerosol Medicine and Pulmonary Drug Delivery 22:229–237.https://doi.org/10.1089/jamp.2008.0720

Methods of sample size calculation in descriptive retrospective burden of illness studiesBMC Medical Research Methodology 19:9.https://doi.org/10.1186/s1287401806579

Virus shedding and environmental deposition of novel A (H1N1) pandemic influenza virus: interim findingsHealth Technology Assessment 14:237–354.https://doi.org/10.3310/hta1446004

CultureCompetent SARSCoV2 in nasopharynx of symptomatic neonates, children, and adolescentsEmerging Infectious Diseases 26:2494–2497.https://doi.org/10.3201/eid2610.202403

Viral shedding and clinical illness in naturally acquired influenza virus infectionsThe Journal of Infectious Diseases 201:1509–1516.https://doi.org/10.1086/652241

Viral load at diagnosis and influenza A H1N1 (2009) disease severity in childrenInfluenza and Other Respiratory Viruses 6:e89–e92.https://doi.org/10.1111/j.17502659.2012.00383.x

Viable SARSCoV2 in the air of a hospital room with COVID19 patientsInternational Journal of Infectious Diseases 100:476–482.https://doi.org/10.1016/j.ijid.2020.09.025

Comparison of pandemic (H1N1) 2009 and seasonal influenza viral loads, SingaporeEmerging Infectious Diseases 17:287–290.https://doi.org/10.3201/eid1702.100282

Correlation of pandemic (H1N1) 2009 viral load with disease severity and prolonged viral shedding in childrenEmerging Infectious Diseases 16:1265–1272.https://doi.org/10.3201/eid1608.091918

Early transmission dynamics in Wuhan, China, of novel CoronavirusInfected pneumoniaNew England Journal of Medicine 382:1199–1207.https://doi.org/10.1056/NEJMoa2001316

Longitudinal study of influenza molecular viral shedding in hutterite communitiesJournal of Infectious Diseases 206:1078–1084.https://doi.org/10.1093/infdis/jis450

COVID19 outbreak associated with air conditioning in restaurant, Guangzhou, China, 2020Emerging Infectious Diseases 26:1628–1631.https://doi.org/10.3201/eid2607.200764

SARSCoV2 infection in childrenNew England Journal of Medicine 382:1663–1665.https://doi.org/10.1056/NEJMc2005073

COVID19 patients in earlier stages exhaled millions of SARSCoV2 per hourClinical Infectious Diseases 28:ciaa1283.https://doi.org/10.1093/cid/ciaa1283

Duration of viral shedding in hospitalized patients infected with pandemic H1N1BMC Infectious Diseases 11:140.https://doi.org/10.1186/1471233411140

Hydroxychloroquine for early treatment of adults with mild Covid19: a randomizedcontrolled trialClinical Infectious Diseases 16:ciaa1009.https://doi.org/10.1093/cid/ciaa1009

BookChapter 7: systematic reviews of etiology and riskIn: Aromataris E, Munn Z, editors. Joanna Briggs Institute Reviewer's Manual. the Joanna Briggs Institute. The Joanna Briggs Institute. pp. 5–19.

Methodological guidance for systematic reviews of observational epidemiological studies reporting prevalence and cumulative incidence dataInternational Journal of EvidenceBased Healthcare 13:147–153.https://doi.org/10.1097/XEB.0000000000000054

Methodological quality of case series studies: an introduction to the JBI critical appraisal toolJBI Database of Systematic Reviews and Implementation Reports 23:e00099.https://doi.org/10.11124/JBISRIRD1900099

Viral load of SARSCoV2 in clinical samplesThe Lancet Infectious Diseases 20:411–412.https://doi.org/10.1016/S14733099(20)301134

Coronavirus disease outbreak in call center, South KoreaEmerging Infectious Diseases 26:1666–1670.https://doi.org/10.3201/eid2608.201274

SARSCoV2 can be detected in urine, blood, anal swabs, and oropharyngeal swabs specimensJournal of Medical Virology 92:1676–1680.https://doi.org/10.1002/jmv.25936

SARSCoV2 virus culture and subgenomic RNA for respiratory specimens from patients with mild coronavirus diseaseEmerging Infectious Diseases 26:2701–2704.https://doi.org/10.3201/eid2611.203219

Estimating variability in the transmission of severe acute respiratory syndrome to household contacts in Hong Kong, ChinaAmerican Journal of Epidemiology 166:355–363.https://doi.org/10.1093/aje/kwm082

Early diagnosis of SARS coronavirus infection by real time RTPCRJournal of Clinical Virology 28:233–238.https://doi.org/10.1016/j.jcv.2003.08.004

Influenza A(H1N1)pdm09 infection and viral load analysis in patients with different clinical presentationsMemórias Do Instituto Oswaldo Cruz 115:e200009.https://doi.org/10.1590/007402760200009

Airborne transmission of communicable infection — The Elusive PathwayNew England Journal of Medicine 350:1710–1712.https://doi.org/10.1056/NEJMp048051

Distribution of Transmission Potential During Nonsevere COVID19 IllnessClinical Infectious Diseases 71:2927–2932.https://doi.org/10.1093/cid/ciaa886

Shedding and transmission of novel influenza virus A/H1N1 infection in householdsGermany, 2009American Journal of Epidemiology 171:1157–1164.https://doi.org/10.1093/aje/kwq071

Coronavirus disease 2019 in children  United states, February 12April 2, 2020MMWR. Morbidity and Mortality Weekly Report 69:422–426.https://doi.org/10.15585/mmwr.mm6914e4

Viral load in patients infected with pandemic H1N1 2009 influenza A virusJournal of Medical Virology 82:1–7.https://doi.org/10.1002/jmv.21664

Delayed clearance of viral load and marked cytokine activation in severe cases of pandemic H1N1 2009 influenza virus infectionClinical Infectious Diseases : An Official Publication of the Infectious Diseases Society of America 50:850–859.https://doi.org/10.1086/650581

Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile rangeBMC Medical Research Methodology 14:135.https://doi.org/10.1186/1471228814135

Viral load and rapid diagnostic test in patients with pandemic H1N1 2009Pediatrics International 53:1097–1099.https://doi.org/10.1111/j.1442200X.2011.03489.x

Presymptomatic Transmission of SARSCoV2 â€” Singapore, January 23â€“March 16, 2020MMWR. Morbidity and Mortality Weekly Report 69:411–415.https://doi.org/10.15585/mmwr.mm6914e1

Enhanced spread of expiratory droplets by turbulence in a cough jetBuilding and Environment 93:86–96.https://doi.org/10.1016/j.buildenv.2015.06.018

Severity of pandemic H1N1 2009 influenza virus infection may not be directly correlated with initial viral load in upper respiratory tractInfluenza and Other Respiratory Viruses 6:367–373.https://doi.org/10.1111/j.17502659.2011.00300.x

Saliva or Nasopharyngeal Swab Specimens for Detection of SARSCoV2New England Journal of Medicine 383:1283–1286.https://doi.org/10.1056/NEJMc2016359

Evidence of Airborne Transmission of the Severe Acute Respiratory Syndrome VirusNew England Journal of Medicine 350:1731–1739.https://doi.org/10.1056/NEJMoa032867

Evaluating Transmission Heterogeneity and SuperSpreading Event of COVID19 in a Metropolis of ChinaInternational Journal of Environmental Research and Public Health 17:3705.https://doi.org/10.3390/ijerph17103705

SARSCoV2 Viral Load in Upper Respiratory Specimens of Infected PatientsNew England Journal of Medicine 382:1177–1179.https://doi.org/10.1056/NEJMc2001737
Decision letter

Jos WM van der MeerSenior and Reviewing Editor; Radboud University Medical Centre, Netherlands

Lucie VermeulenReviewer
In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.
Acceptance summary:
The authors performed a systematic literature review and metaanalysis to develop a dataset of respiratory viral loads (rVLs)for three viruses (SARSCoV2, SARSCoV1 and influenza A(H1N1)pdm09). Furthermore, the kinetics of viral shedding over time during a respiratory infection are studied, and a model is developed for infectiousness via shedding of viable virus in aerosols and droplets. The study appears robust and comprehensive, and the results are valuable and contribute to the scientific knowledge in this field.
Decision letter after peer review:
Thank you for submitting your article "Heterogeneity in transmissibility and shedding SARSCoV2 via droplets and aerosols" for consideration by eLife. Your article has been reviewed by 2 peer reviewers, and the evaluation has been overseen by a Senior/Reviewing Editor. The following individual involved in review of your submission has agreed to reveal their identity: Lucie Vermeulen (Reviewer #1).
The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.
Summary:
This is a very interesting study on an important subject. It uses a combination of approaches (systematic review, metaregression, mathematical modelling) to study the association between the variability of respiratory viral loads (rVL) and heterogeneity in transmission rates. The authors argue that variability of rVL is a main determinant of the high heterogeneity of transmission rates and translate the rVL distribution into transmission probabilities for different transmission modes (droplets, aerosols; breathing, speaking, singing). These conclusions are interesting and potentially relevant for public health. The combination of rVL data from >60 studies represents an impressive amount of work which may also be useful for future research.
The paper does not stop at a descriptive summary of these data but uses several modelling approaches (metaregression, "translation" into transmission probabilities, dynamical modelling) to interpret these data. The evidence provided by these analyses is more tentative than presented by the authors in this version of the manuscript.
Essential revisions:
1. The metaregression is based on only three viral species and hence it is unclear how generalisable the observed association between rVL and transmission heterogeneity is. In the best case the data show that the three virus species exhibit significantly different rVLvariation, which coincides with their different k values at the epidemiological level. However, this latter association is essentially based on only three data points (i.e. the three viral species). The current metaregression approach (applying a simple linear regression but essentially ignoring the fact that all studies stem from only three virus species; i.e., ignoring the hierarchical nature of the data) provides p values which strongly exaggerate the degree of evidence.
2. One major potential confounder is the very strong dependence of rVL on infection time. The authors consider this in the section "SARSCoV 2 kinetics during respiratory infection" where they show also a substantial variation of rVL across different strata of days from symptom onset (DFSO). However, it is unclear to what extent this affects the previous analyses (e.g., the metaregression models). More fundamentally, the crosssectional nature of the rVL data leads almost by necessity to an overestimation of the variability in the transmission potential of infections and in a very strong dependency of the rVL variation on the distribution of sampling time. Even a stratification on DFSO can only partly address these problems, firstly because DFSO is in most cases associated with a substantial uncertainty which in the case of a highly dynamic infection will translate into an even larger variation of rVLs. Moreover, even if DFSO were an exact measure of infection time, different infections do not need to be synchronous (e.g. because of stochastic effects or variation of the processes corresponding to the model parameters across individuals) such that different individuals will have their peak rVL at different time points. Taken together this implies that a reliable measure of heterogeneity would require determining something like the area under the curve of rVL (which of course is very challenging).
3. This limitation also strongly affects the practical, public health relevance of the findings. For example, the authors state that "Our analyses suggest that heterogeneity in rVL may be generally associated with overdispersion for viral respiratory infections. In this case, rVL distribution can serve as an early correlate for transmission patterns, including superspreading, during outbreaks of novel respiratory viruses, providing insight for disease control before largescale epidemiological studies empirically characterise k ". This potential application assumes that the timing of rVL measurements is known early in a pandemic and that it can be controlled for, which requires a detailed knowledge of the within patient dynamics of the virus. I would assume that achieving this knowledge would take at least as long as estimating k in epidemiological studies. Thus it may be more appropriate to think about the two approaches in characterising heterogeneity as complementary (in the context of epidemiological triangulation; i.e. both approaches having their weaknesses and biases but which can be overcome in a joint consideration; generally, I think that attempting to achieve such a triangulation is one of the main strengths of the present study, despite its limitations).
4. The variation of rVL might also be strongly driven by the sampling method/procedure (even the same method will give very different results across healthcare workers), which implies the same problems as (2) – i.e. overestimation of rVL variability and potential confounding.
5. The authors note that "Talking, singing and coughing expelled virions at greater proportions via droplets (80.686.0%) than aerosols (14.019.4%)." It should be noted that although more virions are expelled via droplets than aerosols according to the findings of this study, exposure to droplets and aerosols is not equal and this could affect the probability of transmission via these routes. For example, if social distancing and masking is observed then it is possible that larger droplets are more easily captured by masks or fall on the ground quickly and do not reach a susceptible individual, while aerosols do. Furthermore, smaller droplets and aerosols can penetrate more deeply into the lungs. It is as of yet unclear whether this would influence the probability of becoming (more severely) infected. This may also differ per virus. A discussion on these issues is relevant.
6. The authors also note that their results "support aerosol spread as a transmission mode for SARSCoV2, including for conditional superspreading by highly infectious cases. However, with short durations of stay in wellventilated areas, the exposure risk for aerosols, including longrange and buoyant ones, remains correlated with proximity to infectious cases."
7. A methodological note on the modelling that may affect the results (but likely do not impact the conclusions strongly) is the following.
The authors take a value of 0.1% for the fraction of SARSCoV2 RNA copies that represents viable virus (parameter 𝛾). This value is quite uncertain. More literature (not yet peerreviewed) exists on the fraction of SARSCoV2 RNA copies that is infectious virus, providing different values. Van Kampen et al. (2020) only found a cytopathic effect on Vero cells if the swab sample from patients contained more than 7 log_{10} RNA copies/mL. Fears et al. (2020) find an average of 0.003 (range 0.0008 – 0.02) CCID50/RNA copy. However, Lednicky et al. (2020) sampled SARSCoV2containing aerosols in a room with COVID19 patients with air samplers using a water vapor condensation mechanism and as such collect virus particles without damaging them, and found an average of 0.6 CCID50/RNA copy, much higher. The model is likely sensitive to this parameter, and this could influence the result. These issues should be taken into account in the revision.
8. Another methodological point is that more datasets in literature are available on the emission rates and size distribution of particles during breathing, speaking, coughing etc., than are currently used to base the model on. Schijven et al. (2020) compared seven datasets and found that they sometimes differ quite strongly. For example, the median volume of aerosol particles produced for coughing differs over two orders of magnitude when comparing two data sets. It is unclear what this difference represents, it might have to do with the sampling method. Furthermore, observed size distributions also differ in literature, with peak particle emission rates at different sizes. The authors should be aware that the choice of particle emission data for their study can impact their results strongly, and including discussion on the choice of data set and the implications on the results is warranted.
9. Taking 0.5 for the evaporation diameter factor is probably too large. Liu et al. (2017) find a value around one third for respiratory droplets from coughing, and the recent study on the evaporation of saliva droplets and aerosols by Lieber et al. (2021) find a value of 0.2 (for a range of temperature of 2029 degrees C, and range of relative humidity of 6 – 65%). This likely matters for the result, as the difference between 0.5 and 0.2 leads to a factor ~15 change in droplet volume.
10. The kinetic model assumes that viral replication is controlled by the reduction of target cells over the course of the infection, but it neglects the effect of the immune system. This seems a rather strong assumption. What is the evidence for this in the case of SARSCoV2? Also, it would be good if the authors could comment on the identifiability of the model parameters especially the high uncertainty of the halflife of SARSCoV2 in the respiratory tract (2.6266hours) suggests that this might be a problem.
Additional points:
11. Line 453454: "log 𝑘 = 𝑎(𝑆𝐷) + 𝑏, where 𝑎 is the slope for association and 𝑏 is the intercept". This appears to be a strange notation for this equation, isn't "a*SD + b" more logical?
12. Line 544545: "To estimate the average duration of shedding, we extrapolated the model to 0 log_{10} copies/ml postsymptom onset." If the tail of the model is very long, it might take a very long time to reach 0 log_{10}. Is this the case? And if yes, is perhaps a 95% decrease compared to the maximum a better measure for the duration of shedding?
13. Line 560: Is this unit correct? "𝜌 is the material density of the respiratory particle (997 g/m3)" Shouldn't this density be in kg/m3?
14. Line 563 – 571: The estimate for 𝛾 for SARSCoV2 could turn out to be higher, if Lednicky et al. (2020) are to be believed. In any case, this warrants some further discussion in the paper, as the results are probably quite sensitive to this parameter!
15. Line 609 – 610: "𝜌 is the material density of the respiratory particle (taken to be 1 g/cm3 based on the composition of dehydrated respiratory particles)". What is the reference for this statement? Zhang et al. (2011) find densities between 1.25 and 1.62 g/mL (Table 2). It seems logical that the drying process increases the density somewhat as compared to the density of water, as for instance the heavier salts do not evaporate.
16. Figure 2: this figure is somewhat unclear, maybe I do not understand it correctly. Why does one study have multiple standard deviations? And as there are only three values of k, regression seems to be an odd choice. A comparison between different groups of k seems more appropriate?
References:
Fears AC, Klimstra WB, Duprex P, Hartman A, Weaver SC, Plante KS, et al. 2020. Comparative dynamic aerosol efficiencies of three emergent coronaviruses and the unusual persistence of sarscov2 in aerosol suspensions. medRxiv:2020.2004.2013.20063784.
Lednicky JA, Lauzardo M, Fan ZH, Jutla AS, Tilly TB, Gangwar M, et al. 2020. Viable sarscov2 in the air of a hospital room with covid19 patients. medRxiv:2020.2008.2003.20167395.
Lieber, C., Melekidis, S., Koch, R., Bauer, H.J., 2021. Insights into the evaporation characteristics of saliva droplets and aerosols: Levitation experiments and numerical modeling. Journal of Aerosol Science 154, 105760.
Liu, L., Wei, J., Li, Y., Ooi, A., 2017. Evaporation and dispersion of respiratory droplets from coughing. Indoor Air 27, 179190.
Schijven, J.F., Vermeulen, L.C., Swart, A., Meijer, A., Duizer, E., de RodaHusman, A.M., 2020. Exposure assessment for airborne transmission of SARSCoV2 via breathing, speaking, coughing and sneezing. medRxiv, 2020.2007.2002.20144832.
van Kampen JJA, van de Vijver DAMC, Fraaij PLA, Haagmans BL, Lamers MM, Okba N, et al. 2020. Shedding of infectious virus in hospitalized patients with coronavirus disease2019 (covid19): Duration and key determinants. medRxiv: 2020.2006.2008. 20125310.
Zhang, T., 2011. Study on Surface Tension and Evaporation Rate of Human Saliva, Saline, and Water Droplets. West Virginia University,
https://doi.org/10.7554/eLife.65774.sa1Author response
Essential revisions:
1. The metaregression is based on only three viral species and hence it is unclear how generalisable the observed association between rVL and transmission heterogeneity is. In the best case the data show that the three virus species exhibit significantly different rVLvariation, which coincides with their different k values at the epidemiological level. However, this latter association is essentially based on only three data points (i.e. the three viral species). The current metaregression approach (applying a simple linear regression but essentially ignoring the fact that all studies stem from only three virus species; i.e., ignoring the hierarchical nature of the data) provides p values which strongly exaggerate the degree of evidence.
We have revised the reporting of the described analysis to address the points raised by the reviewers. We have edited the metaregression result to specifically mention it refers to analysis of these three viruses. In addition, rather than report the specific Pvalue from our metaregression (which was orders of magnitude lower), we now report “P < 0.001”, which is revised in Figure 2 as well as the text. The revised text says, “metaregression (Figure 2) showed a strong, negative association between k and heterogeneity in rVL for these three viruses (metaregression slope ttest: P < 0.001, Pearson’s r = 0.73).” (page 6, line 123125). These revisions have have softened the described degree of evidence and report the results of our metaregression as specifically based on the three viruses (SARSCoV2, SARSCoV1 and A(H1N1)pdm09).
2. One major potential confounder is the very strong dependence of rVL on infection time. The authors consider this in the section "SARSCoV 2 kinetics during respiratory infection" where they show also a substantial variation of rVL across different strata of days from symptom onset (DFSO). However, it is unclear to what extent this affects the previous analyses (e.g., the metaregression models). More fundamentally, the crosssectional nature of the rVL data leads almost by necessity to an overestimation of the variability in the transmission potential of infections and in a very strong dependency of the rVL variation on the distribution of sampling time. Even a stratification on DFSO can only partly address these problems, firstly because DFSO is in most cases associated with a substantial uncertainty which in the case of a highly dynamic infection will translate into an even larger variation of rVLs. Moreover, even if DFSO were an exact measure of infection time, different infections do not need to be synchronous (e.g. because of stochastic effects or variation of the processes corresponding to the model parameters across individuals) such that different individuals will have their peak rVL at different time points. Taken together this implies that a reliable measure of heterogeneity would require determining something like the area under the curve of rVL (which of course is very challenging).
Our systematic review and data collection were designed to specifically develop a dataset of rVLs from the estimated infectious periods for each virus. While, as noted by the reviewers, the crosssectional nature of data, like virus quantitation, means that some DFSO may be more prevalent in the dataset in each study. However, when considered together, the studies identified by the systematic review tended to span the estimated infectious period for each of the three viruses. Since the metaanalyses (before the “SARSCoV2 kinetics during the respiratory infection” section) are conducted cumulatively on the identified studies, they “approximated the expected rVL when encountering a COVID19, SARS or A(H1N1)pdm09 case during the infectious period” (page 7, line 138139). For the metaregression, each study is an estimate of the SD of the rVLs in their analyzed period. Like described above, the metaregression should approximate the SD of the rVLs throughout the infectious period. As a limitation of the study, we note that there were fewer studies that had data from the presymptomatic period: “The systematic search found a limited number of studies reporting quantitative specimen measurements from the presymptomatic period, meaning these estimates may be sensitive to sampling bias.” (page 15, line 399401).
As the reviewers mentioned, there is uncertainty in DFSO. For example, DFSO is often based on patient recall, which is uncertain. Even in the scenario that DFSO is certain, then cases can have different peak rVLs on different DFSO, as the reviewers also mentioned. To discuss this potential confounder in the former point, we have written, “Specimen measurements (based on instrumentation, calibration, procedures and reagents) are not standardized and, as DFSO is typically based on patient recall, there is also inherent uncertainty in these values. While the above procedures (including only quantitative measurements after extraction as an inclusion criterion, considering assay detection limits and correcting for specimen dilution) have considered many of these factors, nonstandardization remains an inherent limitation in the variability of specimen measurements.” (page 23, line 588593). For the latter point, if rVL peaks on different DFSO for different individuals, then this variability should influence, and be shown in, our aggregate analyses and estimates. Stratification for DFSO also considers this. For example, when encountering a case on 5 DFSO, we wish to know what the expected distribution of rVLs that the case may have. Our analyses consider the distribution of rVLs on that DFSO, and thus this approach assesses rVL variability on each DFSO across cases.
3. This limitation also strongly affects the practical, public health relevance of the findings. For example, the authors state that "Our analyses suggest that heterogeneity in rVL may be generally associated with overdispersion for viral respiratory infections. In this case, rVL distribution can serve as an early correlate for transmission patterns, including superspreading, during outbreaks of novel respiratory viruses, providing insight for disease control before largescale epidemiological studies empirically characterise k ". This potential application assumes that the timing of rVL measurements is known early in a pandemic and that it can be controlled for, which requires a detailed knowledge of the within patient dynamics of the virus. I would assume that achieving this knowledge would take at least as long as estimating k in epidemiological studies. Thus it may be more appropriate to think about the two approaches in characterising heterogeneity as complementary (in the context of epidemiological triangulation; i.e. both approaches having their weaknesses and biases but which can be overcome in a joint consideration; generally, I think that attempting to achieve such a triangulation is one of the main strengths of the present study, despite its limitations).
We have revised our manuscript to discuss the updated view, for which we express appreciation to the reviewers. We no longer discuss rVL as being characterized before epidemiological studies empirically characterizing k, but now write: “In this case, rVL distribution can serve as an early correlate for transmission patterns, including superspreading, during outbreaks of novel respiratory viruses. When considered jointly with contacttracing studies, this provides epidemiological triangulation on k: heterogeneity in rVL indirectly estimates k via an association, whereas contact tracing empirically characterizes transmission chains to estimate k but is limited by incomplete or incorrect recall of contact events by cases.” (page 14, line 361366).
4. The variation of rVL might also be strongly driven by the sampling method/procedure (even the same method will give very different results across healthcare workers), which implies the same problems as (2) – i.e. overestimation of rVL variability and potential confounding.
The inclusion and exclusion criteria used in our systematic review protocol means that we considered a specific set of respiratory specimens from which virus quantitation was performed in a similar manner. We also accounted for variation between studies and specimen types in their processing (e.g., volume of viral transport media used) in the estimation of rVLs. Further assessment of these potential risks of bias was included through the JBI critical appraisal checklist.
We agree with the reviewers that there is increased variation in rVLs based on sampling, which is an intrinsic measurement error associated with virus quantitation. We do note this limitation with “Specimen measurements (based on instrumentation, calibration, procedures and reagents) are not standardized and, as DFSO is typically based on patient recall, there is also inherent uncertainty in these values. While the above procedures (including only quantitative measurements after extraction as an inclusion criterion, considering assay detection limits and correcting for specimen dilution) have considered many of these factors, nonstandardization remains an inherent limitation in the variability of specimen measurements” (page 23, line 588593).
Each viral load for each virus was measured in a comparable manner. Thus, the intrinsic measurement error should be similar for each virus. In other words, each virus will have a similar increase in variability from measurement error. Thus, this error should not drive the observed differences in rVL heterogeneity, and the difference in heterogeneity in rVL between viruses should arise from the viruses. We have also revised the manuscript to describe this point, “While there is intrinsic measurement error in virus quantitation, based on the systematic review protocol and study design (as described above), this error should similarly increase heterogeneity in rVL for each virus, and the difference in heterogeneity in rVL between viruses should arise from the viruses.” (page 24, line 614617).
5. The authors note that "Talking, singing and coughing expelled virions at greater proportions via droplets (80.686.0%) than aerosols (14.019.4%)." It should be noted that although more virions are expelled via droplets than aerosols according to the findings of this study, exposure to droplets and aerosols is not equal and this could affect the probability of transmission via these routes. For example, if social distancing and masking is observed then it is possible that larger droplets are more easily captured by masks or fall on the ground quickly and do not reach a susceptible individual, while aerosols do. Furthermore, smaller droplets and aerosols can penetrate more deeply into the lungs. It is as of yet unclear whether this would influence the probability of becoming (more severely) infected. This may also differ per virus. A discussion on these issues is relevant.
6. The authors also note that their results "support aerosol spread as a transmission mode for SARSCoV2, including for conditional superspreading by highly infectious cases. However, with short durations of stay in wellventilated areas, the exposure risk for aerosols, including longrange and buoyant ones, remains correlated with proximity to infectious cases."
As these two comments are related, we address them together. Based on the revised evaporation diameter factor from Reviewer Comment 9, our model now estimates that “Talking, singing and coughing expelled virions at comparable proportions via droplets (55.659.4%) and aerosols (40.644.4%)” (page 11, line 276278).
The discussion mentioned by the reviewers, however, remains relevant. We have provided a more nuanced revision on abating droplet and aerosol transmission, as exposure to both droplets and aerosols is correlated with proximity. We have revised the manuscript based on reviewer comments 5 and 6, with this nuanced perspective in mind, and it now reads: “While talking, singing and coughing, our models indicate that SARSCoV2 is carried by droplets (55.659.4% of shed virions), shortrange aerosols (30.134.9%), longrange aerosols (7.78.3%) and buoyant aerosols (0.016.5%). Transmission, however, requires exposure. For direct transmission, droplets tend to be sprayed ballistically onto susceptible tissue, whereas aerosols can be inhaled, may penetrate more deeply into the lungs and more easily facilitate superspreading events. However, with short durations of stay in wellventilated areas, the exposure risk for both droplets and aerosols remains correlated with proximity to infectious cases (Liu, Li, et al., 2017; Prather et al., 2020). Strategies to abate infection should limit crowd numbers and duration of stay while reinforcing distancing, lowvoice amplitudes and widespread mask usage; wellventilated settings can be recognized as lowerrisk venues.” (page 1617, line 430453).
7. A methodological note on the modelling that may affect the results (but likely do not impact the conclusions strongly) is the following.
The authors take a value of 0.1% for the fraction of SARSCoV2 RNA copies that represents viable virus (parameter γ). This value is quite uncertain. More literature (not yet peerreviewed) exists on the fraction of SARSCoV2 RNA copies that is infectious virus, providing different values. Van Kampen et al. (2020) only found a cytopathic effect on Vero cells if the swab sample from patients contained more than 7 log_{10} RNA copies/mL. Fears et al. (2020) find an average of 0.003 (range 0.0008 – 0.02) CCID50/RNA copy. However, Lednicky et al. (2020) sampled SARSCoV2containing aerosols in a room with COVID19 patients with air samplers using a water vapor condensation mechanism and as such collect virus particles without damaging them, and found an average of 0.6 CCID50/RNA copy, much higher. The model is likely sensitive to this parameter, and this could influence the result. These issues should be taken into account in the revision.
A 𝛾 of 0.1% is “equivalent to one viable copy in 3 log_{10} copies/ml for rVL or, after dilution in transport media, roughly one in 4 log_{10} copies/ml for specimen concentration” (page 29, line 724725). For both influenza A and SARSCoV2, culturepositive virus has been found in the respiratory specimens considered in our study down to 4 log_{10} copies/ml, “including from pediatric (L'Huillier et al., 2020) and asymptomatic (Arons et al., 2020) [COVID19] cases” (page 29, line 729730). Thus, we took 𝛾 to be “0.1% as a populationlevel estimate” (page 29, line 722) in our model. As the reviewer mentioned, despite the discussion above, there is still uncertainty in the estimate of 𝛾. We have revised the limitations section in our manuscript to convey this, which is quoted in the response to the next reviewer comment (#8), as we combined it with the added discussion from those reviewer comments.
As research on this topic develops and methodological advances continue to improve the characterization of 𝛾, we hope that the models introduced in our study can be used as an initial basis towards even more accurate estimations of the rate, and extent, to which respiratory activities shed infectious virus via droplets and aerosols.
8. Another methodological point is that more datasets in literature are available on the emission rates and size distribution of particles during breathing, speaking, coughing etc., than are currently used to base the model on. Schijven et al. (2020) compared seven datasets and found that they sometimes differ quite strongly. For example, the median volume of aerosol particles produced for coughing differs over two orders of magnitude when comparing two data sets. It is unclear what this difference represents, it might have to do with the sampling method. Furthermore, observed size distributions also differ in literature, with peak particle emission rates at different sizes. The authors should be aware that the choice of particle emission data for their study can impact their results strongly, and including discussion on the choice of data set and the implications on the results is warranted.
We have added discussion on this topic and in its implications (please note that we combined this with added discussion from the above reviewer comment, #7): “Furthermore, this study considered populationlevel estimates of the infectious periods, viability proportions and profiles for respiratory particles, which omit individual or environmental variation. […] Cumulatively, these sources of variation may influence the shedding model estimates, further increasing heterogeneity in individual infectiousness.” (page 16, line 417429).
9. Taking 0.5 for the evaporation diameter factor is probably too large. Liu et al. (2017) find a value around one third for respiratory droplets from coughing, and the recent study on the evaporation of saliva droplets and aerosols by Lieber et al. (2021) find a value of 0.2 (for a range of temperature of 2029 degrees C, and range of relative humidity of 6 – 65%). This likely matters for the result, as the difference between 0.5 and 0.2 leads to a factor ~15 change in droplet volume.
We have adjusted the evaporation diameter factor to “0.3”, as an approximation between the factors of 0.5 (Johnson et al., 2011), 0.32 (Liu et al., 2017) and 0.2 (Lieber et al., 2017), as described in the Methods: “In our models, we took the diameter of dehydrated respiratory particles to be 0.3 times the initial size when atomized in the respiratory tract (Johnson et al., 2011; Lieber, Melekidis, Koch, and Bauer, 2021; Liu, Wei, Li, and Ooi, 2017).” (page 31, line 767769).
This revision led to updates in Figure 5, Figure 5—Figure supplement 1 and Figure 5—Figure supplement 2, as well as in the reporting of the model results based on particle size (see edits throughout page 10, line 207216).
10. The kinetic model assumes that viral replication is controlled by the reduction of target cells over the course of the infection, but it neglects the effect of the immune system. This seems a rather strong assumption. What is the evidence for this in the case of SARSCoV2? Also, it would be good if the authors could comment on the identifiability of the model parameters especially the high uncertainty of the halflife of SARSCoV2 in the respiratory tract (2.6266hours) suggests that this might be a problem.
Our kinetic model (Equations. 46, page 26, line 671673) is dynamical and involves one independent variable (DFSO represented by t); one explicit dependent variable (rVL represented by V); two implicit dependent variables (the number of uninfected target cells represented by T, and the number of productively infected cells represented by I); and four fitted parameters: β (infection rate constant), p (cellular shedding rate of virus), c (clearance rate of virus) and δ (clearance rate of infected epithelial cells).
Thus, this model does account for the effect of the immune system. It accounts for viral RNA clearance by any mechanism (via c) or infected cells cleared by any mechanism (via δ), including by the immune system for both, as described by the system of equations (Equations 46, page 26, line 671673).
We have revised the manuscript on SARSCoV2 halflife, as the reviewers mentioned, it is a confusing term when described in the body. Virus halflife in the respiratory tract was calculated by ${t}_{1/2}=ln2/c$ (page 27, line 678). As the fitted estimate of c was 3.30 (0.256.34) days^{1}, the lower range of the 95% CI is below 1. The halflife equation uses c in the denominator, meaning t_{1/2} is particularly sensitive to values of c below 1. To reduce confusion over this, we have reported the halflife in days, as it is c’s original unit: “the halflife of SARSCoV2 RNA before clearance from the respiratory tract was 0.21 (0.112.75) days” (page 8, line 177178) and both c and t_{1/2} are included in Figure 4—Figure Supplement 4. We have included discussion on this: “The estimated halflife of SARSCoV2 RNA has a skewed 95% CI (Figure 4—Figure Supplement 4). As $c$ is in the denominator of the equation for halflife (${t}_{1/2}=ln2/c$), ${t}_{1/2}$ is sensitive to c below one, which is the case for its lower 95% CI (Figure 4—Figure Supplement 4) and the source of the skew.” (page 28, line 697700).
Additional points:
11. Line 453454: "log k = a(SD) + b, where a is the slope for association and b is the intercept". This appears to be a strange notation for this equation, isn't "a*SD + b" more logical?
This equation has been modified to "a*SD + b" (page 23, line 596).
12. Line 544545: "To estimate the average duration of shedding, we extrapolated the model to 0 log_{10} copies/ml postsymptom onset." If the tail of the model is very long, it might take a very long time to reach 0 log_{10}. Is this the case? And if yes, is perhaps a 95% decrease compared to the maximum a better measure for the duration of shedding?
The tail of the model was nearly linear (Figure 4D) and should not have an extended tail for this extrapolation.
13. Line 560: Is this unit correct? "ρ is the material density of the respiratory particle (997 g/m3)" Shouldn't this density be in kg/m3?
We appreciate the reviewers for spotting this typo. The material density has been corrected to 997 kg/m^{3} (page 29, line 719).
14. Line 563 – 571: The estimate for γ for SARSCoV2 could turn out to be higher, if Lednicky et al. (2020) are to be believed. In any case, this warrants some further discussion in the paper, as the results are probably quite sensitive to this parameter!
We have addressed this in our response to reviewer comment 7, as both questions focus on the estimate for 𝛾. Please refer to that response.
15. Line 609 – 610: "𝜌 is the material density of the respiratory particle (taken to be 1 g/cm3 based on the composition of dehydrated respiratory particles)". What is the reference for this statement? Zhang et al. (2011) find densities between 1.25 and 1.62 g/mL (Table 2). It seems logical that the drying process increases the density somewhat as compared to the density of water, as for instance the heavier salts do not evaporate.
The respiratory particles are atomized from the extracellular fluid in the respiratory tract, we considered the particles to be hydrated during atomization. The model approximated particle material density based on the density of water while hydrated.
16. Figure 2: this figure is somewhat unclear, maybe I do not understand it correctly. Why does one study have multiple standard deviations? And as there are only three values of k, regression seems to be an odd choice. A comparison between different groups of k seems more appropriate?
Each dot in this figure represents a separate study. We took the value of k for each virus based on our pooled estimates of k from the literature. Then, we used the SD in each study as an estimate of the heterogeneity in rVL for the respective virus in estimated infectious period. Thus, we performed a metaregression based on the SD from each study identified in our systematic review (Figure 2—Figure supplement 1) or on the SD from each study that was assessed as having low risk of bias (Figure 2).
https://doi.org/10.7554/eLife.65774.sa2Article and author information
Author details
Funding
Natural Sciences and Engineering Research Council of Canada (Vanier Scholarship 608544)
 Paul Z Chen
Canadian Institutes of Health Research (Canadian COVID19 Rapid Research Fund OV4170360)
 David N Fisman
Natural Sciences and Engineering Research Council of Canada (Senior Industrial Research Chair)
 Frank X Gu
Toronto COVID19 Action Fund
 Frank X Gu
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
We thank T Alba (Toronto) for discussion on statistical methods. We thank J Jimenez (Colorado) for discussion on the characteristics of aerosols and droplets. We thank E Lavezzo and A Chrisanti (Padova) and A Wyllie, A Ko and N Grubaugh (Yale) for responses to data inquiries. PZC was supported by the NSERC Vanier Scholarship (608544). DNF was supported by the Canadian Institutes of Health Research (Canadian COVID19 Rapid Research Fund, OV4170360). FXG was supported by the NSERC Senior Industrial Research Chair program, NSERC Discovery Grant program and the Toronto COVID19 Action Fund.
Senior and Reviewing Editor
 Jos WM van der Meer, Radboud University Medical Centre, Netherlands
Reviewer
 Lucie Vermeulen
Publication history
 Received: December 15, 2020
 Accepted: April 15, 2021
 Accepted Manuscript published: April 16, 2021 (version 1)
 Version of Record published: May 21, 2021 (version 2)
Copyright
© 2021, Chen et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics

 4,393
 Page views

 372
 Downloads

 7
 Citations
Article citation count generated by polling the highest count across the following sources: PubMed Central, Crossref, Scopus.