A modelling approach to estimate the transmissibility of SARSCoV2 during periods of high, low, and zero case incidence
Abstract
Against a backdrop of widespread global transmission, a number of countries have successfully brought large outbreaks of COVID19 under control and maintained nearelimination status. A key element of epidemic response is the tracking of disease transmissibility in near realtime. During major outbreaks, the effective reproduction number can be estimated from a timeseries of case, hospitalisation or death counts. In low or zero incidence settings, knowing the potential for the virus to spread is a response priority. Absence of case data means that this potential cannot be estimated directly. We present a semimechanistic modelling framework that draws on timeseries of both behavioural data and case data (when disease activity is present) to estimate the transmissibility of SARSCoV2 from periods of high to low – or zero – case incidence, with a coherent transition in interpretation across the changing epidemiological situations. Of note, during periods of epidemic activity, our analysis recovers the effective reproduction number, while during periods of low – or zero – case incidence, it provides an estimate of transmission risk. This enables tracking and planning of progress towards the control of large outbreaks, maintenance of virus suppression, and monitoring the risk posed by reintroduction of the virus. We demonstrate the value of our methods by reporting on their use throughout 2020 in Australia, where they have become a central component of the national COVID19 response.
Editor's evaluation
This paper is interesting, timely, and important because it presents a way to understand the transmission potential of a virus even when there are very few local cases. This has a high public health communication and preparedness value. The paper is clearly written, and the results fit with the known epidemiology of outbreaks that occurred in Australia in 2020. The results are convincing and likely to be of broad interest within and outside the field of epidemiological modelling.
https://doi.org/10.7554/eLife.78089.sa0Introduction
The first 12 months of the COVID19 pandemic led to overwhelmed health systems and enormous social disruption across the globe. Government strategy and public responses to COVID19 were highly variable. Prior to the global circulation of the Delta and Omicron variants, a small number of jurisdictions had achieved extended periods of elimination through 2020 and into early 2021, including Taiwan, Thailand, New Zealand and Australia (Rajatanavin et al., 2021; Summers et al., 2020; Golding et al., 2021). Meanwhile, parts of Europe and the Americas were heavily impacted by COVID19 (The Lancet, 2020; Remuzzi and Remuzzi, 2020), with health systems overwhelmed by multiple explosive outbreaks. The Delta and Omicron variants — with their increased transmissibility — has led to epidemic activity, now likely to be sustained, in a number of previously low prevalence settings (Australian Government Department of Health, 2021; Tiberghien, 2021; New Zealand Government Ministry of Health, 2022; Mallapaty, 2022).
A key element of epidemic response is the close monitoring of the speed of disease spread, via estimation of the effective reproduction number (${R}_{\mathrm{eff}}$) — the average number of new infections caused by an infected individual over their entire infectious period, in the presence of public health interventions and where no assumption of 100% susceptibility is made. Methods are wellestablished for near realtime estimation of this critical value and estimates are routinely assessed by decisionmakers through the course of an epidemic (Gostic et al., 2020; Cori et al., 2013; Thompson et al., 2019; Abbott et al., 2020b; White and Pagano, 2008). When ${R}_{\mathrm{eff}}$ is above 1, the epidemic is estimated to be growing. If control measures, population immunity, or other factors can bring ${R}_{\mathrm{eff}}$ below 1, then the epidemic is estimated to be in decline. Accurate and timely estimation of ${R}_{\mathrm{eff}}$, and the timely adjustment of interventions in response to it, is critical for the sustainable and successful management of COVID19.
However, when incident cases are driven to very low levels — as occurred in Australia following the first wave of COVID19 from February to April 2020 — established methods for estimating ${R}_{\mathrm{eff}}$ are no longer informative. Yet the virus remained a threat, as evidenced by multiple instances of reintroduction and subsequent additional waves in Australia throughout 2020 and early 2021. Independent of whether local (and temporary) elimination was achieved, knowledge of SARSCoV2’s potential transmissibility and the risk of resurgence was a response priority.
Here, by making use of social and behavioural data, we demonstrate a novel method for estimating the ability of the virus to spread in a population, which is informative even when case incidence is very low or zero. In the absence of cases, our method estimates the ability of the virus, if it were present, to spread in a population, which we define as the ‘transmission potential’. We use the word ‘potential’ to distinguish this quantity from an estimate of actual transmission. When the virus is present, our method recovers the effective reproduction number and, additionally, the deviation between the ${R}_{\mathrm{eff}}$ and the transmission potential. Applying this method in realtime provides an estimate of the transmissibility of SARSCoV2 in periods of high, low, and even zero, case incidence, with a coherent and seamless transition in interpretation across the changing epidemiological situations.
Our innovative methods and workflows address a major challenge in epidemic situational awareness: assessing epidemic risk when case numbers are driven to low levels or (temporary) elimination is achieved, as frequently occurred in Australia through 2020–21 (Golding et al., 2021). We have routinely applied this method to all Australian states and territories and reported the outputs to peak national decisionmaking committees on a weekly basis since early May 2020. The concepts of transmission potential and ${R}_{\mathrm{eff}}$ have been incorporated into key instruments of government, including Australia’s national COVID19 surveillance plan (Department of Health and Aged Care, 2021a). The transmission potential and ${R}_{\mathrm{eff}}$ are reported to the public through the Australian Government’s weekly Common Operating Picture (Department of Health and Aged Care, 2021b). While not addressed in this article, our methods have recently been updated to include consideration of variants of concern (Golding et al., 2021) and the effect of vaccination (Department of Health and Aged Care, 2021b) on reducing the ability of the virus to spread in the population.
Model
In this section, we describe a novel method for estimating temporal trends in the transmissibility of SARSCoV2.
The effective reproduction number is the product of the number of contacts an infectious person makes and the per contact probability of infection (the latter of which depends on the nature and duration of contact) (Anderson and May, 1991). Both quantities are impacted by changes in behaviour, which are in turn driven by changes in policy, such as stayathome orders and handwashing advice, and the population’s perception and evaluation of risk, among other factors. The new techniques introduced here provide an estimate for how observed changes in rates of social contact and the per contact probability of infection translate to changes in the ability of the virus to spread.
We estimate the timevarying ability of SARSCoV2 to spread in a population using a novel semimechanistic model informed by data on cases, population behaviours and health system effectiveness (see Materials and methods). We separately model transmission from locally acquired cases (localtolocal transmission) and from overseas acquired cases (importtolocal transmission). We model localtolocal transmission (${R}_{\mathrm{eff}}$) using two components (Figure 1): the average populationlevel trend in ${R}_{\mathrm{eff}}$ driven by interventions that primarily target transmission from local cases, specifically changes in physical distancing behaviour and case targeted measures (Component 1, the ‘transmission potential’ or TP); and shortterm fluctuations in ${R}_{\mathrm{eff}}$ to capture stochastic dynamics of transmission, such as clusters of cases and short periods of lowerthanexpected transmission (Component 2, the ‘deviation’ between TP and ${R}_{\mathrm{eff}}$). During periods of low or zero transmission, TP provides an evaluation of the ability of the virus to spread, informing riskassessments and supporting public health planning and response (Doherty Institute, 2021).
To estimate Component 1, we use three submodels (Figure 1, labelled a, b and c). We distinguish between two types of physical distancing behaviour:
Macrodistancing, defined as the reduction in the average rate of nonhousehold contacts, and assessed through weekly nationwide surveys of the daily number of nonhousehold contacts; and
Precautionary microbehaviour, defined as the reduction in transmission probability per nonhousehold contact, and assessed through weekly nationwide surveys from which we estimate the proportion of the population reporting always keeping 1.5 m physical distance from nonhousehold contacts. Note that for Australian reporting purposes, we used the term ‘microdistancing’ behaviour behaviour.
The modelling framework uses adherence to the 1.5 m rule as a proxy for all behaviours (other than those reducing the number of contacts) that may influence transmission, and so is intended to capture the use of masks, preference for outdoor gatherings, and hand hygiene, among other factors. The 1.5 m rule was a suitable proxy because it was consistent public health advice throughout the analysis period and timeseries data were available to track adherence to this metric over time.
By synthesising data from these surveys and numerous population mobility data streams made available by technology company Google, we infer temporal trends in macro and precautionary microbehaviour behaviour (submodels a and b). Furthermore, using data on the number of days from symptom onset to case notification for cases, we estimate the proportion of cases that are detected (and thus advised to isolate) by each day postinfection. By quantifying the temporal change in the probability density for the timetodetection (submodel c), the model estimates how earlier isolation of cases — due to improvements in contact tracing, expanded access to testing, more inclusive case definitions, and other factors impacting detection rates — reduces the ability of SARSCoV2 to spread.
Transmission potential (Component 1) reflects the average potential for the virus to spread at the population level. During times of disease activity, Component 2 measures how transmission within the subpopulations that have the most active cases at a given point in time differs compared to that expected from the populationwide TP. The combination of Components 1 and 2 recovers the estimated ${R}_{\mathrm{eff}}$ (see Equation (10) in Materials and Methods), as per established methods (Cori et al., 2013; Thompson et al., 2019; Abbott et al., 2020b). When Component 2, the deviation between TP and ${R}_{\mathrm{eff}}$, is positively biased (${R}_{\mathrm{eff}}$ > TP), it may indicate that transmission is concentrated in populations with higherthanaverage levels of mixing, such as healthcare workers or meat processing workers. If negatively biased (${R}_{\mathrm{eff}}$ < TP), it reflects suppressed transmission compared to expectation. This may be due to an effective public health response actively suppressing transmission (e.g. through test, trace, isolation and quarantine), or other factors such as local depletion of susceptible individuals, and/or the virus circulating in a subpopulation with fewerthanaverage social contacts.
Results
To demonstrate the utility of our method for assessing epidemic activity and risk, we report on its application to Australian data on cases, population behaviour and health system effectiveness from the first 12 months of the COVID19 pandemic. We focus on the period from early March 2020 to late January 2021 prior to emergence of variants of concern in Australia (first Alpha, then Delta, and recently Omicron) and vaccination roll out (refer to our recent technical report for details on our approach to variants of concern Golding et al., 2021). We describe our results in the context of the COVID19 epidemiology and public health response in Australia during this period, noting that the methods were developed and applied during the pandemic and contributed to government response efforts. We report retrospective estimates (using data as of 24 January 2021 and our model as of September 2021). Where relevant, we also report estimates made at the time of analysis in 2020, which may differ as a result of updates to the case data and methodological improvements to our model over time, as well as minor statistical variation and smoothing.
Across its eight states and territories, Australia has managed a number of distinct phases of the pandemic — from an initial wave of importations (February–April 2020), to sustained periods of zero local case incidence (April–June 2020 and October–December 2020) to widespread community transmission (June–October 2020). Like elsewhere in the world, key interventions have included quarantine of overseas arrivals, restrictions on mobility and gathering sizes, advice on personal hygiene, and case targeted interventions. The specific measures, and the level of control of SARSCoV2 transmission, has varied between states and over time, according to changing epidemiology and response objectives, among other factors. The model has proven informative across vastly different and rapidly changing phases of the pandemic.
To highlight these different epidemiological situations and the insights gained from our modelbased analysis, we draw on exemplar events from the Australian epidemic when describing our results below. In Table 1, we summarise the key types of information provided by estimated quantities under different epidemiological situations. Further, in Figure 2—figure supplements 1–3, we provide timeseries estimates of each metric and model subcomponent from early March 2020 to late January 2021 for each Australian state and territory.
Initial wave of importations
Australia took an early and precautionary approach to managing the risk of importation of SARSCoV2. On 1 February 2020, when China was the only country reporting uncontained transmission, Australia restricted all travel from mainland China to Australia. Only Australian citizens and residents were permitted entry from mainland China. These individuals were advised to selfquarantine for 14 days from their date of arrival. From 20 March 2020, Australia closed its borders to all foreign nationals, and from 27 March, shifted to mandatory statemanaged quarantine for returned citizens and residents, with weekly quotas on the number of arrivals. These policies remained in place at the time of writing.
During the first half of March 2020, that is, prior to the border closure, daily case incidence increased sharply. Although more than twothirds of these cases had acquired their infection overseas, pockets of local transmission were reported in Australia’s largest cities of Sydney (New South Wales) and Melbourne (Victoria) (Australian Government Department of Health, 2020a; Figure 2A and E). From 16 March 2020, state governments progressively implemented — in rapid succession — a range of physical distancing measures to reduce and prevent community transmission. These measures were part of a coordinated national response strategy. By 31 March, Australians were strongly advised to leave their homes only for limited essential activities and public gatherings were limited to two people (known as ‘stayathome‘ restrictions). Health authorities also advised individuals to keep 1.5 m distance from nonhousehold members from midMarch (Price et al., 2020).
Through the second half of March 2020, we estimate that transmission potential across states and territories decreased substantially and rapidly from well above 1 to just below 1 (Figure 2C and G). This reflected a marked increase in macrodistancing/precautionary microbehaviour (Figure 3B, C, F and G) and a decrease in timetocasedetection (Figure 3D and H). Our method, with its ability to distinguish between importtolocal and localtolocal transmission, estimates that the local ${R}_{\mathrm{eff}}$ dropped below 1 on 22 March (upper confidence intervals) in both Victoria and New South Wales — prior to the activation of stayathome restrictions on 30 March (Figure 2B and F). Physical distancing measures were implemented proactively — prior to the establishment of widespread community transmission — suggesting that the effect of these measures, in combination with border measures and casetargeted interventions, led to the definitive control of a first epidemic wave.
Successful suppression, reopening of society
By early April 2020, local case incidence had been driven to very low levels in all Australian states and territories. Substantial numbers of infections continued to be detected in quarantined international arrivals. However, no breaches of quarantine of significant consequence were reported until late May in the state of Victoria Lane et al., 2021.
Despite physical distancing measures remaining in place through April, levels of macrodistancing and precautionary microbehaviour steadily waned following peak levels of adherence in the first week of April (Figure 3B, C, F, and G). This resulted in a steady increase in estimated transmission potential, although it remained below 1 suggesting that the establishment of community transmission was unlikely throughout this period (Figure 2C and G).
From May through to December 2020, the epidemiology of COVID19 across Australia was characterised by sustained periods of zero case incidence and intermittent, localised outbreaks (with the exception of the state of Victoria, see below). With the gradual easing of restrictions from May, levels of macrodistancing and precautionary microbehaviour continued to decrease. Accordingly, transmission potential steadily increased and by early June it had exceeded 1 in most states and territories (Figure 2), suggesting that conditions were suitable to sustain onward transmission if there were an undetected importation event or a breakdown in infection control for managed active cases/identified importations.
During the period from late June to midOctober 2020, Australia’s most populous state of New South Wales effectively controlled a series of localised outbreaks (the largest of which involved hundreds of cases). This was achieved during a period where society remained relatively open, though some restrictions on population movement and social gatherings were in place. For example, household and public gatherings were limited to 20 people. Throughout this period, as estimated at the time and now in this retrospective analysis, statelevel transmission potential hovered just above 1 (Figure 2C), indicating that levels of population mixing were sufficient to allow escalation of epidemic activity in the general population in the absence of active public health measures to control outbreaks.
We estimate that ${R}_{\mathrm{eff}}$ oscillated around 1 throughout this period (Figure 2B). It increased to above 1 at the onset of each incursion and subsequently dropped below 1 as each cluster was contained, with no discernible change in statelevel transmission potential (model Component 1) in response to each cluster. These oscillations — strong positive and then negative deviations from the transmission potential — are captured by model Component 2 and are clearly evident in the timeseries (Figure 2D). Each of the positive deviations from the transmission potential are consistent with heightened transmission among clusters of cases. Each of the subsequent negative deviations from the transmission potential indicate that the number of offspring from each case of the cluster was fewer than expected given the transmission potential and estimated levels of population mixing. We interpret (and interpreted at the time) this as likely reflecting a strong public health response (i.e. early detection and isolation of cases associated with the cluster as a result of contact tracing and quarantine). This was consistent with weekly reporting on the performance of contact tracing systems in New South Wales, with 100% of cases interviewed within 24hr of notification and 100% of close contacts, identified by the case, contacted by public health officials within 48hr of case notification, from early July through to late October (Department of Health and Aged Care, 2021b).
In midNovember 2020, a sustained period of very low case incidence (i.e. zero local cases on all but 10 days in the previous 6 months) in the state of South Australia was disrupted by a breach of mandatory quarantine which led to a cluster of more than 20 cases. At the time, society was largely open with only minimal social restrictions in place. We estimate transmission potential to have been 1.71 [95% CrI: 1.47–2.01] as of 14 November in the retrospective analysis (cf. 1.27 [95% CrI: 1.14–1.41] at the time) (Figure 2), suggesting that the risk of establishing an epidemic was reasonably high (relative to the chance of stochastic extinction), and that once established, transmission would be rapid. Supported by our realtime analysis, authorities imposed a strict 3day lockdown across the entire state to enable contact tracers to comprehensively identify and quarantine primary and secondary contacts of cases. Estimated transmission potential declined dramatically around the time of activation of restrictions, and quickly rebounded when restrictions were eased three days later (Figure 2). The incursion was rapidly contained — as result of changes to transmission potential (driven by social restrictions), an effective public health response (i.e. active case finding and management) and plausibly some favourable stochastic fluctuations — with South Australia returning to zero local case incidence from midDecember 2020.
Resurgence of epidemic activity in one large state
In late May 2020, a breach of mandatory quarantine seeded a second epidemic wave in Australia’s second most populous state of Victoria (approximately 6.7 million people). At the time that the epidemic was seeded, many first wave restrictions were still in place. For example, gatherings within households, outdoor spaces, and dining venues were capped at 20 people, and working from home was strongly advised. Transmission potential is estimated to have been 1.07 [95% CrI: 0.88–1.22] at 25 May 2020, suggesting that levels of physical distancing may have been insufficient to prevent escalation of epidemic activity in the general population (Figure 2G).
Furthermore, from the earliest stages of the epidemic, our model estimated a strong positive deviation from the transmission potential (Component 2 positively biased, Figure 2H), corresponding to an estimate for the ${R}_{\mathrm{eff}}$ > 1 (95% chance of ${R}_{\mathrm{eff}}$ exceeding 1 by 1 June 2020 in the retrospective analysis) reflecting heightened transmission. Demographic and socioeconomic assessments of the outbreak (Australian Institute of Health and Welfare, 2021; Australian Government Department of Health, 2020b; Wild et al., 2021) showed that early affected areas had higher than average household sizes and a large proportion of essential and casualised workers who were unable to work from home. Thus our model findings concurred with the observed epidemiological characteristics — that the virus was predominantly spreading in subsections of the population with higherthanaverage rates of social contact — and supported public health decision making at the time.
By 1 July 2020, there were more than 600 active cases and 129 newly reported cases with an estimated ${R}_{\mathrm{eff}}$ of 1.33 [95% CrI: 1.25–1.41] (Figure 2F). From 9 July 2020, stayathome policies (denoted Stage 3 restrictions) were reinstated across metropolitan Melbourne. Despite these policies, the epidemic continued to grow through July, reaching a peak of 446 daily cases by date of symptom onset on 24 July 2020. More severe stayathome restrictions (denoted Stage 4) were enacted in metropolitan Melbourne on 2 August, including a nighttime curfew, restrictions on movement more than 5 km from a person’s residence, and stricter definitions of essential workers and businesses including invigilation of a work permit requirement.
During the periods of Stage 3 and 4 restrictions, we observed strong increases in macrodistancing and precautionary microbehaviour, which was reflected by a decrease in statelevel transmission potential from around 1 in early June to a minimum of 0.72 [95% CrI: 0.62–0.86] on 23 August 2020 (Figure 2G), two weeks after the implementation of Stage 4 restrictions.
Following an initial sharp rise in the ${R}_{\mathrm{eff}}$ from well below 1 in midMay to a peak of 1.61 [95% CrI: 1.46–1.79] at 14 June 2020, the ${R}_{\mathrm{eff}}$ steadily decreased over the next eight weeks (Figure 2F). We estimate that ${R}_{\mathrm{eff}}$ fell below the critical threshold of 1 on 25 July, approximately one week prior to the implementation of Stage 4 restrictions. With Stage 4 restrictions in place, ${R}_{\mathrm{eff}}$ settled between 0.8 and 1 for another eight weeks.
While both transmission potential and ${R}_{\mathrm{eff}}$ declined over this period, we estimated ${R}_{\mathrm{eff}}$ to be consistently higher than transmission potential (i.e. there was a strong positive deviation in Component 2) reflecting persistent transmission in subsections of the population with higherthanaverage rates of social contact. This was consistent with other epidemiological assessments of the outbreak which suggested that transmission was concentrated in populations that were less able to physically distance (e.g. healthcare workers, residents of aged care facilities, meat workers public housing residents) (Australian Institute of Health and Welfare, 2021; Australian Government Department of Health, 2020b; Wild et al., 2021). A substantial proportion of cases were in healthcare workers and aged care facilities, particularly during the tail of the epidemic. Each of these settings required specifically targeted interventions to bring transmission under control, which were distinct from the impacts of population level measures. This may partly explain why transmission persisted for many weeks when severe stayathome restrictions were active, since these measures primarily target transmission in the broader community and are logically less effective at controlling transmission in essential workplaces and institutional settings.
Definitive control of the epidemic was achieved by early November 2020, when zero local case incidence reported in Victoria for the first time since April 2020.
The pattern in Component 2 for Victoria, where it deviated strongly above zero in the earliest stages of the epidemic, persisted above zero for many months, and returned to around zero once the epidemic was definitely contained, is in contrast to the oscillations seen in New South Wales from June to October.
Discussion
We have presented a novel semimechanistic modelling framework for assessing transmissibility of SARSCoV2 from periods of high to low — or zero — case incidence, with a seamless and coherent transition in interpretation across the changing epidemiological situations. Using timeseries data on cases and population behaviours, our model computes three metrics within a single framework: the effective reproduction number for active cases (${R}_{\mathrm{eff}}$), the populationwide transmission potential (TP), and the deviation between ${R}_{\mathrm{eff}}$ and TP (C2). Our model has been applied (in realtime) to Australian data throughout the pandemic and continues to support the public health response. Here, our analysis of the first 12 months of the pandemic has demonstrated how these quantities enable the tracking and planning of progress towards the control of large outbreaks (as seen in Victoria), maintenance of virus suppression (as seen in New South Wales), and monitoring the risk posed by reintroduction of the virus (as seen in South Australia).
Our approach addresses a major challenge in epidemic situational awareness by enabling assessment of epidemic risk — via the TP — when cases are driven to low levels or (temporary) elimination is achieved. During periods of viral transmission, the model also provides new insight into epidemic dynamics via the deviation between ${R}_{\mathrm{eff}}$ and TP (C2). Further, the TP provides nearrealtime assessment of trends in population macrodistancing and precautionary microbehaviours that fluctuate in response to changing social restrictions, risk perception, and other factors such as school holidays. In combination, knowledge gained from ${R}_{\mathrm{eff}}$, TP and C2 enables policymakers to monitor the relative impacts of communitywide social restrictions and consider the need for more targeted response measures (Department of Health and Aged Care, 2021a).
Social and behavioural data have been used extensively in other countries to support COVID19 situational assessment (Rajatanavin et al., 2021; Jarvis et al., 2020; Coletti et al., 2020; Atchison et al., 2021; Czeisler et al., 2020; Leung et al., 2021). In the UK, the CoMix study Jarvis et al., 2020 has been collecting contact data on a fortnightly basis since March 2020 and reporting “${R}_{c}$" (the basic reproduction number under control measures), to the UK government’s Scientific Pandemic Influenza Group on Modelling, Operational subgroup (SPIMO). Conceptually, CoMix’s ${R}_{c}$ is akin to our TP. However, by synthesising behavioural data from multiple sources, accounting for both micro and macrodistancing behaviours (thus estimating ‘effective’ contacts), and incorporating the effect of case surveillance, our approach is likely to capture a more complete picture of the populationwide potential for virus transmission. Further, by estimating TP and ${R}_{\mathrm{eff}}$ within the same modelling framework (and thus computing C2), our analysis provides a richer and more coherent epidemiological interpretation than that offered through independent measurement and reporting of each metric. Our case studies demonstrate how this richness has supported (and continues to support) the Australian COVID19 response.
Despite its demonstrated impact, there are limitations to our approach. Firstly, it relies on data from frequent, populationwide surveys. In Australia, these data are collected for government and made available to our analysis team by a market research company which has access to an established ‘panel’ of individuals who have agreed to take part in surveys of public opinion. Researchers and governments in many other countries have used such companies for rapid data collection to support pandemic response (Jarvis et al., 2020; Atchison et al., 2021). However, these survey platforms are not readily available in all settings. Further, the sampling strategy did not allow for surveying individuals without internet access, low literacy or limited English language skills, or communication or cognitive difficulties. Further, individuals under 18 years of age were not represented in our surveys. Nor were these survey results available for the prepandemic period, limiting our ability to estimate what a true behavioural baseline would be for the Australian population.
The requirement for specific data streams is a limitation of our approach routinely applied in Australia in 2020 — where it was developed to address situationspecific policy questions and synthesise available data relating to the transmission process. However, the framework is modular and could be adjusted to incorporate or remove timeseries of relevant quantities (e.g. nonhousehold contact rates, adherence to precautionary microbehaviour, effectiveness of surveillance), according to data availability, epidemiological relevance, and policy needs. For its use in Australia in 2020, nonhousehold contact rates (capturing the main effects of stayathome measures) and precautionary microbehaviour were considered the most important (and measurable) drivers of epidemic dynamics. In other times and places (or for other diseases), different factors may be more important for monitoring epidemic dynamics, and the variables that are quantified should be chosen accordingly.
While the patterns of TP, ${R}_{\mathrm{eff}}$ and C2 observed over time in Australia are consistent with “in field” epidemiological assessments, and while the methods have demonstrated impact in supporting decision making, a direct quantification of the validity of the TP is not straightforward. For example, whether selfreported adherence to the 1.5 m rule is a reliable covariate for change in the per contact probability of transmission over time is difficult to assess. If transmission were to become widespread in Australia; and therefore cases become more representative of the general population rather than specific subsets, ${R}_{\mathrm{eff}}$ and TP estimates would be expected to converge. However in the absence of such a natural experiment, no ground truth for this unobserved parameter exists with which to quantitatively validate the model calibration. During the Victorian second wave, while ${R}_{\mathrm{eff}}$ > TP is consistent with virus spread in subpopulations with higherthanpopulationaverage rates of social contact, which was supported by other epidemiological assessments, we cannot rule out that the modelled TP was systematically underestimating the ‘true’ TP over this period.
In Australia, our methods are not only embedded in state and national situational assessment of Department of Health and Aged Care, 2021b but also national response planning. Since the model incorporates a mechanistic understanding of the impacts of physical distancing behaviour on both household and nonhousehold transmission, it can therefore be used to predict the impact of interventions on actual and potential transmission (Doherty Institute, 2021).
Unlike other approaches that make assumptions about impacts of different interventions on behaviour, we directly measure and account for behavioural responses, providing a much more proximal way of assessing the effects of interventions (Flaxman et al., 2020). Further, while detailed data on the demographics and transmission settings for cases in Australia is unavailable, our method considers deviation (the C2) from the regional average (the TP). It is therefore less susceptible to conflation between an epidemic stochastically moving between settings of different transmissibility, and changes in populationwide transmission potential.
While not addressed in this article, our semimechanistic model structure enables us to perform independent estimates of the relative transmissibility of variants compared to ancestral strains. In doing so, we account for variability in the types of contacts made when low restrictions are applied (Golding et al., 2021). We are able to estimate differences between variants in the probability of transmission per unit of contacttime, for example from detailed attack rate data from overseas. These probabilities can then be combined with our estimates from Australian case data to adjust our estimates of TP under different levels of restrictions for current and emerging variants. We have also updated our modelling framework to account for the effects of vaccination on the TP (reported in the Australian Government’s Common Operating Picture from 27 August 2021 Department of Health and Aged Care, 2021b). This enables us to consider the effect of varying levels of population vaccination coverage, agebased vaccination prioritisation strategies, and levels of restrictions on the ability of the Delta variant (and future possible variants) to spread in the population. These analyses underpin the 2021 Australian national COVID19 reopening plan (Doherty Institute, 2021) and will be reported elsewhere. These various additions and the component models of our framework (Figure 1) provide a suite of interoperable modules that could be used to apply the TP modelling framework to future epidemic diseases and other settings. Enabling the broader application and uptake of these methods would be aided by the development of robust research software, with the ability to modify which modules are used, to match the data streams available to the analyst. The development of such software, and detailed description of data inputs and analysis of the value of each data stream will be the focus of future work.
Our novel methods provide new insight into epidemic dynamics in both low and high incidence settings. The analyses have become an indispensable tool supporting the Australian COVID19 response, through both situational assessment and strategic planning processes.
Methods
Model overview
We estimate the timevarying ability of SARSCoV2 to spread in a population using a novel semimechanistic model informed by data on cases, population behaviours and health system effectiveness. We separately model transmission from locally acquired cases (localtolocal transmission) and from overseas acquired cases (importtolocal transmission). We model localtolocal transmission (${R}_{\mathrm{eff}}$) using two components:
The average populationlevel trend in transmissibility driven by interventions that primarily target transmission from local cases, specifically changes in physical distancing behaviour and case targeted measures (Component 1); and
Shortterm fluctuations in ${R}_{\mathrm{eff}}$ to capture stochastic dynamics of transmission, such as clusters of cases and short periods of lowerthanexpected transmission, and other factors factors influencing ${R}_{\mathrm{eff}}$ that are otherwise unaccounted for by the model (Component 2).
During times of disease activity, Components 1 and 2 are combined to provide an estimate of the local ${R}_{\mathrm{eff}}$ as traditionally measured. In the absence of disease activity, Component 1 is interpreted as the potential for the virus, if it were present, to establish and maintain community transmission (gt_{1}) or otherwise (lt_{1}).
Case data
We used linelists of reported cases for each Australian state and territory extracted from the Australian National Notifiable Diseases Surveillance System (NNDSS). The linelists contain the date when the individual first exhibited symptoms, date when the case notification was received by the jurisdictional health department and where the infection was acquired (i.e. overseas or locally).
Modelling the impact of physical distancing
Overview
To investigate the impact of distancing measures on SARSCoV2 transmission, we distinguish between two types of distancing behaviour: (1) macrodistancing that is, reduction in the rate of nonhousehold contacts; and (2) precautionary microbehaviour hat is, reduction in transmission probability per nonhousehold contact.
We used data from nationwide surveys to estimate trends in specific macrodistancing (average daily number of nonhousehold contacts) and precautionary microbehaviour (proportion of the population always keeping 1.5 m physical distance from nonhousehold contacts) behaviours over time. We used these survey data to infer statelevel trends in macrodistancing and precautionary microbehaviour over time, with additional information drawn from trends in mobility data.
Estimating changes in macrodistancing behaviour
To estimate trends in macrodistancing behaviour, we used data from: two waves of a national survey conducted in early April and early May 2020 by the University of Melbourne; and weekly waves of a national survey conducted by the Australian government from late May 2020. Respondents were asked to report the number of individuals that they had contact with outside of their household in the previous 24 hr. Note that the first wave of the University of Melbourne survey was fielded four days after Australia’s most intensive physical distancing measures were recommended nationally on 29 March 2020.
Given these data, we used a statistical model to infer a continuous trend in macrodistancing behaviour over time. This model assumed that the daily number of nonhousehold contacts is proportional to a weighted average of time spent at different types of location, as measured by Google mobility data. The five types of places are: parks and public spaces; residential properties; retail and recreation; public transport stations; and workplaces. We fit a statistical model that infers the proportion of nonhousehold contacts occurring in each of these types of places from:
A survey of locationspecific contact rates preCOVID19 Rolls et al., 2015; and
A separate statistical model fit to the national average numbers of nonhousehold contacts from a preCOVID19 contact survey and contact surveys fielded postimplementation of COVID19 restrictions.
Waning in macrodistancing behaviour is therefore driven by Google mobility data (calibrated to survey data on nonhousehold contact rates) on increasing time spent in each of the different types of locations since the peak of macrodistancing behaviour.
Estimating changes in precautionary microbehaviour
To estimate trends in precautionary microbehaviour, we used data from weekly national surveys (first wave from 27 to 30 March 2020) to assess changes in behaviour in response to COVID19 public health measures. Respondents were asked to respond to the question: ‘Are you staying 1.5 m away from people who are not members of your household’ on a five point scale with response options ‘No’, ‘Rarely’, ‘Sometimes’, ‘Often’ and ‘Always’.
These behavioural survey data were used in a statistical model to infer the trend in precautionary microbehaviour over time. Precautionary microbehaviour was assumed to be nonexistent prior to the first epidemic wave of COVID19, and the increase in precautionary microbehaviour to its peak was assumed to follow the same trend as precautionary microbehaviour — implying that the population simultaneously adopted both macrodistancing and precautionary microbehaviours around the times that restrictions were implemented.
Incorporating estimated changes in behaviour in the model of transmission potential
These statelevel macrodistancing and precautionary microbehaviour trends were then used in the model of transmission potential to inform the reduction in nonhousehold transmission rates. Since the macrodistancing trend is calibrated against the number of nonhousehold contacts, the rate of nonhousehold transmission scales directly with this inferred trend. The probability of transmission per nonhousehold contact is assumed to be proportional to the fraction of survey participants who report that they always maintain 1.5 m physical distance from nonhousehold contacts. The constant of proportionality is estimated in the model of transmission potential.
The estimated rate of waning of precautionary microbehaviour is sensitive to the metric used. If a different metric of precautionary microbehaviour (e.g. the fraction of respondents practicing good hand hygiene) were used, this might affect the inferred rate of waning of precautionary microbehaviour, and therefore increasing the transmission potential.
Modelling the impact of quarantine of overseas arrivals
We model the impact of quarantine of overseas arrivals via a ‘step function’ reflecting three different quarantine policies: selfquarantine of overseas arrivals from specific countries prior to 15 March 2020; selfquarantine of all overseas arrivals from 15 March up to 27 March 2020; and mandatory quarantine of all overseas arrivals after 27 March 2020 (Figure 2). We make no prior assumptions about the effectiveness of quarantine at reducing ${R}_{\mathrm{eff}}$ import, except that each successive change in policy increased that effectiveness. Note that this part of the model is intended to capture broad changes in the contribution of importation to case numbers, and is not intended to provide reliable inferences about the relative contributions of different border quarantine policies to disease importation.
Accounting for the impact of interstateacquired infections
Each of Australia’s eight states and territories were modelled as a separate epidemic, with no travel assumed between jurisdictions and interstateacquired cases handled as ‘imported cases’ within the modelling framework. We believe that these modelling decisions were reasonable for the Australian context given Australia’s unique geography (the majority of Australians live in a handful of major cities, with comparatively little movement between them), and the imposition of interstate travel restrictions during periods of COVID19 transmission over the analysis period. Furthermore, the number of interstate importations in Australia was small and well documented in the data. Unlike overseasacquired cases, interstateacquired cases are assumed to contribute to onward local transmission since they were not required to quarantine.
Model limitations
While we had access to data on whether cases are locally acquired or overseas acquired, no data were available on whether each of the locally acquired cases were infected by an imported case or by another locally acquired case. These data would allow us to disentangle the two transmission rates. Without these data, we can separate the denominators (number of infectious cases), but not the numerators (number of newly infected cases) in each group at each point in time. With access to such data, our method could provide more precise estimates of ${R}_{\mathrm{eff}}$.
Model description
We developed a semimechanistic Bayesian statistical model to estimate ${R}_{\mathrm{eff}}$, or $R(t)$ hereafter, the effective rate of transmission of SARSCoV2 over time, whilst simultaneously quantifying the impacts on $R(t)$ of a range of policy measures introduced at national and regional levels in Australia.
Observation model
A straightforward observation model to relate case counts to the rate of transmission is to assume that the number of new locally acquired cases ${N}_{i}^{L}(t)$ at time $t$ in region $i$ is (conditional on its expectation) Poissondistributed with mean ${\lambda}_{i}(t)$ given by the product of the total infectiousness of infected individuals ${I}_{i}(t)$ and the timevarying reproduction number ${R}_{i}(t)$:
where the total infectiousness, ${I}_{i}(t)$, is the sum of all active infections ${N}_{i}({t}^{\prime})$ — both locallyacquired ${N}_{i}^{L}({t}^{\prime})$ and overseasacquired ${N}_{i}^{O}({t}^{\prime})$ — initiated at times ${t}^{\prime}$ prior to $t$, each weighted by an infectivity function $g({t}^{\prime})$ giving the proportion of new infections that occur ${t}^{\prime}$ days postinfection. The function $g({t}^{\prime})$ is the probability of an infectorinfectee pair occurring ${t}^{\prime}$ days after the infector’s exposure, hat is, a discretisation of the probability distribution function corresponding to the generation interval.
This observation model forms the basis of the maximumlikelihood method proposed by White and Pagano, 2008 White and Pagano, 2008 and the variations of that method by Cori et al., 2013 Cori et al., 2013, Thompson et al., 2019 Thompson et al., 2019 and Abbott et al., 2020b Abbott et al., 2020a that have previously been used to estimate timevarying SARSCoV2 reproduction numbers in Australia Price et al., 2020.
We extend this model to consider separate reproduction numbers for two groups of infectious cases, in order to model the effects of different interventions targeted at each group: those with locally acquired cases ${I}_{i}^{L}(t)$, and those with overseas acquired cases ${I}_{i}^{O}(t)$, with corresponding reproduction numbers ${R}_{i}^{L}(t)$ and ${R}_{i}^{O}(t)$. These respectively are the rates of transmission from imported cases to locals, and from locally acquired cases to locals. We also model daily case counts as arising from a Negative Binomial distribution rather than a Poisson distribution to account for potential clustering of new infections on the same day, and use a state and timevarying generation interval distribution ${g}_{i}({t}^{\prime},t)$ (detailed in Surveillance effect model):
where the negative binomial distribution is parameterised in terms of its mean ${\mu}_{i}(t)$ and dispersion parameter $r$. In the commonly used probability and dispersion parameterisation with probability $\psi $ the mean is given by $\mu =\psi r/(1\psi )$.
Note that if data were available on the whether the source of infection for each locally acquired case was another locallyacquired case or an overseasacquired cases, we could split this into two separate analyses using the observation model above; one for each transmission source. In the absence of such data, the fractions of all transmission attributed to sources of each type is implicitly inferred by the model, with an associated increase in parameter uncertainty.
We provide the model with additional information on the rate of importtolocal transmission by adding a further likelihood term to the model for known events of importtolocal transmission since the implementation of mandatory hotel quarantine:
where $K$ is the total number of known events of transmission from overseasacquired cases occurring within Australia from ${\tau}_{2}$ = 20200328 to ${\tau}_{3}$ = 20201231. These events are largely transmission events within hotel quarantine facilities, some of which led to outbreaks of localtolocal transmission. Prior to this period, importtolocal transmission events cannot be reliably distinguished from localtolocal transmission events.
When estimating ${R}_{\mathrm{eff}}$ from recent case count data, care must be taken to account for underreporting of recent cases (those which have yet to be detected), because failing to account for this underreporting can lead to estimates of ${R}_{\mathrm{eff}}$ that are biased downwards. We correct for this righttruncation effect by first estimating the fraction of locallyacquired cases on each date that we would expect to have detected by the time the model is run (detection probability), and correcting both the infectiousness terms ${I}_{i}^{L}(t)$, and the observed number of new cases ${N}_{i}^{L}(t)$. We calculate the detection probability for each day in the past from the empirical cumulative distribution function of delays from assumed date of infection to date of detection over a recent period (see Surveillance effect model). We correct the infectiousness estimates ${I}_{i}^{L}(t)$ by dividing the number of newly infected cases on each day ${N}_{i}^{L}(t)$ by this detection probability — to obtain the expected number of new infections per day — before summing across infectiousness. We correct the observed number of new infections by a modification to the negative binomial likelihood; multiplying the expected number of cases by the detection probability to obtain the expected number of cases observed in the (uncorrected) time series of locallyacquired cases.
Reproduction rate models
We model the onward reproduction numbers for overseasacquired and locallyacquired cases in a semimechanistic way. Reproduction numbers for localtolocal transmission are modelled as a combination of a deterministic model of the populationwide transmission potential for that type of case, and a correlated time series of random effects to represent stochastic fluctuations in the reporting rate in each state over time. Importtolocal transmission is modelled in a mechanistic way:
For both locally acquired and overseasacquired infections, the effective reproduction number depends on the transmission potential ${R}_{i}^{\ast}(t)$ is given by a deterministic epidemiological model of populationwide transmission potential that considers the effects of distancing behaviours. The correlated time series of random effects ${\u03f5}_{i}(t)$ represents stochastic fluctuations in these locallocal reproduction numbers in each state over time — for example due to clusters of transmission in subpopulations with higher or lower reproduction numbers than the general population. We consider that the transmission potential ${R}_{i}^{*}(t)$ is the average of individual reproduction numbers over the entire state population, whereas the effective reproduction number ${R}_{i}^{L}(t)$ is the average of individual reproduction numbers among a (nonrandom) sample of individuals – those that make up the active cases at that point in time. We therefore expect that the longterm average of ${R}_{i}^{L}(t)$ will equate to ${R}_{i}^{*}(t)$. The relationship between these two is therefore defined such that the hierarchical distribution over ${R}_{i}^{L}(t)$ is marginally (with respect to time) a lognormal distribution with mean ${R}_{i}^{*}(t)$. The parameter ${\sigma}^{2}$ is the marginal variance of the ${\u03f5}_{i}$, as defined in the kernel function of the Gaussian process.
Note that in this model the random effects term ${\u03f5}_{i}$ and its variance term ${\sigma}^{2}$ is intended to have a mechanistic interpretation as the stochasticity due to random sampling (of people currently infected from the total population). It is not incorporated to account for error in specification of the transmission potential in the way that temporal random effects are commonly used in statistical modelling. Consequently, small variance in the timeseries plots of ${\u03f5}_{i}$ is not indicative of good fit, but of a large number of infections; as the size of the sample increases, the variance of mean decreases.
For overseasacquired cases the populationwide transmission rate at time $t$, ${R}_{i}^{*}(0)Q(t)$, is the baseline rate of transmission (${R}_{i}^{*}(0)={R}_{0}$; localtolocal transmission potential in the absence of distancing behaviour or other mitigation) multiplied by a quarantine effect model, $Q(t)$, that encodes the efficacy of the three different overseas quarantine policies implemented in Australia (described below).
We model ${R}_{i}^{*}(t)$, the populationwide rate of localtolocal transmission at time $t$, as the sum of two components: the rate of transmission to members of the same household, and to members of other households. Each of these components is computed as the product of the number of contacts, and the probability of transmission per contact. The transmission probability is in turn modelled as a binomial process considering the duration of contact with each person and the probability of transmission per unit time of contact. This mechanistic consideration of the contact process enables us to separately quantify how macrodistancing and precautionary microbehaviours impact on transmission, and to make use of various ancillary measures of both forms of distancing:
where: $s(t)$ is the effect of surveillance on transmission, due to the detection and isolation of cases (detailed below); $H{C}_{0}$ and $N{C}_{0}$ are the baseline (i.e. before adoption of distancing behaviours) daily rates of contact with, respectively, people who are, and are not, members of the same household; $H{D}_{0}$ and $N{D}_{0}$ are the baseline average total daily duration of contacts with household and nonhousehold members (measured in hours); $d$ is the average duration of infectiousness in days; $p$ is the probability of transmitting the disease per hour of contact, and; ${h}_{i}(t)$, ${\delta}_{i}(t)$, ${\gamma}_{i}(t)$ are timevarying indices of change relative to baseline of the duration of household contacts, the number of nonhousehold contacts, and the transmission probability per nonhousehold contact, respectively (modifying both the duration and transmission probability per unit time for nonhousehold contacts).
The first component in Equation (12) is the rate of household transmission, and the second is the rate of nonhousehold transmission. Note that the duration of infectiousness $d$ is considered differently in each of these components. For household members, the daily number of household contacts is typically close to the total number of household members, hence the expected number of household transmissions asymptotically approaches the household size; so the number of days of infectiousness contributes to the probability of transmission to each of those household members. This is unlikely to be the case for nonhousehold members, where each day’s nonhousehold contacts may overlap, but are unlikely to be from a small finite pool. This assumption would be unnecessary if contact data were collected on a similar timescale to the duration of infectiousness, though issues with participant recall in contact surveys mean that such data are unavailable. Note that this model does not have a household network structure, nor account for depletion of susceptible individuals within a household.
The parameters $H{C}_{0}$, $H{D}_{0}$, and $N{D}_{0}$ are all estimated from a contact survey conducted in Melbourne in 2015 Rolls et al., 2015. $N{C}_{0}$ is computed from an estimate of the total number of contacts per day for adults from Prem et al., 2017, minus the estimated rate of household contacts. Whilst Rolls et al., 2015 also provides an estimate of the rate of nonhousehold contacts, the method of data collection (a combination of ‘individual’ and ‘group’ contacts) makes it less comparable with contemporary survey data than the estimate of Prem et al., 2017.
The expected duration of infectiousness $d$ is computed as the mean of the nontimevarying discrete generation interval distribution:
and change in the duration of household contacts over time ${h}_{i}(t)$ is assumed to be equivalent to change in time spent in residential locations in region $i$, as estimated by the mobility model for the data stream Google: time at residential. In other words, the total duration of time in contact with household members is assumed to be directly proportional to the amount of time spent at home. Unlike the effect on nonhousehold transmission, an increase in macrodistancing is expected to slightly increase household transmission due to this increased contact duration.
The timevarying parameters ${\delta}_{i}(t)$ and ${\gamma}_{i}(t)$ respectively represent macrodistancing and precautionary microbehaviour; behavioural changes that reduce mixing with nonhousehold members, and the probability of transmission for each of nonhousehold member contact. We model each of these components, informed by population mobility estimates from the mobility model and calibrated against data from nationwide surveys of contact behaviour. Surveillance effect model Disease surveillance — both screening of people with COVIDlike symptoms and performing contact tracing — can improve COVID19 control by placing cases in isolation so that they are less likely to transmit the pathogen to other people. Improvements in disease surveillance can therefore lead to a reduction in transmission potential by isolating cases more quickly, and reducing the time they are infectious but not isolated. Such an improvement changes two quantities: the population average transmission potential ${R}^{*}(t)$ is reduced by a factor ${s}_{i}(t)$; and the generation interval distribution $g(t,{t}^{\prime})$ is shortened, as any transmission events are more likely to occur prior to isolation.
We model both of these functions using a region and timevarying estimate of the survival function (one minus the cumulative density function) ${f}_{i}(t,{t}^{\prime})$ of the discrete probability distribution over times from infection to detection:
where ${g}^{*}({t}^{\prime})$ is the baseline generation interval distribution, representing times to infection in the absence of detection and isolation of cases, ${s}_{i}(t)$ is a normalising factor — and also the effect of surveillance on transmission — and ${f}_{i}(t,{t}^{\prime})$ is a region and timevarying probability density over periods from infection to isolation ${t}^{\prime}$. In states/territories and at times when cases are rapidly found and placed in isolation, the distribution encoded by ${f}_{i}(t,{t}^{\prime})$ has most of its mass on small delays, average generation intervals are shortened, and the surveillance effect ${s}_{i}(t)$ tends toward 0 (a reduction in transmission). At times when cases are not found and isolated until after most of their infectious period has passed, ${f}_{i}(t,{t}^{\prime})$ has most of its mass on large delays, generation intervals are longer on average, and ${s}_{i}(t)$ tends toward 1 (no effect of reduced transmission).
We model the region and timevarying distributions ${f}_{i}(t,{t}^{\prime})$ empirically via a timeseries of empirical distribution functions computed from all observed infectiontoisolation periods observed within an adaptive moving window around each time $t$. Since dates of infection and isolation are not routinely recorded in the dataset analysed, we use 5 days prior to the date of symptom onset to be the assumed date of infection, and the date of case notification to be the assumed date of isolation. This will overestimate the time to isolation and therefore underestimate the effect of surveillance when a significant proportion of cases are placed into isolation prior to testing positive — for example, during the tail of an outbreak being successfully controlled by contact tracing.
For a given date and state/territory, the empirical distribution of delays from symptom onset to notification is computed from cases with symptom onset falling within a time window around that date, with the window selected to be the smallest that will yield at least 500 observations; but constrained to between one and eight weeks.
Where a state/territory does not have sufficient cases to reliably estimate this distribution in an eight week period, a national estimate is used instead. Specifically, if fewer than 100 cases, the national estimate is used, if more than 500 the state estimate is used, and if between 100 and 500 the distribution is a weighted average of state and national estimates.
The national estimate is obtained via the same method but with no upper limit on the window size and excluding data from Victoria since 14 June, since the situation during the Victorian outbreak after this time is not likely to be representative of surveillance in states with few cases.
Macrodistancing model
The populationwide average daily number of nonhousehold contacts at a given time can be directly estimated using a contact survey. We therefore used data from a series of contact surveys commencing immediately after the introduction of distancing restrictions to estimate ${\delta}_{i}(t)$ independently of case data. To infer a continuous trend of ${\delta}_{i}(t)$, we model the numbers of nonhousehold contacts at a given time as a function of mobility metrics considered in the mobility model. We model the log of the average number of contacts on each day as a linear model of the log of the ratio on baseline of five Google metrics of time spent at different types of location: residential, transit stations, parks, workplaces, and retail and recreation:
where $\omega $ is the the vector of 5 coefficients, $\mathbf{m}$ is an vector of length 5 containing ones, except for the element corresponding to time at residential locations, which has value 1, and ⊙ indicates the elementwise product. This constrains the direction of the effect of increasing time spent at each of these locations to be positive (more contacts), except for time at residential, which we constrain to be negative. The intercept of the linear model (average daily contacts at baseline) is given an prior formed from the daily number of nonhousehold contacts in a preCOVID19 contact survey Rolls et al., 2015. Since our aim is to capture general trends in mobility rather than daily effects, we model the weekly average of the daily number of contacts, by using smoothed estimates of the Google mobility metrics.
Whilst we aim to model weekly rather than daily variation in contact rates, when fitting the model to survey data we account for variation among responses by day of the week by modelling the fraction of the weekly number of contacts falling on each day of the week (the lengthseven vector in each state and time ${\mathbf{D}}_{i}(t)$) and using this to adjust the expected number of contacts for each respondent based on the day of the week they completed the survey. To account for how the weekly distribution of contacts has changed over time as a function of mixing restrictions (e.g. a lower proportion of contacts on weekdays during periods when stayathome orders were in place), we model the weekly distribution of contacts itself as a function of deviation in the weekly average of the daily number of contacts, with lengthseven vector parameters $\alpha $ and $\theta $. We use the softmax (normalised exponential) function to transform this distribution to sum to one, then multiply the resulting proportion by 7 to reweight the weekly average daily contact rate to the relevant day of the week.
Combining the baseline average daily contact rate $N{C}_{0}$, mobilitydriven modelled change in contact rates over time ${\delta}_{i}(t)$, and timevarying day of the week effects ${\mathbf{D}}_{i}(t)$ we obtain an expected number of daily contacts for each survey response $N{C}_{k}$:
where $i[k]$, $t[k]$, and $d[k]$ respectively indicate the state, time, and day of the week on which respondent $k$ filled in the survey.
We model the number of contacts from each survey respondent as a draw from an intervalcensored discrete lognormal distribution. This choice of distribution enables us to account for the adhoc rounding of reported numbers of contacts (responses larger than 10 tend to be ‘heaped’ on multiples of 10 and 100), whilst also accounting for heavy upper tail in numbers of reported contacts. The support of this distribution is the integers from 0 to 10 inclusive, and the intervals 11–20, 21–50, and 50–999. Reported daily contact rates ≥ 1000 are excluded as these are considered implausible for our definition of a contact. The probability mass function of this distribution is the integral across these ranges of a lognormal distribution with parameters ${\mu}_{k}$ and $\tau $, parameterised such that the mean of the distribution is $N{C}_{k}$:
We incorporate mobility data into transmission potential in a twostage process. In the first stage, nonhousehold contact rates are modelled using mobility and survey data. The posterior mean of the modelled nonhousehold contact rate in each jurisdiction over time is then incorporated in the transmission potential model as a fixed (i.e. ‘data’) timeseries without propagation of posterior uncertainty. Uncertainty in the macrodistancing model could be propagated through to the TP model by estimating both parts in a single joint model. However this would be computationally very burdensome, and long run times would reduce the utility of the transmission potential model for routine situational assessment. Moreover, because uncertainty in both the macrodistancing and transmission potential timeseries are homoscedastic (the posterior variance is more or less constant over time in each state), propagation of the uncertainty in the macrodistancing model is unlikely to have a material effect on estimation of TP timeseries.
Precautionary microbehaviour model
Unlike with macrodistancing behaviour and contact rates, there is no simple mathematical framework linking change in precautionary microbehaviours to changes in nonhousehold transmission probabilities. We must therefore estimate the effect of precautionary microbehaviour on transmission via case data. We implicitly assume that any reduction in localtolocal transmission potential that is not explained by changes to the numbers of nonhousehold contacts, the duration of household contacts, or improved disease surveillance is explained by the effect of precautionary microbehaviour on nonhousehold transmission probabilities.
Whilst it is not necessary to use ancillary data to estimate the effect that precautionary microbehaviour has at its peak, we use behavioural survey data to estimate the temporal trend in precautionary microbehaviour, in order to estimate to what extent adoption of that behaviour has waned and how that has affected transmission potential.
We therefore model ${\gamma}_{t}$ (a timevarying index of change relative to baseline of transmission probability per nonhousehold contact, see Equation (12)), as a function of the proportion of the population adhering to precautionary microbehaviours. We consider adherence to the ‘1.5 m rule’ as indicative of this broader suite of behaviours due to the availability of data on this behaviour in a series of weekly behavioural surveys beginning prior to the last distancing restriction being implemented Department of the Prime Minister and Cabinet, 2020. We consider the number ${m}_{i,t}^{+}$ of respondents in region $i$ on survey wave commencing at time $t$ replying that they ‘always’ keep 1.5 m distance from nonhousehold members, as a binomial sample with sample size ${m}_{i,t}$. We use a generalised additive model to estimate ${c}_{i}(t)$, the proportion of the population in region $i$ responding that they always comply as a the intervention stage, smoothed over time. Intervention stages are defined as periods of a continuous state of stayathome order, and this state thus switches each time a stayathome order is started, ended, or significantly changed. This state switching allows the model to react to sudden changes in compliance behaviour when orders are made or rescinded. We assume that the temporal pattern in the initial rate of adoption of the behaviour is the same as for macrodistancing behaviours — the adoption curve estimated from the mobility model. In other words, we assume that all macrodistancing and precautionary microbehaviours were adopted simultaneously around the time the first populationwide restrictions were put in place in March and April 2020. However we do not assume that these behaviours peaked at the same time or subsequently followed the same temporal trend. The model for the proportion complying with this behaviour is therefore:
where ${\zeta}_{i,j}$ is intervention state $j$ in region $i$, and $s$ is a smoothing function over time $t$.
Given ${c}_{i}(t)$, we model ${\gamma}_{i}(t)$ as a function of the degree of precautionary microbehaviour relative to the peak:
where ${\kappa}_{i}$ is the peak of compliance, or maximum of ${c}_{i}(t)$, and $\beta $ is inferred from case data in the main ${R}_{\mathrm{eff}}$ model.
Overseas quarantine model
We model the effect of overseas quarantine $Q(t)$ via a monotone decreasing step function with values constrained to the unit interval, and with steps at the known dates ${\tau}_{1}$ and ${\tau}_{2}$ of changes in quarantine policy:
where $q}_{1}>{q}_{2}>{q}_{3$ and all parameters are constrained to the unit interval.
Error models
The correlated timeseries of deviance between transmission potential and the effective reproduction number for localtolocal transmission in each region ${\u03f5}_{i}(t)$ is modelled as a zeromean Gaussian process (GP) with covariance structure reflecting temporal correlation in errors within each region, but independent between regions. We use a Matern 5/2 covariance function $k$, enabling a mixture of relatively smooth trends and local ’roughness’ to represent the sudden rapid growth of cases that can occur with a hightransmission cluster. Kernel parameters $\sigma $ and $l$ are the same across regions:
Components of local transmission potential
We model the rate of transmission from locally acquired cases as a combination of the timevarying mechanistic model of transmission rates ${R}_{i}^{*}(t)$, and a temporallycorrelated error term ${e}^{{\u03f5}_{i}(t)}$. This structure enables inference of mechanistically interpretable parameters whilst also ensuring that statistical properties of the observed data are represented by the model. Moreover, these two parts of the model can also be interpreted in epidemiological terms as two different components of transmission rates:
Component 1 (TP) – transmission rates averaged over the whole state population, representing how macrodistancing, precautionary microbehaviours, and other factors affect the potential for widespread community transmission (${R}_{i}^{*}(t)$), and
Component 2 (C2) – the degree to which the transmission rates of the population of current active cases deviates from the average statewide transmission rate (${e}^{{\u03f5}_{i}(t)}$).
Component 2 reflects the fact that the population of current active cases in each state at a given time will not be representative of the the statewide population, and may be either higher (e.g. when cases arise from a cluster in a hightransmission environment) or lower (e.g. when clusters are brought under control and cases placed in isolation).
Component 1 (TP) can therefore be interpreted as the expected rate of transmission if cases were widespread (populationrepresentative) in the community. The product of Components 1 and 2 (${R}_{\mathrm{eff}}$) can be interpreted as the rate of transmission in the subpopulation making up active cases at a given time.
Where a state has active cases in one or more clusters, the combination of these components gives the apparent rate of transmission in those clusters (${R}_{\mathrm{eff}}$), given by Equation 10. This reflects the interpretation that TP captures the population mean of a distribution over individuallevel reproduction numbers, and ${R}_{\mathrm{eff}}$ is the mean of a (nonrandom) sample from that distribution — the population comprising cases at that point in time. While not used in the public health context in Australia, the epidemiological interpretation of the ${R}_{\mathrm{eff}}$ when a state has no active cases is the rate of spread expected if an index case were to occur in a random subpopulation. Because the amplitude of this error term is learned from the data, this is informative as to the range of plausible rates of spread that might be expected from a case being introduced into a random subpopulation. However, the mean of this distribution, TP, may play a similar role and has proven to be a more interpretable quantity for end users of this model.
Parameter values and prior distributions
The parameters of the generation interval distribution are the posterior mean parameter estimates corresponding to a lognormal distribution over the serial interval estimated by Nishiura et al., 2020. The shape of the generation interval distribution for SARSCoV2 in comparable populations is not well understood, and a number of alternative distributions have been suggested by other analyses. A sensitivity analysis performed by running the model with alternative generation interval distributions (not presented here) showed that parameter estimates were fairly consistent between these scenarios, and the main findings were unaffected. A full, formal analysis of sensitivity to this and other assumptions will be presented in a future publication.
No ancillary data are available to inform $p$, the probability of transmission per hour of contact in the absence of distancing behaviour. However, at $t=0$, holding $H{C}_{0}$, $N{C}_{0}H{D}_{0}$, and $N{D}_{0}$ constant, there is a deterministic relationship between $p$ and ${R}_{i}^{*}(0)$ (the basic reproduction number, which is the same for all states). The parameter $p$ is therefore identifiable from transmission rates at the beginning of the first epidemic wave in Australia. We define a prior on $p$ that corresponds to a prior over ${R}_{i}^{*}(0)$ matching the averages of the posterior means and 95% credible intervals for 11 European countries as estimated by Flaxman et al., 2020 in a sensitivity analysis where the mean generation interval was 5 days — similar to the serial interval distribution assumed here. This corresponds to a prior mean of 2.79, and a standard deviation of 1.70 for ${R}_{i}^{*}(0)$. This prior distribution over $p$ was determined by a MonteCarlo momentmatching algorithm, integrating over the prior values for $H{C}_{0}$, $N{C}_{0}H{D}_{0}$, and $N{D}_{0}$.
Model fitting
We fitted (separate) models of ${c}_{i}(t)$ and $N{C}_{0}{\delta}_{i}(t)$ to survey data alone in order to infer trends in those parameters as informed by survey data. These are shown in Figure 3. We used the posterior means of each of these model outputs as inputs into the ${R}_{\mathrm{eff}}$ model. The posterior variance of each of these quantities is largely consistent over time and between states, and the absolute effect of each is scaled by other parameters (e.g. $\beta $), meaning that uncertainty in these quantities is largely not identifiable from uncertainty in other scaling parameters. As a consequence, propagation of uncertainty in these parameters into the ${R}_{\mathrm{eff}}$ model (as was performed in a previous iteration of the model) has little impact on estimates of ${R}_{\mathrm{eff}}$ and transmission potential, so is avoided for computational brevity.
Inference was performed by Hamiltonian Monte Carlo using the R packages greta and greta.gp 5 (Golding, 2019; Golding, 2020). Posterior samples of model parameters were generated by 10 independent chains of a Hamiltonian Monte Carlo sampler, each run for 1000 iterations after an initial, discarded, ‘warmup’ period (1000 iterations per chain) during which the sampler step size and diagonal mass matrix was tuned, and the regions of highest density located. Convergence was assessed by visual assessment of chains, ensuring that the potential scale reduction factor for all parameters had values less than 1.1, and that there were at least 1000 effective samples for each parameter.
Visual posterior predictive checks were performed to ensure that the observed data were consistent with the posterior predictive density over all cases (and survey results), and over timevarying case predictions within each state.
Code availability
Model code for performing the analyses and generating the figures is available at: https://github.com/goldingn/covid19_australia_interventions, (copy archived at swh:1:rev:9fe78353a2ee6ab9c3b9ed35c1feea6935af769a; Golding, 2023).
Data availability
Datasets analysed and generated during this study are available at the following link: https://doi.org/10.26188/19517986.v1. For estimates of the timevarying effective reproduction number and transmission potential (Figure 2), the complete line listed data within the Australian national COVID19 database are not publicly available. However, we provide the cases per day by notification date and state (Data files 1 and 2) which, when supplemented with the estimated distribution of the delay from symptom onset to notification as in Figure 3D and H (provided in Data files 3 and 4), and Data files 510, analyses of the timevarying effective reproduction number and transmission potential can be performed. Data files 510 contain the numerical data, output from each of the model components, used to generate Figure 3. For access to the raw data, a request must be submitted via NNDSS.datarequests@health.gov.au which will be assessed by a data committee. Model code for performing the analyses and generating the figures is available at: https://github.com/goldingn/covid19_australia_interventions (copy archived at swh:1:rev:9fe78353a2ee6ab9c3b9ed35c1feea6935af769a).

figshareData files to support manuscript: A modelling approach to estimate the transmissibility of SARSCoV2 during periods of high, low and zero case incidence.https://doi.org/10.26188/19517986.v1
References

COVID19, Australia: epidemiology report 12Communicable Diseases Intelligence 24:44.https://doi.org/10.33321/cdi.2020.44.36

COVID19, Australia: epidemiology report 17Communicable Diseases Intelligence 24:44.https://doi.org/10.33321/cdi.2020.44.51

COVID19, Australia: epidemiology report 47Communicable Diseases Intelligence 45:47.

A new framework and software to estimate timevarying reproduction numbers during epidemicsAmerican Journal of Epidemiology 178:1505–1512.https://doi.org/10.1093/aje/kwt133

Public attitudes, behaviors, and beliefs related to covid19, stayathome orders, nonessential business closures, and public health guidance  United States, New York City, and Los Angeles, may 512, 2020MMWR. Morbidity and Mortality Weekly Report 69:751–758.https://doi.org/10.15585/mmwr.mm6924e1

Greta: simple and scalable statistical modelling in RJournal of Open Source Software 4:1601.https://doi.org/10.21105/joss.01601

SoftwareCovid19_australia_interventions, version swh:1:rev:9fe78353a2ee6ab9c3b9ed35c1feea6935af769aSoftware Heritage.

Practical considerations for measuring the effective reproductive number, RTPLOS Computational Biology 16:e1008409.https://doi.org/10.1371/journal.pcbi.1008409

Serial interval of novel coronavirus (COVID19) infectionsInternational Journal of Infectious Diseases 93:284–286.https://doi.org/10.1016/j.ijid.2020.02.060

Projecting social contact matrices in 152 countries using contact surveys and demographic dataPLOS Computational Biology 13:e1005697.https://doi.org/10.1371/journal.pcbi.1005697

Potential lessons from the Taiwan and New Zealand health responses to the COVID19 pandemicThe Lancet Regional Health. Western Pacific 4:100044.https://doi.org/10.1016/j.lanwpc.2020.100044

ReportCommentary: The Delta variant has upended the East Asia COVID19 modelChannel News Asia.
Article and author information
Author details
Funding
Australian Government
 Nick Golding
Australian Research Council (DE180100635)
 Nick Golding
National Health and Medical Research Council (GNT1170960)
 Jodie McVernon
National Health and Medical Research Council (GNT1117140)
 Jodie McVernon
National Health and Medical Research Council (2021/GNT2010051)
 Freya M Shearer
World Health Organization
 Nick Golding
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
Our analyses use surveillance data reported through the Communicable Diseases Network Australia (CDNA) as part of the nationally coordinated response to COVID19. We thank public health staff from incident emergency operations centres in state and territory health departments, and the Australian Government Department of Health, along with state and territory public health laboratories. We thank members of CDNA for their feedback and perspectives on the results of the analyses. This work was directly funded by the Australian Government Department of Health Office of Health Protection. Additional support was provided by: the Australian Research Council (NG DECRA fellowship DE180100635); the National Health and Medical Research Council of Australia through its Centres of Research Excellence (SPECTRUM, GNT1170960) and Investigator Grant Schemes (JMcV Principal Research Fellowship, GNT1117140; FMS Emerging Leader Fellowship, 2021/GNT2010051); and through a research agreement with the World Health Organisation (Health Emergency Information & Risk Assessment, Health Emergencies Programme).
Ethics
The study was undertaken as urgent public health action to support Australia’s COVID19 pandemic response. The study used data from the Australian National Notifiable Disease Surveillance System (NNDSS) provided to the Australian Government Department of Health under the National Health Security Agreement for the purposes of national communicable disease surveillance. Data from the NNDSS were supplied after deidentification to the investigator team for the purposes of provision of epidemiological advice to government. Contractual obligations established strict data protection protocols agreed between the University of Melbourne and subcontractors and the Australian Government Department of Health, with oversight and approval for use in supporting Australia’s pandemic response and for publication provided by the data custodians represented by the Communicable Diseases Network of Australia. The ethics of the use of these data for these purposes, including publication, was agreed by the Department of Health with the Communicable Diseases Network of Australia.
Version history
 Preprint posted: November 29, 2021 (view preprint)
 Received: February 22, 2022
 Accepted: January 16, 2023
 Accepted Manuscript published: January 20, 2023 (version 1)
 Version of Record published: March 8, 2023 (version 2)
Copyright
© 2023, Golding et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics

 969
 views

 168
 downloads

 14
 citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading

 Computational and Systems Biology
 Epidemiology and Global Health
The chemical composition of foods is complex, variable, and dependent on many factors. This has a major impact on nutrition research as it foundationally aﬀects our ability to adequately assess the actual intake of nutrients and other compounds. In spite of this, accurate data on nutrient intake are key for investigating the associations and causal relationships between intake, health, and disease risk at the service of developing evidencebased dietary guidance that enables improvements in population health. Here, we exemplify the importance of this challenge by investigating the impact of food content variability on nutrition research using three bioactives as model: ﬂavan3ols, (–)epicatechin, and nitrate. Our results show that common approaches aimed at addressing the high compositional variability of even the same foods impede the accurate assessment of nutrient intake generally. This suggests that the results of many nutrition studies using food composition data are potentially unreliable and carry greater limitations than commonly appreciated, consequently resulting in dietary recommendations with signiﬁcant limitations and unreliable impact on public health. Thus, current challenges related to nutrient intake assessments need to be addressed and mitigated by the development of improved dietary assessment methods involving the use of nutritional biomarkers.

 Epidemiology and Global Health
Background:
Comorbidity with type 2 diabetes (T2D) results in worsening of cancerspecific and overall prognosis in colorectal cancer (CRC) patients. The treatment of CRC per se may be diabetogenic. We assessed the impact of different types of surgical cancer resections and oncological treatment on risk of T2D development in CRC patients.
Methods:
We developed a populationbased cohort study including all Danish CRC patients, who had undergone CRC surgery between 2001 and 2018. Using nationwide register data, we identified and followed patients from date of surgery and until new onset of T2D, death, or end of followup.
Results:
In total, 46,373 CRC patients were included and divided into six groups according to type of surgical resection: 10,566 RightNoChemo (23%), 4645 RightChemo (10%), 10,151 LeftNoChemo (22%), 5257 LeftChemo (11%), 9618 RectalNoChemo (21%), and 6136 RectalChemo (13%). During 245,466 personyears of followup, 2556 patients developed T2D. The incidence rate (IR) of T2D was highest in the LeftChemo group 11.3 (95% CI: 10.4–12.2) per 1000 personyears and lowest in the RectalNoChemo group 9.6 (95% CI: 8.8–10.4). Betweengroup unadjusted hazard ratio (HR) of developing T2D was similar and nonsignificant. In the adjusted analysis, RectalNoChemo was associated with lower T2D risk (HR 0.86 [95% CI 0.75–0.98]) compared to RightNoChemo.
For all six groups, an increased level of body mass index (BMI) resulted in a nearly twofold increased risk of developing T2D.
Conclusions:
This study suggests that postoperative T2D screening should be prioritised in CRC survivors with overweight/obesity regardless of type of CRC treatment applied.
Funding:
The Novo Nordisk Foundation (NNF17SA0031406); TrygFonden (101390; 20045; 125132).