Predictors of human-infective RNA virus discovery in the United States, China, and Africa, an ecological study

  1. Feifei Zhang  Is a corresponding author
  2. Margo Chase-Topping
  3. Chuan-Guo Guo
  4. Mark EJ Woolhouse
  1. Usher Institute, University of Edinburgh, United Kingdom
  2. Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, United Kingdom
  3. Department of Medicine, Li Ka Shing Faculty of Medicine, University of Hong Kong, China

Abstract

Background:

The variation in the pathogen type as well as the spatial heterogeneity of predictors make the generality of any associations with pathogen discovery debatable. Our previous work confirmed that the association of a group of predictors differed across different types of RNA viruses, yet there have been no previous comparisons of the specific predictors for RNA virus discovery in different regions. The aim of the current study was to close the gap by investigating whether predictors of discovery rates within three regions—the United States, China, and Africa—differ from one another and from those at the global level.

Methods:

Based on a comprehensive list of human-infective RNA viruses, we collated published data on first discovery of each species in each region. We used a Poisson boosted regression tree (BRT) model to examine the relationship between virus discovery and 33 predictors representing climate, socio-economics, land use, and biodiversity across each region separately. The discovery probability in three regions in 2010–2019 was mapped using the fitted models and historical predictors.

Results:

The numbers of human-infective virus species discovered in the United States, China, and Africa up to 2019 were 95, 80, and 107 respectively, with China lagging behind the other two regions. In each region, discoveries were clustered in hotspots. BRT modelling suggested that in all three regions RNA virus discovery was better predicted by land use and socio-economic variables than climatic variables and biodiversity, although the relative importance of these predictors varied by region. Map of virus discovery probability in 2010–2019 indicated several new hotspots outside historical high-risk areas. Most new virus species since 2010 in each region (6/6 in the United States, 19/19 in China, 12/19 in Africa) were discovered in high-risk areas as predicted by our model.

Conclusions:

The drivers of spatiotemporal variation in virus discovery rates vary in different regions of the world. Within regions virus discovery is driven mainly by land-use and socio-economic variables; climate and biodiversity variables are consistently less important predictors than at a global scale. Potential new discovery hotspots in 2010–2019 are identified. Results from the study could guide active surveillance for new human-infective viruses in local high-risk areas.

Funding:

FFZ is funded by the Darwin Trust of Edinburgh (https://darwintrust.bio.ed.ac.uk/). MEJW has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 874735 (VEO) (https://www.veo-europe.eu/).

Editor's evaluation

This study will be of interest to readers in the field of virus discovery. This study attempts to identify predictors of human-infective RNA virus discovery and predict high risk areas in a recent period in the United States, China, and Africa using an ecological modeling framework. The study has potential to inform future discovery efforts for human-infective viruses.

https://doi.org/10.7554/eLife.72123.sa0

Introduction

RNA viruses are the primary cause for emerging infectious diseases with epidemic potential, given that they have a high rate of evolution and high capacity to adapt to new hosts (Woolhouse et al., 2016). In recent decades, infectious diseases caused by severe acute respiratory syndrome coronavirus (SARS-CoV), Middle East respiratory syndrome coronavirus (MERS-CoV), Bundibugyo Ebola virus and SARS-CoV-2 present major threats to the health and welfare of humans (Albariño et al., 2013; Ksiazek et al., 2003; Mackay and Arden, 2015; World Health Organisation, 2020). Detection of formerly unknown human-infective RNA viruses in the earliest stage after the emergence are essential for controlling the infections they cause. Measures to implement early detection include not only advanced diagnostic techniques (Lipkin and Firth, 2013), but more importantly the idea where to look for them (so-called hotspots) (Morse, 2012).

Socio-economic, environmental, and ecological factors related to both virus natural history and research effort have been found to affect the discovery of emerging RNA viruses (Jones et al., 2008; Morse, 2012; Rosenberg, 2015; Zhang et al., 2020). However, these factors are highly spatially heterogeneous, making the generality of any associations with discovery debatable. For example, the United States, China, and Africa have experienced different rates of socio-economic, environmental, and ecological changes in the last one hundred years. The United States has always had better resources to discover new viruses. For example, the Rockefeller Foundation—a U.S. foundation—supported the discovery of 23 arboviruses in Latin America, Africa, and India in 1951–1969 (Rosenberg et al., 2013). China has seen urban land coverage more than double and GDP per capita increase by seven times since the 1980s (Ritchie, 2018; Roser, 2013). Nine out of 223 human-infective RNA viruses have been originally discovered in China, and all were discovered after 1982 (Zhang et al., 2020). In contrast, effective surveillance is challenging in less developed regions such as large parts of Africa given resource constraints (Petti et al., 2006).

There have been no previous comparisons of the specific predictors for RNA virus discovery in different regions. In this study, we applied a similar methodology from our previous study of global patterns of discovery of human-infective RNA viruses (Zhang et al., 2020) to investigate whether predictors of discovery rates within three regions—the United States, China, and Africa—differ from one another and from those at the global level, using three new virus discovery data sets. We also mapped discovery probability in three regions in 2010–2019 using the fitted models and historical predictors. According to findings from our previous study (Zhang et al., 2020), the main predictors for virus discovery at the global scale were GDP-related. This suggests that the patterns of virus discovery we have identified may have been largely driven by research effort rather than the underlying biology. In this study, by focusing on more restricted and homogenous regions where the research effort is less variable, we expected to identify predictors more associated with virus biology.

Materials and methods

Data sets of human-infective RNA viruses in three regions

Request a detailed protocol

We performed an ecological study, and the subject of interest is each human-infective RNA virus species. With reference to a full list of human-infective RNA virus species (Zhang et al., 2020), we geocoded the first report of each in humans in the United States, China, and Africa separately. The latest version as of 31 December 2019 included 223 species (Appendix 1—table 1), with Human torovirus abolished and a new species—Heartland banyangvirus—added by International Committee on Taxonomy of Viruses (ICTV) in 2018 (International Committee on Taxonomy of Viruses, 2018). Data used in this study were not subsets of our previous global analysis; information on discovery locations and discovery dates for each virus species was re-collated for each specific geographical region.

We followed the same search terms, databases searched, and inclusion or exclusion criteria as our global data set for data collection (Woolhouse and Brierley, 2018). In each region, we established whether or not each virus species has been discovered in humans according to peer-reviewed literature. Reference databases included PubMed, Web of Science, Google Scholar, and Scopus. Two Chinese databases [i.e. China National Knowledge Infrastructure (CNKI) and Wanfang Data] were also searched when collecting data for China. Reference lists of relevant studies and reviews were also checked manually to find potential earlier discovery papers. The following key words were used for the retrieval: virus full name or abbreviations or virus synonyms; and human* or person* or case* or patient* or worker* or infection* or disease* or outbreak* or epidemic*; and region name (Chin* or Taiwan or Hong Kong or Macau; United States or US or USA or America*; Africa* or all African country names). Virus synonyms and abbreviations include early names used in the discovery paper and all subtypes provided by the ICTV online report (International Committee on Taxonomy of Viruses, 2018 ). Evidence which met the following criteria from peer-reviewed literatures were included: (a) Diagnostic methods for RNA virus infection in humans were clearly described, through either viral isolation or serological methods; (b) Specific virus species name or subtypes falling under that species were clearly provided; (c) Both natural infection and iatrogenic or occupational infections were accepted. Evidence which met the following criteria were excluded: (a) Uncertain species due to cross-reactivity with related viruses; (b) Diagnostic methods for virus infection were not specified; (c) Description of clinical symptoms or pathogenicity were not considered as human infection of one certain virus species; (d) Report of ‘[virus name]-like’ or ‘potential [virus name] infections’; (e) Intentional infections including experimental inoculation or vitro infections; (f) Non-peer-reviewed literature, including media reports, thesis, or unpublished data. Literature selection was performed by two individuals independently and discrepancies were resolved by discussion with a third individual.

We defined discovery location as where the initial human was exposed to/infected with the virus, as suggested in the first report of human infections from peer-reviewed literature. All locations were geolocated as precisely as possible using methods from our previous paper (Zhang et al., 2020). For each region, a polygon was created for those locations at administrative level 3 (county for the United States; city for China; for Africa, it varies between different countries) and above. Details of data types for virus discovery database in three regions was summarised in Appendix 1—table 2. Although the majority of discovery locations in the United States and Africa involved point data and in China the majority involved polygon data at province level, the average number of grid cells per virus in three regions were similar. A bootstrap resampling procedure was developed for polygon data covering more than one grid cell (details below). Discovery date of human infection was defined as the publication year in the scientific literature.

Spatial covariates

Request a detailed protocol

As for our global analysis (Zhang et al., 2020), a suite of global gridded climatic, socio-economic, land use, and biodiversity variables (n=33) postulated to affect the spatial distribution of RNA virus discovery were compiled, each at a resolution of 0.5°/30" (except university count having a resolution at country level for Africa and at state/province level for the United States and China). Of these, GDP, GDP growth, and university were included to adjust for discovery effort as they could partially explain the infrastructure and technology that are available for virus research (Zhang et al., 2020). We reviewed and tested previous strategies researchers have used to adjust for discovery bias, including frequency of the country listed as the address for authors in scientific papers and frequency of publications for each pathogen from scientific databases (Jones et al., 2008; Olival et al., 2017) but the results were not encouraging as the frequency of published papers from virus-related scientific journals is weakly linked to the published count of novel human-infective RNA virus (Appendix2, Appendix 3—figure 1).

Data for the United States, China, and Africa were extracted by restricting the coordinates within each region. The definition, original resolution, and source of each variable were the same as our previous paper (Zhang et al., 2020). All predictors were aggregated from their original spatial resolution to 1°×1° resolution; data for climatic variables, population, GDP, and land use data without full temporal coverage were extrapolated back to 1901; both following methods from our previous paper (Zhang et al., 2020).

Boosted regression trees modelling

Request a detailed protocol

We used a Poisson boosted regression trees (BRT) model to examine the relationship between discovery of RNA virus and 33 predictors for each 1° resolution of grid cell across each region separately, following codes from our previous study (Zhang et al., 2020) and one previous paper (Allen et al., 2017). As a tree-based machine learning method, the BRT model can automatically capture complex relationships and interactions between variables, and also can well account for spatial autocorrelation within the data (Crase et al., 2012). We compared Moran’s I values of the raw virus data and the model residuals to estimate the ability of the BRT model to account for spatial autocorrelation (Cliff and Ord, 1981). In order to minimise the effect of spatial uncertainty of virus discovery data, we performed 1000 times bootstrap resampling for those discovery locations reported as polygons. We assumed each grid cell in the polygon has the equal chance to be selected, and for each virus record we selected one grid cell randomly from the polygon for each subsample. A ratio of 1:2 for presence to absence constituted each subsample, that is, for each grid cell with virus discovery, two grid cells with no discovery were randomly selected from ‘virus discovery free’ areas at all time points within the region. Take the United States as an example, each subsample included 95 grid cells with virus discovery and 190 with no virus discovery. We then matched the virus data with all predictors by geographical coordinates and decade (using the nearest decade for time-varying predictors). We assumed that the virus count in any given grid cell in each decade followed a Poisson distribution, and we calculated the virus discovery count in each grid cell by decade as the response variable.We also performed further sensitivity analyses by (i) matching virus discovery data and time-varying covariate data by year and (ii) testing for lag effects by matching virus discovery at year t and predictors at t-1 to t-5 year (Appendix4).

All BRT models were fitted in R v. 3.6.3, using packages dismo and gbm. BRT models require the user to balance three parameters including tree complexity, learning rate, and bag fraction. Tree complexity reflects the order of interaction in a tree; learning rate shrinks the contribution of each tree to the growing model; bag fraction specifies the proportion of data drawn from the full training data at each step. We set these parameters as recommended from Elith et al., 2008, and make sure each resampling model contained at least 1000 trees. BRT models identified the final optimal number of trees in each model using a 10-fold cross validation stagewise function (Elith et al., 2008). The three parameter values of the optimal model as well as the mean optimal number of trees across 1000 replicate models for all three regions were summarised in Appendix 1—table 3.

By fitting 1000 replicate BRT models, the relative contribution plots and partial dependence plots with 95% quantiles were plotted. We defined variables with a relative contribution greater than the mean (3.03%) as influential predictors in all three regions (Shearer et al., 2018). The partial dependence plots depict the influence of each variable on the response while controlling for the average effects of all the other variables in the model. The map of virus discovery probability across each region in 2010–2019 was derived from the means of the predictions of 1000 replicate models, using values of the 33 predictors in 2015. In order to show discovery hotspots, we converted the prediction map of virus count to a map of probability.

Two statistics were calculated to evaluate the model’s predictive performance: (a) the deviance of the bootstrap model (Elith et al., 2008), (b) intraclass correlation coefficient (ICC) calculated from 50 rounds of 10-fold cross-validation, by following methods from our previous paper (Zhang et al., 2020). For the 10-fold cross-validation, we selected 50 data sets randomly from the 1000 bootstrapped subsamples. We took the first data set and partitioned into 10 subsets. For each round of 10-fold cross-validation, the unique combinations of nine subsets constituted the training sets and were used to fit models, and the remaining one was used as a test set to evaluate the predictive performance of the model. We repeated the same process as above for the remaining 49 data sets. One intraclass correlation coefficient (ICC) was calculated from each round of validation and the median with 95% quantiles across all 50 rounds was calculated. The ICC varies between 0 and 1, with an ICC of less than 0.40 representing a poor model, 0.40–0.59 representing a fair model, 0.60–0.74 representing a good model, and 0.75–1 representing an excellent model (Cicchetti, 1994).

Exploratory subgroup analyses distinguishing viruses firstly discovered in regions and those that had been discovered elsewhere in the world were performed. We used the same BRT modelling approach as we described above, and relative contribution of each predictor was calculated for each subgroup. We were unable to perform subgroup analysis for China because only nine human-infective RNA viruses have been firstly discovered in it, and the BRT model cannot be fitted to a sample as small as 9.

R software, version 3.6.3 (R Foundation for Statistical Computing, Vienna, Austria) was used for all statistical analyses. All maps were visualised by using ArcGIS Desktop 10.5.1 (Environmental Systems Research Institute).

Results

The numbers of human-infective virus species discovered in the United States, China, and Africa up to October 2019 were 95, 80, and 107, respectively (Appendix 1—table 1). Most first discoveries have been in eastern United States (especially in areas around Maryland, Washington, D.C., and New York), eastern China (developed cities including Beijing, Hong Kong, Shanghai, and Guangzhou), and southern and central Africa (Pretoria and Johannesburg, South Africa; Borno State and Ibadan, Nigeria) (Figure 1). A total of 60 virus species were previously reported in all three regions, and 27, 12, 37 species were only found in the United States, China, and Africa, respectively (Figure 2). In all three regions, smaller proportions of viruses were vector-borne [United States: 23.2% (22/95); China: 21.3% (17/80); Africa: 27.1% (29/107)] and strictly zoonotic [United States: 30.5% (29/95); China: 16.3% (13/80); Africa: 33.6% (36/107)], compared to large proportions for both virus types at the global scale [vector-borne: 41.7% (93/223) and strictly zoonotic: 58.7% (131/223)] (Figure 2). The 60 shared species were also disproportionally vector-borne [11.7% (7/60)] and strictly zoonotic [7% (4/60), Figure 2].

Spatial distribution of human-infective RNA virus discovery in three regions, 1901–2019.

(A) United States. (B) China. (C) Africa. Red dots represent discovery points or centroids of polygons, with the size representing the cumulative virus species count.

Shared human-infective RNA virus species count in three regions.

Under/By the species count the ratios of vector-borne (V) to non-vector-borne (N) viruses and strictly zoonotic (Z) to human transmissible (T) viruses were shown.

The discovery curves for the United States and Africa have seen a broadly similar pattern, with China lagging behind these two regions (Figure 3). The median time lag between the original discovery year of each virus in the world and the discovery year of each virus in each region was 0 [interquartile range (IQR): 2.5], 12 (IQR: 29.5), and 2 (IQR: 10.5) years in the United States, China, and Africa, respectively (Appendix 3—figure 2). In China, the time lag was noticeably shorter for viruses discovered after 1975 [before 1975: a median lag of 30.5 (IQR: 30.5) years; after 1975: 2.5 (IQR: 7) years, p value of Wilcoxon rank sum test < 0.001].

Discovery curve of human-infective RNA virus species in three regions and the world.

In the United States, six variables including three predictors related to land use [urbanized land: relative contribution of 35.8%, urbanization of cropland (i.e. the percentage of land area change from cropland to urban land): 8.0%, growth of urbanized land: 4.1%], two socio-economic variables (GDP growth: 10.0%; GDP: 5.7%), and one climatic variable (diurnal temperature change: 4.9%) were identified as important predictors for discriminating between locations with and without virus discovery (Figure 4A). The partial dependence plots shown in Appendix 3—figure 3 suggested non-linear relationships between the probability of virus discovery and most predictors. All important predictors presented a positive trend over narrow ranges at lower values.

Relative contribution of predictors to human-infective RNA virus discovery in three regions.

(A) United States. (B) China. (C) Africa. The boxplots show the median (black bar) and interquartile range (box) of the relative contribution across 1000 replicate boosted regression tree models, with whiskers indicating minimum and maximum and black dots indicating outliers.

In China, twelve variables including four socio-economic variables (GDP: 12.7%, university count: 7.5%, GDP growth: 4.6%, population growth: 4.4%), five predictors involving land use [pasture: 8.3%, urbanized land: 8.1%, vegetation: 5.8%, cropland: 5.3%, urbanization of secondary land (the percentage of land area change from secondary land to urban land; secondary land is natural vegetation that is recovering from previous human disturbance): 3.3%], and three climatic variables (maximum precipitation: 4.5%, precipitation change: 3.8%, diurnal temperature range: 3.3%) were identified as important predictors for discriminating between locations with and without virus discovery (Figure 4B). GDP, urbanized land, university count, vegetation, GDP growth, maximum precipitation, population growth, and urbanization of secondary land presented a positive trend over narrow ranges at lower levels; pasture, cropland, precipitation change, and diurnal temperature range had non-monotonic/ negative impacts, with highest risks at lower values (Appendix 3—figure 4).

In Africa, ten variables including two socio-economic variables (GDP growth: 21.2%, GDP: 13.0%), seven predictors related to land use (urbanized land: 9.4%, growth of cropland area: 5.6%, urbanization of cropland: 5.5%, growth of urbanized land: 5.1%, urbanization of pasture: 3.8%, vegetation, 3.7%, cropland: 3.2%), and one biodiversity variable (mammal species richness: 3.1%) were identified as important predictors for discriminating between locations with and without virus discovery (Figure 4C). All important predictors presented a positive trend over narrow ranges at lower positive values, except mammal species over a large range (Appendix 3—figure 5).

Our BRT models reduced Moran’s I value below 0.15 in all three regions (Appendix 3—figure 6), suggesting that BRT models with 33 predictors have adequately accounted for spatial autocorrelations in the raw virus data in all three regions. The model validation statistics for each region are shown in Appendix 1—table 4. Combining these measures, our BRT model predictions range from fair to good (Cicchetti, 1994). In our sensitivity analyses based on data matched by year (Appendix 3—figure 7) and 1–5 year lag (results of 1 year lag shown in Appendix 3—figure 8), though there were several changes of relative contribution, the top predictors were broadly consistent with our main model based on data matched by decade (Figure 4).

In comparison with the whole world, human-infective RNA virus discovery was more associated with land use and socio-economic variables than climatic variables and biodiversity in all three regions (Figure 5). The comparison of four groups of predictors between three regions showed that: the greatest contribution of climatic variables to the discovery of human-infective RNA viruses was in China; the greatest contribution of land use was in the United States; the greatest contribution of socio-economic variables and biodiversity was in Africa and least in the United States.

Cumulative relative contribution of predictors to human-infective RNA virus discovery by group in each model of different regions.

The relative contributions of all explanatory factors sum to 100% in each model, and each colour represents the cumulative relative contribution of all explanatory factors within each group.

We mapped human-infective RNA virus discovery probability in 2010–2019 for the three regions, based on the fitted BRT models and values of all 33 predictors in 2015 (Appendix 3—figure 9 to Appendix 3—figure 11). Outside contemporary risk areas where human-infective RNA viruses were previously discovered in the United States (Figure 1A), we predicted high probabilities of virus discovery across southern Michigan, central-Northern Carolina, central Oklahoma, southern Nevada, and north-eastern Utah (Figure 6A). Outside contemporary risk areas where human-infective RNA viruses were previously discovered in China (Figure 1B), we predicted high probabilities of virus discovery across other eastern China area as well as two western areas including south-central Shaanxi and north-eastern Sichuan (Figure 6B). Outside contemporary risk areas where human-infective RNA viruses were previously discovered in Africa (Figure 1C), we predicted high probabilities of virus discovery across northern Morocco, northern Algeria, northern Libya, south-eastern Sudan, central Ethiopia and western Democratic Republic of the Congo (Figure 6C). Most new virus species since 2010 in each region (6/6 in the United States, 19/19 in China, 12/19 in Africa) were discovered in high-risk areas (85% percentiles of predicted probability across each region) as predicted by our model. Of all the 37 (United States: 6; China: 19; Africa: 12) viruses discovered in high-risk areas in 2010–2019, 13 (United States: 2; China: 7; Africa: 4) viruses were discovered at the potential new hotspots where there have not been any virus discoveries before 2010.

Predicted probability of human-infective RNA virus discovery in three regions in 2010–2019.

(A) United States. (B) China. (C) Africa. The triangles represented the actual discovery sites from 2010 to 2019, and the background colour represented the predicted discovery probability.

Based on our subgroup analysis distinguishing viruses firstly discovered in regions and those that had been discovered elsewhere in the world, discoveries of human-infective RNA viruses first discovered from either United States or Africa were better predicted by climatic and biodiversity variables, while discoveries of viruses that had been discovered from elsewhere in the world were better predicted by socio-economic variables (Appendix 3—figure 12).

Discussion

To our knowledge, this analysis represents the first investigation of human-infective RNA virus discovery in three large regions of the world which have experienced distinct socio-economic, ecological and environmental changes over the last 100 years. In total, 95 human-infective RNA virus species had been found in the United States; 80 in China; 107 in Africa. The discovery maps of human-infective RNA virus in the three regions indicated areas with historically high discovery counts: eastern and western United States, eastern China, and central and southern Africa. BRT modelling suggested that the relative contribution of 33 predictors to human-infective RNA virus discovery varied across three regions, though climatic and biodiversity variables were consistently less important in all three regions than at a global scale. We mapped the probability of human-infective RNA virus discovery in 2010–2019 which would continue to be high in historical hotspots but, in addition, we identified several new hotspots in central-eastern and southwestern United States, eastern and western China, and northern Africa. These results offer a tool for public health practitioners and policymakers to better understand local patterns of virus discovery and to invest efficiently in surveillance systems at the local level.

In recent decades, factors that drive pathogen discovery have been comprehensively studied, e.g., (Morse, 2012). In general, evidence has come from three forms of analyses: analysis of single emergence event such as SARS, AIDS, and Ebola (Parrish et al., 2008), quantifying the spillover (or host switching/cross-host transmission) risk using traits of both hosts and viruses (Kreuder Johnson et al., 2015; Olival et al., 2017; Pulliam and Dushoff, 2009), and record of first emergence/discovery event in humans globally over time (Allen et al., 2017; Jones et al., 2008; Zhang et al., 2020). Of these, the latter form of analyses have linked the distribution of emerging infectious diseases across the globe to ecological, environmental, and socio-economic factors, predicted the high-risk areas for discovery of emerging zoonoses, and helped identify priority regions for investment in surveillance systems for new human viruses (Allen et al., 2017; Jones et al., 2008; Zhang et al., 2020). In addition to these analyses, our current regional analyses identified more precise hotspots for virus discovery in three large regions of the world. Because zoonotic viruses are responsible for most historical endemics and epidemic diseases, several projects such as the Global Virome project (GVP), the PREDICT project, and the Vietnam Initiative on Zoonotic Infections (VIZIONS) were launched to construct a comprehensive data set of unknown viruses with epidemic potential from specific animals likely to harbour high-risk viruses, humans having a high contacting rate with animals, and animal-human interfaces with high spill-over probability (Carroll et al., 2018; Morse, 2012; Rabaa, 2015). These hotspots analyses indicate priority regions for surveillance for new viruses for these projects.

In all three regions, GDP and/or GDP growth were identified as important predictors for virus discovery. This is consistent with our previous analysis that GDP and GDP growth play a major role in discovering viruses (Zhang et al., 2020). In general, sufficient economic, human and material resources, the availability of advanced infrastructure and technology, and greater research capabilities in the relative higher income areas enable the virus discovery (Rosenberg et al., 2013). That this effect applied both within one continent and within single countries such as the United States and China suggested that most virus discoveries were likely passive, that is, the viruses were detected when they arrived in a location with the resources to detect them. This is plausible because in all regions in our study, human-transmissible viruses accounted for the larger proportion, and our previous analysis suggested richer areas were more likely to first capture transmissible viruses (e.g. Influenza virus, Rhinovirus, Rabies lyssavirus, Measles morbillivirus, Mumps orthorubulavirus, Rubella virus, and Norwalk virus) capable of spreading to multiple areas (Zhang et al., 2020). Temporally, in China the rate of discovery increased after economic growth accelerated in the 1980s (Figure 3). We note in publications describing first virus discoveries that most historical virus discoveries in Africa received support from the United States and Europe, and this may explain why Africa saw an increased number of virus discoveries after 1950—30 years earlier than China (Figure 3). Notably, in contrast to Africa, university count was found to be associated with virus discovery in China, suggesting virus discovery likely being a significant area of research in Chinese universities. Our model also suggested the overall socio-economic factors contributed less in the United States than other two regions. The possible explanation is that the socio-economic level across the whole United States is relatively high and homogenous.

Predictors other than GDP and university count are likely to be linked to virus natural history. In all three regions, the area of urban land and further urbanization made great contribution to virus discovery. This reinforced previous studies that urbanization was linked to the detection of new human pathogens through the denser urban population, increased human-wildlife contact rate, spill-over of human infection from enzootic cycle, and the contamination of the urban environment with microbial agents (Hassell et al., 2017; Olival et al., 2017; Weaver, 2013). In the United States, land use contributed more to virus discovery than in other regions—urbanized land, urbanization of cropland, and growth of urbanized land alone had a relative contribution of 47.9%. It is possible that land use change in the US is driving both the emergence of novel viruses and their discovery, as has been suggested for Heartland virus (Mansfield et al., 2017; Savage et al., 2013) and several hantaviruses (Hassell et al., 2017).

Climate had less influence on human-infective RNA virus discovery in all three regions in comparison to other predictors, in contrast to virus discovery at a global scale (Zhang et al., 2020). The underlying reason may be that the proportion of vector-borne viruses—whose distribution and abundance is strongly associated with the impact of climate on vector populations (Li et al., 2014)—in all three regions (United States: 23.2%; China: 21.3%; Africa: 27.1%) were less than that in the world (41.7%) (Figure 3). Vector-borne viruses tend to have more restricted global ranges, so are less likely to appear in a study of any one region (Zhang et al., 2020).

In addition, a relatively smaller proportion of strictly zoonotic viruses in three regions (United States: 30.5%; China: 16.3%; Africa: 33.6%) than that in the world (58.7%) (Figure 2) made biodiversity contribute less to virus discovery in the three regions than in the world (Zhang et al., 2020). With exposure to a higher density of mammals played a slightly larger role in virus discovery in Africa than in China and the United States (Appendix 3—figure 9 to Appendix 3—figure 11).

Our discovery probability maps for 2010–2019 in three regions captured most historical hotspots, though several small new areas in central-eastern and southwestern United States, eastern and western China, as well as northern Africa would also make greater contribution to virus discovery (Figure 6). Our model has a good predictive ability, given 84% (37/44) new virus species in 2010–2019 were discovered in high-risk areas we have defined—85% percentiles of discovery probability within each region. Further, 35% (13/37) of those viruses discovered in high-risk areas since 2010 were discovered at the potential new hotspots where there had not been any virus discoveries in the past.

Our subgroup analyses distinguishing viruses firstly discovered in regions and those that had been discovered elsewhere in the world suggested in both the United States and Africa, discoveries of viruses firstly discovered in regions were more likely to be associated with climatic and biodiversity variables while discoveries of viruses had been discovered elsewhere in the world were more likely to be associated with socio-economic variables. This is plausible, again because after a novel virus was discovered elsewhere in the world, it is usually areas with a higher socio-economic level that first capture the virus in the local region.

This study had limitations. First, one common problem for data collected from literature review is the time lag between virus discovery and publication, in which case the virus data are likely to be matched to covariates in later decades. Second, we acknowledge that it is possible we have not identified the earliest report for some well-known viruses such as yellow fever virus, measles virus, especially in the post-vaccination era. Third, we were unable to identify robust and comprehensive data for all three regions on virus discovery effort (e.g. government transparency, laboratory infrastructure and technology), although we interpret GDP and university count as being an indirect measure of resources available for this activity. Previous studies have tried to use the bibliographic data to correct for the discovery effort (; ). However, this strategy worked less well for our data as the frequency of published paper from virus-related scientific journals has only a weak link to publications on novel human-infective RNA virus (Appendix 3—figure 1).

The study adds to our previous study (Zhang et al., 2020) in several ways. First, we firstly construct data sets of human-infective RNA virus discovery reflecting the viral richness in three broad regions of the world. Second, we reduced the heterogeneity of the predictors by focusing on regions, including those predictors reflecting the research effort. Research effort is less variable within restricted regions and therefore has less effect on virus detection. This implies our predicted hotspots stand closer to the virus geographic distribution in nature. Third, the predicted hotspots derived from regional analysis have a higher precision than at a global scale, for example, specific areas in the United States and China were identified as hotspots from regional analysis, rather than the whole eastern area from the global analysis. This helps target areas for future surveillance.

In conclusion, a heterogeneous pattern of virus discovery-driver relationships was identified across three regions and the globe. Within regions virus discovery is driven more by land use and socio-economic variables; climate and biodiversity variables are consistently less important predictors than at a global scale. We mapped with good accuracy that in 2010–2019 three regions where human-infective RNA viruses had previously been discovered would continue to be the discovery hotspots, but in addition, several new areas in each region would make great contribution to virus discovery. Results from the study could guide active surveillance for new human-infective viruses in high-risk areas.

Appendix 1

Appendix 1—table 1
Summary of the human-infective RNA virus data sets in the United States, Africa, and China.
SpeciesOriginal discovery yearUnited StatesChinaAfrica
Reported?Discovery yearlocationLatLonReported?Discovery yearlocationLatLonReported?Discovery yearlocationLatLon
Argentinian mammarenavirus1958NoNoNo
Brazilian mammarenavirus1994Yes Barry et al., 19951995New Haven, Connecticut41.31--72.93NoNo
Cali mammarenavirus1971Yes Buchmeier et al., 19741974Houston, Texas29.76--95.37NoNo
Chapare mammarenavirus2008NoNoNo
Guanarito mammarenavirus1991NoNoNo
Lassa mammarenavirus1970Yes Buckley and Casals, 19701970New Haven, Connecticut41.31--72.93NoYes Buckley and Casals, 19701970Lassa, Borno State, Nigeria10.6913.27
Lujo mammarenavirus2009NoNoYes Briese et al., 20092009Lusaka, Zambia--15.3928.32
Lymphocytic choriomeningitis mammarenavirus1934Yes Armstrong and Lillie, 19341934St. Louis county, Missouri38.61--90.41NoNo
Machupo mammarenavirus1964NoNoNo
Mobala mammarenavirus1985NoNoYes Georges et al., 19851985Bouboui and Gomoka village, Boali town, Central African Republic4.8918.14
Whitewater Arroyo mammarenavirus2000Yes Enserink, 20002000Alameda County, California37.60--121.72NoNo
Mamastrovirus 11975Yes Oshiro et al., 19811981Martin County, California40.22--123.10Yes Xu et al., 19811981Guangzhou, Guangdong23.13113.26Yes Dowling and Wynne, 19811981Lebowa, South Africa--23.529.5
Mamastrovirus 62008Yes Finkbeiner et al., 2009c2009St. Louis, Missouri38.63--90.20Yes Chu et al., 20102010Hong Kong22.40114.11Yes Kapoor et al., 20092009Maiduguri, Borno State, Nigeria11.8313.15
Mamastrovirus 82009Yes Finkbeiner et al., 2009a2009St. Louis, Missouri38.63--90.20Yes Wang et al., 20132013Nanjing, Jiangsu and Lanzhou, Gansu31.95118.78Yes Kapoor et al., 20092009Maiduguri, Borno State, Nigeria11.8313.15
Mamastrovirus 92009Yes Finkbeiner et al., 2009b2009Accomack and Northampton Counties, Virginia37.71--75.81Yes Tao et al., 20192019Jinan, Shandong36.68117.11Yes Kapoor et al., 20092009Maiduguri, Borno State, Nigeria11.8313.15
Mammalian 1 orthobornavirus1985Yes Rott et al., 19851985Philadelphia, Pennsylvania39.95--75.17Yes Chen et al., 19991999Taiwan23.70120.96Yes Bode et al., 19921992Rural area of East Africa--1.2834.53
Mammalian 2 orthobornavirus2015NoNoNo
Norwalk virus1972Yes Kapikian et al., 19721972Norwalk, Ohio41.24--82.62Yes Fang et al., 19951995Henan33.88113.48Yes Taylor et al., 19931993Pretoria, Gauteng province, South Africa--25.7528.23
Sapporo virus1980Yes Nakata et al., 19881988Houston, Texas29.76--95.37Yes Nakata et al., 19881988Shanghai31.23121.47Yes Wolfaardt et al., 19971997Pretoria, Gauteng province, South Africa--25.7528.23
Vesicular exanthema of swine virus1998Yes Smith et al., 19981998Corvallis, Oregon44.56--123.26NoNo
Alphacoronavirus 12007NoNoNo
Human coronavirus 229E1966Yes Hamre and Procknow, 19661966Chicago, Illinois41.88--87.63Yes Virus Research Group of Kun Number 323 Unit, The Chinese People’s Liberation Army, 19751975Kunming, Yunnan25.07102.68Yes Hays and Myint, 19981998Kumasi, Ghana6.70--1.62
Human coronavirus NL632004Yes Esper et al., 20052005New Haven, Connecticut41.31--72.93Yes Chan et al., 20052005Hong Kong22.40114.11Yes Smuts et al., 20082008Cape Town, Western Cape Province, South Africa--33.9018.57
Betacoronavirus 11967Yes McIntosh et al., 19671967Bethesda, Maryland38.98--77.09Yes Chan et al., 20052005Hong Kong22.40114.11Yes Venter et al., 20112011Pretoria, Gauteng province, South Africa--25.7528.23
Human coronavirus HKU12005Yes Esper et al., 20062006New Haven, Connecticut41.31--72.92Yes Woo et al., 20052005Hong Kong22.40114.11Yes Venter et al., 20112011Pretoria, Gauteng province, South Africa--25.7528.23
Middle East respiratory syndrome-related coronavirus2012Yes* Bialek et al., 20142014Lake county, Indiana41.45--87.37Yes* Gao and Song, 20152015Huizhou, Guangdong23.09114.40Yes* Abroug et al., 20142014Monastir, Tunisia35.7910.82
Severe acute respiratory syndrome-related coronavirus2003Yes* Charles M, 20032003Atlanta, Georgia33.75--84.39Yes Peiris et al., 2003a2003Hong Kong22.40114.11Yes Chiu et al., 20042004Pretoria, Gauteng province, South Africa--25.7528.23
Human torovirus (been abolished)1984NoNoNo
Bundibugyo ebolavirus2008NoNoYes Smuts et al., 20082008Bundibugyo and Kikyo town, Bundibugyo District, Western Uganda0.7130.06
Reston ebolavirus1991Yes Miranda et al., 19911991Reston, Fairfax County, Virginia38.96--77.35NoNo
Sudan ebolavirus1977NoNoYes Bowen et al., 19771977Maridi, South Sudan4.9129.45
Tai Forest ebolavirus1995NoNoYes Le Guenno et al., 19951995Abidjan, Cote-d’lvoire5.36--4.01
Zaire ebolavirus1977NoNoYes Johnson et al., 19771977Yambuku village, Democratic Republic of the Congo2.8322.22
Marburg marburgvirus1968Yes* Centers for Disease Control and Prevention, 20092009Denver county, Colorado39.55--105.78NoYes Gear et al., 19751975Johannesburg, South Africa--26.2027.90
Aroa virus1971NoNoNo
Bagaza virus2009NoNoNo
Banzi virus1959NoNoYes Smithburn et al., 19591959Maponde's Kraal(Usutu river), South Africa--26.5231.67
Cacipacore virus2011NoNoNo
Dengue virus1907Yes Lavinder and Francis, 19141914Savannah, Georgia32.02--81.12Yes Clarke et al., 19671967Southwest Taiwan23.06120.59Yes Edington, 19271927Durban, KwaZulu-Natal Province, South Africa--29.8631.02
Edge Hill virus1985NoNoNo
Gadgets Gully virus1991NoNoNo
Ilheus virus1947NoNoNo
Japanese encephalitis virus1933Yes* Perex-Pina and Merikangas, 19531953Waltham, Massachusetts42.38--71.24Yes Yen, 19411941Beijing40.01116.41Yes Simon-Loriere et al., 20172017Cunene, Angola--16.2815.28
Kokobera virus1964NoNoNo
Kyasanur forest disease virus1957NoYes Wang et al., 20092009Hengduanshan Mountain, Yunnan27.5099.00Yes Andayi et al., 20142014Djibouti, Republic of Djibouti11.5743.15
Langat virus1956NoNoNo
Louping ill virus1934Yes Rivers and Schwentker, 19341934New York40.71--74.01NoNo
Murray Valley encephalitis virus1952NoNoNo
Ntaya virus1952NoNoYes Smithburn, 19521952Bwamba county, Uganda0.7530.02
Omsk hemorrhagic fever virus1948NoNoNo
Powassan virus1959Yes Goldfield et al., 19731973Middlesex County, New Jersey40.54--74.37NoNo
Rio Bravo virus1962Yes Suklin et al., 19621962Dallas city, Texas32.78--96.80NoNo
Saint Louis encephalitis virus1933Yes Webster and Fite, 20091933St. Louis City, Missouri38.63--90.20NoNo
Tembusu virus1975NoYes Tang et al., 20132013Shandong36.40118.77No
Tick-borne encephalitis virus1938Yes* Cruse et al., 19791979Cleveland, Ohio41.51--81.69Yes Wang and Zhao, 19561956Bali village, Wuchang, Heilongjiang44.91127.16No
Uganda S virus1952NoNoYes Dick and Haddow, 19521952Bwamba county, Uganda0.7530.02
Usutu virus2009NoNoNo
Wesselsbron virus1957NoNoYes Smithburn et al., 19571957Lake Simbu region, Maputaland, KwaZulu-Natal, South Africa--27.3632.32
West Nile virus1940Yes Nash et al., 20012001New York40.71--74.01Yes Li et al., 20132013Jiashi County, Xinjiang39.5877.18Yes Smithburn et al., 19401940Omogo, West Nile district, Uganda0.4233.21
Yellow fever virus1901Yes Guiteras, 19041904Laredo, Texas27.51--99.51Yes* Chen and Lu, 20162016Beijing40.01116.41Yes Stokes et al., 19281928Larteh, Ghana5.94--0.07
Zika virus1952Yes* Foy et al., 20112011Northern Colorado39.55--105.78Yes* Sun et al., 20162016Gan County, Ganzhou city, Jiangxi25.86115.02Yes Dick, 19521952Zika, Uganda0.1232.53
Hepacivirus C1989Yes Choo et al., 19891989Emeryville, California37.83--122.29Yes Xu et al., 1990a1990Qidong county, Jiangsu31.88121.72Yes Kew et al., 19901990Johannesburg, South Africa--26.2027.90
Pegivirus C1995Yes Simons et al., 19951995Chapel Hill, North Carolina; Rochester, Minnesota; Dallas, Texas35.91--79.06Yes Wang et al., 19961996Beijing40.01116.41Yes Simons et al., 19951995Cairo, Egypt30.0431.24
Pegivirus H2015Yes Kapoor et al., 20152015New York city, New York40.71--74.01Yes Wang et al., 20182018Guangzhou, Guangdong23.13113.26Yes Rodgers et al., 20192019Ebolowa, Cameroon2.9211.15
Pestivirus A1988Yes Yolken et al., 19891989Whiteriver, Arizona33.83--109.97NoYes Giangaspero et al., 19881988Zambia--13.1327.85
Andes orthohantavirus1996NoNoNo
Bayou orthohantavirus1995Yes Morzunov et al., 19951995Louisiana30.98--91.96NoNo
Black creek canal orthohantavirus1995Yes Ravkov et al., 19951995Miami-Dade County, Florida25.76--80.34NoNo
Choclo orthohantavirus2000NoNoNo
Dobrava-Belgrade orthohantavirus1992NoNoNo
Hantaan orthohantavirus1978NoYes Lee et al., 19801980Zhejiang29.14119.79No
Laguna Negra orthohantavirus1997NoNoNo
Puumala orthohantavirus1980NoNoNo
Sangassou orthohantavirus2010NoNoYes Klempa et al., 20102010Sangassou village, Macenta district, Forest Guinea8.24--9.32
Seoul orthohantavirus1982Yes Forthal et al., 19871987Mississippi32.57--89.88Yes Song et al., 19821982Jiangsu33.14119.79Yes Tomori et al., 19861986Jos, Nigeria9.908.86
Sin Nombre orthohantavirus1993Yes Nichol et al., 19931993New Mexico34.52--105.87NoNo
Thailand orthohantavirus2006NoNoNo
Thottopalayam thottimvirus2007NoNoNo
Tula orthohantavirus1996NoNoNo
Orthohepevirus A1983Yes* De Cock et al., 19871987Los Angeles County, California34.05--118.24Yes Huang et al., 19891989Kashi county, Kashi city, Xinjiang39.4675.99Yes Belabbes et al., 19851985Medea town, Algeria36.262.75
Orthohepevirus C2018NoYes Sridhar et al., 20182018Hong Kong22.40114.11No
Crimean-Congo haemorrhagic fever orthonairovirus1967NoYes Yen et al., 19851985Bachu, southern Xinjiang39.7978.55Yes Simpson et al., 19671967Kisangani, Tshopo province, Democratic Republic of the Congo0.5325.19
Dugbe orthonairovirus1969NoNoYes Causey et al., 19691969Ibadan, Nigeria7.353.88
Nairobi sheep disease orthonairovirus1969NoNoYes Morrill et al., 19911991Mombasa; Malindi; and Kilifi, Coast Province, Kenya--3.3439.57
Thiafora orthonairovirus1989NoNoNo
Influenza A virus1933Yes Francis and Magill, 19351935Philadelphi, Pennsylvania39.95--75.17Yes Chang and Chiang, 19501950Beijing40.01116.41Yes Isaacs and Andrews, 19511951Johannesburg, South Africa and Cape Town, South Africa--26.2027.90
Influenza B virus1940Yes Francis, 19401940Irvington village, Greenburgh town, Westchester County, New York41.03--73.87Yes Wen and Chu, 19571957Beijing40.01116.41Yes Montefiore et al., 19701970Arusha, Arusha Region, Tanzania--3.3736.69
Influenza C virus1950Yes Francis et al., 19501950Ann Arbor city, Michigan42.28--83.74Yes Zhang, 19571957Beijing40.01116.41Yes Joosting et al., 19681968Johannesburg, South Africa--26.2027.90
Dhori thogotovirus1985NoNoNo
Thogoto thogotovirus1969NoNoYes Causey et al., 19691969Ibadan, Nigeria7.353.88
Avian orthoavulavirus 11943Yes Burnet, 19431943Washington, D. C.38.91--77.04NoNo
Hendra henipavirus1995NoNoNo
Nipah henipavirus1999NoNoNo
Canine morbillivirus1955Yes Karzon, 19551955Buffalo, New York42.89--78.88NoNo
Measles morbillivirus1911Yes Goldberger and Anderson, 19111911Washington, D. C.38.91--77.04Yes Tang et al., 19581958Beijing40.01116.41Yes Baylet et al., 19631963Dakar, Senegal14.72--17.47
Human respirovirus 11958Yes Chanock et al., 19581958Washington, D. C.38.91--77.04Yes Chen et al., 19641964Zhejiang29.14119.79Yes Taylor-Robinson and Tyrrell, 19631963Cape Town, Western Cape Province, South Africa--33.9018.57
Human respirovirus 31958Yes Chanock et al., 19581958Washington, D. C.38.91--77.04Yes Yu et al., 19871987Guangzhou, Guangdong23.13113.26Yes Taylor-Robinson and Tyrrell, 19631963Cape Town, Western Cape Province, South Africa--33.9018.57
Achimota pararubulavirus 22013NoNoYes Baker et al., 20132013Volta, Ghana6.050.37
Human orthorubulavirus 21956Yes Chanock, 19561956Cincinnati, Ohio39.10--84.51Yes Pathogen biology research group, Jiangsu new medical college, 19751975Nanjing, Jiangsu31.95118.78Yes Balestrieri et al., 19671967Accra, Ghana5.60--0.19
Human orthorubulavirus 41960Yes Johnson et al., 19601960Bethesda, Maryland38.98--77.09Yes Lau et al., 20052005Hong Kong22.40114.11Yes Niang et al., 20102010Ndiop village, Sine Saloum region, Senegal15.18--16.74
Mammalian orthorubulavirus 51959Yes Schultz and Habel, 19591959Stanford, California37.42--122.17NoNo
Menangle pararubulavirus1998NoNoNo
Mumps orthorubulavirus1934Yes Johnson and Goodpasture, 19341934Nashville, Tennessee36.16--86.78Yes Wang et al., 19581958Beijing40.01116.41Yes Bayer and Gear, 19551955Johannesburg, South Africa--26.2027.90
Simian orthorubulavirus1968NoNoNo
Sosuga pararubulavirus2014NoNoYes Albariño et al., 20142014ˉ3.7632.82
Tioman pararubulavirus2007NoNoNo
Bunyamwera orthobunyavirus1946Yes Work, 19641964Southern Florida26.92--81.21NoYes Smithburn et al., 19461946Bwamba County, Uganda0.7530.02
Bwamba orthobunyavirus1941NoNoYes Smithburn et al., 19411941Bwamba county, Western Province of Uganda0.7530.02
California encephalitis orthobunyavirus1952Yes Hammon and Reeves, 19521952Kern county, California35.49--118.86Yes Gu et al., 19841984Longhua, Shanghai31.22121.43Yes Bardos and Sefcovicova, 19611961Uganda1.3732.29
Caraparu orthobunyavirus1961NoNoNo
Catu orthobunyavirus1961NoNoNo
Guama orthobunyavirus1961NoNoNo
Guaroa orthobunyavirus1959NoNoNo
Kairi orthobunyavirus1967NoNoNo
Madrid orthobunyavirus1964NoNoNo
Marituba orthobunyavirus1961NoNoNo
Nyando orthobunyavirus1965NoNoYes Williams et al., 19651965Kisumu, Kenya--0.0934.77
Oriboca orthobunyavirus1961NoNoNo
Oropouche orthobunyavirus1961NoNoNo
Patois orthobunyavirus1972NoNoNo
Shuni orthobunyavirus1975NoNoYes Moore et al., 19751975Ibadan, Nigeria7.383.95
Tacaiuma orthobunyavirus1967NoNoNo
Wyeomyia orthobunyavirus1965NoNoNo
Candiru phlebovirus1983NoNoNo
Punta Toro phlebovirus1970NoNoNo
Rift Valley fever phlebovirus1931NoYes* Liu et al., 20162016Beijing40.01116.41Yes Daubney et al., 19311931Rift Valley of Kenya Colony--0.2836.07
Sandfly fever Naples phlebovirus1944NoNoYes Sabin, 19511951Cairo, Egypt30.0431.24
Heartland banyangvirus2012Yes McMullan et al., 20122012Andrew and Nodaway Counties, Missouri39.82--94.59NoNo
Huaiyangshan banyangvirus2011NoYes Zhang et al., 20112011Huaiyangshan31.37115.39No
Uukuniemi phlebovirus1970NoNoNo
Human picobirnavirus1988Yes Grohmann et al., 19931993Atlanta, Georgia33.75--84.39Yes Rosen et al., 20002000Lulong County, Hebei39.94116.94No
Equine rhinitis A virus1962NoNoNo
Foot-and-mouth disease virus1965NoYes Luo et al., 19991999Guangzhou23.13113.26Yes Donia and Youssef, 20022002Alexandria Governorate, Egypt30.7429.74
Cardiovirus A1947Yes Jonkers, 19611961New Orleans, Louisiana29.95--90.07Yes Feng et al., 20152015Changchun, Jilin43.87125.34Yes Dick and Best, 19481948Entebbe, Uganda0.0532.46
Cardiovirus B1963Yes Jones et al., 20072007San Diego, California32.72--117.16Yes Cheng et al., 2009a2009Lanzhou, Gansu36.06103.79Yes Zoll et al., 20092009Cameroon5.0312.40
Cosavirus A2008NoYes Dai et al., 20102010Shanghai31.23121.47Yes Kapusinszky et al., 20122012Maiduguri, Borno State, Nigeria11.8313.15
Cosavirus B2008NoYes Yang et al., 20162016Zhenjiang, Jiangsu32.19119.43No
Cosavirus D2008NoNoYes Kapusinszky et al., 20122012Maiduguri, Borno State, Nigeria11.8313.15
Cosavirus E2008NoNoYes Kapusinszky et al., 20122012Maiduguri, Borno State, Nigeria11.8313.15
Cosavirus F2012NoNoNo
Enterovirus A1949Yes Sickles and Dalldorf, 19491949New York43.30--74.22Yes Xiao et al., 19851985Tianjin39.34117.36Yes Bayer and Gear, 19551955Johannesburg, South Africa--26.2027.90
Enterovirus B1949Yes Sickles and Dalldorf, 19491949Wilmington39.74--75.54Yes Wu et al., 19601960Fuzhou, Fujian26.07119.30Yes Patz et al., 19531953Middelburg, Transvaal, South Africa--25.7729.46
Enterovirus C1909Yes Flexner and Lewis, 19091909New York city, New York40.71--74.01Yes Yen and Hsü, 19411941Bejing39.90116.41Yes Hudson and Lennette, 19331933Monrovia, Liberia6.29--10.76
Enterovirus D1967Yes Schieble et al., 19671967Berkeley, California37.87--122.27Yes Shanghai Eye and Skin Disease Prevention and Treatment Institute, 19791979Shanghai31.23121.47Yes Mirkovic et al., 19731973Morocco31.79--7.09
Enterovirus E1961Yes Moscovivci et al., 19611961Denver, Colorado39.74--104.99NoNo
Enterovirus H1965NoNoNo
Rhinovirus A1953Yes Price, 19561956Baltimore, Maryland39.29--76.61Yes Guangzhou Institute of Medicine and Health, 19751975Guangzhou, Guangdong23.13113.26Yes Taylor-Robinson, 19631963Cape Town, Western Cape Province, South Africa--33.9018.57
Rhinovirus B1960Yes Hamre and Procknow, 19611961Chicago, Illinois41.88--87.63Yes Xiang et al., 20082008Beijing40.01116.41Yes Briese et al., 20082008Pretoria, Gauteng province, South Africa--25.7528.23
Rhinovirus C2006Yes Lamson et al., 20062006New York city, New York40.71--74.01Yes Lau et al., 20072007Hong Kong22.40114.11Yes Briese et al., 20082008Pretoria, Gauteng province, South Africa--25.7528.23
Erbovirus A2005NoNoNo
Hepatovirus A1973Yes Feinstone et al., 19731973Bethesda, Maryland38.98--77.09Yes Microbiology Research Group of Shanghai First Medical College and Laboratory of Shanghai Sixth People’s Hospital, 19781978Shanghai31.23121.47Yes Szmuness et al., 19771977Dakar, Senegal14.72--17.47
Aichivirus A1991Yes Chhabra et al., 20132013Cincinnati, Ohio39.10--84.51Yes Yang et al., 20092009Shanghai31.23121.47Yes Sdiri-Loulizi et al., 20082008Monastir, Tunisia35.7710.82
Parechovirus A1958Yes Ramoz-alverz and Sabin, 19581958Cincinnati, Ohio39.10--84.51Yes Shan et al., 20092009Shanghai31.23121.47Yes Kapusinszky et al., 20122012Ouagadougou, Burkina Faso12.24--1.56
Parechovirus B2003NoNoNo
Salivirus A2009Yes Greninger et al., 20092009Northern California38.84--120.90Yes Shan et al., 20102010Shanghai31.23121.47Yes Li et al., 20092009Maiduguri, Borno State, Nigeria11.8313.15
Avian metapneumovirus2011Yes Kayali et al., 20112011Memphis, Tennessee35.15--90.05NoNo
Human metapneumovirus2001Yes Falsey et al., 20032003Rochester, New York43.16--77.61Yes Peiris et al., 2003b2003Hong Kong22.40114.11Yes Madhi et al., 20032003Johannesburg, South Africa--26.2027.90
Human orthopneumovirus1957Yes Chanock et al., 19571957Baltimore, Maryland39.29--76.61Yes Kun Number 323 Unit, the Chinese People’s Liberation Army, 19751975Kunming, Yunnan25.07102.68Yes Doggett, 19651965Cape Town, Western Cape Province, South Africa--33.9018.57
Colorado tick fever virus1946Yes Florio et al., 19461946Denver, Colorado39.74--104.99Yes Yang et al., 19961996Nanjing, Jiangsu31.95118.78No
Eyach virus1980NoNoNo
Corriparta virus1967NoNoNo
Great Island virus1963NoNoNo
Lebombo virus1975NoNoYes Moore et al., 19751975lbadan, Nigeria7.383.95
Orungo virus1976NoNoYes Tomori et al., 19761976Ibadan, Nigeria7.383.95
Mammalian orthoreovirus1954Yes Ramos-Alvarez and Sabin, 19541954Cincinnati, Ohio39.10--84.51Yes Zhao et al., 19951995Xuzhou, Jiangsu34.26117.19Yes Malherbe et al., 19631963Johannesburg, South Africa--26.2027.90
Nelson Bay orthoreovirus2007NoYes* Cheng et al., 2009b2009Hong Kong22.40114.11No
Rotavirus A1973Yes Kapikian et al., 19761976Washington, D. C.38.90--77.04Yes PaPa et al., 19791979Beijing40.01116.41Yes Tomori et al., 19761976Johannesburg, South Africa--26.2027.90
Rotavirus B1984Yes Eiden et al., 19851985Baltimore, Maryland39.29--76.61Yes Hung et al., 19841984Jinzhou, Liaoning41.10121.13Yes Nakata et al., 19871987Kenya--0.0237.91
Rotavirus C1986Yes Jiang et al., 19951995Providence, Rhode Island41.82--71.41Yes Qiao et al., 19991999Beijing40.01116.41Yes Sebata and Steele, 19991999Pretoria, Gauteng province, South Africa--25.7528.23
Rotavirus H1987NoYes Wang et al., 19871987Huaihua, Hunan Province27.55109.96No
Banna virus1990NoYes Xu et al., 1990b1990Xishuangbanna, Yunnan Province21.90100.80No
Primate T-lymphotropic virus 11980Yes Poiesz et al., 19801980Bethesda, Maryland38.98--77.09Yes Hung et al., 19841984Shenyang, Liaoing41.80123.38Yes Williams et al., 19841984Ibadan, Nigeria7.383.95
Primate T-lymphotropic virus 21982Yes Kalyanaraman et al., 19821982Seattle, Washington47.61--122.33Yes Ma et al., 20132013Henan and Hubei32.21112.96Yes Delaporte et al., 19911991Franceville, Gabon--1.6313.60
Primate T-lymphotropic virus 32005NoNoYes Calattini et al., 20052005Océan department, South Province, Cameroon2.5010.50
Human immunodeficiency virus 11983Yes Safai et al., 19841984Washington, D. C.38.90--77.04Yes Chang et al., 19861986Hong Kong22.40114.11Yes Brun-Vézinet et al., 19841984Kisangani, Tshopo province, Democratic Republic of the Congo0.5325.19
Human immunodeficiency virus 21986Yes* Centers for Disease Control, 19881988New Jersey40.06--74.41Yes* Yan et al., 20002000Fuzhou, Fujian26.07119.30Yes Kanki et al., 19861986Dakar, Senegal14.72--17.47
Simian immunodeficiency virus1992Yes Khabbaz et al., 19921992Atlanta, Georgia33.75--84.39NoYes Calattini et al., 20052005Cameroon7.3712.35
Central chimpanzee simian foamy virus2012NoNoYes Rua et al., 20122012Near Dja Nature Reserves, Southern Cameroon4.5013.50
Eastern chimpanzee simian foamy virus1971NoNoYes Achong et al., 19711971Kenya--0.0237.91
Grivet simian foamy virus1997NoNoNo
Guenon simian foamy virus2012NoNoYes Rua et al., 20122012Near lolodrof, Southern Cameroon3.2310.73
Taiwanese macaque simian foamy virus2002NoYes Huang et al., 20122012Yunnan25.18101.86No
Australian bat lyssavirus1998NoNoNo
Duvenhage lyssavirus1971NoNoYes Meredith et al., 19711971Pretoria, Gauteng province, South Africa--25.7528.23
European bat Yeslyssavirus1989NoNoNo
European bat 2 lyssavirus1986NoNoNo
Irkut lyssavirus2013NoYes Liu et al., 20132013Tonghua county, Jilin41.68125.76No
Mokola lyssavirus1972NoNoYes Familusi et al., 19721972Ibadan, Nigeria7.383.95
Rabies lyssavirus1903Yes Black and Powers, 19101910Southern California34.57--116.76Yes Wu, 19811981Beijing40.01116.41Yes Wilhelm and Alexis, 19331933Carolina, Mpumalanga, South Africa--26.0730.12
Bas-Congo tibrovirus2012NoNoYes Grard et al., 20122012Mangala village, Boma Bungu Health Zone, Democratic Republic of Congo (DRC)--4.0421.76
Ekpoma Yestibrovirus2015NoNoYes Stremlau et al., 20152015Irrua, Edo State, Nigeria6.746.22
Ekpoma 2 tibrovirus2015NoNoYes Stremlau et al., 20152015Irrua, Edo State, Nigeria6.746.22
Alagoas vesiculovirus1967NoNoNo
Chandipura vesiculovirus1967NoNoNo
Cocal vesiculovirus1964NoNoNo
Indiana vesiculovirus1958Yes Patterson et al., 19581958Beltsville, Prince George's County, Maryland39.05--76.90NoNo
Isfahan vesiculovirus1977NoNoNo
Maraba vesiculovirus1984NoNoNo
New Jersey vesiculovirus1950Yes Hanson et al., 19501950Madison, Wisconsin43.07--89.40NoNo
Piry vesiculovirus1974NoNoNo
Barmah Forest virus1986NoNoNo
Chikungunya virus1956Yes* Centers for Disease Control and Prevention, 20062006Minnesota46.44--93.36Yes Clarke et al., 19671967Southwest Taiwan23.06120.59Yes Ross, 19561956Newala district, Tanzania--10.6439.24
Eastern equine encephalitis virus1938Yes Howitt, 19381938Southwestern Massachusetts42.19--73.09NoNo
Everglades virus1970Yes Ehrenkranz et al., 19701970Homestead, Florida25.47--80.48NoNo
Getah virus1966NoYes Li et al., 19921992Baoting County, Hainan18.98109.83No
Highlands J virus2000Yes Meehan et al., 20002000Florida27.66--81.52NoNo
Madariaga virus1972NoNoNo
Mayaro virus1957Yes* Tesh et al., 19991999Ohio40.42--82.91NoNo
Mosso das Pedras virus2013NoNoNo
Mucambo virus1965NoNoNo
Ndumu virus1961NoNoYes Kokernot et al., 19611961Ndumu, Maputaland, KwaZulu-Natal, South Africa--26.9332.26
Onyong-nyong virus1961NoNoYes Williams and Woodall, 19611961Entebbe, Uganda0.0532.46
Pixuna virus1991NoNoNo
Rio Negro virus1993NoNoNo
Ross River virus1972NoYes Xu et al., 19991999Hainan19.16109.94No
Semliki Forest virus1979NoNoYes Mathiot et al., 19901990Bangui, Central Africa4.3618.58
Sindbis virus1955NoNoYes Taylor et al., 19551955Cairo, Egypt30.0431.24
Tonate virus1976NoNoNo
Una virus1963NoNoNo
Venezuelan equine encephalitis virus1943Yes Casals et al., 19431943New York40.71--74.01NoNo
Western equine encephalitis virus1938Yes Howitt, 19381938Fresno, California36.75--119.77NoNo
Whataroa virus1964NoNoNo
Rubella virus1942Yes Habel, 19421942Washington, D. C.38.91--77.04Yes He et al., 19791979Hangzhou, Zhejiang29.87119.33Yes Selzer, 19631963Cape Town, Western Cape Province, South Africa--33.9018.57
Hepatitis delta virus1977Yes Rizzetto et al., 19791979New Jersey40.06--74.41Yes Rizzetto et al., 19801980Taipei, Taiwan24.96121.51Yes Crocchiolo et al., 19841984Harare, Zimbabwe--17.8331.03
  1. Notes: Yes denotes the virus was ever discovered from the region; * denotes the virus was ever discovered from the region, but imported from other regions; No denotes the virus species has never been discovered from the region; The lat and long denote the coordidate of discovery points or centroids of polygons

Appendix 1—table 2
Resolution and covered grid cells for virus discovery data.
Polygon dataPoint dataTotal
Country levelState/Province levelCity/County level
United StatesVirus countsNA14 (14.7%)11 (11.6%)70 (73.7%)95
Gridded cell countsNA1891272*273
ChinaVirus countsNA22 (27.5%)47 (58.7%)11 (13.8%)80
Gridded cell countsNA1617012*243
Virus counts7 (6.5%)5 (4.7%)15 (14.0%)80 (74.8%)107
AfricaGridded cell counts307221780426
  1. *

    Grid cell counts here include viruses first detected in multiple points from the literature, NA, not applicable

Appendix 1—table 3
Model parameters.
ModelTree complexityLearning rateBag fractionNo. of trees
United States20.00200.51430
China20.00350.51473
Africa20.00300.51446
Appendix 1—table 4
Model validation statistics for analyses in three regions.
Model% of deviance explained (95% quantiles)ICC (95% quantiles)
United States50.5% (44.3%–56.8%)0.66 (0.60–0.70)
China42.0% (32.4%–50.8%)0.52 (0.41–0.60)
Africa42.4% (34.2%–50.0%)0.51 (0.44–0.62)
  1. ICC, intraclass correlation coefficient

Appendix 2

We considered using bibliographic data to adjust for discovery effort, but rejected this strategy after some exploratory tests. Jones et al., 2008 estimated the discovery effort for emerging infectious diseases (EID) by calculating the number of papers published by each country (denoted by the address for every author) in the Journal of Infectious Diseases (JID) since 1973. The hypothesis is that countries publishing more papers in JID are likely to discover more EID events. We tested whether this method worked for our analysis by plotting the relationship between published human-infective RNA virus count and total number of papers from all journals which published on human-infective RNA viruses in Web of Science (as of 21 Feb 2018). Both the total number of papers (Appendix 3—figure 1A) and total number of papers on viruses (Appendix 3—figure 1B) were weakly linked to the published human virus count in our database, though the number of papers did have a positive relationship with the number of papers on viruses (Appendix 3—figure 1C). We also noted that papers in JID (highlighted in blue in Appendix 3—figure 1) may not be able to fully explain the discovery efforts for newly discovered viruses. Olival et al., 2017 adjust for the discovery effort by searching the number of publications for each of 586 virus species they have studied using a keyword search by virus name in PubMed and Web of Science. We found the results using this method were similar to that of Jones et al., 2008. Allen et al., 2017 derived a different index for discovery bias, based on the spatial distribution of place names in peer-reviewed biomedical literature. The disadvantage of this method is that it may not represent the discovery effort, as many place names are not related to zoonotic viruses.

Appendix 3

Appendix 3—figure 1
Relationship between published human-infective RNA virus count and total number of papers from the journals which published all human-infective RNA viruses in Web of Science.

(A) Total number of papers vs. published human virus count; (B) Total number of papers on viruses vs. published human virus count; (C) Total number of papers vs. total number of papers on viruses; (D) Percent of papers on viruses in each journal. Journal of Infectious Diseases (JID) is highlighted in blue.

Appendix 3—figure 2
Time lag of human-infective RNA virus discovery between the three regions and the world.

(A) United States. (B) China. (C) Africa. The blue dots represent the original discovery year of each virus in the world; the red dots represent the discovery year of each virus in three regions; and the segments between them represent the time lag.

Appendix 3—figure 3
Partial dependence plots showing the influence on human-infective RNA virus discovery for all predictors in the Unites States.

Partial dependence plots show the effect of an individual predictor over its range on the response after factoring out other predictors. Fitted lines represent the median (black) and 95% quantiles (coloured) based on 1000 replicated boosted regression tree models. Y axes are centred around the mean without scaling. X axes show the range of sampled values of predictors.

Appendix 3—figure 4
Partial dependence plots showing the influence on human-infective RNA virus discovery for predictors in China.

Partial dependence plots show the effect of an individual predictor over its range on the response after factoring out other predictors. Fitted lines represent the median (black) and 95% quantiles (coloured) based on 1000 replicated boosted regression tree models. Y axes are centred around the mean without scaling. X axes show the range of sampled values of predictors.

Appendix 3—figure 5
Partial dependence plots showing the influence on human-infective RNA virus discovery for all predictors in Africa.

Partial dependence plots show the effect of an individual predictor over its range on the response after factoring out other predictors. Fitted lines represent the median (black) and 95% quantiles (coloured) based on 1000 replicated boosted regression tree models. Y axes are centred around the mean without scaling. X axes show the range of sampled values of predictors.

Appendix 3—figure 6
Moran’s I across different spherical distances.

(A) United States; (B) China; (C) Africa. The solid line and dots represented the median Moran’s I value, and the grey area represented its 95% quantiles generated from 1000 samples (Blue: Raw virus data) or replicate boosted regression tree (BRT) models (Red: Model residuals). We used the fixed spherical distance as the neighbourhood weights—as there is no general consensus for selecting cut-off values, we chose spherical distances ranging from one time to fifteen times of distance of 1° grid cell at the equator, i.e. 110km to 1650km, considering the area of three regions. Our BRT models reduced Moran’s I value from a range of 0.19–0.50 for the raw virus data to 0.009–0.04 for the model residuals in the United States (A), 0.11–0.45 to –0.01–0.09 in China (B), 0.05–0.31 to –0.004–0.15 in Africa (C), suggesting that BRT models with 33 predictors have adequately accounted for spatial autocorrelations in the raw virus data in all three regions.

Appendix 3—figure 7
Relative contribution of predictors to human-infective RNA virus discovery in three regions.

Virus discovery data were matched to time-varying covariate data by year. (A) United States. (B) China. (C) Africa. The boxplots show the median (black bar) and interquartile range (box) of the relative contribution across 1000 replicate boosted regression tree models, with whiskers indicating minimum and maximum and black dots indicating outliers.

Appendix 3—figure 8
Relative contribution of predictors to human-infective RNA virus discovery in three regions.

Virus discovery data at year t were matched to time-varying covariate data at year t-1. (A) United States. (B) China. (C) Africa. The boxplots show the median (black bar) and interquartile range (box) of the relative contribution across 1000 replicate boosted regression tree models, with whiskers indicating minimum and maximum and black dots indicating outliers.

Appendix 3—figure 9
Distribution maps for 32 predictors in 2015 in the United States.

The values of these explanatory variables and latitude in each grid cell were used to predict the virus discovery in the corresponding grid cell in the Unites States in 2010–2019. Explanatory variables were log transformed where necessary to get better visualization, not meaning they entered the model by logged values.

Appendix 3—figure 10
Distribution maps for 32 predictors in 2015 in China.

The values of these explanatory variables and latitude in each grid cell were used to predict the virus discovery in the corresponding grid cell in China in 2010–2019. Explanatory variables were log transformed where necessary to get better visualization, not meaning they entered the model by logged values.

Appendix 3—figure 11
Distribution maps for 32 predictors in 2015 in Africa.

The values of these explanatory variables and latitude in each grid cell were used to predict the virus discovery in the corresponding grid cell in Africa in 2010–2019. Explanatory variables were log transformed where necessary to get better visualization, not meaning they entered the model by logged values.

Appendix 3—figure 12
Cumulative relative contribution of predictors to human-infective RNA virus discovery by group in each model of subgroups.

Subgroup 1 represents viruses firstly discovered from the region (United States or Africa); Subgroup 2 represents viruses firstly discovered elsewhere in the world. In the United States, virus count of Subgroup 1 and Subgroup 2 were 52 and 43, respectively. In Africa, virus count of Subgroup 1 and Subgroup 2 were 39 and 68, respectively. The relative contributions of all explanatory factors sum to 100% in each model, and each colour represents the cumulative relative contribution of all explanatory factors within each group.

Appendix 4

As covariates may vary within a decade and their effects on virus discovery were likely not immediate, we performed two further sensitivity analyses by (i) matching virus discovery data and time-varying covariate data by year and (ii) testing for lag effects by matching virus discovery at year t and predictors at t-1 to t-5 year. We collected yearly data for climatic variables and land use from the same sources used in the main analysis. Yearly population data at grid level before 1970 and GDP data before 1980 are not available, so we extrapolated them back to 1901 using the yearly growth rate at country level (Source: Our World in Data). For population, the WorldPop Project provides yearly gridded data for 2000-2020 (https://www.arcgis.com/home/item.html?id=56eb0f050c61434782f008a08331d23a), and we used the growth rate by grid to extrapolate values after 2000.

Data availability

The authors confirm that all data or the data sources are provided in the paper and its Supplementary Materials. The final datasets and codes used for the analyses are available via figshare at https://doi.org/10.6084/m9.figshare.15101979.

The following data sets were generated
    1. Zhang F
    (2021) figshare
    Supporting data and R scripts for: Predictors of human RNA virus discovery in the United States, China and Africa.
    https://doi.org/10.6084/m9.figshare.15101979
The following previously published data sets were used
    1. Woolhouse MEJ
    2. Brierley L
    (2017) Edinburgh DataShare
    Epidemiological characteristics of human-infective RNA viruses.
    https://doi.org/10.7488/ds/2265

References

    1. Achong BG
    2. Mansell PW
    3. Epstein MA
    (1971)
    A new human virus in cultures from A nasopharyngeal carcinoma
    The Journal of Pathology 103:P18.
    1. Balestrieri A
    2. Russo V
    3. D’Arrigo C
    (1967)
    Serum haemagglutination-inhibiting antibodies for haemadsorbing viruses types 1 and 3 and croup-associated in persons in Accra, Ghana
    Arch Ital Sci Med Trop e Parassit 48:299–306.
    1. Bardos V
    2. Sefcovicova L
    (1961)
    The presence of antibodies neutralizing Tahyna virus in the sera of inhabitants of some European, Asian, African and Australian countries
    Journal of Hygiene, Epidemiology, Microbiology, and Immunology 5:501–504.
    1. Bayer P
    2. Gear J
    (1955)
    Virus meningo-encephalitis in South Africa; a study of the cases admitted to the Johannesburg Fever Hospital
    South African Journal of Laboratory and Clinical Medicine. Suid-Afrikaanse Tydskrif Vir Laboratorium- En Kliniekwerk 1:22–35.
    1. Baylet R
    2. Schluep R
    3. Cantrelle DS
    4. Rey M
    (1963)
    Age-Grouping in Measles in an Urban Environment (A Serological Study
    Bulletin de La Societe Medicale d’Afrique Noire de Langue Francaise 8:771–778.
    1. Black SP
    2. Powers LM
    (1910)
    History of Rabies in Southern California
    California State Journal of Medicine 8:369–372.
    1. Causey OR
    2. Kemp GE
    3. Madbouly MH
    4. Lee VH
    (1969)
    Arbovirus surveillance in Nigeria, 1964-1967
    Bulletin de La Societe de Pathologie Exotique et de Ses Filiales 62:249–253.
    1. Centers for Disease Control
    (1988)
    AIDS due to HIV-2 infection--New Jersey
    MMWR. Morbidity and Mortality Weekly Report 37:33–35.
    1. Centers for Disease Control and Prevention
    (2006)
    Chikungunya fever diagnosed among international travelers--United States, 2005-2006
    MMWR. Morbidity and Mortality Weekly Report 55:1040–1042.
    1. Centers for Disease Control and Prevention
    (2009)
    Imported case of Marburg hemorrhagic fever - Colorado, 2008
    MMWR. Morbidity and Mortality Weekly Report 58:1377–1381.
    1. Chang HT
    2. Chiang YT
    (1950)
    Studies on an epidemic of influenza in Peking
    Chinese Medical Journal 68:185–192.
    1. Chen ZH
    2. Zhang EH
    3. Zhang XZ
    4. He NX
    (1964)
    Isolation of parainfluenza type I virus by tissue culture and adsorption-hemagglutination test [Article in Chinese]
    Journal of Zhejiang University (Medical Sciences) pp. 9–14.
    1. Clarke EJ
    2. Suitor EC
    3. Jenkin HM
    (1967)
    A serological survey of arboviruses in the human population of Senegal
    Tropical and Geographical Medicine 19:326–332.
    1. Delaporte E
    2. Louwagie J
    3. Peeters M
    4. Montplaisir N
    5. d’Auriol L
    6. Ville Y
    7. Bedjabaga L
    8. Larouzé B
    9. Van der Groen G
    10. Piot P
    (1991)
    Evidence of HTLV-II infection in central Africa
    AIDS (London, England) 5:771–772.
    1. Dick GW
    (1952) Zika virus
    II. Pathogenicity and Physical Properties. Transactions of the Royal Society of Tropical Medicine and Hygiene 46:521–534.
    https://doi.org/10.1016/0035-9203(52)90043-6
    1. Doggett JE
    (1965)
    Antibodies to respiratory syncytial virus in human sera from different regions of the world
    Bulletin of the World Health Organization 32:849–853.
    1. Donia HA
    2. Youssef BZ
    (2002)
    Foot and mouth disease (FMD): serological investigation in some farms of Alexandria Governorate of Egypt
    The Journal of the Egyptian Public Health Association 77:371–382.
    1. Edington AD
    (1927)
    "Dengue " as seen in the Recent Epidemic in Durban
    J Med Assoc S Africa 1:446–448.
    1. Fang ZY
    2. Wen LY
    3. Jin SJ
    4. Zhao ZH
    (1995)
    Norwalk-like virus infection found in diarrhea patients in China [Article in Chinese
    Chinese Journal of Virology 11:215–219.
    1. Guangzhou Institute of Medicine and Health
    (1975)
    Investigation report on virus types in patients with cold in Guangzhou [Article in Chinese
    Guangdong Medical Journal pp. 2–6.
    1. Hammon WM
    2. Reeves WC
    (1952)
    California encephalitis virus, a newly described agent
    California Medicine 77:303–309.
    1. Hanson RP
    2. Rasmussen AF
    3. Brandly CA
    4. Brown JW
    (1950)
    Human infection with the virus of vesicular stomatitis
    The Journal of Laboratory and Clinical Medicine 36:754–758.
    1. He NX
    2. Xu TZ
    3. Ma JY
    4. Wang XZ
    5. Wang L
    6. Guo MF
    7. Pan CM
    (1979)
    Isolation of rubella virus [Article in Chinese
    Journal of Zhejiang University (Medical Sciences) 8:169–172.
    1. Huang RT
    2. Wei J
    3. Tian X
    4. Li DR
    5. Yin SR
    (1989)
    Isolation of A small RNA virus from feces of A patient with enterically transmitted Non-A Non-B hepatitis in China [Article in Chinese
    Journal of Academy of Military Medical Sciences 13:273–277.
    1. Kun Number 323 Unit, the Chinese People’s Liberation Army
    (1975)
    Studies on the isolation and growth characteristics of respiratory syncytial virus [Article in Chinese
    Acta Microbiologica Sinica 15:125–132.
    1. Luo RH
    2. Xie JQ
    3. Chen YM
    4. Yang SJ
    (1999)
    Report of one case of hand-foot-mouth disease in human [Article in Chinese
    New Medicine 30:37–38.
    1. Malherbe H
    2. Roux P
    3. Kahn E
    (1963)
    The Role of Enteropathogenic Bacteria and Viruses in Acute Diarrhoeal Disorders of Infancy and Childhood in Johannesburg. II. “Non-Specific” Gastro-Enteritis
    South African Medical Journal = Suid-Afrikaanse Tydskrif Vir Geneeskunde 37:259–261.
    1. Meredith CD
    2. Prossouw AP
    3. Koch HVP
    (1971)
    An unusual case of human rabies thought to be of chiropteran origin
    South African Medical Journal = Suid-Afrikaanse Tydskrif Vir Geneeskunde 45:767–769.
    1. Microbiology Research Group of Shanghai First Medical College and Laboratory of Shanghai Sixth People’s Hospital
    (1978)
    Preliminary report on the examination of hepatitis A antigen particles by immunoelectron microscopy [Article in Chinese
    Shanghai Medical Journal pp. 13–15.
    1. Mirkovic RR
    2. Kono R
    3. Yin-Murphy M
    4. Sohier R
    5. Schmidt NJ
    6. Melnick JL
    (1973)
    Enterovirus type 70: the etiologic agent of pandemic acute haemorrhagic conjunctivitis
    Bulletin of the World Health Organization 49:341–346.
    1. Montefiore D
    2. Drozdov SG
    3. Kafuko GW
    4. Fayinka OA
    5. Soneji A
    (1970)
    Influenza in East Africa, 1969-70
    Bulletin of the World Health Organization 43:269–273.
    1. Moscovivci C
    2. Laplaca M
    3. Maisel J
    4. Kempe H
    (1961)
    Studies of bovine enteroviruses
    American Journal of Veterinary Research 22:852–863.
    1. PaPa QF
    2. Qiu FX
    3. Yu FR
    4. Chen SZ
    (1979)
    Rotavirus-the source of acute gastroenteritis in infants in autumn [Article in Chinese
    Bulletin of Medical Research pp. 26–27.
    1. Pathogen biology research group, Jiangsu new medical college
    (1975)
    Pathogen biology research group Jnmc: Virus isolation in 535 elderly patients with chronic bronchitis and other respiratory infections and antibody tests in some cases [Article in Chinese
    Jiangsu Medical Journal pp. 47–53.
    1. Patterson WC
    2. Mott LO
    3. Jenney EW
    (1958)
    A study of vesicular stomatitis in man
    Journal of the American Veterinary Medical Association 133:57–62.
    1. Patz IM
    2. Measroch V
    3. Gear J
    (1953)
    Bornholm disease, pleurodynia or epidemic myalgia; an outbreak in the Transvaal associated with Coxsackie virus infection
    South African Medical Journal = Suid-Afrikaanse Tydskrif Vir Geneeskunde 27:397–402.
    1. Rivers TM
    2. Schwentker FF
    (1934) Louping Ill in Man
    The Journal of Experimental Medicine 59:669–685.
    https://doi.org/10.1084/jem.59.5.669
    1. Ross RW
    (1956) The Newala epidemic
    III. The Virus: Isolation, Pathogenic Properties and Relationship to the Epidemic. The Journal of Hygiene 54:177–191.
    https://doi.org/10.1017/S0022172400044442
    1. Schultz EW
    2. Habel K
    (1959)
    SA virus; a new member of the myxovirus group
    Journal of Immunology (Baltimore, Md 82:274–278.
    1. Sebata T
    2. Steele AD
    (1999)
    Human group C rotavirus identified in South Africa
    South African Medical Journal = Suid-Afrikaanse Tydskrif Vir Geneeskunde 89:1073–1074.
    1. Shanghai Eye and Skin Disease Prevention and Treatment Institute
    (1979)
    Isolation and identification of acute hemorrhagic conjunctivitis virus in 1975 [Article in Chinese
    Chinese Journal of Ophthalmology 15:90.
    1. Sickles GM
    2. Dalldorf G
    (1949) Serologic differences among strains of the Coxsackie group of viruses
    Proceedings of the Society for Experimental Biology and Medicine. Society for Experimental Biology and Medicine (New York, N.Y.) 72:30.
    https://doi.org/10.3181/00379727-72-17321
    1. Simpson DI
    2. Knight EM
    3. Courtois G
    4. Williams MC
    5. Weinbren MP
    6. Kibukamusoke JW
    (1967)
    Congo virus: a hitherto undescribed virus occurring in Africa
    I. Human Isolations-Clinical Notes. East African Medical Journal 44:86–92.
    1. Smithburn KC
    (1952)
    Neutralizing antibodies against certain recently isolated viruses in the sera of human beings residing in East Africa
    Journal of Immunology 69:223–234.
    1. Smithburn KC
    2. Kokernot RH
    3. Weinbren MP
    4. De Meillon B
    (1957) Studies on arthropod-borne viruses of Tongaland
    IX. Isolation of Wesselsbron Virus from a Naturally Infected Human Being and from Aedes (Banksinella) Circumluteolus Theo. The South African Journal of Medical Sciences 22:113–120.
    https://doi.org/10.4269/ajtmh.1958.7.579
    1. Smithburn KC
    2. Paterson HE
    3. Heymann CS
    4. Winter PA
    (1959)
    An agent related to Uganda S virus from man and mosquitoes in South Africa
    South African Medical Journal = Suid-Afrikaanse Tydskrif Vir Geneeskunde 33:959–962.
    1. Song G
    2. Qiu XZ
    3. Ni DS
    4. Zhao JN
    5. Kong BX
    (1982)
    Etiological studies of epidemic hemorrhagic fever I. Virus Isolation in Apodemus Agrarius from Non-Endemic Area and Its Antigenic Characterization [Article in Chinese
    Zhongguo Yi Xue Ke Xue Yuan Xue Bao Acta Academiae Medicinae Sinicae 4:73–77.
    1. Suklin SE
    2. Burns KF
    3. Shelton DF
    4. Wallis C
    (1962)
    Bat salivary gland virus: infections of man and monkey
    Texas Reports on Biology and Medicine 20:113–127.
    1. Sun H
    2. Jia FJ
    3. Huang BC
    (2016)
    Research progress and epidemic situation of the Zika Virus [Article in Chinese
    Chin J Diagnostics (Electronic Edition) 04:66–69.
    1. Tang FF
    2. Wu SY
    3. Huang YT
    4. Wen ZQ
    (1958)
    Research on the isolation of measles virus [Article in Chinese
    Chinese Science Bulletin pp. 314–315.
    1. Taylor-Robinson D
    2. Tyrrell DA
    (1963) Virus diseases on Tristan da Cunha
    Transactions of the Royal Society of Tropical Medicine and Hygiene 57:19–22.
    https://doi.org/10.1016/0035-9203(63)90005-1
    1. Virus Research Group of Kun Number 323 Unit, The Chinese People’s Liberation Army
    (1975)
    Isolation, identification and serological studies of a coronavirus strain [Article in Chinese
    Acta Microbiologica Sinica 15:231–235.
    1. Wang WS
    2. Zhao CL
    (1956)
    Isolation and identification of forest encephalitis virus [Article in Chinese
    Journal of Harbin Medical University pp. 44–49.
    1. Wang TJ
    2. Sun WC
    3. Fang Z
    4. Du SM
    (1958)
    Etiology of Mumps in Beijing [Article in Chinese
    National Medical Journal of China 44:18–29.
    1. Wang CA
    2. Hu CW
    3. Huang FL
    4. Chen X
    5. Hung T
    (1987)
    A novel discovered rotavirus from adult acute diarrhoeal patients in China [Article in Chinese
    Chinese Journal of Virology 03:321–325.
    1. Wang Y
    2. Okamoto H
    3. An P
    4. Chen HS
    5. Liu YL
    6. Wang FS
    (1996)
    Infection of hepatitis G virus among blood donors in China [Article in Chinese
    Journal of Beijing Medical University 28:97.
    1. Wen CC
    2. Chu CM
    (1957)
    Survey of influenza antibodies in normal human sera in Peking
    Chinese Medical Journal (Peking, China) 75:792–801.
    1. Wilhelm N
    2. Alexis T
    (1933)
    Rabies in South Africa: Occurrence and Distribution of Cases During 1932
    Onderstepoort Journal of Veterinary Science and Animal Industry 1:51–56.
    1. Wu JR
    2. Che JL
    3. Wu GQ
    4. Lin SQ
    (1960)
    Investigation on Coxsackie Virus Disease in Fujian Province [Article in Chinese
    National Medical Journal of China 46:40–48.
    1. Wu BQ
    (1981)
    Report of four cases of rabies encephalitis
    New Medicine 12:357–358.
    1. Xiao MH
    2. Ye ZZ
    3. Zhang ZL
    4. Tian XQ
    5. Zheng JM
    6. Liu ZY
    (1985)
    An epidemic of hand-foot-and-mouth disease due to Coxsackie A16 in Tianjin City [Article in Chinese
    Tianjin Medical Journal 06:355–357.
    1. Xu AY
    2. Pang QF
    3. Qiu FX
    (1981)
    Detection of astrovirus in faeces of infants with gastroenteritis in autumn [Article in Chinese
    Journal of Medical Research 32.
    1. Xu Z
    2. Shen FM
    3. Xu ZY
    4. Huang QS
    (1990a)
    HCV infection and primary liver cell cancer [Article in Chinese
    Tumor (Shanghai) 10:105–115.
    1. Xu PT
    2. Wang YM
    3. Zuo JM
    4. Lin JW
    5. Xu PM
    (1990b)
    New orbiviruses isolated from patients with unknown fever and encephalitis in Yunnan Province [Article in Chinese
    Chinese Journal of Virology 06:27–33.
    1. Xu CH
    2. Peng YF
    3. Bai ZJ
    4. Tian XD
    5. Lin LH
    6. Chen CH
    7. Fang MY
    8. Jiang LH
    (1999)
    Seroepidemiological survey of arbovirus in Hainan Province in 1998 [Article in Chinese
    Chinese Journal of Epidemiology 20:20.
    1. Yan YS
    2. Zheng ZS
    3. Chen G
    4. Zheng J
    5. Yan PP
    6. Shao YM
    (2000)
    Confirmation of the first HIV-2 case in China [Article in Chinese
    Journal of Chinese AIDS&STD Prevention and Control 06:16–18.
    1. Yang JM
    2. Yin GQ
    3. Feng YH
    4. Luo ZY
    5. Jiao JF
    6. Zhang ZQ
    (1996)
    Superinfection of colti virus and Japanese encephalitis virus [Article in Chinese
    Journal of Nanjing Railway Medical College 15:29.
    1. Yen CH
    2. Hsü YK
    (1941)
    ISOLATION OF A VIRUS FROM A CASE OF ACUTE POLIOMYELITIS IN PEIPING: WITH HISTOPATHOLOGICAL STUDIES
    Chinese Medical Journal 60:201–206.
    1. Yu JQ
    2. Chang RX
    3. He CJ
    4. Guan QH
    5. Xie JP
    (1987)
    Serological study of 722 infants with viral pneumonia [Article in Chinese
    Guangdong Medical Journal 08:32–34.
    1. Zhang Q
    (1957)
    Process of isolating influenza virus in 1956 [Article in Chinese
    Biological Products Newsletter 2:80.
    1. Zhang YZ
    2. Zhou DJ
    3. Xiong Y
    4. Chen XP
    5. He YW
    6. Sun Q
    7. Yu B
    8. Li J
    9. Dai YA
    10. Tian JH
    11. Qin XC
    12. Jin D
    13. Cui Z
    14. Luo XL
    15. Li W
    16. Lu S
    17. Wang W
    18. Peng JS
    19. Guo WP
    20. Li MH
    21. Li ZJ
    22. Zhang S
    23. Chen C
    24. Wang Y
    25. de Jong MD
    26. Xu J
    (2011)
    Hemorrhagic fever caused by a novel tick-borne Bunyavirus in Huaiyangshan, China
    Zhonghua Liu Xing Bing Xue Za Zhi = Zhonghua Liuxingbingxue Zazhi 32:209–220.
    1. Zhao JM
    2. Qiang BQ
    3. Zhao TX
    4. Song YY
    5. Deng J
    6. Chen XP
    7. Ou JG
    8. Guo YR
    9. Zhao YN
    10. Cheng H
    11. Zhang Q
    (1995)
    Detection of diarrhoea viruses in children with acute gastroenteritis [Article in Chinese
    Chinese Journal of Experimental and Clinical Virology pp. 45–49.

Article and author information

Author details

  1. Feifei Zhang

    Usher Institute, University of Edinburgh, Edinburgh, United Kingdom
    Contribution
    Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Resources, Software, Validation, Visualization, Writing - original draft, Writing – review and editing
    For correspondence
    Feifei.Zhang@ed.ac.uk
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-3718-243X
  2. Margo Chase-Topping

    1. Usher Institute, University of Edinburgh, Edinburgh, United Kingdom
    2. Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, United Kingdom
    Contribution
    Methodology, Supervision, Writing – review and editing
    Competing interests
    No competing interests declared
  3. Chuan-Guo Guo

    Department of Medicine, Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong, China
    Contribution
    Data curation, Methodology, Software, Validation, Writing – review and editing
    Competing interests
    No competing interests declared
  4. Mark EJ Woolhouse

    Usher Institute, University of Edinburgh, Edinburgh, United Kingdom
    Contribution
    Conceptualization, Funding acquisition, Methodology, Project administration, Supervision, Writing - original draft, Writing – review and editing
    Competing interests
    No competing interests declared

Funding

Darwin Trust of Edinburgh

  • Feifei Zhang

European Union's Horizon 2020 research and innovation programme (874735)

  • Mark EJ Woolhouse

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank Liam Brierley (University of Liverpool, UK) for validating the data sets. We would like to thank all reviewers and editors (Benn Sartorius, Ben Cooper, George Perry etc.) for their constructive comments and suggestions.

Copyright

© 2022, Zhang et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 898
    views
  • 217
    downloads
  • 1
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Feifei Zhang
  2. Margo Chase-Topping
  3. Chuan-Guo Guo
  4. Mark EJ Woolhouse
(2022)
Predictors of human-infective RNA virus discovery in the United States, China, and Africa, an ecological study
eLife 11:e72123.
https://doi.org/10.7554/eLife.72123

Share this article

https://doi.org/10.7554/eLife.72123

Further reading

    1. Ecology
    2. Evolutionary Biology
    Zhixian Zhang, Jianying Li ... Songdou Zhang
    Research Article

    Seasonal polyphenism enables organisms to adapt to environmental challenges by increasing phenotypic diversity. Cacopsylla chinensis exhibits remarkable seasonal polyphenism, specifically in the form of summer-form and winter-form, which have distinct morphological phenotypes. Previous research has shown that low temperature and the temperature receptor CcTRPM regulate the transition from summer-form to winter-form in C. chinensis by impacting cuticle content and thickness. However, the underling neuroendocrine regulatory mechanism remains largely unknown. Bursicon, also known as the tanning hormone, is responsible for the hardening and darkening of the insect cuticle. In this study, we report for the first time on the novel function of Bursicon and its receptor in the transition from summer-form to winter-form in C. chinensis. Firstly, we identified CcBurs-α and CcBurs-β as two typical subunits of Bursicon in C. chinensis, which were regulated by low temperature (10 °C) and CcTRPM. Subsequently, CcBurs-α and CcBurs-β formed a heterodimer that mediated the transition from summer-form to winter-form by influencing the cuticle chitin contents and cuticle thickness. Furthermore, we demonstrated that CcBurs-R acts as the Bursicon receptor and plays a critical role in the up-stream signaling of the chitin biosynthesis pathway, regulating the transition from summer-form to winter-form. Finally, we discovered that miR-6012 directly targets CcBurs-R, contributing to the regulation of Bursicon signaling in the seasonal polyphenism of C. chinensis. In summary, these findings reveal the novel function of the neuroendocrine regulatory mechanism underlying seasonal polyphenism and provide critical insights into the insect Bursicon and its receptor.

    1. Ecology
    2. Evolutionary Biology
    Lin Wang, Tak-Sing Wong
    Insight

    By reducing the reflection of ultraviolet light, hollow nanoparticles called brochosomes help to protect leafhoppers from predators.