Novel risk loci for COVID-19 hospitalization among admixed American populations

  1. Silvia Diz-de Almeida
  2. Raquel Cruz
  3. Andre D Luchessi
  4. José M Lorenzo-Salazar
  5. Miguel López de Heredia
  6. Inés Quintela
  7. Rafaela González-Montelongo
  8. Vivian N Silbiger
  9. Marta Sevilla Porras
  10. Jair Antonio Tenorio Castaño
  11. Julian Nevado
  12. Jose María Aguado
  13. Carlos Aguilar
  14. Sergio Aguilera-Albesa
  15. Virginia Almadana
  16. Berta Almoguera
  17. Nuria Alvarez
  18. Álvaro Andreu-Bernabeu
  19. Eunate Arana-Arri
  20. Celso Arango
  21. María J Arranz
  22. Maria-Jesus Artiga
  23. Raúl C Baptista-Rosas
  24. María Barreda- Sánchez
  25. Moncef Belhassen-Garcia
  26. Joao F Bezerra
  27. Marcos AC Bezerra
  28. Lucía Boix-Palop
  29. María Brion
  30. Ramón Brugada
  31. Matilde Bustos
  32. Enrique J Calderón
  33. Cristina Carbonell
  34. Luis Castano
  35. Jose E Castelao
  36. Rosa Conde-Vicente
  37. M Lourdes Cordero-Lorenzana
  38. Jose L Cortes-Sanchez
  39. Marta Corton
  40. M Teresa Darnaude
  41. Alba De Martino-Rodríguez
  42. Victor del Campo-Pérez
  43. Aranzazu Diaz de Bustamante
  44. Elena Domínguez-Garrido
  45. Rocío Eirós
  46. María Carmen Fariñas
  47. María J Fernandez-Nestosa
  48. Uxía Fernández-Robelo
  49. Amanda Fernández-Rodríguez
  50. Tania Fernández-Villa
  51. Manuela Gago-Domínguez
  52. Belén Gil-Fournier
  53. Javier Gómez-Arrue
  54. Beatriz González Álvarez
  55. Fernan Gonzalez Bernaldo de Quirós
  56. Anna González-Neira
  57. Javier González-Peñas
  58. Juan F Gutiérrez-Bautista
  59. María José Herrero
  60. Antonio Herrero-Gonzalez
  61. María A Jimenez-Sousa
  62. María Claudia Lattig
  63. Anabel Liger Borja
  64. Rosario Lopez-Rodriguez
  65. Esther Mancebo
  66. Caridad Martín-López
  67. Vicente Martín
  68. Oscar Martinez-Nieto
  69. Iciar Martinez-Lopez
  70. Michel F Martinez-Resendez
  71. Ángel Martinez-Perez
  72. Juliana F Mazzeu
  73. Eleuterio Merayo Macías
  74. Pablo Minguez
  75. Victor Moreno Cuerda
  76. Silviene F Oliveira
  77. Eva Ortega-Paino
  78. Mara Parellada
  79. Estela Paz-Artal
  80. Ney PC Santos
  81. Patricia Pérez-Matute
  82. Patricia Perez
  83. M Elena Pérez-Tomás
  84. Teresa Perucho
  85. Mel Lina Pinsach-Abuin
  86. Guillermo Pita
  87. Ericka N Pompa-Mera
  88. Gloria L Porras-Hurtado
  89. Aurora Pujol
  90. Soraya Ramiro León
  91. Salvador Resino
  92. Marianne R Fernandes
  93. Emilio Rodríguez-Ruiz
  94. Fernando Rodriguez-Artalejo
  95. José A Rodriguez-Garcia
  96. Francisco Ruiz-Cabello
  97. Javier Ruiz-Hornillos
  98. Pablo Ryan
  99. José Manuel Soria
  100. Juan Carlos Souto
  101. Eduardo Tamayo
  102. Alvaro Tamayo-Velasco
  103. Juan Carlos Taracido-Fernandez
  104. Alejandro Teper
  105. Lilian Torres-Tobar
  106. Miguel Urioste
  107. Juan Valencia-Ramos
  108. Zuleima Yáñez
  109. Ruth Zarate
  110. Itziar de Rojas
  111. Agustín Ruiz
  112. Pascual Sánchez
  113. Luis Miguel Real
  114. SCOURGE Cohort Group
  115. Encarna Guillen-Navarro
  116. Carmen Ayuso
  117. Esteban Parra
  118. José A Riancho
  119. Augusto Rojas-Martinez
  120. Carlos Flores author has email address
  121. Pablo Lapunzina
  122. Ángel Carracedo author has email address
  1. Centro Singular de Investigación en Medicina Molecular y Enfermedades Crónicas (CIMUS), Universidade de Santiago de Compostela, Santiago de Compostela, Spain
  2. CIBERER, ISCIII, Madrid, Spain
  3. Pediatric Neurology Unit, Department of Pediatrics, Navarra Health Service Hospital, Pamplona, Spain
  4. ERN-ITHACA-European Reference Network
  5. Universidade Federal do Rio Grande do Norte, Departamento de Analises Clinicas e Toxicologicas, Natal, Brazil
  6. Genomics Division, Instituto Tecnológico y de Energías Renovables, Santa Cruz de Tenerife, Spain
  7. Fundación Pública Galega de Medicina Xenómica, Sistema Galego de Saúde (SERGAS) Santiago de Compostela, Spain
  8. Instituto de Genética Médica y Molecular (INGEMM), Hospital Universitario La Paz-IDIPAZ, Madrid, Spain
  9. Unit of Infectious Diseases, Hospital Universitario 12 de Octubre, Instituto de Investigación Sanitaria Hospital 12 de Octubre (imas12), Madrid, Spain
  10. Spanish Network for Research in Infectious Diseases (REIPI RD16/0016/0002), Instituto de Salud Carlos III, Madrid, Spain
  11. School of Medicine, Universidad Complutense, Madrid, Spain
  12. CIBERINFEC, ISCIII, Madrid, Spain
  13. Hospital General Santa Bárbara de Soria, Soria, Spain
  14. Navarra Health Service, NavarraBioMed Research Group, Pamplona, Spain
  15. Hospital Universitario Virgen Macarena, Neumología, Seville, Spain
  16. Department of Genetics & Genomics, Instituto de Investigación Sanitaria-Fundación Jiménez Díaz University Hospital - Universidad Autónoma de Madrid (IIS-FJD, UAM), Madrid, Spain
  17. Spanish National Cancer Research Centre, Human Genotyping-CEGEN Unit, Madrid, Spain
  18. Department of Child and Adolescent Psychiatry, Institute of Psychiatry and Mental Health, Hospital General Universitario Gregorio Marañón (IiSGM), Madrid, Spain
  19. Biocruces Bizkai HRI, Bizkaia, Spain
  20. Cruces University Hospital, Osakidetza, Bizkaia, Spain
  21. Centre for Biomedical Network Research on Mental Health (CIBERSAM), Instituto de Salud Carlos III, Madrid, Spain
  22. Fundació Docència I Recerca Mutua Terrassa, Barcelona, Spain
  23. Spanish National Cancer Research Center, CNIO Biobank, Madrid, Spain
  24. Hospital General de Occidente, Zapopan Jalisco, Mexico
  25. Centro Universitario de Tonalá, Universidad de Guadalajara, Tonalá Jalisco, Mexico
  26. Centro de Investigación Multidisciplinario en Salud, Universidad de Guadalajara, Tonalá Jalisco, Mexico
  27. Instituto Murciano de Investigación Biosanitaria (IMIB-Arrixaca), Murcia, Spain
  28. Universidad Católica San Antonio de Murcia (UCAM), Murcia, Spain
  29. Hospital Universitario de Salamanca-IBSAL, Servicio de Medicina Interna-Unidad de Enfermedades Infecciosas, Salamanca, Spain
  30. Universidad de Salamanca, Salamanca, Spain
  31. Escola Tecnica de Saúde, Laboratorio de Vigilancia Molecular Aplicada, Brazil
  32. Federal University of Pernambuco, Genetics Postgraduate Program, Recife, PE, Brazil
  33. Hospital Universitario Mutua Terrassa, Barcelona, Spain
  34. Instituto de Investigación Sanitaria de Santiago (IDIS), Xenética Cardiovascular, Santiago de Compostela, Spain
  35. CIBERCV, ISCIII, Madrid, Spain
  36. Cardiovascular Genetics Center, Institut d’Investigació Biomèdica Girona (IDIBGI), Girona, Spain
  37. Medical Science Department, School of Medicine, University of Girona, Girona, Spain
  38. Hospital Josep Trueta, Cardiology Service, Girona, Spain
  39. Institute of Biomedicine of Seville (IBiS), Consejo Superior de Investigaciones Científicas (CSIC)- University of Seville- Virgen del Rocio University Hospital, Seville, Spain
  40. Departamento de Medicina, Hospital Universitario Virgen del Rocío, Universidad de Sevilla, Seville, Spain
  41. CIBERESP, ISCIII, Madrid, Spain
  42. Hospital Universitario de Salamanca-IBSAL, Servicio de Medicina Interna, Salamanca, Spain
  43. Osakidetza, Cruces University Hospital, Bizkaia, Spain
  44. Centre for Biomedical Network Research on Diabetes and Metabolic Associated Diseases (CIBERDEM), Instituto de Salud Carlos III, Madrid, Spain
  45. University of Pais Vasco, UPV/EHU, Bizkaia, Spain
  46. Oncology and Genetics Unit, Instituto de Investigacion Sanitaria Galicia Sur, Xerencia de Xestion Integrada de Vigo-Servizo Galego de Saúde, Vigo, Spain
  47. Hospital Universitario Río Hortega, Valladolid, Spain
  48. Servicio de Medicina intensiva, Complejo Hospitalario Universitario de A Coruña (CHUAC), Sistema Galego de Saúde (SERGAS), A Coruña, Spain
  49. Tecnológico de Monterrey, Monterrey, Mexico
  50. Otto von Guericke University, Departament of Microgravity and Translational Regenerative Medicine, Magdeburg, Germany
  51. Hospital Universitario Mostoles, Unidad de Genética, Madrid, Spain
  52. Instituto Aragonés de Ciencias de la Salud (IACS), Zaragoza, Spain
  53. Instituto Investigación Sanitaria Aragón (IIS-Aragon), Zaragoza, Spain
  54. Preventive Medicine Department, Instituto de Investigacion Sanitaria Galicia Sur, Xerencia de Xestion Integrada de Vigo-Servizo Galego de Saúde, Vigo, Spain
  55. Unidad Diagnóstico Molecular. Fundación Rioja Salud, La Rioja, Spain
  56. Hospital Universitario de Salamanca-IBSAL, Servicio de Cardiología, Salamanca, Spain
  57. IDIVAL, Cantabria, Spain
  58. Universidad de Cantabria, Cantabria, Spain
  59. Hospital U M Valdecilla, Cantabria, Spain
  60. Universidad Nacional de Asunción, Facultad de Politécnica, Paraguay
  61. Urgencias Hospitalarias, Complejo Hospitalario Universitario de A Coruña (CHUAC), Sistema Galego de Saúde (SERGAS), A Coruña, Spain
  62. Unidad de Infección Viral e Inmunidad, Centro Nacional de Microbiología (CNM), Instituto de Salud Carlos III (ISCIII), Madrid, Spain
  63. Grupo de Investigación en Interacciones Gen-Ambiente y Salud (GIIGAS) - Instituto de Biomedicina (IBIOMED), Universidad de León, León, Spain
  64. IDIS
  65. Hospital Universitario de Getafe, Servicio de Genética, Madrid, Spain
  66. Ministerio de Salud Ciudad de Buenos Aires, Buenos Aires, Argentina
  67. Hospital Universitario Virgen de las Nieves, Servicio de Análisis Clínicos e Inmunología, Granada, Spain
  68. IIS La Fe, Plataforma de Farmacogenética, Valencia, Spain
  69. Universidad de Valencia, Departamento de Farmacología, Valencia, Spain
  70. Data Analysis Department, Instituto de Investigación Sanitaria-Fundación Jiménez Díaz University Hospital - Universidad Autónoma de Madrid (IIS-FJD, UAM), Madrid, Spain
  71. Universidad de los Andes, Facultad de Ciencias, Bogotá, Colombia
  72. SIGEN Alianza Universidad de los Andes - Fundación Santa Fe de Bogotá, Bogotá, Colombia
  73. Hospital General de Segovia, Medicina Intensiva, Segovia, Spain
  74. Facultad de Farmacia, Universidad San Pablo-CEU, CEU Universities, Urbanización Montepríncipe, Boadilla del Monte, España
  75. Hospital Universitario 12 de Octubre, Department of Immunology, Madrid, Spain
  76. Instituto de Investigación Sanitaria Hospital 12 de Octubre (imas12), Transplant Immunology and Immunodeficiencies Group, Madrid, Spain
  77. Fundación Santa Fe de Bogota, Departamento Patologia y Laboratorios, Bogotá, Colombia
  78. Unidad de Genética y Genómica Islas Baleares, Islas Baleares, Spain
  79. Hospital Universitario Son Espases, Unidad de Diagnóstico Molecular y Genética Clínica, Islas Baleares, Spain
  80. Genomics of Complex Diseases Unit, Research Institute of Hospital de la Santa Creu i Sant Pau, IIB Sant Pau, Barcelona, Spain
  81. Universidade de Brasília, Faculdade de Medicina, Brazil
  82. Programa de Pós-Graduação em Ciências Médicas (UnB), Brazil
  83. Programa de Pós-Graduação em Ciencias da Saude (UnB), Brazil
  84. Hospital El Bierzo, Unidad Cuidados Intensivos, León, Spain
  85. Hospital Universitario Mostoles, Medicina Interna, Madrid, Spain
  86. Universidad Francisco de Vitoria, Madrid, Spain
  87. Departamento de Genética e Morfologia, Instituto de Ciências Biológicas, Universidade de Brasília, Brazil
  88. Programa de Pós-Graduação em Biologia Animal (UnB), Brazil
  89. Programa de Pós-Graduação Profissional em Ensino de Biologia (UnB), Brazil
  90. Universidad Complutense de Madrid, Department of Immunology, Ophthalmology and ENT, Madrid, Spain
  91. Universidade Federal do Pará, Núcleo de Pesquisas em Oncologia, Belém, Pará, Brazil
  92. Infectious Diseases, Microbiota and Metabolism Unit, CSIC Associated Unit, Center for Biomedical Research of La Rioja (CIBIR), Logroño, Spain
  93. Inditex, A Coruña, Spain
  94. GENYCA, Madrid, Spain
  95. Instituto Mexicano del Seguro Social (IMSS), Centro Médico Nacional Siglo XXI, Unidad de Investigación Médica en Enfermedades Infecciosas y Parasitarias, Mexico City, Mexico
  96. Instituto Mexicano del Seguro Social (IMSS), Centro Médico Nacional La Raza. Hospital de Infectología, Mexico City, Mexico
  97. Clinica Comfamiliar Risaralda, Pereira, Colombia
  98. Bellvitge Biomedical Research Institute (IDIBELL), Neurometabolic Diseases Laboratory, L’Hospitalet de Llobregat, Spain
  99. Catalan Institution of Research and Advanced Studies (ICREA), Barcelona, Spain
  100. Hospital Ophir Loyola, Departamento de Ensino e Pesquisa, Belém, Pará, Brazil
  101. Unidad de Cuidados Intensivos, Hospital Clínico Universitario de Santiago (CHUS), Sistema Galego de Saúde (SERGAS), Santiago de Compostela, Spain
  102. Department of Preventive Medicine and Public Health, School of Medicine, Universidad Autónoma de Madrid, Madrid, Spain
  103. IdiPaz (Instituto de Investigación Sanitaria Hospital Universitario La Paz), Madrid, Spain
  104. IMDEA-Food Institute, CEI UAM+CSIC, Madrid, Spain
  105. Complejo Asistencial Universitario de León, León, Spain
  106. Instituto de Investigación Biosanitaria de Granada (ibs GRANADA), Granada, Spain
  107. Universidad de Granada, Departamento Bioquímica, Biología Molecular e Inmunología III, Granada, Spain
  108. Hospital Infanta Elena, Allergy Unit, Valdemoro, Madrid, Spain
  109. Instituto de Investigación Sanitaria-Fundación Jiménez Díaz University Hospital - Universidad Autónoma de Madrid (IIS-FJD, UAM), Madrid, Spain
  110. Faculty of Medicine, Universidad Francisco de Vitoria, Madrid, Spain
  111. Hospital Universitario Infanta Leonor, Madrid, Spain
  112. Complutense University of Madrid, Madrid, Spain
  113. Gregorio Marañón Health Research Institute (IiSGM), Madrid, Spain
  114. Haemostasis and Thrombosis Unit, Hospital de la Santa Creu i Sant Pau, IIB Sant Pau, Barcelona, Spain
  115. Hospital Clinico Universitario de Valladolid, Servicio de Anestesiologia y Reanimación, Valladolid, Spain
  116. Universidad de Valladolid, Departamento de Cirugía, Valladolid, Spain
  117. Hospital Clinico Universitario de Valladolid, Servicio de Hematologia y Hemoterapia, Valladolid, Spain
  118. Hospital de Niños Ricardo Gutierrez, Buenos Aires, Argentina
  119. Fundación Universitaria de Ciencias de la Salud
  120. Spanish National Cancer Research Centre, Familial Cancer Clinical Unit, Madrid, Spain
  121. University Hospital of Burgos, Burgos, Spain
  122. Universidad Simón Bolívar, Facultad de Ciencias de la Salud, Barranquilla, Colombia
  123. Centro para el Desarrollo de la Investigación Científica, Paraguay
  124. Research Center and Memory clinic, ACE Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Spain
  125. Centre for Biomedical Network Research on Neurodegenerative Diseases (CIBERNED), Instituto de Salud Carlos III, Madrid, Spain
  126. CIEN Foundation/Queen Sofia Foundation Alzheimer Center, Madrid, Spain
  127. Hospital Universitario de Valme, Unidad Clínica de Enfermedades Infecciosas y Microbiología, Sevilla, Spain
  128. Sección Genética Médica - Servicio de Pediatría, Hospital Clínico Universitario Virgen de la Arrixaca, Servicio Murciano de Salud, Murcia, Spain
  129. Departamento Cirugía, Pediatría, Obstetricia y Ginecología, Facultad de Medicina, Universidad de Murcia (UMU), Murcia, Spain
  130. Grupo Clínico Vinculado, Centre for Biomedical Network Research on Rare Diseases (CIBERER), Instituto de Salud Carlos III, Madrid, Spain
  131. Department of Anthropology, University of Toronto at Mississauga, Mississauga, Ontario, Canada
  132. Tecnologico de Monterrey, Escuela de Medicina y Ciencias de la Salud, Monterrey, Mexico
  133. Research Unit, Hospital Universitario Nuestra Señora de Candelaria, Instituto de Investigación Sanitaria de Canarias, Santa Cruz de Tenerife, Spain
  134. Centre for Biomedical Network Research on Respiratory Diseases (CIBERES), Instituto de Salud Carlos III, Madrid, Spain
  135. Department of Clinical Sciences, University Fernando Pessoa Canarias, Las Palmas de Gran Canaria, Spain

Peer review process

Revised: This Reviewed Preprint has been revised by the authors in response to the previous round of peer review; the eLife assessment and the public reviews have been updated where necessary by the editors and peer reviewers.

Read more about eLife’s peer review process.

Editors

  • Reviewing Editor
    Siming Zhao
    Dartmouth College, Lebanon, United States of America
  • Senior Editor
    Murim Choi
    Seoul National University, Seoul, Korea, the Republic of

Reviewer #1 (Public Review):

Summary:

This paper conducted a GWAS meta-analysis for COVID-19 hospitalization among admixed American populations. The authors identified four genome-wide significant associations, including two novel loci (BAZ2B and DDIAS), and an additional risk locus near CREBBP using cross-ancestry meta-analysis. They utilized multiple strategies to prioritize risk variants and target genes. Finally, they constructed and assessed a polygenic risk score model with 49 variants associated with critical COVID-19 conditions.

Strengths:

Given that most of the previous studies were done in European ancestries, this study provides unique findings about the genetics of COVID-19 in admixed American populations. The GWAS data would be a valuable resource for the community. The authors conducted comprehensive analyses using multiple different strategies, including Bayesian fine mapping, colocalization, TWAS, etc., to prioritize risk variants and target genes. The polygenic risk score (PGS) result demonstrated the ability of cross-population PGS model for COVID-19 risk stratification.

Weaknesses:

(1) One of the major limitations of this study is that the GWAS sample size is relatively small, which limits its power.
(2) Lack of replication cohort.
(3) Colocalization and TWAS used eQTL data from GTEx data, which are mainly from European ancestries.

Comments on latest version:

The authors addressed most of my concerns.

Reviewer #2 (Public Review):

This is a genome-wide association study of COVID-19 in individuals of admixed American ancestry (AMR) recruited from Brazil, Colombia, Ecuador, Mexico, Paraguay and Spain. After quality control and admixture analysis, a total of 3,512 individuals were interrogated for 10,671,028 genetic variants (genotyped + imputed). The genetic association results for these cohorts were meta-analyzed with the results from The Host Genetics Initiative (HGI), involving 3,077 cases and 66,686 controls. The authors found two novel genetic loci associated with COVID-19 at 2q24.2 (rs13003835) and 11q14.1 (rs77599934), and other two independent signals at 3p21.31 (rs35731912) and 6p21.1 (rs2477820) already reported as associated with COVID-19 in previous GWASs. Additional meta-analysis with other HGI studies also suggested risk variants near CREBBP, ZBTB7A and CASC20 genes.

Strengths:

These findings rely on state-of-the-art methods in the field of Statistical Genomics and help to address the issue of low number of GWASs in non-European populations, ultimately contributing to reduce health inequalities across the globe.

Weaknesses:

There is no replication cohort, as acknowledged by the authors (page 29, line 587) and no experimental validation to assess the biological effect of putative causal variants/genes. Thus, the study provides good evidence of association, rather than causation, between the genetic variants and COVID-19.

Comments on latest version:

The issues identified in the first round of review were well addressed by the authors in the revised version of the manuscript.

Reviewer #3 (Public Review):

Summary:

In the context of the SCOURGE consortium's research, the authors conduct a GWAS meta-analysis on 4,702 hospitalized individuals of admixed American descent suffering from COVID-19. This study identified four significant genetic associations, including two loci initially discovered in Latin American cohorts. Furthermore, a trans-ethnic meta-analysis highlighted an additional novel risk locus in the CREBBP gene, underscoring the critical role of genetic diversity in understanding the pathogenesis of COVID-19.

Strengths:

(1) The study identified two novel severe COVID-19 loci (BAZ2B and DDIAS) by the largest GWAS meta-analysis for COVID-19 hospitalization in admixed Americans.
(2) With a trans-ethnic meta-analysis, an additional risk locus near CREBBP was identified.

Weaknesses:

(1) The GWAS power is limited due to the relatively small number of cases.

(2) There is no replication study for the novel severe COVID-19 loci, which may lead to false positive findings.

(3) The variants selected for the PGS appear arbitrary and may not leverage the GWAS findings.

(4) The TWAS models were predominantly trained on European samples, and there is no replication study for the findings as well.

Author response:

The following is the authors’ response to the original reviews.

Public Reviews:

Reviewer #1 (Public Review):

Summary:

This paper conducted a GWAS meta-analysis for COVID-19 hospitalization among admixed American populations. The authors identified four genome-wide significant associations, including two novel loci (BAZ2B and DDIAS), and an additional risk locus near CREBBP using cross-ancestry meta-analysis. They utilized multiple strategies to prioritize risk variants and target genes. Finally, they constructed and assessed a polygenic risk score model with 49 variants associated with critical COVID-19 conditions.

Strengths:

Given that most of the previous studies were done in European ancestries, this study provides unique findings about the genetics of COVID-19 in admixed American populations. The GWAS data would be a valuable resource for the community. The authors conducted comprehensive analyses using multiple different strategies, including Bayesian fine mapping, colocalization, TWAS, etc., to prioritize risk variants and target genes. The polygenic risk score (PGS) result demonstrated the ability of the cross-population

PGS model for COVID-19 risk stratification.

Thank you very much for the positive comments and the willingness to revise this manuscript.

Weaknesses:

(1) One of the major limitations of this study is that the GWAS sample size is relatively small, which limits its power.

(2) The fine mapping section is unclear and there is a lack of information. The authors assumed one causal signal per locus, and only provided credible sets, but did not provide posterior inclusion probabilities (PIP) for the variants to be causal.

(3) Colocalization and TWAS used eQTL data from GTEx data, which are mainly from European ancestries. It is unclear how much impact the ancestry mismatch would have on the result. The readers should be cautious when interpreting the results and designing follow-up studies.

We agree with that the sample size is relatively small. Despite that, it was sufficient to reveal novel risk loci supporting the robustness of the main findings. We have indicated this limitation at the end of the discussion section.

Thank you for rising this point. As suggested, we have also used SuSIE, which allows to assume more than one causal signal per locus. However, in this case the results were not different from those obtained with the original Bayesian colocalization performed with corrcoverage. Regarding the PIP, at the fine mapping stage we are inclined to put more weight on the functional annotations of the variants in the credible set than on the statistical contributions to the signal. This is the reason why we prefer not to put weight on the PIP of the variants but prioritize variants that were enriched functional annotations.

This is a good point regarding the lack of diversity in GTEx data. We have also used data from AMR populations (GALA II-SAGE models), although it was only available for blood tissue. Regarding the ancestry mismatch between datasets, several studies have attempted to explore the impact. Gay et al. (PMID: 32912333) studied local ancestry effects on eQTLs from the GTEx consortium and concluded that adjustment of eQTLs by local ancestry only yields modest improvement over using global ancestry (as done in GTEx). Moreover, the colocalization results between adjusting by Local Ancestry and Global Ancestry were not significantly different. Besides, Mogil et al. (PMID: 30096133) observed that genes with higher heritability share genetic architecture between populations. Nevertheless, both studies have evidenced decreased power and poorer predictive performances regarding gene expression because of reduced diversity in eQTL analyses. As consequence of the ancestry mismatch, we now warn the readers that this may compromise signal detection (Discussion, lines 531-533).

Reviewer #2 (Public Review):

This is a genome-wide association study of COVID-19 in individuals of admixed American ancestry (AMR) recruited from Brazil, Colombia, Ecuador, Mexico, Paraguay, and Spain. After quality control and admixture analysis, a total of 3,512 individuals were interrogated for 10,671,028 genetic variants (genotyped + imputed). The genetic association results for these cohorts were meta-analyzed with the results from The Host Genetics Initiative (HGI), involving 3,077 cases and 66,686 controls. The authors found two novel genetic loci associated with COVID-19 at 2q24.2 (rs13003835) and 11q14.1 (rs77599934), and other two independent signals at 3p21.31 (rs35731912) and 6p21.1 (rs2477820) already reported as associated with COVID-19 in previous GWASs. Additional meta-analysis with other HGI studies also suggested risk variants near CREBBP, ZBTB7A, and CASC20 genes.

Strengths:

These findings rely on state-of-the-art methods in the field of Statistical Genomics and help to address the issue of a low number of GWASs in non-European populations, ultimately contributing to reducing health inequalities across the globe.

Thank you very much for the positive comments and the willingness to revise this manuscript.

Weaknesses:

There is no replication cohort, as acknowledged by the authors (page 29, line 587), and no experimental validation to assess the biological effect of putative causal variants/genes. Thus, the study provides good evidence of association, rather than causation, between the genetic variants and COVID-19. Lastly, I consider it crucial to report the results for the SCOURGE Latin American GWAS, in addition to its meta-analysis with HGI results, since HGI data has a different phenotype scheme (Hospitalized COVID vs Population) compared to SCOURGE (Hospitalized COVID vs Non-hospitalized COVID).

We essentially agree with the reviewer in that one of the main limitations of the study is the lack of a replication stage because of the use of all available datasets on a one-stage analysis. To contribute to the interpretation of the findings in the absence of a replication stage, we now assessed the replicability of the novel loci using the Meta-Analysis Model-based Assessment of replicability (MAMBA) approach (PMID: 33785739) and included the posterior probabilities of replication in Table 2. We also explored further the potential replicability of signals in other populations. We agree that the results should be interpreted in terms of associations given the lack of functional validation of main findings, so we have slightly modified the discussion.

As suggested, the SCOURGE Latin American GWAS summary is now accessible by direct request to the Consortium GitHub repository (https://github.com/CIBERER/Scourge-COVID19) (lines 797-799). We have also included the results from the SCOURGE GWAS analysis for the replication of the 40 lead variants in the Supplementary Table 12. Results from the SCOURGE GWAS for the lead variants in the AMR meta-analysis with HGI were already included in the Supplementary Table 2. As note, we have not been able to conduct the meta-analysis with the same hospitalization scheme as in the HGI study since the population-specific results for those analyses were not publicly released. However, sensitivity analyses included within the supplementary material from the COVID-19 Host Genetics Initiative (2021) stated that there were no significant differences in effects (Odds Ratios) between analyses using population controls or just non-hospitalized COVID-19 patients.

Reviewer #3 (Public Review):

Summary:

In the context of the SCOURGE consortium's research, the authors conduct a GWAS meta-analysis on 4,702 hospitalized individuals of admixed American descent suffering from COVID-19. This study identified four significant genetic associations, including two loci initially discovered in Latin American cohorts. Furthermore, a trans-ethnic meta-analysis highlighted an additional novel risk locus in the CREBBP gene, underscoring the critical role of genetic diversity in understanding the pathogenesis of COVID-19.

Strengths:

(1) The study identified two novel severe COVID-19 loci (BAZ2B and DDIAS) by the largest GWAS meta-analysis for COVID-19 hospitalization in admixed Americans.

(2) With a trans-ethnic meta-analysis, an additional risk locus near CREBBP was identified.

Thank you very much for the positive comments and the willingness to revise this manuscript.

Weaknesses:

(1) The GWAS power is limited due to the relatively small number of cases.

(2) There is no replication study for the novel severe COVID-19 loci, which may lead to false positive findings.

We agree with that the sample size is relatively small. Despite that, it was sufficient to reveal novel risk loci supporting the robustness of the main findings. We have indicated this limitation at the end of the discussion section.

Regarding the lack of a replication study, we now assessed the replicability of the novel loci using the Meta-Analysis Model-based Assessment of replicability (MAMBA) approach (PMID: 33785739). We have included the posterior probabilities of replication in Table 2.

(3) Significant differences exist in the ages between cases and controls, which could potentially introduce biased confounders. I'm curious about how the authors treated age as a covariate. For instance, did they use ten-year intervals? This needs clarification for reproducibility.

Thank you for rising this point. Age was included as a continuous variable. This has been now indicated in line 667 (within Material and Methods).

(4)"Those in the top PGS decile exhibited a 5.90-fold (95% CI=3.29-10.60, p=2.79x10-9) greater risk compared to individuals in the lowest decile". I would recommend comparing with the 40-60% PGS decile rather than the lowest decile, as the lowest PGS decile does not represent 'normal controls'.

Thank you. In the revised version, the PGS categories was compared following the recommendation (lines 461-463).

(5) In the field of PGS, it's common to require an independent dataset for training and testing the PGS model. Here, there seems to be an overfitting issue due to using the same subjects for both training and testing the variants.

We are sorry for the misunderstanding. In fact, we have followed the standard to avoid overfitting of the PGS model and have used different training and testing datasets. The training data (GWAS) was the HGI-B2 ALL meta-analysis, in which our AMR GWAS was not included. The PRS model was then tested in the SCOURGE AMR cohort. However, it is true that we did test the combination of the PRS adding the new discovered variants in the SCOURGE cohort. To avoid potential overfitting by adding the new loci, we have excluded from the manuscript the results on which we included the newly discovered variants.

(6) The variants selected for the PGS appear arbitrary and may not leverage the GWAS findings without an independent training dataset.

Again, we are sorry for the misunderstanding. The PGS model was built with 43 variants associated with hospitalization or severity within the HGI v7 results and 7 which were discovered by the GenOMICC consortium in their latest study and were not in the latest HGI release. The variants are included within the Supplementary Table 14, but we have now annotated the discovery GWAS.

(7) The TWAS models were predominantly trained on European samples, and there is no replication study for the findings as well.

This is a good point regarding the lack of diversity in GTEx data. We have also used data from AMR populations (GALA II-SAGE models), although it was only available for blood tissue. Regarding the ancestry mismatch between datasets, several studies have attempted to explore the impact. Gay et al. (PMID: 32912333) studied local ancestry effects on eQTLs from the GTEx consortium and concluded that adjustment of eQTLs by local ancestry only yields modest improvement over using global ancestry (as done in GTEx). Moreover, the colocalization results between adjusting by Local Ancestry and Global Ancestry were not significantly different. Besides, Mogil et al. (PMID: 30096133) observed that genes with higher heritability share genetic architecture between populations. Nevertheless, both studies have evidenced decreased power and poorer predictive performances regarding gene expression because of reduced diversity in eQTL analyses. As consequence of the ancestry mismatch, we now warn the readers that this may compromise signal detection (Discussion, lines 531-533).

Recommendations for the authors:

Reviewer #1 (Recommendations For The Authors):

(1) The authors mentioned the fine mapping method did not converge for the locus in chr 11. I would consider trying a different fine-mapping method (such as SuSiE or FINEMAP). It would be helpful to provide posterior inclusion probabilities (PIP) for the variants in fine mapping results and plot the PIP values in the regional association plots.

As suggested, we have also used SuSIE, which allows to assume more than one causal signal per locus. However, in this case the results were not different from those obtained with the original Bayesian colocalization performed with corrcoverage. SuSIE’s fine-mapping for chromosome 11 prioritized a single variant, which is likely due to the rare frequency. Thus, we have maintained the fine-mapping as it was originally indicated in the previous version of the manuscript but have now included the credible set in Supplementary Table 6.

Regarding the PIP, at the fine mapping stage we are inclined to put more weight on the functional annotations of the variants in the credible set than on the statistical contributions to the signal. This is the reason why we prefer not to put weight on the PIP of the variants but prioritize variants that were enriched functional annotations.

(2) Please provide more detailed information about the VEP and V2G analysis and how to interpret those results. My understanding of V2G is that it includes different sources of information (such as molecular QTLs and chromatin interactions from different tissues/cell types, etc.). It is unclear what sources of information and weight settings were used in the V2G model.

Thank you for rising this point. As suggested, we have clarified the basis for VEP and V2G and the interpretation (lines 732-743).

(3) The authors identified multiple genes with different strategies, e.g. FUMA, V2G, COLOC, TWAS, etc. How many genes were found/supported by evidence provided by multiple methods? It could be helpful to have a table summarizing the risk genes found by different strategies, and the evidence supporting the genes. e.g. which genes are found by which methods, and the biological functions of the genes, etc.

Thank you for rising this point. As suggested, we now added a new figure (Figure 5) to summarize the findings with the multiple methods used.

(4) It would be helpful to make the code/scripts available for reproducibility.

As suggested, the SCOURGE Latin American GWAS summary and the analysis scripts (https://github.com/CIBERER/Scourge-COVID19/tree/main/scripts/novel-risk-hosp-AMR-2024) are now accessible in the Consortium GitHub repository (https://github.com/CIBERER/Scourge-COVID19) (lines 806-807).

(5) The fonts in some of the figures (e.g. Figure 2) are hard to read.

Thank you. We have now included the figures as SVG files.

Reviewer #2 (Recommendations For The Authors):

- The abstract lacks a conclusion sentence.

Thank you. As suggested, we have included two additional sentences with broad conclusions from the study. We preferred to avoid relying on conclusions related to known or new biological links of the prioritized genes given the lack of functional validation of main findings.

- Regarding the association analysis (page 27, line 677), I wonder if some of the 10 principal components (PCs) are capturing information about the recruitment areas (countries). It may be relevant to test for multicollinearity among these variables.

Since we acknowledge that some of the categories might be correlated with a certain PC but not all of them do, we have calculated GVIF values for the main variables to assess the categorical variable as a single entity. The scaled GVIF^1(1/2*Df)) value for the categorical variable is 1.52. Thus, if we square this value, we obtain 2.31, which can be then used for applying usual rule-of-thumb for VIF values.

- Still on the topic of association analysis, did the authors adjust the logistic model for comorbidities variables from Table 1? Given these comorbidities also have a genetic component and their distribution differs between non-hospitalized vs hospitalized, I am concerned that comorbidities might be confounding the association between genetic variants and COVID.

We did not adjust by comorbidities since HGI studies were not adjusted either and we aimed to be as aligned as possible with HGI. However, as suggested, we have now tested the association between each of the comorbidities in Table 1 and each of the variants in Table 2, using the comorbidities as dependent variables and adjusting for the main covariables (age, sex, PCs and country of recruitment). None of the variants were significantly associated to the comorbidities (line 333).

- If I understood correctly, the 49 genetic variants used to develop the polygenic risk score model (PRS) were based on the HGI total sample size (data release 7), which is predominantly of European ancestry. I am concerned about the prediction accuracy in the AMR population (PRS transferability issue).

We have explored literature in search of other PRS to compare the associated OR in our cohort with ORs calculated in European populations. Horowitz et al. (2022) reported an OR of 1.38 for the top 10% with respect to hospitalization risk in European individuals using a GRS with 12 variants.

We acknowledge that this might be an issue and is now explained in discussion of the revised version (lines 561-568). However, as this is the first time a PRS for COVID-19 is applied to a relatively large AMR cohort, we believe that this analysis will be of value for further analyses regarding PRS transferability, providing a source for comparison in further studies.

- On page 23, line 579, the authors acknowledge their "GWAS is underpowered". This sentence requires a sample/power calculation, otherwise, I suggest using "is likely underpowered".

Thanks for the input. We have modified the sentence as suggested.

Reviewer #3 (Recommendations For The Authors):

I wonder if the authors have an approximate date when the GWAS summary statistic will be available. I reviewed some manuscripts in the past, and the authors claimed they would deposit the data soon, but in fact it would not happen until 2 years later.

The summary statistics are already available from the SCOURGE Consortium repository https://github.com/CIBERER/Scourge-COVID19 (lines 806-807).

  1. Howard Hughes Medical Institute
  2. Wellcome Trust
  3. Max-Planck-Gesellschaft
  4. Knut and Alice Wallenberg Foundation