Evaluation of in silico predictors on short nucleotide variants in HBA1, HBA2, and HBB associated with haemoglobinopathies

  1. Stella Tamana
  2. Maria Xenophontos
  3. Anna Minaidou
  4. Coralea Stephanou
  5. Cornelis L Harteveld
  6. Celeste Bento
  7. Joanne Traeger-Synodinos
  8. Irene Fylaktou
  9. Norafiza Mohd Yasin
  10. Faidatul Syazlin Abdul Hamid
  11. Ezalia Esa
  12. Hashim Halim-Fikri
  13. Bin Alwi Zilfalil
  14. Andrea C Kakouri
  15. ClinGen Hemoglobinopathy Variant Curation Expert Panel
  16. Marina Kleanthous
  17. Petros Kountouris  Is a corresponding author
  1. Molecular Genetics Thalassaemia Department, The Cyprus Institute of Neurology and Genetics, Cyprus
  2. Leiden University Medical Center, Netherlands
  3. Centro Hospitalar e Universitário de Coimbra, Portugal
  4. Laboratory of Medical Genetics, National and Kapodistrian University of Athens, Greece
  5. Division of Endocrinology, Metabolism and Diabetes, First Department of Pediatrics, National and Kapodistrian University of Athens, Greece
  6. Haematology Unit, Cancer Research Centre, Institute for Medical Research, National Health of Institutes (NIH), Ministry of Health Malaysia, Malaysia
  7. Malaysian Node of the Human Variome Project, School of Medical Sciences, Health Campus, Universiti Sains Malaysia, Malaysia
  8. Human Genome Centre, School of Medical Sciences, Health Campus, Universiti Sains Malaysia, Malaysia

Peer review process

This article was accepted for publication as part of eLife's original publishing model.

History

  1. Version of Record published
  2. Accepted Manuscript published
  3. Accepted
  4. Received
  5. Preprint posted

Decision letter

  1. Robert Baiocchi
    Reviewing Editor; The Ohio State University, United States
  2. Mone Zaidi
    Senior Editor; Icahn School of Medicine at Mount Sinai, United States
  3. Mohammad Hamid
    Reviewer; Pasteur Institute of Iran, Islamic Republic of Iran

Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.

Decision letter after peer review:

Thank you for submitting your article "Evaluation of in silico predictors on short nucleotide variants in HBA1, HBA2 and HBB associated with haemoglobinopathies" for consideration by eLife. Your article has been reviewed by 2 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Mone Zaidi as the Senior Editor. The following individual involved in the review of your submission has agreed to reveal their identity: Mohammad Hamid (Reviewer #2).

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Essential revisions:

Reviewer #1 identified multiple issues that need to be addressed by the authors. Please provide a point-by-point statement in response to these comments. This may require substantial revision of the manuscript.

Reviewer #1 (Recommendations for the authors):

– Both the lists of annotations for the dataset of variants should be provided and the authors must also provide a comparison of the original database annotations and their revised annotations in the form of a figure panel or table. This will help determine whether the observed low specificity for in silico predictions was due to the revised annotations.

– All the classification benchmarks and parameters must be explored and presented in the results for the improved approach with separate pathogenic and benign thresholds in Table 2: The addition of accuracy, sensitivity, specificity and MCC will enable comparison with classification using the same pathogenic and benign thresholds in Table 1. This data is present in supplementary file 3 but the binary classification metrics for the tools and thresholds shown in table 2 should be displayed alongside.

– The authors must discuss why the performance of certain tools was better or worse than others to help other researchers not familiar with these studies obtain a better understanding of the tools. When the improved approach was applied, certain tools performed better than others for certain classes of variants. The reasons for this must be clearly explained and if not known, then an attempt must be made to determine them. This is in the interest of selecting appropriate tools for in silico prediction by other groups based on some knowledge of the underlying functioning of these classifiers.

– Concordance or discordance among the tools after setting separate thresholds (shown in Figure 3B and 3C) would be understood easier if presented as in supplementary figure 2. Why is there more concordance for HBB and less for HBA variants initially? And why is there low concordance after the improved approach? Is it also low for HBB or does it show the same pattern as before? Authors must discuss the likely reasons.

– If possible the authors must evaluate metapredictors separately since some of them like CADD take as input scores from other in silico tools also used in this comparison. Did metapredictors perform better in general and after the improvement?

– Please explain the rationale for the proposed improvement by setting separate decision thresholds for pathogenic and benign classification. Why focus on the likelihood ratio instead of MCC or the balance of accuracy, sensitivity and specificity? Why is increasing specificity at the expense of reduced sensitivity better in this case according to the authors' judgement?

Reviewer #2 (Recommendations for the authors):

Although this study was done in a small cohort of patients, I suggested that the paper be accepted as an original article.

https://doi.org/10.7554/eLife.79713.sa1

Author response

Reviewer #1 (Recommendations for the authors):

– Both the lists of annotations for the dataset of variants should be provided and the authors must also provide a comparison of the original database annotations and their revised annotations in the form of a figure panel or table. This will help determine whether the observed low specificity for in silico predictions was due to the revised annotations.

We have now addressed this point. Please check our response in “Comment 2”.

– All the classification benchmarks and parameters must be explored and presented in the results for the improved approach with separate pathogenic and benign thresholds in Table 2: The addition of accuracy, sensitivity, specificity and MCC will enable comparison with classification using the same pathogenic and benign thresholds in Table 1. This data is present in supplementary file 3 but the binary classification metrics for the tools and thresholds shown in table 2 should be displayed alongside.

We have now included Sensitivity at the Pathogenic Threshold and Specificity at the Benign Threshold to Table 2. However, in the analysis where we trichotomise the problem, we do not deem MCC and specificity at the pathogenic threshold or sensitivity at the benign threshold to be informative and are rather misleading. All these metrics for the two independent binary predictors (pathogenic and benign) are available in Supplementary File 3. Please also check our response for “Comment 3”.

– The authors must discuss why the performance of certain tools was better or worse than others to help other researchers not familiar with these studies obtain a better understanding of the tools. When the improved approach was applied, certain tools performed better than others for certain classes of variants. The reasons for this must be clearly explained and if not known, then an attempt must be made to determine them. This is in the interest of selecting appropriate tools for in silico prediction by other groups based on some knowledge of the underlying functioning of these classifiers.

We have now addressed this point.

– Concordance or discordance among the tools after setting separate thresholds (shown in Figure 3B and 3C) would be understood easier if presented as in supplementary figure 2. Why is there more concordance for HBB and less for HBA variants initially? And why is there low concordance after the improved approach? Is it also low for HBB or does it show the same pattern as before? Authors must discuss the likely reasons.

We have now addressed this point.

– If possible the authors must evaluate metapredictors separately since some of them like CADD take as input scores from other in silico tools also used in this comparison. Did metapredictors perform better in general and after the improvement?

We follow the same analysis methodology for all tools included in the study, including meta predictors. We highlight that meta predictors are superior in predicting the pathogenicity of globin gene variants and we discuss possible reasons for this in a newly added paragraph in the discussion (pg. 19; lines 417-426).

– Please explain the rationale for the proposed improvement by setting separate decision thresholds for pathogenic and benign classification. Why focus on the likelihood ratio instead of MCC or the balance of accuracy, sensitivity and specificity? Why is increasing specificity at the expense of reduced sensitivity better in this case according to the authors' judgement?

One of the main objectives of this study is to provide evidence for the use of in silico predictors under the Bayesian ACMG/AMP framework. This framework provides specific LR threshold for each strength level in the framework. Achieving these thresholds can decrease the overall specificity/sensitivity, but it increases the confidence that pathogenic or benign calls are correct. Please also see our response to “Comment 4”

Reviewer #2 (Recommendations for the authors):

Although this study was done in a small cohort of patients, I suggested that the paper be accepted as an original article.

We would like to thank the reviewer for the positive evaluation.

https://doi.org/10.7554/eLife.79713.sa2

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Stella Tamana
  2. Maria Xenophontos
  3. Anna Minaidou
  4. Coralea Stephanou
  5. Cornelis L Harteveld
  6. Celeste Bento
  7. Joanne Traeger-Synodinos
  8. Irene Fylaktou
  9. Norafiza Mohd Yasin
  10. Faidatul Syazlin Abdul Hamid
  11. Ezalia Esa
  12. Hashim Halim-Fikri
  13. Bin Alwi Zilfalil
  14. Andrea C Kakouri
  15. ClinGen Hemoglobinopathy Variant Curation Expert Panel
  16. Marina Kleanthous
  17. Petros Kountouris
(2022)
Evaluation of in silico predictors on short nucleotide variants in HBA1, HBA2, and HBB associated with haemoglobinopathies
eLife 11:e79713.
https://doi.org/10.7554/eLife.79713

Share this article

https://doi.org/10.7554/eLife.79713