Meta-Research: A comprehensive review of randomized clinical trials in three medical journals reveals 396 medical reversals
Abstract
The ability to identify medical reversals and other low-value medical practices is an essential prerequisite for efforts to reduce spending on such practices. Through an analysis of more than 3000 randomized controlled trials (RCTs) published in three leading medical journals (the Journal of the American Medical Association, the Lancet, and the New England Journal of Medicine), we have identified 396 medical reversals. Most of the studies (92%) were conducted on populations in high-income countries, cardiovascular disease was the most common medical category (20%), and medication was the most common type of intervention (33%).
https://doi.org/10.7554/eLife.45183.001Introduction
Low-value medical practices are medical practices that are either ineffective or that cost more than other options but only offer similar effectiveness (Prasad et al., 2013; Prasad et al., 2011; Schpero, 2014). Such practices can result in physical and emotional harm, undermine public trust in medicine, and have both an opportunity cost (Korenstein et al., 2018) and a financial cost (Reid et al., 2016; Beaudin-Seiler, 2016). Identifying and eliminating low-value medical practices will, therefore, reduce costs and improve care.
Medical reversals are a subset of low-value medical practices and are defined as practices that have been found, through randomized controlled trials, to be no better than a prior or lesser standard of care (Prasad et al., 2013; Prasad et al., 2011). It can, however, be difficult to identify medical reversals. For example, Cochrane reviews provide high-quality evidence on medical practices (Garner et al., 2013), but each review focuses on only one practice and many practices have not been reviewed by Cochrane. The Choosing Wisely initiative in the US maintains a list of low-value medical practices, but it relies on medical organizations to report such practices and often includes only those practices where there is a high degree of consensus (Beaudin-Seiler, 2016).
Here we report how a systematic search of randomized controlled trials in three leading medical journals – the Journal of the American Medical Association (JAMA), the Lancet, and the New England Journal of Medicine (NEJM) – identified 396 medical reversals. It is our hope that, by building on previous efforts in this area (Prasad et al., 2013), this list will help others to eliminate the use of these practices.
Results
We reviewed JAMA and the Lancet between 2003 and 2017, and NEJM between 2011 and 2017, and identified a total of 7036 original articles (Figure 1; 2911 in JAMA, 2624 in the Lancet, and 1501 in NEJM). There were 3017 articles reporting the results of randomized control trials regarding a medical practice, and these articles were further coded for novelty/establishment and whether the outcomes were positive, negative, or inconclusive. After excluding studies that were novel (n = 1373) or established with positive or inconclusive outcomes (n = 1229), there were 415 (14%) studies identified as tentative medical reversals. After a search of systematic reviews to refute these tentative reversals, 19 were excluded, leaving a total of 396 medical reversals (6% of all original articles and 13% of all randomized trials).
Many of these 396 reversals had been the subject of systematic reviews: in 209 cases (53%) the systematic review confirmed that the medical practice in question was indeed a medical reversal; in 109 cases (28%) the results of the systematic review were inconclusive; and for 78 cases (20%) there was no systematic review. 154 of the reversals (39%) were found in JAMA, 129 (33%) were found in NEJM, and 113 (29%) were found in Lancet.
Reversal study characteristics are described in Table 1. Most studies (92%, n = 366) were conducted on populations in high-income countries, whereas 8% (n = 30) were done in low- or middle-income countries, including, but not limited to China, India, Malaysia, Ghana Tanzania, and Ethiopia. Cardiovascular disease was the most common medical category (20%, n = 80), followed by public health/preventive medicine (12%, n = 48), and critical care (11%, n = 45). Regarding the type of intervention, medication was the most common (33%, n = 129), followed by a procedure (20%, n = 81), vitamin/supplement (13%, n = 53), device (9%, n = 35) and system intervention (8%, n = 30). The breakdown of funding categories were as such (Supplementary file 1): 253 (63.9%) were from non-industry sources only; 88 (22.2%) were from a combination of industry and non-industry sources; 36 (9.1%) from industry only sources; and 3 (0.8%) from non-industry sources plus insurance company (n = 2) or a development bank (n = 1). There were 16 (4.0%) studies that we could not find the source of funding.
Table 2 summarizes 20 selected medical reversals. The selected examples were chosen to represent various types of practices in various medical disciplines over the full years that we did the analysis. Supplementary file 2 contains a full list of reversal summaries. Figure 2 shows the percent of articles that are in each journal, by medical specialty.
Discussion
Here we present a broad and extensive list of established medical practices found to be ineffective in randomized control trials. This list represents practices from all disciplines of medical care. These practices add to a previously reported list of 146 medical reversals published during years 2001–2010 (Prasad et al., 2013).
Efforts to identify low-value practices are numerous. In the US Choosing Wisely initiative began by asking members of each medical specialty to provide a list of the top five diagnostic tests or treatments that are expensive and have evidence showing a lack of benefit (Schpero, 2014): similar initiatives have been implemented in other countries (de Vries et al., 2016). Some have performed systematic searches of the scientific databases using key words (de Vries et al., 2016). Others have used a multiplatform attempt, consisting of searching the peer-reviewed literature, insurance and health organization databases, and opportunistic samplings of knowledgeable experts in the field (Elshaug et al., 2012). Each of these ways to identify medical reversals or low-value practices has advantages and disadvantages, but identifying these practices can be challenging because of their heterogeneity, the lack of established methods to identify these practices, the difficulty in applying them to the correct population or subpopulation, and the obstacle of prioritizing which practices are more or less low-value (Elshaug et al., 2013).
Prior work by Schwartz and colleagues approximated the financial costs of 26 low-value services that are more commonly used in the older adult population (Schwartz et al., 2014). They estimated that spending for these services in the Medicare population was between $1.9 and $8.5 billion during 2008–2009, which was between 0.6% and 2.7% of Medicare Parts A and B spending. In their analysis, at least 25% of Medicare beneficiaries received low-value services during 2008–2009. These results are especially notable considering the authors only used the 26 most commonly used low-value services. In contrast, the ubiquity of medical reversals has been previously reported upon in the NEJM, where 146 practices were identified as medical reversals over a decade (Prasad et al., 2013). Here, we hope to add to the prior efforts of others in providing a larger and more comprehensive list (396 practices in total) for clinicians and researchers to guide practice as they care for patients more effectively and more economically.
We found reversals in a variety of medical sub-fields and types of devices, procedures, or practices. These reversals had been practiced and tested in high-income as well as low- to middle-income countries, although the highest percentage of reversals was in high-income countries, likely because most randomized trials are performed in this setting. In countries like the US, where there was a 20% increase in spending between 2013 and 2015, and drug prices alone surpassed the increase in aggregate health care spending (Kesselheim et al., 2016), the identification and disuse of costly and ineffective (or possibly harmful) medications and practices are especially important. For example, bevacizumab (Avastin) was approved in 2008 by the Food and Drug Administration (FDA) in the US for metastatic breast cancer under the accelerated approval program, but was later shown to not improve overall survival (Vitry et al., 2015), even though the cost to each patient was $88,000 per year (Selyukh, 2011). Consequently, the FDA approval for that indication was withdrawn in November 2011 (Vitry et al., 2015).
Reversals were not just limited to practices performed by physician or health care providers only. Many reversals involved practices where the physician was a ‘gatekeeper’ to access these practices, but some were practices where the patient could access on their own, such as behavioral practices (e.g., cognitive behavioral therapy or mindfulness interventions), complementary or non-traditional practices (e.g., acupuncture), dietary supplements (e.g. omega-3 fatty acids or vitamin A supplementation), community practices (e.g., programs to prevent teenage pregnancy or self-poisoning), or wearable technology. Wearable technology has become especially popular among people who are interested in tracking their physical activity in an effort to lose weight. A study on the use of wearable technology, however, found that weight loss was significantly less among the group that had access to wearable technology, compared to the group that did not (Jakicic et al., 2016). With increasing availability of healthcare interventions that are readily accessible to everyone without a prescription, there needs to be greater discussion on whether these work between patients and physicians, as well as discussion on the regulation of these interventions.
13% of all randomized trials were medical reversals: this is slightly higher than a previous report based on an analysis of just one journal. There was some variation in the percentage of trials published in each journal that reported on practices considered as a medical reversal, ranging from 29% (113/396) for Lancet to 39% (154/396) for JAMA.
Finally, reversals highlight the importance of independent, governmental and non-conflicted funding of clinical research. The majority of reversal studies we found were funded by such sources (63.9%), with a minority funded solely by the industry (9.1%). Conversely, industry funded research represented between 35–49% of trials registered on ClinicalTrials.gov during years 2006 through 2014 (Ehrhardt et al., 2015).
Strengths and limitations
There are several strengths and limitations to this paper. First, we looked at just three journals (each of which has a high impact factor). Results may not be broadly generalizable to all journals or fields, and reversals in our list could be affected by the editors’ decision to publish or not publish a given article. Second, documented evidence of the use of a newer practice was sometimes easier to find because it had come about during a time when there was more internet use. Conversely, documented evidence of an older practice was sometimes easier to find because there had been more historical commentary about its use. Because of this, newer or more recent practices may be more or less likely to be categorized as established than older or less recent practices. Third, others may categorize results differently, depending on background expertise of the investigators.
To help overcome this limitation, physicians in the clinical setting from a range of backgrounds were invited to review and comment on practices identified as reversals. Our dataset is presented in full in Supplementary file 2. It is inevitable that others may feel differently and choose to reclassify some of our examples. We hope our work may serve to enhance and expand upon other efforts to identify and disincentivize low-value practices. Fourth, we relied on the study authors’ point of view on whether the results were positive or negative, and there may be reasonable differences of opinion regarding the interpretation of some studies. Fifth, we did not evaluate the quality of the meta-analysis used to confirm or refute the medical reversal. However, we tried to find the most recent review that was published in either Cochrane or medical journal (for that specialty) to confirm or refute the reversal. Finally, our definition of established may be broad in that we did not limit established practices to only those that were being used widespread, in part because once a practice has been adopted, even intermittently, it is difficult to get patients and patients to abandon this practice. We did, however, maintain that proof of establishment needed to codified into guidelines or be one for which we could prove use outside of a clinical trial or clinical protocol. Additionally, multiple physicians reviewed each practice to confirm that these practices were indeed reversals.
Our primary research objective was to compile a comprehensive review of medical reversals for the benefit of both medical professionals and lay persons. This type of work is fundamentally descriptive and does not seek to test a binary hypothesis. Nevertheless, there are a number of concepts and lessons that may be realized from the results. The breadth of reversals across the various fields of medicine emphasize the importance of conducting randomized trials for both novel and established practices. While it may impractical, if not impossible, to test every medical practice in a randomized setting, there are many testable practices that are adopted based on nonrandomized data or bio plausibility. There is a danger in expediting treatments into practice without data proving their efficacy. Once an ineffective practice is established, it is difficult to convince practitioners to abandon its use; eliminating a reversal from standard practice occurs slowly and with resistance (Prasad et al., 2012; Tatsioni et al., 2007). By aiming to test novel treatments before they are widespread, we can reduce the number of reversals in practice and prevent harms to patients and to the reputation of the medical field. We hope these findings propel medical professionals to critically evaluate their own practices and, going forward, demand high-quality research before adopting a practice, especially for practices that are costlier and/or more aggressive than standard of care.
Conclusions
We have identified 396 medical reversals spanning different types of medical disciplines, types of interventions, and populations. The de-adoption of these and other low-value medical practices will lead to cost savings and improvements in medical care.
Methods
Aim of study
Request a detailed protocolWe sought to compile a list of medical reversals that appeared in three leading general medical journals during a 15 year period.
Search strategy
Request a detailed protocolWe used methods similar to our prior survey of 10 years of publications in one high-impact journal (Prasad et al., 2013). We reviewed all articles under the headings ‘Original Investigation’, ‘Preliminary Communications’, ‘Caring for the Critically-Ill Patient’, ‘Brief Reports’, ‘Clinical investigations’, ‘Toward Optimal Laboratory Use’, and ‘Original Contribution’ in JAMA and all articles under the heading ‘Articles’ in the Lancet from years 2003 to 2017. We reviewed all articles under the heading ‘Original Articles’ in NEJM from years 2011–2017. The years 2001 to 2010 of the NEJM were previously reviewed and reported (Prasad et al., 2013; Prasad et al., 2011). The choice of journals was made based on the three general medical journals with the highest 5 year Hirsch index for medical journals (https://jcr.incites.thomsonreuters.com/JCRJournalHomeAction.action). This study was conducted from March 1, 2017 through November 11, 2018.
Article inclusion
View detailed protocolWe identified all randomized trials of a clinical practice, or, in other words, any investigation that assessed screening, diagnostic testing, medication(s), procedure(s), surgery, medical device, treatment algorithms, or any change in health care provision systems. We excluded randomized controlled trials (RCTs) that did not concern a medical practice (e.g. a RCT that tested a biological question, such as the effect of testosterone on muscle mass) or that were individual-level patient meta-analyses.
We then excluded trials of novel practices, defined as practices only used in the confines of clinical trials. Established practices were included and defined as those used regularly outside of research trials. This could include off-label use or use outside of the US.
Next, we excluded trials that reached positive or inconclusive results. An article was considered positive if the trial met its primary endpoint and negative if it failed to meet the primary outcome or if the study measured a hard endpoint (quality of life, mortality, etc.) and failed to show statistical superiority over a prior or lesser standard of practice in the control arm. For non-inferiority or equivalence studies, meeting the pre-specified margin would be considered positive. For studies comparing two established interventions, the more expensive intervention needed to show benefit to be considered positive. Studies were deemed inconclusive if they demonstrated neither clear benefit nor harm (e.g., improved overall survival but no improvement in functional capacity in patients who have had a stroke) or the study was stopped early for reasons other than futility or adverse events.
For each tentative reversal in our dataset, we performed a two-part search to find a systematic review. Meta analyses and/or systematic reviews (MA/SRs) were sought for each RCT designated as a ‘reversal’ to determine whether the established practice was found to be ineffective across multiple studies. MA/SRs were found by searching, in this order: review articles that cited that trial in Pubmed.gov; review articles that cited the trial in Google Scholar; and then using search terms in Google Scholar. In some cases, MA/SRs were found using the journal website under ‘citing articles’. Because of the high-quality review process, Cochrane reviews were first choice for reviews on the article’s subject, but if there was no Cochrane review, a meta-analysis from another high-quality journal was used. More recent meta-analyses were prioritized over older meta-analyses on the same topic, and meta-analyses that population-weighted their analyses were prioritized over ones that did not. MA/SRs were categorized as 1) confirming reversal, 2) refuting reversal, 3) insufficient data on reversal, or 4) no MA/SR found. MA/SRs needed to include the RCT in order to be considered as a confirmation of a reversal, and the conclusions needed to be based on results from RCTs only (not on observational or nonrandomized studies). Articles with MA/SRs refuting the reversal were excluded from the final analysis. A table of all confirmed reversals can be found in Supplementary file 2.
For all steps of study selection, two reviewers (DH, AH, TC, JG) independently examined information for each article. When there were differences in opinion between the two reviewers, adjudication first involved discussion between the two readers to see whether agreement could be reached. If disagreement persisted, a third reviewer (VP) adjudicated the discrepancy. Figure 1 shows our study selection strategy.
Data abstraction and coding
Request a detailed protocolArticles were coded by discipline (public health/general preventive medicine, psychiatry, neurology/neurosurgery, radiation oncology, surgery, urology, allergy and immunology, anesthesiology, dermatology, pediatrics, obstetrics and gynecology, ophthalmology, orthopedic surgery, cardiovascular disease, critical care medicine, endocrinology, diabetes, and metabolism, gastroenterology/hepatology, hematology, infectious disease, medical oncology, nephrology, pulmonary disease, or rheumatology) with the option of a secondary discipline, if it could be categorized as more than one, whether the study was done in a high-income country or a low- to middle- income country (International Statistical Institute, 2018), and the type of intervention (medication, procedure, device, screening test, over-the-counter medication, vitamins/supplements/food, behavioral therapy, treatment algorithm, diagnostic instruments, system intervention/quality and performance measure, or optimization). We also abstracted the funding source(s) and categorized the data as industry only, non-industry only, a combination of industry and non-industry sources, or a combination of non-industry and either an insurance company or banking institution. Intervention materials provided by an industry source qualified as having funding support from industry sources.
For all coding, two reviewers (DH, AH, TC, JG) independently extracted information for each article. The aforementioned procedure to resolve disagreement was used.
Four physicians (AC, MH, CL, DM) reviewed all reversals, systematic reviews, and documentation to confirm that the practice was a reversal. Further discrepancies were adjudicated by VP. Thus, our process involved iterative assessment and documentation of practices by a group of researchers and physicians.
Data analysis
View detailed protocolData are presented using descriptive statistics. Analyses were conducted using Microsoft Excel and R, package Tidyverse (Wickham, 2017). This study was not submitted for Institutional Review Board approval because it involved publicly available data and did not involve individual patient data. All abstracted data are included the manuscript and supporting files.
Data availability
Data were obtained from publicly available data and are included in the manuscript and supporting files.
References
-
Is treatment with vaginal pessaries an option in patients with a sonographically detected short cervix?Journal of Perinatal Medicine 31:122–133.https://doi.org/10.1515/JPM.2003.017
-
Menopausal hormone therapy and mortality: a systematic review and Meta-AnalysisThe Journal of Clinical Endocrinology & Metabolism 100:4021–4028.https://doi.org/10.1210/jc.2015-2238
-
Hospital-based employment of orthopaedic surgeons - Passing trend or new paradigm?: AOA critical issuesThe Journal of Bone and Joint Surgery-American Volume 94:1–5.https://doi.org/10.2106/JBJS.J.01618
-
Use of postmenopausal hormone replacement therapy: estimates from a nationally representative cohort studyAmerican Journal of Epidemiology 145:536–545.https://doi.org/10.1093/oxfordjournals.aje.a009142
-
Mechanical versus manual chest compressions for cardiac arrestCochrane Database of Systematic Reviews 2:CD007260.https://doi.org/10.1002/14651858.CD007260.pub3
-
Treatment of posterior uveitis with a fluocinolone acetonide implant: three-year clinical trial resultsArchives of Ophthalmology 126:1191–1201.https://doi.org/10.1001/archopht.126.9.1191
-
Effectiveness and acceptability of a newly designed hip protector: a pilot studyArchives of Gerontology and Geriatrics 30:25–34.https://doi.org/10.1016/S0167-4943(99)00048-5
-
BookPain Management Injection Therapies for Low Back PainRockville, MD: Agency for Healthcare Research and Quality.
-
Off-pump coronary artery bypass grafting decreases risk-adjusted mortality and morbidityThe Annals of Thoracic Surgery 72:1282–1289.https://doi.org/10.1016/S0003-4975(01)03006-5
-
A randomized trial of nicotine-replacement therapy patches in pregnancyNew England Journal of Medicine 366:808–818.https://doi.org/10.1056/NEJMoa1109582
-
Pharmacological interventions for promoting smoking cessation during pregnancyCochrane Database of Systematic Reviews 9:CD010078.https://doi.org/10.1002/14651858.CD010078
-
Infection control and prevention measures to reduce the spread of vancomycin-resistant enterococci in hospitalized patients: a systematic review and meta-analysisJournal of Antimicrobial Chemotherapy 69:1185–1192.https://doi.org/10.1093/jac/dkt525
-
Are low-value care measures up to the task? A systematic review of the literatureBMC Health Services Research 16:405.https://doi.org/10.1186/s12913-016-1656-3
-
Over 150 potentially low-value health care practices: an Australian studyThe Medical Journal of Australia 197:556–560.https://doi.org/10.5694/mja12.11083
-
The history and historical treatments of deep vein thrombosisJournal of Thrombosis and Haemostasis 11:402–411.https://doi.org/10.1111/jth.12127
-
Reducing ineffective practice: challenges in identifying low-value health care using Cochrane systematic reviewsJournal of Health Services Research & Policy 18:6–12.https://doi.org/10.1258/jhsrp.2012.012044
-
Off- Versus On-Pump coronary surgery and the effect of Follow-Up length and surgeons' Experience: a Meta-AnalysisJournal of the American Heart Association 7:e010034.https://doi.org/10.1161/JAHA.118.010034
-
Vitamin A supplementation for the prevention of morbidity and mortality in infants six months of age or lessCochrane Database of Systematic Reviews 10:CD007480.https://doi.org/10.1002/14651858.CD007480.pub2
-
Screening for breast cancer with mammographyCochrane Database of Systematic Reviews 6:CD001877.https://doi.org/10.1002/14651858.CD001877.pub5
-
Current status of off-pump coronary-artery bypassNew England Journal of Medicine 366:1541–1543.https://doi.org/10.1056/NEJMe1203194
-
Multiple-micronutrient supplementation for women during pregnancyCochrane Database of Systematic Reviews 3:CD004905.https://doi.org/10.1002/14651858.CD004905.pub5
-
Epidural steroid injections for lumbar spinal stenosisCurrent Reviews in Musculoskeletal Medicine 1:32–38.https://doi.org/10.1007/s12178-007-9003-2
-
Vitamin A supplementation for the prevention of morbidity and mortality in infants one to six months of ageCochrane Database of Systematic Reviews 28:CD007480.https://doi.org/10.1002/14651858.CD007480.pub3
-
Standardization of uveitis nomenclature for reporting clinical data. Results of the first international workshopAmerican Journal of Ophthalmology 140:509–516.https://doi.org/10.1016/j.ajo.2005.03.057
-
Surgery versus physical therapy for a meniscal tear and osteoarthritisNew England Journal of Medicine 368:1675–1684.https://doi.org/10.1056/NEJMoa1301408
-
Off-pump or on-pump coronary-artery bypass grafting at 30 daysNew England Journal of Medicine 366:1489–1497.https://doi.org/10.1056/NEJMoa1200388
-
Interventions for promoting smoking cessation during pregnancyCochrane Database of Systematic Reviews 3:CD001055.https://doi.org/10.1002/14651858.CD001055.pub3
-
Compliance with routine use of gowns by healthcare workers (HCWs) and non-HCW visitors on entry into the rooms of patients under contact precautionsInfection Control & Hospital Epidemiology 28:337–340.https://doi.org/10.1086/510811
-
Premature rupture of the membranes: neonatal consequencesSeminars in Perinatology 20:375–380.https://doi.org/10.1016/S0146-0005(96)80004-8
-
Fluoropyrimidine‐hai (hepatic arterial infusion) versus systemic chemotherapy (SCT) for unresectable liver metastases from colorectal cancerCochrane Database of Systematic Reviews 3:CD007823.https://doi.org/10.1002/14651858.CD007823.pub2
-
Off-pump versus on-pump coronary artery bypass grafting for ischaemic heart diseaseCochrane Database of Systematic Reviews 3:CD007224.https://doi.org/10.1002/14651858.CD007224.pub2
-
The urgent need for evidence in arthroscopic meniscal surgery: a systematic review of the evidence for operative management of meniscal tearsThe American Journal of Sports Medicine 45:965–973.https://doi.org/10.1177/0363546516650180
-
Zopiclone and zaleplon vs benzodiazepines in the treatment of insomnia: canadian consensus statementHuman Psychopharmacology: Clinical and Experimental 18:29–38.https://doi.org/10.1002/hup.445
-
Physical methods for preventing deep vein thrombosis in strokeCochrane Database of Systematic Reviews 8:CD001922.https://doi.org/10.1002/14651858.CD001922.pub3
-
BookDementia: A NICE-SCIE Guideline on Supporting People with Dementia and Their Carers in Health and Social Care LeicesterUK: British Psychological Society.
-
Efficacy of second generation antidepressants in late-life depression: a meta-analysis of the evidenceThe American Journal of Geriatric Psychiatry 16:558–567.https://doi.org/10.1097/01.JGP.0000308883.64832.ed
-
Efficacy of antidepressants for depression in Alzheimer's Disease: Systematic review and meta-analysisJournal of Alzheimer's Disease 58:725–733.https://doi.org/10.3233/JAD-161247
-
Hip protectors for preventing hip fractures in the elderlyCochrane Database of Systematic Reviews 3:CD001255.https://doi.org/10.1002/14651858.CD001255.pub3
-
The frequency of medical reversalArchives of Internal Medicine 171:1675–1676.https://doi.org/10.1001/archinternmed.2011.295
-
A decade of reversal: an analysis of 146 contradicted medical practicesMayo Clinic Proceedings 88:790–798.https://doi.org/10.1016/j.mayocp.2013.05.012
-
Pulmonary artery catheters for adult patients in intensive careCochrane Database of Systematic Reviews 2:CD003408.https://doi.org/10.1002/14651858.CD003408.pub3
-
Low-value health care services in a commercially insured populationJAMA Internal Medicine 176:1567–1571.https://doi.org/10.1001/jamainternmed.2016.5031
-
Recommendations for vitamin A supplementationThe Journal of Nutrition 132:2902S–2906.https://doi.org/10.1093/jn/132.9.2902S
-
Cervical pessary for preventing preterm birth in singleton pregnancies with short cervical length: A systematic review and meta-analysisJournal of Ultrasound in Medicine 36:1535–1543.https://doi.org/10.7863/ultra.16.08054
-
Hip protectors for preventing hip fractures in older peopleCochrane Database of Systematic Reviews 3:CD001255.https://doi.org/10.1002/14651858.CD001255.pub5
-
Measuring low-value care in medicareJAMA Internal Medicine 174:1067–1076.https://doi.org/10.1001/jamainternmed.2014.1541
-
On-pump versus off-pump coronary-artery bypass surgeryNew England Journal of Medicine 361:1827–1837.https://doi.org/10.1056/NEJMoa0902905
-
Prevalence, burden, and treatment of insomnia in primary careThe American Journal of Psychiatry 154:1417.https://doi.org/10.1176/ajp.154.10.1417
-
Using computer, mobile and wearable technology enhanced interventions to reduce sedentary behaviour: a systematic review and meta-analysisInternational Journal of Behavioral Nutrition and Physical Activity 14:105.https://doi.org/10.1186/s12966-017-0561-4
-
Meta-analysis comparing ≥10-year mortality of off-pump versus on-pump coronary artery bypass graftingThe American Journal of Cardiology 120:1933–1938.https://doi.org/10.1016/j.amjcard.2017.08.007
-
Acceptance and compliance with external hip protectors: a systematic review of the literatureOsteoporosis International 13:917–924.https://doi.org/10.1007/s001980200128
-
Regulatory withdrawal of medicines marketed with uncertain benefits: the Bevacizumab case studyJournal of Pharmaceutical Policy and Practice 8:25.https://doi.org/10.1186/s40545-015-0046-2
-
TidyverseTidyverse, R package version 1.2.1, https://www.tidyverse.org/.
Decision letter
-
Eduardo FrancoSenior and Reviewing Editor; McGill University, Canada
-
Adam ElshaugReviewer
In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.
Thank you for submitting your article "A Comprehensive Analysis of Recent Medical Reversals that includes 396 Low-Value Practices in the Biomedical Literature" for consideration by eLife. Your article has been reviewed by three peer reviewers, and the evaluation has been overseen by Eduardo Franco acting as Reviewing Editor and Senior Editor.
As standard practice in eLife, the reviewers have discussed their critiques with one another in an online panel moderated by the Reviewing Editor. After consensus was reached, the Reviewing Editor has drafted this decision to help you prepare a revised submission.
Summary:
The authors aimed to identify low-value care or medical reversals from randomized controlled trials (RCT) published in three top-tier medical journals from 2003 to 2017. RCTs were included if they were on an established practice, and then whether the trial had negative results. The authors classified the type of intervention and discipline, and searched for a systematic review on the practice. Identification of medical reversals across disciplines is an enormous task and the authors have done well to carry this out systematically, and created a valuable resource (the list of 396 medical services) for de-adoption efforts. This paper partly overlaps with, but greatly extends the group's previous work building a sample of medical reversals.
Essential revisions:
Main issue: All reviewers raised concerns about lack of clarity on the aims of your work. Was your goal to identify practices that could be targeted for de-implementation? If that is the main purpose, this could be stated more clearly and the Discussion could better explain next steps. For example, would there be an additional layer of review before de-implementation campaigns? Are there additional questions that would need to be addressed to advance some of these treatments into a concerted de-implementation plan? The manuscript ended a bit abruptly and there might be a missed opportunity to discuss issues on evidence and whether/how these "396 medical reversals" indicate a problem.
1) The title could be more specific in referring to JAMA, Lancet, NEJM. It is a comprehensive analysis of those three journals, not 'the biomedical literature' which of course is far broader. The appropriate title can be discussed with the Features Editor.
2) General: Why was the BMJ not included as the fourth highest general medical journal? The BMJ has a strong track record of publishing work that aligns with the reversal agenda. Also, NEJM and JAMA tend to be very US-centric but both Lancet and BMJ (albeit UK based) tend to me more outward looking to the rest of the world. Given the degree of publication bias that works against reversals coming to light, the BMJ might offer high marginal yield and the fact that it was excluded may have been a missed opportunity. Please elaborate in more detail in the Discussion the choice of journals.
3) General: Is it possible that the definition of reversals was overly permissive? What might be more compelling is if the authors can document that interventions had been recommended in clinical practice guidelines at some point before the trial result. Or at least- it would be nice if evidence were presented that the intervention was in widespread use. It is possible that some interventions that "reversed" were not actually in widespread use and/or recommended.
4) Abstract: "This may serve as a starting point for de-adoption efforts and the study of the utilization of these practices." This may be one starting point but with the proliferation of other lists (e.g., Choosing Wisely [CW]) this is a parallel or adjunct starting point. While this is likely a more robust starting point, the reality is CW lists are front and center right now so are attracting more attention than they probably warrant from an evidentiary basis.
5) Introduction: Readers would benefit from a definition of reversal as used in this study, and in particular if the focus is on interventions in toto, or their application by/to patient subgroups (e.g., accounting for heterogeneity of treatment effect), or both. This might be gleaned from Table 2 but it needs addressing upfront in main text (probably Introduction).
6) Introduction: "…and identifying these practices is a starting point in for reducing cost and improving care." Again, with this framing you are missing an opportunity for this paper. There are so many lists of LVC now that identification is less of a challenge, prioritization from the lists perhaps more so, particularly for policymakers. The robust method you have undertaken and results offer as much if not more of a prioritization opportunity, than an identification opportunity. In short, taken in isolation it is identification, but viewed through the bigger picture it offers a prioritization angle, particularly given CW approaches by individual societies have used less formal methods.
7) Introduction, paragraph two: This paragraph is a bit confusing. It may be difficult to name (or identify?) these practices for many reasons beyond issues with the scope of the Cochrane review database. This should be expanded on.
8) Introduction, paragraph three:. This paragraph hints at some of the good reasons to go 'beyond' the CW lists to identify medical reversals, but these probably aren't obvious to readers unfamiliar with the campaign and I think really important to justify why a broader search strategy is necessary to identify low-value practices.
9) Introduction, paragraphs five and six: Referencing and focusing on the earlier study on the identified NEJM RCTs would make more sense here, rather than focus so much on the Schwartz 2014 study since the current article has very different aims/results compared to this paper.
10) Methods: For the second step (search out of systematic reviews)- how did the authors handle quality of SR/MA as well as instances where SR/MA's came to conflicting conclusions?
11) Results: The aim of the article was to compile a list of medical reversals. This is a challenge for the Results section, because there are many possible descriptive statistics from this list that the authors could choose to present (Table 1) – none of which is specific to the aim of the article. The Results (and/or Methods) section could improve with justification of why specific results were included.
12) Results: the paper does not appear to have been driven by a particular hypothesis- which makes it feel more like a gallery or reference work than a research report. It would have helped to give context if hypotheses were articulated and perhaps some relationships tested. For example, one wondered: who funded the "reversal" studies? How large were the reversals – did they tend to barely pull SR estimates over the null, or did they exert decisive effects?
Related to the above- it is not immediately clear to the reader what the message is here. Clearly, it is sometimes important to run randomized trials, since there can be residual uncertainty about efficacy. But there is no point in running trials if something is already proven. So the manuscript leaves lots of questions unanswered- are too many treatments reversed? Is there not enough research aimed at reversals? Does too much time pass between uptake and reversal? What types of interventions are at greater risk of an extended period of "reversal latency?" A better concluding paragraph would help pull the piece together. Right now it sort of ends abruptly.
13) Table 2 (and also the supplementary file 2 with all reversals) would be improved if they clearly stated the authors' classification of the medical discipline and intervention type. The authors stated this as a potential limitation and potential disagreement/debate with readers in the Discussion, so including these in Table 2 would at least give some insight into the authors' logic.
14) The authors also state that they hope the full list of reversals would be a resource for "de-adoption efforts", so why not include the medical discipline in the supplementary file. This seems like an easy step to make it much more useful to specialists and policy-makers interested in disinvestment in their area.
15) It is also not clear how the supplementary file is organised. Why is there a split between the tables and references after intervention 57, 81 and so on? It would be helpful to have some headings or descriptions here. PubMed IDs (PMID) should be added to every entry for ready reference.
16) Discussion, paragraph three: Makes a really good point about the patient 'accessed' low-value care/reversals, which hasn't been discussed much in the literature before. Adding something about the need for direction/further research here would be useful to highlight this point more (such as how the public accesses or could be more aware of this evidence).
17) Strengths and limitations: "Studies were deemed inconclusive if they demonstrated neither clear benefit nor harm." The qualifier 'inconclusive' may not be accounting for poor cost-effectiveness (e.g., inconclusive but at a cost, possibly more than a comparator). And harm is poorly measured in RCTs, certainly in relation to the trial evaluation and definitely in relation to its real world use, i.e., harms are incomprehensively measured, particularly those downstream harms (and costs). An 'inconclusive' label should place the burden of proof back on the intervention champions/sponsors to prove safety and effectiveness, rather than let it be perceived as benign. Also the Methods section does not speak to trials that involved measures of equivalence or non-inferiority. If an intervention was found clinically equivalent/non-inferior did it evade further scrutiny in this study? An obvious concern here is marginal cost-effectiveness (i.e., EQ/NI but more expensive) or questions about sufficient measuring and reporting of harm. Please clarify.
18) "…and the conclusions needed to be based on results from RCTs only (not on observational or nonrandomized studies). Articles with MA/SRs refuting the reversal were excluded from the final analysis." But all RCTs are not created equal. Did you carry out quality appraisal and critique? Were RCTs with higher degrees of design rigour (adequate blinding, sham arms, use of appropriate comparator/s etc) offered more weight than those of less rigour? If not then might this mean your methods are quite conservative and therefore specific? (i.e., some of your excluded studies might be false negatives?) The last sentence in 'limitations' indicates no quality appraisal was conducted. If so this heightens the risk of false negatives, and this would be a major limitation of the study.
https://doi.org/10.7554/eLife.45183.011Author response
Essential revisions:
Main issue: All reviewers raised concerns about lack of clarity on the aims of your work. Was your goal to identify practices that could be targeted for de-implementation? If that is the main purpose, this could be stated more clearly and the Discussion could better explain next steps. For example, would there be an additional layer of review before de-implementation campaigns? Are there additional questions that would need to be addressed to advance some of these treatments into a concerted de-implementation plan? The manuscript ended a bit abruptly and there might be a missed opportunity to discuss issues on evidence and whether/how these "396 medical reversals" indicate a problem.
Our main goal is to identify practices so that they can assist in de-implementation efforts. We have added this statement to the Introduction stating our goal. We have also added to the Conclusion by expanding on why the identification of these 396 medical practices is important.
1) The title could be more specific in referring to JAMA, Lancet, NEJM. It is a comprehensive analysis of those three journals, not 'the biomedical literature' which of course is far broader. The appropriate title can be discussed with the Features Editor.
We have rephrased the wording in the title as has been suggested by the editorial team at eLife but are happy to consider others.
2) General: Why was the BMJ not included as the fourth highest general medical journal? The BMJ has a strong track record of publishing work that aligns with the reversal agenda. Also, NEJM and JAMA tend to be very US-centric but both Lancet and BMJ (albeit UK based) tend to me more outward looking to the rest of the world. Given the degree of publication bias that works against reversals coming to light, the BMJ might offer high marginal yield and the fact that it was excluded may have been a missed opportunity. Please elaborate in more detail in the Discussion the choice of journals.
We thank the reviewer for their suggestion and acknowledge that a review of BMJ would have been within the scope of our review. The top three medical journals (NEJM, Lancet, and JAMA) were decided upon in a priori discussion with the funder. As such, the funding for this specific project covered time and personnel to review these journals only. This amounted to hundreds of thousands of dollars of expenditure and 7000 person-hours.
3) General: Is it possible that the definition of reversals was overly permissive? What might be more compelling is if the authors can document that interventions had been recommended in clinical practice guidelines at some point before the trial result. Or at least- it would be nice if evidence were presented that the intervention was in widespread use. It is possible that some interventions that "reversed" were not actually in widespread use and/or recommended.
This is a fair point. While there is the possibility that our coding of established is overly permissive, we attempted to minimize this. We did so by defining an ‘existing practice’ to be one that is codified into practice by guidelines, or one for which we could prove use outside of a clinical trial or clinical protocol. Of course, this definition is broad, and the degree to which a practice is used may range from some use to near total use. The solution to this would be to perform an administrative database search for every practice we considered. While this was not in the scope of the current work, it will be done for selected reversals in years to come. We have added this as a limitation to our study.
4) Abstract: "This may serve as a starting point for de-adoption efforts and the study of the utilization of these practices." This may be one starting point but with the proliferation of other lists (e.g., Choosing Wisely [CW]) this is a parallel or adjunct starting point. While this is likely a more robust starting point, the reality is CW lists are front and center right now so are attracting more attention than they probably warrant from an evidentiary basis.
You are correct to note this is an adjunct starting point. We have reworded the conclusion here to indicate that these results enhance and expand upon previous efforts (in the Abstract and in the conclusion).
5) Introduction: Readers would benefit from a definition of reversal as used in this study, and in particular if the focus is on interventions in toto, or their application by/to patient subgroups (e.g., accounting for heterogeneity of treatment effect), or both. This might be gleaned from Table 2 but it needs addressing upfront in main text (probably Introduction).
Thank you. Great point. We have clarified the definition for medical reversal in the Introduction.w
6) Introduction: "…and identifying these practices is a starting point in for reducing cost and improving care." Again, with this framing you are missing an opportunity for this paper. There are so many lists of LVC now that identification is less of a challenge, prioritization from the lists perhaps more so, particularly for policymakers. The robust method you have undertaken and results offer as much if not more of a prioritization opportunity, than an identification opportunity. In short, taken in isolation it is identification, but viewed through the bigger picture it offers a prioritization angle, particularly given CW approaches by individual societies have used less formal methods.
We have added to the Introduction that our broad method may permit others to prioritize which efforts of de-implementation. These practices may concern higher cost or more invasive practices, which may arguably be tackled first.
7) Introduction, paragraph two: This paragraph is a bit confusing. It may be difficult to name (or identify?) these practices for many reasons beyond issues with the scope of the Cochrane review database. This should be expanded on.
We have added several sentences and reworded some to better describe the pros and cons of other methods for identifying low-value practices.
8) Introduction, paragraph three:. This paragraph hints at some of the good reasons to go 'beyond' the CW lists to identify medical reversals, but these probably aren't obvious to readers unfamiliar with the campaign and I think really important to justify why a broader search strategy is necessary to identify low-value practices.
We have added an additional sentence to describe why we conducted this study in the way that we did.
9) Introduction, paragraphs five and six: Referencing and focusing on the earlier study on the identified NEJM RCTs would make more sense here, rather than focus so much on the Schwartz 2014 study since the current article has very different aims/results compared to this paper.
The reviewer is correct in that our study has different aims than the Schwartz 2014 study. Our intent was to highlight the financial ramifications of low-value medical practices. Since the Schwartz paper only looked at the 26 most common practices, the financial outcomes of 396 practices could be even greater. However, because a financial analysis was out of the scope of our study and we were not able to calculate the total costs, we discussed the work by Schwartz and colleagues. We did augment this part of the Discussion by bringing in prior medical reversal findings by Prasad and colleagues (2013).
10) Methods: For the second step (search out of systematic reviews)- how did the authors handle quality of SR/MA as well as instances where SR/MA's came to conflicting conclusions?
We did not do a formal analysis of the quality of the systematic reviews. Our preference was to use Cochrane Reviews because they are recognized for their high-quality methods in conducting systematic reviews and meta-analyses. Of the reversals that were confirmed by a review, 147 (70%) were confirmed by a Cochrane review. In cases where there was no Cochrane review and there were multiple reviews, we used ones that were newer and ones that population weighted their analysis. More detail has been added to the Methods section.
11) Results: The aim of the article was to compile a list of medical reversals. This is a challenge for the Results section, because there are many possible descriptive statistics from this list that the authors could choose to present (Table 1) – none of which is specific to the aim of the article. The Results (and/or Methods) section could improve with justification of why specific results were included.
We are picking a few ways to describe this data, but to be clear this is primarily a descriptive and not hypothesis driven work. We plan to publish our supplement on a website that will permit users to sort the reversals as they wish, and include a search engine. We are working with a web-developer to make this happen.
12) Results: the paper does not appear to have been driven by a particular hypothesis- which makes it feel more like a gallery or reference work than a research report. It would have helped to give context if hypotheses were articulated and perhaps some relationships tested. For example, one wondered: who funded the "reversal" studies? How large were the reversals – did they tend to barely pull SR estimates over the null, or did they exert decisive effects?
This work is primarily non-hypothesis driven, but descriptive. We appreciate the suggestion of adding the funding source to the reported data. We have added these data to the Results section, with additional detail given in the Methods.
Related to the above- it is not immediately clear to the reader what the message is here. Clearly, it is sometimes important to run randomized trials, since there can be residual uncertainty about efficacy. But there is no point in running trials if something is already proven. So the manuscript leaves lots of questions unanswered- are too many treatments reversed? Is there not enough research aimed at reversals? Does too much time pass between uptake and reversal? What types of interventions are at greater risk of an extended period of "reversal latency?" A better concluding paragraph would help pull the piece together. Right now it sort of ends abruptly.
13) Table 2 (and also the supplementary file 2 with all reversals) would be improved if they clearly stated the authors' classification of the medical discipline and intervention type. The authors stated this as a potential limitation and potential disagreement/debate with readers in the Discussion, so including these in Table 2 would at least give some insight into the authors' logic.
We have added a column to the Table 2 of the primary medical discipline for each medical practice reversal.
14) The authors also state that they hope the full list of reversals would be a resource for "de-adoption efforts", so why not include the medical discipline in the supplementary file. This seems like an easy step to make it much more useful to specialists and policy-makers interested in disinvestment in their area.
We have added a column that specifies the primary medical discipline for each medical reversal. We will also be including the results of our analysis on a medical reversal website, which we are currently designing. A link to this website will be shared with eLife.
15) It is also not clear how the supplementary file is organised. Why is there a split between the tables and references after intervention 57, 81 and so on? It would be helpful to have some headings or descriptions here. PubMed IDs (PMID) should be added to every entry for ready reference.
The supplemental file is arranged in reverse chronological order, by journal. As for the breaks in the table(s), this was done for purely practical reasons – we started a new table when Endnote (the reference software we used) was about to crash. Access to a more powerful computer with specialized RAM capability would solve this issue by allowing us to create a single table with one set of references. The website will have links to all articles that we reference in our write-ups.
16) Discussion, paragraph three: Makes a really good point about the patient 'accessed' low-value care/reversals, which hasn't been discussed much in the literature before. Adding something about the need for direction/further research here would be useful to highlight this point more (such as how the public accesses or could be more aware of this evidence).
We thank the reviewer for this comment and have added a sentence on the need for discussion regarding these interventions and their use in everyday practice.
17) Strengths and limitations: "Studies were deemed inconclusive if they demonstrated neither clear benefit nor harm." The qualifier 'inconclusive' may not be accounting for poor cost-effectiveness (e.g., inconclusive but at a cost, possibly more than a comparator). And harm is poorly measured in RCTs, certainly in relation to the trial evaluation and definitely in relation to its real world use, i.e., harms are incomprehensively measured, particularly those downstream harms (and costs). An 'inconclusive' label should place the burden of proof back on the intervention champions/sponsors to prove safety and effectiveness, rather than let it be perceived as benign. Also the Methods section does not speak to trials that involved measures of equivalence or non-inferiority. If an intervention was found clinically equivalent/non-inferior did it evade further scrutiny in this study? An obvious concern here is marginal cost-effectiveness (i.e., EQ/NI but more expensive) or questions about sufficient measuring and reporting of harm. Please clarify.
These are excellent points by the reviewer. Inconclusive does consider cost-effectiveness as far as the practice works or does not work, although our coding did not explicitly use cost to judge these practices. Reversals are things that did not improve important outcomes, and so we were most interested in the effectiveness of an intervention instead of the cost value of it. Some entities did not make our list if they offered real benefits even tremendous marginal cost. In other words, there may be non-reversals that are not cost effective. However, in many of the studies where there were notable cost differentials between the two treatments, the more expensive intervention was also the novel intervention. There were a few studies that compared two established interventions, where one was the more expensive of the two. In this case, the newer, albeit established, and more expensive would have had to have shown greater benefit for the results to be considered positive. Also, we took our results and took them in the context in which the study was framed. If the study was a noninferior or equivalence study and the intervention met the noninferiority margin, it was considered positive. We have added a few sentences to the Methods section for clarification. As for measuring harm, we did find that some interventions were harmful in a statistically significant way, these studies were coded as negative because they did not meet their primary endpoint. The studies that were coded as inconclusive were ones that may have met their primary endpoint, but other endpoints were as or more meaningful to the patient as the primary endpoint. Such is the case with alteplase for survivors of stroke, where it did prolong overall survival but did not improve functional scores. In this case, there is debate about which outcome is the better one on which to make a conclusive decision about benefits.
18) "…and the conclusions needed to be based on results from RCTs only (not on observational or nonrandomized studies). Articles with MA/SRs refuting the reversal were excluded from the final analysis." But all RCTs are not created equal. Did you carry out quality appraisal and critique? Were RCTs with higher degrees of design rigour (adequate blinding, sham arms, use of appropriate comparator/s etc) offered more weight than those of less rigour? If not then might this mean your methods are quite conservative and therefore specific? (i.e., some of your excluded studies might be false negatives?) The last sentence in 'limitations' indicates no quality appraisal was conducted. If so this heightens the risk of false negatives, and this would be a major limitation of the study.
We have added more detail of how the MA/SR were selected. We did not perform a formal evaluation of the quality of SR/MA. Our preference was to use a Cochrane Review because the quality of their reviews is well established, and in most cases, we were able to do so. In a number of instances, there was only one SR/MA to confirm or refute the reversal. In instances where there were multiple reviews, our methods were to use a Cochrane review first, then another review from a high-quality journal, with newer reviews having priority over older reviews, and reviews that population-weighted their analysis over ones that did not. As for the quality of the individual RCTs, we deferred to the SR/MA to report that.
https://doi.org/10.7554/eLife.45183.012Article and author information
Author details
Funding
Laura and John Arnold Foundation
- Vinay Prasad
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Publication history
- Received:
- Accepted:
- Version of Record published:
Copyright
© 2019, Herrera-Perez et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 90,060
- views
-
- 4,331
- downloads
-
- 76
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
There is a high price to pay for ineffective medical procedures.
-
- Evolutionary Biology
- Medicine
Male germ cells share a common origin across animal species, therefore they likely retain a conserved genetic program that defines their cellular identity. However, the unique evolutionary dynamics of male germ cells coupled with their widespread leaky transcription pose significant obstacles to the identification of the core spermatogenic program. Through network analysis of the spermatocyte transcriptome of vertebrate and invertebrate species, we describe the conserved evolutionary origin of metazoan male germ cells at the molecular level. We estimate the average functional requirement of a metazoan male germ cell to correspond to the expression of approximately 10,000 protein-coding genes, a third of which defines a genetic scaffold of deeply conserved genes that has been retained throughout evolution. Such scaffold contains a set of 79 functional associations between 104 gene expression regulators that represent a core component of the conserved genetic program of metazoan spermatogenesis. By genetically interfering with the acquisition and maintenance of male germ cell identity, we uncover 161 previously unknown spermatogenesis genes and three new potential genetic causes of human infertility. These findings emphasize the importance of evolutionary history on human reproductive disease and establish a cross-species analytical pipeline that can be repurposed to other cell types and pathologies.