1. Computational and Systems Biology
Download icon

Inequalities in the distribution of National Institutes of Health research project grant funding

  1. Michael S Lauer  Is a corresponding author
  2. Deepshikha Roychowdhury
  1. National Institutes of Health, Office of the Director, United States
  2. NIH Office of Extramural Research, United States
Research Article
  • Cited 0
  • Views 2,589
  • Annotations
Cite this article as: eLife 2021;10:e71712 doi: 10.7554/eLife.71712

Abstract

Previous reports have described worsening inequalities of National Institutes of Health (NIH) funding. We analyzed Research Project Grant data through the end of Fiscal Year 2020, confirming worsening inequalities beginning at the time of the NIH budget doubling (1998–2003), while finding that trends in recent years have reversed for both investigators and institutions, but only to a modest degree. We also find that career-stage trends have stabilized, with equivalent proportions of early-, mid-, and late-career investigators funded from 2017 to 2020. The fraction of women among funded PIs continues to increase, but they are still not at parity. Analyses of funding inequalities show that inequalities for investigators, and to a lesser degree for institutions, have consistently been greater within groups (i.e. within groups by career stage, gender, race, and degree) than between groups.

Introduction

Over the past few years, there has been increasing interest (Peifer, 2017) in how the National Institutes of Health (NIH) funding support is distributed, with concern voiced by some that there may be excess concentration of support given to men and to the most well-funded late-career investigators. In a report (National Institutes of Health, 2019) issued by an NIH Working Group to the Advisory Committee to the Director (ACD), it was noted that "In biomedical science, power stems from who has access to awards. The Working Group heard repeatedly that the concentration of funding in a relatively small number of investigators (who are overwhelmingly white, cisgender, straight men) incentivizes universities to protect researchers bringing in high levels of grant funding".

Recently published literature has raised concerns regarding how NIH distributes funding support. One report (Katz and Matter, 2020) which focused on all ‘R’ grants found increasing inequality of funding support over 30 years (1985–2015). A research letter (Oliveira et al., 2019) found lower levels of support for grants in which women were identified as Principal Investigators. Other reports have documented disproportionate aging of the research workforce (Blau and Weinberg, 2017) and stresses particular to mid-career investigators (Charette et al., 2016); these reports are concerning given evidence that there is no correlation between research stage and scientific impact (Sinatra et al., 2016).

In 2017, the NIH considered imposing a cap (Lauer, 2017a) on individual-investigator research support through use of a ‘Grant Support Index or GSI’ (Lauer, 2017b) which classified grants according to mechanism (e.g. R01, P01, U54) rather than according to dollars. The GSI set a value of 7 for R01 grants, with lower values for ‘smaller’ mechanisms like R03 or R21 and greater values for mechanisms like P01 or U54. The proposed cap was set at 21, meaning that on average no investigator could be designated as PI on more than the equivalent of three R01 grants. The proposed cap was highly controversial (Kaiser, 2017) and was dropped in favor of a different approach (Lauer et al., 2017) that targeted funds directly toward early career investigators.

Here, we present updated data on distribution of NIH support for principal investigators (‘PIs’, keeping in mind that NIH issues awards to institutions [Lauer, 2018], not to individual scientists) with particular attention to career stage, gender, race, and degree. We focus on research project grants (‘RPGs’) as these comprise close to 80% of all NIH extramural research funding; we can also assess patterns that are independent of already well-known disparities for small business and non-RPG research grants.

Results

Distribution of funding to RPG PIs over time

Figure 1 shows different measures of funding distribution to RPG PI’s between fiscal years 1985 to 2020. These measures reflect different approaches that economists use to assess income inequality; here we use RPG funding as the analogue of income. We use three different measures:

  • Proportion of funds going to the top 1%, or centile, as well as to the top 10%, or decile, (Saez and Zucman, 2020) in contrast to the proportion to the bottom 50% (Panel A) or considered alone (Panel C).

  • Standard deviation of the log of funding (Hoffmann et al., 2020), a measure that accounts for the well-documented skewness in funding and that is particularly sensitive to low and intermediate levels of funding (Panel B).

  • The Theil T index (Conceição and Ferreira, 2000), a measure that is more sensitive to higher levels of funding (Panel D). Unlike other measures of inequality, the Theil Index is not intuitive. However, it can be used to parse group data, allowing us to parse inequality into within group and between group components; for example, we can see whether there is a greater degree of inequality between men and women as opposed to within cohorts of men and women.

Distribution of Research Project Grant (RPG) Principal Investigator (PI) Funding, Fiscal Years 1985–2020.

Panel A: Percent of RPG funds distributed to the top centile, top decile, and bottom half of investigators. Panel B: Standard deviation of the log of funding, a measure that focuses primarily on lower and intermediate levels of funding. Panel C: Percent of RPG funds distributed solely to the top centile of investigators. Panel D: Theil T index, a measure more sensitive to the highest funding levels, and hence has a similar appearance to percent of funds distributed to the top centile. The vertical dotted lines refer to the beginning and end of the NIH doubling and the year of budget sequestration (2013).

All three measures indicate greater inequalities in funding since the early 1990s through 2006 corresponding to the NIH-doubling and its aftermath; a plateau from 2006 to 2013; a rapid rise after 2013 (the year of sequestration) to 2017; and a decline approaching 2013 levels from 2018 to 2020. The inequalities are more striking among the most highly funded investigators (Panels C and D), where increases are noted with the NIH doubling (1998 to 2003) and in the first few years after the 2013 budget sequestration. The top 1% of investigators received 8% of RPG funds in 1998; in recent years, they received close to 10% of funds. While this may not seem like much, we should keep in mind that a difference of 2% of RPG funds means that a small group of ~300 investigators are receiving in 2020 approximately $420 million (inflation-adjusted) more than they would have received by 1998 standards. Given that the average RPG costs about $500,000, this difference is the equivalent of 800 grants. Inequalities among investigators receiving low to intermediate levels of funding followed a somewhat different trajectory, decreasing during the NIH doubling while increasing after 2013.

Characteristics of the most highly funded RPG principal investigators

Table 1 shows characteristics of 34,936 principal investigators funded in fiscal year 2020 according to whether or not they were among the top funded centile. We defined proxies for career stage according to age, with values of ‘early’ (age < 46), ‘middle’ (age 46 to 58), and ‘late’ (age > 58). Compared to the bottom 99%, the top 1% of investigators were in later career stages and more likely to be white, non-Hispanic, and to hold an MD degree (either alone or with a PhD). The difference in funding levels is striking, with top 1% investigators receiving a median of $4.8 million compared to $0.4 million for all others; they were also much more likely to be supported on multiple RPG grants.

Table 1
Investigator characteristics according to centile of funding in fiscal year 2020.

Values shown in parentheses are percentages for categorical variables and IQR for continuous variables. IQR = inter-quartile range. ND = not displayed due to small cell size.

CharacteristicTop 1%Bottom 99%
Total N (%)349 (1.0)34587 (99.0)
Career StageEarly30 (8.6)10567 (30.6)
Middle128 (36.7)12936 (37.4)
Late162 (46.4)8273 (23.9)
GenderFemale102 (29.2)11858 (34.3)
Male241 (69.1)21695 (62.7)
RaceWhite277 (79.4)23264 (67.3)
Asian42 (12.0)7523 (21.8)
Black or African-AmericanND639 (1.8)
More than One RaceND418 (1.2)
EthnicityHispanic12 (3.4)1622 (4.7)
Not Hispanic306 (87.7)29513 (85.3)
DegreePhD166 (47.6)24620 (71.2)
MD116 (33.2)5238 (15.1)
MD-PhD60 (17.2)3572 (10.3)
OtherND1157 (3.3)
Funding in $MillionMedian (IQR)4.8 (4.0 to 6.5)0.4 (0.3 to 0.7)
Number of RPG AwardsOne69 (19.8)23268 (67.3)
Two86 (24.6)7571 (21.9)
Three52 (14.9)2540 (7.3)
Four60 (17.2)847 (2.4)
Five or More82 (23.5)361 (1.0)

Table 2 shows corresponding characteristics of 19,221 principal investigators funded in fiscal year 1995, before the begining of the NIH doubling. In contrast to 2020, career stage and race differences were less marked, but gender differences were more so. During both eras (before the doubling and in most recent times) top centile investigators were much more likely to hold an MD degree. Consistent with prior literature (Blau and Weinberg, 2017), the age range of all NIH funded investigators is skewing older over time. Another noteworthy difference between FY2020 and FY1995 is that much greater proportions of investigators were supported on multiple – 3, 4, or 5 or more – grants in FY2020 than in FY1995.

Table 2
Investigator characteristics according to centile of funding in fiscal year 1995.

Data on ethnicity are not provided due to high rates of missingness (more than one-third). Dollar values are inflation-adjusted to a FY2019 reference standard. Values shown in parentheses are percentages for categorical variables and IQR for continuous variables. IQR = inter-quartile range. ND = not displayed due to small cell size.

CharacteristicTop 1%Bottom 99%
Total N (%)192 (1.0)19029 (99.0)
Career StageEarly37 (19.3)8757 (46.0)
Middle111 (57.8)7305 (38.4)
Late33 (17.2)1852 (9.7)
GenderFemale23 (12.0)4266 (22.4)
Male165 (85.9)14439 (75.9)
RaceWhite167 (87.0)16121 (84.7)
Asian14 (7.3)1525 (8.0)
Black or African-AmericanND164 (0.9)
More than One RaceND89 (0.5)
DegreePhD72 (37.5)13418 (70.5)
MD81 (42.2)3740 (19.7)
MD-PhD38 (19.8)1703 (8.9)
OtherND168 (0.9)
Funding in $MillionMedian (IQR)4.5 (4.0 to 5.7)0.4 (0.3 to 0.7)
Number of RPG AwardsOne46 (24.0)14894 (78.3)
Two61 (31.8)3391 (17.8)
Three57 (29.7)617 (3.2)
Four22 (11.5)115 (0.6)
Five or MoreND12 (0.1)

Inequalities between and within groups

Figure 2 shows secular changes in composition of the RPG PI workforce between FY1985 and FY2020. Over time, there have been increases in the proportion of late career, female, and Asian investigators. Middle career investigators are comprising a lower proportion of the workforce since the mid-2000s. Over the past 4–5 years, the proportions of PIs at different career stages have stabilized. The proportion of late-career investigators is no longer rising while that of mid-career investigators is no longer falling. This stabilization has occured at the same as NIH implementation of its Next Generation Researchers Initiative (Lauer et al., 2017). The fraction of women among funded PIs continues to increase, but they are still not at parity. The proportion of MD-only degree holders has fallen, while the proportion of MD-PhD degree holders has increased. Figure 3 shows using box plots the FY 2020 distribution of funding to RPG PIs according to career stage, gender, race, and degree. Late career investigators, men, whites, and those holding MD degrees are better funded. Nonetheless, one notes that there appears to be greater variability within groups than between groups.

Secular changes in the composition of the RPG PI Workforce from fiscal year 1985 to fiscal year 2020.

Race data are shown from 1995 on due to high proportions of unknown values beforehand. Each plot shows the percentage of RPG PIs according to different groupings. All percentages add up to 100. Panel A: Career Stage. Panel B: Gender. Panel C: Race. Panel D: Degree.

Box plots showing the distribution of funding in FY2020 according to PI groups.

Diamonds refer to means; the higher means compared to medians reflect highly skewed distributions. Outliers are not displayed Panel A: Career Stage. Panel B: Gender. Panel C: Race. Panel D: Degree. For all groups, variability appears to be greater within groups than between groups. AA = African-American.

Table 3 shows FY2020 characteristics according to career stage. Late-career investigators were more likely to be white males, to hold MD degrees, and to be designated as PI on a larger number of grants. Table 4 shows FY2020 investigator characteristics according to gender. Women were younger, more likely to hold a PhD degree, and less likely to be principal investigators of 2 or more RPG grants. Table 5 shows corresponding race data. Black or African-American investigators were younger, more likely to be women, and more likely to hold MD degrees. They were also much more likely to serve a PI on only one RPG grant.

Table 3
Investigator characteristics according to career stage in fiscal year 2020.

Values shown in parentheses are percentages for categorical variables and IQR for continuous variables. IQR = inter-quartile range.

CharacteristicEarlyMiddleLate
Total N (%)10597 (30.3)13064 (37.4)8435 (24.1)
GenderFemale4241 (40.0)4505 (34.5)2267 (26.9)
Male6145 (58.0)8464 (64.8)6128 (72.6)
RaceWhite6855 (64.7)8509 (65.1)6990 (82.9)
Asian2515 (23.7)3440 (26.3)955 (11.3)
Black or African-American270 (2.5)232 (1.8)84 (1.0)
More than One Race209 (2.0)153 (1.2)42 (0.5)
DegreePhD8643 (81.6)9115 (69.8)5355 (63.5)
MD1067 (10.1)2006 (15.4)1935 (22.9)
MD-PhD651 (6.1)1714 (13.1)998 (11.8)
Other236 (2.2)229 (1.8)147 (1.7)
Funding in $MillionMedian (IQR)0.4 (0.2 to 0.6)0.4 (0.3 to 0.8)0.5 (0.3 to 0.9)
Funding Percentile RankMedian (IQR)55.3 (31.7 to 78.2)47.2 (23.2 to 72.2)42.5 (18.6 to 71.7)
Number of RPG AwardsOne7704 (72.7)8149 (62.4)5377 (63.7)
Two2047 (19.3)3163 (24.2)1957 (23.2)
Three584 (5.5)1155 (8.8)701 (8.3)
Four183 (1.7)384 (2.9)268 (3.2)
Five or More79 (0.7)213 (1.6)132 (1.6)
Table 4
Investigator characteristics according to gender in fiscal year 2020.

Values shown in parentheses are percentages for categorical variables and IQR for continuous variables. IQR = inter-quartile range.

CharacteristicWomenMen
Total N (%)11960 (34.2)21936 (62.8)
Career StageEarly4241 (35.5)6145 (28.0)
Middle4505 (37.7)8464 (38.6)
Late2267 (19.0)6128 (27.9)
RaceWhite8528 (71.3)14876 (67.8)
Asian2405 (20.1)5106 (23.3)
Black or African-American296 (2.5)342 (1.6)
More than One Race189 (1.6)230 (1.0)
DegreePhD9093 (76.0)15278 (69.6)
MD1734 (14.5)3540 (16.1)
MD-PhD867 (7.2)2732 (12.5)
Other266 (2.2)386 (1.8)
Funding in $MillionMedian (IQR)0.4 (0.3 to 0.7)0.4 (0.3 to 0.8)
Funding Percentile RankMedian (IQR)51.3 (27.1 to 76.7)48.0 (23.1 to 72.7)
Number of RPG AwardsOne8409 (70.3)14002 (63.8)
Two2512 (21.0)5066 (23.1)
Three732 (6.1)1833 (8.4)
Four212 (1.8)688 (3.1)
Five or More95 (0.8)347 (1.6)
Table 5
Investigator characteristics according to race in fiscal year 2020.

Values shown in parentheses are percentages for categorical variables and IQR for continuous variables. IQR = inter-quartile range. ND = not displayed due to small cell size.

CharacteristicWhiteAsianBlack or African-American
Total N (%)23541 (67.4)7565 (21.7)643 (1.8)
Career StageEarly6855 (29.1)2515 (33.2)270 (42.0)
Middle8509 (36.1)3440 (45.5)232 (36.1)
Late6990 (29.7)955 (12.6)84 (13.1)
GenderFemale8528 (36.2)2405 (31.8)296 (46.0)
Male14876 (63.2)5106 (67.5)342 (53.2)
DegreePhD17086 (72.6)5398 (71.4)406 (63.1)
MD3831 (16.3)976 (12.9)141 (21.9)
MD-PhD2211 (9.4)1094 (14.5)73 (11.4)
Other413 (1.8)97 (1.3)23 (3.6)
Funding in $MillionMedian (IQR)0.4 (0.3 to 0.8)0.4 (0.3 to 0.7)0.4 (0.2 to 0.6)
Funding Percentile RankMedian (IQR)47.8 (23.6 to 73.6)51.3 (26.3 to 73.5)60.7 (32.1 to 83.8)
Number of RPG AwardsOne15506 (65.9)4914 (65.0)504 (78.4)
Two5344 (22.7)1688 (22.3)103 (16.0)
Three1784 (7.6)616 (8.1)25 (3.9)
Four611 (2.6)230 (3.0)ND
Five or More296 (1.3)117 (1.5)ND

The Theil T index enables us to formally assess between-group and within-group contributions to inequality. Figure 4 shows that for all groupings, within group differences contribute more to inequality than between-group differences. The small between-group differences are shown in Figure 5. Late stage investigators, men, whites, and investigators with MD degrees contribute ‘positive elements’ because they on average receive higher levels of funding. Nonetheless, the absolute values of these elements, as compared to the total Theil index, are small.

Components of Theil index, showing between-group and within-group contributions to overall inequality over time.

Panel A: Career Stage. Panel B: Gender. Panel C: Race. Panel D: Degree. For all groups, within-group differences contribute more to inequality than between-group differences.

Theil Elements in different groups over time.

Panel A: Career stage. Panel B: Gender. Panel C: Race. Panel D: Degree. Values above the zero line indicate that groups received above average funding, while values below zero indicate below average funding. Thus, as in Panel A, late stage investigators received above average funding and early stage investigators received below average funding. Middle career investigators initially received above average funding, but in recent years have received funding close to average, contributing little to inequality. AA = African-American.

Organizational inequalities

In additional analyses, we look at RPG funding inequalities among organizations. Figure 6 shows data analagous to those in Figure 1. Because the absolute number of organizations is much less than for PIs (e.g. in 2020 there were 1097 unique organizations receiving RPG funding) we focus on the top decile (10%) rather than the top centile. The top 10% of organizations have been receiving approximately 70% of RPG funding, while the bottom half have received well under 5%. Like with PIs, inequalities increased after the doubling, but patterns in more recent years have differed. Inequalities decreased in the late 2000s (perhaps coincident with the 2008 finanical crash), but have increased slightly in more recent years.

Distribution of Research Project Grant (RPG) Organization Funding, Fiscal Years 1985–2020.

Panel A: Percent of RPG funds distributed to the top decile and bottom half of organizations. Panel B: Standard deviation of the log of funding, a measure that focuses primarily on lower and intermediate levels. Panel C: Percent of RPG funds distributed solely to the top decile of organizations. Panel D: Theil T index, a measure more sensitive to the highest funding levels, and hence has a similar appearance to percent of funds distributed to the top centile. The vertical dotted lines in Panels B, C, and D refer to the beginning and end of the NIH doubling and the year of budget sequestration (2013).

Figure 7 shows distribution of RPG funding in Fiscal Year 2020 according to organization type. Because the distributions are highly skewed (even more so than with PIs), we show log-transformed values (Panel A). There are marked differences between groups – medical schools are receiving higher levels of funding than other institutions. We confirm this by calculating Theil indices, which show that organizational inequalities stem from both between group and within group variability (Panel B). The Theil elements plot (Panel C), consistent with Panel A, shows that medical schools, and to a lesser extent hospitals, are groups that receive higher levels of funding. Figure 8 shows corresponding data according to organization region. Funding inequalities were greater within regions than betweeen regions. Figure 9 shows similarly that for domestic institutions within state inequalities contribute more to overall inequality that between-state inequalities.

RPG funding distribution and inequalities according to organization type.

Panel A: Box plots showing distributions of log-transformed RPG funding in FY2020. Panel B: Theil index components plot, showing that both between group and within group inequalities contribute to overall inequality. Panel C: Theil elements plot. Values above the zero line indicate that groups received above average fundings, while values below zero indicate below average funding. Medical schools and hospitals received above average funding.

RPG funding distribution and inequalities according to organization region.

Panel A: Box plots showing distributions of log-transformed RPG funding in FY2020. Panel B: Theil index components plot, showing that within group inequalities primarily contribute to overall inequality. Panel C: Theil elements plot. Values above the zero line indicate that groups received above average fundings, while values below zero indicate below average funding. Foreign organizations received below average funding.

RPG funding distribution and inequalities according to organization state within the United States.

The panel shows a Theil index components plot, showing that within state inequalities contribute more to overall inequality than between-state inequality.

Perspective: Income inequality in the united States and Europe – Population Data

In order to put these NIH-specific data into perspective, we present high-level income equality data for general populations of the United States and the European Union. We show data from the World Inequality Database (Saez, 2021), which was developed by Emmanuel Saez and colleagues.

Figure 10 shows percent of annual income going to the top centile (Panel A) and the bottom half (Panel B) of the populations of the United States and Europe from 1980 to 2020. We focus on income, instead of wealth, since income for most people comes from remuneration for work and therefore would be analogous to RPG funding awarded in anticipation of scientific work. At all times, income inequality has been greater in the United States. Changes in inequality have also been greater in the United States. From 1995 to 2019, the proportion of income going to the top centile of the United States population has increased from 14.3% to 18.7%, a relative increase of 31%. During the same time, the proportion of RPG funding going to the top centile of RPG PIs has increased from 8.3% to 10.8% (Figure 1, Panel C), a relative increase of 30%. Although the US population and NIH-funded PIs have experienced different events – e.g., the 2000 recession and the 2008 financial crash for the US population; the NIH doubling, the 2006 payline crash, the 2013 sequestration, and the recent string of budget increases for NIH-funded PIs – the overall relative changes in inequality at the top are remarkably similar.

United States and European Union income equality measures from the World Inequality Database.

Panel A: Percent of income going to the top centile of the population. Panel B: Percent of income going to the bottom half.

Discussion

Inequalities in funding of RPG PIs have increased since the NIH doubling, with further increases since sequestration in 2013 (Figure 1). Over the past few years, a time of substantial and sustained budget increases for NIH and a time of focus on early career investigators, there has been a decrease in the degree of inequality, but not quite back to the level of 2013. The RPG funding inequalities primarily reflect changes ‘at the top,’ meaning among the most highly funded investigators (Figure 1, Panels C and D). The top 1%’s share of RPG funding has increased from 8% before the doubling to nearly 10% now (Figure 1, Panel C); this difference translates into ~$400 million, or the equivalent of 800 RPG awards. Since sequestration, the top 1% has received an increased share of funding, while the bottom 50% has received less. During the NIH doubling, both the top centile and the lower half saw increases in the proportion of funding they received (Figure 1, Panels A and B).

The composition of the RPG PI workforce has evolved over time, with greater proportions of investigators who are late career, women, and Asian, and lesser proportions of MD-only degree holders (Figure 2). Despite steady increases in the proportion of women investigators, they are still well below parity. (Figure 2, Panel B). Among the groups studied, more funding goes to late career investigators, as well as to men, whites, and holders of MD degrees. Nonetheless, there is greater inequality within groups than between groups (Figures 35). One might argue that it may be reasonable for researchers to receive more funding at later career stages as they may have larger networks and are more experienced at posing research questions. Thus, some inequality may be considered ‘acceptable.’ But there is not funding parity for gender or race for researchers in the workforce, which are unacceptable inequalities. Over the past few years NIH has launched high-profile initiatives to enhance the diversity of the biomedical research work force. (National Institutes of Health, 2021b; ORWH, 2021).

Materials and methods

From the NIH IMPAC II database, we obtained PI-specific data on inflation-adjusted total-cost funding of Research Project Grants (RPGs), defined as those grants with activity codes of DP1, DP2, DP3, DP4, DP5, P01, PN1, PM1, R00, R01, R03, R15, R21, R22, R23, R29, R33, R34, R35, R36, R37, R61, R50, R55, R56, RC1, RC2, RC3, RC4, RF1, RL1, RL2, RL9, RM1, UA5, UC1, UC2, UC3, UC4, UC7, UF1, UG3, UH2, UH3, UH5, UM1, UM2, U01, U19, and U34. Not all of these activity codes were used by NIH every year. For FY 2009 and 2010, we excluded awards made under the American Recovery and Reinvestment Act of 2009 (ARRA) and all ARRA solicited applications and awards. For FY2020, we excluded awards issued using supplemental Coronavirus (COVID-19) appropriations. Inflation-adjustments were referenced to FY2019 using the Biomedical Research and Development Price Index (National Institutes of Health, 2021a).

We measured inequality by three approaches: Proportion of funds going to the top 1%,or centile, (Saez and Zucman, 2020); standard deviation of the log of funding (Hoffmann et al., 2020), a measure that accounts for the well-documented skewness in funding and that is particularly sensitive to low and intermediate levels of funding; and the Theil T index (Conceição and Ferreira, 2000), a measure that is more sensitive to higher levels of funding and that be exploited to explore contributions of different groups to overall inequality.

For individual level data (say of individual PIs), the Theil Index (T) of funding inequality is mathematially represented as:

(1) T=p=1n(1n*ypμy*lnypμy)

where n is the number of individual PIs, yp is the funding of PI p, and μy is the population mean funding. The final logarithmic fraction takes on a value greater than 0 if the individual investigator p’s funding is greater than the population mean μy and less than 0 if the individual investigator’s funding is less than the population mean. We can think of the three terms as: 1n as the investigator’s proportion of the population; ypμy as the magnitude of deviance compared to the overall population; and lnypμy as the direction of deviance.

For grouped data (e.g. data grouped by career stage or gender or other characteristics), we can present the Theil Index T as a weighted average of inequality within each group plus inequality between those groups. That is:

(2) T=Tg+Tgw

where Tg is the between-group component and Tgw is the within group component.

The between-group component of the Theil Index (Tg) is mathematically represented in a form similar to the overall Theil Index (Equation 1), namely:

(3) Tg=i=1m(piP*yiμ*lnyiμ)

where i indexes the m groups (e.g. early, middle, and late career investigators), P is the total population, yi is the average funding of the group i, and μ is the average funding accross the entire population. The expression within the parenthesis is called the ‘Theil element,’ which is positive (or negative) if the group’s average funding is above (or below) the population average and zero if the averages are equal. The Theil elements represent the contribution of each group to total inequality between the groups.

Unlike other measures of inequality (e.g. proportion of funding going to the top centile or standard deviation of log funding), the Theil Index is not intuitive. However, it can be used to parse group data, allowing us to parse inequality into within group and between group componentd between group component.

Data availability

Source data have been provided in R format. R markdown source code corresponds with all numbers, tables, and figures.

References

  1. Software
    1. Conceição P
    2. Ferreira P
    (2000) The Young Person’s Guide to the Theil Index: Suggesting Intuitive Interpretations and Exploring Analytical Applications
    The Young Person’s Guide to the Theil Index: Suggesting Intuitive Interpretations and Exploring Analytical Applications.
  2. Software
    1. Lauer MS
    (2017a) Implementing Limits on Grant Support to Strengthen the Biomedical Research Workforce – NIH Extramural Nexus
    Implementing Limits on Grant Support to Strengthen the Biomedical Research Workforce – NIH Extramural Nexus.
  3. Software
    1. ORWH
    (2021) Career Development & Interprofessional Education Office of Research on Women’s Health
    Career Development & Interprofessional Education Office of Research on Women’s Health.

Decision letter

  1. Carlos Isales
    Reviewing Editor; Medical College of Georgia at Augusta University, United States
  2. Mone Zaidi
    Senior Editor; Icahn School of Medicine at Mount Sinai, United States
  3. Mark Peifer
    Reviewer; University of North Carolina at Chapel Hill, United States

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Acceptance summary:

This article by Lauer et al. provides new insights into the important problem of inequalities in grant distribution in awards from the NIH. The article highlights that differences in funding rates within a group in the areas of race and gender persist, despite active efforts over the last several years to close these gaps. It is critical to address these inequalities if there is to be continued growth in the pace of scientific discovery.

Decision letter after peer review:

Thank you for submitting your article "Inequalities in the Distribution of National Institutes of Health Research Project Grant Funding" for consideration by eLife. Your article has been reviewed by 2 peer reviewers, and the evaluation has been overseen by a Senior Editor and Mone Zaidi as the Deputy Editor. The following individual involved in review of your submission has agreed to reveal their identity: Mark Peifer (Reviewer #1).

The reviewers have discussed their reviews with one another, and the Senior Editor has drafted this to help you prepare a revised submission.

Essential revisions:

1. Revised wording/clarifications to make sure conclusions better match the data.

a. Abstract. First, stating the trend toward reversing inequality has "reversed", while technically accurate, implies return to the start of the analysis. In fact, there has been only a modest reversal, and inequality remains very strong. This sentence should be more nuanced, as should the following statement about career stages. Second the statement about women is very confusing. They state "Women continue to constitute a greater proportion of funded principal investigators, though not at parity." When I first read this, I was thinking – no chance that women are "a greater proportion", i.e., a majority of funded PIs. It would be clearer to state: "The fraction of women among funded PIs continues to increase, but they are still not at parity".

b. Figure 1 should have the Y-axis go to zero, to make it clearer just how much funding is held by the top 1%.

c. P. 4. The authors should note that the age range of ALL NIH funded investigators is skewing older over time – this has been analyzed by others but is worth a mention.

d. The data in Figure 2 is very interesting, and deserves a more complete description, especially as it brings these analyses up to the current date.

e. P 10, Figure 3, the choice of data presentation and the way it was discussed significantly minimized the differences by career stage, race and gender. I imagine the majority of investigators in all groups are funded by a single grant at or near the modular level, and this is leveling differences – the mean/median difference supports this. I suspect that, and the authors have data to explore whether, the most highly funded individuals are very different between groups. What would the top decile look like for each group – their mean values suggest substantial differences? I'd like to see more in-depth analysis here. I think the authors have dug deeper, as they state "Women were younger, more likely to hold a PhD degree, and less likely to be principal investigators of more than the equivalent of 4 R01 grants.", but how one would assess this last factor from the data presented is not clear.

f. The analysis of different institutions was stunning (Figures 6 and 7), and deserves some mention in the abstract and some more text talking about the nuances. I also wondered if the authors have analyzed by region of the country, as this is also an area that has attracted attention (e.g. analyses of Walls)?

2. Additional analyses

a. The analysis in Figure 1 was exceptionally interesting. A similar comparison of the top 10% to the bottom 50% is worth adding. Also, in discussing the "bottom 50%", they should begin by noting the fraction of NIH-funded PIs who have a single grant (my memory suggests ~75%) and that this combined with the modular budget means most in the bottom half are quite similar in funding levels.

3. A clearer and more extensive description of some of the methods used

a. Figure 1. The authors should provide a clearer explanation of the nature of the data in panels B and D, and how to interpret it. This is true when they use the same measures later.

b. Table 1. Make clear that for most measures the parenthetical values are percentages. Then explain what the parenthetical values represent in Age and Funding – are they ranges? Define IQR.

c. Table 2. Same Adjustments as in Table 1. Additionally, are the funding levels in some sort of inflation adjusted dollars – otherwise they seem very odd.

4. The fact that there is much greater inequality within groups than between groups is interesting and some explicit discussion of what this means would be helpful. It seems that researchers receiving more funding at later career stages may be reasonable as they on average (with speculation) may have larger networks and know the research questions that could be asked better. Thus, some inequality here may be considered acceptable. But there is not funding parity for gender or race for researchers in the workforce, which are unacceptable inequalities, although these may not be as striking as the composition of the workforce and distribution of funds to the most funded investigators. How do you interpret these results collectively – which inequalities require the highest priority to address?

5. Page 13: Regarding the graphs of income inequality overall in the US and Europe: this context was appreciated, but additional interpretation in the discussion would be useful. Funding doesn't accumulate exactly like wealth can from investments, so why do you think inequalities in income and research funding are so closely associated? Could it be a coincidence?

6. Per the authors' statement, data and code should be publicly deposited upon acceptance or earlier.

https://doi.org/10.7554/eLife.71712.sa1

Author response

Essential revisions:

1. Revised wording/clarifications to make sure conclusions better match the data.

a. Abstract. First, stating the trend toward reversing inequality has "reversed", while technically accurate, implies return to the start of the analysis. In fact, there has been only a modest reversal, and inequality remains very strong. This sentence should be more nuanced, as should the following statement about career stages. Second the statement about women is very confusing. They state "Women continue to constitute a greater proportion of funded principal investigators, though not at parity." When I first read this, I was thinking-no chance that women are "a greater proportion", i.e., a majority of funded PIs. It would be clearer to state: "The fraction of women among funded PIs continues to increase, but they are still not at parity"

We agree with the reviewer. We have revised the abstract accordingly.

b. Figure 1 should have the Y-axis go to zero, to make it clearer just how much funding is held by the top 1%.

We respectfully disagree with the reviewer and point to the renowned graphics expert Edward Tufte. Professor Tufte wrote, “In general, in a time-series, use a baseline that shows the data not the zero point. If the zero point reasonably occurs in plotting the data, fine. But don't spend a lot of empty vertical space trying to reach down to the zero point at the cost of hiding what is going on in the data line itself. (The book, How to Lie with Statistics, is wrong on this point.) For examples, all over the place, of absent zero points in time-series, take a look at any major scientific research publication. The scientists want to show their data, not zero. The urge to contextualize the data is a good one, but context does not come from empty vertical space reaching down to zero, a number which does not even occur in a good many data sets. Instead, for context, show more data horizontally!” (See https://www.edwardtufte.com/bboard/q-and-a-fetch-msg?msg_id=00003q).

Nonetheless, we did revise Figure 1, Panel A according to the reviewer’s recommendation.

c. P. 4. The authors should note that the age range of ALL NIH funded investigators is skewing older over time-this has been analyzed by others but is worth a mention.

We agree with the reviewer and included text accordingly (lines 86-88).

d. The data in Figure 2 is very interesting, and deserves a more complete description, especially as it brings these analyses up to the current date.

We agree with the reviewer and have added text (lines 95-100).

e. P 10, Figure 3, the choice of data presentation and the way it was discussed significantly minimized the differences by career stage, race and gender. I imagine the majority of investigators in all groups are funded by a single grant at or near the modular level, and this is leveling differences – the mean/median difference supports this. I suspect that, and the authors have data to explore whether, the most highly funded individuals are very different between groups. What would the top decile look like for each group – their mean values suggest substantial differences? I'd like to see more in-depth analysis here. I think the authors have dug deeper, as they state "Women were younger, more likely to hold a PhD degree, and less likely to be principal investigators of more than the equivalent of 4 R01 grants.", but how one would assess this last factor from the data presented is not clear.

We agree with the reviewer and have added analyses of number of RPG’s per investigator (see Tables 1-5, lines 88-90 and 105-111).

f. The analysis of different institutions was stunning (Figures 6 and 7), and deserves some mention in the abstract and some more text talking about the nuances. I also wondered if the authors have analyzed by region of the country, as this is also an area that has attracted attention (e.g. analyses of Walls)?

We agree with the reviewer and added new Figures 8 (for region) and 9 (for states).

2. Additional analyses

a. The analysis in Figure 1 was exceptionally interesting. A similar comparison of the top 10% to the bottom 50% is worth adding. Also, in discussing the "bottom 50%", they should begin by noting the fraction of NIH-funded PIs who have a single grant (my memory suggests ~75%) and that this combined with the modular budget means most in the bottom half are quite similar in funding levels.

We agree with the reviewer. We added data on the top decile in Figure 1, Panel A, and on the number of grants per investigator in Tables 1-5.

3. A clearer and more extensive description of some of the methods used

a. Figure 1. The authors should provide a clearer explanation of the nature of the data in panels B and D, and how to interpret it. This is true when they use the same measures later.

We agree with the reviewer. We have added text (lines 58-61, 72-73).

b. Table 1. Make clear that for most measures the parenthetical values are percentages. Then explain what the parenthetical values represent in Age and Funding – are they ranges? Define IQR.

We agree with the reviewer and made these changes.

c. Table 2. Same Adjustments as in Table 1. Additionally, are the funding levels in some sort of inflation adjusted dollars – otherwise they seem very odd.

We agree with the reviewer and made these changes. We note that dollar figures are inflation-adjusted. (See also lines 187-188).

4. The fact that there is much greater inequality within groups than between groups is interesting and some explicit discussion of what this means would be helpful. It seems that researchers receiving more funding at later career stages may be reasonable as they on average (with speculation) may have larger networks and know the research questions that could be asked better. Thus, some inequality here may be considered acceptable. But there is not funding parity for gender or race for researchers in the workforce, which are unacceptable inequalities, although these may not be as striking as the composition of the workforce and distribution of funds to the most funded investigators. How do you interpret these results collectively – which inequalities require the highest priority to address?

We agree with the reviewer. Without wading into policy pronouncements (which would be inappropriate for this venue), we add some discussion (lines 171-177).

5. Page 13: Regarding the graphs of income inequality overall in the US and Europe: this context was appreciated, but additional interpretation in the discussion would be useful. Funding doesn't accumulate exactly like wealth can from investments, so why do you think inequalities in income and research funding are so closely associated? Could it be a coincidence?

We agree with the reviewer and added appropriate text (lines 142-144).

6. Per the authors' statement, data and code should be publicly deposited upon acceptance or earlier.

We agree with the reviewer. We will deposit de-identified data and code upon acceptance.

https://doi.org/10.7554/eLife.71712.sa2

Article and author information

Author details

  1. Michael S Lauer

    National Institutes of Health, Office of the Director, Bethesda, United States
    Contribution
    Conceptualization, Formal analysis, Supervision, Methodology, Writing - original draft
    For correspondence
    Michael.Lauer@nih.gov
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-9217-8177
  2. Deepshikha Roychowdhury

    NIH Office of Extramural Research, Bethesda, United States
    Contribution
    Conceptualization, Data curation, Formal analysis, Methodology, Writing - review and editing
    Competing interests
    No competing interests declared

Funding

National Institutes of Health

  • Michael S Lauer
  • Deepshikha Roychowdhury

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Senior Editor

  1. Mone Zaidi, Icahn School of Medicine at Mount Sinai, United States

Reviewing Editor

  1. Carlos Isales, Medical College of Georgia at Augusta University, United States

Reviewer

  1. Mark Peifer, University of North Carolina at Chapel Hill, United States

Publication history

  1. Preprint posted: June 24, 2021 (view preprint)
  2. Received: June 28, 2021
  3. Accepted: August 30, 2021
  4. Accepted Manuscript published: September 3, 2021 (version 1)
  5. Version of Record published: September 17, 2021 (version 2)

Copyright

This is an open-access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.

Metrics

  • 2,589
    Page views
  • 146
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Computational and Systems Biology
    2. Neuroscience
    Elliot H Smith et al.
    Research Article

    Interictal epileptiform discharges (IEDs), also known as interictal spikes, are large intermittent electrophysiological events observed between seizures in patients with epilepsy. Though they occur far more often than seizures, IEDs are less studied, and their relationship to seizures remains unclear. To better understand this relationship, we examined multi-day recordings of microelectrode arrays implanted in human epilepsy patients, allowing us to precisely observe the spatiotemporal propagation of IEDs, spontaneous seizures, and how they relate. These recordings showed that the majority of IEDs are traveling waves, traversing the same path as ictal discharges during seizures, and with a fixed direction relative to seizure propagation. Moreover, the majority of IEDs, like ictal discharges, were bidirectional, with one predominant and a second, less frequent antipodal direction. These results reveal a fundamental spatiotemporal similarity between IEDs and ictal discharges. These results also imply that most IEDs arise in brain tissue outside the site of seizure onset and propagate toward it, indicating that the propagation of IEDs provides useful information for localizing the seizure focus.

    1. Biochemistry and Chemical Biology
    2. Computational and Systems Biology
    Dhruva Katrekar et al.
    Tools and Resources

    Adenosine deaminases acting on RNA (ADARs) can be repurposed to enable programmable RNA editing, however their enzymatic activity on adenosines flanked by a 5' guanosine is very low, thus limiting their utility as a transcriptome engineering toolset. To address this issue, we first performed a novel deep mutational scan of the ADAR2 deaminase domain, directly measuring the impact of every amino acid substitution across 261 residues, on RNA editing. This enabled us to create a domain wide mutagenesis map while also revealing a novel hyperactive variant with improved enzymatic activity at 5'-GAN-3' motifs. However, exogenous delivery of ADAR enzymes, especially hyperactive variants, leads to significant transcriptome wide off-targeting. To solve this problem, we engineered a split ADAR2 deaminase which resulted in 1000-fold more specific RNA editing as compared to full-length deaminase overexpression. We anticipate that this systematic engineering of the ADAR2 deaminase domain will enable broader utility of the ADAR toolset for RNA biotechnology and therapeutic applications.