A summary of the four independent datasets used in the study showing: country, time period of the data, number of disciplines, research evaluation method, number of individuals or applicants by gender, number of individual data points, output variable and mean expected output for each gender across the whole dataset.

Maximal candidate model and best fit model to predict Score for PBRF and logit(P(Success)) for each dataset. Wilkinson notation is used to indicate interaction terms.

Male-dominated disciplines have higher expected research scores than female-dominated disciplines in all three PBRF assessments.

Points – the mean raw score of individuals in each of the disciplines (full raw data cannot be shown for privacy reasons), against the proportion of men in the discipline (blue – men, red – women). Lines – expected score from individual level analysis (N = 4135,65867467 respectively) adjusting for age, institute, gender, and proportion of men in the discipline, and shown for a 50-year-old man (blue) and woman (red) at one university in Aotearoa NZ.

Research score is strongly correlated with number of research outputs which is correlated to the gender balance of the discipline but, even after adjusting for a typical number of research outputs, female-dominated disciplines have lower scores.

PBRF data using individuals at UC (N=384). A) Score against number of research outputs over the period of the PBRF assessment. Trend line shows the expected score for a 50-year old. Error bars show the median and interquartile range for men and women for score (vertical) and number of outputs (horizontal). B) Mean number of outputs by gender for each discipline against gender balance of the discipline. Trend line from individual level analysis (N = 384), gender was not significant. C) Expected score of a 50-year-old man and women with the expected number of outputs for a discipline of that gender balance (25%, 50%, 75% men respectively).

Researchers in male-dominated disciplines have a higher chance of funding success.

A) ARC (2010-2019): Points - funding success rates by gender in 20 disciplines over 10 years against proportion of men in the discipline in 2018. Lines – expected success rates of men and women in 2010 (solid lines) and 2019 (dashed lines). B) CIHR (2011-2013), C) CIHR (2014-2016): Points – funding success rates by gender for each discipline against proportion of men in the discipline (estimated from application numbers) error bars are binomial 95% CI. Lines – expected success rate from combined analysis of all grant types in both time periods. D) EIGE (2020): Points - funding success rates by gender in 8 disciplines and 27 countries against proportion of men in the discipline in that country. Lines – expected success rates of men and women.