Workshop Intervention construct

Exploration of the perceived barriers and benefits of inclusion of males and females in in vivo research from study 1.

Survey data collected in study 1 with 39 participants in the baseline group, 51 in the interested group and 15 in the intervention group. A: Barrier comparison between treatment groups. B: Benefit comparison between treatment groups. C: Proportion time barrier selected in the general population (baseline and interested group combined). D: Proportion time benefit selected in the general population (baseline and interested group combined). A Pearson’s chi-squared test was used to compare proportions between the treatment groups. Statistical significance is highlighted with a horizontal bar and if the p-value is less than 0.05, it is flagged with one star (*). If the p-value is less than 0.01, it is flagged with 2 stars (**). If a p-value is less than 0.001 it is flagged with 3 stars (***).

Exploration of the perceived barriers and benefits on the topic of inclusion of females and males in in vivo research for study 2.

Data collected from study 2 with 29 participants completing the barrier question in the pre-survey and 28 in the post-survey. A: Barriers associated with misconceptions that the intervention was looking to address. B: Other barriers. C: Pre- versus post-comparison of the benefits of inclusion. D: Comparison of the benefits selected in study 1 general population (baseline plus interested group) versus study 2 pre-testing group. E: Comparison of the barriers selected in study 1 general population (baseline plus interested groups) versus study 2 pre-testing group. A McNamar’s test of association was used to compare the proportions between the pre and post intervention data in study 2. A chi-squared test was used to compare proportions between the two studies. Statistical significance is highlighted with a horizontal bar and if the p-value is less than 0.05, it is flagged with one star (*). If the p-value is less than 0.01, it is flagged with 2 stars (**). If a p-value is less than 0.001 it is flagged with 3 stars (***).

Exploration of intent for significant and critical predictor variables for study 1 data.

A full model, with all potential predictors and demographics was fitted to explore the variation in intent and assess for evidence of predictive behaviour. The baseline group was set as the reference group. If main effects were significant the variation was explored with Tukey post hoc testing. For graphs A, B and C the grey area indicates the 95% confidence interval for the fitted linear relationship (blue). D: Data are presented as the model estimated mean (Least Square Means) with a standard error bar estimated from the model. Bar represents the planned comparison with * representing statistical significance <0.05. E: Violin plot showing the distribution of intent as a function of the ability to influence the design. The red box indicates the mean.

Workshop Intervention construct

Statistical model output for the full model for study 1 data exploring the predictors ability to explain variation in average intent.

Where Beh_control represent the average behavioural control score, Soc_norm the average social normal score, Year_Work represented the number of years the participants have worked in animal research, Type_Work represents the type of research conducted by the participant, Education the highest level of education obtained, Stats_Training represents the level of statistical training received, Factorial_Fam represents how familiar the participants were with factorial experimental design, Factorial_Incor represents how often the participants incorporated males and females into their experiments while studying an intervention, attitude represents the average attitude score, and Ability_Influence represents how often the participants were involved or could influence the planning of experiments involving animals. Nparm stands for the number of parameters, Df represents the degrees of freedom and Prob > F represent the p value associated with the F ratio. Statistical significance shown as * for p-value < 0.05, ** for p< 0.01 and *** for p<0.001.

Exploration of intent for significant and critical predictor variables for study 2 data.

A full model, with all potential predictors and demographics was fitted to explore the variation in intent and assess for evidence of predictive behaviour. For graphs A, B, C and E: the grey area indicates the 95% confidence interval for the fitted linear relationship (blue). Graph D displays density and individual participant pre/post Box- Cox transformed intent values.

Statistical model output for the full model for study 2 data exploring the predictors ability to explain variation in average intent.

Where Beh_control represent the average behavioural control score, Soc_norm the average social normal score, Year_Work represented the number of years the participants have worked in animal research, Type_Work represents the type of research conducted by the participant, Education the highest level of education obtained, Stats_Training represents the level of statistical training received, Factorial_Fam represents how familiar the participants were with factorial experimental design, Factorial_Incor represents how often the participants incorporated males and females into their experiments while studying an intervention, attitude represents the average attitude score, and Ability_Influence represents how often the participants were involved or could influence the planning of experiments involving animals. Nparm stands for the number of parameters, Df represents the degrees of freedom and Prob > F represent the p value associated with the F ratio. Statistical significance is flagged with one star (*) if the p-value is less than 0.05, with 2 stars (**) if less than 0.01, and with 3 stars (***) if less than 0.001.

Exploration of intervention impact on the proportion of correctly answered knowledge questions.

A: Study 1: Impact of treatment group on the cumulative knowledge score. Statistical significance assessed with a Poisson regression (Baseline: N=39, interest: N=51 and intervention: N=15). B: Study 1: Proportion of correct answers for each question (Baseline: N=39, interest: N=51 and intervention: N=15). Statistical significance assessed with a Pearson’s chi squared test. C: Study 2: Impact of intervention on the cumulative knowledge score. Statistical significance assessed with a paired t-test (N=26 with pre and post data available). D: Study 2: Proportion of correct answers for each question (Pre: N=29, Post: N=28). Statistical significance assessed with McNamar’s test of association. Statistical significance shown as * for p-value < 0.05, ** for p< 0.01 and *** for p<0.001.

Summary of demographic and potential predictors between groups for all survey 1 contributors who met the inclusion criteria.

The demographic information included is for the full 105 participants that met the inclusion criteria. While all 105 met the inclusion criteria, 7 participants left the question about age blank. As age was included as a predictor in the analysis of intent the missing values were managed with listwise deletion, assuming missing at random, reducing the dataset size to 98 participants (N=35 baseline, N=48 interested and N=15 intervention). To test for a statistically significant difference (association) between the treatment groups, a Pearson’s Chi-square test was used for categorical variables, ordered logistic regression for nominal variables and ANOVA test for continuous variables. Institute type was collected as a demographic, the resulted population sampled was predominately academic and this variable was therefore removed from downstream analysis due to the lack of predictive ability to assess institute type on the outcome of interest. The abbreviation name in bracket, within the demographic column, indicates the term used within the statistical model and associated output.

Summary of demographic and potential predictors for all participants who met the inclusion criteria for Study 2.

The demographic information is for the full 31 participants that met the inclusion criteria in study 2. While 31 unique individuals met the criteria, an individual may not have responded to both surveys (N=29 pre-survey and N=28 in the post-survey). For instance, 2 participants did not meet the inclusion criteria for the pre-survey but met the criteria for the post-intervention survey. For the intention analysis, some missing data was observed in the demographic data. To reduce survey burden, the post-survey only included one demographic question (the participant’s age) to support alignment of data just in case duplicate initials were used as an identifier. Two responders did not include the age information in the study and were managed with listwise deletion, assuming missing at random. The abbreviation name in bracket, within the demographic column, indicates the term used within the statistical model and associated output. For McNemar’s paired analysis we conducted listwise deletion, assuming missing at random, reducing the dataset size to 26.

cumulative knowledge score for study 1.

Pre- and post-cumulative knowledge scores for study 2.