Exploration of the perceived barriers and benefits of inclusion of males and females in in vivo research from study 1.

Evaluation of survey data collected in study 1 with 39 participants in the baseline group, 51 in the interested group and 15 in the intervention group. A: Percentage of study participants that selected each barrier, displayed by treatment group. B: Percentage of study participants that selected each benefit, displayed by treatment group. C: Percentage of study participants that selected each barrier in the general population (baseline and interested group combined). D: Percentage of study participants that selected each benefit in the general population (baseline and interested group combined). A Pearson’s chi-squared test was used to compare the proportion of participants selecting each barrier/benefit between the treatment groups. Statistical significance is highlighted with a horizontal bar and if the p- value is less than 0.05, it is flagged with one star (*). If the p-value is less than 0.01, it is flagged with 2 stars (**). If a p- value is less than 0.001 it is flagged with 3 stars (***).

Exploration of the perceived barriers and benefits on the topic of inclusion of females and males in in vivo research for study 2.

Evaluation of data collected from study 2 with 29 participants completing the barrier question pre-intervention and 28 in the post-intervention. A: Percentage of study participants that selected each barrier, pre-and-post intervention, for the barriers targeted by the intervention. B. Percentage of study participants that selected all other barriers, displayed pre-and-post intervention. C: Percentage of study participants that selected each benefit, displayed pre- and -post intervention. D: Comparison of the percentage of study participants selecting benefits in study 1 general population (baseline plus interested group) versus study 2 pre-intervention group. E: Comparison of the percentage of study participants selecting barriers in study 1 general population (baseline plus interested groups) versus study 2 pre- intervention group. A McNemar’s test of association was used to compare the proportions between the pre and post intervention data in study 2. A chi-squared test was used to compare proportions between the two studies. Statistical significance is highlighted with a horizontal bar and if the p-value is less than 0.05, it is flagged with one star (*). If the p-value is less than 0.01, it is flagged with 2 stars (**). If a p-value is less than 0.001 it is flagged with 3 stars (***).

Exploration of intent for significant and critical predictor variables for study 1 data

A full model, with all potential predictors and demographics was fitted to explore the variation in intent and assess for evidence of predictive behaviour. The baseline group was set as the reference group. If main effects were significant the variation by treatment group was explored with Tukey post hoc testing. A: Relationship between intention and attitude. B: Relationship between Intention and behavioural control C: Relationship between intention and social norm. For panels A, B and C the grey area indicates the 95% confidence interval for the fitted linear relationship (blue) and the text indicates the statistical significance of the relationship. D: Model estimated means (Least Square Means) for each treatment group with a standard error bar estimated from the model. Vertical bars represent the planned comparison between groups, with * representing statistical significance <0.05. E: Violin plot showing the distribution of intent as a function of the ability to influence the design. Points indicate individual study participants, and the red box indicates the calculated mean for each group. The text indicates the statistical assessment for that variable.

Workshop Intervention construct

Statistical model output for the full model for study 1 data exploring the predictors ability to explain variation in average intent.

Where Beh_control represent the average behavioural control score, Soc_norm the average social normal score, Year_Work represented the number of years the participants have worked in animal research, Type_Work represents the type of research conducted by the participant, Education the highest level of education obtained, Stats_Training represents the level of statistical training received, Factorial_Fam represents how familiar the participants were with factorial experimental design, Factorial_Incor represents how often the participants incorporated males and females into their experiments while studying an intervention, attitude represents the average attitude score, and Ability_Influence represents how often the participants were involved or could influence the planning of experiments involving animals. Nparm stands for the number of parameters, Df represents the degrees of freedom and Prob > F represent the p value associated with the F ratio. Statistical significance shown as * for p-value < 0.05, ** for p< 0.01 and *** for p<0.001.

Exploration of intent for significant and critical predictor variables for study 2 data

A full model, with all potential predictors and demographics was fitted to explore the variation in intent and assess for evidence of predictive behaviour. A: Relationship between intention and attitude. B: Relationship between Intention and behavioural control C: Relationship between intention and social norm. For panels A, B and C the grey area indicates the 95% confidence interval for the fitted linear relationship (blue) and the text indicates the statistical significance of the relationship. D: Exploration of intention between pre and post intervention where a line links each individual participants score, and shaded area indicates the density of the Box-Cox transformed measure of intent. E: Relationship between intention and age. For graphs A, B, C and E: the grey area indicates the 95% confidence interval for the fitted linear relationship (blue) for Box-Cox transformed intent (y-axis) against key predictors in the TPB model (x-axis)(A, B, C) and significant predictor “age” (E).

Statistical model output for the full model for study 2 data exploring the predictors ability to explain variation in average intent.

Where Beh_control represent the average behavioural control score, Soc_norm the average social normal score, Year_Work represented the number of years the participants have worked in animal research, Type_Work represents the type of research conducted by the participant, Education the highest level of education obtained, Stats_Training represents the level of statistical training received, Factorial_Fam represents how familiar the participants were with factorial experimental design, Factorial_Incor represents how often the participants incorporated males and females into their experiments while studying an intervention, attitude represents the average attitude score, and Ability_Influence represents how often the participants were involved or could influence the planning of experiments involving animals. Nparm stands for the number of parameters, Df represents the degrees of freedom and Prob > F represent the p value associated with the F ratio. Statistical significance is flagged with one star (*) if the p-value is less than 0.05, with 2 stars (**) if less than 0.01, and with 3 stars (***) if less than 0.001.

Exploration of intervention impact on the proportion of correctly answered knowledge questions.

A: Study 1: Cumulative knowledge score (cumulative questions answered correctly) displayed by treatment group (Baseline: N=39, interest: N=51 and intervention: N=15). Statistical significance assessed with a Poisson regression. B: Study 1: Percentage correct answers for each question, displayed by treatment group (Baseline: N=39, interest: N 51 and intervention: N 15). Statistical significance assessed with a Pearson’s chi squared test. C: Study 2: Impact of intervention on the cumulative knowledge score. Where a line links each individual participants score, and shaded area indicates the density of the Box-Cox transformed measure of intent. Statistical significance assessed with a paired t-test (N=26 with pre and post data available). D: Study 2: Percentage correct answers for each question, displayed by pre-and-post intervention group. (Pre: N=29, Post: N=28). Statistical significance assessed with McNemar’s test of association. Statistical significance shown as * for p-value < 0.05, ** for p< 0.01 and *** for p<0.001.

Summary of demographic and potential predictors between groups for all survey 1 contributors who met the inclusion criteria.

The demographic information included is for the full 105 participants that met the inclusion criteria. While all 105 met the inclusion criteria, 7 participants left the question about age blank. As age was included as a predictor in the analysis of intent the missing values were managed with listwise deletion, assuming missing at random, reducing the dataset size to 98 participants (N=35 baseline, N=48 interested and N=15 intervention). To test for a statistically significant difference (association) between the treatment groups, a Pearson’s hi-square test was used for categorical variables, ordered logistic regression for nominal variables and ANOVA test for continuous variables. Institute type was collected as a demographic, the resulted population sampled was predominately academic and this variable was therefore removed from downstream analysis due to the lack of predictive ability to assess institute type on the outcome of interest. The abbreviation name in bracket, within the demographic column, indicates the term used within the statistical model and associated output.

Summary of demographic and potential predictors for all participants who met the inclusion criteria for Study 2.

The demographic information is for the full 31 participants that met the inclusion criteria in study 2. While 31 unique individuals met the criteria, an individual may not have responded to both surveys (N=29 pre-survey and N=28 in the post-survey). For instance, 2 participants did not meet the inclusion criteria for the pre-survey but met the criteria for the post-intervention survey. For the intention analysis, some missing data was observed in the demographic data. To reduce survey burden, the post-survey only included one demographic question (the participant’s age) to support alignment of data just in case duplicate initials were used as an identifier. Two responders did not include the age information in the study and were managed with listwise deletion, assuming missing at random. The abbreviation name in bracket, within the demographic column, indicates the term used within the statistical model and associated output. For McNemar’s paired analysis we conducted listwise deletion, assuming missing at random, reducing the dataset size to 26.

Cumulative knowledge score for study 1.

Pre- and post-cumulative knowledge scores for study 2.

Pearsons’s correlation coefficient analysis between continuous variables.

Where Year_Work represents the number of years the participants have worked in animal research, ducation represents the highest level of education obtained, Type_Work represents the type of research conducted by the participant, Training represents the level of statistical training received, Factorial_Fam represents how familiar the participants were with factorial experimental design, Factorial_Incor represents how often the participants incorporated males and females into their experiments while studying an intervention, attitude represents the average attitude score, eh_control represents the average behavioural control score, Soc_norm represents the average social normal score and Ability_Influence represents how often the participants were involved or could influence the planning of experiments involving animals. Only variables that were used in the final statistical model were compared for correlations.

For the perceived barriers, the Pearson’s chi-square test of association between treatment groups in study 1

This survey question provided several pre-defined options and ability to enter a free-texted option. Participants were asked to choose all that applied. Exploration of the free text has grouped the barriers into three additional categories: convention, logistic, and none or no barriers. To test for a statistically significant difference (association) between the treatment groups, a Pearson’s hi-square test was applied for all options where the total N>10.

For the perceived benefits, the Pearson’s chi-square test of association between treatment groups in study 1

This question provided several pre-defined options and the ability to enter a free-texted option. Participants were asked to choose all that applied. No free text advantages were provided by survey takers for this question. To test for a statistically significant difference (association) between the treatment groups, a Pearson’s chi-square test of association was applied for all options where the total N >10.

For the perceived barriers, the McNemar’s test of association between treatment groups in study 2.

This question provided several pre-defined options and the ability to enter a free-texted option. Participants were asked to choose all that applied. No free text advantages were provided by survey takers for this question. To test for a statistically significant difference (association) between the treatment groups, a McNemar’s test was applied for all options where the total N >10.

For the perceived benefits, the McNemar’s test of association between treatment groups in study 2.

This survey question provided several pre-defined options and ability to enter a free-texted option. Participants were asked to choose all that applied. Participants were given the option of a free text response, but none were submitted. To test for a statistically significant difference (association) between the pre and post measures, a McNemar’s test was applied for all options where the total N >10. This test accounts for the repeat nature of the data which required the list wise deletion of those individuals with missing data.

For the knowledge questions, a Pearson’s chi-squared test to assess association between treatment groups of the proportion of participants who answered the question correctly in study 1.

For the knowledge questions, a McNemar’s test to assess association between pre and post answers for each knowledge question in study 2.

Intention data and SAS analysis code

The following provides the data and SAS code used to analyze study 1 and study 2 intention data. This information can be used for data and analysis transparency. Further, the information below can be cut and pasted directly into SAS.

Poisson regression analysis of cumulative knowledge score for study 1.