Impact of donor overlap between tissues in age-associated trends
A – Heatmap depicting the percentage of common donors between pairs of tissues. A given square illustrates the percentage of all samples of tissue in the x axis (Tissue 1) that is in common with the tissue in the y axis (Tissue 2). B – Assessment of the relative contributions of different sources to the dataset’s variance. (Left panel) Tissue accounts for approximately 90% of the total variance, while donor contributes around 10%; age has a minimal impact (1%), likely due to the relative subtlety of its effects on gene expression and to the tissue specificity of ageing dynamics. (Right panel) Removal of the donor variable does not transfer variance to age, suggesting limited confounding between the two variables. C – Impact of the relative proportion of common donors on gene expression correlation between tissue pairs. Panels A, B, and C showcase the tissue pairs with the highest (Muscle Skeletal / Kidney Cortex), median (Pancreas / Heart Left Ventricle), and lowest (Small Intestine / Brain Amygdala) percentages of common donors, respectively. The left panels illustrate gene-by-gene Pearson’s correlations of gene expression between the two tissues, comparing the scenarios with (x-axis) and without (y-axis) the removal of common donors. The right panels depict the same comparisons, but with random downsampling (y-axis) in both tissues based on the proportions defined for common donor removal. The depicted examples show that the outcomes are comparable when removing common donors or employing random downsampling. D – Comparison of the impacts of removing common donor samples and random downsampling across tissue pairs. The process of removing common donors involved the identification and removal of samples from shared donors while maintaining the original sample size imbalance between tissues. As this process inherently involves downsampling, which may impact results, we performed additional downsampling by randomly removing samples from both tissues according to the proportions defined for the removal of common donors. The heatmap is coloured based on whether the removal of common donors has a greater (red) or lesser impact (blue) than random downsampling. The values depicted in the heatmap, denoted as the Impact of Common Donors (ICD), are computed for each tissue pair. This calculation involves several steps: first, by determining the absolute difference in Pearson’s correlation for each gene’s mean expression within each age window from the ShARP-LM pipeline, between the original data and the subset of data without common donors (DiffWoCD) or with random downsampling (DiffRD). Subsequently, the medians of DiffWoCD and DiffRD are computed, and the difference between these median values provides the ICD for each tissue pair. Due to the unidirectional nature of correlation (i.e., the results for tissue 1 vs tissue 2 mirror those for tissue 2 vs tissue 1), the resulting matrix is triangular in form. Grey tiles denote NA values, i.e., where the tissue-tissue comparison does not have a meaning, namely self-self and between sex-specific tissues. Top right insert: density line (“smoothed” histogram) of all ICD values. E – Scatter plot relating the Impact of Common Donors (ICD, values in heatmap D) with the respective percentage of common donors (values in heatmap A).