Plant Biology

Dissecting oligogenic and polygenic indirect genetic effects through the lens of neighbor genotypic identity

Yasuhiro Sato author has email address
Kosuke Hamazaki author has email address

Faculty of Environmental Earth Science, Hokkaido University, Sapporo, Japan
Institute for Advanced Academic Research (IAAR), Chiba University, Chiba, Japan

https://doi.org/10.7554/eLife.111650.1

Open access
Copyright information

Figures and data

Overview of the proposed mixed model incorporating oligogenic and polygenic indirect genetic effects.
Panel (a) displays the locus-level equation (upper) and its mixed-model expression (lower) for direct and indirect genetic effects (DGEs and IGEs) with their interactions (DGE×IGE). Blue and red terms highlight extended components relevant to IGE and DGE×IGE, respectively. Plain (black) terms indicate DGE components (i.e., a standard GWAS/GP model). The model parameters are as follows; x_q,i ∈ {AA, Aa, aa} = {+1, 0, −1} whereby or 0; : mean allelic similarity between focal and neighboring individuals; : allele frequency bias near focal individuals; β_q,1: SNP-wise DGE; β_q,2: SNP-wise IGE based on neighbor allelic similarity; and β_q,12: SNP-wise DGE×IGE based on the interaction term between β_q,1 and β_q,2. These fixed-effect coefficients can be used for GWAS of DGE and IGE. Panel (b) shows variance-covariance structure between the random effects u₁ and u₂ based on a multi-variate normal (MVN) distribution. The bottom illustration outlines how to incorporate the local (neighboring) and global spatial genetic structure involving IGEs. The bottom right and left respectively represent neighboring space for individual k and l, which constitute the k and l-th element of the four variance-covariance matrices as K₁: Kinship matrix representing individual genetic similarity; K₂: Kinship matrix weighted by neighboring allelic similarity; and K₁₂ & K₂₁: Kinship matrix weighted by neighboring allelic frequency, with the (co)variance-component parameters : DGE genetic variance (>0); : IGE genetic variance (>0); ρ₁₂: DGE-IGE covariance (>0 or <0). See also “Model development” in the main text.

Overview of the proposed mixed model incorporating oligogenic and polygenic indirect genetic effects.
Panel (a) displays the locus-level equation (upper) and its mixed-model expression (lower) for direct and indirect genetic effects (DGEs and IGEs) with their interactions (DGE×IGE). Blue and red terms highlight extended components relevant to IGE and DGE×IGE, respectively. Plain (black) terms indicate DGE components (i.e., a standard GWAS/GP model). The model parameters are as follows; x_q,i ∈ {AA, Aa, aa} = {+1, 0, −1} whereby or 0; : mean allelic similarity between focal and neighboring individuals; : allele frequency bias near focal individuals; β_q,1: SNP-wise DGE; β_q,2: SNP-wise IGE based on neighbor allelic similarity; and β_q,12: SNP-wise DGE×IGE based on the interaction term between β_q,1 and β_q,2. These fixed-effect coefficients can be used for GWAS of DGE and IGE. Panel (b) shows variance-covariance structure between the random effects u₁ and u₂ based on a multi-variate normal (MVN) distribution. The bottom illustration outlines how to incorporate the local (neighboring) and global spatial genetic structure involving IGEs. The bottom right and left respectively represent neighboring space for individual k and l, which constitute the k and l-th element of the four variance-covariance matrices as K₁: Kinship matrix representing individual genetic similarity; K₂: Kinship matrix weighted by neighboring allelic similarity; and K₁₂ & K₂₁: Kinship matrix weighted by neighboring allelic frequency, with the (co)variance-component parameters : DGE genetic variance (>0); : IGE genetic variance (>0); ρ₁₂: DGE-IGE covariance (>0 or <0). See also “Model development” in the main text.

Benchmark simulations under the scenarios of positive or negative covariance between DGE and IGE.
The upper panels (a-f) show the results of positive covariance (ρ = 0.6) while the lower panels (g-l) show those of negative covariance (ρ = −0.6). (a and g) Estimated covariance between polygenic DGE and IGE. Horizontal dashed lines indicate the true ρ₁₂ value. (b and h) Estimated weight of IGE relative to that of DGE . Models with or without the covariance (cov or noncov) were fitted to simulated phenotypes. Horizontal dashed lines indicate the true w₂ value. (c-d and i-j) Genomic prediction based on the models with or without the covariance (cov or noncov). The model performance was evaluated using the coefficient of determination (R²: c and i) and mean absolute error (MAE: d and j) between the predicted phenotypic values ŷ and the observed phenotypic values of the test dataset y. (e-f and k-l) GWAS performance for the models with or without the covariance (cov or noncov) to detect causative IGE SNPs. The model performance was evaluated using the area under the ROC curve (AUC: e and k) and sensitivity at the stringent threshold at false discovery rate (FDR) = 0.05 (f nad l). Box plots display the median and quartiles among 30 independent iterations, with whiskers extending to 1.5 × inter-quartiles.

Benchmark simulations under the scenarios of positive or negative covariance between DGE and IGE.
The upper panels (a-f) show the results of positive covariance (ρ = 0.6) while the lower panels (g-l) show those of negative covariance (ρ = −0.6). (a and g) Estimated covariance between polygenic DGE and IGE. Horizontal dashed lines indicate the true ρ₁₂ value. (b and h) Estimated weight of IGE relative to that of DGE . Models with or without the covariance (cov or noncov) were fitted to simulated phenotypes. Horizontal dashed lines indicate the true w₂ value. (c-d and i-j) Genomic prediction based on the models with or without the covariance (cov or noncov). The model performance was evaluated using the coefficient of determination (R²: c and i) and mean absolute error (MAE: d and j) between the predicted phenotypic values ŷ and the observed phenotypic values of the test dataset y. (e-f and k-l) GWAS performance for the models with or without the covariance (cov or noncov) to detect causative IGE SNPs. The model performance was evaluated using the area under the ROC curve (AUC: e and k) and sensitivity at the stringent threshold at false discovery rate (FDR) = 0.05 (f nad l). Box plots display the median and quartiles among 30 independent iterations, with whiskers extending to 1.5 × inter-quartiles.

Proportion of phenotypic variation explained (PVE) by direct genetic effects (DGEs), indirect genetic effects (IGEs), and their covariance (cov) in aspen.
Blue (upper), green (middle), and red (lower) bars respectively indicate PVE by IGEs, DGEs, and covariance for each trait (see PVE_IGE, PVE_DGE, and PVE_cov calculations in the main text subsection “Model development”). The left and right panels present results from data collected in 2014 and 2015, respectively. Trait abbreviations: AvgBA, average basal area; AvgSize, average tree volume; GrowthYEAR1_YEAR2, growth of AvgSize from YEAR1 to YEAR2; BAI, basal area increment; AvgEFN, average extra-floral nectary; SDEFN, standard deviation of extra-floral nectary; TotEFN, total extra-floral nectary; SLA, specific leaf area; LA, leaf area; BB, first bud break; BS, terminal bud set; GS, growing season (i.e., BB to BS); CT, condensed tannin; PG, phenolic glycoside; TotalDefChem, total defense chemicals; CN, carbon-nitrogen ratio; and N, nitrogen.

Indirect genetic effects of conspecific neighbors on apple traits across the six sites (Belgium, BEL; France, FRA; Italy, ITA; Poland, POL; Spain, ESP; and Switzerland, CHE) and three years (2018, 2019, and 2020).
(a-f) Variation partitioning of the flowering intensity (a-b), trunk diameter (c-d), and trunk increment (e-f). The left panels (a, c and e) show the proportion of phenotypic variation explained (PVE) by direct genetic effects (DGEs), indirect genetic effects (IGEs), and their covariance (cov) following the aspen example (see Fig. 3). The right panels (b, d, and f) display PVE by the covariance against the mean trunk diameter per trial. A single data point corresponds to each field trial. Red circles highlights trials with significant . Solid and dashed lines indicate significant and non-significant trends detected by linear mixed models that include the sites and years as random effects. (g) GWAS manhattan plot of IGEs on the trunk increment at the Belgium site in 2019 (BEL2019). The y-axis [-log₁₀(p₂)] indicates the association score of β_q,2, which is plotted against 17 chromosomes of apple. The horizontal dashed line indicates the genome-wide threshold at FDR = 0.05.

Indirect genetic effects of conspecific neighbors on apple traits across the six sites (Belgium, BEL; France, FRA; Italy, ITA; Poland, POL; Spain, ESP; and Switzerland, CHE) and three years (2018, 2019, and 2020).
(a-f) Variation partitioning of the flowering intensity (a-b), trunk diameter (c-d), and trunk increment (e-f). The left panels (a, c and e) show the proportion of phenotypic variation explained (PVE) by direct genetic effects (DGEs), indirect genetic effects (IGEs), and their covariance (cov) following the aspen example (see Fig. 3). The right panels (b, d, and f) display PVE by the covariance against the mean trunk diameter per trial. A single data point corresponds to each field trial. Red circles highlights trials with significant . Solid and dashed lines indicate significant and non-significant trends detected by linear mixed models that include the sites and years as random effects. (g) GWAS manhattan plot of IGEs on the trunk increment at the Belgium site in 2019 (BEL2019). The y-axis [-log₁₀(p₂)] indicates the association score of β_q,2, which is plotted against 17 chromosomes of apple. The horizontal dashed line indicates the genome-wide threshold at FDR = 0.05.

Proportion of phenotypic variation explained (PVE) by direct genetic effects (DGEs), indirect genetic effects (IGEs), and their covariance (cov) in grape.
Blue (upper), green (middle), and red (lower) bars respectively indicate PVE by IGEs, DGEs, and covariance for each trait (see also Fig. 3). Trait abbreviations: “nb.clusters”, the number of clusters; “datever”, date of veraison; “nb.wshoots”, the number of woody shoots, “d13C”, δ¹³C; “glufru.norm”, the sum of glucose and fructose; “gluvrai.norm”, glucose; “fruvrai.norm”, fructose; “tartrate.norm”, tartaric acid; “malate.norm”, malic acid; “shik.norm”, shikimic acid; and “citr.norm”, citric acid.

Ising model-based simulation of spatial patterns shaped by the nearest neighbor interactions.
The lattice space with 40 x 40 cells shows the outcome of 100-times update of the genotype value after a random configuration of two genotypes (black and white cells) following the fixed effects of Eq. S1.1, i.e., (a) The case of β_q,2 = −0.2, β_q,1 = 0, β₀ = 0. (b) The case of positive β_q,2 = 0.2, β_q,1 = 0, β₀ = 0.

Ising model-based simulation of spatial patterns shaped by the nearest neighbor interactions.
The lattice space with 40 x 40 cells shows the outcome of 100-times update of the genotype value after a random configuration of two genotypes (black and white cells) following the fixed effects of Eq. S1.1, i.e., (a) The case of β_q,2 = −0.2, β_q,1 = 0, β₀ = 0. (b) The case of positive β_q,2 = 0.2, β_q,1 = 0, β₀ = 0.

Numerical examples for the theoretical expectation between the phenotypic value and the neighboring frequency of A alleles f_q,A at q-th SNP locus within a local space k.
Black and grey lines indicate the phenotypic values of A and a alleles, respectively. Dashed curves represent a group mean defined by the weighted mean of the phenotypic values between A and a alleles. Panels (a-d) exemplify respective cases of positive and/or negative β_q,n(= β_q,12) and β_q,s×n(= β_q,2) with zero or non-zero β_q,s(= β_q,1) and β₀ = 1.0.

Numerical examples for the theoretical expectation between the phenotypic value and the neighboring frequency of A alleles f_q,A at q-th SNP locus within a local space k.
Black and grey lines indicate the phenotypic values of A and a alleles, respectively. Dashed curves represent a group mean defined by the weighted mean of the phenotypic values between A and a alleles. Panels (a-d) exemplify respective cases of positive and/or negative β_q,n(= β_q,12) and β_q,s×n(= β_q,2) with zero or non-zero β_q,s(= β_q,1) and β₀ = 1.0.

Variance components and genomic prediction of simulated data under the scenarios of moderate positive (ρ = 0.3: a-d), null (ρ = 0: e-h), or moderate nagative (ρ = −0.3: i-l) covariance between DGE and IGE.
(a, e and i) Estimated covariance between polygenic DGE and IGE. Horizontal dashed lines indicate the true ρ₁₂ value. (b, f, and j) Estimated weight of IGE relative to that of DGE . Models with or without the covariance (cov or noncov) were fitted to simulated phenotypes. Horizontal dashed lines indicate the true w₂ value. (c-d, g-h and k-l) Genomic prediction based on the models with or without the covariance (cov or noncov). The model performance was evaluated using the coefficient of determination (R²: c, g and k) and mean absolute error (MAE: d, h and l) between the predicted phenotypic values ŷ and the observed trait values of the test dataset y. Box plots display the median and quartiles among 30 independent iterations, with whiskers extending to 1.5 × inter-quartiles.

Variance components and genomic prediction of simulated data under the scenarios of moderate positive (ρ = 0.3: a-d), null (ρ = 0: e-h), or moderate nagative (ρ = −0.3: i-l) covariance between DGE and IGE.
(a, e and i) Estimated covariance between polygenic DGE and IGE. Horizontal dashed lines indicate the true ρ₁₂ value. (b, f, and j) Estimated weight of IGE relative to that of DGE . Models with or without the covariance (cov or noncov) were fitted to simulated phenotypes. Horizontal dashed lines indicate the true w₂ value. (c-d, g-h and k-l) Genomic prediction based on the models with or without the covariance (cov or noncov). The model performance was evaluated using the coefficient of determination (R²: c, g and k) and mean absolute error (MAE: d, h and l) between the predicted phenotypic values ŷ and the observed trait values of the test dataset y. Box plots display the median and quartiles among 30 independent iterations, with whiskers extending to 1.5 × inter-quartiles.

Coefficient of determination (R²) for the estimated phenotypic values ŷ between the models with or without the covariance parameter ρ₁₂.
Box plots are shown for the simulation scenarios with varying covariance between DGE and IGE (ρ = −0.6, −0.3,0,0.3,0.6). The high correlations (R² > 0.9) indicate the very similar ŷ between the two models.

Genomic prediction of the basal area increment from 2014 to 2015 (BAI2014_2015) in aspen.
The coefficient of determination (R²: a and b) and mean absolute error (MAE: c and d) of predicted phenotypic values ŷ among 10-fold cross-validations are shown for the data collected in 2014 (left) and 2015 (right).

Proportion of phenotypic variation explained (PVE) by indirect genetic effects (IGEs), direct genetic effects (DGEs), and their covariance (cov) in 20 apple traits other than the flowering intensity, trunk diameter, and trunk increment given in the main text.
Blue (upper), green (middle), and red (lower) bars respectively indicate PVE by IGEs, DGEs, and their covariance for each trait. Blank bars indicate the absence of data.

GWAS Manhattan (a) and quantile-quantile (b) plots for the citric acid (“citr.norm” in Fig. 5) in grape.
(a) The association score of −log₁₀(p) plotted against the 19 chromosomes of grape (where chromosome 100 means unanchored chromosomes). Horizontal dashed line indicates the genome-wide significance threshold at FDR = 0.05. (b) The observed association scores −log₁₀(p) plotted against randomly expected values. Trend line and shade indicate y = x and its 95% confidence interval.

Sign up for email alerts