Quantitative trait and transcriptome analysis of genetic complexity underpinning cardiac interatrial septation in mice using an advanced intercross line

  1. Mahdi Moradi Marjaneh  Is a corresponding author
  2. Edwin P Kirk
  3. Ralph Patrick
  4. Dimuthu Alankarage
  5. David T Humphreys
  6. Gonzalo Del Monte-Nieto
  7. Paola Cornejo-Paramo
  8. Vaibhao Janbandhu
  9. Tram B Doan
  10. Sally L Dunwoodie
  11. Emily S Wong
  12. Chris Moran
  13. Ian CA Martin
  14. Peter C Thomson
  15. Richard P Harvey  Is a corresponding author
  1. Victor Chang Cardiac Research Institute, Australia
  2. School of Clinical Medicine, Faculty of Medicine and Health, UNSW Sydney, Australia
  3. Department of Infectious Disease, Faculty of Medicine, Imperial College London, United Kingdom
  4. Centre for Clinical Genetics, Sydney Children’s Hospital, Australia
  5. Randwick Genomics Laboratory, NSW Health Pathology, Australia
  6. School of Biotechnology and Biomolecular Sciences, Faculty of Science, UNSW Sydney, Australia
  7. Sydney School of Veterinary Science, University of Sydney, Australia

Abstract

Unlike single-gene mutations leading to Mendelian conditions, common human diseases are likely to be emergent phenomena arising from multilayer, multiscale, and highly interconnected interactions. Atrial and ventricular septal defects are the most common forms of cardiac congenital anomalies in humans. Atrial septal defects (ASD) show an open communication between the left and right atria postnatally, potentially resulting in serious hemodynamic consequences if untreated. A milder form of atrial septal defect, patent foramen ovale (PFO), exists in about one-quarter of the human population, strongly associated with ischaemic stroke and migraine. The anatomic liabilities and genetic and molecular basis of atrial septal defects remain unclear. Here, we advance our previous analysis of atrial septal variation through quantitative trait locus (QTL) mapping of an advanced intercross line (AIL) established between the inbred QSi5 and 129T2/SvEms mouse strains, that show extremes of septal phenotypes. Analysis resolved 37 unique septal QTL with high overlap between QTL for distinct septal traits and PFO as a binary trait. Whole genome sequencing of parental strains and filtering identified predicted functional variants, including in known human congenital heart disease genes. Transcriptome analysis of developing septa revealed downregulation of networks involving ribosome, nucleosome, mitochondrial, and extracellular matrix biosynthesis in the 129T2/SvEms strain, potentially reflecting an essential role for growth and cellular maturation in septal development. Analysis of variant architecture across different gene features, including enhancers and promoters, provided evidence for the involvement of non-coding as well as protein-coding variants. Our study provides the first high-resolution picture of genetic complexity and network liability underlying common congenital heart disease, with relevance to human ASD and PFO.

Editor's evaluation

Overall, this is a comprehensive study that will provide a useful reference for the field. It will be a useful tool for hypothesis generation, which could lead to research on therapies that target atrial septal or common congenital heart disease.

https://doi.org/10.7554/eLife.83606.sa0

Introduction

Congenital heart disease (CHD) is common with a neonatal incidence of 0.8–1%, placing an enormous burden on affected patients, families, and healthcare systems (van der Linde et al., 2011). High throughput sequencing has identified abundant examples of monogenic syndromic and non-syndromic forms of CHD; however, the majority represent complex multifactorial conditions of unknown etiology, with a significant role for polygenic disease, de novo mutations, and gene-environment interactions (Sifrim et al., 2016). Genome-wide association studies (GWAS) have identified a small number of common CHD risk alleles, likely those of the largest effect (Lahm et al., 2021); however, for the majority of clinical subcategories, none have been identified thus far.

Development of the four-chambered mammalian heart leads to the permanent separation of the systemic and pulmonary circulations through septation of common atrial and ventricular chambers. Septation co-evolved with air-breathing and involves the convergence of myogenic and non-myogenic tissues at the valvuloseptal apparatus of the atrioventricular (AV) junction (Moorman and Christoffels, 2003). This process is genetically vulnerable as septation defects occur commonly in the human CHD spectrum (Gruber and Epstein, 2004).

During fetal life, the inter-atrial septum functions initially as a one-way ‘flap valve’ to help divert systemic blood away from the pulmonary circulation (Figure 1A). Regulatory networks generated at the boundaries between different cardiac progenitor fields define the programs for septal development (De Bono et al., 2018; Rana et al., 2014; Steimle et al., 2018). Initially, a muscular septum primum grows inwards from the atrial roof (Anderson et al., 2003) associated with a mesenchymal cap at its leading edge. The septum primum cap and an additional mesenchyme at its base called the dorsal mesenchymal protrusion (Burns et al., 2016; Webb et al., 1998), then fuse with mesenchyme at the AV complex (Deepe et al., 2020). Before its closure, the upper edge of the septum primum becomes fenestrated by apoptosis, forming a left-right communication termed the ostium secundum (Anderson et al., 2003; Moore and Persaud, 1998). A septum secundum forms as an infolding of the dorsal atrial wall in humans (Anderson et al., 2003) or as a specific muscular ridge in mice (Briggs et al., 2012), leaving an additional, prominent and offset interatrial communication termed the foramen ovale (Burns et al., 2016), completing the flap valve apparatus (Figure 1A).

Figure 1 with 1 supplement see all
Atrial septal morphology in normal adult and congenital heart disease.

(A) Schematic of human fetal heart highlighting inter-atrial septal structures. In the fetal heart, the septum primum has not yet fused with the septum secundum, as in adult stages; thus, the foramen ovale is patent. The foramen ovale and ostium secundum are offset, creating a flap valve. Arrow shows the direction of the blood shunt, allowing blood to bypass the pulmonary circuit. (B) Schematic of atrioventricular septal complex in the adult heart, and dysmorphology associated with the more common secundum type of atrial septal defect (ASD), and patent foramen ovale (PFO). Red arrow in the normal heart indicates the viewpoint for panel C. Double-headed black arrows indicate blood flow in pathological settings. (C) Light micrograph of the interatrial septum of an adult mouse heart as seen after removal of the left atrial appendage. The annulus of the mitral valve is towards the lower left of this panel. Note that the foramen ovale is covered by the membranous atrial septum primum, which is foremost in the left viewpoint (see red arrow in panel B). Septal landmarks are shown along with quantitative atrial septal traits measured in F2 and F14 studies. The crescent corresponds to the upper edge (see A) of the atrial septum primum. Scale bar, 100 µm. AV: atrioventricular; CRW: crescent width; DMP: dorsal mesenchymal protrusion; FO: foramen ovale; FOW: foramen ovale width; FVL: flap valve length; IVS: interventricular septum; LA: left atrium; LV: left ventricle; MC: mesenchymal cap; RA: right atrium; RV: right ventricle; SP: septum primum; SS: septum secundum.

© 2001, Elsevier. Figure 1A has been adapted from Figure 7-16 from Larsen, 1997, with permission from Elsevier. It is not covered by the CC-BY 4.0 license and further reproduction of this panel would need permission from the copyright holder.

© 2001, Elsevier. Figure 1C has been adapted from Figure 1 from Kirk et al., 2006.

When the lungs become activated at birth, left atrial pressure increases and the flap valve normally closes permanently by fusion of the septum primum to the septum secundum (Figure 1B, left panel). However, fusion is incomplete in about one-quarter of the human population, leading to the condition termed patent foramen ovale (PFO; Figure 1B, right panel) (Hagen et al., 1984). For larger PFO, there is a strong probability of a hemodynamically significant inter-atrial communication associated with a higher risk of cryptogenic (unexplained) stroke, likely due to the passage of venous thrombi across the patent septum to the systemic circulation (Lechat et al., 1988; Webster et al., 1988). Larger PFOs are also associated with atrial septal aneurysm (ASA) (Hagen et al., 1984; Homma et al., 2003), as well as migraine with aura, clinical hypoxemia, and decompression illness in divers (Shnaider et al., 2004; Torti et al., 2004; Wilmshurst et al., 2000).

Atrial septal defect (ASD) is a less prevalent but more severe abnormality of the atrial septum (Feldt et al., 1971; Figure 1B, middle panel). There is likely a genetic link and anatomical continuum between the more common secundum form of ASD (ASDII) with PFO and ASA (Kirk et al., 2006; Kirk et al., 2007; Posch et al., 2010). ASDII arises from abnormalities of the septum primum and/or secundum, and presents as a frank and permanent interatrial corridor (Zipes, 2005), and, if untreated, the left-to-right blood shunt present postnatally can cause pulmonary hypertension and Eisenmenger syndrome, a life-threatening complication (Beghetti and Galiè, 2009).

Analysis of QTL has emerged as an approach to understand the genetic complexity underpinning both quantitative and complex (non-Mendelian) binary traits (Ma et al., 2020; Shirai and Okada, 2021). Our previous study of inbred mouse strains revealed significant variation in atrial septal anatomy correlating with PFO and ASA (Biben et al., 2000), with a genetic background as a major determinant. We established quantitative parameters of septal status including length of the septum primum (flap valve length; FVL), orthogonal width of the foramen ovale (FOW), and width of the open corridor in PFO (crescent width; CRW) (Figure 1B and C). We found that mean FVL was strongly negatively correlated with the prevalence of PFO across a variety of genetic backgrounds (Biben et al., 2000) and, collectively, short FVL, large FOW, and large CRW were all strongly associated with PFO risk (Kirk et al., 2006). All traits were influenced by mutation of the cardiac homeodomain transcriptional master-regulator NKX2-5, pathogenic variants in which are strongly causative for ASDII in humans (Biben et al., 2000; Schott et al., 1998). We performed a QTL analysis with an F2 design using QSi5 and 129T2/SvEms strains, representing mice with extremes of atrial septal phenotypes, identifying seven significant (logarithm of odds (LOD) >4.3) and six suggestive (LOD >2.8) QTL affecting quantitative septal traits (Kirk et al., 2006), indicating a complex genetic basis for atrial septation defects in inbred mice. The F2 design used in our previous study has the power to detect the most significant QTL underlying complex traits, but confidence intervals are usually large, and each may reflect the effects of multiple loci and can contain hundreds of potential candidate genes (Darvasi et al., 1993). Therefore, increasing recombination has become the focus of fine mapping approaches including the use of advanced intercross lines (AIL), generated by intercrossing inbred strains with extreme phenotypes across 10 or more generations to increase chromosomal recombination (Darvasi and Soller, 1995). A similar rationale underlies the establishment of multi-parental recombinant inbred strains, which capture >90% of common genetic diversity among mouse species, used recently to study complex cardiovascular disease traits in adults (Salimova et al., 2019).

Here, we advance our understanding of genetic complexity underlying atrial septal variation in inbred QSi5 and 129T2/SvEms mouse strains. We used AIL to confirm and fine-map significant QTL identified for FVL and FOW in the previous F2 study. We also sequenced the genomes of the AIL parental strains and integrated variant analysis with transcriptome data from dissected atrial septa at different developmental stages in parental strains. Our results provide the first high-resolution picture of genetic complexity underpinning atrial septal variation in the mouse model, allowing the identification of QTL with high impact, candidate genes, and gene regulatory network perturbations that may have relevance to human PFO and ASD.

Results

Selection of AIL mice - atrial septal phenotypes

In our previous F2 study (Kirk et al., 2006), adult QSi5 and 129T2/SvEms parental mice and F2 mice were scored for PFO as a binary trait, and three quantitative anatomical parameters of the inter-atrial septum (FVL, FOW, and CRW) that were found to be associated with PFO (see Methods, Figure 1C; Biben et al., 2000; Kirk et al., 2006). The prevalence of PFO in parental strains was 4.5% and 80%, respectively. Beginning the pedigree from a single breeding pair, we randomly intercrossed F2 mice selected for extremes of phenotype for a further 12 generations to generate F14 (AIL) mice (see details in Methods). As in the original F2 study, the most relevant quantitative septal parameter in the F14 study was FVL (Table 1; Figure 1—figure supplement 1) - it showed the greatest difference in mean length up to a maximum of 2.5-fold between different inbred strains (Biben et al., 2000), and a difference of ~twofold and 4.8 standard deviations (SD) between parental strains for this study. FVL also showed the strongest (negative) correlation with PFO prevalence among a number of inbred strains (r=–0.97) (Biben et al., 2000), and in both F2 (p<0.001) and F14 (p<0.001) generations (Kirk et al., 2006).

Table 1
Phenotypic characteristics of parental strains and F2 mice extracted from Kirk et al., 2006 compared to F14 mice.
QSi5129T2/SvEmsF2F14
N66751437*933
PFO (%)4.5801734
FVL ± SD (mm)1.13 ± 0.110.60 ± 0.111.0 ± 0.191.01 ± 0.16
FOW ± SD (mm)0.21 ± 0.060.24 ± 0.060.21 ± 0.070.24 ± 0.07
CRW ± SD (mm)0.51 ± 0.130.44 ± 0.120.41 ± 0.120.54 ± 0.15
Body Weight ± SD (g)29.4 ± 2.7717.5 ± 2.126.6 ± 3.325.8 ± 2.9
Heart Weight ± SD (g)0.21 ± 0.020.14 ± 0.020.21 ± 0.030.18 ± 0.03
  1. *

    Data were incomplete for some mice.

In our original study in which quantitative septal parameters were defined (Biben et al., 2000), we also determined the width of the patent corridor between the septum primum and septum secundum in cases of PFO, measured at the edge of the ostium secundum of the septum primum remnant (Figure 1C), which forms a prominent crescent-shaped ridge. However, since this parameter was constrained to cases of PFO, for QTL analysis we considered only CRW (crescent width), defined as the length of the prominent crescent irrespective of the presence of PFO (Kirk et al., 2006). Whereas there was a strong statistical effect of PFO on CRW in the F2 study (p<0.001) (Kirk et al., 2006), this was lost in the F14 cohort (p=0.069) (Supplementary file 1). An additional observation was that mean CRW was positively associated with the risk of PFO in the F2 cohort, whereas in the F14 study longer CRW was associated with lower PFO prevalence (negative association). For these reasons and because CRW remains an ill-defined anatomical parameter, we did not consider this trait in selecting mice with extremes of phenotype in either the F2 or F14 study. However, post-hoc analysis for CRW QTL in the F2 study revealed a significant QTL on MMU7 (LOD = 4.58) and a suggestive one on MMU3 (LOD = 3.49) (Kirk et al., 2006). We, therefore, performed a similar post-hoc analysis in this F14 study (see below).

FOW showed more subtle differences among individuals within the parental strains, substantially overlapping within one SD (Table 1), and the correlation with FVL was weak in the F2 cohort (r=–0.087; p=0.001) (Kirk et al., 2006). Nonetheless, variation in FOW is reflective of up to a twofold difference in foramen ovale area (Biben et al., 2000). Therefore, FOW was taken into account in selecting mice of extreme phenotypes for inclusion in the AIL study and was indeed associated with both PFO (p<0.001) (Supplementary file 1) and FVL (r=–0.284; p<0.001) in F14 mice (Supplementary file 2).

In the AIL study, we sought to confirm and fine-map significant QTL found in the F2 study with LOD scores above the threshold for significance of 4.3. This included three QTLs for FVL and three for FOW. We also included a suggestive QTL (2.8<LOD<4.3) for FOW located on MMU9 (LOD = 3.43), since its peak covered the T-box transcription factor gene Tbx20, variants in the human orthologue of which are known to cause familial septal defects and severe PFO (Kirk et al., 2007).

Selection of AIL mice - heart weight and body weight phenotypes

Given that QSi5 mice were originally selected for their high fecundity and growth, it was evident that the quantitative septal parameters under study might be influenced by heart size and mass (Table 1; Kirk et al., 2006), and indeed both FVL and FOW were significantly correlated with HW in both F2 and F14 cohorts, albeit that the effects were small (Supplementary file 2; Kirk et al., 2006) and HW had no influence on the likelihood of PFO. In the F2 study, therefore, we did not normalize septal data for HW so as not to mask important QTL (Kirk et al., 2006). However, the possibility that FVL and FOW QTL could be explained by variation in HW or BW has not been formally excluded. Thus, prior to the analysis of AIL mice, we performed a retrospective linkage analysis for HW and BW on F2 data. The HW of F2 mice was initially adjusted for factors with significant effects (age, sex, and BW) and we used the same LOD score criteria as in the F2 study (4.3 for significant and 2.8 for suggestive linkage) (Lander and Kruglyak, 1995). We discovered a suggestive QTL (LOD = 3.4) for normalized HW on MMU7, and another on the same chromosome that fell just short of suggestive (Figure 2A). The suggestive HW QTL overlapped with previously determined QTL for BW on MMU7, however, did not represent a BW QTL in our study. We also found a significant QTL for HW normalized for age, sex and BW on MMU11 (LOD = 8.5) (Figure 2B). This QTL overlapped one for BW normalized for age and sex (LOD = 14.2) (data not shown). Thus, we included normalized HW as a parameter in selection of F14 mice with extremes of phenotype for further QTL analysis (see Methods), and selected markers for fine mapping of the significant normalized HW QTL on MMU11.

Figure 2 with 1 supplement see all
Linkage analysis for heart weight (HW).

Suggestive and significant quantitative trait loci (QTL) identified on MMU7 (A) and MMU11 (B) by retrospective analysis of HW in the F2 study, the latter being fine-mapped by the advance intercross line (AIL) study. Y-axes represent logarithm of odds (LOD) scores for HW adjusted for age, sex, and body weight (BW).

Linkage results for atrial septal morphology

We set a LOD score of 2 as a cut-off for significant linkage based on the density of chosen markers (~2 cM) and the size of the genomic regions covered (Lander and Botstein, 1989). From the septal morphology QTL identified in the F2 study (six significant and one suggestive), at least six QTLs were confirmed and significantly narrowed using AIL data. Five F2 QTLs resolved into multiple peaks and several new QTLs were also discovered. Furthermore, the overlap between QTL for different traits was increased. Supplementary file 3 describes QTL for FVL, FOW, and CRW identified by the AIL study, and Figure 3 shows the examples from each relevant chromosome comparing F2 and AIL mapping results. Overall, 37 QTLs were significant in the AIL study. As in the F2 study, the direction of QTL varied, and were classified here as ‘normal’ if they contribute to the negative septal features documented for the 129T2/SvEms strain (shorter FVL; longer FOW; shorter CRW) and ‘cryptic’ if protective against septal defects (i.e. contributing to the QSi5 phenotype of longer FVL; shorter FOW; longer CRW) (Supplementary file 3).

Figure 3 with 1 supplement see all
Comparison of linkage results from F2 (red line) and advanced intercross line (AIL) (blue line) populations.

Y-axis represent logarithm of odds (LOD) scores and x-axis represent genetic map positions. A 1-LOD drop-off for each quantitative trait locus (QTL) is shown on the x-axis representing the confidence interval of the QTL.

Linkage analysis of PFO as a binary trait

Although in general, the linkage analysis of continuously distributed traits is more sensitive and informative than the analysis of binary traits, the analysis of PFO as a binary trait was still of importance in our study. Therefore, we also performed direct linkage analysis for the presence or absence of PFO in the F14 mice as a binary trait using a logistic regression model that we previously developed (Moradi Marjaneh et al., 2012). This analysis confirmed most of the FVL and FOW QTL. The AIL results for each chromosome for quantitative (FVL, FOW, and CRW) and binary (PFO) atrial septal traits are compared in Figure 4. Results from each chromosome are summarized as follows:

MMU1

Comparison of advanced intercross line (AIL) results for patent foramen ovale (PFO), flap valve length (FVL), foramen ovale (FOW), and crescent width (CRW).

We previously identified a significant QTL for FOW with a peak at 30.8 cM extending 26.1 cM (inclusive of 1-LOD on each side of the peak = 1-LOD drop-off) on MMU1 (Figure 3A). The linkage analysis of AIL data narrowed this region to 7.5 cM, including two adjacent peaks with LOD scores of 3.1 and 4.4 overlapping in the 1-LOD drop-off. To assess whether the two peaks were distinct, we re-ran the linkage analysis including the marker closest to the distal peak as a fixed term (Figure 3—figure supplement 1A). While this led to a significant reduction in the LOD score of the distal peak, the proximal peak did not change and was still significant, indicating that the peaks represent two independent QTLs. Unlike the F2 study, the AIL study also showed strong evidence of linkage across this region for FVL (including two FVL QTL with LOD scores of 6.0 and 2.3, respectively) (Figure 4A). Analysis of the AIL also revealed an additional QTL for FVL on MMU1 located at 9 cM, with a peak LOD of 2.2, which is clearly separate from the original QTL (Figure 4A). The QTL map from the analysis of PFO as a binary trait followed the pattern produced by FOW and FVL data (Figure 4A). In particular, its peak at 41.6 cM (LOD = 4) strongly supported highly significant FOW and FVL QTL in that vicinity. The 1-LOD drop-off of this peak covered approximately 2.4 cM, a very substantial refinement on the F2 study.

MMU2

The broad F2 QTL peak for FOW (peak at 67.5 cM; Figure 3B) was narrowed to a sharp peak at 72 cM refining the QTL genomic region from 18.2 cM (F2) to 3.6 cM (AIL) (Figure 3B and Figure 4B). We also observed a new highly significant QTL at the distal end of this chromosome (~100 cM) affecting both FOW and FVL (LOD scores 5.79 and 5.44, respectively) (Figure 3B and Figure 4B). The binary analysis of PFO supported QTL for both FOW and FVL. While the F2 study was not able to detect QTL for CRW on this chromosome, linkage analysis of the AIL data revealed two possibly distinct QTL close to each other at 41.25 cM and 44 cM (Figure 4B and Figure 3—figure supplement 1B) without pleiotropic effects on the other quantitative traits.

MMU4

The very broad F2 FOW QTL spanning 23.8 cM of the chromosome (peaking at 25.4 cM; Figure 3C) resolved in the AIL as two FOW QTL (6 cM, LOD = 2.2; 44 cM, LOD = 6.0), leaving the mid portion including the 1-LOD drop-off of the original peak unlinked (Figure 3C). Given the sparsity of markers across this region in the F2 study, this outcome likely represents a resolution of the broad F2 QTL region into two widely spaced QTLs. The AIL data also revealed three QTLs for FVL, of which the distal one was located at the same position as the distal FOW QTL (Figure 4C). The analysis of PFO as a binary trait generated a similar pattern to the FVL and FOW results. A new QTL for CRW was detected at 32.25 cM which, as for the CRW QTL on MMU2, did not affect the other traits.

MMU8

The region at the end of this chromosome was found to be linked to FVL in the F2 study (Figure 3D). It peaked at 62.5 cM and covered 12.2 cM of the chromosome. On analysis of the AIL data, this QTL resolved into two QTLs at 52.25 cM and 66 cM (Figure 3D). A highly significant QTL was also detected for FOW at 63.7 cM (Figure 4D).

MMU9

The F2 study revealed a suggestive QTL for FOW peaking at 20.7 cM (LOD = 3.4) and extending 17.3 cM on MMU9 (Figure 3E). The AIL resolved this into at least four separate QTLs with peaks at 9.25, 19.25, 24, and 28.25 cM (Figure 3E). Tbx20, a cardiac transcription factor gene, is located at 10.25 cM, within the 1-LOD drop-off of the first QTL, and very close to its peak at 9.25 cM. We also detected a new QTL for FOW, peaking at 35.5 cM with a maximum LOD score of 4.6. Analysis of the AIL also identified two peaks for FVL at 19.5 cM and 23.75 cM with significant LOD scores (Figure 4E), both overlapping with significant peaks for FOW. The analysis of PFO as a binary trait resulted in a strikingly similar pattern to the FOW, significantly supporting all four of the FOW QTL, including the QTL coinciding with the Tbx20 gene.

MMU11

The F2 study did not show a linkage of any region of MMU11 to the atrial septal traits. However, this chromosome was included in the AIL study to fine-map the HW QTL detected by the analysis of the F2 data (see below). Using the chosen markers for the HW QTL we also discovered at least two new QTL underlying FVL, with a peak LOD score of 4.8 at 9.5 cM (Figure 4F).

MMU13

The AIL narrowed down the broad genomic region of the F2 QTL for FVL (Figure 3F) from 19.4 cM to 8.7 cM (Figure 4G). It is notable that the AIL data resulted in a shifting of the QTL peak from 15.3 cM (F2) toward the telomere of the chromosome including two close peaks at 4.9 and 7 cM (AIL) which were determined to be significantly distinct using the fixing method (see Methods) (Figure 3—figure supplement 1C).

MMU19

The broad F2 QTL for FVL at 10.2 cM resolved into three separate QTLs with the highest LOD score at 16 cM (Figure 3G). The fixing method showed these peaks represented three separate QTLs (Figure 3—figure supplement 1D). We also observed a linkage between this region and FOW, with the highest LOD score of 3.9 at 19.75 cM (Figure 4H). Binary analysis of PFO showed a similar pattern to that seen for FVL and FOW, strongly supporting the QTL located between 13 cM and 23 cM.

Linkage analysis for heart weight

As noted above, our retrospective analysis of F2 data revealed a significant QTL for normalized HW at the proximal end of MMU11, with the highest LOD score of 8.5, and a suggestive QTL on MMU7 peaking at 42.1 cM (Figure 2). This is the first report of QTL affecting HW on MMU11, although numerous HW QTL on other chromosomes has been detected previously using different cohorts and study designs (Reed et al., 2008; Rocha et al., 2004; Sugiyama et al., 2002). An extra nine markers covering the MMU11 HW QTL region were selected for this AIL study (see Methods). We also performed linkage analysis for normalized HW on other chromosomal regions for which markers had been previously chosen for analysis of atrial septal morphology. MMU2, MMU4, MMU9, MMU11, and MMU13 all showed significant evidence of linkage to HW Figure 2—figure supplement 1; of these, only MMU2 has previously been linked with HW, albeit on different genetic backgrounds (Rocha et al., 2004). On MMU11, we observed evidence of linkage distally with a LOD score of 2.3 peaking at 15 cM, thus confirming the original F2 QTL (Figure 2B).

Comparison of linkage results for HW and atrial septal morphology, however, showed that only the HW QTL on MMU2 and MMU9 had a potential overlap within its 1-LOD drop-off with QTL for atrial septal parameters. Thus, the chromosomal locations of QTL for HW were largely different from those defining atrial septal parameters.

Contribution of QTL to phenotypic differences

The percent attributable phenotypic variance for each QTL is shown in Supplementary file 3. As in the F2 study (Kirk et al., 2006), many AIL QTL were of relatively large effect - 18/37 individually contributed ~23–70% of the difference between parental means. Interestingly, the majority of these (15/18) were for FOW, with seven being normal and eight being cryptic QTL. Normal and cryptic QTL are anticipated to interact additively or in more complex ways in parental strains and individual mice, which may show phenotypes more extreme than parental strains (Kirk et al., 2006). The effect sizes of FVL QTL were somewhat smaller, individually 0.3–9% of the difference between parental means, with the majority (12/15) being normal QTL. Whereas it is difficult to precisely quantify the effects of individuals or combinations of QTL, our data suggest that many of the QTL detected are of large or moderate effect size and, therefore, collectively contribute significantly to quantitative trait variation between parent strains.

Whole genome sequencing of parent strains and SNP density analysis

The whole genomes of the AIL parental strains were sequenced and mapped to the C57Bl/6J mouse reference genome (NCBI38/mm10). Genomic analysis of the various inbred mouse strains used in research reveals a mosaic pattern of haplotypes attributable to the genome of founder lines. Accordingly, pairwise comparisons of inbred strain genomes show long blocks of either low or high SNP density representing shared and divergent ancestry, respectively (Frazer et al., 2007; Wade et al., 2002). Regions with a high rate of polymorphism between parental strains (spanning nearly one-third of the genome but harboring more than 95% of the genetic variation) can significantly reduce the regions of interest underlying QTL (Wade et al., 2002), albeit that regions with low or high polymorphism may also carry variants that have arisen since the establishment of parent strains (Bloom et al., 2019). Although these would be rare, they could in, principle, also contribute to septal variation.

High-quality homozygote variants between the AIL parental strains identified by whole genome sequencing were used to compute variant density across the genome (Figure 5—figure supplement 1; see Methods). To assess the distribution of variant densities, we binned the genome into non-overlapping intervals and counted the number of variants in each bin (Figure 5—figure supplement 2). The analysis was repeated using different bin sizes ranging from 50 Kb to 1.1 Mb. The distribution of variant densities across the genome showed a bimodal pattern, confirming two categories of genomic regions with low and high variant density regions. Binning at 600 Kb intervals showed the clearest distinction between the two peaks of distribution and the boundary between the two peaks, representing a cut-off of 1000 variants per 600 Kb interval, was chosen to classify genomic segments as low or high variant density regions. Intersecting with high variant density regions narrowed the total size of the QTL regions from 264.1Mb to 109.5 Mb (41.5%) (Figure 5). It is noteworthy that among the high variant density regions called there were blocks of very high variant density, presumably also derivative of strain ancestry.

Figure 5 with 2 supplements see all
Segmentation of the quantitative trait loci (QTL) according to high variant density regions.

The QTL regions for each chromosome are illustrated as black blocks. The red histograms and y-axeis represent the variant density defined as the number of 129T vs QSi5 variants per 600 Kb genomics intervals. High variant density regions (density ≥ 1000) are represented as pink blocks. The x-axis for each chromosome represents the genomic position (Mb).

Filtering for high-impact variants and candidate genes

There were a total of 6,769,034 high-quality variants between the AIL parental strains identified by whole genome sequencing (using C57Bl/6J as reference). We first filtered for protein-coding variants in genes expressed in the developing atrial septum (see below) and which lay under QTL regions (within the 1-LOD drop-off support interval but without filtering for high SNP density regions), then analyzed through the Ensembl Variant Effect Predictor (VEP) tool (McLaren et al., 2016) for pathogenicity impact. We filtered variants for high impact (stop-gain, stop-loss, start-loss, splicing defect, frameshift), missense variants predicted deleterious by SIFT (Ng and Henikoff, 2003), and non-high impact (e.g. non-frameshift) deletions. This resulted in 45 variants spread over 28 genes for 129T2/SvEms and 47 variants across 35 genes for QSi5 (Supplementary file 4; Figure 6A and B), noting that two genes had unique variants in each parent strain (total number of genes carrying predicted deleterious variants = 61). Virtually all of these were in high SNP density regions (84/92; 91.3%; Supplementary file 4) and many genes (26%) contained multiple variants (Figure 6A and B).

Identification of predicted-pathogenic variants.

(A, B) Number of variants per gene for predicted pathogenic genes under quantitative trait loci (QTL) according to classification (high-impact, non-frameshift deletion, or predicted-deleterious missense variant) for (A) 129T2 and (B) QSi5. (C, D) Overlap of genes containing predicted-pathogenic variants with (1) genes in a high confidence/emerging congenital heart disease (CHD) list, (2) genes with a heart development cardiovascular function from mouse genome informatics (MGI), and (3) all genes under QTL intervals that contain protein-modifying variants. Overlaps are shown for predicted pathogenic variants in (C) 129T2 and (D) QSi5.

For the Tbx20 gene on MMU9, a total of 321 variants were detected (Supplementary file 5); however, only three were exonic and these were synonymous. No defects in splice motifs or cryptic splice donor/acceptor sites were detected using the VEP tool. We reran the analysis using Spliceogen (Monger et al., 2019) and again no splicing defects were detected. Spliceogen provides rankings of possible cryptic splice sites (Supplementary file 5), however, these do not reflect actual probabilities. Therefore, we analyzed splice junctions detected in RNA-seq data from the developing septum (see below) but did not find splice site differences for Tbx20 between parental strains. We also mapped Tbx20 transcript isoforms with Salmon (Patro et al., 2017) and performed differential expression analysis, again yielding no differences in isoform expression between strains (Supplementary file 5).

To create a high-confidence QTL gene list, we annotated genes lying under QTL according to prior association with cardiovascular phenotypes (Mouse Genome Informatics; MGI) or link to human CHD from a curated list of high-confidence and emerging CHD genes (Alankarage et al., 2019). Eight genes (Cybrd1, Dst, Fxn, Lrp2, Mcam, Pgbd1, Sik3, Smad6) intersected with one or both lists (Figure 6C and D). We highlight below several high-confidence genes, as well as Cep164, an example of a gene in which seven predicted-deleterious variants were detected:

Cep164

129T2/SvEms strain; MMU9; under a cryptic QTL for FOW (effect size 65%). Cep164 contained seven predicted-deleterious missense coding variants and one in-frame three base pair (bp) deletion (Figure 6A). Cep164 encodes a docking protein associated with mother centrioles and is essential for the formation of primary and multi-cilia, therefore, affecting cellular processes such as fluid sensing and inter-cellular signaling (Schmidt et al., 2012). Defects in ciliogenesis are well known to cause CHD (Audain et al., 2021; Li et al., 2015). Three Cep164 variants overlap regions predicted to encode coiled-coil domains in the centre of the protein and one variant may overlap a predicted DNA/RNA-binding domain (PredictProtein Bernhofer et al., 2021). Cep164 knockout mice are embryonic lethal showing multiple severe defects including heart developmental arrest at the looping stage (Siller et al., 2017).

Dst

129T2/SvEms strain; MMU1; under a cryptic QTL for FVL (effect size 5.68%). Dst1 carried three missense mutations predicted to be deleterious. Dst1 encodes diverse isoforms of Dystonin, members of the plakin family of cytoskeletal docking and tumor suppressor proteins implicated in cell growth, adhesion, cytoskeleton organization, intracellular transport, and migration. They may function in part through an ability to antagonize the activation of the Hippo pathway effector YAP (Jain et al., 2019). Mutations in humans and mice cause fragile skin (epidermolysis bullosa), severe neurological conditions, as well as skeletal muscle instability (Horie et al., 2017; Künzli et al., 2016). Isoform BPAG1b is expressed in cardiac and skeletal muscles colocalizing with Z-bands, sarcolemma, and cardiac intercalated discs, and interacts with α-actinin2. Knockout strains show diminished heart size and atrophy before weaning and altered hypertrophy/wall stress markers (Horie et al., 2017). Dst is classified as an emerging CHD gene.

Sik3

129T2/SvEms strain; MMU9; under a cryptic QTL for FOW (effect size 65%). Sik3 carries a non-synonymous SNP predicted to affect SIK3 function. Sik3 encodes a serine/threonine kinase of the AMP Kinase (AMPK) family whose activity is inhibited by cyclic AMP (cAMP) via phosphorylation at multiple sites by a cAMP kinase (PKA). SIK3 functions in a variety of biological processes (Wein et al., 2018) including through phosphorylation and inhibition of cAMP-regulated transcriptional coactivators (CRTCs) and class IIa histone deacetylases (HDACs), the latter functioning as inhibitors of MEF2 transcription factors that play key roles in heart development (Darling and Cohen, 2021). Sik3 overlaps with the MGI list with a homozygous targeted EUCOMM mutation showing increased heart weight.

Lrp2

QSi5 strain; MMU2; under a normal QTL for CRW (effect size 30%). Lrp2, encoding a low-density lipoprotein receptor-related protein, carried two missense variants predicted to be deleterious. Lrp2 is expressed in the SHF progenitor pool and developing outflow tract (OFT) and is known to act in multiple morphogenetic pathways. Variants in human LRP2 have been associated with hypoplastic left heart (Theis et al., 2020), whereas mutant mice show premature differentiation and depletion of anterior SHF progenitors in the fetus, leading to a shortened OFT, truncus arteriosus, and ventricular wall and septal defects (Christ et al., 2020).

Smad6

QSi5 strain; MMU9; under a cryptic QTL for FOW (effect size 58%). Smad6, encoding an inhibitory member of the mothers against decapentaplegic transcription factor (SMAD) family, carried the non-synonymous SNP NM_008542 c.845C>G (p.Arg282Pro), predicted to be deleterious. We chose this variant for further analysis. SMAD6 is an inhibitor of BMP signaling acting at multiple levels, including inhibition of the binding of receptor-activated (R) SMADs 1, 5, and 8 (the effector transcription factors of the BMP pathway) to BMP receptors, where they are normally activated by phosphorylation; inhibition of binding of R-SMADs to their co-SMAD (SMAD4); and direct transcriptional repression of targets genes of R-SMADs by recruiting transcriptional repressors such as CtBP, HDAC-1/3 and HOXC8/9 (Bai and Cao, 2002; Bai et al., 2000; Lin et al., 2003; Figure 7A). BMP signaling has been implicated in the cardiac specification and development of cardiac chambers, valves, septa, and outflow tract (Wang et al., 2011), and SMAD6 mutations in humans are associated with syndromic and non-syndromic congenital malformations, including cardiac outflow tract defects, as well as patent foramen ovale and atrial and ventricular septal defects (Calpena et al., 2020; Galvin et al., 2000; Tan et al., 2012). The R282P variant lies within the regulatory linker region joining the MH1 DNA-binding domain and MH2 transactivation/protein:protein interaction domain (Figure 7B), and interrupts the conserved PPXY motif known to bind the E3 ubiquitin ligase SMURF1, predicted to poly-ubiquitinate SMAD6 triggering proteosome-mediated degradation (Sangadala et al., 2007). SMURF1 ubiquitinates other targets including SMAD1, and can also act to exclude SMAD1 (and possibly SMAD6) from the nucleus (Sapkota et al., 2007). The R282P variant also lies close or immediately adjacent to serine residues that are modifiable by phosphorylation (Bian et al., 2014; Christensen et al., 2010; Mertins et al., 2016). In SMAD1, SMURF1 binding and subsequent ubiquitination is dependent upon MAPK-mediated phosphorylation of linker serine/threonine residues, which prime phosphorylation by kinase GSK-3β, facilitating poly-ubiquitination (Sapkota et al., 2007). Thus, the R282P SMAD6 variant potentially disrupts the complex regulatory network regulating SMAD6 stability and activity.

Functional analysis of SMAD6 variant R282P.

(A) Schematic of BMP signaling pathway highlighting roles of SMAD6. Prepared with Biorender. (B) Schematic of hSMAD6 showing the three main functional domains; the conserved DNA-binding MH1 domain (172-270aa), regulatory linker region (271-328aa), and the C-terminal transactivation/cofactor interaction MH2 domain (329-496aa). The position of G471 within the MH2 domain is indicated – mutation of this residue (G471S). Insert shows the amino acid sequence of the linker region containing PPXY (SMURF1 binding) and PLDLS (CtBP binding) domains, with the position of hSMAD R281P and Y279A variants indicated. Known serine phosphorylation sites are marked * above the sequence. (C–E). Functional analysis of hSMAD6 variants. Activity of hSMAD6-G471S (C), hSMAD6-Y279A (D), and hSMAD6-R281P (E), were assessed relative to wild-type (WT) hSMAD6 on a BMP-signaling responsive (BRE)-luc promoter in untreated and BMP4-treated cells. n=6–9; differences between WT and variant were assessed by unpaired t-test; data presented as mean ± SD. Asterisks denote significance, ****p<0.0001, ns = 0.3332. (F,G) Ubiquitination of WT hSMAD6 and variants assessed in MG132 or DMSO (control)-treated HEK293T cells. WT FLAG-ShMAD6 and variants were immunoprecipitated with anti-FLAG antibody with lysates subject to western blotting with anti-ubiquitin antibody (F). Total ubiquitin, FLAG-hSMAD6, and protein indicated. (G) Quantification of FLAG-hSMAD6 ubiquitination. n=3; difference between WT and variants assessed by one-way ANOVA with Tukey’s multiple comparisons; data presented as mean ± SD. Asterisks denote significance, ***p=0.0001.

We chose to follow up on this SMAD6 variant. First, we analyzed the activity of WT and mutant forms of SMAD6 in HEK293T cells after transfection of human (h) cDNAs and a luciferase reporter construct reading out SMAD-dependent BMP-signaling (Figure 7C–E). WT hSMAD6 repressed BMP signaling by ~98% at baseline and ~92% after stimulation of cells with BMP4. Repression was diminished significantly by the hSMAD6 mutation G471S, which abolishes the interaction between SMAD6 and SMAD1 (Figure 7C). Then, to gauge the impact of the loss of interaction with SMURF1, we replaced tyrosine 279 of the hSMAD6 PPXY domain, predicted to be essential for SMURF1 binding, with alanine (Sangadala et al., 2007). Relative to WT, the Y279A variant slightly reduced SMAD6 repression at baseline, and the change was greater after BMP4 stimulation (Figure 7D). The R281P hSMAD6 variant (equivalent to R282P in mouse), also showed a slight reduction in repression similar to Y278A at baseline, however, the reduction was eliminated after BMP4 stimulation (Figure 7E), suggesting an alternative mechanism to simply loss of SMURF1 binding. To take a different approach, we tested the poly-ubiquitination status of the above hSMAD6 variants using co-immunoprecipitation (Figure 7F). After transfection of FLAG-tagged hSMAD6 vectors into HEK293T cells and treatment with proteosome inhibitor MG132 or DMSO control, lysates were precipitated using a FLAG antibody and ubiquitination of hSMAD6 detected by western blotting. In the presence of MG132, poly-ubiquitinated hSMAD6 WT, R281P, and Y279A were readily detected (Figure 7F). Ubiquitination of G471S was lower, perhaps because ubiquitination depends on interaction with an R-SMAD. In DMSO, poly-ubiquitination of WT, Y279A, and G471S hSMAD6 were barely detected; however, poly-ubiquitination of R281P occurred at levels ~10 fold higher than WT, comparable to that seen in the presence of MG132 (Figure 7F and G). Our findings suggest a defect in the hSMAD6 R281P-SMURF1 binding/ubiquitination pathway leading to constitutive poly-ubiquitination.

Because the Smad6 variant lies under a cryptic QTL of high effect (58%) and is seen in the QSi5 strain (with robust septal qualities), it is an excellent candidate for being protective against atrial septal defects. We confirmed Smad6 expression in the developing interatrial septum (see below). Other highlighted variants have not yet been validated.

Transcriptome analysis of the developing atrial septum

To explore changes in the gene regulatory networks affecting atrial septal traits, we profiled the transcriptome of developing septa dissected from AIL parental strains at embryonic stages (E) 12.5, E14.5, and E16.5 (see Methods). At the earliest time point, E12.5, the thin septum primum has already fused with the AV septum via its mesenchymal cap, and the ostium secundum has been created by cell death; thus, transcriptome changes across the three stages analyzed may reflect prior changes during septum primum formation, as well as the formation of the septum secundum, septal growth, and remodeling. RNA was extracted and pooled into two biological replicates per mouse strain per time point (three septa/pool; total 12 samples) and cDNA libraries were generated and sequenced on an Illumina platform.

Principal component analysis (PCA) performed on the top 500 variable protein-coding genes showed that sample replicates clustered closely together, whereas samples from the different embryonic time points segregated along the first principal component (PC1) axis and those from the different inbred strains segregated along the PC2 axis (Figure 8A). We calculated differentially expressed protein-coding genes (DEGs; protein coding) for each time point separately (using the DESeq2 R package Love et al., 2014) using significance thresholds of padj <0.05 and log2 fold-change >0.5. E12.5 had the largest number of differentially expressed genes (990), followed by E16.5 (881) and E14.5 (762) (Figure 8B and C; Supplementary file 6), and at all time points there were more genes downregulated in 129T2/SvEms compared to QSi5 (Figure 8C). The majority of DEGs were unique to a time point; however, 453 genes were downregulated in at least 2-time points. Almost 30% (18/61) of genes with predicted-pathogenic variants showed differential expression at at least 1-time point (Supplementary file 4).

Figure 8 with 2 supplements see all
Transcriptome analysis of atrial septum.

(A) Principal component analysis (PCA) plot of gene expression profiles from atrial septum RNA-seq libraries including two replicates per time point (E12.5, E14.5, and E16.5) per mouse strain (QSi5 and 129T2/SvEms) for protein-coding genes. (B) Venn diagram with the number of differentially expressed genes between QSi5 and 129T2/SvEms at different time points. (C) MA plots of differentially expressed genes for each time point. (D) STRING network of E14.5 differentially expressed genes and genes with predicted-pathogenic variants from QSi5 or 129T2/SvEms. Genes are colored according to whether they are upregulated or downregulated in 129T2/SvEms or contain a variant that is predicted-pathogenic and differentially expressed.

We investigated the biological significance of DEGs at each time point separately using protein-protein association connections obtained from the STRING database (Szklarczyk et al., 2017), filtering for high-confidence connections. Overall network significance was calculated by comparing network metrics with those from STRING networks of randomly selected genes across 100,000 permutations. To discover potential drivers of network perturbations, we also included in the STRING analysis protein-coding DEGs which were located under QTL (1-LOD drop-off) and in which variants predicted to be deleterious had been identified. Using this stringent approach, the network at E14.5 was significant for both the number of edges (observed = 921; expected = 419.8 ± 56.5; p<1 × 10–5) and average clustering coefficient (observed = 0.40; expected = 0.27 ± 0.03; p=8 × 10–5) (Figure 8D). The E14.5 network was driven predominantly by genes downregulated in 129T2/SvEms compared to QSi5 septa, i.e., the strain with the highest prevalence of PFO and most disadvantageous septal traits (shortest FVL, largest FOW). Highly connected sub-networks contained genes involved in biosynthetic and signaling pathways and macromolecular machines, including nucleosomes, ribosomes, mitochondria, and extracellular matrix (ECM) (Figure 8D). This signature was supported by a significant over-representation of gene ontology (GO) biological process terms among DEGs for nucleosome assembly, chromatin assembly, translation, ATP metabolic process, and extracellular matrix organization (Supplementary file 7; Fisher’s exact test; false discovery rate <0.05). Whereas networks for E12.5 and E16.5 alone, and those for DEGs common across all time points, were not significant overall, both E12.5 and E16.5 networks contained ribosomal and ECM protein sub-networks (Figure 8—figure supplement 1 and Figure 8—figure supplement 2).

Within the E14.5 network, Ubiquitin C (Ubc) was a prominent downregulated hub gene connecting directly to histones (specifically H2B) and ribosomal genes, as well as others, dispersed across the network, most of which were also downregulated (Figure 8D). In addition to its well-characterized role in protein homeostasis, Ubc is also involved in transcription and genome integrity (Mark and Rape, 2021; Mattiroli and Penengo, 2021; Qu et al., 2021). As such it might be expected to be functionally connected to a large number of genes even in random networks; however, the number of connections between Ubc and other genes in the septal network was far greater than expected by chance (Fisher’s exact test, padj = 6.97 × 10–5). Furthermore, analysis of network topology in Cytoscape showed that Ubc had the highest measure of stress centrality and betweenness centrality, key indicators of network hubs (Shannon et al., 2003).

Other genes with high centrality scores were Akt1 and Prkaca, encoding AKT Serine/Threonine kinase/Protein Kinase B (AKT/PKB) and Protein Kinase A (PKA), respectively, which were downregulated in 129T2/SvEms compared to QSi5 septa, and had dispersed downregulated connections across the network. These connections were over-represented in their respective known and predicted phosphorylation targets (sourced from PhosphoSitePlus Hornbeck et al., 2012 and PhosphoPICK Patrick et al., 2015) - 12/31 for ATK, 9/17 for PKA (P<1 × 10–4 for both kinases; Fisher’s exact test) (Figure 9). We further analyzed known and predicted kinase targets among all genes in the network, which resulted in an additional 35 phosphorylation connections for PKA and nine for AKT. Interestingly, along with hubs for Ubc, collagen/ECM, and translation, an additional hub centered on Src, encoding a proto-oncogene tyrosine kinase involved in embryonic development and cell growth, was evident at E12.5, but not at later stages (Figure 8—figure supplement 1). Across all stages, there were very few DEGs in which predicted-pathogenic coding variants were identified, and none appeared as hub genes, placing network genes downstream of functional variants.

Network analysis of protein kinase A (PKA) and AKT1 phosphorylation targets.

Integrated network analysis of PKA and AKT1 STRING connections and phosphorylation substrate targets from PhosphoSitePlus and PhosphoPICK predictions.

Differentially expressed QTL genes are enriched for genomic variants

Both protein-coding and cis-regulatory variants are likely causally linked to network perturbations underlying quantitative traits, as highlighted by GWAS on human disease (Zhang et al., 2014). To explore the relationship between QTL variants and DEGs further, we assessed the enrichment of variants (129T2/SvEms vs QSi5) within and around DEGs underlying QTL relative to genomic features (enhancers, promoters, 5′UTRs, exons, introns, and 3′UTRs), comparing to variant distribution in non-DEGs. Enhancers were defined as regions showing enhancer histone (H) marks (H3K4me1+ and/or H3K27ac+; H3K4me3) within regions covering 50 Kb-250 Kb upstream and downstream of the transcriptional start site (TSS) for genes expressed in mouse embryonic stem cell-derived cardiac progenitor cells and cardiomyocytes, and whole E14.5 fetal hearts (Wamstad et al., 2012). We selected DEGs as protein-coding genes with an adjusted differential expression p-value (padj) of <0.05 and absolute log2 fold-change of >0.5 in at least one atrial septal time point. Of 2168 such genes, 214 were under QTL (1-LOD drop-off) (Supplementary file 8). For non-DEGs, we selected 2168 protein-coding genes that showed the highest padj value after merging data from all three time points, of which 188 were under QTL (Supplementary file 8).

After permutation testing, we observed a significant enrichment of variants relative to expected in enhancers, promoters, 5′UTRs, and introns for DEGs underlying QTL (Figure 10). Non-DEGs showed significant depletion of variants in exons. DEGs and non-DEGs showed similar normalized variant count distributions across features (Figure 10—figure supplement 1), indicating that DEGs were not substantially skewed toward genes with high variant density. Likewise, variants were not focused on any specific gene class (Supplementary file 8). The over-representation of variants was consistent when candidate cardiac enhancers near DEGs were considered even up to 250 Kb from the TSS (Figure 10—figure supplement 2A), noting however that most enhancers were found within the 100 Kb window (Figure 10—figure supplement 2B). Interestingly, DEGs had approximately 60–80% more enhancers than non-DEGs, consistent with their higher levels of expression, irrespective of septal stage or strain (Figure 10—figure supplement 2C–E).

Figure 10 with 2 supplements see all
Enrichment of 129T2/SvEms vs QSi5 variants in genomic features of differentially expressed genes (DEGs) under quantitative trait loci (QTL).

The number of variants directly overlapping a feature is shown in blue, and gray bars represent the expected values based on the mean overlap from 1000 randomly generated interval sets. Error bars show the 95% confidence intervals of the mean. The significance of the enrichment is expressed as p-values, calculated by dividing the number of random samples showing equal or greater overlap than the observed by the total number of permutations (*p< 0.05).

Discussion

QTL analysis has emerged as an approach to understand the genetic complexity underpinning both quantitative and complex (non-Mendelian) binary traits. PFO and ASD are examples of complex binary traits of medical significance. One model for complex binary traits assumes an underlying continuous but unobservable variable (termed liability) with a threshold above which an individual expresses a phenotype (Falconer, 1965). Quantitative parameters act as proxies for the assumed liabilities and significantly increase the power of QTL detection. Comparing F2 and AIL designs with identical parameters including sample size and marker density, the predicted confidence intervals of AIL QTL are t/2 times smaller than those of F2 QTL, where t is the number of AIL generations (Darvasi and Soller, 1995). This indicates that AIL is a powerful method for precise localization of QTL and also separation of linked QTL identified by an F2 design.

We previously mapped QTL underlying quantitative parameters of the atrial septum using an F2 intercross design and here applied the AIL approach for confirmation and fine mapping. The cascade breeding program generating the F14 AIL resource from 48 breeding pairs per generation came close to the practical optimum (100 animals produced per generation) (Darvasi and Soller, 1995). From seven F2 QTLs included in our AIL study, at least six were confirmed and substantially narrowed, whereas five F2 QTLs resolved into multiple peaks, and additional QTLs were discovered. Overall, 37 significant atrial septal QTLs were documented.

Among the three quantitative parameters studied here, FVL and FOW showed a similar pattern of linkage on most of the chromosomes. This is a new finding that was not evident in the F2 study. Furthermore, independent analysis of PFO as a binary trait strongly supported results for FVL and/or FOW on most chromosomes. As noted, of the traits analyzed, FVL has a larger variation between parental strains (Table 1; Kirk et al., 2006) and shows a stronger negative correlation to PFO prevalence among several inbred and mutant strains (Biben et al., 2000), as well as in both F2 and F14 mice. Collectively, these data suggest that many QTLs affect the formation of the primary and secondary atrial septa in common, and that FVL is a robust indicator for atrial septal morphology and risk of PFO.

CRW did not show a significant correlation to PFO in the AIL data and was, therefore, not found to be a satisfactory surrogate for the size of the open corridor in cases of PFO (Biben et al., 2000). Nonetheless, we found five significant QTLs for CRW, only one of which overlapped with QTL for the other traits, suggesting that CRW is determined largely by different genetic elements to those governing FVL and FOW. As CRW is measured along the lower boundary of the apoptotic domain that generates the ostium secundum in septal development, it may relate to tissue remodeling subsequent to the apoptotic process.

As septal parameters may be influenced by heart size, we performed a linkage analysis for normalized HW across MMU11, where a significant QTL was discovered retrospectively in the F2 data, and indeed across all other chromosomal regions for which markers were selected. We confirmed and refined the position of the HW QTL on MMU11, and detected HW QTL on MMU2, MMU4, MMU9, and MMU13. Of these, only MMU2 has previously been linked with HW (Rocha et al., 2004). Inter-trait correlation and linkage results showed that HW influences quantitative septal parameters in both F2 and F14 studies, but only in a relatively minor way (Supplementary file 2). Importantly, HW and atrial septal morphology QTL showed limited overlap. We propose that HW is mostly determined by ventricular chamber growth as an independent parameter to the development of ‘primary’ myocardial components of the early heart tube, which have a lower proliferative index (Moorman and Christoffels, 2003), and that of the mesenchymal and cushion components, that contribute to the formation of the inter-atrial septum.

Overall, we conclude that septal morphological variation and risk of PFO have a complex genetic basis in inbred strains of mice. Given that the analysis was restricted to the previously found QTL and the selected markers covered only a limited part of the genome, it is likely that the genetic complexity underpinning septal defects in the two inbred strains under study is even greater than revealed here. It is noteworthy that the two inbred lines in this study were previously selected for independent traits. For example, the QSi5 strain, used here because of its low PFO risk (Kirk et al., 2006), was bred for numerous traits related to high fecundity (Holt et al., 2004) and whereas this has facilitated the generation of >1000 mice required for the AIL study, the genetic diversity that contributes to variation in atrial septal morphology may be limited by the prior genetic selection. High genetic complexity underpinning variation in atrial septal morphology may be expected, given the diverse lineage origins, tissue contributions, and morphogenetic networks contributing to septal structure (Anderson et al., 2003; De Bono et al., 2018; Deepe et al., 2020; Rana et al., 2014; Steimle et al., 2018). Extrapolating to the diverse outbred human population, we might conclude that potentially many hundreds of variants contribute to atrial septal dysmorphology, consistent with estimates of the number of inherited and de novo mutations discovered to date contributing to CHD more broadly (Jin et al., 2017).

To create a list of protein-coding genes that could potentially contribute to differences in atrial septal morphology in inbred strains of mice, we sequenced the whole genomes of parent strains and then filtered for genes that lie under QTL (1-LOD drop-off), were expressed in the developing inter-atrial septal septum, and carried variants predicted to be pathogenic. The total list comprised 61 genes carrying 92 variants, resulting in an average of ~2.5 candidate protein-coding variants per QTL. Most of these genes lay within regions of high SNP density and 26% of them carried multiple variants. We further filtered for genes that overlapped the MGI list of mouse genes associated with cardiovascular phenotypes, and a curated known and emerging human CHD gene list (Alankarage et al., 2019). Eight genes (Cybrd1, Dst, Fxn, Lrp2, Mcam, Pgbd1, Sik3, Smad6) overlapped with curated lists. The detailed role of most of these genes in septal development has not been elucidated. Our follow-up study of the SMAD6 linker region variant R282P demonstrated a hyper-ubiquitinated state for this critical regulator of BMP signaling. The full mechanism and in vivo significance remain to be determined, noting that hyper-ubiquitination, in addition to affecting SMAD6 stability in vivo, could inhibit additional regulatory processes or interactions occurring across the linker region (Sapkota et al., 2007; Xu et al., 2016).

Extending genome analyses, we determined the transcriptome of atrial septal regions of both strains at three developmental time points and calculated DEGs. By far the most significant STRING protein network for DEGs was at E14.5, with subnetworks with high connectivity associated with growth-related macromolecular cellular structures including nucleosomes, ribosomes, mitochondria, and ECM. Virtually all associated genes were downregulated in the 129T2/SvEms strain (showing less robust septal characteristics). Our findings suggest that there is a general septal growth and maturation deficit involving multiple septal components in the 129T2/SvEms strain at a relatively late stage (E14.5) of septal development. This idea is supported by the correlations between FVL and FOW (reflecting the development of the primum and secundum septa, respectively) (Kirk et al., 2006) and the strong concordance between QTL for individual septal traits and PFO discovered in this work. We hypothesize that differences in quantitative septal traits and PFO prevalence that we documented previously in an inbred, hybrid, and mutant strains of mice (Kirk et al., 2006), relate in part to degrees of septal growth and maturation, and that higher septal growth and maturation facilitates better fusion of the septum primum and secundum. Careful elucidation of the differential growth parameters of septal components in parent strains across time may provide further support for this model.

STRING networks identified three potential hub drivers of this depressed biosynthetic state in the 129T2/SvEms strain - genes for Ubiquitin c, and kinases AKT and PKA. Many genes encoding targets of these kinases were also up- or downregulated, potentially a network adaptation. We propose that variants in the 129T2/SvEms parent strain depress growth-related signaling pathways (including BMP, AKT, PKA, and SRC) and associated ubiquitin-related quality control processes, impacting septal growth and morphogenesis independently of chamber growth. In this light, it is interesting that CHD and cancer risk genes have recently been correlated (Morton et al., 2021). This could be a focus area for further investigation.

Our analysis of variant architecture under QTL (Figure 10) showed that variants (129T2/SvEms vs QSi5) were over-represented in differentially expressed atrial septum-expressed genes located under QTL across multiple gene features including enhancers, promoters, 5′-UTRs and introns. These data suggest a direct mechanistic link between the presence of non-coding (including cis-regulatory) as well as coding region variants in the differential expression of septal genes underlying QTL. This would represent a significant departure from a reductionist model whereby a single variant accounts for the impact of each QTL. Cis-regulatory variants are, therefore, candidate drivers of network perturbations revealed by STRING analysis.

Our results are consistent with Fisher’s 'infinitesimal' model whereby many loci (theoretically an infinite number) could each contribute to a small part of the liability for common disease (Fisher, 1919; Norton and Pearson, 1976), and also GWAS analysis of human disease that most often detect common risk loci of small effect. However, as in our previous study (Kirk et al., 2006), individual septal QTL can have significant effect sizes, demonstrating the power of the AIL approach to discern alleles of major impact. Indeed, all five predicted pathogenic coding variants highlighted in this study, including the Smad6 variant, lay under QTL of moderate to high effect size (5.68–65%). It is possible that a limited number of high-impact variants are responsible for septal defects in certain AIL mice. This type of polygenic inheritance has been demonstrated recently in a human cardiomyopathy family (Gifford et al., 2019). Such variants may prove to have causative or modifying effects on atrial septal defects in humans, as exemplified by SMAD6.

Several findings speak to the complex genetic landscape of PFO. It is interesting that twelve out of the 37 QTL detected were cryptic QTL, with many being of moderate to large effect size (~5–70%; Supplementary file 3). Thus, cryptic QTL is frequently detected and individually appear to provide strong buffering effects against septal defects. Cryptic QTL has been detected commonly in other animal and plant QTL studies (Rieseberg et al., 1999). The mechanisms of their action would be interesting to identify as they may reveal pathways that protect against CHD more broadly.

A further complexity is that, of five genes carrying predicted pathogenic variants highlighted in this study, four were found in the unexpected strain. For example, Cep164 lies under a cryptic QTL (protective), whereas predicted pathogenic variants in Cep164 were found in the 129T2/SvEms strain (showing worse septal status). The inverse can also apply – for example, Lrp2 lies under a normal QTL, but variants were found in the QSi5 strain. Such findings suggest highly complex interactions within and between individual QTLs. As noted, 26% of predicted pathogenic genes, some known to be involved in heart development, carried multiple predicted pathogenic variants (Cep164 carried seven). It would be reasonable to imagine that these alleles are null or have major impacts on gene function. Genetic compensation may also buffer the impact of such variants (El-Brolosy et al., 2019).

Our results provide the first high-resolution picture of genetic complexity and multilayered network liabilities underpinning atrial septal variation in a mouse model. Elucidating the impacts of genetic variants under QTL at the individual gene level would be a daunting task; however, understanding higher level network control of septal development underpinning observed defects in septal macromolecular synthesis pathways is an important cardiac developmental systems genetics problem, which may have relevance to other congenital dysmorphologies. Our quantitative model offers opportunities to study the interface between genetic, environmental, and epigenetic inputs to common CHD and potentially other diseases in greater detail.

Materials and methods

Key resources table
Reagent type (species) or resourceDesignationSource or referenceIdentifiersAdditional information
Strain, strain background (M. musculus)129 T2/SvEmsThe Jackson Laboratory Biben et al., 2000Stock no. 002064http://www.informatics.jax.org/mgihome/nomen/strain_129.shtml
Strain, strain background (M. musculus)QSi5The Jackson LaboratoryStock no. 027001
AntibodySmad6 (mouse monoclona)Santa Cruz BiotechnologySC-25321; RRID: AB_627906WB (1:100)
AntibodyFLAG-M2 (rabbit monoclona)Cell Signaling TechnologyCST:14793; RRID: AB_2572291IP (1:150) WB (1:1000)
AntibodyUbiquitin (mouse monoclona)Santa Cruz BiotechnologySC-166553; RRID: AB_2241297WB (1:1000)
Cell line (H. sapiens)HEK293TATCCRRID: CVCL_0063
Transfected construct (H. sapiens)PCS2-SMAD6ADDGENERRID:Addgene_14960
Transfected construct (M. musculus)BRE-lucADDGENERRID:Addgene_45126
Peptide, recombinant proteinBMP4GibcoGibco:PHC9534;100 ng/mL
Commercial assay or kitDual luciferase assay kitPromegaPromega:E1980
Commercial assay or kitKAPA hifiRocheRoche:KK2101
Commercial assay or kitWesternBreeze Chemiluminescent kit-Anti-rabbitThermo Fisher ScientificThermo Fisher:WB7106
Software/ algorithmImageJNIHRRID:SCR_003070
Software/ algorithmGraphpad PrismDotmaticsRRID:SCR_002798
OtherRecombinant protein G-Sepharose 4BThermo Fisher ScientificThermo Fisher:101241
Software/algorithmRhttps://cran.r-project.orgVersion 4.2.2
Software/algorithmGenomicRanges (R package)Lawrence et al., 2013Version 1.42.0
Software/algorithmGenomicFeatures (R package)Lawrence et al., 2013Version 1.42.3
Software/algorithmliftOverKent tools via UCSCversion 402
SoftwareTrimmomaticBolger et al., 2014v0.35
Software/AlgorithmSTAR alignerDobin et al., 2013v2.5.1
Software/AlgorithmSalmonPatro et al., 2017
Software/AlgorithmDESeq2Love et al., 2014v1.14.1
Software/algorithmPANTHERMi et al., 2017Web service
Software/algorithmBWA-MEMLi and Durbin, 2010v0.7.15
Software/algorithmGATKvan der Auwera and O’Connor, 2020v3.5
Software/algorithmVCFtoolsDanecek et al., 2011v0.1.14
Software/algorithmAnnovarWang et al., 2010
Software/algorithmCytoscapeShannon et al., 2003RRID: SCR_003032V3.9.1
Software/algorithmPythonhttps://www.python.org
Software/algorithmNetworkxHagberg et al., 2008V3.0
OtherSTRINGSzklarczyk et al., 2017v10
Software/algorithmSpliceogenMonger et al., 2019
Software/algorithmVariant Effect PredictorMcLaren et al., 2016
Software/algorithmSIFTNg and Henikoff, 2003
Software/algorithmfgseahttps://doi.org/10.1101/060012v1.22.0
OtherMouse Molecular Signatures Databasehttps://www.gsea-msigdb.org/version m5.all.v2023.1.Mm

QTL fine mapping using advanced intercross line

Mice and advanced intercross line

Request a detailed protocol

129T2/SvEms mice (sub-strain 129T2/SvEms-+Ter?) were obtained from The Jackson Laboratory and were a derivative of the Ter subline family of 129 strains (Biben et al., 2000). QSi5 is an inbred mouse strain that was bred from outbred Quackenbush Swiss mice (Holt et al., 2004) at the University of Sydney, Sydney NSW, and was maintained as a breeding colony once established. QSi5 mice are also available from the Jackson Laboratory (strain 027001). As parental strains in our F2 study (Kirk et al., 2006), 129T2/SvEms and QSi5, they were selected based on having extreme values for mean FVL and prevalence of PFO, two indicators of atrial septal morphology which are strongly negatively correlated (Biben et al., 2000). 129T2/SvEms had the highest prevalence of PFO (75%) and shortest FVL (mean = 0.6 mm) among inbred strains and multiple crosses (Biben et al., 2000). QSi5 had the longest FVL (mean = 1.13 mm) and among the lowest prevalence of PFO (4.5%).

We randomly inter-crossed the original F2 mice for 12 further generations. The F2 mice were bred to produce 48 male and female pairs which were then stocked in 48 separate cages. We inter-crossed the mice in each cage to generate F3 mice. For the F3 × F3 and subsequent crosses, a cascading scheme was used in which a female mouse from one cage would be mated with a male mouse from the next cage. This breeding design, in which each pair contributes exactly two offspring (one male and one female) to the next generation doubles the effective population size and reduces the random changes in allele frequency due to random genetic drift over the 10 generations. Animals were bred and housed under Animal Care and Research Ethics approvals N00/4-2003/1/3745, N00/4-2003/2/3745 and N00/4-2003/3/3745 from the University of Sydney.

Dissection and measurements

Request a detailed protocol

In total, 1003 AIL F14 mice were dissected. Of these, 933 had complete phenotypic data (475 males and 458 females). Phenotyping of each mouse including initial and fine dissections, determination of PFO status, and measurement of septal features, were all performed on the same day. As in the F2 study (Kirk et al., 2006), the thoracic organs including the heart, lungs, and mediastinum were initially dissected en bloc and stored in PBS. A tail biopsy was also taken from each mouse and snap-frozen for DNA extraction.

The following steps were performed under a Leica MZ8 dissecting microscope. The mediastinal organs were removed to expose the atria. Subsequently, the left atrium was opened to expose the atrial septum. We detected PFO by pressurization of the right atrium. The right-to-left passage of blood (or injected Orange G dye) across the inter-atrial septum indicated the presence of PFO. Measurement of septal features including FVL, FOW, and CRW was performed using an eyepiece graticule. FVL was defined formally, as in our previous study (Kirk et al., 2006), as the length of the flap valve from the edge of the crescent (proximal rim of the ostium secundum) to the distal rim of the fossa ovalis. The maximum width of the foramen ovale (foramen ovale width; FOW) was measured perpendicular to the FVL. Crescent width (CRW) was defined as the maximum width of the prominent crescent-shaped ridge, representing the proximal rim of the ostium secundum and edge of the flap valve as previously described (Kirk et al., 2006).

Normalization

Request a detailed protocol

A general linear model (PASW Statistics 18) was used to analyze the effect of various covariates on the traits of interest (FVL, FOW, and heart weight; HW) considering a p-value of <0.05 as significant (Supplementary files 9-13). FVL and FOW were significantly affected only by HW (p=0.002 and p=0.028, respectively) and the effect of other covariates including sex, age, body weight (BW), and coat color was not significant. However, we did not adjust for HW since QTL relevant to HW may also influence atrial septal morphology. On the other hand, HW was significantly affected by sex (p<0.001), age (p=0.032), BW (p<0.001), and coat color (p=0.042). Therefore, prior to sample selection and further analysis, HW was normalized for age, sex, and BW, although not for coat color so as to avoid missing QTL linked to coat color genes.

Sample selection

Request a detailed protocol

Selective genotyping of extreme phenotypes is an efficient method to increase the power of QTL mapping (Lander and Botstein, 1989). However, the benefits of this method decline with an increasing number of uncorrelated or weakly correlated traits. The focus of our study was to fine-map QTL underlying the main highly correlated traits (FVL and FOW) and a peripheral trait of HW. We did not consider CRW as a basis for sample selection since it was a less defined anatomical structure and, unlike FVL and FOW, it was not associated with PFO (see Results). For each trait (FVL, FOW, and HW), we selected approximately 100 F14 animals with extreme phenotypes. Given the overlap between the extreme phenotypes from different traits, 237 mice were selected by this method. To compensate for biases in selective genotyping in this study, the selection of extreme phenotypes was combined with a degree of random selection. Therefore, 163 mice were also selected randomly giving a total sample of 400 mice. The same number of males and females were selected from the breeding cages. Thus, the selected mice gave as equal as possible representation of males, females, and breeding cages.

Marker selection

Request a detailed protocol

We searched genotype data from the Mouse HapMap project for potential informative single nucleotide polymorphisms (SNPs) between 129T2/SvEms and QSi5 strains. In total, a set of 135 markers with an average interval of 2 centimorgans (cM) was selected to genotype genomic regions including the significant QTL from the F2 study (three QTL for FVL and three for FOW) and one QTL for HW (Supplementary file 14). In addition, a suggestive QTL for FOW on MMU9 (LOD = 3.43) was included in the AIL study as its peak covered the known ASD/VSD gene Tbx20 (Kirk et al., 2006). Fifteen extra markers were chosen to cover the whole peak of this QTL. Subsequently, a total of 150 markers were genotyped in the selected mice (Supplementary file 14).

Genotyping

Request a detailed protocol

Genomic DNA was isolated from mouse tails using a Macherey-Nagel Nucleospin kit according to the manufacturer’s guidelines. Prior to genotyping, we verified the informativeness of the markers for parental strains. Subsequently, the markers were genotyped in the selected mice using iPLEX MassArray assay following the manufacturer’s protocol.

Linkage analysis

Request a detailed protocol

We performed interval mapping linkage analysis for the quantitative traits at 0.25 cM intervals using a maximum likelihood method implemented in R software (see code availability statement below). The method used was initially developed for an F2 design, but modified for an AIL using the methods described by Darvasi and Soller, 1995. Given the density of selected markers and the size of the genomic region covered by markers, a LOD score of 2 was set as the threshold level of significance (Lander and Botstein, 1989).

We used the 1-LOD drop-off to estimate the confidence interval of each QTL (Lander and Botstein, 1989). However, in some cases, two significant peaks were located close together and overlapped in the 1-LOD drop-off intervals. To determine whether these peaks reflected the same underlying QTL or identified the presence of independent QTL, we re-ran the linkage analysis using a model in which the marker closest to the higher peak was included as a fixed term (Figure 3—figure supplement 1), effectively a simplified form of composite interval mapping but applied in a maximum likelihood framework (Kearsey and Hyne, 1994; Wu and Li, 1994). Disappearance of the lower peak would indicate that two peaks represented a single QTL. On the other hand, if the lower peak remained significant, two peaks would represent two separate QTLs.

We have previously developed a QTL program in R to perform linkage analysis for PFO as a binary trait (presence or absence) (Moradi Marjaneh et al., 2012). For binary analysis of AIL data, the linear model used for the quantitative analysis was replaced by a generalized linear model in the form of a logistic regression model, as described previously (Moradi Marjaneh et al., 2012).

Additive and dominance effects given in Supplementary file 3 were used to calculate QTL effect size information. The additive effect for each QTL (a.qtl) was defined as the effect of one copy of the Q (QSi5) allele and calculated as (mean.QQ - mean.qq)/2 where mean.QQ is the phenotypic mean for genotype QQ and mean.qq is the phenotypic mean for genotype qq (129T2/SvEms allele). The dominance effect (d.qtl) for each QTL was defined as the difference between the mean.Qq (phenotypic mean for genotype Qq) and the midpoint between the qq and QQ means. Therefore, if d.qtl >0, allele Q is (partially) dominant as Qq is moved towards QQ and if d.qtl <0, allele q is (partially) dominant as Qq is moved towards qq. Attributable phenotypic variance for each QTL is the additive effect of the QTL, expressed as a percentage of the difference between parental means for that trait.

Conversion of genetic maps

Request a detailed protocol

We previously defined the genetic positions of the F2 markers using an older mouse genetic map developed by the Whitehead Institute and the Massachusetts Institute of Technology (MIT) (Dietrich et al., 1996). Thus, we converted the genetic position of F2 markers into those for the current Mouse Genome Informatics (MGI) map developed at the Jackson Laboratory (Bult et al., 2008) using mouse maps introduced by Cox et al., 2009. For inter-marker intervals, a linear interpolation was used to convert old genetic positions to new ones.

Whole genome sequencing of the AIL parental strains

Sequencing

Request a detailed protocol

Genomic DNA was extracted from tail biopsy specimens of QSi5 and 129T2/SvEms mice using the phenol-chloroform extraction method. Library preparation and sequencing were performed at the Kinghorn Centre for Clinical Genomics. 200 ng of DNA was mechanically fragmented using LE220 Covaris (Covaris, Woburn, USA) to approximately 450 bp inserts followed by library preparation using the Seqlab TruSeq Nano DNA HT kit (20000903, Illumina, San Diego, USA). Library preparation was performed according to the manufacturer’s instructions which included end repair, library size selection, 3′ end polyadenylation, adaptor ligation, purification of ligated fragments, enrichment of ligated fragments by PCR, and a final purification of amplified DNA library. Library fragment sizes were reviewed using the LabChip GX (Perkin Elmer, Waltham, USA) to evaluate library size and adapter dimer presence before the library concentration was determined via quantitative real-time PCR using the KAPA library quantification kits (KK4824, Roche, Basel, Switzerland) on the QuantStudio7 or ViiA7 (Thermo Fisher Scientific, Waltham, USA). Normalized DNA libraries were clustered on Illumina cBot and then sequenced using Illumina HiSeq X Ten platform using HiSeq X Ten Reagent Kit v2.5 kits (FC-501–2501, Illumina). Paired-end sequencing was performed using the 2 × 150 bp chemistry to achieve an average output of approximately >120 Gb of data per library.

Bioinformatic analysis

Request a detailed protocol

The sequencing reads were mapped to the mouse reference genome (NCBI38/mm10) using Burrows-Wheeler Aligner (BWA-mem v0.7.15) (Li and Durbin, 2010). Variants were called using the genome analysis toolkit (GATK; v3.5) pipeline following best practice guidelines (van der Auwera and O’Connor, 2020). This involved marking duplicate reads with the Picard toolkit (Broad Institute GitHub Repository: http://broadinstitute.github.io/picard/) (RRID:SCR_006525 version 2.21.3), indel realignment with GATK using known SNPs and indels from C57BL/6 dbSNP142 (ftp://ftp-mouse.sanger.ac.uk/current_indels/strain_specific_vcfs/), Base quality score recalibration and final variant calling was performed using the GATK Haplotype Caller. Variant call format (VCF) files were merged using VCFtools (v0.1.14) (Danecek et al., 2011) and annotated with ANNOVAR (Wang et al., 2010). A set of high-quality variants was identified by filtering variants for those with a minimum sequencing depth of eight and a maximum estimated false discovery rate (FDR) of 10%. Variants were filtered to only consider those that were homozygous in one mouse line with either the reference or a heterozygous call in the other line. Zygosity information from each mouse line was extracted using VCFtools and custom scripts.

To analyze protein-coding variants, variants were filtered for feature annotations ‘exonic,’ ‘exonic;splicing,’ or ‘splicing,’ with variants annotated with effect class ‘synonymous SNV’ filtered out. Variants were filtered for genes expressed in the septal time course RNA-seq (see below), where a gene was defined as expressed if it had a counts-per million (CPM) >1 in at least two samples. They were then converted to mm10/GRCm39 coordinates and submitted to the Ensembl Variant Effect Predictor (VEP) tool (McLaren et al., 2016). Based on VEP annotations, variants were classified as high-impact (VEP impact classification of ‘HIGH’), deleterious missense (SIFT (Ng and Henikoff, 2003) ‘deleterious’ prediction), or non-high-impact deletions. High-impact variants were manually inspected in IGV and filtered out if the ‘HIGH’ impact classification was deemed inaccurate (for example, cases where an adjacent variant negated a predicted stop gain or a substitution was incorrectly called as a deletion). Finally, the list of high-confidence potential-pathogenic variants was filtered for variants falling within QTL coordinates (LOD >1).

Transcriptome analysis of cardiac interatrial septum in mice

Microdissection of the atrial septum region was performed on embryos from mouse strains (QSi5, 129T2/SvEms) at E12.5, E14.5, and E16.5 (six septa per mouse strain per time point). Embryos were dissected from the uterus in ice-cold PBS under a dissecting scope (Leica MZ8) using micro-tweezers. Once the heart was harvested from the embryo and the pericardial membrane removed, the atrial and ventricular regions were separated. Then, the atrial appendages were removed from the atrial part and the remaining mesenchymal tissue from the atrioventricular canal was further trimmed. The remaining tissue, containing the atrial septum, was stored at –80 °C in RNAlater solution (Ambion). Total RNA was purified using miRNeasy Micro Kit (Qiagen). Two technical replicates per mouse strain per time point were created by pooling RNA from three dissected septa each.

RNA sequencing and data analysis

Request a detailed protocol

RNA libraries were made using TruSeq stranded RNA Library Preparation Kit (Illumina) and sequenced on a HiSeq2500 Illumina sequencer to a depth of ~23 million paired-end reads per sample. Sequencing reads were trimmed to remove poor quality sequence and adaptors using Trimmomatic (v0.35) (Bolger et al., 2014) using parameters LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:33. Sequencing reads were aligned against the mouse reference genome (GRCm38/mm10) using RNA STAR (v2.5.1) (Dobin et al., 2013). Gene counts were assembled using SummariseOverlaps (v1.30.0) (Lawrence et al., 2013) against Gencode release M4 and filtering to only include those whose expression was at least 1 count per million (CPM) in at least two samples. Differential expression analysis of RNA-seq data was performed using DESeq2 (v. 1.14.1) (Love et al., 2014). For differential expression analysis, each time point was analyzed individually. Differentially expressed genes were calculated between 129T2/SvEms and QSi5 using the DESeq function with default parameters, with an adjusted p-value cut-off of 0.05 and absolute log2 fold-change difference of 0.5 used as thresholds for significance. GO term analyses of differentially expressed genes were performed using the PANTHER web service (Mi et al., 2017) with a false discovery rate cut-off of 0.05 used to assess significance. Transcript isoform quantifications were performed on the Fastq files using Salmon (Patro et al., 2017), against the Gencode mm10 transcript reference (vM10) with default parameters. Transcript abundances were read into DESeq2 using the tximport package (Soneson et al., 2015). Transcripts were filtered and retained for differential expression analysis if they had a CPM >1 in at least four samples across the strains and time points. DESeq2 testing of differences between strains while accounting for time point was performed using a design of ‘~strain + time point’.

Network analysis

Request a detailed protocol

Network analysis of DEGs was performed using protein-protein association connections obtained from the STRING (Szklarczyk et al., 2017) version 10 database, considering connections with a combined score greater than 700. Permutation testing (100,000 permutations) was applied to determine whether the size and complexity of the networks (determined by the number of edges and clustering coefficient, respectively) were greater than expected by chance. For each permutation, a random set of genes was selected from the set of expressed genes in the RNA-seq for the relevant time point, and network metrics re-calculated. Edge counts and clustering coefficients were calculated using NetworkX (Hagberg et al., 2008). Empirical p-values were then determined as the proportion of random networks achieving network metrics equal to, or higher, than the DEG networks.

Splicing analysis of Tbx20

Request a detailed protocol

Tbx20 variants were examined for splicing defects using Spliceogen (Monger et al., 2019) with default parameters. To evaluate potential splice differences between the strains from the RNA-seq we used the splice junction (SJ) files produced by the STAR 2-pass alignment (Dobin et al., 2013). For each time point, we identified junctions overlapping the Tbx20 gene (matched for strand) that were supported by over 10 uniquely mapped reads in both strains. The junction coordinates were then compared between the strains using the GenomicRanges findOverlaps function (Lawrence et al., 2013), with the parameter type set to ‘equal’.

Analysis of SMAD6 variant

Plasmids

Pcs2-SMAD6 (#14960) and BRE-luc (#45126) were purchased from Addgene. FLAG tag was added to SMAD6 using the following primers:

  • Forward (F): 5′GATCGACTACAAGGACGACGATGACAAGG 3′;

  • Reverse (R): 5′GATCCCTTGTCATCGTCGTCCTTGTAGTC 3′.

  • Mutagenesis primers: R281P - F 5′CTCCCTACTCTCCGCTGTCTCCTCG 3′;

  • R 5′CGAGGAGACAGCGGAGAGTAGGGAG 3′;

  • Y279A - F 5′CTCCGCCACCTCCCGCATCTCGGCTGTCTC 3′;

  • R 5′GAGACAGCCGAGATGCGGGAGGTGGCGGAG 3′;

  • G471S - F 5′CATCAGCTTCGCCAAGAGCTGGGGGCCCTG 3′;

  • R 5′CAGGGCCCCCAGCTCTTGGCGAAGCTGATG 3′.

Cell lines

Request a detailed protocol

HEK293T cells (ATCC; https://www.atcc.org; short non-tandem repeat authentication January 19, 2023, 100% match to reference; mycoplasma-negative) were used.

Cell culture

Request a detailed protocol

HEK293T cells were maintained in a DMEM medium containing 10% FCS in a humidified incubator at 37 °C at 10% CO2. 80,000 cells were seeded in 12-well plates for luciferase assays and 100,000 cells were seeded in six-well plates for protein extraction. BMP4 ligand (Gibco, #PHC9534) was added to cells at 100 ng/ml. Transfections were carried out using Lipofectamine 3000 (Thermo Fisher Scientific) according to the manufacturer’s instructions.

Luciferase assay

Request a detailed protocol

HEK293T cells were transfected with 100 ng of luciferase reporter constructs, 150 ng of expression vectors or empty vector and 2.5 ng of the TK-Renilla. Cells were treated with BMP4 overnight on the day of transfection. Assays were performed 24 hr after transfection. Dual-luciferase assays were performed as per the manufacturer’s instructions (Promega, #E1980). Firefly luciferase activity was normalized to Renilla luciferase activity.

Co-IP and western blot

Request a detailed protocol

HEK293T cells were transfected with 1 µg of hSMAD6-WT, hSMAD6-R281P, hSMAD6-Y279A, or hSMAD6-G471S. Cells were treated with either DMSO or MG132 (Sigma Aldrich, #M7449) 8 hr prior to protein extraction. Cells were lysed 48 hr after transfection using whole cell extract buffer (20 mM HEPES, 420 mM NaCl, 0.5% NP-40, 25% glycerol, 0.2 mM EDTA, 1.5 mM MgCl2, 1 mM PMSF, and protease inhibitors). Cells were washed in PBS and lysed with 250 µl WCE buffer for 10 min on ice. Lysed cells were scraped off the six-well plates and homogenized with a 25 Gauge needle 10 times. The lysates were centrifuged for 30 min at 4 °C. The supernatant was precleared with Protein G (Thermo Fisher Scientific) for 1 hr at 4 °C, then incubated with anti-FLAG M2 antibody (Sigma Aldrich, #F1804) at 1:150 dilution or an equal amount of mouse IgG overnight at 4 °C. The lysates were incubated with Protein G beads for 2 hr at 4 °C, then washed in WCE buffer four times. Protein was eluted in 4 x sample buffer (Biorad) for 5 min at 95 °C and loaded onto TGX stain-free precast gels (Biorad). Western blots were carried out using the following antibodies: anti-ubiquitin (1:1000, Santa Cruz Biotechnology, sc-166553), anti-FLAG (1:1000, Cell Signaling Technology, #14793), anti-SMAD6 (1:100, Santa Cruz Biotechnology, sc-25321).

Variant enrichment analyses

Request a detailed protocol

We examined the overlap of high-quality homozygous 129T2/SvEms vs QSi5 variants and genomic features (enhancers, promoters, 5'UTRs, exons, introns, and 3'UTRs) of DEGs and non-DEGs located under QTL 1-LOD drop-off regions. Promoters were defined as 1000 bp upstream and 100 bp downstream of a known transcription start site and enhancers were defined as H3K4me1+ and/or H3K27ac+, and H3K4me3 regions within increasing bin sizes of 50-250 Kb upstream and downstream of the transcriptional start site in ES cell-derived cardiac progenitors and cardiomyocytes, and E14.5 fetal hearts (Wamstad et al., 2012). Accession numbers for histone mark data were: H3K4me1 – ENCSR000CDL; H3K27ac – ENCSR000CDK; HK4me3 – ENCSR000CDM. Variant enrichment within each feature was defined as the proportion of variants that overlapped the feature, with normalization for the proportion of coverage of the feature within the QTL 1-LOD drop-off. To test the significance of the enrichment, background controls were generated by permuting each genomic feature of DEGs and non-DEGs under QTL, respectively, by randomly selecting matched size blocks (according to the tested property) under QTL regions. This process was repeated 1000 times using bedtools shuffle (version 2.25.0). For each genomic feature, the DEG/non-DEG regions were excluded from the pool of regions used to generate random interval sets. The overlap of the variants and genomic features of DEGs and non-DEGs under QTL was compared to the mean overlap for 1000 random interval sets and empirical p-values were calculated by dividing the number of random interval sets showing equal or greater overlap than the observed by 1000.

Code availability

Request a detailed protocol

The codes used to perform linkage analysis are available at https://github.com/MahdiMoradiMarjaneh/AIL (copy archived at Marjaneh, 2023).

Data availability

Sequencing data have been deposited in the ArrayExpress database at EMBL-EBI (https://www.ebi.ac.uk/arrayexpress) under accession codes E-MTAB-11161 (DNA-seq) and E-MTAB-10929 (RNA-seq).

The following data sets were generated

References

    1. Feldt RH
    2. Avasthey P
    3. Yoshimasu F
    4. Kurland LT
    5. Titus JL
    (1971)
    Incidence of congenital heart disease in children born to residents of Olmsted County
    Mayo Clinic Proceedings 46:794–799.
  1. Conference
    1. Hagberg AA
    2. Schult DA
    3. Swart PJ
    (2008)
    Exploring network structure, dynamics, and function using NetworkX
    7th Python in Science Conference (SciPy2008).
    1. Kearsey MJ
    2. Hyne V
    (1994) QTL analysis: a simple 'marker-regression' approach
    TAG. Theoretical and Applied Genetics. Theoretische Und Angewandte Genetik 89:698–702.
    https://doi.org/10.1007/BF00223708
  2. Book
    1. Larsen WJ
    (1997)
    Human Embryology (2nd edn)
    United Kingdom: Churchill Livingstone.
    1. Moore KL
    2. Persaud TVN
    (1998)
    The Developing Human
    349–362, Clinically oriented embryology, The Developing Human, Philadelphia, PA, WB Saunders and Company.
  3. Book
    1. van der Auwera GA
    2. O’Connor BD
    (2020)
    Genomics in the Cloud: Using Docker, GATK, and WDL in Terra (1st edn)
    USA: O’Reilly Media, Inc.
  4. Book
    1. Zipes DP
    (2005)
    Braunwald’s Heart Disease: A Textbook of Cardiovascular Medicine
    Elsevier.

Decision letter

  1. Aldons J Lusis
    Reviewing Editor; University of California, Los Angeles, United States
  2. David E James
    Senior Editor; University of Sydney, Australia

Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.

Decision letter after peer review:

Thank you for submitting your article "Quantitative trait and transcriptome analysis of genetic complexity underpinning cardiac interatrial septation in mice using an advanced intercross line" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and David James as the Senior Editor. The reviewers have opted to remain anonymous.

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Essential revisions:

All three reviewers judged this to be a comprehensive, well-designed study that will enhance our understanding of the genetics of septal defects. The reviewers' comments related primarily to the presentation of the results, in several cases requesting a further discussion of the limitations of the findings.

Reviewer 1 suggests that functional validation would greatly strengthen the manuscript. While I agree, the present results are very substantial and do provide a better overall understanding of the complexity of the trait in addition to candidates and hypotheses for future studies.

Reviewer 2 had several criticisms. I felt that points 2, 3, and 10 should be addressed, and I would be interested to know how the authors feel about all of the points. Some of the suggestions, such as the identification of rare variants that have arisen since during the development of the advanced intercross lines (point 5) would be very difficult, but the authors might expand on the point in the discussion.

Reviewer 3 also had several comments. Points 1 and 2 should be addressed. Points 3 – 5 deal with additional transcriptomic and functional analyses. I think the suggestions are excellent and the analyses would strengthen the paper, but given the extensive analyses reported I do not want to insist on additional experiments. The authors should, however, discuss these points in terms of limitations.

Reviewer #1 (Recommendations for the authors):

A limitation of the study in its current form is that it does not develop an experimental example of using the data set on gene regulation network analysis in Figure 7-8 to make predictions and then test them. For example, the author discusses changes to ubiquitin C, a prominent down-regulated hub gene. However, the functional significance of these predicted interactions is not experimentally tested. To demonstrate that one or more of these genes (Znrf3, Ube4a, Akt1, and Prkaca) are directly involved in the developing atrial septum would significantly increase enthusiasm for the paper.

Reviewer #2 (Recommendations for the authors):

1. The AIL analysis was limited to QTL previously found in F2 analysis, where the selected markers covered only a limited part of the genome. WGS and RNA seq were utilized to validate QTL genes of known role in CHDs, missing an important opportunity for identifying new QTL and novel gene discovery in PFO/CHDs.

2. Authors stated that "the possibility that FVL- and FOW- QTL could be explained by variation in HW or BW has not been formally excluded. Markers of normalized heart weight were included in AIL, like MMU11". However, in [Table 1] HW and BW were listed as independent traits. The manuscript would benefit from using HW/BW ratio and examining HW/BW association with PFO parameters, instead of using HW and BW separately as independent traits.

Furthermore, SEX and Age were not included. Adding these co-variants in their comparison between F2 and F14 in Table 1 would be important.

3. Page 13. Line 231: "The AIL results for each chromosome of quantitative traits and binary PFO are compared in figure 3". The cited figure should be [figure 4] instead of [figure 3], or [both Figure 3 and figure 4].

4. page 16, lines 277-280. "MMU9 QTL is a suggestive QTL [LOD=3.4] that was identified in F2 study and resolved into 4 QTL in the current F14 study [Figure 3 E and Figure 4 E], with significant overlapping FOW, FVL, and PFO peaks. Importantly, TBX20, a cardiac transcription factor gene involved in septal defects, is located within the 1-LOD region of the first QTL". However, only synonymous variants were found in TBX20 (based on WGS analysis) (page 20, line 368).

Given the importance of TBX20 in PFO and the suggestive data, it is worthy to list TBX20 variants and examine/discuss potential miss-classification of these variants, or potential cryptic splice site variants in TBX20 that could be missed on initial pipeline analysis.

5. Page 19. Line 340-341: "regions with low SNP density may contain rare variants that have arisen since establishment of the parent strains".

I. Why were these potentially rare variants dismissed in the current manuscript?

II. Since the prevalence of PFO in F14 increased by ~ 2 folds, compared to F2, it would be important to include the rare variants (Homozygous or heterozygous) that might have arisen since the establishment of parental strains, and ca potentially have large effects to explain the increased prevalence at F14.

6. How did the filtered deleterious variant segregate with extreme PFO phenotypes? Can they explain the higher prevalence of PFO in 129T2/SvEms (high PFO) compared to the QSi5 (low PFO) mouse strain?

7. Majority of the highlighted genes, including LRP2, SMAD6, and SIK3, are affected with 1 or 2 deleterious variants, suggesting a potential monogenic effect caused by rare variants.

I would recommend listing the minor allele frequency (MAF) of these variants and other characteristics of the variants (variant type, Sift Score, Splice prediction) in a new summary Table.

8. SMAD6 variant. The gene was identified in the QSI5-QTL (this strain exhibits low PFO). Based on the functional evidence (cell culture), this variant is damaging and may alter SMAD6 function in several ways.

I. Based on the data and discussion, it is difficult to tell whether the SMAD6 variant is contributing or protective in the PFO phenotype.

II. Again, this seems to be a monogenetic effect.

III. How does SMAD6, or its downstream effectors, contribute to the interactome network of E14.5? is it expressed in the Atrial septum during development? Does the variant influence network interactions?

9. Page 27. figure 7. PCAs for coding vs noncoding. The noncoding transcript showed a different pattern of variation (strain vs developmental stage). Since they were not included in the downstream analysis, I would suggest removing Figure 7B from the current manuscript.

10. Transcriptome analysis/Network analysis [fig7.E, Figure 8]. Adding genes affected with variants to network analysis of the DEGs may confound network structure, as no evidence was shown to support the impact of variants on their gene expression. Instead, I would suggest the following:

I. Discuss why E14 revealed the highest # of DEGs and more significant networks.

II. What was the expression status of genes with pathogenic variants that were expressed at each time point?

III. The finding that DEGs in QTL regions were enriched in pathogenic variants led to the speculation of the functional impact of some QTL on septal traits will involve collective effects of multiple variants (polygenic effect). However, the role of the list of genes carrying pathogenic variants in septum morphology has not been examined mechanistically, including smad6, in vivo. Therefore the evidence does not support the conclusion.

IV. Page 36. Line 645-654. Variants overlapping with the noncoding element.

What is the significance of this finding? Any further evidence of deleterious impact or causality?

Reviewer #3 (Recommendations for the authors):

1. In Figure 6, the authors filter a large number of identified variants between the AIL parental strains using a logical pipeline. The authors ultimately intersected potentially pathogenic variants under septal QTL regions with genes according to prior association with cardiovascular phenotypes (Mouse Genome Informatics; MGI) or link to human CHD from a curated list of high-confidence and emerging CHD genes. Were CHD or MGI genes enriched under QTL regions?

2. The examples of high-impact variants are highly subjective other than smad6, which the authors validate. The caveats underlying the description of unvalidated specific variants should be more explicitly noted, as few if any of them have been directly implicated in atrial septation.

3. Figure 8 – RNA-seq is performed beginning at E12.5, after the formation of the PAS, MC, and DMP. Thus, it is probably most appropriate to limit the implication of genes differentially expressed between the two mouse strains at these later time points to the formation of the septum secundum. While the formation of the SS is critical for FO closure, the limited applicability of the RNA-seq data for descriptions of atrial septation should be mentioned. Alternatively, the authors could perform additional RNA-seq experiments beginning around E9.5 in the pSHF and common atrium to include genes associated with all atrial septum structures as they form, or obtain and analyze previously published RNA-seq data from these regions.

4. The authors' analysis of the atrial septum transcriptome data jumps quickly to considering stain-specific DEGs. However, this is the first transcriptional profiling of the later stages of the atrial septum, as far as I know. A more considered baseline analysis should be performed before jumping into the strain-specific analysis. What does the transcriptome alone say about atrial septation? How does the atrial septum transcriptome compare to the atrial free wall or other regions of the heart (by comparison with RNA-seq from the authors or others)? How do the pathways that are septum-enriched compare to those uncovered in the stain-specific septum analysis?

5. Figures 7 & 8 – the authors implicate the BMP, ubiquitination, Akt, and PKA pathways in PFO based on gene expression differences between 129T2/SvEms and QSi5 mouse lines from E12.5-E16.5. These findings would be greatly strengthened if the authors could demonstrate functional differences in the activities of these pathways in atrial septum tissues in vivo at these time points. This would not be necessary for the publication of this already detailed manuscript and could be discussed as an avenue for follow-up.

6. The observation that differentially expressed QTL genes are enriched for genomic variants is an important observation. The authors used a very conservative relationship between candidate regulatory regions and genes, described as "within 50kb upstream of the transcriptional start site of relevant genes". Given that enhancers demonstrate no directional bias, their capture could be increased to include regions both upstream and downstream of a gene's TSS. Also, 50kb is quite conservative, and their capture may be increased by enlarging the considered regions to 100 or 250 kb. Given the conclusion that there is a "strong mechanistic link between variant architecture and differential expression of genes under QTL", the authors should present the gene sets and pathways that are enriched in this analysis.

7. Figure 10 – The authors could consider an analysis of variant conservation between human and mouse to this figure to interrogate the relevance of identified loci for human ASD/PFO. For variants in and around DEGs underlying QTL, what proportion in coding regions are conserved? What proportion in non-coding regions?

https://doi.org/10.7554/eLife.83606.sa1

Author response

Essential revisions:

Reviewer #1 (Recommendations for the authors):

A limitation of the study in its current form is that it does not develop an experimental example of using the data set on gene regulation network analysis in Figure 7-8 to make predictions and then test them. For example, the author discusses changes to ubiquitin C, a prominent down-regulated hub gene. However, the functional significance of these predicted interactions is not experimentally tested. To demonstrate that one or more of these genes (Znrf3, Ube4a, Akt1, and Prkaca) are directly involved in the developing atrial septum would significantly increase enthusiasm for the paper.

Thank you for this comment. Beyond our in vitro analysis of the Smad6 variant, we agree that additional testing of DEGs and genes carrying predicted pathogenic variants would be desirable to determine their unique contribution to atrial septal development in mouse and whether they contribute to septal defects in humans. However, I am sure the reviewers appreciate, this would require analysis of possibly many individual variants or groups of variants genetically (for example using CRISPR/Cas9). Furthermore, whereas it seems evident that biosynthetic processes affecting mitochondria, ribosomes, ECM, chromatin and ubiquitin-related processes underpin differences in atrial septal development between founder strains (a primary finding of this paper), it is not immediately obvious which individual genes are the hub driver genes that would be the most appropriate candidates for genetic manipulation. Furthermore, whereas downregulation of ubiquitin or kinases AKT and PKA, and a host of their target genes, may contribute to the defects, manipulating these genes will likely have broad and general effects on atrial cell viability, proliferation and/or interactions, and the complex interactions between variant pathways would be lost. Unravelling higher level network control would certainly be a worthy problem in cardiac system genetics, potentially relevant to other diseases involving disrupted morphogenesis; however, as noted by the handing editor, such studies are well beyond the scope of this current paper. We have added a note to the last paragraph of the Discussion (line 756) to acknowledge these issues:

“Our results provide the first high-resolution picture of genetic complexity and multilayered network liabilities underpinning atrial septal variation in a mouse model. Elucidating the impacts of genetic variants under QTL at the individual gene level would be a daunting task; however, understanding higher level network control of septal development underpinning observed defects in septal macromolecular synthesis pathways is an important cardiac developmental systems genetics problem, which may have relevance to other congenital dysmorphologies.”

Reviewer #2 (Recommendations for the authors):

1. The AIL analysis was limited to QTL previously found in F2 analysis, where the selected markers covered only a limited part of the genome. WGS and RNA seq were utilized to validate QTL genes of known role in CHDs, missing an important opportunity for identifying new QTL and novel gene discovery in PFO/CHDs.

We agree with this comment and had previously address it in the Discussion (line 662). The approach taken was partly dictated by efficiencies of time and funding. The lost opportunity notwithstanding, we would argue that the approach led to important discoveries underpinning atrial septal defects, such as the strong concordance of PFO and atrial septal traits (not evident in the F2 study), the compelling protein network associated with DEGs suggesting defects in biosynthesis of macromolecular structures, and findings related to variant distribution and gene architecture.

2. Authors stated that "the possibility that FVL- and FOW- QTL could be explained by variation in HW or BW has not been formally excluded. Markers of normalized heart weight were included in AIL, like MMU11". However, in [Table 1] HW and BW were listed as independent traits. The manuscript would benefit from using HW/BW ratio and examining HW/BW association with PFO parameters, instead of using HW and BW separately as independent traits.

Furthermore, SEX and Age were not included. Adding these co-variants in their comparison between F2 and F14 in Table 1 would be important.

Thank you for these interesting comments. Please note that in our previous paper on F2 mice (Kirk et al., 2006), QTL for atrial septal traits persisted after normalisation for HW, BW, sex, age, day of dissection, and coat colour. Whereas these parameters contributed to septal traits in both F2 and F14 mice, the effects were small and HW had no impact on PFO. Specifically, in the current study, the correlation coefficients (r) for comparisons between HW and BW, and septal traits, were very low (0.005 – 0.13) (Supplementary File 2). Note also that our analysis of HW QTL in F2 and F14 mice was determined after normalisation for age and sex. Importantly, our mapping results showed that HW and BW QTL were largely different from QTL for septal traits. Thus, with the caveat that mapping was not genome-wide in the F14 study, we can be very confident that QTL septal traits are not a function of variation in HW or BW. This is a very important result and, as we mention in the Discussion, HW QTL likely relates to ventricular growth as an independent parameter to development of the “primary” myocardial component of the heart from which the atrial septum derives.

We did not analyse the relationship between HW/BW ratio and PFO, nor use HW/BW ratio in QTL mapping. With respect, we do not think this effort is justified for several reasons. First, as noted above, there is only a weak contribution of HW and BW to septal traits. Furthermore, HW and BW are generally well correlated in mammals, except in pathology where there is adaptive hypertrophy and/or inflammation. The correlation coefficient between HW and BW in F14 mice was 0.64 (p<0.001; Supplementary File 2). Thus, for HW/BW ratios, these parameters will tend to cancel each other out.

To explain our initial interest in HW and BW further, QSi5 mice, which show a low prevalence of PFO (4.6%), were originally bred because they were highly fecund and showed rapid grow. Thus, they had considerably higher BW and HW than 129T2 mice. It seemed possible that genes (and even environmental factors) influencing overall growth rate would affect the size of the heart and that this would create “noise” in the specific septal traits such as FVL and FOW, which correlate with PFO. Thus, HW and BW were included in analyses of the septal traits not because we were interested in them per se, but because we wished to quantify their influences, hoping to formally exclude them as a trivial source of septal variation. Specifically, we wanted normalised measures of FVL, FOW etc independent of variation in HW and BW. After retrospective analysis of QTL for HW in F2 data revealed a number of HW (and BW) QTL, it was simple enough also to include HW as a variable in selection of F14 mice for QTL analysis of septal traits. This analysis of HW QTL allowed us to show that they largely do not overlap with FVL and FOW QTL, providing much stronger evidence that HW is not a significant confounder in our analysis of the genetic basis of atrial septal defects in 129T2 mice.

Finally, our quantitative genetics team (CM, ICAM, PCT) point out that most statisticians would shy away from using ratios rather than adjusted values in QTL analyses. That is, there is no real certainty that having a term β×HW/BW in the model would capture variation any better than β1×HW + β2×BW, particularly because the point is to adjust for the source of variation in estimating QTL effects, as noted above, not to explore the underlying functional form of any relationship. There is a long history of objections by statisticians to the use of ratios going back to Pearson (1897), with Kronmal 1993 (https://doi.org/10.2307/2983064) and Curran-Everett 2013 (https://doi.org/10.1152/advan.00053.2013) providing theoretical justifications.

In summary, whereas differential growth between mouse strains is clearly an interesting topic, and may have translational relevance in animal breeding, this is not the topic of this paper and would represent an unhelpful distraction from the primary focus of this work on septal traits. We hope this is acceptable to the reviewers. We have, however, clarified further specifically why HW and BW were of interest in the Results section “Selection of AIL mice – heart weight and body weight phenotypes” (line 198):

“Given that QSi5 mice were originally selected for their high fecundity and growth, it was evident that the quantitative septal parameters under study might be influenced by heart size and mass (Table 1) (Kirk et al., 2006), and indeed both FVL and FOW were significantly correlated with HW in both F2 and F14 cohorts, albeit that the effects were small (Supplementary File 2) (Kirk et al., 2006) and HW had no influence the likelihood of PFO.”

3. Page 13. Line 231: “The AIL results for each chromosome of quantitative traits and binary PFO are compared in figure 3”. The cited figure should be [figure 4] instead of [figure 3], or [both Figure 3 and figure 4].

Thank you, Figure 3 has been corrected to Figure.

4. page 16, lines 277-280. "MMU9 QTL is a suggestive QTL [LOD=3.4] that was identified in F2 study and resolved into 4 QTL in the current F14 study [Figure 3 E and Figure 4 E], with significant overlapping FOW, FVL, and PFO peaks. Importantly, TBX20, a cardiac transcription factor gene involved in septal defects, is located within the 1-LOD region of the first QTL". However, only synonymous variants were found in TBX20 (based on WGS analysis) (page 20, line 368).

Given the importance of TBX20 in PFO and the suggestive data, it is worthy to list TBX20 variants and examine/discuss potential miss-classification of these variants, or potential cryptic splice site variants in TBX20 that could be missed on initial pipeline analysis.

Thank you for this suggestion and we agree. We have modified the text first to be more specific about the number of variants detected across the Tbx20 gene, which we now list in a new Supplementary File 5. There were 321 variants in total – only 3 were exonic variants and these were synonymous as previously indicated. The rest were intronic; however, the VEP tool that we used for variant classification detected no variants overlapping known splice motifs and no cryptic splice motifs. We reran the analysis using Spliceogen, a recently described tool pipeline that integrates some of the best performing models for splice motif prediction and achieved the same results. Again, no splicing defects were detected. Spliceogen provides a ranking for cryptic donor/acceptor sites, however, these do not reflect real probabilities. We therefore analysed splice junctions detected in the RNA-seq but did not find confident differences between the strains. In addition, we used Salmon to map Tbx20 isoforms and perform differential expression analysis, but this approach also did not reveal differences in Tbx20 isoforms between strains. For future reference, all the above outputs are given in Supplementary File 5. We have updated the methods and inserted the following in the manuscript (line 383):

“For the Tbx20 gene on MMU9, a total of 321 variants were detected (Supplementary File 5); however, only three were exonic and these were synonymous. No defects in splice motifs or cryptic splice donor/acceptor sites were detected using the VEP tool. We reran the analysis using Spliceogen (Monger et al., 2019) and again no splicing defects were detected. Spliceogen provides rankings of possible cryptic splice sites (Supplementary File 5), however, these do not reflect actual probabilities. Therefore, we analysed splice junctions detected in RNA-seq data from the developing septum (see below) but did not find splice site differences for Tbx20 between parental strains. We also mapped Tbx20 transcript isoforms with Salmon (Patro et al., 2017) and performed differential expression analysis, again yielding no differences in isoform expression between strains (Supplementary File 5).”

5. Page 19. Line 340-341: "regions with low SNP density may contain rare variants that have arisen since establishment of the parent strains".

I. Why were these potentially rare variants dismissed in the current manuscript?

II. Since the prevalence of PFO in F14 increased by ~ 2 folds, compared to F2, it would be important to include the rare variants (Homozygous or heterozygous) that might have arisen since the establishment of parental strains, and ca potentially have large effects to explain the increased prevalence at F14.

Thank you for these comments. We think that use of the term “rare variants” when referring to variants that might have arisen in low SNP density regions since the establishment of the parental strains is confusing because it is generally used to refer to variants with low MAF in humans. However, they will be “rare” in the sense that if they did arise in the germline, they would tend to be lost by genetic drift. We neglected to acknowledge that they might also arise in regions of high SNP density as well. We have corrected this by referring to them as simply “variants” (line 352) and we have also modified the text to acknowledged that variants that have arisen since establishment of parental strains could in principle also contribute to septal defects:

“Regions with a high rate of polymorphism between parental strains (spanning nearly one-third of the genome but harboring more than 95% of the genetic variation) can significantly reduce the regions of interest underlying QTL (Wade et al., 2002), albeit that regions with high or low polymorphism may also carry variants that have arisen since the establishment of parent strains (Bloom et al., 2019). Although these would be rare, they could, in principle, also contribute to septal variation.”

Please note that in filtering for pathogenic coding variants, variants lying within regions of low SNP density were not excluded. Predicted pathogenic variants shown in Figures 6A,B are annotated with respect to their presence in high (starred) or low SNP density regions. We have now indicated in the text also the number and percent of filtered variants which lie in high density SNP regions (84/92; 91.3%; line 380) and the status of each variant is now also indicated in Supplementary File 4.

The 2-fold increase in PFO prevalence in F14 mice compared to F2 mice is interesting; however, the most obvious explanation for this is genetic drift during the cascade breeding of F14 mice. We think the increased PFO prevalence is unlikely to relate specifically to variants that have arisen since establishment of the parental strains for the reasons above and we have not attempted to focus specifically on such variants – we don’t believe that there is any a priori reason for doing this. As noted, predicted pathogenic variants in high (starred) and low density SNP regions are shown in Figure 6A,B.

6. How did the filtered deleterious variant segregate with extreme PFO phenotypes? Can they explain the higher prevalence of PFO in 129T2/SvEms (high PFO) compared to the QSi5 (low PFO) mouse strain?

Thank you for this comment. It is in fact difficult to say whether filtered variants “explain” the extreme PFO phenotypes, noting that cis-regulatory variants will likely be extremely important also, as we highlight later in the manuscript. Nonetheless, reflecting on the Reviewer’s comment, we reexplored the effect sizes for QTL under which the high-confidence pathogenic variants highlighted in the text lay (i.e. those predicted pathogenic and overlapping curated MGI and CHD lists). These variants were selected based principally on due diligence of the predicted impact of the variants (or multiple variants) and/or the involvement of the relevant gene in heart development and/or disease. We discovered that QTL for most highlighted variants had large effect sizes (Cep164: 65%; Dst: 5.68%; Sik3: 65%; Lrp2: 30%; Smad6: 58%). However, we noted that all bar one of these variants were associated with cryptic QTL (i.e. protective of septal defects), including the Smad6 variant. We also found that variants were often found in the unexpected strain give the direction (normal or cryptic) of the QTL. This highlights that there are highly complex interactions between alleles within and between QTL in parental strains and in individual AIL mice. At this stage, we prefer not to make assumption about the network model, which contains multiple interacting elements, albeit a default model where one variant is responsible for each QTL, seems naïve. We feel that trying to segregate highlighted variants with extremes of phenotypes will likely lead us for the moment down an unproductive path. To provide more clarity in the text and follow on from the new findings, we have indicated in the Results section the effect size of the QTL under which the highlighted variants lay (pages 18-20) and added a section in the Discussion about possible implications of our findings for the genetic model (pages 33-34).

“Our results are consistent with Fisher's 'infinitesimal' model whereby many loci (theoretically an infinite number) could each contribute to a small part the liability for common disease (Fisher, 1918), and also GWAS analysis of human disease that most often detect common risk loci of small effect. However, as in our previous study (Kirk et al., 2006), individual septal QTL can have significant effect sizes, demonstrating the power of the AIL approach to discern alleles of major impact. Indeed, all five predicted pathogenic coding variants highlighted in this study, including the Smad6 variant, lay under QTL of moderate to high effect size (5.68-65%). It is possible that a limited number of high impact variants are responsible for septal defects in certain AIL mice. This type of polygenic inheritance has been demonstrated recently in a human cardiomyopathy family (Gifford et al., 2019). Such variants may prove to have causative or modifying effects on atrial septal defects in humans, as exemplified by SMAD1.

Several findings speak to the complex genetic landscape of PFO. It is interesting that twelve out of the 37 QTL detected were cryptic QTL, with many being of moderate to large effect size (~5-70%; Supplementary File 3). Thus, cryptic QTL are frequently detected and individually appear to provide strong buffering effects against septal defects. Cryptic QTL have been detected commonly in other animal and plant QTL studies (Rieseberg et al., 1999). The mechanisms of their action would be interesting to identify as they may reveal pathways that protect against CHD more broadly.

A further complexity is that, of five genes carrying predicted pathogenic variants highlighted in this study, four were found in the unexpected strain. For example, Cep164 lies under a cryptic QTL (protective), whereas predicted pathogenic variants in Cep164 were found in the 129T2/SvEms strain (showing worse septal status). The inverse can also apply – for example, Lrp2 lies under a normal QTL, but variants were found in the QSi5 strain. Such findings suggest highly complex interactions within and between individual QTL. As noted, 26% of predicted pathogenic genes, some known to be involved in heart development, carried multiple predicted pathogenic variants (Cep164 carried seven). It would be reasonable to imagine that these alleles are null or have major impacts on gene function. Genetic compensation may also buffer the impact of such variants (El-Brolosy et al., 2019).”

7. Majority of the highlighted genes, including LRP2, SMAD6, and SIK3, are affected with 1 or 2 deleterious variants, suggesting a potential monogenic effect caused by rare variants.

I would recommend listing the minor allele frequency (MAF) of these variants and other characteristics of the variants (variant type, Sift Score, Splice prediction) in a new summary Table.

We agree that the function of genes predicted as pathogenic and carrying multiple variants may be severely compromised. See response to Comment 6 above. Please note that the general classification of variants (High Impact; Deleterious [SIFT]; non-frameshift deletion) is recorded in Supplementary File 4.

8. SMAD6 variant. The gene was identified in the QSI5-QTL (this strain exhibits low PFO). Based on the functional evidence (cell culture), this variant is damaging and may alter SMAD6 function in several ways.

I. Based on the data and discussion, it is difficult to tell whether the SMAD6 variant is contributing or protective in the PFO phenotype.

II. Again, this seems to be a monogenetic effect.

III. How does SMAD6, or its downstream effectors, contribute to the interactome network of E14.5? is it expressed in the Atrial septum during development? Does the variant influence network interactions?

Smad6 is expressed in the developing atrial septum and the Smad6 variant is an excellent candidate for being protective against septal defects (i.e. underlies a cryptic QTL and appears in the QSi5 strain with robust septal qualities). We clarify this at the end of the Smad6 section on page 23:

“Because the Smad6 variant lies under a cryptic QTL of high effect (58%) and is seen in the QSi5 strain (with robust septal qualities), it is an excellent candidate for being protective against atrial septal defects. We confirmed Smad6 expression in the developing interatrial septum (see below).”

It is unclear precisely how the variant will affect SMAD6 function (and therefore control of the BMP pathway) in vivo, noting that it still has repressive activity on the BMP pathway assay in vitro. However, it is possible that it may be an allele of strong effect (we avoid the term “monogenic” as it would be difficult to claim this, even for alleles of strong effect). We currently don’t know how it affects the differential expression of septal genes between strains and associated protein:protein interaction networks.

9. Page 27. figure 7. PCAs for coding vs noncoding. The noncoding transcript showed a different pattern of variation (strain vs developmental stage). Since they were not included in the downstream analysis, I would suggest removing Figure 7B from the current manuscript.

We believe the reviewer is referring to PCA plot using non-coding genes in Figure 8B. As suggested, this has been removed and numbering of other Figure 8 panels and text adjusted accordingly.

10. Transcriptome analysis/Network analysis [fig7.E, Figure 8]. Adding genes affected with variants to network analysis of the DEGs may confound network structure, as no evidence was shown to support the impact of variants on their gene expression. Instead, I would suggest the following:

We agree that the presence of pathogenic protein coding variant (red nodes) in the E14.5 STRING network is not helpful, so have reconfigured the networks without them and adjusted the text and legend accordingly. Deleting the red nodes did not have a major effect on the networks (new Figure 8 and Figure 8—figure supplements 1 and 2). We also inserted a note at the end of this section on page 25 highlighting the signalling kinase SRC, which forms a strong hub at E12.5, and also that there were very few DEGs that contained predicted pathogenic coding variants:

“Interestingly, along with hubs for Ubc, collagen/ECM and translation, an additional hub centred on Src, encoding a proto-oncogene tyrosine kinase involved in embryonic development and cell growth, was also evident at E12.5, but not at later stages (Figure 8—figure supplement 1). Across all stages, there were very few DEGs that were predicted to carry pathogenic coding variants, and none appeared as hub genes, placing network genes downstream of functional variants.”

I. Discuss why E14 revealed the highest # of DEGs and more significant networks.

Thank you for challenging us with this question. It is possible that the most significant STRING network of DEGs between strains occurs at E14.5 (subnetworks for macromolecular synthesis of mitochondria, chromatin, ribosomes and ECM) because changes relate to growth and maturation of septal components at a relatively later stage of septal development rather than to specific (earlier) steps of septal component specification. This notion would be supported by the fact that there is strong overlap between QTL for FVL, FOW and PFO, and other findings. We have made a comment to this effect in the Discussion on page 32 (line 694):

“By far the most significant STRING protein network for DEGs was at E14.5, and subnetworks with high connectivity relating to macromolecular cellular structures including nucleosomes, ribosomes, mitochondria and ECM, were detected. Virtually all associated genes were downregulated in the 129T2/SvEms strain (showing less robust septal characteristics). Our findings suggest that there is a general septal growth and maturation deficit involving multiple septal components in the 129T2/SvEms strain at a relatively late stage (E14.5) of septal development. This idea is supported by the correlations between FVL and FOW (reflecting development of the primum and secundum septa, respectively) (Kirk et al., 2006) and the strong concordance between QTL for individual septal traits and PFO discovered in this work. We hypothesise that differences in quantitative septal traits and PFO prevalence that we documented previously in inbred, hybrid and mutant strain of mice (Kirk et al., 2006), relate in part to degrees of septal growth and maturation, and that higher septal growth and maturation facilitates better fusion of the septum primum and secundum. Careful elucidation of the differential growth parameters of septal components in parent strains across time may provide further support for this model.”

We think this is an important hypothesis that integrates diverse findings and so we have incorporated a brief reference to it in the Abstract.

“Transcriptome analysis of developing septa revealed downregulation of networks involving ribosome, nucleosome, mitochondrial and extracellular matrix biosynthesis in the 129T2/SvEms strain, potentially reflecting an essential role for growth and cellular maturation in septal development.”

II. What was the expression status of genes with pathogenic variants that were expressed at each time point?

We have added a new tab in Supplementary File 4 indicating raw expression figures for all predicted pathogenic genes at each stage of atrial septal development analysed, with a column indicating whether the gene was differentially expressed. Around 30% of genes were differentially expressed in at least one time point and we have added a note to this effect on page.

III. The finding that DEGs in QTL regions were enriched in pathogenic variants led to the speculation of the functional impact of some QTL on septal traits will involve collective effects of multiple variants (polygenic effect). However, the role of the list of genes carrying pathogenic variants in septum morphology has not been examined mechanistically, including smad6, in vivo. Therefore the evidence does not support the conclusion.

IV. Page 36. Line 645-654. Variants overlapping with the noncoding element.

What is the significance of this finding? Any further evidence of deleterious impact or causality?

We were confused by statement under III that DEGs under QTL are not enriched in pathogenic variants. We think the Reviewer was referring to DEGs under QTL that were over-represented in variants (129T2 vs QSi5) which lie in non-coding features. We have not tested the “polygenic” hypothesis mechanistically. We address the substantial challenges in doing so in our response to Reviewer 1.

Reviewer #3 (Recommendations for the authors):

1. In Figure 6, the authors filter a large number of identified variants between the AIL parental strains using a logical pipeline. The authors ultimately intersected potentially pathogenic variants under septal QTL regions with genes according to prior association with cardiovascular phenotypes (Mouse Genome Informatics; MGI) or link to human CHD from a curated list of high-confidence and emerging CHD genes. Were CHD or MGI genes enriched under QTL regions?

We in fact made an error in our reporting in the Results section on how many of the filtered (predicted pathogenic) variants overlapped with the curated MGI and CHD lists of pathogenic genes. We have now corrected this and, as shown in the Venn diagrams of Figure 6C,D, eight predicted pathogenic genes overlapped the MGI/CDH lists. We list these in the text on page and again in the Discussion on page. As requested, we tested whether curated genes were overrepresented under QTL (relative to non-QTL, atrial-expressed genes), and they were not. This does not diminish their potential involvement in atrial septal traits; however, this has not been tested.

2. The examples of high-impact variants are highly subjective other than smad6, which the authors validate. The caveats underlying the description of unvalidated specific variants should be more explicitly noted, as few if any of them have been directly implicated in atrial septation.

We understand the Reviewer’s comment and have amended to text at the end of the section (page 490) to indicate that variants other than Smad6 have not been validated. We also go further into the complexity of the highlighted variant’s QTL effect size, direction (normal versus cryptic) and strain of origin in the Discussion, and implications for the model (see response to Reviewer #2 comment 6).

3. Figure 8 – RNA-seq is performed beginning at E12.5, after the formation of the PAS, MC, and DMP. Thus, it is probably most appropriate to limit the implication of genes differentially expressed between the two mouse strains at these later time points to the formation of the septum secundum. While the formation of the SS is critical for FO closure, the limited applicability of the RNA-seq data for descriptions of atrial septation should be mentioned. Alternatively, the authors could perform additional RNA-seq experiments beginning around E9.5 in the pSHF and common atrium to include genes associated with all atrial septum structures as they form, or obtain and analyze previously published RNA-seq data from these regions.

Thank you for raising this. First, we have corrected out statement in the Introduction, where we had indicated that the septum secundum was an infolding of the dorsal atrial myocardium (as was described in human hearts). We realise now that the situation in mouse is different and that the septum secundum is a true septum (muscular ridge). We have clarified this in the Introduction on line 91:

“A septum secundum forms as an infolding of the dorsal atrial wall in humans (Anderson et al., 2003) or as specific muscular ridge in mice (Briggs et al., 2012), leaving an additional, prominent and offset interatrial communication termed the foramen ovale (Burns et al., 2016), completing the flap valve apparatus (Figure 1A).”

The Reviewer is correct in stating that at the earliest timepoint analysed (E12.5), the septum primum mesenchymal cap has already fused with the atrioventricular canal cushion, and the ostium secundum has already formed (Briggs et al., 2012). We have clarified this in the text on line 497:

“At the earliest time point, E12.5, the thin septum primum has already fused with the AV septum via its mesenchymal cap, and the ostium secundum has been created by cell death; thus, transcriptome changes across the three stages analysed may reflect prior changes during septum primum formation, as well as formation of the septum secundum, septal growth and remodelling.”

4. The authors' analysis of the atrial septum transcriptome data jumps quickly to considering stain-specific DEGs. However, this is the first transcriptional profiling of the later stages of the atrial septum, as far as I know. A more considered baseline analysis should be performed before jumping into the strain-specific analysis. What does the transcriptome alone say about atrial septation? How does the atrial septum transcriptome compare to the atrial free wall or other regions of the heart (by comparison with RNA-seq from the authors or others)? How do the pathways that are septum-enriched compare to those uncovered in the stain-specific septum analysis?

Thanks for this comment and we agree that deeper analysis of this data set could be fruitful. However, we have not undertaken such analyses so as not to distract from the main points of this already dense paper.

5. Figures 7 & 8 – the authors implicate the BMP, ubiquitination, Akt, and PKA pathways in PFO based on gene expression differences between 129T2/SvEms and QSi5 mouse lines from E12.5-E16.5. These findings would be greatly strengthened if the authors could demonstrate functional differences in the activities of these pathways in atrial septum tissues in vivo at these time points. This would not be necessary for the publication of this already detailed manuscript and could be discussed as an avenue for follow-up.

We have made a comment to this effect in the Discussion on line 711:

“We propose that variants in the 129T2/SvEms parent strain depress growth-related signalling pathways (including BMP, AKT, PKA and SRC) and associated ubiquitin-related quality control processes, impacting septal growth and morphogenesis independently of chamber growth. In this light, it is interesting that CHD and cancer risk genes have recently been correlated (Morton et al., 2021). This could be a focus area for further investigation.”

6. The observation that differentially expressed QTL genes are enriched for genomic variants is an important observation. The authors used a very conservative relationship between candidate regulatory regions and genes, described as "within 50kb upstream of the transcriptional start site of relevant genes". Given that enhancers demonstrate no directional bias, their capture could be increased to include regions both upstream and downstream of a gene's TSS. Also, 50kb is quite conservative, and their capture may be increased by enlarging the considered regions to 100 or 250 kb. Given the conclusion that there is a "strong mechanistic link between variant architecture and differential expression of genes under QTL", the authors should present the gene sets and pathways that are enriched in this analysis.

We have now expanded the analysis to include enhancer bin sizes of 50Kb (as used previously), 100Kb, 200Kb and 250Kb. For this new analysis, the definition of a bin was modified to include enhancers detected within regions (e.g. 50Kb) upstream and downstream of the transcriptional start site. The over-representation of variants in enhancers of DEGs under QTL fell within the p<0.05 threshold for all bin sizes. Please note that we detected an error in the previous version of the manuscript with respect to the labelling of one of the histone mark files we referenced (Wamstad et al., 2012) – the correct accession number is now indicated in Materials and methods. Results are presented in Figure 10—figure supplement 2 and we wrote into the text as follows (line 579):

“After permutation testing, we observed a significant enrichment of variants relative to expected in enhancers, promoters, 5UTRs and introns for DEGs underlying QTL (Figure 10). Non-DEGs showed significant depletion of variants in exons. DEGs and non-DEGs showed similar normalised variant count distributions across features (Figure 10—figure supplement 1), indicating that DEGs were not substantially skewed towards genes with high variant density. Likewise, variants were not focused on any specific gene class (Supplementary File 8). The over-representation of variants was consistent when candidate cardiac enhancers near DEGs were considered even up to 250Kb from the TSS (Figure 10—figure supplement 2A), noting however that most enhancers were found within the 100Kb window (Figure 10—figure supplement 2B). Interestingly, DEGs had approximately 60-80% more enhancers than non-DEGs, consistent with their higher levels of expression, irrespective of septal stage or strain (Figure 10—figure supplement 2C-E).”

With regards to “presenting the gene sets and pathways that are enriched in this analysis”, thank you for this suggestion. This is of course difficult because of the statistical nature of the tests. However, we ranked DEGs under QTL based on the variant density or binary variant status across enhancers, promoters, 5’UTRs and/or exons, and applied fgsea to the ranked gene lists to identify whether there was enrichment of GO terms. However, there were no significant GO terms identified. We did not proceed further with this approach.

7. Figure 10 – The authors could consider an analysis of variant conservation between human and mouse to this figure to interrogate the relevance of identified loci for human ASD/PFO. For variants in and around DEGs underlying QTL, what proportion in coding regions are conserved? What proportion in non-coding regions?

With respect, this analysis has a statistical basis, and until more specific causative coding and cis-regulatory variants can be mapped, this would be a distraction from the main discovery in this section.

https://doi.org/10.7554/eLife.83606.sa2

Article and author information

Author details

  1. Mahdi Moradi Marjaneh

    1. Victor Chang Cardiac Research Institute, Sydney, Australia
    2. School of Clinical Medicine, Faculty of Medicine and Health, UNSW Sydney, Sydney, Australia
    3. Department of Infectious Disease, Faculty of Medicine, Imperial College London, London, United Kingdom
    Contribution
    Conceptualization, Data curation, Software, Formal analysis, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing
    Contributed equally with
    Edwin P Kirk and Ralph Patrick
    For correspondence
    m.moradi@imperial.ac.uk
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-9412-9029
  2. Edwin P Kirk

    1. Victor Chang Cardiac Research Institute, Sydney, Australia
    2. Centre for Clinical Genetics, Sydney Children’s Hospital, Sydney, Australia
    3. Randwick Genomics Laboratory, NSW Health Pathology, Sydney, Australia
    Contribution
    Conceptualization, Data curation, Formal analysis, Supervision, Investigation, Visualization, Methodology, Writing – original draft, Project administration, Writing – review and editing
    Contributed equally with
    Mahdi Moradi Marjaneh and Ralph Patrick
    Competing interests
    No competing interests declared
  3. Ralph Patrick

    Victor Chang Cardiac Research Institute, Sydney, Australia
    Contribution
    Conceptualization, Data curation, Formal analysis, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing
    Contributed equally with
    Mahdi Moradi Marjaneh and Edwin P Kirk
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-0956-1026
  4. Dimuthu Alankarage

    Victor Chang Cardiac Research Institute, Sydney, Australia
    Contribution
    Conceptualization, Investigation, Visualization, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
  5. David T Humphreys

    1. Victor Chang Cardiac Research Institute, Sydney, Australia
    2. School of Clinical Medicine, Faculty of Medicine and Health, UNSW Sydney, Sydney, Australia
    Contribution
    Conceptualization, Data curation, Formal analysis, Investigation, Visualization, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-4140-0089
  6. Gonzalo Del Monte-Nieto

    1. Victor Chang Cardiac Research Institute, Sydney, Australia
    2. School of Clinical Medicine, Faculty of Medicine and Health, UNSW Sydney, Sydney, Australia
    Contribution
    Investigation, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
  7. Paola Cornejo-Paramo

    Victor Chang Cardiac Research Institute, Sydney, Australia
    Contribution
    Formal analysis, Investigation, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
  8. Vaibhao Janbandhu

    1. Victor Chang Cardiac Research Institute, Sydney, Australia
    2. School of Clinical Medicine, Faculty of Medicine and Health, UNSW Sydney, Sydney, Australia
    Contribution
    Data curation, Investigation, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-3837-1865
  9. Tram B Doan

    Victor Chang Cardiac Research Institute, Sydney, Australia
    Contribution
    Data curation, Supervision, Investigation, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
  10. Sally L Dunwoodie

    1. Victor Chang Cardiac Research Institute, Sydney, Australia
    2. School of Clinical Medicine, Faculty of Medicine and Health, UNSW Sydney, Sydney, Australia
    Contribution
    Supervision, Investigation, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-2069-7349
  11. Emily S Wong

    1. Victor Chang Cardiac Research Institute, Sydney, Australia
    2. School of Biotechnology and Biomolecular Sciences, Faculty of Science, UNSW Sydney, Sydney, Australia
    Contribution
    Conceptualization, Supervision, Investigation, Methodology, Project administration, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-0315-2942
  12. Chris Moran

    Sydney School of Veterinary Science, University of Sydney, Sydney, Australia
    Contribution
    Conceptualization, Supervision, Investigation, Methodology, Project administration, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-4550-5101
  13. Ian CA Martin

    Sydney School of Veterinary Science, University of Sydney, Sydney, Australia
    Contribution
    Conceptualization, Supervision, Investigation, Methodology, Project administration, Writing – review and editing
    Competing interests
    No competing interests declared
  14. Peter C Thomson

    Sydney School of Veterinary Science, University of Sydney, Sydney, Australia
    Contribution
    Conceptualization, Supervision, Funding acquisition, Investigation, Methodology, Writing – original draft, Project administration, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-4428-444X
  15. Richard P Harvey

    1. Victor Chang Cardiac Research Institute, Sydney, Australia
    2. School of Clinical Medicine, Faculty of Medicine and Health, UNSW Sydney, Sydney, Australia
    3. School of Biotechnology and Biomolecular Sciences, Faculty of Science, UNSW Sydney, Sydney, Australia
    Contribution
    Conceptualization, Formal analysis, Supervision, Funding acquisition, Investigation, Methodology, Writing – original draft, Project administration, Writing – review and editing
    For correspondence
    r.harvey@victorchang.edu.au
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-9950-9792

Funding

National Health and Medical Research Council (0573705)

  • Richard P Harvey

National Health and Medical Research Council (1118576)

  • Richard P Harvey

National Health and Medical Research Council (2008743)

  • Richard P Harvey

National Heart, Lung, and Blood Institute (1RO1HL68885-01)

  • Richard P Harvey

National Heart Foundation of Australia (G06S2575)

  • Richard P Harvey

National Heart Foundation of Australia (G0050738)

  • Richard P Harvey

National Health and Medical Research Council (354400)

  • Richard P Harvey

National Health and Medical Research Council (0573732)

  • Richard P Harvey

National Health and Medical Research Council (1074386)

  • Richard P Harvey

New South Wales Government

  • Richard P Harvey

University of New South Wales

  • Paola Cornejo-Paramo

University of New South Wales (ID3263695)

  • Mahdi Moradi Marjaneh

Imperial College Healthcare NHS Trust

  • Mahdi Moradi Marjaneh

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

MMM held a University of New South Wales (UNSW; Sydney, Australia) International Postgraduate Award (ID3263695) and was supported in part by the NIHR Biomedical Research Centre of Imperial College Healthcare NHS Trust (London, UK). EPK held a National Heart Foundation Clinical Fellowship (Australia). The work was supported by the National Institute of Heart Lung and Blood (USA; 1RO1HL68885-01), National Heart Foundation of Australia (G06S2575; G0050738), National Health and Medical Research Council (NHMRC, Australia) (354400, 0573732; 1074386), the New South Wales Government Ministry of Health 20:20 campaign and the Victor Chang Cardiac Research Institute Innovation Centre (funded by the New South Wales Government Ministry of Health). RPH was supported by NHMRC Australia Fellowship (0573705), Senior Principal Research Fellowship (1118576), and Investigator Grant (2008743). PCP was supported by a UNSW Sydney International Postgraduate Award (UIPA) Ph.D. Scholarship.

Ethics

Animals were bred and housed under Animal Care and Research Ethics approvals N00/4-2003/1/3745, N00/4-2003/2/3745, and N00/4-2003/3/3745 from the University of Sydney.

Senior Editor

  1. David E James, University of Sydney, Australia

Reviewing Editor

  1. Aldons J Lusis, University of California, Los Angeles, United States

Version history

  1. Received: September 21, 2022
  2. Preprint posted: November 1, 2022 (view preprint)
  3. Accepted: June 2, 2023
  4. Accepted Manuscript published: June 5, 2023 (version 1)
  5. Version of Record published: June 21, 2023 (version 2)
  6. Version of Record updated: June 22, 2023 (version 3)

Copyright

© 2023, Moradi Marjaneh, Kirk, Patrick et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 525
    Page views
  • 71
    Downloads
  • 1
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Mahdi Moradi Marjaneh
  2. Edwin P Kirk
  3. Ralph Patrick
  4. Dimuthu Alankarage
  5. David T Humphreys
  6. Gonzalo Del Monte-Nieto
  7. Paola Cornejo-Paramo
  8. Vaibhao Janbandhu
  9. Tram B Doan
  10. Sally L Dunwoodie
  11. Emily S Wong
  12. Chris Moran
  13. Ian CA Martin
  14. Peter C Thomson
  15. Richard P Harvey
(2023)
Quantitative trait and transcriptome analysis of genetic complexity underpinning cardiac interatrial septation in mice using an advanced intercross line
eLife 12:e83606.
https://doi.org/10.7554/eLife.83606

Share this article

https://doi.org/10.7554/eLife.83606

Further reading

    1. Developmental Biology
    2. Neuroscience
    Athina Keramidioti, Sandra Schneid ... Charles N David
    Research Article

    The Hydra nervous system is the paradigm of a ‘simple nerve net’. Nerve cells in Hydra, as in many cnidarian polyps, are organized in a nerve net extending throughout the body column. This nerve net is required for control of spontaneous behavior: elimination of nerve cells leads to polyps that do not move and are incapable of capturing and ingesting prey (Campbell, 1976). We have re-examined the structure of the Hydra nerve net by immunostaining fixed polyps with a novel antibody that stains all nerve cells in Hydra. Confocal imaging shows that there are two distinct nerve nets, one in the ectoderm and one in the endoderm, with the unexpected absence of nerve cells in the endoderm of the tentacles. The nerve nets in the ectoderm and endoderm do not contact each other. High-resolution TEM (transmission electron microscopy) and serial block face SEM (scanning electron microscopy) show that the nerve nets consist of bundles of parallel overlapping neurites. Results from transgenic lines show that neurite bundles include different neural circuits and hence that neurites in bundles require circuit-specific recognition. Nerve cell-specific innexins indicate that gap junctions can provide this specificity. The occurrence of bundles of neurites supports a model for continuous growth and differentiation of the nerve net by lateral addition of new nerve cells to the existing net. This model was confirmed by tracking newly differentiated nerve cells.

    1. Developmental Biology
    Marta Grzonka, Hisham Bazzi
    Research Article

    SAS‑6 (SASS6) is essential for centriole formation in human cells and other organisms but its function in mouse is unclear. Here, we report that Sass6‑mutant mouse embryos lack centrioles, activate the mitotic surveillance cell death pathway and arrest at mid‑gestation. In contrast, SAS‑6 is not required for centriole formation in mouse embryonic stem cells (mESCs), but is essential to maintain centriole architecture. Of note, centrioles appeared after just one day of culture of Sass6‑mutant blastocysts, from which mESCs are derived. Conversely, the number of cells with centrosomes is drastically decreased upon the exit from a mESC pluripotent state. At the mechanistic level, the activity of the master kinase in centriole formation, PLK4, associated with increased centriolar and centrosomal protein levels, endow mESCs with the robustness in using SAS‑6‑independent centriole-duplication pathways. Collectively, our data suggest a differential requirement for mouse SAS‑6 in centriole formation or integrity depending on PLK4 and centrosome composition.