An in silico FSHD muscle fiber for modeling DUX4 dynamics and predicting the impact of therapy

  1. Matthew V Cowley
  2. Johanna Pruller
  3. Massimo Ganassi
  4. Peter S Zammit
  5. Christopher RS Banerji  Is a corresponding author
  1. Centre for Sustainable and Circular Technologies, Department of Chemistry, University of Bath, United Kingdom
  2. King's College London, Randall Centre for Cell and Molecular Biophysics, New Hunt's House, Guy's Campus, United Kingdom
  3. The Alan Turing Institute, British Library, United Kingdom

Abstract

Facioscapulohumeral muscular dystrophy (FSHD) is an incurable myopathy linked to the over-expression of the myotoxic transcription factor DUX4. Targeting DUX4 is the leading therapeutic approach, however, it is only detectable in 0.1–3.8% of FSHD myonuclei. How rare DUX4 drives FSHD and the optimal anti-DUX4 strategy are unclear. We combine stochastic gene expression with compartment models of cell states, building a simulation of DUX4 expression and consequences in FSHD muscle fibers. Investigating iDUX4 myoblasts, scRNAseq, and snRNAseq of FSHD muscle we estimate parameters including DUX4 mRNA degradation, transcription and translation rates, and DUX4 target gene activation rates. Our model accurately recreates the distribution of DUX4 and targets gene-positive cells seen in scRNAseq of FSHD myocytes. Importantly, we show DUX4 drives significant cell death despite expression in only 0.8% of live cells. Comparing scRNAseq of unfused FSHD myocytes to snRNAseq of fused FSHD myonuclei, we find evidence of DUX4 protein syncytial diffusion and estimate its rate via genetic algorithms. We package our model into freely available tools, to rapidly investigate the consequences of anti-DUX4 therapy.

Editor's evaluation

To provide a logical answer to the over-expressed DUX4 in FSHD, the authors took a sophisticated mathematical modeling approach and applied it to empirical data. The approach successfully predicts behaviors of proteins and cells, thereby suggests a model for pathogenicity. The result poses a potential to be expanded to understand molecular dynamics of other mutation-mediated rare diseases.

https://doi.org/10.7554/eLife.88345.sa0

Introduction

FSHD is a prevalent (~12/100,000 Deenen et al., 2014), incurable, inherited skeletal myopathy. The condition is characterized by progressive fatty replacement and fibrosis of specific muscle groups driving weakness and wasting (Banerji and Zammit, 2021), which is accelerated by inflammation (Dahlqvist et al., 2020). FSHD is highly heterogeneous, with both the rate of progression and order of muscle involvement varying dramatically from person to person, even between monozygotic twins (Tawil et al., 1993). Around 75% of patients exhibit a descending phenotype with weakness beginning in the facial muscles, before progressing to the shoulder girdle and latterly lower limb, while the remaining 25% exhibit a range of ‘atypical’ phenotypes, including facial sparing and lower limb predominant (Banerji et al., 2020a). Despite this clinical range, FSHD is associated with significant morbidity and socioeconomic costs (Schepelmann et al., 2010).

Genetically FSHD comprises two distinct subtypes: FSHD1 (OMIM: 158900, 95% of cases) and FSHD2 (OMIM: 158901, 5% of cases). Both subtypes bear a unifying epigenetic feature: derepression of the D4Z4 macrosatellite at chromosome 4q35. In FSHD1 this is due to truncation of the D4Z4 macrosatellite from the typical >100 to 10–1 units (Lemmers et al., 2010). In FSHD2, derepression is due to mutation in a chromatin modifier, typically SMCHD1 (Lemmers et al., 2012) but rarely DNMT3B (van den Boogaard et al., 2016) or LRIF1 (Hamanaka et al., 2020). In addition to D4Z4 epigenetic derepression, FSHD patients also carry certain permissive 4qA haplotypes distal to the last D4Z4 repeat encoding a polyadenylation signal (Lemmers et al., 2010).

Each 3.3 kb D4Z4 repeat encodes the transcription factor DUX4, which plays a role in zygotic genome activation, after which it is silenced in somatic tissues (De Iaco et al., 2017). In FSHD, however, epigenetic derepression of the D4Z4 region allows inappropriate transcription of DUX4 from the most distal D4Z4 unit, with transcripts stabilized by splicing to the polyadenylation signal in 4qA haplotypes, allowing translation. Mis-expression of DUX4 protein is thus believed to underlie FSHD pathogenesis and DUX4 inhibition is currently the dominant approach to FSHD therapy (Tawil, 2020; Le Gall et al., 2020).

However, DUX4 is extremely difficult to detect in FSHD patient muscle, with the vast majority of transcript and protein level studies failing to detect DUX4 in FSHD muscle biospies (Banerji and Zammit, 2021). When DUX4 is detected in FSHD patient muscle, it is at very low levels, requiring highly sensitive techniques such as nested RT-qPCR for transcripts (Jones et al., 2012) and proximity ligation assays for protein (Beermann et al., 2022). Investigation of FSHD patient-derived myoblasts has confirmed this very low level of DUX4 expression (Banerji and Zammit, 2021). Single-cell and single nuclear transcriptomic studies find only 0.5–3.8% of in vitro differentiated FSHD myonuclei express DUX4 transcript (Jiang et al., 2020; van den Heuvel et al., 2019). Immunolabelling studies only detect DUX4 protein in between 0.1–5% of FSHD myonuclei (Snider et al., 2010; Rickard et al., 2015).

As DUX4 is a transcription factor, it has been proposed that DUX4 target genes may represent a key driver of FSHD pathology. However, multiple meta-analyses have found DUX4 target gene expression to be a poor biomarker of FSHD muscle, except in the context of significant inflammation (Banerji et al., 2017; Banerji and Zammit, 2019), where it may be confounded by immune cell gene expression (Banerji et al., 2020b). Importantly, a recent phase 2b clinical trial of the DUX4 inhibitor losmapimod failed to reach its primary endpoint of reduced DUX4 target gene expression in patient muscle, despite improvement in functional outcomes (Jagannathan et al., 2022). Given the challenge of detecting DUX4 in muscle biopsies (Banerji and Zammit, 2021) it is unsurprising that no data was published relating to DUX4 expression changes during the losmapimod clinical trial. An understanding as to why losmapimod did not alter the expression of DUX4 targets in patient muscle biopsies is also lacking, but hypotheses include heterogeneity in muscle sampling, low baseline levels of DUX4 targets and thus limited dynamic range, slow reversibility of DUX4-induced epigenetic changes on target gene promoters and losmapimod having limited impact on DUX4 transcriptional activity in vivo.

How can such a rare expression of DUX4 drive a pathology as significant as FSHD? DUX4 expression in FSHD patient myoblasts follows a burst-like pattern, and cells expressing DUX4 have significantly shortened lifespans, suggesting a gradual attrition of cells over time (Rickard et al., 2015). A mouse model in which DUX4 expression is induced in a rare burst-like manner bears striking histological and transcriptomic similarities to FSHD patient muscle (Bosnakovski et al., 2017; Bosnakovski et al., 2020). DUX4 expression in FSHD patient-derived, multi-nucleated myotubes also displays a gradient across myonuclei, suggesting that DUX4 may ‘diffuse’ either actively or passively from an origin nucleus to neighboring nuclei, bypassing their need to wait for the rare DUX4 burst (Rickard et al., 2015; Tassin et al., 2013).

A deeper understanding of DUX4 dynamics and how it drives FSHD pathology is essential to move toward anti-DUX4 therapy. Not only would this explain DUX4 target gene expression as a suboptimal monitoring tool, but would enable optimal therapeutic design, through in silico perturbation of the parameters underlying DUX4 expression and toxicity.

Despite considerable discussion of ‘rare-bursts’ and ‘diffusion’ no attempt has been made to understand DUX4 expression through stochastic processes or differential equations; the natural setting to place these dynamics. Here, we combine ordinary differentiation equation models with stochastic gene expression models to construct a tuneable in silico simulation of DUX4 regulation in FSHD cell culture, both in unfused myocytes and syncytial multinucleated myotubes.

Through analysis of iDUX4 myoblasts, scRNAseq and snRNAseq of FSHD differentiated myoblasts we derive experimental estimates for the parameters of our model, including DUX4 mRNA degradation, transcription and translation rates, and DUX4 target gene activation rates. Simulation of our model provides a striking fit to DUX4 and DUX4 target gene expressing cell proportions seen in scRNAseq of FSHD myocytes. Importantly, our model predicts that DUX4 drives significant cell death, despite expression being limited to <1% of live cells. By comparing scRNAseq of unfused FSHD myonuclei to snRNAseq of multinucleated FSHD myotubes, we find evidence of DUX4 protein syncytial diffusion. We extend our model to examine DUX4 spreading between adjacent myonuclei and project our simulation onto the surface of a muscle fiber, in a spatially relevant model. Employing genetic algorithms to fit our spatial model to snRNAseq of FSHD syncytial myotubes, we provide an estimate for the syncytial diffusion rate of DUX4 protein.

We package our model into three freely available user interfaces, presenting an in silico toolkit to assess the impact of specific anti-DUX4 therapies on FSHD cell culture in a rapid, cost-effective, and unbiased manner.

Results

Compartment and promoter-switching models of DUX4 expression

Here, we consider two models of DUX4 expression in FSHD myocytes, a deterministic model of FSHD cell states we call the compartment model, and a stochastic model of gene expression we call the promoter switching model.

We first describe the compartment model. FSHD single myocytes can express DUX4 and, therefore, DUX4 target genes (Rickard et al., 2015; Kowaljow et al., 2007; Heher et al., 2022). We have previously demonstrated, via a library of DUX4 expression constructs, that DUX4 target gene activation is also necessary for DUX4-induced cell death (Knopp et al., 2016). We thus propose that an FSHD single myocyte at a given time t, occupies one of the following five states/compartments, defined by its transcriptomic distribution:

  1. St – a susceptible state where the cell expresses no DUX4 mRNA and no DUX4 target mRNA (DUX4 -ve/Target gene -ve: DUX4 mRNA naive cell)

  2. Et – an exposed state where the cell expresses DUX4 mRNA but no DUX4 target mRNA (DUX4 +ve/Target gene -ve: DUX4 transcribed but not translated)

  3. It – an infected state where the cell expresses both DUX4 mRNA and DUX4 target mRNA (DUX4 +ve/Target gene +ve: DUX4 transcribed and translated)

  4. Rt – a resigned state where the cell expresses no DUX4 mRNA but does express DUX4 target mRNA (DUX4 -ve/Target gene +ve: i.e., a historically DUX4 mRNA expressing cell)

  5. Dt – a dead state

We propose a model in which the single FSHD myocyte can transition through these five states according to five parameters:

  1. VD – the average transcription rate of DUX4 in a single cell

  2. d0 – the average degradation rate of DUX4 mRNA

  3. TD – the average translation rate of DUX4 from mRNA to active protein

  4. VT – the average transcription rate of DUX4 target genes in the presence of DUX4

  5. Dr – the average death rate of DUX4 target positive cells

We allow cells to transition from DUX4 mRNA negative states St and Rt to DUX4 mRNA positive states Et and It, respectively at the rate of transcription of DUX4, VD , with transition in the reverse direction occurring at the degradation rate of DUX4 mRNA, d0. Transition from state Et to state It requires DUX4 mRNA to be translated to active protein and DUX4 target mRNA to be expressed and thus occurs at the rate TDVT. Lastly, DUX4 target gene +ve states It and Rt transition to the dead cell state Dt at rate Dr. Our compartment model can be represented schematically (Figure 1A) or equivalently as a system of ordinary differential equations (ODEs) (Figure 1B).

Overview of models.

(A) Schematic of the compartment model describing the transition between five Facioscapulohumeral muscular dystrophy (FSHD) states according to five underlying parameters. (B) Ordinary differential equations describing the compartment model. (C) Schematic of the promoter-switching model of gene expression.

There are important assumptions in our compartment model:

  1. Cells are assumed not to proliferate over the evolution of the model. We restrict applications to differentiating cells that have exited the cell cycle.

  2. The death rate of DUX4 target gene -ve cells is negligible in comparison to the death rate of target gene +ve cells. This assumption is justifiable given published data on the death rate of DUX4 target gene-positive cells (Rickard et al., 2015).

  3. The transition from DUX4 target gene -ve cell to DUX4 target gene +ve cell states is irreversible, i.e., we assume that the volume of target transcripts induced by DUX4 is sufficiently large, such that the rate of their degradation to zero is negligible in comparison to the death rate of target positive cells. This assumption is justified given that DUX4 is a potent transcriptional activator and pioneer factor (Knopp et al., 2016; Choi et al., 2016).

In what follows, we derive experimental estimates for the 5 parameters underlying the compartment model.

A further preliminary is the promoter-switching model. This model is precisely the two-stage telegraph process, which has a long history of study in stochastic gene expression (Shahrezaei and Swain, 2008; Vu et al., 2016; Kim and Marioni, 2013). Under this model, we assume that the promoter of a gene n can occupy one of two states: active and inactive, and that transition from active (a) to inactive (i) state occurs at a rate kin , with the reverse transition occurring at a rate kan. We further assume that an active promoter can transcribe a single mRNA at a rate v0n , which degrades at a rate δn . The model is represented schematically in Figure 1C. This model of gene expression has been shown to follow a Poisson-Beta distribution (Vu et al., 2016; Kim and Marioni, 2013), where the promoter state is determined by a Beta-distributed variable p~Betakan/δn,kin/δn , and the mRNA copy number distribution conditional on the promoter state follows a Poisson distribution m|p~Poissonpv0n/δn. Under this interpretation maximum likelihood estimates (MLEs) for the normalized underlying parameters kan/δn, kin/δn and v0n/δn can be approximated from single-cell transcriptomic data (Vu et al., 2016; Kim and Marioni, 2013).

We assume that the proportion of time the promoter is in the active state, multiplied by the transcription rate of the active promoter in our promoter-switching model kanv0nkan+kin, is a reasonable proxy for the average rate of transcription of a gene in our compartmental model. We employ this assumption to estimate the average transcription rates VD and VT for the compartment model.

Estimating the kinetics of DUX4 mRNA

Our compartment model contains two parameters governing the kinetics of DUX4 mRNA: the degradation rate d0 and the average transcription rate VD .

To estimate the degradation rate d0 we employed human immortalized LHCN-M2 myoblasts expressing DUX4 under the control of a doxycycline-inducible promoter (iDUX4 myoblasts) (Choi et al., 2016). DUX4 expression was induced to a level and duration we have found sufficient to drive robust DUX4 mRNA expression, but only weak activation of DUX4 target genes and no widespread apoptosis (Ganassi et al., 2022). After induction, myoblasts were washed and supplemented with fresh medium without doxycycline. Samples were harvested for RNA extraction in triplicate immediately after washing and then at 1, 2, 3, 4, 5, 6, 8, and 10 hr post-wash. RT-qPCR was performed employing a standard curve to quantify the DUX4 mRNA copy number over our time course, to monitor DUX4 mRNA degradation in the absence of induction (Figure 2A).

Estimation of DUX4 mRNA degradation rate d0 and average transcription rate (VD).

(A) Schematic of the experiment performed to estimate the DUX4 mRNA degradation rate d0. iDUX4 myoblasts were induced to express DUX4 with 250 ng/ml of doxycycline for 7 hr before doxycycline was washed away. DUX4 mRNA was quantified at multiple time points post-wash using RT-qPCR. (B) Bar chart displays the RT-qPCR of ln(DUX4 copy number) at post-wash times 0, 1, 2, 3, 4, 5, 6, 8, and 10 hr. Bar represents the average of triplicates and the standard error of the mean is displayed. A line of best fit of ln(DUX4 copy number) against time is displayed alongside the corresponding linear regression p-value (bold) and the slope of the line corresponds to d0 (red). (C) Schematic of the estimation of average DUX4 transcription rate VD from scRNAseq data of 5133 Facioscapulohumeral muscular dystrophy (FSHD) myocytes. The maximum likelihood estimates (MLEs) for the underlying parameters of DUX4 transcription under the promoter-switching model are estimated via gradient descent and combined to estimate the average DUX4 transcription rate (red text).

As anticipated DUX4 mRNA levels decayed exponentially over time in the absence of doxycycline (linear regression of ln(DUX4 mRNA) vs time, p=4.1 × 10–4, Figure 2B), allowing us to calculate the degradation rate of DUX4 mRNA, d0=0.246/hr. This estimate suggests that the half-life of DUX4 mRNA is approximately 2.8 hr, not atypical for a transcription factor, and on the faster end of the mRNA degradation distribution (Yang et al., 2003).

To estimate the mean transcription rate of DUX4, VD , we applied the promoter-switching model presented above. We considered the scRNAseq dataset of FSHD single myocytes produced by van den Heuvel et al., 2019. This dataset comprises 7047 single myocytes, which were differentiated for 3 days in the presence of EGTA to inhibit fusion. 5133/7047 (73%) single myocytes were derived from four FSHD patients (two FSHD1 and two FSHD2). Patient demographics, genotypes, and DUX4 +/-ve cell counts are displayed in Supplementary file 1. DUX4 mRNA expression was detected in 27/5133 (0.53%) single FSHD myocytes but not in control myocytes (van den Heuvel et al., 2019; Banerji and Zammit, 2019). Considering the 5133 FSHD myocytes together we implemented a gradient descent algorithm to approximate the MLEs of the normalized parameters kaDUX4/δDUX4, kiDUX4/δDUX4 and v0DUX4/δDUX4 from the Poisson-Beta interpretation of the promoter-switching model of DUX4 expression (Figure 2C).

We note that δDUX4=d0, which we have already estimated, this enabled us to renormalize these parameters and compute the average DUX4 transcription rate, VD:=kaDUX4v0DUX4kaDUX4+kiDUX4=0.00211/hr.

As the total number of DUX4 positive cells is low, we pooled data from the four FSHD patients to allow robust estimation of the average DUX4 transcription rate for this patient cohort. We note, however, that distinct FSHD genotypes likely underlie different DUX4 transcription rates. The majority of DUX4 +ve cells (23/27) were found in two FSHD patients for this scRNAseq data. For patient FSHD1.1 19/2226 (0.85%) cells were DUX4 +ve and for patient FSHD2.1 5/1283 (0.39%) cells were DUX4 +ve (Supplementary file 1). To investigate variability in VD , we derived individual estimates for these two ‘higher’ DUX4 expressing patients, for FSHD1.1 VD=0.00373/hr, while for FSHD2.1 VD=0.000960/hr. The pooled estimate of VD is thus comparable in order of magnitude with that of individual patients. To fully utilize the available data and prevent limiting our model to only ‘higher’ DUX4 expressing patients, we employ the pooled estimate of VD=0.00211/hr for the remainder of our calculations.

Estimating kinetics of DUX4 target activation

We next estimated the average transcription rate of the DUX4 target genes in the presence of DUX4 mRNA, VT. We focused on eight DUX4 target genes: ZSCAN4, TRIM43, RFPL1, RFPL2, RFPL4B, PRAMEF1, PRAMEF2, and PRAMEF12 that have been identified as direct DUX4 targets via ChIP-seq (Young et al., 2013). We have shown that these eight genes are the only features consistently up-regulated in human myoblasts expressing DUX4, across four independent studies (Banerji and Zammit, 2021; Rickard et al., 2015; Choi et al., 2016; Young et al., 2013; Geng et al., 2012).

Examining mRNA levels of these eight DUX4 target genes in scRNAseq (van den Heuvel et al., 2019) and snRNAseq (Jiang et al., 2020) studies of FSHD and control differentiated myoblasts, expression was restricted to FSHD cells/nuclei and never observed in controls. This pattern of expression mirrors that of DUX4 mRNA and suggests that these targets are highly specific and are unlikely to be activated in the absence of DUX4.

To confirm our hypothesis, we applied the promoter-switching model. We returned to the scRNAseq data of 5133 FSHD myocytes generated by van den Heuvel et al., 2019 and divided the data into the 27 DUX4 +ve cells and 5106 DUX4 -ve cells. For each of our eight DUX4 target genes, we implemented our gradient descent algorithm to compute the MLEs of the normalized parameters kan/δn, kin/δn, and v0n/δn, underlying a promoter-switching model for the given gene across the 27 DUX4 +ve FSHD myocytes and the 5106 DUX4 -ve FSHD myocytes separately (Figure 3A).

Estimation of promoter-switching model parameters for DUX4 target genes and estimation of VT.

(A) Schematic of estimation of promoter-switching model parameters for eight DUX4 target genes, across 5106 DUX4 -ve Facioscapulohumeral muscular dystrophy (FSHD) single myocytes and 27 DUX4 +ve FSHD single myocytes from van den Heuvel et al., 2019. Violin plots display (B) the proportion of time in active promoter state (C) normalized active promoter transcription rate and (D) the mean mRNA copy number, for the eight DUX4 target genes in the 5106 DUX4 -ve and 27 DUX4 +ve FSHD myocytes separately, p-values correspond to two-tailed paired Wilcoxon tests. (E) Violin plot displays the variance of mRNA copy number for the eight DUX4 target genes calculated from the 27 DUX4 +ve FSHD myocytes (red) and calculated assuming DUX4 up-regulates targets only via the increase in (blue) the normalized transcription rate v0nδn or (green) the ratio of active to inactive promoter transition rates kankin. Paired two-tailed Wilcoxon p-values are displayed comparing adjacent distributions. (F) Schematic displaying how the change in parameters underlying the promoter-switching models for the eight DUX4 target genes in the presence of DUX4 leads to stable target gene activation. (G) Schematic of the estimation of the average DUX4 target gene transcription rate VT, from scRNAseq data of 27 DUX4 +ve FSHD myocytes.

Figure 3—source data 1

Source data for Figure 3 provides the parameters of the promoter switching model for each of the 8 DUX4 targets, derived from 5106 DUX4 -ve FSHD single cells and 27 DUX4 +vs FSHD single cells seperately.

https://cdn.elifesciences.org/articles/88345/elife-88345-fig3-data1-v2.csv

In the presence of DUX4 mRNA, the proportion of time the promoters of the eight DUX4 target genes remained in the active state, kankan+kin, significantly increased (paired Wilcoxon p=7.8 × 10–3, Figure 3B), as expected. Curiously, however, the normalized rate of transcription of the DUX4 target genes from the active promoter, v0n/δn, significantly decreased in the presence of DUX4 (paired Wilcoxon p=0.039, Figure 3C).

To investigate further, we considered the moments of the distribution of mRNA copy number, m, under the promoter-switching model. It can be shown that the mean mRNA copy number satisfies (Vu et al., 2016):

Em=kanv0n/δnkan+kin.

The changes in parameters we calculate for the DUX4 target genes confirmed that in the presence of DUX4 mRNA, the mean expression of all 8 DUX4 targets increases (paired Wilcoxon p=0.039, Figure 3D), i.e., the drop in v0n/δn is over-compensated for by the rise in kankan+kin. However, it is curious that v0n/δn should drop at all.

It can be shown that the variance of mRNA copy number m, satisfies (Vu et al., 2016):

Varm=Em1+kinv0n/(δn)^2(1+kan/δn+kin/δn)(kan/δn+kin/δn).

This expression ensures that as the mean mRNA copy number rises, so too must the variance, however, the level to which the variance rises is controlled by a term that is monotonic increasing in v0n/δn and depends on the promoter state parameters in a more complex way.

We postulated the pattern of DUX4 target gene parameter changes we observe in the presence of DUX4, have the effect of raising mean target mRNA expression, while suppressing target mRNA variance, i.e., a controlled activation of target genes. To investigate this, we considered two hypothetical scenarios in which the mean expression of target mRNAs, Em in the absence of DUX4, is increased to the same level we observe in the presence of DUX4. In the first scenario, we considered only increasing v0n / δn to achieve the rise in Em , in the second we considered only increasing the ratio of active to inactive promotor kankin. Both hypothetical scenarios resulted in a significantly higher variance of target mRNA than observed in our data, with the pure rise in v0n/δn driving the most dramatic increase in target mRNA variance (Figure 3E).

Taken together these results suggest that under our promoter-switching model, increasing the expression of a gene comes at the cost of increasing the variance of its expression, and that the greatest contributor to this variance comes from the normalized active promoter transcription rate v0n / δn . The parameter changes we observe in DUX4 target genes, suggest a management of this trade-off by DUX4, which increases the mean expression of target mRNA through a large increase in the proportion of time the promoter is active, kankan+kin, while offsetting the resulting rise in variance through a modest decrease in normalized active promotor transcription v0n/δn (Figure 3F).

We formulate the mean transcription rate of at least 1 of the 8 DUX4 targets from our compartment model, VT, as the sum of the mean transcription rates of all eight target genes in the presence of DUX4 mRNA, i.e.,:

VT j=18kajv0jkaj+kij

where j indexes the eight DUX4 target genes, and the promoter-switching model parameters are estimated from the 27 DUX4 expressing FSHD single myocytes. Our promoter-switching model scRNAseq-derived MLEs are normalized parameters kan/δn, kin/δn and v0n/δn. We must, therefore, estimate δn for each target gene to compute VT . As there is a range of target genes, we approximate δn for each by the median mRNA degradation rate observed in an analysis of 5245 genes (Yang et al., 2003), and set δn=0.0693 resulting in VT=6.41/hr (Figure 3G).

As with our calculation of VD data was pooled across four FSHD patients to calculate VT . We do not anticipate patient genotype to impact the average DUX4 target transcription rate, independently of its impact on the DUX4 transcription rate. However, to confirm our findings on the impact of DUX4 on target gene promoter dynamics, in a patient-specific setting, we attempted calculation of the parameters underlying the promoter switching model for the eight DUX4 target genes in DUX4 +ve and DUX4-ve cells, for patients FSHD1.1 and FSHD2.1 separately. Due to the limited number of cells for each patient, personalised estimates for all eight target genes could not be obtained. However, where patient-specific estimates have obtained the direction of parameter differences in target genes, between DUX4 +ve and DUX4 -ve cells were in line with those of pooled estimates across four FSHD patients (Supplementary file 2).

The remaining two parameters of our compartment model were calculated from published data. For the translation rate of DUX4 mRNA to active protein TD , we considered our analysis of iDUX4 myoblasts (Ganassi et al., 2022). We induced DUX4 expression with 250 ng/ml doxycycline and performed RT-qPCR to assess the expression of DUX4, and 3/8 of our DUX4 target genes ZSCAN4, TRIM43, and PRAMEF1, at 7 hr, 16 hr, and 24 hr of induction. DUX4 mRNA levels peaked at 7 hr, while the expression of DUX4 target genes peaked between 16 and 24 hr (Ganassi et al., 2022). This suggests a delay between DUX4 mRNA production and the presence of active DUX4 protein of between 9 and 17 hr, on average 13 hr. We thus estimate TD=113/hr.

For the death rate of DUX4 target positive cells, Dr , we consider the data of Rickard et al., 2015, in which differentiating FSHD myoblasts containing a DUX4-activated GFP reporter were imaged every 15 min for 120 hr. Following activation of the DUX4 reporter, cells died ~20.2 hr later (Rickard et al., 2015). We thus estimate Dr=120.2/hr.

Compartment model simulation

Having defined experimental estimates for parameters underlying the compartment model, we simulated the model forward in time to observe how an initial distribution of cells progresses through the five compartments.

To provide a ground truth we considered the scRNAseq data of van den Heuvel et al., 2019. Defining a DUX4 target gene +ve cell as expressing at least one of our eight DUX4 target genes, we assign the 5133 FSHD myocytes to 1 of the 4 live cell states of our compartment model: S(3 days)=4956 (96.6%),E(3 days)=14(0.273%), I3 days=13 (0.253%), R3 days=150 (2.92%).

If we assume that at the start of differentiation, all FSHD myocytes occupy state S0 being DUX4 negative and DUX4 target gene negative, simulating our model we estimated that 7% of starting cells will have died over 3 days. To replicate the starting conditions of the scRNAseq data we thus set S0=5133(1+0.07)=5488, E0=I0=R0=D0=0. Simulating our model over 3 days from this starting condition, we obtained cell state proportions statistically indistinguishable from the experimental scRNAseq data: S(3 days)=4953 (96.5%),E(3 days)=14(0.253%), I3 days=25 (0.487%), R3 days=116 (2.26%) (Chi-Squared p=0.99, Figure 4A).

Simulation of the compartment model and comparison with scRNAseq data of unfused Facioscapulohumeral muscular dystrophy (FSHD) myocytes.

(A) Bar chart displays the number of cells in each live cell compartment (blue) in our model following simulation over 3 days using experimentally estimated parameters and (red) in 5133 single FSHD myocytes. A Chi-squared goodness of fit p-value tests the alternate hypothesis that the two distributions are different. (B) Line plot displaying how the number of cells in each of the five compartments of the model changes over 100 days from a starting state of 5133 (1+0.07) cells. The percentage of cells dead after 10 days is displayed alongside the maximum percentage of live cells which are DUX4 +ve (E+I)max and DUX4 target gene +ve (R+I)max.

We next simulated our model forward over 100 days to observe how the proportion of cells in each state changed (Figure 4B). As expected, the number of cells in the DUX4 naive, susceptible state S(t) gradually decreased, while the number of dead cells due to DUX4 D(t) gradually rose, so that after 10 days 26.3% of cells had died as a consequence of DUX4 expression. Remarkably despite cell death in our model only being attributable to DUX4 expression, the dynamics predict that this is achieved while keeping the number of DUX4 mRNA and DUX4 target mRNA positive cells extremely low. DUX4 positive cells never rose to more than 0.8% of the live cell population and DUX4 target positive cells never more than 2.9%.

Our compartment model with experimentally derived parameters thus provides an excellent fit to real-world data and gives a mechanism for how extremely low levels of DUX4 and target gene expression can drive significant cell death in FSHD myocytes.

Modeling DUX4 expression in FSHD myotubes allows syncytial diffusion

Having considered scRNAseq data of unfused FSHD myocytes differentiated for 3 days, in which DUX4 expressed in a given nucleus cannot interact with other nuclei, we next considered syncytial diffusion of DUX4. Jiang et al., 2020 published a single myonuclear RNAseq (snRNAseq) of FSHD2 myoblasts differentiated over 3 days to form multinucleated myotubes. The data describes 139 FSHD2 myonuclei and 76 control myonuclei. As with the scRNAseq data van den Heuvel et al., 2019, none of the control myonuclei express DUX4 nor any of the 8 DUX4 target genes. For FSHD myonuclei, 3/139 (2.2%) express DUX4. The number of myonuclei in each of the live ‘cell’ compartments of our model is: S(3 days)=58 (41.7%),E(3 days)=0(0%), I3 days=3 (2.2%), R3 days=78 (56.1%).

Comparing the proportions of myonuclei (cells) assigned to the four live states of our compartment model in the unfused FSHD scRNAseq and syncytial FSHD snRNAseq datasets, we found significant differences in the distributions (Chi-Squared p<2.2 × 10–16, Figure 5A). The unfused FSHD myocytes demonstrated a higher proportion of DUX4 -ve, DUX4 target gene -ve susceptible (S) cells, while the syncytial FSHD myonuclei demonstrated a greater proportion of DUX4 -ve, DUX4 target gene +ve resigned (R) cells.

Derivation and simulation of the syncytial compartment model and comparison with snRNAseq data of fused Facioscapulohumeral muscular dystrophy (FSHD) myonuclei.

(A) Bar chart displays the proportion of cells in each live compartment of our model in (pink) 5133 unfused FSHD single myocytes from van den Heuvel et al., 2019 and (dark red) 139 fused FSHD single myonuclei of Jiang et al., 2020. A Chi-squared goodness of fit p-value tests the alternate hypothesis that the two distributions are different. (B) Schematic of the syncytial compartment model, allowing DUX4 protein permissive states I and R, to ‘infect’ non-permissive states S and E. (C) Ordinary differential equations describing the syncytial compartment model. (D) Schematic of the genetic algorithm employed to estimate the DUX4 diffusion rate Δ and the starting population size n, for the syncytial compartment model from 139 fused FSHD single myonuclei. (E) Bar chart displays the number of cells in each live cell compartment (blue) in our syncytial compartment model following simulation over 2 days using experimentally estimated parameters and (red) in 139 fused single FSHD myonuclei. A Chi-squared goodness of fit p-value tests the alternate hypothesis that the two distributions are different. (F) Line plot displaying how the number of cells in each of the five compartments of the syncytial compartment model changes over 100 days from a starting state of n cells. The percentage of cells dead after 10 days is displayed alongside the maximum percentage of live cells which are DUX4 +ve (E+I)max and DUX4 target gene +ve (R+I)max.

We investigated whether allowing diffusion of DUX4 protein between myonuclei in our model could explain this difference in proportions. Two states in our compartment model express DUX4 target genes and thus have evidence for DUX4 protein: It and Rt , while states St, and Et do not. We updated our model to the syncytial compartment model, where we allow the DUX4 protein compatible states to ‘infect’ the DUX4 protein incompatible states at a rate Δ (Figure 5B and C).

The term ‘infect’ is only an analogy and our model is not based on a direct mapping of e.g., viral infection to our setting. In our case, the ‘infectious agent’ DUX4 causes harm but does not replicate. DUX4 protein is a transcription factor and thus activates the expression of target genes. In our model, DUX4 can be seen as an ‘infectious agent’, and expression of DUX4 target genes can be seen as the ‘infection’, which leads to cell death. DUX4 can either stay in the ‘infected’ cell until it dies or diffuse to another cell to spread the ‘infection’.

We next employed a genetic algorithm to fit our syncytial compartment model to the snRNAseq data (Jiang et al., 2020). As differentiating FSHD myoblasts take approximately 24 hr to initiate fusion (Banerji et al., 2019), and DUX4 expressing cells show defects in fusion (Knopp et al., 2016), we assumed that day 3 of differentiation represents day 2 of syncytial myonuclei and simulated our model over 48 hr, from a starting state of S0=n, E0=I0=R0=0. We optimized two parameters via our genetic algorithm: the DUX4 syncytial diffusion rate Δ, and the number of starting myonuclei in the susceptible state n (Figure 5D).

The algorithm converged on a solution of n=217 susceptible myonuclei and Δ = 7.46 × 10–4 / hr. Simulating the syncytial compartment model over 2 days employing these parameters resulted in a statistically indistinguishable approximation to the snRNAseq data (Jiang et al., 2020): S(3 days)=62 (44.6%),E(3 days)=0(0%), I3 days=1 (0.720%), R3 days=76 (54.7%), (Chi-Squared p=0.99, Figure 5E). Thus differences in proportions of cell states in fused versus unfused myocytes can be explained by the addition of a diffusion term for DUX4 protein, within the limits of the assumptions of our model.

Simulating the syncytial compartment model over 100 days, we found a much faster death rate than the unfused myocyte model, with 97% of myonuclei dead by day 10. DUX4 mRNA-expressing cells remain low in proportion, never exceeding 0.8% of total live nuclei, however, DUX4 target gene mRNA expressing cells can now comprise a significant proportion of living nuclei, up to 70% at 73 hr, though on average comprise only 5.6% of living nuclei (Figure 5F).

A cellular automaton model of FSHD myotubes

A limitation of our syncytial compartment model is the assumption that any DUX4 protein-positive myonucleus can ‘infect’ any DUX4 protein-negative myonucleus. In practice, DUX4 can likely only spread between adjacent myonuclei in a short-range interaction (Rickard et al., 2015; Tassin et al., 2013). To overcome this limitation, we re-cast our compartment model as a cellular automaton.

In this grid-based model, squares on the grid represent myonuclei, by introducing an appropriate boundary condition, the grid is topologically equivalent to the surface of a cylinder and thus can be considered to represent myonuclei residing on the surface of a myofiber (Figure 6A).

Derivation and simulation of the cellular automaton model and comparison with snRNAseq data of fused Facioscapulohumeral muscular dystrophy (FSHD) myonuclei.

(A) Schematic demonstrating how with suitable boundary conditions a grid can be topologically equivalent to a cylinder, and thus encapsulate dynamics on the surface of a myofiber. (B) Schematic of the cellular automaton model of DUX4 expression in a syncytium with grid squares representing single myonuclei updating in 1 hr time-steps via a two-stage process. (C) Schematic of the genetic algorithm employed to estimate the DUX4 diffusion rate Δ, myotube length l, and myotube circumference c for the cellular automaton model from 139 fused FSHD single myonuclei. (D) Heatmap displays the clustering solution of the 100 optimal parameter regimes l,c and Δ produced by fitting 100 genetic algorithms to 139 fused FSHD myonuclei. The highlighted cluster represents a parameter regime in which myotubes are long and thin, the average parameter values of this cluster are displayed in red. (E) Bar chart displays the average proportion of cells in each live cell compartment (blue) in our cellular automaton model following simulation over 2 days using experimentally estimated parameters, alongside standard error of the mean, and (red) in 139 fused single FSHD myonuclei. A Chi-squared goodness of fit p-value tests the alternate hypothesis that the two distributions are different.

We evolved our cellular automaton forwards in time in two steps. First, the non-syncytial compartment model was applied to each myonucleus to update the internal state of each nucleus stochastically according to the experimentally derived transition rates and the time period that has elapsed. In the second step, myonuclei in DUX4 protein-compatible states It and Rt can ‘infect’ any of the eight neighbouring myonuclei in the grid which are in states St or Et at a stochastic rate Δ (Figure 6B). Our cellular automaton model, thus retains all features of the compartment model, but facilitates a more realistic topology for a muscle fiber and prevents DUX4 protein from engaging in long-range non-physiological interactions.

We next employed a genetic algorithm to fit the cellular automaton to the syncytial snRNAseq data (Jiang et al., 2020). We again assumed that day 3 of differentiation corresponds to day 2 of syncytial myotubes and simulate our model over 48 hr, in time steps of 1 hr, from a starting state of S0=n, E0=I0=R0=0. However, we now optimized three parameters: the DUX4 local diffusion rate Δ, the myotube nuclear length l and the myotube nuclear circumference c, where n=lc.

As the cellular automaton model is stochastic due to the discrete time steps employed, one cannot expect the same set of parameters to yield the same distribution of cell states after 48 hr of simulation. Thus the solution of a genetic algorithm may represent a parameter regime optimal for replicating the snRNAseq data, or may represent an unlikely realization of a parameter regime sub-optimal for replicating the data. To overcome this limitation we implemented 100 genetic algorithms to obtain a ‘family’ of parameter values Δ and c, which underlie potentially optimal solutions (Figure 6C).

We performed a clustering analysis of the 100 optimal solutions, employing the parameter values Δ, l and c as a feature space. This revealed that a cluster of 14 solutions has significantly greater myotube length than circumference (Wilcoxon p=5.0 × 10–8). This implies that for these 14 solutions, the genetic algorithm has converged to the general structure of a muscle fiber cylinder (long and thin). We focused on the mean parameter values of this cluster as the most physiologically realistic solution: Δ=0.0402/hr , l=44 myonuclei, and c = 9 myonuclei (Figure 6D).

Simulating the cellular automaton model 100 times with this parameter range over 48 hr resulted on average in a statistically indistinguishable approximation of the snRNAseq syncytial FSHD myonuclei state distribution (Chi-squared p=0.98, Figure 6E).

Thus after restricting DUX4 protein from spreading to only neighboring nuclei, we were still able to explain the difference between unfused and fused cell state distributions purely with a diffusion term for DUX4 protein.

A toolbox for assessing the impact of therapies on DUX4-mediated myotoxicity

Anti-DUX4 therapy for FSHD can target any aspect of DUX4 expression. To understand the impact of a given therapeutic strategy one requires two key pieces of information:

  1. The true value of the parameters underlying DUX4 expression.

  2. How a proposed change in these parameters by a therapeutic will impact cell death.

Here, we have derived experimental estimates for the parameters underlying DUX4 expression which can be modified by anti-DUX4 therapy, while our compartmental and cellular automaton models provide the framework for investigating how parameter changes will impact cell death.

To facilitate other investigators using our models to guide anti-DUX4 therapy development, we have packaged them into graphical user interfaces to provide three user-friendly tools.

The first tool allows investigators to visualize the compartment model and syncytial version, for a range of parameter choices. Sliders allow the investigator to perturb the six parameters of our models: VD , d0, VT, TD, Dr, and Δ, and outputs the temporal evolution of the models over as many hours as required, from any starting number of susceptible cells/nuclei, as well as histograms comparing the perturbed model to the experimental data of van den Heuvel et al., 2019 and Jiang et al., 2020.

The second tool implements the cellular automaton model of DUX4 in a syncytium. As with the first tool users can employ sliders to perturb the 6 parameters of the model VD , d0, VT, TD, Dr, and Δ, as well as the dimensions of the myofiber grid l and c, and the number of hours over which the automaton should be simulated. The automaton can then be started and the nuclear state updates will be dynamically played on a grid, while a histogram dynamically updates the proportions of nuclei in each state.

The final tool allows investigators to compare dead cell proportions over time for a chosen parameter regime ((VD) , d0, VT, TD, Dr, and , l and c), with cell death under our experimentally derived parameters. Users can again select their parameter regime and dynamically view changes in dead cell proportions under the single cell compartment model, to simulate realizations of the cellular automaton model and perform comparisons of cell death via Cox proportional hazard models.

The tools are hosted at the following web addresses:

  1. Compartment Models: https://crsbanerji.shinyapps.io/compartment_models/

  2. Cellular Automaton: https://crsbanerji.shinyapps.io/ca_shiny/

  3. Survival Analysis: https://crsbanerji.shinyapps.io/survival_sim/

The three tools can be used to understand how anti-DUX4 therapies aimed at different model parameters alter the proportion of cells in each compartment of our model at given times. This can guide the optimal therapeutic approaches, as well as optimal time points to assay to validate given therapies using cell culture approaches.

Discussion

Anti-DUX4 therapy is the leading candidate for an FSHD treatment, with several compounds currently in clinical trials (Tawil, 2020; Jagannathan et al., 2022; Le Gall et al., 2020). However, DUX4 expression in FSHD muscle demonstrates a complex dynamic (Banerji and Zammit, 2021; Banerji and Zammit, 2019). Understanding this complex dynamic is essential to the construction of optimal therapy, as it is currently unclear which stage of the DUX4 ‘central-dogma’ one should target to have the most significant impact on pathology. This has led to diverse strategies, targeting epigenetic regulation of DUX4 (Lemmers et al., 2012; Block et al., 2013), DUX4 mRNA (Wallace et al., 2012; Ciszewski et al., 2020), DUX4 protein (Klingler et al., 2020), and DUX4 downstream effects (Heher et al., 2022).

Here, we present a mathematical model of DUX4 expression in differentiated FSHD myoblasts, based on ordinary differential equations and stochastic gene expression. By analyzing human myoblasts expressing inducible DUX4 as well as scRNAseq and snRNAseq of FSHD patient myocytes and myotubes, we compute experimental estimates for the parameters underlying our model. These include the first estimates of DUX4 transcription, translation, and mRNA degradation rates. Simulating our model with experimentally derived parameters we find that it accurately predicts the proportion of DUX4 +ve/-ve and DUX4 target gene +ve/-ve cells observed in actual scRNAseq of FSHD patient myocytes. We package our model into graphical user interface tools to allow investigators to rapidly observe the impact of any given anti-DUX4 therapy on cell viability.

As with all models, ours are subject to assumptions, such as combining the multi-step processes of transcription and promoter activation into single steps. We further assumed that cells expressing our 8 DUX4 target genes are historic DUX4-expressing cells. We selected these eight genes as they are confirmed DUX4 targets by ChIP-seq and have been identified as upregulated in every transcriptomic analysis of DUX4 over-expressing human myoblasts (Banerji and Zammit, 2021). Tissue expression patterns of these eight genes are not well characterized outside of the FSHD context, however, their expression has been reported during zygotic genome activation (Taubenschmid-Stowers et al., 2022) and in testicular tissue (Taubenschmid-Stowers et al., 2022; Ishiguro et al., 2017), both settings where DUX4 is physiologically expressed. We found further evidence for the expression of these genes indicating historic DUX4 in this study. First, we never find a single transcript for any of these eight target genes in 1914 single myocytes and 77 single nuclei from control individuals, suggesting that their expression requires an FSHD genotype (and thus likely DUX4 expression). Second, our investigation of these genes in scRNAseq data of FSHD myocytes demonstrated that in the presence of DUX4, the expression of all eight genes significantly increases, indicating activation by DUX4. While this is evidence that the eight targets are specific to DUX4, it is possible that they may also be induced at a much lower level, by some other factor related to the FSHD genotype and thus a small proportion of DUX4 target gene +ve cells in this study may not be historically DUX4 +ve.

We also limited our model to differentiated muscle culture, when DUX4 expression is more robust. DUX4 mRNA is typically detected in a small proportion (0.5–3.8%) of FSHD myoblasts in vitro (Jiang et al., 2020; van den Heuvel et al., 2019; Banerji and Zammit, 2019), and protein expression is similarly rare (0.1%) (Snider et al., 2010). The situation is qualitatively different in FSHD muscle biopsies, where DUX4 mRNA detection is highly variable and DUX4 protein detection is extremely rare. DUX4 target gene expression is detected in inflamed FSHD muscle biopsies but less robustly in non-inflamed muscle, implying that a systemic component may also activate DUX4 expression in vivo (Banerji and Zammit, 2021; Banerji, 2020; Banerji et al., 2020b). Despite the added complexity of DUX4 expression in the muscle biopsy setting, all anti-DUX4 therapies currently under consideration were first investigated in myogenic cells in vitro. Thus, it is encouraging that we can simulate DUX4 and target gene expression single-cell distributions which are indistinguishable from those observed in FSHD patient muscle cell culture. Through theoretical investigation, we can also explore processes such as DUX4 protein syncytial diffusion and DUX4 target gene activation, which may be less accessible to experimentation.

FSHD is a rare disease and public datasets describing scRNAseq and snRNAseq of patient-derived primary myocytes are currently limited to those used in this study. Consequently, we have pooled data describing 4 individual patients of different FSHD genotypes to estimate two of the parameters of our model: VT and VD . DUX4 expression levels depend on D4Z4 repeat length and methylation status, and can differ between cell lines isolated from different FSHD patients (Jones et al., 2012; Homma et al., 2012), as well as between genetically identical cell lines isolated from the same mosaic FSHD patient (Krom et al., 2012). We computed patient-specific estimates of VD for 2/4 patients with sufficient data and found these comparable to the pooled estimate. By pooling patients, we obtained parameter estimates that may not be true for an individual but represented the ‘average’ parameter values across these four FSHD patients. Our in silico model of DUX4 expression is proposed as a pre- in vitro screening tool to guide anti-DUX4 therapy for the general population. However, as more data on FSHD is generated our models will evolve. In particular, as higher volumes of scRNAseq data describing FSHD patients with different genotypes become available, the model can be updated to facilitate genotype-stratified and even personalized anti-DUX4 therapy design. Moreover, new data describing scRNAseq of proliferating FSHD primary myoblasts, alongside quantification of how DUX4 and DUX4 target gene expression impacts myoblast proliferation rates, would allow natural extension of our model to understand DUX4 expression during FSHD myoblast proliferation.

DUX4 is myotoxic (Kowaljow et al., 2007), but how its rare expression drives and sustains significant pathology is unclear. DUX4 has been proposed to undergo a rare burst-like expression dynamic (Bosnakovski et al., 2017; Tassin et al., 2013), though no attempt has previously been made to understand DUX4 expression as a stochastic process. Here, we model DUX4 and its target gene expression directly as stochastic processes via our promoter-switching model and estimate the underlying parameters. Our cell-biology informed model predicts that in single FSHD myocytes/myotubes, DUX4 can drive significant cell death, eventually depleting the entire population, while being expressed in only 0.8% of live cells. This level matches well with the proportion of DUX4-positive cells seen in published studies (Banerji and Zammit, 2021; van den Heuvel et al., 2019; Snider et al., 2010) and confirms that a burst-like mechanism of DUX4 expression can drive significant pathology while making DUX4 difficult to detect.

DUX4 has also been proposed to spread through the myofiber syncytium from its originator nucleus to ‘infect’ DUX4 naïve nuclei, bypassing the need for a rare expression burst and accelerating pathology (Tassin et al., 2013; Barro et al., 2010). The proportion of DUX4 target positive cells is greater in snRNAseq of syncytial FSHD myotubes than scRNAseq of unfused myocytes, while the proportion of DUX4 positive cells is comparable, supporting a syncytial diffusion mechanism. We model DUX4 in FSHD myotubes as an infectious agent, able to spread from one nucleus to another (but not replicate) by adapting epidemiological compartment models, which we package as a cellular automaton to provide a realistic myotube surface on which to monitor DUX4 expression. Incorporation of a diffusion term is sufficient to account for the difference in DUX4 target gene +ve/-ve nuclear proportions, between syncytial FSHD myotubes and unfused FSHD myocytes, while maintaining comparable proportions of DUX4 +ve cells. We thus demonstrate that the theoretical hypothesis of DUX4 syncytial diffusion is compatible with biological data, and provide an estimate for the DUX4 diffusion rate.

Our model predicts <2.9% of single FSHD myocytes will be DUX4 target gene mRNA positive at any given time, despite significant DUX4-driven cell death. While in FSHD myotube nuclei this proportion can transiently spike at 70%, on average it remains much lower at only ~5.6%. This suggests that in FSHD muscle, expression of DUX4 target genes at any given time typically remains restricted to a small number of myonuclei. This may explain why DUX4 target gene expression is an inconsistent biomarker of FSHD muscle (Banerji et al., 2017; Banerji and Zammit, 2019). Moreover, our model predicts that DUX4 target gene expression does not typically increase over time, supporting the finding that DUX4 target gene expression in FSHD muscle has not been validated as a marker of disease progression (Banerji, 2020; Wong et al., 2020). Lastly, due to the typically low levels observed, it may be difficult to detect changes in DUX4 target gene activation following anti-DUX4 therapies. This may have contributed to the recent phase 2b trial of losmapimod in FSHD failing to reach its primary outcome measure of suppressing DUX4 target gene expression in patient muscle biopsies (Jagannathan et al., 2022).

By modeling DUX4 target gene expression as stochastic processes, we found that DUX4 increases the proportion of time target gene promoters are in an active configuration, while curiously decreasing the normalized active promoter transcription rate. Through analytic investigation of our model, we found the rise in active promotor time drove increased DUX4 target gene expression, while the drop in normalized active promotor transcription rate dampened the noise associated with this higher expression level. This suggests that DUX4 modifies the expression of its target genes to orchestrate a precise, stable activation. It has been shown that DUX4 is a pioneer factor and increases target gene promoter activation via C-terminal recruitment of p300 to drive H3K27 acetylation (Choi et al., 2016). How DUX4 may limit normalized active promoter transcription rates of its target genes is unclear, but could involve interaction with the transcriptional complex, or feeding back to decrease the stability of target mRNA.

We thus present our model of DUX4 expression as a theoretical setting to understand the complex dynamics of this important disease gene and as an open source, in silico platform to rapidly and cheaply pre-screen anti-DUX4 therapy for FSHD.

Methods

Key resources table
Reagent type
(species) or
resource
DesignationSource or referenceIdentifiersAdditional information
Cell line (human)LHCN-M2-iDUX410.1038/s41598-018-35150-8Immortalized doxycycline-inducible
DUX4 expressing human
myoblast cell line
Sequence-based
reagent
DUX4 Fwd10.1038/s41598-018-35150-8qPCR pimerACCTCTCCTAGAAACGGAGGC
Sequence-based
reagent
DUX4 Rev10.1038/s41598-018-35150-8qPCR primerCAGCAGAGCCCGGTATTCTTC
Commercial
assay or kit
Quantitect reverse
transcription kit
Qiagen, 205311RT-kit
Commercial
assay or kit
Takyon SYBR
Green qPCR Mastermix
UF-NSMT-B101qPCR Mastermix

Cell culture

DUX4 inducible immortalized human myoblasts LHCN-M2 were cultured in Promocell growth media (C-23060) supplemented with 15% FBS and 1:1000 Gentamycin (Sigma), in a humidified incubator at 5% CO2. Cells were found negative for mycoplasma contamination (MycoAlert PLUS Mycoplasma, Lonza).

RT-qPCR

iDUX4 myoblasts were induced to express DUX4 with 250 ng/ml doxycycline for 7 hr in six-well plates, before washing with PBS and the addition of fresh medium without doxycycline. Total mRNA was isolated using the RNeasy Kit (Qiagen, 74–104) according to the manufacturer’s instructions in triplicate immediately after washing and then at 1, 2, 3, 4, 5, 6, 8, and 10 hr post-wash. After reverse transcription with the Quantitect reverse transcription kit (Qiagen, 205311), SYBR green qPCR was performed (Takyon, UF-NSMT-B101) using a Viia7machine (ThermoFisher). DUX4 primers were: Fwd ACCTCTCCTAGAAACGGAGGC, Rev CAGCAGAGCCCGGTATTCTTC.

Standard curve to compute d0

The length of the DUX4 construct used in our standard curve was 8938 bp and the number of copies per ng was calculated via the following formula:

copy number/ng := NA8938660×10-9=1.02×109

Where NA=6.022×1023 mol–1 is Avogadro’s number and 660 g/mol is the molecular weight of a single DNA base pair. Linear regression of the standard curve confirmed that DUX4 threshold cycles (Cts) are a linear function of log10(copy number) (R2=0.994), with a slope of 30.4 and intercept of –2.70. These values were employed to convert DUX4 Cts from our iDUX4 assay to DUX4 copy number via the following formula:

copy number= 10Ct-30.4-2.70

In our iDUX4 assay, we sample RNA immediately after washing off doxycycline and at multiple time points post-wash. We assume that no DUX4 can be transcribed post-wash and model the DUX4 copy number as an exponential decay:

DUX4 copy number=Ae- d0t;
lnDUX4 copy number=lnA- d0t

Linear regression of lnDUX4 copy number against time post wash (t) yielded a slope of 0.246, giving our estimate of d0 , the degradation rate of DUX4.

scRNAseq and snRNAseq data

Normalized read counts for scRNAseq of 7047 FSHD and controls single myocytes produced by van den Heuvel et al., 2019 were downloaded from the GEO data base accession GSE122873. This dataset contains 5133 single myocytes derived from four FSHD patients (two FSHD1 and two FSHD2) and 1914 single myocytes from two control individuals. Myocytes were differentiated for 3 days in the presence of EGTA to prevent fusion.

Normalized read counts for snRNAseq of 317 FSHD and controls single myonuclei produced by Jiang et al., 2020 were downloaded from the GEO database accession GSE143492. This data comprised 47 FSHD2 nuclei at day 0 of differentiation and 139 FSHD2 nuclei at day 3, from two seperate patients, alongside 54 control nuclei at day 0 and 77 control nuclei at day 3 from two separate individuals. For comparison with the scRNAseq data, we focused on myonuclei from day 3 of differentiation.

All myocytes and myonuclei nuclei were assigned to 1 of 4 live cell compartments via the following criteria:

  1. S – no detectable normalized reads for DUX4, ZSCAN4, TRIM43, RFPL1, RFPL2, RFPL4B, PRAMEF1, PRAMEF2, and PRAMEF12.

  2. E – at least one detectable normalized DUX4 read, and no detectible normalized reads for ZSCAN4, TRIM43, RFPL1, RFPL2, RFPL4B, PRAMEF1, PRAMEF2, and PRAMEF12.

  3. I – at least one detectable normalized DUX4 read, and at least one detectible normalized read for any of ZSCAN4, TRIM43, RFPL1, RFPL2, RFPL4B, PRAMEF1, PRAMEF2, and PRAMEF12.

  4. R – no detectable normalized reads for DUX4, and at least one detectible normalized read for any of ZSCAN4, TRIM43, RFPL1, RFPL2, RFPL4B, PRAMEF1, PRAMEF2, and PRAMEF12.

All control myocytes/myonuclei were in the S compartment. The distribution of the four live cell compartments in the 5133 FSHD single myocytes and the 139 days three FSHD single myonuclei were compared via a Chi-squared test, and significance was assessed at the 5% level.

Estimation of MLEs of promoter-switching model parameters from scRNAseq data

To calculate the transcription rate for DUX4 and DUX4 targets we adopt the approach employed by Larsson et al. and Trung et al. where a maximum likelihood estimation (MLE) methodology is used to infer the parameters for the two-state model of stochastic gene expression (Vu et al., 2016; Larsson et al., 2019). This is implemented in the poisbeta python package available at https://github.com/aksarkar/poisbeta. The procedure takes the Poisson-Beta distribution discussed in the results:

p~Betakan/δn,kin/δn
m|p~Poissonpv0n/δn

and minimizes the negative log-likelihood for each gene

argmin(θ in Θ)i=1NlnP(mi|θ),

where p is a Beta-distributed variable describing the promotor state, N is the number of cells, mi is the number of mRNA transcripts for cell i=1,,N, and θ = [v0n/δn,kan/δn, kin/δn], is the parameter space. From here on we will not include the normalizing δn term for brevity.

As a first estimate of θ, we use the method of moments derived by Peccoud and Ycart for the two-state promotor model (Peccoud and Ycart, 1995)

v0n=-r1r2+2r1r3-r2r3r1-2r2+r3,
kan=2r2-r1r1-r3r3-r2r1r2-2r1r3+r2r3r1-2r2+r3,
kin=2r1r3-r2r1r2-2r1r3+r2r3,

where

r1=e1,
r2=e2e1,
r3=e3e2,

and

e1=1Ni=1Nmi,
e2=1Ni=1Nmi(mi-1),
e3=1Ni=1Nmi(mi-1)(mi-2).

θ is then optimized via the Nelder-Mead method using the Gauss-Jacobi quadrature to evaluate the likelihood function

l=1Beta(kan,kin)01Poisson(mi;v0npi)pikan-1(1-pi)kin-1 dpi
l=12kan+kin-1Beta(kan,kin)-11Poissonmi;v0n1+ti2(1+ti)kan-1(1-pi)kin-1 dti
l12kan+kin-1Beta(kan,kin)k=1KwkPoisson(mi;v0nyk)

where pi=(ti+1)/2 to substitute the beta-distributed parameter, K is the number of points integrated over, wk is the weight of the Jacobi polynomial of degree k, and yk is the root. The optimizer, quadrature method, beta distribution, and Poisson distribution are implemented in the scipy package available from https://scipy.org/install/.

Statistical comparison of promoter-switching model parameters

The three normalized parameters of the promoter-switching model v0n/δn,kan/δn, kin/δn , were computed as above for each of the 8 DUX4 target genes ZSCAN4, TRIM43, RFPL1, RFPL2, RFPL4B, PRAMEF1, PRAMEF2, and PRAMEF12, in the 27/5133 FSHD single myocytes with detectable DUX4 mRNA and separately in the 5106 FSHD single myocytes without detectable DUX4, in the dataset of van den Heuvel et al., 2019.

The normalized active promoter transcription rate v0n/δn , the proportion of time spent with a promoter active kankan+kin and the mean mRNA expression level E[m] for the eight DUX4 target genes were compared between DUX4 positive and DUX4 negative myocytes via paired two-tailed Wilcoxon tests, significance was assessed at the 5% level.

Hypothetical scenarios for raising Em under the promoter-switching model

We considered two hypothetical scenarios for raising Em under the promotor-switching model to observe the impact on Varm . In the first v0nδn is increased while keeping other parameters constant. In the second kankin is increased keeping other parameters constant. Simple algebraic manipulation allows analytical solutions for the raised parameters and thus Varm .

In the case of v0n/δn , we assume that v0nδnxv0nδn for some x>1, the rise in E[m] in this case is simply:

Emxv0nδn-Emv0nδn= kanv0nδnkan+kin(x-1)

We can thus choose x, to achieve the rise in Em seen under DUX4 expression d(μ) as:

x=1+dμkan+kinkanv0nδn

Which can be substituted into the formula for Varm to compute Varmxv0nδn, giving the expected variance of the mRNA copy distribution if Em is increased by d(μ), solely by increasing v0nδnxv0nδn for some x>1.

In the case of kankin , we assume that kankinx kankin , we further assume that the rise in active transition rate is mirrored by a drop in the inactivation rate, i.e., the two parameters are not mutually exclusive, thus: kanykan and kin1ykin , where y2=x. The rise in E[m] in this case is:

Emx kankin-Emkankin=v0nkanδnxxkan+kin-1kan+kin

We can thus choose x, to achieve the rise in Em seen under DUX4 expression d(μ) as:

x=kinEmkankin+dμkinEmkankin+dμkan

Which can be substituted into the formula for Varm to compute Varmxkankin.

For each of the 8 DUX4 target genes we considered the parameters of the promoter switching model v0n/δn,kan/δn, kin/δn , computed over the 5106 DUX4 -ve FSHD single myocytes from the dataset of van den Heuvel et al., as the starting parameters to input into the above formulae for x. Comparing the mean expression of each target in the 27 DUX4 +ve myocytes and the 5106 DUX4 -ve myocytes we computed d(μ). For each target we thus computed Varmxv0nδn and Varmxkankin, the target mRNA copy number variances under our two hypothetical scenarios. These variances were compared with the true mRNA copy number variance in the 27 DUX4 +ve myocytes, as well as with each other via two-tailed paired Wilcoxon tests, significance was assessed at the 5% level.

Compartment model simulation

After the estimation of the five parameters VD , d0, VT, TD, and Dr , the ODEs underlying the compartment model (Figure 1B) were simulated from an initial state of all cells in state S(0), over a period of 100 days in timesteps of 1 hr using the deSolve package in R (Soetaert et al., 2010). For comparison with the 5133 single FSHD myocytes differentiated for 3 days described by van den Heuvel et al., 2019, an initial population sized 5133 was simulated for 3 days, after which 6.901% of starting cells were in the dead cell state. In order to replicate the live cell population of 5133 after 3 days we thus employed a starting population of 5133 (1+0.06901) cells in state S(0). After simulation for 3 days the proportion of cells in each live cell state S3 days, E3 days, I(3 days), and R3 days was compared between the model simulation and the true scRNAseq data, via a Chi-squared test.

For the syncytial compartment model, simulation was performed as above with VD , d0, VT, TD, and Dr unchanged. The additional parameter Δ was estimated via the genetic algorithm (below) alongside the starting population n of cells in state S. After simulation for 2 days (as there is assumed no syncytium for the first 24 hr of fusion) the proportion of cells in each live cell state was compared between the model simulation and the true snRNAseq data of Jiang et al., 2020 via a Chi-Squared test. Significance was assessed at the 5% level.

Cellular automaton model

The cellular automaton model was evolved on an l ×c rectangular grid describing the length and circumference of the myotube. Individual cells on the grid correspond to single myonuclei in a syncytium. Each cell was evolved stochastically over a discrete time-step of 1 hr. In each time-step 2 independent actions were performed.

First, the state of each cell was evolved according to the non-syncytial compartment model. Transition from connected states was modeled as an exponential process with a rate parameter equal to the transition rate (e.g. transition from S to E occurred according to exp(VD)). An event occurring or not under this exponential distribution within 1 hr was simulated during each time step. If the event occurred a transition in cell state occurred, otherwise the cell state remained the same. In the situation where a cell state could transition in two possible directions (e.g. state E can transition to S at a rate d0 and to I at a rate TDVT) competing exponential clocks were set up. Using the property that minexpA,expB~exp(A+B), we simulated a transition occurring in either direction within 1 hr. If a transition occurred, we simulated the exponential clocks describing transition in either direction independently, and transitioned the cell state according to whichever experienced an event faster. If no transition occurred the cell state remained the same.

Once every cell was evolved according to the non-syncytial compartment model we updated the model according to the diffusion of DUX4. Cells in states I or R at the start of the time-step can interact with cells in states S or E if they are among their eight immediate neighbors (Figure 6B). Each interaction was again modeled as an exponential distribution with rate Δ, which was simulated for an event occurring within 1 hr. If the event occurred then the state of the neighboring cell was updated SI or ER, otherwise the state of the neighboring cell remained the same.

We implement a boundary condition so that cells at position 1 on the circumference dimension are able to interact with cells in position c, to allow our grid topological equivalence to a cylinder (Figure 6A).

The cellular automaton model was evolved from a starting distribution of all cells in state S over 48 hr time-steps, employing the parameters VD , d0, VT, TD and Dr estimated from the non-syncytial compartment model and , l and c estimated from the genetic algorithm and clustering analysis (see below). Due to the stochastic nature of the exponential clocks involved, 100 simulations were performed to obtain an average behavior of cell type proportions under our model. The proportion of cells in each of the live cell states obtained from the average behavior of our model was compared to the true snRNAseq data of Jiang et al., 2020 via a Chi-Squared test. Significance was assessed at the 5% level.

Genetic algorithms

We employed genetic algorithms twice, firstly in the syncytial compartment model to fit the number of myonuclei n in the starting state S and the diffusion parameter Δ. Second in the cellular automaton model to again fit Δ, and the length l and circumference c of the automaton grid, and thus again the number of cells n=l×c in the starting state S.

Both algorithms employed the same fitness function aimed at minimizing the difference in the proportion of cells in the live states between the simulated model after 2 days and the snRNAseq of Jiang et al., 2020:

-Ssim2daysnlive59139+Esim2daysnliveε139+Isim2daysnlive3139+Rsim2daysnlive78139

Where nliven-Dsim2days and ε1 is chosen to prevent singularity while rewarding models which obtain values of Esim2daysnlive close to 0. This fitness function was chosen as it outperformed conventional functions based on Minkowski and Euclidean distances. A value of ε=0.1, proved sufficient to generate models with live cell state proportions indistinguishable from the snRNAseq data via the Chi-squared test.

Genetic algorithms were implemented using the GA package in R (Scrucca, 2013) using default parameters: starting population of 50 models, elitism of 0.05, mutation probability 0.1, crossover probability 0.8, and an optimal model was selected if maximum fitness was stable for 10 generations.

The cellular automaton model evolves stochastically and thus an optimal fitness function for a population could represent an optimal parameter regime, or an unlikely simulation of a sub-optimal parameter regime. We thus ran 100 genetic algorithms on the cellular automaton to generate 100 optimal parameter regimes and examined their structure via clustering analysis (below).

Clustering analysis

Clustering analysis was performed on the 3 × 100 feature space of parameter regimes , l and c found optimal in the 100 genetic algorithms performed on the cellular automaton model. The ConsensusClusterPlus package in R (Wilkerson and Hayes, 2010) implemented K-medoids clustering using a Euclidean distance metric, and consensus-CDF cluster stability plots ascertained the optimal number of clusters in the parameter feature space as 5.

Two-tailed paired Wilcoxon tests comparing the distribution of l and c values in each cluster, demonstrated that only one parameter cluster output significantly long and thin myotubes, in line with the microanatomy of muscle (Figure 6D). The average values of , l, and c from this cluster were thus considered the optimal parameter regime for the cellular automaton model.

Shiny tools

Three GUI tools were written using the shiny package in R (Chang et al., 2014). The first tool outputs the compartment models for user parameter inputs and is implemented using the deSolve package (Soetaert et al., 2010). The second tool outputs a dynamic realization of the stochastic cellular automaton model for user parameter inputs, the automaton is displayed using the lattice package in R (Sarkar & Deepayan, 2008). The third tool displays the proportion of dead cells under the compartment model comparing our experimentally derived parameter regime to a user-selected parameter regime. In addition, the third tool simulates two realizations of the cellular automaton model, one using our experimentally derived parameter regime and another using the user-selected regime (l and c are kept the same for both simulations to allow comparable starting populations). Survival analysis using Cox-proportional hazard models is implemented via the survival package in R (Therneau and Grambsch, 2018), to compare cell death rates in the two realizations, with p-values displayed on a corresponding Kaplan-Meier plot.

The three tools can be accessed at:

  1. Compartment Models: https://crsbanerji.shinyapps.io/compartment_models/

  2. Cellular Automaton: https://crsbanerji.shinyapps.io/ca_shiny/

  3. Survival Analysis: https://crsbanerji.shinyapps.io/survival_sim/

Data availability

All data generated or analysed during this study are publicly available or included in the manuscript, all code employed is published as part of our shiny app at 3 public domain URLs listed in the manuscript, and available at GitHub: https://github.com/MVCowley/in-silico-FSHD-muscle-fiber-tools, copy archived at Cowley, 2023.

The following previously published data sets were used
    1. van den Heuvel A
    2. Mahfouz A
    3. Kloet SL
    4. Balog J
    5. van Engelen BGM
    6. Tawil R
    7. Tapscott SJ
    8. van der Maarel SM
    (2018) NCBI Gene Expression Omnibus
    ID GSE122873. Single-cell RNA sequencing in patient-derived primary myocytes for facioscapulohumeral muscular dystrophy.
    1. Jiang S
    2. Williams K
    3. Kong X
    4. Zeng W
    5. Nguyen NV
    6. Ma X
    7. Tawil R
    8. Yokomori K
    9. Mortazavi A
    (2020) NCBI Gene Expression Omnibus
    ID GSE143492. Single-nucleus RNA-seq identifies divergent populations of FSHD2 myotube nucle.

References

  1. Software
    1. Sarkar & Deepayan
    (2008)
    Lattice: multivariate data visualization with R
    R Package Version 0.20-38.
    1. Tawil A
    (2020)
    Design of a phase 2, randomized, double-blind, placebo-controlled, 24-week, parallel-group study of the efficacy and safety of Losmapimod in treating subjects with Facioscapulohumeral muscular dystrophy (FSHD): Redux4 (1592)
    Neurology 94:1592.
  2. Book
    1. Therneau TM
    2. Grambsch PM
    (2018)
    Modeling Survival Data: Extending the Cox Model
    Springer.

Decision letter

  1. Murim Choi
    Senior and Reviewing Editor; Seoul National University, Republic of Korea

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

[Editors' note: this paper was reviewed by Review Commons.]

https://doi.org/10.7554/eLife.88345.sa1

Author response

Reviewer #1 (Evidence, reproducibility and clarity (Required)):

Banerji and colleagues measure DUX4 and target gene expression over a time course in doxycycline-inducible myoblasts to estimate kinetic parameters and rates underlying the transition of non-expressing cells through DUX4-expressing cells to cell death. They then use these parameters to model the rate of appearance of DUX4+ cells, DUX4 target gene expression, etc. in cells from FSHD patients, and derive a model that predicts that over 100 days around one fourth of cells die while less than 1% of cells express DUX4 or its target genes at any given time. This is somewhat similar to what is seen in FSHD patients, where DUX4 expression is infrequent in cultured cells, while patients eventually have substantial muscle loss. The experiments are well-designed and explained clearly.

We thank the reviewer for their kind comments on our manuscript and for acknowledging the similarity between our model simulation and both cell biological and clinical features of FSHD.

Reviewer #1 (Significance (Required)):

The primary significance of this study is that the field has a sense that the damage seen in patient muscle is not congruent with the low expression of DUX4 in patients, and the model showing many cells dying with only a few cells expressing DUX4 at any given time suggests that overall damage can be greater than that observed in any particular snapshot.

However, it is important not to conflate low frequencies of DUX4+ nuclei in cultured myoblasts with "DUX4 being difficult to detect" as in p 12, Discussion. DUX4 is difficult to detect, indeed basically not detected, in muscle biopsy specimens, but in cells in vitro, DUX4 is fairly easy to detect, albeit in quite low numbers of cells. Since the study evaluates cells in vitro, it is important to make clear that the situation in vivo is qualitatively different from that seen in vitro, namely DUX4 not being detected, and the authors should clarify this importance difference.

We appreciate this important point that DUX4 detection is extremely challenging in FSHD patient muscle biopsies, compared to in vitro cell culture of FSHD myoblasts and myotubes, a topic we considered at length in our recent review (Banerji and Zammit 2021, https://doi.org/10.15252/emmm.202013695).

Detection of DUX4 mRNA in muscle biopsies is highly variable requiring nested RT-qPCR, and protein detection is even less robust, though this has been achieved via western blot in less affected FSHD muscle (Tassin et al., 2013, DOI: 10.1111/j.1582-4934.2012.01647.x) and using a proximity ligation assay (Beermann et al. 2022, doi: 10.1186/s13104-022-06054-8). DUX4 target gene expression is detected in inflamed FSHD muscle but less robustly in non-inflamed muscle, indicating the recent presence of DUX4 protein, and also implying that a systemic component may activate DUX4 expression in vivo.

Conversely in cell culture of FSHD muscle cells, DUX4 mRNA is typically found in 0.5-3.8% of myonuclei (van den Heuvel et al., 2019, https://doi.org/10.1093/hmg/ddy400) and protein in around 1/1000 myonuclei (Snider et al., 2010, https://doi.org/10.1371/journal.pgen.1001181). Arguably these levels are still very low and DUX4 is still ‘difficult to detect’ in vitro, for example DUX4 expression is never seen above the limit of detection in bulk RNA-seq of FSHD patient cultured myoblasts. However, we fully appreciate that DUX4 is ‘easier to detect’ in vitro versus in vivo.

Regardless, anti-DUX4 therapy is currently the most heavily funded approach to FSHD treatment, and all anti-DUX4 therapies currently under consideration were first investigated in the myoblast in vitro setting. Thus, the aim of our model is to provide an additional level of in silico screening for anti-DUX4 therapy, before progressing to in vitro investigation.

Full consideration of DUX4 expression in vivo (while an exciting prospect), would require significantly more data than is currently available (relating to inflammation, vascularity and other systemic factors) and a highly complex model which is beyond the remit of this investigation.

We will clarify in the manuscript that our model applies in vitro, highlighting differences with in vivo and emphasising that the model is limited for understanding DUX4 expression in FSHD muscle biopsies.

Changes to manuscript:

Page 13, Line 443: We have removed the statement: ‘with DUX4 mRNA, protein and target gene accumulation all difficult to detect’.

Page 13, Lines 473-481: We have added the following to the Discussion:

‘We also limited our model to differentiated muscle culture, when DUX4 expression is more robust. DUX4 mRNA is typically detected in a small proportion (0.5-3.8%) of FSHD myoblasts in vitro16,17,21, and protein expression is similarly rare (0.1%)18. The situation is qualitatively different in FSHD muscle biopsies, where DUX4 mRNA detection is highly variable and DUX4 protein detection is extremely rare. DUX4 target gene expression is detected in inflamed FSHD muscle biopsies but less robustly in non-inflamed muscle, implying that a systemic component may also activate DUX4 expression in vivo2,48,49. Despite the added complexity of DUX4 expression in the muscle biopsy setting, all anti-DUX4 therapies currently under consideration were first investigated in myogenic cells in vitro.’

A second reason for caution in extrapolating correlates from the in vitro model to the disease process in muscle tissue is that in vivo there is a continual source of replacement cells, as the authors have shown in a previous study. Have the authors attempted to model a situation in which new cells are provided into the system at some rate, related to the amount of death occurring at different times? Although the authors mention that the static cell number is a limitation of the model, it would be valuable to revisit or explore this idea in the Discussion section, if only to provide the reader with a more pragmatic perspective.

We are focused specifically on modelling the in vitro differentiated myogenic cell setting to provide an in silico pre-screening tool for assessing anti-DUX4 therapy, rather than attempting to model the full pathology in vivo, which given data limitations, is beyond the scope of our study. As the reviewer notes, we highlight this limitation relating to proliferation when we introduce the model in the results stating: ‘Cells are assumed not to proliferate over the evolution of the model. We restrict applications to differentiating cells which have exited the cell cycle.’

As the reviewer likely appreciates, DUX4 expression in proliferating cells differs significantly from differentiated cells. The single nuclear and single cell RNA-seq data we employ to derive the parameters underlying DUX4 and DUX4 target transcription rates are from non-proliferating myocytes differentiated for 3 days. Hence it is not appropriate to include proliferation in this model, as the transcription rates we use will not be accurate for proliferating cells.

Introducing proliferation will also require us to estimate the proliferation rate of FSHD myoblasts, and how it is impacted by DUX4 expression, as well as how DUX4/target transcription rates change as the proliferating cells differentiate. Though this is an interesting application there is not sufficient data available to produce this wider scale model at present.

We thus restrict application of our model to the non-proliferating differentiated setting, which can be easily accessed in vitro to screen anti-DUX4 therapy.

We appreciate the observation that in vivo, there will be addition of new cells during muscle regeneration, and will include this in discussion in the updated manuscript. We will also expand on our discussion of what data would be required to include proliferation in our model in the revised manuscript.

Changes to manuscript:

Page 14, Lines 500-503: We have added the following to the Discussion:

‘Moreover, new data describing scRNAseq of proliferating FSHD primary myoblasts, alongside quantification of how DUX4 and DUX4 target gene expression impacts myoblast proliferation rates, would allow natural extension of our model to understand DUX4 expression during FSHD myoblast proliferation.’

The second part of the paper models presence of DUX4 in nuclei based on diffusion from expressing to non-expressing nuclei, and characterizes this as the activity of "an infectious agent, able to spread from one nucleus to another by adapting epidemiological compartment models". The relationship to an infection process is probably not the ideal way to characterize this process, because an infection implies the setting up of new sites of production of the agent, DUX4, where what is really happening is that DUX4 diffusing into these other nuclei isn't leading to more DUX4 production, it is just diffusing into nearby nuclei and accumulating there. Unless I am misunderstanding, the authors are simply showing that a larger number of nuclei will be positive in a system in which cells are fusion products having many nuclei than in a system in which all nuclei are isolated within their own cells.

The term ‘infectious agent’ is only an analogy and implies some preconceptions. The reviewer interprets our infectious agent model as: ‘an infection implies the setting up of new sites of production of the agent, DUX4, where what is really happening is that DUX4 diffusing into these other nuclei isn't leading to more DUX4 production, it is just diffusing into nearby nuclei and accumulating there. Our bespoke model is not based on a direct mapping of e.g., viral infection to our setting. In our case, the ‘infectious agent’ DUX4 causes harm but does not replicate. DUX4 protein is a transcription factor and thus activates the expression of target genes (not including DUX4). In our model, DUX4 can be seen as the ‘infectious agent’, and expression of DUX4 target genes can be seen as the ‘infection’, which leads to cell death. DUX4 can either stay in the ‘infected’ cell until it dies or diffuse to another cell to spread the ‘infection’.

The reviewer interprets the results of our model as ‘simply showing that a larger number of nuclei will be [DUX4] positive in a system in which cells are fusion products having many nuclei than in a system in which all nuclei are isolated within their own cells’.

In fact, we make several observations: firstly we compare single cell RNA-seq of unfused myocytes to single nuclear RNA-seq of fused muti-nucleated myotubes and find that the latter has a greater proportion of cells expressing DUX4 target genes (i.e. more infection). The proportion of DUX4 positive cells (i.e. the amount of infectious agent produced) is similar in the two settings. We posit that this difference in DUX4 target gene positive cells (infection) may be due to DUX4 protein diffusing between cells in the myotube syncytium and activating the targets in neighbouring nuclei (i.e., greater mobility of the infectious agent/loss of quarantine). Note that there are many other possibilities, such as fusion causing an increase in DUX4 transcription/mRNA stability etc.

We find that by introducing a DUX4 protein diffusion term (loss of quarantine) into our model we can completely explain the difference between DUX4 target gene expression (infection) in real data of unfused versus fused myonuclei, despite identical levels of DUX4 (infectious agent). This is a novel finding providing evidence for DUX4 protein diffusion in syncytial myonuclei, which represents an often overlooked therapeutic target/consideration in FSHD. We also provide the first explicit quantification of this diffusion rate via our model as well as a framework for investigators to understand how modifying this parameter can impact DUX4 induced myotoxicity.

We will better define our ‘infectious agent’ analogy and clarify our interpretation in the updated manuscript.

Changes to Manuscript:

Page 9, Lines 338-343: We add the following clarification when we introduce our infection analogy model in the Results:

The term ‘infect’ is only an analogy and our model is not based on a direct mapping of e.g., viral infection to our setting. In our case, the ‘infectious agent’ DUX4 causes harm but does not replicate. DUX4 protein is a transcription factor and thus activates the expression of target genes. In our model, DUX4 can be seen as an ‘infectious agent’, and expression of DUX4 target genes can be seen as the ‘infection’, which leads to cell death. DUX4 can either stay in the ‘infected’ cell until it dies or diffuse to another cell to spread the ‘infection’.’

Page 14, Lines 514-525: We have edited our interpretation of the ‘infectious agent’ model results in the Discussion:

‘DUX4 has also been proposed to spread through the myofibre syncytium from its originator nucleus to ‘infect’ DUX4 naïve nuclei, bypassing the need for a rare expression burst and accelerating pathology26,48. The proportion of DUX4 target positive cells is greater in snRNAseq of syncytial FSHD myotubes than scRNAseq of unfused myocytes, while the proportion of DUX4 positive cells is comparable, supporting a syncytial diffusion mechanism. We model

DUX4 in FSHD myotubes as an infectious agent, able to spread from one nucleus to another (but not replicate) by adapting epidemiological compartment models, which we package as a cellular automaton to provide a realistic myotube surface on which to monitor DUX4 expression. Incorporation of a diffusion term is sufficient to account for the difference in DUX4 target gene +ve/-ve nuclear proportions, between syncytial FSHD myotubes and unfused FSHD myocytes, while maintaining comparable proportions of DUX4 +ve cells. We thus demonstrate that the theoretical hypothesis of DUX4 syncytial diffusion is compatible with biological data, and provide an estimate for the DUX4 diffusion rate.’

Reviewer #2 (Evidence, reproducibility and clarity (Required)):

An in silico FSHD muscle fibre for modelling DUX4 dynamics

Summary

This paper studies and mathematically models the stochastic process of presence or absence of DUX4 expression in individual myocytes in FSHD, which underlies the very small proportion of individual cells which do express DUX4. The paper is valuable in providing a mathematical basis for modelling this, and hence a basis by which to study the effects of other factors or potential therapeutic interventions which may influence the overall proportion of cells which do express.

We thank the reviewer for the kind comments on our manuscript and recognition of its value to design/screening of potential therapeutics for FSHD.

To understand the paper fully, requires a familiarity with mathematical and statistical reasoning which will be beyond many potential readers, including this reviewer, but I hope that the following specific comments may still be helpful.

We appreciate the reviewer’s candour regarding their understanding of the more advanced mathematical aspects the paper. Interdisciplinary work such as ours may introduce concepts which are unfamiliar to some readers.

As the reviewer will appreciate, understanding a complex process like DUX4 expression requires new and more sophisticated approaches, having largely eluded the conventional approaches to date. The main mathematical tools employed in our work derive from the wellestablished stochastic gene expression and epidemiological compartment models. We have endeavoured to make the mathematics self-contained to facilitate understanding and will edit to further improve accessibility.

Changes to Manuscript:

Multiple edits have been made in response to the helpful comments of both reviewers which has improved accessibility of the more technical aspects of the manuscript.

Major Comments

1. A potential major concern is that the modelling seems to be based largely on combined data from 27 DUX-4 positive myocytes out of 5133 myocytes from 4 FSHD patients described by Van Den Heuvel et al., 2019 (ref 17), but with 2 being FSHD1 and 2 FSHD2, and without the 2 patients within each FSHD type being matched for D4Z4 fragment length (RU = 'residual units'), which can be expected to be a major influencing factor on the threshold for expression in any one myocyte upon which the stochastic process acts. Thus, one can expect lower RU number to show a greater proportion of expressing cells within FSHD1, and to some extent similarly within FSHD2 (although in FSHD2, different SMCHD1 mutations will also have different 'strengths'). The 4 patients were : 2x FSHD1 (3RU ; 6RU); 2x FSHD2 (12RU ; 18 RU). In the paper of Den Heuvel et al. it is evident that the DUX4-positive cell count differs between these 4 cell lines very much as might be anticipated from their D4Z4 RU number. Thus (Figure 2 in that paper) patient 1.1 (3RU) has 12 positive cells, whereas patient 1.2 (6RU) has only 2 positive cells. Patient 2.1 (SMCHD1 exon 37 mutation, and 12 RU) has 11 positive cells, whereas Patient 2.2 (SMCHD1 exon 10 mutation, and 18 RU) has only 2 positive cells.

Therefore, if the validity and accuracy of the modelling here would be affected by genetic heterogeneity, the authors should present their calculations and analyses based only on the 2 patients who account for 23/27 of the positive cells, and furthermore, that the calculations and modelling are performed and presented for those 2 patients separately.

The reviewer raises the important point that 2/5 of the parameters of our model were derived from the single cell RNA-seq data set of van den Heuvel et al., which comprises 4 FSHD patients of various genotype. The reviewer remarks that the number of DUX4 positive cells reported for each patient in the van den Heuvel data, differs according to genotype, in line with the known inverse association between DUX4 expression and D4Z4 short allele length. Figure 2E in van den Heuvel et al., to which the reviewer refers, does not present absolute DUX4+ cell counts, but rather ‘DUX4 affected’ cell counts, based on the expression of 67 DUX4 target genes. The numbers in that figure thus differ from ours. We caution that the number of ‘DUX4 affected’ cell counts must be normalised by the total number of cells assayed per patient, to obtain a ‘DUX4 affected’ proportion for each patient. However, we appreciate the reviewers point.

A concern is raised that we have pooled all 4 FSHD patients to derive our parameter estimates, essentially averaging over genotypes. The two parameters we derived from this data are ‘the average DUX4 transcription rate, and ‘the average DUX4 target gene transcription rate.

The average DUX4 target gene transcription rate was derived from consideration of DUX4 target gene expression in the 27 FSHD cells already expressing DUX4 mRNA. Since the capacity of DUX4 to activate its target genes is not known to depend on genotype, we feel it is appropriate to pool the patients to estimate this parameter.

For the average DUX4 transcription rate however, we do appreciate the reviewer’s concern that patient genotype may impact the estimate. The reason for pooling all patients is partly statistical. As there are so few cells expressing DUX4 in each patient, the gradient descent algorithm we employed to estimate the 3 rate parameters for the promoter switching model of DUX4, may fail to converge for every patient separately. By pooling patients, we obtained an estimate that may not be true for an individual but represented the ‘average’ transcription rate of DUX4 in these 4 FSHD patients, hence our careful use of wording when defining the average DUX4 transcription rate parameter.

Our in silico model of DUX4 expression is proposed as a pre- in vitro screening tool to guide anti-DUX4 therapy for the general population, not a digital twin of a single FSHD patient, and so an average DUX4 transcription rate derived from pooling available data is optimal.

This average measure is not irrelevant in light of genetic variation and in fact may be more informative for anti-DUX4 therapeutics intended for a general population, but less informative for genotype stratified therapeutic design. Genotypic stratification by D4Z4 repeat length was not performed in recent FSHD clinical trials for anti-DUX4 therapy (e.g., losmapimod: NCT04003974), which recruited patients with all pathogenic D4Z4 repeat lengths, motivating an average over genotype design in the first instance.

We will elaborate in detail on the reasons for patient pooling throughout the manuscript.

We will attempt the analysis the reviewer suggests i.e. ‘the authors should present their calculations and analyses based only on the 2 patients who account for 23/27 of the positive cells…separately’. However, this is conditional on algorithm convergence given the smaller number of cells, and we must caveat that in doing so we will have eliminated 2 patients purely on the basis of low DUX4 expression and so will over-estimate the average DUX4 transcription rate for FSHD patients in general.

New analysis:

We are pleased to report that our gradient descent algorithm converged enabling estimation of the DUX4 transcription rate VD for the 2 patients with the largest number of DUX4+ve cells, as the reviewer requested. Our pooled estimate of VD based on all 4 patients was 0.0021/hour. The estimate from patient FSHD1.1 described by van den Heuvel et al., was 0.00374/hour and from patient FSHD2.1 was 0.00096/hour. We note that the average value for these two patients is 0.00235/hour, comparable but slightly higher than our pooled estimate across all 4 patients. This is in line with our expectation that eliminating 2/4 patients on the basis of low DUX4 expression may contribute to an over-estimate of the true average DUX4 transcription rate.

Personalised estimates of the DUX4 target gene transcription rate VT could not be computed as this required us to estimate parameters of the promoter switching model for the 8 DUX4 target genes, from consideration of only 19 DUX4 +ve cells for patient FSHD1.1 and 5 DUX4+ve cells for patient FSHD2.1. For 3/8 genes for patient FSHD1.1 and 1/8 genes for patient FSHD2.1 our gradient descent algorithm failed to converge to MLEs of the parameters, preventing calculation of VT. We explore the parameters which did converge in response to point 9 below.

Given this and the fact that we are constructing a general model rather than a personalised model (for which there is insufficient data), we believe it is most appropriate to present all subsequent calculations in the manuscript using the pooled estimates of both VD and VT.

However, for the editor’s interest, we also ran our compartment model employing the values of VD separately for patients FSHD1.1 and FSHD2.1, with the pooled value VT. In Author response image 1 and 2 we have recreated figure 4A for each patient, comparing model simulated cell proportions to the true cell proportions for each patient in the scRNAseq data. We see that for each patient the simulated proportions are statistically indistinguishable from the data as assessed by Chi Squared test, as for the pooled model:

Author response image 1
Model simulation with VD derived for patient FSHD1.

1, using appropriate starting cell numbers for this patient: Model vs Data Chi Squared p=0.99.

Author response image 2
Model simulation with VD derived for patient FSHD2.

1, using appropriate starting cell numbers for this patient: Model vs Data Chi Squared p=0.99.

Though encouraging, these models are not fully personalised as they use a pooled VT, to avoid the risk that readers interpret this model as fully personalised, we choose not to include these data in the published manuscript and focus instead on the general model derived from only pooled estimates.

Changes to Manuscript:

Page 6, Lines 196-206: We have updated the Results to include:

‘As the total number of DUX4 positive cells is low, we pooled data from the 4 FSHD patients to allow robust estimation of the average DUX4 transcription rate for this patient cohort. We note, however, that distinct FSHD genotypes likely underlie different DUX4 transcription rates. The majority of DUX4 +ve cells (23/27) were found in two FSHD patients for this scRNAseq data. For patient FSHD1.1 19/2226 (0.85%) cells were DUX4 +ve and for patient FSHD2.1 5/1283 (0.39%) cells were DUX4 +ve (Supplementary File 1). To investigate variability in VD, we derived individual estimates for these two ‘higher’ DUX4 expressing patients, for FSHD1.1 VD = 0.00373/hour, while for FSHD2.1 VD = 0.000960/hour. The pooled estimate of VD is thus comparable in order of magnitude with that of individual patients. To fully utilise the available data and prevent limiting our model to only ‘higher’ DUX4 expressing patients, we employ the pooled estimate of VD = 0.00211/hour for the remainder of our calculations.’

Page 14, Lines 486-499: We have updated the Discussion to include:

‘FSHD is a rare disease and public data sets describing scRNAseq and snRNAseq of patient derived primary myocytes are currently limited to those used in this study. Consequently, we have pooled data describing 4 individual patients of different FSHD genotype to estimate two of the parameters of our model: VT and VD. DUX4 expression levels depend on D4Z4 repeat length and methylation status, and can differ between cell lines isolated from different FSHD patients14,50, as well as between genetically identical cell lines isolated from the same mosaic FSHD patient51. We computed patient specific estimates of VD for 2/4 patients with sufficient data and found these comparable to the pooled estimate. By pooling patients, we obtained parameter estimates that may not be true for an individual but represented the ‘average’ parameter values across these 4 FSHD patients. Our in silico model of DUX4 expression is proposed as a pre- in vitro screening tool to guide anti-DUX4 therapy for the general population. However, as more data on FSHD is generated our models will evolve. In particular, as higher volumes of scRNAseq data describing FSHD patients with different genotypes become available, the model can be updated to facilitate genotype stratified and even personalised antiDUX4 therapy design.’

In order to be of value for assessing potential interventions more widely in FSHD, the authors would then need, in further publications, to extend their analysis similarly to genetically homogeneous myocyte sets with other RU number.

We agree with the reviewer that for genotypically stratified therapeutic design, based on D4Z4 short allele length, personalised or genotype specific estimates of the DUX4 transcription rate should be calculated. However, as the reviewer clearly appreciates, developing such FSHD digital twins, though exciting, would require a much larger library of single cell RNA-seq datasets describing FSHD patients of various genotypes than is currently available.

We will explore this in an updated discussion.

Changes to Manuscript:

See relevant changes to discussion in response to above point.

Specific points.

2. Introduction 4th paragraph: – 'Single cell and single nuclear transcriptomic studies find only 0.5-3.8% of in vitro differentiated FSHD myonuclei express DUX4 transcript16,17. Immunolabelling studies only detect DUX4 protein in between 0.1-5% of FSHD myonuclei18,19.' The authors should comment whether there is any evidence from the data in these references, or from other publications, regarding differences in these proportions by FSHD diagnosis (I or II) or by D4Z4 residual number?.

We will expand discussion in the updated manuscript of the result (explained by the reviewer) that the number of DUX4 affected cells found in the single cell RNA-seq dataset of van den Heuvel may correlate with genotype. We will contrast this with the observation that immortalised isogenic FSHD myoblast clones isolated from a single FSHD patient can also have a very wide range of DUX4 expression (e.g., Krom et al., doi: 10.1016/j.ajpath.2012.07.007). This will highlight the mix of poorly understood genetic and epigenetic factors impacting DUX4 expression and further motivate an average model in the first instance.

Changes to Manuscript:

See relevant changes to discussion in response to the first point raised by the reviewer, in particular:

‘DUX4 expression levels depend on D4Z4 repeat length and methylation status, and can differ between cell lines isolated from different FSHD patients14,50, as well as between genetically identical cell lines isolated from the same mosaic FSHD patient51.’

3. Introduction 5th paragraph: – Importantly, a recent phase 2b clinical trial of the DUX4 inhibitor losmapimod failed to reach its primary endpoint of reduced DUX4 target gene expression in patient muscle, despite improvement in functional outcomes23. Please state here whether or not there was any evidence that DUX4 expression itself was reduced in this trial. It would help if the authors could also add here an interpretive comment as to whether this indicates that the current presumed target genes for DUX4 are not in fact the ones important for FSHD, or whether their expression might be retained from compensatory upregulation by other upstream regulators. The authors should expand briefly also in the introduction on the evidence for toxicity of target gene expression.

We are very happy to include a discussion of these points, which we explore in considerable detail in our recent review of DUX4 in FSHD (Banerji and Zammit 2021, https://doi.org/10.15252/emmm.202013695). Briefly, the losmapimod trial did not publish data showing they measured DUX4, as Reviewer #1 pointed out DUX4 is highly difficult to detect in patient muscle biopsies and dynamic change following anti-DUX4 therapy has certainly never been observed. Expression of four DUX4 target genes was intended as a surrogate measure of DUX4 activity, reasons for its lack of change in this trial is very much a topic of open debate, which we will outline in an updated manuscript.

Changes to Manuscript:

Page 2, Lines 64-70: We include the following statement in the Introduction:

‘Given the challenge of detecting DUX4 in muscle biopsies4 it is unsurprising that no data was published relating to DUX4 expression changes during the losmapimod clinical trial. An understanding as to why losmapimod did not alter expression of DUX4 targets in patient muscle biopsies is also lacking, but hypotheses include heterogeneity in muscle sampling, low baseline levels of DUX4 targets and thus limited dynamic range, slow reversibility of DUX4 induced epigenetic changes on target gene promoters and losmapimod having limited impact on DUX4 transcriptional activity in vivo.’

4. Results 2nd paragraph: – We first describe the compartment model. FSHD single myocytes can express DUX4 and therefore DUX4 target genes; DUX4 target gene expression leads to cell death19,27,28. Please indicate whether the evidence in these studies is independent of DUX4. Ie. Is there independent evidence that it is definitely the target gene expression, rather than DUX4 expression per se, that leads to cell death, especially if losmapimod improves function by inhibiting DUX4, but has no impact on these target genes.

This is an important point raised by the reviewer and one that we have rigorously investigated (Knopp et al., 2016, https://doi.org/10.1242/jcs.180372).

The c-terminus of DUX4 is a potent transcriptional activator and is required for DUX4 target gene activation, while an inverted centromeric D4Z4 repeat unit encodes a version of DUX4 lacking the c-terminus, named DUX4c. We constructed a library of DUX4 expression constructs encoding: DUX4, DUX4c, tMALDUX4 (DUX4 with the c-terminus removed), DUX4-VP16 (DUX4 with the c-terminus replaced by the VP16 transactivation domain) and DUX4-ERD (DUX4 with the c-terminus replaced by the engrailed repressor domain). In a series of experiments outlined in Knopp et al., we clearly demonstrated that only DUX4 and DUX4-VP16 could activate DUX4 targets in C2C12 murine myoblasts and that only these constructs could induce apoptosis. Our prior work provides this clear link between DUX4 target gene expression and cell death. We will include discussion of this point in the updated manuscript.

Changes to Manuscript:

Page 4, Lines 109-110:We include the following statement in the Results:

‘We have previously demonstrated, via a library of DUX4 expression constructs, that DUX4 target gene activation is also necessary for DUX4-induced cell death31.’

5. Results 2nd paragraph Item 4 : – R(t) – a resigned state where the cell expresses no DUX4 mRNA but does express DUX4 target mRNA (DUX4 -ve/Target gene +ve: i.e. a historically DUX4 mRNA expressing cell) Since the mRNA is from the target genes, why should it be produced only in response to DUX4 expression ? ie. Please indicate the evidence to say that these must be 'historically DUX4 mRNA expressing cells', rather than target gene expression in response to other factors, or even autonomously.

The reviewer raises the point that the 8 DUX4 target genes we investigate may be expressed in a DUX4 independent manner, making DUX4 target gene +ve, DUX4 -ve cells, potentially not historic DUX4 expressing cells. We selected these 8 genes as they are confirmed DUX4 targets by ChIP-seq and have been identified as upregulated by DUX4 in every transcriptomic analysis of DUX4 over-expressing human myoblasts (Banerji and Zammit 2021, https://doi.org/10.15252/emmm.202013695). We provide further evidence for these genes indicating historic DUX4 in these cells: first, we never find a single transcript for any of these 8 target genes in 1914 single myocytes and 77 single nuclei from control individuals, suggesting that their expression requires an FSHD genotype (and thus likely DUX4 expression). Second, our investigation of these genes in the single cell RNA-seq data set of van den Heuvel demonstrated that in the presence of DUX4, expression of all 8 genes significantly increase (Figure 3D), indicating activation by DUX4.

While this is evidence that the 8 targets are specific to DUX4, we accept that it is not 100% conclusive and these genes may also be induced by some other factor related to the FSHD genotype. We will thus highlight this possibility of other modes of expression in the discussion.

Changes to Manuscript:

Page 13, Lines 458-472: We include the following statement in the Discussion:

‘We further assumed that cells expressing our 8 DUX4 target genes are historic DUX4 expressing cells. We selected these 8 genes as they are confirmed DUX4 targets by ChIP-seq and have been identified as upregulated in every transcriptomic analysis of DUX4 overexpressing human myoblasts4. Tissue expression patterns of these 8 genes are not well characterised outside of the FSHD context, however their expression has been reported during zygotic genome activation46 and in testicular tissue1,2, both settings where DUX4 is physiologically expressed. We found further evidence for expression of these genes indicating historic DUX4 in this study. First, we never find a single transcript for any of these 8 target genes in 1914 single myocytes and 77 single nuclei from control individuals, suggesting that their expression requires an FSHD genotype (and thus likely DUX4 expression). Second, our investigation of these genes in scRNAseq data of FSHD myocytes demonstrated that in the presence of DUX4, expression of all 8 genes significantly increases, indicating activation by DUX4. While this is evidence that the 8 targets are specific to DUX4, it is possible that they may also be induced at a much lower level, by some other factor related to the FSHD genotype and thus a small proportion of DUX4 target gene +ve cells in this study may not be historically DUX4 +ve.’

6. Results – Estimating the kinetics of DUX4 mRNA – 2nd paragraph: – DUX4 expression was induced with 250 ng/ml doxycycline for 7 hours.…. Might some of the laboratory detail here be better placed in the 'Methods' section ?

We will move some methodological detail to the methods as suggested.

Changes to Manuscript:

Details on concentrations of doxycycline and duration of induction have been moved from Results page 5 to Materials and methods page 16.

7. Results – Estimating the kinetics of DUX4 mRNA – 4th paragraph: – '…from 4 FSHD patients (2 FSHD1 and 2 FSHD2).' Please give additional information about these patients – ie. Age, sex, severity (or age-onset), and genetic results – perhaps best done as a Table in this paper rather than only by reference back to the Van den Heuvel et al. paper (see main comment 1 above).

We will provide this information in a table as the reviewer suggests.

Changes to Manuscript:

We now include a supplementary table: Supplementary File 1 detailing for each FSHD patient described by van den Heuvel et al., age, sex, genetic results as well as DUX4 +ve/-ve cell counts and where appropriate patient specific estimates of VD. We note that details of patient severity were not recorded by van den Heuvel et al.

Page 6, Lines 188-189: This table is referenced in the Results:

‘Patient demographics, genotypes and DUX4+/-ve cell counts are displayed in Supplementary File 1.’

A legend for this table is included on page 24 lines 870-873:

‘Supplementary File 1: Demographics of FSHD patients assayed in scRNAseq data.

For each FSHD patient described in the scRNAseq data set of van den Heuvel et al., we list sex, diagnosis, genotype, number of DUX4 +ve/-ve cells and % of DUX4+ve cells. For patients FSHD1.1 and FSHD2.1 where sufficient data was available, patient specific estimates of VD are also given.’

8. Results – Estimating kinetics of DUX4 target activation – 2nd paragraph: – '8 DUX4 target genes……expression was restricted to FSHD cells/nuclei and never observed in controls.' So are these genes normally only expressed in the zygote , rather than post-birth (except perhaps the testis)? The authors should comment on this.

The 8 DUX4 target genes are not well characterised in terms of their expression pattern in other tissues. We are confident that in our data sets they are not expressed in healthy myocytes and myotubes, but it is difficult to expand confidently on other tissues due to lack of data. We will include comment on what is known about their expression pattern during zygotic genome activation and in testicular tissue, but this cannot be an exhaustive description of their expression. For example, DUX4 has been detected in keratinocytes, osteoblasts and lymphocytes, but the target genes have not been characterised in such settings. We still have much to learn about the expression of these genes outside of the muscle setting.

Changes to Manuscript:

See change in Discussion on page 13, outlined in response to point 5 above, in particular the statement:

‘Tissue expression patterns of these 8 genes are not well characterised outside of the FSHD context, however their expression has been reported during zygotic genome activation46 and in testicular tissue1,2, both settings where DUX4 is also expressed.’

9. Results – Estimating kinetics of DUX4 target activation – 4nd paragraph: – In the presence of DUX4 mRNA the proportion of time the promoters of the 8 DUX4 target genes remained in the active state,….significantly increased. It would be interesting to know if any of these temporal dynamic descriptors of DUX4 target promoters differ consistently between sample II-2 and II-1 and in the same direction between I-2 and I-1. ie. Might the same factors (eg. D4Z4 copy number) which may affect presence/absence of DUX4 also affect other cellular measures of DUX 4 ?

The reviewer raises an interesting point about genetic modifiers of DUX4 transcriptional

activation. As discussed in response to point 1, the DUX4 target gene transcription rate is derived from the 27 cells expressing DUX4. As there are so few cells expressing DUX4 in each patient (sometimes <3), the gradient descent algorithm we employ to estimate the 3 rate parameters for the promoter switching model of each DUX4 target, will fail to converge for every patient separately.

We will attempt analysis of the two patients which provide the majority of DUX4 positive cells separately, as the reviewer requested in point 1. If successful, we will investigate whether the derived DUX4 target gene expression parameters differ between these two patients as the reviewer suggests. While this limited data will not permit anything definitive about impact of D4Z4 repeat length on DUX4 target gene expression, it may highlight the existence of interindividual differences in such parameters.

New Analysis

We have attempted the analysis suggested by the reviewer: comparing the promoter switching model parameters for the 8 DUX4 target genes, between DUX4+ve and DUX4-ve cells separately for patients FSHD1.1 and FSHD2.1. Unfortunately, as remarked in our response to point 1 above, insufficient data prevented parameter estimation for 3/8 of the DUX4 target genes (PRAMEF12, PRAMEF2 and RFPL2) for patient FSHD1.1 and 1/8 DUX4 target genes (RFPL2) for patient FSHD2.1. Consequently, our comparison between patients is limited to the genes for which we could derive parameter estimates.

Despite this limitation, for both patients FSHD1.1 and FSHD2.1 parameter differences observed between DUX4 +ve and DUX4-ve cells were consistent with those seen for all cells pooled.

In particular, as with the pooled estimates, the proportion of time the promoter of the target genes remained in the active state was higher in DUX4 +ve compared to DUX4 -ve cells for each patient separately. This achieved significance for patient FSHD2.1 (paired Wilcoxon p=0.0156, n=7 genes), with borderline significance for patient FSHD1.1 (paired Wilcoxon p=0.0625, n=5 genes). The lower significance for FSHD1.1 may be related to the smaller number of genes for which we could derive parameter values.

Again, as with the pooled estimates, the transcription rate of the target genes again was higher in the DUX4-ve cells compared to DUX4+ve. This trend approached significance for FSHD2.1 (paired Wilcoxon p=0.0781, n=7 genes), but was not significant for FSHD1.1 (paired Wilcoxon p=0.35, n=5 genes).

Lastly as with the pooled estimates, the mean expression of the target genes was higher in DUX4+ve cells compared to DUX4-ve. This was significant for FSHD2.1 (paired Wilcoxon p=0.0156, n=7 genes) and approached significance for FSHD1.1 (paired Wilcoxon p=0.0625, n=5 genes).

We include these data as supplementary information to the updated manuscript.

Changes to Manuscript:

Page 8, Lines 267-276: The following has been added to the Results:

‘As with our calculation of VD data was pooled across 4 FSHD patients to calculate VT. We do not anticipate patient genotype to impact the average DUX4 target transcription rate, independently of its impact on the DUX4 transcription rate. However, to confirm our findings on the impact of DUX4 on target gene promoter dynamics, in a patient specific setting, we attempted calculation of the parameters underlying the promoter switching model for the 8 DUX4 target genes in DUX4+ve and DUX4-ve cells, for patients FSHD1.1 and FSHD2.1 separately. Due to the limited number of cells for each patient, personalised estimates for all 8 target genes could not be obtained. However, where patient specific estimates were obtained the direction of parameter differences in target genes, between DUX4 +ve and DUX4 -ve cells were in line with those of pooled estimates across 4 FSHD patients (Supplementary File S2).’

Page 24, Lines 874-878: A supplementary table describing the target gene parameter estimates is also provided to the manuscript alongside a legend:

‘Supplementary File 2: Promoter switching model parameter estimates for DUX4 target genes in patients FSHD1.1 and FSHD2.1.

For each of the 8 DUX4 target genes considered the three parameters underlying the promoter switching model (kann,kinn and v0nn) derived from DUX4+ve and DUX4-ve cells separately for patients FSHD1.1 and FSHD2.1 separately are presented.’

10. Results – Compartment model simulation – 4th paragraph: – '…so that after 10 days 26.3% of cells had died.' The authors need to give data here also for controls. Ie. What proportion of cells die after 10 days in controls ?

As mentioned in the introduction of our model, we only model DUX4 dependent cell death. As control cells do not express DUX4 they cannot experience cell death by this mechanism. The reviewer can therefore interpret this result as excess death due to DUX4, i.e., the 26.3% of FSHD cells die which would not have died in control cells. We will clarify this point in an updated manuscript.

Changes to Manuscript:

Page 9, Lines 307-309: We have made the following change to the Results:

‘As expected, the number of cells in the DUX4 naive, susceptible state S(t) gradually decreased, while the number of dead cells due to DUX4 D(t) gradually rose, so that after 10 days 26.3% of cells had died as a consequence of DUX4 expression.’

11. Discussion – 1st paragraph: – ' However, DUX4 expression in FSHD muscle demonstrates a complex dynamic with DUX4 mRNA, protein and target gene accumulation all difficult to detect2,21. Understanding this complex dynamic is essential to the construction of optimal therapy' So, it presumably follows that comparison by known genetic or demographic modifiers of disease severity (such as D4Z4 copy number could be of fundamental importance in interpreting this dynamic and hence the results of clinical trials) (see Main point 1, and point 7).

This point appears to generally be a reiteration of the reviewer’s initial concerns on genotypic variation in the van den Heuvel et al., data set. We have addressed these concerns in response to points 1, 2 and 9.

Changes to Manuscript:

Please see changes in response to points 1,2 and 9 above.

12. Discussion – 6th paragraph: – 'Our model predicts <2.9% of single FSHD myocytes will be DUX4 target gene positive at any given time…' Should this be: '…DUX4 target gene-mRNA positive'. Also, does this 2.9%-figure differ between individual patients, and therefore might generally differ between FSHD1 and 2, and between different D4Z4 genetic categories ?; as this will matter for different individual patients.

We will update the manuscript to amend to ‘DUX4 target gene mRNA positive’. The reviewer again raises the concern of genotypic variability, we will of course discuss this and perform additional analysis as detailed above (points 1, 2, 9 and 11).

Changes to Manuscript:

Page 14, Line 526: We have made the following change to the Discussion:

‘Our model predicts <2.9% of single FSHD myocytes will be DUX4 target gene mRNA positive at any given time, despite significant DUX4 driven cell death.’

Please see additional changes in response to points 1,2 and 9 above, that also pertain to 11 and 12.

Reviewer #2 (Significance (Required)):

Please see comments in the 'Evidence, reproducibility and clarity' box above.

This paper is valuable in providing a mathematical basis for modelling the stochastic process of presence or absence of DUX4 expression in individual myocytes in FSHD (facioscapulohumeral muscular dystrophy), and hence a basis by which to study the effects of other factors or potential therapeutic interventions which may influence the overall proportion of muscle cells which do express DUX4, and hence succumb to its toxicity.

This is a specialist field and the nature of presentation of the mathematical modelling in the paper will restrict accessibility to only a very few researchers in this field.

We appreciate that introducing concepts from different fields such as mathematics can present a challenge to some readers. However, new methods and interdisciplinary science are essential to understand complex disorders and this is even more so in complex rare diseases such as FSHD, where diversity in expertise is important. ‘Stochastic’ has been used to describe DUX4’s expression pattern for many years and is commonly used in many FSHD papers and conferences. We have introduced concepts from the well established field of ‘stochastic gene expression’ to understand DUX4 expression in FSHD. The methods are different from conventional approaches used in FSHD, but are arguably the most natural framework in which to understand DUX4.

We have provided user-friendly programs that can be used to estimate the effects on DUX4 parameters of multiple variables to make the paper useful, even if some of the more advanced mathematics is challenging. We will also further edit to increase accessibility.

Changes to Manuscript:

Multiple edits have been made in response to the helpful comments of both reviewers which has improved accessibility of the more technical aspects of the manuscript.

There appears to be a possible major problem in the way that the paper has combined data from patients with different genetic forms of FSHD, and different likely genetic severities. It would seem better to analyse the data separately for each of these to allow a more homogenous platform on which to construct a model.

That we pooled data from 4 FSHD patients, to derive estimates of 2 of the 5 parameters of our model is the major (non-editorial) comment about our study, and is raised in various versions in points 1, 2, 9, 11 and 12 of reviewer 2’s response. We have explained above that the reason for pooling is partly statistical, due to dataset size limitation. Considering each patient separately may prevent convergence of a gradient descent algorithm for parameter estimation. We again emphasise that pooling 4 FSHD patients, while less personalised, does not make our estimate of the average DUX4 transcription rate in FSHD patients invalid. It is simply an average level across our 4 patients, perhaps higher than for some patients but lower than for others.

As commented, we will repeat the analysis separately for the 2 patients who have larger numbers of DUX4 positive cells and report the findings, with the caveat that these estimates will, by definition, be over-estimates of the true value of the average DUX4 transcription rate for the FSHD population.

Changes to Manuscript:

Please see changes made in response to changes 1, 2 and 9 that also pertain to 11 and 12.

https://doi.org/10.7554/eLife.88345.sa2

Article and author information

Author details

  1. Matthew V Cowley

    Centre for Sustainable and Circular Technologies, Department of Chemistry, University of Bath, Bath, United Kingdom
    Contribution
    Formal analysis, Investigation, Methodology, Writing – review and editing
    Contributed equally with
    Johanna Pruller
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-5258-8024
  2. Johanna Pruller

    King's College London, Randall Centre for Cell and Molecular Biophysics, New Hunt's House, Guy's Campus, London, United Kingdom
    Contribution
    Conceptualization, Formal analysis, Investigation, Methodology, Writing – review and editing
    Contributed equally with
    Matthew V Cowley
    Competing interests
    No competing interests declared
  3. Massimo Ganassi

    King's College London, Randall Centre for Cell and Molecular Biophysics, New Hunt's House, Guy's Campus, London, United Kingdom
    Contribution
    Conceptualization, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-3163-9707
  4. Peter S Zammit

    King's College London, Randall Centre for Cell and Molecular Biophysics, New Hunt's House, Guy's Campus, London, United Kingdom
    Contribution
    Conceptualization, Supervision, Funding acquisition, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-9562-3072
  5. Christopher RS Banerji

    1. King's College London, Randall Centre for Cell and Molecular Biophysics, New Hunt's House, Guy's Campus, London, United Kingdom
    2. The Alan Turing Institute, British Library, London, United Kingdom
    Contribution
    Conceptualization, Data curation, Software, Formal analysis, Supervision, Funding acquisition, Investigation, Methodology, Writing – original draft, Project administration, Writing – review and editing
    For correspondence
    cbanerji@turing.ac.uk
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-4373-7657

Funding

EPSRC Centre for Doctoral Training in Sustainable Chemical Technologies (EP/L016354/1)

  • Matthew V Cowley

Friends of FSH Research

  • Matthew V Cowley

Muscular Dystrophy UK (19GRO-PG12-0493)

  • Johanna Pruller

FSHD Society (FSHD-Winter2021-4491649104)

  • Johanna Pruller

Medical Research Council (MR/S002472/1)

  • Massimo Ganassi

Association Francaise contre les Myopathies

  • Peter S Zammit

SOLVE FSHD

  • Massimo Ganassi

The Turing-Roche Partnership

  • Christopher RS Banerji

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

CSRB was supported by the Turing-Roche Partnership. MVC was supported by the EPSRC Centre for Doctoral Training in Sustainable Chemical Technologies (EP/L016354/1) and Friends of FSH research (Project: An in-silico approach to understanding DUX4 expression). JP was supported by Muscular Dystrophy UK (19GRO-PG12-0493) and is currently by the FSHD Society (FSHD-Winter2021-4491649104). MG was supported by the Medical Research Council (MR/S002472/1) and now by SOLVE FSHD. The Zammit lab was also generously supported by Association Française contre les Myopathies.

Senior and Reviewing Editor

  1. Murim Choi, Seoul National University, Republic of Korea

Version history

  1. Preprint posted: December 12, 2022 (view preprint)
  2. Received: April 13, 2023
  3. Accepted: May 14, 2023
  4. Accepted Manuscript published: May 15, 2023 (version 1)
  5. Version of Record published: June 22, 2023 (version 2)

Copyright

© 2023, Cowley, Pruller et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 728
    Page views
  • 114
    Downloads
  • 1
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Matthew V Cowley
  2. Johanna Pruller
  3. Massimo Ganassi
  4. Peter S Zammit
  5. Christopher RS Banerji
(2023)
An in silico FSHD muscle fiber for modeling DUX4 dynamics and predicting the impact of therapy
eLife 12:e88345.
https://doi.org/10.7554/eLife.88345

Further reading

    1. Cell Biology
    2. Developmental Biology
    Simon Schneider, Andjela Kovacevic ... Hubert Schorle
    Research Article

    Cylicins are testis-specific proteins, which are exclusively expressed during spermiogenesis. In mice and humans, two Cylicins, the gonosomal X-linked Cylicin 1 (Cylc1/CYLC1) and the autosomal Cylicin 2 (Cylc2/CYLC2) genes, have been identified. Cylicins are cytoskeletal proteins with an overall positive charge due to lysine-rich repeats. While Cylicins have been localized in the acrosomal region of round spermatids, they resemble a major component of the calyx within the perinuclear theca at the posterior part of mature sperm nuclei. However, the role of Cylicins during spermiogenesis has not yet been investigated. Here, we applied CRISPR/Cas9-mediated gene editing in zygotes to establish Cylc1- and Cylc2-deficient mouse lines as a model to study the function of these proteins. Cylc1 deficiency resulted in male subfertility, whereas Cylc2-/-, Cylc1-/yCylc2+/-, and Cylc1-/yCylc2-/- males were infertile. Phenotypical characterization revealed that loss of Cylicins prevents proper calyx assembly during spermiogenesis. This results in decreased epididymal sperm counts, impaired shedding of excess cytoplasm, and severe structural malformations, ultimately resulting in impaired sperm motility. Furthermore, exome sequencing identified an infertile man with a hemizygous variant in CYLC1 and a heterozygous variant in CYLC2, displaying morphological abnormalities of the sperm including the absence of the acrosome. Thus, our study highlights the relevance and importance of Cylicins for spermiogenic remodeling and male fertility in human and mouse, and provides the basis for further studies on unraveling the complex molecular interactions between perinuclear theca proteins required during spermiogenesis.

    1. Cell Biology
    2. Structural Biology and Molecular Biophysics
    Bronwyn A Lucas, Benjamin A Himes, Nikolaus Grigorieff
    Research Advance

    Previously we showed that 2D template matching (2DTM) can be used to localize macromolecular complexes in images recorded by cryogenic electron microscopy (cryo-EM) with high precision, even in the presence of noise and cellular background (Lucas et al., 2021; Lucas et al., 2022). Here, we show that once localized, these particles may be averaged together to generate high-resolution 3D reconstructions. However, regions included in the template may suffer from template bias, leading to inflated resolution estimates and making the interpretation of high-resolution features unreliable. We evaluate conditions that minimize template bias while retaining the benefits of high-precision localization, and we show that molecular features not present in the template can be reconstructed at high resolution from targets found by 2DTM, extending prior work at low-resolution. Moreover, we present a quantitative metric for template bias to aid the interpretation of 3D reconstructions calculated with particles localized using high-resolution templates and fine angular sampling.