Identification of five repeats within the mouse DUX C-terminal region, and truncation analysis of transcriptional activity.

a. (Left) Mouse DUX aligned to itself, revealing five repeats. (Right) Diagram of mouse DUX and human DUX4.

b. Amino acid alignment of mouse DUX C-terminal repeats and comparison to human DUX4. A 14 amino acid tail follows both the terminal mouse C5 repeat and the sole DUX4 repeat.

c. Phylogenetic analysis showing amino acid alignments of mouse DUX C-terminal repeats vs. the C-terminus of bovine DUXC, the C-terminus of rat DUX4, and the C-terminus of human DUX4. Line length unit is substitutions per base. Purple shading denotes mouse DUX C-terminal repeats.

d. (Left) Schematic of constructs used for flow cytometry: mCherry-tagged full-length DUX (FL), homeodomains with C1, C12, C123, C1234, C12345. (Right) Flow cytometry data for MERVL::GFP reporter given mCherry expression in Dux-/- mESCs following 18hr expression of indicated constructs. *p-value < 0.05, student’s t-test. (n=3 biological replicates)

Structure function-analysis of DUX by 2C-like cell conversion.

a. (Top) Schematic of constructs used for flow cytometry: mCherry-tagged FL, ΔC12345+14aa, C1+14aa, C2+14aa, C3+14aa, C4+14aa, and C5+14aa. (Bottom) Flow cytometry for MERVL::GFP reporter given mCherry expression in Dux-/- mESCs with 18hr overexpression of indicated constructs. *p-value < 0.05, student’s t-test. (n=3 biological replicates). Constructs with a single black line have the linker GGGGS2 and constructs with two black lines have linkers with GGGGS2 and GAGAS2, respectively.

b. (Top) Schematic of constructs used for flow cytometry: mCherry-tagged FL, C3+14aa, C3Δ14aa, C5+14aa, C5Δ14aa. (Bottom) Flow cytometry for MERVL::GFP reporter given mCherry expression in Dux-/- mESCs following 18hr expression of indicated constructs. *p-value < 0.05, student’s t-test. (n=3 biological replicates)

c. (Top) DUX domain chimera construct design for C1-C3 fusions with (top) schematic of constructs and (bottom) specific chimera cut-offs for C1-C3 a,b,d (pink to blue) and C3-C1 a,b,c (blue to pink) constructs. (Bottom) Schematic of DUX domain chimera constructs used for flow cytometry: mCherry-tagged full-length DUX (FL), C1C3a, C1C3b, C1C3c, C3C1a, C3C1b, C3C1c. Symbols adjacent to alignments: red circle denotes transcriptionally inactive repeats and the green circle denotes transcriptionally active repeats. (Right) Flow cytometry for MERVL::GFP reporter given mCherry expression in Dux-/- mESCs folllowing 18hr expression of indicated constructs. *p-value < 0.05, student’s t-test. (n=3 biological replicates)

d. Flow cytometry for MERVL::GFP reporter given mCherry expression in Dux-/- mESCs following 18hr expression of point mutation constructs: (Top) FL-DUX, C3+14aa, C3 D438E+14aa, C3 D438G+14aa, and C3F443L+14aa. (Bottom) FL-DUX, C1+14aa, C3+14aa, C1DPLELF+14aa (substation of C3 amino acids into C1+14aa), and C3GPLELL+14aa (substitution of C2 and C4 6aa sequence into C3+14aa). *p-value < 0.05, student’s t-test. (n=3 biological replicates)

e. Principle component analysis (PCA) of RNA-seq analysis for 18hr overexpression of DUX domain chimera constructs, full-length DUX, and mCherry alone (n=2). Colored squares beside sample legends denote the inability to activate MERVL::GFP reporter (red) or ability to activate MERVL::GFP reporter (green).

f. Heat map of RNA-seq at DUX target genes (n=456) from 18hr expression of DUX domain chimera constructs, full-length DUX, and mCherry alone (n=2). Colored rectangles above samples denote inability to activate MERVL::GFP reporter (red) or ability to activate MERVL::GFP reporter (green).

Chromatin modifications at DUX binding sites requires an active DUX repeat.

a. (Left) Schematic of mCherry-tagged DUX domain constructs in clonal Dux-/- mESCs used for CUT&TAG experiments. (Right) anti-mCherry Western blot for indicated constructs after 18hr expression. Black arrows indicate construct of interest.

b. mCherry CUT&TAG class average map centered at DUX binding sites after 12-hour overexpression of Dux constructs illustrated in Figure 2a (n=2 biological replicates).

c. mCherry CUT&TAG heatmap centered at DUX binding sites after 12-hour expression of Dux constructs illustrated in Figure 2a (n=2 biological replicates).

d. H3K9ac CUT&TAG class average map centered at DUX binding sites after 12-hour expression of Dux constructs illustrated in Figure 2a (n=2 biological replicates).

e. H3K9ac CUT&TAG heatmap centered at DUX binding sites after 12-hour expression of Dux constructs illustrated in Figure 2a (n=2 biological replicates).

BioID for full-length DUX reveals interaction with proteins involved in chromatin de-repression.

a. Experimental concept for BioID (proximity labeling assay) for BirA*-DUX to identify candidate DUX interacting proteins.

b. Experimental set-up for full-length DUX BioID. The control construct, BirA*-only, transfected into 2C-like mESCs, and BirA*-DUX, transfected into mESCs (n=2 biological replicates).

c. Volcano plot using DeSeq2 analysis for BirA*-only vs. BirA*-DUX with log2FoldChange on the x-axis and −log(padj) on the y-axis (n=2 biological replicates).

d. Interactome map of proteins enriched to interact with full-length DUX, vs. BirA*-only, split into specific cellular components and protein complexes (n=2 biological replicates). Full dataset available in Supplementary Table 1.

e. (Left) Experimental set-up for BioID-like experiment in HEK283T cells with BirA*DUX and flag-tagged candidates – SMARCC1 and ZSCAN4D. (Right) Co-immunoprecipitation of FLAG and Western blot after 18-hour expression of transiently transfected FLAG-SMARCC1 or FLAG-ZSCAN4D and BirA*-DUX in HEK293Ts (n=3 biological replicates).

BioID for active C-terminal DUX repeat reveals interactors, including Smarcc1.

a. Experimental constructs BioID of individual DUX repeats. Experimental constructs are N-terminally labelled BirA* linked to either mCherry-HDs-C1+14aa or mCherry-HDs-C3+14aa.

b. Volcano plot using DeSeq2 analysis for BirA*mC-C1+14aa and BirA*mC-C3+14aa with log2FoldChange on the x-axis and −log(padj) on the y-axis (n=3 biological replicates).

c. Interactome map of proteins enriched with BirA*mC-C3+14aa vs. BirA*mC-C1+14aa, split into specific cellular components and protein complexes (n=2 biological replicates). Full dataset available in Table 2.

d. Co-immunoprecipitation of endogenous SMARCC1 after 12hr overexpression of DUX domain constructs in stable clonal cell lines illustration in Figure 2a (n=3 biological replicates).

e. SMARCC1 CUT&TAG heatmap at DUX binding sites after 12-hour overexpression of Dux constructs illustrated in Figure 2a (n=2 biological replicates).

f. ATAC-seq heatmap at DUX binding sites after 12-hour overexpression of Dux constructs illustrated in Figure 2a (n=2 biological replicates, except full-length DUX only has 1 biological replicate).

Model of DUX protein mechanism of activation based on C-terminal domain combinations.

a. Summary table of DUX domain constructs (illustrated in Figure 2a) and their transcription, recruitment, chromatin modification, and protein interaction activities.

b. Model for the contributions of DUX domains on DUX interactions and activity

DUX orthologs do not contain a repeat structure similar to mouse DUX.

a) Human DUX4 aligned to itself.

b) Rat DUX4 aligned to itself.

c) Bovine DUXC aligned to itself.

d) Table with pairwise comparison of percent identity comparing mouse DUX C-terminal repeats, the human DUX4 C-terminus, the rat DUX4 C-terminus, and the bovine DUXC C-terminus.

e) Experimental design for expression of mCherry-tagged DUX domain constructs and flow cytometry of GFP% | mCherry tag expression

f) Western blot for mCherry tag after 18hr expression of C1, C12, C123, C1234, C12345

DUX homeodomains are required for transcriptional activity.

a) (Left) Schematic of constructs used for flow cytometry: mCherry-tagged full-length DUX (FL), ΔC12345+14aa, C12345Δ14aa, C1+14aa, C12+14aa, C123+14aa, C1234+14aa, C5+14aa, C45+14aa, C345+14aa, C2345+14aa, ΔHD1, ΔHD2, and Dux-/- alone. (Right) Flow cytometry for MERVL::GFP reporter given mCherry expression in Dux-/- mESCs following 18hr expression of indicated constructs. *p-value < 0.05, student’s t-test. (n=3 biological replicates)

b) Schematic of C1C3 chimera DUX constructs for amino acid cut offs for C1C3a, C3C3b, C1C3c, C3C1a, C3C1b, and C3C1c aligned to the 5 C-terminal repeats

c) Heat map of RNA-seq at DUX target genes (n=456) (22) from 18hr expression of DUX domain constructs of FL, C12345Δ14aa, C12345+14aa, C12345Δ14aa(n=2). Colored rectangles above samples denote inability to activate MERVL::GFP reporter (red) and ability to activate MERVL::GFP reporter (green).

d) Phylogenetic analysis of the five mouse DUX C-terminal repeats. Line length unit is substitutions per base.

DUX CUT&Tag data strongly resembles DUX ChIP-seq, and DUX binding sites gain H3K9ac upon DUX expression.

a. Profile plot of mCherry-FL DUX CUT&TAG centered at DUX binding sites after 12 hour expression vs. HA-DUX ChIP-seq (n=2 biological replicates) (12)

b. IGV screenshots at TCSTV3, ZSCAN4F, USP17LB, and ZSCAN4C (DUX targets genes) for H3K9ac CUT&Tag in mESCs and mCherry-DUX 12hr expression and mCherry CUT&TAG for mCherry tag alone and mCherry-DUX (n=2 biological replicates)

The proximity-labeling assay BioID identifies DUX protein interactors.

a) BioID DeSeq2 analysis replicate structure of BirA*-only and BirA*-DUX duplicates for mass spectrometry datasets.

b) Venn diagram of statistically significant enriched proteins for BirA*-DUX interaction for DeSeq2 and SAINTexpress

c) Streptavidin pulldown and Western blot after 18-hour BioID induction in clonal stable BirA*-DUX mESC line expressing FLAG-tagged SMARCC1, CHAF1A, ZSCAN4D, or KDM4C with blots for DUX and FLAG-tag (n=3 biological replicates). KDM4C was a BioID hit, but does not interact with DUX, indicating a possible false positive.

d) Streptavidin pulldown and Western blot after 18-hour BioID induction in clonal stable BirA*-DUX mESC line, with blot for N-terminal myc-BirA*-DUX, P300, and Avidin-HRP (n=2 biological replicates)

e) BioID individual repeat DeSeq2 analysis replicate structure of BirA*-C1 and BirA*-C3 triplicates for mass spectrometry (n=3 biological replicates)

f) Principle component analysis of DeSeq2 analysis for BirA*mC-C1+14aa and BirA*mC-C3+14aa with PC1: 54% variance and PC2: 17% variance