DUX4 is frequently expressed in diverse metastatic cancers.

(A) Matched The Cancer Genome Atlas (gray, TCGA) and advanced metastatic (orange) cancer datasets analyzed in our study.

(B) The proportion of DUX4-expressing cancers in TCGA (purple shading) and metastatic (red shading) cancers. The blue line indicates the median over TCGA cancer cohorts. The 95% confidence intervals were estimated via a two-sided proportion test.

(C) DUX4 expression values (TPM, transcripts per million) in TCGA (purple shading) and advanced metastatic (red shading) cancer cohorts analyzed in our study.

(D) Representative RNA-seq coverage plots from primary and metastatic cancers for reads mapping to the DUX4 cDNA. Open reading frame (ORF, black rectangle); UTR (untranslated region, gray line); Homeobox domains (yellow rectangles).

The DUX4 transcript is likely polyadenylated.

(A) As in (Figure 1A), but The Cancer Genome Atlas (TCGA) cancer cohorts without matched advanced metastatic counterparts analyzed in our study are shown.

(B) A comparison of DUX4 expression values (TPM, transcripts per million) measured from sequencing libraries prepared via poly(A) capture or hybrid capture.

(C) As in (B), but a heatmap where patient samples (columns) were stratified according to the indicated categories of DUX4 expression.

DUX4 expression in advanced cancers is associated with signatures of host anti-tumor immunity inhibition.

(A) Gene Ontology (GO) enrichment network analysis of DUX4-downregulated genes. Differentially expressed genes were identified from the comparison of advanced urothelial carcinoma tumors with high (> 1 TPM) vs. low (≤ 1 TPM) DUX4 expression. The nodes and node sizes correspond to significantly enriched GO terms (Benjamini-Hochberg-adjusted p-value < 0.05) and the number of DUX4-downregulated genes in each, respectively. The edges connecting nodes indicate shared genes.

(B) Downregulated (blue) and upregulated (red) anti-tumor immunity genes in tumors with DUX4-positive (> 1 TPM) vs. -negative (≤ 1 TPM) advanced urothelial carcinomas.

(C) Effector CD8+ T cell score, defined as the mean of the z-score normalized gene expression values in the signature (Mariathasan et al., 2018) for DUX4+/- tumors. The p-value was estimated via a Mann-Whitney U test.

(D) CXCL9 expression for DUX4+/- tumors. The p-value was estimated via a Mann-Whitney U test.

(E) As in (D), but illustrating CXCL10 expression.

(F) Proportion of immune phenotypes in DUX4+/- cancers. The phenotypes were based on the CD8+ T cell abundance and degree of tumor infiltration determined by anti-CD8 staining of tumor FFPE sections in the original study (Mariathasan et al., 2018). The p-value was estimated via a multinomial proportion test.

(G) PD-L1-expression on tumor cells stratified by DUX4 expression status measured by immunohistochemistry in the original study. The samples were categorized based on the percentage of PD-L1-positive tumor cells. The p-value was estimated via a multinomial proportion test.

(H) As in (G), but PD-L1 staining on tumor-infiltrating immune cells (lymphocytes, macrophages, and dendritic cells) is represented.

DUX4-positivity is correlated with an embryonic gene expression signature, downregulation of interferon-gamma signaling, and exclusion of diverse immune cell types.

(A) As in (Figure 2A), but the Gene Ontology (GO) enrichment network analysis corresponding to DUX4-upregulated genes, compared against the set of coding genes, is shown.

(B) Interferon-gamma (IFN-γ) signature score (Ayers et al., 2017). The p-value was estimated via a Mann-Whitney U test.

(C) CD8 T cell score from (Danaher et al., 2017). The p-value was estimated via a Mann-Whitney U test.

(D) As in (C), showing the other immune cell signatures available in (Danaher et al., 2017).

(E) Chemokine signature score (Coppola et al., 2011). The p-value was estimated via a Mann-Whitney U test.

(F) DUX4 expression (TPM) in inflamed, immune excluded, and immune desert tumors. The phenotypes are based on CD8+ T cell abundance and degree of tumor infiltration determined by anti-CD8 staining of tumor FFPE sections in the original study (Mariathasan et al., 2018). The p-values were estimated via the Mann-Whitney U test.

(G) DUX4 expression (TPM) in advanced urothelial carcinoma tumors. The percentage of tumor cells with positive PD-L1 staining are indicated on the x-axis. The p-values were estimated via the Mann-Whitney U test.

(H) As in (G), but showing the percentage of tumor-infiltrating immune cells (lymphocytes, macrophages, and dendritic cells) with positive PD-L1 staining on the x-axis.

DUX4-positivity is associated with decreased response to immune checkpoint inhibition.

(A) The proportion of clinical response classifications (RECIST, Response Evaluation Criteria in Solid Tumors) in DUX4-positive (DUX4+, > 1 TPM) or -negative (DUX4-, ≤ 1 TPM) advanced urothelial carcinoma patients. RECIST categories were assigned in the original study (Mariathasan et al., 2018). The p-value was estimated via a multinomial proportion test.

(B) Kaplan-Meier (KM) estimates of overall survival for the patients in (A) stratified by tumor mutational burden (TMB, number of missense mutations). The estimated survival functions (solid lines), censored events (crosses), and 95% confidence intervals (transparent ribbons) for the patients in the top and bottom TMB quartiles are plotted. The p-value was estimated via a log-rank test.

(C) As in (B), but patients are stratified by DUX4 expression. To control for possible confounding by TMB, the quartile of patients with the lowest TMB was excluded.

DUX4 expression status stratifies patients according to survival.

(A) Kaplan-Meier (KM) estimates of overall survival (solid lines), 95% confidence intervals (transparent ribbons), and censored events (crosses) for ICI-treated advanced urothelial carcinoma patients stratified by DUX4 expression status. The p-value was estimated via a log-rank test.

DUX4 expression status affects clinical response after controlling for other genetic and clinical variables.

(A) Hazard ratios (HR) and 95% confidence intervals for the variables included in univariate (left) or multivariate (right) Cox Proportional Hazards (PH) regression. For categorical variables, the reference groups are indicated by points at HR = 1. Statistically significant predictors that are associated with increased (orange) or decreased (blue) risk in both the univariate and multivariate contexts are highlighted. ECOG (Eastern Cooperative Oncology Group); BCG (Bacillus Calmette-Guerin).

(B) Bayesian information criterion (BIC) measurements for goodness of fit for the full (TMB, Clinical, DUX4 expression) vs. reduced Cox PH models, where lower values indicate better fit. The bootstrapped BIC mean and the 95% confidence interval are illustrated. Clinical (ECOG Performance Status and Platinum treatment history).

(C) Kaplan-Meier (KM) estimates of overall survival, 95% confidence interval (transparent ribbon), and censored events (crosses) for low-risk (solid gray line) and high-risk (solid orange line) patients in the training (left) and test (right) sets. Risk group assignments were based on risk scores estimated by the full Cox PH model. p-values were estimated via a log-rank test.

Cox Proportional Hazards regression models containing DUX4 expression status as a predictor have a better fit to the data.

(A) Akaike information criterion (AIC) measurements for goodness of fit for the full (TMB, Clinical, DUX4 expression) vs. reduced Cox PH models, where lower values indicate better fit. The bootstrapped AIC mean and the 95% confidence interval are illustrated. Clinical (ECOG Performance Status and Platinum treatment history).

(B) Harrell’s concordance indices (C-index) for the full (TMB, Clinical, DUX4 expression) vs. reduced Cox PH models, where high values indicate better model performance. The bootstrapped C-index mean and the 95% confidence interval are illustrated.

(C) Kaplan-Meier (KM) estimates of overall survival, 95% confidence interval (transparent ribbon), and censored events (crosses) for low-risk (solid gray line) and high-risk (solid orange line) patients in the training (left) and test (right) sets. Risk group assignments were based on risk scores estimated by the Cox PH model with only TMB as a predictor. p-values were estimated via a log-rank test.

(D) As in (C), but the risk group assignments were based on risk scores estimated by the Cox PH model with TMB, ECOG Performance Status, and Platinum treatment history as predictors.

(E) Time-dependent Brier scores for the full and reduced Cox PH models applied on the training (left) and test (right) sets. The Continuous Ranked Probability Scores (CRPS), defined as the integrated Brier score divided by time, are shown in parentheses. Reference refers to the Kaplan-Meier prediction model. A Brier score = 0.25 indicates random guessing (gray dashed line).

Cox Proportional Hazards Regression for Overall Survival

Cox Proportional Hazards Regression for Overall Survival (TGFB1 expression included)

Likelihood ration rest

DUX4 expression is associated with decreased overall survival in the context of immune checkpoint inhibition.

(A) Random Survival Forest (RSF) predicted overall survival for patients with either DUX4-positive or -negative tumors in the training and test sets. Out-of-bag (OOB) survival predictions are shown for the patients in the training set. Survival predictions for individual patients (thin lines) and the median survival function across the cohort (thick line) are represented. DUX4-(< 0.25 TPM); DUX4+ (> 1 TPM).

(B) Training (OOB) and test set survival probability predictions for patients with DUX4+/- tumors at 12 and 18 months. The p-values were estimated using a two-sided Mann-Whitney U test.

(C) Survival probability for patients in the training and test sets. Survival functions corresponding to the median RSF prediction (solid orange line) and the Kaplan-Meier estimate (dashed gray) are displayed.

(D) Feature importance for variables used in the RSF model. The average absolute estimated Shapley values (solid lines) are shown, associated with predicting survival probability at particular times. The 95% confidence interval of the mean (transparent ribbon) is plotted.

(E) Surface plot showing adjusted (marginal) survival probability, measured via partial dependence, as a function of tumor mutational burden (TMB, number of missense mutations) and time. Each point on the surface corresponds to the mean survival prediction (at the respective timepoint) after TMB is fixed to the respective value for all patients.

(F) Partial plot showing adjusted survival probability as a function of DUX4 expression status. The median survival probability (solid lines) and the 95% confidence interval (transparent ribbon) after DUX4 expression status is fixed to the indicated value for all patients are plotted.

A Random Survival Forest model quantifies the effect of DUX4 status on overall survival probability in the context of immune checkpoint inhibition.

(A) Error (1 – Harrell’s concordance index) as a function of the number of trees in the Random Survival Forest (RSF) model. The training out-of-bag error (OOB error, solid gray line) and the test error (solid orange line) are shown. 1500 trees were used in the final model.

(B) Time-dependent Brier scores for the RSF model estimated from the training (solid turquoise line) or test (solid teal line) sets. OOB survival predictions were used to calculate the Brier score for the training set. The Continuous Ranked Probability Scores (CRPS), defined as the integrated Brier score divided by time, for both sets are shown. A Brier score = 0.25 indicates random guessing (gray dashed line).

(C) Time-dependent ROC analyses. The AUCC/D (solid orange line) and the 95% confidence interval (transparent orange ribbon) are shown over the observation time for the training and test sets. The OOB mortality predictions were used to calculate the training set AUCC/D.

(D) The Receiver Operating Characteristic (ROC) curves for the RSF model at 12 (solid turquoise line) and 18 (solid teal line) months. The Cumulative/Dynamic Area Under the ROC Curve (AUCC/D) and 95% confidence interval are specified. The OOB mortality predictions were used to calculate the training set AUCC/D.

(E) RSF feature importance. The mean absolute Shapley values and the 95% confidence intervals are shown (left). The standardized permutation importance and confidence regions estimated via delete-d jackknife subsampling are plotted (middle); noise variables are indicated by permutation importance measures ≤ 0. The minimal depth measures are shown (right); variables with values exceeding the depth threshold (gray dashed line) are designated as noise variables.

(F) Shapley dependence plot illustrating the relationship between tumor mutational burden (TMB, number of missense mutations) and mortality. Each point corresponds to a single patient.

(G) As in (F), but showing DUX4 expression status.

(H) Partial plot illustrating the marginal effect of TMB on mortality. Each point corresponds to the average RSF mortality predictions when TMB is fixed to the indicated value for all patients. The transparent ribbon corresponds to the 95% confidence interval.(I) Partial plot illustrating the marginal effect of DUX4 expression status. The points correspond to the RSF prediction for mortality for each patient when DUX4 expression status is fixed to the indicated value for the entire cohort.