Identification of females with non-mosaic X-inactivation

(A) The patterns of X-chromosome inactivation (XCI) in women resulting in mosaic (right female) or non-mosaic XCI (nmXCI). The presence of genetic variants can result in nmXCI females by (i) directly determining which X-chromosome can be inactivated (primary skewing, left female) or by (ii) imparting a selective advantage to a small number of cells (secondary skewing, middle female). Xa, active chr X; Xi, inactive chr X, Xm, maternal chr X, Xp, paternal chr X. (B) Single-tissue median allelic expression (AE) and standard error of all nonPAR genes on chromosome X (chr X) not previously classified as variable in all 285 women in GTEx. (C) Allelic expression per tissue of nonPAR chr X genes not previously classified as variable in mosaic females (median allelic expression < 0.475) and three females identified as non-mosaic, nmXCI-1, nmXCI-2 and UPIC (median allelic expression > 0.475). Boxplot indicating median, 25th and 75th percentile. (D) Copy number as log2 ratio of chromosome 17 (chr 17) for nmXCI females, UPIC, nmXCI-1 and nmXCI-2. Trisomy 17p in UPIC is highlighted.

Characterization of two novel human nmXCI females

(A) Overlap of genic heterozygous SNPs (hetSNPs) (upper) and genes with hetSNP (lower) across the two novel nmXCI females (nmXCI-1 and nmXCI-2). Genic hetSNPs were identified using both WES and WGS for each individual. (B) Distribution of assessed genes across the X-chromosome. Genes located in the pseudoautosomal region 1 (PAR1) and PAR2 are highlighted in green. (C) Allelic expression per tissue for well-characterised PAR, (AKAP17A), escape (PUDP, DDX3X), inactive (APOOL) and variable (PRKX) genes (8). Boxplot indicating median, 25th and 75th percentile. (D) Spearman correlation of allelic expression values using our analysis approach (Gylemo) and the Tukiainen et al analysis pipelines for female UPIC. (E) Overlap of genic hetSNPs (left) and genes (right) identified by our analysis (Gylemo) and the Tukiainen et al analysis pipeline in female UPIC.

An extended landscape of X-inactivation in humans

(A) Overlap of genic heterozygous SNPs (hetSNPs) (upper) and genes with hetSNP (lower) across the three females (nmXCI-1, nmXCI-2 and UPIC) with non-mosaic X-inactivation (nmXCI). (B) Tissues covered in each nmXCI female. Tissues not covered in UPIC are indicated in bold. (C) Examples of genes classified by the binomial test alone and after manual curation. Allelic expression across tissues is shown with XCI status based on the binomial test indicated as inactive (grey circle) or escape (red triangle). (D) Allelic expression of X-linked gene categories (lower) and number of genes included in each category (upper). Genes classified as escape and inactive are separated based on whether allelic expression was determined across multiple tissues or in a single sample (data for one single tissue). Boxplot indicating median, 25th and 75th percentile. (E) Heatmap showing the allelic expression of all genes that show constitutive or variable escape from XCI. Black asterisks within the tiles indicate a significant expression from the inactive X-chromosome (i.e. escape, FDR-corrected binomial q-value < 0.01). The ‘consensus call’ tile is the assigned XCI status across tissues and individuals for each gene with genes classified as variable including both intra- and interindividual variation. Red asterisks indicate genes in which manual curation of XCI status was performed. Grey tiles indicate missing data. (B, E) Tissue abbreviations can be found in Table S7.

Classification and novel XCI assessment of X-linked genes

(A) Allelic expression across all available tissues in all three nmXCI females for genes which were only covered in one tissue in the previous assessment based on UPIC alone (8) (B) Alluvial plot showing classification of escape status of X-linked genes based on our analysis (Gylemo) compared to a previous assessment based on UPIC alone (Tukiainen) (8). Genes classified as escape and inactive are separated based on whether allelic expression was determined across multiple tissues or in only one sample (data for one single tissue).

(A) Expression, cis-binding and spreading of the long non-coding RNA XIST initiates the epigenetic process of remodelling an active X-chromosome (chr X) into an inactive X-chromosome (Xi). After initiation of X-inactivation activating histone marks (histone 3 lysine 27 acetylation; H3K27ac) are removed while silencing histone modifications (trimethylation of lysine 27 on histone 3; H3K27me3 and ubiquitination of lysine 119 on histone 2A; H2AK119Ub) and DNA methylation are deposited on the future inactive X chromosome. This process leads to transcriptional silencing of all genes on the Xi except for genes residing in the pseudoautosomal regions (PARs, green) as well as a subset of non-PAR genes that ‘escape’ X-inactivation (red) and are continually expressed from the Xi. (B) Copy number as log2 ratio using 500kb bins of the whole genome for nmXCI females, UPIC, nmXCI-1 and nmXCI-2. Trisomy 17p in UPIC is highlighted. (C) Gene expression from 17p and 17q across all tissues in nmXCI-2 and UPIC. TPM: transcriptions per million. 17p: chromosome 17, p arm. 17q: chromosome 17, q arm. Tissue abbreviations can be found in Table S7.

(A) Spearman correlation of allelic expression values using our analysis approach (Gylemo) and Tukiainen et al analysis pipelines for female UPIC in all tissues available. (B) List of genes that were included in Tukiainen’s analysis of allelic expression in UPIC but were excluded from our analysis including reason for exclusion.

Read counts for minor and major alleles for all genes where the XCI status was manually curated. The manual curation result and the reason for manual curation is stated above each gene and summarized in the table in the bottom right of the figure. Asterisks indicate significant escape based on the binomial test (FDR-corrected binomial q value < 0.01).

Heatmap showing the allelic expression of all X-linked genes assayed in this study. Black asterisks within the tiles indicate a significant inactive X-chromosome expression (i.e. escape, FDR-corrected binomial q value < 0.01). The ‘consensus call’ tile is the assigned XCI status across tissues for each gene with genes classified as variable including both intra- and interindividual variation. Red asterisks before gene names indicate genes in which manual curation was performed to assign XCI status. Grey tiles indicate missing data.

(A) Spearman correlation of allelic expression between nmXCI females. (B, C) Alluvial plots showing direct classification of escape status based on our analysis (Gylemo) compared to classification of escape status of X-linked genes as consensus calls for X-inactivation based on data from multiple previous studies employing indirect measures to determine escape; reported by (B) Tukiainen et al (8) (in their Suppl. Table.1, Combined XCI status) and (C) Balaton et al (in their Table S1, Balaton consensus calls) (31).