Schematic representation of the compositional structure of a gene and alternative splicing.

The selective combination of exons and introns in a gene of 33 nucleotides gives rise to three distinct mRNA isoforms: mRNA1 (16 nucleotides), mRNA2 (9 nucleotides), and mRNA3 (19 nucleotides). When these CDSs are mapped onto the genome, the coding DNA—defined as the DNA sequences that are transcribed into a mRNA—is found to be composed of 25 nucleotides. The ASR is then computed as the ratio of the total number of nucleotides in mRNA isoforms to the number of nucleotides composing coding DNA: ASr = (16 + 9 + 19)∕25 = 1.76.

Comparative analysis of (A) alternative splicing ratio (ASR) and (B) normalized alternative splicing ratio (ASR*) distributions across taxonomic groups, including mammals, birds, fish, arthropods, plants, fungi, unicellular eukaryotes, bacteria, and archaea.

Box plots represent the median (horizontal line), interquartile range (IQR, box), and whiskers extending to 1.5× IQR. A yellow diamond-shaped point represents the mean, and outliers are shown as individual red points. Colors in the box plots correspond to taxonomic classifications.

(A-I) Phylogenetic Generalized Least Squares (PGLS) regression between the proportion of coding relative to gene content and the normalized ASR across different taxonomic groups. Each panel represents a distinct taxonomic group. The regression lines represent the estimated evolutionary relationship between the two variables while accounting for phylogenetic dependence. (J) Global relationship across all taxonomic groups.

(A-I) Phylogenetic Generalized Least Squares (PGLS) regression between the proportion of gene content relative to genome size and the normalized ASR across different taxonomic groups. Each panel represents a distinct taxonomic group. The regression lines represent the estimated evolutionary relationship between the two variables while accounting for phylogenetic dependence. (J) Global relationship across all taxonomic groups.

Normalized Alternative Splicing Ratio, ASR, is represented as a color gradient across different genomic profiles.

The x-axis spans the gene-to-genome proportion, while the y-axis spans the coding-to-gene proportion.

Organisms were classified into taxonomic groups based on the hierarchical structure provided by the NCBI Taxonomy Database, using annotation data from NCBI that meet conditions (i) and (ii) described in Section NCBI RefSeq Dataset.

Pairwise comparisons of ASR and ASR across taxonomic groups, showing differences in statistical measures (mean and median).

Statistical significance was assessed using Monte Carlo permutation tests, and adjusted p-values were calculated using the Bonferroni correction (see Methods). Colors represent significance levels: red for p ≤ 0.001 (***), green for 0.001 < p ≤ 0.05 (**), and blue for 0.05 < p ≤ 0.1 (*). Values with p > 0.1 have no color.

Pairwise comparisons of genomic variables across taxonomic groups, showing differences in statistical measures (mean and median) for gene content relative to genome size, coding size relative to gene content, and coding relative to genome size.

Statistical significance was assessed using Monte Carlo permutation tests, and adjusted p-values were calculated using the Bonferroni correction (see Methods). Colors represent significance levels: red for p ≤ 0.001 (***), green for 0.001 < p ≤ 0.05 (**), and blue for 0.05 < p ≤ 0.1 (*). Values with p > 0.1 have no color.

Summary statistics for the percentage of gene content relative to genome size (Gene Content / Genome Size (%)), the percentage of coding relative to gene size (Coding Size / Gene Content (%)), the percentage of coding relative to genome size (Coding Size / Genome Size (%)), the alternative splicing ratio (ASR), and the normalized alternative splicing ratio (ASR) across different taxonomic groups.

The table includes the mean (), the interpercentile range between the 5th and 95th percentiles ([Q0.05, Q0.95]), and standard deviation (σ) for each group.

Phylogenetic generalized least squares (PGLS) analysis was fitted to the data using a linear model YβX, with β the regression coefficient estimating the effect of X on Y.

The Table is structured into two parts: the first part presents the results of the model fitted to ASR vs. genome size, ASR vs. gene content, ASR vs. coding content, ASR vs. the percentage of gene content within genome size, ASR vs. the percentage of coding within genes, and ASR vs. the percentage of coding within genome size. The second part reports the model fitted to the same relationships, but with ASR values normalized, ASR. The model was fitted separately for each taxonomic group. The table presents the regression coefficients (β), the adjusted coefficient of determination (Radj2), and the importance of the phylogenetic structure (λ). Statistical significance is indicated with asterisks and colors: red for p ≤ 0.001 (***), green for 0.001 < p ≤ 0.05 (**), and blue for 0.05 < p ≤ 0.1 (*). Values with p > 0.1 have no color.

Coefficient of variation for each genomic variable and taxonomic group.

The coefficient of variation is calculated as , where s is the standard deviation and is the mean of each variable within a taxonomic group. Each column represents a different variable, with the coefficient of variation computed for genome size, gene content, coding content, the percentage of gene content within genomes, the percentage of coding within genes, the percentage of coding within genomes, the alternative splicing ratio (ASR), and its normalized ratio, ASR.

Relative variability among genomic features, computed as the ratio of the coefficient of variation between different genomic variables, , where (standard deviation divided by the sample mean).

It is computed separately for each taxonomic group. The first table presents variability relationships among variables related to genome composition, whereas the second and third tables show the relationships of these variables with the alternative splicing ratio and its normalized value, respectively.

(A-I) Phylogenetic Generalized Least Squares (PGLS) regression between genome size and ASR across different taxonomic groups. Each panel represents a distinct taxonomic group. The regression lines represent the estimated evolutionary relationship between the two variables while accounting for phylogenetic dependence. (J) Global relationship across all taxonomic groups. The inset provides a logarithmic representation of the x-axis.

(A-I) Phylogenetic Generalized Least Squares (PGLS) regression between gene content and ASR across different taxonomic groups. Each panel represents a distinct taxonomic group. The regression lines represent the estimated evolutionary relationship between the two variables while accounting for phylogenetic dependence. (J) Global relationship across all taxonomic groups. The inset provides a logarithmic representation of the x-axis.

(A-I) Phylogenetic Generalized Least Squares (PGLS) regression between the amount of coding and ASR across different taxonomic groups. Each panel represents a distinct taxonomic group. The regression lines represent the estimated evolutionary relationship between the two variables while accounting for phylogenetic dependence. (J) Global relationship across all taxonomic groups.

(A-I) Phylogenetic Generalized Least Squares (PGLS) regression between the proportion of coding relative to genome size and ASR across different taxonomic groups. Each panel represents a distinct taxonomic group. The regression lines represent the estimated evolutionary relationship between the two variables while accounting for phylogenetic dependence. (J) Global relationship across all taxonomic groups.

(A-I) Phylogenetic Generalized Least Squares (PGLS) regression between the proportion of coding relative to gene content and ASR across different taxonomic groups. Each panel represents a distinct taxonomic group. The regression lines represent the estimated evolutionary relationship between the two variables while accounting for phylogenetic dependence. (J) Global relationship across all taxonomic groups.

(A-I) Phylogenetic Generalized Least Squares (PGLS) regression between the proportion of gene content relative to genome size and ASR across different taxonomic groups. Each panel represents a distinct taxonomic group. The regression lines represent the estimated evolutionary relationship between the two variables while accounting for phylogenetic dependence. (J) Global relationship across all taxonomic groups.

(A-I) Phylogenetic Generalized Least Squares (PGLS) regression between genome size and the normalized ASR across different taxonomic groups. Each panel represents a distinct taxonomic group. The regression lines represent the estimated evolutionary relationship between the two variables while accounting for phylogenetic dependence. (J) Global relationship across all taxonomic groups. The inset provides a logarithmic representation of the x-axis.

(A-I) Phylogenetic Generalized Least Squares (PGLS) regression between gene content and the normalized ASR across different taxonomic groups. Each panel represents a distinct taxonomic group. The regression lines represent the estimated evolutionary relationship between the two variables while accounting for phylogenetic dependence. (J) Global relationship across all taxonomic groups. The inset provides a logarithmic representation of the x-axis.

(A-I) Phylogenetic Generalized Least Squares (PGLS) regression between the amount of coding and the normalized ASR across different taxonomic groups. Each panel represents a distinct taxonomic group. The regression lines represent the estimated evolutionary relationship between the two variables while accounting for phylogenetic dependence. (J) Global relationship across all taxonomic groups.

(A-I) Phylogenetic Generalized Least Squares (PGLS) regression between the proportion of coding relative to genome size and the normalized ASR across different taxonomic groups. Each panel represents a distinct taxonomic group. The regression lines represent the estimated evolutionary relationship between the two variables while accounting for phylogenetic dependence. (J) Global relationship across all taxonomic groups.