-
Supplementary file 1
Comparison of the lists of genes in datasets CG_SSI2SD_rNSM > 0.125 and CG_SO2SD_rMSN > 3.00 with the lists of cancer genes identified by others (VOG, Vogelstein et al., 2013; TAM, Tamborero et al., 2013; LAW, Lawrence et al., 2014; ABB, Abbott et al., 2015; TOR, Torrente et al., 2016; ZHO, Zhou et al., 2017; MAR, Martincorena et al., 2017; BAI, Bailey et al., 2018; SON, Sondka et al., 2018; ZHA, Zhao et al., 2019a).
Transcripts of OGs (oncogenes) and TSGs (tumor suppressor genes) of the cancer gene list of Vogelstein et al., 2013 are highlighted by brick red and blue backgrounds, respectively. Transcripts of CGC genes (SON, Sondka et al., 2018) that do not correspond to OGs or TSGs of the cancer gene list of Vogelstein et al., 2013 are highlighted by yellow background. Novel positively or negatively selected cancer genes validated in the present work are highlighted in dark green background.
-
https://cdn.elifesciences.org/articles/59629/elife-59629-supp1-v2.xlsx
-
Supplementary file 2
Comparison of the lists of genes in datasets CG_SSI2SD_rNSM > 0.125 and CG_SO2SD_rMSN > 3.00 with the lists of genes in datasets CG_SO*2SD_rNSM > 3 and CG_SO*2SD_rMSN > 1.50, respectively.
-
https://cdn.elifesciences.org/articles/59629/elife-59629-supp2-v2.xlsx
-
Supplementary file 3
Comparison of the list of negatively selected genes, CG2SD_rSMN > 0.5 with the lists of negatively selected genes (WEG, ZHOU, ZAPATA, PYATNITSKIY), defined by Zhou et al., 2017, Weghorn and Sunyaev, 2017, Zapata et al., 2018, Pyatnitskiy et al., 2015, respectively as well as the list of genes (De Kegel) identified by De Kegel and Ryan, 2019 as broadly essential genes.
Negatively selected genes discussed in detail in the present work are highlighted in dark green background.
-
https://cdn.elifesciences.org/articles/59629/elife-59629-supp3-v2.xlsx
-
Supplementary file 4
Comparison of the list of genes in dataset CG2SD_rSMN > 0.5 with the list of genes in dataset CG_SO*2SD_rSMN > 1.50.
-
https://cdn.elifesciences.org/articles/59629/elife-59629-supp4-v2.xlsx
-
Supplementary file 5
SO (Substitution Only) and SSI (Substitutions and Subtle Indel) analyses of somatic mutations of transcripts of human protein coding genes that have at least 100 confirmed somatic, non-polymorphic mutations identified in tumor tissues.
The table also contains lists of passenger genes (PG_SOf_1SD, PG_SOr2_1SD, PG_SOr3_1SD, PG_SSIf_1SD, PG_SSIr2_1SD, PG_SSIr3_1SD) whose parameters deviate from the mean values by ≤1 SD as well as lists of candidate cancer genes (CG_SOf_1SD, CG_SOr2_1SD, CG_SOr3_1SD, CG_SSIf_1SD, CG_SSIr2_1SD, CG_SSIr3_1SD) whose parameters deviate from the mean values by >1 SD. Table also contains lists of candidate cancer genes (CG_SOf_2SD, CG_SOr2_2SD, CG_SOr3_2SD, CG_SSIf_2SD, CG_SSIr2_2SD, CG_SSIr3_2SD) whose parameters deviate from the mean values by >2 SD as well as lists of passenger genes (PG_SOf_2SD, PG_SOr2_2SD, PG_SOr3_2SD, PG_SSIf_2SD, PG_SSIr2_2SD, PG_SSIr3_2SD) whose parameters deviate from the mean values by <2 SD. Transcripts of OGs (oncogenes) and TSGs (tumor suppressor genes) of the cancer gene list of Vogelstein et al., 2013 are highlighted by brick red and blue backgrounds, respectively. Transcripts of CGC (Cancer Gene Census) genes (Sondka et al., 2018) that do not correspond to OGs or TSGs of the cancer gene list of Vogelstein et al., 2013 are highlighted by yellow background.
-
https://cdn.elifesciences.org/articles/59629/elife-59629-supp5-v2.xlsx
-
Supplementary file 6
Numbers and fractions of missense, nonsense, and silent single-nucleotide polymorphisms (SNPs) affecting the coding sequences of the human genes.
Transcripts of OGs (oncogenes) and TSGs (tumor suppressor genes) of the cancer gene list of Vogelstein et al., 2013 are highlighted by brick red and blue backgrounds, respectively. Transcripts of CGC genes (SON, Sondka et al., 2018) that do not correspond to OGs or TSGs of the cancer gene list of Vogelstein et al., 2013 are highlighted by yellow background. Novel positively or negatively selected cancer genes validated in the present work are highlighted in dark green background.
-
https://cdn.elifesciences.org/articles/59629/elife-59629-supp6-v2.xlsx
-
Supplementary file 7
Comparison of fS, rSM, and rSMN scores of genes determined for somatic mutations in tumors with those determined for germline mutations.
-
https://cdn.elifesciences.org/articles/59629/elife-59629-supp7-v2.xlsx
-
Supplementary file 8
Statistics of transcripts and subtle somatic mutations of human protein coding genes of the different datasets analyzed.
-
https://cdn.elifesciences.org/articles/59629/elife-59629-supp8-v2.xlsx
-
Supplementary file 9
SO (Substitution Only) and SSI (Substitutions and Subtle Indel) analyses of somatic mutations of transcripts of human protein coding genes.
Transcripts of OGs (oncogenes) and TSGs (tumor suppressor genes) of the cancer gene list of Vogelstein et al., 2013 are highlighted by brick red and blue backgrounds, respectively. Transcripts of CGC (Cancer Gene Census) genes (Sondka et al., 2018) that do not correspond to OGs or TSGs of the cancer gene list of Vogelstein et al., 2013 are highlighted by yellow background.
-
https://cdn.elifesciences.org/articles/59629/elife-59629-supp9-v2.xlsx
-
Supplementary file 10
Contribution of major types of tumors (‘Tumor Primary site’) to subtle somatic substitutions of the human protein coding genes analyzed.
-
https://cdn.elifesciences.org/articles/59629/elife-59629-supp10-v2.xlsx
-
Supplementary file 11
Analyses of fS, fM, and fN parameters of transcripts of human protein coding genes that have at least 0 (N0), 50 (N50), 100 (N100), or 500 (N500) somatic substitutions in tumors, respectively.
Transcripts of OGs (oncogenes) and TSGs (tumor suppressor genes) of the cancer gene list of Vogelstein et al., 2013 are highlighted by brick red and light blue backgrounds, respectively. Transcripts of CGC (Cancer Gene Census) genes (Sondka et al., 2018) that do not correspond to OGs or TSGs of the cancer gene list of Vogelstein et al., 2013 are highlighted by yellow background. Novel proto-oncogenes, TSGs and negatively selected tumor essential genes validated in the present work are shown in brown, dark blue, and green colors, respectively. For 3D representations of the data, see Figure 12.
-
https://cdn.elifesciences.org/articles/59629/elife-59629-supp11-v2.xlsx
-
Supplementary file 12
Analyses of fS, fM, and fN parameters of transcripts of human protein coding genes that have at least 0 (N02SD), 50 (N502SD), 100 (N1002SD), or 500 (N5002SD) somatic substitutions in tumors and deviate from average values of fS, fM, and fN by more than 2SD (Sheet ‘CG_SOf_2SD’).
Transcripts of OGs (oncogenes) and TSGs (tumor suppressor genes) of the cancer gene list of Vogelstein et al., 2013 are highlighted by brick red and light blue backgrounds, respectively. Transcripts of CGC (Cancer Gene Census) genes (SON, Sondka et al., 2018) that do not correspond to OGs or TSGs of the cancer gene list of Vogelstein et al., 2013 are highlighted by yellow background. Novel proto-oncogenes, TSGs and negatively selected tumor essential genes (TEGs) validated in the present work are shown in brown, dark blue, and green colors, respectively. Sheet ‘statistics’ contains a summary of the fS, fM, and fN parameters of datasets N0, N50, N100, N500, N02SD, N502SD, N1002SD, N5002SD and indicates the number of known and novel OGs, TSGs and TEGs that are present in the different datasets.
-
https://cdn.elifesciences.org/articles/59629/elife-59629-supp12-v2.xlsx
-
Supplementary file 13
Negatively selected genes in datasets N0, N50, N100, and N500.
Sheet ‘SO’ lists the genes/transcripts in datasets N0, N50, N100, and N500 that contain transcripts of human protein coding genes with at least 0, 50, 100, or 500 somatic substitutions in tumors, respectively. The lists of negatively selected genes identified by others were taken from the publications of Weghorn and Sunyaev, 2017, Zapata et al., 2018, Zhou et al., 2017 and Pyatnitskiy et al., 2015. Sheet ‘statistics’ indicates the number of negatively selected genes identified by others that are present in the N0, N50, N100, and N500 datasets. Note that only 48%, 64%, 77%, and 89% of the negatively selected genes identified by Weghorn and Sunyaev, 2017, Zapata et al., 2018, Zhou et al., 2017 and Pyatnitskiy et al., 2015, respectively, are present in the dataset N100 that we have analyzed in the present work.
-
https://cdn.elifesciences.org/articles/59629/elife-59629-supp13-v2.xlsx
-
Supplementary file 14
Expected fractions of nonsense, missense, and silent substitutions of various codons in the absence of selection assuming that there is no difference in the probability of the substitution classes C>A, C>G, C>T, T>A, T>C, and T>G.
-
https://cdn.elifesciences.org/articles/59629/elife-59629-supp14-v2.docx
-
Supplementary file 15
Expected fractions of nonsense, missense, and silent substitutions of various codons in the absence of selection assuming that there is no difference in the probability of the substitution classes C>A, C>G, C>T, T>A, T>C, and T>G.
-
https://cdn.elifesciences.org/articles/59629/elife-59629-supp15-v2.xlsx
-
Supplementary file 16
Expected fractions of nonsense, missense, and silent substitutions of various codons in the absence of selection assuming that there is no difference in the probability of the substitution classes C>A, C>G, C>T, T>A, T>C, and T>G.
-
https://cdn.elifesciences.org/articles/59629/elife-59629-supp16-v2.docx
-
Supplementary file 17
Expected fraction of silent, missense, and nonsense mutations of coding sequences of human protein-coding genes, assuming equal probability of different substitutions classes.
-
https://cdn.elifesciences.org/articles/59629/elife-59629-supp17-v2.xlsx
-
Supplementary file 18
Expected fractions of nonsense, missense, and silent substitutions of various codons in the absence of selection assuming that only C>A and G>T mutations occur.
-
https://cdn.elifesciences.org/articles/59629/elife-59629-supp18-v2.docx
-
Supplementary file 19
Expected fractions of nonsense, missense, and silent substitutions of various codons in the absence of selection assuming that only C>G and G>C mutations occur.
-
https://cdn.elifesciences.org/articles/59629/elife-59629-supp19-v2.docx
-
Supplementary file 20
Expected fractions of nonsense, missense, and silent substitutions of various codons in the absence of selection assuming that only C>T and G>A mutations occur.
-
https://cdn.elifesciences.org/articles/59629/elife-59629-supp20-v2.docx
-
Supplementary file 21
Expected fractions of nonsense, missense, and silent substitutions of various codons in the absence of selection assuming that only T>A and A>T mutations occur.
-
https://cdn.elifesciences.org/articles/59629/elife-59629-supp21-v2.docx
-
Supplementary file 22
Expected fractions of nonsense, missense, and silent substitutions of various codons in the absence of selection assuming that only T>C and A>G mutations occur.
-
https://cdn.elifesciences.org/articles/59629/elife-59629-supp22-v2.docx
-
Supplementary file 23
Expected fractions of nonsense, missense, and silent substitutions of various codons in the absence of selection assuming that only T>G and A>C mutations occur.
-
https://cdn.elifesciences.org/articles/59629/elife-59629-supp23-v2.docx
-
Supplementary file 24
Expected fractions of nonsense, missense and silent substitutions of various codons in the absence of selection assuming that only C>A or C>G or C>T or T>A or T>C or T>G mutations occur.
-
https://cdn.elifesciences.org/articles/59629/elife-59629-supp24-v2.xlsx
-
Supplementary file 25
Expected fractions of nonsense, missense, and silent substitutions in the absence of selection assuming equal codon frequency and that only C>A or C>G or C>T or T>A or T>C or T>G mutations occur.
-
https://cdn.elifesciences.org/articles/59629/elife-59629-supp25-v2.docx
-
Supplementary file 26
Contributions of C>A, C>G, C>T, T>A, T>C, and T>G mutations to the pattern of Single Base Substitutions in tumors.
-
https://cdn.elifesciences.org/articles/59629/elife-59629-supp26-v2.xlsx
-
Supplementary file 27
Expected fractions of nonsense (fN*), missense (fM*), and silent (fS*) mutations of human protein-coding genes taking into account the probability of different substitutions classes in tumors.
-
https://cdn.elifesciences.org/articles/59629/elife-59629-supp27-v2.xlsx
-
Supplementary file 28
Expected fractions of nonsense (fN**), missense (fM**), and silent (fS**) mutations of human protein-coding genes taking into account the probability of different substitutions classes in germline cells.
-
https://cdn.elifesciences.org/articles/59629/elife-59629-supp28-v2.xlsx
-
Supplementary file 29
Statistics of the results of SO (Substitution Only) and SSI (Substitutions and Subtle Indel) analyses of the data presented in Supplementary file 5.
The column marked 'Expected' indicates the parameters expected if we assume that the structure of the genetic code determines the probability of somatic substitutions.
-
https://cdn.elifesciences.org/articles/59629/elife-59629-supp29-v2.xlsx
-
Supplementary file 30
Comparison of the results of SO (Substitution Only) and SSI (Substitutions and Subtle Indel) analyses.
-
https://cdn.elifesciences.org/articles/59629/elife-59629-supp30-v2.xlsx
-
Supplementary file 31
Lists of genes (CG_SOf_2SD, CG_SOr2_2SD, CG_SOr3_2SD, CG_SSIf_2SD, CG_SSIr2_2SD, CG_SSIr3_2SD) whose parameters deviate from the mean values by >2 SD.
Transcripts of OGs (oncogenes) and TSGs (tumor suppressor genes) of the cancer gene list of Vogelstein et al., 2013 are highlighted by brick red and blue backgrounds, respectively. Transcripts of CGC (Cancer Gene Census) genes (Sondka et al., 2018) that do not correspond to OGs or TSGs of the cancer gene list of Vogelstein et al., 2013 are highlighted by yellow background.
-
https://cdn.elifesciences.org/articles/59629/elife-59629-supp31-v2.xlsx
-
Supplementary file 32
Observed/expected parameters (rN*, rM*, rS*; rSM*, rNM*, rNS*; rSMN*, rMSN*, and rNSM*) of somatic mutations affecting the coding sequences of the human genes in cancer.
Transcripts of OGs (oncogenes) and TSGs (tumor suppressor genes) of the cancer gene list of Vogelstein et al., 2013 are highlighted by brick red and blue backgrounds, respectively.
-
https://cdn.elifesciences.org/articles/59629/elife-59629-supp32-v2.xlsx
-
Supplementary file 33
Observed/expected parameters (rN**, rM**, rS**; rSM**, rNM**, rNS**; rSMN**, rMSN**, and rNSM**) of single-nucleotide polymorphisms (SNPs) affecting the coding sequences of the human genes.
Transcripts of OGs (oncogenes) and TSGs (tumor suppressor genes) of the cancer gene list of Vogelstein et al., 2013 are highlighted by brick red and blue backgrounds, respectively.
-
https://cdn.elifesciences.org/articles/59629/elife-59629-supp33-v2.xlsx
-
Transparent reporting form
-
https://cdn.elifesciences.org/articles/59629/elife-59629-transrepform-v2.pdf