FLEXIQuant-LF to quantify protein modification extent in label-free proteomics data

  1. Christoph N Schlaffner
  2. Konstantin Kahnert
  3. Jan Muntel
  4. Ruchi Chauhan
  5. Bernhard Y Renard
  6. Judith A Steen
  7. Hanno Steen  Is a corresponding author
  1. F.M. Kirby Neurobiology Center, Boston Children’s Hospital, United States
  2. Department of Neurology, Harvard Medical School, United States
  3. Department of Pathology, Boston Children’s Hospital, United States
  4. Bioinformatics Unit (MF1), Robert Koch Institute, Germany
  5. Department of Medical Biotechnology, Institute of Biotechnology, Technische Universität Berlin, Germany
  6. Department of Pathology, Harvard Medical School, United States
  7. Data Analytics and Computational Statistics, Hasso-Plattner-Institute, Faculty of Digital Engineering, University of Potsdam, Germany
  8. Precision Vaccines Program, Boston Children’s Hospital, United States
4 figures, 3 tables and 4 additional files

Figures

Workflow and FLEXIQuant-LF concept.

(A) Workflow. (A1) HeLa cells were synchronized in S-phase using thymidine (dark orange). Upon release from thymidine block, cells were treated with nocodazole and samples were collected after 4 hr (medium dark orange), 8 hr (orange), and 10 hr (light orange). (A2) APC/C was co-immunoprecipitated using an anti-CDC27 antibody. (A3) All samples were trypsinized separately and analyzed by LC-MS/MS in DDA mode to generate a spectral library, which was subsequently analyzed by SWATH-MS (see Supplementary file 1 for raw peptide intensities of all quantified APC/C proteins). (A4) FLEXIQuant-LF-based differential modification analysis of APC/C proteins (see Supplementary file 2 for all resulting RM scores). (B) FLEXIQuant-LF overview. (B1) Firstly, intensities of unmodified peptides are used to indirectly identify and quantify the modification extent of a protein using the following steps. (B2) A RANSAC-based robust linear regression model is fitted to the intensities of unmodified peptide species using a reference sample as independent variable and the sample of interest as dependent variable and the vertical distance to the regression line of each peptide is determined. (B3) For each peptide, the distance is then normalized by dividing by the slope of the regression line multiplied by the intensity of the peptide in the reference sample and the result is subtracted from one to yield raw scores. (B4) Peptides with a raw score above three standard deviations of the median (MAD) of all raw scores are classified as outliers and excluded from the subsequent RM score calculation. The remaining raw scores are then scaled using the median of the three highest raw scores resulting in a metric, termed RM score which is equal to one minus the extent of modification. (B5) Lastly, peptides are classified in three categories based on their RM scores and the extent of modification is visualized: (i) RM score <0.5: peptide is likely differentially modified (magenta bars), (ii) 0.5 ≤ RM score<0.6: peptide is possibly differentially modified (blue bars), and (iii) RM score ≥0.6: peptide is likely not differentially modified (green bars).

Benchmarking FLEXIQuant-LF with CDC27 and APC5.

Peptide classification and extent of modification quantification of CDC27 peptides (A) and APC5 peptides (B) with protein sequences and measured peptides highlighted underneath. Green bars indicate peptides that were classified as likely not differentially modified (RM score ≥0.6), blue bars indicate peptides classified as possibly differentially modified (0.5 ≤ RM score<0.6) and magenta bars indicate peptides that were classified as likely differentially modified (RM score ≤0.5). Shading indicates timepoints from 0 hr (S phase; darkest shade) to 10 hr (brightest shade). Positions of peptides within the protein are given from the N- to C-terminus. Orange stars indicate modifications identified in the DDA dataset while yellow stars indicate modifications described in the literature. (A) FLEXIQuant-LF analysis of 32 CDC27 peptides after filtering classified nine and four peptides as likely and possibly differentially modified over the course of the experiment. We found PTM evidence for all peptides classified as likely differentially modified (for five in our DDA data and for all peptides online [see also Table 1]) as well as for three out of four peptides classified as possibly differentially modified (for one in our DDA data and all except of peptide [23-30] described online). (B) Out of 35 quantified peptides of APC5, four and five peptides were classified as likely and possibly differentially modified. We found evidence for the likely differentially modified peptides in our DDA data or as previously reported. Additionally, we found evidence for four out of five possibly differentially modified peptides online. Only for [99-106], we could not find any evidence for modification.

Figure 3 with 2 supplements
Application of FLEXIQuant-LF to remaining APC/C core components.

Peptide classification and extent of modification of APC1. Seven out of 50 quantified peptides were classified as likely differentially modified and four peptides as possibly differentially modified. We found evidence of modification for six out of seven peptides classified as likely differentially modified (five in our DDA data and six described in the literature) as well as for three out of four peptides classified as possibly differentially modified in our DDA data (see also Table 1). Additionally, two peptides were classified as likely differentially modified and possibly differentially modified (indicated by arrow heads) but no evidence of modification was found in literature. Interestingly, these two peptides are consecutive, and the second peptide starts with threonine (AA 1077). An as of now undiscovered threonine modification such as a phosphorylation would likely lead to a highly increased missed cleavage rate and could explain the reduction of the signal intensities of both peptides. Classification and extent of modification of the remaining APC/C core components is shown in Figure 3—figure supplements 1 and 2.

Figure 3—figure supplement 1
FLEXIQuant-LF analysis of remaining APC/C components analyzed as superprotein.

Peptide classification and extent of modification of (A) APC4, (B) APC7, (C) APC10, (D) CDC20, (E) CDC16, (F) CDC23 analyzed with FLEXIQuant-LF. APC15 and APC16 were analyzed using the ‘superprotein’ approach, the remaining proteins were analyzed individually. Green bars indicate peptides that were classified as likely not differentially modified at 10 hr (RM score ≥0.6), blue bars indicate peptides classified as possibly differentially modified (0.5 ≤ RM score<0.6) and magenta bars indicate peptides that classified as likely differentially modified (RM score ≤0.5). For each class, the four time points are shown in different color shades, from time point 0 hr (S phase; darkest shade) to time point 10 hr (brightest shade). Positions of peptides within the protein are given from the N- to C-terminus. Peptides labeled with an orange star indicate that a modification was identified on this peptide in the DDA dataset and peptides labeled with a yellow star indicate a modification that was described in the literature.

Figure 3—figure supplement 2
FLEXIQuant-LF analysis of APC/C components as superprotein.

Peptide classification and extent of modification of (A) APC15 and (B) APC16 analyzed with FLEXIQuant-LF using the ‘superprotein’ approach as number of peptides per protein was below threshold of five peptides. Green bars indicate peptides that were classified as likely not differentially modified at 10 hr (RM score ≥0.6). Positions of peptides within the protein are given from the N- to C-terminus.

FLEXIQuant-LF reproducibility and error estimation.

(A) Number of ambiguous samples with different number of RANSAC initiations. The number of samples with ambiguous results over 1000 runs decreases with increasing number of RANSAC initiations but starts to oscillate between one and two ambiguous samples from 50 initiations upwards. (B) Fraction of optimal and suboptimal outcomes of PMF with sample 30. The frequency of a suboptimal outcome decreases with increasing number of RANSAC initiations. (C) Correlation of the expected/in silico created change with the RM score difference before and after the in silico change. Expected and measured changes highly correlate (Pearson r = 0.98). (D) Classification error estimation using definitions of likely not, possibly, and likely differentially modified as described in Figure 1B. With a sensitivity and precision of 72.2% and 88.6%, respectively, the transition area of possibly differentially modified evaluates expectedly worse than the likely not and likely differentially modified classifications (both with sensitivity and precision >96%). (E) Cumulative quantification error frequencies show 94.7% of cases below 0.1. (F) Quantification error associated with number of peptides used per protein (boxes indicated 2nd and 3rd quartiles while whiskers indicate data 1.5 times the inter quartile range above and below these quartiles). Overall quantification errors are very low and further improve with the number of peptides used for FLEXIQuant-LF analysis.

Tables

Key resources table
Reagent type
(species) or resource
DesignationSource or referenceIdentifiersAdditional information
Cell line
(Homo sapiens)
HeLa S3ATCCCat# CCL-2.2,
RRID:CVCL_0058
AntibodyMouse monoclonal
CDC27 antibody (AF3.1)
Santa CruzCat# sc-9972IP (1:3)
AntibodyMouse polyclonal Normal
mouse IgG
Santa CruzCat# sc-2025IP (1:6)
Peptide,
recombinant protein
HRM calibration
peptides (iRT peptides)
BiognosysCat# Ki-3002
Peptide,
Recombinant protein
Trypsin
(sequencing grade modified trypsin)
PromegaV517
Chemical
compound, drug
NacodazoleSigma AldrichCat# M1404100 ng/ml
Chemical compound, drugPenicillin and
streptomycin mix
InvitrogenCat# 15140–122100 μg/ml
Chemical
compound, drug
Halt Protease and
Phosphatase Inhibitor
Thermo FisherCat# 78442
Software, algorithmFLEXIQuant-LFThis paperSee Abstract,
Materials and methods,
Data and Software Availability
Software, algorithmMaxQuant v1.5.2.8Cox lab, Max Planck
Institute of Biochemistry
https://www.maxquant.org
Software, algorithmProteinPilot v4.5.1Sciexhttps://sciex.com/
Software, algorithmSpectronaut v7.0Biognosyshttps://biognosys.com
OtherDMEM mediaInvitrogenCat# 11965
OtherFBSInvitrogenCat# 26140–079
OtherL-glutamineInvitrogenThermoFisher:
Cat# 25030149
Other4–12% SDS-PAGE gelInvitrogenCat# NP0329BOX
OtherMES bufferInvitrogenCat# NP0002-02
OtherThymidineSigma AldrichCat# T1895
OtherRIPA Lysis BufferSanta CruzCat# sc-24948
OtherAffi-Prep Protein A beadsBioradCat# 1560006
OtherFormic acid (FA)Thermo FisherCat# A117-50
OtherAcetonitrile (ACN)Thermo fisherCat# A955-4
OtherWaterThermo fisherCat# W6-4
OtherTripleTOF 5600Sciexhttps://sciex.com/
OtherNano cHiPLC trap column
(200 µm x 0.5 mm Reprosil
C18 3 µm 120 Å)
EksigentCat# 804–00016
OtherNano cHiPLC column
(75 µm x 15 cm Reprosil
C18 3 µm 120 Å)
EksigentCat# 804–00011
OtherDTTSigma AldrichCat# D9779
Table 1
Overview of FLEXIQuant-LF analysis for the APC/C complex components during nocodazole arrest.

Column name explanations: Quant. peptides: quantified peptides; Seq. cov.: sequence coverage; Likely diff. modified: number of peptides classified as likely differentially modified; Possibly diff. modified: number of peptides classified as possibly differentially modified; Excluded: number of peptides classified as outliers and thus excluded from RM score calculation; (DDA: number of peptides classified as differentially modified for which we found evidence in our DDA data; Lit.: number of peptides classified as differentially modified for which we found evidence for modification in the literature).

FLEXIQuant-LF analysisEvidence
ProteinQuant.
peptides
Seq. cov.Likely diff. modifiedPossibly diff.
modified
ExcludedDDALit.No evidence
 APC15439%744892
 APC42843%002--0
 APC53759%452281
 APC72752%421142
 APC10969%003--0
 APC15130%000--0
 APC16330%000--0
 CDC162341%323132
 CDC20722%101010
 CDC233352%334351
 CDC273355%9416121
in total248AVG
47%
30
12%
20
8%
20
8%
21
42%
41
82%
9
18%
Table 2
Differentially modified peptides.

Overview of peptides classified as likely or possibly differentially modified and resulting RM scores of the FLEXIQuant-LF analysis of APC/C.

ProteinPeptideStartEndRM score
4 hr8 hr10 hr
 APC1GDSPVTSPFQNYSSIHSQSR3113300.710.370.27
SPSISNMAALSR3413520.710.320.29
FSEQGGTPQNVATSSSLTAHLR2853060.690.340.30
AHSPALGVHSFSGVQR3533680.690.350.30
NFDFEGSLSPVIAPK6806940.800.430.36
LLQLCamQR107010760.570.510.45
ARPSETGSDDDWEYLLNSDYHQNVESHLLNR6967260.780.610.48
LHDSLYNEDCamTFQQLGTYIHSIR5665880.750.420.50
SLCamLSPSEASQMK7277390.830.610.51
TMALPVGR107710840.650.590.56
LHDSLYNEDCamTFQQLGTYIHSIRDPVHNR5665940.690.630.57
 APC5EELDVSVR1901970.640.420.35
EEEVSCamSGPLSQK1982100.590.390.40
ALTPASLQK2302380.570.490.43
CamQVASAASYDQPK6706820.520.470.46
KTVEDADMELTSR1681800.760.640.50
LILTGAESK2812890.670.530.52
LMAEGELK991060.700.580.54
DSDLLHWK4044110.580.580.57
TVEDADMELTSR1691800.820.630.59
 APC7VRPSTGNSASTPQSQCamLPSEIEVK1161390.850.460.41
AYAFVHTGDNSR2422530.600.470.47
DMAAAGLHSNVR43540.610.520.49
YTMALQQK1001070.700.470.54
ALTQRPDYIK4684770.670.550.57
 CDC16QTAEETGLTPLETSR5735870.860.340.34
CamYDFDVHTMK5445530.600.550.45
SSICamLLR1301360.650.600.49
IYDALDNR1391460.710.560.52
DPFHASCamLPVHIGTLVELNK2612800.441.000.60
 CDC20VLSLTMSPDGATVASAAADETLR4464680.510.320.34
 CDC23NQGETPTTEVPAPFFLPASLSANNTPTR5585850.820.270.13
RVSPLNLSSVTP5865970.710.300.20
VSPLNLSSVTP5875970.820.360.28
AALYFQR3503560.670.570.52
LWDEASTCamAQK5255350.740.620.56
NTSAAIQAYR3803890.700.550.56
 CDC27EVTPILAQTQSSGPQTSTTPQVLSPTITSPPNALPR3413760.460.310.19
LDSSIISEGK4324410.660.330.27
LFTSDSSTTK3813900.580.290.30
YSLNTDSSVSYIDSAVISPDTVPLGTGTSILSK2242560.560.320.33
LNLESSNSK2152230.690.330.33
GGITQPNINDSLEITK4164310.730.470.41
QPETVLTETPQDTIELNR1972140.710.420.41
SVFSQSGNSR3313400.660.350.42
FTSLQNFSNCamLPNSCamTTQVPNHSLSHR1701960.730.460.45
ISTITPQIQAFNLQK4424560.770.540.52
ASVLFANEK7107180.530.480.56
HYNAWYGLGMIYYK6346470.530.760.59
DAVFLAER23300.620.610.59

Additional files

Supplementary file 1

Raw peptide intensities of all identified APC/C components.

https://cdn.elifesciences.org/articles/58783/elife-58783-supp1-v1.xlsx
Supplementary file 2

Comparison RM scores proteins analyzed individually and using the ‘superprotein’ approach.

https://cdn.elifesciences.org/articles/58783/elife-58783-supp2-v1.xlsx
Supplementary file 3

Comparison RM scores for simulated independent dataset.

https://cdn.elifesciences.org/articles/58783/elife-58783-supp3-v1.xlsx
Transparent reporting form
https://cdn.elifesciences.org/articles/58783/elife-58783-transrepform-v1.pdf

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Christoph N Schlaffner
  2. Konstantin Kahnert
  3. Jan Muntel
  4. Ruchi Chauhan
  5. Bernhard Y Renard
  6. Judith A Steen
  7. Hanno Steen
(2020)
FLEXIQuant-LF to quantify protein modification extent in label-free proteomics data
eLife 9:e58783.
https://doi.org/10.7554/eLife.58783