PGFinder, a novel analysis pipeline for the consistent, reproducible, and high-resolution structural analysis of bacterial peptidoglycans

  1. Ankur V Patel  Is a corresponding author
  2. Robert D Turner
  3. Aline Rifflet
  4. Adelina E Acosta-Martin
  5. Andrew Nichols
  6. Milena M Awad
  7. Dena Lyras
  8. Ivo Gomperts Boneca
  9. Marshall Bern
  10. Mark O Collins  Is a corresponding author
  11. Stéphane Mesnage  Is a corresponding author
  1. School of Biosciences, University of Sheffield, United Kingdom
  2. Department of Computer Science, University of Sheffield, United Kingdom
  3. Institut Pasteur, Unité Biologie et Génétique de la Paroi Bactérienne, France
  4. INSERM, Équipe Avenir, France
  5. CNRS, UMR 2001 "Microbiologie intégrative et moléculaire", France
  6. biOMICS Facility, Faculty of Science Mass Spectrometry Centre, University of Sheffield, United Kingdom
  7. Protein Metrics Inc, United States
  8. Infection and Immunity Program, Monash Biomedicine Discovery Institute, Australia
  9. Department of Microbiology, Monash University, Australia
9 figures, 3 tables and 2 additional files

Figures

Diversity of peptidoglycan composition and structure.

(a) Representative peptidoglycan building block made of N-acetylglucosamine (GlcNAc) and N-acetylmuramic acid (MurNAc) forming a disaccharide subunit linked to a pentapeptide stem attached to the …

Flowchart outlining the algorithm for the matching script.

The identification of muropeptides was carried out using four successive steps, indicated by different colours (orange, green, blue, and red, respectively). As a first step, observed masses in the …

Distribution of E. coli peptidoglycan fragments identified using automated search workflow.

Breakdown of peptidoglycan is shown by oligomerisation state (left) branching to specific composition (right). Branch size is proportional to percentage. Monomers, dimers, trimers, and glycan chains …

Figure 4 with 1 supplement
Comparative analysis of C. difficile R20291 and M7404 peptidoglycan (PG) composition.

(a) Pearson’s correlation coefficients across biological replicates of R20291 and M7404 C. difficile isolates. Heatmap gradient shows highest value in green to lowest value in red. (b) Muropeptide …

Figure 4—source data 1

C. difficile mass database.

https://cdn.elifesciences.org/articles/70597/elife-70597-fig4-data1-v1.csv
Figure 4—source data 2

C. difficile 20291 versus M7404, list of muropeptides, abundance, RT.

https://cdn.elifesciences.org/articles/70597/elife-70597-fig4-data2-v1.xlsx
Figure 4—figure supplement 1
C. difficile LC-MS chromatograms.
Appendix 1—figure 1
UHPLC-MS chromatogram of E. coli reduced disaccharide peptides.
Appendix 1—figure 2
Consistency of E. coli PG analyses.

(a) Pearson’s correlation coefficients across biological replicates of E. coli BW25113. (b) Muropeptide distribution according to degree of crosslinking. The crosslinking index was calculated as …

Appendix 2—figure 1
Workflow for production of MaxQuant compatible MS data files from Agilent QTOF data.

Agilent MS data (data: .d) is converted by Proteowizard to a mzML format (data: XML). Relevant settings for Proteowizard are shown (left). mzML file is then converted by TOPPAS to a mzXML file …

Appendix 2—figure 2
Workflow for MS data processing using MaxQuant, before automated analysis.

mzXML (data: XML) is passed to MaxQuant (process) for deconvolution and monoisotopic mass determination. Default values used except where indicated (right). MaxQuant output (data: text file) is then …

Author response image 1

Tables

Table 1
Processed match output.
StructureRT (min)Abundance (%)Monoisotopic mass (Da)
Av±SDAv±SDObsTheoΔppm
GM|03.62±0.013.465±0.683498.205498.2062.5
GlycansGM (x2)|010.11±0.030.428±0.349976.384976.3862.2
4.38%±0.35%GM (anhydro) |08.20±1.920.238±0.025478.179478.1802.9
GM (deacetyl) |02.57±0.000.155±0.032456.194456.1963.5
GM (x2) (deacetyl) |06.86±0.020.093±0.012934.372934.3763.2
GM-AEmA|110.04±0.0436.098±2.131941.405941.4082.8
GM-AEm|16.57±0.0114.352±0.397870.368870.3713.0
GM-AEmKR|19.56±0.058.030±0.7741154.5631154.5673.6
GM-AE|19.57±0.041.809±0.231698.284698.2863.1
GM-AEmG|17.85±0.050.689±0.049927.390927.3922.3
GM-AEm (anhydro) |113.98±0.020.668±0.073850.342850.3442.2
MonomersGM-AEmA (anhydro) |116.55±0.010.573±0.100921.380921.3822.0
63.14%±1.13%GM-AEmAG|19.45±0.050.219±0.009998.426998.4293.1
GM-AEmKR (anhydro) |114.83±0.010.160±0.0391134.5371134.5402.9
GM-AEmA (deacetyl) |18.57±0.060.083±0.055899.394899.3973.1
GM-GM-AEmA|113.10±0.020.075±0.0401419.5841419.5882.9
GM-AE (anhydro) |117.44±0.010.069±0.013678.258678.2602.8
M-AEm|14.56±0.010.062±0.064667.289667.2913.8
M-AEmKR|18.16±0.060.061±0.056*951.484951.4873.2
GM-AEmAA|111.38±0.040.059±0.0031012.4421012.4452.4
M-AEmA|18.52±0.050.053±0.015738.325738.3284.0
GM-GM-AEm|111.31±0.040.042±0.0251348.5471348.5512.4
GM-AEm (deacetyl) |14.77±0.010.024±0.014828.358828.3603.0
GM-GM-AEmKR|112.18±0.030.011±0.002*1632.7421632.7473.0
GM-AEmA-GM-AEmA|216.01±0.0217.247±0.7771864.8001864.8052.3
GM-AEmA-GM-AEmKR|214.83±0.024.589±0.5892077.9572077.9643.0
GM-AEmA-GM-AEm|215.09±0.023.207±0.1681793.7631793.7682.6
GM-AEmA-GM-AEmA (anhydro) |220.56±0.010.873±0.0371844.7741844.7782.4
GM-AEm-GM-AEmKR|214.22±0.000.855±0.1012006.9202006.9263.3
GM-AEmA-GM-AEmKR (anhydro) |218.89±0.170.665±0.0792057.9342057.9371.8
GM-AEm-GM-AEm|214.23±0.010.558±0.0621722.7251722.7303.0
GM-AEm-GM-AEmAG|214.68±0.010.416±0.0251850.7851850.7892.4
GM-AEmA-GM-AEm (anhydro) |219.66±0.010.381±0.0281773.7381773.7412.1
DimersGM-AEmA-GM-AEmAG|215.33±0.020.179±0.0051921.8221921.8262.2
29.54%±0.46%GM-AEm-GM-AEmKR (anhydro) |218.07±0.010.170±0.0241986.8961986.9002.1
GM-AEm-GM-AEm (anhydro) |218.77±0.010.141±0.0151702.6971702.7044.5
GM-AEmA-GM-AEmAA|216.54±0.010.075±0.0021935.8381935.8422.1
GM-AEm-GM-AEmG|213.91±0.010.054±0.0031779.7471779.7522.7
GM-GM-AEmA-GM-AEmA|217.51±0.010.046±0.0282342.9762342.9853.6
GM-AEmA-GM-AEmA (deacetyl) |215.17±0.010.029±0.0221822.7891822.7943.0
GM-AEmA-GM-AEmG (anhydro) |219.12±0.010.021±0.0011830.7611830.7630.7
GM-AEmA-GM-AEmAG (anhydro) |219.73±0.010.019±0.0021901.7961901.8002.1
GM-AEmA-GM-AEmAA (anhydro) |221.17±0.020.015±0.0021915.8121915.8161.8
GM-GM-AEmA-GM-AEm|216.85±0.000.003±0.0042271.9432271.9472.1
GM-AEmA-GM-AEmA-GM-AEmA|318.86±0.011.751±0.2212788.1922788.2023.5
GM-AEmA-GM-AEmA-GM-AEm|318.23±0.210.371±0.0312717.1582717.1642.2
GM-AEmA-GM-AEmA-GM-AEmA (anhydro) |322.39±0.020.222±0.0272768.1692768.1752.3
GM-AEmA-GM-AEmA-GM-AEmKR|317.54±0.010.207±0.0283001.3503001.3603.4
GM-AEmA-GM-AEmA-GM-AEm (anhydro) |321.60±0.020.117±0.0032697.1332697.1381.8
TrimersGM-AEmA-GM-AEmA-GM-AEmKR (anhydro) |320.90±0.160.088±0.0262981.3282981.3342.2
2.94%±0.36%GM-AEmA-GM-AEmA-GM-AEmG|317.72±0.010.039±0.0042774.1822774.1861.4
GM-AEmA-GM-AEm-GM-AEm|317.45±0.010.029±0.0052646.1232646.1271.7
GM-AEmA-GM-AEm-GM-AEm (anhydro) |321.16±0.010.025±0.0012626.0962626.1011.9
GM-AEmA-GM-AEm-GM-AEmKR|317.11±0.010.022±0.0022930.3162930.3232.7
GM-AEmA-GM-AEmA-GM-AEmAG|318.24±0.010.021±0.0012845.2172845.2232.0
GM-AEmA-GM-AEmA-GM-AEmAA|319.23±0.010.014±0.002*2859.2352859.2391.3
GM-AEmA-GM-AEm-GM-AEmKR (anhydro) |320.31±0.020.014±0.0052910.2932910.2971.5
GM-AEm-GM-AEmG-GM-AEmAG|317.18±0.000.004±0.0052703.1432703.1492.0
GM-AEmA-GM-AEm-GM-AEmG (anhydro) |321.21±0.020.011±0.003*2754.1572754.1601.1
GM-AEmA-GM-AEmA-GM-AEmAG (anhydro) |321.77±0.010.006±0.0042825.1892825.1972.8
  1. Inferrred dimers and trimers are based on the most abundant monomers and could correspond to alternative structures.

  2. G: GlcNAc; M: MurNAc; m: meso-diaminopimelic acid; the number following the symbol ‘|’ refers to the oligomerisation state (1 for monomers, 2 for dimers, and 3 for trimers).

  3. *

    Calculated from two values.

Table 2
Automated identification of P. aeruginosa peptidoglycan fragments.
Inferred structureMass∆ppmMaxQuant
TheoreticalObservedThis workAnderson et al.
GM (anhydro)478.1799478.17804.0–2.7+
GM498.2061498.20423.9–4.2+
GM (x2) (deacetyl)934.3755934.37065.3–8.6+
GM (x2) (anhydro)956.3598956.35515.06.0+
GM (x2)976.3860976.37946.7–6.1+
GM (x3) (deacetyl)1412.55541412.54904.5–6.2+
GM (x3) (anhydro)1434.53971434.53483.4–7.5+
GM (x3)1454.56591454.55924.6–5.3+
GM (x4)1932.74581932.73525.5–5.1+
GM-AE (anhydro)678.2596678.25674.3–9.1+
GM-AE698.2858698.28303.9–12.9+
GM-AEJ (anhydro)850.3444850.34015.1–10.6+
GM-AEJ870.3706870.36763.5–5.9+
GM-AEJA (anhydro)921.3815921.37655.4–9.9+
GM-AEJG927.3920927.38685.6–8.9+
GM-AEJA941.4077941.40453.4–5.0+
GM-AEJC973.3843973.37638.2–2072.2+
GM-AEJL983.4593983.44989.6–15.5+
GM-AEJK998.4703998.46248.0–10.6+
GM-AEJM1001.41531001.40609.2–13.5+
GM-AEJAA1012.44481012.44133.4–7.8+
GM-AEJY (anhydro)1013.40911013.4242–14.917.8+
GM-AEJF1017.44331017.43478.4–15.0+
GM-AEJY1033.43531033.42787.2–5.3+
GM-AEJAV1040.48081040.47168.8–14.7+
GM-AEJIA1054.49641054.48748.5–11.3+
GM-AEJW1056.43941056.4455–5.84.0+
GM-AEJAM1072.45241072.44605.9–4.3+
GM-AEJKR1154.56671154.56313.1–8.1+
GM-GM-AE1176.48361176.459020.9–24.7+
GM-GM-AEJ1348.56841348.545716.9–24.9+
GM-GM-AEJA1419.60551419.582416.2–23.5+
GM-AEJA-GM-AEJ (amidase product)1313.57211313.56743.5–11.0+
GM-AEJA-GM-AEJA (amidase product)1384.60921384.60374.0–7.4+
GM-AEJ-GM-AEJ (anhydro)1702.70421702.69763.938.3+
GM-AEJ-GM-AEJ1722.73041722.72344.1–8.6+
GM-AEJA-GM-AEJ (double anhydro)1753.71511753.70436.2–7.2+
GM-AEJA-GM-AEJ (anhydro)1773.74131773.73394.2–11.1+
GM-AEJA-GM-AEJ1793.76751793.75964.4–8.8+
GM-AEJA-GM-AEJA (dacetyl)1822.79411822.78087.3–7.4+
GM-AEJA-GM-AEJA (double anhydro)1824.76011824.74478.4–15.6+
GM-AEJA-GM-AEJA (anhydro)1844.77841844.77044.3–8.3+
GM-AEJA-GM-AEJG1850.78891850.8158–14.69.7+
GM-AEJA-GM-AEJA1864.80461864.79624.5–6.6+
GM-AEJA-GM-AEJK (anhydro)1901.84101901.82975.9–14.5+
GM-AEJA-GM-AEJL1906.85621906.84525.8–11.3+
GM-AEJA-GM-AEJK1921.86721921.85864.5–12.0+
GM-AEJA-GM-AEJF1940.84021940.82637.2–8.8+
GM-AEJA-GM-AEJY1956.83221956.82105.7–7.6+
GM-AEJA-GM-AEJAL1977.89331977.88136.0–10.7+
GM-AEJA-GM-AEJKR2077.96362077.95892.2–13.0+
GM-GM-AEJ-GM-AEJ2200.92822200.900012.8–17.7+
GM-GM-AEJA-GM-AEJ2271.96532271.936812.6–18.4+
GM-GM-AEJA-GM-AEJA2343.00242342.973412.4411.4+
GM-AEJA-GM-AEJA-GM-AEJ (double anhydro)2677.11202677.10004.5–10.7+
GM-AEJA-GM-AEJA-GM-AEJ (anhydro)2697.13822697.12594.6–8.6+
GM-AEJA-GM-AEJA-GM-AEJ2717.16442717.15324.1–10.7+
GM-AEJA-GM-AEJA-GM-AEJA (double anhydro)2748.14912748.13634.7–11.0+
GM-AEJA-GM-AEJA-GM-AEJA (anhydro)2768.17532768.16742.9–11.2+
GM-AEJA-GM-AEJA-GM-AEJA2788.20152788.19193.4–9.7+
GM-AEJA-GM-AEJA-GM-AEJK (anhydro)2825.23792825.22056.1–9.3+
GM-GM-AEJA-GM-AEJA-GM-AEJ3195.36223195.326411.2–14.0+
GM-GM-AEJA-GM-AEJA-GM-AEJA3266.39933266.363011.1–12.5+
  1. Alternative structures were matched:

  2. GM-AEJ-GM-AEJK.

  3. GM-AEJ-GM-AEJKA (anhydro).

  4. GM-AEJ-GM-AEJKA.

  5. GM-AEJ-GM-AEJA-GM-AEJKA (anhydro).

Table 2—source data 1

Pseudomonas aeruginosa matched muropeptides not reported previously.

https://cdn.elifesciences.org/articles/70597/elife-70597-table2-data1-v1.xlsx
Table 2—source data 2

Raw output of automated search using MaxQuant and PGFinder.

https://cdn.elifesciences.org/articles/70597/elife-70597-table2-data2-v1.xlsx
Key resources table
Reagent type (species) or resourceDesignationSource or referenceIdentifiersAdditional information
Strain, strain background(Escherichia coli)BW25113https://doi.org/10.1073/pnas.120163297RRID:Addgene_72340Model strain for PG analysis
Strain, strain background(Clostridioides difficile)R20291https://doi.org/10.1128/JB.0073107Model strain for PG analysis
Strain, strain background(Clostridioides difficile)M7404https://doi.org/10.1371/journal.ppat.1002317Model strain for PG analysis
Software, algorithmPGFinderv.0.02This workUsed for MS1 analysis of PG structure
Software, algorithmByosv.3.9–32Protein Metrics IncUsed for MS data deconvolution and MS/MS analysis
Software, algorithmMaxQuant v2.0.1.0Cox and Mann, 2008RRID:SCR_014485Used for MS data deconvolution
Software, algorithmPerseusv.1.6.10.53Tyanova et al., 2016RRID:SCR_015753Used statistical analysis of muropeptide abundance

Additional files

Download links