1. Immunology and Inflammation
Download icon

Different B cell subpopulations show distinct patterns in their IgH repertoire metrics

  1. Marie Ghraichy
  2. Valentin von Niederhäusern
  3. Aleksandr Kovaltsuk
  4. Jacob D Galson
  5. Charlotte M Deane
  6. Johannes Trück  Is a corresponding author
  1. Division of Immunology, University Children's Hospital and Children’s Research Center, University of Zurich (UZH), Switzerland
  2. Department of Statistics, University of Oxford, United Kingdom
  3. Alchemab Therapeutics Ltd, United Kingdom
Research Article
  • Cited 0
  • Views 525
  • Annotations
Cite this article as: eLife 2021;10:e73111 doi: 10.7554/eLife.73111

Abstract

Several human B cell subpopulations are recognised in the peripheral blood, which play distinct roles in the humoral immune response. These cells undergo developmental and maturational changes involving VDJ recombination, somatic hypermutation and class switch recombination, altogether shaping their immunoglobulin heavy chain (IgH) repertoire. Here, we sequenced the IgH repertoire of naïve, marginal zone, switched and plasma cells from 10 healthy adults along with matched unsorted and in silico separated CD19+ bulk B cells. Using advanced bioinformatic analysis and machine learning, we show that sorted B cell subpopulations are characterised by distinct repertoire characteristics on both the individual sequence and the repertoire level. Sorted subpopulations shared similar repertoire characteristics with their corresponding in silico separated subsets. Furthermore, certain IgH repertoire characteristics correlated with the position of the constant region on the IgH locus. Overall, this study provides unprecedented insight over mechanisms of B cell repertoire control in peripherally circulating B cell subpopulations.

Introduction

B cell development starts in the bone marrow where immature B cells must assemble and express on their surface a functional but non-self-reactive B cell antigen receptor (BCR) (Lefranc and Lefranc, 2001). The generation of the heavy and light chain of the BCR is mediated by the random and imprecise process of V(D)J recombination (Tonegawa, 1983). Further development of B cells occurs in the periphery in response to stimulation with the process of somatic hypermutation (SHM) through which point mutations are introduced in the genes coding for the V(D)J part of the immunoglobulin heavy (IgH) and light chain (Jolly et al., 1996). Subsequently, B cells with a mutated BCR providing increased antigen affinity are selected and show increased survival and proliferation capacity (Zheng et al., 2005).

Furthermore, class switch recombination (CSR) modifies the IgH constant region, resulting in the generation of B cells with nine different immunoglobulin isotypes or isotype subclasses, namely IgD, IgM, IgG1-4, IgA1/2 and IgE (Stavnezer et al., 2008). This process involves the replacement of the proximal heavy chain constant gene by a more distal gene. Class switching is an essential mechanism during humoral immune responses as the constant region of an antibody determines its effector function (Vidarsson et al., 2014). Both direct switching and sequential switching upon a second round of antigen exposure have been reported (Horns et al., 2016; Cameron et al., 2003; Zhang et al., 1994).

Through developmental mechanisms and further differentiation in the periphery, several phenotypically distinct circulating B cell subpopulations are generated (Allman and Pillai, 2008). They include naïve, marginal zone (MZ), switched memory B cells and plasma cells (PC), which are mainly characterised by their differential expression of surface markers and by playing distinct roles in the adaptive immune response (Leandro, 2013 ) High-throughput sequencing of the IgH repertoire (AIRR-seq) has made it possible to improve our understanding of the different components of the adaptive immune system in health and disease, and following vaccine challenge (Mandric et al., 2020; Ghraichy et al., 2018; Galson et al., 2014; Lindau and Robins, 2017; Georgiou et al., 2014). Previous studies using both high- and low-throughput sequencing techniques have already reported important differences between B cell subpopulations affecting their IgH repertoire composition, VDJ gene usage, mutations and clonality (Berkowska et al., 2011; Mroczek et al., 2014; Galson et al., 2015; Wu et al., 2010).

Recent AIRR-seq workflows allow coverage of a sufficient part of the IgH constant region in addition to the VDJ region, making it possible to assign antibody classes and subclasses on an individual sequence level. It is common practice to use unsorted bulk B cells from peripheral blood as a starting material and use the constant region information combined with the degree of SHM to group transcripts in silico into different B cell populations (Glanville et al., 2011; Ghraichy et al., 2020). Using isotype-resolved IgH sequencing of bulk B cells, isotype subclasses have been found to show differences in their repertoire characteristics (Jackson et al., 2014; Kitaura et al., 2017). However, it remains unknown how the IgH repertoire of bioinformatically separated transcripts originating from bulk-sequenced B cells compares to the repertoire of their corresponding circulating B cell subpopulations. It is also unknown how IgH sequences with the same constant region originating from different cell types compare.

Here, we used an established AIRR-seq workflow that captures the diversity of the variable IgH genes together with the isotype subclass usage to study in detail the repertoire of CD19+ bulk B cells as well as flow cytometry sorted naïve, MZ, switched and PCs from 10 healthy adults. We applied statistical methods such as principal component analysis (PCA), k-means clustering, linear discriminant analysis (LDA) and machine learning algorithms to combine several repertoire metrics and characterise the different B cell subpopulations. We show that transcripts from physically sorted B cell subpopulations share similar characteristics with their corresponding subsets in the bulk that were grouped in silico using isotype subclass information and number of mutations. We further demonstrate that sequences with the same isotype subclass originating from different cell types are closely related, suggesting the presence of isotype-specific rather than cell-type-specific signatures in the IgH repertoire. We finally correlate these signatures to the isotype subclass positioning on the locus and find that downstream subclasses exhibit enhanced signs of maturity, overall providing new insights into the selection and the peripheral differentiation of distinct B cell subpopulations.

Results

Physically sorted B cell subpopulations and their corresponding subsets in the bulk share similar repertoire characteristics

We compared IgH repertoire characteristics between the following B cell subpopulations: Bnaive, BMZ, BPC_MD, BPC_AG and Bswitched and their corresponding subsets that we obtained in silico from Bbulk: Bbulk_naïve, Bbulk_MD and Bbulk_switched. We identified three separate clusters: one made of predominantly BMZ, Bbulk_MD and BPC_MD; another with only Bnaive and Bbulk_naïve; and a third cluster with predominantly Bbulk_switched, BPC_AG and Bswitched (Figure 1A) by combining all repertoire characteristics in a PCA and applying k-means clustering. To test whether this clustering pattern was driven by VJ gene usage, complementarity-determining region (CDR) 3 physiochemical properties or the general repertoire metrics, we analysed these variables separately. Using V family and J gene usage, there was a clear separation between naïve and memory cells mostly driven by differences in V1/3 and J4/6 usage (Figure 1—figure supplement 1). However, no separation between BMZ/BPC_MD/ Bbulk_MD and Bswitched/BPC_AG/Bbulk_switched was observed (Figure 1B). The CDR3 physiochemical properties alone created similar clusters as when combined together with the other metrics (Figure 1C). This separation was mostly driven by a lower basic and a higher aromatic content in addition to a higher gravy index and a lower polarity in Bnaive/Bbulk_naïve compared to memory subpopulations (Figure 1—figure supplement 2). Global repertoire metrics also created a clear separation between Bnaive/Bbulk_naïve, Bswitched/BPC_AG/Bbulk_switched and BMZ/BPC_MD/Bbulk_MD subpopulations mostly driven by higher mutation counts, NP length and selection pressure in the CDR and lower junction length and diversity in Bswitched compared to Bnaive (Figure 1—figure supplement 3).

Figure 1 with 3 supplements see all
Different repertoire characteristics similarly separate between B cells subpopulations.

Principal component analysis (PCA) (left) and composition of the clusters formed using k-means clustering with k = 3 (right) applied on (A) all repertoire characteristics, (B) V family and J gene usage (see Figure 1—figure supplement 1), (C) physiochemical properties of CDR3 junction (see Figure 1—figure supplement 2) and (D) global repertoire metrics (see Figure 1—figure supplement 3). The percentage of all variation in the data that is explained by PC1 and PC2 is shown on the x and y axis, respectively, between brackets. In the PCA plots, areas are the convex hulls of the subsets, and the largest point of one colour represents the centre of that hull.

In summary, we found that V family and J gene usage, the physiochemical properties of the CDR3 and global repertoire metrics similarly distinguish between B cell subpopulations. Bnaive, BMZ/BPC_MD and Bswitched/BPC_AG were divergent but shared properties with their relative corresponding subsets in the bulk.

Accurate prediction of cell type based on repertoire features on a single-cell level

We constructed a sequence classifier that predicts the cell type of a sequence using sequence attributes and different repertoire metrics. Since we subsampled our data making our datasets perfectly balanced, we used only accuracy as a performance metric. Logistic regression, decision tree and random forest classifiers all performed satisfactorily (Figure 2A). However, logistic regression performed poorly on correctly classifying Bswitched and BPC_AG, for which accuracy was almost equal to chance. The performance of all three classifiers was highest in distinguishing between Bnaive and other cell types.

Classification accuracies and feature scores on a single-sequence level.

(A) Heatmap showing pairwise classification accuracy results using logistic regression, decision tree and random forest classifier. (B) Random forest feature scores by cell pair.

The random forest classifier was the most successful compared to the other two and the most accurate in predicting the cell type of a sequence. We assessed the relevance of specific predictors in properly classifying cell types by calculating feature importance scores for each cell pair (Figure 2B). The number of mutations was the highest scoring feature for all cell pairs except for distinguishing between Bswitched and BPC_AG and between BMZ and BPC_MD for which CDR3 amino acid characteristics had higher scores. Within the CDR3 physiochemical properties, average bulkiness, average polarity and the gravy hydrophobicity index were the most differentiating between cell types, whereas the basic and acidic content of the CDR3 chain seemed to be less important. R/S ratio in CDR and FWR and the junction length appeared to have similar scores and were more important in cases where Bnaive were not one of the two cell types. V family and J gene appeared to have low importance in distinguishing between all cell pairs.

Within class-switched subsets, sequences with same constant region from different cell types show similar features

When comparing class-switched transcripts originating from Bbulk_switched, Bswitched and BPC_AG, isotype subclasses were similarly distributed. IgA1 was the dominant subclass in IgA transcripts, whereas IgA2 was less frequently used. All cell subpopulations showed a dominant use of IgG1 and IgG2 with little IgG3 and negligible IgG4 (Figure 3A). Usage of IgA1 in BPC_AG was similar to Bswitched and Bbulk_switched (p=0.28 and p=0.25, Kruskal–Wallis). IgG3 usage was significantly lower in BPC_AG compared to Bbulk_switched and Bswitched (p=0.01, p=0.01, Kruskal–Wallis) while IgG1 usage tended to be lower (p=0.13 and p=0.11, Kruskal–Wallis) and IgG2 usage higher in BPC_AG compared to the other two B cell subpopulations (p=0.11 and p=0.11, Kruskal–Wallis).

Analysis of isotype subclasses in IgG and IgA transcripts.

(A) Isotype subclass distribution by cell type. Error bars represent the standard error of the mean. (B) Principal component analysis (PCA) on all repertoire properties combined by cell type and isotype subclass. Areas are the convex hulls of a group, and the largest point of one colour represents the centre of that hull. (C) Composition of the clusters formed by applying the k-means clustering algorithm on all data with increasing k from k = 2 to k = 5. Underlyaing source data can be found under Figure 4—source data 1.

When combining repertoire characteristics by isotype subclass and cell type for class-switched transcripts resulting from Bbulk_switched, Bswitched and BPC_AG, we found that samples with the same constant region originating from different cell types overlapped (Figure 3B). We identified two clusters: one mainly composed of IgG1 and IgG3 samples from all cell types and another with IgA1, IgA2 and IgG2 samples by applying k-means clustering with k = 2 (Figure 3C). By further dividing the data and with increasing k, we observed that newly formed clusters were mainly composed of distinct isotype subclasses, while the cell type itself was not a defining factor for cluster formation. Interestingly, we could not see a clear separation between IgG2 and IgA2 samples with increasing number of clusters.

B cell repertoire metrics correlate with constant region positioning on the IgH locus in class-switched subsets

The IgH locus contains nine constant genes. The genes encoding for IgM and IgD are the closest to the V-D-J recombination sites while those for IgG3, IgG1 and IgA1 are further downstream but still close to IgM/IgD, whereas more distant on the locus are the genes that encode for IgG2, IgG4, IgE and IgA2 (Figure 4A). We determined and compared B cell repertoire metrics between different subclasses in BPC and Bswitched and compared those to Bnaive and BMZ. Bnaive showed the lowest number of mutations and R/S ratio and longest CDR3 junction. Memory subsets had a high number of mutations, with BMZ and BPC_MD having fewer mutations than class-switched transcripts (Figure 4B). IgM-distal subclasses IgG2 and IgA2 in both Bswitched and BPC_AG showed the highest R/S ratio, indicating high selection pressure (Figure 4C). All antigen-experienced subsets had a lower junction length compared to Bnaive except for IgM-proximal transcripts IgG3 and IgG1 (Figure 4E). The proportion of IGHV4-34, the gene associated with self-reactivity (Bashford-Rogers et al., 2018), was lower in memory subsets compared to Bnaive except for IgG3 from Bswitched for which the proportion of IGHV4-34 was similar to naïve subsets (Figure 4F). Within IgG and IgA sequences, genomic distance from IgM correlated with a higher R/S ratio, shorter junction and lower usage of IGHV4-34. BPC had a significantly lower diversity compared to all other cell types (Figure 4G). Interestingly, transcripts from Bswitched showed a similar diversity to Bnaive, whereas BMZ were less diverse. Within BPC_AG, IgM-distal subclasses showed a lower diversity.

Figure 4 with 1 supplement see all
Analysis of repertoire metrics by isotype subclass and cell type.

(A) Overview of the IgH constant region locus. Comparison of (B) mutation counts, (C) R/S ratio, (D) selection pressure, (E) junction length, (F) proportion of IGHV4-34 and (G) diversity between different B cell subpopulations. Linear discriminant analysis (LDA) fitted on (H) V family and J gene usage (see Figure 4—figure supplement 1) and (I) CDR3 amino acid physiochemical properties.

IGHV family and IGHJ gene usage also showed a discrepancy between different subsets. IGHV family usage in IgM-proximal subclasses IgG3 and IgG1 was similar to BNaive. BMZ and IgM-distal subclasses were enriched in IGHJ4 at the expense of IGHJ6 compared to naïve cells and IgG1-3 B cell subsets (Figure 4—figure supplement 1). To reduce the dimensionality of all data points into a single one-dimensional axis, we performed LDA fitted on the relative gene frequencies (Figure 4H). This showed a clear distinction between Bnaive, IgG1-3 and BMZ, IgG2 and IgA1-2. An LDA fitted on the physiochemical properties of the CDR3 junction also showed a clear distinction between naïve and memory subsets, with IgG3 and IgG1 being closest to Bnaive and IgG2 and IgA2 overlapping and furthest away (Figure 4I).

In summary, we found that in IgA and IgG subclasses different B cell repertoire metrics correlate with the positioning of their respective subclass genes on the IgH locus, namely with the increasing genomic distance from IgM, with the proximal-switched IgH subclasses being more similar to naïve. Memory IgM subsets (BMZ and BPC_MD) were found to have lower mutations and R/S ratio than class-switched subsets. However, in other properties such as junction length, proportion of IGHV4-34 and diversity, they were more similar to IgM-distal subclasses.

Discussion

Here, we used AIRR-seq to characterise similarities and differences in the IgH repertoire of bulk B cells and different sorted naïve and memory B cell populations. This allowed for an in-depth understanding of the mechanisms underlying B cell responses. We report differences in V family and J gene usage, CDR3 physiochemical properties and global repertoire characteristics that similarly distinguish between naïve, IgM/IgD memory and class-switched subsets both at the repertoire and at the sequence level. Furthermore, we show alterations in the repertoire characteristics at the isotype subclass level unrelated to cell type that correlate with the position of the constant gene on the IgH locus. This study provides powerful insight on biological mechanisms underlying the B cell response as well as novel understanding of AIRR-seq methodologies to be taken into account in future studies.

Earlier studies have revealed differences in gene usage and repertoire properties between different sorted B cell subpopulations (Mroczek et al., 2014; Wu et al., 2010; Briney et al., 2012a; Larimore et al., 2012; DeWitt et al., 2016; Bautista et al., 2020). In addition, some differences in repertoire characteristics across isotype subclasses have been determined by using bulk B cells as starting material (Jackson et al., 2014; Kitaura et al., 2017; de Jong et al., 2017). However, studies combining IgH sequencing from both bulk and sorted B cell subpopulations have been sparse. Here, we sought to compare IgH repertoire properties between transcripts sorted in silico from bulk B cells and physically sorted B cell subpopulations. In addition, using isotype-resolved sequencing, we aimed to perform a comparative analysis of the IgH repertoire of transcripts with the same constant region originating from different cell types and correlate the differences in repertoire with position on the IgH locus. The type of material obtained, the sequencing depth and the experimental and computational methods used here constitute by far the most thorough and comprehensive analysis of repertoire differences between B cell subsets and isotype subclasses and thereby provide unique insights into the underlying mechanisms, leading to the formation of different B cell subpopulations.

Previous work involving human naïve and antigen-experienced B cell repertoires havs shown naïve B cells to have shorter junctions and higher usage of IGHJ6 and IGHV3, and lower usage of IGHJ4 and IGHV1 compared with IgM memory and switched B cells (Briney et al., 2012a; Larimore et al., 2012; DeWitt et al., 2016; Bautista et al., 2020). Differences in gene usage and CDR3 properties between IgM memory and switched B cells have also been reported (Wu et al., 2010). IgM memory and switched B cells have been found to use more negatively charged residues and to have less hydrophobic junctions compared with naïve B cells (Mroczek et al., 2014; Wu et al., 2010). Here, we focused on a more detailed examination of the repertoires by combining multiple characteristics using dimensionality reduction methods. Results of a previous study revealed that combining only a few repertoire characteristics is sufficient to discriminate between B cell subpopulations (Galson et al., 2015). In addition, an LDA combining V gene family proportions has been found to successfully distinguish between IgM and IgG repertoires (Friedensohn et al., 2018). We extend these findings by showing that using V family and J gene usage, CDR3 physiochemical properties or global repertoire characteristics similarly allow to separate between naïve and memory subpopulations. This suggests that distinct B cell subpopulations derive from different developmental mechanisms and are subject to selective processes that lead to similar variable gene identity. The importance of the CDR3 and its location at the centre of the antigen-binding site makes the observed differences in physiochemical properties and junction length interesting. These findings may suggest that different B cells subpopulations stimulated by diverse antigens have distinct junction compositions and properties. Further work to investigate CDR3-specific residue content and prediction of CDR3 protein structures could be enriching for antibody specificity and antigen recognition in different B cell subpopulations.

Previous research has demonstrated that same B cell subpopulations from different donors are more similar in their repertoire characteristics than different B cell subpopulations within an individual (Briney et al., 2012b; Rubelt et al., 2016). This has led to the understanding that differences between naïve and memory cells are conserved across unrelated individuals. Our findings are in agreement with these observations, and we extend on those by showing that the main defining factor in repertoire similarity is the constant region type, namely the isotype subclass, and that differences between subclasses are conserved across both cell type and individual. This finding suggests the existence of an isotype-based mechanism for repertoire control that is constant across cell types and individuals.

In addition to the comparative analysis of the different peripheral B cell subsets, our study represents, to our knowledge, the first comparison of bulk B cell sequencing with sorted B cell subpopulations. We showed that sequencing unsorted B cells from peripheral blood and combining the constant region information with the degree of SHM to bioinformatically group transcripts yields accurate results comparable to physical sorting, especially when analysing global repertoire characteristics. We acknowledge that this might be limiting in tasks sensitive to potential biases from different RNA levels per cell such as identifying antigen-specific sequences from PCs.

Recent IgH repertoire studies have moved towards using machine learning and artificial intelligence in contrast to traditional statistical approaches for goals including vaccine design, immunodiagnostics and antibody discovery (Greiff et al., 2017; Ostmeyer et al., 2017; Konishi et al., 2019; Shemesh et al., 2021). Previous work has focused on representing repertoires as sequence or subsequence-based features, that is, overlapping amino acid k-mers and their Atchley biophysiochemical properties (Greiff et al., 2017; Ostmeyer et al., 2017). Here, we report a simple pairwise classifier that successfully predicts the cell type of a sequence based on only the commonly used sequence attributes such as number of mutations and junction length. Random forest and decision tree classifiers outperformed the logistic regression algorithm suggesting a non-linear separation between cell types. A common concern when applying machine learning is the possibility of over-fitting. To prevent this, we trained the algorithm on 80% of the data and tested its performance on the remaining unseen 20%. We also subsampled every pair of classes to equal number of sequences in order to balance the dataset. The model presented here is applied only within an individual and is thereby confined by repertoire signals that might be individual-specific. More work improving the generalisability of the model across individuals would be revolutionising in terms of its potential practical applications. Unsurprisingly, the number of mutations was the most important feature in distinguishing between most cell types. These results, along with previous work, are promising and suggest that increasing the predictive potential of machine learning methods could help in finding sequence characteristics that distinguish between groups, such as disease state and healthy.

Studies indicate that both direct and sequential CSR to IgM-distal isotype subclasses can occur (Wesemann et al., 2011; Looney et al., 2016). Several studies have provided evidence for sequential CSR. IgM was found to commonly switch to proximal subclasses (IgG1, IgA1 and IgG2), but direct switches from IgM to more downstream subclasses (IgG4, IgE or IgA2) were rare (Horns et al., 2016). It has also been reported that a deficiency in IgG3, the most IgM-proximal subclass, frequently results in a decrease in other IgG subclasses (Meyts et al., 2006). Although it is challenging to determine whether sequential CSR occurs during a primary response, by re-entry into the germinal centre or during a secondary response to the same antigen, we and others have shown that IgM-distal subclasses accumulate with age, likely due to secondary encounter with antigen (Ghraichy et al., 2020; IJspeert et al., 2016). Studies comparing the mean mutation number between isotype subclasses have shown contradicting results. In one study, mutations varied in relation to the constant region position on the IgH locus, with the closest to IgM (IgG3) having the lowest mutations (Jackson et al., 2014), while in another study, no such difference was observed (Kitaura et al., 2017). We did not find a difference in number of mutations among IgG subclasses. Our findings rather suggest that mutation is more efficient in more downstream subclasses as we found that these exhibit higher R/S ratios and selection pressure in the CDR, consistent with previous studies (de Jong et al., 2017). Generally, class-switched IgM-distal subclasses showed signs of maturity (shorter junctions, lower IGHV4-34 usage) while transcripts from class-switched IgM proximal subclasses were more similar to those of naïve B cells. These results suggest that sequential CSR subjects B cells to selective forces, leading to more mature variable gene properties without necessarily accumulating more mutations.

In summary, in this study we took an extensive look at the IgH repertoire of different flow cytometry sorted as well as bioinformatically grouped cell types and isotype subclasses of healthy individuals. Using advanced bioinformatic tools, statistical analysis and machine learning, this analysis provides deep insight into the different mechanisms of B cell development and boosts our understanding of the B cell system components in health. The data and methods presented here provide a foundation for future work investigating the immune repertoire of patients with altered immune status and hold promise for the application of AIRR-seq along with machine learning techniques in clinical and diagnostic settings.

Materials and methods

Key resources table
 Reagent type (species) or resource Designation Source or reference Identifiers Additional information
AntibodyAnti-human CD19-PE (mouse monoclonal)BioLegendCat# 363003; RRID:AB_2564125FACS (5 μl per test)
AntibodyAnti-human CD27-FITC (mouse monoclonal)BioLegendCat# 302806; RRID:AB_314298FACS (5 μl per test)
AntibodyAnti-human CD38-APC/Fire 750 (mouse monoclonal)BioLegendCat# 356626; RRID:AB_2616713FACS (5 μl per test)
AntibodyAnti-human IgD-APC (mouse monoclonal)BioLegendCat# 348222; RRID:AB_2561595FACS (5 μl per test)
AntibodyAnti-human IgM-Pacific Blue (mouse monoclonal)BioLegendCat# 314513; RRID:AB_10574306FACS (5 μl per test)

Sample collection and cell sorting

Request a detailed protocol

This was a descriptive study, hence no formal sample size calculation was performed. Buffy coat samples were obtained from 10 anonymous healthy adults, hence no approval from the local ethics committee was necessary. B cells were first isolated by magnetic cell sorting using the human CD19 MicroBeads (Miltenyi Biotec, San Diego, CA) and the AutoMACS magnetic cell separator. From 9 out of the 10 samples, 3 × 106 bulk CD19+ B cells (Bbulk) were lysed and stored at –80°C. The remaining cells were sorted by flow cytometry into four subpopulations using cell surface markers characteristic for naïve (Bnaive), MZ (BMZ), PCs (BPC) and switched memory B cells (Bswitched). Cells were then lysed and stored at –80°C. Surface markers, demographics, number of cells and purity of each sample are outlined in Supplementary file 1.

RNA extraction and library preparation

Request a detailed protocol

RNA extraction was performed on the lysate using the RNeasy Mini Kit (Qiagen, Hilden, Germany). Libraries were prepared as previously described (Ghraichy et al., 2020). Briefly, two reverse transcription (RT) reactions were carried out for each RNA sample resulting from Bbulk or BPC: one with equal concentrations of IgM and IgD-specific primers and another with IgA, IgG and IgE-specific primers. Only one RT reaction with IgM and IgD-specific primers was performed on Bnaive and BMZ samples; similarly, we applied one RT reaction with IgA, IgG and IgE primers on samples obtained from Bswitched. IgH cDNA rearrangements were then amplified in a two-round multiplex PCR using a mix of IGHV region forward primers and Illumina adapter primers, followed by gel extraction for purification and size selection. The final concentration of PCR products was measured using Qubit prior to library preparation and combined with a total of 12 equally concentrated samples. Final libraries barcoded with individual i7 and i5 adapters were sequenced in each run on the Illumina MiSeq platform (2 × 300 bp protocol).

Data preprocessing

Request a detailed protocol

Preprocessing of raw sequences was carried out using the Immcantation toolkit and as per Ghraichy et al., 2020; Vander Heiden et al., 2014; Gupta et al., 2015. Briefly, samples were demultiplexed based on their Illumina tags. A quality filter was applied, paired reads were joined and then collapsed according to their unique molecular identifier (UMI). Identical reads with different UMI were further collapsed, resulting in a dataset of unique sequences. VDJ gene assignment was carried out using IgBlast (Ye et al., 2013). Isotype subclass annotation was carried out by mapping constant regions to germline sequences using stampy (Lunter and Goodson, 2011). The number and type of V gene mutations was determined as the number of mismatches with the germline sequence using the R package shazam (Gupta et al., 2015). The R package alakazam was also used to calculate the physicochemical properties of the CDR3 amino acid sequences (Gupta et al., 2015). Selection pressure was calculated using BASELINe and the statistical framework used to test for selection was CDR_R/(CDR_R + CDR_S) (Yaari et al., 2012).

In silico grouping of sequences

Request a detailed protocol

For Bbulk samples, we used the constant region information combined with the mutation counts to classify individual sequences into different subsets. IgD and IgM sequences with up to 2 nt mutations across the entire V gene were considered ‘unmutated’ (Bbulk_naïve) to account for remaining PCR and sequencing bias. The remaining mutated IgD and IgM sequences were labelled as IgD/IgM memory (Bbulk_MD). All class-switched sequences were defined as antigen-experienced regardless of their V gene mutation count (Bbulk_switched). We split the sequences originating from BPC into two categories: IgM/IgD BPC (BPC_MD) and switched IgG/IgA PCs (BPC_AG) according to the constant region of the sequences.

Summarising repertoire characteristics

Request a detailed protocol

V family and J gene usage was calculated in proportions for each individual and cell type. We summarised the mean of the following CDR3 physiochemical characteristics: hydrophobicity, bulkiness, polarity, normalised aliphatic index, normalised net charge, acidic side chain residue content, basic side chain residue content, aromatic side chain content by individual and cell type.

Mean junction length, number of mutations and numbers of non-template (N) and palindromic (P) nucleotide added at the junction were calculated by individual and cell type. Selection pressure was summarised separately in CDR and framework region (FWR). Diversity was calculated as the proportion of unique junctions out of total transcripts. The preceding characteristics are referred to as global repertoire metrics.

Dimensionality reduction and clustering

Request a detailed protocol

PCA and k-means clustering were applied to the different repertoire characteristics to explore and find associations in the data. They were applied using the internal R functions prcomp() and kmeans() (R Development Core Team, 2018). LDA was performed using the R function lda() from the package MASS (Venables and Ripley, 2002).

Sequence classifier

Request a detailed protocol

We constructed the sequence classifier using the sklearn package in Python (Pedregosa, 2015). Because we have the constant region information and to avoid error accumulation, we performed a pairwise classification, thereby transforming the multiclass problem into a binary classification. Within every participant and for every pair of cells, we subsampled to the lower sequence number to avoid bias and dataset imbalance. We used the number of mutations, the physiochemical properties and the junction length as numerical input features. The V gene family and J gene were one-hot encoded. In the case where the naïve cells were not one of the two classes, the replacement/silent (R/S) mutation ratios in CDR and FWR were included as features. We split the data into training and testing set using the default test size of 0.2. We used logistic regression, decision tree and random forest classifiers for prediction. The accuracy was recorded to judge the overall performance of the models. For every pair of classes, the mean accuracy of the 10 samples was calculated.

Data availability

Raw data used in this study are available at the NCBI Sequencing Read Archive (https://www.ncbi.nlm.nih.gov/sra) under BioProject number PRJNA748239 including metadata meeting MiAIRR standards (32). The processed dataset is available in Zenodo (https://doi.org/10.5281/zenodo.3585046) along with the protocol describing the exact processing steps with the software tools and version numbers.

The following data sets were generated
    1. Ghraichy M
    2. Trück J
    (2021) NCBI BioProject
    ID PRJNA748239. IgH repertoire sequencing in human B cell subpopulations.
    1. Ghraichy M
    2. Trück J
    (2019) Zenodo
    Pre-processed B cell receptor repertoire sequencing data from BioProject PRJNA527941.
    https://doi.org/10.5281/zenodo.3585046

References

  1. Book
    1. Lefranc MP
    2. Lefranc G
    (2001)
    The Immunoglobulin Factsbook
    Academic Press.
    1. Pedregosa F
    (2015)
    Machine Learning in Python
    Journal of Machine Learning Research: JMLR 12:2825–2830.
    1. Zhang K
    2. Mills FC
    3. Saxon A
    (1994)
    Switch circles from IL-4-directed epsilon class switching from human B lymphocytes Evidence for direct, sequential, and multiple step sequential switch from mu to epsilon Ig heavy chain gene
    Journal of Immunology 152:3427–3435.

Decision letter

  1. Tomohiro Kurosaki
    Reviewing Editor; Osaka University, Japan
  2. Betty Diamond
    Senior Editor; The Feinstein Institute for Medical Research, United States
  3. Wanli Liu
    Reviewer; Tsinghua University, China

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Acceptance summary:

The described findings are a timely update to fulfill the gaps of previously missed issues to uncover IgH repertoires in different isotype-specific groups and shed significant lights on a better understanding of adaptive immune systems.

Decision letter after peer review:

Thank you for submitting your article "Different B cell subpopulations show distinct patterns in their IgH repertoire metrics" for consideration by eLife. Your article has been reviewed by 2 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Betty Diamond as the Senior Editor. The following individual involved in review of your submission has agreed to reveal their identity: Wanli Liu (Reviewer #1).

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Essential revisions:

1) As the reviewer 2 has voiced, by fairly referring previous literatures, authors should mention what is the unique outcome from this study, compared with our previous knowledge; indeed, differences in IgH repertoire among different B cell subsets has been already demonstrated by many previous studies. In addition, authors should mention biological implications from this study in a way easily understood by the biologists.

2) Age dependence is important. Thus, analysis of the correlation and proportion of mean mutation concerning the age of enrolled people should be done.

3) The authors demonstrated that IgH of different Ig classes and IgG subclasses show distinct repertoire depending on the distance from Cmu. This finding is novel. However, IgH repertoire of marginal zone B cells and plasma cells producing IgM/IgD supports this conclusion only in some parameters such as number of somatic mutations but not other parameters such as junction length. Moreover, proportion of unique junctions is more influenced by plasma cell differentiation than Ig classes. Therefore, the conclusion of this part needs to be revised.

4) The minimal sampling proportion required to perform the study is of crucial importance. In the material and method section, references should be incorporated, suggesting how the sampling size of 10 healthy individuals has been selected.

5) At the end of the Discussion part, a concluding sentence should be incorporated, illustrating how these algorithmic tools, together with machine learning techniques, can advance our understanding of immunology and address unmet clinical needs related to infectious diseases, immune dysregulation.

Reviewer #1:

Brief Summary of achievement: This potentially important study utilizes advanced statistical methods and machine learning algorithms, and bioinformatics approaches to investigate the IgH repertoire of different B cell and subpopulation cell types to determine whether, physically sorted B cell repertoire shares similar characteristics with their corresponding subsets in bulks. In the current study, the authors solved the connection that transcripts from mechanically sorted B cell subgroups have identical qualities to their bulk counterpart's subsets, which were clustered in silico using isotype subclass knowledge and the multitude of mutations. The authors further showed that different cell types corresponding to the same isotype subclass have similar sequences, implying that the IgH repertoire contains isotype-specific rather than cell-type-specific characteristics. Most impressively, the authors link these signatures to isotype subclass location on the locus and discover that downstream subclasses have more mature signals. Overall, this is a very impressive and innovative approach for studying the process and adds another layer of information into understanding the mechanisms underlying B-cell responses and uncover critical relevance of AIRR-seq approaches that will be used in future research.

Author's contribution and structure of the proposed study: The prescribed findings would be a timely update to fulfill the gaps of previously missed issues to uncover IgH repertoires in different isotype-specific groups and shed significant light on a better understanding of adaptive immune responses. The Candidate has achieved the proposed issues precisely. The findings are presented clearly and concisely, with adequate explanation and interpretation. The results support the theme logically, and the methods have been implemented in a balanced way to achieve the results. The discussion provides a critical elaboration of the results obtained. These studies in the future may provide detailed insights for adaptive immune researchers struggling to develop therapeutic drugs to treat several autoimmune and BCRs structural interlinked diseases. However, including autoimmune disease patients could further help researchers get detailed insights to extract the precise conclusion.

Strengths and weaknesses of the study: This study provides detailed insights into healthy individuals' IgH repertoire characteristics. Previous investigations have concentrated mainly on showing repertoires as a sequence or subsequence-based features. However, these investigations provided a brief pairwise predictor that effectively predicts a sequence's cell type based on the most often utilized sequence parameters, such as the number of mutations and junction length. On the other hand, these studies provide investigations within individuals. However, the feasibility of the model should be enhanced across the individuals. Moreover, interrogating the importance of IgH repertoire characteristics in B cells subsets during diseased conditions or concerning specific age should be paid explicit attention in the future through applying these sound approaches.

Before the publication, the following few points should be well considered:

1 – The authors use ten healthy individuals as readout; however, how V and J repertoire characteristics behave in people of different ages are missing. Secondly, the mean mutation parameter has been interestingly discussed in these studies with isotype-specific subclasses suggesting that, mutation most commonly is seen in downstream subclasses. As per my opinion, it would be interesting to analyze the correlation and proportion of mean mutation concerning the age of enrolled people. The authors can beautifully select people of different ages to analyze the results.

2 – Few terminologies such as PCA, LDA, CDR3 are used at high frequency; I suggest elaborating their abbreviation in the introduction section for better clarity.

3 – At the end of the Discussion part, a concluding sentence should be incorporated, illustrating how these algorithmic tools, together with machine learning techniques, can advance our understanding of immunology and address unmet clinical needs related to infectious diseases, immune dysregulation.

4 – The minimal sampling proportion required to perform the study is of crucial importance. In the material and method section, references should be incorporated, suggesting how the sampling size of 10 healthy individuals has been selected.

These studies comprehensively provided a model to check IgH repertoire within individuals, specifically in healthy individuals, possibly confined by individual-specific repertoire signals. However, it would be fascinating to incorporate diseased individuals in the study to elaborate on the system's feasibility. I would suggest adding few lines in the discussion part as a dimension of future work.

Reviewer #2:

In this paper, the authors analyzed IgH repertoire of various B cell subsets from human donors such as naïve, marginal zone and switched B cells and plasma cells producing IgM or IgD and those producing IgA or IgG using bioinformatic analysis and machine learning. They also analyzed IgH repertoire in bulk and in silico separated B cell subsets. The authors demonstrated that marginal zone B cells and plasma cells producing IgM or IgD show similar IgH repertoire whereas IgH repertoire in switched B cells is similar to that of plasma cells producing IgG or IgA but distinct from the other subsets. Number of somatic mutations, physiochemical nature of CDR3 and usage of VH and JH contribute to the distinction of IgH repertoire between different subsets.

The authors comprehensively analyzed IgH repertoire and nicely showed the difference between different B cell subsets. However, difference in IgH repertoire between different B cell subsets has been demonstrated by various previous studies. Therefore, what biological questions can be addressed by this analysis is somewhat unclear.

The difference of physiochemical properties of CDR3 between different subsets is interesting. Could the author discuss the impact of this difference in antigen binding?

The authors demonstrated that IgH of different Ig classes and IgG subclasses show distinct repertoire depending on the distance from Cmu. This finding is novel. However, IgH repertoire of marginal zone B cells and plasma cells producing IgM/IgD supports this conclusion only in some parameters such as number of somatic mutations but not other parameters such as junction length. Moreover, proportion of unique junctions is more influenced by plasma cell differentiation than Ig classes. Therefore, the conclusion of this part needs to be revised.

https://doi.org/10.7554/eLife.73111.sa1

Author response

Essential revisions:

1) As the reviewer 2 has voiced, by fairly referring previous literatures, authors should mention what is the unique outcome from this study, compared with our previous knowledge; indeed, differences in IgH repertoire among different B cell subsets has been already demonstrated by many previous studies. In addition, authors should mention biological implications from this study in a way easily understood by the biologists.

We have added the following paragraph to our discussion clearly stating our research questions and novelty in contrast with previous work:

“Earlier studies have revealed differences in gene usage and repertoire properties between different sorted B cell subpopulations. In addition, some differences in repertoire characteristics across isotype subclasses have been determined by using bulk B cells as starting material. However, studies combining IgH sequencing from both bulk and sorted B cell subpopulations have been sparse. Here, we sought to compare IgH repertoire properties between transcripts sorted in silico from bulk B cells and physically sorted B cell subpopulations. In addition, using isotype-resolved sequencing, we aimed to perform a comparative analysis of the IgH repertoire of transcripts with the same constant region originating from different cell types and correlate the differences in repertoire with position on the IgH locus. The type of material obtained, the sequencing depth, and the experimental and computational methods used here constitute by far the most thorough and comprehensive analysis of repertoire differences between B cell subsets and isotype subclasses and thereby provide unique insights into the underlying mechanisms leading to formation of different B cell subpopulations.”

2) Age dependence is important. Thus, analysis of the correlation and proportion of mean mutation concerning the age of enrolled people should be done.

We agree with the reviewer that age is an important factor in shaping the IgH repertoire. We and others have previously looked at the effect of age on the IgH repertoire characteristics in detail (Ghraichy, 2020; IJspeert, 2016). These studies have shown that maturation of the B cell compartment with age is primarily observed as an increase in the number of mutations, with most changes occurring in the first decade of life. Changes continue to be made thereafter, but at a much lower rate. Using the data in the present manuscript, additional correlation analysis was conducted for all repertoire characteristics that have been shown to be age dependent in a previous analysis using the same sequencing and analysis protocol (Ghraichy, 2020). This analysis revealed a positive correlation between the number of mutations and age, and between R/S ratio in CDR and age but only in IgG2 transcripts. No age-dependent correlation was observed for any other subsets. However, because of the number of individuals included and their age range, we decided not to include this analysis as we thought it did not add any important value to previous findings. Because our proposed results are based on an intraindividual and cross-subgroup analysis, we believe that our results are valid in general, despite isolated age dependence.

3) The authors demonstrated that IgH of different Ig classes and IgG subclasses show distinct repertoire depending on the distance from Cmu. This finding is novel. However, IgH repertoire of marginal zone B cells and plasma cells producing IgM/IgD supports this conclusion only in some parameters such as number of somatic mutations but not other parameters such as junction length. Moreover, proportion of unique junctions is more influenced by plasma cell differentiation than Ig classes. Therefore, the conclusion of this part needs to be revised.

We thank the reviewer for this comment and have modified the title and conclusion of this section to specify that parameters are dependent on the distance from Cmu in class-switched subsets. We have also added the following sentence at the end of this section:

“Memory IgM subsets (BMZ and BPC_MD) were found to have lower mutations and R/S ratio than class-switched subsets. However, in other properties such as junction length, proportion of IGHV4-34 and diversity, they were more similar to IgM-distal subclasses.”

4) The minimal sampling proportion required to perform the study is of crucial importance. In the material and method section, references should be incorporated, suggesting how the sampling size of 10 healthy individuals has been selected.

Because this is a descriptive study, no formal sample size calculation was performed and the sampling size of 10 was based on the availability of samples. For clarity, we have added a sentence in the material and methods section.

Although additional samples might have been beneficial in increasing the sensitivity for certain analysis, the current approach with full-length isotype-resolved sequencing of 9 bulk and 4 x 10 B cell subpopulations (total n = 49 samples) allowed clear separation of subsets with limited variability and demonstration of overlap between in silico and physically sorted B cell subpopulations. Where appropriate (e.g., in machine learning), we used subsampling and divided into a training set and a test set to limit variability in sequencing methods and make data analysis more robust.

5) At the end of the Discussion part, a concluding sentence should be incorporated, illustrating how these algorithmic tools, together with machine learning techniques, can advance our understanding of immunology and address unmet clinical needs related to infectious diseases, immune dysregulation.

According to the reviewer’s suggestion, we have added the following concluding sentence at the end of the discussion:

“The data and methods presented here provide a foundation for future work investigating the immune repertoire of patients with altered immune status and hold promise for the application of AIRR-seq along with machine learning techniques in clinical and diagnostic settings.”

Reviewer #1:

Before the publication, the following few points should be well considered :

1 – The authors use ten healthy individuals as readout; however, how V and J repertoire characteristics behave in people of different ages are missing. Secondly, the mean mutation parameter has been interestingly discussed in these studies with isotype-specific subclasses suggesting that, mutation most commonly is seen in downstream subclasses. As per my opinion, it would be interesting to analyze the correlation and proportion of mean mutation concerning the age of enrolled people. The authors can beautifully select people of different ages to analyze the results.

We agree with the reviewer that age is an important factor in shaping the IgH repertoire. We and others have previously looked at the effect of age on the IgH repertoire characteristics in detail (Ghraichy, 2020; IJspeert, 2016). These studies have shown that maturation of the B cell compartment with age is primarily observed as an increase in the number of mutations, with most changes occurring in the first decade of life. Changes continue to be made thereafter, but at a much lower rate. Using the data in the present manuscript, additional correlation analysis was conducted for all repertoire characteristics that have been shown to be age dependent in a previous analysis using the same sequencing and analysis protocol (Ghraichy, 2020). This analysis revealed a positive correlation between the number of mutations and age, and between R/S ratio in CDR and age but only in IgG2 transcripts. No age-dependent correlation was observed for any other subsets. However, because of the number of individuals included and their age range, we decided not to include this analysis as we thought it did not add any important value to previous findings. Because our proposed results are based on an intraindividual and cross-subgroup analysis, we believe that our results are valid in general, despite isolated age dependence.

2 – Few terminologies such as PCA, LDA, CDR3 are used at high frequency; I suggest elaborating their abbreviation in the introduction section for better clarity.

We have elaborated the abbreviations in the introduction as suggested.

3 – At the end of the Discussion part, a concluding sentence should be incorporated, illustrating how these algorithmic tools, together with machine learning techniques, can advance our understanding of immunology and address unmet clinical needs related to infectious diseases, immune dysregulation.

According to the reviewer’s suggestion, we have added the following concluding sentence at the end of the discussion:

“The data and methods presented here provide a foundation for future work investigating the immune repertoire of patients with altered immune status and hold promise for the application of AIRR-seq along with machine learning techniques in clinical and diagnostic settings.”

4 – The minimal sampling proportion required to perform the study is of crucial importance. In the material and method section, references should be incorporated, suggesting how the sampling size of 10 healthy individuals has been selected.

We thank the reviewer for this comment. Because this is a descriptive study, no formal sample size calculation was performed and the sampling size of 10 was based on the availability of samples. For clarity, we have added a sentence in the material and methods section.

Although additional samples might have been beneficial in increasing the sensitivity for certain analysis, the current approach with full-length isotype-resolved sequencing of 9 bulk and 4 x 10 B cell subpopulations (total n = 49 samples) allowed clear separation of subsets with limited variability and demonstration of overlap between in silico and physically sorted B cell subpopulations. Where appropriate (e.g., in machine learning), we used subsampling and divided into a training set and a test set to limit variability in sequencing methods and make data analysis more robust.

These studies comprehensively provided a model to check IgH repertoire within individuals, specifically in healthy individuals, possibly confined by individual-specific repertoire signals. However, it would be fascinating to incorporate diseased individuals in the study to elaborate on the system's feasibility. I would suggest adding few lines in the discussion part as a dimension of future work.

We agree with the reviewer that applying the methods presented on diseased individuals would be enlightening and have included a sentence at the end of the discussion suggesting this as a possible future direction.

Reviewer #2:

In this paper, the authors analyzed IgH repertoire of various B cell subsets from human donors such as naïve, marginal zone and switched B cells and plasma cells producing IgM or IgD and those producing IgA or IgG using bioinformatic analysis and machine learning. They also analyzed IgH repertoire in bulk and in silico separated B cell subsets. The authors comprehensively analyzed IgH repertoire and nicely showed the difference between different B cell subsets. However, the novelty of this study is somewhat unclear, and some of the conclusions are not well supported by the data.

In this paper, the authors analyzed IgH repertoire of various B cell subsets from human donors such as naïve, marginal zone and switched B cells and plasma cells producing IgM or IgD and those producing IgA or IgG using bioinformatic analysis and machine learning. They also analyzed IgH repertoire in bulk and in silico separated B cell subsets. The authors demonstrated that marginal zone B cells and plasma cells producing IgM or IgD show similar IgH repertoire whereas IgH repertoire in switched B cells is similar to that of plasma cells producing IgG or IgA but distinct from the other subsets. Number of somatic mutations, physiochemical nature of CDR3 and usage of VH and JH contribute to the distinction of IgH repertoire between different subsets.

The authors comprehensively analyzed IgH repertoire and nicely showed the difference between different B cell subsets. However, difference in IgH repertoire between different B cell subsets has been demonstrated by various previous studies. Therefore, what biological questions can be addressed by this analysis is somewhat unclear.

We have added the following paragraph to our discussion clearly stating our research question and novelty in contrast with previous work:

“Earlier studies have revealed differences in gene usage and repertoire properties between different sorted B cell subpopulations. In addition, some differences in repertoire characteristics across isotype subclasses have been determined by using bulk B cells as starting material. However, studies combining IgH sequencing from both bulk and sorted B cell subpopulations have been sparse. Here, we sought to compare IgH repertoire properties between transcripts sorted in silico from bulk B cells and physically sorted B cell subpopulations. In addition, using isotype-resolved sequencing, we aimed to perform a comparative analysis of the IgH repertoire of transcripts with the same constant region originating from different cell types and correlate the differences in repertoire with position on the IgH locus. The type of material obtained, the sequencing depth, and the experimental and computational methods used here constitute by far the most thorough and comprehensive analysis of repertoire differences between B cell subsets and isotype subclasses and thereby provide unique insights into the underlying mechanisms leading to formation of different B cell subpopulations.”

The difference of physiochemical properties of CDR3 between different subsets is interesting. Could the author discuss the impact of this difference in antigen binding?

We thank the reviewer for their comment. To our knowledge, except for charge and hydrophobicity that are associated with self-reactivity, there is limited knowledge how other CDR3 physiochemical characteristics affect antigen binding. However, we have added the following sentence in the discussion suggesting possible future directions to further investigate these changes.

“The importance of the CDR3 and its location at the center of the antigen-binding site make the observed differences in physiochemical properties and junction length interesting. These findings may suggest that different B cells subpopulations stimulated by diverse antigens have distinct junction compositions and properties. Further work to investigate CDR3-specific residue content and prediction of CDR3 protein structures could be enriching for antibody specificity and antigen recognition in different B cell subpopulations.”

The authors demonstrated that IgH of different Ig classes and IgG subclasses show distinct repertoire depending on the distance from Cmu. This finding is novel. However, IgH repertoire of marginal zone B cells and plasma cells producing IgM/IgD supports this conclusion only in some parameters such as number of somatic mutations but not other parameters such as junction length. Moreover, proportion of unique junctions is more influenced by plasma cell differentiation than Ig classes. Therefore, the conclusion of this part needs to be revised.

This is a good point and we have modified the title and conclusion of this section to specify that parameters are dependent on the distance from Cmu in class-switched subsets. We have also added the following sentence at the end of this section:

“Memory IgM subsets (BMZ and BPC_MD) were found to have lower mutations and R/S ratio than class-switched subsets. However, in other properties such as junction length, proportion of IGHV4-34 and diversity, they were more similar to IgM-distal subclasses.”

https://doi.org/10.7554/eLife.73111.sa2

Article and author information

Author details

  1. Marie Ghraichy

    Division of Immunology, University Children's Hospital and Children’s Research Center, University of Zurich (UZH), Zurich, Switzerland
    Contribution
    Data curation, Formal analysis, Investigation, Methodology, Resources, Software, Visualization, Writing – original draft, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-7348-796X
  2. Valentin von Niederhäusern

    Division of Immunology, University Children's Hospital and Children’s Research Center, University of Zurich (UZH), Zurich, Switzerland
    Contribution
    Data curation, Formal analysis, Methodology, Resources, Writing – review and editing
    Competing interests
    No competing interests declared
  3. Aleksandr Kovaltsuk

    Department of Statistics, University of Oxford, Oxford, United Kingdom
    Contribution
    Conceptualization, Funding acquisition, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
  4. Jacob D Galson

    1. Division of Immunology, University Children's Hospital and Children’s Research Center, University of Zurich (UZH), Zurich, Switzerland
    2. Alchemab Therapeutics Ltd, London, United Kingdom
    Contribution
    Methodology, Software, Writing – review and editing
    Competing interests
    employee of Alchemab Therapeutics Limited but there is no conflict of interest related to this work
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-4916-800X
  5. Charlotte M Deane

    Department of Statistics, University of Oxford, Oxford, United Kingdom
    Contribution
    Funding acquisition, Supervision, Writing – review and editing
    Competing interests
    Reviewing editor, eLife
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-1388-2252
  6. Johannes Trück

    Division of Immunology, University Children's Hospital and Children’s Research Center, University of Zurich (UZH), Zurich, Switzerland
    Contribution
    Conceptualization, Funding acquisition, Methodology, Project administration, Supervision, Writing – original draft, Writing – review and editing
    For correspondence
    Johannes.Trueck@kispi.uzh.ch
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-0418-7381

Funding

Swiss National Science Foundation (PZ00P3_161147)

  • Johannes Trück

Swiss National Science Foundation (PZ00P3_183777)

  • Johannes Trück

Gottfried und Julia Bangerter-Rhyner-Stiftung

  • Johannes Trück

Olga Mayenfisch Stiftung

  • Johannes Trück

Palatin-Stiftung

  • Johannes Trück

Biotechnology and Biological Sciences Research Council (BB/M011224/1)

  • Aleksandr Kovaltsuk

UCB Pharma Ltd

  • Aleksandr Kovaltsuk

Royal Commission for the Exhibition of 1851 Industrial Fellowship

  • Aleksandr Kovaltsuk

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

This work was supported by Swiss National Science Foundation (Ambizione-SCORE. PZ00P3_161147 and PZ00P3_183777) (JT); Gottfried und Julia Bangerter-Rhyner-Stiftung (JT); Olga Mayenfisch Stiftung (JT); Palatin-Stiftung (JT); Investment fund of the University of Zurich (JT). Biotechnology and Biological Sciences Research Council (BBSRC) (AK); UCB Pharma Ltd (AK); Royal Comission for the Exhibtion of 1851 Industrial Fellowship (AK).

Ethics

Human subjects: Buffy coat samples were obtained from 10 anonymous healthy adults, hence no approval from the local ethics committee was necessary.

Senior Editor

  1. Betty Diamond, The Feinstein Institute for Medical Research, United States

Reviewing Editor

  1. Tomohiro Kurosaki, Osaka University, Japan

Reviewer

  1. Wanli Liu, Tsinghua University, China

Publication history

  1. Received: August 17, 2021
  2. Preprint posted: September 5, 2021 (view preprint)
  3. Accepted: October 17, 2021
  4. Accepted Manuscript published: October 18, 2021 (version 1)
  5. Version of Record published: November 1, 2021 (version 2)

Copyright

© 2021, Ghraichy et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 525
    Page views
  • 114
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Genetics and Genomics
    2. Immunology and Inflammation
    Ilana Fox-Fisher et al.
    Research Article Updated

    Blood cell counts often fail to report on immune processes occurring in remote tissues. Here, we use immune cell type-specific methylation patterns in circulating cell-free DNA (cfDNA) for studying human immune cell dynamics. We characterized cfDNA released from specific immune cell types in healthy individuals (N = 242), cross sectionally and longitudinally. Immune cfDNA levels had no individual steady state as opposed to blood cell counts, suggesting that cfDNA concentration reflects adjustment of cell survival to maintain homeostatic cell numbers. We also observed selective elevation of immune-derived cfDNA upon perturbations of immune homeostasis. Following influenza vaccination (N = 92), B-cell-derived cfDNA levels increased prior to elevated B-cell counts and predicted efficacy of antibody production. Patients with eosinophilic esophagitis (N = 21) and B-cell lymphoma (N = 27) showed selective elevation of eosinophil and B-cell cfDNA, respectively, which were undetectable by cell counts in blood. Immune-derived cfDNA provides a novel biomarker for monitoring immune responses to physiological and pathological processes that are not accessible using conventional methods.

    1. Immunology and Inflammation
    Shahanshah Khan et al.
    Research Article

    The pathogenesis of COVID-19 is associated with a hyperinflammatory response; however, the precise mechanism of SARS-CoV-2-induced inflammation is poorly understood. Here we investigated direct inflammatory functions of major structural proteins of SARS-CoV-2. We observed that spike (S) protein potently induced inflammatory cytokines and chemokines including IL-6, IL-1b, TNFa, CXCL1, CXCL2, and CCL2, but not IFNs in human and mouse macrophages. No such inflammatory response was observed in response to membrane (M), envelope (E), and nucleocapsid (N) proteins. When stimulated with extracellular S protein, human and mouse lung epithelial cells also produced inflammatory cytokines and chemokines. Interestingly, epithelial cells expressing S protein intracellularly were non-inflammatory, but elicited an inflammatory response in macrophages when co-cultured. Biochemical studies revealed that S protein triggers inflammation via activation of the NF-kB pathway in a MyD88-dependent manner. Further, such an activation of the NF-kB pathway was abrogated in Tlr2-deficient macrophages. Consistently, administration of S protein induced IL-6, TNF-a, and IL-1b in wild-type, but not Tlr2-deficient mice. Notably, upon recognition of S protein, TLR2 dimerizes with TLR1 or TLR6 to activate the NF-kB pathway. Together these data reveal a mechanism for the cytokine storm during SARS-CoV-2 infection and suggest that TLR2 could be a potential therapeutic target for COVID-19.