Information transfer in mammalian glycanbased communication
Abstract
Glycanbinding proteins, socalled lectins, are exposed on mammalian cell surfaces and decipher the information encoded within glycans translating it into biochemical signal transduction pathways in the cell. These glycanlectin communication pathways are complex and difficult to analyze. However, quantitative data with singlecell resolution provide means to disentangle the associated signaling cascades. We chose Ctype lectin receptors (CTLs) expressed on immune cells as a model system to study their capacity to transmit information encoded in glycans of incoming particles. In particular, we used nuclear factor kappaBreporter cell lines expressing DCspecific ICAM3–grabbing nonintegrin (DCSIGN), macrophage Ctype lectin (MCL), dectin1, dectin2, and macrophageinducible Ctype lectin (MINCLE), as well as TNFαR and TLR1&2 in monocytic cell lines and compared their transmission of glycanencoded information. All receptors transmit information with similar signaling capacity, except dectin2. This lectin was identified to be less efficient in information transmission compared to the other CTLs, and even when the sensitivity of the dectin2 pathway was enhanced by overexpression of its coreceptor FcRγ, its transmitted information was not. Next, we expanded our investigation toward the integration of multiple signal transduction pathways including synergistic lectins, which is crucial during pathogen recognition. We show how the signaling capacity of lectin receptors using a similar signal transduction pathway (dectin1 and dectin2) is being integrated by compromising between the lectins. In contrast, coexpression of MCL synergistically enhanced the dectin2 signaling capacity, particularly at lowglycan stimulant concentration. By using dectin2 and other lectins as examples, we demonstrate how signaling capacity of dectin2 is modulated in the presence of other lectins, and therefore, the findings provide insight into how immune cells translate glycan information using multivalent interactions.
Editor's evaluation
This manuscript lays out the framework for addressing an important challenge in our understanding of cellular signal transduction: how complex extracellular inputs can be detected and processed using multiple receptors. This problem is addressed in the context of glycan receptors lectins, mediating very common but still not completely understood cellcell interactions. Using information capacity analysis, the study addresses the importance of glycan input measurement by multiple receptors on the immune cells, showing how the signal detection can benefit from receptor crosstalk.
https://doi.org/10.7554/eLife.69415.sa0Introduction
Glycans are present in all living cells and play a key role in many essential biological processes including development, differentiation, and immunity. Being surface exposed, glycans often encode for information in cellular communication such as self/nonselfdiscrimination, cellular identity, and homing as well as apoptosis markers (Bode et al., 2019; Maverakis et al., 2015; Williams, 2017). Other than linear biopolymers, such as proteins and nucleic acids, glycans are branched structures, where subtle changes in the glycosidic bonds between each monomer can carry essential pieces of information. Adding to this complexity, glycans are products of large cellular machinery and are therefore not directly encoded by the genome (Cummings, 2009). Besides their composition, the recognition of glycans by their receptors is complicated, particularly due to the lack of specificity. Glycans are recognized by lectins, yet no glycan is recognized by a single receptor, and no individual lectin is highly specific for only one glycan. Additionally, affinities are low, and interactions often depend on the multivalency of both the receptor and the ligand. Overall, since alterations of the glycocalyx do not function as a deterministic on/off switch but rather a progressive tuning of the cellular response, glycan lectin communication should be considered as a stochastically behaving system, rather than a deterministic one (Dennis, 2015).
Many lectin receptors serve as triggers for multiple immunological signaling pathways, often funneling down to NFκB (nuclear factor kappaB) as a transcription factor. In this work, we focus on Ctype lectin receptors (CTLs). MINCLE (macrophageinducible Ctype lectin), for example, is a CTL involved in the recognition of pathogens as well as selfdamage (Miyake et al., 2015; Williams, 2017). MINCLE and its close relative dectin2 (dendritic cellassociated Ctype lectin2) signal via the FcRγ gamma chain (Miyake et al., 2015; Ostrop et al., 2015; Sato et al., 2006), leading to CARD9BCL10Malt1 activation (Figure 1A). This in turn results in the activation of NFκB, eventually triggering cytokine release. Importantly, these two receptors share the same signal transduction pathway, while having different functions (Thompson et al., 2021). Therefore, both dectin2 and MINCLE can be compared of whether these related proteins differently transmit glycan information from the receptor level. In contrast, dectin1 and dectin2 have different signal transduction pathways but are both involved in the detection of βglucans and mannan, respectively (Figure 1A).
Upon fungal infection, combination of these and other cell surface receptors expressed by antigen presenting cells then leads to a defined immune reaction via signal integration processes (Snarr et al., 2017). Such signal integration can result in synergism between the receptors triggering an effect greater than their individual contributions (Ostrop and Lang, 2017). For example, MCL (macrophage Ctype lectin), another CTL present on cells of the innate immune system, is known to synergistically work with dectin2 (Ostrop et al., 2015; Zhu et al., 2013). Additionally, to this type of synergism, other members of the CTL family, e.g., DCspecific ICAM3–grabbing nonintegrin (DCSIGN) and Langerin, rather modulate a response instead of initiating it by themselves (Geijtenbeek and Gringhuis, 2016; Osorio and Reis e Sousa, 2011). Therefore, it is important to quantitatively account for the resulting signaling to describe the complexity of how these cell surface receptors can modulate each other to translate a glycanencoded information into a biological response.
Accounting for the stochastic behavior of cellular signaling, information theory provides robust and quantitative tools to analyze complex communication channels. A fundamental metric of information theory is entropy, which determines the amount of disorder or uncertainty of variables. In this respect, cellular signaling pathways having high variability of the initiating input signals (e.g. stimulants) and the corresponding highly variable output response (i.e. cellular signaling) can be characterized as a high entropy. Importantly, input and output can have mutual dependence, and therefore, knowing the input distribution can partly provide the information of output distribution. If noise is present in the communication channel, input and output have reduced mutual dependence. This mutual dependence between input and output is called mutual information. Mutual information is, therefore, a function of input distribution, and the upper bound of mutual information is called channel capacity (Appendix 2; Cover and Thomas, 2012).
In this report, a communication channel describes signal transduction pathway of CTL, which ultimately lead to NFκB translocation and finally GFP expression in the reporter model (Figure 1A). To quantify the signaling information of the communication channels, we used channel capacity. Importantly, the channel capacity is not merely describing the resulting maximum intensity of the reporter cells. The channel capacity takes cellular variation and activation across a whole range of incoming stimulus of singlecell resolved data into account and quantifies all of that data into a single number.
Herein, we studied dectin2, dectin1, MINCLE, DCSIGN, MCL, TNFαR (TNF alpha receptor), and TLR1&2 in NFκB reporter cells using singlecell resolved flow cytometry (Figure 1A, see also Appendix 2). To accurately quantify the information transmission in the receptors’ signaling pathways in response to exogenous glycans, we use the channel capacity as a metric (Figure 1B). By employing channel capacity measurements, we found dectin2 channel has relatively low signaling capacity, which in turn is synergistically increased in the presence of coexpressed MCL receptor. Furthermore, the channel capacity of dectin1 and dectin2 channel for the same glycan ligand was compromised when both receptors are expressed by the cell while increasing the binding sensitivity (EC_{50}) to the ligand. Overall, our findings and approach provide a quantitative description of glycan lectin communication and signal integration of CTLs and other receptors, which may lead to a better understanding of key phenomena such as pathogen recognition and autoimmunity.
Results
Quantifying signal transduction in glycanbased communication
We employed a singlecell resolved reporter system to monitor CTL activity by GFP expression under control of the transcription factor NFκB in human monocytic U937 cells (Figure 1A). Dectin2 was expressed in these reporter cells, and stimulation was conducted using various ligands (Figure 1C–E). FurFurMan, an extract of Malassezia furfur, as well as the polysaccharide mannan and invertase, both from Saccharomyces cerevisiae, initiated dectin2 signaling. In contrast, owing to the lack of multivalency, mannose itself could not initiate signaling but was able to inhibit dectin2 function (Figure 1C and D, and also Appendix 1—figure 1 A; Ishikawa et al., 2013). In parallel, the invertase treated with αmannosidase does not activate the NFκB signaling, indicating the glycosylationdependent dectin2 activity (Appendix 1—figure 1 B). The activation of human dectin2 receptor is in line with previous reports on its murine homolog, which is triggered by Manα1–2 Man moieties presented on scaffolds like proteins, glycans, or polystyrene beads (Ishikawa et al., 2013; Yonekawa et al., 2014; Zhou et al., 2018). Analogously, introduction of dectin1 into the reporter cells enabled detection of NFκBbased GFP expression after stimulation. However, while FurFurMan could also stimulate dectin1 cells, this was not inhibited by the addition of mannose, which is expected for this βglucan receptor (Figure 1D).
Next, we studied the doseresponse behavior of dectin2 reporter cells stimulated with FurFurMan over a wide range of input concentrations (Figure 1E). The cellular population revealed an overlap between the unstimulated and the maximally stimulated population, demonstrating the absence of a clear twostate behavior on a population level (Figure 1E, Appendix 1—figure 1 C). To ensure that change in the reporter level is not affected by protein expression rate, we confirmed that GFP expression required at least 16 hr of stimulation to reach its maximum in steadystate protein expression, while short stimulation with for 2–6 hr does not lead the maximum level of GFP production (Appendix 1—figure 1 D). We also ruled out any influence of the selection process for the cellular clones, by sorting dectin2 expressing cells according to their GFP expression level. When restimulated, both populations again showed the same broad GFP expression, confirming the wide range of the response to be independent of genetic differences between individual cells (Appendix 1—figure 1 E). Taken together, observing noisy dectin2 signaling on a singlecell level in relevant model cell lines reveals a broad population distribution when stimulated.
Dectin2 transmits less information than other receptors
To investigate whether other receptors with similar signaling pathways follow the same principle, we analyzed the dose response of dectin1, MINCLE, and the nonCTLs TNFαR and TLR1 and 2 (Bode et al., 2019; Holbrook et al., 2019; Ishikawa et al., 2009; Ozinsky et al., 2000). To quantify the underlying signal transmission in a cellular population, the channel capacity was used as a metric. Note that we choose the stimulation time, the period of incubation time of the cell with the input ligands, as the time point when GFP response and channel capacity reach the maximum and steadystate value (Appendix 1—figure 2 A and B). And therefore, the stimulation was 13 and 16 hr for TNFα and the rest of the ligands, respectively. Previous work on TNFα signaling found the TNFα channel to have a channel capacity of about 1 bit in particular 1.64 bits when a reporter cell system was used (Cheong et al., 2011). In addition, this channel capacity can be further increased if one can measure the temporal evolution of output dynamics instead of static output dataset (Selimkhanov et al., 2014). Such channel capacity suggests that a cellular population can use a receptor to distinguish between two states: on/off or presence/absence of a stimulant. For U937 cells, we found the TNFαR transmits 1.34 bits of channel capacity for TNFα stimulant (Figure 2A and B), which was not influenced by the introduction of additional lectins (i.e. MINCLE, dectin2, and DCSIGN, see Appendix 1—figure 2 C). In the case of dectin1 expressing U937 cells, the channel capacities were 1.20 and 1.09 bits for depleted zymosan (DZ) and FurFurMan input, respectively, while both MINCLE and TLR1&2 had a channel capacity of 0.98 and 0.99 bits, respectively. Since these receptors signal via NFκB, these differences can be explained by receptor expression levels and downstream pathways. In contrast, dectin2 stimulation resulted in a channel capacity of 0.70 bits using FurFurMan as a ligand. Stimulation using heat inactivated invertase or mannan had 0.80 and 0.49 bits, respectively (Figure 2B). Also, in THP1 cells, a similar trend of lower GFP expression upon stimulation is observed, further supporting the notion that dectin2 has a lower signal transmission capacity compared to the other receptors such as TNFαR (Appendix 1—figure 2 DF).
The most striking difference was found between MINCLE and dectin2, as both lectins use the same signaling pathway via FcRγ (Ishikawa et al., 2013), suggesting that the substantial differences between the channel capacities rely on very early ligand recognition events. We hypothesized overexpression of the signaling protein FcRγ might increase the information transmitted via dectin2. The overexpression of FcRγ resulted in at least twofold increase of NFκB controlled GFP expression (Figure 2C). Overexpression of both dectin2 and FcRγ yielded a highbasal NFκB activation of the cells while the sensitivity for its ligand (EC_{50}) increased about 50fold. While the maximal GFP signal of dectin2 (MFI, mean fluorescence intensity) was increased in the presence of FcRγ overexpression, the channel capacity however decreased simultaneously (0.41 bits; Figure 2 C–E). Since competition with mannose reduced this effect, we speculate that decreased channel capacity might originate from selfrecognition of dectin2 of ligands being present either on the same cell or those in close proximity during the culture conditions (Figure 2E). From this, we concluded the channel capacity of a glycanbased communication channel is not necessarily coupled to its sensitivity. Also, the ability of a communication channel to transmit information is not well described by its maximal signal alone (i.e. MFI), but rather by the channel capacity. Next, we quantified the number of receptors and excluded that the difference in MINCLE and dectin2 channel capacities is due to differences in receptor expression levels (Appendix 1—figure 2 G). Taken together, dectin2 has relatively less channel capacity, and while its sensitivity (EC_{50}) can be modulated with FcRγ, the transmitted information does not increase. Additionally, the number of receptors has little influence on the channel capacity or amplitude.
Signal integration compromises between dectin1 and dectin2 receptors when both are engaged
To expand our insight from isolated cell surface receptors to the interplay between multiple lectins, we prepared reporter cells expressing dectin2 and dectin1 simultaneously. FurFurMan served as a stimulant since it interacts with both dectin1 and dectin2. First of all, we found that the level of receptor expression did not change upon expression of an additional lectin (Figure 3A). Dectin1 expressing cells gave a higher maximal signal (i.e. maximal MFI) and channel capacity than dectin2 expressing cells; however, the latter channel showed higher sensitivity (EC_{50}) to FurFurMan. We found that the double positive cells did compromise between the two receptors, displaying the values corresponding to the intermediate values of the EC_{50} and channel capacity of dectin1 and dectin2 (Figure 3B and C). Additionally, mannose could be used to interfere with dectin2 signaling, thus U937 dectin1 dectin2 expressing cells showed the same doseresponse curve as dectin1 expressing cells (Figure 3D). When DZ, a dectin1 specific stimulants, was used, dectin2 expression did not significantly influence the response of the double positive cells. Hence, dectin2 specific signaling was not influenced by dectin1 expression (Appendix 1—figure 3 AC). Moreover, inhibition of dectin2 signaling initiated by FurFurMan by the addition of 25 mM mannose resulted a response that was not a compromise. Taken together, we see that the simultaneous stimulation of dectin1 and dectin2 resulted in a compromise between their channels, which demonstrates how these two channels integrate glycan signal into response.
Macrophage Ctype lectin (MCL) increases the channel capacity of dectin2
To further expand our insights into signal transmission through multiple lectins, we wondered whether coexpression of other lectins would synergistically increase the channel capacity of dectin2 signaling. For this, we included DCSIGN and MCL (Figure 4A). Although DCSIGN does not elicit NFκB signaling by itself in U937 cells, it is known to recognize highmannose structures present on invertase (Gringhuis et al., 2009). As expected, U937 dectin2 DCSIGN cells experience significantly increased ligand binding (Appendix 1—figure 4 A and B). We then speculated that this would either (a) promote the ligand recognition by preconcentration of the stimulants on the cell surface or (b) sequester the input signal from dectin2, reducing the cellular response. In fact, DCSIGNmediated ligand binding did not alter the dectin2 channel capacity for FurFurMan or invertase stimulation or did DCSIGN expression itself modulate TLR4 signaling (Figure 4B, Appendix 1—figure 4 C). However, the sensitivity, as assessed by EC_{50}, increased for dectin2 DCSIGN expressing cells (Figure 4A). The increased sensitivity due to DCSIGN coexpression might increase the channel capacity if the allowed dose range spans lowconcentration region. Therefore, we calculated the channel capacity by increasing the maximum input concentration. However, this was not the case (Figure 4C). Contrary to DCSIGN, overexpression of MCL significantly increased the channel capacity of dectin2 expressing cells, particularly when limiting our dataset to lowmaximum invertase concentrations (Figure 4B and D, Appendix 1—figure 4 D and E). This indicates that MCL enhances the fidelity of invertase information transmission of dectin2 channel, providing quantitative measurement of synergistic effect of MCL (Ostrop et al., 2015; Zhu et al., 2013).
We then wondered whether the difference in channel capacity between dectin2 and TNFαR could simply be a result of affinity. Since TNFαR has a nanomolar affinity for its ligand (Grell et al., 1998), we applied an antidectin2 antibody to stimulate dectin2 cells. Even under these conditions, we did not monitor an increase in channel capacity (Appendix 1—figure 4 F). Therefore, we found that MCL but not DCSIGN significantly increase the dectin2 channel capacity, while both MCL and DCSIGN enhance cellular binding of the stimulants and the resulting cellular sensitivity to invertase.
Dectin2 channel has a low signaltonoise ratio
The relatively low channel capacity of dectin2 could be a result of its limited maximum GFP expression even at highstimulant concentrations compared to the other channels (Figure 2B). For this, we define the signal power as the variation of the mean GFP expression under individual stimulant dose (Figure 5A). In addition, the level of background noise (i.e. noise power) of the channel can be defined as the average of the variance of GFP expression at a given stimulant dose. These definitions allow to decompose signal and noise power (Appendix 3) and analyze them separately to infer how those two parameters shape the channel capacity.
TNFaR, MINCLE, and dectin1 have a similar level of noise power. Amongst the three receptors, TNFaR shows the highest signal power and consequently the highest signaltonoise ratio (Figure 5B and C). All three channels have a signaltonoise ratio higher than one. For dectin2, both signal and noise power are low compared to the other receptors; however, the noise power exceeds the signal power, resulting in a significantly lower signaltonoise ratio. Since the signal power is independent of the noise power, our data indicate that the lower variation of the mean GFP expression (i.e. signal power) of dectin2 dictates the reduced channel capacity compared to the other receptors. A similar conclusion cannot be drawn for the noise power since it is inherently coupled to the signal power (see Appendix 3).
We further employed the decomposition method to dectin2 signaling in the presence of either dectin1 coexpression or FcRγ overexpression (Appendix 1—figure 5). Analogous to the compromised channel capacity when dectin1 and dectin2 were coexpressed (Figure 3C), the analysis revealed that both the signal and the noise power were compromised as well (Appendix 1—figure 5 A and B). In case of FcRγ overexpression, dectin2 signaling after invertase stimulation is characterized by increased noise power, resulting in decreased signaltonoise ratio (Appendix 1—figure 5 CE). Therefore, despite the highGFP expression at highstimulant concentrations (Figure 2C), the overexpression of FcRγ, as additional signaling hubs, involved in the dectin2/NFκB cascade did not increase the signal power but instead elevated the noise power, leading to reduced channel capacity. Taken together, the relatively lowchannel capacity of dectin2 is directly related to its lowsignal power, and the overexpression of FcRγ further decreases the channel capacity through increasing the noise power.
Discussion
We set out to better understand how glycanencoded information is read in cellular communication. We established a glycanresponsive in vitro model and exploited the channel capacity as a quantitative metric. For the receptors other than dectin2, the channel capacities were around 1 bit or higher, similar values that have been reported for other systems previously (Suderman et al., 2017). In particular, TNFα receptor has a channel capacity of 1.64±0.36 bits, which was found in a comparable reporter cell system (Cheong et al., 2011). Interestingly, the number of receptors expressed on the cell surface did not determine the channel capacity of a signaling channel (Appendix 1—figure 2 G). Our results exemplify that lectin signaling pathways and especially the dectin2 pathway should not be viewed as a deterministic on/offswitch, but rather as difference in the probability of cells to be active at a certain dose. This is in line with previous reports strengthening a quantitative view of cellular signaling and taking the cellular microheterogeneity into account (Levchenko and Nemenman, 2014; Zhang et al., 2017). We found that the mannose binding CTL dectin2 to transmits less information compared to other receptors of the same family (Figure 2B).
To understand how these insights could be expanded on the interplay between multiple receptors like the CTLs occur on innate immune cells rather than isolated lectins, we employed combinations of CTLs on our model cells. Dectin2 and dectin1 recognize different epitopes on FurFurMan, and we found that the effects were not additive, but a compromise between the two receptors, showing intermediate sensitivity (EC_{50}) and channel capacity between dectin2 and dectin1 (Figure 3A and B). This effect implies at high concentrations of FurFurMan the dectin2 channel is actively inhibiting dectin1 signaling, resulting in a lower cellular NFκB activation. It is well known that lectins are able to modulate the signals of other receptors (Geijtenbeek and Gringhuis, 2009; Gringhuis et al., 2009; Miyake et al., 2015). Yet this compromise is an exciting discovery since to the best of our knowledge previous studies have not quantified lectin signal integration. Hence, it is likely that during a fungal infection, exposing multiple epitopes of pathogens are recognized by the precise arsenal of immune receptors, and their underlying signaling pathways are integrating the information contained within the epitopes. This in turn leads to a compromise of all activated receptors and results in a specifically tailored biochemical response of the given immune cell (Ostrop and Lang, 2017).
Dectin2 itself we found to have relatively less channel capacity when compared to the closely related MINCLE that uses the same pathway with more signal power (Figure 2A and B). It is therefore likely the receptor itself determines very early on the information flow into the cell. This could be a result of MINCLE being stimulated with crystalline insoluble ligands which could result in larger signaling clusters at the cellular surface. Alternatively, dectin2 signaling could be influenced by mannose structures that are present on the cellular surface by giving rise to background signaling and selection for reducing signaling power in an in vitro setting of highcellular density. Additionally, since dectin2 binds highmannose structures of eukaryotic origin (McGreal et al., 2006), a too sensitive reaction might lead to permanent selfrecognition of human Man9 structures for example and hence potential autoimmune reactions. This hypothesis is supported by the dectin2dependent highbasal activity of FcRγ overexpressing dectin2 cells, which in turn is responsible for a lower channel capacity in dectin2 FcRγ cells (Figure 2C–E). Hence, dectin2 could have evolved to use the CARD9BCL10Malt1 pathway to NFκB less effective. Along the same lines, recent reports show that CTLs are in general becoming more important in autoimmunity, dectin2 in particular is known to be responsible for the development of allergic reactions (Dambuza and Brown, 2015; Parsons et al., 2014).
We first thought a combination of multiple lectins might synergistically enhance signaling capacity of dectin2. But while DCSIGN greatly enhanced ligand binding to the cells, meaning the increased sensitivity (EC_{50}), it did not significantly increase the channel capacity (Figure 4A–C, Appendix 1—figure 4A and B). In contrast to DCSIGN, the closely related MCL to dectin2 has a significant synergetic effect on dectin2 channel capacity at particularly lowstimulant concentrations, potentially making double positive cells more discriminative, at earlier timepoints of infection compared to dectin2 expressing cells, substantiating the importance of signal integration to understand an cellular innate immune response (Ostrop and Lang, 2017).
Finally, it is important to take into consideration that our conclusions came from model cell lines, which were used as a surrogate for celltypespecific lectin expression patterns of primary immune cells. Human monocytes and dectin2 positive U937 cells have comparable receptor densities and respond similar to stimulation with zymosan particles (Appendix 1—figure 6A and B). Importantly, since our channel capacity calculations are applicable regardless of the nature of signal and medium, one could use it to quantify cellular responses in similar assays in the future. Work is ongoing to address central questions of cellular communication based on glycan lectin interactions.
Materials and methods
All reagents were bought from Sigma Aldrich, if not stated otherwise.
Reporter cell generation and reporter cell assay
Request a detailed protocolU937 cells were transduced with an NFκBGFP Cignal lentivirus (Qiagen) according to the manufacturer’s instructions to generate NFκB reporter cells. 0.5 mL of 2e5 cells were mixed with the lentivirus at an MOI (multiplicity of infection) of 15 and spin transduced for 1.5 hr at 33°C and 900 g. After 48 hr of rest, cells were selected with puromycin (gibco) for three passages. Eight cultures from a single cell each were subsequently made and evaluated according to their GFP expression, clone #5 only monoclonal cells were used for all experiments of this paper.
Reporter cell assay
Request a detailed protocolU937 reporter cells were used in its log phase, and 100 µL were plated in a 96well plate with 3e4 cells per well. Cells were challenged in complete media (RPMI with 10% FBS (fetal bovine serum), 1% Glutamax, 1% Pen/Strep, and all by gibco) with TNFα and various other ligands and at various concentrations for 13 hr and 16 hr, respectively. After incubation, cells were resuspended once in DPBS (Dulbecco's phosphatebuffered saline) and the expressed GFPs fluorescent intensity was measured by flow cytometry (Attune Nxt, Thermo Fisher).
Cell culturing and passage
Request a detailed protocolU937 cells were kept between 1e5 and 1.5e6 cells/mL in complete media with passage 2–4 times a week. 293 F cells were adherently cultured in DMEM (Dulbecco's modified Eagle’s medium) with 10% FBS, 1% Glutamax, 1% Pen/Strep (Gibco), and split 2–3 times per week. All cells were tested for mycoplasma contamination using Minerva biolabs VenorGeM Classic.
Antibody staining and quantitation
Request a detailed protocolFor the surface staining, cells were incubated in with the respective antibodies and isotype controls for 30 min at 4°C in DPBS, then washed once in DPBS +0.5% BSA and measured via flow cytometry. For perforated stains cells were first fixed in 4% PFA (Carl Roth) at 4°C for 20 min, then perforated in perforation solution (DPBS +0.5% BSA+0.1% Saponine) for 20 min at 4°C. The cells were then resuspended in perforation solution containing the respective antibodies, incubated for 20 min at 4°C and measured via flow cytometry after being washed once. To quantify the fluorescent intensities, we used the BD PE quantitation kit, which allowed us to calibrate FI to the number of PE molecules present in a sample. A list of all used antibodies can be found in the Supplementary file 1.
Generation of lectin overexpressing cells
Request a detailed protocolcDNA of MINCLE, dectin2, MCL, FcRγ, dectin1, and DCSIGN were cloned into vector BICPGKZeoT2amAmetrine:EF1a as previously reported (Wamhoff et al., 2019). This bicistronic vector expresses mAmetrine under the PGK promoter. To combine multiple GOI (gene of interest), we also used the lentiviral vector EF1aHygro/Neo a gift from Tobias Meyer (Addgene plasmid # 85134). Briefly, 293F cells were transfected with vectors coding for the lentivirus and GOI. Lentivirions were generated for 72 hr, and the supernatant was frozen to kill any remaining 293 F cells. This supernatant was used to transduce the GOI into U937 cells via spin infection at 900 g and 33°C in the presence of 0.8 µg/mL polybrene (van de Weijer et al., 2014). After 48 hr of rest, the U937 cells were selected with appropriate antibiotics (Zeocin 200 µg/mL, G418 500 µg/mL, or Hygromycin B 200 µg/mL; Thermo Fisher, Carl Roth, Thermo Fisher, respectively). A list of used primers can be found in the Supplementary file 2.
Labeling of proteins
Request a detailed protocolInvertase (5 mg in 1 mL) was heat inactivated for 40 min at 80°C and mixed with 3×molar excess of Atto647NNHS dye (AttoTech) according to the manufacturer’s protocol. The labeled protein was purified using Sephadex G25 column, and aliquots were frozen at –80°C. Since we found the labeled invertase to contain less impurities, we used Atto647 labeled invertase for all experiments shown in this study. Human TNFα (Peprotech) was labeled with the same procedure, yet without heat inactivation. The degree of labeling was determined to be around 1 as determined with a labeled protein concentration measurement of a NanoPhotometer NP80 (Implen).
Channel capacity calculation
Request a detailed protocolCalculations of channel capacity were based on Cheong et al., 2011 and Suderman et al., 2017. See Appendix 2 on channel capacity calculation for details.
Data representation, software, and statistical analysis
Request a detailed protocolData is shown as mean ± SD. Statistical analysis of data was performed by unpaired twotailed ttest, with significant different defined as (p<0.05). EC_{50} values were calculated in graph pad prism version 8.4.2 using four parametric dose vs. response function. When necessary statistical differences between EC_{50} values were compared using an extrasumofsquares F test. Detail of statistical tests and EC_{50} determinations can be found in the SI raw data file. FlowJo v.10 was used for analysis and export of flow cytometry data.
Data availability
Request a detailed protocolAll data is available at Dryad. The Jupyter notebook including the channel capacity calculation and noise analysis is available at: https://github.com/imaginationdykim/2022.CC, (copy archived at Fuchsberger, 2023).
Appendix 1
Appendix 2
Estimation of channel capacity between input doses and reporter GFP
1.1 Data structure
Cells can sense the environment and respond to it. In this work, we quantify the cellular capability to sense carbohydrate information through information theory. The carbohydrate information is given as carbohydrate ligand concentrations in the cell media, and the output is the GFP expression level triggered by NFκB translocation of individual cells (Appendix 2—figure 1). The inputs (i.e. concentration of ligand) are discrete values covering almost all variability of output distribution. We used 9 or 10 levels of carbohydrate ligand concentrations including the absence of ligand and measured around 100,000 cells for whole doses using flow cytometry. The measured GFP expression levels are integer values ranging from around 0 to 50,000.
1.2 Mutual information estimation
To estimate the mutual information between carbohydrate inputs and GFP expressions from NFκB translocation, the input and output array data described in Appendix 2—figure 1 is projected into twodimensional probability space divided by grids as shown in Appendix 2—figure 2A. The projection allows estimation of joint probability distribution of finite data points. On the other hand, finite sample size together with arbitrariness of binning interval produces over or underestimation of joint probability distribution, requiring additional statistical analysis to calculate unbiased channel capacity. The joint probability distribution of individual grid elements is the number of data points in the grid divided by the total number of data points. The joint probability distribution and marginalized input and output distributions is shown in Appendix 2—figure 2 B. The probability distribution for input and output is the marginalized joint probability distribution by output and input index, respectively, as follows:
and
where $i$ and $j$ are the input and output index, respectively, and $P}_{xy$ is the joint probability of input and output. The mutual information of the given input and output distribution is 0.53 bits, calculated from mutual information definition: $MI\left(input;output\right)=\text{}H\left(input\right)+H\left(output\right)\text{}H\left(input,\text{}output\right)$. Note that $H\left(input\right)$ , $H\left(output\right)$, and $H\left(input,\text{}output\right)$ are input entropy, input entropy, and joint entropy, respectively, defined as follows:
and
1.3 Channel capacity estimation
Suppose there is a joint probability distribution having four inputs and two outputs as described in Appendix 2—figure 2C above. Since the marginal probability distributions are equally distributed for input and output, the input entropy and output entropy yield 2 and 1 bits, respectively. The mutual information is therefore 0.5 bits by subtracting joint entropy from the sum of input and output entropy.
Consider that how much of information can be reliably transmitted from this channel? The input 0 and 3 certainly give output 0 and 1, respectively. On the other hand, the input 1 and 2 give uncertain outputs, distributed equally on all output range. Therefore, by completely suppressing the input channels 1 and 2, one can achieve the maximum information that this channel reliable transmits (i.e. channel capacity). The array multiplication (i.e. elementwise product) between weighting values $w\left(i\right)$, [2, 0 , 0, 2], and input distribution ${P}_{x}\left(i\right)$ , given in Appendix 2—figure 2C above, [1/4,1/4,1/4,1/4], produces the following weighted input probability distribution that maximize the mutual information:
And the changed input probability distribution, ${P}_{x}^{`}\left(i\right)$, must satisfy the law of total probability
under the constraint condition
The modified joint probability distribution yields 1 bit of channel capacity as shown in the below figure of Appendix 2—figure 2C.
Mathematically, the adjusted joint probability distribution by weighting values can be expressed as follows:
And therefore the adjusted input and output marginal probability distributions are
and
Altogether, the mutual information given the weighting values is defined as
Finding input weighting values, $w\left(i\right),$ that maximize the mutual information subject to $\sum _{i}^{input}w\left(i\right){P}_{x}\left(i\right)=1$ and $0\le w\left(i\right){P}_{x}\left(i\right)$ is a nonlinear optimization problem. Since the direction of the gradient of mutual information is the same as that of those two constraints at the minimum, using Lagrange multiplier method, one can restated the functions as Lagrangian $\mathcal{L}\left({w}_{i},\lambda ,\sigma \right)=MI\left(input;outputw\left(i\right)\right)\lambda \left[\sum _{i}^{input}w\left(i\right){P}_{x}\left(i\right)1\right]\sigma \left[w\left(i\right){P}_{x}\left(i\right)\right],$ and find out weightings using numeric approach (Kraft D. 1988. A Software Package for Sequential Quadratic Programming. Wiss. Berichtswesen d. DFVLR). We used sequential least squares programming provided by SciPy Python library (scipy.optimize.minimize, SciPy 1.7.3) to find out the optimizing input weighting values.
The input weighting values that maximize the mutual information given in Appendix 2—figure 2B is shown in Appendix 2—figure 2D. $w\left(i\right)$ is $\left[\frac{5}{3},\frac{5}{3},\frac{5}{3},\mathrm{0,0},\mathrm{0,5},0\right]$; hence, ${P}_{x}^{\mathrm{`}}\left(i\right)$ becomes $\left[\frac{1}{6},\frac{1}{6},\frac{1}{6},\mathrm{0,0},0,\frac{1}{2},0\right]$. Note that other weighing values, for example, $\left[\mathrm{5,0},\mathrm{0,0},\mathrm{0,0},\mathrm{5,0}\right]$ and $\left[0,\frac{10}{3},\frac{5}{3},\mathrm{0,0},\mathrm{0,5},0\right]$, can be the solution due to the same output distribution for inputs 0, 1, and 2. The optimized input distribution yields around 0.71 bit of mutual information between input and output distribution.
1.4 Channel capacity calculation from modalized weighting values
Since we use optimization algorithms, we do not predefine the weighting values to find out the maximum mutual information. On the other hand, estimating mutual information under various Gaussianshaped input weighting values can give intuition of physiologically relevant input distribution and cellular response (Cheong et al., 2011).
Appendix 2—figure 3 shows the estimation of channel capacity between TNFα doses and NFκB reporter under various unimodal and bimodal Gaussian input distributions. Note that several superposition of two different Gaussian distribution that forms a unimodal distribution, having single maximum peak, is excluded (Appendix 2—figure 3 B). Appendix 2—figure 3 C and D, show the calculated mutual information values from the unimodal and bimodal input distributions, respectively. Since the input range from 0 to 5 yields the same output response, the variation of input distribution within those range does not affect the mutual information. Therefore, discrete increases of mutual information are pronounced if the mutual information values are sorted in ascending order, particularly, in the case of bimodal input distribution (Appendix 2—figure 3D). Appendix 2—figure 3 E and F show the probability space given from the mutual information maximizing unimodal and bimodal input distribution, respectively. Maximum mutual information from bimodal input distribution yields around 10% higher value than that of the unimodal distribution and less than 1% of lower value compared to the optimized input distribution described in the previous section.
1.5 Influence of the number of output binning on channel capacity
Projection of input and output distribution onto probability space is described in Appendix 2—figure 2 A and B. Since the input and output data points are finite, relatively large number of output binning will produce discontinuous joint probability distribution throughout different output indexes. On the other hand, insufficient number of output binning cannot capture the original probability distribution from the input and output but average out the local variation of joint probability distribution throughout the output indexes. Therefore, the number of output binning significantly influence the mutual information and channel capacity of input and output distribution. Note that the number of input binning is the same as the number of input doses.
Appendix 2—figure 4 describes the changing mutual information and channel capacity values in different output binning numbers. Since the input binning is given as the input doses, there is no variation in input entropy in the mutual information calculation. On the other hand, the increase of binning increases the output entropy and joint entropy. As increasing the output binning number, the increased output entropy than that of the joint entropy is bigger (Appendix 2—figure 4 D). Therefore, mutual information and channel capacity increase as increasing output binning.
1.6 Influence of the number of samples on channel capacity
As described in the previous section, it is important to consider the ratio between the number of output binning and the number of samples to estimate the channel capacity. If the number of samples is relatively smaller than the number of binning, the joint probability space become sparse and generate one to one input and output relationship which in turn increases the calculated channel capacity.
Appendix 2—figure 5 describes changing mutual information and channel capacity with respect to the total number of samples. Since the samples are random distribution, the ground truth mutual information and channel capacity are 0. On the other hand, the distribution yields the more mutual information and channel capacity as decreasing the total number of samples (Appendix 2—figure 5AC). The mutual information and channel capacity values deviating from 0 are a bias since the ground truth mutual information and channel capacity are 0.
1.7 Bootstrapping method to estimate channel capacity at infinite sample size
As shown in the previous section, the size of sample determines the degree of bias in the calculated channel capacity. Furthermore, the size of sample is always finite, therefore the calculated channel capacity is biased. On the other hand, using linear regression of subsampled datasets, the channel capacity value at infinite sample size can be estimated as followed (Appendix 2—figure 6). The sample distribution given in Appendix 2—figure 6 A is subsampled into various subsampling percentages shown in Appendix 2—figure 6 B. The subsampling uses random sampling with replacement in every drawing, therefore the original sample distribution shown in Appendix 2—figure 6 and 100% subsampled distribution from the original data show difference in distribution points (Appendix 2—figure 6 B). These subsampled distributions tend to yield higher channel capacity in smaller subsample size (Appendix 2—figure 6 C). By plotting channel capacity value with respect to the inverse of sample sizes, the channel capacity value at infinite sample size can be estimated (Appendix 2—figure 6 D). This bootstrapping method alleviates the degree of bias in the calculated channel capacity, but still the estimated channel capacity of random distribution at infinite sample size is around 0.15 bits, 0.15 bits higher than the ground truth channel capacity (i.e. 0 bit).
1.8 Channel capacity bias map depending on the output binning and total number of samples
As shown in the series of previous section, choosing an appropriate output binning and the total number of samples are essential to calculate the unbiased channel capacity. Since the two factors dependently influence the bias of channel capacity, it is required to estimate the channel capacity in various total sample numbers and binning numbers. Appendix 2—figure 7 above and B describe the bias map depending on the total number of samples and the number of binning calculated from random (A) and experimental dataset (B). Note that all channel capacities are estimated using bootstrapping method to interpolate the channel capacity value at infinite sample size.
In the case of random dataset, most of the regions spanning in range 0–160,000 total sample size and 0–1000 output binning number exhibit less than 0.01 bit of channel capacity. The white line in Appendix 2—figure 7 A indicate the contour line having 0.01 bit of estimated channel capacity. Therefore, the output binning number and total number of samples having the values above the white line exhibit less bias than 0.01 bits of channel capacity. On the other hand, the channel capacity values below the white line exhibit the value higher than 0.01 bits of channel capacity. In this work, the allowed bias is either 0.01 or 0.05 bits depending on the input and output layer (see section 123).
Appendix 2—figure 7 B is the bias map calculated from between TNFα doses and reporter GFP of U937 cells. In this example, the total number of samples is 170,472 and subsampled without replacement in different yaxis of the heatmap (Appendix 2—figure 7). Overall, as shown in the widely spreading orange and red color, the calculated channel capacity fluctuates near 1.4 bits. The exceptions are either the case where the number of output binning is less than 200 or the coordination of total number of samples and binning numbers are below the white line shown in Appendix 2—figure 7 A. In this work, the minimum sample number in whole dataset is 63,816 which is the above of the line in Appendix 2—figure 7 A. Therefore, expected maximum bias in channel capacity is less than 0.01 bits even in the case of 1000 output binning.
We determine the channel capacity value as the highest channel capacity values calculated from output binning numbers ranging from 10 to 1000. Appendix 2—figure 7C, D shows channel capacity values depending on the output binning number for 25,570 and 51,141 total sample sized random distribution and DosesGFP response data. In the case of 25,570 total sample size, around 600 output binning number, bias start to increase. On the other hand, in the case of 51,141 total sample size, there is no noticeable increase of bias in the given output binning range. The highest channel capacity values for 51,141 sample size is 1.41 bits at 985 output binning number. In the original sample size (i.e. 170,472), the maximum channel capacity is 1.41 bits at 510 output binning number. Therefore, in this example dataset, the estimated channel capacity is 1.41 bits.
In this work, if the input is discrete dose information, we estimate channel capacity using bootstrapping method described in Appendix 2—figure 6 with multiple output binning numbers ranging from 10 to 1000. The maximum value of channel capacity in the binning range is the final channel capacity value of the calculation. Appendix 2—figure 8 shows the results of channel capacity calculation from experimental datasets. The yintercepts of individual lines are the unbiased channel capacity. And the maximum value of those yintercepts in the individual dataset is selected as the final channel capacity value.
Appendix 3
Decomposition of signaling channel
2.1 Definition of signal and noise power
The signal power of a signaling channel is the variance of the average output distribution of individual input responses. Therefore, signal power can be written as $\mathrm{E}{[{\stackrel{}{R}}_{i}\mathrm{E}({\stackrel{}{R}}_{i}\left)\right]}^{2}$ hence $\mathrm{E}\left({\stackrel{}{R}}_{i}^{2}\right)\mathrm{E}{\left({\stackrel{}{R}}_{i}\right)}^{2}$, where $\mathrm{E}$ and ${\stackrel{}{R}}_{i}$ are the expectation operator and average output distribution at ith input dose, respectively. Since ${\stackrel{}{R}}_{i}=\sum _{j}{R}_{j}P\left({R}_{j}{S}_{i}\right)$, where ${R}_{j}$ and $P\left({R}_{j}{S}_{i}\right)$ are marginal output value at jth index and conditional probability of output at jth index given the ith input signal ${S}_{i}$, substituting ${\stackrel{}{R}}_{i}$ into $\mathrm{E}\left({\stackrel{}{R}}_{i}^{2}\right)\mathrm{E}{\left({\stackrel{}{R}}_{i}\right)}^{2}$ provides the signal power as ${\sigma}_{r}^{2}=\sum _{i}P\left({S}_{i}\right){\left[\sum _{j}{R}_{j}P\left({R}_{j}{S}_{i}\right)\right]}^{2}{\left[\sum _{i,j}{R}_{j}P\left({R}_{j}{S}_{i}\right)\right]}^{2},$ where $P\left({S}_{i}\right)$ is the input probability at ith index.
In the case of noise power, it is defined as the average of the variance of the output distribution of individual input responses. Therefore, noise power can be written as $E\left[{\overline{R}}_{i}^{\text{2}}{\overline{R}}_{i}^{2}\right]$ and can be further expanded as ${\sigma}_{n}^{2}=\sum _{i}P\left({S}_{i}\right)\left[\sum _{j}{R}_{j}^{2}P\left({R}_{j}{S}_{i}\right){\left[\sum _{j}{R}_{j}P\left({R}_{j}{S}_{i}\right)\right]}^{2}\right].$
Appendix 3—figure 1A, B describes how different input and output distributions contribute to signal and noise power. Increasing the variance of output response of individual input does not influence the signal power but only increase the noise power. Likewise, increasing the mean output while keeping the variance of individual output response for each input provides increased signal power without affecting the noise power (Appendix 3—figure 1 B).
Data availability
We have uploaded the raw data of the study to Dryad at https://doi.org/10.5061/dryad.18931zd2g. Our scripts for data evaluation are also linked to GitHub and stated in the manuscript.

Dryad Digital RepositoryData from: Information transfer in mammalian glycanbased communication.https://doi.org/10.5061/dryad.18931zd2g
References

The repertoire of glycan determinants in the human glycomeMolecular BioSystems 5:1087–1104.https://doi.org/10.1039/b907931a

CType lectins in immunity: Recent developmentsCurrent Opinion in Immunology 32:21–27.https://doi.org/10.1016/j.coi.2014.12.002

Many light touches convey the messageTrends in Biochemical Sciences 40:673–686.https://doi.org/10.1016/j.tibs.2015.08.010

SoftwareImaginationdykim/2022.CC, version swh:1:rev:9f9def67f6ce522bee1f3b0864bd7111214df18aSoftware Heritage.

Signalling through Ctype lectin receptors: Shaping immune responsesNature Reviews. Immunology 9:465–479.https://doi.org/10.1038/nri2569

CType lectin receptors in the control of T helper cell differentiationNature Reviews. Immunology 16:433–448.https://doi.org/10.1038/nri.2016.55

Direct recognition of the mycobacterial glycolipid, trehalose dimycolate, by Ctype lectin MincleThe Journal of Experimental Medicine 206:2879–2888.https://doi.org/10.1084/jem.20091750

Cellular noise and information transmissionCurrent Opinion in Biotechnology 28:156–164.https://doi.org/10.1016/j.copbio.2014.05.002

Ctype lectin receptor MCL facilitates mincle expression and signaling through complex formationJournal of Immunology 194:5366–5374.https://doi.org/10.4049/jimmunol.1402429

Contact, collaboration, and conflict: Signal integration of sykcoupled Ctype lectin receptorsJournal of Immunology 198:1403–1414.https://doi.org/10.4049/jimmunol.1601665

Dectin2 is a pattern recognition receptor for fungi that couples with the fc receptor γ chain to induce innate immune responsesJournal of Biological Chemistry 281:38854–38866.https://doi.org/10.1074/jbc.M606542200

Immune recognition of fungal polysaccharidesJournal of Fungi 3:47.https://doi.org/10.3390/jof3030047

Dependence on mincle and dectin2 varies with multiple Candida species during systemic infectionFrontiers in Microbiology 12:633229.https://doi.org/10.3389/fmicb.2021.633229

Sensing lipids with Mincle: Structure and functionFrontiers in Immunology 8:1662.https://doi.org/10.3389/fimmu.2017.01662
Decision letter

Andre LevchenkoReviewing Editor; Yale University, United States

Aleksandra M WalczakSenior Editor; CNRS, France
Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.
Decision letter after peer review:
Thank you for submitting your article "Information transfer in mammalian glycanbased communication" for consideration by eLife. Your article has been reviewed by 2 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Aleksandra Walczak as the Senior Editor. The reviewers have opted to remain anonymous.
The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.
Essential revisions:
1) Please, address the comments from Reviewer 1 regarding the physiological relevance of the cell model used in this study. In particular, it is imperative to relate the results to physiological levels of receptor expression and address the relevance of the signal processing in the cell type used in the experiments to cellular information transfer in vivo.
2) Both reviewers raise the key issue of how the information transfer was assessed vs previously published studies. Please, expand on the description of the information analysis, put it more explicitly into the context of the prior studies, consider supplying further details and justification for the specific capacity calculation algorithm used in the study.
3) It is important to expand on the discussion of the physiological relevance of having this complex receptor system, particularly in the context of different information transfer properties.
Reviewer #1 (Recommendations for the authors):
Overall, this is a fairly interesting piece of work, extending the framework of information theory to a new class of signaling networks that has not, at least to my knowledge, been considered before in this context. It is also quite interesting that there are significant differences in the information transmitted by dectin2 and other, very closely related receptors; this suggests that even signaling systems with very similar downstream pathways can have very different noise properties. Understanding exactly why this happens mechanistically, and why these networks may have evolved this way, are very interesting future directions suggested by this work.
All of that being said, I have a number of important suggestions for the authors that I feel would need to be addressed before the manuscript could be considered for eLife. These are:
1) I think the physiological relevance of the results are difficult to evaluate as the manuscript is currently written. The fundamental issue is that the U937 cells considered here are not evolved to sense the glycans considered as "input" signals by the authors. Since this is an artificial cell line, the cells in question have not even evolved to sense anything; they were developed from cancer cells and have likely adapted to growing in the lab. So, while these are derived from a (potentially?) relevant immune cell population, they are clearly not representative of the actual human cells that have evolved to sense and respond to potential fungal infections. As far as I can tell, these cells don't even express the relevant receptors, since the authors have to introduce those receptors exogenously.
The authors may argue that the cells in question are at least immunological in origin, and that cell line work of this type is common in the field due to the difficulty of engineering the appropriate reporters into, say, a mouse in order to do work in primary cells. That may be true, but it still does not alleviate a very critical set of concerns regarding these results. In particular, it is currently difficult to interpret the findings the authors have about dectin2 having a lower 'channel capacity' than other receptors like dectin1. As the authors (basically) state, there are two hypotheses regarding this particular result. It does not seem that the difference in channel capacity between dectin2 and dectin1 is due to differences in average expression levels or due to differences in the distribution of protein levels themselves. Indeed, compared to TNFα, there actually seems to be less noise in dectin2 binding vs. TNFα binding (more on that question below, see Figure 4A). So, the differences in channel capacity here are likely either due to fact that dectin2 is somehow just more "noisy" as a receptor in and of itself, or the fact that there are differences in the downstream signaling network in between the receptor and NFκB.
This second hypothesis seems far more likely, given that it is highly unclear that intrinsic differences in, say, the conformational changes that drive signaling processes downstream signaling could be that significantly different between dectin1 and dectin2. In other words, it seems extremely unlikely that an intrinsic, biophysical difference between the receptors could be sufficient to explain the difference in channel capacity. As such, it is likely that the real difference here is due to some difference in downstream interactions. The authors seem to imply that any such differences are currently unknown, since these two receptors seem to interact with the same downstream pathways. Regardless, it seems much more likely to me that differences in the signaling networks induced by these two receptors (or dectin2 and any of the receptors studied here) are driving differences in the observed information transfer.
It has been shown that the channel capacity depends in a critical way on the distribution of the number of downstream signaling molecules present (see the Suderman et al. (2018) Interface Focus 8 (6), 20180039). In other words, if you have a network that signals through a molecule with low abundance, that will tend to increase noise levels and decrease channel capacity. Or, if there is a molecule in the signaling pathway with a particularly broad distribution across the cell population, that can also lead to low channel capacities.
The problem in this case is that the cells in question have not evolved to use dectin2 to sense anything in the environment. So, a particular protein downstream of dectin2 may be expressed at a low level, or expressed with a broad distribution, simply because there is no need for these cells to sense their environment efficiently. As a result, it is very difficult to interpret the physiological relevance of the results presented here.
As currently written, the authors seem to make the argument that their findings have revealed a fundamental difference between dectin2 and the other signaling molecules considered here. I am not myself convinced of that fact, simply because the physiological relevance of exogenously expressing these receptors in an immortal cancer cell line is unclear. I would suggest the authors think about what their results truly mean for our understanding of the underlying biology. Currently, I think it is impossible to infer that the authors would see the same kind of results in primary cells that actually perform the job of sensing these molecules in the body. That being said, the results are still interestingdectin1 and dectin2 are very closely related, and to have such different noise properties is really rather intriguing. It suggests that receptors can have very different doseresponse distributions even if they are very similar and are thought to signal through (essentially) identical downstream systems. The fact that this can happen is surprising, I think, and sets up the two alternative hypotheses discussed above. But the authors need to acknowledge this and be more reasonable regarding the extent to which these results can be extrapolated into a more meaningful biological context.
2) One of the most interesting aspects of this work is the fact that the authors consider not just the downstream signaling response, but also directly measure the statistical properties of the first step in the signaling process: namely, receptor binding. I think there is a significant opportunity to understand a critical aspect of cell signaling here in a new and interesting way, and so I have some suggestions for analyses the authors could do that would, I think, really increase the impact of the work.
I would be really interested to see what the channel capacity would be for the information flow between ligand concentration and the level of bound receptor. In other words, the authors can calculate the channel capacity between the dose of the ligand that they give the cells, and the distribution of bound receptor that they observe on the cell surface.
Looking at the data (e.g. Figure 4A), my intuitive sense is that the channel capacities will be relatively low for that calculation: maybe 2 bits or something like that. Interestingly, that is much, much lower than the theoretically possible value we would expect for receptors expressed at around 1000 copies per cell (which is around 3.24 bits, see the Suderman Interface Focus paper mentioned above). This may simply be because the receptors are being expressed exogenously and thus have lower absolute numbers, and broader distributions across the population, than would be observed in a more physiological context. That being said, this is to my knowledge the first actual measurement of the amount of bound receptor on the surface of individual cells. This is thus a great opportunity to calculate the information flow between ligand concentration and bound receptor levels.
I should note here that it might be intuitive for the authors to see this as an "upper bound" on the channel capacity for their system; in other words, the channel capacity between the ligand concentration and the bound receptor concentration should set an upper bound for the channel capacity for the entire system. The data processing inequality from Shannon seems, at least on the face of it, to guarantee that. It is important to note, however, that the signaling networks in question do not form a Markov chain; the amount of reporter expression can easily average over the entire signaling history in a particular cell. So, whatever the authors find here, it is not technically an upper bound on the downstream information flow. That being said, it would still be very interesting to know!
It should be relatively easy for the authors to just make this calculation based on the data they already have. I do, however, have suggestions for a few experiments that would improve the rigor of this calculation and make the results even more interesting.
Firstly, it is unclear how representative the bound protein numbers are of the actual number of binding events we would have in the cell culture when the cells are exposed to the ligand. This is because the authors have to wash the cells (at least I assume they do that) and then put them through the FACS instrument to measure the fluorescence. It would be really cool of the authors could do a time course to measure how fast the ligand "falls off" of the receptor. As long as the measurements made by the authors occur before significant unbinding has happened, the channel capacity they calculate should be very representative of what actually happened in the culture.
The other experiment that would be really interesting would be to repeat the experiment with labeled ligands but using whatever primary cells are known to express dectin1, dectin2, etc. In other words, the fluorescence here is coming from the ligand, so there is no need to engineer any reporters into the cell. If the authors see similar distributions of bound receptor on the surface of primary cells, and similar channel capacities between ligand concentration input and the bound receptor as output, that would be very intriguing. In any case, doing this experiment would I think begin to address some of the concerns with physiological relevance raised above. It would also be extremely interesting as a contribution to the field.
3) As mentioned in my public comments, another issue in this work is the fact that the authors are relatively reticent about how their calculations are actually performed. While the authors present the relevant equations for calculating the Mutual Information and claim to follow the approach of Cheong et al., the methods are not sufficiently detailed to understand exactly what they have done here. This is particularly important because their approach is clearly not identical to that used by Cheong, since their algorithm for maximizing the MI across input distributions is different. Cheong et al. approach this problem by trying a limited set of possible input distributions and using the one that gives the highest MI, while the authors here use a builtin optimization algorithm in python. It is not clear if this is the only point at which the authors deviate from the approach laid out by Cheong et al., or if they have modified other aspects of the calculation.
The authors need to explain how they approach the problem of correcting for finite size effects. Do they use the approach of Cheong et al., bootstrapping the data at different levels and then using a linear extrapolation of MI vs. 1/N (where N is the number of data points/cells) to extrapolate to the case of an infinite population size (1/N \to 0). Did they use the same approach to choosing the number of bins to calculate the probability distribution over outputs? Cheong et al. did this by visual inspection, if I recall correctly, looking for a "plateau" in bin numbers where the MI calculated from the data was nonzero but the MI calculated for randomized data was 0. Based on the statements made by the authors, and Figure S6, they seem to have performed both of these stepsotherwise it is completely unclear where the confidence intervals on their MI/channel capacity estimates come from, and how they chose the number of bins to use. Although it is important to note that the randomized controls seem to be missing from Figure S6. Regardless, since the authors have evidently reimplemented the code, they need to explain exactly what they have done.
I would suggest that the authors also adopt several improvements to the Cheong et al. approach that have been developed in subsequent works. In particular, Suderman et al. (2017, PNAS 114 (22), 575560) implemented a broader range of bootstrap samples across which to do the linear extrapolation to infinite population size, automated the choice of bin sizes, and developed a bootstrap approach to estimating confidence intervals that is much more realistic (and statistically appropriate) than simply using the confidence intervals from the linear extrapolation step. The authors can refer to the Suderman paper on how this was done; it should be relatively easy to implement these improvements, if they have not been done already.
I appreciate the author's innovation of using an optimizer in python to maximize the MI. I would expect this actually produces higher "channel capacities" than the previous approaches mentioned above. That being said, it is unclear how this optimization algorithm works. The authors may expect readers to go to the python documentation, but there are problems there. For one, the algorithm implanted in python might change, so the documentation may describe an approach that is different from the one the authors actually used to perform their calculations. How various packages are implemented in python can change from time to time, and something so generic as "optimize" may be changed as better optimization algorithms become available in the future. So the authors should describe exactly what they did. Also, readers are, in general, not going to go to python documentation to understand the methods of a paper like this, nor should they have to. The authors should describe what the algorithm is doing as currently implemented, in terms that readers can readily understand.
Finally, the authors should also perform the calculation using the exact same distributions used by Cheong et al. and Suderman et al. This will allow us to compare their results more directly to the results from those previous authors, as well as allow us to determine if the approach that they used results in similar channel capacities to the optimization algorithm used here. While I expect the optimization algorithm employed by the authors will yield larger numbers for the channel capacity (and thus better estimates, since the channel capacity is formally a supremum), it would be extremely useful if the authors checked this.
4) Another issue related to the calculation of the MI itself is the fact that the authors use logarithmically spaced bins, rather than linear bins, to perform the calculation. The authors describe this approach as if the matter of how to construct the bins is simply a matter of personal preference or convenience for the calculation, and choose logarithmic bins because they give somewhat larger values for the channel capacity. For one, this is not surprising; as is often the case, the FACS data obtained by the authors has a lognormallike character; in other words, the authors see normalish distributions on a log scale (e.g. Figure 2A). As such, it is natural that using logarithmic bins will result in higher channel capacities, because it will tend to equalize (as much as possible) the effective number of observations within the bins.
While this may seem reasonable at first, it is actually, in my view, hard to justify. The reason is that the choice of how to generate the bins is not arbitrary, but actually corresponds to a very strong statement regarding what the authors actually consider the relevant "output" of the system to be. Using logarithmic bins is obviously equivalent to first taking the log of the data, and then using linear bins on that logtransformed data (Figure S2A should make this fact abundantly clear). The authors are thus envisioning that the channel is not a channel whose input is ligand concentration and output is GFP level, but rather a channel where the input is the ligand concentration and the output is the log of the GFP level. This is not an arbitrary decision, but rather one that has real consequences for how we construe the calculation in a physiological context. If the output is the log of the GFP level, that means that the cell is sensitive not to linear changes in protein concentration, but rather foldchanges.
For certain transcription factors, authors have argued that it is actually this kind of relative, fold change that matters (see Lee et al. (2014) Mol Cell 53 (6) 86779). I would argue, however, that the conclusions of that work are hardly so robust as to suggest that, for every inputoutput transcriptional regulatory system within cells, it is the fold change of the "output" protein, rather than its linear change, that matters to the cell. Certainly, if the output is an enzyme, a transporter, or a protein that performs a whole host of other functions, then it is natural to assume that the appropriate output is the protein concentration itself, not the fold change in protein concentration.
If the authors can argue that every single gene whose transcription is controlled by NFκB has a foldchange impact on its downstream function, then I could see a strong argument being made for logarithmic bins. As it stands, however, I think the most natural way to interpret the output of the channel is on a linear scale, and I would say the authors should focus on that, rather than a logarithmic scale. If the authors want to include their logarithmic calculations in this work, I would suggest moving them to the supplement, and making clear the fact that such a calculation construes the output as having a foldchange impact on whatever is downstream of the protein level measured by the authors.
5) In this work, the authors focus on the measurement of GFP levels at a certain time. As written, it is not clear if the authors are considering the steadystate response of the cells that they are treating, a peak response, or something else. As far as I can tell, the authors stimulate the cells for 16 or 13 hours, depending on the ligand in question. I am not sure how these numbers were chosendid the authors look at the GFP expression dynamics and choose this number based on those measurements? From an informationtheoretic standpoint, it is best if the results are based on steadystate protein levels, but the author's don't seem to explain their rationale for choosing the time points that they choose. If the GFP expression levels do not reach a steadystate, it would be great if the authors had a particular reason for choosing the times they choose. I would also suggest that the authors consider looking to see if their channel capacity estimates are robust to the time they choose by looking at other time points and calculating the channel capacities to see if they get similar results. I am not sure what times to choose, but perhaps 24 hours or 36? It is hard to say without looking at the average dynamics over time, so providing that kind of data would be extremely helpful.
Also, the authors should acknowledge that a large body of literature has emerged that makes the claim that it is not the individual time points that matter, but rather the entire time series of the response that should be construed as the output. The authors can refer to the Zhang et al. paper in Cell Systems, the paper by Selimkhanov et al. (Science (2014) 346 (6215) 13703), papers by the Hoffman group (notably Tang et al. (2020) Nat Commun 12 (1) 1272), and a host of others. I am not suggesting that the authors adopt this worldview; there are actually serious mathematical and conceptual issues with construing the output of a communication channel to be a function like a time series. But, this idea is out there in the literature, and the authors should address this point and discuss how looking at GFP time series, rather than individual time point measurements, might yield different (and undoubtedly higher) channel capacity estimates.
5) As a final, rather important but also rather technical point, the authors in Figure 3C do statistical tests on their channel capacity estimates, comparing dectin1 and dectin2 cells to those that express dectin1 and dectin2 together. The issue here is that they use a ttest to estimate statistical significance, but it is really unclear where the dispersion (e.g. error bars or standard deviations) in the channel capacity estimates come from. I expect, since the authors day "n = 9" in the legend, that they split their data into 9 groups (maybe 9 sets of cells whose results were collected in different FACS runs or something?), performed the channel capacity estimate on each independently, then used the means and standard deviations in those estimates to do the ttest. That is completely made up by me, however; the authors really need to explain where these error estimates (and their +/ X values for channel capacity estimates in the various tables) actually come from.
In any case, the ttest of course only works if the data in question is drawn from actual Gaussian distributions. This does not mean that the distributions "look Gaussian" or "seem okay;" in the application of a ttest, you make extremely strong assumptions that only real honesttogoodness Gaussian distributions actually satisfy. So the data really needs to actually be drawn from a Gaussian distribution for the results of the test to be interpreted appropriately.
My first suggestion here is that the authors need to use a nonparametric statistical test. That is, unless there is a true, theoretical reason to expect that the data will be drawn from a Gaussian (I know of no such reason, but perhaps the authors have one). The Wilcoxon ranksum test was made for problems like this one, so I would suggest the authors use that.
A larger problem here, however, is that taking n = 9 (however it was done in the end) is not going to give a great estimate of the uncertainty in the channel capacity estimate. I would instead suggest that the authors merge their data into one data set (unless there are some kind of batch effects that prevent them from doing that) and then using a bootstrap/resampling approach, as in the Suderman et al. PNAS paper, to estimate confidence intervals and generate distributions for statistical tests. These bootstraps can be done hundreds or thousands of times, allowing for even more nonparametric tests (like permutation tests) to be used to estimate statistical significance.
This is honestly a fairly minor point, because nothing terribly important in the conclusions of the paper rest on the statistical tests here. But, if the authors want to argue that there are real differences between dectin1, dectin2 and dectin1+dectin2 cells, they should do these statistics in a more nonparametric way.
Reviewer #2 (Recommendations for the authors):
I think this paper would be greatly improved if the authors were more precise and detailed in the description of their capacity calculation. To address this, it's likely the authors were following the method from Cheong et al., but some of those details should be described here.
https://doi.org/10.7554/eLife.69415.sa1Author response
Essential revisions:
1) Please, address the comments from Reviewer 1 regarding the physiological relevance of the cell model used in this study. In particular, it is imperative to relate the results to physiological levels of receptor expression and address the relevance of the signal processing in the cell type used in the experiments to cellular information transfer in vivo.
We thank the reviewer for their concern and updated the revised manuscript accordingly. We agree that for future work such a correlation between model cell lines and primary cells will benefit most of the biophysical and synthetic biology work focused on the interpretation of the signaling transmission. However, direct implementation of the NFκB reporter into primary cells is very challenging. Hence, we decided to validate the choice of the model cell line by direct, sidebyside, comparison of dectin1 and dectin2 positive U937 reporter cells with the primary human monocytes and have mentioned and included these data in manuscript (line 382) and the Supporting Information (Appendix 1 Figure 6), respectively. These experiments revealed that human monocytes to the large extent overlap with the dectin2 positive U937 cells in their expression level of dectin2 receptor and efficiency to interact with zymosan particles.
Additionally, at this early stage of our understanding for the complexity of glycanmediated information transmission, we see several advantages of our approach: (i) U937 cells originate from human monocyte and expresses immunological receptors such as mincle, CD200, CD200R, Siglec1, and Siglec3, therefore resemble physiological relevance (Byrareddy et al. PLOS ONE 10, e0140689 (2015)). Model cells must contain all necessary down streaming molecules connecting our receptors of interest to NFκB. By this the model cell line we chose here will also be applicable for future studies on other glycan binding proteins associated with the immune system. (ii) We used monoclonal U937 cells, thus our results are independent from the heterogeneity of cells.
On the other hand, we agree that the U937 cells may differ in their expression levels of these downstream components of the signaling pathways and in turn, influence the activation of NFκB, which may alter the channel capacity. Hence, in the revised manuscript, we do not compare the channel capacity between two different receptors that have different downstream pathways such as TNFa and dectin2.
2) Both reviewers raise the key issue of how the information transfer was assessed vs previously published studies. Please, expand on the description of the information analysis, put it more explicitly into the context of the prior studies, consider supplying further details and justification for the specific capacity calculation algorithm used in the study.
In the revised manuscript, we added a general introduction on channel capacity calculation method that used in this study and its comparisons to prior studies as described in Appendix 2 and 3.
3) It is important to expand on the discussion of the physiological relevance of having this complex receptor system, particularly in the context of different information transfer properties.
Thank you for raising this concern. We agree that the physiological relevance needs to be discussed further and we expanded the Discussion to explain why we choose U937 and its physiological relevance as follows:
Line 382: “Finally, it is important to take into consideration that our conclusions came from model cell lines, which were used as a surrogate for celltypespecific lectin expression patterns of primary immune cells. Human monocytes and dectin2 positive U937 cells have comparable receptor densities and respond similar to stimulation with zymosan particles (Appendix 1 Figure 6A and B).”
Reviewer #1 (Recommendations for the authors):
Overall, this is a fairly interesting piece of work, extending the framework of information theory to a new class of signaling networks that has not, at least to my knowledge, been considered before in this context. It is also quite interesting that there are significant differences in the information transmitted by dectin2 and other, very closely related receptors; this suggests that even signaling systems with very similar downstream pathways can have very different noise properties. Understanding exactly why this happens mechanistically, and why these networks may have evolved this way, are very interesting future directions suggested by this work.
All of that being said, I have a number of important suggestions for the authors that I feel would need to be addressed before the manuscript could be considered for eLife. These are:
1) I think the physiological relevance of the results are difficult to evaluate as the manuscript is currently written. The fundamental issue is that the U937 cells considered here are not evolved to sense the glycans considered as "input" signals by the authors. Since this is an artificial cell line, the cells in question have not even evolved to sense anything; they were developed from cancer cells and have likely adapted to growing in the lab. So, while these are derived from a (potentially?) relevant immune cell population, they are clearly not representative of the actual human cells that have evolved to sense and respond to potential fungal infections. As far as I can tell, these cells don't even express the relevant receptors, since the authors have to introduce those receptors exogenously.
The authors may argue that the cells in question are at least immunological in origin, and that cell line work of this type is common in the field due to the difficulty of engineering the appropriate reporters into, say, a mouse in order to do work in primary cells. That may be true, but it still does not alleviate a very critical set of concerns regarding these results. In particular, it is currently difficult to interpret the findings the authors have about dectin2 having a lower 'channel capacity' than other receptors like dectin1. As the authors (basically) state, there are two hypotheses regarding this particular result. It does not seem that the difference in channel capacity between dectin2 and dectin1 is due to differences in average expression levels or due to differences in the distribution of protein levels themselves. Indeed, compared to TNFα, there actually seems to be less noise in dectin2 binding vs. TNFα binding (more on that question below, see Figure 4A). So, the differences in channel capacity here are likely either due to fact that dectin2 is somehow just more "noisy" as a receptor in and of itself, or the fact that there are differences in the downstream signaling network in between the receptor and NFκB.
This second hypothesis seems far more likely, given that it is highly unclear that intrinsic differences in, say, the conformational changes that drive signaling processes downstream signaling could be that significantly different between dectin1 and dectin2. In other words, it seems extremely unlikely that an intrinsic, biophysical difference between the receptors could be sufficient to explain the difference in channel capacity. As such, it is likely that the real difference here is due to some difference in downstream interactions. The authors seem to imply that any such differences are currently unknown, since these two receptors seem to interact with the same downstream pathways. Regardless, it seems much more likely to me that differences in the signaling networks induced by these two receptors (or dectin2 and any of the receptors studied here) are driving differences in the observed information transfer.
It has been shown that the channel capacity depends in a critical way on the distribution of the number of downstream signaling molecules present (see the Suderman et al. (2018) Interface Focus 8 (6), 20180039). In other words, if you have a network that signals through a molecule with low abundance, that will tend to increase noise levels and decrease channel capacity. Or, if there is a molecule in the signaling pathway with a particularly broad distribution across the cell population, that can also lead to low channel capacities.
The problem in this case is that the cells in question have not evolved to use dectin2 to sense anything in the environment. So, a particular protein downstream of dectin2 may be expressed at a low level, or expressed with a broad distribution, simply because there is no need for these cells to sense their environment efficiently. As a result, it is very difficult to interpret the physiological relevance of the results presented here.
As currently written, the authors seem to make the argument that their findings have revealed a fundamental difference between dectin2 and the other signaling molecules considered here. I am not myself convinced of that fact, simply because the physiological relevance of exogenously expressing these receptors in an immortal cancer cell line is unclear. I would suggest the authors think about what their results truly mean for our understanding of the underlying biology. Currently, I think it is impossible to infer that the authors would see the same kind of results in primary cells that actually perform the job of sensing these molecules in the body. That being said, the results are still interestingdectin1 and dectin2 are very closely related, and to have such different noise properties is really rather intriguing. It suggests that receptors can have very different doseresponse distributions even if they are very similar and are thought to signal through (essentially) identical downstream systems. The fact that this can happen is surprising, I think, and sets up the two alternative hypotheses discussed above. But the authors need to acknowledge this and be more reasonable regarding the extent to which these results can be extrapolated into a more meaningful biological context.
We agree on Reviewer #1’s comment regarding the difference of cell type can influence the amount of signaling molecules and thereby the channel capacity. Therefore, there is an arbitrariness in the calculated channel capacities. Indeed, we made model cell lines in THP1 and showed different NFκB response but relatively weak compared to U937 cell. At this point we can only speculate why this is and assume that U937 cell has more downstream signaling molecule compared with the other cell lines. In the corrected manuscript, we do not compare the channel capacity between two different receptors that have different downstream pathway such as TNFa and dectin2.
The expression level of various downstream molecules of U937 cell might be different to the macrophages or dendritic primary cells. But U937 has all essential signaling molecules to translocate NFκB with our receptors of interest. Therefore, we could investigate how two different carbohydrate receptors compromise the incoming carbohydrate information to the NFκB response; This is hardly done by primary cells.
To address the concern, as was suggested, in the updated version of the manuscript we directly compared the receptors expression level of CD14+ human monocytes primary cells and our model synthetic cell lines (Appendix 1 Figure 6). Overall, it is challenging to introduce NFκB reporter in the primary cells and one advantage of our system is that U937 reporter cell line was expanded from a single clone. Thus, our results are not affected by the celltocell variability, providing more defined mechanistic conclusions. Since we could not introduce the reporter into the primary cells, we correlated the response of dectin1 and dectin2 positive U937 cells with the primary classical monocytes by comparing: i. binding of glycan input and ii. quantification of the receptors number. As a result, we observed that dectin2 positive U937 cells and classical monocytes can interact comparably with the labeled glycan input. We also found that the receptor density on the transduced cell line was one order of magnitude higher in comparison to primary monocytes. We believe that such discrepancy may dependent on the primary cell type (unfortunately we could not run similar studies on multiple immune cell samples (1000€ each)), or variability in their isolation protocol. Thus, we consider the choice of our model system physiological and relevant for the behavior of classical human monocytes, at least based on the parameters we quantified in this direct sidebyside comparison.
We also would like to emphasize that our conclusions on the noisiness of dectin2 receptor are coming from the direct comparison between dectin2 and mincle receptors. Although dectin1 and dectin2 are receptors of the same lectin family, they have low structural homology and completely different mechanism of the downstream signal transmission and originate from different gen clusters (Saijo and Iwakura, 2011). One major difference for instance, is the presence of the ITAMdomain on dectin1 and the absence of such on dectin2. In addition, mincle cannot be phosphorylated on its own, similar to dectin2, and relies on the downstream interaction with FcRy receptor. Thus, we also agreed with the review, that the different noise for signaling via these two receptors is an interesting observation which deserves further focus. This is ongoing research in the lab.
Currently, we assume that the noisiness of dectin2 might be hidden in the sequence of its intracellular domains. In the followup work we plan to swap intracellular sequences of dectin2 and mincle and check how the construction of such chimera receptors affects signaling efficiency. To summarize, in our work we concluded on the dectin2 noisiness by comparing two structurally homologous receptors, expressed at the same density, in the same cell line (expanded from a single clone).
To further improve our manuscript and acknowledging that our results are derived in a synthetic model system. We try to make the reader aware of the limited physiological implication of our results. Still, we highlighted, an advantage of using synthetic approaches to dissect precise mechanisms of glycanmediated signaling as follows:
Line 382: “Finally, it is important to take into consideration that our conclusions came from model cell lines, which were used as a surrogate for celltypespecific lectin expression patterns of primary immune cells. Human monocytes and dectin2 positive U937 cells have comparable receptor densities and respond similar to stimulation with zymosan particles (Appendix 1 Figure 6A and B).”
2) One of the most interesting aspects of this work is the fact that the authors consider not just the downstream signaling response, but also directly measure the statistical properties of the first step in the signaling process: namely, receptor binding. I think there is a significant opportunity to understand a critical aspect of cell signaling here in a new and interesting way, and so I have some suggestions for analyses the authors could do that would, I think, really increase the impact of the work.
I would be really interested to see what the channel capacity would be for the information flow between ligand concentration and the level of bound receptor. In other words, the authors can calculate the channel capacity between the dose of the ligand that they give the cells, and the distribution of bound receptor that they observe on the cell surface.
Looking at the data (e.g. Figure 4A), my intuitive sense is that the channel capacities will be relatively low for that calculation: maybe 2 bits or something like that. Interestingly, that is much, much lower than the theoretically possible value we would expect for receptors expressed at around 1000 copies per cell (which is around 3.24 bits, see the Suderman Interface Focus paper mentioned above). This may simply be because the receptors are being expressed exogenously and thus have lower absolute numbers, and broader distributions across the population, than would be observed in a more physiological context. That being said, this is to my knowledge the first actual measurement of the amount of bound receptor on the surface of individual cells. This is thus a great opportunity to calculate the information flow between ligand concentration and bound receptor levels.
I should note here that it might be intuitive for the authors to see this as an "upper bound" on the channel capacity for their system; in other words, the channel capacity between the ligand concentration and the bound receptor concentration should set an upper bound for the channel capacity for the entire system. The data processing inequality from Shannon seems, at least on the face of it, to guarantee that. It is important to note, however, that the signaling networks in question do not form a Markov chain; the amount of reporter expression can easily average over the entire signaling history in a particular cell. So, whatever the authors find here, it is not technically an upper bound on the downstream information flow. That being said, it would still be very interesting to know!
It should be relatively easy for the authors to just make this calculation based on the data they already have. I do, however, have suggestions for a few experiments that would improve the rigor of this calculation and make the results even more interesting.
We agree on Reviewer #1’s great suggestion on the additional analysis of information transmission in the presence of labelled input. We tried to include our new results regarding this (please see Author response image 1), but we could not fully explain the results due to the following reasons:
The measured labelled inputs signal of the cell at a single time point cannot reflect the total amount of interaction between the ligand and receptor. There are several processes that prevent us from directly connecting detected stimulants and signal initiation from such a snapshot. By the time we record the data, several ligands will be left receptor occupancy, whilst initiating signaling. Additionally, labelled stimulants will be degraded in the lysosome after the endocytosis. Therefore, the channel capacity between the ligand concentrations and bound ligands, in this experiment, cannot be the “upper bound” of the channel capacity between ligand concentrations and GFP expression. Similarly, we see that the channel capacity between bound ligand and GFP expression is lower than the channel capacity between ligand concentrations and GFP expression.
We would like to share our experimental results regarding the channel capacity calculation in the presence of labelled input as follows:
Before we introduce our new results regarding this, we want to point out our previous results. As shown in the plot from Appendix 1 Figure 4A, WT cells exhibit nonspecific binding to labelled invertase and there is no big difference in channel capacity between WT and dectin2 calculated from ligand concentration and the level of bound ligands. Hence, it is difficult to delineate the nonspecific binding of labeled invertase underlying the binding detected for dectin2 expressing cells. However, only in presence of dectin2, we detect downstream signaling and expression of GFP.
Due to the presence of nonspecific binding of labeled invertase to dectin2 positive cells, we had to ensure that the activation of NFκB via dectin2 is indeed specific to protein glycosylation (Appendix 1 Figure 1B). In the subsequent experiments, we remove dectin2 epitopes from the stimulants using αmannosidase and as a result, could not detect any activation of the GFP expression, although such nonmannosylated protein was still nonspecifically recognized by the cells. These results confirm that the overall quantified response is glycanspecific and is not affected by nonspecific binding of invertase to U937 cells. The findings are included in Appendix 1 Figure 1B and the main text as follows:
However, for the suggested analysis, we had to circumvent the nonspecific binding of invertase to the cells and replaced it with label zymosan as a stimulant. The outcomes of these experiments are the following:
The corresponding channel capacity values in each layer is written in the plots in Author response image 2. The data indicate that the channel capacity between bound ligands to GFP expression is almost always higher than that of between ligand concentration and GFP. Therefore, the downstream pathway connecting bound ligand to GFP expression information has more capability in transmitting information than the receptor itself. In case of dectin2 stimulation via antidectin2 antibody (Author response image 2D), we suspect that lower channel capacity between bound ligand and GFP expression (0.42 bit) than ligand concentration to GFP (0.59 bits) is due to degradation of the APC dye over the time course of 16h stimulation.
In the case of channel capacity comparison between ligand concentrations to bound ligand and ligand concentrations to GFP, dectin1 shows less channel capacity in ligand concentration to bound ligands than the other. Please see more data points shown in Author response image 3.
According to Reviewer #1’s comment and general intuition, it is hard to understand that the channel capacity of a full communication channel is higher than a fraction of it. We think there is a dissipation of information in the ligand bound state coming from stimulant digestion or offrate. Several ligands can bind to dectin1 and initiate downstream signaling, but some may detach from the receptor prior to internalization, but still initiating signaling.
Note that the channel capacity calculation from bound ligand to GFP expression need extra care on the bias generation since the binning is also applied on the input variable. Therefore, the bias increases more rapidly as increasing the number of binning on both input and output (Author response image 4). On the other hand, insufficient number of binning can underestimate the calculated channel capacity. To address this problem, we employed equal frequency binning. The equal frequency binning partitions the data set such that the individual binning region contains the same number of data points. The equal frequency binning, compared with linear binning, is faster approaches the maximum channel capacity with increasing number of binning (see also Author response image 5).
Taken together, even we could calculate the transmitted information through a receptor bound state, the measured labelled input information does not fully capture the total amount of receptor ligand interactions and therefore we could not quantitatively explain the results. Hence, we provide out experiments and analysis for the Reviewers only. We think implementation of measuring temporal dynamics of labelled ligand binding and unbinding with pH stable fluorophores, can provide actual information of ligand receptor interactions. This is ongoing research in the lab and we again thank the reviewer for his/her input.
Firstly, it is unclear how representative the bound protein numbers are of the actual number of binding events we would have in the cell culture when the cells are exposed to the ligand. This is because the authors have to wash the cells (at least I assume they do that) and then put them through the FACS instrument to measure the fluorescence. It would be really cool of the authors could do a time course to measure how fast the ligand "falls off" of the receptor. As long as the measurements made by the authors occur before significant unbinding has happened, the channel capacity they calculate should be very representative of what actually happened in the culture.
We agree with the reviewer, unbinding of ligand could obscure the data analysis (see also above). In our assay, ligands were always washed before flow experiments. We assume that due to multivalent nature of glycanlectin interactions the desorption rate for the studied system is very low, which is why the washing step does not affect binding. Our assumption stems from working with these receptors in isolated biophysical assays (e.g. SPR) in previous work (Aretz et al. ACIE 2017). This effect will be even more pronounced if multiple copies of these receptors are present on a cellular surface. Thus, the measured channel capacity is likely to be representative of what is happening in cell culture. We also would like to note that the system presented here is not ideal to investigate dynamics of lectinglycan interactions. In our followup studies, we plan to establish a kinetic microscopy assay to probe such interactions on a singlereceptor level. We also plan to use not the endpoint reporter, as NFκBGFP, but realtime signalling sensors, like ERKKTR.
The other experiment that would be really interesting would be to repeat the experiment with labeled ligands but using whatever primary cells are known to express dectin1, dectin2, etc. In other words, the fluorescence here is coming from the ligand, so there is no need to engineer any reporters into the cell. If the authors see similar distributions of bound receptor on the surface of primary cells, and similar channel capacities between ligand concentration input and the bound receptor as output, that would be very intriguing. In any case, doing this experiment would I think begin to address some of the concerns with physiological relevance raised above. It would also be extremely interesting as a contribution to the field.
According to Reviewer #1’s suggestion we compared labelled zymosan binding and dectin2 expression level between our transfected U937 cell and primary cultured human monocytes as shown in Appendix 1—figure 6. We found that our model cells express one order of magnitude higher dectin2 expression relative to the human monocytes (Appendix 1—figure 6B A) and this resulted in the similar labelled zymosan binding trend on both cell types as increasing the zymosan concentration (Appendix 1—figure 6B). We include the following sentence in Appendix 1 Figure 6 and main text, respectively:
Line 382: “Finally, it is important to take into consideration that our conclusions came from model cell lines, which were used as a surrogate for celltypespecific lectin expression patterns of primary immune cells. Human monocytes and dectin2 positive U937 cells have comparable receptor densities and respond similar to stimulation with zymosan particles (Appendix 1 Figure 6A and B).”
3) As mentioned in my public comments, another issue in this work is the fact that the authors are relatively reticent about how their calculations are actually performed. While the authors present the relevant equations for calculating the Mutual Information and claim to follow the approach of Cheong et al., the methods are not sufficiently detailed to understand exactly what they have done here. This is particularly important because their approach is clearly not identical to that used by Cheong, since their algorithm for maximizing the MI across input distributions is different. Cheong et al. approach this problem by trying a limited set of possible input distributions and using the one that gives the highest MI, while the authors here use a builtin optimization algorithm in python. It is not clear if this is the only point at which the authors deviate from the approach laid out by Cheong et al., or if they have modified other aspects of the calculation.
According to Reviewer #1’s comment we extended the channel capacity calculation procedure as described in Appendix 2 and 3 for the clear explanation. In the case of modulating input distribution to maximize the channel capacity, we compared the method given in Cheong et al. and our approach as shown in Appendix 2 Section 4.
Mutual information calculation under unimodal and bimodal input distributions. (A) Examples of unimodal input distributions. The parameter s is the standard deviation of the Gaussian function selected from 0.5, 1, 2, 4 and 8. There are 60 cases of input distributions. (B) Examples of bimodal input distribution containing the same s parameters of the unimodal distributions. The number of bimodal combinations of the distribution is 1496. Vertically sorted various unimodal (C) and bimodal (D) input marginal probability distribution by the mutual information yields of the distribution. The probability space for the maximum mutual information given from unimodal (E) and bimodal (F) input distributions.
As shown in Appendix 2—figure 3, we compared between channel capacity values calculated from predefined unimodal/bimodal distribution and that of our optimization approach using built in Python function. We found that our approach gives the same (at least in given significant figures) channel capacity, 1.01 bit, for both cases, if we use bimodal input distribution.
The authors need to explain how they approach the problem of correcting for finite size effects. Do they use the approach of Cheong et al., bootstrapping the data at different levels and then using a linear extrapolation of MI vs. 1/N (where N is the number of data points/cells) to extrapolate to the case of an infinite population size (1/N \to 0).
In the original manuscript we did not consider the bias originated from the finite sample side because the sample number is very high (~150,000). Please see the channel capacities evaluated from experimental dataset (Appendix 2—figure 8).
Channel capacity estimation of experimental data using bootstrapping in various ybinning number. The yintercept values of the regression line are the estimated channel capacity in the given ybinning number. The number of subsampled data points in each inverse sample side is 30.
The maximum corrected bias using bootstrapping in the entire dataset is less than 0.03 bits. The slopes in the linear regression lines in the plot are inverse proportional to the sample size. Therefore, our relatively bigger sample size does not significantly affect the corrected channel capacity. But we admit that in the case of relatively smaller sample size, using bootstrapping is required and gives more reliable result. Therefore, to retain a generality in channel capacity calculation of this work, we included bootstrapping calculation in all channel capacity estimates. The detailed bootstrapping procedures are shown in Appendix 2 Section 7.
Did they use the same approach to choosing the number of bins to calculate the probability distribution over outputs? Cheong et al. did this by visual inspection, if I recall correctly, looking for a "plateau" in bin numbers where the MI calculated from the data was nonzero but the MI calculated for randomized data was 0. Based on the statements made by the authors, and Figure S6, they seem to have performed both of these stepsotherwise it is completely unclear where the confidence intervals on their MI/channel capacity estimates come from, and how they chose the number of bins to use. Although it is important to note that the randomized controls seem to be missing from Figure S6. Regardless, since the authors have evidently reimplemented the code, they need to explain exactly what they have done.
In the original manuscript, we found appropriate binning number using visual inspection as Cheong et al. did. In the case of bias estimation using randomized dataset, we could not find any noticeable bias generation in the entire binning range from 0 to 1000. But we did not include the data. In the revised manuscript, we employed more rigorous binning method as described below:
Appendix 2—figure 7 shows channel capacities calculated from bootstrapping for various binning and total subsample numbers. In the case of experimental data set (TNFa stimulation on TNFAR), the total number of samples represent the number of subsampled dataset (without replacement) from the original dataset. The white line in the random dataset represents the 0.01 bits of contour line. This indicates that the bias of channel capacity calculation is not only dependent on the number of output but also the number of measured samples. Note that the smallest number of experimental dataset is around 99,000. Therefore, we can expect that the maximum bias in our channel capacity calculation is less than 0.01 bits in the binning range from 10 to 1000 according to the random dataset. Indeed, as shown in Appendix 2—figure 7D, channel capacity values for subsampled (51151 sample size) experimental and random dataset exhibit stable line in the given binning range.
Accordingly, we decided to choose channel capacity by which the maximum channel capacity value calculated from 10 to 1000 of output binning range. And we included this procedure in Appendix 2—figure 7.
I would suggest that the authors also adopt several improvements to the Cheong et al. approach that have been developed in subsequent works. In particular, Suderman et al. (2017, PNAS 114 (22), 575560) implemented a broader range of bootstrap samples across which to do the linear extrapolation to infinite population size, automated the choice of bin sizes, and developed a bootstrap approach to estimating confidence intervals that is much more realistic (and statistically appropriate) than simply using the confidence intervals from the linear extrapolation step. The authors can refer to the Suderman paper on how this was done; it should be relatively easy to implement these improvements, if they have not been done already.
According to Reviewer # 1’s suggestions, we have implemented additional statistical methods to improve the channel capacity estimates of the work as follows: We implemented bootstrapping method, and by using linear regression we extrapolated channel capacity value at infinite sample size (Appendix 2 Figure 6). On the other hand, the confidence intervals, having less than the minimum value of significant figure, are not present in this work. Instead, we compared different outcomes using nonparametric statistical test as Reviewer #1’s suggestion from comment #2.
In the case of binning size selection, we decided to choose channel capacity by which the maximum channel capacity value calculated from 10 to 1000 of the output binning range as described in Appendix 2—figure 7.
I appreciate the author's innovation of using an optimizer in python to maximize the MI. I would expect this actually produces higher "channel capacities" than the previous approaches mentioned above. That being said, it is unclear how this optimization algorithm works. The authors may expect readers to go to the python documentation, but there are problems there. For one, the algorithm implanted in python might change, so the documentation may describe an approach that is different from the one the authors actually used to perform their calculations. How various packages are implemented in python can change from time to time, and something so generic as "optimize" may be changed as better optimization algorithms become available in the future. So the authors should describe exactly what they did. Also, readers are, in general, not going to go to python documentation to understand the methods of a paper like this, nor should they have to. The authors should describe what the algorithm is doing as currently implemented, in terms that readers can readily understand.
We agree to Reviewer #1’s comment regarding the lacking explanation of mutual information maximization procedures. The explanation should be more general and independent. Therefore, we have corrected Appendix 2 Section 3. describing mutual information procedures as follows:
“Finding input weighting values, $w\left(i\right),$ that maximize the mutual information subject to $\sum _{i}^{\text{input}}w\left(i\right){P}_{x}\left(i\right)=1$ and $0\le w\left(i\right){P}_{x}(i)$ is a nonlinear optimization problem. Since, the direction of the gradient of mutual information is the same as that of those two constraints at the minimum, using Lagrange multiplier method, one can restated the functions as Lagrangian $\mathcal{L}\left({w}_{i},\lambda ,\sigma \right)=\text{MI}\left(input;outputw\left(i\right)\right)\lambda \left[\sum _{i}^{\text{input}}w\left(i\right){P}_{x}\left(i\right)1\right]\sigma \left[w\left(i\right){P}_{x}\left(i\right)\right],$ and find out weightings using numeric approach (Kraft D. 1988. A Software Package for Sequential Quadratic Programming. Wiss. Berichtswesen d. DFVLR). We used Sequential Least Squares Programming (SLSQP) provided by SciPy Python library (scipy.optimize.minimize, SciPy 1.7.3) to find out the optimizing input weighting values.”
Finally, the authors should also perform the calculation using the exact same distributions used by Cheong et al. and Suderman et al. This will allow us to compare their results more directly to the results from those previous authors, as well as allow us to determine if the approach that they used results in similar channel capacities to the optimization algorithm used here. While I expect the optimization algorithm employed by the authors will yield larger numbers for the channel capacity (and thus better estimates, since the channel capacity is formally a supremum), it would be extremely useful if the authors checked this.
According to Reviewer #1’s suggestion, we compared the channel capacities calculated from predefined bimodal and unimodal input distribution with our systematic approach as described in Appendix 2—figure 3 and Appendix 2 Section 4. We found that there is no noticeable channel capacity difference between those two methods in our significant figure range.
4) Another issue related to the calculation of the MI itself is the fact that the authors use logarithmically spaced bins, rather than linear bins, to perform the calculation. The authors describe this approach as if the matter of how to construct the bins is simply a matter of personal preference or convenience for the calculation, and choose logarithmic bins because they give somewhat larger values for the channel capacity. For one, this is not surprising; as is often the case, the FACS data obtained by the authors has a lognormallike character; in other words, the authors see normalish distributions on a log scale (e.g. Figure 2A). As such, it is natural that using logarithmic bins will result in higher channel capacities, because it will tend to equalize (as much as possible) the effective number of observations within the bins.
While this may seem reasonable at first, it is actually, in my view, hard to justify. The reason is that the choice of how to generate the bins is not arbitrary, but actually corresponds to a very strong statement regarding what the authors actually consider the relevant "output" of the system to be. Using logarithmic bins is obviously equivalent to first taking the log of the data, and then using linear bins on that logtransformed data (Figure S2A should make this fact abundantly clear). The authors are thus envisioning that the channel is not a channel whose input is ligand concentration and output is GFP level, but rather a channel where the input is the ligand concentration and the output is the log of the GFP level. This is not an arbitrary decision, but rather one that has real consequences for how we construe the calculation in a physiological context. If the output is the log of the GFP level, that means that the cell is sensitive not to linear changes in protein concentration, but rather foldchanges.
For certain transcription factors, authors have argued that it is actually this kind of relative, fold change that matters (see Lee et al. (2014) Mol Cell 53 (6) 86779). I would argue, however, that the conclusions of that work are hardly so robust as to suggest that, for every inputoutput transcriptional regulatory system within cells, it is the fold change of the "output" protein, rather than its linear change, that matters to the cell. Certainly, if the output is an enzyme, a transporter, or a protein that performs a whole host of other functions, then it is natural to assume that the appropriate output is the protein concentration itself, not the fold change in protein concentration.
If the authors can argue that every single gene whose transcription is controlled by NFκB has a foldchange impact on its downstream function, then I could see a strong argument being made for logarithmic bins. As it stands, however, I think the most natural way to interpret the output of the channel is on a linear scale, and I would say the authors should focus on that, rather than a logarithmic scale. If the authors want to include their logarithmic calculations in this work, I would suggest moving them to the supplement, and making clear the fact that such a calculation construes the output as having a foldchange impact on whatever is downstream of the protein level measured by the authors.
We understand Reviewer #1’s comment. Please see the public comment #3 and our answer including Author response image 5. We found whether we choose logarithmic, linear or equal frequency binning, there is no significant difference in the calculated channel capacity values if the binning number is more than 200. The binning method, however, should not be arbitrary but has to have strong relationship with the context. Therefore, in the revised manuscript, we used linear binning and corrected all the channel capacity values accordingly. Please note that the new calculation does not affect our conclusion.
5) In this work, the authors focus on the measurement of GFP levels at a certain time. As written, it is not clear if the authors are considering the steadystate response of the cells that they are treating, a peak response, or something else. As far as I can tell, the authors stimulate the cells for 16 or 13 hours, depending on the ligand in question. I am not sure how these numbers were chosendid the authors look at the GFP expression dynamics and choose this number based on those measurements? From an informationtheoretic standpoint, it is best if the results are based on steadystate protein levels, but the author's don't seem to explain their rationale for choosing the time points that they choose. If the GFP expression levels do not reach a steadystate, it would be great if the authors had a particular reason for choosing the times they choose. I would also suggest that the authors consider looking to see if their channel capacity estimates are robust to the time they choose by looking at other time points and calculating the channel capacities to see if they get similar results. I am not sure what times to choose, but perhaps 24 hours or 36? It is hard to say without looking at the average dynamics over time, so providing that kind of data would be extremely helpful.
Thank you for bringing this to our attention. We noticed that the stimulation time written in Materials and methods part is not correct. We stimulated 16h in every experiment except TNFa stimulation as described in the original main text. The time at which the mean value of the GFP expression level reaches its maximum was determined as the stimulation time. As shown in (Appendix 1—figure 2A), U937 cells exhibit maximum mean GFP response at around 12 hours of TNFa stimulation (50 ng/mL) while the expression level stay in steady state. And at around 20 hours later, the GFP expression level decreases to the basal level. Therefore, we selected 13 hours as the stimulation time if the TNFa is the stimulant. For the other stimulants, we used 16 hours as the stimulation time according to the maximum GFP response. Author response image 6 shows the channel capacity estimate of dectin2 expressing U937 cell using mannan as the stimulant at different time points. We include these data into our manuscript as follows:
Line 168: “Note that we choose the stimulation time, the period of incubation time of the cell with the input ligands, as the time point when GFP response and channel capacity reaches the maximum and steady state value (Appendix 1 Figure 2A and B).”
Also, the authors should acknowledge that a large body of literature has emerged that makes the claim that it is not the individual time points that matter, but rather the entire time series of the response that should be construed as the output. The authors can refer to the Zhang et al. paper in Cell Systems, the paper by Selimkhanov et al. (Science (2014) 346 (6215) 13703), papers by the Hoffman group (notably Tang et al. (2020) Nat Commun 12 (1) 1272), and a host of others. I am not suggesting that the authors adopt this worldview; there are actually serious mathematical and conceptual issues with construing the output of a communication channel to be a function like a time series. But, this idea is out there in the literature, and the authors should address this point and discuss how looking at GFP time series, rather than individual time point measurements, might yield different (and undoubtedly higher) channel capacity estimates.
We appreciate Reviewer #1’s suggestion for discussing channel capacity calculation from dynamics datasets. We understand there are lots of variability of selecting input and output structure. For example, one might use oscillating ligand concentration as an input and collect temporal NFκB translocation into nucleus as an output (Kellogg, R. A. & Tay, S. Cell 160, 381–392 (2015)), which seemingly, even the authors did not consider information theory, yields more higher channel capacity than static input. Indeed, there are a lot of ways of selecting input and output for channel capacity calculation in the same signaling pathway. In this work, we were concerned with the information transmission via carbohydrates via lectin signaling pathways in a cell population level at a fixed time point. And we agree that describing channel capacity measurement using single cell and time resolved cellular output brings more broader insight on this work. This is ongoing research in the lab. We added following sentence in the Discussions and Conclusions part as follows:
Line 174: “In addition, this channel capacity can be further increased if one can measure the temporal evolution of output dynamics instead of static output dataset (Selimkhanov et al., 2014).”
6) As a final, rather important but also rather technical point, the authors in Figure 3C do statistical tests on their channel capacity estimates, comparing dectin1 and dectin2 cells to those that express dectin1 and dectin2 together. The issue here is that they use a ttest to estimate statistical significance, but it is really unclear where the dispersion (e.g. error bars or standard deviations) in the channel capacity estimates come from. I expect, since the authors day "n = 9" in the legend, that they split their data into 9 groups (maybe 9 sets of cells whose results were collected in different FACS runs or something?), performed the channel capacity estimate on each independently, then used the means and standard deviations in those estimates to do the ttest. That is completely made up by me, however; the authors really need to explain where these error estimates (and their +/ X values for channel capacity estimates in the various tables) actually come from.
In any case, the ttest of course only works if the data in question is drawn from actual Gaussian distributions. This does not mean that the distributions "look Gaussian" or "seem okay;" in the application of a ttest, you make extremely strong assumptions that only real honesttogoodness Gaussian distributions actually satisfy. So the data really needs to actually be drawn from a Gaussian distribution for the results of the test to be interpreted appropriately.
My first suggestion here is that the authors need to use a nonparametric statistical test. That is, unless there is a true, theoretical reason to expect that the data will be drawn from a Gaussian (I know of no such reason, but perhaps the authors have one). The Wilcoxon ranksum test was made for problems like this one, so I would suggest the authors use that.
A larger problem here, however, is that taking n = 9 (however it was done in the end) is not going to give a great estimate of the uncertainty in the channel capacity estimate. I would instead suggest that the authors merge their data into one data set (unless there are some kind of batch effects that prevent them from doing that) and then using a bootstrap/resampling approach, as in the Suderman et al. PNAS paper, to estimate confidence intervals and generate distributions for statistical tests. These bootstraps can be done hundreds or thousands of times, allowing for even more nonparametric tests (like permutation tests) to be used to estimate statistical significance.
This is honestly a fairly minor point, because nothing terribly important in the conclusions of the paper rest on the statistical tests here. But, if the authors want to argue that there are real differences between dectin1, dectin2 and dectin1+dectin2 cells, they should do these statistics in a more nonparametric way.
Since it is unclear whether the channel capacities are distributed Gaussian distribution across different batches of experiment, we have changed statistical test from ttest to Wilcoxon ranksum test, a nonparametric statistical test, if we compare channel capacities in the manuscript.
Figure 2B and D include the significances from Wilcoxon ranksum test, indicating the preserved statistical significance (*p>0.05, **p>0.01) that we had in the original manuscript.
In the revised manuscript, we do not compare the other pairs of receptors due to the possible difference in downstream molecule number of U937. And also please note that the pooling the dataset that measured in different dates usually generate high bias in the channel capacity since the baseline of the GFP signal is significantly different due to the variation of laser condition of flow cytometer daytoday.
Reviewer #2 (Recommendations for the authors):
I think this paper would be greatly improved if the authors were more precise and detailed in the description of their capacity calculation. To address this, it's likely the authors were following the method from Cheong et al., but some of those details should be described here.
We thank Reviewer #2’s suggestion of improving the description of channel capacity calculation. Therefore, in the revised manuscript we added the following contents in Appendix 2.
https://doi.org/10.7554/eLife.69415.sa2Article and author information
Author details
Funding
European Research Council (716024)
 Christoph Rademacher
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication. Open access funding provided by Max Planck Society.
Acknowledgements
This project (GLYCONOISE) has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (Grant agreement No. 716024). We thank Max Planck Society for support and Prof. Dr. Peter H Seeberger for helpful discussions. We also thank the Deutsches Rheuma Forschungszentrum (DRFZ) for providing access to their cell sorting facility. The computational results presented were obtained using the CLIP cluster (https://clip.science).
Senior Editor
 Aleksandra M Walczak, CNRS, France
Reviewing Editor
 Andre Levchenko, Yale University, United States
Version history
 Received: April 14, 2021
 Preprint posted: May 10, 2021 (view preprint)
 Accepted: February 19, 2023
 Accepted Manuscript published: February 20, 2023 (version 1)
 Version of Record published: March 14, 2023 (version 2)
 Version of Record updated: March 31, 2023 (version 3)
Copyright
© 2023, Fuchsberger, Kim et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics

 1,014
 Page views

 211
 Downloads

 0
 Citations
Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading

 Cell Biology
 Microbiology and Infectious Disease
Reverse genetics is key to understanding protein function, but the mechanistic connection between a gene of interest and the observed phenotype is not always clear. Here we describe the use of proximity labeling using TurboID and sitespecific quantification of biotinylated peptides to measure changes to the local protein environment of selected targets upon perturbation. We apply this technique, which we call PerTurboID, to understand how the P. falciparum exported kinase, FIKK4.1, regulates the function of the major virulence factor of the malaria causing parasite, PfEMP1. We generated independent TurboID fusions of 2 proteins that are predicted substrates of FIKK4.1 in a FIKK4.1 conditional KO parasite line. Comparing the abundance of sitespecific biotinylated peptides between wildtype and kinase deletion lines reveals the differential accessibility of proteins to biotinylation, indicating changes to localization, proteinprotein interactions, or protein structure which are mediated by FIKK4.1 activity. We further show that FIKK4.1 is likely the only FIKK kinase that controls surface levels of PfEMP1, but not other surface antigens, on the infected red blood cell under standard culture conditions. We believe PerTurboID is broadly applicable to study the impact of genetic or environmental perturbation on a selected cellular niche.

 Cell Biology
The primary cilium plays important roles in regulating cell differentiation, signal transduction, and tissue organization. Dysfunction of the primary cilium can lead to ciliopathies and cancer. The formation and organization of the primary cilium are highly associated with cell polarity proteins, such as the apical polarity protein CRB3. However, the molecular mechanisms by which CRB3 regulates ciliogenesis and the location of CRB3 remain unknown. Here, we show that CRB3, as a navigator, regulates vesicle trafficking in γtubulin ring complex (γTuRC) assembly during ciliogenesis and ciliumrelated Hh and Wnt signaling pathways in tumorigenesis. Crb3 knockout mice display severe defects of the primary cilium in the mammary ductal lumen and renal tubule, while mammary epithelialspecific Crb3 knockout mice exhibit the promotion of ductal epithelial hyperplasia and tumorigenesis. CRB3 is essential for lumen formation and ciliary assembly in the mammary epithelium. We demonstrate that CRB3 localizes to the basal body and that CRB3 trafficking is mediated by Rab11positive endosomes. Significantly, CRB3 interacts with Rab11 to navigate GCP6/Rab11 trafficking vesicles to CEP290, resulting in intact γTuRC assembly. In addition, CRB3depleted cells are unresponsive to the activation of the Hh signaling pathway, while CRB3 regulates the Wnt signaling pathway. Therefore, our studies reveal the molecular mechanisms by which CRB3 recognizes Rab11positive endosomes to facilitate ciliogenesis and regulates ciliumrelated signaling pathways in tumorigenesis.