List of glycoproteins in the human genome: predictions and measurement of glycosylation, and virus infection inhibition assays using sample proteins from the list.

A. Schematic of inhibition of virus infection by membrane glycoproteins. B. The number of membrane proteins and predicted glycosylated proteins in human genome from UniProt . C. The number of predicted glycosylation sites per the number of amino acid sequence of ectodomain for 2515 membrane associated proteins, plotted along with the number of ectodomain amino acid sequence. Color indicates the measured rate for glycosylation per molecule (PNA/mol) per amino acid. 0T, 4T, and 14T indicate truncation mutants of MUC1 that contain 0, 4, and 14 tandem repeat sequences, respectively. D. Flow cytogram for the binding of Alexa Flour 647 labeled PNA to HEK293T cells expressing MUC1(42 tandem repeats) tagged with SNAP surface 488 and the linear regression of the data to the reaction model (red dashed line, see the method section for details). E. Relations of the measured PNA/mol and the number of predicted glycosylation sites for the indicated molecules. F. SARS-CoV2-PP infection assay in HEK293T cells expressing ACE2, TMPRSS2, and each of designated membrane protein. Dots were measured values of the integral of GFP expressions from infected viruses in those samples adjusted by the total ACE2 expressions at the time of infection, and were plotted along with the mean density of membrane protein at the time of infection. Red lines indicate learned predicted infection rates mean from Bayesian hierarchical inference based on sigmoidal function, and purple area represents one sigma below and above the red lines. G. Relations between the measured rate for glycosylation per molecule (PNA/mol) and molecular specific IC50 density in sigmoidal inhibitory function inferred from Bayesian hierarchical modeling in F (σ_IC50). H. Relations between σ_IC50 and estimated molecular weight including glycans in the experimental system. I. Purification and analysis of recombinant proteins, non-glycosylated (bacterial or B) and glycosylated (G) MUC1 (14TR) tagged with SNAP surface 488. Coomassie Brilliant Blue stained (left), glycan stained (middle), and fluorescent (right) for proteins in SDS-PAGE.

Virus infection in epithelium regulated by glycan contents in each cell.

A. Maximal projection of z-stack of images. SARS-CoV2-pp infected air liquid interface (ALI) cultured Calu-3 cell monolayer, imaged by the binding of Alexa Fluor 647-labeled Neu5AC specific lectin from Sambucus sieboldiana (SSA), and the expression of GFP derived from infected viruses. B. Plot of fluorescence intensities in all pixels in Figure 2A in both GFP and SSA channels. C. SARS-CoV2-pp infected Calu-3 ALI monolayer, imaged by immunofluorescence for each membrane protein and GFP. Maximal projection of z-stack of images is shown. D. Pearson correlation for fluorescence intensities of lectin/antibody and GFPs in image pixels in maximal projection of z-stack images. Error bars are standard error of mean from images from three or more different samples. Lectins used were SSA, MAL (from Maackia amurensis, Neu5AC specific), WGA (from Triticum vulgaris, GlcNAc specific), PNA, and DSA (from Datura stramonium, GlcNAc specific). E. SARS-CoV2-pp infected Calu-3 ALI monolayer, imaged by immunofluorescence for CD44 and SSA lectin binding. Maximal projection of z-stack of images is shown. F. Pearson correlation of lectin/antibody signal and SSA signal in image pixels in maximal projection of z-stack images. Error bars are standard error of mean from images from three or more different samples. G. TOS analysis for GFP and SSA signal pixels in image of Figure 2A. H. Correlation in top 10% in both axes in the TOS analysis. Error bars are standard error of mean from images from three or more different samples.

Structure of interface between viruses and cell membranes and polymer brush theory.

A. Flow cytogram of DID labeled SARS-CoV2-PP binding to HEK293T cells expressing ACE2, TMPRSS2, and/or various truncation mutants of MUC1. B. Schematic diagram of the interface structure between the virus and the cell membrane during the process from cell-virus binding to virus uptake via stable interface formation. C. Basics of polymer brush theory and free energy of the polymer in the brush structure. d, spacing distance between polymers, RF, Flory radius of polymers, fint, and fel, intermolecular and elastic free energies per a single polymer. D. Two types of interfacial structures of particle binding to polymer graft surfaces in the conventional polymer brush model. E. Two additional structure types of particle-surface interface, specialized cases for virus-cell interface. F. A chart for free energy of the system U during the process of virus-cell interface formation.

In situ FLIM–FRET measurements for protein sizes, and conformational predictions.

A. Example of FLIM images, for cells expressing SNAP–Surface 488 conjugated CD24 proteins and being incorporated with PlasMem Bright Red dyes in different surface densities. Schematic drawing (left) depicts a geometry of the FRET from a single donor dye conjugated to SNAP tag at the end of protein ectodomain to populations of acceptor dyes incorporated in plasma membrane. B. FRET efficiency estimated from FLIM imaging for cells expressing each protein at different mean acceptor densities. Lines are predicted mean FRET efficiencies from Bayesian hierarchical inference (see method for details), and purple area represents one sigma below and above the lines. C. Relations between inferred Flory radius from FLIM – FRET analyses and inferred σ_IC50 from infection inhibition assays. Measured glycosylation rage PNA/mol/AA was depicted in color. D. Relations between inferred Flory radius from FLIM – FRET analyses and estimated molecular weight including glycans. Dot line is the result of linear regression. E. Relations between inferred Flory radius from FLIM – FRET analyses and amino acid length. Dot lines are the fit to Flory model for RF∼Nν. F. Relations between the distance between coordinates of two amino acids at both ends of ectodomain in Alpha Fold2 predicted conformations and number of amino acids for all 2515 proteins in the list. The number of predicted glycosylation sates per amino acid was depicted in color. Dot line indicates the Flory model RF∼Nν, where ν =0.14. G. Relations between the distance between coordinates of two amino acids at both ends of ectodomain in Alpha Fold2 predicted conformations for our sample molecules and measured Flory radius from FLIM–FRET assays. Dot line indicates where the measured RF is equal to the Alpha Fold 2 predicted length. H. Schematic diagram of protein conformational dynamics and two length scales, RF and Alpha Fold 2 prediction. Yellow and red stars indicate the two ends of the ectodomain.

Biochemical reconstitution of protein packing in membrane surface.

A. Lipid bilayers coated on silica beads for incorporating bacterial (b-) and glycosylated (mammalian expressed, g-) proteins, schematic and fluorescent images. B. Representative result for flow cytometry analyses of protein binding to lipid bilayer coated silica beads. Bar is standard deviation in each measurement. Lines are regression curves to receptor binding model Bx/(KD + x), where x is protein concentration, KD is the dissociation constant and B is the saturated density. C. Relations of surface area coverage by bound proteins and concentrations of proteins used for membrane binding. Surface area was normalized by assuming all bound proteins were in hemisphere of radius RF and the ratio of coverage was calculated. Protein concentrations were normalized by KD. Dot line in the plot indicates the coverage when hemisphere of radius RF aligned in a hexagonal close packing. D. Schematic for structures and free energies for glycosylated and non-glycosylated proteins with similar RF that are in mushroom and brush regimes.

Superresolution imaging of virus and cellular proteins for analyses of virus – cell interface

A. Dual color STORM images of SNAP-MUC1/SNAP-VAMP2 in cells and Spike in SARS-CoV-2-PP bound to the cells. Whole cell images (left) were reconstruction from STORM data, and coordinates determined by STORM were individually plotted in expanded images in right. B. Schematic of calculation of cross correlation. Mutual distance between all combinations of two dyes were calculated from the coordinates determined by STORM imaging. C. Examples of histogram of mutual distances between all combinations of two dyes were calculated from coordinates H(r), and the cross-correlation function C(r) that is the normalized radial average of H(r). Δr, a size of shell and the bin size for histogram, was set to be 5 nm. D. Examples of simulated STORM images, and calculated C(r) from these images. h denotes the distance between virus and cell when they are separated, and density plot illustrates average protein densities along membranes in these images. E. Plot of C(r)_(out/in) calculated for all simulated images, along with density_(in/out) and h. Blue dashed lines indicate upper and lower bound for C(r)_(out/in) calculated for stable virus – cell interface, and red dot line is the fifth polynomial regression to data point and purple area represents one sigma below and above the line. F. Traces of C(r) for individual cells expressing MUC1 and VAMP2. G. C(r)_(out/in) calculated for all STORM images. H. density_(in/out) for all STORM images converted from C(r)_(out/in) based on the regression in E, were plotted along with RF for each protein determined by FLIM – FRET (Figure 4).

Models for three distinct cases of virus – cell interface.

A. Energy and force involved in molecular exclusion from virus-cell interface. B. Transient distribution of membrane proteins nearby the interface. C. Two-dimensional distribution of viruses and glycoproteins on the surface horizontal direction of the cell membrane. D-F. Schematic of virus-cell interface structure and corresponding free energy chart. D. Virus-cell interface with very small membrane proteins. Due to low energy penalty in constructing virus-cell interface entrapping these small molecules, virus can form interface without excluding proteins, and can infect cells. E-F. Virus-cell interface structure for larger membrane proteins. Interface formation requires protein exclusion. Due to membrane viscosity and inhomogeneous field in adhesive energy, excluded proteins become crowded at proximity of the interface. In the case for highly glycosylated proteins (F), intermolecular repulsion between excluded proteins become very high at high density regions. Such repulsion generates a high kinetic barrier for molecular exclusion, preventing to proceed to subsequent steps for infection. In contrast, in the case for low glycosylated proteins (E), their intermolecular repulsion and energy barrier are lower, and thus these molecules are easier to be excluded from the interface and virus infection is not strongly inhibited.