Intramolecular glycan-protein (protein-glycan) interactions occur between glycans and proteins that they are covalently attached to.
[2] For instance, SARS-CoV-2, the causative agent of COVID-19, employs its extensively glycosylated spike (S) protein to bind to the ACE2 receptor, allowing it to enter host cells.
[3] The spike protein is a trimeric structure, with each subunit containing 22 N-glycosylation sites, making it an attractive target for vaccine search.
Indeed, three different hexoses could theoretically produce from 1056 to 27,648 unique trisaccharides in contrast to only 6 peptides or oligonucleotides formed from 3 amino acids or 3 nucleotides respectively.
[2] In contrast to template-driven protein biosynthesis, the "language" of glycosylation is still unknown, making glycobiology a hot topic of current research given their prevalence in living organisms.
[2] The study of glycan-protein interactions provides insight into the mechanisms of cell-signaling and allows to create better-diagnosing tools for many diseases, including cancer.
[5] The binding of glycan-binding proteins (GBPs) to glycans could be modeled with simple equilibrium.
Given that many GBPs exhibit multivalency, this model may be expanded to account for multiple equilibria:
Indeed, a statistical analysis of carbohydrate-binding pockets shows that aspartic acid and asparagine residues are present twice as often as would be predicted by chance.
It has been shown that a single change in the stereochemistry at C4 carbon shifts preference for aromatic residues from
[6] The comparison of electrostatic surface potentials (ESPs) of aromatic rings in tryptophan, tyrosine, phenylalanine, and histidine suggests that electronic effects also play a role in the binding to glycans (see Figure 2).
There are many proteins capable of binding to glycans, including lectins, antibodies, microbial adhesins, viral agglutinins, etc.
Lectins found in plants and fungi cells have been extensively used in research as a tool to detect, purify, and analyze glycans.
[7] Although antibodies exhibit nanomolar affinities toward protein antigens, the specificity against glycans is very limited.
[9][7] In contrast with jawed vertebrates whose immunity is based on variable, diverse, and joining gene segments (VDJs) of immunoglobulins, the jawless invertebrates, such as lamprey and hagfish, create a receptor diversity by somatic DNA rearrangement of leucine-rich repeat (LRR) modules that are incorporate in *vlr* genes (variable leukocyte receptors).
[10] Those LRR form 3D structures resembling curved solenoids that selectively bind specific glycans.
[11] A study from University of Maryland has shown that lamprey antibodies (lambodies) could selectively bind to tumor-associated carbohydrate antigens (such as Tn and TF
A selection of lambodies that could bind to aGPA, a human erythrocyte membrane glycoprotein that is covered with 16 TF
This lambody selectively stained (over healthy samples) cells from 14 different types of adenocarcinomas: bladder, esophagus, ovary, tongue, cheek, cervix, liver, nose, nasopharynx, greater omentum, colon, breast, larynx, and lung.
[9] A close look at the crystal structure of VLRB.aGPA.23 reveals a tryptophan residue at position 187 right over the carbohydrate binding pocket.
The ability to form multivalent protein-ligand interactions significantly enhances the strength of binding: while
For example, galectins are usually observed as dimers, while intelectins form trimers and pentraxins assemble into pentamers.
Larger structures, like hexameric Reg proteins, may assemble into membrane penetrating pores.
Collectins may form even more bizarre complexes: bouquets of trimers or even cruciform-like structures (e.g. in SP-D).
[17] Glycan-protein interactions may be detected by testing proteins of interest (or libraries of those) that bear fluorescent tags.
The structure of the glycan-binding protein may be deciphered by several analytical methods based on mass-spectrometry, including MALDI-MS, LC-MS, tandem MS-MS, and/or 2D NMR.
[18] Computational methods have been applied to search for parameters (e.g. residue propensity, hydrophobicity, planarity) that could distinguish glycan-binding proteins from other surface patches.
[19] Further studies have employed calculations of Van der Waals energies of protein-probe interactions and amino acid propensities to identify CRDs with 98% specificity at 73% sensitivity.
[21] In contrast with protein studies, where a primary protein structure is unambiguously defined by the sequence of nucleotides (the genetic code), the glycobiology still cannot explain how a certain "message" is encoded using carbohydrates or how it is "read" and "translated" by other biological entities.
An interdisciplinary effort, combining chemistry, biology, and biochemistry, studies glycan-protein interactions to see how different sequences of carbohydrates initiate different cellular responses.