Glycoproteins: Synthesis and Clinical Consequences


Return to The Medical Biochemistry Page

© 1996–2014 themedicalbiochemistrypage.org, LLC | info @ themedicalbiochemistrypage.org

Glycoproteins

Membrane associated carbohydrate is exclusively in the form of oligosaccharides covalently attached to proteins forming glycoproteins, and to a lesser extent covalently attached to lipid forming the glycolipids. Glycoproteins consist of proteins covalently linked to carbohydrate. The post-translational attachment of carbohydrate to proteins plays a critical role in overall biochemical complexity in humans (as well as all eukaryotes). The importance of this protein modification can be emphasized by the fact that approximately 50% of all proteins are known to be glycosylated and at least 1% of the human genome is represented by glycan biosynthesis genes.

 

 

 

 

 

 

 

 

 

 

The predominant sugars found in glycoproteins are glucose (Glc), galactose (Gal), mannose (Man), fucose (Fuc), N-acetylgalactosamine (GalNAc), N-acetylglucosamine (GlcNAc) and N-acetylneuraminic acid (NANA). NANA is also called sialic acid (Sia). The distinction between proteoglycans and glycoproteins resides in the level and types of carbohydrate modification. Proteoglycans also contain the sugar glucuronic acid (GlcA). The carbohydrate modifications found in glycoproteins are rarely as complex as that of proteoglycans. The carbohydrates of glycoproteins are linked to the protein component through either O-glycosidic or N-glycosidic bonds. The N-glycosidic linkage is through the amide group of asparagine (Asn, N). The O-glycosidic linkage is to the hydroxyl of serine (Ser, S), threonine (Thr, T) or hydroxylysine (hLys). The linkage of carbohydrate to hLys is generally found only in the collagens. The linkage of carbohydrate to hLys is either the single sugar galactose or the disaccharide glucosylgalactose. When attached to Ser or Thr, the sugar of O-linked glycoproteins is most often GalNAc. This most common O-glycoprotein type is also commonly referred to as a mucin-type glycan. In N-linked glycoproteins, the sugar attached to the Asn residue is always GlcNAc.

N-linked sugars: The predominant carbohydrate attachment in glycoproteins of mammalian cells is via N-glycosidic linkage. The site of carbohydrate attachment to N-linked glycoproteins is found within a consensus sequence of amino acids, Asn-X-Ser/Thr (N-X-S/T), where X is any amino acid except proline. When an analysis of proteins in the public databases is carried out, it can be shown that approximately 65% of all the proteins contain at least one occurrence of the N-X-S/T consensus.

O-linked sugars: The synthesis of O-linked glycoproteins occurs via the stepwise addition of nucleotide-activated sugars directly onto the polypeptide. The nucleotide-activated sugars are coupled to either UDP, GDP (as with mannose) or CMP (for instance, NANA). The attachment of sugars is catalyzed by specific glycoprotein glycosyltransferases. Evidence indicates that each specific type of carbohydrate linkage in O-linked glycoproteins is the result of a different glycosyltransferase.

O-linkage to GalNAc N-linkage to GalNAc

O-linkage to GalNAc

N-linkage to GlcNAc

Most proteins that are secreted, or bound to the plasma membrane, are modified by carbohydrate attachment. The part that is modified, in plasma membrane-bound proteins, is the extracellular portion of the protein. Intracellular proteins are less frequently modified by carbohydrate attachment. However, the attachment of carbohydrate to intracellular proteins confers unique functional activities on these proteins. Linkage of carbohydrate to cytosolic and/or nuclear proteins occurs via O-linkage and involves attachment of GlcNAc to serine or threonine residues. The linkage is catalyzed by the enzyme O-GlcNAc transferase, OGT. Several transcription factors and RNA polymerase II have been shown to be modified by O-GlcNAc linkage.

The protein component of all glycoproteins is synthesized from polyribosomes that are bound to the endoplasmic reticulum (ER). The processing of the sugar groups occurs co- and post-translationally in the lumen of the ER and continues in the Golgi apparatus for N-linked glycoproteins. Attachment of sugars in O-linked glycoproteins occurs post-translationally in the Golgi apparatus. Sugars used for glycoprotein synthesis (both N-linked and O-linked) are activated by coupling to nucleotides. Glucose and GlcNAc are coupled to UDP and mannose is coupled to GDP.

back to the top

Nucleotide Sugar Biosynthesis

The monosaccharides that are attached to proteins (as well as lipids in the formation of glycolipids) are first activated by coupling to a nucleotide. These nucleotide-activated sugars are derived from dietary sources and salvage pathways. Glucose and fructose are the two major forms of sugar in humans from which all other monosaccharides can be synthesized. Through the actions of different enzymes these two sugars are phosphorylated, epimerized, and acetylated converting them into the various high-energy nucleotide sugar donors used in the synthesis of complex glycans. The synthesis of nearly all nucleotide-activated sugars occurs in the cytosol. The exception to this is the synthesis of CMP-activated N-acetylneuraminic acid (NANA, also NeuAc or Sia) which takes place in the nucleus.

Although nucleotide-activated sugars are synthesized in the cytosol they are utilized within the lumen of the ER and/or the Golgi network. Therefore, they must be translocated into these organelles before they can be used in the process of glycan synthesis. Since nucleotide-activated sugars cannot freely pass through the ER or Golgi membranes, specific transport systems are responsible for their translocation. In general, the transport of a nucleotide-activated sugar occurs only into the organelle in which the corresponding glycosyltransferase is localized. Some nucleotide sugars enter only the lumen of ER, others only enter the lumen of the Golgi apparatus, and a few nucleotide-activated sugars are transported into both organelles.

There are two mechanisms for the importation of nucleotide-activated sugars into the ER and/or Golgi apparatus. The first mechanism involves dolichol phosphate and it is used to transport mannose and glucose. This dolichol-mediated transport system is only functional in the ER. On the cytosolic side of the ER membrane, the mannose from GDP-Man and the glucose from UDP-Glc are linked to dolichol with concomitant cleavage of the nucleotide moiety. The enzymes that catalyze these two reactions are referred to as Dol-P-Man and Dol-P-Glc synthases, respectively. The sugar is then released into the lumen of the ER by a "flippase" activity allowing the monosaccharide to be used by ER-localized glycosyltransferases.

The second mechanism of nucleotide-sugar transport into the ER and/or Golgi involves specific nucleotide sugar transporters (NSTs). NSTs belong to the solute carrier 35 (SLC35) family of membrane transporters. These SLC35 transporters reside in the ER and/or Golgi membranes and function as typical antiporters. Some NSTs can transport more than one substrate. For example, the UDP-Gal (SLC35A2) transporter transports both UDP-Gal and UDP-GalNAc. In contrast, for example, the GDP-Fuc (SLC35C1, also known as FUCT1) and CMP-NANA (SLC35A1) transporters are monospecific.

When the nucleotide-activated sugar is transported into the lumen of the ER or Golgi there is a concomitant equimolar exit of a corresponding nucleoside monophosphate from the ER/Golgi lumen to the cytosol. It is the nucleotide portion of the nucleotide-activated sugar that serves as the recognition feature required for initial binding to the NST. Following entry into the ER or Golgi, a glycosyltransferase will transfer the monosaccharide to a target glycan concomitant with removal of the nucleoside diphosphate. The nucleoside diphosphates are then converted to nucleoside monophosphates and inorganic phosphate by a nucleoside diphosphatase. The resultant nucleoside monophosphate can then be transported out into the cytosol via the action of the SLC35 transporters. The inorganic phosphate is transported into the cytosol by a specific transporter. Both nucleoside diphosphates and monophosphates can inhibit the nucleotide-activated sugar transport process as well as the activity of lumenal glycosyltransferases.

The synthesis of nucleotide-activated sugars is under tight regulatory control. Loss of this control, due to enzyme deficiency, can result in serious clinical manifestation. However, because of the interconnected pathways of nucleotide-activated sugar metabolism, the results of an individual enzyme deficiency can be difficult to predict. Due to the tight regulation of nucleotide-activated sugar synthesis, the alteration in production of only a single nucleotide sugar can significantly impair glycosylation with potentially profound effects. That disruption in the processes of nucleotide-activated sugar synthesis can result in severe clinical symptomology is evident from a number of disorders classified as congenital disorders of glycosylation, CDG (see below and the CDG page). For example, patients with CDG-IIc, caused by a deficient GDP-Fuc transporter (FUCT1, SLC35C1), manifest with unique facial features, recurrent infections, persistent leukocytosis, defective neutrophil chemotaxis and severe growth and mental retardation.

back to the top

Mechanism of N-Glycosylation

In contrast to the step-wise addition of sugar groups to the O-linked class of glycoproteins, N-linked glycoprotein synthesis requires a lipid intermediate: dolichol phosphate. Dolichols are polyprenols (C80–C100) containing 17 to 21 isoprene units, in which the terminal unit is saturated. In the image below the black bracket denotes the isoprene unit which can be repeated numerous times. The phosphate moiety of dolichol phosphate is attached to the –OH group.

Structure of dolichol

As indicated, the formation of the GlcNAc-β-Asn linkage in proteins occurs in the endoplasmic reticulum (ER) through cotranslational addition of a preassembled carbohydrate core structure that is delivered via the carbohydrate-dolichol lipid intermediate. The preassembled carbohydrate core structure comprises three terminal residues of glucose attached to a branched cluster of nine mannose residues that are in turn attached to two GlcNAc residues attached to dolicholpyrophosphate. The structure is abbreviated Glc3Man9GlcNAc2–PP–dolichol. This structure is commonly referred to as the lipid-linked oligosaccharide (LLO) whereas the oligosaccharide structure itself is termed the en bloc oligosaccharide. In mammalian cells the importance of the terminal glucose residues is evident from the fact that transfer of Man9GlcNAc2–PP–dolichol is some 25-times less efficient than the complete structure. In addition, structures that contain three terminal glucose residues, but not the complete Man9GlcNAc2 structure, are efficiently transferred to protein by oligosaccharyltransferase. Synthesis of the en bloc dolichol–PP–oligosaccharide unit begins on the cytoplasmic face of the ER membrane and prior to transfer to the protein, the structure “flips” to the luminal side.

Pathway for the transfer of the LLO to proteins in the ER

Pathway of synthesis and transfer of the LLO unit in the ER.

Immediately following transfer of the en bloc oligosaccharide unit to the protein, processing and alteration of the composition of the oligosaccharide ensues and continues as the protein passes through the ER then into and through the Golgi apparatus. Initially, the terminal glucose is removed through the action of glucosidase I (GI), a membrane bound enzyme recognizing α-1,2-linked glucose. The remaining two glucose residues are then removed by glucosidase II (GII), a soluble enzyme recognizing α-1,3-inked glucose. After removal of the glucose residues, the action of α-mannosidases removes several mannose residues as the protein progresses to the Golgi. The action of the various glucosidases and mannosidases leaves N-linked glycoproteins containing a common core of carbohydrate consisting of three mannose residues and two GlcNAc. Through the action of a wide range of glycosyltransferases and glycosidases a variety of other sugars are attached to this core as the protein progresses through the Golgi. These latter reactions generate the three major families of N-linked glycoproteins described above.

Upon completion of glycan processing, all N-linked glycoproteins will contain a common core of carbohydrate attached to the polypeptide. This core consists of three mannose residues and two GlcNAc. A variety of other sugars are attached to this core and comprise three major N-linked families:

1. High-mannose type contains all mannose outside the core in varying amounts.

2. Hybrid type contains various sugars and amino sugars.

3. Complex type is similar to the hybrid type, but in addition, contains sialic acids to varying degrees.

Structures of the 3 major classes of glycoproteins

Open squares: GlcNAc; open circles: mannose; open diamonds: galactose; filled squares: fucose; filled triangles: sialic acid the greek symbols α and β followed by numbers refers to the type of linkage.

Structures of oligosaccharides of the 3 major classes of N-glycoprotein.

back to the top

O-Glycosylation & Mucin-Type O-Glycans

The attachment of carbohydrate to the hydroxyl group of serine (Ser, S) or threonine (Thr, T) residues in proteins, as well as hydroxylysine (hLys) in collagens, constitutes the bulk of O-linked glycans (O-glycans) found in humans. Carbohydrate addition in O-glycans occurs via the stepwise addition of nucleotide activated sugars as the modified proteins traverse the ER and the Golgi network. There are seven major types of O-glycans in humans that are defined by the first sugar residue that is attached to either Ser, Thr or hLys. The Table below lists these seven O-glycan types. The (R)- symbol represents the fact that a broad array of additional carbohydrates can be found attached to these basic glycan structures. The ± symbol indicates that the additional carbohydrate structure is found in some but not all species of that particular O-glycan type. Xyl is the sugar xylulose.


O-Glycan Type Structure of Linkage Glycoprotein Type
O-linked GlcNAc GlcNAc-β1–Ser/Thr nuclear and cytoplasmic
mucin-type (R)-GalNAc-α1–Ser/Thr plasma membrane and secreted
O-linked mannose NANA-α2–3Gal-β1–4GlcNAc-β1–2Man-α1–Ser/Thr α-dystroglycan
O-linked fucose NANA-α2–6Gal-β1–4GlcNAc-β1–3±Fuc-α1-Ser/Thr EGF domains; this particular O-fucosylation is critical in the function of the receptor protein Notch
O-linked fucose Glc-β1–3Fuc-α1–Ser/Thr thrombospondin 1 repeats (TSR)
O-linked glucose Xyl-α1–3Xyl-α1–3±Glc-β1–Ser EGF domains
O-linked galactose Glc-α1–2±Gal-β1-O–Lys collagens
glycosaminoglycan (GAG) (R)-GlcA-β1–3Gal-β1–4Xyl-β1–Ser proteoglycans

Of the seven different types of O-glycans, by far the most common in humans is the mucin-type. Mucin glycoproteins are so-called because of their abundance in the mucous secretions on cell surfaces and in body fluids. Mucin-type O-glycans all have the amino sugar GalNAc attached to the Ser or Thr residue of themodified protein. The process of O-GalNAc glycosylation occurs via two distinct steps that consist of initiation and processing. The initiation step controls the pattern and the density of the carbohydrate structures attached to the protein. The processing step determines the ultimate O-glycan structure that is present in the fully modified protein. The attachment of the initial GalNAc residue occurs in the Golgi network after the target protein has obtained its' native folded state. Attachment of GalNAc to Ser or Thr residues in a target protein is catalzyed by a family of enzymes known as UDP-N-acetylgalactosamine:polypeptide N-acetylgalactosaminyltransferases (ppGalNAcTs or ppGaNTases). The addition of the GalNAc residue forms what is referred to as the Tn antigen (Tn Ag). Biochemical studies have defined 17 functionally active ppGaNTases in humans with homology searches indicating that at least 20 such enzymes exist in humans.

Subsequent to the formation of the Tn Ag in mucin-type O-glycans, numerous glycosyltransferases begin the process of modification of the carbohydrate structure that then results in the fully modified protein. One common modification of the Tn Ag is the addition of sialic acid forming the structure referred to as the sialyl-Tn antigen. In addition to the Tn and the sialyl-Tn structures, the further modifications result in the formation of eight additional mucin-type core structures. These eight mucin-type core structures are defined based upon the second sugar attached to the GalNAc and the linkage of that attachment (e.g. α or β and 1–3 or 1–6). The core 1 through core 4 structures detailed in the Table below represent the majority of mucin-type O-glycan structures produced in human tissues.


Core Type Carbohydrate Structure of Core Comments
1 Galβ1–3GalNAc-α1–Ser/Thr most cells, secreted proteins; most abundant mucin-type O-glycan structure, referred to as the T antigen; there is a single core 1 β1–3 galactosyltransferase (C1Gal-T1 or T synthase) in mammals; highest expression in liver, kidney, heart, and placenta
2 Galβ1–3(GlcNAcβ1–6)GalNAc-α1–Ser/Thr all blood cells; represents a branching glycan of the core 1 structure; catalyzed by one of three β1–6 N-acetylglucosaminyltransferases (C2GnT-1 to -3 or β6GlcNAc-T); each with distinct expression patterns; two synthesize the core 2 structure, one (C2GnT-2)synthesizes both core 2 and core 4 structures
3 GlcNAcβ1–3GalNAc-α1–Ser/Thr colon and saliva; synthesized by β1–3 N-acetylglucosaminyltransferase (C3GnT-1 or β3Gn-T6), single enzyme in humans expressed primarily in small intestine and stomach
4 GlcNAcβ1–3(GlcNAcβ1–6)GalNAc-α1–Ser/Thr mucin-secreting cell types; represents a branching glycan of the core 3 structure; catalyzed by β1–6 N-acetylglucosaminyltransferases (β6GlcNAc-T); three β6GlcNAc-Ts in humans, each with distinct expression patterns; two synthesize the core 2 structure, one synthesizes both core 2 and core 4 structures
5 GalNAcα1–3GalNAc-α1–Ser/Thr adenocarcinomas, embryonic gut-derived meconium
6 GlcNAcβ1–6GalNAc-α1–Ser/Thr embryonic gut, mucins from ovarian cysts; core 6 glycans may represent β-galactosidase degradation products of core 2 glycans
7 GlcNAcα1–6GalNAc-α1–Ser/Thr not described in humans
8 Galα1–3GalNAc-α1–Ser/Thr bronchial tissue mucins

Recent evidence has demonstrated that human core 1 β1–3-galactosyltransferase (C1Gal-T1) requires a molecular chaperone present in the ER for its enzymatic function. This molecular chaperone has been named core 1 β1–3-Gal-T-specific molecular chaperone, Cosmc. Cosmc is a member of the type II transmembrane protein family. The role of Cosmc in C1Gal-T1 activity is in ensuring that the enzyme is properly folded within the ER. Loss of Cosmc activity results in C1Gal-T1 being degraded in the proteosome. Given these results it is possible, but as yet undemonstrated, that additional chaperones specific for other glycosyltransferases may exist.

The attachment of carbohydrates to proteins forming O-glycans serves multiple critical functions with respect to the structure and activity of the modified proteins. O-linked glycans are important for protein stability, modulation of enzyme activity, receptor-mediated signaling, immune function and immunity, protein-protein interactions, as well as many other functions. Mucin-type O-glycans are also important for binding water and are often found on the outer surfaces of tissues such as the gastrointestinal, urogenital and respiratory systems. The interaction of water with the mucin-type O-glycans results in the formation of a viscous solution or gel that forms a protective barrier harboring antibacterial properties.

back to the top

O-Mannosylation

Incorporation of mannose into target proteins by attachment to Ser and/or Thr, referred to as O-mannosylation, was originally thought to only occur in fungi. However, it is now known that this particular form of protein glycosylation serves a critical function in mammals. In mammals, O-mannosylation has been identified on α-dystroglycan (α-DG) from nerve and muscle, chondroitin sulfate proteoglycans and total glycopeptides from brain tissue, and most recently on neuron-specific protein tyrosine phosphatase, receptor type, zeta 1 (PTPRZ1, also known as RPTPβ). Clinically, O-mannosylation of α-DG is the most significant as evidenced by the constellation of symptoms that result from defects in the processes of O-mannosylation.

Incorporation of mannose into O-glycans involves the transfer of GDP-activated mannose (GDP-Man) to dolichol-phosphate forming Dol-P-Man. Synthesis of Dol-P-Man takes place on the cytosolic face of the ER membrane and is catalyzed by a family of genes called dolichyl-phosphate-D-mannose:protein O-mannosyltransferases (PMTs). PMTs catalyze the transfer of a mannosyl residue from Dol-P-β-D-Man to Ser and Thr residues of secretory proteins. An α-D-mannosidic linkage is formed through invertion of the anomeric configuration of the mannose. All PMT family members are classified as members of the glycosyltransferase 39 (GT39) family. The PMT family is divided into three subfamilies; PMT1, PMT2, and PMT4. In mammals, POMT1 (PMT4 subfamily member) and POMT2 (PMT2 subfamily member) are the known PMTs involved in O-mannosylation reactions. Evidence shows that the POMT1 and POMT2 proteins form a complex and these complexes are crucial for mannosyltransferase activity. Since it occurs in the ER, the actual process of O-mannosylation is distinct from most other O-glycosylation reactions that take place exclusively in the Golgi apparatus. After mannosylation of the target protein further extension of the O-linked mannose residue takes place in the Golgi apparatus. The vast majority of mammalian O-mannosyl glycans represent variations of the tetrasaccharide NANAα2–3Galβ1–4GlcNAcβ1–2Manα1–Ser/Thr with different lengths (e.g., asialo) and variable fucose (α1,3- linked to GlcNAc) contents. In addition to the conserved tetrasaccharide, O-mannosyl glycans containing the human natural killer-1 (HNK-1) epitope [HSO4-3GlcAβ1–3Galβ1–4GlcNAcβ1–2Man-Ser/Thr] have been detected in significant amounts on PTPRZ1 (RPTPβ) from neuroblastoma cells.

To date, the best-studied O-mannosylated protein in humans is α-DG. It comprises two globular domains separated by a Ser/Thr-rich mucin-like region that is substantially O-mannosylated. α-DG is an essential component of the dystrophin-glycoprotein complex (DGC) in skeletal muscle. Highlighting the significance of O-mannosylated α-DG is the fact that most of the defects associated with impaired O-mannosylation can be explained by reduced function of α-DG. In humans, defective O-mannosylation is associated with a group of autosomal recessive muscular dystrophies termed congenital muscular dystrophies (CMD). In addition to muscle dysfunction, these CMD are also associated with variable brain and ocular abnormalities. The CMD are also referred to as secondary α-dystroglycanopathies since their common pathological feature is the hypoglycosylation of α-DG. CMD patients exhibit both clinical and genetic variability. The most severe CMD is Walker-Warburg syndrome (WWS) and as a consequence of multiple malformations, WWS patients often die within the first year of life. Less severe CMD includes muscle-eye-brain disease (MEB) and Fukuyama congenital muscular dystrophy (FCMD). Each of these disorders are characterized by CMD associated with severe brain malformations and abnormalities of the eye.  The mildest forms of CMD may not present until adulthood such as in limb-girdle muscular dystrophy (LGMD) which manifests with neither brain or eye abnormalities.

Mutations in six known or putative glycosyltransferase genes have been identified to cause various α-dystroglycanopathies, including the human POMT1, POMT2, and POMGnT1 genes. Each of these three genes is directly involved in the synthesis of O-mannosyl glycans linked to α-DG. Examination of numerous α-dystroglycanopathy patients has determined that mutations in either POMT1 or POMT2 occur in about 20% of all cases. Mutations in POMT1 have been shown to result in MEB and some milder forms of CMD such as congenital muscular dystrophy with mental retardation (CMD/MR) and LGMD type 2K (LGMD2K). Mutations in POMT2 have been detected in WWS patients, MEB-like patients, and CMD/MR patients. Additionally, a POMT2 mutation has been identified in a patient with an LGMD type 2N (LGMD2N) phenotype, a mild clinical form of CMD with no neurological involvement.

back to the top

Hexosamine (O-GlcNAc) Signaling in Metabolism & Development

Hexosamine Biosynthesis

Multiple nuclear and cytoplasmic proteins, including transcription factors, cytoskeletal proteins, oncogenes, and kinases, are posttranslationally modified on serine and threonine residues with β-N-acetylglucosamine (O-GlcNAc).

O-GlcNAc attachment to protein

Attachment of O-GlcNAc to insulin receptor substrate-1 (IRS-1). While no single consensus sequence is known to exist for the attachment of O-GlcNAc to Ser or Thr residues in proteins, the sequence shown is that derived from IRS-1 where the GlcNAc moeity is attached to Ser1046.

Until the discovery of O-GlcNAc modification (referred to as O-GlcNAcylation) on nuclear and cytoplasmic proteins, more than 20 years ago, it was believed that protein glycosylation was restricted to the luminal compartments of the secretory machinery and to the cell surface and extracellular matrix. Early studies on the subcellular localization of O-GlcNAc showed that it was highly concentrated at the nuclear envelope, particularly at the nuclear pore complex, as well as throughout the chromosomes. Additionally, several cytosolic and cytoskeletal proteins were also found to be O-GlcNAcylated. The gamut of O-GlcNAcylated proteins includes enzymes involved in the metabolism of amino acids, nucleotides (e.g. thymidine kinase), and carbohydrates (e.g. glucose-6-phosphatase), general metabolic processes, cell growth and maintenance (e.g. MYC, Sp1), DNA damage responses, intracellular transport, transcription (e.g. RNA polymerase II) and translation (e.g. eIF-5). The carboxy-terminal domain of a subpopulation of RNA polymerase II is extensively O-GlcNAcylated, and almost all RNA polymerase II transcription factors are O-GlcNAcylated. Thus far, more than 600 proteins have been shown to be O-GlcNAcylated.

The synthesis of O-GlcNAc occurs via the metabolic pathway called the hexosamine biosynthesis pathway, HBP. Synthesis of O-GlcNAc initiates from glucose. Upon entering cells, glucose is converted to glucose-6-phosphate (G6P) via the action of glucokinase/hexokinase. Most of the G6P formed is oxidized in the process of glycolysis. However, a portion of the G6P can be shuttled into either glycogen synthesis or the pentose phosphate pathway, depending upon the metabolic needs of the cell. During the process of glycolysis, phosphohexose isomerase converts G6P into fructose-6-phosphate (F6P). Whereas, the majority of F6P enters glycolysis, 2–5% enters the hexosamine biosynthesis pathway (HBP) to generate uridine diphospho-N-acetylglucosamine (UDP-GlcNAc). The end product of the HBP, UDP-GlcNAc, is referred to as an activated nucleotide sugar. The nucleotide sugar is an essential intermediate in the synthesis of a wide array of complex glycans.

The first and rate-limiting enzyme in the HBP is glutamine:fructose-6-phosphate aminotransferase 1, GFAT (also abbreviated GFPT1). GFAT utilizes glutamine and F6P for the formation of glucosamine-6-phosphate (GlcN6P) and glutamate. It is possible for glucosamine to enter the HBP directly by conversion to GlcN6P via the action of hexokinase, thus bypassing the rate-limiting step of the pathway. This has been shown to be true when glucosamine is infused into rats or mice. GlcN6P is acetylated forming N-acetylglucosamine-6-phosphate (GlcNAc6P) from acetyl-CoA via the action of glucosamine-6-phosphate N-acetyltransferase (gene symbol GNPNAT1). This enzyme is more commonly referred to as EMeg32 due to the fact that it was clone 32 originally isolated from a screen for genes specific for precursors of Erythroid and Megakaryocytic lineages. GlcNAc6P is converted to GlcNAc-1-phosphate (GlcNAc1P) via the action of phosphoacetylglucosamine mutase (PAGM). GlcNAc1P is finally nucleotide activated via the action of UDP-N-acetylglucosamine pyrophosphorylase (UAP1) generating UDP-GlcNAc. Like many metabolic pathways, the product, in this case UDP-GlcNAc, serves to regulate the rate of flux through the pathway by feedback inhibition of the rate-limiting enzyme, GFAT. Once synthesized the UDP-GlcNAc can be converted into other types of nucleotide sugars or directly incorporated into a variety of carbohydrate modified macromolecules such as lipids and proteins.

Hexosamine biosynthesis pathway, HBP

The hexosamine biosynthesis pathway (HBP). The HBP can serve as a glucose sensor since the synthesis of UDP-GlcNAc relies on the incorporation of products from glucose, amino acid (glutamine), fatty acid (acetyl-CoA), and nucleotide (uridine) metabolism. Although the majority of the glucose taken up by the cell is committed to glycolysis, glycogen synthesis, and the pentose phosphate pathway, in general about 2–5% of it enters the HBP for the formation of UDP-GlcNAc. The UDP-GlcNAc then serves as the substrate for a wide array of glycosylations, including O-GlcNAc modification. The first and rate-limiting enzyme in the HBP is glutamine:fructose 6-phosphate aminotransferase 1 (GFAT; also abbreviated GFPT1). Like other metabolic rate-limiting enzymes, GFAT is negatively regulated by the end product of the pathway, UDP-GlcNAc. EMeg32 is glucosamine 6-phosphate N-acetyltransferase (also abbreviated GNPNAT1). The derivation EMeg32 is from the fact that the gene was originally isolated from a screen for genes specific for precursors of Erythroid and Megakaryocytic lineages and was clone 32. PHI = phosphohexose isomerase. PAGM = phosphoacetylglucosamine mutase. For a large size image: Click here.

Hexosamine Cycling

Similar to serine and threonine phosphorylation, which can cycle between phosphorylation and dephosphorylation, thereby effecting fluctuating regulation of enzyme activity, O-GlcNAc modification at serine and threonine residues also cycles. The cycling of O-GlcNAc is a nutrient-responsive, post-translational modification that, like phosphorylation, impacts target protein activity. The enzymes responsible for the addition and removal O-GlcNAc have been identified and cloned from several model organisms. Additionally, upstream modulators that control the level of the donor sugar, UDP-GlcNAc, hace also been identified. Studies of hexosamine cycling have been greatly aided by a variety of inhibitors and sensors of the process. Research on O-GlcNAc cycling has identified this molecule as a key regulator of nutrient sensing. Additionally, disruption of HBP has a profound impact on diseases of nutrient sensing, such as type 2 diabetes, and neurodegeneration. That O-GlcNAcylation is a critical protein modification is demonstrated by the fact that blocking its synthesis or attachment to target proteins causes embryonic stem cell lethality in mice.

O-GlcNAc cycling at serine and threonine residues is maintained by the action of two enzymes. Addition of GlcNAc at serine and threonine residues is referred to as O-GlcNAcylation and is catalyzed by the enzyme UDP-N-acetylglucosamine:polypeptide β-N-acetylglucosaminyltransferase or more commonly just O-GlcNAc transferase, OGT. Removal of the O-GlcNAc modification is catalyzed by β-N-acetylglucosamindase or more commonly O-GlcNAcase, OGA. The OGA gene is also referred to as the MGEA5 gene due to the fact that it was originally isolated as the meningioma-expressed antigen 5 gene.

Whereas there are hundreds of different kinases and phosphatases that add and remove phosphate, respectively, from serine and threonine residues in various target proteins, there is but a single OGT and single OGA gene in the human genome. OGT is located on the X-chromosome at Xq13.1 which places the gene in close proximity to the Xist locus and the X-inactivation center (XIC). The close proximity of the OGT gene to known heterochromatin boundaries may have significance for the regulation of its expression. The OGA gene is located on chromosome 10q24.32.

The HBP integrates the nutrient status of the cell by utilizing glucose, acetyl-CoA, glutamine, and UTP to produce UDP-GlcNAc. In turn, OGT will transmit this nutrient information throughout the cell changing the level of GlcNAcylated target proteins. While UDP-GlcNAc is used throughout the secretory pathway as a building block for the synthesis of N-linked and O-linked glycans, as well as the assembly of GPI-anchors, nuclear and cytosolic O-GlcNAc-modified proteins appear to be particularly sensitive to physiological flux of the UDP-GlcNAc pools.

How then does a complex array of protein substrates in a variety of different tissues come under the concerted control of a single enzyme? It turns out that OGT is composed of a modular structure generated by tetratricopeptide repeat (TPR) domains. These domains produce a super helical structure containing an asparagine ladder that is critical for protein recognition. Alternative splicing produces multiple isoforms, each with varying number of TPRs. This difference allows each isoform to modify a select subset of substrates. Alternative splicing also targets OGT to different subcellular compartments. Additionally, there have been a wide array of OGT-interacting proteins (OIP) identified that provide another layer of regulation by bridging the interaction of the enzyme with its with target substrates.

Hexosamine Signaling in Development

As indicated above O-GlcNAc cycling is essential for viability since its deletion results in extreme developmental defects, most often leading to death of the developing embryo. By tissue selective knockout of the OGT gene, profound changes in all cell types examined have been found. For example, within the brain neuronal-specific knockout of OGT leads to mice that are smaller at birth and exhibit aberrant locomotor activity. EMeg3, which is essential for the efficient production of UDP-GlcNAc is also a critical enzyme controlling early development. Knocking out EMeg32 in mice results in neonatal lethality and the animals exhibit pronounced developmental delay.

Hexosamine cycling is also known to be critical in the regulation of transcriptional repression. Using Drosophila melanogaster as a model organism it has been shown that the polycomb group proteins (PcG) repress the transcription of HOX genes as a multimeric complex referred to as the PcG repressive complex (PRC). This PRC-mediated transcriptional repression effects control over anterior-posterior segementation of the developing larva. The PcG repression of HOX genes, thereby, prevents homeotic transformations. The PcG family proteins are conserved in mammals and carry out similar transcriptional repressive functions. The unifying theme of PcG function is to act as an epigenetic regulator of cell fate and maintain cellular identity through many rounds of cell division. It turns out that in Drosophila OGT is the same as the super sex combs (sxc) gene which is an essential component of PcG complexes in this organism. The lethal phenotype associated with sxc mutations can be rescued by expression of the human OGT gene. Therefore, OGT plays an integral role in the ability of the PcG proteins to repress genes appropriately. In addition, since the C-terminal domain (CTD) of RNA polymerase II is extensively O-GlcNAcylated, the cycling could alter the activity of RNA Pol II at the promoter of many of these same PRC regulated genes.

With respect to epigenetic changes in gene expression it is interesting that the OGT gene is located very close to the Xist locus which encodes the RNA involved in X-chromosome inactivation. In addition the OGT gene is one of a small number of genes that are highly regulated in embryonic stem cells during the process of X-inactivation, owing to its close proximity to Xist. Experiments in Drosophila demonstrate that PcG regulates expression of the NK cluster on chromosome 10 in which the OGA gene resides. Since OGT regulates the activity of the PRC, OGT could, in turn, control the expression of OGA. Therefore, , OGT acting through PcG would repress OGA expression and bias the cell towards sustained levels of O-GlcNAc.

back to the top

Hexosamine Signaling and Insulin Resistance

Numerous proteins involved in insulin signaling and the downstream targets of these signaling cascades have been shown to be O-GlcNAcylated. With respect to insulin receptor signaling proteins, IRS-1, PI3K, PKB/Akt, PDK1, and GSK3β are all known to be O-GlcNAcylated. These modifications have all been observed in adipocytes which are a major target for the actions of insulin. Insulin-stimulated glucose uptake into adipocytes occurs via insulin-mediated mobilization of GLUT4 to the plasma membrane. Increased glucose uptake, in response to insulin can, therefore, significantly modify the flux through the HBP. Evidence linking the correlation between the HBP and insulin resistance in adipocytes was demonstrated at least 20 years ago. Using cultured rat adipocytes experiments demonstrated that chronic exposure to both insulin and glucose was required for the adipocytes to become insulin-resistant. For more information of insulin resistance go to the Insulin Function page. This is now a common theme underlying insulin resistance in other insulin-sensitive tissues such as skeletal muscle. In these early experiments it was shown that the impairment in insulin-stimulated glucose uptake, under hyperglycemic and hyperinsulinemic conditions, was exclusively dependent on the presence of the amino acid glutamine. Remember that glutamine is required as a substrate for GFAT, the rate-limiting enzyme in the HBP. Inhibition of GFAT activity was observed in the hyperglycemic and hyperinsulinemic conditions likely due to feedback inhibition by UDP-GlcNAc as the HBP product was shown to accumulate in the treated cells. However, if GFAT was inhibited with the use of various amidotransferase inhibitors the hyperglycemia-induced insulin resistance was prevented. Additionally, if cells are treated with glucosamine, which enters the HBP after the GFAT catalyzed reaction, there was a greater reduction in insulin-mediated glucose uptake compared to the hyperglycemic condition. As expected, since GFAT is bypassed, the glucosamine-induced insulin resistance does not require glutamine. Although glucose and glutamine metabolism are key inducers of flux through the HBP, free fatty acids (FFA) and uridine are also potent modulators of the HBP.

Utilizing experiments in whole animals, as opposed to cell culture, has provided additional direct evidence that excess flux through the HBP leads to modulation of insulin sensitivity in adipocytes. When GFAT is overexpressed in mice under the control of a GLUT4 promoter the animals develop classical insulin-resistant phenotype with hyperinsulinemia and reduction in whole-body glucose disposal rate. Because GLUT4 is highly expressed in adipose tissue and skeletal muscle, two major insulin-responsive tissues, it is not surprising that defective whole-body glucose disposal was observed. Elevation in serum leptin level was also observed in these GFAT overexpressing mice. Interestingly, muscle explants from GLUT4-GFAT mice showed normal insulin-stimulated glucose uptake. This latter observation is strong evidence that adipocytes play a major regulatory role in the HBP-mediated whole-body insulin resistance.

Another strain of mice has been utilized for studies on the role of HBP in insulin sensitivity that express GFAT specifically in adipose tissue by the use of an aP2 (adipocyte lipid binding protein) promoter driving its expression. Adipose tissue-restricted elevations in O-GlcNAc levels are detected in these mice and this is associated the development of whole-body insulin resistance. The results in these animals is characterized by a reduction in both glucose disposal rate and skeletal muscle glucose uptake. An increase in serum leptin and a decrease in serum adiponectin levels were also found in these mice.

As pointed out above, numerous proteins downstream of the insulin receptor that are critical to insulin-mediated signal transduction are known to be O-GlcNAcylated. Therefore, it is not difficult to assume that HBP-mediated glucose desensitization will occur at multiple stages, in particular through insulin-mediated signal transduction. Under high glucose-induced insulin resistance, there is a reduction in insulin-stimulated phosphorylation of PKB/Akt. There has been some discrepancy in determining precisely how HBP flux affects PKB/Akt phosphorylation in response to insulin binding its receptor. Recent research has shown that when cells are exposed to chronically high glucose and insulin there is a concomitant reduction in PIP3 which is a product of activated PI3K, a target of the activated insulin receptor. This reduction in PIP3 levels is correlated with an increase in PTEN (phosphatase and tensin homolog deleted on chromosome 10) levels. PTEN is a known inhibitor of PI3K. In addition, it was shown that there is an increase in IRS-1 phosphorylation on Ser636 and Ser639. Since rapamycin treatment inhibits the alteration of PIP3 and PTEN levels under insulin-resistant conditions, it is believed that mammalian target of rapamycin complex 1 (mTORC1) is involved in negatively regulating the IRS-1/PI3K/Akt signaling cascade downstream of the insulin receptor. The sites on IRS-1 seen to be phosphorylated by chronic hyperglycemic and hypeinsulinemiic conditions (S636/S639) are known to be substrates of mTORC1.

The regulation of insulin-stimulated GLUT4 translocation is also affected by changes in the flux rate through the HBP. Several cytoskeletal proteins involved in mobilization of GLUT4 to the plasma membrane are known to be O-GlcNAcylated. In addition, several of the proteins involved in the translocation process are targets of signaling proteins downstream of the insulin receptor. In cell culture models of both glucose- and glucosamine-induced insulin-resistance a reduction in the acute insulin-stimulated GLUT4 translocation is associated with a significant alteration in membrane redistribution of vesicle proteins such as t-(target membrane) SNARE, v-(vesicle membrane) SNARE and Munc18c (mammalian uncoordinated). SNARE stands for soluble-N-ethylmaleimide-sensitive factor attachment protein receptor. Munc18c is a negative regulator of both t- and v-SNAREs. Munc18c is known to be a target for O-GlcNAcylation. These results suggest a direct involvement of excess HBP flux in desensitizing the fusion between GLUT4-containing intracellular vesicles and the plasma membrane.

In addition to GLUT4 translocation, insulin-mediated PI3K and PKB/Akt activation also stimulates glycogen synthesis. The net effect is to balance the level of glucose metabolism in response to excess glucose influx. Insulin-dependent glycogen synthesis is mediated via the activation of of glycogen synthase (GS). Like other downstream targets of the insulin receptor, GS regulation involves a PKB/Akt-mediated inhibition of GSK3β which normally phosphorylates and inhibits GS. The insulin-stimulated increase in glycogen synthesis decreases the pool of G6P and subsequently F6P, thereby restricting flux through the HBP. PKB/Akt activation also leads to reduced dephosphorylation of GS via protein phosphatase 1 (PP1). Exposing cells to either high glucose or glucosamine results in a reduction in insulin-stimulated GS activity. Additionally, GS is a known O-GlcNAcylated protein and as might be expected it ahs been shown that GS becomes more resistant to dephosphorylation by PP1 under conditions of excess HBP flux.

While increased global O-GlcNAc levels are implicated in the development of insulin resistance, OGT is also regulated by insulin in adipocyte cell cultures. OGT is tyrosine-phosphorylated by the insulin receptor upon acute insulin stimulation and this phosphorylation increases the activity of the enzyme. In addition there is an observed shift in OGT localization from the nucleus to the cytosol in response to insulin stimulation. This OGT translocation to the plasma membrane is PI3K-dependant in response to acute insulin stimulation.

In summary, given that genetic and pharmacologic elevation in O-GlcNAc levels in cultured adipocytes and mouse models is associated with insulin-resistant phenotypes, it is likely that reducing O-GlcNAc levels in adipocytes should reverse the HBP-induced insulin resistance. A proof-of-concept experiment in transgenic mice (the insulin-resistant db/db mouse model which harbors a mutated leptin receptor) showed that overexpression of OGA, which reduces the level of O-GlcNAcyaltion, significantly improves whole-body glucose tolerance and insulin sensitivity. This result suggests that lowering O-GlcNAc levels in vivo should be of significant clinical beneficial.

back to the top

O-GlcNAcylation & Glucose Homeostasis

Numerous lines of evidence, as discussed above, demonstrate that overexpression of the rate-limiting enzyme of the HBP, GFAT, leads to peripheral insulin resistance. In addition, overexpression of O-GlcNAc transferase (OGT) in skeletal muscle and fat results in elevated circulating insulin levels and insulin resistance. Insulin resistance is a major contributing factor to the hyperglycemia typical of type 2 diabetes. However, additional factors exacerbate the hyperglycemic consequences of insulin resistance such as aberrantly regulated hepatic gluconeogenesis. Hyperglycemia is also associated with O-GlcNAcylation of transcription factors and cofactors such as FOXO1, FOXO3, CREB-regulated transcription coactivator 2 (CRTC2), and PGC-1α that are involved in the modulation of the expression of gluconeogenic genes. PGC-1α is a key transcriptional coactivator that regulates mitochondrial biogenesis as well as hepatic gluconeogenesis.

As indicated above, humans express a single OGT gene as well as a single OGA gene involved in the attachment and removal of O-GlcNAc from target proteins, respectively. Currently, even though hundreds of proteins and enzymes are known to be modified by O-GlcNAcylation, the details of how OGT and OGA achieve substrate specificity is largely unknown. It has been proposed that OGT recognizes numerous different substrates primarily though the tandem tetratricopeptide repeats (TPRs). Indeed, different OGT isoforms with various lengths in TPRs show different substrate specificities.

Recent evidence has shown that OGT regulates gluconeogenesis through O-GlcNAcylation of PGC-1α via interaction with host cell factor C1 (HCF1, also HCFC1). HCF1 is an essential transcriptional cofactor shown to be required for herpes virus gene expression, cell-cycle regulation, and stem cell growth. In addition to interacting with OGT, HCF1 is also highly O-GlcNAcylated. O-GlcNAcylation of PGC-1α stabilizes the transcriptional coactivator by inhibiting its ubiquitination by recruiting the deubiquitinase BRCA1-associated protein 1 (BAP1). BAP1 encodes a nuclear ubiquitin carboxy-terminal hydrolase (UCH). Stabilization of PGC-1α thereby, results in enhanced gluconeogenesis.

The term, glucose effectiveness, describes the ability of high glucose, by itself, to suppress endogenous glucose production. This phenomenon has an important role in overall glucose homeostasis. One important pathway through which glucose determines its own production involves the regulation of OGT/HCF1 complex formation and subsequent O-GlcNAcylation and stabilization of PGC-1α resulting in enhanced gluconeogenesis. Compared to no glucose, low levels of glucose stimulate gluconeogenesis, whereas, hyperglycemia inhibits gluconeogenesis. Under euglycemic conditions, OGT interaction with HCF1 leads to O-GlcNAcylation of PGC-1α and expression of gluconeogenic genes. Thus, the OGT/HCF-1/PGC-1α pathway is critical in maintaining normal glucose levels and mediating glucose effectiveness. Conversely, the hyperglycemia typical in type 2 diabetes leads to hyperactivation of the OGT/HCF-1/PGC-1α pathway diminishes glucose effectiveness. The potential for pharmacological inhibition of this pathway represents a potential novel strategy for treating the hyperglycemia associated with type 2 diabetes.

back to the top

Lysosomal Targeting of Enzymes

Enzymes that are destined for the lysosomes (lysosomal enzymes) are directed there by a specific carbohydrate modification. During transit through the Golgi apparatus a residue of GlcNAc-1-phosphate (GlcNAc-1-P) is added to the carbon-6 hydroxyl group of one or more specific mannose residues that have been added to these enzymes. The GlcNAc is activated by coupling to UDP and is transferred by UDP-GlcNAc:lysosomal enzyme GlcNAc-1-phosphotransferase (GlcNAc-phosphotransferase), yielding a phosphodiester intermediate: GlcNAc-1-P-6-Man-protein. A second reaction (catalyzed by GlcNAc 1-phosphodiester-N-acetylglucosaminidase) removes the GlcNAc leaving mannose residues phosphorylated in the 6 position: Man-6-P-protein. A specific Man-6-P receptor (MPR) is present in the membranes of the Golgi apparatus. Binding of Man-6-P to this receptor targets proteins to the lysosomes.

Two distinct MPRs have been identified and both are members of the P-type lectin family. Both are type I integral membrane glycoproteins that contain an N-terminal extracellular domain, a single transmembrane domain and a C-terminal cytoplasmic domain. One receptor is large with a molecular weight of approximately 300kDa, the other receptor is smaller with a molecular weight of approximately 46kDa. Structural similarities between these two receptors indicates they are derived from a single ancestral gene with the larger receptor arising through multiple gene duplications. The extracellular portion of the larger receptor contains 15 repeating elements, each of which is highly similar to the extracellular domain of the smaller receptor. Both receptors exist as dimers embedded in the membrane.

The large receptor binds two moles of Man-6-P and the smaller binds one mole of Man-6-P per subunit, thus 4 and 2 moles of Man-6-P per dimer, respectively. The bovine and murine versions of the smaller receptors require divalent cations for ligand binding and thus the receptor has been termed the cation-dependent Man-6-P receptor (CD-MPR). However, the human counterpart may not require cations for its activity. The larger receptor does not require divalent cations for ligand binding and is therefore, commonly referred to as the cation-independent Man-6-P receptor (CI-MPR). However, the CI-MPR has been shown to bind the nonglycosylated polypeptide hormone, insulin-like growth factor 2 (IGF-2) and as such the larger MPR is more frequently identified as IGF-2/MPR. The IGF-2/MPR is available at the cell surface and its role in binding IGF-2 is to target this hormone for degradation in the lysosomes. In addition to IGF-2, the IGF-2/MPR has been shown to bind a diverse array of Man-6-P-containing proteins as well as several nonglucosylated proteins. Although IGF-2/MPR and CD-MPR exhibit distinct activities, both receptors function to target newly synthesized lysosomal enzymes to the lysosomes.

back to the top

Glycosylphosphatidylinositol Anchored Proteins (GPI-Linkage)

Many membrane-associated glycoproteins belong to neither the peripheral nor the transmembrane class. These glycoproteins are tethered to the outer leaflet of the plasma membrane via a glycosylphosphatidylinositol linkage to their C-termini. This type of membrane attachment is referred to as a GPI linkage and the glycoproteins are termed glypiated proteins. The basic structure of the GPI linkage is shown in the Figure below. This class of membrane anchored protein was initially discovered due to their release from cells following treatment with crude bacterial phospholipase C (PLC). The first mammalian glypiated proteins characterized were alkaline phosphatase and 5'-nucleotidase. One clinically important glypiated protein is the erythrocyte surface glycoprotein, decay-accelerating factor, (DAF; also known as CD55 = cluster of differentiation protein 55). DAF prevents erythrocyte lysis by complement. Other important GPI linked proteins are the enzymes acetylcholinesterase, the cell adhesion molecule N-CAM-120 (neural cell adhesion molecule-120) and the T-cell markers Thy-1 and LFA-3 (lymphocyte function associated antigen-3).

Structure of the GPI linkage

R1 is most often an unsaturated fatty acid but can be just -OH; R2 is either a fatty acid, alkyl or alkenyl group, can also be ceramide instead of glycerolipid, can also just be -OH: R3 is most often palmitic acid attached to the C2 carbon of inositol: R4, R9 are ethanolamine phosphate or -OH: R5, R6, R7, R8, R10 are carbohydrate substituents or -OH.

General structure of the GPI anchor

Virtually all protein-linked GPI anchors share the common core structure that is depicted in the Figure above. The use of the GPI-anchor has been found in a functionally diverse array of mammalian proteins that includes hydrolytic enzymes, adhesion molecules, complement regulatory proteins, and receptors. It has been found that in several mammalian mRNAs encoding GPI-anchored proteins, the process of alternative splicing results in the expression of transmembrane and/or soluble forms of the same gene product. In addition, some of these variants are developmentally regulated. For example, neural cell adhesion molecule (NCAM) exists in GPI-anchored and soluble forms when expressed in muscle and in GPI-anchored and transmembrane forms when expressed in brain. The structures of GPI anchors are quite diverse, depending on both the protein to which they are attached and the organism in which they are synthesized.

The biosynthesis of GPI anchors occurs via a complex series of three reaction processes. Initially there is the preassembly of a GPI precursor in the ER membrane. This is followed by the attachment of the GPI to the newly synthesized protein within the lumen of the ER followed by cleavage of the carboxy-terminal GPI-addition signal peptide. Finally the GPI-anchor undergoes lipid remodeling and/or carbohydrate side-chain modifications in the ER and the Golgi. Some GPI-anchored proteins possess two fatty acid chains (e.g. diacylglycerol, alkylacylglycerol, alkenylacylglycerol, or ceramide) which allows for a stable association of the protein with the lipid bilayer.

GPI-anchored proteins play significant roles in the function of the immune system. For example, transmembrane signaling is known to occur as a result of the cross-linking of GPI-anchored proteins with antibody followed by clustering with a second antibody on various cells such as leukocytes. The types of intracellular responses initiated by GPI protein-induced signaling include tyrosine phosphorylation, cytokine production, oxidative burst, increased intracellular Ca2+ release and cell proliferation.

back to the top

Clinical Significances of Glycoproteins

Glycoproteins on cell surfaces are important for communication between cells, for maintaining cell structure and for self-recognition by the immune system. The alteration of cell-surface glycoproteins can, therefore, produce profound physiological effects, of which several are listed below.

1. The ABO blood group antigens are the carbohydrate moieties of glycolipids on the surface of cells as well as the carbohydrate portion of serum glycoproteins. When present on the surface of cells the ABO carbohydrates are linked to sphingolipid and are therefore of the glycosphingolipid class. When the ABO carbohydrates are associated with protein in the form of glycoproteins they are found in the serum and are referred to as the secreted forms. Some individuals produce the glycoprotein forms of the ABO antigens while others do not. This property distinguishes secretors from non-secretors, a property that has forensic importance such as in cases of rape. For more information of blood group antigens, including ABO visit the blood group antigen gene mutation database at NCBI.

Structures of the ABO blood group antigens

R represents the linkage to protein in the secreted forms, sphingolipid (ceramide) in the cell-surface bound form, open square = GlcNAc, open diamond = galactose, filled square = fucose, filled diamond = GalNAc. The linkage in the glycolipid form may include a glucose in a β-1,3 or β-1,4 to the initial galactose residue.

Structure of the ABO blood group carbohydrates

2. The truncation of erythrocyte surface glycoproteins leads to cell clumping, as in congenital dyserythropoietic anemia type II. Also referred to as HEMPAS (hereditary erythroblastic multinuclearity with positive acidified-serum test).

3. Several viruses, bacteria and parasites have exploited the presence of cell-surface carbohydrates, principally associated with protein (glycoproteins), using them as portals of entry into the cell.

a. Human immunodeficiency virus (HIV), the causative agent of AIDS, gains entry into cells of the immune system by attaching to a class of cellular receptors known as the chemokine receptors, most notably CXCR4 and CCR5.

b. Members of the poxvirus family of viruses gain entry into cells, most frequently migratory leukocytes, by attaching to chemokine receptors including CCR1, CCR5 and CXCR4.

c. Dystroglycan (DG) is a component of the dystrophin-glycoprotein complex. It is a laminin receptor encoded by a single gene and cleaved by postranslational processing into two proteins, peripheral membrane α-DG and transmembrane β-DG. α-DG interacts with laminin-2 in the basal lamina and β-DG binds to dystrophin containing cytoskeletal proteins in muscle and peripheral nerves. DG is involved in agrin- and laminin-induced acetylcholine receptor clustering at neuromuscular junctions, morphogenesis, early development, and the pathogenesis of muscular dystrophies. Evidence  has shown that α-DG present on Schwann cell membranes is the receptor for Mycobacterium leprae and also serves as the receptor for the arenavirus class of pathogens. Arenaviruses cause hemorrhagic fever in humans. Lymphocytic choriomeningitis virus (LCMV), Lassa fever virus (LFV), Oliveros and Mobala (all members of the arenavirus family) all bind to α-DG. The specificity of this interaction was demonstrated by the resistance to LCMV infection of cells harboring a null mutation in DG.

d. Rhinoviruses utilize attachment to ICAM-1 (intercellular adhesion molecule-1) to gain entry into cells.

e. The pathogenic human parvovirus, B19, attaches to the erythrocyte-specific cell-surface globoside identified as erythrocyte P antigen to infect erythrocytes.

f. The malarial parasite Plasmodium vivax, binds to the erythrocyte chemokine receptor known as the Duffy blood group antigen (also known as the erythrocyte receptor for interleukin-8) to infect erythrocytes.

g. The MNS blood group system is a well-characterized set of erythrocyte surface antigens that represent the variable carbohydrate modifications of the trans-membrane glycoprotein, glycophorin. Glycophorin is the cellular receptor for influenza virus as well as the receptor for erythrocyte invasion by the malarial parasite Plasmodium falciparum.

h. Helicobacter pylori is the bacterium responsible for chronic active gastritis and gastric and duodenal ulcers; it is also the causative agent for one of the most common forms of cancer in humans, adenocarcinoma. This bacterium attaches to the Lewis blood group antigen on the surfaces of gastric mucous cells.

i. Rabies virus binds to cells through interactions with neural cell adhesion molecule (N-CAM).

j. Human herpesvirus 6 (HHV-6) infection occurs in virtually all persons within the first 2 years of life and persists the entire lifetime. In immunocompromised patients HHV-6 causes opportunistic infections and is the causative agent of exanthema subitum. HHV-6 has been linked to multiple sclerosis and to the progression of AIDS. The cellular receptor for HHV-6 is the cell-surface type-I glycoprotein, CD46.

4. Defects in the proper targeting of glycoproteins to the lysosomes can also lead to clinical complications. Deficiencies in the enzyme responsible for the transfer of GlcNAc-1-P to Man residues (GlcNAc phosphotransferase) in lysosomal enzymes leads to the formation of dense inclusion bodies formation in the fibroblasts. Two disorders related to deficiencies in the targeting of lysosomal enzymes are termed I-cell disease (mucolipidosis II) and pseudo-Hurler polydystrophy (mucolipidosis III, also called mucolipidosis-HI). I-cell disease is characterized by severe psychomotor retardation, skeletal abnormalities, coarse facial features, painful restricted joint movement, and early mortality. Pseudo-Hurler polydystrophy is less severe; it progresses more slowly, and afflicted individuals live to adulthood.


back to the top

Clinical Significance of Defective Glycoprotein Degradation

The proper degradation of glycoproteins has important medical relevance. Degradation occurs within lysosomes and requires specific lysosomal hydrolases, termed glycosidases. Exoglycosidases remove sugars sequentially from the non-reducing end and exhibit restricted substrate specificities. In contrast, endoglycosidases cleave carbohydrate linkages from within and exhibit broader substrate specificities. Several inherited disorders involving the abnormal storage of glycoprotein degradation products have been identified in humans. These disorders result from defects in the genes encoding specific glycosidases, leading to incomplete degradation and subsequent over-accumulation of partially degraded glycoproteins. As a general class, such disorders are known as lysosomal storage diseases. Numerous proteins and sphinogolipids harbor similar carbohydrate modifications. The enzymes that remove these sugar residues are the same for both glycoproteins and glycolipids and as such there is often overlapping phenotypes in diseases that were originally identified as being caused by defects in glycoprotein degradation or glycolipid degradation.

The following Figure shows the locations of the actions of several glycosidases involved in glycoprotein metabolism. The structures of the carbohydrates in a typical complex oligosaccharide cluster are included (see Figure above of the 3 major classes of glycoprotein). However, when linked as they would be by the indicated bonds (e.g. α-2,3, or 6 indicated for sialic acid linkage to galactose) there would be loss of H2O. The bonds are indicated for each linkage by the solid line between structures. Enzyme names are in green and diseases associated with defects in the indicated enzymes are in blue. Each of the disease names in the image can be clicked to go to a descriptive page of that disease. The Table below the Figure lists some of these diseases as well as the affected enzyme and classic symptoms of the disease. Note that as indicated in the above paragraph, many lysosomal storage diseases (e.g. Tay-Sachs) resulting from defective enzymes that metabolize both glycolipids and glycoproteins are defined by one or the other defective pathway (see the Sphingolipids page for more information). Click here for a large scale version of this image.

Pathway for degradation of glycoproteins


Disease Enzyme Deficiency Symptoms/Comments
Aspartylglucosaminuria aspartylglucosaminidase
(N-aspartyl-β-glucosaminidase)
progressive mental retardation, delayed speech and motor development, coarse facial features
β-Mannosidosis β-mannosidase primarily neurological defects, speech impairment
α-Mannosidosis α-mannosidase mental retardation, dystosis multiplex, hepatosplenomegaly, hearing loss, delayed speech
GM1 Gangliosidosis β-galactosidase also identified as a glycosphingolipid storage disease or lysosomal storage disease
Sandhoff disease β-hexosaminidases A and B also identified as a glycosphingolipid storage disease or lysosomal storage disease
Sialidosis
(also identified as Mucolipidosis I)
neuraminidase (sialidase) myoclonus, congenital ascites, hepatosplenomegaly, coarse facial features, delayed mental and motor development
Fucosidosis α-fucosidase progressive motor and mental deterioration, growth retardation, coarse facial features, recurrent sinus and pulmonary infections

back to the top

Carbohydrate Recognition: Lectins

The ability of certain proteins to recognize and bind to specific carbohydrate structures is of critical importance in overall cellular homeostasis. Proteins that recognize specific types of carbohydrates displayed on other proteins and/or lipids are referred to as lectins. All lectins contain a carbohydrate recognition domain, CRD. The lectin family of proteins does not include the immunoglobulins which by themselves constitute a specialized class of carbohydrate recognizing proteins. The clinical laboratory definition of lectins describes non-immunoglobulins that are capable of differential agglutinization (Latin for "to glue to", means clumping) of erythrocytes.

Lectin Families and their Carbohydrate Specificities


Lectin Family Characteristics Binding Specificity
C-type (7 subfamilies) require Ca2+ for activity variable
Collectins
(C-type subfamily III)
C-type CRD with collagen-like domains variable, mannose
Galectins
(S-type lectins)
sulfydryl-dependent or β-galactosidase binding strict for β-galactosides
Siglecs
(I-type lectins)
contain immunoglobulin-like domains sialic acid
Phosphomannosyl receptors (P-type lectins) recognize mannose-6-phosphate mannose-6-phosphate
Pentraxins quaternary structure composed of 5 identical polypeptides that form a ring with a central hole) variable and can be non-carbohydrate
Calnexin and calreticulin recognition of properly folded glycoproteins in the ER glucose
Ficolins contain fibrinogen-like homology domain variable
Tachylectins distinct CRD but also contain fibrinogen-like domain GlcNAc/GalNAc

C-type Lectins

C-type lectins were initially defined as those proteins having a CRD of approximately 120 amino acids with a dependence on Ca2+ ions for binding activity. The C-type lectin family can be further subdivided into at least 7 subfamilies dependent upon the nature of other non-carbohydrate recognition domains as well as overall gene structure. C-type lectins recognize carbohydrate structures on many different forms of protein, including cell surface proteins and extracellular matrix (ECM) proteins and other ECM structures containing carbohydrate modifications such as proteoglycans, glycosaminoglycans and glycolipids. C-type lectins also exhibit broad specificity in the types of carbohydrates to which they bind including fucose, galactose, glucose, GlcNAc, maltose, mannose and N-acetylmannosamine (ManNAc). One of the most important functions for carbohydrate recognition by C-type lectins is the role that function plays in pathogen recognition by the innate immune system.

The collectins are subtype III C-type lectins defined by having an additional collagen-like domain. The presence of the collagen-like domain allows the collectins to form a triple helical structure that creates a multivalent CRD. Mannan-binding lectin, MBL (also termed mannose-binding protein) is the best characterized human collectin. Although termed mannose-binding lectin, MBL actually exhibits higher affinity for GlcNAc than for mannose. The major function of the collectins is recognition of carbohydrate structures on microbial surfaces leading to activation of the complement cascade and phagocytosis. MBL is synthesized by the liver and secreted into the circulation and is considered one of the acute phase proteins synthesized by the liver. Expression levels of MBL rise in response to inflammatory signals. Upon binding to carbohydrate structures on microbial surfaces, MBL activates the complement system. The mechanism of MBL activation of complement requires the interaction of MBL with a family of serine-proteases termed mannose-binding lectin-associated serine proteases (MASPs). The action of MBL-MASP in the lectin-activated complement cascade is similar to the C1q-antibody complex that activates the serine proteases C1r and C1s.

The selectins are subtype IV C-type lectins defined the presence of the C-type lectin CRD, an epidermal growth factor-like domain and a variable number of complement regulatory protein-like repeat domains. The selectins are also members of the cell adhesion family of proteins. Each selectin is tethered to the plasma membrane via a single transmembrane domain. There are three characterized selectins that were originally identified as the adhesion molecules: endothelial-leukocyte adhesion molecule-1 (ELAM-1), murine lymph node homing receptor and platelet granule membrane protein-140. These three proteins are now referred to as E-selectin, L-selectin and P-selectin, respectively. A major function of the selectins is the recruitment of neutrophils to sites of inflammation. P-selectin is constitutively stored in secretory granules of platelets and endothelial cells. When stimulated by the appropriate response (e.g. IL-1 and TNF-α) these cells present P-selectin to the cell surface. Through this mechanism, P-selectin is involved in mediating very early responses of leukocytes to inflammatory signals. E-selectin is also synthesized by endothelial cells but is not constitutively present. Expression of E-selectin is regulated by cytokines such as IL-1 and TNF-α. L-selectin is expressed by most leukocytes and plays a role in the recruitment of these cell types into lymphatic tissue. Constitutively expressed L-selectin is rapidly released from the surface of activated leukocytes. Activation of L-selectin signaling, via ligand-induced cross-linking, results in potentiation of the responses of neutrophils to other stimuli.

S-type Lectins: Galectins

The S-type lectins were originally so designated because the original family members exhibited a sulfhydryl-dependence on carbohydrate binding. The S-type lectins are now commonly referred to as the galectins. The galectins are all defined by a canonical CRD having an affinity for β-galactosides. To date, 14 galectins have been identified in mammals. The galectins can be further divided into 3 subfamilies based upon their structures. The prototype subfamily (galectin-1, -2, -5, -7, -10, -11, -13, and -14) is comprised of a single CRD; the tandem-repeat type family (galectin-4, -6, -8, -9, and -12) contains two homologous CRDs and the chimeric type family contain a single CRD combined with a unique amino or carboxy terminus. Due to their multivalent nature, galectins are capable of cross-linking glycoconjugates containing N-acetyllactosamine (LacNAc) structures. Galectin activities are observed both at the cell surface and intracellularly.

One of the most crucial functions of the galectins is in regulation of the inflammatory response. Galectin-1 is expressed by antigen-stimulated T cells, activated B cells and by macrophages. The action of galectin-1 in these systems is to stimulate apoptosis resulting in cell death. Negative growth effects of galectin-1 are also seen in breast and prostate cancer cell lines. In addition to galectin-1, galectin-7, -8, -9, and -12 have been shown to exhibit pro-apoptotic activity. In contrast, galectin-3 exhibits anti-apoptotic activities in similar types of cells. Whereas the proapoptotic effects of galectin-1 can be observed by adding the protein to activated T cells, the anti-apoptotic effects of galectin-3 appears to be the result of intracellular activities associated with the protein. Interestingly, galectin-3 has some striking sequence similarities to the anti-apoptotic Bcl-2 protein family. Galectin-1 is a potent inhibitor of activated T cell-mediated inflammatory responses. Using experimentally induced models of several autoimmune disorders (e.g. myasthenia gravis, multiple sclerosis and rheumatoid arthritis) it has been shown that administration of recombinant galectin-1 can significantly ameliorate the resultant clinical symptoms. Conversely, the presence of autoantibodies to galectin-1 has been identified in several neurodegenerative disorders. Likewise, the levels of autoantibodies to galectin-3 are inversely associated with severity of Crohn disease (inflammatory bowel disease, IBD).

I-type Lectins: Siglecs

I-type lectin is the term collectively given to the family of carbohydrate binding proteins that belong to the immunoglobulin (Ig) superfamily of proteins. The Ig superfamily of proteins all have at least one domain composed of two β-sheets that are disulfide bonded and stacked together. There are at least three subtypes of Ig domain (V-set, C1-set and C2-set) that are defined by the number and arrangement of the β-strands in the domain. All of the I-type lectins are membrane anchored with their CRDs on the outside of the cell. The I-type lectins are divided into two main subfamilies. One subfamily includes the cell adhesion molecules, L1, N-CAM and intercellular adhesion molecule-1 (ICAM-1) as well as the cell-surface glycoproteins CD83 and CD2. The other subfamily is composed of the siglecs which are so named because of their binding specificity for sialic acid residues. Siglecs are also characterized by the presence of a characteristic V-set Ig domain at the their amino termini followed by varying numbers of C2-set domains.

The siglecs are primarily expressed on cells of the hematopoietic system. A macrophage lectin-like adhesion molecule termed sialoadhesin and CD22, another member of the Ig superfamily that shows restricted B-cell expression were the first two members of the siglec group to be characterized. Both sialoadhesin and CD22 were shown to mediate cell-cell interactions through their abilities to recognize sialylated glycoconjugates. Sialoadhesin is now referred to as siglec-1 and CD22 as siglec-2 in reference to their being the first and second member characterized, respectively. To date, at least 11 siglecs have been characterized. Like the pentraxins, the siglecs, principally the siglec-3 family, are involved in regulatory aspects of innate immunity reactions. All of the siglec-3 family members contain a motif in their intracellular tails termed an immunoreceptor tyrosine-based inhibitory motif (ITIM) indicating these proteins act as inhibitory receptors on hematopoietic cells. Associated with the ITIM is a tyrosine residue that becomes phosphorylated in response to activation of activating receptors. The function of the ITIM is to recruit phosphatases to activated receptor complexes, thereby, deactivating the activating receptors by phosphate removal. The presence of phosphotyrosine in the cytoplasmic tail of siglec-3 also enhances the subsequent interaction with sialylated ligands suggesting that tyrosine phosphorylation plays a regulatory role in ligand binding to siglec-3.

Pentraxins

The pentraxin family of lectins is defined by having primary sequence similarities to major acute phase reactants, C-reactive protein (CRP) and serum amyloid P component (SAP). The pentraxins assume a characteristic multisubunit complex consisting of five to ten identical subunits that form a pentameric ring structure with a central open core, hence the term pentraxin. The original identity applied to CRP stems from the observation that it was an antibody-like molecule associated with the C polysaccharide component of the pneumococcal bacterium. CRP was originally shown to be a complement-activating precipitin with a high affinity for phosphocholine. Although originally not identified as a carbohydrate-binding protein, CRP was found to exhibit affinity for galactans (galactose-containing oligosaccharides) and phosphogalactose. Similarly, SAP not only has carbohydrate-binding specificity it also has an affinity for phosphoethanolamine. In general the pentraxins represent a class of molecules that exhibit calcium-dependent binding to a number of substances bearing phosphate esters and carbohydrates. Classes of pentraxins have been divided into those with binding similarities to CRP (i.e. preference for phosphocholines) or SAP (i.e. preference for phosphoethanolamines).

Another defining feature associated with pentraxins is the observation that their expression is markedly increased as a consequence of the activities of immunomodulatory molecules such as IL-1 and TNF-α as well as in response to tissue injury and necrosis. Both IL-1 and IL-6 can dramatically induce the acute phase response of the liver leading to increased CRP synthesis and secretion. A role for the pentraxins in innate immunity functions is suggested by their reactivities with the complement system as well as phagocytic leukocytes. CRP is consistently shown to activate the classic complement pathway and binds to the site of complement-induced cell injury.

back to the top

Congenital Disorders of Glycosylation (CDG)

Congenital disorders of glycosylation (CDG) represent a constellation of diseases that result from defects in the N-linked glycosylation pathway. These diseases are clustered into two broad categories. Group I CDG diseases are defined by alterations/deficiencies in the synthesis and/or transfer of the dolichol-pyrophosphate oligosaccharide precursor to Asn residues in substrate proteins. Group II CDG diseases are defined as those that result from defects in subsequent N-linked glycan processing. It is important to note that these disorders are only reflective of deficiencies of N-glycosylation and that diseases/disorders are known to result from deficiencies in O-glycosylation, GPI-linkage and the biosynthesis of proteoglycans, all of which involve carbohydrate addition and remodeling in the context of a protein backbone. For more information on both N-linked and O-linked glycosylation defects see the Congenital Disorders of Glycosylation page.

CDG-Ia is the most commonly occurring CDG, with appearance in individuals of European ancestry being highest. Although there is considerable variability in the clinical phenotypes observed in CDG-Ia patients, there is always some level of psychomotor retardation. In addition, children are ataxic and have skeletal abnormalities consisting of long limbs and short torsos. Due to defective synthesis of coagulation factors by the liver (primarily factor XI, antithrombin III, protein C and protein S), patients have severe coagulation defects. Adding to the situation is hepatomegaly with consequent liver dysfunction. CDG-Ia results from mutations in phosphomannomutase 2 (PMM2), the enzyme that is required to convert Man-6-P to Man-1-P used in the generation of GDP-Man. Over 60 mutations in PMM2 have been identified that either decrease enzyme activity or stability.

CDG-IIc is more commonly referred to as leukocyte adhesion deficiency syndrome II (LAD II). LAD II belongs to the class of disorders referred to as primary immunodeficiency syndromes as the symptoms of the disease manifest due to defects in leukocyte function. Symptoms of LAD II are characterized by unique facial features, recurrent infections, persistent leukocytosis, defective neutrophil chemotaxis and severe growth and mental retardation. The genetic defect resulting in LAD II is in the pathway of fucose utilization leading to loss of fucosylated glycans on the cell surface. An additional feature of LAD II is that individuals harbor the rare Bombay (hh) blood type at the ABO locus as well as lack the Lewis blood group antigens. The Bombay blood type is characterized by a deficiency in the H (referred to as the O-type), A and B antigens due to loss of the fucose residue. Each of these blood group antigens contains a Fuc-α-1,2-Gal modification that is the final carbohydrate addition to these antigens. These fucosylation reactions are catalyzed by α-1,2-fucosyltransferase which is encoded by the H and Se loci. The defective neutrophil chemotaxis is due to the loss of a selectin ligand on these cells. This ligand is the sialylated Lewisx antigen, another blood group antigen.

The recurrent infections seen in LAD II patients are the result of the defective neutrophil function. Neutrophils are involved in innate immunity responses to bacterial infection. To carry out their role in host defense mechanisms, neutrophils must adhere to the surface of the endothelium at the site of inflammation which is an event mediated by cell surface adhesion molecules. The selectin family (E-, L-, and P-selectins) of animal lectins are necessary to mediate the initial process of neutrophil adherence to the endothelium. The selectins recognize sialylated fucosylated lactosamines typified by the Lewisx antigen. Once neutrophils adhere and roll along the surface of the endothelium (due to vascular flow), the integrin family of adhesion molecules allow for firm adherence followed by tissue penetration. A related disorder, termed LAD I, is caused by the absence of CD18 which is the β2 subunit of the leukocyte integrin found on the surface of neutrophils and monocytes.

Because there is widespread loss of fucosylated antigens in LAD II patients, each of which can be formed through the actions of several fucosyltransferases, the role of these enzymes in the disease could be ruled out. In addition, normal levels of the α-1,2-, α-1,3- and α-1,4-fucosyltransferases were observed in the serum of LAD II individuals. These observations indicated that the pathways to GDP-fucose synthesis or utilization by the fucosyltransferases in the Golgi must be deficient in LAD II. Fucose can be converted to GDP-fucose by salvage of free fucose (exogenous or derived through glycoconjugate degradation) or by epimerization of GDP-mannose. In order for GDP-fucose to be utilized by Golgi fucosyltransferases it must first be transported into the Golgi from the cytosol where it is synthesized. Examinations of the enzymes involved in the synthesis and transport of GDP-fucose have been undertaken. While the activity of one of the enzymes of GDP-mannose epimerization (GDP-D-mannose 4,6-dehydratase, GMD) has been shown to be reduced in LAD II patients, it has been determined that the major defect causing LAD II is an impairment in the transport of GDP-fucose into the Golgi. This latter reaction is catalyzed by the GDP-fucose transporter encoded by the FUCT1 gene (also identified as solute carrier family 35, member C1: SCL35C1).

back to the top
Return to Protein Modifications Page
Return to The Medical Biochemistry Page
Michael W King, PhD | © 1996–2014 themedicalbiochemistrypage.org, LLC | info @ themedicalbiochemistrypage.org

Last modified: February 26, 2014