Protein Modifications

Return to The Medical Biochemistry Page

© 1996–2016, LLC | info @

Secreted and Membrane-Associated Proteins












Proteins that are membrane bound or are destined for excretion (as well as Glycoproteins) are synthesized by ribosomes associated with the membranes of the endoplasmic reticulum (ER). The ER associated with ribosomes is termed rough ER (RER). These classes of protein all contain an N-terminus termed a signal sequence or signal peptide. The signal peptide is usually 13-36 predominantly hydrophobic amino acid residues. The signal peptide is recognized by a ribonucleoprotein complex termed the signal recognition particle (SRP) as the signal peptide emerges from the exit side of the ribosome. The eukaryotic SRP is composed of six proteins and an RNA termed the 7SL RNA. The six proteins of the SRP are SRP9, SRP14, SRP19, SRP54, SRP68, and SRP72. Humans express at least three genes encoding the 7SL RNA identified as RN7SL1, RN7SL2, and RN7SL3.

When SRP binds to the emerging signal peptide it induces an translational elongation arrest until the entire translational complex and SRP binds to the SRP receptor on the ER. The SRP receptor (termed SR) is a heterodimeric complex composed of an α-subunit (SR-α; encoded by the SRPRA gene) and a β-subunit (SR-β; encoded by the SRPRB gene). Associated with the SRP receptor is a translocation channel through which the emerging polypeptide is extruded into the lumen of the ER. The translocation channel is referred to as the translocon. Although the translocon is composed of multiple protein subunits the critical channel is formed from the heterotrimeric Sec61 complex which contains the Sec61α, Sec61β, and Sec61γ proteins.

This signal peptide is removed following passage through the endoplasmic reticulum membrane. The removal of the signal peptide is catalyzed by enzymes of the serine protease family known as signal peptidases which are multiprotein complexes identified as SPC. Proteins that contain a signal peptide are called preproteins to distinguish them from proproteins. However, some proteins that are destined for secretion are also further proteolyzed following secretion and are termed preproproteins.

Synthesis of membrane-associated and secreted proteins on rough ER

Mechanism of synthesis of membrane bound, secreted, and glycoproteins. Ribosomes engage the ER membrane through interaction of the signal recognition particle, SRP in the ribosome with the SRP receptor in the ER membrane. As the protein is synthesized the signal sequence is passed through the ER membrane into the lumen of the ER. After sufficient synthesis the signal peptide is removed by the action of signal peptidase. Synthesis will continue and if the protein is secreted it will end up completely in the lumen of the ER. If the protein is membrane associated a stop transfer motif in the protein will stop the transfer of the protein through the ER membrane. This will become the membrane spanning domain of the protein.

back to the top

Proteolytic Cleavage

Most proteins undergo proteolytic cleavage following translation. The simplest form of this is the removal of the initiation methionine. Many proteins are synthesized as inactive precursors that are activated under proper physiological conditions by limited proteolysis. Pancreatic enzymes and enzymes involved in clotting are examples of the latter. Inactive precursor proteins that are activated by removal of polypeptides are termed proproteins.

A complex example of post-translational processing of a preproprotein is the cleavage of prepro-opiomelanocortin (POMC) synthesized in the pituitary (see the Peptide Hormones page for discussion of POMC). This preproprotein undergoes complex cleavages, the pathway of which differs depending upon the cellular location of POMC synthesis.

Another is example of a preproprotein is insulin. Since insulin is secreted from the pancreas it has a prepeptide. Following cleavage of the 24 amino acid signal peptide the protein folds into proinsulin. Proinsulin is further cleaved yielding active insulin which is composed of two peptide chains linked together through disulfide bonds.

Still other proteins (of the enzyme class) are synthesized as inactive precursors called zymogens. Zymogens are activated by proteolytic cleavage such as is the situation for several proteins of the blood clotting cascade.

back to the top


Post-translational methylation of proteins occurs on nitrogens and oxygens. The activated methyl donor for these reactions is S-adenosylmethionine (SAM). The most common methylations are on the ε-amine of the R-group of lysine residues and the guanidino moiety of the R-group of arginine. Methylation of lysine residues in histones in the nucleosome is an important regulator of chromatin structure and consequently of transcriptional activity. Lysine methylation was originally thought to be a permanent covalent mark, providing long-term signaling, including the histone-dependent mechanism for transcriptional memory. However, it has become clear that lysine methylation, similar to other covalent modifications, can be transient and dynamically regulated by an opposing demethylation activity. Methylation of lysine residues affects gene expression not only at the level of chromatin modification, but also by modifying the activity of numerous transcription factors. Histone arginine methylation is also known to regulation chromatin structure and consequently transcriptional activty. Humans express 27 lysine (K) methyltransferases (identifed as KMT family enzymes) and nine arginine methyltransferases. The latter family of enzymes is identified as the protein arginine (R) methyltransferase (PRMT) family. Numerous enzymes catalyze lysine demethylation reactions with one of the largest being the Jumonji C (JmjC) domain containing demethylases, all of which are members of a large family of at least 80 enzymes that are 2-oxoglutarate and Fe2+-dependent dioxygenases. For more complete information on the functions of protein methylation and demethylation go to the Regulation of Gene Expression page.

Additional nitrogen methylations are found on the imidazole ring of histidine and the R-group amides of glutamate and aspartate. Methylation of the oxygen of the R-group carboxylates of glutamate and aspartate also takes place and forms methyl esters. Proteins can also be methylated on the thiol R-group of cysteine.

As indicated below, many proteins are modified at their C-terminus by prenylation near a cysteine residue in the consensus CAAX. Following the prenylation reaction the protein is cleaved at the peptide bond of the cysteine and the carboxylate residue of the cysteine is methylated by a prenylated protein methyltransferase.

back to the top


Post-translational acetylation of proteins occurs on the ε-amine of lysine residues the same as for the methylation of lysines in proteins. In addition, a large number of proteins (more than 80% of human proteins) are acetylated on the N-terminal amino acid. The enzymes that catalyze protein acetylation of lysine residues are classified as lysine (K) acetyltransferases and denoted by the nomenclature KAT. Humans express 17 genes encoding KAT enzymes. The activated acetyl donor for the KAT enzymes is acetyl-CoA. The role of acetyl-CoA in the acetylation of proteins places this post-translational processing event at the crossroads of metabolic regulation. Physiological and pathophysiological conditions that result in increases or decreases in the production and utilization of acetyl-CoA will, therefore, have profound effects on the ability of KAT enzymes to carry out their functions.

Lysine Acetylation

Acetylation of lysine residues in histones in the nucleosome is an important regulator of chromatin structure and consequently of transcriptional activity. Like the reversibility of lysine methylation, protein lysine acetylation is also reversible. The enzymes that carry out removal of the acetyl group are broadly classified into two primary groups. One group is identified as the histone deacetylases (HDAC), which are Zn2+-dependent enzymes and the other group is identified as the sirtuins (SIRT) which are NAD+-dependent enzymes. More than 1,750 proteins in human tissues have been shown to be modified by acetylation. Greater detail on histone acetylation-deacetylation can be found in the Control of Gene Expression page. The discussion here will focus on metabolic regulation via reversible acetylation.

Protein lysine acetylation is observed on proteins in most all compartments of the cell. Recent evidence has demonstrated that numerous enzymes, that control a vast array of metabolic processes, have their activity modulated by reversible lysine acetylation. Within the liver, nearly 1,000 different proteins (not including nuclear proteins) have been shown to be acetylated with many of the proteins functional in the processes of metabolic regulation. Of these nearly 1,000 proteins, more than 150 are found in the mitochondria of hepatocytes. An astounding outcome of the work on metabolic regulation, via protein acetylation, is that very nearly all of the enzymes involved in glycolysis, glycogen metabolism, gluconeogenesis, the TCA cycle, fatty acid oxidation, the urea cycle, and nitrogen metabolism, and have been shown to be acetylated. In addition, several enzymes involved in oxidative phosphorylation and amino acid metabolism have also been found to be acetylated.

The acetylation of metabolic enzymes results in alterations in their activities by several different mechanisms. Acetylation can lead to subsequent ubiquitylation and proteosomal degradation of the modified protein. Acetylation can also result in destruction of the modified protein via the lysosomes. Protein degradation is not the only mechanism whereby lysine acetylation can be used to regulate an enzymes level of activity. Numerous enzymes, including metabolic enzymes, that are acetylated have altered catalytic activity. Acetylation can lead to neutralization of an active site lysine or the acetylation can lead to blockade of the action of an allosteric activator. Numerous other lysine acetylation-mediated effects on enzyme activity have been documented including the blocking of substrate binding, blocking of metabolite binding, and modifying the subcellular localization of an enzyme.

Several Metabolic Enzymes Regulated by Reversible Acetylation

Enzyme Name Gene Acetylase Deacetylase Comments
Acetyl-CoA acetyltransferase 1 ACAT1 unknown SIRT3 mitochondrial enzyme involved in ketone body utilization; major activity is the cleavage of acetoacetyl-CoA into two acetyl-CoA units; acetylation down-regulates the activity of the enzyme; K260 and K265 deacetylated by SIRT3 but K187 is not
Acyl-CoA dehydrogenase, long chain ACADL unknown SIRT3 mitochondrial fatty acid β-oxidation enzyme; acetylation down-regulates the activity of the enzyme
Acyl-CoA synthetase 1 ACSL1 unknown SIRT3 major liver and adipose tissue enzyme involved in the activation of fatty acids for β-oxidation; enzyme contains at least 15 sites of acetylation that are acetylated differentially dependent upon physiological status; acetylation of K285 is known to down-regulate the activity of the enzyme
Aldehyde dehydrogenase 2 ALDH2 unknown SIRT3 the mitochondrial aldehyde dehydrogenase; multiple sites of acetylation; acetylation increase the activity of the enzyme; K370 is deacetylated by SIRT3 but K453 is not
acyl-CoA synthetase short chain family member 1 ACSS1 unknown SIRT1 mitochondrial enzyme; also identified as AceCS1; catalyzes conversion of acetate to acetyl-CoA; important in energy homeostasis during periods of fasting; acetylation results in down-regulation of enzyme activity
acyl-CoA synthetase short chain family member 2 ACSS2 KAT3A (CBP) SIRT3 cytoplasmic enzyme; also identified as AceCS2; catalyzes conversion of acetate to acetyl-CoA; acetate stimulates interactions between ACSS2, CBP [derived from CREB (cAMP-response element binding protein)-binding protein], and the hypoxia induced factor, HIF-2 (see the Glycolysis page for more details on the hypoxia induced pathway); acetylation of ACSS2 results in down-regulation of enzyme activity by interference with the active site
argininosuccinate lyase ASL unknown unknown urea cycle enzyme; acetylation results in down-regulation of enzyme activity by interference with the active site
carbamoylphosphate synthetase I CPS1 unknown SIRT5 urea cycle enzyme; acetylation results in down-regulation of enzyme activity
carnitine palmitoyltransferase 2, CPT2 CPS1 unknown unknown mitochondrial enzyme involved in transport of activated fatty acids into the mitochondria for β-oxidation; consequences of acetylation of four sites (K104, K453, K537, and K544) yet to be determined
glyceraldehyde-3-phosphate dehydrogenase GAPDH KAT2B HDAC5 glycolytic enzyme; KAT2B was originally identified as PCAF (p300/CBP-associated factor); lysine residues K117, K227, K251 and K254 are acetylated; acetylation of K227 causes an interaction of GAPDH and one of the seven in abstentia homolog (SIAH) ubiquitin ligases resulting in cytoplasmic to nuclear translocation; the seven in abstentia gene was originally identified in Drosophila as being required for the specification of R7 cell fate in the eye; humans express three SIAH gene identifed as SIAH1, SIAH2, and SIAH3; acetylation of K254 results in increased enzyme activity in response to increased glucose concentration
glutamate dehydrogenase GLUD1 unknown SIRT3 major enzyme of overall nitrogen homeostasis and regulator of energy status;
glutaminase GLS2 unknown unknown enzyme involved in overall nitrogen homeostasis; acetylation of K329 results in down-regulation of enzyme activity
3-hydroxy-3-methylglutaryl CoA synthase 2 HMGCS2 unknown SIRT3 mitochondrial enzyme involved in synthesis of the ketone bodies; acetylation results in down-regulation of enzyme activity; K310 is deacetylated by SIRT3 but K354 is not
isocitrate dehydrogenase 2 IDH2 unknown SIRT3 mitochondrial enzyme involved in the production of NADPH in response to oxidative stress; acetylation results in down-regulation of enzyme activity
malate dehydrogenase 2 MDH2 unknown unknown mitochondrial enzyme of the TCA cycle; lysines K185, K301, K307, and K314 are acetylated; acetylation results in up-regulation of enzyme activity; acetylation of MDH2 increases under conditions of increased fatty acid intake
ornithine transcarbamoylase OTC2 unknown SIRT3 mitochondrial enzyme involved in urea cycle; lysine K88 in the active site is a primary target for acetylation; acetylation of K88 inhibits enzyme activity by decreasing affinity for substrate, carbamoyl phosphate; mutation of K88 to asparagine (K88N mutation) found in some patients suffering from OTC deficiency
phosphoenolpyruvate caboxykinase 1 PCK1 EP300 SIRT2 cytoplasmic form of the enzyme (also known as PEPCK-c) involved in gluconeogenesis; the EP300 gene encodes the p300 protein (adenovirus E1A binding protein p300) that is a close relative of the CBP acetyltransferase; EP300 also identifed by the KAT nomenclature as KAT3B; CPB protein is encoded by the CREBBP gene which is also identifed by the standard KAT nomenclature as KAT3A; acetylation of PEPCK-c results in down-regulation of enzyme activity via interaction with the UBR5 ubiquitin ligase (ubiquitin ligase E3 component N-recognin 5)
phosphoglycerate mutase 1 PGAM1 unknown SIRT1 cytoplasmic enzyme involved in glycolysis; at least nine lysines shown to be acetylated in PGAM1; the major sites of acetylation are K251, K253, and K254; acetylation results in up-regulation of enzyme activity
pyruvate kinase, muscle isoform PKM2 KAT2B unknown cytoplasmic enzyme involved in glycolysis; the PKM2 gene produces two PKM isoforms (PKM1 and PKM2) as a result of alternative mRNA splicing; expression of the gene is induced in proliferating cells and all human cancers; expression of PKM2 and synthesis of the PKM2 isoform of the enzyme results in reduced oxidation of glucose to pyruvate resulting in the accumulation of glycolytic intermediates which promotes the production of macromolecules from glucose carbons; acetylation of K305 is stimulated in the presence of high glucose; acetylation results in down-regulation of enzyme activity as a result of the lysosomal degradation pathway referred to as chaperone-mediated autophagy, CMA
succinate dehydrogenase complex subunit A SDHA unknown SIRT3 mitochondrial enzyme that is one of four subunits of the SDH complex; involved in the TCA cycle and in oxidative phosphorylation; acetylation results in down-regulation of enzyme activity
superoxide dismutase 2 SOD2 unknown SIRT3 mitochondrial matrix enzyme involved in removal of super oxide anions; catalyzes reduction of super oxide anion to hydrogen peroxide; acetylation results in down-regulation of enzyme activity
sphingosine kinase 1 SPHK1 p300/CBP unknown cytoplasmic enzyme involved in synthesis of the bioactive lipid sphingosine-1-phosphate, S1P; acetylation results in stabilization of the protein leading to up-regulation of enzyme activity

N-Terminal Acetylation

The acetylation of proteins on the N-treminal amino acid occurs in greater than 80% of all human proteins. The modification is identified as Nt-acetylation. In most proteins where the initiator methione remains at the N-terminus, this amino acid is acetylated. When the initiator methionine is removed, as is the case for all secreted, transmembrane, and glycoproteins due to removal ofthe leader peptide in the lumen of the ER, the protein can still be Nt-acetylated. The most commonly occurring amino acid at the N-terminus that are acetylated are alanine (A), serine (S), cysteine (C), threonine (T), and valine (V). In the vast majority of cases, the presence of the Nt-acetylation creates a specific degradation signal referred to as a degron. The presence of the degron signal then targets the protein for ubiquitylation via the ubiquitin-dependent N-end-rule pathway. The ubiquitylated proteins are then degraded in the proteosome. Components of the N-end-rule pathway are referred to as N-recognins. It is the N-recognins that are the ubiquitin ligases (UBR: for (ubiquitin ligase E3 component N-recognin) that ubiquitinate the Nt-acetylated protein. An example of a metabolic enzyme that is targeted for ubiquitylation via the N-recognin pathway is PEPCK as indicated in the Table above. In this example the ubiquitin ligase is UBR5. Humans express five UBR ubiquitin E3 ligases (UBR1–UBR5). When the N-terminal amino acid that is acetylated is a cysteine it can be oxidized by nitric oxide (NO) followed by arginine attachment via the action of an arginyltransferase such as the enzyme encoded by the ATE1 gene.

The enzymes that incorporate an acetyl group onto the N-terminal amino acid of human proteins are referred to as N-acetyltransferases (NAT). These enzymes represent a distinct family of acetyltransferases that distinguishes them from the lysine acetyltransferases (KAT). Like the KAT enzymes the NAT enzymes utilize acetyl-CoA as the acetyl donor for the acetyltransferase reaction. There are six NAT complexes in human cells identified as NatA–NatF. Functional NAT enzymes are heterotrimeric complexes where the α-subunit of the complex is the catalytic protein. The catalytic α-subunits are encoded by a family of 12 genes identified as N(alpha)-acetyltransferases (NAA). The NatA complex can be generated through the association of four different NAA proteins (NAA10, NAA11, NAA15, and NAA16). The NatB complex can contain either the NAA20 or NAA25 protein. The NatC complex can contain either the NAA30, NAA35, or NAA38 protein. The NatD, NatE, and NatF complexes each contain a single NAA protein, NAA40, NAA50, and NAA60, respectively.

back to the top


Post-translational phosphorylation is one of the most common reversible protein modifications that occurs in animal cells. The vast majority of phosphorylations occur as a mechanism to regulate the biological activity of an enzyme or protein and as such are transient. In other words a phosphate (or more than one in many cases) is added by a specific kinase and later removed by a specific phosphatase.

Physiologically relevant examples are the phosphorylations that occur in glycogen synthase and glycogen phosphorylase in hepatocytes in response to glucagon release from the pancreas. Phosphorylation of glycogen synthase inhibits its activity, whereas, the activity of glycogen phosphorylase is increased. These two events lead to increased hepatic glucose delivery to the blood.

The enzymes that phosphorylate proteins are termed kinases and those that remove phosphates are termed phosphatases. For a more detailed discussion of kinases and phosphatases go to the Signal Transduction page. Protein kinases catalyze reactions of the following type:

ATP + protein → phosphoprotein + ADP

In animal cells serine, threonine and tyrosine are the amino acids subject to phosphorylation. The largest group of kinases are those that phosphorylate either serine or threonine residues and as such are termed protein serine/threonine kinases. The ratio of phosphorylation of the three different amino acids is approximately 1000/100/1 for serine/threonine/tyrosine.

Although the level of tyrosine phosphorylation is minor, the importance of phosphorylation of this amino acid is profound. As an example, the activity of numerous growth factor receptors is controlled by tyrosine phosphorylation.

back to the top

Fatty Acid Acylation

Many proteins are modified at their N-terminus following synthesis. N-terminal modifications can include acetylation and myristoylation. As indicated above many proteins are N-terminally acetylated, the consequences of which are targeted degradation via the N-recognin ubiquitin ligase pathway. Despite the fact that the initiator methionine is very often hydrolyzed following protein synthesis (catalyzed by methionine aminopeptidases), acylation of the N-terminus still occurs. N-terminal acetylation is catalyzed by a family of N-terminal acetyltransferases (NATs), as discussed above, using acetyl-CoA as the acetyl donor for these reactions. Protein fatty acylation at the N-terminus most often involves attachment of the 14-carbon fatty acid, myristic acid to an N-terminal glycine residue, referred to as N-myristoylation. Another common fatty acylation of proteins utilizes the 16-carbon fatty acid palmitic acid which is attached to the sulfhydryl group of internal and N-terminal cysteine residues and is, therefore, referred to as S-palmitoylation. Although other long, medium, and short chain fatty acids are found attached to either the N-terminal amino acid (such as N-terminal propionylation) or to internal amino acids, N-myristoylation and S-palmitoylation represent the bulk of protein acylations. One physiologically relevant example of internal protein acylation is the hormone ghrelin. Ghrelin is a stomach-derived hormone that is acylated, a modification required for its biological activity, with octanoic acid on a specific serine residue. The ghrelin acylation is catalized by an enzyme that is a member of the multipass transmembrane acyltransferase family termed MBOAT for membrane-bound O-acyltransferase. The ghrelin acyltransferase is encoded by the MBOAT4 gene which was originally identified as ghrelin O-acyltransferase, GOAT.

Protein N-Myristoylation

N-terminal myristoylation is catalyzed by N-terminal myristoyltransferases (NMTs). Humans express two NMT genes identified as NMT1 and NMT2. Incorporation of myristic acid onto an N-terminal glycine residue occurs predominantly as a co-translational event, although post-translational N-myristoylation has been shown to occur in apoptotic cells. Within the human proteome, it has been shown that approximately 0.5% of all proteins are N-myristoylated.

Protein S-Palmitoylation

Although not as common as protein N-myristoylation, protein S-palmitoylation is an important post-translational modification effecting the regulation of membrane attachment, intracellular trafficking, and membrane subdomain localization. The bulk of S-palmitoylation occurs on the sulfhydryl of internal cysteine residues, however, important examples of N-terminal palmitoylated proteins are known. In addition to the attachment of palmitic acid, S-palmitoylation is a term used to describe the S-acylation of proteins with stearic acid (18:0), oleic acid (18:1), arachidonic acid (20:4), and eicosapentaenoic acid (20:5). Palmitoylation of proteins is catalyzed by a family of protein acyltransferase (PATs) that are members of the Asp-His-His-Cys-containing protein acyltransferase family, identifed as DHHC-PATs. Due to the DHHC motif forming a zinc-finger domain the genes encoding these enzymes are termed zinc finger DHHC type containing (ZDHHC) with a number designating the specific gene. Currently 23 human ZDHHC genes have been identified and characterized, ZDHHC1–ZDHHC9, ZDHHC11–ZDHHC24 (there is no ZDHHC10 gene). N-terminal palmitoylation is known to occur on the α-subunit of Gs-type G-proteins as well as on the sonic hedgehog (SHH) protein. N-terminal palmitoylation of SHH is catalyzed by a specific enzyme encoded by the HHAT (hedgehog acyltransferase) gene. The HHAT protein belongs to the MBOAT family of multipass transmembrane acyltransferases. In addition to N-terminal S-palmitoylation, SHH is modified by the attachment of cholesterol to the C-terminus, a modification required to limit the spread of the protein across the anteroposterior axis of the developing neural tube and the developing limb bud. The human homolog of the Drosophila melanogaster segment polarity gene porcupine, encoded by the PORCN gene, is also a member of the MBOAT family of acyltransferases. The PORCN encoded enzyme S-palmitoylates the Wnt family proteins, a modification that is required for correct distribution of the gradients of these important development regulatory growth factors.

back to the top


Prenylation refers to the addition of the 15 carbon farnesyl group or the 20 carbon geranylgeranyl group to acceptor proteins, both of which are isoprenoid compounds derived from the cholesterol biosynthetic pathway. The isoprenoid groups are attached to cysteine residues at the carboxy terminus of proteins in a thioether linkage (C–S–C). A common consensus sequence at the C-terminus of prenylated proteins has been identified and is composed of CAAX, where C is cysteine, A is any aliphatic amino acid (except alanine) and X is the C-terminal amino acid. More than 120 human proteins have been identified that are modified by the addition of a prenyl group. These proteins include the γ-subunit of numerous heterotrimeric G-proteins, members of the Ras superfamily of small GTPases, the nuclear lamins, and several protein kinases and protein phosphatases.

In the course of the prenylation reaction, the prenyl group (either farnesyl or geranylgeranyl) is added to the cysteine in the CAAX motif at the C-terminus of target proteins and the AAX tripeptide is subsequently removed. These prenylation reactions are carried out, in humans, by one of several CAAX isoprenylation enzymes. The major isoprenylation enzymes are farnesyltransferase and geranylgeranyltransferase type I. Farnesyltransferase and geranylgeranyltransferase function as heterodimers composed of a common α-subunit and a distinct β-subunit. The common α-subunit is encoded by the FNTA gene (farnesyltransferase, CAAX box, alpha). The β-subunit of farnesyltransferase is encoded by the FNTB gene. The β-subunit of geranylgeranyl transferase type I is encoded by the PGGT1B gene (protein geranylgeranyltransferase type I subunit beta). Following protein isoprenylation the AAX tripeptide is removed by CAAX proteases. The major CAAX protease in humans is encoded by the RCE1 gene (Ras and a-factor converting enzyme 1). The last step in protein isoprenylation involves the methylation of the carboxylate group of the prenylated cysteine in a reaction utilizing S-adenosylmethionine as the methyl donor. Humans express three enzymes that carry out the isoprenylcysteine methyltransferase reaction with the most abundant being encoded by the ICMT gene (isoprenylcysteine carboxyl methyltransferase).

Reactions of protein prenylation

Reactions of protein prenylation. The prenylation of target proteins occurs in three steps. The first step is the isoprenylation of the cysteine residue in the C-terminal CAAX motif. The AAX tripeptide is then removed followed by methylation of the hydroxyl of the carboxylic acid of the now C-terminal prenylated cysteine. The major prenylation reactions involve farnesylation or geranylgeranylation. The major CAAX protease is encoded by the RCE1 gene and the major cysteine carboxymethyltransferase is encoded by the ICMT gene.

In addition to numerous prenylated proteins that contain the CAAX consensus, prenylation is known to occur on proteins of the RAB family of RAS-related G-proteins. There are 65 proteins in this family that are prenylated at either a CC or CXC element in their C-termini. The RAB family of proteins are involved in signaling pathways that control intracellular membrane trafficking.

Some of the most important proteins whose functions depend upon prenylation are those that modulate immune responses. These include proteins involved in leukocyte motility, activation, and proliferation and endothelial cell immune functions. It is these immunomodulatory roles of many prenylated proteins that are the basis for a portion of the anti-inflammatory actions of the statin class of cholesterol synthesis-inhibiting drugs due to a reduction in the synthesis of farnesylpyrophosphate and geranylpyrophosphate and thus reduced extent of inflammatory events. Other important examples of prenylated proteins include the oncogenic GTP-binding and hydrolyzing protein RAS and the γ-subunit of the visual protein transducin, both of which are farnesylated. In addition, as indicated above, numerous heterotrimeric G-proteins have their γ-subunits modified by geranylgeranylation.

back to the top

Ubiquitin & Ubiquitin-Like Proteins: Targeting & Degradation

The protein ubiquitin is the founding member of a family of proteins that includes at least 20 members all of which are involved in post-translational modifications of numerous substrate proteins. The originally characterized function of ubiquitin was its attachment to proteins to target them for degradation in the protein degradation apparatus termed the proteosome. Humans express four different genes that produce the protein ubiquitin, UBB, UBC, UBA52, and RPS27A (also known as UBA80). The UBB gene generates multiple alternatively spliced mRNAs that encode proteins that are composed of three direct repeats of the ubiquitin coding sequence and, therefore, produce a polyubiquitin precursor protein. Like the UBB gene, the UBC gene generates an mRNA that encodes a polyubiquitin precursor protein. The UBA52 gene generates multiple alternatively spliced mRNAs that encode precursor ubiquitin fusion proteins. The UBA52 mRNAs encode proteins that contain ubiquitin at the N-terminal part of the fusion protein and large ribosomal subunit protein L40 at the C-terminal part. The RPS27A gene encodes a ubiquitin fusion protein like that encoded by the UBA52 gene. The RPS27A gene generates multiple alternatively spliced mRNAs that encode precursor proteins containing ubiquitin in the N-terminal part fused to small ribosomal subunit protein S27a in the C-terminal part.

The other members of the ubiquitin protein family are referred to as ubiquitin-like proteins or ubiquitin-like modifiers and given the designation, UBL. The UBL family of proteins are used to modify protein substrates in a manner similar to that of ubiquitin as described below. The UBL family includes the small ubiquitin-related modifier (SUMO) proteins (discussed in detail in the next section) and NEDD8 (originally isolated in a screen for neural precursor cell expressed, developmentally down-regulated genes) which are the most well characterized of the UBLs. Additional ubiquitin-like modifiers include, ATG8 (autophagy related gene 8), ATG12 (autophagy related gene 12), ISG15 (interferon-stimulated gene 15), URM1 (ubiquitin-related modifier 1), UFM1 (ubiquitin-fold modifier 1), UBD (ubiquitin-like modifier D, originally identified as FAT10 for HLA-F adjacent transcript 10), and FUB1 (also known as FUBI for fusion ubiquitin-like and also known as MNSFβ for monoclonal nonspecific suppressor factor β). Like the UBA52 and RPS27A genes which encode fusion proteins containing ubiquitin, the FAU gene encodes the FUB1 protein sequence at the N-terminal end of a fusion protein where the small ribosomal subunit protein S30 corresponds to the C-terminal portion of the fusion protein. Although originally characterized for its role in protein degradation, ubiquitin and the other UBL proteins are also involved in diverse metabolic processes that include regulation of subcellular localization, nuclear transport, protein synthesis, DNA repair, regulation of cellular responses to oxidative stress, regulation of inflammatory responses, and autophagy.

Protein Ubiquitylation

Proteins are in a continual state of flux, being synthesized and degraded. In addition, when proteins become damaged they must be degraded to prevent aberrant activities of the defective proteins and/or other proteins associated with those that have been damaged. One of the major mechanisms for the destruction of cellular proteins involves a complex structure referred to as the proteasome. In eukaryotic cells the proteasome is found in the cytosol and the nucleus and has a large mass such that it has a sedimentation coefficient of 26S. The 26S proteasome comprises a 20S barrel-shaped catalytic core as well as 19S regulatory complexes at both ends. Degradation of proteins in the proteosome occurs via an ATP-dependent mechanism. The other major orderly protein (and other cellular constituents) degradation pathway is referred to as autophagy, from the Greek for "self" and "to eat". The process of autophagy involves the sequestration of targeted cytoplasmic constituents into a double membrane vesicle termed the autophagosome. Following its formation the autophagosome fuses with the lysosomal machinery of the cell and the contents are degraded.

Proteins that are to be degraded by the proteasome are first tagged by attachment of multimers of the 76 amino acid protein ubiquitin, a process termed ubiquitylation or ubiquitination. Many proteins involved in cell cycle regulation, control of proliferation and differentiation, programmed cell death (apoptosis), DNA repair, immune and inflammatory processes, and organelle biogenesis have been discovered to undergo regulated degradation via the 26S proteasome. The degradation of proteins via the 26S proteasome involves a two-step process starting with ubiquitylation of the target protein followed by entry into, and degradation by, the proteasome complex with release of ubiquitin monomers that can be re-used to tag additional proteins. The process of ubiquitin addition to the substrate protein involves multiple ubiquitin additions such that the targeted protein is polyubiquitylated. Of clinical significance are the observations that deregulation of the functions of the proteasome can contribute to the pathogenesis of various human diseases such as cancer, myeloproliferative diseases, and neurodegenerative diseases.

Prior to target protein ubiquitylation the precursor ubiquitin proteins need to be cleaved at their C-terminal end to expose a di-glycine motif. The enzymes that cleave the C-terminus of ubiquitin are called ubiquitin C-terminal hydrolases, UCH. The UCH enzymes are also members of the large family of deubiquitylating enzymes described below. There are four human UCH encoding genes, UCHL1, UCHL3, UCHL5, and BAP1 (BRCA1 associated protein 1). Attachment of the C-terminal glycine of ubiquitin to lysine residues (via an isopeptide linkage) in target proteins involves a series of three enzyme activities. The first, identified as E1 (also called ubiquitin-activating enzymes), activates ubiquitin in an ATP-dependent manner such that the ubiquitin is bound to the E1 enzyme via a high-energy thiol ester. The next class of enzyme, referred to as E2 (also called ubiquitin-conjugating enzymes, UBE; or ubiquitin-carrier proteins), transfers the ubiquitin via an E2 thiol ester intermediate to the substrate protein. The substrate proteins are recognizable by E2 because they are bound by the third class of enzyme called E3 or ubiquity protein ligases. The E3 enzymes carry out the final step in the process which is the covalent attachment of ubiquitin to the targeted substrate protein. The ubiquitin is generally transferred to the ε-amino group of an internal lysine residue in the substrate protein. There are however, examples where the ubiquitin is attached to the N-terminal amino group in a substrate protein as described above in the section on protein N-terminal acetylation.

Humans express a family of ten E1 class ubiquitin-activating enzymes, 41 genes encoding E2 class ubiquitin conjugating enzymes, and several hundred (prediction of more than 1000) genes encoding E3 class ubiquitin ligases. The E3 ligases are divided into three major families that includes the RING domain containing family that acquired the term RING from Really Interesting New Gene. The RING E3 ligases represent the largest family of ubiquitin ligases and these enzymes associate in complexes with proteins of the cullin family forming the cullin-RING E3 ligase family. Cullin proteins are a family of proteins that serve as molecular scaffolds. Humans express eight cullin genes identified as CUL1, CUL2, CUL3, CUL4a, CUL4b, CUL5, and CUL7 and PARC (p53-associated parkin-like protein). Each of these eight proteins contains a highly homologous domain termed the cullin domain. The proteins encoded by the CUL genes all assemble into multisubunit complexes forming the cullin-RING E3 ubiquitin ligase (CRL) family. The RING domain is a zinc finger-like domain which is a protein domain involved in DNA-binding, RNA-binding, protein-binding, and lipid-binding. There are at least 275 genes in humans that express proteins containing a RING domain. In addition to a cullin and a RING domain protein the CRLs also contain substrate recognition proteins that represent the E2 ubiquitin-conjugating enzymes. The archetypal CRL (CRL1) is also commonly called the SCF complex where SCF stands for SKP1 (S-phase kinase associated protein 1), CUL1, and F-box containing complex. The second E3 ligase family is the HECT domain family. The term HECT is derived from Homologous to the papilloma virus E6 protein associated protein (E6AP) Carboxy Terminus. The third family of E3 ligases is the RBR family where RBR stands for Ring Between Ring.

Whereas, ubiquitylation targets proteins for degradation in the proteasome there needs to be a mechanism to ensure that inappropriately tagged proteins, i.e. those that are not destined for degradation, can be untagged, as well as deubiquitylating proteins that have been modified for the purposes of regulation not degradation. There is a large family of isopeptidases called deubiquitylating enzymes (DUBs) that carry out this vital function of removing ubiquitin from proteins to which it is either added for temporary regulation or has been mistakenly attached. The DUB enzymes also cleave monoubiquitin from polyubiquitin chains that have been removed from proteins, effectively recycling active ubiquitin. The DUB enzymes are all members either the large family of cysteine proteases or the large family of metalloproteases. The DUB enzymes that are cysteine proteases are divided into four subfamilies identified as the ubiquitin-specific proteases (USP), the ubiquitin C-terminal hydrolases (UCH), the Machado-Josephin domain proteases (MJD), and the ovarian tumor proteases (OTU). Humans express over 100 genes whose encoded proteins have been identified as DUBs or are putative DUB enzymes with the largest group of DUBs belonging to the cysteine protease family of enzymes.

Process of ubiquitination of proteins

Process of ubiquitylation and proteasome-mediated protein degradation. Protein ubiquitylation begins with the attachment of a monomer of ubiquitin to a member of the ubiquitin activating enzyme (E1) family. The "activated" ubiquitin is then transferred to a member of the ubiquitin-conjugating enzyme (E2) family in a process termed transacylation. The E2-ubiquitin complex is then targeted by a member of the ubiquitin ligase (E3) family of enzymes which transfers the ubiquitin to the target substrate protein. Following ubiquitylation the tagged protein can be degraded in the proteosome.

Inactivation of a critical activity such as that catalyzed by the E1 enzymes results in lethality. However, there are numerous pathological states that can be attributed, in part, to mutations in recognition motifs in ubiquitylation substrates and enzymes in the ubiquitylation process. Disease states associated with the ubiquitin modification system can be classified into two groups. One group results from a loss of function mutation in a ubiquitin system enzyme or target protein that results in stabilization of the protein that should normally be degraded. The other group results from gain of function mutations that result in abnormal or accelerated degradation of target proteins. The most obvious disease state that could be expected to arise as a result of defective ubiquitylation processes is cancer. In fact, many malignancies are known to result from defective ubiquitin-mediated degradation of growth promoting proteins such as FOS, MYC, and SRC. Likewise, inappropriate degradation of key regulators of cell cycle progression such as the tumor suppressor p53 and p27KIP1 (CDKN1B), which is an inhibitor of cyclin-dependent kinases (CDKs which control progression through the cell cycle) is also associated with various types of cancer. In addition to cancers, defective ubiquitylation is found associated with neurodegenerative diseases such as Parkinson disease, Alzheimer disease, and amyotrophic lateral sclerosis.

Protein Neddylation

The NEDD8 gene was originally isolated in a screen for abundantly expressed genes in the embryonic mouse brain. There were ten genes isolated in this screen and the were all designated as Neural precursor cell-Expressed, Developmentally Downregulated genes. These ten genes all have distinct functions in a variety of disparate biochemical pathways. The NEDD8 gene encodes a ubiquitin-like protein and of all the UBLs it is the most closely related to ubiquitin exhibiting 59% sequence identity between human NEDD8 and ubiquitin. Like ubiquitin the NEDD8 protein is attached to target proteins and this modification is accomplished by the largest subfamily of E3 ligases, the cullin-RING E3 ubiquitin ligases described earlier. The attachment of NEDD8 to target proteins is a process referred to as neddylation.

Functional NEDD8 protein is localized primarily to the nucleus. Following its synthesis, the precursor NEDD8 proteins is proteolytically processed to expose the necessary C-terminal glycine residue that is the amino acid attached to substrate proteins. There are two primary NEDD8 processing enzymes, one that functions on both ubiquitin and NEDD8 (C-terminal hydrolase isozyme 3, UCHL3) and one that is specific for NEDD8 (deneddylase 1, DEN1). Similar to the mechanism of ubiquitylation, neddylation involves E1 activating enzymes, E2 conjugating enzymes, and E3 ligases. The NEDD8 E1 activating enzyme is termed NAE. NAE is a heterodimeric complex composed of ubiquitin activating enzyme 3 (UBA3) and amyloid-β precursor protein binding protein 1 (APPBP1). There are at least two NEDD8-specific E2 conjugating enzymes identified as ubiquitin conjugating enzyme E2 M (UBE2M, also known as UBC12) and UBE2F. All of the NEDD8 E3 ligases also function to attach ubiquitin to target substrates. The cullin-RING E3 ligases, which represent the largest subfamily of ubiquitin E3 ligases, all function in the neddylation reaction. Similar to the process by which ubiquitin can be removed from a target proteins through the action of the deubiquitylation (DUB) enzymes, neddylated proteins can also be deneddylated. The deneddylation reaction is catalyzed by the protein encoded by the CSN5 (COP9 signalosome complex subunit 5) gene when this protein is engaged in the eight-subunit COP9 signalosome complex. The COP9 signalosome is the sole regulator (deactivator) of cullin-RING ligase (CRL) complexes. The COP9 signalosome is highly similar to the 19S complexes of the 26S proteosome.

Important substrates for neddylation are the eight human cullin gene encoded proteins. All eight of these proteins contain a highly conserved C-terminal neddylation site (IVRIMKMR) where the bolded lysine residue is the NEDD8 attachment site. As discussed above, the cullin proteins serve as molecular scaffolds for the RING domain family of E3 ubiquitin ligases which together form the CRL family of E3 ligase complexes. Numerous transcription factors have also been identified as being substrates for neddylation. In the case of transcription factors the consequences of neddylation is, in general, a suppression of their activity as a result of altered protein-protein or protein-DNA interactions as well as altered protein stability and subcellular localization. One very clinically relevant transcription factor that is neddylated is the tumor suppressor, p53. In normal cells the p53 protein is trapped in the cytosol and targeted for degradation through interaction with the E3 ubiquitin ligase MDM2. When cells experience DNA damage or other forms of cellular stress p53 is released from MDM2 and enters the nucleus to initiate a program of gene expression designed to allow the cell to respond to the stress event. The MDM protein also serves as the E3 ligase for NEDD8 allowing p53 to neddylated. The effect of neddylation of p53 is not degradation but inhibited transcriptional activity. Other important neddylated transcription factors are members of the E2F family. Neddylation of members of this transcription factor family, like the effect of p53 neddylation, results in down-regulation of their transcriptional activity.

back to the top

Protein Modification by SUMO

As discussed in the above section, many proteins are post-translationally modified via the addition of ubiquitin. Over the past several years numerous ubiquitin-like (UBL) proteins have been identified. Like ubiquitin, UBLs are added to other proteins via post-translational reactions. However, unlike ubiquitylation which primarily targets proteins for degradation in the proteasome, UBL modifications are not for protein degradation. Although there are several types of UBLs, those with the broadest range of functions and the largest number of known substrates are members of the SUMO (small ubiquitin-related modifier) proteins. There have been over 50 proteins, that function in a variety of different capacities within cells, shown to be modified by SUMOylation. Modification of proteins by SUMO addition has been shown to occur in all tissues and at all developmental stages.

There are four SUMO proteins in mammalian cells designated SUMO-1 to SUMO-4. SUMO-1 is also identified as UBL1, PIC1 [promyelocytic leukemia (PML) interacting clone 1], sentrin, GMP1 [GTPase activating protein (GAP) modifying protein 1], and Smt3c (suppressor of mif two 3 homolog 1c). SUMO-2 is also identified as sentrin 2, Smt3b, and GMP1-related protein. SUMO-3 is also identified as sentrin 3 and Smt3a. SUMO-2 and SUMO-3 differ only at three N-terminal amino acid residues and are, therefore, often referred to collectively as SUMO-2/3. SUMO-2/3 share only 50% amino acid similarity to SUMO-1. SUMO-4 is based upon DNA sequence analysis and exhibits 87% amino acid similarity to SUMO-2. It is believed that the SUMO-4 gene is actually a pseudogene since it does not contain any introns. Although SUMO-4 mRNA has been detected in tissues such as spleen and lymph nodes, no protein product has been detected.

Following de novo synthesis, SUMO proteins must undergo a C-terminal cleavage processing event in order to render the proteins biologically active. The C-terminal cleavage reactions are catalyzed by a family of proteins called SENP (sentrin/SUMO-specific protease). There are at least six SENP proteins in mammals. The removal of the C-terminal amino acids reveals a di-glycine (–G–G) motif that allows the SUMO protein to subsequently conjugate to lysine (K) residues in target proteins. SUMO conjugation to target proteins requires a series of enzymatic steps that is similar in mechanism to the conjugation of ubiquitin to target proteins. SUMO proteins are initially activated via an ATP-dependent reaction catalyzed by the E1 activating complex. This complex is a heterodimer composed of SUMO-activating enzyme E1 (SAE1) and SAE2. This activating reaction forms a covalent bond between an active site cysteine in SAE2 and the C-terminal glycine of the SUMO protein. The next step involves the transfer of the SUMO protein to an active site cysteine of the protein identified as Ubc9 (ubiquitin-conjugating 9). As yet Ubc9 is the only known SUMO-conjugating enzyme in mammalian tissues. Ubc9 brings the SUMO protein to the target protein by recognizing, and binding to, the consensus SUMOylation motif in target proteins. The consensus motif for SUMOylation is the following: ΨKxD/E where Ψ is a large hydrophobic amino acid and x is any amino acid. Approximately 75% of all SUMOylated proteins contain this target motif, however, not all proteins that contain this motif are SUMOylated and some proteins are SUMOylated on lysine residues that do not lie in this motif.

As described above for ubiquitylation, there are three classes of enzyme involved in the process: E1, E2, and E3. Since Ubc9 can directly conjugate SUMO proteins to their targets it was thought that no E3-like activity was required. However, SUMO E3 ligases have been identified and although they do not function directly in the enzymatic process of SUMO attachment to target they do act as a scaffold. SUMO E3 ligases bring Ubc9-SUMO complexes into contact with target proteins. The mammalian SUMO E3 proteins are members of the PIAS [protein inhibitor of activated STAT (signal transducer and activator of transcription)] family of proteins. There are currently five members of the mammalian PIAS protein family.

Whereas, ubiquitylation of a protein results in is destruction in the proteasome, SUMOylation is a dynamic process and once attached the SUMO residue can be removed. Removal of SUMO from a target protein is accomplished by the same SENP enzymes that are required for the activation step of SUMO processing. SENP1 and SENP2 have broad specificity for SUMO-1 and SUMO-2/3 and are involved in their processing and deconjugation. SENP3 and SENP5 exhibit a preference for SUMO-2/3. Although SENP6 and SENP7 exhibit the same preference for SUMO-2/3 they are only minimally involved in deconjugation reactions. The primary functions of SENP6 and SENP7 are in editing the length of poly-SUMO-2/3 chains on target proteins. Therefore, it seems clear that SENP1 and SENP2 are responsible for maturation of SUMO proteins and deconjugation of SUMO-1 and SUMO-2/3 conjugated targets. SENP3 and SENP5 function in the removal of monomeric SUMO-2/3 from target proteins. SENP6 and SENP7 function as editors of SUMO-2/3 chains in target proteins.

The exact functional consequences of SUMOylation of a particular target protein is difficult to predict. However, modification of target proteins by SUMO addition is likely to lead to at least three non-mutually exclusive effects. The attachment of SUMO can result in the masking of a site in the target protein that is required for binding or interaction with a substrate protein. The addition of SUMO may alternatively result in the formation of an attachment site allowing for the recruitment of proteins that can now interact with the SUMOylated protein. The third consequence could be that the conformation of the SUMOylated protein is altered such that activity is regulated in some way.

Several Mammalian Substrates for SUMO

The following Table is not intended to represent a complete list of all known SUMO target proteins it is just a representative list.

Protein Symbol & Name Protein Function Role of SUMOylation
Androgen receptor transcriptional activation reduces the transcriptional activation activity of the receptor
glucose transporter 1
glucose transport exact consequence not fully defined but GLUT1 protein levels are down-regulated by Ubc9
glucose transporter 4
glucose transport exact consequence not fully defined but GLUT4 protein levels are up-regulated by Ubc9
homeodomain-interacting protein kinase 2
transcriptional co-repression mediates the localization of HIPK2 nuclear bodies, also called promyelocytic leukemia (PML) bodies
inhibitory κBα
inhibitor of NF-κB (nuclear factor κB) signal transduction inhibits ubiquitylation of IκBα thereby blocking NF-κB activity
originally isolated from mouse tumorigenic cell line 3T3DM
E3 ubiquitin ligase for tumor suppressor p53 inhibits ubiquitylation of Mdm2 resulting in activation of the E3 function of Mdm2
p53 tumor suppressor, is a transcription factor activated in response to DNA damage activates p53 transactivation leading to increased apoptosis
promyelocytic leukemia
tumor suppressor allows for the formation of nuclear bodies and the recruitment p53
Topo I
topoisomerase I
topoisomerase involved in DNA replication and repair exact consequence not fully defined but SUMOylation is induced after DNA damage with camptothecin
Topo II
topoisomerase II
topoisomerase involved in DNA replication and repair exact consequence not fully defined but SUMOylation is induced after DNA damage with teniposide

back to the top


Sulfate modification of proteins occurs at tyrosine residues. As many as 1% of all tyrosine residues present in the eukaryotic proteome are modified by sulfate addition making this the most common tyrosine modification. Tyrosine sulfation is accomplished via the activity of tyrosylprotein sulfotransferases (TPST) which are membrane-associated enzymes of the trans-Golgi network. There are two known TPSTs identified as TPST-1 and TPST-2. The universal sulfate donor for these TPST enzymes is 3'-phosphoadenosyl-5'-phosphosulphate (PAPS). Addition of sulfate occurs almost exclusively on secreted and trans-membrane spanning proteins. Since sulfate is added permanently it is necessary for the biological activity and not used as a regulatory modification like that of tyrosine phosphorylation.

Synthesis and structure of 3'-phosphoadenosyl-5'-phosphosulphate (PAPS)

Two-step reaction for synthesis of PAPS. The synthesis of PAPS involves the addition of sulfate at the β (beta) position of the phosphates of ATP with the resultant loss of the γ (gamma) phosphate generating adenosine 5'-phosphosulfate, APS. APS is then phosphorylated at the 3'-position of the ribose moiety forming the ultimate product, PAPS. Synthesis of PAPS in humans is catalyzed by the bi-functional enzyme 3'-phosphoadenosine 5'-phosphosulfate synthase, PAPSS. PAPSS possesses both the ATP sulfurylase and APS kinase activities that are associated with two separate enzymes in yeasts, bacteria, and plants. Humans express two PAPSS genes identified as PAPSS1 and PAPSS2.

At least 34 human proteins have been identified that are tyrosine sulfated although the total number that are predicted is much higher. In all vertebrates a total of 310 tyrosine sulfated proteins have been identified. It is predicted that the mouse proteome is likely to contain over 2000 tyrosine sulfated proteins. The addition of sulfate to tyrosine is believed to play a role in the modulation of protein-protein interactions of secreted and membrane-bound proteins. The process of tyrosine sulfation has been shown to be critical for the processes of blood coagulation, various immune functions, intracellular trafficking, and ligand recognition by several G-protein-coupled receptors (GPCRs). Some well-known tyrosine sulfated proteins are the coagulation protein factor VIII, and the gut peptides gastrin and cholecystokinin (CCK).

back to the top

Vitamin C-Dependent Modifications

Modifications of proteins that depend upon vitamin C as a cofactor include proline and lysine hydroxylations and carboxy terminal amidation. The hydroxylating enzymes are identified as prolyl hydroxylase and lysyl hydroxylase. The donor of the amide for C-terminal amidation is glycine. The most important hydroxylated proteins are the collagens. Several peptide hormones such as oxytocin and vasopressin have C-terminal amidation.

back to the top

Vitamin K-Dependent Modifications

Vitamin K is a cofactor in the carboxylation of glutamic acid residues catalyzed by the enzyme gamma-glutamyl carboxylase (γ-glutamyl carboxylase). The result of this type of reaction is the formation of a γ-carboxyglutamate (gamma-carboxyglutamate), referred to as a gla residue. The gene encoding γ-glutamyl carboxylase is identified as GGCX and is located on chromosome 2p11.2. The GGCX gene spans 13 kbp and consists of 15 exons encoding a 758 amino acid protein. The γ-glutamyl carboxylase protein is an integral membrane protein with three transmembrane spanning domains associated with microsomal membranes.

The overall reaction, resulting in the incorporation of a gla-residue, actually involves a series of three distinct reactions. The reaction catalyzed by γ-glutamyl carboxylase is the one that incorporates the gla-residue but two additional enzyme activities are required to convert vitamin K back to its active hydroquinone (quinol) form. The latter two reactions are catalyzed by vitamin K epoxide reductase (VKORC1). These latter two reactions involve a dithiol conversion to a disulfide. An additional enzyme called vitamin K quinone reductase (VKQR) can also carry out the conversion of the quinone form of vitamin K (as formed by the action of VKORC1 or as obtained from the diet) to the hydroquinone form. This latter reaction utilizes NADH as a co-factor.

Formation of a γ-carboxyglutamamte (gla) residue in prothrombin

Incorporation of a gla-residue into prothrombin: The incorporation of a gla-residue into a protein such as prothrombin requires the hydroquinone (KH2) form of vitamin K (either K1, K2, or synthetic K3). The utilization and regeneration of the KH2 form in the overall process of the γ-glutamyl carboxylase (GGCX) reaction is referred to as the vitamin K cycle. Either following the carboxylation, or directly from dietary quinone forms of vitamin K, the action of vitamin K epoxide reductase (VKORC1) is to provide a continuous source of the KH2 form.

The formation of gla residues within several proteins of the blood clotting cascade is critical for their normal function. The presence of gla residues allows the protein to chelate calcium ions and thereby render an altered conformation and biological activity to the protein. The coumarin-based anticoagulants, warfarin and dicumarol function by inhibiting the second and third enzymes of the overall carboxylation reaction.

back to the top


Selenium is a trace element and is found as a component of several prokaryotic and eukaryotic enzymes that are involved in redox reactions. Two critical re-dox enzyme familiess that require selenocysteine residues are the glutathione peroxidase and thioredoxin reductase families. Glutathione peroxidase is a critical enzyme involved in the protection of red blood cells from reactive oxygen species (ROS). This enzyme is a component of a re-dox system that also involves the enzyme glutathione reductase and NADPH as the terminal electron donor. This system is required for the continued reduction of oxidized glutathione (GSSG) and represents the single most significant system requiring continued glucose metabolism via the Pentose Phosphate Pathway in erythrocytes as the means for the production of the NADPH. Glutathione (GSH) becomes oxidized in the context of reducing various ROS and peroxides and to continue in this capacity the oxidized form needs to be continously reduced. Humans express eight different glutathione peroxidase genes identified as GPX1 through GPX8. The enzyme encoded by the GPX1 gene (GPx1) is found in the cytosol of nearly all cell types in humans. GPx1 functions almost exclusively to reduce hydrogen peroxide (H2O2) to water. The protein encoded by the GPX3 gene, GPx3, is an extracellular enzyme found primarily in the plasma. The GPX4 encoded enzyme, GPx4, is localized to the intestines and is an extracellular enzyme as well. The GPX1 gene is located on chromsome 3p21.3 and is composed of 2 exons that generate two alternatively spliced mRNAs. The GPX1 coding region contains a polyalanine tract in the N-terminal region of the protein. There are several alleles of this gene that have five, six, or seven alanine repeats. The allele with five alanine repeats has been shown to be highly correlated to increased risk for development of breast cancer. The GPX2 gene is located on chromsome 14q24.1 and is composed of 4 exons. The GPX3 gene is located on chromsome 5q33.1 and is composed of 5 exons. The GPX4 gene is located on chromsome 19p13.3 and is composed of 8 exons. The GPX5 gene is located on chromsome 6p22.1 and is composed of 7 exons. The resultant GPX5 mRNA does not contain the canonical selenocysteine codon (UGA) and thus, the resulting protein does not contain a selenocysteine residue. Expression of the GPX5 gene is regulated by androgens and the gene is expressed exclusively in the epididymis in the male reproductive tract where the expressed protein, GPx5, is involved in protecting spermatazoa membranes from the damaging effects of lipid peroxidation. The GPX6 gene is located on chromsome 6p22.1 and is composed of 5 exons. GPX6 expression is restricted to embryonic tissues and the adult olfactory system. The GPX7 gene is located on chromsome 1p32 and is composed of 3 exons. The GPX8 gene is located on chromsome 5q11.2 and is composed of 3 exons.

As the name of the enzyme implies, thioredoxin reductase is involved in the reduction of thioredoxin which itself is principally involved in the reduction of oxidized disulfide bonds in proteins. The reduction of these disulfide bonds results in oxidation of thioredoxin which then is reduced by thioredoxin reductase. The overall process, like the glutathione peroxidase system, requires NADPH as the terminal electron donor for the reduction process. A critically important reaction that is coupled to the thioredoxin system is the formation of deoxynucleotides. Humans contain three thioredoxin reductase genes that encode three distinct enzymes identified as TrxR1, TrxR2, and TrxR3. The TrxR1 enzyme is functional in the cytosol and is primarily involved in the maitenance of the ribonucleotide reductase system. The TrxR2 enzyme is functional in the mitochondria where it is principally involved in the detoxification of reactive oxygen species (ROS) produced in this organelle. TrxR3 is a testes-specific isoform of the enzyme. The TrxR1 enzyme is encoded by the TXNRD1 gene located on chromosome 12q23–q24.1 and is composed of 18 exons that generate several alternatively spliced mRNAs encoding five different isoforms of TrxR1. The TrxR2 enzyme is encoded by the TXNRD2 gene located on chromosome 22q11.21 and is composed of 19 exons that generate two alternatively spliced mRNAs resulting in two different isoforms of TrxR2. The TrxR3 enzyme is encoded by the TXNRD3 gene located on chromosome 3q21.3 and is composed of 16 exons that generate two alternatively spliced mRNAs resulting in two different isoforms of TrxR3.

The enzymes of the deiodinase family are also important selenocysteine-containing enzymes. Clinically relevant enzymes in this family are the thyroid deiodinases that are critical for the maturation and catabolism of the thyroid hormones. Humans express three different thyroid deiodinase genes identified as DIO1, DIO2, and DIO3. The enzyme encoded by the DIO1 gene, thryroxine deiodinase type I (also called iodothyronine deiodinase type I) is involved in the peripheral tissue conversion of thyroxine (T4) to bioactive form of thyroid hormone, tri-iodothyronine (T3). In addition to its role in the generation of T3, thyroxine deiodinase I is involved in the catabolism of thyroid hormones. The enzyme encoded by the DIO2 gene, iodothyronine deiodinase type II, is also involved in the conversion of T4 to T3 but does so within the thyroid gland itself. The activity of iodothyronine deiodinase II has been associated with the thyrotoxicosis of Graves disease. The enzyme encoded by the DIO3 gene is involved only in the inactivation (catabolism) of T3 and T4. Expression of the DIO3 gene is highest the female uterus during pregnancy and in fetal and neonatal tissue suggesting a role for this enzyme in the regulation of thyroid hormone levels and functions during early development. The DIO1 gene is located on chromosome 1p33–p32 and is composed of 4 exons that generate four alternatively spliced mRNAs. The DIO2 gene is located on chromosome 14q24.2–q24.3 and is composed of 6 exons that generate four alternatively spliced mRNAs. The DIO3 gene is located on chromosome 14q32 and is an intronless gene (is a single exon gene) that encodes a protein of 304 amino acids.

Selenocysteine incorporation in eukaryotic proteins occurs cotranslationally at UGA codons (normally stop codons) via the interactions of a number of specialized proteins and protein complexes. In addition, there are specific secondary structures in the 3′ untranslated regions of selenoprotein mRNAs, termed SECIS elements, that are required for selenocysteine insertion into the elongating protein. One of the complexes required for this important modification is comprised of a selenocysteinyl tRNA [(Sec)-tRNA(Ser)Sec] and its specific elongation factor identified as selenoprotein translation factor B (SelB). SelB is also commonly called eukaryotic elongation factor, selenocysteine-tRNA-specific (EEFsec or EFsec). The protein that is involved in the interaction of the SECIS element with the (Sec)-tRNA(Ser)Sec if referred to as SECIS binding protein, SBP2. Additional proteins involved in synthesis pathway include two selenophosphate synthetases, SPS1 and SPS2, ribosomal protein L30, and two factors that have been shown to bind (Sec)-tRNA(Ser)Sec identified as soluble liver antigen/liver protein (SLA/LP) and SECp43.

Incorporation of selenocysteine during protein synthesis

Selenocysteine biosynthesis and incorporation. The first steps involve the activation of serine onto the (Sec)-tRNA followed by enzymatic conversion to selenocysteine generating (Sec)-tRNA(Ser)Sec. Next the (Sec)-tRNA(Ser)Sec is bound by SelB and the complex is incorporated into the translational machinery aided by SBP2 (not shown). The elongating protein is transfered to the selenocysteinyl-tRNA via the action of peptidyltransferase as for any other incoming amino acid and normal elongation continues.

back to the top
Return to The Medical Biochemistry Page
Michael W King, PhD | © 1996–2016, LLC | info @

Last modified: September 27, 2016