RNA Transcription & Processing

Return to The Medical Biochemistry Page

© 1996–2016 themedicalbiochemistrypage.org, LLC | info @ themedicalbiochemistrypage.org













Transcription is the mechanism by which a template strand of DNA is utilized by specific RNA polymerases to generate one of the four distinct classifications of RNA. These four RNA classes are:

1. Messenger RNAs (mRNAs): This class of RNAs is the genetic coding templates used by the translational machinery to determine the order of amino acids incorporated into an elongating polypeptide in the process of translation.

2. Transfer RNAs (tRNAs): This class of small RNAs form covalent attachments to individual amino acids and recognize the encoded sequences of the mRNAs to allow correct insertion of amino acids into the elongating polypeptide chain.

3. Ribosomal RNAs (rRNAs): This class of RNAs is assembled, together with numerous ribosomal proteins, to form the ribosomes. Ribosomes engage the mRNAs and form a catalytic domain into which the tRNAs enter with their attached amino acids. The proteins of the ribosomes catalyze all of the functions of polypeptide synthesis.

4. Small RNAs: This class of RNAs includes the small nuclear RNAs (snRNAs) involved in RNA splicing and the microRNAs (miRNAs) involved in the modulation of gene expression through the alteration of mRNA activity.

All RNA polymerases are dependent upon a DNA template in order to synthesize RNA. The resultant RNA is, therefore, complimentary to the template strand of the DNA duplex and identical to the non-template strand. The non-template strand is called the coding strand because its' sequences are identical to those of the mRNA. However, in RNA, U is substituted for T and the intronic DNA sequences are removed from the processed RNAs.

back to the top

Classes of RNA Polymerases

In prokaryotic cells, all three RNA classes are synthesized by a single polymerase. In eukaryotic cells there are three distinct classes of RNA polymerase, RNA polymerase (pol) I, II and III. Each polymerase is responsible for the synthesis of a different class of RNA. The capacity of the various polymerases to synthesize different RNAs was shown with the toxin α-amanitin. At low concentrations of α-amanitin synthesis of mRNAs are affected but not rRNAs nor tRNAs. At high concentrations, both mRNAs and tRNAs are affected. These observations have allowed the identification of which polymerase synthesizes which class of RNAs.

RNA pol I (RNAP I or RNA polymerase 7) is responsible for rRNA synthesis (excluding the 5S rRNA). The functional enzyme is a large (590 kDa) multi-subunit complex composed of 14 subunits. Twelve of the RNAP I subunits are identical to or related tosubunits of the RNAP II complex. The genes that encode the subunits of the RNAP I complex are identified as POLR1 genes, with five distinct genes (POLR1A-POLR1E) expressed in humans. There are four major rRNAs in eukaryotic cells designated by their sedimentation size. The 28S, 5S 5.8S RNAs are associated with the large ribosomal subunit and the 18S rRNA is associated with the small ribosomal subunit.

RNA pol II (RNAP II) in humans is a large 550kDa complex composed of 12 distinct subunits. Each of the 12 subunits of the RNAP II complex are identified as RBP1-RBP12 and the genes that encode these subunits are POLR2A-POLR2L. The RBP1 subunit is the actual RNA polymerizing activity of the complex. This subunit is encoded by the POLR2A gene. The function of RNAP II is to synthesize all of the mRNAs and some of the small nuclear RNAs (snRNAs) involved in RNA splicing, and several microRNAs.

RNA pol III (RNAP III) is also a multisubunit complex and is composed of at least 17 proteins. Ten of the RNAP III subunits are unique to this complex, two are common with subunits of RNAP I, and five are common to all three RNAP complexes. The genes encoding the RNAP III-specific proteins are identified as POLR3A-POLR3H. All of the RNAs transcribed by RNAP III are small stable untranslated RNAs. The products of RNAP III include all of the tRNAs, the 5S rRNA, several microRNAs, and the U6 small nuclear RNA (snRNA) of the splicing machinery.

back to the top

Mechanism of RNA Polymerases

Synthesis of RNA exhibits several features that are synonymous with DNA replication. RNA synthesis requires accurate and efficient initiation, elongation proceeds in the 5'—>3' direction (i.e. the polymerase moves along the template strand of DNA in the 3'—>5' direction), and RNA synthesis requires distinct and accurate termination. Transcription exhibits several features that are distinct from replication.

1. Transcription initiates, both in prokaryotes and eukaryotes, from many more sites than replication.

2. There are many more molecules of RNA polymerase per cell than DNA polymerase.

3. RNA polymerase proceeds at a rate much slower than DNA polymerase (approximately 50–100 bases/sec for RNA versus near 1000 bases/sec for DNA).

4. Finally the fidelity of RNA polymerization is much lower than DNA. This is allowable since the aberrant RNA molecules can simply be turned over and new correct molecules made.

back to the top

Processes of Transcription

Signals are present within the DNA template that act in cis to stimulate the initiation of transcription. These sequence elements are termed promoters. Promoter sequences promote the ability of RNA polymerases to recognize the nucleotide at which initiation begins. Additional sequence elements are present within genes that act in cis to enhance polymerase activity even further. These sequence elements are termed enhancers. Transcriptional promoter and enhancer elements are important sequences used in the control of gene expression.

E. coli RNA polymerase is composed of 5 distinct polypeptide chains. Association of several of these generates the RNA polymerase holoenzyme. The sigma subunit is only transiently associated with the holoenzyme. This subunit is required for accurate initiation of transcription by providing polymerase with the proper cues that a start site has been encountered. In both prokaryotic and eukaryotic transcription the first incorporated ribonucleotide is a purine and it is incorporated as a triphosphate. In E. coli several additional nucleotides are added before the sigma subunit dissociates.

The process of eukaryotic mRNA transcriptional initiation is an extremely complex event. There are numerous protein factors controlling initiation, some of which are basal factors present in all cells and others are specific to cell type and/or the differentiation state of the cell. Two basal promoter elements that are found in essentially all eukaryotic mRNA genes are the TATA-box and the CAAT-box. Many constitutively expressed mRNA genes (house-keeping genes) also contain a GC-box promoter element (generally GGGCGG). These elements are so called because of the DNA sequences that constitute the promoter element. The TATA-box can be found approximately 25–100 bases upstream (written -25 to -100) of the start site for transcription and the CAAT-box is generally in the -70 to -150 position. The TATA-box sequences are found ONLY in the coding strand of the gene (i.e. the strand that has the sequences identical to the resulting mRNA) while the CAAT-box and GC-box sequences are most often found in the template strand but can also reside in the coding strand. Many of the basal transcription factors are identified by the fact that they control the activity of RNA pol II. Thus, the nomenclature of these proteins is TFII, for transcription factor of RNA pol II. TFIID is the factor that binds to the TATA-box and its binding is facilitated by TFIIA. Once TFIID and TFIIA are bound TFIIB binds and this recruits RNA pol II to the promoter. Next TFIIE and TFIIH bind.

TFIIH is in fact a complex of proteins and this complex is not only involved in transcription but also in certain steps of DNA repair. The role of TFIIH in DNA repair can be seen as critical since defects in its function are responsible for certain forms of xeroderma pigmentosum. The critical role of TFIIH in transcription initiation is due to the fact that one of the proteins of the complex is a kinase that phosphorylates serine residues in the C-terminal domain (CTD) of the large subunit of RNA pol II. This kinase subunit of TFIIH is called Kin28. The CTD contains a tandem repeat sequence that is composed of the consensus heptad of amino acids: Y1S2 P3T4 S5P6 S7 which can be repeated from 25 to 52 times. It is Ser5 and Ser7 that become phosphorylated during transcriptional initiation. These serines are different from the serine (Ser2) phosphorylated in the CTD by P-TEFb involved in the capping process as discussed below. After transcriptional initiation has commenced and RNA pol II moves down the DNA template, factors TFIIA and TFIID remain on the promoter to allow for additional rounds of initiation to take place.

Elongation involves the addition of the 5'–phosphate of ribonucleotides to the 3'–OH of the elongating RNA with the concomitant release of pyrophosphate. Nucleotide addition continues until specific termination signals are encountered. Following termination the core polymerase dissociates from the template. In prokaryotic transcription, the core and sigma subunit can then reassociate forming the holoenzyme again ready to initiate another round of transcription.

In E. coli transcriptional termination occurs by both factor-dependent and factor-independent means. Two structural features of all E. coli factor-independently terminating genes have been identified. One feature is the presence of 2 symmetrical GC-rich segments that are capable of forming a stem-loop structure in the RNA and the second is a downstream A rich sequence in the template. The formation of the stem-loop in the RNA destabilizes the association between polymerase and the DNA template. This is further destabilized by the weaker nature of the AU base pairs that are formed, between the template and the RNA, following the stem-loop. This leads to dissociation of polymerase and termination of transcription. Most genes in E. coli terminate by this method. Factor-dependent termination requires the recognition of termination sequences by the termination protein, rho. The rho factor recognizes and binds to sequences in the 3' portion of the RNA. This binding destabilizes the polymerase-template interaction leading to dissociation of the polymerase and termination of transcription.

back to the top

Co- and post-transcriptional Processing of RNAs

When transcription of bacterial rRNAs and tRNAs is completed they are immediately ready for use in translation. No additional processing takes place. Translation of bacterial mRNAs can begin even before transcription is completed due to the lack of the nuclear-cytoplasmic separation that exists in eukaryotes. The ability to initiate translation of prokaryotic RNAs while transcription is still in progress affords a unique opportunity for regulating the transcription of certain genes. An additional feature of bacterial mRNAs is that most are polycistronic. This means that multiple polypeptides can be synthesized from a single primary transcript.

Polycistronic mRNAs are very rare in eukaryotic cells but have been identified. The mitochondrial genomes in mammals and the slime mold, Dictyostelium discoideum, encode polycistronic mRNAs that are processed into primarily mono-, di-, and tricistronic transcripts. In addition, several viruses encode polycistronic RNAs.

In contrast to bacterial transcripts, eukaryotic RNAs (all 3 classes) undergo significant processing, some of which occurs co-transcriptionally and some post-transcriptionally. All 3 classes of RNA are transcribed from genes that contain introns. The sequences encoded by the intronic DNA must be removed from the primary transcript prior to the RNAs being biologically active. The process of intron removal is called RNA splicing. Additional processing occurs to mRNAs. The 5' end of all eukaryotic mRNAs are capped with a unique 5'—>5' linkage to a 7-methylguanosine residue. The capped end of the mRNA is thus, protected from exonucleases and more importantly is recognized by specific proteins of the translational machinery. The capping process occurs after the newly synthesizing mRNA is around 20–30 bases long. At this point RNA pol II pauses and the kinase positive transcription elongation factor b (P-TEFb) phosphorylates RNA pol II on the serine-2 residue (Ser2) in the repeat unit of the C-terminal domain (CTD) of the large subunit of the enzyme. P-TEFb is composed of cyclin-dependent kinase 9 (CDK9) and either cyclin T1, T2, or K. The complex is also called C-terminal domain kinase 1 (CTDK1). This pausing and regulatory phosphorylation event allows for the potential of attenuation in the rate of transcription.

Structure of the eukaryotic mRNA 5'-cap

Structure of the 5'-Cap of eukaryotic mRNAs

Messenger RNAs also are polyadenylated at the 3' end. A specific sequence, AAUAAA, is recognized by the endonuclease activity of by polyadenylate polymerase which cleaves the primary transcript approximately 11–30 bases 3' of the sequence element. A stretch of 20–250 A residues is then added to the 3' end by the polyadenylate polymerase activity.

Process of mRNA 3'-end polyadenylation

Processes of polyadenylation

In addition to intron removal in tRNAs, extra nucleotides at both the 5' and 3' ends are cleaved, the sequence 5'–CCA–3' is added to the 3' end of all tRNAs and several nucleotides undergo modification. There have been more than 60 different modified bases identified in tRNAs.

Both prokaryotic and eukaryotic rRNAs are synthesized as long precursors termed pre-ribosomal RNAs. In eukaryotes a 45S pre-ribosomal RNA serves as the precursor for the 18S, 28S and 5.8S rRNAs.

back to the top

Splicing of RNAs

There are several different classes of reactions involved in intron removal. The 2 most common are the group I and group II introns. Group I introns are found in nuclear, mitochondrial and chloroplast rRNA genes, group II in mitochondrial and chloroplast mRNA genes. Many of the group I and group II introns are self-splicing, i.e. no additional protein factors are necessary for the intron to be accurately and efficiently spliced out.

Group I introns require an external guanosine nucleotide as a cofactor. The 3'–OH of the guanosine nucleotide acts as a nucleophile to attack the 5'–phosphate of the 5' nucleotide of the intron. The resultant 3'–OH at the 3' end of the 5' exon then attacks the 5' nucleotide of the 3' exon releasing the intron and covalently attaching the two exons together. The 3' end of the 5' exon is termed the splice donor site and the 5' end of the 3' exon is termed the splice acceptor site.

Mechanism of group 1 intron self-splicing Mechanism of group 2 intron self-splicing

Splicing by Group 1 Introns

Splicing by Group 2 Introns

Group II introns are spliced similarly except that instead of an external nucleophile the 2'–OH of an adenine residue within the intron is the nucleophile. This residue attacks the 3' nucleotide of the 5' exon forming an internal loop called a lariat structure. The 3' end of the 5' exon then attacks the 5' end of the 3' exon as in group I splicing releasing the intron and covalently attaching the two exons together.

The third class of introns is also the largest class found in nuclear mRNAs. This class of introns undergoes a splicing reaction similar to group II introns in that an internal lariat structure is formed. However, the splicing is catalyzed by specialized RNA–protein complexes called small nuclear ribonucleoprotein particles (snRNPs, pronounced snurps). The RNAs found in snRNPs are identified as U1, U2, U4, U5 and U6. The genes encoding these snRNAs are highly conserved in vertebrate and insects and are also found in yeasts and slime molds indicating their importance.

Analysis of a large number of mRNA genes has led to the identification of highly conserved consensus sequences at the 5' and 3' ends of essentially all mRNA introns.

Consensus sequences for exon/intron splicing sites

The U1 RNA has sequences that are complimentary to sequences near the 5' end of the intron. The binding of U1 RNA distinguishes the GU at the 5' end of the intron from other randomly placed GU sequences in mRNAs. The U2 RNA also recognizes sequences in the intron, in this case near the 3' end. The addition of U4, U5 and U6 RNAs forms a complex identified as the spliceosome that then removes the intron and joins the two exons together.

An additional mechanism of intron removal is the process of tRNA splicing. These introns are spliced by a specific splicing endonuclease that involves a cut-and-paste mechanism. In order for tRNA intron removal to occur the tRNA must first be properly folded into its characteristic cloverleaf shape. Misfolded precursor tRNAs are not processed which allows the splicing reaction to serve as a control step in the generation of mature tRNAs.

back to the top

RNA Editing

RNA editing was a term first used to describe an unusual form of post-transcriptional processing involving the insertion of uridine (U) residues into a mitochondrial mRNA found in Trypanosoma brucei. This particular form of editing was then found to occur in many eukaryotic mRNAs. The process of RNA editing is now known to encompass a wide variety of mechanistically unrelated processes that change the nucleotide sequence of an RNA species relative to that directed by the encoding DNA. Currently RNA editing systems are divided into two general classes: substitution and insertion/deletion. In the first class, the coding sequences of a mature RNA and its gene are co-linear as they contain the same number of nucleotides but differ in nucleotide sequence where editing has occurred. In the second class, the nucleotide sequence of the mature RNA product is not co-linear with that of its DNA coding sequence since the final RNA product contains extra nucleotides relative to the encoding gene. All of the major types of cellular RNA (mRNA, rRNA, and tRNA) have been shown to be subject to editing in different organisms.

The term "RNA editing" is not used to refer to RNA modifications such as 5'-capping, splicing, and 3'-polyadenylation, nor to the formation of modified nucleosides in RNA (as is typical in tRNAs). However, it is important to keep sight of the fact that the distinctions between “RNA editing” and “RNA modification” can be less than obvious. To illustrate this fact, consider that there are instances of RNA editing involving deamination of A residues forming I (inosine) residues (see next section). If this editing occurs in the coding region of an mRNA, the edited site (I) is recognized as G during translation. However, it is also known that A residues in the wobble position of tRNA anticodons (the 5'-nucleotide) undergo deamination (by an evolutionarily related enzyme) to I, which similarly results in a change in the anticodon pairing properties. Thus, under these circumstances editing and modification can result in the same effects at the level of the resultant protein.

RNA editing systems have been identified that result in changes in A residues to I residues, referred to as A-to-I editing systems, or changes in C residues to U residues, referred to as C-to-U editing systems. The enzymes that catalyze the A-to-I edits are members of a family of adenosine deaminases that act on RNA (ADARs). This distinguishes these enzymes from the adenosine deaminase involved in the catabolism and salvage of purine nucleotides. The enzymes that catalyze C-to-U edits are called cytosine deaminases that act on RNA (CDARs). A sequence comparative analysis of ADAR and CDAR sequences demonstrated that they all belong to a superfamily of RNA-dependent deaminases that also includes tRNA-specific deaminases (ADATs). A common feature of ADARs, CDARs, and ADATs is the presence in the deaminase domain of conserved residues that are essential for catalysis. All three types of deaminases likely arose from an ancestral cytidine deaminase via the acquisition of RNA-binding domains.

The clinical significances of the editing of human RNAs is demonstrated by the observations that mutations in the ADAR1 gene are associated with rare autosomal skin pigmentation disorder (dyschromatosis symmetrica hereditaria, DSH) and with Aicardi-Goutières syndrome (AGS), an early-onset encephalopathy that often results in severe and permanent neurological damage. Defective RNA editing is also associated with a number of neurological diseases including suicidal depression, epilepsy, schizophrenia, and amyotrophic lateral sclerosis (Lou Gherig disease).

A-to-I Editing

The process of A-to-I editing occurs on nuclear transcripts and is catalyzed by a family of enzymes referred to as ADARs. ADAR activity was initially characterized as a double-stranded RNA (dsRNA) unwinding activity and as such, these observations emphasize that ADARs are dsRNA-binding proteins and that their catalytic activity is directed toward duplex regions in RNA. Although the most biologically significant functions of ADARs is site-specific deamination in mRNA, it is known that RNA duplex regions in several types of non-coding RNAs, including microRNAs (miRNAs) and small interfering RNAs (siRNAs), as well as some viral RNAs are also substrates for ADARs.

Three mammalian ADAR genes give rise to four known isoforms: ADAR1p150, ADAR1p110, ADAR2 and ADAR3. Alternative promoter usage within the ADAR1 transcript generates the full length (ADAR1p150) isoform and an N-terminally truncated (ADAR1p110) isoform. Both ADAR1 isoforms contain three dsRNA-binding domains and the deaminase domain. The ADAR1 variants and ADAR2 are expressed in many tissues, whereas the ADAR3 protein is only expressed in the brain. Although ADAR3 is presumed to be catalytically inactive, it may compete with ADAR1 and -2 for RNA binding substrates, thereby, altering the overall profile of edited RNAs via that mechanism.

The vast majority of A residues that are targets for editing are localized near splice junctions in the pre-mRNA. The formation of a dsRNA-ADAR substrates in intronic sequences could, therefore, obscure splice sites from the splicing machinery resulting in alternative splicing events. In addition, the editing of select A residues could lead to the creation or elimination of splicing sites which also could result in alternative splicing events.

A-to-I editing occurs in more RNAs than does C-to-U editing. By far, most of the mammalian mRNAs found to undergo A-to-I editing are expressed in the nervous system. Physiologically significant examples are transcripts of the ionotropic glutamate receptor (GluR) family and the serotonin receptor family. In both cases the deamination of exonic A residues leads to single amino acid changes in the resulting proteins.

Editing of glutamate receptor mRNA occurs specifically in the mRNA encoding the GluA2 (GluR2) subunit of the 2-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid (AMPA) receptors. Editing of the GluA2 mRNA occurs at two non-synonymous sites termed the Q/R and R/G sites. These sites are so-called because the editing results in the change of a glutamine residue for an arginine residue in the first site and a change of arginine for glycine in the second. The Q/R site is encoded by exon 11 and resides within the second transmembrane domain (TMII) of the protein. The R/G site is located just one nucleotide from the boundary between exon 13 and the downstream intron. When this site is edited, splicing favors inclusion of exon 15 over that of exon 14. With respect to the Q/R site, editing has a profound effect on the calcium permeability of the resulting AMPA receptor. Calcium permeability of all AMPA receptor isoforms is controlled by the GluA2 subunit. In unedited GluA2 proteins the presence of the Q residue allows Ca2+ permeability whereas the edited amino acid (R) does not. Almost all of the GluA2 present in the human brain is edited. The importance of GluA2 mRNA editing can be demonstrated by the phenotype of ADAR2 knockout mice. These mice have significantly reduced editing of the Q/R site which causes them to be highly seizure-prone, and they die within 3 weeks of birth.

Editing of the serotonin receptor mRNA occurs specifically in the 5-HT2C subtype within the cells of the prefrontal cortex. This mRNA, encoded by the HTR2C gene, contains five sites that are A-to-I edited. These sites are referred to as A, B, C' (E), C, and D. The most commonly detected edited 5-HT2C mRNAs are edited at the AC'C, ABD, and ABCD combination sites. There is a strong correlation to severe psychiatric behaviors and 5-HT2C mRNA editing combinations. In victims of suicide, who had been diagnosed with a history of major depression, the level of C' editing is much higher and the level of D editing is significantly decreased when compared in unaffected individuals. Interestingly, when mice are treated with the antidepressant, fluoxetine, the pattern of C, C', and D editing in the 5-HT2C mRNA is the exact opposite to that observed in victims of suicide.

A-to-I editing also occurs in the non-coding region of the ADAR2 pre-mRNAs. The consequence of ADAR2 editing its own mRNA is the generation of an alternative splice acceptor site in intron 1, resulting in an alternative splicing event that creates a nonfunctional ADAR2 protein.

The A-to-I editing process also influences the biogenesis and target recognition of siRNAs involved in the RNAi pathway. siRNA biogenesis requires processing of long dsRNA precursors into 21- to 23-nucleotide RNA duplexes which ultimately initiate transcriptional and post-transcriptional sequence-specific silencing. For details on the processing of siRNAs (and miRNAs) go to the Control of Gene Expression page. The RNA editing and RNAi pathways both involve dsRNAs, therefore, editing could potentially antagonizing the RNAi pathway. A-to-I edits could potentially alter the required dsRNA structures of siRNAs (and miRNAs) leading to reduced processing and thus, decreased functional siRNAs. In addition, editing of siRNAs and miRNAs could change their proper targeting to sequence-specific silencing sites in target mRNAs.

C-to-U Editing

The first reported instance of C-to-U editing was within the mRNA encoding apolipoprotein B (apoB). Editing of the apoB mRNA changes a CAA codon to a UAA translational stop codon leading to premature termination of protein synthesis. When the apoB gene is transcribed within hepatocytes the mRNA is not edited and a full-length apoB protein is generated called apoB-100. This apolipoprotein (apoB-100) is found exclusively with the VLDL particles produced and secreted by the liver. Within intestinal enterocytes, the apoB mRNA is edited resulting in the generation of a smaller protein called apoB-48. This apolipoprotein (apoB-48) is found exclusively associated with chylomicrons, the lipoprotein particles produced by the intestines and released to the lymphatic system. C-to-U editing of the apoB mRNA requires a single-stranded RNA template with well defined characteristics in the immediate vicinity of the edited base, as well as protein cofactors that assemble into a functional complex referred to as a holoenzyme or editosome. This functional complex includes a minimal core composed of apolipoprotein B mRNA editing enzyme, catalytic polypeptide 1 (APOBEC-1; the catalytic deaminase) and a competence factor, APOBEC-1 complementation factor (A1CF). The function of A1CF is to act as an adaptor protein by binding both the APOBEC-1 enzyme and the mRNA substrate.

Another example of C-to-U mRNA editing involves site-specific deamination of a CGA to UGA codon in the neurofibromatosis type 1 (NF1) mRNA. The NF1 mRNA encodes a protein identified as neurofibromin 1. The editing of the NF1 mRNA introduces a translational stop codon at position 3916 that results in a truncation of the neurofibromin 1 protein in a critical domain involved in GTPase activation. Although no demonstration of a truncated NF1 protein has been shown, the editing of the NF1 mRNA has been demonstrated in peripheral nerve sheath tumors from patients with type 1 neurofibromatosis.

A third C-to-U edited mRNA encodes eukaryotic initiation factor 4, gamma 2, eIF-4G2 (also identified as p97, DAP5, and NAT1) which is a translational repressor that may be involved in repression of global translation. The editing of the eIF-4G2 mRNA was identified in studies that demonstrated the oncogenic potential of APOBEC-1 when it was overexpressed in experimental animals. In these studies it was found that the eIF-4G2 mRNA underwent C-to-U editing at multiple sites, creating of stop codons that in turn reduced the abundance of the eIF-4G2 protein. The eIF-4G2 protein has a crucial role in early embryogenesis since eIF-4G2-negative embryos die during gastrulation. Although the precise mechanism through which elevated APOBEC-1 activity leads to dysplasia and cancer is not yet defined, host adaptations have been shown to modulate the expression of APOBEC-1 in sporadic human colorectal cancers.

Editing of the apoB mRNA

Editing of the apoB mRNA: When the apoB gene is expressed in the liver the resulting mRNA is not edited and is translated into the full-length apoB-100 protein present in VLDL. When the gene is transcribed in the intestines, editing of the mRNA converts a CAA codon to a translational stop codon (UAA) resulting in the translation of a truncated apoB-48 protein that is present in chylomicrons.

The APOBEC-1 deaminase is encoded by the APOBEC1 gene located on chromosome 12p13.1 and is composed of 6 exons that generate three alternatively spliced mRNAs that encode two distinct protein isoforms. The APOBEC1 gene is a member of a large cytidine deaminase gene family but is the only member of the family that encodes an mRNA-specific editing enzyme. All the other members of the family function primarily to edit cytidine residues in different types DNA molecules. The other members of the family include APOBEC2, APOBEC3A, APOBEC3B, APOBEC3C, APOBEC3D, APOBEC3F, APOBEC3G, APOBEC3H, APOBEC4, and activation-induced cytidine deaminase (AICDA). Although the APOBEC3A encoded protein functions principally to deaminate cytidines of single-stranded DNA and to inhibit viruses and retrotransposons, it is also known to deaminate cytidines in mRNAs in monocytes and macrophages in response to hypoxia. The enzynmes encoded by the APOBEC3D, APOBEC3F, APOBEC3G, and APOBEC3H genes function as anti-retroviral enzymes and have been shown to restrict HIV infection. Each of these four enzymes gets assembled into infectious virion particles where they deaminate cytidine residues in the viral cDNA resulting in reduced progression of reverse transcription. The resulting uracil residues induce G-to-A hypermutations in the HIV-1 genome since A base pairs with U during DNA replication.

back to the top

Catalytically Active RNAs: Ribozymes

Ribozymes represent a special class of RNA molecules that possess catalytic activity. Ribozyme are composed of well-defined tertiary structures that impart the RNAs with their unique biological activity as nucleic acid enzymes. Ribozymes have been identified in a wide range of genomes from viruses to mammals. To date, eight naturally occurring classes of ribozyme have been defined, all of which catalyze cleavage or ligation of the RNA backbone by trans-esterification or hydrolysis of phosphate groups. The catalytic properties of ribozymes are exclusively due to the capacity of these RNA molecules to assume particular structures. RNA molecules have the capacity to fold into several distinct structures which can enable a single RNA to perform more than one function. RNA-mediated catalysis was first demonstrated in the process of intron splicing (group I and II introns). Subsequently, numerous RNAs harboring catalytic activity have been described. Ribozymes have been shown to be involved in tRNA processing (RNaseP), phosphoryl transfer reactions catalyzing the cleavage or ligation of the RNA phosphodiester backbone, in protein synthesis (peptidyltransferase) and in the regulation of gene expression. Despite the similarity of the chemistry of the reactions catalyzed by ribosomes, each molecule possesses a completely unique sequence, tertiary structure, and a specific catalytic mechanism, which reflects the diversity of catalytic strategies of ribozymes. Peptidyltransferase activity of the ribosome represents a distinct ribozyme structure and activity.

The enzymatic activity of ribozymes depends on the capacity of the RNA to fold into specific structures that impart catalytic specificity. The possibility, for a single RNA molecule, to fold into more than one structure, implies that a single RNA polymer could have more than one function. This means the RNA molecules could perform more than one task resulting in a single sequence (the genotype) manifesting multiple phenotypes. That this is indeed the case has been demonstrated for short (25-34 nucleotides) RNA sequences which exhibit the ability to bind two different ligands such as GMP and L-arginine. In addition, another experiment, designed to select for a ribozyme that catalyzed the ligation of two RNA substrates, discovered that the RNA molecule could also undergo a separate self-cleavage reaction. These two distinct enzymatic reactions, ligation and cleavage, were imparted by two distinct sites of the RNA molecule. Multiple bifunctional ribozymes have been identified.

Group I introns are considerably larger and more structurally complex than any of the self-cleaving RNAs. This class of ribozme is found in precursor mRNA, tRNA, and rRNA transcripts from a variety of organisms. The catalytic reaction carried out by group I intron ribozymes occurs in two steps. The reactions result in the ligation of flanking 5' and 3' exons to yield the mature RNA. Several hundred examples of this class of ribozyme have been identified. All of them share a common secondary structure and most likely a similar reaction mechanism. The Tetrahymena thermophila rRNA intron was the first group I self-splicing intron discovered (see section above). The ribozyme derived from this intron is 421 nucleotides long and is composed of a conserved catalytic core of roughly 200 nucleotides. This ribozyme catalyzes the first step of intron self-splicing using an oligonucleotide to mimic the 5'-exon. The 3' oxygen of an exogenous guanosine serves as the nucleophile for this reaction (see Figure above).

The most recently discovered functional class of ribozymes include those that are involved in the regulation of protein synthesis. Two of these newly identified ribozymes are the mammalian cytoplasmic polyadenylation element-binding protein 3 (CPEB3) ribozyme and a variant hammerhead ribozyme embedded in mammalian mRNAs. Hammerhead ribozymes are so-called because of the secondary structure evident in the active ribozyme. The hammerhead, hepatitis delta virus (HDV), hairpin, Neurospora Varkud satellite (VS), and glmS ribozymes are a class of small RNAs (50–150 nucleotides) that catalyze site-specific self-cleavage and were originally characterized in viral, virusoid, bacterial, or satellite RNA genomes.

The glmS ribozyme is a ribozyme found in Gram-positive bacteria. It is considered a metabolite-responsive ribozyme since it was originally discovered by its ability to catalyze site-specific RNA cleavage in the presence of glucosamine-6-phosphate (GlcN6P). The glmS ribozyme was originally identified in the 5'-untranslated region of the GLMS gene which is involved in the synthesis of GlcN6P. The glmS ribozyme is also considered a riboswitch since it is involved in the regulation of gene expression in response to changing concentrations of a metabolite.

The CPEB3 ribozyme is a self-cleaving non-coding RNA located in the second intron of the CPEB3 gene, which belongs to a family of genes regulating the reactions of mRNA polyadenylation. A 72 nucleotide core of the CPEB3 ribozyme sequence is sufficient to carry out self-cleavage. The cleavage activity of the CPEB3 ribozyme is slow which, under normal conditions, allows normal splicing of the CPEB3 pre-mRNA to occur. A trans-acting factor is known to interact with the ribozyme cleavage site thereby, regulating the rate of ribozyme self-cleavage. When self-cleavage is increased, the level of truncated CPEB3 pre-mRNAs increases resulting in degradation of the cleaved RNA fragments. This process may serve as a switch to turn off the synthesis of the CPEB3 protein.

back to the top

Clinical Significances of Alternative and Aberrant Splicing

The presence of introns in eukaryotic genes would appear to be an extreme waste of cellular energy when considering the number of nucleotides incorporated into the primary transcript only to be removed later as well as the energy utilized in the synthesis of the splicing machinery. However, the presence of introns can protect the genetic makeup of an organism from genetic damage by outside influences such as chemical or radiation. An additionally important function of introns is to allow alternative splicing to occur, thereby, increasing the genetic diversity of the genome without increasing the overall number of genes. By altering the pattern of exons, from a single primary transcript, that are spliced together different proteins can arise from the processed mRNA from a single gene. Alternative splicing can occur either at specific developmental stages or in different cell types.

This process of alternative splicing has been identified to occur in the primary transcripts from at least 40 different genes. Depending upon the site of transcription, the calcitonin gene yields an RNA that synthesizes calcitonin (thyroid) or calcitonin-gene related peptide (CGRP, brain). Even more complex is the alternative splicing that occurs in the α-tropomyosin transcript. At least 8 different alternatively spliced α-tropomyosin mRNAs have been identified.

Abnormalities in the splicing process can lead to various disease states. Many defects in the β-globin genes are known to exist leading to β-thalassemias. Some of these defects are caused by mutations in the sequences of the gene required for intron recognition and, therefore, result in abnormal processing of the β-globin primary transcript.

Patients suffering from a number of different connective tissue diseases exhibit humoral auto-antibodies that recognize cellular RNA-protein complexes. Patients suffering from systemic lupus erythematosis have auto-antibodies that recognize the U1 RNA of the spliceosome.

back to the top
Return to The Medical Biochemistry Page
Michael W King, PhD | © 1996–2016 themedicalbiochemistrypage.org, LLC | info @ themedicalbiochemistrypage.org

Last modified: February 17, 2016