The one gene–one enzyme hypothesis is the idea that genes act through the production of enzymes, with each gene responsible for producing a single enzyme that in turn affects a single step in a metabolic pathway. The concept was proposed by George Beadle and Edward Tatum in an influential 1941 paper on genetic mutations in the mold Neurospora crassa, and subsequently was dubbed the "one gene–one enzyme hypothesis" by their collaborator Norman Horowitz. In 2004 Norman Horowitz reminisced that "these experiments founded the science of what Beadle and Tatum called 'biochemical genetics.' In actuality they proved to be the opening gun in what became molecular genetics and all the developments that have followed from that." The development of the one gene–one enzyme hypothesis is often considered the first significant result in what came to be called molecular biology. Although it has been extremely influential, the hypothesis was recognized soon after its proposal to be an oversimplification. Even the subsequent reformulation of the "one gene–one polypeptide" hypothesis is now considered too simple to describe the relationship between genes and proteins.
Although some instances of errors in metabolism following Mendelian inheritance patterns were known earlier, beginning with the 1902 identification by Archibald Garrod of alkaptonuria as a Mendelian recessive trait, for the most part genetics could not be applied to metabolism through the late 1930s. Another of the exceptions was the work of Boris Ephrussi and George Beadle, two geneticists working on the eye color pigments of Drosophila melanogaster fruit flies in the Caltech laboratory of Thomas Hunt Morgan. In the mid-1930s they found that genes affecting eye color appeared to be serially dependent, and that the normal red eyes of Drosophila were the result of pigments that went through a series of transformations; different eye color gene mutations disrupted the transformations at a different points in the series. Thus, Beadle reasoned that each gene was responsible for an enzyme acting in the metabolic pathway of pigment synthesis. However, because it was a relatively superficial pathway rather than one shared widely by diverse organisms, little was known about the biochemical details of fruit fly eye pigment metabolism. Studying that pathway in more detail required isolating pigments from the eyes of flies, an extremely tedious process.
After moving to Stanford University in 1937, Beadle began working with biochemist Edward Tatum to isolate the fly eye pigments. After some success with this approach—they identified one of the intermediate pigments shortly after another researcher, Adolf Butenandt, beat them to the discovery—Beadle and Tatum switched their focus to an organism that made genetic studies of biochemical traits much easier: the bread mold Neurospora crassa, which had recently been subjected to genetic research by one of Thomas Hunt Morgan's researchers, Carl C. Lingegren. Neurospora had several advantages: it required a simple growth medium, it grew quickly, and because of the production of ascospores during reproduction it was easy to isolate genetic mutants for analysis. They produced mutations by exposing the fungus to X-rays, and then identified strains that had metabolic defects by varying the growth medium. This work of Beadle and Tatum led almost at once to an important generalization. This was that most mutants unable to grow on minimal medium but able to grow on “complete” medium each require addition of only one particular supplement for growth on minimal medium. If the synthesis of a particular nutrient (such as an amino acid or vitamin) was disrupted by mutation, that mutant strain could be grown by adding the necessary nutrient to the medium. This finding suggested that most mutations affected only a single metabolic pathway. Further evidence obtained soon after the initial findings tended to show that generally only a single step in the pathway is blocked. Following their first report of three such auxotroph mutants in 1941, Beadle and Tatum used this method to create series of related mutants and determined the order in which amino acids and some other metabolites were synthesized in several metabolic pathways. The obvious inference from these experiments was that each gene mutation affects the activity of a single enzyme. This led directly to the one gene–one enzyme hypothesis, which, with certain qualifications and refinements, has remained essentially valid to the present day. As recalled by Horowitz et al., the work of Beadle and Tatum also demonstrated that genes have an essential role in biosyntheses. At the time of the experiments (1941), non-geneticists still generally believed that genes governed only trivial biological traits, such as eye color, and bristle arrangement in fruit flies, while basic biochemistry was determined in the cytoplasm by unknown processes. Also, many respected geneticists thought that gene action was far too complicated to be resolved by any simple experiment. Thus Beadle and Tatum brought about a fundamental revolution in our understanding of genetics.
The nutritional mutants of Neurospora also proved to have practical applications; in one of the early, if indirect, examples of military funding of science in the biological sciences, Beadle garnered additional research funding (from the Rockefeller Foundation and an association of manufacturers of military rations) to develop strains that could be used to assay the nutrient content of foodstuffs, to ensure adequate nutrition for troops in World War II.
The hypothesis and alternative interpretations
In their first Neurospora paper, published in the November 15, 1941, edition of the Proceedings of the National Academy of Sciences, Beadle and Tatum noted that it was "entirely tenable to suppose that these genes which are themselves a part of the system, control or regulate specific reactions in the system either by acting directly as enzymes or by determining the specificities of enzymes", an idea that had been suggested, though with limited experimental support, as early as 1917; they offered new evidence to support that view, and outlined a research program that would enable it to be explored more fully. By 1945, Beadle, Tatum and others, working with Neurospora and other model organisms such as E. coli, had produced considerable experimental evidence that each step in a metabolic pathway is controlled by a single gene. In a 1945 review, Beadle suggested that "the gene can be visualized as directing the final configuration of a protein molecule and thus determining its specificity." He also argued that "for reasons of economy in the evolutionary process, one might expect that with few exceptions the final specificity of a particular enzyme would be imposed by only one gene." At the time, genes were widely thought to consist of proteins or nucleoproteins (although the Avery–MacLeod–McCarty experiment and related work was beginning to cast doubt on that idea). However, the proposed connection between a single gene and a single protein enzyme outlived the protein theory of gene structure. In a 1948 paper, Norman Horowitz named the concept the "one gene–one enzyme hypothesis".
Although influential, the one gene–one enzyme hypothesis was not unchallenged. Among others, Max Delbrück was skeptical only a single enzyme was actually involved at each step along metabolic pathways. For many who did accept the results, it strengthened the link between genes and enzymes, so that some biochemists thought that genes were enzymes; this was consistent with other work, such as studies of the reproduction of tobacco mosaic virus (which was known to have heritable variations and which followed the same pattern of autocatalysis as many enzymatic reactions) and the crystallization of that virus as an apparently pure protein. At the start of the 1950s, the Neurospora findings were widely admired, but the prevailing view in 1951 was that the conclusion Beadle had drawn from them was a vast oversimplification. Beadle wrote in 1966, that after reading the 1951 Cold Spring Harbor Symposium on Genes and Mutations, he had the impression that supporters of the one gene–one enzyme hypothesis “could be counted on the fingers of one hand with a couple of fingers left over.” By the early 1950s, most biochemists and geneticists considered DNA the most likely candidate for physical basis of the gene, and the one gene–one enzyme hypothesis was reinterpreted accordingly.
One gene–one polypeptide
In attributing an instructional role to genes, Beadle and Tatum implicitly accorded genes an informational capability. This insight provided the foundation for the concept of a genetic code. However, it was not until the experiments were performed showing that DNA was the genetic material, that proteins consist of a defined linear sequence of amino acids, and that DNA structure contained a linear sequence of base pairs, was there a clear basis for solving the genetic code.
By the early 1950s, advances in biochemical genetics—spurred in part by the original hypothesis—made the one gene–one enzyme hypothesis seem very unlikely (at least in its original form). Beginning in 1957, Vernon Ingram and others showed through electrophoresis and 2D chromatography that genetic variations in proteins (such as sickle cell hemoglobin) could be limited to differences in just a single polypeptide chain in a multimeric protein, leading to a "one gene–one polypeptide" hypothesis instead. According to geneticist Rowland H. Davis, "By 1958 – indeed, even by 1948 – one gene, one enzyme was no longer a hypothesis to be resolutely defended; it was simply the name of a research program."
Presently, the one gene–one polypeptide perspective cannot account for the various spliced versions in many eukaryote organisms which use a spliceosome to individually prepare a RNA transcript depending on the various inter- and intra-cellular environmental signals. This splicing was discovered in 1977 by Phillip Sharp and Richard J. Roberts
Possible anticipation of Beadle and Tatum's results
Historian Jan Sapp has studied the controversy in regard to German geneticist Franz Moewus who, as some leading geneticists of the 1940s and 50s argued, generated similar results before Beadle and Tatum's celebrated 1941 work. Working on the algae Chlamydomonas, Moewus published, in the 1930s, results that showed that different genes were responsible for different enzymatic reactions in the production of hormones that controlled the organism's reproduction. However, as Sapp skillfully details, those results were challenged by others who found the data 'too good to be true' statistically, and the results could not be replicated.
- ^ abBeadle GW, Tatum EL (15 November 1941). "Genetic Control of Biochemical Reactions in Neurospora". PNAS. 27 (11): 499–506. Bibcode:1941PNAS...27..499B. doi:10.1073/pnas.27.11.499. PMC 1078370. PMID 16588492.
- ^ abFruton, p. 434
- ^Horowitz NH, Berg P, Singer M, et al. (January 2004). "A centennial: George W. Beadle, 1903-1989". Genetics. 166 (1): 1–10. doi:10.1534/genetics.166.1.1. PMC 1470705. PMID 15020400.
- ^Morange, p. 21
- ^Bussard AE (2005). "A scientific revolution? The prion anomaly may challenge the central dogma of molecular biology". EMBO Reports. 6 (8): 691–694. doi:10.1038/sj.embor.7400497. PMC 1369155. PMID 16065057.
- ^Morange, pp. 21-24
- ^Fruton, pp. 432-434
- ^ abHorowitz NH (May 1996). "The sixtieth anniversary of biochemical genetics". Genetics. 143 (1): 1–4. PMC 1207243. PMID 8722756.
- ^Kay, pp. 204-205.
- ^Beadle, G. W. (1966) "Biochemical genetics: some recollections", pp. 23-32 in Phage and the Origins of Molecular Biology, edited by J. Cairns, G. S. Stent and J. D. Watson. Cold Spring Harbor Symposia, Cold Spring Harbor Laboratory of Quantitative Biiology, NY. ASIN: B005F08IQ8
- ^Morange, pp. 27-28
- ^Berg P, Singer M. George Beadle, an uncommon farmer: the emergence of genetics in the 20th century, CSHL Press, 2003. ISBN 0-87969-688-5, ISBN 978-0-87969-688-7
- ^Davis R. H. (2007). "Beadle's progeny: Innocence rewarded, innocence lost"(PDF). Journal of Biosciences. 32 (2): 197–205 . doi:10.1007/s12038-007-0020-5. PMID 17435312.
- ^Chow, Louise T., Richard E. Gelinas, Thomas R. Broker, and Richard J. Roberts. "An amazing sequence arrangement at the 5' ends of adenovirus 2 messenger RNA." Cell 12, no. 1 (September 1977): 1-8.
- ^Jan Sapp (1990), Where the Truth Lies: Franz Moewus and the Origins of Molecular biology, New York: Oxford University Press.
If genes are segments of DNA and if DNA is just a string of nucleotide pairs, then how does the sequence of nucleotide pairs dictate the sequence of amino acids in proteins? The analogy to a code springs to mind at once. The cracking of the genetic code is the story told in this section. The experimentation was sophisticated and swift, and it did not take long for the code to be deciphered once its existence was strongly indicated.
Simple logic tells us that, if nucleotide pairs are the “letters” in a code, then a combination of letters can form “words” representing different amino acids. We must ask how the code is read. Is it overlapping or nonoverlapping? Then we must ask how many letters in the mRNA make up a word, or codon, and which specific codon or codons represent each specific amino acid.
Overlapping versus nonoverlapping codes
Figure 10-24 shows the difference between an overlapping and a nonoverlapping code. In the example, a three-letter, or triplet, code is shown. For the nonoverlapping code, consecutive amino acids are specified by consecutive code words (codons), as shown at the bottom of Figure 10-24. For an overlapping code, consecutive amino acids are encoded in the mRNA by codons that share some consecutive bases; for example, the last two bases of one codon may also be the first two bases of the next codon. Overlapping codons are shown in the upper part of Figure 10-24. Thus, for the sequence AUUGCUCAG in a nonoverlapping code, the first three amino acids are encoded by the three triplets AUU, GCU, and CAG, respectively. However, in an overlapping code, the first three amino acids are encoded by the triplets AUU, UUG, and UGC if the overlap is two bases, as shown in Figure 10-24.
The difference between an overlapping and a nonoverlapping code. The case illustrated is for a code with three letters (a triplet code). An overlapping code uses codons that employ some of the same nucleotides as those of other codons for the translation (more...)
By 1961, it was already clear that the genetic code was nonoverlapping. The analysis of mutationally altered proteins, in particular, the nitrous acid–generated mutants of tobacco mosaic virus, showed that only a single amino acid changes at one time in one region of the protein. This result is predicted by a nonoverlapping code. As you can see from Figure 10-24, an overlapping code predicts that a single base change will alter as many as three amino acids at adjacent positions in the protein.
It should be noted that, although the use of an overlapping code was ruled out by the analysis of single proteins, nothing precluded the use of alternative reading frames to encode amino acids in two different proteins. In the example here, one protein might be encoded by the series of codons that reads AUU, GCU, CAG, CUU, and so forth. A second protein might be encoded by codons that are shifted over by one base and therefore read UUG, CUC, AGC, UUG, and so forth. This is an example of storing the information encoding two different proteins in two different reading frames, while still using a genetic code that is read in a nonoverlapping manner during the translation of a specific protein. Some examples of such shifts in reading frame have been found.
Number of letters in the code
In reading an mRNA molecule from one particular end, only one of four different bases, A, U, G, or C, can be found at each position. Thus, if the words were one letter long, only four words would be possible. This vocabulary cannot be the genetic code, because we must have a word for each of the 20 amino acids commonly found in cellular proteins. If the words were two letters long, then 42 = 16 words would be possible; for example, AU, CU, or CC. This vocabulary is still not large enough.
If the words are three letters long, then 43 = 64 words are possible; for example, AUU, GCG, or UGC. This vocabulary provides more than enough words to describe the amino acids. We can conclude that the code word must consist of at least three nucleotide pairs. However, if all words are “triplets,” then we have a considerable excess of possible words over the 20 needed to name the common amino acids.
Use of suppressors to demonstrate a triplet code
Convincing proof that a codon is, in fact, three letters long (and no more than three) came from beautiful genetic experiments first reported in 1961 by Francis Crick, Sidney Brenner, and their co-workers, who used mutants in the rII locus of T4 phage. Mutations causing the rII phenotype (see Chapter 9) were induced by using a chemical called proflavin, which was thought to act by the addition or deletion of single nucleotide pairs in DNA. (This assumption is based on experimental evidence not presented here.) The following examples illustrate the action of proflavin on double-stranded DNA.
Then, starting with one particular proflavin-induced mutation called FCO, Crick and his colleagues found “reversions” (reversals of the mutation) that were detected by their wild-type plaques on E. colistrain K(λ). Genetic analysis of these plaques revealed that the “revertants” were not identical true wild types, thereby suggesting that the back mutation was not an exact reversal of the original forward mutation. In fact, the reversion was found to be caused by the presence of a second mutation at a different site from—but in the same gene as—that of FCO; this second mutation “suppressed” mutant expression of the original FCO. Recall from Chapter 4 that a suppressor mutation counteracts or suppresses the effects of another mutation.
The suppressor mutation could be separated from the original forward mutation by recombination, and, as we have seen, when this was done, the suppressor was shown to be an rII mutation itself (Figure 10-25).
The suppressor of an initial rII mutation is shown to be an rII mutation itself after separation by crossing over. The original mutant, FCO, was induced by proflavin. Later, when the FCO strain was treated with proflavin again, a revertant was found, (more...)
How can we explain these results? If we assume that reading is polarized—that is, if the gene is read from one end only—then the original proflavin-induced addition or deletion could be mutant because it interrupts a normal reading mechanism that establishes the group of bases to be read as words. For example, if each three bases on the resulting mRNA make a word, then the “reading frame” might be established by taking the first three bases from the end as the first word, the next three as the second word, and so forth. In that case, a proflavin-induced addition or deletion of a single pair on the DNA would shift the reading frame on the mRNA from that corresponding point on, causing all following words to be misread. Such a frameshift mutation could reduce most of the genetic message to gibberish. However, the proper reading frame could be restored by a compensatory insertion or deletion somewhere else, leaving only a short stretch of gibberish between the two. Consider the following example in which three-letter English words are used to represent the codons:
The insertion suppresses the effect of the deletion by restoring most of the sense of the sentence. By itself, however, the insertion also disrupts the sentence:
If we assume that the FCO mutant is caused by an addition, then the second (suppressor) mutant would have to be a deletion because, as we have seen, this would restore the reading frame of the resulting message (a second insertion would not correct the frame). In the following diagrams, we use a hypothetical nucleotide chain to represent RNA for simplicity. We also assume that the code words are three letters long and are read in one direction (left to right in our diagrams).
rIIa message: distal words changed (x) by frameshift mutation (words marked ✓ are unaffected)
rIIarIIb message: few words wrong, but reading frame restored for later words
The few wrong words in the suppressed genotype could account for the fact that the “revertants” (suppressed phenotypes) that Crick and his associates recovered did not look exactly like the true wild types phenotypically.
We have assumed here that the original frameshift mutation was an addition, but the explanation works just as well if we assume that the original FCO mutation is a deletion and the suppressor is an addition. If the FCO is defined as plus, then suppressor mutations are automatically minus. Experiments have confirmed that a plus cannot suppress a plus and a minus cannot suppress a minus. In other words, two mutations of the same sign never act as suppressors of each other. However, very interestingly, combinations of three pluses or three minuses have been shown to act together to restore a wild-type phenotype.
This observation provided the first experimental confirmation that a word in the genetic code consists of three successive nucleotide pairs, or a triplet. The reason is that three additions or three deletions within a gene automatically restore the reading frame in the mRNA if the words are triplets. For example,
Proof that the genetic deductions about proflavin were correct came from an analysis of proflavin-induced mutations in a gene with a protein product that could be analyzed. George Streisinger worked with the gene that controls the enzyme lysozyme, which has a known amino acid sequence. He induced a mutation in the gene with proflavin and selected for proflavin-induced revertants, which were shown genetically to be double mutants (with mutations of opposite sign). When the protein of the double mutant was analyzed, a stretch of different amino acids lay between two wild-type ends, just as predicted:
Degeneracy of the genetic code
Crick’s work also suggested that the genetic code is degenerate. That expression is not a moral indictment. It simply means that each of the 64 triplets must have some meaning within the code; so at least some amino acids must be specified by two or more different triplets. If only 20 triplets are used (with the other 44 being nonsense, in that they do not code for any amino acid), then most frameshift mutations can be expected to produce nonsense words, which presumably stops the protein-building process. If this were the case, then the suppression of frameshift mutations would rarely, if ever, work. However, if all triplets specified some amino acid, then the changed words would simply result in the insertion of incorrect amino acids into the protein. Thus, Crick reasoned that many or all amino acids must have several different names in the base-pair code; this hypothesis was later confirmed biochemically.
The discussion up to this point demonstrates that
The genetic code is nonoverlapping.
Three bases encode an amino acid. These triplets are termed codons.
The code is read from a fixed starting point and continues to the end of the coding sequence. We know this because a single frameshift mutation anywhere in the coding sequence alters the codon alignment for the rest of the sequence.
The code is degenerate in that some amino acids are specified by more than one codon.
Cracking the code
The deciphering of the genetic code—determining the amino acid specified by each triplet—was one of the most exciting genetic breakthroughs of the past 50 years. Once the necessary experimental techniques became available, the genetic code was broken in a rush.
The first breakthrough was the discovery of how to make synthetic mRNA. If the nucleotides of RNA are mixed with a special enzyme (polynucleotide phosphorylase), a single-stranded RNA is formed in the reaction. No DNA is needed for this synthesis, and so the nucleotides are incorporated at random. The ability to synthesize mRNA offered the exciting prospect of creating specific mRNA sequences and then seeing which amino acids they would specify. The first synthetic messenger obtained, poly(U), was made by reacting only uracil nucleotides with the RNA-synthesizing enzyme, producing –UUUU–. In 1961, Marshall Nirenberg and Heinrich Matthaei mixed poly(U) with the proteinsynthesizing machinery of E. coliin vitro and observed the formation of a protein! The main excitement centered on the question of the amino acid sequence of this protein. It proved to be polyphenylalanine—a string of phenylalanine molecules attached to form a polypeptide. Thus, the triplet UUU must code for phenylalanine:
This type of analysis was extended by mixing nucleotides in a known fixed proportion when making synthetic mRNA. In one experiment, the nucleotides uracil and guanine were mixed in a ratio of 3:1. When nucleotides are incorporated at random into synthetic mRNA, the relative frequency at which each triplet will appear in the sequence can be calculated on the basis of the relative proportion of the various nucleotides present (Table 10-3). Note that, in Table 10-3, UUU is used as the baseline frequency against which the other frequencies are measured in determining their respective ratios. For example, UUG, with a probability of p(UUG) = 9/64, would be expected only one-third as often as UUU, with its probability of p(UUU) = 27/64. Stated alternatively, p(UUG)/p(UUU) = 9/27 = 1/3 = 0.33, which is the ratio for UUG given in Table 10-3.
Expected Frequencies of Various Codons in Synthetic mRNA Composed of 3/4 Uracil and 1/4 Guanine.
If these codons each encode a different amino acid (that is, are not redundant), we expect the amino acids generated by this particular mix of guanine and uracil to be in ratios similar to those of the various codons. Although there is some redundancy among these codons, the ratios of the amino acids actually obtained from this mix of bases (Table 10-4) are indeed quite similar to the ratios seen for the codon frequencies in Table 10-3. (In Table 10-4, phenylalanine is used as the baseline in determining ratios.)
Observed Frequencies of Various Amino Acids in Protein Translated from mRNA Composed of 3/4 Uracil and 1/4 Guanine.
From this evidence, we can deduce that codons consisting of one guanine and two uracils (G + 2 U) code for valine, leucine, and cysteine, although we cannot distinguish the specific sequence for each of these amino acids. Similarly, one uracil and two guanines (U + 2 G) must code for tryptophan, glycine, and perhaps one other. It looks as though the Watson-Crick model is correct in predicting the importance of the precise sequence (not just the ratios of bases). Many provisional assignments (such as those just outlined for G and U) were soon obtained, primarily by groups working with Nirenberg or with Severo Ochoa.
Before we consider other code words, we will examine tRNA molecules, which further explain the link between the mRNA codon and amino acid recognition.
tRNA recognition of the codon
Is it the tRNA or the amino acid itself that recognizes the mRNA that encodes a specific amino acid? A very convincing experiment answered this question. In the experiment, an aminoacyl-tRNA (aa-tRNA), cysteinyl-tRNA (tRNACys, the tRNA specific for cysteine) “charged” with cysteine was treated with nickel hydride, which converted the cysteine (while still bound to tRNACys) into another amino acid, alanine, without affecting the tRNA:
Protein synthesized with this hybridspecies had alanine wherever we would expect cysteine. Thus, the experiment demonstrated that the amino acids are “illiterate”; they are inserted at the proper position because the tRNA “adapters” recognize the mRNA codons and insert their attached amino acids appropriately. We would expect, then, to find some site on the tRNA that recognizes the mRNA codon by complementary base pairing.
Figure 10-26a shows several functional sites of the tRNA molecule. The site that recognizes an mRNA codon is called the anticodon; its bases are complementary and antiparallel to the bases of the codon. Another operationally identifiable site is the amino acid attachment site. The other arms probably assist in binding the tRNA to the ribosome. Figure 10-26b shows a specific tRNA (yeast alanine tRNA). The “flattened” cloverleafs shown in these diagrams are not the normal conformation of tRNA molecules; tRNA normally exists as an L-shaped folded cloverleaf, as shown in Figure 10-26c. These diagrams are supported by very sophisticated chemical analysis of tRNA nucleotide sequences and by X-ray crystallographic data on the overall shape of the molecule. Although tRNA molecules have many structural similarities, each has a unique three-dimensional shape that allows recognition by the correct synthetase, which catalyzes the joining of a tRNA with its specific amino acid to form an aminoacyl-tRNA. (Synthetases will be considered in this chapter under “Protein Synthesis.”) The specificity of charging the tRNAs is crucial to the integrity of protein synthesis.
The structure of transfer RNA. (a) The functional areas of a generalized tRNA molecule. (b) The specific sequence of yeast alanine tRNA. Arrows indicate several kinds of rare modified bases. (c) Diagram of the actual three-dimensional structure of yeast phenylalanine (more...)
Where does tRNA come from? If radioactive tRNA is put into a cell nucleus in which the DNA has been partly denatured by heating, the radioactivity appears (by autoradiography) in localized regions of the chromosomes. These regions probably indicate the location of genes that specify tRNA; they are regions of DNA that produce tRNA rather than mRNA, which produces a protein. The labeled tRNA hybridizes to these sites because of the complementarity of base sequences between the tRNA and its parent gene. A similar situation holds for rRNA. Thus, we see that even the one-gene–one-polypeptide idea is not completely valid. Some genes do not code for protein; rather, they specify RNA components of the translational apparatus.
Some genes encode proteins; other genes specify RNA (for example, tRNA or rRNA) as their final product.
How does tRNA get its fancy shape? It probably folds up spontaneously into a conformation that produces maximal stability. Transfer RNA contains many “odd” or modified bases (such as pseudouracil, ψ) in its nucleotides; these bases play a direct role in folding and have been implicated in other tRNA functions. You may have noticed some unusual base pairing within the loops of the tRNA in Figure 10-26b; G is hydrogen bonded to U (instead of C). This apparent mismatching is considered next.
The complete code
Specific code words were finally deciphered through two kinds of experiments. The first required making “mini mRNAs,” each only three nucleotides in length. These mini mRNAs are too short to promote translation into protein, but they do stimulate the binding of aminoacyl-tRNAs to ribosomes in a kind of abortive attempt at translation. It is possible to make a specific mini mRNA and determine which aminoacyl-tRNA that it will bind to ribosomes. For example, the G + 2 U problem described earlier can be resolved by using the following mini mRNAs:
Analogous mini RNAs provided 64 possible codons.
The second kind of experiment that was useful in cracking the genetic code required the use of repeating copolymers. For instance, the copolymer designated (AGA)n, which is a long sequence of AGAAGAAGAAGAAGA, was used to stimulate polypeptide synthesis in vitro. From the sequence of the resulting polypeptides and the possible triplets that could reside in the respective RNA copolymer, many code words could be verified. (This kind of experiment is detailed in Problem 10 at the end of this chapter. In solving it, you can put yourself in the place of H. Gobind Khorana, who received a Nobel Prize for directing the experiments.)
Figure 10-27 gives the genetic code dictionary of 64 words. Inspect this dictionary carefully, and ponder the miracle of molecular genetics. Such an inspection should reveal several points that require further explanation.
Multiple codons for a single amino acid
From the discussion of degeneracy, we know that the number of codons for a single amino acid varies, ranging from one (tryptophan = UGG) to as many as six (serine = UCU or UCC or UCA or UCG or AGU or AGC). Why? The answer is complex but not difficult; it can be divided into two parts:
Certain amino acids can be brought to the ribosome by several alternative tRNA types (species) having different anticodons, whereas certain other amino acids are brought to the ribosome by only one tRNA.
Certain tRNA species can bring their specific amino acids in response to several codons, not just one, through a loose kind of base pairing at one end of the codon and anticodon. This sloppy pairing is called wobble.
The degree of degeneracy for a given amino acid is determined by the number of codons for that amino acid that have only one tRNA each plus the number of codons for amino acids that share a tRNA through wobble.
We had better consider wobble first, and it will lead us into a discussion of the various species of tRNA. Wobble is caused by the third nucleotide of an anticodon (at the 5′ end) that is not quite aligned (Figure 10-28). This out-of-line nucleotide can sometimes form hydrogen bonds not only with its normal complementary nucleotide in the third position of the codon, but also with a different nucleotide in that position. Crick established certain “wobble rules” that dictate which nucleotides can and cannot form new hydrogen-bonded associations through wobble (Table 10-5). In Table 10-5, I (inosine) is one of the rare bases found in tRNA, often in the anticodon.
In the third site (5′ end) of the anticodon, G can take either of two wobble positions, thus being able to pair with either U or C. This ability means that a single tRNA species carrying an amino acid (in this case, serine) can recognize two codons—UCU (more...)
Codon-Anticodon Pairings Allowed by the Wobble Rules.
Figure 10-28 shows the possible codons that one tRNA serine species can recognize. As the wobble rules indicate, G can pair with U or with C. Table 10-6 lists all the codons for serine and shows how different tRNAs can service these codons. Serine affords a good example of the effects of wobble on the genetic code.
Different tRNAs That Can Service Codons for Serine.
Sometimes there can be an additional tRNA species that we represent as tRNASer4; it has an anticodon identical with any of the three anticodons shown in Table 10-6, but it differs in its nucleotide sequence elsewhere in the tRNA molecule. These four tRNAs are called isoaccepting tRNAsbecause they accept the same amino acid, but they are probably all transcribed from different tRNA genes.
The second point that you may have noticed in Figure 10-27 is that some codons do not specify an amino acid at all. These codons are labeled as stop or termination codons. They can be regarded as being similar to periods or commas punctuating the message encoded in the DNA.
One of the first indications of the existence of stop codons came in 1965 from Brenner’s work with the T4 phage. Brenner analyzed certain mutations (m1–m6) in a single gene that controls the head protein of the phage. These mutants had two things in common. First, the head protein of each mutant was a shorter polypeptide chain than that of the wild type. Second, the presence of a suppressor mutation (su) in the host chromosome would cause the phage to develop a head protein of normal (wild-type) chain length despite the presence of the m mutation (Figure 10-29).
Polypeptide chain lengths of phage T4 head protein in wild type (top) and various amber mutants (m). An amber suppressor (su) leads to phenotypic development of the wild-type chain.
Brenner examined the ends of the shortened proteins and compared them with wild-type protein, recording for each mutant the next amino acid that would have been inserted to continue the wild-type chain. These amino acids for the six mutations were glutamine, lysine, glutamic acid, tyrosine, tryptophan, and serine. There is no immediately obvious pattern to these results, but Brenner brilliantly deduced that certain codons for each of these amino acids are similar in that each of them can mutate to the codon UAG by a single change in a DNAnucleotide pair. He therefore postulated that UAG is a stop (termination) codon—a signal to the translation mechanism that the protein is now complete.
UAG was the first stop codon deciphered; it is called the amber codon. Mutants that are defective owing to the presence of an abnormal amber codon are called amber mutants, and their suppressors are amber suppressors. UGA, the opal codon, and UAA, the ochre codon, also are stop codons and also have suppressors. Stop codons are often called nonsense codons because they designate no amino acid. Not surprisingly, stop codons do not act as mini mRNAs in binding aa-tRNA to ribosomes in vitro. We shall consider stop codons and their suppressors further after we have dealt with the process of protein synthesis.