The central dogma of molecular biology--that DNA makes RNA, which in turn makes protein--begs the question of the order in which these three fundamental biopolymers arose. The proposition that RNA came first has achieved wide popularity (1, 2). However, the concept of a primordial RNA world does not identify which molecule came next: Was it DNA (a more stable information storage medium) or protein (a more versatile catalyst)? An appraisal of the diverse and sophisticated catalytic potential of RNA oligomers (ribozymes) has led some to suggest that proteins came last, the final twist to a nucleic acid world (3-6). But new findings corroborate the view of an early RNA/protein environment with DNA evolving last. Establishing the correct chronology defines the context of genetic code evolution and allows predictions to be made about the distribution of RNA within the fundamental machinery of life.
Parsimony favors the notion that life first evolved through a single biological macromolecule that both stored genetic information and catalyzed the reactions required for self-replication. Today proteins perform sophisticated catalysis and DNA stores information, whereas RNA can do both. Intuitive early speculations (7) that RNA dominated some primordial biosphere reached mainstream theory (1) through two avenues of research. First, investigators have produced a diverse array of ribozymes that catalyze fundamental metabolic reactions and bind specific ligands. Second, identification of putative "molecular fossils" in extant metabolism (8) has inspired the "palimpsest" model of evolution (3) in which modern protein enzymes are postulated to have incompletely replaced earlier ribozyme equivalents. Indeed, patterns within present-day metabolism support the RNA-first model over any alternative. DNA probably arose as an RNA derivative because all organisms make deoxyribonucleotides by reducing ribonucleotides, and make thymine by methylating uracil (9). Proteins-first models cannot explain the presence of functional RNA in processes such as translation in extant organisms: The 20 "natural" amino acids are more chemically diverse than the four nucleotides, which suggests that proteins have greater catalytic potential (2). If proteins came first, why should the evolution of nucleic acids (presumably for information storage) insinuate RNA into catalysis? Only the RNA-world model implies a logical, adaptive diversification from RNA into proteins (superior catalysts) and DNA (a more stable information store) (9).
If RNA came first, which came next: proteins or DNA? The mainstream view of an RNA/protein biosphere (10, 11) has until now been based on surprisingly little hard evidence. Indeed, the very findings that support an RNA world--namely the expanding repertoire of putative molecular fossils and of laboratory-produced ribozymes--have prompted speculation that proteins evolved last, once both types of nucleic acid were present (3-6). Furthermore, the ribonucleotide reductase enzymes (RNRs) responsible for producing deoxyribonucleotides group into three very different classes, distributed unevenly across taxa. Advocates of a proteins-last model have seized upon the lack of clear homology among the three classes to suggest that the enzymes are in fact polyphyletic, having evolved separately in different lineages to replace a primordial ribozyme equivalent (3). However, new research establishes the homology of all RNRs and places their biochemistry beyond the plausible scope of ribozyme catalysis, a view corroborated by the distribution of catalytic and functional RNA found within cells.
DNA biosynthesis occurs in two stages: An RNR reduces the 2¢ hydroxyl group of ribonucleotides to form deoxyribonucleotides; a DNA polymerase then links these monomers into DNA (see the figure). Two aspects of RNR-catalyzed reduction inform the relative timing of DNA evolution. First, new insights into tertiary structural similarity (12) demonstrate RNR homology and strengthen the argument that subsequent divergence resulted from oxygenation of the atmosphere after the evolution of photosynthesis (10). Second, all RNRs use an unusual and energetically difficult reaction mechanism based on free-radical chemistry and the conserved spatial arrangement of two thiol groups at the active site (see the figure).
That the extraordinary sulfur-based chemistry of RNRs has been absolutely conserved, despite divergence in nearly every other aspect of the enzymes, implies that nature has only discovered this one mechanism for ribose reduction. Perhaps no others are biologically feasible or accessible. Why otherwise would all known organisms share this energetically costly and biochemically unusual mechanism? If ribonucleotide reduction requires such difficult chemistry, is it even conceivable that DNA arose in a world of ribozyme catalysis?
Two further ways in which DNA could exist in an RNA world without proteins deserve consideration. First, if the RNA world used abiotically synthesized DNA, ribonucleotide reduction would initially be unnecessary. Second, ribozymes might have augmented their capabilities by using sulfur-containing peptides as cofactors. However, prebiotic synthesis experiments indicate that deoxyribose was probably vanishingly rare on the early Earth, and that sulfur-containing amino acids were too unstable to have attained "useful" concentrations (13). The specific prediction of this argument is that future research will fail to produce a prebiotically plausible ribozyme capable of ribonucleotide reduction.
This biochemical argument is corroborated by simple observations concerning the role of catalytic and functional RNA molecules in extant metabolism. Modern organisms process genotype into phenotype through two distinct stages. First, DNA genes are transcribed into RNA messages (mRNA); these messages are then translated into proteins. Functional RNA molecules are intimately associated with every aspect of translation, from tRNA (the "adapter" molecule that translates each mRNA codon into the appropriate amino acid) through ribonuclease P (involved in tRNA maturation) to the ribosome itself (an RNA/protein complex that coordinates translation). Indeed, the very latest insight into ribosome structure (14) emphasizes the crucial role of the ribosomal RNA subunits in coordinating the translation machinery. Other functional RNA molecules modify RNA messages before their translation (1).
In stark contrast, unique functional RNA is completely absent from transcription (1, 15) and from ribonucleotide reduction (16). Indeed, within DNA-related metabolism, ribonucleotides only appear in the form of RNR cofactors (shared with many other enzymes) and RNA oligomers used to prime DNA replication. The sheer diversity of functional RNA associated with the translation machinery imbues this conspicuous absence with meaning. Thus, another novel prediction is that any functional RNA molecules found to be associated with transcription, ribonucleotide reduction, or DNA replication will be clear secondary derivatives of other cellular machinery.
Class III RNRs, which use the cofactors S-adenosylmethionine (to generate the radical) and formate (as a reductant), are the best extant model for the primordial RNR (10, 16). Interestingly, class III RNRs appear homologous to other enzymes that use glycyl radicals in pathways predating atmospheric oxygen, such as anaerobic fermentation (17). Pyruvate formate lyase (PFL), for example, uses a glycyl radical to generate a cysteine radical, which then lyses pyruvate into formate and helps to create S-adenosylmethionine. Both PFL products are used in class III RNR reduction; the glycyl radical of PFL and class III RNRs is generated using an iron-sulfur cluster and both show sequence homology. If PFL is representative of early protein catalysis, then proteins may have initially evolved to perform tasks beyond the catalytic range of RNA and only later usurped the function of ribozymes.
The distribution of catalytic RNA within extant metabolism, together with the difficult biochemistry of ribonucleotide reduction, implies that sophisticated proteins probably predate DNA: Transcription machinery lacks ribozyme relics simply because ribozymes never performed this function. This restricts the latest point at which the genetic code could have arisen, because proteins (that is, translation) must have evolved before DNA. Yet life probably did not originate with RNA. Abiotic synthesis of both ribose and bases is problematic, and linking the two into nucleotides is more difficult still (18). Increasing acceptance of an RNA world has actually stimulated research into plausible models for a pre-RNA world (1). Could translation have arisen even before RNA? Ribozyme relics argue against such a model: If translation evolved in a pre-RNA biosphere, why would the subsequent evolution of RNA introduce functional RNA components into preexisting (proteinaceous) ribosomes?
Finally, if proteins evolved in an RNA world, then this informs theories about the fixation of the canonical genetic code. RNA templates are significantly more error-prone (in terms of point mutation) than their DNA equivalents (5). An RNA-world origin for the genetic code thus adds significance to the finding that the arrangement of canonical codon assignments appears to minimize the phenotypic impact of errors (19).
References