Gene coding. What is the genetic code: general information

Previously, we emphasized that nucleotides have an important feature for the formation of life on Earth - in the presence of one polynucleotide chain in a solution, the process of formation of a second (parallel) chain spontaneously occurs based on the complementary compound of related nucleotides. The same number of nucleotides in both chains and their chemical relationship is an indispensable condition for the implementation of such reactions. However, during protein synthesis, when information from mRNA is implemented into the protein structure, there can be no question of observing the principle of complementarity. This is due to the fact that in mRNA and in the synthesized protein not only the number of monomers is different, but, what is especially important, there is no structural similarity between them (on the one hand, nucleotides, on the other, amino acids). It is clear that in this case there is a need to create a new principle for the exact translation of information from a polynucleotide into a polypeptide structure. In evolution, such a principle was created and the genetic code was laid in its basis.

The genetic code is a system of writing hereditary information in molecules nucleic acids, based on a certain alternation of nucleotide sequences in DNA or RNA, forming codons corresponding to amino acids in a protein.

The genetic code has several properties.

    Tripletity.

    Degeneracy or redundancy.

    Unambiguity.

    Polarity.

    Non-overlapping.

    Compactness.

    Versatility.

It should be noted that some authors also offer other properties of the code related to the chemical features of the nucleotides included in the code or to the frequency of occurrence of individual amino acids in the proteins of the body, etc. However, these properties follow from the above, so we will consider them there.

but. Tripletity. The genetic code, like many complexly organized systems, has the smallest structural and smallest functional unit. Triplet - the smallest structural unit genetic code. It consists of three nucleotides. A codon is the smallest functional unit of the genetic code. As a rule, mRNA triplets are called codons. In the genetic code, a codon performs several functions. First, its main function is that it codes for one amino acid. Second, a codon may not code for an amino acid, but in this case it has a different function (see below). As can be seen from the definition, a triplet is a concept that characterizes elementary structural unit genetic code (three nucleotides). codon characterizes elementary semantic unit genome - three nucleotides determine the attachment to the polypeptide chain of one amino acid.

The elementary structural unit was first deciphered theoretically, and then its existence was confirmed experimentally. Indeed, 20 amino acids cannot be encoded by one or two nucleotides. the latter are only 4. Three out of four nucleotides give 4 3 = 64 variants, which more than covers the number of amino acids present in living organisms (see Table 1).

The combinations of nucleotides presented in Table 64 have two features. First, out of 64 triplet variants, only 61 are codons and encode any amino acid, they are called sense codons. Three triplets do not encode

Table 1.

Messenger RNA codons and their corresponding amino acids

F undamentals of codons

nonsense

nonsense

nonsense

Met

Shaft

amino acids a are stop signals marking the end of translation. There are three such triplets UAA, UAG, UGA, they are also called "meaningless" (nonsense codons). As a result of a mutation, which is associated with the replacement of one nucleotide in a triplet with another, a meaningless codon can arise from a sense codon. This type of mutation is called nonsense mutation. If such a stop signal is formed inside the gene (in its informational part), then during protein synthesis in this place the process will be constantly interrupted - only the first (before the stop signal) part of the protein will be synthesized. A person with such a pathology will experience a lack of protein and will experience symptoms associated with this lack. For example, this kind of mutation was found in the gene encoding the hemoglobin beta chain. A shortened inactive hemoglobin chain is synthesized, which is rapidly destroyed. As a result, a hemoglobin molecule devoid of a beta chain is formed. It is clear that such a molecule is unlikely to fully fulfill its duties. There is a serious disease that develops according to the type of hemolytic anemia (beta-zero thalassemia, from the Greek word "Talas" - the Mediterranean Sea, where this disease was first discovered).

The mechanism of action of stop codons is different from the mechanism of action of sense codons. This follows from the fact that for all the codons encoding amino acids, the corresponding tRNAs were found. No tRNAs were found for nonsense codons. Therefore, tRNA does not take part in the process of stopping protein synthesis.

codonAUG (sometimes GUG in bacteria) not only encodes the amino acid methionine and valine, but is alsobroadcast initiator .

b. Degeneracy or redundancy.

61 of the 64 triplets code for 20 amino acids. Such a threefold excess of the number of triplets over the number of amino acids suggests that two coding options can be used in the transfer of information. Firstly, not all 64 codons can be involved in encoding 20 amino acids, but only 20, and secondly, amino acids can be encoded by several codons. Studies have shown that nature used the latter option.

His preference is clear. If only 20 out of 64 triplet variants were involved in coding amino acids, then 44 triplets (out of 64) would remain non-coding, i.e. meaningless (nonsense codons). Earlier, we pointed out how dangerous for the life of a cell is the transformation of a coding triplet as a result of a mutation into a nonsense codon - this significantly disrupts the normal operation of RNA polymerase, ultimately leading to the development of diseases. There are currently three nonsense codons in our genome, and now imagine what would happen if the number of nonsense codons increased by about 15 times. It is clear that in such a situation the transition of normal codons to nonsense codons will be immeasurably higher.

A code in which one amino acid is encoded by several triplets is called degenerate or redundant. Almost every amino acid has several codons. So, the amino acid leucine can be encoded by six triplets - UUA, UUG, CUU, CUC, CUA, CUG. Valine is encoded by four triplets, phenylalanine by two and only tryptophan and methionine encoded by one codon. The property that is associated with the recording of the same information with different characters is called degeneracy.

The number of codons assigned to one amino acid correlates well with the frequency of occurrence of the amino acid in proteins.

And this is most likely not accidental. The higher the frequency of occurrence of an amino acid in a protein, the more often the codon of this amino acid is represented in the genome, the higher the probability of its damage by mutagenic factors. Therefore, it is clear that a mutated codon is more likely to code for the same amino acid if it is highly degenerate. From these positions, the degeneracy of the genetic code is a mechanism that protects the human genome from damage.

It should be noted that the term degeneracy is used in molecular genetics in another sense as well. Since the main part of the information in the codon falls on the first two nucleotides, the base in the third position of the codon turns out to be of little importance. This phenomenon is called “degeneracy of the third base”. The latter feature minimizes the effect of mutations. For example, it is known that the main function of red blood cells is the transport of oxygen from the lungs to the tissues and carbon dioxide from the tissues to the lungs. This function is carried out by the respiratory pigment - hemoglobin, which fills the entire cytoplasm of the erythrocyte. It consists of a protein part - globin, which is encoded by the corresponding gene. In addition to protein, hemoglobin contains heme, which contains iron. Mutations in globin genes lead to the appearance of different variants of hemoglobins. Most often, mutations are associated with substitution of one nucleotide for another and the appearance of a new codon in the gene, which can code for a new amino acid in the hemoglobin polypeptide chain. In a triplet, as a result of a mutation, any nucleotide can be replaced - the first, second or third. Several hundred mutations are known to affect the integrity of globin genes. About 400 of which are associated with the replacement of single nucleotides in the gene and the corresponding amino acid substitution in the polypeptide. Of these, only 100 substitutions lead to instability of hemoglobin and various kinds of diseases from mild to very severe. 300 (approximately 64%) substitution mutations do not affect hemoglobin function and do not lead to pathology. One of the reasons for this is the “degeneracy of the third base” mentioned above, when the replacement of the third nucleotide in the triplet coding for serine, leucine, proline, arginine and some other amino acids leads to the appearance of a synonym codon encoding the same amino acid. Phenotypically, such a mutation will not manifest itself. In contrast, any replacement of the first or second nucleotide in a triplet in 100% of cases leads to the appearance of a new hemoglobin variant. But even in this case, there may not be severe phenotypic disorders. The reason for this is the replacement of an amino acid in hemoglobin with another one similar to the first in terms of physicochemical properties. For example, if an amino acid with hydrophilic properties is replaced by another amino acid, but with the same properties.

Hemoglobin consists of an iron porphyrin group of heme (oxygen and carbon dioxide molecules are attached to it) and a protein - globin. Adult hemoglobin (HbA) contains two identical- chains and two-chains. Molecule-chain contains 141 amino acid residues,- chain - 146,- And-chains differ in many amino acid residues. The amino acid sequence of each globin chain is encoded by its own gene. The gene encoding- the chain is located on the short arm of chromosome 16,-gene - in the short arm of chromosome 11. Change in the gene encoding- hemoglobin chain of the first or second nucleotide almost always leads to the appearance of new amino acids in the protein, disruption of hemoglobin functions and severe consequences for the patient. For example, replacing “C” in one of the CAU (histidine) triplets with “U” will lead to the appearance of a new UAU triplet encoding another amino acid - tyrosine. Phenotypically, this will manifest itself in a serious illness .. A similar replacement in position 63-chain of the histidine polypeptide to tyrosine will destabilize hemoglobin. The disease methemoglobinemia develops. Change, as a result of mutation, of glutamic acid to valine in the 6th positionchain is the cause of a severe disease - sickle cell anemia. Let's not continue the sad list. We only note that when replacing the first two nucleotides, an amino acid can appear according to physical and chemical properties similar to the previous one. Thus, the replacement of the 2nd nucleotide in one of the triplets encoding glutamic acid (GAA) in-chain on “Y” leads to the appearance of a new triplet (GUA) encoding valine, and the replacement of the first nucleotide with “A” forms an AAA triplet encoding the amino acid lysine. Glutamic acid and lysine are similar in physicochemical properties - they are both hydrophilic. Valine is a hydrophobic amino acid. Therefore, the replacement of hydrophilic glutamic acid with hydrophobic valine significantly changes the properties of hemoglobin, which ultimately leads to the development of sickle cell anemia, while the replacement of hydrophilic glutamic acid with hydrophilic lysine changes the function of hemoglobin to a lesser extent - patients develop a mild form of anemia. As a result of the replacement of the third base, the new triplet can encode the same amino acids as the previous one. For example, if uracil was replaced by cytosine in the CAH triplet and a CAC triplet arose, then practically no phenotypic changes in a person will be detected. This is understandable, because Both triplets code for the same amino acid, histidine.

In conclusion, it is appropriate to emphasize that the degeneracy of the genetic code and the degeneracy of the third base from a general biological position are protective mechanisms that are incorporated in evolution in the unique structure of DNA and RNA.

in. Unambiguity.

Each triplet (except for meaningless ones) encodes only one amino acid. Thus, in the direction of codon - amino acid, the genetic code is unambiguous, in the direction of amino acid - codon - it is ambiguous (degenerate).

unambiguous

codon amino acid

degenerate

And in this case, the need for unambiguity in the genetic code is obvious. In another variant, during the translation of the same codon, different amino acids would be inserted into the protein chain and, as a result, proteins with different primary structures and different functions would be formed. The cell's metabolism would switch to the "one gene - several polypeptides" mode of operation. It is clear that in such a situation the regulatory function of genes would be completely lost.

g. Polarity

Reading information from DNA and from mRNA occurs only in one direction. Polarity is essential for defining higher order structures (secondary, tertiary, etc.). Earlier we talked about the fact that structures of a lower order determine structures of a higher order. The tertiary structure and structures of a higher order in proteins are formed immediately as soon as the synthesized RNA chain moves away from the DNA molecule or the polypeptide chain moves away from the ribosome. While the free end of the RNA or polypeptide acquires a tertiary structure, the other end of the chain still continues to be synthesized on DNA (if RNA is transcribed) or ribosome (if polypeptide is transcribed).

Therefore, the unidirectional process of reading information (in the synthesis of RNA and protein) is essential not only for determining the sequence of nucleotides or amino acids in the synthesized substance, but for the rigid determination of secondary, tertiary, etc. structures.

e. Non-overlapping.

The code may or may not overlap. In most organisms, the code is non-overlapping. An overlapping code has been found in some phages.

The essence of a non-overlapping code is that the nucleotide of one codon cannot be the nucleotide of another codon at the same time. If the code were overlapping, then the sequence of seven nucleotides (GCUGCUG) could encode not two amino acids (alanine-alanine) (Fig. 33, A) as in the case of a non-overlapping code, but three (if one nucleotide is common) (Fig. 33, B) or five (if two nucleotides are common) (see Fig. 33, C). In the last two cases, a mutation of any nucleotide would lead to a violation in the sequence of two, three, etc. amino acids.

However, it has been found that a mutation of one nucleotide always disrupts the inclusion of one amino acid in a polypeptide. This is a significant argument in favor of the fact that the code is non-overlapping.

Let us explain this in Figure 34. Bold lines show triplets encoding amino acids in the case of non-overlapping and overlapping codes. Experiments have unambiguously shown that the genetic code is non-overlapping. Without going into the details of the experiment, we note that if we replace the third nucleotide in the nucleotide sequence (see Fig. 34)At (marked with an asterisk) to some other then:

1. With a non-overlapping code, the protein controlled by this sequence would have a replacement for one (first) amino acid (marked with asterisks).

2. With an overlapping code in option A, a replacement would occur in two (first and second) amino acids (marked with asterisks). Under option B, the substitution would affect three amino acids (marked with asterisks).

However, numerous experiments have shown that when one nucleotide in DNA is broken, the protein always affects only one amino acid, which is typical for a non-overlapping code.

ГЦУГЦУГ ГЦУГЦУГ ГЦУГЦУГ

HCC HCC HCC UHC CUG HCC CUG UGC HCC CUG

*** *** *** *** *** ***

Alanine - Alanine Ala - Cys - Lei Ala - Lei - Lei - Ala - Lei

A B C

non-overlapping code overlapping code

Rice. 34. Scheme explaining the presence of a non-overlapping code in the genome (explanation in the text).

The non-overlapping of the genetic code is associated with another property - the reading of information begins from a certain point - the initiation signal. Such an initiation signal in mRNA is the codon encoding AUG methionine.

It should be noted that a person still has a small number of genes that deviate from general rule and overlap.

e. Compactness.

There are no punctuation marks between codons. In other words, the triplets are not separated from each other, for example, by one meaningless nucleotide. The absence of "punctuation marks" in the genetic code has been proven in experiments.

well. Versatility.

The code is the same for all organisms living on Earth. Direct evidence of the universality of the genetic code was obtained by comparing DNA sequences with corresponding protein sequences. It turned out that the same sets of code values ​​are used in all bacterial and eukaryotic genomes. There are exceptions, but not many.

The first exceptions to the universality of the genetic code were found in the mitochondria of some animal species. This concerned the terminator codon UGA, which read the same as the UGG codon encoding the amino acid tryptophan. Other rarer deviations from universality have also been found.

MZ. The genetic code is a system for recording hereditary information in nucleic acid molecules, based on a certain alternation of nucleotide sequences in DNA or RNA that form codons,

corresponding to the amino acids in the protein.The genetic code has several properties.

Every living organism has a special set of proteins. Certain compounds of nucleotides and their sequence in the DNA molecule form the genetic code. It conveys information about the structure of the protein. In genetics, a certain concept has been adopted. According to her, one gene corresponded to one enzyme (polypeptide). It should be said that research on nucleic acids and proteins has been carried out for a fairly long period. Further in the article, we will take a closer look at the genetic code and its properties. Will also be given brief chronology research.

Terminology

The genetic code is a way of encoding the sequence of amino acid proteins using nucleotide sequence. This method of forming information is characteristic of all living organisms. Proteins - natural organic matter with high molecular weight. These compounds are also present in living organisms. They consist of 20 types of amino acids, which are called canonical. Amino acids are arranged in a chain and connected in a strictly established sequence. It determines the structure of the protein and its biological properties. There are also several chains of amino acids in the protein.

DNA and RNA

Deoxyribonucleic acid is a macromolecule. She is responsible for the transmission, storage and implementation of hereditary information. DNA uses four nitrogenous bases. These include adenine, guanine, cytosine, thymine. RNA consists of the same nucleotides, except for the one that contains thymine. Instead, a nucleotide containing uracil (U) is present. RNA and DNA molecules are nucleotide chains. Thanks to this structure, sequences are formed - the "genetic alphabet".

Implementation of information

The synthesis of a protein encoded by a gene is realized by combining mRNA on a DNA template (transcription). There is also a transfer of the genetic code into a sequence of amino acids. That is, the synthesis of the polypeptide chain on mRNA takes place. To encode all amino acids and signal the end of the protein sequence, 3 nucleotides are enough. This chain is called a triplet.

Research History

The study of protein and nucleic acids has been carried out for a long time. In the middle of the 20th century, the first ideas about the nature of the genetic code finally appeared. In 1953, it was found that some proteins are made up of sequences of amino acids. True, at that time they could not yet determine their exact number, and there were numerous disputes about this. In 1953, Watson and Crick published two papers. The first declared the secondary structure of DNA, the second spoke of its admissible copying using matrix synthesis. In addition, emphasis was placed on the fact that a particular sequence of bases is a code that carries hereditary information. American and Soviet physicist Georgy Gamov admitted the coding hypothesis and found a method to test it. In 1954, his work was published, during which he put forward a proposal to establish correspondences between amino acid side chains and diamond-shaped "holes" and use this as a coding mechanism. Then it was called rhombic. Explaining his work, Gamow admitted that the genetic code could be triplet. The work of a physicist was one of the first among those that were considered close to the truth.

Classification

After several years, various models of genetic codes were proposed, representing two types: overlapping and non-overlapping. The first one was based on the occurrence of one nucleotide in the composition of several codons. The triangular, sequential and major-minor genetic code belongs to it. The second model assumes two types. Non-overlapping include combinational and "code without commas". The first variant is based on the encoding of an amino acid by nucleotide triplets, and its composition is the main one. According to the "no comma code", certain triplets correspond to amino acids, while the rest do not. In this case, it was believed that if any significant triplets were arranged sequentially, others located in a different reading frame would turn out to be unnecessary. Scientists believed that it was possible to select a nucleotide sequence that would meet these requirements, and that there were exactly 20 triplets.

Although Gamow et al questioned this model, it was considered the most correct over the next five years. At the beginning of the second half of the 20th century, new data appeared that made it possible to detect some shortcomings in the "code without commas". Codons have been found to be able to induce protein synthesis in vitro. Closer to 1965, they comprehended the principle of all 64 triplets. As a result, redundancy of some codons was found. In other words, the sequence of amino acids is encoded by several triplets.

Distinctive features

The properties of the genetic code include:

Variations

For the first time, the deviation of the genetic code from the standard was discovered in 1979 during the study of mitochondrial genes in the human body. Further similar variants were identified, including many alternative mitochondrial codes. These include the deciphering of the stop codon UGA used as the definition of tryptophan in mycoplasmas. GUG and UUG in archaea and bacteria are often used as starting variants. Sometimes genes code for a protein from a start codon that differs from the one normally used by that species. Also, in some proteins, selenocysteine ​​and pyrrolysine, which are non-standard amino acids, are inserted by the ribosome. She reads the stop codon. It depends on the sequences found in the mRNA. Currently, selenocysteine ​​is considered the 21st, pyrrolizan - the 22nd amino acid present in proteins.

General features of the genetic code

However, all exceptions are rare. In living organisms, in general, the genetic code has a number of common features. These include the composition of the codon, which includes three nucleotides (the first two belong to the determining ones), the transfer of codons by tRNA and ribosomes into an amino acid sequence.

Genetic code– recording system genetic information in DNA (RNA) in the form of a certain sequence of nucleotides. A certain sequence of nucleotides in DNA and RNA corresponds to a certain sequence of amino acids in the polypeptide chains of proteins. It is customary to write the code using capital letters of Russian or Latin alphabet. Each nucleotide is designated by the letter that begins the name of the nitrogenous base that is part of its molecule: A (A) - adenine, G (G) - guanine, C (C) - cytosine, T (T) - thymine; in RNA instead of thyminuracil - U (U). The sequence of nucleotides determines the sequence of incorporation of AA into the synthesized protein.

Properties of the genetic code:

1. Tripletity- a significant unit of the code is a combination of three nucleotides (triplet, or codon).
2. Continuity- there are no punctuation marks between the triplets, that is, the information is read continuously.
3. Non-overlapping- the same nucleotide cannot be part of two or more triplets at the same time (not observed for some overlapping genes of viruses, mitochondria and bacteria that encode several frameshift proteins).
4. Uniqueness(specificity) - a certain codon corresponds to only one amino acid (however, the UGA codon in Euplotescrassus codes for two amino acids - cysteine ​​and selenocysteine)
5. Degeneracy(redundancy) - several codons can correspond to the same amino acid.
6. Versatility- the genetic code works in the same way in organisms of different levels of complexity - from viruses to humans (genetic engineering methods are based on this; there are a number of exceptions, shown in the table in the "Variations of the standard genetic code" section below).

Conditions for biosynthesis

Protein biosynthesis requires the genetic information of a DNA molecule; informational RNA - the carrier of this information from the nucleus to the site of synthesis; ribosomes - organelles where the actual protein synthesis occurs; a set of amino acids in the cytoplasm; transport RNAs encoding amino acids and carrying them to the site of synthesis on ribosomes; ATP is a substance that provides energy for the process of coding and biosynthesis.

Stages

Transcription- the process of biosynthesis of all types of RNA on the DNA matrix, which takes place in the nucleus.

A certain section of the DNA molecule is despiralized, the hydrogen bonds between the two chains are destroyed under the action of enzymes. On one DNA strand, as on a matrix, an RNA copy is synthesized from nucleotides according to the complementary principle. Depending on the DNA region, ribosomal, transport, and informational RNAs are synthesized in this way.

After mRNA synthesis, it leaves the nucleus and goes to the cytoplasm to the site of protein synthesis on ribosomes.


Broadcast- the process of synthesis of polypeptide chains, carried out on ribosomes, where mRNA is an intermediary in the transfer of information about the primary structure of the protein.

Protein biosynthesis consists of a series of reactions.

1. Activation and coding of amino acids. tRNA has the form of a cloverleaf, in the central loop of which there is a triplet anticodon corresponding to the code of a certain amino acid and the codon on mRNA. Each amino acid is connected to the corresponding tRNA using the energy of ATP. A tRNA-amino acid complex is formed, which enters the ribosomes.

2. Formation of the mRNA-ribosome complex. mRNA in the cytoplasm is connected by ribosomes on granular ER.

3. Assembly of the polypeptide chain. tRNA with amino acids, according to the principle of complementarity of the anticodon with the codon, combine with mRNA and enter the ribosome. In the peptide center of the ribosome, a peptide bond is formed between two amino acids, and the released tRNA leaves the ribosome. At the same time, the mRNA advances one triplet each time, introducing a new tRNA - an amino acid and removing the released tRNA from the ribosome. The entire process is powered by ATP. One mRNA can combine with several ribosomes, forming a polysome, where many molecules of one protein are simultaneously synthesized. Synthesis ends when meaningless codons (stop codes) begin on the mRNA. Ribosomes are separated from mRNA, polypeptide chains are removed from them. Since the entire synthesis process takes place on the granular endoplasmic reticulum, the resulting polypeptide chains enter the EPS tubules, where they acquire the final structure and turn into protein molecules.

All synthesis reactions are catalyzed by special enzymes using ATP energy. The rate of synthesis is very high and depends on the length of the polypeptide. For example, in the ribosome of Escherichia coli, a protein of 300 amino acids is synthesized in approximately 15-20 seconds.

- one system records of hereditary information in nucleic acid molecules in the form of a sequence of nucleotides. The genetic code is based on the use of an alphabet consisting of only four nucleotide letters that differ in nitrogenous bases: A, T, G, C.

The main properties of the genetic code are as follows:

1. The genetic code is triplet. A triplet (codon) is a sequence of three nucleotides that codes for one amino acid. Since proteins contain 20 amino acids, it is obvious that each of them cannot be encoded by one nucleotide (since there are only four types of nucleotides in DNA, in this case 16 amino acids remain uncoded). Two nucleotides for coding amino acids are also not enough, since in this case only 16 amino acids can be encoded. This means that the smallest number of nucleotides encoding one amino acid is three. (In this case, the number of possible nucleotide triplets is 4 3 = 64).

2. The redundancy (degeneracy) of the code is a consequence of its triplet nature and means that one amino acid can be encoded by several triplets (since there are 20 amino acids, and 64 triplets). The exceptions are methionine and tryptophan, which are encoded by only one triplet. In addition, some triplets perform specific functions. So, in the mRNA molecule, three of them - UAA, UAG, UGA - are terminating codons, i.e., stop signals that stop the synthesis of the polypeptide chain. The triplet corresponding to methionine (AUG), standing at the beginning of the DNA chain, does not encode an amino acid, but performs the function of initiating (exciting) reading.

3. Simultaneously with redundancy, the code has the property of unambiguity, which means that each codon corresponds to only one specific amino acid.

4. The code is collinear, i.e. The sequence of nucleotides in a gene exactly matches the sequence of amino acids in a protein.

5. The genetic code is non-overlapping and compact, that is, it does not contain "punctuation marks". This means that the reading process does not allow for the possibility of overlapping columns (triplets), and, starting at a certain codon, the reading goes continuously triple by triplet up to stop signals (terminating codons). For example, in mRNA the following sequence nitrogenous bases AUGGUGCUUAAAUGUG will only be read in triplets: AUG, GUG, CUU, AAU, GUG, and not AUG, UGG, GGU, GUG, etc. or AUG, GGU, UGTs, CUU, etc., or in any other way (for example, codon AUG, punctuation mark G, codon UGC, punctuation mark Y, etc.).

6. The genetic code is universal, that is, the nuclear genes of all organisms encode information about proteins in the same way, regardless of the level of organization and systematic position these organisms.

In any cell and organism, all features of the anatomical, morphological and functional nature are determined by the structure of the proteins that are included in them. The hereditary property of an organism is the ability to synthesize certain proteins. Amino acids are located in a polypeptide chain, on which biological characteristics depend.
Each cell has its own sequence of nucleotides in the DNA polynucleotide chain. This is the genetic code of DNA. Through it, information about the synthesis of certain proteins is recorded. About what the genetic code is, about its properties and genetic information is described in this article.

A bit of history

The idea that perhaps a genetic code exists was formulated by J. Gamow and A. Down in the middle of the twentieth century. They described that the nucleotide sequence responsible for the synthesis of a particular amino acid contains at least three units. Later they proved the exact number of three nucleotides (this is a unit of the genetic code), which was called a triplet or codon. There are sixty-four nucleotides in total, because the acid molecule, where or RNA occurs, consists of residues of four different nucleotides.

What is the genetic code

The method of coding the protein sequence of amino acids due to the sequence of nucleotides is characteristic of all living cells and organisms. That's what the genetic code is.
There are four nucleotides in DNA:

  • adenine - A;
  • guanine - G;
  • cytosine - C;
  • thymine - T.

They are indicated by capital letters in Latin or (in Russian-language literature) Russian.
RNA also has four nucleotides, but one of them is different from DNA:

  • adenine - A;
  • guanine - G;
  • cytosine - C;
  • uracil - W.

All nucleotides line up in chains, and in DNA a double helix is ​​obtained, and in RNA it is single.
Proteins are built on where they, located in a certain sequence, determine its biological properties.

Properties of the genetic code

Tripletity. The unit of the genetic code consists of three letters, it is triplet. This means that the twenty existing amino acids are coded for by three specific nucleotides called codons or trilpets. There are sixty-four combinations that can be created from four nucleotides. This amount is more than enough to encode twenty amino acids.
Degeneracy. Each amino acid corresponds to more than one codon, with the exception of methionine and tryptophan.
Unambiguity. One codon codes for one amino acid. For example, in the gene of a healthy person with information about the beta target of hemoglobin, the triplet of GAG and GAA codes for A in everyone who has sickle cell anemia, one nucleotide is changed.
Collinearity. The amino acid sequence always corresponds to the nucleotide sequence that the gene contains.
The genetic code is continuous and compact, which means that it does not have "punctuation marks". That is, starting at a certain codon, there is a continuous reading. For example, AUGGUGTSUUAAAUGUG will be read as: AUG, GUG, CUU, AAU, GUG. But not AUG, UGG, and so on, or in any other way.
Versatility. It is the same for absolutely all terrestrial organisms, from humans to fish, fungi and bacteria.

table

Not all available amino acids are present in the presented table. Hydroxyproline, hydroxylysine, phosphoserine, iodo derivatives of tyrosine, cystine, and some others are absent, since they are derivatives of other amino acids encoded by mRNA and formed after protein modification as a result of translation.
From the properties of the genetic code, it is known that one codon is able to code for one amino acid. The exception is the genetic code that performs additional functions and codes for valine and methionine. RNA, being at the beginning with a codon, attaches a t-RNA that carries formyl methion. Upon completion of the synthesis, it splits off itself and takes the formyl residue with it, transforming into a methionine residue. Thus, the above codons are the initiators of the synthesis of a chain of polypeptides. If they are not at the beginning, then they are no different from others.

genetic information

This concept means a program of properties that is transmitted from ancestors. It is embedded in heredity as a genetic code.
Implemented during protein synthesis genetic code:

  • information and RNA;
  • ribosomal rRNA.

Information is transmitted by direct communication (DNA-RNA-protein) and reverse (environment-protein-DNA).
Organisms can receive, store, transfer it and use it most effectively.
Being inherited, information determines the development of an organism. But due to interaction with environment the reaction of the latter is distorted, due to which evolution and development take place. Thus, new information is laid in the body.


Computing patterns molecular biology and the discovery of the genetic code illustrated the need to combine genetics with Darwin's theory, on the basis of which a synthetic theory of evolution emerged - non-classical biology.
Heredity, variability and natural selection Darwin are supplemented by genetically determined selection. Evolution is realized at the genetic level through random mutations and inheritance of the most valuable traits that are most adapted to the environment.

Deciphering the human code

In the nineties, the Human Genome Project was launched, as a result of which fragments of the genome containing 99.99% of human genes were discovered in the 2000s. Fragments that are not involved in protein synthesis and are not encoded remained unknown. Their role is still unknown.

Chromosome 1, last discovered in 2006, is the longest in the genome. More than three hundred and fifty diseases, including cancer, appear as a result of disorders and mutations in it.

The role of such research can hardly be overestimated. When they discovered what the genetic code is, it became known according to what patterns development occurs, how the morphological structure, the psyche, predisposition to certain diseases, metabolism and vices of individuals are formed.