Designer Genes

Designer Genes is a Division C biology event for the 2019 season. It was previously an event for the 2013 and 2014 seasons. The event covers topics relating to genetics and the molecular biology of inheritance.

In addition to the topics listed in the table in section 3 of the rules, this event also covers the material in the Division B event Heredity (although generally in less detail since it is only a part of the event). Several sections of this page link to the appropriate sections of the Heredity page.

Inheritance
Refer to Heredity

DNA
Refer to Heredity

RNA
Refer to Heredity

Mitosis
Refer to Heredity

Meiosis
Refer to Heredity

Gene Expression
Several different factors interact with RNA transcription and translation to control gene expression.

Transcriptional
Gene expression can be controlled during or after transcription. The rate of gene transcription is often controlled by allowing or denying RNA polymerase access to the gene. Termination can also occur early, preventing the gene from being transcribed properly. Transcriptional regulation can also occur when RNA polymerase is attempting to escape the promoter complex to start transcribing DNA. Protein factors can also alter the rate of transcription.

Post-transcriptional
There are three main post-transcriptional processes: the processing of 3' and 5' ends, RNA splicing, and alternative splicing. At the end of the transcription process, the 3' end gains 50-250 adenine nucleotides known as the Poly-A tail. The 5' end is capped with a 7-methylguanosine residue which is an altered version of guanine. This cap prevents the RNA from degrading and stabilizes the mRNA, enabling it to undergo translation into proteins. Certain enzymes are able to break down the Poly-A tail and cap, allowing nuclease enzymes to break down the RNA.

Alternative splicing
mRNA can be divided into two parts known as introns and exons. Introns are parts of the mRNA not used for translation and the production of proteins. Exons are all of the other parts of the mRNA that are used for the production of proteins. During RNA splicing, the unnecessary introns are removed and the expressed exons are brought together. Small nuclear ribonuclearproteins (snRNPS) recognize the splice sites, and join together additional proteins to form an assembly known as the spliceosome. This assembly removes the introns and facilitates RNA splicing.

The process for alternative splicing is similar to the process for regular RNA splicing. However, the key difference is that alternative splicing produces different RNA from the same primary transcript. Exons are mixed and matched to create different proteins from the same length of mRNA. This process is also called exon shuffling, and is the reason why humans produce so many proteins despite having a limited number of genes.

microRNA
MicroRNA (abbreviated to miRNA) is a small sequence of RNA that regulates gene expression. It is typically about 22 nucleotides in length. miRNA works by bonding with complementary sequences in mRNA, which destabilizes the mRNA strand by separating it into two pieces or slowing down translation into proteins. miRNA is involved in a variety of biological functions, including cell cycle control, apoptosis and developmental processes like aging and immune responses. miRNA has also been implicated in various diseases including cancer and certain types of heart and neurological diseases. One miRNA can target multiple genes, regulating the expression of multiple proteins.

Translational
Gene expression can also be regulated or modified during or after RNA translation.

Lac and Trp Operons
Lac and Trp Operons are examples in prokaryotic gene regulation. Most prokaryotic genes such as in E.coli are always turned "on", but others are active only when products are needed by the cell, so their expression must be regulated.

An operon is a group of genes transcribed together by a single promoter. The lac operon was the first to be discovered. In the model bacterium E. coli, this operon is transcribed in the presence of lactose to give the bacterium the ability to digest this source of energy. It has three parts: lacA, lacY, and lacZ, as well as a promoter, a regulator, a terminator, and an operator. To activate lactose digestion abilities, an isomer of lactose (allolactose) binds to the gene's repressor, allowing the operon to be transcribed

Whereas the lac operon gives E. coli the ability to digest lactose, the trp operon shuts off the bacterium's capability to metabolize tryptophan. As such, it is an example of a repressible operon. In the presence of lactose, its five structural genes (trpA, trpB, trpC, trpD, and trpE), which code for tryptophan synthase, will be repressed so E. coli can metabolize lactose instead. Lac operons are inductible operons due to the fact that genes are expressed in the presence of a substance (lactose).

Post-translational
The way a protein functions hinges on the way it is folded. Hydrogen bonds form between the nucleotides, which produce the tertiary structure of the protein. Chaperonins assist the folding of proteins, and ensure that it does not fold improperly. A protein that is folded improperly and not destroyed can cause numerous diseases such as Alzheimer's disease, cystic fibrosis, and cancer. Enzymes can also process the polypeptide once it is folded by removing residues or amino acids.

Carbohydrates, lipids and phosphate groups can also be attached to the polypeptides. The attachment of carbohydrates is known as glycosylation, and often promotes protein folding and stability in proteins. Lipidation often occurs in proteins that are going to be attached to the cell membrane. The most common type of post-translational modification is known as phosphorylation and typically regulates the activity of enzymes.

Epigenetics
Epigenetics is the study of changes in organisms not caused by the alteration of genetic code. Epigenetics revolves around gene expression, not the DNA itself. It affects how genes are read by cells, and how they produce proteins. Think of the human genome as a filing cabinet, and the genes as folders that contain the instructions to make a protein. Certain folders might be marked as important, or others could be marked as less important. These epigenetic marks control the expression of genes. It is the reason that even though every cell in the body has the same DNA at its core, different cells have different functions. A liver cell would open different folders in the filing cabinet than a brain cell would, because it would need to make different proteins.

Epigenetic marks take the form of molecular tags that are placed in different places on the histone, and each one has a different effect. They can make DNA more accessible to proteins, or purposefully make it less accessible so that a specific gene is not transcribed or translated. Some epigenetic marks are very long and cover large stretches of DNA, or others are gathered at the start of genes. Epigenetic marks can also change over time. These changes can be caused by anything from chemical additives in plastics to DNA errors during replication.

Some epigenetic marks can also be inherited through generations. This is how environmental factors are passed down through generations. Addictive behavior is inherited in this way, and the effects nutrient deprivation can be passed down in this way too. However, passing down epigenetic tags is different than passing down genes. Reproductive cells undergo a process called reprogramming, and this process is supposed to erase all epigenetic tags. However, on some genes it fails and leaves these tags in place to be passed down to another generation. In mammals, about 1% of genes escape epigenetic reprogramming.

Phylogenetics
Phylogenetics is the study of evolutionary relationships. A phylogenetic tree displays these relationships based upon their similarities and differences. Rooted trees have a common ancestor, and in some cases the length of a line can indicate time estimates. Unrooted trees only show the relationship between a couple of organisms and do not require an ancestral root. Phylogenetic trees are based on speculation and do not show exact evolutionary history, but they can still display how animals could have possibly evolved.

Hardy-Weinberg Equilibrium
The Hardy-Weinberg equilibrium is a common population model used in genetics.

Conditions
The Hardy-Weinberg Law states that a population will maintain the exact allele and genotype frequencies over each generation unless five specific influences are introduced into the population. For a population to be in Hardy-Weinberg equilibrium, it must meet all of the 5 conditions listed below:


 * 1) No mutations: Mutations introduce new alleles into the population.
 * 2) No gene flow: Like mutations, immigration or emigration can introduce new alleles (or bolster/diminish existing alleles)
 * 3) Very large population: Genetic drift is likely to occur in a smaller population. Hardy-Weinberg equilibrium can only occur in a population approaching infinity.
 * 4) No natural selection: If some traits are discriminated for/against by environmental conditions, the genotype frequencies will not be in equilibrium over the generations.
 * 5) Random mating: Like natural selection, sexual selection involved in non-random mating could discriminate for/against traits.

An example of Hardy Weinberg: Consider a world where everyone has either purple or blue skin. "S" is purple skin, and "s' is blue skin. The probability of either one of these traits occurring is constant, and both that and the allele freqeuncies have to add to 1. The probabilities of the alleles are represented as [math]p[/math] (for S) and [math]q[/math] (for s). Therefore, the probability of being homozygous purple (SS) would be [math]p*p[/math] or [math]p^2[/math], the probability of being heterozygous (Ss) would be [math]p*q + q*p[/math] or [math]2pq[/math], and the probability of being homozygous blue (ss) would be [math]q*q[/math] or [math]q^2[/math]. These probabilities form the two equations used in the Hardy-Weinberg equilibrium.

Equations
There are two equations used in the Hardy-Weinberg Law:


 * 1) [math]p^2 + 2pq + q^2 = 1[/math]
 * 2) [math]p + q = 1[/math]

where

[math]p[/math] is the frequency of the (homozygous) dominant allele in the population, as a percentage [math]q[/math] is the frequency of the (homozygous) recessive allele in the population, as a percentage [math]p^2[/math] is the percentage of the homozygous dominant individuals [math]2pq[/math] is the percentage of the heterozygous individuals [math]q^2[/math] is the percentage of the homozygous recessive individuals.

These equations only apply if the population is in Hardy-Weinberg equilibrium.

Solving a Hardy-Weinberg Problem
A typical Hardy-Weinberg problem will resemble the sample problem below:

IMPORTANT: Before attempting to solve the problem, it is critical to analyze all of the given information and approach it in the correct manner. Make sure to check the math after finishing! One mistake will throw off the entire problem. When solving a problem, make sure to work in the order as follows:


 * Step 1: Determine [math]q[/math]. Since a dominant phenotype can have either a homozygous or heterozygous genotype, it is easier to find the recessive allele first (unless an exact homozygous/heterozygous dominant value is given).
 * Step 2: Determine [math]p[/math]. Using the second equation, [math]p[/math] can be found once [math]q[/math] has been determined.
 * Step 3: Determine [math]p^2[/math] and [math]q^2[/math]. Steps 3 and 4 are interchangeable, but finding [math]p^2[/math] and [math]q^2[/math] first is generally the common practice.
 * Step 4: Determine [math]2pq[/math].

The answers and work (using the four steps) for the sample problem are shown below:


 * Step 1: Determine [math]q[/math]. Since aa, or [math]q^2[/math] is 36%, then a (the frequency of the recessive allele-this is q in Hardy-Weinberg terms) must be 60%, or 0.6.
 * Step 2: Determine [math]p[/math]. Using the second equation, [math] p + q = 1[/math]. Therefore, [math]p[/math], or A must be 0.4. (40%)
 * Step 3: Determine [math]p^2[/math] and [math]q^2[/math]. Now that [math]p[/math] and [math]q[/math] (A and a respectively) are both known, [math]p^2[/math] and [math]q^2[/math] can be found by squaring each term. In this case, [math]p^2 = .16[/math] and [math]q^2 = .36[/math] (16% and 36% respectively).
 * Step 4: Determine [math]2pq[/math]. This can be done two ways. Rearranging the first equation, [math]2pq = 1 - p^2 - q^2[/math], so [math]2pq = .48[/math] (48%). Additionally, [math]2pq[/math] can be found by multiplying [math]p[/math] and [math]q[/math] together, then multiplying that by [math]2[/math].

So, the answers to the sample questions are:


 * 1) .36 (this was given to us in the problem)
 * .6
 * .4
 * 1) 48%

Note: Frequency is always expressed as a decimal (and percentages are expressed as percents).

Sequencing
There are a variety of ways to sequence DNA, or determine the specific order that nucleotides are in. One of the most reliable methods of sequencing is the chain-termination method, or Sanger sequencing. This method was one of the earliest and is typically used on strands of DNA that have 900 base pairs or less. It is expensive and inefficient for larger scale projects, but useful for individual pieces of DNA. The Sanger method is a three step process, and was used in the Human Genome Project to sequence all 22 autosomes and the X and Y chromosomes.


 * A DNA fragment is denatured into a single strand and cooled so that the primer can bind to it. A primer and DNA polymerase are added, along with regular deoxynucleotides and fluorescent chain-terminating dideoxynucleotides. These special nucleotides lack a hydroxyl group on the 3' carbon, preventing the addition of further nucleotides.
 * Once the primer binds to the DNA fragment, the temperature is raised again and the DNA polymerase begins to add DNA nucleotides to appropriate sites on the template DNA. This continues until the polymerase adds a tagged nucleotide instead of a regular one.
 * This process creates numerous strands of varying lengths. These strands can be separated by length using gel electrophoresis which can be used to show which dideoxynucleotide is at the end. The strand created with the Sanger method is the complementary strand of DNA.

Next Generation Sequencing
Next Generation Sequencing (NGS), or high-throughput sequencing, is a name that describes several different ways to sequence DNA. It is faster and cheaper than Sanger sequencing since many sequencing reactions can take place at once, it is very low-cost, and the reactions are much smaller.

RNA-Seq and Tn-Seq
RNA sequencing (also known as WTSS) is the use of Next Generation Sequencing to reveal how much RNA is in a sample at a given moment and is replacing microarrays in many labs. RNA-Seq sequences the mRNA and can be used to analyze gene expression, typically in different conditions (such as with drugs and without drugs). It can also find variations in RNA and detect post-transcriptional alterations, whereas microarrays can only determine gene expression.

Tn-Seq (transposon sequencing) determines genetic interactions and can determine the frequency of mutations. However, it is limited to bacterial studies.

Microarray
A microarray consists of a small solid surface with various known single-stranded segments of DNA attached. It is primarily used for testing unknown DNA sequences - the level of binding of an unknown sequence to one of the microarray segments (known as probes) indicates whether the unknown strand is complementary to a particular known strand. DNA microarrays are also used to measure the expression levels of a large amount of genes simultaneously.

RFLP Analysis
In RFLP analysis, a DNA sample is broken into pieces (and digested) by restriction enzymes and the resulting restriction fragments are separated according to their lengths by gel electrophoresis. Though now largely obsolete due to the rise of inexpensive DNA sequencing technologies, RFLP analysis was previously used for DNA profiling (also known as DNA fingerprinting).

Molecular Cloning
Molecular cloning is the process of inserting recombinant DNA into various host organisms - for example, certain types of bacteria - and replicating them. It is most often used to manufacture large quantities of desirable proteins. For example, synthetic insulin is primarily produced using recombinant DNA inside bacteria such as E. coli.

Polymerase Chain Reaction
Polymerase Chain Reaction, abbreviated as PCR, is a method of quickly making billions of copies of a desired section of DNA. For a virtual lab that clearly explains the process, visit this website.

The Polymerase Chain Reaction is another way of creating large numbers of a specific piece of DNA, other than cloning DNA.

In PCR, DNA primers are employed on opposite ends of the DNA sequence. They are necessary of the initiation of DNA replication. Then, a single strand of DNA is used as the template to produce double stranded DNA through polymerization.

Individual strands of DNA are unwinded from double stranded DNA using heat. Thus, PCR consists of heat treatment to unwind the DNA, then the binding of primers to the DNA, then polymerization to form another strand. This repeats, and will quickly and exponentially multiply the amount of DNA available.

The key step in the development of PCR was the isolation and use of a heat resistant DNA polymerase (TacDNA Polymerase).

PCR is better than the conventional cloning of DNA due to the fact that PCR can be used with only very small and impure samples of DNA.

Gel Electrophoresis
Gel Electrophoresis is one of the most useful techniques to study macromolecules, especially proteins or nucleic acids. In gel electrophoresis, charged molecules are pulled through a gel (usually purified agar known as agarose) and this separates the molecules. Larger molecules move more slowly through the gel since they get caught in the gel matrix, and molecules with greater charges move faster since the electric field is what is pulling the molecules.

Molecules will have a negative charge, and thus will move towards the positive poles.

Blotting
Blotting is a method used for isolating some certain molecule from a sample. In the case of DNA, it is first cut by restriction enzymes and sorted by size with gel electrophoresis. A blotting membrane is placed over the gel, and a paper towel is used to absorb buffer through the membrane. The buffer moves through the membrane and flows upward, leaving the DNA behind on the other side of the membrane.

Gene Therapy
Gene therapy is the process of introducing genes into a patient in order to cure a disease. It has the potential to eliminate hereditary diseases like cystic fibrosis and could cure other diseases like cancer or AIDS. Many different approaches to gene therapy are being tested such as deactivating problematic genes, replacing mutated genes and introducing new genes into the body but most of these are experimental and can be dangerous. Gene therapy is commonly only tested on diseases that have no other cures.

CRISPR-Cas technology
CRISPR stands for Clustered Regularly Interspaced Short Palindromic Repeats, which is a bacterial defense mechanism that can be used to target and edit DNA in specific locations. CRISPR technology is typically used for gene therapy, and is currently being used to correct mutations that cause diseases. Other systems also exist that target RNA and diagnose illnesses.

CRISPR "spacer" sequences are first translated into RNA sequences called crRNAs that can guide the system to the matching portions of DNA. When the DNA is found, Cas9 (an enzyme) binds to the DNA and cuts it, disabling the gene. Other versions of the Cas9 system can activate gene expression, allowing researchers to study a gene's function.

Genetic Disorders
Refer to Heredity

Resources

 * [[Media:Gangsta DG Notes.pdf|gangsta_duck's Designer Genes Notes]]
 * [[Media:GFNowhere_designer_notes.pdf|GuyFromNowhere's Designer Genes Notes]]
 * Molecular Biology of the Cell notes
 * [[Media:Designer Genes Short Practice Test and Key.pdf|Short Practice Test (and key)]]