Protein Modeling/Human Genome

Jump to: navigation, search

The 2015 topic for Protein Modeling deals with editing the human genome. The pre-build model is of the catalytic domain of the endonuclease protein fokI.

See the CBM webpage for more details.


Endonuclease FokI

The 2015 Pre-Build Model should represent amino acids 421-560 of chain A of the restriction endonuclease protein fokI based on the PDB file 2FOK.pdb.

Restriction endonucleases are proteins that can cut DNA at a specific point in a specific sequence, allowing genome editing. The specific protein FokI, the focus of this year's event, occurs naturally in bacteria as a defense mechanism against invading viruses. It is an enzyme derived from Flavobacterium okeanokoites (or Planomicrobium okeanokoites). This protein, like other restriction enzymes, has two domains (functional parts): the cleavage domain (nuclease) and the DNA-binding domain, composed of zinc fingers. In designing genome-editing restriction enzymes, the nuclease of the FokI is typically removed from its natural DNA binding domains and attached to new binding domains, to create a new specialized restriction enzyme. The pre-build model focuses solely on the nuclease.

The nuclease functions solely as a dimer, meaning it requires two copies (one attached to each strand of DNA) in order to successfully cleave the DNA It can recognize specific DNA sequences (5’GGATG3’ and 5’CATCC3’) and cuts or cleaves it on both DNA strands 14 bases after the first bolded and underlined G and 13 bases before the bolded and underlined C. It has a cofactor: Mg2+

Zinc-Finger Proteins

The 2015 regional and invitational onsite models will represent zinc-finger proteins. The invitational onsite model will be based on the PDB file 1psv.pdb and the regionals onsite model will be based on the file 1mey.pdb.

Zinc finger proteins are composed of a 2-stranded beta sheet and a single alpha helix. They are stabilized by a zinc ion coordinated by 4 cysteine and histidine amino acids. The zinc finger domain binds to the main groove of the DNA, while individual side chains interact with the DNA bases to "read" them. This allows chains of zinc finger domains to bind to very specific genomic sequences. Some scientists experimented with the idea of replacing the zinc coordination with other interactions, which led to the design of a peptide that could adopt the same shape and structure as the DNA binding zinc finger domain, but had a completely different rationale for its stability.

Zinc fingers were first identified in a frog transcription factor (transcription factor IIIA). This protein structure was found to bind both 5S RNA and its cognate DNA. Over the years, zinc fingers have been identified in many other proteins and is one of the most common protein domains that binds to specific DNA/RNA sequences.

Each zinc finger domain has ~30 amino acids. Most zinc finger containing proteins have a series of these domains linked to each other, which bind to the major groove of the DNA. Specific amino acid side chains reach out from these domains to "read" the DNA sequence by interacting with specific DNA bases.

Zinc-Finger Nucleases

Zinc finger nucleases are sequence specific DNA binding proteins. They are composed of several zinc fingers, which each bind three bases, as well as a cleavage domain. As the nuclease can only function as a dimer, two nucleases are created, one with each half of the nuclease domain. One binds to each strand of DNA, and together they form a functional homodimer able to cut both strands of DNA. Then, they repair enzymes in the cell in an attempt to fix the breach, introducing mutations as they do so.

TALE Nuclease Proteins

The 2015 State onsite model will represent Transcription Activator Like Effector Nuclease Proteins (TALE) and will be based on the PDB file 3v6t.pdb.


The 2015 National onsite model will represent Clustered Regularly Interspaced Short Palindromic Repeat (CRISPR) proteins and will be based on the PDB file 4un3.pdb.

Editing the Human Genome

Restriction Endonucleases

Restriction endonucleases are proteins that can cut DNA at a specific point in a specific sequence, allowing genome editing. They are termed "restriction enzymes" because they restrict the infection of bacteriophages. Bacteria are under constant attack by bacteriophages (e.g. bacteriophage phiX174).

To protect themselves, many types of bacteria have developed a method to chop up any foreign DNA, such as that of an attacking phage. bacteria build an endonuclease (an enzyme that cuts DNA) which is allowed to circulate in the bacterial cytoplasm, waiting for phage DNA. Each type of restriction enzyme seeks out a single DNA sequence and precisely cuts it in one place.

Example: EcoRI, cuts the sequence GAATTC, cutting between the G and the A. Roving endonucleases can be dangerous, so bacteria protect their own DNA by modifying it with methyl groups. These groups are added to adenine or cytosine bases (depending on the particular type of bacteria) in the major groove. Methyl groups block the binding of restriction enzymes but don’t block the normal reading and replication of the genomic information stored in the DNA. DNA from an attacking bacteriophage won’t have protective methyl groups and will be destroyed.

Each particular type of bacteria has a restriction enzyme (or several different ones) that cuts a specific DNA sequence, paired with a methyl-transferase enzyme that protects this same sequence in the bacterial genome.

CCR5 (Chemokine Receptor 5)

CCR5 is a membrane receptor protein found in human immune cells that is used by HIV to enter the host cell. is an HIV co-receptor; cooperates with the host cellular CD4 primary receptor to allow the initial docking of the HIV virus onto T-cells, and subsequent infection.

The CD4 bound HIV envelope spike protein use this molecule as a co-receptor to enter and infect host cells. In some instances HIV uses another similar chemokine receptor CXCR4 as the co-receptor for entry into host cells. A naturally occurring deletion in this protein enables a cell to become resistant to the HIV virus since it is unable to properly bind and insert its genetic information.

Normally 353 amino acids long, and folds up into a structure composed of 7 transmembrane alpha helices with structural homology to the family of G protein-coupled receptors (GPCRs). Primarily, the CCR5 gene is involved in the receiving of chemical signals called chemokines and recruiting other immune cells to help the immune system function.

However, this variation is homozygous recessive, meaning it requires both recessive alleles in order to express its resistant properties. In some ethnic groups (Caucasians) a 32 nucleotide deletion in the gene results in a corresponding deletion in the mRNA.

Because the genetic code is a triplet code, and 32 isn’t a multiple of 3, the deletion results in 1) the deletion of 11 amino acids 2) a switch in the translational reading frame resulting in a scrambled amino acids sequence even after the deletion site. 31 additional amino acids are added as a result of the deletion before a stop codon is met by the ribosome. This prematurely terminated CCR5 protein is 215 amino acids long.

CCR5 normally dimerizes and is phosphorylated in the endoplasmic reticulum and is then efficiently trafficked through the Golgi to the cell membrane. In contrast, 32CCR5 is not phosphorylated, and is not trafficked to the cell membrane. 32CCR5 retains its ability to dimerize with wild type CCR5 leading to a transdominant negative effect on the delivery of the functional CCR5 to the cell surface.

Approximately 15-20% of the northern European population is heterozygous for a naturally occurring 32 base pair deletion in their CCR5 gene – making them resistant to HIV infection. Approximately 1% of European caucasians are homozygous for this mutation – and resistant to HIV infection. Based on the functional cure of the Berlin patient it appears that introducing the CCR5 delta 32 mutation may make host cells resistant to HIV. Using an engineered nuclease, such as a zinc finger nuclease, and specifically targeting the CCR5 gene in HIV patients to isolate and deactivate the CCR5 protein will make the patient’s endogenous T-cells resistant to further infection.

Since HIV infection is persistent, making the host cells resistant may provide a functional cure for HIV infected individuals. Sangamo Biosciences (a biotech company specializing in the development of therapeutic zinc finger nucleases) has developed a zinc finger nuclease that is targeted to disrupt the CCR5 gene.

currently being tested in a Phase 2 clinical trial with HIV/AIDS patients by Sangamo Biosciences in collaboration with groups from the University of Pennsylvania School of Medicine and the Albert Einstein College of Medicine.


The Human Immunodeficiency Virus (HIV) is an RNA virus that can infect specific immune cells in our body, called T helper cells. In order to infect these cells, the envelope proteins of the virus bond to the host cell protein CD4. This alters the morphology of both proteins, allowing the complex to bind to CCR5, the host cell chemokine receptor. This allows the virus to penetrate the cell membrane and enter.

The RNA genome of HIV is encased in a capsid, which is in turn covered by an envelope derived from the host cell membrane. The structures and functions of most of HIV’s proteins are now known. We are still learning about the accessory and regulatory proteins of HIV that exploits the host cell’s machinery for its own advantage.

Life Cycle

Attachment: The HIV spike or envelope protein, gp120, attaches to the host cell protein CD4 on specific types of T-cells.
Fusion and entry: Binding of gp120 and CD4 rearranges their structures allowing the complex to bind another host cell receptor, the chemokine receptors, called CCR5. In some cases an alternate receptor called CXCR4 may replace CCR5 in this interaction. This in turn facilitates the stock of the HIV spike (the protein gp41) to penetrate the host cell membrane and fuse the viral envelope with the host cell membrane.
Reverse transcription: Upon entry, HIV sheds its capsid and the 2 single strands of viral RNA are converted to a double stranded DNA by a special viral enzyme called Reverse transcriptase.
Integration: The double stranded DNA, or proviral DNA, enters the host cell nucleus and is integrated in the cell’s genome by another special viral enzyme called Integrase.
Transcription and translation: The proviral DNA is transcribed and translated like any other host cell gene using host cell machinery (RNA polymerase, Ribosomes etc.)
Assembly and budding: The various viral proteins and RNA come together to assemble the virus. At this stage some of the viral proteins are still linked to each other as part of the polyprotein synthesized by the virus. Various HIV proteins and RNA are packaged into an immature viral particle that buds off from the host cell encased in its membrane.
Maturation of viral particle: With action of the viral protease the various HIV proteins are cut and separated, free to perform their specific functions. This rearrangement or maturation helps the HIV become a mature infectious particle ready to infect another cell.

All the steps of the viral lifecycle are presented in the HHMI Biointeractives animation, narrated by HHMI investigator, Bruce Walker, MD.


Although there are various examples of genetic editing, this year's event focuses on potential treatment for HIV infection through the use of therapeutic zinc-finger nucleases. An overview is presented here, but it is a good idea to visit the CBM website for more in-depth information.

The approaches currently used to treat HIV infections include:

Viral Enzyme inhibitors: block the actions of some critical enzymes in the HIV lifecycle.
Reverse transcriptase inhibitors (RTI): block initial conversion of viral RNA to proviral DNA that is integrated in the host cell genome by either mimicking the enzyme substrate and directly binding to the active site (nucleoside RTIs), or by binding to a site near the enzyme active site and blocking its function (non-nucleoside RTIs)
Integrase inhibitors: block integration of proviral DNA into the host cell genome preventing permanent infection of the host cells
Protease inhibitors: block cleavage of viral polyprotein, preventing maturation of HIV to infectious particles
Entry inhibitors: block interaction of the CD4-gp120 complex with the chemokine co-receptor preventing entry of HIV in the host cell
Fusion inhibitors: block the structural changes in the stock of the HIV spike (gp41) that are needed for the viral envelope and host cell membranes to fuse
Upcoming Approaches

There are some treatment methods which are still being researched.

Making the host cells resistant to HIV: Currently researchers are using Zinc finger nucleases to target the CCR5 gene in stem cells that give rise to blood cells and introduce a deletion or disruption in the gene. As a result these cells are unable to make a functional CCR5 protein and become resistant to HIV infection. A treatment protocol using approach is currently in a Phase II clinical trial conducted by a group from the University of Pennsylvania School of Medicine, the Albert Einstein College of Medicine and Sangamo Biosciences (a biotech company specializing in the development of therapeutic zinc finger nucleases).
Seek out and destroy all the integrated proviral DNA: A recent research report has suggested the possibility of using a gene therapeutic approach to specifically identifying and editing out the integrated proviral HIV-1 DNA. While there is a long way before this can even be tested as a treatment option it offers the hope that gene therapy can be used for dealing with tough diseases like HIV/AIDS.
Genetic Treatment

Various treatments have focused on blocking the action of the virus once it is in the cell. However, patients with naturally suppressed CCR5 proteins due to a genetic 32 base pair deletion, or who have reduced levels due to heterozygous alleles, have T-cells that are resistant to the virus.

Therefore, introducing this 32 bp deletion into the patient's genetic material can create resistant T-cells. This can, in theory, be done with zinc-finger nucleases. This mutation works by preventing CCR5 proteins from being brought to the cell surface, and has no other known ill effects.