Protein Modeling/CRISPR-Cas9

From Wiki -
Revision as of 21:03, 1 August 2019 by AlfWeg (talk | contribs) (Inhibition of Cas9 by AcrIIA4)
Jump to: navigation, search

The CRISPR-Cas9 and Anti-CRISPR proteins are the topic of Protein Modeling for the 2018-2019 season. The CRISPR complex known as Cascade can be found under the Protein Data Bank ID 4QYZ, and the Cas9 protein can be found under the ID 4OO8. The pre-build model is the anti-CRISPR protein AcrII4A, which can be found under the ID 5VW1.


CRISPR stands for Clustered Regularly Interspaced Short Palindromic Repeats, and it refers to a series of DNA sequences found in prokaryotes that defend the organism from bacteriophages. In general terms, CRISPR stores Viral DNA seperated by palindromic repeats, working as part of an active immune system in bacteria and archaea. CRISPR sequences are found in approximately 50% of bacteria and 90% of archaea that have been sequenced.

Cas Proteins

Proteins that merge with CRISPR to cut Viral DNA in bacteria. Several types and classes exist.


Main protein used to cut target DNA. This process is well understood


This Cas protein takes viral DNA and adds it to CRISPR. It is not currently known how this is done.

Important parts of CRISPR Cas9 System

PAM Sequence

Stands for protospacer adjacent motif. Short sequence downstream of target DNA. It is used to identify and locate the target DNA. Anti-CRISPR mimics this sequence to trick the CRISPR-Cas system.


crRNA(CRISPR RNA) is used to locate target DNA by binding to the PAM. tracrRNA(trans-activating CRISPR RNA) - used to bind crRNA to Cas protein


When a bacteria survives an attack from a phage, the Cas1 and Cas2 proteins will remove a 20-bp snippet of viral genetic material and add it to the bacteria's CRISPR array. Later, when the virus attacks again, the DNA is transcripted and the resulting RNA strand pairs up with a tracrRNA through base pairing, forming a single-guide RNA (sgRNA). The sgRNA then forms a complex with the Cas9 protein. The Cas9 complex will look in the viral DNA to try to find a sequence complementary to the sgRNA. The Cas9 protein will unravel a section of DNA. If the sgRNA can bind to the section, and detects a PAM sequence, then it will make a double-stranded cut. The PAM sequence is a 3-bp (NGG) sequence in the viral DNA, adjacent to the 20-bp target sequence. The Cas9 protein uses the PAM sequence to make sure that it is cutting the viral DNA and not the bacteria's own CRISPR array, which doesn't contain the PAM sequence. The structure of the Cas9-dsDNA-sgRNA complex can be found in PDB file 5F9R.


Some phages use Anti-CRISPR(Found originally in Listeria Monocytogenes) to deactivate the CRISPR system. These proteins mimic the PAM sequence in DNA to trick the CRISPR-Cas9 system and fill the pocket in whicht the target DNA would fill.

Inhibition of Cas9 by AcrIIA4(Anti-CRISPR)

Phages have also developed an evolutionary immune system to the CRISPR system. In a study published by Nature, Anti-CRISPR proteins were found to be highly acidic DNA mimics. (DOI: 10.1126/sciadv.1701620). In-depth studies by Yang & Patel and Dong, et. al provided insight into the structure and function of the AcrIIA4 protein. The main ways that AcrIIA4 inhibit the function of Cas 9 are:

  • Blocking the CTD and Topo domains to prevent PAM recognition
  • Blocking the RuvC domain to prevent cleavage of the non-complimentary strand

Cas9 must be in complex with an sgRNA in order for AcrIIA4 to bind to it. When AcrIIA4 binds to Cas9, the viral DNA cannot bind to the complex, which allows the virus to survive.



The Cas9 protein has a bi-lobed structure, consisting of a REC (recognition) lobe and a NUC (nuclease) lobe. The lobes are further divided into domains. The REC lobe consists of 3 Helical domains and a Bridge Helix. The NUC lobe consists of a RuvC domain split into 3 parts, an HNH domain, a Topo domain, and a CTD domain. The RuvC domain includes an active site which cleaves the non-complimentary strand, and the HNH domain cleaves the complimentary strand. The Topo and CTD domains serve to identify the PAM sequence, as well as to bind to the non-complimentary strand. The Helical domains and Bridge Helix bind to the complimentary DNA strand.


The AcrIIA4 protein is a protein, with two chains. The B-chain consists of an alpha helix, a 3-stranded beta sheet, and 2 more alpha helices (N-C). The beta hairpins play a crucial role in the inhibition of Cas9, by occupying various active sites of the Cas9 protein.