Protein Modeling/CRISPR-Cas9

The CRISPR-Cas9 and Anti-CRISPR proteins are the topic of Protein Modeling for the 2018-2019 season. The CRISPR complex known as Cascade can be found under the Protein Data Bank ID 4QYZ, and the Cas9 protein can be found under the ID 4OO8. The pre-build model is the anti-CRISPR protein AcrII4A, which can be found under the ID 5VW1.

Function
CRISPR stands for Clustered Regularly Interspaced Short Palindromic Repeats, and it refers to a series of DNA sequences found in prokaryotes that defend the organism from bacteriophages. In general terms, CRISPR stores Viral DNA seperated by palindromic repeats, working as part of an active immune system in bacteria and archaea. CRISPR sequences are found in approximately 50% of bacteria and 90% of archaea that have been sequenced.

CRISPR Cas9
When a bacteria survives an attack from a phage, the Cas1 and Cas2 proteins will remove a 20-bp snippet of viral genetic material and add it to the bacteria's CRISPR array. Later, when the virus attacks again, the DNA is transcripted and the resulting RNA strand pairs up with a tracrRNA through base pairing, forming a single-guide RNA (sgRNA). The sgRNA then forms a complex with the Cas9 protein. The Cas9 complex will look in the viral DNA to try to find a sequence complementary to the sgRNA. The Cas9 protein will unravel a section of DNA. If the sgRNA can bind to the section, and detects a PAM sequence, then it will make a double-stranded cut. The PAM sequence is a 3-bp (NGG) sequence in the viral DNA, adjacent to the 20-bp target sequence. The Cas9 protein uses the PAM sequence to make sure that it is cutting the viral DNA and not the bacteria's own CRISPR array, which doesn't contain the PAM sequence. The structure of the Cas9-dsDNA-sgRNA complex can be found in PDB file 5F9R.

Anti-CRISPR
Some phages use Anti-CRISPR(Found originally in Listeria Monocytogenes) to deactivate the CRISPR system. These proteins mimic the PAM sequence in DNA to trick the CRISPR-Cas9 system and fill the pocket in whicht the target DNA would fill.

Inhibition of Cas9 by AcrIIA4
Phages have also developed an evolutionary immune system to the CRISPR system. In a study published by Nature, Anti-CRISPR proteins were found to be highly acidic DNA mimics. (DOI: 10.1126/sciadv.1701620). In-depth studies by Yang & Patel and [http://cbm.msoe.edu/images/contentImages/scienceOlympiad/module/2.%20Dong.2017.nature22377.pdf Dong, et. al] provided insight into the structure and function of the AcrIIA4 protein. The main ways that AcrIIA4 inhibit the function of Cas 9 are: Cas9 must be in complex with an sgRNA in order for AcrIIA4 to bind to it. When AcrIIA4 binds to Cas9, the viral DNA cannot bind to the complex, which allows the virus to survive.
 * Blocking the CTD and Topo domains to prevent PAM recognition
 * Blocking the RuvC domain to prevent cleavage of the non-complimentary strand

Cas9
The Cas9 protein has a bi-lobed structure, consisting of a REC (recognition) lobe and a NUC (nuclease) lobe. The lobes are further divided into domains. The REC lobe consists of 3 Helical domains and a Bridge Helix. The NUC lobe consists of a RuvC domain split into 3 parts, an HNH domain, a Topo domain, and a CTD domain. The RuvC domain includes an active site which cleaves the non-complimentary strand, and the HNH domain cleaves the complimentary strand. The Topo and CTD domains serve to identify the PAM sequence, as well as to bind to the non-complimentary strand. The Helical domains and Bridge Helix bind to the complimentary DNA strand.

AcrIIA4
The AcrIIA4 protein is a relatively small protein, with 87 amino acids. It consists of an alpha helix, a 3-stranded beta sheet, and 2 more alpha helices (N-C). The beta hairpins play a crucial role in the inhibition of Cas9, by occupying various active sites of the Cas9 protein.

Resources

 * Cas9 on Wikipedia
 * The official Center for BioMolecular Modeling page on the Protein Modeling event
 * Protein Modeling SciOly.org Wiki