This page is incomplete.
- This article is about a replacement event for that might not be run at every tournament. Please refer to instructions from your particular tournament before preparing for this event. For the event it replaces, see Ping Pong Parachute.
|Division B Website||www|
|Division C Website||www|
Cybersecurity is a Division B and Division C event that was first run as a trial event at the 2021 BEARSO Invitational to replace Ping Pong Parachute. The event consists of two parts: a written test on Cryptography and Web Architecture, and a hands on task on Cryptography and Programming. The event was also run at the November Scilympiad Practice, Yosemite Invitational, and Science Olympiad at Penn State Invitational.
A hash algorithm is a one-way function that maps data, such as a string or a file, to a hash, or a "digest" - a string of data that is much shorter in length. Hash functions are always deterministic. If two equal inputs are hashed two separate times, the digest will always be the same. A hash can be used as a checksum to validate that a file has not been altered, since if a single bit of information was changed, the checksum would change. Hash functions are also designed to decrease the risk of hash collisions. Since the hashed digest of an input reduces its size significantly, hash collisions can occur when two inputs map to the same output. Hash functions are used in digital signatures, signing and authentication algorithms, and passwords.
A good hash algorithm has the following characteristics:
- It is hard to find collisions.
- It is irreversible.
- It has to be deterministic.
Passwords are one of the most important applications of hashing algorithms. When a password is inputted, a hash of the password is calculated, and compared to the hashed value of your original password. Thus, no plaintext passwords should be saved server-side, which would reduce the damage in the event of a data breach.
The MD5 hashing algorithm produces 16 byte digests (128 bits) with block sizes of 64 bytes (512 bits). It was first created in 1991 by Ronald Rivest. Multiple vulnerabilities have been exposed with the MD5 hashing algorithm, and collisions can be calculated in less than a second on a typical computer. Thus, MD5 is considered extremely cryptographically insecure; however, many programs and applications continue to use it.
SHA1, which stands for Secure Hash Algorithm 1, was created in 1995. The algorithm produces 20 byte digests (160 bits) of data using block sizes of 512 bits. 80 rounds of hashing is done under this hashing algorithm. Hash collisions can be calculated with 2^60.3 to 2^65.3 operations, and collisions have been calculated. In addition, SHA1 is built using the Merkle-Damgård construction, so it is also prone to length extension attacks.
SHA2 is a set of six hash functions, namely SHA-224, SHA-256, SHA-384, SHA-512, SHA-512/224, and SHA-512/256. Depending on the name of the function, each hash algorithm produces a digest size with a different amount of bits. Either 64 or 80 rounds of hashing are performed on the plaintext. Similarly to SHA1, SHA2 hashes are vulnerable to hash length extension attacks. Currently, there are no known full collisions of SHA2 hashes.
Hash Length Extension Attack
Algorithms that are based on the Merkle-Damgård construction are vulnerable to hash length extension attacks, including MD5, SHA1, and all SHA2 hashes. Because the hash digest is essentially a snapshot of the internal state of the hash, the internal state can be recreated with the hash. In addition, because the hashing algorithm is performed on blocks of data, one can compute HASH(UNKNOWN + CUSTOM MESSAGE) only knowing the length and hash digest of HASH(UNKNOWN) by continuing the hashing algorithm from the known internal state. This attack is significant because it allows attackers to forge requests. One patch of this attack is to hash the unknown secret twice: for instance, HASH(HASH(UNKNOWN)) could be used.
A hash collision has occurred when two plaintexts, hashed with the same algorithm, has produced the same digest. All hashing algorithms have an infinite amount of collisions because the plaintext can be infinitely long, yet the length of the digest is finite. However, because of the length of the hash, it can be extremely hard to produce a collision.
The XOR Operation
The XOR operator is a binary operator that take two bits of data and outputs one bit of data. XOR sounds for exclusive or; it returns True if the two inputs are different, and False if the inputs are the same. In Java and Python, the XOR operator is denoted with the "^" symbol (exponentiation is represented with "**"). In contrast, SageMath uses the "^" symbol to represent exponentiation, while "^^" is the XOR operator.
This is an XOR table which represents the inputs as well as the outputs:
|Input 1||Input 2||Output|
In addition, the XOR operator is commutative (A^B==B^A), associative ((A^B)^C==A^(B^C)), and is its own inverse (A^A==0, B^A^A=B), meaning that it is symmetric and reversible. The latter is especially important, as this is a very simple way to reveal a plaintext or calculate a ciphertext using the same encryption/decryption function.
In cryptography, the XOR cipher is used extremely frequently in stream ciphers and block ciphers, as well as many others. Not only is the XOR operation computationally inexpensive, but it is also theoretically impossible to break as long as the key is secure.
Ciphers that rely purely on the XOR operation, such as stream ciphers such as OTP (one time pad), are susceptible to a known plaintext attack. If a plaintext as well as its corresponding ciphertext is known, then the keystream can be trivially calculated using the operation C^P=K. Thus, it is important for a keystream to be unique and only be used once.
Crib dragging is a method of attacking XOR ciphers with a repeating key. One can use frequency analysis, as well as a brute-force, to calculate the plaintext and "drag" the key along the ciphertext in order to reveal the plaintext message.
Different number bases, most often powers of 2, are used extensively in computer science because of its compatibility with computers.
To express a string or a file as a long integer, each character in the string is looked up in an ASCII table and strung together as a string of bytes. For instance, the word "SciOly.org" would be expressed as "53 63 69 4f 6c 79 2e 6f 72 67" in hex, "01010011 01100011 01101001 01001111 01101100 01111001 00101110 01101111 01110010 01100111" in binary, and "U2NpT2x5Lm9yZw==" in base 64.
A binary digit is simply a long list of 0s or 1s. Representing an integer, string, or file as a string of binary digits is useful because it is one of the only methods through which a computer can process data.
The hexadecimal number system is often used in place of binary because it is not only easier for humans to read, but computers can also trivially compute the binary representation of hexadecimal. In this number system, the numbers 0-15 would be expressed as 0123456789ABCDEF.
Base 64 is one of the most compact ways of expressing data as a string, since it stores more data per character. The 64 characters used are as follows:
- 0-25: A-Z
- 26-51: a-z
- 52-61: 0-9
- 62: +
- 63: /
In base 64, every 4 bytes of encoded data can be decrypted to 3 bytes of unencoded data. Thus, when the unencoded data is not a multiple of 3, padding must be added in order to make the encoded data a multiple of 4. The "=" is used as a padding.
Classical cryptography includes cryptosystems that are no longer in use, as extensive attacks have been developed and can be trivially implemented to reveal the message. Most classical cryptosystems encrypt plaintext, such as messages in English, rather than files. In Codebusters, all of the
Attacks on Classical Cryptosystems
- Chosen Plaintext Attacks
- Chosen Ciphertext Attacks
- Known Plaintext Attacks
Diffie-Hellman Key Exchange
Elliptic Curve Cryptography
Post Quantum Cryptography
Quantum computers can break some cryptographic functions (e.g. prime factorization through Shor's algorithm). Post-quantum cryptography uses properties of quantum mechanics, such as entanglement, to protect against quantum computer attacks.
Mozilla's web docs tend to be the most up-to-date repository for web technologies information. W3C Schools and Stack Overflow often don't follow industry best practices.
Modern websites use HTTPS, a descendant of the Hypertext Transport Protocol (HTTP) that incorporates encryption through Transport Layer Security (TLS).