The Essential Guide to Cryptography: Hashing, Encryption, and Signing
Cryptography primarily studies how to protect information security through mathematical and computer science methods. It is one of the cornerstones of internet security.
The term “Cryptography” originates from the Greek words “kryptós” (hidden) and “gráphein” (writing).
Information Security comprises the following five fundamental attributes:
- Integrity
- Authenticity
- Non-Repudiation
- Confidentiality
- Availability
This article will explain, through specific applications, how cryptographic hashing, encryption, and signing effectively safeguard these fundamental attributes.
#️⃣ Hashing
Cryptographic Hash Function
A Cryptographic Hash Function (CHF) is a one-way mathematical algorithm used to map data of any size to a fixed-size bit string. Common cryptographic hash functions include MD5, SHA256, and others. Cryptographic hash functions typically have the following characteristics:
- Deterministic: The same input always produces the same output.
- Efficiency: The hash value can be calculated quickly.
- Avalanche effect: Minor changes in the input lead to significant changes in the output.
- Collision Resistance: It is computationally infeasible to find two different inputs that produce the same output.
- Preimage Resistance: Knowing the output, it is computationally infeasible to deduce the input.
- Second Preimage Resistance: Knowing one input and its corresponding output, it is computationally infeasible to find another input that produces the same output.
SHA256('hi') = '8f434346648f6b96df89dda901c5176b10a6d83961dd3c1ac88b59b2dc327aa4'
SHA256('ho') = 'a821c62e8104f8519d639b4c0948aece641b143f6601fa145993bb2e2c7299d4'
| | |
CHF Plain Text Hash or Digest
Cryptographic Hash Functions (CHF) differ from Non-Cryptographic Hash Functions (NCHF), which don’t require security features like resistance to preimage attacks, making them faster and more resource-efficient. Common NCHF example: FNV-1.
Application: Password Storage
From the perspectives of both attack and defense, let’s explore how cryptographic hashing, key stretching, and salting can be used to counteract brute force attacks, dictionary attacks, and rainbow table attacks, thereby securely storing passwords.
- 🛡️ Defense
- Cryptographic hashing
- Key stretching
- Salt & Pepper
- ⚔️ Offense
- Brute-force attack
- Dictionary attack
- Rainbow table
🛡️ Suppose a user chooses a simple password, 123456
. Service backend hashes it using SHA256 and stores the hash in the database. On login, input is rehashed and checked against the database hash. If they match, login succeeds.
SHA256('123456') = '8d969e...'
⚔️ Attackers use brute-force attacks to test passwords one by one, seeking the string whose hash value is 8d969e...
. After some time, they successfully discover that 123456
is the password.
SHA256('1') != '8d969e...' # failed attempt
SHA256('12') != '8d969e...' # failed attempt
...
SHA256('123456') == '8d969e...' # success
🛡️ To increase the cost of brute-force attacks for attackers, the backend can implement key stretching. This method boosts the number of hash iterations for each password, requiring attackers to expend more resources for each attempt.
SHA256(SHA256(SHA256('123456'))) = '9ebf7e...'
⚔️ Attackers can use a list of common passwords, and start cracking with the most likely ones. Since users often choose these common passwords, this significantly increases the success rate. This method is known as a dictionary attack. Dictionary files are typically large but save time compared to brute-force attacks, effectively trading space for time.
SHA256('password') != '5e8848...' # failed attempt
SHA256('123456') == '8d969e...' # success
⚔️ Attackers can further optimize time complexity using lookup tables, which store the hash values of common passwords from the dictionary as keys, significantly speeding up the search process. This strategy further trades space for time.
'5e8848...': 'password'
'8d969e...': '123456' # success
⚔️ However, lookup tables require vast storage to cover a large password space. To address this, rainbow tables were developed. Rainbow tables are precomputed using cryptographic hash functions (CHF) and reduction functions. The construction process involves selecting a set of initial data as head nodes, alternately applying the hash function (H) and the reduction function (R) to reach the tail nodes. Only the heads and tails of each chain are stored, significantly reducing storage size compared to lookup tables.
'password' -> H -> R -> H -> R -> H -> R -> 'goodday'
| | | | |
'5e8848...' '123456' '8d969e...' 'whatsup' '6a8k12...'
'newton' -> H -> R -> H -> R -> H -> R -> 'supernova'
'gaming' -> H -> R -> H -> R -> H -> R -> 'blackwell'
only stores('password', 'goodday'),('newton', 'supernova'),('gaming', 'blackwell')
The reduction function maps hash values back to a value within the password space. Its goal is to evenly cover the password space to minimize collisions. The password space refers to the set of all possible password combinations.
To crack a password, alternate between R and H on the target hash to locate the chain containing it. Then, starting from the head node, alternate H and R to find the original value of the target hash. While rainbow tables have slightly worse query performance compared to lookup tables, they save significant space, allowing the same storage to cover a larger password space.
- Alternate R and H operations on the target hash value '8d969e...'
- After each R operation, check if the result matches a tail node
- R('8d969e...') = 'whatsup' != any tail node
- H('whatsup') = '6a8k12...'
- R('6a8k12...') = 'goodday' IN ('password', 'goodday')
- Calculate further and find R(H(R('8d969e...'))) = 'goodday'
- This indicates the target hash's preimage is in the chain ('password', 'goodday')
- Starting from 'password', alternate R and H until you find the hash '8d969e...' which is '123456'
🛡️ To counter rainbow table attacks, the backend can use salt. A salt is a random string generated by a cryptographically secure pseudorandom number generator (CSPRNG). When hashing a user’s password, the password and salt are combined to create the final hash. The salt is stored in plaintext alongside the hash result. Each user has a unique salt.
SHA256('123456'+'sda8%D') = '264e1e...'
| | |
Secret Salt Digest
|-----------|
|
Store in DB
The purpose of the salt is to introduce randomness. With salt protection, attackers must calculate a dedicated rainbow table for each salt value, significantly increasing the cost of cracking.
Pepper, like salt, is a string. The difference is that pepper is not stored in plaintext in the database; it is usually stored in the configuration and is globally unique.
SHA256('123456' + 'd9^d7x' + 'sda8%D') = '36df42...'
| | | |
Secret Pepper Salt Digest
| |-----------|
| |
Store in Config Store in DB
Pepper’s role is to make it difficult for attackers to crack user passwords even if they obtain the plaintext salt values and hashes from a database breach.
Despite these security measures, attackers can still crack passwords. Users should also enhance security awareness by using strong passwords, changing them regularly, enabling 2FA, and recognizing phishing links.
🔒 Encryption
Encryption refers to using an algorithm and a key to convert plaintext data into ciphertext. Common encryption algorithms fall into two categories: symmetric-key encryption and public-key encryption.
Encryption vs. Encoding: These are two different concepts. Encryption aims to protect data privacy, while encoding converts data into a specific format for easier storage, transmission, or processing, such as Base64.
Symmetric-key Encryption
Symmetric-key encryption uses the same key for both encryption and decryption.
K('hi') = 'ab0af2...' -> K('ab0af2...') = 'hi'
| | | |
Key Cipher Text Key Plain Text
Common symmetric encryption algorithms include DES, 3DES, and AES. Among them, AES (Advanced Encryption Standard) is the most widely used.
AES supports three different key lengths:
- AES-128: 128-bit key, 10 encryption rounds (better performance)
- AES-192: 192-bit key, 12 encryption rounds
- AES-256: 256-bit key, 14 encryption rounds (more secure)
AES supports various encryption modes:
- ECB (Electronic Codebook): The earliest and simplest encryption mode.
- CBC (Cipher Block Chaining): Requires an IV, the most commonly used encryption mode.
- GCM (Galois/Counter Mode): Requires an IV, the most common mode for encryption and authentication.
ECB mode is not recommended because it lacks randomness; identical input blocks always produce the same output, making it insecure.
The AES-256 CBC mode encryption process is as follows:
- Specify a Password and use CSPRNG to generate a Salt.
- Use PBKDF2 to derive a 128-bit Initialization Vector (IV). The IV ensures the ciphertext is random each time.
- Use PBKDF2 to derive a 256-bit Key.
- Encrypt the plaintext using the IV and Key to obtain the ciphertext.
- Transmit the ciphertext along with the Salt.
- The receiver uses the Password and Salt with PBKDF2 to regenerate the IV and Key for decryption.
Password -| |-> IV(128 bits) -|
|-PBKDF2->| |-Encrypt-> Cipher
Salt -----| |-> Key(256 bits) -|
Below is a demonstration of using openssl
for AES-256 CBC mode encryption and decryption:
> cat plaintext
Let's eat ShakeShack!
# Encrypt
> openssl enc -aes-256-cbc -salt -pbkdf2 -in plaintext -out encrypted -pass pass:123456
# Decrypt
> openssl enc -aes-256-cbc -d -salt -pbkdf2 -in encrypted -out decrypted -pass pass:123456
> cat decrypted
Let's eat ShakeShack!
One of the biggest challenges of symmetric encryption is the key distribution problem: how to securely transmit the key from the sender to the receiver.
Public-key Encryption
Public-key encryption uses different keys for encryption and decryption.
K-('hi') = 'v9we6k...' -> K+('v9we6k...') = 'hi'
| | |
Private Key Public Key Plain Text
Common public-key encryption algorithms include RSA and ECC. Below is a simple example of RSA key generation, encryption, and decryption.
RSA Key Generation:
Select two prime numbers: p=5, q=11
Calculate n: n = p * q = 5 * 11 = 55
Calculate Euler's totient function: ϕ(n) = (5-1)*(11-1) = 40
Choose public exponent e, a positive integer less than ϕ(n): e = 3
Calculate private exponent d: d * e mod ϕ(n) = 1
Public Key: (n, e) = (55, 3)
Private Key: (n, d) = (55, 27)
RSA Encryption:
Encrypt plaintext 12: 12 ^ 3 mod 55 = 23
RSA Decryption:
Decrypt ciphertext 23: 23 ^ 27 mod 55 = 12
Public-key encryption is more resource-intensive compared to symmetric encryption, making it unsuitable for encrypting large amounts of data. It is commonly used for key distribution and digital signatures.
Application: SSL/TLS
SSL/TLS is a classic application of both symmetric and public-key encryption. SSL/TLS uses public-key encryption for key exchange and symmetric encryption for subsequent communication. This approach avoids the performance issues of public-key encryption and the key distribution problems of symmetric encryption.
Client <-Cert(Ks+)- Server
Client --Ks+(Key)-> Server
Client <-Enc(Msg)-> Server
✍🏼 Signing
Digital Signature
Message Digest guarantees integrity.
Alice Bob
Msg: 'I\'m buying everyone ShakeShack' -| |- Msg: 'I\'m buying everyone ShakeShack'
| | | |
| SHA256(Msg) | -> | | SHA256(Msg)
V | | V
HashA = 'ac6...f67' --------------------| |- HashB = 'ac6...f67' == HashA
Message Authentication Code, MAC guarantees integrity and authenticity.
Alice Bob
Msg: 'I\'m buying everyone ShakeShack' -| |- Msg: 'I\'m buying everyone ShakeShack'
| | | |
| CalcMAC(Key+Msg) | -> | | CalcMAC(Key+Msg)
V | | V
MAC A = '182...807' --------------------| |- MAC B = '182...807' == MAC A
Digital Signature guarantees integrity and authenticity, and non-repudiation.
Alice Bob
Msg: 'I\'m buying everyone ShakeShack' -| |- Msg: 'I\'m buying everyone ShakeShack'
| | | |
| K-(SHA256(Msg)) | -> | | SHA256(Msg)
V | | V
Sig A = '8gk...f06' --------------------| |- K+(Sig A) = SHA256(Msg)
Application: Digital Certificate
A digital certificate is an electronic document used to prove the ownership of a public key.
# Application
User CA
CSR = (User, K+) --> SigCA = Kca-(Hash(User, K+)) # Signing
Cert = (User, K+, SigCA) <-- (User, K+, SigCA) # Digital Certificate
# Verification
User Verifier
(K+, Cert) --> Kca+(SigCA) == Hash(User, K+) # K+ Verified
The CA’s public key (Kca+) is usually pre-installed on the user’s device. These public keys are part of the root certificates, forming the foundation of the trust chain.