Length Extension Attack
A Length Extension Attack is a vulnerability in hash functions like MD5, SHA-1 & SHA-2 that allows an attacker to append new information to a message and generate a valid hash for the new, longer message without knowing the secret key originally used to sign it.
Vulnerability
The attack exploits the Merkle-Damgård Construction used by MD5, SHA-1 & SHA-2. In this design, the hash function processes input in blocks, updating an internal “state” after each block. Crucially, the final output of the hash is simply the final internal state of the function.
Because the output reveals the internal state, an attacker who sees a valid hash (signature) can load this final state back into the function and resume processing more data.
How the Attack Works
- Intercept the Hash: The attacker intercepts a valid message and its signature (Hash of
Key + Message). - Reconstruct Padding: To resume hashing, the attacker must recreate the exact padding that was added to the original message by the hash algorithm (usually a ‘1’ bit, followed by zeros, and the message length).
- Resume & Extend: The attacker initializes their hash function with the intercepted hash value (instead of the standard starting values). They then process their malicious “extension” data.
- Result: The attacker generates a valid hash for the new message:
Key + Message + Padding + Extension. The system receiving this generally accepts it as valid because the hash matches, effectively allowing the attacker to forge a signature.
Example Scenario
- In the coding demonstration, a “bank transaction” signed with
Hash(Key + Message)was intercepted. - The attacker appended a new command (e.g.,
, amount: 100000) to the original message. - By initializing the hash state with the original signature and processing the extension, they generated a valid signature for the modified transaction without ever knowing the bank’s secret key.
Mitigation
This specific vulnerability is why simple Hash(Key + Message) is insecure for authentication.
- HMAC: Using an HMAC (Hash-Based Message Authentication Code) prevents this because it hashes the output twice, masking the internal state.
- SHA-3: Newer algorithms like SHA-3 use a “sponge” construction with a hidden “capacity” part of the state that is not output, making it impossible to reconstruct the full state needed to extend the hash.
Relevant Note(s): Cryptographic Hash Function