Mixing error-correcting codes and cryptography

Secret codes and error-correcting codes have nothing to do with each other. Except when they do!

Error-correcting codes

Error correcting code make digital communication possible. Without some way to detect and correct errors, the corruption of a single bit could wreak havoc. A simple example of an error-detection code is check sums. A more sophisticated example would be erasure codes, a method used by data centers to protect customer data against hard drive failures or even entire data centers going offline.

People who work in coding theory are quick to point out that they do not work in cryptography. “No, not that kind of code. Error-correcting codes, not secret codes.” The goal isn’t secrecy. The goal is maximize the probability of correctly transmitting data while minimizing the amount of extra information added.

Codes and ciphers

You don’t hear the word “code” used in connection with cryptography much anymore. People used to refer to “codes and ciphers” in one breath. Historically, the technical distinction was that a code operated on words, while a cipher operated on characters. Codes in this sense have long been obsolete, but people still speak of codes colloquially.

David Kahn’s classic book on pre-modern cryptography is entitled The Codebreakers, not the Cipherbreakers, because the public at the time was more familiar with the term code than the term cipher. Maybe that’s still the case because, for example, Jason Fagone entitled his biography of Elizabeth Friedman The Woman Who Smashed Codes. Perhaps the author suggested The Woman Who Smashed Ciphers and an editor objected.

Code-based cryptography

If you’re accustomed to the older use of “codes,” the term “code-based cryptography” is redundant. But it means something specific in modern usage: cryptographic systems that incorporate error-correction codes. So error-correcting codes and secret “codes” do have something to do with each other after all!

Robert McEliece had this idea back in 1978. His encryption method starts with a particular error-correcting code, a binary Goppa code, and scrambles it with an invertible linear transformation. At a very high level, McEliece’s method boils down to a secret factorization, sorta like RSA but even more like oil and vinegar. The public key is the product of the Goppa code and the linear transformation, but only the owner knows the factorization of this key.

To encrypt a message with McEliece’s method, the sender adds a specific amount of random noise, noise that the Goppa code can remove. An attacker faces a challenging computational problem to recover the message without knowing how to factor the public key.

Post-quantum cryptography

McEliece’s method did not attract much interest at the time because it requires much larger public keys than other methods, such as RSA. However, there is renewed interest in McEliece’s approach because his scheme is apparently quantum-resistant whereas RSA and other popular public key systems are not.

If and when large quantum computers become practical, they could factor primes efficiently, and thus break RSA. They could also solve the discrete logarithm and elliptic discrete logarithm problems, breaking Diffie-Hellman and elliptic curve cryptosystems. All public key cryptosystems now in common use would be broken.

Why worry about this now while quantum computers don’t exist? (They exist, but only as prototypes. So far the largest number a quantum computer has been able to factor is 21.) The reason is that it takes a long time to develop, analyze, standardize, and deploy encryption methods. There’s also the matter of forward security: someone could store encrypted messages with the hope of decrypting them in the future. This doesn’t matter for cat photos transmitted over TLS, but it could matter for state secrets; governments may be encrypting documents that they wish to keep secret for decades.

NIST is sponsoring a competition to develop and standardize quantum-resistant encryption methods. Two months ago NIST announced the candidates that advanced to the second round. Seven of these methods use code-based cryptography, including the classic McEliece method and six variations: BIKE, HQC, LEDAcrypt, NTS-KEM, ROLLO, and RQC.

Related posts