So Random! How Coin Flips Conquered Noise

by Alex Bellos (Spring 2024 science communicator in residence)

Flip a coin. You get a 0 or 1.

Flip 50 coins. You get 50 random bits.

Flip 50 coins 50 times. You get … an error-correcting code.

Or so said Claude Shannon, who came up with the concept in his seminal 1948 paper, “A Mathematical Theory of Communication.”

An error-correcting code (ECC) is an algorithm that enables the transmission of data in such a way that errors can be detected and corrected. Before Shannon, scientists assumed that the problem of noise could never be overcome in an unreliable communication channel. Shannon showed that this assumption was wrong: it was always possible to transmit messages that can be replicated perfectly at the point of destination.

ECCs are now a rich and varied field that spans math, computer science, engineering, and information theory. In Spring 2024, a Simons Institute research program brought together specialists in the ECCs currently used in mobile communications, the latest generation of a technology invented almost 80 years ago.

Shannon’s paper marked the birth of the Information Age. In it, he introduced the word “bit,” for binary digit, a 0 or a 1, and he explained how messages written in terms of bits can successfully pass through communication channels even if they are distorted or altered along the way. An ECC is a list of “codewords” made up of bits. The general idea is that the codewords contain enough redundancy so that even if some bits are lost or corrupted during transmission, the correct codeword is still recognizable at the other end. The codeword may be bloodied and battered, but it still contains sufficient information for the receiver to identify it correctly.

For example, say an ECC has only two codewords, 0000 and 1111. If only one bit is corrupted en route, it is clear that a 0010 is the first codeword and not the second. The mantra of error correction is “what is similar is the same.” If A ≈ B, then A = B.

For an ECC to tolerate as many transmission errors as possible, its codewords must be as different from each other as possible. Yet how does one go about creating codewords that are more or less uniformly spread across the space of all possible codewords? Shannon suggested the following recipe.

Let codeword 1 be a sequence of 50 coin flips.
Let codeword 2 be another 50 coin flips.
Continue until you have 50 codewords.

Codewords generated randomly — a “random code” — achieve the desired property of mutual difference. The chance that any two codewords are identical is astronomically tiny, since we are hardly likely to flip the same 50-bit sequence in a row on two occasions. Likewise, it is also extremely improbable that any two sequences will differ in just a single position, or in two, three, or even four positions. What’s most likely is that, on average, any two codewords will differ in roughly half their positions. Not only will no two codewords be similar to each other, but on the whole, they will be as far apart from each other as it is possible to be.

“It is surprising that there are lots of methods to create a code but none give the quality of a random code,” said Noga Ron-Zewi, a professor of computer science at the University of Haifa who gave a Richard M. Karp Distinguished Lecture about ECCs at the Simons Institute in April 2024.

In his 1948 paper, however, Shannon noted that random codes are impractical. When built into a communication system, the process of error correction has three stages: encoding, transmission, and decoding. A random code has a high transmission rate, almost as fast as possible, but encoding and decoding using random codewords is intolerably slow. The challenge Shannon set for his peers was to find an efficient way to implement his revolutionary idea.

In the following decades, scientists developed many practical ECCs for use in different digital communication systems. Currently, mobile and wireless systems mostly use low-density parity-check (LDPC) codes, an idea first suggested by Shannon’s MIT colleague Robert Gallager in 1960. An LDPC code can be represented visually by a graph like the one below.

Each circle contains a bit, and the circles together are the codeword. The squares are “parity” checks. The word “parity” means the property of being odd or even. Each parity check is a 0, indicating that the sum of the codeword bits linked to it is even. In an LDPC code, every parity check is linked to a small number of codeword bits — hence “low density.” Graph-based codes speed up decoding because, rather than the receiver taking a corrupted codeword and finding the codeword most similar to it in a list, as in a random code, the receiver can correct the errors using the parity checks.

LDPC codes look very different from Shannon’s random codes, but they share one important feature. LDPC graphs are also constructed using coin flips. Henry Pfister, professor of engineering at Duke University and an organizer of the Simons Institute research program, explained that to generate a good code, engineers come up with many graphs where hundreds of codeword bits are connected somewhat randomly to parity checks. Each graph is then tweaked and ranked. “Experts take the codes with the highest scores and then test them rigorously in realistic conditions to finally pick the best one,” he said. The random element in the graph is important because, Pfister added, “structure can introduce modes of failure.” If there is a way to design good LDPC codes without flipping coins, he said, no one has discovered it yet.

LDPC codes are the most efficient of all known ECCs. For any message, the code chosen will depend on how noisy the transmission channel is, but typically the proportion of redundant bits in an LDPC codeword is only between 20 and 50 percent.

Error correction is a wonderful story of scientific ingenuity and success. Trillions of messages are sent every day between our devices. Every single one uses an error-correcting code, the technology invented by Claude Shannon in 1948. When we press “Send” on an email or text, our devices convert the message into small packets of bits, which are encoded using an ECC and sent. The receiving device receives the packets, which have possibly accumulated errors in transit; detects these errors; and rebuilds the original message.

“About 1 percent of the time, a packet will fail — it will be too corrupted, and the receiver cannot reconstitute the error. In that case, the base station will send the packet again,” said Pfister. The systems are so efficient, however, that transmission errors have essentially been eradicated from communication. “You never get an error getting through. The chance of an undetected packet is one in 1018. This might happen once, to one person in the world, every 10 years.”

Shannon showed that choosing codewords at random enabled very good transmission rates. Random errors were the disease, and random codes were the antidote. The spirit of Shannon’s original insight persists. Messages travel error free, at almost optimal speeds, guided by the randomness at the heart of LDPC codes.

,