Reed-Solomon: How Tesseras Survives Data Loss

2026-02-14

Your hard drive will die. Your cloud provider will pivot. The RAID array in your closet will outlive its controller but not its owner. If a memory is stored in exactly one place, it has exactly one way to be lost forever.

Tesseras is a network that keeps human memories alive through mutual aid. The core survival mechanism is Reed-Solomon erasure coding — a technique borrowed from deep-space communication that lets us reconstruct data even when pieces go missing.

What is Reed-Solomon?

Reed-Solomon is a family of error-correcting codes invented by Irving Reed and Gustave Solomon in 1960. The original use case was correcting errors in data transmitted over noisy channels — think Voyager sending photos from Jupiter, or a CD playing despite scratches.

The key insight: if you add carefully computed redundancy to your data before something goes wrong, you can recover the original even after losing some pieces.

Here's the intuition. Suppose you have a polynomial of degree 2 — a parabola. You need 3 points to define it uniquely. But if you evaluate it at 5 points, you can lose any 2 of those 5 and still reconstruct the polynomial from the remaining 3. Reed-Solomon generalizes this idea to work over finite fields (Galois fields), where the "polynomial" is your data and the "evaluation points" are your fragments.

In concrete terms:

Split your data into k data shards
Compute m parity shards from the data shards
Distribute all k + m shards across different locations
Reconstruct the original data from any k of the k + m shards

You can lose up to m shards — any m, data or parity, in any combination — and still recover everything.

Why not just make copies?

The naive approach to redundancy is replication: make 3 copies, store them in 3 places. This gives you tolerance for 2 failures at the cost of 3x your storage.

Reed-Solomon is dramatically more efficient:

Strategy	Storage overhead	Failures tolerated
3x replication	200%	2 out of 3
Reed-Solomon (16,8)	50%	8 out of 24
Reed-Solomon (48,24)	50%	24 out of 72

With 16 data shards and 8 parity shards, you use 50% extra storage but can survive losing a third of all fragments. To achieve the same fault tolerance with replication alone, you'd need 3x the storage.

For a network that aims to preserve memories across decades and centuries, this efficiency isn't a nice-to-have — it's the difference between a viable system and one that drowns in its own overhead.

How Tesseras uses Reed-Solomon

Not all data deserves the same treatment. A 500-byte text memory and a 100 MB video have very different redundancy needs. Tesseras uses a three-tier fragmentation strategy:

Small (< 4 MB) — Whole-file replication to 7 peers. For small tesseras, the overhead of erasure coding (encoding time, fragment management, reconstruction logic) outweighs its benefits. Simple copies are faster and simpler.

Medium (4–256 MB) — 16 data shards + 8 parity shards = 24 total fragments. Each fragment is roughly 1/16th of the original size. Any 16 of the 24 fragments reconstruct the original. Distributed across 7 peers.

Large (≥ 256 MB) — 48 data shards + 24 parity shards = 72 total fragments. Higher shard count means smaller individual fragments (easier to transfer and store) and higher absolute fault tolerance. Also distributed across 7 peers.

The implementation uses the reed-solomon-erasure crate operating over GF(2⁸) — the same Galois field used in QR codes and CDs. Each fragment carries a BLAKE3 checksum so corruption is detected immediately, not silently propagated.

Tessera (120 MB photo album)
    ↓ encode
16 data shards (7.5 MB each) + 8 parity shards (7.5 MB each)
    ↓ distribute
24 fragments across 7 peers (subnet-diverse)
    ↓ any 16 fragments
Original tessera recovered

The challenges

Reed-Solomon solves the mathematical problem of redundancy. The engineering challenges are everything around it.

Fragment tracking

Every fragment needs to be findable. Tesseras uses a Kademlia DHT for peer discovery and fragment-to-peer mapping. When a node goes offline, its fragments need to be re-created and distributed to new peers. This means tracking which fragments exist, where they are, and whether they're still intact — across a network with no central authority.

Silent corruption

A fragment that returns wrong data is worse than one that's missing — at least a missing fragment is honestly absent. Tesseras addresses this with attestation-based health checks: the repair loop periodically asks fragment holders to prove possession by returning BLAKE3 checksums. If a checksum doesn't match, the fragment is treated as lost.

Correlated failures

If all 24 fragments of a tessera land on machines in the same datacenter, a single power outage kills them all. Reed-Solomon's math assumes independent failures. Tesseras enforces subnet diversity during distribution: no more than 2 fragments per /24 IPv4 subnet (or /48 IPv6 prefix). This spreads fragments across different physical infrastructure.

Repair speed vs. network load

When a peer goes offline, the clock starts ticking. Lost fragments need to be re-created before more failures accumulate. But aggressive repair floods the network. Tesseras balances this with a configurable repair loop (default: every 24 hours with 2-hour jitter) and concurrent transfer limits (default: 4 simultaneous transfers). The jitter prevents repair storms where every node checks its fragments at the same moment.

Long-term key management

Reed-Solomon protects against data loss, not against losing access. If a tessera is encrypted (private or sealed visibility), you need the decryption key to make the recovered data useful. Tesseras separates these concerns: erasure coding handles availability, while Shamir's Secret Sharing (a future phase) will handle key distribution among heirs. The project's design philosophy — encrypt as little as possible — keeps the key management problem small.

Galois field limitations

The GF(2⁸) field limits the total number of shards to 255 (data + parity combined). For Tesseras, this is not a practical constraint — even the Large tier uses only 72 shards. But it does mean that extremely large files with thousands of fragments would require either a different field or a layered encoding scheme.

Evolving codec compatibility

A tessera encoded today must be decodable in 50 years. Reed-Solomon over GF(2⁸) is one of the most widely implemented algorithms in computing — it's in every CD player, every QR code scanner, every deep-space probe. This ubiquity is itself a survival strategy. The algorithm won't be forgotten because half the world's infrastructure depends on it.

The bigger picture

Reed-Solomon is a piece of a larger puzzle. It works in concert with:

Kademlia DHT for finding peers and routing fragments
BLAKE3 checksums for integrity verification
Bilateral reciprocity for fair storage exchange (no blockchain needed)
Subnet diversity for failure independence
Automatic repair for maintaining redundancy over time

No single technique makes memories survive. Reed-Solomon ensures that data can be recovered. The DHT ensures fragments can be found. Reciprocity ensures peers want to help. Repair ensures none of this degrades over time.

A tessera is a bet that the sum of these mechanisms, running across many independent machines operated by many independent people, is more durable than any single institution. Reed-Solomon is the mathematical foundation of that bet.