Part 2: Shorebird | bidiff Algorithm & Zstd Compression

byLokesh Jangid •5/14/2026 10:34:00 pm

0

A visual, byte-level explanation of how Shorebird computes the exact difference between two binaries and compresses it for efficient OTA delivery.

1. The Problem — Why Not Just Send the New File?

When you change Dart code and create a patch, two libapp.so files exist:

OLD libapp.so (3,146 KB)

← already on user's phone

NEW libapp.so (3,148 KB)

← contains your code changes

You could send the entire 3,148 KB new file. But here's the thing — when you change a few lines of Dart code, maybe 95-98% of the bytes are identical. The Dart compiler produces AOT machine code where most functions, the standard library, all your unchanged widgets, constants, and data structures remain byte-for-byte the same.

The question becomes: can we send ONLY the bytes that changed? That's what bidiff does.

2. bidiff — Binary Diff Algorithm

2.1 What is bidiff?

bidiff (version 1.0.0, created by Amos Wenger / fasterthanlime) is a binary delta encoding algorithm. It's conceptually similar to the famous bsdiff algorithm (used by Chrome, Firefox, and Android system updates) but written in Rust with a cleaner API.

It takes two byte sequences (old file + new file) and produces a compact set of instructions that describe how to transform the old file into the new file.

2.2 The Core Idea — Suffix Sorting

Step 1: Build a Suffix Array of the Old File

The algorithm sorts ALL possible suffixes (substrings starting at every byte position) of the old file. This creates an index that can find any byte pattern in the old file in O(log n) time.

Think of it like building a phone book — but instead of names, every "entry" is a position in the old file, and they're sorted by what bytes appear at that position.

Let's trace through a tiny example to see how it works:

🔬 Worked Example

OLD file: "Hello World! This is my Flutter app."

NEW file: "Hello World! This is my updated Flutter app!"

The algorithm scans the new file byte by byte, looking for the longest match in the old file:

OLD file bytes:

H

e

l

o

W

o

r

l

d

!

T

h

i

s

i

s

m

y

F

l

u

t

e

r

a

p

.

NEW file bytes:

H

e

l

o

W

o

r

l

d

!

T

h

i

s

i

s

m

y

u

p

d

a

t

e

d

F

l

u

t

e

r

a

p

!

■ Green = identical bytes at same position ■ Blue = new/inserted bytes ■ Red = bytes that exist in old but at different position

2.3 The Three Operations

bidiff encodes the diff as a stream of three operation types:

📋 COPY(old_offset, length)

"Copy length bytes from the OLD file starting at position old_offset."

This is the powerhouse operation. When the algorithm finds a sequence in the new file that matches somewhere in the old file, it encodes it as a COPY instead of storing the actual bytes. For libapp.so, COPY operations cover 90-98% of the new file.

➕ ADD(delta_bytes)

"Add these delta values byte-by-byte to the copied bytes."

Sometimes the new file is almost identical to a section in the old file, but with small byte-level differences (like a function offset shifting by a few bytes). ADD captures these "near misses" efficiently — instead of storing the full new bytes, it stores the difference between old and new bytes at each position.

🆕 INSERT(new_bytes)

"Insert these entirely new bytes that don't exist anywhere in the old file."

When you add a completely new function or string literal, those bytes have no match in the old file. They get encoded as raw INSERT operations. This is the most expensive operation but typically covers only 2-10% of the file.

2.4 The Diff for Our Example

Instruction 1:  COPY(offset=0, length=24)       → "Hello World! This is my "
                                                    ↑ 24 bytes from old file position 0

Instruction 2:  INSERT("updated ")               → "updated "
                                                    ↑ 8 new bytes (not in old file)

Instruction 3:  COPY(offset=24, length=11)       → "Flutter app"
                                                    ↑ 11 bytes from old file position 24

Instruction 4:  INSERT("!")                       → "!"
                                                    ↑ 1 new byte (replaces ".")

─────────────────────────────────────────────────────────
TOTAL DIFF SIZE:  3 offsets/lengths (maybe 12 bytes)
                + 9 bytes of new data
                = ~21 bytes to describe transformation

vs. sending entire new file = 44 bytes
vs. old file size           = 36 bytes

Savings: 52% reduction (and this gets MUCH better with larger files)

2.5 How It Scales to libapp.so (3 MB)

With a real libapp.so, the same principle applies at a massive scale:

Scenario	COPY %	ADD %	INSERT %	Diff Size
Change 1 line of Dart code	~99%	~0.5%	~0.5%	~30 KB
Change 10 files	~97%	~1.5%	~1.5%	~80 KB
Major refactor (50 files)	~90%	~5%	~5%	~300 KB
Complete rewrite	~40%	~10%	~50%	~1.5 MB

2.6 The DiffParams in Shorebird

// updater/patch/src/lib.rs
let diff_params = DiffParams::new(
    1,     // scan_block_size: minimum match length (1 byte granularity)
    None   // sort_partitions: auto (uses available CPU cores)
).unwrap();

bidiff::simple_diff_with_params(&older[..], &newer[..], &mut writer, &diff_params);

scan_block_size = 1 means the algorithm looks for matches at single-byte granularity — the most thorough (but slowest) setting. Since this runs on the developer's machine (not the phone), speed isn't critical.

Why bidiff over bsdiff?

bsdiff is the original (from 2003, by Colin Percival). bidiff is a Rust reimplementation with the same core algorithm (suffix array + binary diff) but with a cleaner streaming API, better memory efficiency, and native Rust integration. Shorebird chose it because the entire updater library is Rust, and bidiff integrates seamlessly with bipatch (the reverse operation).

3. bipatch — Applying the Diff on the Device

bipatch is the inverse of bidiff. It runs on the user's phone and reconstructs the new file from the old file + the diff instructions.

// On the device (Rust updater):
let mut reader = bipatch::Reader::new(
    diff_stream,     // the decompressed diff instructions
    old_file_reader  // seekable reader over the original bundled libapp.so
);

// reader now behaves like a Read stream that outputs the NEW file bytes
std::io::copy(&mut reader, &mut output_file);

bipatch processes the instruction stream sequentially:

1. Read instruction header → it's a COPY(offset=0, length=24)
2. Seek to byte 0 in the old file, read 24 bytes → write to output
3. Read next instruction → INSERT(8 bytes)
4. Read 8 bytes from the diff stream → write to output
5. Read next instruction → COPY(offset=24, length=11)
6. Seek to byte 24 in the old file, read 11 bytes → write to output
7. ...and so on until the diff stream is exhausted

The output is a byte-for-byte identical copy of the new libapp.so. The SHA256 hash is verified after reconstruction to guarantee this.

4. Zstd Compression — Making the Diff Even Smaller

4.1 What is Zstd?

Zstandard (zstd) is a real-time compression algorithm created by Yann Collet at Facebook/Meta in 2015. It's designed to provide high compression ratios at very fast speeds — both for compression and decompression.

It's used by: Linux kernel, Android system updates, databases (RocksDB, PostgreSQL), game engines (Unreal), and hundreds of other systems. It's essentially the modern replacement for both gzip (zlib) and Snappy.

4.2 How Zstd Works (Simplified)

Zstd combines three compression techniques:

① LZ77 — Find Repeated Patterns

Scans the data with a sliding window, looking for byte sequences that appeared earlier. When it finds a match, it replaces the repeated bytes with a (distance, length) reference: "Go back distance bytes and copy length bytes from there."

Example: In the diff instructions, many COPY operations have similar offset patterns. LZ77 captures this repetition.

② Finite State Entropy (FSE) — Smart Encoding

After LZ77, the stream of literals, match lengths, and offsets is encoded using Finite State Entropy (zstd's custom entropy coder, an alternative to Huffman coding). FSE assigns shorter bit patterns to more common values and longer patterns to rare values.

If the byte 0x00 appears 40% of the time in the diff, FSE might encode it as just 2 bits instead of 8.

③ Dictionary + Repeat Offsets

Zstd keeps track of the 3 most recent match offsets. Since consecutive matches often have similar offsets (common in structured binary data like ELF files), referencing a "repeat offset" costs almost nothing — sometimes just 1-2 bits.

4.3 Why Zstd Specifically?

✅ Why Shorebird chose Zstd

Decompression speed: ~1.5 GB/s on mobile ARM64 — near memory-copy speed. The phone decompresses a 150 KB patch in <1ms.

Compression ratio: 3-5x better than Snappy, comparable to gzip at similar speed.

Streaming support: Can decompress byte-by-byte without loading the entire file into memory. Critical for the pipe architecture.

Small decoder: The decompression code adds only ~30 KB to the binary size.

Alternatives considered

gzip/zlib: 2-3x slower decompression, ~20% worse compression.

brotli: 10-15% better compression, but 3x slower decompression. Good for web assets, overkill for mobile OTA.

lz4: 2x faster decompression, but 2x worse compression ratio. The diff would be larger.

No compression: The bidiff output has high entropy but still contains repetitive patterns (similar offset values, runs of zeros). Zstd captures 60-70% further reduction.

4.4 The Magic Bytes

// Every zstd compressed file starts with these 4 bytes:
const ZSTD_MAGIC: [u8; 4] = [0x28, 0xB5, 0x2F, 0xFD];

// The device validates this BEFORE attempting decompression:
fn validate_compressed_patch(patch_path: &Path) {
    let mut magic = [0u8; 4];
    file.read_exact(&mut magic);
    if magic != ZSTD_MAGIC {
        bail!("Not a valid zstd archive — download may be corrupt");
    }
}

This is a quick sanity check that prevents wasting CPU on garbage data (e.g., if the download was corrupted or a CDN served an error page instead of the patch file).

5. The Complete Pipeline — Putting It All Together

5.1 On Developer's Machine (Creating the Patch)

old libapp.so
3,146 KB

+

new libapp.so
3,148 KB

→

bidiff
suffix sort + diff

→

raw diff
~480 KB

→

zstd compress
level 3 (default)

→

patch.bin
~150 KB ✓

// updater/patch/src/lib.rs — the exact code
pub fn make_patch(older: Vec<u8>, newer: Vec<u8>, patch: &mut WS) {
    let (mut patch_r, mut patch_w) = pipe::pipe();    // in-memory pipe
    let diff_params = DiffParams::new(1, None).unwrap();

    // Thread 1: compute binary diff, write to pipe
    std::thread::spawn(move || {
        bidiff::simple_diff_with_params(&older, &newer, &mut patch_w, &diff_params).unwrap();
    });

    // Thread 2 (this thread): read diff from pipe, compress with zstd, write to output
    let compressor = ZstdCompressor::new();
    compressor.compress(&mut BufWriter::new(patch), &mut patch_r).unwrap();
}

Two threads run in parallel, connected by a pipe. Thread 1 produces diff bytes into the pipe, Thread 2 reads and compresses them. This is a producer-consumer pattern — the diff doesn't need to be fully computed before compression starts.

5.2 On User's Phone (Applying the Patch)

patch.bin
~150 KB (downloaded)

→

zstd decompress
<1ms on ARM64

→

raw diff
~480 KB

+

bundled libapp.so
from APK (3,146 KB)

→

bipatch
apply instructions

→

dlc.vmcode
3,148 KB ✓

// updater/library/src/updater.rs — the exact code
fn inflate(patch_path: &Path, base_reader: RS, output_path: &Path) {
    // Validate zstd magic bytes first
    validate_compressed_patch(patch_path);

    let compressed_patch_r = BufReader::new(File::open(patch_path));
    let output_file_w = File::create(output_path);

    let (patch_r, patch_w) = pipe::pipe();    // in-memory pipe

    // Thread 1: decompress zstd → write decompressed diff into pipe
    let decompress = ZstdDecompressor::new();
    std::thread::spawn(move || {
        decompress.copy(compressed_patch_r, patch_w);
    });

    // Thread 2 (this thread): read decompressed diff from pipe
    //                         + read original libapp.so from APK
    //                         → apply bipatch → write new libapp.so
    let mut fresh_r = bipatch::Reader::new(patch_r, base_reader);
    std::io::copy(&mut fresh_r, &mut BufWriter::new(output_file_w));
}

Same pipe architecture as the creation side, but reversed. Decompression and patching happen in parallel. The phone never needs to hold the entire decompressed diff in memory — it streams through the pipe.

5.3 Verification

fn check_hash(file_path: &Path, expected_hash: &str) -> Result<()> {
    let mut file = File::open(file_path)?;
    let mut hasher = Sha256::new();
    std::io::copy(&mut file, &mut hasher)?;
    let actual_hash = hex::encode(hasher.finalize());

    if actual_hash != expected_hash {
        bail!("Hash mismatch: expected {}, got {}", expected_hash, actual_hash);
    }
    Ok(())
}

After dlc.vmcode is written, the updater computes SHA256(dlc.vmcode) and compares it to the hash from the server. This guarantees the reconstructed file is byte-for-byte identical to what the developer built. Any corruption — in download, decompression, or patching — is caught.

6. Summary

Component	What It Does	Where It Runs	Speed
bidiff	Finds matching byte sequences between old & new file using suffix array. Outputs COPY/ADD/INSERT instructions.	Developer's machine	~2-5 seconds for 3 MB
zstd compress	Compresses the diff instructions using LZ77 + FSE entropy coding. Typically 60-70% further reduction.	Developer's machine	~50ms
zstd decompress	Reverses the compression. Streams byte-by-byte via pipe.	User's phone	<1ms for 150 KB
bipatch	Reads COPY/ADD/INSERT instructions, seeks in the original file, reconstructs the new file.	User's phone	~50-200ms for 3 MB
SHA256	Verifies the reconstructed file matches the expected hash.	User's phone	~20ms for 3 MB

The Key Insight

The combination of bidiff + zstd reduces a 3 MB file change to a ~150 KB download. The device reconstructs the full file using the original (already on the phone) as a reference. The entire process — download, decompress, patch, verify — takes under 2 seconds on a mid-range phone over a 4G connection.

Tags: flutter on air update shorebird

Part 2: Shorebird | bidiff Algorithm & Zstd Compression

1. The Problem — Why Not Just Send the New File?

2. bidiff — Binary Diff Algorithm

2.1 What is bidiff?

2.2 The Core Idea — Suffix Sorting

Step 1: Build a Suffix Array of the Old File

🔬 Worked Example

2.3 The Three Operations

📋 COPY(old_offset, length)

➕ ADD(delta_bytes)

🆕 INSERT(new_bytes)

2.4 The Diff for Our Example

2.5 How It Scales to libapp.so (3 MB)

2.6 The DiffParams in Shorebird

3. bipatch — Applying the Diff on the Device

4. Zstd Compression — Making the Diff Even Smaller

4.1 What is Zstd?

4.2 How Zstd Works (Simplified)

① LZ77 — Find Repeated Patterns

② Finite State Entropy (FSE) — Smart Encoding

③ Dictionary + Repeat Offsets

4.3 Why Zstd Specifically?

✅ Why Shorebird chose Zstd

Alternatives considered

4.4 The Magic Bytes

5. The Complete Pipeline — Putting It All Together

5.1 On Developer's Machine (Creating the Patch)

5.2 On User's Phone (Applying the Patch)

5.3 Verification

6. Summary

Post a Comment

Subscribe Us

Contact Form