HeadlinesBriefing favicon HeadlinesBriefing.com

Fabrice Bellard’s ts_zip: AI‑Driven Text Compression

Hacker News: Front Page •
×

Fabrice Bellard has released ts_zip, a text‑compression tool that relies on a large language model. The program requires a GPU for reasonable speed and at least 4 GB of RAM, with an RTX 4090 achieving about 1 MB/s for both compression and decompression. Only text files are supported; binary data sees little benefit.

Bellard’s choice of the RWKV 169M v4 model, trained mainly on English, gives a balance between speed and ratio, and the model is quantized to 8 bits per parameter and evaluated with BF16 arithmetic. Results show a compression ratio of 1.14 bpb on a 152 kB file, outperforming xz at 2.55 bpb, and reaching 1.10 bpb on a 213 MB file. The tool is experimental, so backward compatibility between versions is not guaranteed.

Bellard also offers a Windows build and a Linux tarball, and links to the Large Text Compression Benchmark for enwik8 and enwik9. While slower than traditional compressors, ts_zip demonstrates how neural models can push the limits of text compression, hinting at future hybrid approaches that blend statistical and learned techniques.