LZ77 is one of the most popular compression algorithms and is basis to DELFATE algorithm used in ZIP, PNG, gzip, zlib and many other compression utilities. LZ77 was invented by two Israeli scientists Abraham Lempel and Jacob Ziv in 1977.
The concept of LZ77 is relatively simple to understand. It tries to find sequences of characters which match characters earlier in the string and replaces it with an offset to the matching sequence together with a sequence length. So instead of storing the character sequence twice it is only stored once and instead of the second and further occurrences just two numbers (offset and length) are stored.
DEFLATE is a modification of LZ77 which uses Huffman coding to make compression even more efficient. Data is split in blocks and each block is operated upon separately. There are three modes how DEFLATE can operate on a block:
The Huffman coding in DEFLATE is done by creating of an alphabet which includes not only data characters but also lengths of the data and end codes. And a separate Huffman coding is done on distances.
First implementation of DEFLATE was done by Phil Katz in PKZIP archiver. After that Jean-loup Gailly and Mark Adler implemented it in gzip, zlib and Info-Zip. Zlib became a critical software component of many software platforms including Windows, Mac OS, Linux and iOS. Even data transferred via the HTTP internet protocol is compressed by zlib. Even though more advanced algorithms like Burrows-Wheeler algorithm and arithmetic coding exist DEFLATE still dominates the compression space because of its effectiveness, performance and relative simplicity.
ZIP Quick Info | |
---|---|
Zip File Format | |
MIME Type | |
| |
Opens with | |
|