Compression algorithm history is quite long and starts with first attempts by humans to replace longer sequences of information with shorter once. Morse code or maritime signal flags are one of the first examples of signal compression. In both of these languages longer sequences of data are converted to a shorter signal which achieves signal compression. In morse code sequences of dots and dashes were specifically coded in a way that letters which most frequently occur in English language are coded with the shortest sequence. That makes morse code taking up less space than it would if sequences were arranged in another way.
In 1948 Claude Shannon published an article called “A Mathematical Theory of Communication”. This article was later published as a book and became one of the key stones in the information theory. A method called Shannon-Fano coding was introduced in it. This method allows creation of very effective coding table between original alphabet and sequences of bits. Frequencies of letters in the alphabet are used as an input and the coding achieves almost always the minimum possible coding length. But since it doesn’t achieve it 100% of times, Huffman coding is always used in modern algorithms.
Huffman code similarly to Shannon-Fano code is an optimal coding of alphabet symbols to a binary sequence based on letter frequencies. Both Huffman and Shannon-Fano codes are prefix codes which means each coding sequence does not have another coding sequence as its substring. This allows unambiguous decoding of coded strings.
In 1977 LZ77 and LZ78 algorithms were introduced by Israeli computer scientists Abraham Lempel and Jacob Ziv. These algorithms made a foundation for algorithms used to compress GIF files and DEFLATE algorithm which is used in ZIP, gzip, PNG, and many other places.
More recently PPMd, LZMA and Burrows-Wheeler algorithms were developed which improved compression efficiency even further. LZMA and its improvement LZMA2 is used in 7-Zip archiver while Burrows-Wheeler algorithm is used in bzip2. There are also many cases when specific compression algorithm is applied to the data depending on its nature. For example executable binaries and image data use different compression algorithms to achieve maximum compression rates.
ZIP Quick Info | |
---|---|
Zip File Format | |
MIME Type | |
| |
Opens with | |
|