LZMA SDK 19.00 | |
-------------- | |
LZMA SDK provides the documentation, samples, header files, | |
libraries, and tools you need to develop applications that | |
use 7z / LZMA / LZMA2 / XZ compression. | |
LZMA is an improved version of famous LZ77 compression algorithm. | |
It was improved in way of maximum increasing of compression ratio, | |
keeping high decompression speed and low memory requirements for | |
decompressing. | |
LZMA2 is a LZMA based compression method. LZMA2 provides better | |
multithreading support for compression than LZMA and some other improvements. | |
7z is a file format for data compression and file archiving. | |
7z is a main file format for 7-Zip compression program (www.7-zip.org). | |
7z format supports different compression methods: LZMA, LZMA2 and others. | |
7z also supports AES-256 based encryption. | |
XZ is a file format for data compression that uses LZMA2 compression. | |
XZ format provides additional features: SHA/CRC check, filters for | |
improved compression ratio, splitting to blocks and streams, | |
LICENSE | |
------- | |
LZMA SDK is written and placed in the public domain by Igor Pavlov. | |
Some code in LZMA SDK is based on public domain code from another developers: | |
1) PPMd var.H (2001): Dmitry Shkarin | |
2) SHA-256: Wei Dai (Crypto++ library) | |
Anyone is free to copy, modify, publish, use, compile, sell, or distribute the | |
original LZMA SDK code, either in source code form or as a compiled binary, for | |
any purpose, commercial or non-commercial, and by any means. | |
LZMA SDK code is compatible with open source licenses, for example, you can | |
include it to GNU GPL or GNU LGPL code. | |
LZMA SDK Contents | |
----------------- | |
Source code: | |
- C / C++ / C# / Java - LZMA compression and decompression | |
- C / C++ - LZMA2 compression and decompression | |
- C / C++ - XZ compression and decompression | |
- C - 7z decompression | |
- C++ - 7z compression and decompression | |
- C - small SFXs for installers (7z decompression) | |
- C++ - SFXs and SFXs for installers (7z decompression) | |
Precomiled binaries: | |
- console programs for lzma / 7z / xz compression and decompression | |
- SFX modules for installers. | |
UNIX/Linux version | |
------------------ | |
To compile C++ version of file->file LZMA encoding, go to directory | |
CPP/7zip/Bundles/LzmaCon | |
and call make to recompile it: | |
make -f makefile.gcc clean all | |
In some UNIX/Linux versions you must compile LZMA with static libraries. | |
To compile with static libraries, you can use | |
LIB = -lm -static | |
Also you can use p7zip (port of 7-Zip for POSIX systems like Unix or Linux): | |
http://p7zip.sourceforge.net/ | |
Files | |
----- | |
DOC/7zC.txt - 7z ANSI-C Decoder description | |
DOC/7zFormat.txt - 7z Format description | |
DOC/installer.txt - information about 7-Zip for installers | |
DOC/lzma.txt - LZMA compression description | |
DOC/lzma-sdk.txt - LZMA SDK description (this file) | |
DOC/lzma-history.txt - history of LZMA SDK | |
DOC/lzma-specification.txt - Specification of LZMA | |
DOC/Methods.txt - Compression method IDs for .7z | |
bin/installer/ - example script to create installer that uses SFX module, | |
bin/7zdec.exe - simplified 7z archive decoder | |
bin/7zr.exe - 7-Zip console program (reduced version) | |
bin/x64/7zr.exe - 7-Zip console program (reduced version) (x64 version) | |
bin/lzma.exe - file->file LZMA encoder/decoder for Windows | |
bin/7zS2.sfx - small SFX module for installers (GUI version) | |
bin/7zS2con.sfx - small SFX module for installers (Console version) | |
bin/7zSD.sfx - SFX module for installers. | |
7zDec.exe | |
--------- | |
7zDec.exe is simplified 7z archive decoder. | |
It supports only LZMA, LZMA2, and PPMd methods. | |
7zDec decodes whole solid block from 7z archive to RAM. | |
The RAM consumption can be high. | |
Source code structure | |
--------------------- | |
Asm/ - asm files (optimized code for CRC calculation and Intel-AES encryption) | |
C/ - C files (compression / decompression and other) | |
Util/ | |
7z - 7z decoder program (decoding 7z files) | |
Lzma - LZMA program (file->file LZMA encoder/decoder). | |
LzmaLib - LZMA library (.DLL for Windows) | |
SfxSetup - small SFX module for installers | |
CPP/ -- CPP files | |
Common - common files for C++ projects | |
Windows - common files for Windows related code | |
7zip - files related to 7-Zip | |
Archive - files related to archiving | |
Common - common files for archive handling | |
7z - 7z C++ Encoder/Decoder | |
Bundles - Modules that are bundles of other modules (files) | |
Alone7z - 7zr.exe: Standalone 7-Zip console program (reduced version) | |
Format7zExtractR - 7zxr.dll: Reduced version of 7z DLL: extracting from 7z/LZMA/BCJ/BCJ2. | |
Format7zR - 7zr.dll: Reduced version of 7z DLL: extracting/compressing to 7z/LZMA/BCJ/BCJ2 | |
LzmaCon - lzma.exe: LZMA compression/decompression | |
LzmaSpec - example code for LZMA Specification | |
SFXCon - 7zCon.sfx: Console 7z SFX module | |
SFXSetup - 7zS.sfx: 7z SFX module for installers | |
SFXWin - 7z.sfx: GUI 7z SFX module | |
Common - common files for 7-Zip | |
Compress - files for compression/decompression | |
Crypto - files for encryption / decompression | |
UI - User Interface files | |
Client7z - Test application for 7za.dll, 7zr.dll, 7zxr.dll | |
Common - Common UI files | |
Console - Code for console program (7z.exe) | |
Explorer - Some code from 7-Zip Shell extension | |
FileManager - Some GUI code from 7-Zip File Manager | |
GUI - Some GUI code from 7-Zip | |
CS/ - C# files | |
7zip | |
Common - some common files for 7-Zip | |
Compress - files related to compression/decompression | |
LZ - files related to LZ (Lempel-Ziv) compression algorithm | |
LZMA - LZMA compression/decompression | |
LzmaAlone - file->file LZMA compression/decompression | |
RangeCoder - Range Coder (special code of compression/decompression) | |
Java/ - Java files | |
SevenZip | |
Compression - files related to compression/decompression | |
LZ - files related to LZ (Lempel-Ziv) compression algorithm | |
LZMA - LZMA compression/decompression | |
RangeCoder - Range Coder (special code of compression/decompression) | |
Note: | |
Asm / C / C++ source code of LZMA SDK is part of 7-Zip's source code. | |
7-Zip's source code can be downloaded from 7-Zip's SourceForge page: | |
http://sourceforge.net/projects/sevenzip/ | |
LZMA features | |
------------- | |
- Variable dictionary size (up to 1 GB) | |
- Estimated compressing speed: about 2 MB/s on 2 GHz CPU | |
- Estimated decompressing speed: | |
- 20-30 MB/s on modern 2 GHz cpu | |
- 1-2 MB/s on 200 MHz simple RISC cpu: (ARM, MIPS, PowerPC) | |
- Small memory requirements for decompressing (16 KB + DictionarySize) | |
- Small code size for decompressing: 5-8 KB | |
LZMA decoder uses only integer operations and can be | |
implemented in any modern 32-bit CPU (or on 16-bit CPU with some conditions). | |
Some critical operations that affect the speed of LZMA decompression: | |
1) 32*16 bit integer multiply | |
2) Mispredicted branches (penalty mostly depends from pipeline length) | |
3) 32-bit shift and arithmetic operations | |
The speed of LZMA decompressing mostly depends from CPU speed. | |
Memory speed has no big meaning. But if your CPU has small data cache, | |
overall weight of memory speed will slightly increase. | |
How To Use | |
---------- | |
Using LZMA encoder/decoder executable | |
-------------------------------------- | |
Usage: LZMA <e|d> inputFile outputFile [<switches>...] | |
e: encode file | |
d: decode file | |
b: Benchmark. There are two tests: compressing and decompressing | |
with LZMA method. Benchmark shows rating in MIPS (million | |
instructions per second). Rating value is calculated from | |
measured speed and it is normalized with Intel's Core 2 results. | |
Also Benchmark checks possible hardware errors (RAM | |
errors in most cases). Benchmark uses these settings: | |
(-a1, -d21, -fb32, -mfbt4). You can change only -d parameter. | |
Also you can change the number of iterations. Example for 30 iterations: | |
LZMA b 30 | |
Default number of iterations is 10. | |
<Switches> | |
-a{N}: set compression mode 0 = fast, 1 = normal | |
default: 1 (normal) | |
d{N}: Sets Dictionary size - [0, 30], default: 23 (8MB) | |
The maximum value for dictionary size is 1 GB = 2^30 bytes. | |
Dictionary size is calculated as DictionarySize = 2^N bytes. | |
For decompressing file compressed by LZMA method with dictionary | |
size D = 2^N you need about D bytes of memory (RAM). | |
-fb{N}: set number of fast bytes - [5, 273], default: 128 | |
Usually big number gives a little bit better compression ratio | |
and slower compression process. | |
-lc{N}: set number of literal context bits - [0, 8], default: 3 | |
Sometimes lc=4 gives gain for big files. | |
-lp{N}: set number of literal pos bits - [0, 4], default: 0 | |
lp switch is intended for periodical data when period is | |
equal 2^N. For example, for 32-bit (4 bytes) | |
periodical data you can use lp=2. Often it's better to set lc0, | |
if you change lp switch. | |
-pb{N}: set number of pos bits - [0, 4], default: 2 | |
pb switch is intended for periodical data | |
when period is equal 2^N. | |
-mf{MF_ID}: set Match Finder. Default: bt4. | |
Algorithms from hc* group doesn't provide good compression | |
ratio, but they often works pretty fast in combination with | |
fast mode (-a0). | |
Memory requirements depend from dictionary size | |
(parameter "d" in table below). | |
MF_ID Memory Description | |
bt2 d * 9.5 + 4MB Binary Tree with 2 bytes hashing. | |
bt3 d * 11.5 + 4MB Binary Tree with 3 bytes hashing. | |
bt4 d * 11.5 + 4MB Binary Tree with 4 bytes hashing. | |
hc4 d * 7.5 + 4MB Hash Chain with 4 bytes hashing. | |
-eos: write End Of Stream marker. By default LZMA doesn't write | |
eos marker, since LZMA decoder knows uncompressed size | |
stored in .lzma file header. | |
-si: Read data from stdin (it will write End Of Stream marker). | |
-so: Write data to stdout | |
Examples: | |
1) LZMA e file.bin file.lzma -d16 -lc0 | |
compresses file.bin to file.lzma with 64 KB dictionary (2^16=64K) | |
and 0 literal context bits. -lc0 allows to reduce memory requirements | |
for decompression. | |
2) LZMA e file.bin file.lzma -lc0 -lp2 | |
compresses file.bin to file.lzma with settings suitable | |
for 32-bit periodical data (for example, ARM or MIPS code). | |
3) LZMA d file.lzma file.bin | |
decompresses file.lzma to file.bin. | |
Compression ratio hints | |
----------------------- | |
Recommendations | |
--------------- | |
To increase the compression ratio for LZMA compressing it's desirable | |
to have aligned data (if it's possible) and also it's desirable to locate | |
data in such order, where code is grouped in one place and data is | |
grouped in other place (it's better than such mixing: code, data, code, | |
data, ...). | |
Filters | |
------- | |
You can increase the compression ratio for some data types, using | |
special filters before compressing. For example, it's possible to | |
increase the compression ratio on 5-10% for code for those CPU ISAs: | |
x86, IA-64, ARM, ARM-Thumb, PowerPC, SPARC. | |
You can find C source code of such filters in C/Bra*.* files | |
You can check the compression ratio gain of these filters with such | |
7-Zip commands (example for ARM code): | |
No filter: | |
7z a a1.7z a.bin -m0=lzma | |
With filter for little-endian ARM code: | |
7z a a2.7z a.bin -m0=arm -m1=lzma | |
It works in such manner: | |
Compressing = Filter_encoding + LZMA_encoding | |
Decompressing = LZMA_decoding + Filter_decoding | |
Compressing and decompressing speed of such filters is very high, | |
so it will not increase decompressing time too much. | |
Moreover, it reduces decompression time for LZMA_decoding, | |
since compression ratio with filtering is higher. | |
These filters convert CALL (calling procedure) instructions | |
from relative offsets to absolute addresses, so such data becomes more | |
compressible. | |
For some ISAs (for example, for MIPS) it's impossible to get gain from such filter. | |
--- | |
http://www.7-zip.org | |
http://www.7-zip.org/sdk.html | |
http://www.7-zip.org/support.html |