feat: 9.5.9
This commit is contained in:
parent
cb1753732b
commit
35f43a7909
1084 changed files with 558985 additions and 0 deletions
156
lz4/doc/lz4_Block_format.md
Normal file
156
lz4/doc/lz4_Block_format.md
Normal file
|
@ -0,0 +1,156 @@
|
|||
LZ4 Block Format Description
|
||||
============================
|
||||
Last revised: 2019-03-30.
|
||||
Author : Yann Collet
|
||||
|
||||
|
||||
This specification is intended for developers
|
||||
willing to produce LZ4-compatible compressed data blocks
|
||||
using any programming language.
|
||||
|
||||
LZ4 is an LZ77-type compressor with a fixed, byte-oriented encoding.
|
||||
There is no entropy encoder back-end nor framing layer.
|
||||
The latter is assumed to be handled by other parts of the system
|
||||
(see [LZ4 Frame format]).
|
||||
This design is assumed to favor simplicity and speed.
|
||||
It helps later on for optimizations, compactness, and features.
|
||||
|
||||
This document describes only the block format,
|
||||
not how the compressor nor decompressor actually work.
|
||||
The correctness of the decompressor should not depend
|
||||
on implementation details of the compressor, and vice versa.
|
||||
|
||||
[LZ4 Frame format]: lz4_Frame_format.md
|
||||
|
||||
|
||||
|
||||
Compressed block format
|
||||
-----------------------
|
||||
An LZ4 compressed block is composed of sequences.
|
||||
A sequence is a suite of literals (not-compressed bytes),
|
||||
followed by a match copy.
|
||||
|
||||
Each sequence starts with a `token`.
|
||||
The `token` is a one byte value, separated into two 4-bits fields.
|
||||
Therefore each field ranges from 0 to 15.
|
||||
|
||||
|
||||
The first field uses the 4 high-bits of the token.
|
||||
It provides the length of literals to follow.
|
||||
|
||||
If the field value is 0, then there is no literal.
|
||||
If it is 15, then we need to add some more bytes to indicate the full length.
|
||||
Each additional byte then represent a value from 0 to 255,
|
||||
which is added to the previous value to produce a total length.
|
||||
When the byte value is 255, another byte is output.
|
||||
There can be any number of bytes following `token`. There is no "size limit".
|
||||
(Side note : this is why a not-compressible input block is expanded by 0.4%).
|
||||
|
||||
Example 1 : A literal length of 48 will be represented as :
|
||||
|
||||
- 15 : value for the 4-bits High field
|
||||
- 33 : (=48-15) remaining length to reach 48
|
||||
|
||||
Example 2 : A literal length of 280 will be represented as :
|
||||
|
||||
- 15 : value for the 4-bits High field
|
||||
- 255 : following byte is maxed, since 280-15 >= 255
|
||||
- 10 : (=280 - 15 - 255) ) remaining length to reach 280
|
||||
|
||||
Example 3 : A literal length of 15 will be represented as :
|
||||
|
||||
- 15 : value for the 4-bits High field
|
||||
- 0 : (=15-15) yes, the zero must be output
|
||||
|
||||
Following `token` and optional length bytes, are the literals themselves.
|
||||
They are exactly as numerous as previously decoded (length of literals).
|
||||
It's possible that there are zero literal.
|
||||
|
||||
|
||||
Following the literals is the match copy operation.
|
||||
|
||||
It starts by the `offset`.
|
||||
This is a 2 bytes value, in little endian format
|
||||
(the 1st byte is the "low" byte, the 2nd one is the "high" byte).
|
||||
|
||||
The `offset` represents the position of the match to be copied from.
|
||||
1 means "current position - 1 byte".
|
||||
The maximum `offset` value is 65535, 65536 cannot be coded.
|
||||
Note that 0 is an invalid value, not used.
|
||||
|
||||
Then we need to extract the `matchlength`.
|
||||
For this, we use the second token field, the low 4-bits.
|
||||
Value, obviously, ranges from 0 to 15.
|
||||
However here, 0 means that the copy operation will be minimal.
|
||||
The minimum length of a match, called `minmatch`, is 4.
|
||||
As a consequence, a 0 value means 4 bytes, and a value of 15 means 19+ bytes.
|
||||
Similar to literal length, on reaching the highest possible value (15),
|
||||
we output additional bytes, one at a time, with values ranging from 0 to 255.
|
||||
They are added to total to provide the final match length.
|
||||
A 255 value means there is another byte to read and add.
|
||||
There is no limit to the number of optional bytes that can be output this way.
|
||||
(This points towards a maximum achievable compression ratio of about 250).
|
||||
|
||||
Decoding the `matchlength` reaches the end of current sequence.
|
||||
Next byte will be the start of another sequence.
|
||||
But before moving to next sequence,
|
||||
it's time to use the decoded match position and length.
|
||||
The decoder copies `matchlength` bytes from match position to current position.
|
||||
|
||||
In some cases, `matchlength` is larger than `offset`.
|
||||
Therefore, `match_pos + matchlength > current_pos`,
|
||||
which means that later bytes to copy are not yet decoded.
|
||||
This is called an "overlap match", and must be handled with special care.
|
||||
A common case is an offset of 1,
|
||||
meaning the last byte is repeated `matchlength` times.
|
||||
|
||||
|
||||
End of block restrictions
|
||||
-----------------------
|
||||
There are specific rules required to terminate a block.
|
||||
|
||||
1. The last sequence contains only literals.
|
||||
The block ends right after them.
|
||||
2. The last 5 bytes of input are always literals.
|
||||
Therefore, the last sequence contains at least 5 bytes.
|
||||
- Special : if input is smaller than 5 bytes,
|
||||
there is only one sequence, it contains the whole input as literals.
|
||||
Empty input can be represented with a zero byte,
|
||||
interpreted as a final token without literal and without a match.
|
||||
3. The last match must start at least 12 bytes before the end of block.
|
||||
The last match is part of the penultimate sequence.
|
||||
It is followed by the last sequence, which contains only literals.
|
||||
- Note that, as a consequence,
|
||||
an independent block < 13 bytes cannot be compressed,
|
||||
because the match must copy "something",
|
||||
so it needs at least one prior byte.
|
||||
- When a block can reference data from another block,
|
||||
it can start immediately with a match and no literal,
|
||||
so a block of 12 bytes can be compressed.
|
||||
|
||||
When a block does not respect these end conditions,
|
||||
a conformant decoder is allowed to reject the block as incorrect.
|
||||
|
||||
These rules are in place to ensure that a conformant decoder
|
||||
can be designed for speed, issuing speculatively instructions,
|
||||
while never reading nor writing beyond provided I/O buffers.
|
||||
|
||||
|
||||
Additional notes
|
||||
-----------------------
|
||||
If the decoder will decompress data from an external source,
|
||||
it is recommended to ensure that the decoder will not be vulnerable to
|
||||
buffer overflow manipulations.
|
||||
Always ensure that read and write operations
|
||||
remain within the limits of provided buffers.
|
||||
Test the decoder with fuzzers
|
||||
to ensure it's resilient to improbable combinations.
|
||||
|
||||
The format makes no assumption nor limits to the way the compressor
|
||||
searches and selects matches within the source data block.
|
||||
Multiple techniques can be considered,
|
||||
featuring distinct time / performance trade offs.
|
||||
As long as the format is respected,
|
||||
the result will be compatible and decodable by any compliant decoder.
|
||||
An upper compression limit can be reached,
|
||||
using a technique called "full optimal parsing", at high cpu cost.
|
433
lz4/doc/lz4_Frame_format.md
Normal file
433
lz4/doc/lz4_Frame_format.md
Normal file
|
@ -0,0 +1,433 @@
|
|||
LZ4 Frame Format Description
|
||||
============================
|
||||
|
||||
### Notices
|
||||
|
||||
Copyright (c) 2013-2015 Yann Collet
|
||||
|
||||
Permission is granted to copy and distribute this document
|
||||
for any purpose and without charge,
|
||||
including translations into other languages
|
||||
and incorporation into compilations,
|
||||
provided that the copyright notice and this notice are preserved,
|
||||
and that any substantive changes or deletions from the original
|
||||
are clearly marked.
|
||||
Distribution of this document is unlimited.
|
||||
|
||||
### Version
|
||||
|
||||
1.6.2 (12/08/2020)
|
||||
|
||||
|
||||
Introduction
|
||||
------------
|
||||
|
||||
The purpose of this document is to define a lossless compressed data format,
|
||||
that is independent of CPU type, operating system,
|
||||
file system and character set, suitable for
|
||||
File compression, Pipe and streaming compression
|
||||
using the [LZ4 algorithm](http://www.lz4.org).
|
||||
|
||||
The data can be produced or consumed,
|
||||
even for an arbitrarily long sequentially presented input data stream,
|
||||
using only an a priori bounded amount of intermediate storage,
|
||||
and hence can be used in data communications.
|
||||
The format uses the LZ4 compression method,
|
||||
and optional [xxHash-32 checksum method](https://github.com/Cyan4973/xxHash),
|
||||
for detection of data corruption.
|
||||
|
||||
The data format defined by this specification
|
||||
does not attempt to allow random access to compressed data.
|
||||
|
||||
This specification is intended for use by implementers of software
|
||||
to compress data into LZ4 format and/or decompress data from LZ4 format.
|
||||
The text of the specification assumes a basic background in programming
|
||||
at the level of bits and other primitive data representations.
|
||||
|
||||
Unless otherwise indicated below,
|
||||
a compliant compressor must produce data sets
|
||||
that conform to the specifications presented here.
|
||||
It doesn’t need to support all options though.
|
||||
|
||||
A compliant decompressor must be able to decompress
|
||||
at least one working set of parameters
|
||||
that conforms to the specifications presented here.
|
||||
It may also ignore checksums.
|
||||
Whenever it does not support a specific parameter within the compressed stream,
|
||||
it must produce a non-ambiguous error code
|
||||
and associated error message explaining which parameter is unsupported.
|
||||
|
||||
|
||||
General Structure of LZ4 Frame format
|
||||
-------------------------------------
|
||||
|
||||
| MagicNb | F. Descriptor | Block | (...) | EndMark | C. Checksum |
|
||||
|:-------:|:-------------:| ----- | ----- | ------- | ----------- |
|
||||
| 4 bytes | 3-15 bytes | | | 4 bytes | 0-4 bytes |
|
||||
|
||||
__Magic Number__
|
||||
|
||||
4 Bytes, Little endian format.
|
||||
Value : 0x184D2204
|
||||
|
||||
__Frame Descriptor__
|
||||
|
||||
3 to 15 Bytes, to be detailed in its own paragraph,
|
||||
as it is the most important part of the spec.
|
||||
|
||||
The combined _Magic_Number_ and _Frame_Descriptor_ fields are sometimes
|
||||
called ___LZ4 Frame Header___. Its size varies between 7 and 19 bytes.
|
||||
|
||||
__Data Blocks__
|
||||
|
||||
To be detailed in its own paragraph.
|
||||
That’s where compressed data is stored.
|
||||
|
||||
__EndMark__
|
||||
|
||||
The flow of blocks ends when the last data block is followed by
|
||||
the 32-bit value `0x00000000`.
|
||||
|
||||
__Content Checksum__
|
||||
|
||||
_Content_Checksum_ verify that the full content has been decoded correctly.
|
||||
The content checksum is the result of [xxHash-32 algorithm]
|
||||
digesting the original (decoded) data as input, and a seed of zero.
|
||||
Content checksum is only present when its associated flag
|
||||
is set in the frame descriptor.
|
||||
Content Checksum validates the result,
|
||||
that all blocks were fully transmitted in the correct order and without error,
|
||||
and also that the encoding/decoding process itself generated no distortion.
|
||||
Its usage is recommended.
|
||||
|
||||
The combined _EndMark_ and _Content_Checksum_ fields might sometimes be
|
||||
referred to as ___LZ4 Frame Footer___. Its size varies between 4 and 8 bytes.
|
||||
|
||||
__Frame Concatenation__
|
||||
|
||||
In some circumstances, it may be preferable to append multiple frames,
|
||||
for example in order to add new data to an existing compressed file
|
||||
without re-framing it.
|
||||
|
||||
In such case, each frame has its own set of descriptor flags.
|
||||
Each frame is considered independent.
|
||||
The only relation between frames is their sequential order.
|
||||
|
||||
The ability to decode multiple concatenated frames
|
||||
within a single stream or file
|
||||
is left outside of this specification.
|
||||
As an example, the reference lz4 command line utility behavior is
|
||||
to decode all concatenated frames in their sequential order.
|
||||
|
||||
|
||||
Frame Descriptor
|
||||
----------------
|
||||
|
||||
| FLG | BD | (Content Size) | (Dictionary ID) | HC |
|
||||
| ------- | ------- |:--------------:|:---------------:| ------- |
|
||||
| 1 byte | 1 byte | 0 - 8 bytes | 0 - 4 bytes | 1 byte |
|
||||
|
||||
The descriptor uses a minimum of 3 bytes,
|
||||
and up to 15 bytes depending on optional parameters.
|
||||
|
||||
__FLG byte__
|
||||
|
||||
| BitNb | 7-6 | 5 | 4 | 3 | 2 | 1 | 0 |
|
||||
| ------- |-------|-------|----------|------|----------|----------|------|
|
||||
|FieldName|Version|B.Indep|B.Checksum|C.Size|C.Checksum|*Reserved*|DictID|
|
||||
|
||||
|
||||
__BD byte__
|
||||
|
||||
| BitNb | 7 | 6-5-4 | 3-2-1-0 |
|
||||
| ------- | -------- | ------------- | -------- |
|
||||
|FieldName|*Reserved*| Block MaxSize |*Reserved*|
|
||||
|
||||
In the tables, bit 7 is highest bit, while bit 0 is lowest.
|
||||
|
||||
__Version Number__
|
||||
|
||||
2-bits field, must be set to `01`.
|
||||
Any other value cannot be decoded by this version of the specification.
|
||||
Other version numbers will use different flag layouts.
|
||||
|
||||
__Block Independence flag__
|
||||
|
||||
If this flag is set to “1”, blocks are independent.
|
||||
If this flag is set to “0”, each block depends on previous ones
|
||||
(up to LZ4 window size, which is 64 KB).
|
||||
In such case, it’s necessary to decode all blocks in sequence.
|
||||
|
||||
Block dependency improves compression ratio, especially for small blocks.
|
||||
On the other hand, it makes random access or multi-threaded decoding impossible.
|
||||
|
||||
__Block checksum flag__
|
||||
|
||||
If this flag is set, each data block will be followed by a 4-bytes checksum,
|
||||
calculated by using the xxHash-32 algorithm on the raw (compressed) data block.
|
||||
The intention is to detect data corruption (storage or transmission errors)
|
||||
immediately, before decoding.
|
||||
Block checksum usage is optional.
|
||||
|
||||
__Content Size flag__
|
||||
|
||||
If this flag is set, the uncompressed size of data included within the frame
|
||||
will be present as an 8 bytes unsigned little endian value, after the flags.
|
||||
Content Size usage is optional.
|
||||
|
||||
__Content checksum flag__
|
||||
|
||||
If this flag is set, a 32-bits content checksum will be appended
|
||||
after the EndMark.
|
||||
|
||||
__Dictionary ID flag__
|
||||
|
||||
If this flag is set, a 4-bytes Dict-ID field will be present,
|
||||
after the descriptor flags and the Content Size.
|
||||
|
||||
__Block Maximum Size__
|
||||
|
||||
This information is useful to help the decoder allocate memory.
|
||||
Size here refers to the original (uncompressed) data size.
|
||||
Block Maximum Size is one value among the following table :
|
||||
|
||||
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
|
||||
| --- | --- | --- | --- | ----- | ------ | ---- | ---- |
|
||||
| N/A | N/A | N/A | N/A | 64 KB | 256 KB | 1 MB | 4 MB |
|
||||
|
||||
The decoder may refuse to allocate block sizes above any system-specific size.
|
||||
Unused values may be used in a future revision of the spec.
|
||||
A decoder conformant with the current version of the spec
|
||||
is only able to decode block sizes defined in this spec.
|
||||
|
||||
__Reserved bits__
|
||||
|
||||
Value of reserved bits **must** be 0 (zero).
|
||||
Reserved bit might be used in a future version of the specification,
|
||||
typically enabling new optional features.
|
||||
When this happens, a decoder respecting the current specification version
|
||||
shall not be able to decode such a frame.
|
||||
|
||||
__Content Size__
|
||||
|
||||
This is the original (uncompressed) size.
|
||||
This information is optional, and only present if the associated flag is set.
|
||||
Content size is provided using unsigned 8 Bytes, for a maximum of 16 Exabytes.
|
||||
Format is Little endian.
|
||||
This value is informational, typically for display or memory allocation.
|
||||
It can be skipped by a decoder, or used to validate content correctness.
|
||||
|
||||
__Dictionary ID__
|
||||
|
||||
Dict-ID is only present if the associated flag is set.
|
||||
It's an unsigned 32-bits value, stored using little-endian convention.
|
||||
A dictionary is useful to compress short input sequences.
|
||||
The compressor can take advantage of the dictionary context
|
||||
to encode the input in a more compact manner.
|
||||
It works as a kind of “known prefix” which is used by
|
||||
both the compressor and the decompressor to “warm-up” reference tables.
|
||||
|
||||
The decompressor can use Dict-ID identifier to determine
|
||||
which dictionary must be used to correctly decode data.
|
||||
The compressor and the decompressor must use exactly the same dictionary.
|
||||
It's presumed that the 32-bits dictID uniquely identifies a dictionary.
|
||||
|
||||
Within a single frame, a single dictionary can be defined.
|
||||
When the frame descriptor defines independent blocks,
|
||||
each block will be initialized with the same dictionary.
|
||||
If the frame descriptor defines linked blocks,
|
||||
the dictionary will only be used once, at the beginning of the frame.
|
||||
|
||||
__Header Checksum__
|
||||
|
||||
One-byte checksum of combined descriptor fields, including optional ones.
|
||||
The value is the second byte of `xxh32()` : ` (xxh32()>>8) & 0xFF `
|
||||
using zero as a seed, and the full Frame Descriptor as an input
|
||||
(including optional fields when they are present).
|
||||
A wrong checksum indicates an error in the descriptor.
|
||||
Header checksum is informational and can be skipped.
|
||||
|
||||
|
||||
Data Blocks
|
||||
-----------
|
||||
|
||||
| Block Size | data | (Block Checksum) |
|
||||
|:----------:| ------ |:----------------:|
|
||||
| 4 bytes | | 0 - 4 bytes |
|
||||
|
||||
|
||||
__Block Size__
|
||||
|
||||
This field uses 4-bytes, format is little-endian.
|
||||
|
||||
If the highest bit is set (`1`), the block is uncompressed.
|
||||
|
||||
If the highest bit is not set (`0`), the block is LZ4-compressed,
|
||||
using the [LZ4 block format specification](https://github.com/lz4/lz4/blob/dev/doc/lz4_Block_format.md).
|
||||
|
||||
All other bits give the size, in bytes, of the data section.
|
||||
The size does not include the block checksum if present.
|
||||
|
||||
_Block_Size_ shall never be larger than _Block_Maximum_Size_.
|
||||
Such an outcome could potentially happen for non-compressible sources.
|
||||
In such a case, such data block must be passed using uncompressed format.
|
||||
|
||||
A value of `0x00000000` is invalid, and signifies an _EndMark_ instead.
|
||||
Note that this is different from a value of `0x80000000` (highest bit set),
|
||||
which is an uncompressed block of size 0 (empty),
|
||||
which is valid, and therefore doesn't end a frame.
|
||||
Note that, if _Block_checksum_ is enabled,
|
||||
even an empty block must be followed by a 32-bit block checksum.
|
||||
|
||||
__Data__
|
||||
|
||||
Where the actual data to decode stands.
|
||||
It might be compressed or not, depending on previous field indications.
|
||||
|
||||
When compressed, the data must respect the [LZ4 block format specification](https://github.com/lz4/lz4/blob/dev/doc/lz4_Block_format.md).
|
||||
|
||||
Note that a block is not necessarily full.
|
||||
Uncompressed size of data can be any size __up to__ _Block_Maximum_Size_,
|
||||
so it may contain less data than the maximum block size.
|
||||
|
||||
__Block checksum__
|
||||
|
||||
Only present if the associated flag is set.
|
||||
This is a 4-bytes checksum value, in little endian format,
|
||||
calculated by using the [xxHash-32 algorithm] on the __raw__ (undecoded) data block,
|
||||
and a seed of zero.
|
||||
The intention is to detect data corruption (storage or transmission errors)
|
||||
before decoding.
|
||||
|
||||
_Block_checksum_ can be cumulative with _Content_checksum_.
|
||||
|
||||
[xxHash-32 algorithm]: https://github.com/Cyan4973/xxHash/blob/release/doc/xxhash_spec.md
|
||||
|
||||
|
||||
Skippable Frames
|
||||
----------------
|
||||
|
||||
| Magic Number | Frame Size | User Data |
|
||||
|:------------:|:----------:| --------- |
|
||||
| 4 bytes | 4 bytes | |
|
||||
|
||||
Skippable frames allow the integration of user-defined data
|
||||
into a flow of concatenated frames.
|
||||
Its design is pretty straightforward,
|
||||
with the sole objective to allow the decoder to quickly skip
|
||||
over user-defined data and continue decoding.
|
||||
|
||||
For the purpose of facilitating identification,
|
||||
it is discouraged to start a flow of concatenated frames with a skippable frame.
|
||||
If there is a need to start such a flow with some user data
|
||||
encapsulated into a skippable frame,
|
||||
it’s recommended to start with a zero-byte LZ4 frame
|
||||
followed by a skippable frame.
|
||||
This will make it easier for file type identifiers.
|
||||
|
||||
|
||||
__Magic Number__
|
||||
|
||||
4 Bytes, Little endian format.
|
||||
Value : 0x184D2A5X, which means any value from 0x184D2A50 to 0x184D2A5F.
|
||||
All 16 values are valid to identify a skippable frame.
|
||||
|
||||
__Frame Size__
|
||||
|
||||
This is the size, in bytes, of the following User Data
|
||||
(without including the magic number nor the size field itself).
|
||||
4 Bytes, Little endian format, unsigned 32-bits.
|
||||
This means User Data can’t be bigger than (2^32-1) Bytes.
|
||||
|
||||
__User Data__
|
||||
|
||||
User Data can be anything. Data will just be skipped by the decoder.
|
||||
|
||||
|
||||
Legacy frame
|
||||
------------
|
||||
|
||||
The Legacy frame format was defined into the initial versions of “LZ4Demo”.
|
||||
Newer compressors should not use this format anymore, as it is too restrictive.
|
||||
|
||||
Main characteristics of the legacy format :
|
||||
|
||||
- Fixed block size : 8 MB.
|
||||
- All blocks must be completely filled, except the last one.
|
||||
- All blocks are always compressed, even when compression is detrimental.
|
||||
- The last block is detected either because
|
||||
it is followed by the “EOF” (End of File) mark,
|
||||
or because it is followed by a known Frame Magic Number.
|
||||
- No checksum
|
||||
- Convention is Little endian
|
||||
|
||||
| MagicNb | B.CSize | CData | B.CSize | CData | (...) | EndMark |
|
||||
| ------- | ------- | ----- | ------- | ----- | ------- | ------- |
|
||||
| 4 bytes | 4 bytes | CSize | 4 bytes | CSize | x times | EOF |
|
||||
|
||||
|
||||
__Magic Number__
|
||||
|
||||
4 Bytes, Little endian format.
|
||||
Value : 0x184C2102
|
||||
|
||||
__Block Compressed Size__
|
||||
|
||||
This is the size, in bytes, of the following compressed data block.
|
||||
4 Bytes, Little endian format.
|
||||
|
||||
__Data__
|
||||
|
||||
Where the actual compressed data stands.
|
||||
Data is always compressed, even when compression is detrimental.
|
||||
|
||||
__EndMark__
|
||||
|
||||
End of legacy frame is implicit only.
|
||||
It must be followed by a standard EOF (End Of File) signal,
|
||||
wether it is a file or a stream.
|
||||
|
||||
Alternatively, if the frame is followed by a valid Frame Magic Number,
|
||||
it is considered completed.
|
||||
This policy makes it possible to concatenate legacy frames.
|
||||
|
||||
Any other value will be interpreted as a block size,
|
||||
and trigger an error if it does not fit within acceptable range.
|
||||
|
||||
|
||||
Version changes
|
||||
---------------
|
||||
|
||||
1.6.2 : clarifies specification of _EndMark_
|
||||
|
||||
1.6.1 : introduced terms "LZ4 Frame Header" and "LZ4 Frame Footer"
|
||||
|
||||
1.6.0 : restored Dictionary ID field in Frame header
|
||||
|
||||
1.5.1 : changed document format to MarkDown
|
||||
|
||||
1.5 : removed Dictionary ID from specification
|
||||
|
||||
1.4.1 : changed wording from “stream” to “frame”
|
||||
|
||||
1.4 : added skippable streams, re-added stream checksum
|
||||
|
||||
1.3 : modified header checksum
|
||||
|
||||
1.2 : reduced choice of “block size”, to postpone decision on “dynamic size of BlockSize Field”.
|
||||
|
||||
1.1 : optional fields are now part of the descriptor
|
||||
|
||||
1.0 : changed “block size” specification, adding a compressed/uncompressed flag
|
||||
|
||||
0.9 : reduced scale of “block maximum size” table
|
||||
|
||||
0.8 : removed : high compression flag
|
||||
|
||||
0.7 : removed : stream checksum
|
||||
|
||||
0.6 : settled : stream size uses 8 bytes, endian convention is little endian
|
||||
|
||||
0.5: added copyright notice
|
||||
|
||||
0.4 : changed format to Google Doc compatible OpenDocument
|
597
lz4/doc/lz4_manual.html
Normal file
597
lz4/doc/lz4_manual.html
Normal file
|
@ -0,0 +1,597 @@
|
|||
<html>
|
||||
<head>
|
||||
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
|
||||
<title>1.9.3 Manual</title>
|
||||
</head>
|
||||
<body>
|
||||
<h1>1.9.3 Manual</h1>
|
||||
<hr>
|
||||
<a name="Contents"></a><h2>Contents</h2>
|
||||
<ol>
|
||||
<li><a href="#Chapter1">Introduction</a></li>
|
||||
<li><a href="#Chapter2">Version</a></li>
|
||||
<li><a href="#Chapter3">Tuning parameter</a></li>
|
||||
<li><a href="#Chapter4">Simple Functions</a></li>
|
||||
<li><a href="#Chapter5">Advanced Functions</a></li>
|
||||
<li><a href="#Chapter6">Streaming Compression Functions</a></li>
|
||||
<li><a href="#Chapter7">Streaming Decompression Functions</a></li>
|
||||
<li><a href="#Chapter8">Experimental section</a></li>
|
||||
<li><a href="#Chapter9">Private Definitions</a></li>
|
||||
<li><a href="#Chapter10">Obsolete Functions</a></li>
|
||||
</ol>
|
||||
<hr>
|
||||
<a name="Chapter1"></a><h2>Introduction</h2><pre>
|
||||
LZ4 is lossless compression algorithm, providing compression speed >500 MB/s per core,
|
||||
scalable with multi-cores CPU. It features an extremely fast decoder, with speed in
|
||||
multiple GB/s per core, typically reaching RAM speed limits on multi-core systems.
|
||||
|
||||
The LZ4 compression library provides in-memory compression and decompression functions.
|
||||
It gives full buffer control to user.
|
||||
Compression can be done in:
|
||||
- a single step (described as Simple Functions)
|
||||
- a single step, reusing a context (described in Advanced Functions)
|
||||
- unbounded multiple steps (described as Streaming compression)
|
||||
|
||||
lz4.h generates and decodes LZ4-compressed blocks (doc/lz4_Block_format.md).
|
||||
Decompressing such a compressed block requires additional metadata.
|
||||
Exact metadata depends on exact decompression function.
|
||||
For the typical case of LZ4_decompress_safe(),
|
||||
metadata includes block's compressed size, and maximum bound of decompressed size.
|
||||
Each application is free to encode and pass such metadata in whichever way it wants.
|
||||
|
||||
lz4.h only handle blocks, it can not generate Frames.
|
||||
|
||||
Blocks are different from Frames (doc/lz4_Frame_format.md).
|
||||
Frames bundle both blocks and metadata in a specified manner.
|
||||
Embedding metadata is required for compressed data to be self-contained and portable.
|
||||
Frame format is delivered through a companion API, declared in lz4frame.h.
|
||||
The `lz4` CLI can only manage frames.
|
||||
<BR></pre>
|
||||
|
||||
<a name="Chapter2"></a><h2>Version</h2><pre></pre>
|
||||
|
||||
<pre><b>int LZ4_versionNumber (void); </b>/**< library version number; useful to check dll version */<b>
|
||||
</b></pre><BR>
|
||||
<pre><b>const char* LZ4_versionString (void); </b>/**< library version string; useful to check dll version */<b>
|
||||
</b></pre><BR>
|
||||
<a name="Chapter3"></a><h2>Tuning parameter</h2><pre></pre>
|
||||
|
||||
<pre><b>#ifndef LZ4_MEMORY_USAGE
|
||||
# define LZ4_MEMORY_USAGE 14
|
||||
#endif
|
||||
</b><p> Memory usage formula : N->2^N Bytes (examples : 10 -> 1KB; 12 -> 4KB ; 16 -> 64KB; 20 -> 1MB; etc.)
|
||||
Increasing memory usage improves compression ratio.
|
||||
Reduced memory usage may improve speed, thanks to better cache locality.
|
||||
Default value is 14, for 16KB, which nicely fits into Intel x86 L1 cache
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<a name="Chapter4"></a><h2>Simple Functions</h2><pre></pre>
|
||||
|
||||
<pre><b>int LZ4_compress_default(const char* src, char* dst, int srcSize, int dstCapacity);
|
||||
</b><p> Compresses 'srcSize' bytes from buffer 'src'
|
||||
into already allocated 'dst' buffer of size 'dstCapacity'.
|
||||
Compression is guaranteed to succeed if 'dstCapacity' >= LZ4_compressBound(srcSize).
|
||||
It also runs faster, so it's a recommended setting.
|
||||
If the function cannot compress 'src' into a more limited 'dst' budget,
|
||||
compression stops *immediately*, and the function result is zero.
|
||||
In which case, 'dst' content is undefined (invalid).
|
||||
srcSize : max supported value is LZ4_MAX_INPUT_SIZE.
|
||||
dstCapacity : size of buffer 'dst' (which must be already allocated)
|
||||
@return : the number of bytes written into buffer 'dst' (necessarily <= dstCapacity)
|
||||
or 0 if compression fails
|
||||
Note : This function is protected against buffer overflow scenarios (never writes outside 'dst' buffer, nor read outside 'source' buffer).
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<pre><b>int LZ4_decompress_safe (const char* src, char* dst, int compressedSize, int dstCapacity);
|
||||
</b><p> compressedSize : is the exact complete size of the compressed block.
|
||||
dstCapacity : is the size of destination buffer (which must be already allocated), presumed an upper bound of decompressed size.
|
||||
@return : the number of bytes decompressed into destination buffer (necessarily <= dstCapacity)
|
||||
If destination buffer is not large enough, decoding will stop and output an error code (negative value).
|
||||
If the source stream is detected malformed, the function will stop decoding and return a negative result.
|
||||
Note 1 : This function is protected against malicious data packets :
|
||||
it will never writes outside 'dst' buffer, nor read outside 'source' buffer,
|
||||
even if the compressed block is maliciously modified to order the decoder to do these actions.
|
||||
In such case, the decoder stops immediately, and considers the compressed block malformed.
|
||||
Note 2 : compressedSize and dstCapacity must be provided to the function, the compressed block does not contain them.
|
||||
The implementation is free to send / store / derive this information in whichever way is most beneficial.
|
||||
If there is a need for a different format which bundles together both compressed data and its metadata, consider looking at lz4frame.h instead.
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<a name="Chapter5"></a><h2>Advanced Functions</h2><pre></pre>
|
||||
|
||||
<pre><b>int LZ4_compressBound(int inputSize);
|
||||
</b><p> Provides the maximum size that LZ4 compression may output in a "worst case" scenario (input data not compressible)
|
||||
This function is primarily useful for memory allocation purposes (destination buffer size).
|
||||
Macro LZ4_COMPRESSBOUND() is also provided for compilation-time evaluation (stack memory allocation for example).
|
||||
Note that LZ4_compress_default() compresses faster when dstCapacity is >= LZ4_compressBound(srcSize)
|
||||
inputSize : max supported value is LZ4_MAX_INPUT_SIZE
|
||||
return : maximum output size in a "worst case" scenario
|
||||
or 0, if input size is incorrect (too large or negative)
|
||||
</p></pre><BR>
|
||||
|
||||
<pre><b>int LZ4_compress_fast (const char* src, char* dst, int srcSize, int dstCapacity, int acceleration);
|
||||
</b><p> Same as LZ4_compress_default(), but allows selection of "acceleration" factor.
|
||||
The larger the acceleration value, the faster the algorithm, but also the lesser the compression.
|
||||
It's a trade-off. It can be fine tuned, with each successive value providing roughly +~3% to speed.
|
||||
An acceleration value of "1" is the same as regular LZ4_compress_default()
|
||||
Values <= 0 will be replaced by LZ4_ACCELERATION_DEFAULT (currently == 1, see lz4.c).
|
||||
Values > LZ4_ACCELERATION_MAX will be replaced by LZ4_ACCELERATION_MAX (currently == 65537, see lz4.c).
|
||||
</p></pre><BR>
|
||||
|
||||
<pre><b>int LZ4_sizeofState(void);
|
||||
int LZ4_compress_fast_extState (void* state, const char* src, char* dst, int srcSize, int dstCapacity, int acceleration);
|
||||
</b><p> Same as LZ4_compress_fast(), using an externally allocated memory space for its state.
|
||||
Use LZ4_sizeofState() to know how much memory must be allocated,
|
||||
and allocate it on 8-bytes boundaries (using `malloc()` typically).
|
||||
Then, provide this buffer as `void* state` to compression function.
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<pre><b>int LZ4_compress_destSize (const char* src, char* dst, int* srcSizePtr, int targetDstSize);
|
||||
</b><p> Reverse the logic : compresses as much data as possible from 'src' buffer
|
||||
into already allocated buffer 'dst', of size >= 'targetDestSize'.
|
||||
This function either compresses the entire 'src' content into 'dst' if it's large enough,
|
||||
or fill 'dst' buffer completely with as much data as possible from 'src'.
|
||||
note: acceleration parameter is fixed to "default".
|
||||
|
||||
*srcSizePtr : will be modified to indicate how many bytes where read from 'src' to fill 'dst'.
|
||||
New value is necessarily <= input value.
|
||||
@return : Nb bytes written into 'dst' (necessarily <= targetDestSize)
|
||||
or 0 if compression fails.
|
||||
|
||||
Note : from v1.8.2 to v1.9.1, this function had a bug (fixed un v1.9.2+):
|
||||
the produced compressed content could, in specific circumstances,
|
||||
require to be decompressed into a destination buffer larger
|
||||
by at least 1 byte than the content to decompress.
|
||||
If an application uses `LZ4_compress_destSize()`,
|
||||
it's highly recommended to update liblz4 to v1.9.2 or better.
|
||||
If this can't be done or ensured,
|
||||
the receiving decompression function should provide
|
||||
a dstCapacity which is > decompressedSize, by at least 1 byte.
|
||||
See https://github.com/lz4/lz4/issues/859 for details
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<pre><b>int LZ4_decompress_safe_partial (const char* src, char* dst, int srcSize, int targetOutputSize, int dstCapacity);
|
||||
</b><p> Decompress an LZ4 compressed block, of size 'srcSize' at position 'src',
|
||||
into destination buffer 'dst' of size 'dstCapacity'.
|
||||
Up to 'targetOutputSize' bytes will be decoded.
|
||||
The function stops decoding on reaching this objective.
|
||||
This can be useful to boost performance
|
||||
whenever only the beginning of a block is required.
|
||||
|
||||
@return : the number of bytes decoded in `dst` (necessarily <= targetOutputSize)
|
||||
If source stream is detected malformed, function returns a negative result.
|
||||
|
||||
Note 1 : @return can be < targetOutputSize, if compressed block contains less data.
|
||||
|
||||
Note 2 : targetOutputSize must be <= dstCapacity
|
||||
|
||||
Note 3 : this function effectively stops decoding on reaching targetOutputSize,
|
||||
so dstCapacity is kind of redundant.
|
||||
This is because in older versions of this function,
|
||||
decoding operation would still write complete sequences.
|
||||
Therefore, there was no guarantee that it would stop writing at exactly targetOutputSize,
|
||||
it could write more bytes, though only up to dstCapacity.
|
||||
Some "margin" used to be required for this operation to work properly.
|
||||
Thankfully, this is no longer necessary.
|
||||
The function nonetheless keeps the same signature, in an effort to preserve API compatibility.
|
||||
|
||||
Note 4 : If srcSize is the exact size of the block,
|
||||
then targetOutputSize can be any value,
|
||||
including larger than the block's decompressed size.
|
||||
The function will, at most, generate block's decompressed size.
|
||||
|
||||
Note 5 : If srcSize is _larger_ than block's compressed size,
|
||||
then targetOutputSize **MUST** be <= block's decompressed size.
|
||||
Otherwise, *silent corruption will occur*.
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<a name="Chapter6"></a><h2>Streaming Compression Functions</h2><pre></pre>
|
||||
|
||||
<pre><b>void LZ4_resetStream_fast (LZ4_stream_t* streamPtr);
|
||||
</b><p> Use this to prepare an LZ4_stream_t for a new chain of dependent blocks
|
||||
(e.g., LZ4_compress_fast_continue()).
|
||||
|
||||
An LZ4_stream_t must be initialized once before usage.
|
||||
This is automatically done when created by LZ4_createStream().
|
||||
However, should the LZ4_stream_t be simply declared on stack (for example),
|
||||
it's necessary to initialize it first, using LZ4_initStream().
|
||||
|
||||
After init, start any new stream with LZ4_resetStream_fast().
|
||||
A same LZ4_stream_t can be re-used multiple times consecutively
|
||||
and compress multiple streams,
|
||||
provided that it starts each new stream with LZ4_resetStream_fast().
|
||||
|
||||
LZ4_resetStream_fast() is much faster than LZ4_initStream(),
|
||||
but is not compatible with memory regions containing garbage data.
|
||||
|
||||
Note: it's only useful to call LZ4_resetStream_fast()
|
||||
in the context of streaming compression.
|
||||
The *extState* functions perform their own resets.
|
||||
Invoking LZ4_resetStream_fast() before is redundant, and even counterproductive.
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<pre><b>int LZ4_loadDict (LZ4_stream_t* streamPtr, const char* dictionary, int dictSize);
|
||||
</b><p> Use this function to reference a static dictionary into LZ4_stream_t.
|
||||
The dictionary must remain available during compression.
|
||||
LZ4_loadDict() triggers a reset, so any previous data will be forgotten.
|
||||
The same dictionary will have to be loaded on decompression side for successful decoding.
|
||||
Dictionary are useful for better compression of small data (KB range).
|
||||
While LZ4 accept any input as dictionary,
|
||||
results are generally better when using Zstandard's Dictionary Builder.
|
||||
Loading a size of 0 is allowed, and is the same as reset.
|
||||
@return : loaded dictionary size, in bytes (necessarily <= 64 KB)
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<pre><b>int LZ4_compress_fast_continue (LZ4_stream_t* streamPtr, const char* src, char* dst, int srcSize, int dstCapacity, int acceleration);
|
||||
</b><p> Compress 'src' content using data from previously compressed blocks, for better compression ratio.
|
||||
'dst' buffer must be already allocated.
|
||||
If dstCapacity >= LZ4_compressBound(srcSize), compression is guaranteed to succeed, and runs faster.
|
||||
|
||||
@return : size of compressed block
|
||||
or 0 if there is an error (typically, cannot fit into 'dst').
|
||||
|
||||
Note 1 : Each invocation to LZ4_compress_fast_continue() generates a new block.
|
||||
Each block has precise boundaries.
|
||||
Each block must be decompressed separately, calling LZ4_decompress_*() with relevant metadata.
|
||||
It's not possible to append blocks together and expect a single invocation of LZ4_decompress_*() to decompress them together.
|
||||
|
||||
Note 2 : The previous 64KB of source data is __assumed__ to remain present, unmodified, at same address in memory !
|
||||
|
||||
Note 3 : When input is structured as a double-buffer, each buffer can have any size, including < 64 KB.
|
||||
Make sure that buffers are separated, by at least one byte.
|
||||
This construction ensures that each block only depends on previous block.
|
||||
|
||||
Note 4 : If input buffer is a ring-buffer, it can have any size, including < 64 KB.
|
||||
|
||||
Note 5 : After an error, the stream status is undefined (invalid), it can only be reset or freed.
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<pre><b>int LZ4_saveDict (LZ4_stream_t* streamPtr, char* safeBuffer, int maxDictSize);
|
||||
</b><p> If last 64KB data cannot be guaranteed to remain available at its current memory location,
|
||||
save it into a safer place (char* safeBuffer).
|
||||
This is schematically equivalent to a memcpy() followed by LZ4_loadDict(),
|
||||
but is much faster, because LZ4_saveDict() doesn't need to rebuild tables.
|
||||
@return : saved dictionary size in bytes (necessarily <= maxDictSize), or 0 if error.
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<a name="Chapter7"></a><h2>Streaming Decompression Functions</h2><pre> Bufferless synchronous API
|
||||
<BR></pre>
|
||||
|
||||
<pre><b>LZ4_streamDecode_t* LZ4_createStreamDecode(void);
|
||||
int LZ4_freeStreamDecode (LZ4_streamDecode_t* LZ4_stream);
|
||||
</b><p> creation / destruction of streaming decompression tracking context.
|
||||
A tracking context can be re-used multiple times.
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<pre><b>int LZ4_setStreamDecode (LZ4_streamDecode_t* LZ4_streamDecode, const char* dictionary, int dictSize);
|
||||
</b><p> An LZ4_streamDecode_t context can be allocated once and re-used multiple times.
|
||||
Use this function to start decompression of a new stream of blocks.
|
||||
A dictionary can optionally be set. Use NULL or size 0 for a reset order.
|
||||
Dictionary is presumed stable : it must remain accessible and unmodified during next decompression.
|
||||
@return : 1 if OK, 0 if error
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<pre><b>int LZ4_decoderRingBufferSize(int maxBlockSize);
|
||||
#define LZ4_DECODER_RING_BUFFER_SIZE(maxBlockSize) (65536 + 14 + (maxBlockSize)) </b>/* for static allocation; maxBlockSize presumed valid */<b>
|
||||
</b><p> Note : in a ring buffer scenario (optional),
|
||||
blocks are presumed decompressed next to each other
|
||||
up to the moment there is not enough remaining space for next block (remainingSize < maxBlockSize),
|
||||
at which stage it resumes from beginning of ring buffer.
|
||||
When setting such a ring buffer for streaming decompression,
|
||||
provides the minimum size of this ring buffer
|
||||
to be compatible with any source respecting maxBlockSize condition.
|
||||
@return : minimum ring buffer size,
|
||||
or 0 if there is an error (invalid maxBlockSize).
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<pre><b>int LZ4_decompress_safe_continue (LZ4_streamDecode_t* LZ4_streamDecode, const char* src, char* dst, int srcSize, int dstCapacity);
|
||||
</b><p> These decoding functions allow decompression of consecutive blocks in "streaming" mode.
|
||||
A block is an unsplittable entity, it must be presented entirely to a decompression function.
|
||||
Decompression functions only accepts one block at a time.
|
||||
The last 64KB of previously decoded data *must* remain available and unmodified at the memory position where they were decoded.
|
||||
If less than 64KB of data has been decoded, all the data must be present.
|
||||
|
||||
Special : if decompression side sets a ring buffer, it must respect one of the following conditions :
|
||||
- Decompression buffer size is _at least_ LZ4_decoderRingBufferSize(maxBlockSize).
|
||||
maxBlockSize is the maximum size of any single block. It can have any value > 16 bytes.
|
||||
In which case, encoding and decoding buffers do not need to be synchronized.
|
||||
Actually, data can be produced by any source compliant with LZ4 format specification, and respecting maxBlockSize.
|
||||
- Synchronized mode :
|
||||
Decompression buffer size is _exactly_ the same as compression buffer size,
|
||||
and follows exactly same update rule (block boundaries at same positions),
|
||||
and decoding function is provided with exact decompressed size of each block (exception for last block of the stream),
|
||||
_then_ decoding & encoding ring buffer can have any size, including small ones ( < 64 KB).
|
||||
- Decompression buffer is larger than encoding buffer, by a minimum of maxBlockSize more bytes.
|
||||
In which case, encoding and decoding buffers do not need to be synchronized,
|
||||
and encoding ring buffer can have any size, including small ones ( < 64 KB).
|
||||
|
||||
Whenever these conditions are not possible,
|
||||
save the last 64KB of decoded data into a safe buffer where it can't be modified during decompression,
|
||||
then indicate where this data is saved using LZ4_setStreamDecode(), before decompressing next block.
|
||||
</p></pre><BR>
|
||||
|
||||
<pre><b>int LZ4_decompress_safe_usingDict (const char* src, char* dst, int srcSize, int dstCapcity, const char* dictStart, int dictSize);
|
||||
</b><p> These decoding functions work the same as
|
||||
a combination of LZ4_setStreamDecode() followed by LZ4_decompress_*_continue()
|
||||
They are stand-alone, and don't need an LZ4_streamDecode_t structure.
|
||||
Dictionary is presumed stable : it must remain accessible and unmodified during decompression.
|
||||
Performance tip : Decompression speed can be substantially increased
|
||||
when dst == dictStart + dictSize.
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<a name="Chapter8"></a><h2>Experimental section</h2><pre>
|
||||
Symbols declared in this section must be considered unstable. Their
|
||||
signatures or semantics may change, or they may be removed altogether in the
|
||||
future. They are therefore only safe to depend on when the caller is
|
||||
statically linked against the library.
|
||||
|
||||
To protect against unsafe usage, not only are the declarations guarded,
|
||||
the definitions are hidden by default
|
||||
when building LZ4 as a shared/dynamic library.
|
||||
|
||||
In order to access these declarations,
|
||||
define LZ4_STATIC_LINKING_ONLY in your application
|
||||
before including LZ4's headers.
|
||||
|
||||
In order to make their implementations accessible dynamically, you must
|
||||
define LZ4_PUBLISH_STATIC_FUNCTIONS when building the LZ4 library.
|
||||
<BR></pre>
|
||||
|
||||
<pre><b>LZ4LIB_STATIC_API int LZ4_compress_fast_extState_fastReset (void* state, const char* src, char* dst, int srcSize, int dstCapacity, int acceleration);
|
||||
</b><p> A variant of LZ4_compress_fast_extState().
|
||||
|
||||
Using this variant avoids an expensive initialization step.
|
||||
It is only safe to call if the state buffer is known to be correctly initialized already
|
||||
(see above comment on LZ4_resetStream_fast() for a definition of "correctly initialized").
|
||||
From a high level, the difference is that
|
||||
this function initializes the provided state with a call to something like LZ4_resetStream_fast()
|
||||
while LZ4_compress_fast_extState() starts with a call to LZ4_resetStream().
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<pre><b>LZ4LIB_STATIC_API void LZ4_attach_dictionary(LZ4_stream_t* workingStream, const LZ4_stream_t* dictionaryStream);
|
||||
</b><p> This is an experimental API that allows
|
||||
efficient use of a static dictionary many times.
|
||||
|
||||
Rather than re-loading the dictionary buffer into a working context before
|
||||
each compression, or copying a pre-loaded dictionary's LZ4_stream_t into a
|
||||
working LZ4_stream_t, this function introduces a no-copy setup mechanism,
|
||||
in which the working stream references the dictionary stream in-place.
|
||||
|
||||
Several assumptions are made about the state of the dictionary stream.
|
||||
Currently, only streams which have been prepared by LZ4_loadDict() should
|
||||
be expected to work.
|
||||
|
||||
Alternatively, the provided dictionaryStream may be NULL,
|
||||
in which case any existing dictionary stream is unset.
|
||||
|
||||
If a dictionary is provided, it replaces any pre-existing stream history.
|
||||
The dictionary contents are the only history that can be referenced and
|
||||
logically immediately precede the data compressed in the first subsequent
|
||||
compression call.
|
||||
|
||||
The dictionary will only remain attached to the working stream through the
|
||||
first compression call, at the end of which it is cleared. The dictionary
|
||||
stream (and source buffer) must remain in-place / accessible / unchanged
|
||||
through the completion of the first compression call on the stream.
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<pre><b></b><p>
|
||||
It's possible to have input and output sharing the same buffer,
|
||||
for highly contrained memory environments.
|
||||
In both cases, it requires input to lay at the end of the buffer,
|
||||
and decompression to start at beginning of the buffer.
|
||||
Buffer size must feature some margin, hence be larger than final size.
|
||||
|
||||
|<------------------------buffer--------------------------------->|
|
||||
|<-----------compressed data--------->|
|
||||
|<-----------decompressed size------------------>|
|
||||
|<----margin---->|
|
||||
|
||||
This technique is more useful for decompression,
|
||||
since decompressed size is typically larger,
|
||||
and margin is short.
|
||||
|
||||
In-place decompression will work inside any buffer
|
||||
which size is >= LZ4_DECOMPRESS_INPLACE_BUFFER_SIZE(decompressedSize).
|
||||
This presumes that decompressedSize > compressedSize.
|
||||
Otherwise, it means compression actually expanded data,
|
||||
and it would be more efficient to store such data with a flag indicating it's not compressed.
|
||||
This can happen when data is not compressible (already compressed, or encrypted).
|
||||
|
||||
For in-place compression, margin is larger, as it must be able to cope with both
|
||||
history preservation, requiring input data to remain unmodified up to LZ4_DISTANCE_MAX,
|
||||
and data expansion, which can happen when input is not compressible.
|
||||
As a consequence, buffer size requirements are much higher,
|
||||
and memory savings offered by in-place compression are more limited.
|
||||
|
||||
There are ways to limit this cost for compression :
|
||||
- Reduce history size, by modifying LZ4_DISTANCE_MAX.
|
||||
Note that it is a compile-time constant, so all compressions will apply this limit.
|
||||
Lower values will reduce compression ratio, except when input_size < LZ4_DISTANCE_MAX,
|
||||
so it's a reasonable trick when inputs are known to be small.
|
||||
- Require the compressor to deliver a "maximum compressed size".
|
||||
This is the `dstCapacity` parameter in `LZ4_compress*()`.
|
||||
When this size is < LZ4_COMPRESSBOUND(inputSize), then compression can fail,
|
||||
in which case, the return code will be 0 (zero).
|
||||
The caller must be ready for these cases to happen,
|
||||
and typically design a backup scheme to send data uncompressed.
|
||||
The combination of both techniques can significantly reduce
|
||||
the amount of margin required for in-place compression.
|
||||
|
||||
In-place compression can work in any buffer
|
||||
which size is >= (maxCompressedSize)
|
||||
with maxCompressedSize == LZ4_COMPRESSBOUND(srcSize) for guaranteed compression success.
|
||||
LZ4_COMPRESS_INPLACE_BUFFER_SIZE() depends on both maxCompressedSize and LZ4_DISTANCE_MAX,
|
||||
so it's possible to reduce memory requirements by playing with them.
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<pre><b>#define LZ4_DECOMPRESS_INPLACE_BUFFER_SIZE(decompressedSize) ((decompressedSize) + LZ4_DECOMPRESS_INPLACE_MARGIN(decompressedSize)) </b>/**< note: presumes that compressedSize < decompressedSize. note2: margin is overestimated a bit, since it could use compressedSize instead */<b>
|
||||
</b></pre><BR>
|
||||
<pre><b>#define LZ4_COMPRESS_INPLACE_BUFFER_SIZE(maxCompressedSize) ((maxCompressedSize) + LZ4_COMPRESS_INPLACE_MARGIN) </b>/**< maxCompressedSize is generally LZ4_COMPRESSBOUND(inputSize), but can be set to any lower value, with the risk that compression can fail (return code 0(zero)) */<b>
|
||||
</b></pre><BR>
|
||||
<a name="Chapter9"></a><h2>Private Definitions</h2><pre>
|
||||
Do not use these definitions directly.
|
||||
They are only exposed to allow static allocation of `LZ4_stream_t` and `LZ4_streamDecode_t`.
|
||||
Accessing members will expose user code to API and/or ABI break in future versions of the library.
|
||||
<BR></pre>
|
||||
|
||||
<pre><b>typedef struct {
|
||||
const LZ4_byte* externalDict;
|
||||
size_t extDictSize;
|
||||
const LZ4_byte* prefixEnd;
|
||||
size_t prefixSize;
|
||||
} LZ4_streamDecode_t_internal;
|
||||
</b></pre><BR>
|
||||
<pre><b>#define LZ4_STREAMSIZE 16416 </b>/* static size, for inter-version compatibility */<b>
|
||||
#define LZ4_STREAMSIZE_VOIDP (LZ4_STREAMSIZE / sizeof(void*))
|
||||
union LZ4_stream_u {
|
||||
void* table[LZ4_STREAMSIZE_VOIDP];
|
||||
LZ4_stream_t_internal internal_donotuse;
|
||||
}; </b>/* previously typedef'd to LZ4_stream_t */<b>
|
||||
</b><p> Do not use below internal definitions directly !
|
||||
Declare or allocate an LZ4_stream_t instead.
|
||||
LZ4_stream_t can also be created using LZ4_createStream(), which is recommended.
|
||||
The structure definition can be convenient for static allocation
|
||||
(on stack, or as part of larger structure).
|
||||
Init this structure with LZ4_initStream() before first use.
|
||||
note : only use this definition in association with static linking !
|
||||
this definition is not API/ABI safe, and may change in future versions.
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<pre><b>LZ4_stream_t* LZ4_initStream (void* buffer, size_t size);
|
||||
</b><p> An LZ4_stream_t structure must be initialized at least once.
|
||||
This is automatically done when invoking LZ4_createStream(),
|
||||
but it's not when the structure is simply declared on stack (for example).
|
||||
|
||||
Use LZ4_initStream() to properly initialize a newly declared LZ4_stream_t.
|
||||
It can also initialize any arbitrary buffer of sufficient size,
|
||||
and will @return a pointer of proper type upon initialization.
|
||||
|
||||
Note : initialization fails if size and alignment conditions are not respected.
|
||||
In which case, the function will @return NULL.
|
||||
Note2: An LZ4_stream_t structure guarantees correct alignment and size.
|
||||
Note3: Before v1.9.0, use LZ4_resetStream() instead
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<pre><b>#define LZ4_STREAMDECODESIZE_U64 (4 + ((sizeof(void*)==16) ? 2 : 0) </b>/*AS-400*/ )<b>
|
||||
#define LZ4_STREAMDECODESIZE (LZ4_STREAMDECODESIZE_U64 * sizeof(unsigned long long))
|
||||
union LZ4_streamDecode_u {
|
||||
unsigned long long table[LZ4_STREAMDECODESIZE_U64];
|
||||
LZ4_streamDecode_t_internal internal_donotuse;
|
||||
} ; </b>/* previously typedef'd to LZ4_streamDecode_t */<b>
|
||||
</b><p> information structure to track an LZ4 stream during decompression.
|
||||
init this structure using LZ4_setStreamDecode() before first use.
|
||||
note : only use in association with static linking !
|
||||
this definition is not API/ABI safe,
|
||||
and may change in a future version !
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<a name="Chapter10"></a><h2>Obsolete Functions</h2><pre></pre>
|
||||
|
||||
<pre><b>#ifdef LZ4_DISABLE_DEPRECATE_WARNINGS
|
||||
# define LZ4_DEPRECATED(message) </b>/* disable deprecation warnings */<b>
|
||||
#else
|
||||
# if defined (__cplusplus) && (__cplusplus >= 201402) </b>/* C++14 or greater */<b>
|
||||
# define LZ4_DEPRECATED(message) [[deprecated(message)]]
|
||||
# elif defined(_MSC_VER)
|
||||
# define LZ4_DEPRECATED(message) __declspec(deprecated(message))
|
||||
# elif defined(__clang__) || (defined(__GNUC__) && (__GNUC__ * 10 + __GNUC_MINOR__ >= 45))
|
||||
# define LZ4_DEPRECATED(message) __attribute__((deprecated(message)))
|
||||
# elif defined(__GNUC__) && (__GNUC__ * 10 + __GNUC_MINOR__ >= 31)
|
||||
# define LZ4_DEPRECATED(message) __attribute__((deprecated))
|
||||
# else
|
||||
# pragma message("WARNING: LZ4_DEPRECATED needs custom implementation for this compiler")
|
||||
# define LZ4_DEPRECATED(message) </b>/* disabled */<b>
|
||||
# endif
|
||||
#endif </b>/* LZ4_DISABLE_DEPRECATE_WARNINGS */<b>
|
||||
</b><p>
|
||||
Deprecated functions make the compiler generate a warning when invoked.
|
||||
This is meant to invite users to update their source code.
|
||||
Should deprecation warnings be a problem, it is generally possible to disable them,
|
||||
typically with -Wno-deprecated-declarations for gcc
|
||||
or _CRT_SECURE_NO_WARNINGS in Visual.
|
||||
|
||||
Another method is to define LZ4_DISABLE_DEPRECATE_WARNINGS
|
||||
before including the header file.
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<pre><b>LZ4_DEPRECATED("use LZ4_compress_default() instead") LZ4LIB_API int LZ4_compress (const char* src, char* dest, int srcSize);
|
||||
LZ4_DEPRECATED("use LZ4_compress_default() instead") LZ4LIB_API int LZ4_compress_limitedOutput (const char* src, char* dest, int srcSize, int maxOutputSize);
|
||||
LZ4_DEPRECATED("use LZ4_compress_fast_extState() instead") LZ4LIB_API int LZ4_compress_withState (void* state, const char* source, char* dest, int inputSize);
|
||||
LZ4_DEPRECATED("use LZ4_compress_fast_extState() instead") LZ4LIB_API int LZ4_compress_limitedOutput_withState (void* state, const char* source, char* dest, int inputSize, int maxOutputSize);
|
||||
LZ4_DEPRECATED("use LZ4_compress_fast_continue() instead") LZ4LIB_API int LZ4_compress_continue (LZ4_stream_t* LZ4_streamPtr, const char* source, char* dest, int inputSize);
|
||||
LZ4_DEPRECATED("use LZ4_compress_fast_continue() instead") LZ4LIB_API int LZ4_compress_limitedOutput_continue (LZ4_stream_t* LZ4_streamPtr, const char* source, char* dest, int inputSize, int maxOutputSize);
|
||||
</b><p></p></pre><BR>
|
||||
|
||||
<pre><b>LZ4_DEPRECATED("use LZ4_decompress_fast() instead") LZ4LIB_API int LZ4_uncompress (const char* source, char* dest, int outputSize);
|
||||
LZ4_DEPRECATED("use LZ4_decompress_safe() instead") LZ4LIB_API int LZ4_uncompress_unknownOutputSize (const char* source, char* dest, int isize, int maxOutputSize);
|
||||
</b><p></p></pre><BR>
|
||||
|
||||
<pre><b>LZ4_DEPRECATED("use LZ4_decompress_safe_usingDict() instead") LZ4LIB_API int LZ4_decompress_safe_withPrefix64k (const char* src, char* dst, int compressedSize, int maxDstSize);
|
||||
LZ4_DEPRECATED("use LZ4_decompress_fast_usingDict() instead") LZ4LIB_API int LZ4_decompress_fast_withPrefix64k (const char* src, char* dst, int originalSize);
|
||||
</b><p></p></pre><BR>
|
||||
|
||||
<pre><b>LZ4_DEPRECATED("This function is deprecated and unsafe. Consider using LZ4_decompress_safe() instead")
|
||||
int LZ4_decompress_fast (const char* src, char* dst, int originalSize);
|
||||
LZ4_DEPRECATED("This function is deprecated and unsafe. Consider using LZ4_decompress_safe_continue() instead")
|
||||
int LZ4_decompress_fast_continue (LZ4_streamDecode_t* LZ4_streamDecode, const char* src, char* dst, int originalSize);
|
||||
LZ4_DEPRECATED("This function is deprecated and unsafe. Consider using LZ4_decompress_safe_usingDict() instead")
|
||||
int LZ4_decompress_fast_usingDict (const char* src, char* dst, int originalSize, const char* dictStart, int dictSize);
|
||||
</b><p> These functions used to be faster than LZ4_decompress_safe(),
|
||||
but this is no longer the case. They are now slower.
|
||||
This is because LZ4_decompress_fast() doesn't know the input size,
|
||||
and therefore must progress more cautiously into the input buffer to not read beyond the end of block.
|
||||
On top of that `LZ4_decompress_fast()` is not protected vs malformed or malicious inputs, making it a security liability.
|
||||
As a consequence, LZ4_decompress_fast() is strongly discouraged, and deprecated.
|
||||
|
||||
The last remaining LZ4_decompress_fast() specificity is that
|
||||
it can decompress a block without knowing its compressed size.
|
||||
Such functionality can be achieved in a more secure manner
|
||||
by employing LZ4_decompress_safe_partial().
|
||||
|
||||
Parameters:
|
||||
originalSize : is the uncompressed size to regenerate.
|
||||
`dst` must be already allocated, its size must be >= 'originalSize' bytes.
|
||||
@return : number of bytes read from source buffer (== compressed size).
|
||||
The function expects to finish at block's end exactly.
|
||||
If the source stream is detected malformed, the function stops decoding and returns a negative result.
|
||||
note : LZ4_decompress_fast*() requires originalSize. Thanks to this information, it never writes past the output buffer.
|
||||
However, since it doesn't know its 'src' size, it may read an unknown amount of input, past input buffer bounds.
|
||||
Also, since match offsets are not validated, match reads from 'src' may underflow too.
|
||||
These issues never happen if input (compressed) data is correct.
|
||||
But they may happen if input data is invalid (error or intentional tampering).
|
||||
As a consequence, use these functions in trusted environments with trusted data **only**.
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<pre><b>void LZ4_resetStream (LZ4_stream_t* streamPtr);
|
||||
</b><p> An LZ4_stream_t structure must be initialized at least once.
|
||||
This is done with LZ4_initStream(), or LZ4_resetStream().
|
||||
Consider switching to LZ4_initStream(),
|
||||
invoking LZ4_resetStream() will trigger deprecation warnings in the future.
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
</html>
|
||||
</body>
|
396
lz4/doc/lz4frame_manual.html
Normal file
396
lz4/doc/lz4frame_manual.html
Normal file
|
@ -0,0 +1,396 @@
|
|||
<html>
|
||||
<head>
|
||||
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
|
||||
<title>1.9.3 Manual</title>
|
||||
</head>
|
||||
<body>
|
||||
<h1>1.9.3 Manual</h1>
|
||||
<hr>
|
||||
<a name="Contents"></a><h2>Contents</h2>
|
||||
<ol>
|
||||
<li><a href="#Chapter1">Introduction</a></li>
|
||||
<li><a href="#Chapter2">Compiler specifics</a></li>
|
||||
<li><a href="#Chapter3">Error management</a></li>
|
||||
<li><a href="#Chapter4">Frame compression types</a></li>
|
||||
<li><a href="#Chapter5">Simple compression function</a></li>
|
||||
<li><a href="#Chapter6">Advanced compression functions</a></li>
|
||||
<li><a href="#Chapter7">Resource Management</a></li>
|
||||
<li><a href="#Chapter8">Compression</a></li>
|
||||
<li><a href="#Chapter9">Decompression functions</a></li>
|
||||
<li><a href="#Chapter10">Streaming decompression functions</a></li>
|
||||
<li><a href="#Chapter11">Bulk processing dictionary API</a></li>
|
||||
</ol>
|
||||
<hr>
|
||||
<a name="Chapter1"></a><h2>Introduction</h2><pre>
|
||||
lz4frame.h implements LZ4 frame specification (doc/lz4_Frame_format.md).
|
||||
lz4frame.h provides frame compression functions that take care
|
||||
of encoding standard metadata alongside LZ4-compressed blocks.
|
||||
<BR></pre>
|
||||
|
||||
<a name="Chapter2"></a><h2>Compiler specifics</h2><pre></pre>
|
||||
|
||||
<a name="Chapter3"></a><h2>Error management</h2><pre></pre>
|
||||
|
||||
<pre><b>unsigned LZ4F_isError(LZ4F_errorCode_t code); </b>/**< tells when a function result is an error code */<b>
|
||||
</b></pre><BR>
|
||||
<pre><b>const char* LZ4F_getErrorName(LZ4F_errorCode_t code); </b>/**< return error code string; for debugging */<b>
|
||||
</b></pre><BR>
|
||||
<a name="Chapter4"></a><h2>Frame compression types</h2><pre></pre>
|
||||
|
||||
<pre><b>typedef enum {
|
||||
LZ4F_default=0,
|
||||
LZ4F_max64KB=4,
|
||||
LZ4F_max256KB=5,
|
||||
LZ4F_max1MB=6,
|
||||
LZ4F_max4MB=7
|
||||
LZ4F_OBSOLETE_ENUM(max64KB)
|
||||
LZ4F_OBSOLETE_ENUM(max256KB)
|
||||
LZ4F_OBSOLETE_ENUM(max1MB)
|
||||
LZ4F_OBSOLETE_ENUM(max4MB)
|
||||
} LZ4F_blockSizeID_t;
|
||||
</b></pre><BR>
|
||||
<pre><b>typedef enum {
|
||||
LZ4F_blockLinked=0,
|
||||
LZ4F_blockIndependent
|
||||
LZ4F_OBSOLETE_ENUM(blockLinked)
|
||||
LZ4F_OBSOLETE_ENUM(blockIndependent)
|
||||
} LZ4F_blockMode_t;
|
||||
</b></pre><BR>
|
||||
<pre><b>typedef enum {
|
||||
LZ4F_noContentChecksum=0,
|
||||
LZ4F_contentChecksumEnabled
|
||||
LZ4F_OBSOLETE_ENUM(noContentChecksum)
|
||||
LZ4F_OBSOLETE_ENUM(contentChecksumEnabled)
|
||||
} LZ4F_contentChecksum_t;
|
||||
</b></pre><BR>
|
||||
<pre><b>typedef enum {
|
||||
LZ4F_noBlockChecksum=0,
|
||||
LZ4F_blockChecksumEnabled
|
||||
} LZ4F_blockChecksum_t;
|
||||
</b></pre><BR>
|
||||
<pre><b>typedef enum {
|
||||
LZ4F_frame=0,
|
||||
LZ4F_skippableFrame
|
||||
LZ4F_OBSOLETE_ENUM(skippableFrame)
|
||||
} LZ4F_frameType_t;
|
||||
</b></pre><BR>
|
||||
<pre><b>typedef struct {
|
||||
LZ4F_blockSizeID_t blockSizeID; </b>/* max64KB, max256KB, max1MB, max4MB; 0 == default */<b>
|
||||
LZ4F_blockMode_t blockMode; </b>/* LZ4F_blockLinked, LZ4F_blockIndependent; 0 == default */<b>
|
||||
LZ4F_contentChecksum_t contentChecksumFlag; </b>/* 1: frame terminated with 32-bit checksum of decompressed data; 0: disabled (default) */<b>
|
||||
LZ4F_frameType_t frameType; </b>/* read-only field : LZ4F_frame or LZ4F_skippableFrame */<b>
|
||||
unsigned long long contentSize; </b>/* Size of uncompressed content ; 0 == unknown */<b>
|
||||
unsigned dictID; </b>/* Dictionary ID, sent by compressor to help decoder select correct dictionary; 0 == no dictID provided */<b>
|
||||
LZ4F_blockChecksum_t blockChecksumFlag; </b>/* 1: each block followed by a checksum of block's compressed data; 0: disabled (default) */<b>
|
||||
} LZ4F_frameInfo_t;
|
||||
</b><p> makes it possible to set or read frame parameters.
|
||||
Structure must be first init to 0, using memset() or LZ4F_INIT_FRAMEINFO,
|
||||
setting all parameters to default.
|
||||
It's then possible to update selectively some parameters
|
||||
</p></pre><BR>
|
||||
|
||||
<pre><b>typedef struct {
|
||||
LZ4F_frameInfo_t frameInfo;
|
||||
int compressionLevel; </b>/* 0: default (fast mode); values > LZ4HC_CLEVEL_MAX count as LZ4HC_CLEVEL_MAX; values < 0 trigger "fast acceleration" */<b>
|
||||
unsigned autoFlush; </b>/* 1: always flush; reduces usage of internal buffers */<b>
|
||||
unsigned favorDecSpeed; </b>/* 1: parser favors decompression speed vs compression ratio. Only works for high compression modes (>= LZ4HC_CLEVEL_OPT_MIN) */ /* v1.8.2+ */<b>
|
||||
unsigned reserved[3]; </b>/* must be zero for forward compatibility */<b>
|
||||
} LZ4F_preferences_t;
|
||||
</b><p> makes it possible to supply advanced compression instructions to streaming interface.
|
||||
Structure must be first init to 0, using memset() or LZ4F_INIT_PREFERENCES,
|
||||
setting all parameters to default.
|
||||
All reserved fields must be set to zero.
|
||||
</p></pre><BR>
|
||||
|
||||
<a name="Chapter5"></a><h2>Simple compression function</h2><pre></pre>
|
||||
|
||||
<pre><b>size_t LZ4F_compressFrameBound(size_t srcSize, const LZ4F_preferences_t* preferencesPtr);
|
||||
</b><p> Returns the maximum possible compressed size with LZ4F_compressFrame() given srcSize and preferences.
|
||||
`preferencesPtr` is optional. It can be replaced by NULL, in which case, the function will assume default preferences.
|
||||
Note : this result is only usable with LZ4F_compressFrame().
|
||||
It may also be used with LZ4F_compressUpdate() _if no flush() operation_ is performed.
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<pre><b>size_t LZ4F_compressFrame(void* dstBuffer, size_t dstCapacity,
|
||||
const void* srcBuffer, size_t srcSize,
|
||||
const LZ4F_preferences_t* preferencesPtr);
|
||||
</b><p> Compress an entire srcBuffer into a valid LZ4 frame.
|
||||
dstCapacity MUST be >= LZ4F_compressFrameBound(srcSize, preferencesPtr).
|
||||
The LZ4F_preferences_t structure is optional : you can provide NULL as argument. All preferences will be set to default.
|
||||
@return : number of bytes written into dstBuffer.
|
||||
or an error code if it fails (can be tested using LZ4F_isError())
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<a name="Chapter6"></a><h2>Advanced compression functions</h2><pre></pre>
|
||||
|
||||
<pre><b>typedef struct {
|
||||
unsigned stableSrc; </b>/* 1 == src content will remain present on future calls to LZ4F_compress(); skip copying src content within tmp buffer */<b>
|
||||
unsigned reserved[3];
|
||||
} LZ4F_compressOptions_t;
|
||||
</b></pre><BR>
|
||||
<a name="Chapter7"></a><h2>Resource Management</h2><pre></pre>
|
||||
|
||||
<pre><b>LZ4F_errorCode_t LZ4F_createCompressionContext(LZ4F_cctx** cctxPtr, unsigned version);
|
||||
LZ4F_errorCode_t LZ4F_freeCompressionContext(LZ4F_cctx* cctx);
|
||||
</b><p> The first thing to do is to create a compressionContext object, which will be used in all compression operations.
|
||||
This is achieved using LZ4F_createCompressionContext(), which takes as argument a version.
|
||||
The version provided MUST be LZ4F_VERSION. It is intended to track potential version mismatch, notably when using DLL.
|
||||
The function will provide a pointer to a fully allocated LZ4F_cctx object.
|
||||
If @return != zero, there was an error during context creation.
|
||||
Object can release its memory using LZ4F_freeCompressionContext();
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<a name="Chapter8"></a><h2>Compression</h2><pre></pre>
|
||||
|
||||
<pre><b>size_t LZ4F_compressBegin(LZ4F_cctx* cctx,
|
||||
void* dstBuffer, size_t dstCapacity,
|
||||
const LZ4F_preferences_t* prefsPtr);
|
||||
</b><p> will write the frame header into dstBuffer.
|
||||
dstCapacity must be >= LZ4F_HEADER_SIZE_MAX bytes.
|
||||
`prefsPtr` is optional : you can provide NULL as argument, all preferences will then be set to default.
|
||||
@return : number of bytes written into dstBuffer for the header
|
||||
or an error code (which can be tested using LZ4F_isError())
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<pre><b>size_t LZ4F_compressBound(size_t srcSize, const LZ4F_preferences_t* prefsPtr);
|
||||
</b><p> Provides minimum dstCapacity required to guarantee success of
|
||||
LZ4F_compressUpdate(), given a srcSize and preferences, for a worst case scenario.
|
||||
When srcSize==0, LZ4F_compressBound() provides an upper bound for LZ4F_flush() and LZ4F_compressEnd() instead.
|
||||
Note that the result is only valid for a single invocation of LZ4F_compressUpdate().
|
||||
When invoking LZ4F_compressUpdate() multiple times,
|
||||
if the output buffer is gradually filled up instead of emptied and re-used from its start,
|
||||
one must check if there is enough remaining capacity before each invocation, using LZ4F_compressBound().
|
||||
@return is always the same for a srcSize and prefsPtr.
|
||||
prefsPtr is optional : when NULL is provided, preferences will be set to cover worst case scenario.
|
||||
tech details :
|
||||
@return if automatic flushing is not enabled, includes the possibility that internal buffer might already be filled by up to (blockSize-1) bytes.
|
||||
It also includes frame footer (ending + checksum), since it might be generated by LZ4F_compressEnd().
|
||||
@return doesn't include frame header, as it was already generated by LZ4F_compressBegin().
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<pre><b>size_t LZ4F_compressUpdate(LZ4F_cctx* cctx,
|
||||
void* dstBuffer, size_t dstCapacity,
|
||||
const void* srcBuffer, size_t srcSize,
|
||||
const LZ4F_compressOptions_t* cOptPtr);
|
||||
</b><p> LZ4F_compressUpdate() can be called repetitively to compress as much data as necessary.
|
||||
Important rule: dstCapacity MUST be large enough to ensure operation success even in worst case situations.
|
||||
This value is provided by LZ4F_compressBound().
|
||||
If this condition is not respected, LZ4F_compress() will fail (result is an errorCode).
|
||||
LZ4F_compressUpdate() doesn't guarantee error recovery.
|
||||
When an error occurs, compression context must be freed or resized.
|
||||
`cOptPtr` is optional : NULL can be provided, in which case all options are set to default.
|
||||
@return : number of bytes written into `dstBuffer` (it can be zero, meaning input data was just buffered).
|
||||
or an error code if it fails (which can be tested using LZ4F_isError())
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<pre><b>size_t LZ4F_flush(LZ4F_cctx* cctx,
|
||||
void* dstBuffer, size_t dstCapacity,
|
||||
const LZ4F_compressOptions_t* cOptPtr);
|
||||
</b><p> When data must be generated and sent immediately, without waiting for a block to be completely filled,
|
||||
it's possible to call LZ4_flush(). It will immediately compress any data buffered within cctx.
|
||||
`dstCapacity` must be large enough to ensure the operation will be successful.
|
||||
`cOptPtr` is optional : it's possible to provide NULL, all options will be set to default.
|
||||
@return : nb of bytes written into dstBuffer (can be zero, when there is no data stored within cctx)
|
||||
or an error code if it fails (which can be tested using LZ4F_isError())
|
||||
Note : LZ4F_flush() is guaranteed to be successful when dstCapacity >= LZ4F_compressBound(0, prefsPtr).
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<pre><b>size_t LZ4F_compressEnd(LZ4F_cctx* cctx,
|
||||
void* dstBuffer, size_t dstCapacity,
|
||||
const LZ4F_compressOptions_t* cOptPtr);
|
||||
</b><p> To properly finish an LZ4 frame, invoke LZ4F_compressEnd().
|
||||
It will flush whatever data remained within `cctx` (like LZ4_flush())
|
||||
and properly finalize the frame, with an endMark and a checksum.
|
||||
`cOptPtr` is optional : NULL can be provided, in which case all options will be set to default.
|
||||
@return : nb of bytes written into dstBuffer, necessarily >= 4 (endMark),
|
||||
or an error code if it fails (which can be tested using LZ4F_isError())
|
||||
Note : LZ4F_compressEnd() is guaranteed to be successful when dstCapacity >= LZ4F_compressBound(0, prefsPtr).
|
||||
A successful call to LZ4F_compressEnd() makes `cctx` available again for another compression task.
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<a name="Chapter9"></a><h2>Decompression functions</h2><pre></pre>
|
||||
|
||||
<pre><b>typedef struct {
|
||||
unsigned stableDst; </b>/* pledges that last 64KB decompressed data will remain available unmodified. This optimization skips storage operations in tmp buffers. */<b>
|
||||
unsigned reserved[3]; </b>/* must be set to zero for forward compatibility */<b>
|
||||
} LZ4F_decompressOptions_t;
|
||||
</b></pre><BR>
|
||||
<pre><b>LZ4F_errorCode_t LZ4F_createDecompressionContext(LZ4F_dctx** dctxPtr, unsigned version);
|
||||
LZ4F_errorCode_t LZ4F_freeDecompressionContext(LZ4F_dctx* dctx);
|
||||
</b><p> Create an LZ4F_dctx object, to track all decompression operations.
|
||||
The version provided MUST be LZ4F_VERSION.
|
||||
The function provides a pointer to an allocated and initialized LZ4F_dctx object.
|
||||
The result is an errorCode, which can be tested using LZ4F_isError().
|
||||
dctx memory can be released using LZ4F_freeDecompressionContext();
|
||||
Result of LZ4F_freeDecompressionContext() indicates current state of decompressionContext when being released.
|
||||
That is, it should be == 0 if decompression has been completed fully and correctly.
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<a name="Chapter10"></a><h2>Streaming decompression functions</h2><pre></pre>
|
||||
|
||||
<pre><b>size_t LZ4F_headerSize(const void* src, size_t srcSize);
|
||||
</b><p> Provide the header size of a frame starting at `src`.
|
||||
`srcSize` must be >= LZ4F_MIN_SIZE_TO_KNOW_HEADER_LENGTH,
|
||||
which is enough to decode the header length.
|
||||
@return : size of frame header
|
||||
or an error code, which can be tested using LZ4F_isError()
|
||||
note : Frame header size is variable, but is guaranteed to be
|
||||
>= LZ4F_HEADER_SIZE_MIN bytes, and <= LZ4F_HEADER_SIZE_MAX bytes.
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<pre><b>size_t LZ4F_getFrameInfo(LZ4F_dctx* dctx,
|
||||
LZ4F_frameInfo_t* frameInfoPtr,
|
||||
const void* srcBuffer, size_t* srcSizePtr);
|
||||
</b><p> This function extracts frame parameters (max blockSize, dictID, etc.).
|
||||
Its usage is optional: user can call LZ4F_decompress() directly.
|
||||
|
||||
Extracted information will fill an existing LZ4F_frameInfo_t structure.
|
||||
This can be useful for allocation and dictionary identification purposes.
|
||||
|
||||
LZ4F_getFrameInfo() can work in the following situations :
|
||||
|
||||
1) At the beginning of a new frame, before any invocation of LZ4F_decompress().
|
||||
It will decode header from `srcBuffer`,
|
||||
consuming the header and starting the decoding process.
|
||||
|
||||
Input size must be large enough to contain the full frame header.
|
||||
Frame header size can be known beforehand by LZ4F_headerSize().
|
||||
Frame header size is variable, but is guaranteed to be >= LZ4F_HEADER_SIZE_MIN bytes,
|
||||
and not more than <= LZ4F_HEADER_SIZE_MAX bytes.
|
||||
Hence, blindly providing LZ4F_HEADER_SIZE_MAX bytes or more will always work.
|
||||
It's allowed to provide more input data than the header size,
|
||||
LZ4F_getFrameInfo() will only consume the header.
|
||||
|
||||
If input size is not large enough,
|
||||
aka if it's smaller than header size,
|
||||
function will fail and return an error code.
|
||||
|
||||
2) After decoding has been started,
|
||||
it's possible to invoke LZ4F_getFrameInfo() anytime
|
||||
to extract already decoded frame parameters stored within dctx.
|
||||
|
||||
Note that, if decoding has barely started,
|
||||
and not yet read enough information to decode the header,
|
||||
LZ4F_getFrameInfo() will fail.
|
||||
|
||||
The number of bytes consumed from srcBuffer will be updated in *srcSizePtr (necessarily <= original value).
|
||||
LZ4F_getFrameInfo() only consumes bytes when decoding has not yet started,
|
||||
and when decoding the header has been successful.
|
||||
Decompression must then resume from (srcBuffer + *srcSizePtr).
|
||||
|
||||
@return : a hint about how many srcSize bytes LZ4F_decompress() expects for next call,
|
||||
or an error code which can be tested using LZ4F_isError().
|
||||
note 1 : in case of error, dctx is not modified. Decoding operation can resume from beginning safely.
|
||||
note 2 : frame parameters are *copied into* an already allocated LZ4F_frameInfo_t structure.
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<pre><b>size_t LZ4F_decompress(LZ4F_dctx* dctx,
|
||||
void* dstBuffer, size_t* dstSizePtr,
|
||||
const void* srcBuffer, size_t* srcSizePtr,
|
||||
const LZ4F_decompressOptions_t* dOptPtr);
|
||||
</b><p> Call this function repetitively to regenerate data compressed in `srcBuffer`.
|
||||
|
||||
The function requires a valid dctx state.
|
||||
It will read up to *srcSizePtr bytes from srcBuffer,
|
||||
and decompress data into dstBuffer, of capacity *dstSizePtr.
|
||||
|
||||
The nb of bytes consumed from srcBuffer will be written into *srcSizePtr (necessarily <= original value).
|
||||
The nb of bytes decompressed into dstBuffer will be written into *dstSizePtr (necessarily <= original value).
|
||||
|
||||
The function does not necessarily read all input bytes, so always check value in *srcSizePtr.
|
||||
Unconsumed source data must be presented again in subsequent invocations.
|
||||
|
||||
`dstBuffer` can freely change between each consecutive function invocation.
|
||||
`dstBuffer` content will be overwritten.
|
||||
|
||||
@return : an hint of how many `srcSize` bytes LZ4F_decompress() expects for next call.
|
||||
Schematically, it's the size of the current (or remaining) compressed block + header of next block.
|
||||
Respecting the hint provides some small speed benefit, because it skips intermediate buffers.
|
||||
This is just a hint though, it's always possible to provide any srcSize.
|
||||
|
||||
When a frame is fully decoded, @return will be 0 (no more data expected).
|
||||
When provided with more bytes than necessary to decode a frame,
|
||||
LZ4F_decompress() will stop reading exactly at end of current frame, and @return 0.
|
||||
|
||||
If decompression failed, @return is an error code, which can be tested using LZ4F_isError().
|
||||
After a decompression error, the `dctx` context is not resumable.
|
||||
Use LZ4F_resetDecompressionContext() to return to clean state.
|
||||
|
||||
After a frame is fully decoded, dctx can be used again to decompress another frame.
|
||||
|
||||
</p></pre><BR>
|
||||
|
||||
<pre><b>void LZ4F_resetDecompressionContext(LZ4F_dctx* dctx); </b>/* always successful */<b>
|
||||
</b><p> In case of an error, the context is left in "undefined" state.
|
||||
In which case, it's necessary to reset it, before re-using it.
|
||||
This method can also be used to abruptly stop any unfinished decompression,
|
||||
and start a new one using same context resources.
|
||||
</p></pre><BR>
|
||||
|
||||
<pre><b>typedef enum { LZ4F_LIST_ERRORS(LZ4F_GENERATE_ENUM)
|
||||
_LZ4F_dummy_error_enum_for_c89_never_used } LZ4F_errorCodes;
|
||||
</b></pre><BR>
|
||||
<a name="Chapter11"></a><h2>Bulk processing dictionary API</h2><pre></pre>
|
||||
|
||||
<pre><b>LZ4FLIB_STATIC_API LZ4F_CDict* LZ4F_createCDict(const void* dictBuffer, size_t dictSize);
|
||||
LZ4FLIB_STATIC_API void LZ4F_freeCDict(LZ4F_CDict* CDict);
|
||||
</b><p> When compressing multiple messages / blocks using the same dictionary, it's recommended to load it just once.
|
||||
LZ4_createCDict() will create a digested dictionary, ready to start future compression operations without startup delay.
|
||||
LZ4_CDict can be created once and shared by multiple threads concurrently, since its usage is read-only.
|
||||
`dictBuffer` can be released after LZ4_CDict creation, since its content is copied within CDict
|
||||
</p></pre><BR>
|
||||
|
||||
<pre><b>LZ4FLIB_STATIC_API size_t LZ4F_compressFrame_usingCDict(
|
||||
LZ4F_cctx* cctx,
|
||||
void* dst, size_t dstCapacity,
|
||||
const void* src, size_t srcSize,
|
||||
const LZ4F_CDict* cdict,
|
||||
const LZ4F_preferences_t* preferencesPtr);
|
||||
</b><p> Compress an entire srcBuffer into a valid LZ4 frame using a digested Dictionary.
|
||||
cctx must point to a context created by LZ4F_createCompressionContext().
|
||||
If cdict==NULL, compress without a dictionary.
|
||||
dstBuffer MUST be >= LZ4F_compressFrameBound(srcSize, preferencesPtr).
|
||||
If this condition is not respected, function will fail (@return an errorCode).
|
||||
The LZ4F_preferences_t structure is optional : you may provide NULL as argument,
|
||||
but it's not recommended, as it's the only way to provide dictID in the frame header.
|
||||
@return : number of bytes written into dstBuffer.
|
||||
or an error code if it fails (can be tested using LZ4F_isError())
|
||||
</p></pre><BR>
|
||||
|
||||
<pre><b>LZ4FLIB_STATIC_API size_t LZ4F_compressBegin_usingCDict(
|
||||
LZ4F_cctx* cctx,
|
||||
void* dstBuffer, size_t dstCapacity,
|
||||
const LZ4F_CDict* cdict,
|
||||
const LZ4F_preferences_t* prefsPtr);
|
||||
</b><p> Inits streaming dictionary compression, and writes the frame header into dstBuffer.
|
||||
dstCapacity must be >= LZ4F_HEADER_SIZE_MAX bytes.
|
||||
`prefsPtr` is optional : you may provide NULL as argument,
|
||||
however, it's the only way to provide dictID in the frame header.
|
||||
@return : number of bytes written into dstBuffer for the header,
|
||||
or an error code (which can be tested using LZ4F_isError())
|
||||
</p></pre><BR>
|
||||
|
||||
<pre><b>LZ4FLIB_STATIC_API size_t LZ4F_decompress_usingDict(
|
||||
LZ4F_dctx* dctxPtr,
|
||||
void* dstBuffer, size_t* dstSizePtr,
|
||||
const void* srcBuffer, size_t* srcSizePtr,
|
||||
const void* dict, size_t dictSize,
|
||||
const LZ4F_decompressOptions_t* decompressOptionsPtr);
|
||||
</b><p> Same as LZ4F_decompress(), using a predefined dictionary.
|
||||
Dictionary is used "in place", without any preprocessing.
|
||||
It must remain accessible throughout the entire frame decoding.
|
||||
</p></pre><BR>
|
||||
|
||||
</html>
|
||||
</body>
|
Reference in a new issue