feat: 9.5.9

This commit is contained in:
tmtt 2022-07-29 15:12:07 +02:00
parent cb1753732b
commit 35f43a7909
1084 changed files with 558985 additions and 0 deletions

156
lz4/doc/lz4_Block_format.md Normal file
View file

@ -0,0 +1,156 @@
LZ4 Block Format Description
============================
Last revised: 2019-03-30.
Author : Yann Collet
This specification is intended for developers
willing to produce LZ4-compatible compressed data blocks
using any programming language.
LZ4 is an LZ77-type compressor with a fixed, byte-oriented encoding.
There is no entropy encoder back-end nor framing layer.
The latter is assumed to be handled by other parts of the system
(see [LZ4 Frame format]).
This design is assumed to favor simplicity and speed.
It helps later on for optimizations, compactness, and features.
This document describes only the block format,
not how the compressor nor decompressor actually work.
The correctness of the decompressor should not depend
on implementation details of the compressor, and vice versa.
[LZ4 Frame format]: lz4_Frame_format.md
Compressed block format
-----------------------
An LZ4 compressed block is composed of sequences.
A sequence is a suite of literals (not-compressed bytes),
followed by a match copy.
Each sequence starts with a `token`.
The `token` is a one byte value, separated into two 4-bits fields.
Therefore each field ranges from 0 to 15.
The first field uses the 4 high-bits of the token.
It provides the length of literals to follow.
If the field value is 0, then there is no literal.
If it is 15, then we need to add some more bytes to indicate the full length.
Each additional byte then represent a value from 0 to 255,
which is added to the previous value to produce a total length.
When the byte value is 255, another byte is output.
There can be any number of bytes following `token`. There is no "size limit".
(Side note : this is why a not-compressible input block is expanded by 0.4%).
Example 1 : A literal length of 48 will be represented as :
- 15 : value for the 4-bits High field
- 33 : (=48-15) remaining length to reach 48
Example 2 : A literal length of 280 will be represented as :
- 15 : value for the 4-bits High field
- 255 : following byte is maxed, since 280-15 >= 255
- 10 : (=280 - 15 - 255) ) remaining length to reach 280
Example 3 : A literal length of 15 will be represented as :
- 15 : value for the 4-bits High field
- 0 : (=15-15) yes, the zero must be output
Following `token` and optional length bytes, are the literals themselves.
They are exactly as numerous as previously decoded (length of literals).
It's possible that there are zero literal.
Following the literals is the match copy operation.
It starts by the `offset`.
This is a 2 bytes value, in little endian format
(the 1st byte is the "low" byte, the 2nd one is the "high" byte).
The `offset` represents the position of the match to be copied from.
1 means "current position - 1 byte".
The maximum `offset` value is 65535, 65536 cannot be coded.
Note that 0 is an invalid value, not used.
Then we need to extract the `matchlength`.
For this, we use the second token field, the low 4-bits.
Value, obviously, ranges from 0 to 15.
However here, 0 means that the copy operation will be minimal.
The minimum length of a match, called `minmatch`, is 4.
As a consequence, a 0 value means 4 bytes, and a value of 15 means 19+ bytes.
Similar to literal length, on reaching the highest possible value (15),
we output additional bytes, one at a time, with values ranging from 0 to 255.
They are added to total to provide the final match length.
A 255 value means there is another byte to read and add.
There is no limit to the number of optional bytes that can be output this way.
(This points towards a maximum achievable compression ratio of about 250).
Decoding the `matchlength` reaches the end of current sequence.
Next byte will be the start of another sequence.
But before moving to next sequence,
it's time to use the decoded match position and length.
The decoder copies `matchlength` bytes from match position to current position.
In some cases, `matchlength` is larger than `offset`.
Therefore, `match_pos + matchlength > current_pos`,
which means that later bytes to copy are not yet decoded.
This is called an "overlap match", and must be handled with special care.
A common case is an offset of 1,
meaning the last byte is repeated `matchlength` times.
End of block restrictions
-----------------------
There are specific rules required to terminate a block.
1. The last sequence contains only literals.
The block ends right after them.
2. The last 5 bytes of input are always literals.
Therefore, the last sequence contains at least 5 bytes.
- Special : if input is smaller than 5 bytes,
there is only one sequence, it contains the whole input as literals.
Empty input can be represented with a zero byte,
interpreted as a final token without literal and without a match.
3. The last match must start at least 12 bytes before the end of block.
The last match is part of the penultimate sequence.
It is followed by the last sequence, which contains only literals.
- Note that, as a consequence,
an independent block < 13 bytes cannot be compressed,
because the match must copy "something",
so it needs at least one prior byte.
- When a block can reference data from another block,
it can start immediately with a match and no literal,
so a block of 12 bytes can be compressed.
When a block does not respect these end conditions,
a conformant decoder is allowed to reject the block as incorrect.
These rules are in place to ensure that a conformant decoder
can be designed for speed, issuing speculatively instructions,
while never reading nor writing beyond provided I/O buffers.
Additional notes
-----------------------
If the decoder will decompress data from an external source,
it is recommended to ensure that the decoder will not be vulnerable to
buffer overflow manipulations.
Always ensure that read and write operations
remain within the limits of provided buffers.
Test the decoder with fuzzers
to ensure it's resilient to improbable combinations.
The format makes no assumption nor limits to the way the compressor
searches and selects matches within the source data block.
Multiple techniques can be considered,
featuring distinct time / performance trade offs.
As long as the format is respected,
the result will be compatible and decodable by any compliant decoder.
An upper compression limit can be reached,
using a technique called "full optimal parsing", at high cpu cost.

433
lz4/doc/lz4_Frame_format.md Normal file
View file

@ -0,0 +1,433 @@
LZ4 Frame Format Description
============================
### Notices
Copyright (c) 2013-2015 Yann Collet
Permission is granted to copy and distribute this document
for any purpose and without charge,
including translations into other languages
and incorporation into compilations,
provided that the copyright notice and this notice are preserved,
and that any substantive changes or deletions from the original
are clearly marked.
Distribution of this document is unlimited.
### Version
1.6.2 (12/08/2020)
Introduction
------------
The purpose of this document is to define a lossless compressed data format,
that is independent of CPU type, operating system,
file system and character set, suitable for
File compression, Pipe and streaming compression
using the [LZ4 algorithm](http://www.lz4.org).
The data can be produced or consumed,
even for an arbitrarily long sequentially presented input data stream,
using only an a priori bounded amount of intermediate storage,
and hence can be used in data communications.
The format uses the LZ4 compression method,
and optional [xxHash-32 checksum method](https://github.com/Cyan4973/xxHash),
for detection of data corruption.
The data format defined by this specification
does not attempt to allow random access to compressed data.
This specification is intended for use by implementers of software
to compress data into LZ4 format and/or decompress data from LZ4 format.
The text of the specification assumes a basic background in programming
at the level of bits and other primitive data representations.
Unless otherwise indicated below,
a compliant compressor must produce data sets
that conform to the specifications presented here.
It doesnt need to support all options though.
A compliant decompressor must be able to decompress
at least one working set of parameters
that conforms to the specifications presented here.
It may also ignore checksums.
Whenever it does not support a specific parameter within the compressed stream,
it must produce a non-ambiguous error code
and associated error message explaining which parameter is unsupported.
General Structure of LZ4 Frame format
-------------------------------------
| MagicNb | F. Descriptor | Block | (...) | EndMark | C. Checksum |
|:-------:|:-------------:| ----- | ----- | ------- | ----------- |
| 4 bytes | 3-15 bytes | | | 4 bytes | 0-4 bytes |
__Magic Number__
4 Bytes, Little endian format.
Value : 0x184D2204
__Frame Descriptor__
3 to 15 Bytes, to be detailed in its own paragraph,
as it is the most important part of the spec.
The combined _Magic_Number_ and _Frame_Descriptor_ fields are sometimes
called ___LZ4 Frame Header___. Its size varies between 7 and 19 bytes.
__Data Blocks__
To be detailed in its own paragraph.
Thats where compressed data is stored.
__EndMark__
The flow of blocks ends when the last data block is followed by
the 32-bit value `0x00000000`.
__Content Checksum__
_Content_Checksum_ verify that the full content has been decoded correctly.
The content checksum is the result of [xxHash-32 algorithm]
digesting the original (decoded) data as input, and a seed of zero.
Content checksum is only present when its associated flag
is set in the frame descriptor.
Content Checksum validates the result,
that all blocks were fully transmitted in the correct order and without error,
and also that the encoding/decoding process itself generated no distortion.
Its usage is recommended.
The combined _EndMark_ and _Content_Checksum_ fields might sometimes be
referred to as ___LZ4 Frame Footer___. Its size varies between 4 and 8 bytes.
__Frame Concatenation__
In some circumstances, it may be preferable to append multiple frames,
for example in order to add new data to an existing compressed file
without re-framing it.
In such case, each frame has its own set of descriptor flags.
Each frame is considered independent.
The only relation between frames is their sequential order.
The ability to decode multiple concatenated frames
within a single stream or file
is left outside of this specification.
As an example, the reference lz4 command line utility behavior is
to decode all concatenated frames in their sequential order.
Frame Descriptor
----------------
| FLG | BD | (Content Size) | (Dictionary ID) | HC |
| ------- | ------- |:--------------:|:---------------:| ------- |
| 1 byte | 1 byte | 0 - 8 bytes | 0 - 4 bytes | 1 byte |
The descriptor uses a minimum of 3 bytes,
and up to 15 bytes depending on optional parameters.
__FLG byte__
| BitNb | 7-6 | 5 | 4 | 3 | 2 | 1 | 0 |
| ------- |-------|-------|----------|------|----------|----------|------|
|FieldName|Version|B.Indep|B.Checksum|C.Size|C.Checksum|*Reserved*|DictID|
__BD byte__
| BitNb | 7 | 6-5-4 | 3-2-1-0 |
| ------- | -------- | ------------- | -------- |
|FieldName|*Reserved*| Block MaxSize |*Reserved*|
In the tables, bit 7 is highest bit, while bit 0 is lowest.
__Version Number__
2-bits field, must be set to `01`.
Any other value cannot be decoded by this version of the specification.
Other version numbers will use different flag layouts.
__Block Independence flag__
If this flag is set to “1”, blocks are independent.
If this flag is set to “0”, each block depends on previous ones
(up to LZ4 window size, which is 64 KB).
In such case, its necessary to decode all blocks in sequence.
Block dependency improves compression ratio, especially for small blocks.
On the other hand, it makes random access or multi-threaded decoding impossible.
__Block checksum flag__
If this flag is set, each data block will be followed by a 4-bytes checksum,
calculated by using the xxHash-32 algorithm on the raw (compressed) data block.
The intention is to detect data corruption (storage or transmission errors)
immediately, before decoding.
Block checksum usage is optional.
__Content Size flag__
If this flag is set, the uncompressed size of data included within the frame
will be present as an 8 bytes unsigned little endian value, after the flags.
Content Size usage is optional.
__Content checksum flag__
If this flag is set, a 32-bits content checksum will be appended
after the EndMark.
__Dictionary ID flag__
If this flag is set, a 4-bytes Dict-ID field will be present,
after the descriptor flags and the Content Size.
__Block Maximum Size__
This information is useful to help the decoder allocate memory.
Size here refers to the original (uncompressed) data size.
Block Maximum Size is one value among the following table :
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
| --- | --- | --- | --- | ----- | ------ | ---- | ---- |
| N/A | N/A | N/A | N/A | 64 KB | 256 KB | 1 MB | 4 MB |
The decoder may refuse to allocate block sizes above any system-specific size.
Unused values may be used in a future revision of the spec.
A decoder conformant with the current version of the spec
is only able to decode block sizes defined in this spec.
__Reserved bits__
Value of reserved bits **must** be 0 (zero).
Reserved bit might be used in a future version of the specification,
typically enabling new optional features.
When this happens, a decoder respecting the current specification version
shall not be able to decode such a frame.
__Content Size__
This is the original (uncompressed) size.
This information is optional, and only present if the associated flag is set.
Content size is provided using unsigned 8 Bytes, for a maximum of 16 Exabytes.
Format is Little endian.
This value is informational, typically for display or memory allocation.
It can be skipped by a decoder, or used to validate content correctness.
__Dictionary ID__
Dict-ID is only present if the associated flag is set.
It's an unsigned 32-bits value, stored using little-endian convention.
A dictionary is useful to compress short input sequences.
The compressor can take advantage of the dictionary context
to encode the input in a more compact manner.
It works as a kind of “known prefix” which is used by
both the compressor and the decompressor to “warm-up” reference tables.
The decompressor can use Dict-ID identifier to determine
which dictionary must be used to correctly decode data.
The compressor and the decompressor must use exactly the same dictionary.
It's presumed that the 32-bits dictID uniquely identifies a dictionary.
Within a single frame, a single dictionary can be defined.
When the frame descriptor defines independent blocks,
each block will be initialized with the same dictionary.
If the frame descriptor defines linked blocks,
the dictionary will only be used once, at the beginning of the frame.
__Header Checksum__
One-byte checksum of combined descriptor fields, including optional ones.
The value is the second byte of `xxh32()` : ` (xxh32()>>8) & 0xFF `
using zero as a seed, and the full Frame Descriptor as an input
(including optional fields when they are present).
A wrong checksum indicates an error in the descriptor.
Header checksum is informational and can be skipped.
Data Blocks
-----------
| Block Size | data | (Block Checksum) |
|:----------:| ------ |:----------------:|
| 4 bytes | | 0 - 4 bytes |
__Block Size__
This field uses 4-bytes, format is little-endian.
If the highest bit is set (`1`), the block is uncompressed.
If the highest bit is not set (`0`), the block is LZ4-compressed,
using the [LZ4 block format specification](https://github.com/lz4/lz4/blob/dev/doc/lz4_Block_format.md).
All other bits give the size, in bytes, of the data section.
The size does not include the block checksum if present.
_Block_Size_ shall never be larger than _Block_Maximum_Size_.
Such an outcome could potentially happen for non-compressible sources.
In such a case, such data block must be passed using uncompressed format.
A value of `0x00000000` is invalid, and signifies an _EndMark_ instead.
Note that this is different from a value of `0x80000000` (highest bit set),
which is an uncompressed block of size 0 (empty),
which is valid, and therefore doesn't end a frame.
Note that, if _Block_checksum_ is enabled,
even an empty block must be followed by a 32-bit block checksum.
__Data__
Where the actual data to decode stands.
It might be compressed or not, depending on previous field indications.
When compressed, the data must respect the [LZ4 block format specification](https://github.com/lz4/lz4/blob/dev/doc/lz4_Block_format.md).
Note that a block is not necessarily full.
Uncompressed size of data can be any size __up to__ _Block_Maximum_Size_,
so it may contain less data than the maximum block size.
__Block checksum__
Only present if the associated flag is set.
This is a 4-bytes checksum value, in little endian format,
calculated by using the [xxHash-32 algorithm] on the __raw__ (undecoded) data block,
and a seed of zero.
The intention is to detect data corruption (storage or transmission errors)
before decoding.
_Block_checksum_ can be cumulative with _Content_checksum_.
[xxHash-32 algorithm]: https://github.com/Cyan4973/xxHash/blob/release/doc/xxhash_spec.md
Skippable Frames
----------------
| Magic Number | Frame Size | User Data |
|:------------:|:----------:| --------- |
| 4 bytes | 4 bytes | |
Skippable frames allow the integration of user-defined data
into a flow of concatenated frames.
Its design is pretty straightforward,
with the sole objective to allow the decoder to quickly skip
over user-defined data and continue decoding.
For the purpose of facilitating identification,
it is discouraged to start a flow of concatenated frames with a skippable frame.
If there is a need to start such a flow with some user data
encapsulated into a skippable frame,
its recommended to start with a zero-byte LZ4 frame
followed by a skippable frame.
This will make it easier for file type identifiers.
__Magic Number__
4 Bytes, Little endian format.
Value : 0x184D2A5X, which means any value from 0x184D2A50 to 0x184D2A5F.
All 16 values are valid to identify a skippable frame.
__Frame Size__
This is the size, in bytes, of the following User Data
(without including the magic number nor the size field itself).
4 Bytes, Little endian format, unsigned 32-bits.
This means User Data cant be bigger than (2^32-1) Bytes.
__User Data__
User Data can be anything. Data will just be skipped by the decoder.
Legacy frame
------------
The Legacy frame format was defined into the initial versions of “LZ4Demo”.
Newer compressors should not use this format anymore, as it is too restrictive.
Main characteristics of the legacy format :
- Fixed block size : 8 MB.
- All blocks must be completely filled, except the last one.
- All blocks are always compressed, even when compression is detrimental.
- The last block is detected either because
it is followed by the “EOF” (End of File) mark,
or because it is followed by a known Frame Magic Number.
- No checksum
- Convention is Little endian
| MagicNb | B.CSize | CData | B.CSize | CData | (...) | EndMark |
| ------- | ------- | ----- | ------- | ----- | ------- | ------- |
| 4 bytes | 4 bytes | CSize | 4 bytes | CSize | x times | EOF |
__Magic Number__
4 Bytes, Little endian format.
Value : 0x184C2102
__Block Compressed Size__
This is the size, in bytes, of the following compressed data block.
4 Bytes, Little endian format.
__Data__
Where the actual compressed data stands.
Data is always compressed, even when compression is detrimental.
__EndMark__
End of legacy frame is implicit only.
It must be followed by a standard EOF (End Of File) signal,
wether it is a file or a stream.
Alternatively, if the frame is followed by a valid Frame Magic Number,
it is considered completed.
This policy makes it possible to concatenate legacy frames.
Any other value will be interpreted as a block size,
and trigger an error if it does not fit within acceptable range.
Version changes
---------------
1.6.2 : clarifies specification of _EndMark_
1.6.1 : introduced terms "LZ4 Frame Header" and "LZ4 Frame Footer"
1.6.0 : restored Dictionary ID field in Frame header
1.5.1 : changed document format to MarkDown
1.5 : removed Dictionary ID from specification
1.4.1 : changed wording from “stream” to “frame”
1.4 : added skippable streams, re-added stream checksum
1.3 : modified header checksum
1.2 : reduced choice of “block size”, to postpone decision on “dynamic size of BlockSize Field”.
1.1 : optional fields are now part of the descriptor
1.0 : changed “block size” specification, adding a compressed/uncompressed flag
0.9 : reduced scale of “block maximum size” table
0.8 : removed : high compression flag
0.7 : removed : stream checksum
0.6 : settled : stream size uses 8 bytes, endian convention is little endian
0.5: added copyright notice
0.4 : changed format to Google Doc compatible OpenDocument

597
lz4/doc/lz4_manual.html Normal file
View file

@ -0,0 +1,597 @@
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>1.9.3 Manual</title>
</head>
<body>
<h1>1.9.3 Manual</h1>
<hr>
<a name="Contents"></a><h2>Contents</h2>
<ol>
<li><a href="#Chapter1">Introduction</a></li>
<li><a href="#Chapter2">Version</a></li>
<li><a href="#Chapter3">Tuning parameter</a></li>
<li><a href="#Chapter4">Simple Functions</a></li>
<li><a href="#Chapter5">Advanced Functions</a></li>
<li><a href="#Chapter6">Streaming Compression Functions</a></li>
<li><a href="#Chapter7">Streaming Decompression Functions</a></li>
<li><a href="#Chapter8">Experimental section</a></li>
<li><a href="#Chapter9">Private Definitions</a></li>
<li><a href="#Chapter10">Obsolete Functions</a></li>
</ol>
<hr>
<a name="Chapter1"></a><h2>Introduction</h2><pre>
LZ4 is lossless compression algorithm, providing compression speed >500 MB/s per core,
scalable with multi-cores CPU. It features an extremely fast decoder, with speed in
multiple GB/s per core, typically reaching RAM speed limits on multi-core systems.
The LZ4 compression library provides in-memory compression and decompression functions.
It gives full buffer control to user.
Compression can be done in:
- a single step (described as Simple Functions)
- a single step, reusing a context (described in Advanced Functions)
- unbounded multiple steps (described as Streaming compression)
lz4.h generates and decodes LZ4-compressed blocks (doc/lz4_Block_format.md).
Decompressing such a compressed block requires additional metadata.
Exact metadata depends on exact decompression function.
For the typical case of LZ4_decompress_safe(),
metadata includes block's compressed size, and maximum bound of decompressed size.
Each application is free to encode and pass such metadata in whichever way it wants.
lz4.h only handle blocks, it can not generate Frames.
Blocks are different from Frames (doc/lz4_Frame_format.md).
Frames bundle both blocks and metadata in a specified manner.
Embedding metadata is required for compressed data to be self-contained and portable.
Frame format is delivered through a companion API, declared in lz4frame.h.
The `lz4` CLI can only manage frames.
<BR></pre>
<a name="Chapter2"></a><h2>Version</h2><pre></pre>
<pre><b>int LZ4_versionNumber (void); </b>/**< library version number; useful to check dll version */<b>
</b></pre><BR>
<pre><b>const char* LZ4_versionString (void); </b>/**< library version string; useful to check dll version */<b>
</b></pre><BR>
<a name="Chapter3"></a><h2>Tuning parameter</h2><pre></pre>
<pre><b>#ifndef LZ4_MEMORY_USAGE
# define LZ4_MEMORY_USAGE 14
#endif
</b><p> Memory usage formula : N->2^N Bytes (examples : 10 -> 1KB; 12 -> 4KB ; 16 -> 64KB; 20 -> 1MB; etc.)
Increasing memory usage improves compression ratio.
Reduced memory usage may improve speed, thanks to better cache locality.
Default value is 14, for 16KB, which nicely fits into Intel x86 L1 cache
</p></pre><BR>
<a name="Chapter4"></a><h2>Simple Functions</h2><pre></pre>
<pre><b>int LZ4_compress_default(const char* src, char* dst, int srcSize, int dstCapacity);
</b><p> Compresses 'srcSize' bytes from buffer 'src'
into already allocated 'dst' buffer of size 'dstCapacity'.
Compression is guaranteed to succeed if 'dstCapacity' >= LZ4_compressBound(srcSize).
It also runs faster, so it's a recommended setting.
If the function cannot compress 'src' into a more limited 'dst' budget,
compression stops *immediately*, and the function result is zero.
In which case, 'dst' content is undefined (invalid).
srcSize : max supported value is LZ4_MAX_INPUT_SIZE.
dstCapacity : size of buffer 'dst' (which must be already allocated)
@return : the number of bytes written into buffer 'dst' (necessarily <= dstCapacity)
or 0 if compression fails
Note : This function is protected against buffer overflow scenarios (never writes outside 'dst' buffer, nor read outside 'source' buffer).
</p></pre><BR>
<pre><b>int LZ4_decompress_safe (const char* src, char* dst, int compressedSize, int dstCapacity);
</b><p> compressedSize : is the exact complete size of the compressed block.
dstCapacity : is the size of destination buffer (which must be already allocated), presumed an upper bound of decompressed size.
@return : the number of bytes decompressed into destination buffer (necessarily <= dstCapacity)
If destination buffer is not large enough, decoding will stop and output an error code (negative value).
If the source stream is detected malformed, the function will stop decoding and return a negative result.
Note 1 : This function is protected against malicious data packets :
it will never writes outside 'dst' buffer, nor read outside 'source' buffer,
even if the compressed block is maliciously modified to order the decoder to do these actions.
In such case, the decoder stops immediately, and considers the compressed block malformed.
Note 2 : compressedSize and dstCapacity must be provided to the function, the compressed block does not contain them.
The implementation is free to send / store / derive this information in whichever way is most beneficial.
If there is a need for a different format which bundles together both compressed data and its metadata, consider looking at lz4frame.h instead.
</p></pre><BR>
<a name="Chapter5"></a><h2>Advanced Functions</h2><pre></pre>
<pre><b>int LZ4_compressBound(int inputSize);
</b><p> Provides the maximum size that LZ4 compression may output in a "worst case" scenario (input data not compressible)
This function is primarily useful for memory allocation purposes (destination buffer size).
Macro LZ4_COMPRESSBOUND() is also provided for compilation-time evaluation (stack memory allocation for example).
Note that LZ4_compress_default() compresses faster when dstCapacity is >= LZ4_compressBound(srcSize)
inputSize : max supported value is LZ4_MAX_INPUT_SIZE
return : maximum output size in a "worst case" scenario
or 0, if input size is incorrect (too large or negative)
</p></pre><BR>
<pre><b>int LZ4_compress_fast (const char* src, char* dst, int srcSize, int dstCapacity, int acceleration);
</b><p> Same as LZ4_compress_default(), but allows selection of "acceleration" factor.
The larger the acceleration value, the faster the algorithm, but also the lesser the compression.
It's a trade-off. It can be fine tuned, with each successive value providing roughly +~3% to speed.
An acceleration value of "1" is the same as regular LZ4_compress_default()
Values <= 0 will be replaced by LZ4_ACCELERATION_DEFAULT (currently == 1, see lz4.c).
Values > LZ4_ACCELERATION_MAX will be replaced by LZ4_ACCELERATION_MAX (currently == 65537, see lz4.c).
</p></pre><BR>
<pre><b>int LZ4_sizeofState(void);
int LZ4_compress_fast_extState (void* state, const char* src, char* dst, int srcSize, int dstCapacity, int acceleration);
</b><p> Same as LZ4_compress_fast(), using an externally allocated memory space for its state.
Use LZ4_sizeofState() to know how much memory must be allocated,
and allocate it on 8-bytes boundaries (using `malloc()` typically).
Then, provide this buffer as `void* state` to compression function.
</p></pre><BR>
<pre><b>int LZ4_compress_destSize (const char* src, char* dst, int* srcSizePtr, int targetDstSize);
</b><p> Reverse the logic : compresses as much data as possible from 'src' buffer
into already allocated buffer 'dst', of size >= 'targetDestSize'.
This function either compresses the entire 'src' content into 'dst' if it's large enough,
or fill 'dst' buffer completely with as much data as possible from 'src'.
note: acceleration parameter is fixed to "default".
*srcSizePtr : will be modified to indicate how many bytes where read from 'src' to fill 'dst'.
New value is necessarily <= input value.
@return : Nb bytes written into 'dst' (necessarily <= targetDestSize)
or 0 if compression fails.
Note : from v1.8.2 to v1.9.1, this function had a bug (fixed un v1.9.2+):
the produced compressed content could, in specific circumstances,
require to be decompressed into a destination buffer larger
by at least 1 byte than the content to decompress.
If an application uses `LZ4_compress_destSize()`,
it's highly recommended to update liblz4 to v1.9.2 or better.
If this can't be done or ensured,
the receiving decompression function should provide
a dstCapacity which is > decompressedSize, by at least 1 byte.
See https://github.com/lz4/lz4/issues/859 for details
</p></pre><BR>
<pre><b>int LZ4_decompress_safe_partial (const char* src, char* dst, int srcSize, int targetOutputSize, int dstCapacity);
</b><p> Decompress an LZ4 compressed block, of size 'srcSize' at position 'src',
into destination buffer 'dst' of size 'dstCapacity'.
Up to 'targetOutputSize' bytes will be decoded.
The function stops decoding on reaching this objective.
This can be useful to boost performance
whenever only the beginning of a block is required.
@return : the number of bytes decoded in `dst` (necessarily <= targetOutputSize)
If source stream is detected malformed, function returns a negative result.
Note 1 : @return can be < targetOutputSize, if compressed block contains less data.
Note 2 : targetOutputSize must be <= dstCapacity
Note 3 : this function effectively stops decoding on reaching targetOutputSize,
so dstCapacity is kind of redundant.
This is because in older versions of this function,
decoding operation would still write complete sequences.
Therefore, there was no guarantee that it would stop writing at exactly targetOutputSize,
it could write more bytes, though only up to dstCapacity.
Some "margin" used to be required for this operation to work properly.
Thankfully, this is no longer necessary.
The function nonetheless keeps the same signature, in an effort to preserve API compatibility.
Note 4 : If srcSize is the exact size of the block,
then targetOutputSize can be any value,
including larger than the block's decompressed size.
The function will, at most, generate block's decompressed size.
Note 5 : If srcSize is _larger_ than block's compressed size,
then targetOutputSize **MUST** be <= block's decompressed size.
Otherwise, *silent corruption will occur*.
</p></pre><BR>
<a name="Chapter6"></a><h2>Streaming Compression Functions</h2><pre></pre>
<pre><b>void LZ4_resetStream_fast (LZ4_stream_t* streamPtr);
</b><p> Use this to prepare an LZ4_stream_t for a new chain of dependent blocks
(e.g., LZ4_compress_fast_continue()).
An LZ4_stream_t must be initialized once before usage.
This is automatically done when created by LZ4_createStream().
However, should the LZ4_stream_t be simply declared on stack (for example),
it's necessary to initialize it first, using LZ4_initStream().
After init, start any new stream with LZ4_resetStream_fast().
A same LZ4_stream_t can be re-used multiple times consecutively
and compress multiple streams,
provided that it starts each new stream with LZ4_resetStream_fast().
LZ4_resetStream_fast() is much faster than LZ4_initStream(),
but is not compatible with memory regions containing garbage data.
Note: it's only useful to call LZ4_resetStream_fast()
in the context of streaming compression.
The *extState* functions perform their own resets.
Invoking LZ4_resetStream_fast() before is redundant, and even counterproductive.
</p></pre><BR>
<pre><b>int LZ4_loadDict (LZ4_stream_t* streamPtr, const char* dictionary, int dictSize);
</b><p> Use this function to reference a static dictionary into LZ4_stream_t.
The dictionary must remain available during compression.
LZ4_loadDict() triggers a reset, so any previous data will be forgotten.
The same dictionary will have to be loaded on decompression side for successful decoding.
Dictionary are useful for better compression of small data (KB range).
While LZ4 accept any input as dictionary,
results are generally better when using Zstandard's Dictionary Builder.
Loading a size of 0 is allowed, and is the same as reset.
@return : loaded dictionary size, in bytes (necessarily <= 64 KB)
</p></pre><BR>
<pre><b>int LZ4_compress_fast_continue (LZ4_stream_t* streamPtr, const char* src, char* dst, int srcSize, int dstCapacity, int acceleration);
</b><p> Compress 'src' content using data from previously compressed blocks, for better compression ratio.
'dst' buffer must be already allocated.
If dstCapacity >= LZ4_compressBound(srcSize), compression is guaranteed to succeed, and runs faster.
@return : size of compressed block
or 0 if there is an error (typically, cannot fit into 'dst').
Note 1 : Each invocation to LZ4_compress_fast_continue() generates a new block.
Each block has precise boundaries.
Each block must be decompressed separately, calling LZ4_decompress_*() with relevant metadata.
It's not possible to append blocks together and expect a single invocation of LZ4_decompress_*() to decompress them together.
Note 2 : The previous 64KB of source data is __assumed__ to remain present, unmodified, at same address in memory !
Note 3 : When input is structured as a double-buffer, each buffer can have any size, including < 64 KB.
Make sure that buffers are separated, by at least one byte.
This construction ensures that each block only depends on previous block.
Note 4 : If input buffer is a ring-buffer, it can have any size, including < 64 KB.
Note 5 : After an error, the stream status is undefined (invalid), it can only be reset or freed.
</p></pre><BR>
<pre><b>int LZ4_saveDict (LZ4_stream_t* streamPtr, char* safeBuffer, int maxDictSize);
</b><p> If last 64KB data cannot be guaranteed to remain available at its current memory location,
save it into a safer place (char* safeBuffer).
This is schematically equivalent to a memcpy() followed by LZ4_loadDict(),
but is much faster, because LZ4_saveDict() doesn't need to rebuild tables.
@return : saved dictionary size in bytes (necessarily <= maxDictSize), or 0 if error.
</p></pre><BR>
<a name="Chapter7"></a><h2>Streaming Decompression Functions</h2><pre> Bufferless synchronous API
<BR></pre>
<pre><b>LZ4_streamDecode_t* LZ4_createStreamDecode(void);
int LZ4_freeStreamDecode (LZ4_streamDecode_t* LZ4_stream);
</b><p> creation / destruction of streaming decompression tracking context.
A tracking context can be re-used multiple times.
</p></pre><BR>
<pre><b>int LZ4_setStreamDecode (LZ4_streamDecode_t* LZ4_streamDecode, const char* dictionary, int dictSize);
</b><p> An LZ4_streamDecode_t context can be allocated once and re-used multiple times.
Use this function to start decompression of a new stream of blocks.
A dictionary can optionally be set. Use NULL or size 0 for a reset order.
Dictionary is presumed stable : it must remain accessible and unmodified during next decompression.
@return : 1 if OK, 0 if error
</p></pre><BR>
<pre><b>int LZ4_decoderRingBufferSize(int maxBlockSize);
#define LZ4_DECODER_RING_BUFFER_SIZE(maxBlockSize) (65536 + 14 + (maxBlockSize)) </b>/* for static allocation; maxBlockSize presumed valid */<b>
</b><p> Note : in a ring buffer scenario (optional),
blocks are presumed decompressed next to each other
up to the moment there is not enough remaining space for next block (remainingSize < maxBlockSize),
at which stage it resumes from beginning of ring buffer.
When setting such a ring buffer for streaming decompression,
provides the minimum size of this ring buffer
to be compatible with any source respecting maxBlockSize condition.
@return : minimum ring buffer size,
or 0 if there is an error (invalid maxBlockSize).
</p></pre><BR>
<pre><b>int LZ4_decompress_safe_continue (LZ4_streamDecode_t* LZ4_streamDecode, const char* src, char* dst, int srcSize, int dstCapacity);
</b><p> These decoding functions allow decompression of consecutive blocks in "streaming" mode.
A block is an unsplittable entity, it must be presented entirely to a decompression function.
Decompression functions only accepts one block at a time.
The last 64KB of previously decoded data *must* remain available and unmodified at the memory position where they were decoded.
If less than 64KB of data has been decoded, all the data must be present.
Special : if decompression side sets a ring buffer, it must respect one of the following conditions :
- Decompression buffer size is _at least_ LZ4_decoderRingBufferSize(maxBlockSize).
maxBlockSize is the maximum size of any single block. It can have any value > 16 bytes.
In which case, encoding and decoding buffers do not need to be synchronized.
Actually, data can be produced by any source compliant with LZ4 format specification, and respecting maxBlockSize.
- Synchronized mode :
Decompression buffer size is _exactly_ the same as compression buffer size,
and follows exactly same update rule (block boundaries at same positions),
and decoding function is provided with exact decompressed size of each block (exception for last block of the stream),
_then_ decoding & encoding ring buffer can have any size, including small ones ( < 64 KB).
- Decompression buffer is larger than encoding buffer, by a minimum of maxBlockSize more bytes.
In which case, encoding and decoding buffers do not need to be synchronized,
and encoding ring buffer can have any size, including small ones ( < 64 KB).
Whenever these conditions are not possible,
save the last 64KB of decoded data into a safe buffer where it can't be modified during decompression,
then indicate where this data is saved using LZ4_setStreamDecode(), before decompressing next block.
</p></pre><BR>
<pre><b>int LZ4_decompress_safe_usingDict (const char* src, char* dst, int srcSize, int dstCapcity, const char* dictStart, int dictSize);
</b><p> These decoding functions work the same as
a combination of LZ4_setStreamDecode() followed by LZ4_decompress_*_continue()
They are stand-alone, and don't need an LZ4_streamDecode_t structure.
Dictionary is presumed stable : it must remain accessible and unmodified during decompression.
Performance tip : Decompression speed can be substantially increased
when dst == dictStart + dictSize.
</p></pre><BR>
<a name="Chapter8"></a><h2>Experimental section</h2><pre>
Symbols declared in this section must be considered unstable. Their
signatures or semantics may change, or they may be removed altogether in the
future. They are therefore only safe to depend on when the caller is
statically linked against the library.
To protect against unsafe usage, not only are the declarations guarded,
the definitions are hidden by default
when building LZ4 as a shared/dynamic library.
In order to access these declarations,
define LZ4_STATIC_LINKING_ONLY in your application
before including LZ4's headers.
In order to make their implementations accessible dynamically, you must
define LZ4_PUBLISH_STATIC_FUNCTIONS when building the LZ4 library.
<BR></pre>
<pre><b>LZ4LIB_STATIC_API int LZ4_compress_fast_extState_fastReset (void* state, const char* src, char* dst, int srcSize, int dstCapacity, int acceleration);
</b><p> A variant of LZ4_compress_fast_extState().
Using this variant avoids an expensive initialization step.
It is only safe to call if the state buffer is known to be correctly initialized already
(see above comment on LZ4_resetStream_fast() for a definition of "correctly initialized").
From a high level, the difference is that
this function initializes the provided state with a call to something like LZ4_resetStream_fast()
while LZ4_compress_fast_extState() starts with a call to LZ4_resetStream().
</p></pre><BR>
<pre><b>LZ4LIB_STATIC_API void LZ4_attach_dictionary(LZ4_stream_t* workingStream, const LZ4_stream_t* dictionaryStream);
</b><p> This is an experimental API that allows
efficient use of a static dictionary many times.
Rather than re-loading the dictionary buffer into a working context before
each compression, or copying a pre-loaded dictionary's LZ4_stream_t into a
working LZ4_stream_t, this function introduces a no-copy setup mechanism,
in which the working stream references the dictionary stream in-place.
Several assumptions are made about the state of the dictionary stream.
Currently, only streams which have been prepared by LZ4_loadDict() should
be expected to work.
Alternatively, the provided dictionaryStream may be NULL,
in which case any existing dictionary stream is unset.
If a dictionary is provided, it replaces any pre-existing stream history.
The dictionary contents are the only history that can be referenced and
logically immediately precede the data compressed in the first subsequent
compression call.
The dictionary will only remain attached to the working stream through the
first compression call, at the end of which it is cleared. The dictionary
stream (and source buffer) must remain in-place / accessible / unchanged
through the completion of the first compression call on the stream.
</p></pre><BR>
<pre><b></b><p>
It's possible to have input and output sharing the same buffer,
for highly contrained memory environments.
In both cases, it requires input to lay at the end of the buffer,
and decompression to start at beginning of the buffer.
Buffer size must feature some margin, hence be larger than final size.
|<------------------------buffer--------------------------------->|
|<-----------compressed data--------->|
|<-----------decompressed size------------------>|
|<----margin---->|
This technique is more useful for decompression,
since decompressed size is typically larger,
and margin is short.
In-place decompression will work inside any buffer
which size is >= LZ4_DECOMPRESS_INPLACE_BUFFER_SIZE(decompressedSize).
This presumes that decompressedSize > compressedSize.
Otherwise, it means compression actually expanded data,
and it would be more efficient to store such data with a flag indicating it's not compressed.
This can happen when data is not compressible (already compressed, or encrypted).
For in-place compression, margin is larger, as it must be able to cope with both
history preservation, requiring input data to remain unmodified up to LZ4_DISTANCE_MAX,
and data expansion, which can happen when input is not compressible.
As a consequence, buffer size requirements are much higher,
and memory savings offered by in-place compression are more limited.
There are ways to limit this cost for compression :
- Reduce history size, by modifying LZ4_DISTANCE_MAX.
Note that it is a compile-time constant, so all compressions will apply this limit.
Lower values will reduce compression ratio, except when input_size < LZ4_DISTANCE_MAX,
so it's a reasonable trick when inputs are known to be small.
- Require the compressor to deliver a "maximum compressed size".
This is the `dstCapacity` parameter in `LZ4_compress*()`.
When this size is < LZ4_COMPRESSBOUND(inputSize), then compression can fail,
in which case, the return code will be 0 (zero).
The caller must be ready for these cases to happen,
and typically design a backup scheme to send data uncompressed.
The combination of both techniques can significantly reduce
the amount of margin required for in-place compression.
In-place compression can work in any buffer
which size is >= (maxCompressedSize)
with maxCompressedSize == LZ4_COMPRESSBOUND(srcSize) for guaranteed compression success.
LZ4_COMPRESS_INPLACE_BUFFER_SIZE() depends on both maxCompressedSize and LZ4_DISTANCE_MAX,
so it's possible to reduce memory requirements by playing with them.
</p></pre><BR>
<pre><b>#define LZ4_DECOMPRESS_INPLACE_BUFFER_SIZE(decompressedSize) ((decompressedSize) + LZ4_DECOMPRESS_INPLACE_MARGIN(decompressedSize)) </b>/**< note: presumes that compressedSize < decompressedSize. note2: margin is overestimated a bit, since it could use compressedSize instead */<b>
</b></pre><BR>
<pre><b>#define LZ4_COMPRESS_INPLACE_BUFFER_SIZE(maxCompressedSize) ((maxCompressedSize) + LZ4_COMPRESS_INPLACE_MARGIN) </b>/**< maxCompressedSize is generally LZ4_COMPRESSBOUND(inputSize), but can be set to any lower value, with the risk that compression can fail (return code 0(zero)) */<b>
</b></pre><BR>
<a name="Chapter9"></a><h2>Private Definitions</h2><pre>
Do not use these definitions directly.
They are only exposed to allow static allocation of `LZ4_stream_t` and `LZ4_streamDecode_t`.
Accessing members will expose user code to API and/or ABI break in future versions of the library.
<BR></pre>
<pre><b>typedef struct {
const LZ4_byte* externalDict;
size_t extDictSize;
const LZ4_byte* prefixEnd;
size_t prefixSize;
} LZ4_streamDecode_t_internal;
</b></pre><BR>
<pre><b>#define LZ4_STREAMSIZE 16416 </b>/* static size, for inter-version compatibility */<b>
#define LZ4_STREAMSIZE_VOIDP (LZ4_STREAMSIZE / sizeof(void*))
union LZ4_stream_u {
void* table[LZ4_STREAMSIZE_VOIDP];
LZ4_stream_t_internal internal_donotuse;
}; </b>/* previously typedef'd to LZ4_stream_t */<b>
</b><p> Do not use below internal definitions directly !
Declare or allocate an LZ4_stream_t instead.
LZ4_stream_t can also be created using LZ4_createStream(), which is recommended.
The structure definition can be convenient for static allocation
(on stack, or as part of larger structure).
Init this structure with LZ4_initStream() before first use.
note : only use this definition in association with static linking !
this definition is not API/ABI safe, and may change in future versions.
</p></pre><BR>
<pre><b>LZ4_stream_t* LZ4_initStream (void* buffer, size_t size);
</b><p> An LZ4_stream_t structure must be initialized at least once.
This is automatically done when invoking LZ4_createStream(),
but it's not when the structure is simply declared on stack (for example).
Use LZ4_initStream() to properly initialize a newly declared LZ4_stream_t.
It can also initialize any arbitrary buffer of sufficient size,
and will @return a pointer of proper type upon initialization.
Note : initialization fails if size and alignment conditions are not respected.
In which case, the function will @return NULL.
Note2: An LZ4_stream_t structure guarantees correct alignment and size.
Note3: Before v1.9.0, use LZ4_resetStream() instead
</p></pre><BR>
<pre><b>#define LZ4_STREAMDECODESIZE_U64 (4 + ((sizeof(void*)==16) ? 2 : 0) </b>/*AS-400*/ )<b>
#define LZ4_STREAMDECODESIZE (LZ4_STREAMDECODESIZE_U64 * sizeof(unsigned long long))
union LZ4_streamDecode_u {
unsigned long long table[LZ4_STREAMDECODESIZE_U64];
LZ4_streamDecode_t_internal internal_donotuse;
} ; </b>/* previously typedef'd to LZ4_streamDecode_t */<b>
</b><p> information structure to track an LZ4 stream during decompression.
init this structure using LZ4_setStreamDecode() before first use.
note : only use in association with static linking !
this definition is not API/ABI safe,
and may change in a future version !
</p></pre><BR>
<a name="Chapter10"></a><h2>Obsolete Functions</h2><pre></pre>
<pre><b>#ifdef LZ4_DISABLE_DEPRECATE_WARNINGS
# define LZ4_DEPRECATED(message) </b>/* disable deprecation warnings */<b>
#else
# if defined (__cplusplus) && (__cplusplus >= 201402) </b>/* C++14 or greater */<b>
# define LZ4_DEPRECATED(message) [[deprecated(message)]]
# elif defined(_MSC_VER)
# define LZ4_DEPRECATED(message) __declspec(deprecated(message))
# elif defined(__clang__) || (defined(__GNUC__) && (__GNUC__ * 10 + __GNUC_MINOR__ >= 45))
# define LZ4_DEPRECATED(message) __attribute__((deprecated(message)))
# elif defined(__GNUC__) && (__GNUC__ * 10 + __GNUC_MINOR__ >= 31)
# define LZ4_DEPRECATED(message) __attribute__((deprecated))
# else
# pragma message("WARNING: LZ4_DEPRECATED needs custom implementation for this compiler")
# define LZ4_DEPRECATED(message) </b>/* disabled */<b>
# endif
#endif </b>/* LZ4_DISABLE_DEPRECATE_WARNINGS */<b>
</b><p>
Deprecated functions make the compiler generate a warning when invoked.
This is meant to invite users to update their source code.
Should deprecation warnings be a problem, it is generally possible to disable them,
typically with -Wno-deprecated-declarations for gcc
or _CRT_SECURE_NO_WARNINGS in Visual.
Another method is to define LZ4_DISABLE_DEPRECATE_WARNINGS
before including the header file.
</p></pre><BR>
<pre><b>LZ4_DEPRECATED("use LZ4_compress_default() instead") LZ4LIB_API int LZ4_compress (const char* src, char* dest, int srcSize);
LZ4_DEPRECATED("use LZ4_compress_default() instead") LZ4LIB_API int LZ4_compress_limitedOutput (const char* src, char* dest, int srcSize, int maxOutputSize);
LZ4_DEPRECATED("use LZ4_compress_fast_extState() instead") LZ4LIB_API int LZ4_compress_withState (void* state, const char* source, char* dest, int inputSize);
LZ4_DEPRECATED("use LZ4_compress_fast_extState() instead") LZ4LIB_API int LZ4_compress_limitedOutput_withState (void* state, const char* source, char* dest, int inputSize, int maxOutputSize);
LZ4_DEPRECATED("use LZ4_compress_fast_continue() instead") LZ4LIB_API int LZ4_compress_continue (LZ4_stream_t* LZ4_streamPtr, const char* source, char* dest, int inputSize);
LZ4_DEPRECATED("use LZ4_compress_fast_continue() instead") LZ4LIB_API int LZ4_compress_limitedOutput_continue (LZ4_stream_t* LZ4_streamPtr, const char* source, char* dest, int inputSize, int maxOutputSize);
</b><p></p></pre><BR>
<pre><b>LZ4_DEPRECATED("use LZ4_decompress_fast() instead") LZ4LIB_API int LZ4_uncompress (const char* source, char* dest, int outputSize);
LZ4_DEPRECATED("use LZ4_decompress_safe() instead") LZ4LIB_API int LZ4_uncompress_unknownOutputSize (const char* source, char* dest, int isize, int maxOutputSize);
</b><p></p></pre><BR>
<pre><b>LZ4_DEPRECATED("use LZ4_decompress_safe_usingDict() instead") LZ4LIB_API int LZ4_decompress_safe_withPrefix64k (const char* src, char* dst, int compressedSize, int maxDstSize);
LZ4_DEPRECATED("use LZ4_decompress_fast_usingDict() instead") LZ4LIB_API int LZ4_decompress_fast_withPrefix64k (const char* src, char* dst, int originalSize);
</b><p></p></pre><BR>
<pre><b>LZ4_DEPRECATED("This function is deprecated and unsafe. Consider using LZ4_decompress_safe() instead")
int LZ4_decompress_fast (const char* src, char* dst, int originalSize);
LZ4_DEPRECATED("This function is deprecated and unsafe. Consider using LZ4_decompress_safe_continue() instead")
int LZ4_decompress_fast_continue (LZ4_streamDecode_t* LZ4_streamDecode, const char* src, char* dst, int originalSize);
LZ4_DEPRECATED("This function is deprecated and unsafe. Consider using LZ4_decompress_safe_usingDict() instead")
int LZ4_decompress_fast_usingDict (const char* src, char* dst, int originalSize, const char* dictStart, int dictSize);
</b><p> These functions used to be faster than LZ4_decompress_safe(),
but this is no longer the case. They are now slower.
This is because LZ4_decompress_fast() doesn't know the input size,
and therefore must progress more cautiously into the input buffer to not read beyond the end of block.
On top of that `LZ4_decompress_fast()` is not protected vs malformed or malicious inputs, making it a security liability.
As a consequence, LZ4_decompress_fast() is strongly discouraged, and deprecated.
The last remaining LZ4_decompress_fast() specificity is that
it can decompress a block without knowing its compressed size.
Such functionality can be achieved in a more secure manner
by employing LZ4_decompress_safe_partial().
Parameters:
originalSize : is the uncompressed size to regenerate.
`dst` must be already allocated, its size must be >= 'originalSize' bytes.
@return : number of bytes read from source buffer (== compressed size).
The function expects to finish at block's end exactly.
If the source stream is detected malformed, the function stops decoding and returns a negative result.
note : LZ4_decompress_fast*() requires originalSize. Thanks to this information, it never writes past the output buffer.
However, since it doesn't know its 'src' size, it may read an unknown amount of input, past input buffer bounds.
Also, since match offsets are not validated, match reads from 'src' may underflow too.
These issues never happen if input (compressed) data is correct.
But they may happen if input data is invalid (error or intentional tampering).
As a consequence, use these functions in trusted environments with trusted data **only**.
</p></pre><BR>
<pre><b>void LZ4_resetStream (LZ4_stream_t* streamPtr);
</b><p> An LZ4_stream_t structure must be initialized at least once.
This is done with LZ4_initStream(), or LZ4_resetStream().
Consider switching to LZ4_initStream(),
invoking LZ4_resetStream() will trigger deprecation warnings in the future.
</p></pre><BR>
</html>
</body>

View file

@ -0,0 +1,396 @@
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>1.9.3 Manual</title>
</head>
<body>
<h1>1.9.3 Manual</h1>
<hr>
<a name="Contents"></a><h2>Contents</h2>
<ol>
<li><a href="#Chapter1">Introduction</a></li>
<li><a href="#Chapter2">Compiler specifics</a></li>
<li><a href="#Chapter3">Error management</a></li>
<li><a href="#Chapter4">Frame compression types</a></li>
<li><a href="#Chapter5">Simple compression function</a></li>
<li><a href="#Chapter6">Advanced compression functions</a></li>
<li><a href="#Chapter7">Resource Management</a></li>
<li><a href="#Chapter8">Compression</a></li>
<li><a href="#Chapter9">Decompression functions</a></li>
<li><a href="#Chapter10">Streaming decompression functions</a></li>
<li><a href="#Chapter11">Bulk processing dictionary API</a></li>
</ol>
<hr>
<a name="Chapter1"></a><h2>Introduction</h2><pre>
lz4frame.h implements LZ4 frame specification (doc/lz4_Frame_format.md).
lz4frame.h provides frame compression functions that take care
of encoding standard metadata alongside LZ4-compressed blocks.
<BR></pre>
<a name="Chapter2"></a><h2>Compiler specifics</h2><pre></pre>
<a name="Chapter3"></a><h2>Error management</h2><pre></pre>
<pre><b>unsigned LZ4F_isError(LZ4F_errorCode_t code); </b>/**< tells when a function result is an error code */<b>
</b></pre><BR>
<pre><b>const char* LZ4F_getErrorName(LZ4F_errorCode_t code); </b>/**< return error code string; for debugging */<b>
</b></pre><BR>
<a name="Chapter4"></a><h2>Frame compression types</h2><pre></pre>
<pre><b>typedef enum {
LZ4F_default=0,
LZ4F_max64KB=4,
LZ4F_max256KB=5,
LZ4F_max1MB=6,
LZ4F_max4MB=7
LZ4F_OBSOLETE_ENUM(max64KB)
LZ4F_OBSOLETE_ENUM(max256KB)
LZ4F_OBSOLETE_ENUM(max1MB)
LZ4F_OBSOLETE_ENUM(max4MB)
} LZ4F_blockSizeID_t;
</b></pre><BR>
<pre><b>typedef enum {
LZ4F_blockLinked=0,
LZ4F_blockIndependent
LZ4F_OBSOLETE_ENUM(blockLinked)
LZ4F_OBSOLETE_ENUM(blockIndependent)
} LZ4F_blockMode_t;
</b></pre><BR>
<pre><b>typedef enum {
LZ4F_noContentChecksum=0,
LZ4F_contentChecksumEnabled
LZ4F_OBSOLETE_ENUM(noContentChecksum)
LZ4F_OBSOLETE_ENUM(contentChecksumEnabled)
} LZ4F_contentChecksum_t;
</b></pre><BR>
<pre><b>typedef enum {
LZ4F_noBlockChecksum=0,
LZ4F_blockChecksumEnabled
} LZ4F_blockChecksum_t;
</b></pre><BR>
<pre><b>typedef enum {
LZ4F_frame=0,
LZ4F_skippableFrame
LZ4F_OBSOLETE_ENUM(skippableFrame)
} LZ4F_frameType_t;
</b></pre><BR>
<pre><b>typedef struct {
LZ4F_blockSizeID_t blockSizeID; </b>/* max64KB, max256KB, max1MB, max4MB; 0 == default */<b>
LZ4F_blockMode_t blockMode; </b>/* LZ4F_blockLinked, LZ4F_blockIndependent; 0 == default */<b>
LZ4F_contentChecksum_t contentChecksumFlag; </b>/* 1: frame terminated with 32-bit checksum of decompressed data; 0: disabled (default) */<b>
LZ4F_frameType_t frameType; </b>/* read-only field : LZ4F_frame or LZ4F_skippableFrame */<b>
unsigned long long contentSize; </b>/* Size of uncompressed content ; 0 == unknown */<b>
unsigned dictID; </b>/* Dictionary ID, sent by compressor to help decoder select correct dictionary; 0 == no dictID provided */<b>
LZ4F_blockChecksum_t blockChecksumFlag; </b>/* 1: each block followed by a checksum of block's compressed data; 0: disabled (default) */<b>
} LZ4F_frameInfo_t;
</b><p> makes it possible to set or read frame parameters.
Structure must be first init to 0, using memset() or LZ4F_INIT_FRAMEINFO,
setting all parameters to default.
It's then possible to update selectively some parameters
</p></pre><BR>
<pre><b>typedef struct {
LZ4F_frameInfo_t frameInfo;
int compressionLevel; </b>/* 0: default (fast mode); values > LZ4HC_CLEVEL_MAX count as LZ4HC_CLEVEL_MAX; values < 0 trigger "fast acceleration" */<b>
unsigned autoFlush; </b>/* 1: always flush; reduces usage of internal buffers */<b>
unsigned favorDecSpeed; </b>/* 1: parser favors decompression speed vs compression ratio. Only works for high compression modes (>= LZ4HC_CLEVEL_OPT_MIN) */ /* v1.8.2+ */<b>
unsigned reserved[3]; </b>/* must be zero for forward compatibility */<b>
} LZ4F_preferences_t;
</b><p> makes it possible to supply advanced compression instructions to streaming interface.
Structure must be first init to 0, using memset() or LZ4F_INIT_PREFERENCES,
setting all parameters to default.
All reserved fields must be set to zero.
</p></pre><BR>
<a name="Chapter5"></a><h2>Simple compression function</h2><pre></pre>
<pre><b>size_t LZ4F_compressFrameBound(size_t srcSize, const LZ4F_preferences_t* preferencesPtr);
</b><p> Returns the maximum possible compressed size with LZ4F_compressFrame() given srcSize and preferences.
`preferencesPtr` is optional. It can be replaced by NULL, in which case, the function will assume default preferences.
Note : this result is only usable with LZ4F_compressFrame().
It may also be used with LZ4F_compressUpdate() _if no flush() operation_ is performed.
</p></pre><BR>
<pre><b>size_t LZ4F_compressFrame(void* dstBuffer, size_t dstCapacity,
const void* srcBuffer, size_t srcSize,
const LZ4F_preferences_t* preferencesPtr);
</b><p> Compress an entire srcBuffer into a valid LZ4 frame.
dstCapacity MUST be >= LZ4F_compressFrameBound(srcSize, preferencesPtr).
The LZ4F_preferences_t structure is optional : you can provide NULL as argument. All preferences will be set to default.
@return : number of bytes written into dstBuffer.
or an error code if it fails (can be tested using LZ4F_isError())
</p></pre><BR>
<a name="Chapter6"></a><h2>Advanced compression functions</h2><pre></pre>
<pre><b>typedef struct {
unsigned stableSrc; </b>/* 1 == src content will remain present on future calls to LZ4F_compress(); skip copying src content within tmp buffer */<b>
unsigned reserved[3];
} LZ4F_compressOptions_t;
</b></pre><BR>
<a name="Chapter7"></a><h2>Resource Management</h2><pre></pre>
<pre><b>LZ4F_errorCode_t LZ4F_createCompressionContext(LZ4F_cctx** cctxPtr, unsigned version);
LZ4F_errorCode_t LZ4F_freeCompressionContext(LZ4F_cctx* cctx);
</b><p> The first thing to do is to create a compressionContext object, which will be used in all compression operations.
This is achieved using LZ4F_createCompressionContext(), which takes as argument a version.
The version provided MUST be LZ4F_VERSION. It is intended to track potential version mismatch, notably when using DLL.
The function will provide a pointer to a fully allocated LZ4F_cctx object.
If @return != zero, there was an error during context creation.
Object can release its memory using LZ4F_freeCompressionContext();
</p></pre><BR>
<a name="Chapter8"></a><h2>Compression</h2><pre></pre>
<pre><b>size_t LZ4F_compressBegin(LZ4F_cctx* cctx,
void* dstBuffer, size_t dstCapacity,
const LZ4F_preferences_t* prefsPtr);
</b><p> will write the frame header into dstBuffer.
dstCapacity must be >= LZ4F_HEADER_SIZE_MAX bytes.
`prefsPtr` is optional : you can provide NULL as argument, all preferences will then be set to default.
@return : number of bytes written into dstBuffer for the header
or an error code (which can be tested using LZ4F_isError())
</p></pre><BR>
<pre><b>size_t LZ4F_compressBound(size_t srcSize, const LZ4F_preferences_t* prefsPtr);
</b><p> Provides minimum dstCapacity required to guarantee success of
LZ4F_compressUpdate(), given a srcSize and preferences, for a worst case scenario.
When srcSize==0, LZ4F_compressBound() provides an upper bound for LZ4F_flush() and LZ4F_compressEnd() instead.
Note that the result is only valid for a single invocation of LZ4F_compressUpdate().
When invoking LZ4F_compressUpdate() multiple times,
if the output buffer is gradually filled up instead of emptied and re-used from its start,
one must check if there is enough remaining capacity before each invocation, using LZ4F_compressBound().
@return is always the same for a srcSize and prefsPtr.
prefsPtr is optional : when NULL is provided, preferences will be set to cover worst case scenario.
tech details :
@return if automatic flushing is not enabled, includes the possibility that internal buffer might already be filled by up to (blockSize-1) bytes.
It also includes frame footer (ending + checksum), since it might be generated by LZ4F_compressEnd().
@return doesn't include frame header, as it was already generated by LZ4F_compressBegin().
</p></pre><BR>
<pre><b>size_t LZ4F_compressUpdate(LZ4F_cctx* cctx,
void* dstBuffer, size_t dstCapacity,
const void* srcBuffer, size_t srcSize,
const LZ4F_compressOptions_t* cOptPtr);
</b><p> LZ4F_compressUpdate() can be called repetitively to compress as much data as necessary.
Important rule: dstCapacity MUST be large enough to ensure operation success even in worst case situations.
This value is provided by LZ4F_compressBound().
If this condition is not respected, LZ4F_compress() will fail (result is an errorCode).
LZ4F_compressUpdate() doesn't guarantee error recovery.
When an error occurs, compression context must be freed or resized.
`cOptPtr` is optional : NULL can be provided, in which case all options are set to default.
@return : number of bytes written into `dstBuffer` (it can be zero, meaning input data was just buffered).
or an error code if it fails (which can be tested using LZ4F_isError())
</p></pre><BR>
<pre><b>size_t LZ4F_flush(LZ4F_cctx* cctx,
void* dstBuffer, size_t dstCapacity,
const LZ4F_compressOptions_t* cOptPtr);
</b><p> When data must be generated and sent immediately, without waiting for a block to be completely filled,
it's possible to call LZ4_flush(). It will immediately compress any data buffered within cctx.
`dstCapacity` must be large enough to ensure the operation will be successful.
`cOptPtr` is optional : it's possible to provide NULL, all options will be set to default.
@return : nb of bytes written into dstBuffer (can be zero, when there is no data stored within cctx)
or an error code if it fails (which can be tested using LZ4F_isError())
Note : LZ4F_flush() is guaranteed to be successful when dstCapacity >= LZ4F_compressBound(0, prefsPtr).
</p></pre><BR>
<pre><b>size_t LZ4F_compressEnd(LZ4F_cctx* cctx,
void* dstBuffer, size_t dstCapacity,
const LZ4F_compressOptions_t* cOptPtr);
</b><p> To properly finish an LZ4 frame, invoke LZ4F_compressEnd().
It will flush whatever data remained within `cctx` (like LZ4_flush())
and properly finalize the frame, with an endMark and a checksum.
`cOptPtr` is optional : NULL can be provided, in which case all options will be set to default.
@return : nb of bytes written into dstBuffer, necessarily >= 4 (endMark),
or an error code if it fails (which can be tested using LZ4F_isError())
Note : LZ4F_compressEnd() is guaranteed to be successful when dstCapacity >= LZ4F_compressBound(0, prefsPtr).
A successful call to LZ4F_compressEnd() makes `cctx` available again for another compression task.
</p></pre><BR>
<a name="Chapter9"></a><h2>Decompression functions</h2><pre></pre>
<pre><b>typedef struct {
unsigned stableDst; </b>/* pledges that last 64KB decompressed data will remain available unmodified. This optimization skips storage operations in tmp buffers. */<b>
unsigned reserved[3]; </b>/* must be set to zero for forward compatibility */<b>
} LZ4F_decompressOptions_t;
</b></pre><BR>
<pre><b>LZ4F_errorCode_t LZ4F_createDecompressionContext(LZ4F_dctx** dctxPtr, unsigned version);
LZ4F_errorCode_t LZ4F_freeDecompressionContext(LZ4F_dctx* dctx);
</b><p> Create an LZ4F_dctx object, to track all decompression operations.
The version provided MUST be LZ4F_VERSION.
The function provides a pointer to an allocated and initialized LZ4F_dctx object.
The result is an errorCode, which can be tested using LZ4F_isError().
dctx memory can be released using LZ4F_freeDecompressionContext();
Result of LZ4F_freeDecompressionContext() indicates current state of decompressionContext when being released.
That is, it should be == 0 if decompression has been completed fully and correctly.
</p></pre><BR>
<a name="Chapter10"></a><h2>Streaming decompression functions</h2><pre></pre>
<pre><b>size_t LZ4F_headerSize(const void* src, size_t srcSize);
</b><p> Provide the header size of a frame starting at `src`.
`srcSize` must be >= LZ4F_MIN_SIZE_TO_KNOW_HEADER_LENGTH,
which is enough to decode the header length.
@return : size of frame header
or an error code, which can be tested using LZ4F_isError()
note : Frame header size is variable, but is guaranteed to be
>= LZ4F_HEADER_SIZE_MIN bytes, and <= LZ4F_HEADER_SIZE_MAX bytes.
</p></pre><BR>
<pre><b>size_t LZ4F_getFrameInfo(LZ4F_dctx* dctx,
LZ4F_frameInfo_t* frameInfoPtr,
const void* srcBuffer, size_t* srcSizePtr);
</b><p> This function extracts frame parameters (max blockSize, dictID, etc.).
Its usage is optional: user can call LZ4F_decompress() directly.
Extracted information will fill an existing LZ4F_frameInfo_t structure.
This can be useful for allocation and dictionary identification purposes.
LZ4F_getFrameInfo() can work in the following situations :
1) At the beginning of a new frame, before any invocation of LZ4F_decompress().
It will decode header from `srcBuffer`,
consuming the header and starting the decoding process.
Input size must be large enough to contain the full frame header.
Frame header size can be known beforehand by LZ4F_headerSize().
Frame header size is variable, but is guaranteed to be >= LZ4F_HEADER_SIZE_MIN bytes,
and not more than <= LZ4F_HEADER_SIZE_MAX bytes.
Hence, blindly providing LZ4F_HEADER_SIZE_MAX bytes or more will always work.
It's allowed to provide more input data than the header size,
LZ4F_getFrameInfo() will only consume the header.
If input size is not large enough,
aka if it's smaller than header size,
function will fail and return an error code.
2) After decoding has been started,
it's possible to invoke LZ4F_getFrameInfo() anytime
to extract already decoded frame parameters stored within dctx.
Note that, if decoding has barely started,
and not yet read enough information to decode the header,
LZ4F_getFrameInfo() will fail.
The number of bytes consumed from srcBuffer will be updated in *srcSizePtr (necessarily <= original value).
LZ4F_getFrameInfo() only consumes bytes when decoding has not yet started,
and when decoding the header has been successful.
Decompression must then resume from (srcBuffer + *srcSizePtr).
@return : a hint about how many srcSize bytes LZ4F_decompress() expects for next call,
or an error code which can be tested using LZ4F_isError().
note 1 : in case of error, dctx is not modified. Decoding operation can resume from beginning safely.
note 2 : frame parameters are *copied into* an already allocated LZ4F_frameInfo_t structure.
</p></pre><BR>
<pre><b>size_t LZ4F_decompress(LZ4F_dctx* dctx,
void* dstBuffer, size_t* dstSizePtr,
const void* srcBuffer, size_t* srcSizePtr,
const LZ4F_decompressOptions_t* dOptPtr);
</b><p> Call this function repetitively to regenerate data compressed in `srcBuffer`.
The function requires a valid dctx state.
It will read up to *srcSizePtr bytes from srcBuffer,
and decompress data into dstBuffer, of capacity *dstSizePtr.
The nb of bytes consumed from srcBuffer will be written into *srcSizePtr (necessarily <= original value).
The nb of bytes decompressed into dstBuffer will be written into *dstSizePtr (necessarily <= original value).
The function does not necessarily read all input bytes, so always check value in *srcSizePtr.
Unconsumed source data must be presented again in subsequent invocations.
`dstBuffer` can freely change between each consecutive function invocation.
`dstBuffer` content will be overwritten.
@return : an hint of how many `srcSize` bytes LZ4F_decompress() expects for next call.
Schematically, it's the size of the current (or remaining) compressed block + header of next block.
Respecting the hint provides some small speed benefit, because it skips intermediate buffers.
This is just a hint though, it's always possible to provide any srcSize.
When a frame is fully decoded, @return will be 0 (no more data expected).
When provided with more bytes than necessary to decode a frame,
LZ4F_decompress() will stop reading exactly at end of current frame, and @return 0.
If decompression failed, @return is an error code, which can be tested using LZ4F_isError().
After a decompression error, the `dctx` context is not resumable.
Use LZ4F_resetDecompressionContext() to return to clean state.
After a frame is fully decoded, dctx can be used again to decompress another frame.
</p></pre><BR>
<pre><b>void LZ4F_resetDecompressionContext(LZ4F_dctx* dctx); </b>/* always successful */<b>
</b><p> In case of an error, the context is left in "undefined" state.
In which case, it's necessary to reset it, before re-using it.
This method can also be used to abruptly stop any unfinished decompression,
and start a new one using same context resources.
</p></pre><BR>
<pre><b>typedef enum { LZ4F_LIST_ERRORS(LZ4F_GENERATE_ENUM)
_LZ4F_dummy_error_enum_for_c89_never_used } LZ4F_errorCodes;
</b></pre><BR>
<a name="Chapter11"></a><h2>Bulk processing dictionary API</h2><pre></pre>
<pre><b>LZ4FLIB_STATIC_API LZ4F_CDict* LZ4F_createCDict(const void* dictBuffer, size_t dictSize);
LZ4FLIB_STATIC_API void LZ4F_freeCDict(LZ4F_CDict* CDict);
</b><p> When compressing multiple messages / blocks using the same dictionary, it's recommended to load it just once.
LZ4_createCDict() will create a digested dictionary, ready to start future compression operations without startup delay.
LZ4_CDict can be created once and shared by multiple threads concurrently, since its usage is read-only.
`dictBuffer` can be released after LZ4_CDict creation, since its content is copied within CDict
</p></pre><BR>
<pre><b>LZ4FLIB_STATIC_API size_t LZ4F_compressFrame_usingCDict(
LZ4F_cctx* cctx,
void* dst, size_t dstCapacity,
const void* src, size_t srcSize,
const LZ4F_CDict* cdict,
const LZ4F_preferences_t* preferencesPtr);
</b><p> Compress an entire srcBuffer into a valid LZ4 frame using a digested Dictionary.
cctx must point to a context created by LZ4F_createCompressionContext().
If cdict==NULL, compress without a dictionary.
dstBuffer MUST be >= LZ4F_compressFrameBound(srcSize, preferencesPtr).
If this condition is not respected, function will fail (@return an errorCode).
The LZ4F_preferences_t structure is optional : you may provide NULL as argument,
but it's not recommended, as it's the only way to provide dictID in the frame header.
@return : number of bytes written into dstBuffer.
or an error code if it fails (can be tested using LZ4F_isError())
</p></pre><BR>
<pre><b>LZ4FLIB_STATIC_API size_t LZ4F_compressBegin_usingCDict(
LZ4F_cctx* cctx,
void* dstBuffer, size_t dstCapacity,
const LZ4F_CDict* cdict,
const LZ4F_preferences_t* prefsPtr);
</b><p> Inits streaming dictionary compression, and writes the frame header into dstBuffer.
dstCapacity must be >= LZ4F_HEADER_SIZE_MAX bytes.
`prefsPtr` is optional : you may provide NULL as argument,
however, it's the only way to provide dictID in the frame header.
@return : number of bytes written into dstBuffer for the header,
or an error code (which can be tested using LZ4F_isError())
</p></pre><BR>
<pre><b>LZ4FLIB_STATIC_API size_t LZ4F_decompress_usingDict(
LZ4F_dctx* dctxPtr,
void* dstBuffer, size_t* dstSizePtr,
const void* srcBuffer, size_t* srcSizePtr,
const void* dict, size_t dictSize,
const LZ4F_decompressOptions_t* decompressOptionsPtr);
</b><p> Same as LZ4F_decompress(), using a predefined dictionary.
Dictionary is used "in place", without any preprocessing.
It must remain accessible throughout the entire frame decoding.
</p></pre><BR>
</html>
</body>