Skip to content

Commit

Permalink
Import v1.5.6 sources
Browse files Browse the repository at this point in the history
It's not ready for release as we have disabled the Sequence API due to
changes upstream that break compilation.
  • Loading branch information
luben committed Mar 27, 2024
1 parent c66db1e commit 355e64a
Show file tree
Hide file tree
Showing 56 changed files with 4,707 additions and 2,344 deletions.
5 changes: 3 additions & 2 deletions src/main/java/com/github/luben/zstd/Zstd.java
Original file line number Diff line number Diff line change
Expand Up @@ -590,9 +590,10 @@ public static long decompressDirectByteBufferFastDict(ByteBuffer dst, int dstOff
public static native int loadFastDictDecompress(long stream, ZstdDictDecompress dict);
public static native int loadDictCompress(long stream, byte[] dict, int dict_size);
public static native int loadFastDictCompress(long stream, ZstdDictCompress dict);
public static native void registerSequenceProducer(long stream, long seqProdState, long seqProdFunction);
// TODO: Fix native compilation
//public static native void registerSequenceProducer(long stream, long seqProdState, long seqProdFunction);
// static native long getBuiltinSequenceProducer(); // Used in tests
static native void generateSequences(long stream, long outSeqs, long outSeqsSize, long src, long srcSize);
static native long getBuiltinSequenceProducer(); // Used in tests
static native long getStubSequenceProducer(); // Used in tests
public static native int setCompressionChecksums(long stream, boolean useChecksums);
public static native int setCompressionMagicless(long stream, boolean useMagicless);
Expand Down
2 changes: 2 additions & 0 deletions src/main/java/com/github/luben/zstd/ZstdCompressCtx.java
Original file line number Diff line number Diff line change
Expand Up @@ -280,6 +280,7 @@ public ZstdCompressCtx setLong(int windowLog) {
* Register an external sequence producer
* @param producer the user-defined {@link SequenceProducer} to register.
*/
/* TODO: fix compilation
public ZstdCompressCtx registerSequenceProducer(SequenceProducer producer) {
ensureOpen();
acquireSharedLock();
Expand All @@ -305,6 +306,7 @@ public ZstdCompressCtx registerSequenceProducer(SequenceProducer producer) {
}
return this;
}
*/

/**
* Enable or disable sequence producer fallback
Expand Down
44 changes: 37 additions & 7 deletions src/main/native/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,12 +19,16 @@ The scope can be reduced on demand (see paragraph _modular build_).

#### Multithreading support

Multithreading is disabled by default when building with `make`.
When building with `make`, by default the dynamic library is multithreaded and static library is single-threaded (for compatibility reasons).

Enabling multithreading requires 2 conditions :
- set build macro `ZSTD_MULTITHREAD` (`-DZSTD_MULTITHREAD` for `gcc`)
- for POSIX systems : compile with pthread (`-pthread` compilation flag for `gcc`)

Both conditions are automatically applied when invoking `make lib-mt` target.
For convenience, we provide a build target to generate multi and single threaded libraries:
- Force enable multithreading on both dynamic and static libraries by appending `-mt` to the target, e.g. `make lib-mt`.
- Force disable multithreading on both dynamic and static libraries by appending `-nomt` to the target, e.g. `make lib-nomt`.
- By default, as mentioned before, dynamic library is multithreaded, and static library is single-threaded, e.g. `make lib`.

When linking a POSIX program with a multithreaded version of `libzstd`,
note that it's necessary to invoke the `-pthread` flag during link stage.
Expand All @@ -42,8 +46,8 @@ Zstandard's stable API is exposed within [lib/zstd.h](zstd.h).

Optional advanced features are exposed via :

- `lib/common/zstd_errors.h` : translates `size_t` function results
into a `ZSTD_ErrorCode`, for accurate error handling.
- `lib/zstd_errors.h` : translates `size_t` function results
into a `ZSTD_ErrorCode`, for accurate error handling.

- `ZSTD_STATIC_LINKING_ONLY` : if this macro is defined _before_ including `zstd.h`,
it unlocks access to the experimental API,
Expand Down Expand Up @@ -84,10 +88,10 @@ The file structure is designed to make this selection manually achievable for an
For example, advanced API for version `v0.4` is exposed in `lib/legacy/zstd_v04.h` .

- While invoking `make libzstd`, it's possible to define build macros
`ZSTD_LIB_COMPRESSION, ZSTD_LIB_DECOMPRESSION`, `ZSTD_LIB_DICTBUILDER`,
`ZSTD_LIB_COMPRESSION`, `ZSTD_LIB_DECOMPRESSION`, `ZSTD_LIB_DICTBUILDER`,
and `ZSTD_LIB_DEPRECATED` as `0` to forgo compilation of the
corresponding features. This will also disable compilation of all
dependencies (eg. `ZSTD_LIB_COMPRESSION=0` will also disable
dependencies (e.g. `ZSTD_LIB_COMPRESSION=0` will also disable
dictBuilder).

- There are a number of options that can help minimize the binary size of
Expand Down Expand Up @@ -115,13 +119,22 @@ The file structure is designed to make this selection manually achievable for an
binary is achieved by using `HUF_FORCE_DECOMPRESS_X1` and
`ZSTD_FORCE_DECOMPRESS_SEQUENCES_SHORT` (implied by `ZSTD_LIB_MINIFY`).

On the compressor side, Zstd's compression levels map to several internal
strategies. In environments where the higher compression levels aren't used,
it is possible to exclude all but the fastest strategy with
`ZSTD_LIB_EXCLUDE_COMPRESSORS_DFAST_AND_UP=1`. (Note that this will change
the behavior of the default compression level.) Or if you want to retain the
default compressor as well, you can set
`ZSTD_LIB_EXCLUDE_COMPRESSORS_GREEDY_AND_UP=1`, at the cost of an additional
~20KB or so.

For squeezing the last ounce of size out, you can also define
`ZSTD_NO_INLINE`, which disables inlining, and `ZSTD_STRIP_ERROR_STRINGS`,
which removes the error messages that are otherwise returned by
`ZSTD_getErrorName` (implied by `ZSTD_LIB_MINIFY`).

Finally, when integrating into your application, make sure you're doing link-
time optimation and unused symbol garbage collection (via some combination of,
time optimization and unused symbol garbage collection (via some combination of,
e.g., `-flto`, `-ffat-lto-objects`, `-fuse-linker-plugin`,
`-ffunction-sections`, `-fdata-sections`, `-fmerge-all-constants`,
`-Wl,--gc-sections`, `-Wl,-z,norelro`, and an archiver that understands
Expand Down Expand Up @@ -151,6 +164,23 @@ The file structure is designed to make this selection manually achievable for an
- The build macro `ZSTD_NO_INTRINSICS` can be defined to disable all explicit intrinsics.
Compiler builtins are still used.

- The build macro `ZSTD_DECODER_INTERNAL_BUFFER` can be set to control
the amount of extra memory used during decompression to store literals.
This defaults to 64kB. Reducing this value reduces the memory footprint of
`ZSTD_DCtx` decompression contexts,
but might also result in a small decompression speed cost.

- The C compiler macros `ZSTDLIB_VISIBLE`, `ZSTDERRORLIB_VISIBLE` and `ZDICTLIB_VISIBLE`
can be overridden to control the visibility of zstd's API. Additionally,
`ZSTDLIB_STATIC_API` and `ZDICTLIB_STATIC_API` can be overridden to control the visibility
of zstd's static API. Specifically, it can be set to `ZSTDLIB_HIDDEN` to hide the symbols
from the shared library. These macros default to `ZSTDLIB_VISIBILITY`,
`ZSTDERRORLIB_VSIBILITY`, and `ZDICTLIB_VISIBILITY` if unset, for backwards compatibility
with the old macro names.

- The C compiler macro `HUF_DISABLE_FAST_DECODE` disables the newer Huffman fast C
and assembly decoding loops. You may want to use this macro if these loops are
slower on your platform.

#### Windows : using MinGW+MSYS to create DLL

Expand Down
2 changes: 1 addition & 1 deletion src/main/native/common/allocations.h
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
#define ZSTD_DEPS_NEED_MALLOC
#include "zstd_deps.h" /* ZSTD_malloc, ZSTD_calloc, ZSTD_free, ZSTD_memset */

#include "mem.h" /* MEM_STATIC */
#include "compiler.h" /* MEM_STATIC */
#define ZSTD_STATIC_LINKING_ONLY
#include "../zstd.h" /* ZSTD_customMem */

Expand Down
78 changes: 49 additions & 29 deletions src/main/native/common/bitstream.h
Original file line number Diff line number Diff line change
Expand Up @@ -90,19 +90,20 @@ MEM_STATIC size_t BIT_closeCStream(BIT_CStream_t* bitC);
/*-********************************************
* bitStream decoding API (read backward)
**********************************************/
typedef size_t BitContainerType;
typedef struct {
size_t bitContainer;
BitContainerType bitContainer;
unsigned bitsConsumed;
const char* ptr;
const char* start;
const char* limitPtr;
} BIT_DStream_t;

typedef enum { BIT_DStream_unfinished = 0,
BIT_DStream_endOfBuffer = 1,
BIT_DStream_completed = 2,
BIT_DStream_overflow = 3 } BIT_DStream_status; /* result of BIT_reloadDStream() */
/* 1,2,4,8 would be better for bitmap combinations, but slows down performance a bit ... :( */
typedef enum { BIT_DStream_unfinished = 0, /* fully refilled */
BIT_DStream_endOfBuffer = 1, /* still some bits left in bitstream */
BIT_DStream_completed = 2, /* bitstream entirely consumed, bit-exact */
BIT_DStream_overflow = 3 /* user requested more bits than present in bitstream */
} BIT_DStream_status; /* result of BIT_reloadDStream() */

MEM_STATIC size_t BIT_initDStream(BIT_DStream_t* bitD, const void* srcBuffer, size_t srcSize);
MEM_STATIC size_t BIT_readBits(BIT_DStream_t* bitD, unsigned nbBits);
Expand All @@ -112,7 +113,7 @@ MEM_STATIC unsigned BIT_endOfDStream(const BIT_DStream_t* bitD);

/* Start by invoking BIT_initDStream().
* A chunk of the bitStream is then stored into a local register.
* Local register size is 64-bits on 64-bits systems, 32-bits on 32-bits systems (size_t).
* Local register size is 64-bits on 64-bits systems, 32-bits on 32-bits systems (BitContainerType).
* You can then retrieve bitFields stored into the local register, **in reverse order**.
* Local register is explicitly reloaded from memory by the BIT_reloadDStream() method.
* A reload guarantee a minimum of ((8*sizeof(bitD->bitContainer))-7) bits when its result is BIT_DStream_unfinished.
Expand Down Expand Up @@ -162,7 +163,7 @@ MEM_STATIC size_t BIT_initCStream(BIT_CStream_t* bitC,
return 0;
}

MEM_STATIC FORCE_INLINE_ATTR size_t BIT_getLowerBits(size_t bitContainer, U32 const nbBits)
FORCE_INLINE_TEMPLATE size_t BIT_getLowerBits(size_t bitContainer, U32 const nbBits)
{
#if defined(STATIC_BMI2) && STATIC_BMI2 == 1 && !defined(ZSTD_NO_INTRINSICS)
return _bzhi_u64(bitContainer, nbBits);
Expand Down Expand Up @@ -267,22 +268,22 @@ MEM_STATIC size_t BIT_initDStream(BIT_DStream_t* bitD, const void* srcBuffer, si
bitD->bitContainer = *(const BYTE*)(bitD->start);
switch(srcSize)
{
case 7: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[6]) << (sizeof(bitD->bitContainer)*8 - 16);
case 7: bitD->bitContainer += (BitContainerType)(((const BYTE*)(srcBuffer))[6]) << (sizeof(bitD->bitContainer)*8 - 16);
ZSTD_FALLTHROUGH;

case 6: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[5]) << (sizeof(bitD->bitContainer)*8 - 24);
case 6: bitD->bitContainer += (BitContainerType)(((const BYTE*)(srcBuffer))[5]) << (sizeof(bitD->bitContainer)*8 - 24);
ZSTD_FALLTHROUGH;

case 5: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[4]) << (sizeof(bitD->bitContainer)*8 - 32);
case 5: bitD->bitContainer += (BitContainerType)(((const BYTE*)(srcBuffer))[4]) << (sizeof(bitD->bitContainer)*8 - 32);
ZSTD_FALLTHROUGH;

case 4: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[3]) << 24;
case 4: bitD->bitContainer += (BitContainerType)(((const BYTE*)(srcBuffer))[3]) << 24;
ZSTD_FALLTHROUGH;

case 3: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[2]) << 16;
case 3: bitD->bitContainer += (BitContainerType)(((const BYTE*)(srcBuffer))[2]) << 16;
ZSTD_FALLTHROUGH;

case 2: bitD->bitContainer += (size_t)(((const BYTE*)(srcBuffer))[1]) << 8;
case 2: bitD->bitContainer += (BitContainerType)(((const BYTE*)(srcBuffer))[1]) << 8;
ZSTD_FALLTHROUGH;

default: break;
Expand All @@ -297,12 +298,12 @@ MEM_STATIC size_t BIT_initDStream(BIT_DStream_t* bitD, const void* srcBuffer, si
return srcSize;
}

MEM_STATIC FORCE_INLINE_ATTR size_t BIT_getUpperBits(size_t bitContainer, U32 const start)
FORCE_INLINE_TEMPLATE size_t BIT_getUpperBits(BitContainerType bitContainer, U32 const start)
{
return bitContainer >> start;
}

MEM_STATIC FORCE_INLINE_ATTR size_t BIT_getMiddleBits(size_t bitContainer, U32 const start, U32 const nbBits)
FORCE_INLINE_TEMPLATE size_t BIT_getMiddleBits(BitContainerType bitContainer, U32 const start, U32 const nbBits)
{
U32 const regMask = sizeof(bitContainer)*8 - 1;
/* if start > regMask, bitstream is corrupted, and result is undefined */
Expand All @@ -325,7 +326,7 @@ MEM_STATIC FORCE_INLINE_ATTR size_t BIT_getMiddleBits(size_t bitContainer, U32 c
* On 32-bits, maxNbBits==24.
* On 64-bits, maxNbBits==56.
* @return : value extracted */
MEM_STATIC FORCE_INLINE_ATTR size_t BIT_lookBits(const BIT_DStream_t* bitD, U32 nbBits)
FORCE_INLINE_TEMPLATE size_t BIT_lookBits(const BIT_DStream_t* bitD, U32 nbBits)
{
/* arbitrate between double-shift and shift+mask */
#if 1
Expand All @@ -348,7 +349,7 @@ MEM_STATIC size_t BIT_lookBitsFast(const BIT_DStream_t* bitD, U32 nbBits)
return (bitD->bitContainer << (bitD->bitsConsumed & regMask)) >> (((regMask+1)-nbBits) & regMask);
}

MEM_STATIC FORCE_INLINE_ATTR void BIT_skipBits(BIT_DStream_t* bitD, U32 nbBits)
FORCE_INLINE_TEMPLATE void BIT_skipBits(BIT_DStream_t* bitD, U32 nbBits)
{
bitD->bitsConsumed += nbBits;
}
Expand All @@ -357,7 +358,7 @@ MEM_STATIC FORCE_INLINE_ATTR void BIT_skipBits(BIT_DStream_t* bitD, U32 nbBits)
* Read (consume) next n bits from local register and update.
* Pay attention to not read more than nbBits contained into local register.
* @return : extracted value. */
MEM_STATIC FORCE_INLINE_ATTR size_t BIT_readBits(BIT_DStream_t* bitD, unsigned nbBits)
FORCE_INLINE_TEMPLATE size_t BIT_readBits(BIT_DStream_t* bitD, unsigned nbBits)
{
size_t const value = BIT_lookBits(bitD, nbBits);
BIT_skipBits(bitD, nbBits);
Expand All @@ -374,6 +375,21 @@ MEM_STATIC size_t BIT_readBitsFast(BIT_DStream_t* bitD, unsigned nbBits)
return value;
}

/*! BIT_reloadDStream_internal() :
* Simple variant of BIT_reloadDStream(), with two conditions:
* 1. bitstream is valid : bitsConsumed <= sizeof(bitD->bitContainer)*8
* 2. look window is valid after shifted down : bitD->ptr >= bitD->start
*/
MEM_STATIC BIT_DStream_status BIT_reloadDStream_internal(BIT_DStream_t* bitD)
{
assert(bitD->bitsConsumed <= sizeof(bitD->bitContainer)*8);
bitD->ptr -= bitD->bitsConsumed >> 3;
assert(bitD->ptr >= bitD->start);
bitD->bitsConsumed &= 7;
bitD->bitContainer = MEM_readLEST(bitD->ptr);
return BIT_DStream_unfinished;
}

/*! BIT_reloadDStreamFast() :
* Similar to BIT_reloadDStream(), but with two differences:
* 1. bitsConsumed <= sizeof(bitD->bitContainer)*8 must hold!
Expand All @@ -384,31 +400,35 @@ MEM_STATIC BIT_DStream_status BIT_reloadDStreamFast(BIT_DStream_t* bitD)
{
if (UNLIKELY(bitD->ptr < bitD->limitPtr))
return BIT_DStream_overflow;
assert(bitD->bitsConsumed <= sizeof(bitD->bitContainer)*8);
bitD->ptr -= bitD->bitsConsumed >> 3;
bitD->bitsConsumed &= 7;
bitD->bitContainer = MEM_readLEST(bitD->ptr);
return BIT_DStream_unfinished;
return BIT_reloadDStream_internal(bitD);
}

/*! BIT_reloadDStream() :
* Refill `bitD` from buffer previously set in BIT_initDStream() .
* This function is safe, it guarantees it will not read beyond src buffer.
* This function is safe, it guarantees it will not never beyond src buffer.
* @return : status of `BIT_DStream_t` internal register.
* when status == BIT_DStream_unfinished, internal register is filled with at least 25 or 57 bits */
MEM_STATIC FORCE_INLINE_ATTR BIT_DStream_status BIT_reloadDStream(BIT_DStream_t* bitD)
FORCE_INLINE_TEMPLATE BIT_DStream_status BIT_reloadDStream(BIT_DStream_t* bitD)
{
if (bitD->bitsConsumed > (sizeof(bitD->bitContainer)*8)) /* overflow detected, like end of stream */
/* note : once in overflow mode, a bitstream remains in this mode until it's reset */
if (UNLIKELY(bitD->bitsConsumed > (sizeof(bitD->bitContainer)*8))) {
static const BitContainerType zeroFilled = 0;
bitD->ptr = (const char*)&zeroFilled; /* aliasing is allowed for char */
/* overflow detected, erroneous scenario or end of stream: no update */
return BIT_DStream_overflow;
}

assert(bitD->ptr >= bitD->start);

if (bitD->ptr >= bitD->limitPtr) {
return BIT_reloadDStreamFast(bitD);
return BIT_reloadDStream_internal(bitD);
}
if (bitD->ptr == bitD->start) {
/* reached end of bitStream => no update */
if (bitD->bitsConsumed < sizeof(bitD->bitContainer)*8) return BIT_DStream_endOfBuffer;
return BIT_DStream_completed;
}
/* start < ptr < limitPtr */
/* start < ptr < limitPtr => cautious update */
{ U32 nbBytes = bitD->bitsConsumed >> 3;
BIT_DStream_status result = BIT_DStream_unfinished;
if (bitD->ptr - nbBytes < bitD->start) {
Expand Down
Loading

0 comments on commit 355e64a

Please sign in to comment.