Class Zdict


  • public class Zdict
    extends java.lang.Object
    Native bindings to the dictionary builder API of Zstandard (zstd).
    • Method Detail

      • nZDICT_trainFromBuffer

        public static long nZDICT_trainFromBuffer​(long dictBuffer,
                                                  long dictBufferCapacity,
                                                  long samplesBuffer,
                                                  long samplesSizes,
                                                  int nbSamples)
        Unsafe version of: trainFromBuffer
      • ZDICT_trainFromBuffer

        public static long ZDICT_trainFromBuffer​(java.nio.ByteBuffer dictBuffer,
                                                 java.nio.ByteBuffer samplesBuffer,
                                                 PointerBuffer samplesSizes)
        Train a dictionary from an array of samples.

        Redirect towards optimizeTrainFromBuffer_fastCover single-threaded, with d=8, steps=4, f=20, and accel=1.

        Samples must be stored concatenated in a single flat buffer samplesBuffer, supplied with an array of sizes samplesSizes, providing the size of each sample, in order.

        The resulting dictionary will be saved into dictBuffer.

        Note: ZDICT_trainFromBuffer() requires about 9 bytes of memory for each input byte.

        Tips:

        • In general, a reasonable dictionary has a size of ~ 100 KB.
        • It's possible to select smaller or larger size, just by specifying dictBufferCapacity.
        • In general, it's recommended to provide a few thousands samples, though this can vary a lot.
        • It's recommended that total size of all samples be about ~x100 times the target size of dictionary.
        Returns:
        size of dictionary stored into dictBuffer (≤ dictBufferCapacity) or an error code, which can be tested with isError.
      • nZDICT_getDictID

        public static int nZDICT_getDictID​(long dictBuffer,
                                           long dictSize)
        Unsafe version of: getDictID
      • ZDICT_getDictID

        public static int ZDICT_getDictID​(java.nio.ByteBuffer dictBuffer)
        Extracts dictID.
        Returns:
        zero if error (not a valid dictionary)
      • nZDICT_isError

        public static int nZDICT_isError​(long errorCode)
      • ZDICT_isError

        public static boolean ZDICT_isError​(long errorCode)
      • nZDICT_getErrorName

        public static long nZDICT_getErrorName​(long errorCode)
      • ZDICT_getErrorName

        @Nullable
        public static java.lang.String ZDICT_getErrorName​(long errorCode)
      • nZDICT_trainFromBuffer_cover

        public static long nZDICT_trainFromBuffer_cover​(long dictBuffer,
                                                        long dictBufferCapacity,
                                                        long samplesBuffer,
                                                        long samplesSizes,
                                                        int nbSamples,
                                                        long parameters)
        Unsafe version of: trainFromBuffer_cover
      • ZDICT_trainFromBuffer_cover

        public static long ZDICT_trainFromBuffer_cover​(java.nio.ByteBuffer dictBuffer,
                                                       java.nio.ByteBuffer samplesBuffer,
                                                       PointerBuffer samplesSizes,
                                                       ZDICTCoverParams parameters)
        Train a dictionary from an array of samples using the COVER algorithm.

        Samples must be stored concatenated in a single flat buffer samplesBuffer, supplied with an array of sizes samplesSizes, providing the size of each sample, in order.

        The resulting dictionary will be saved into dictBuffer.

        Note: ZDICT_trainFromBuffer_cover() requires about 9 bytes of memory for each input byte.

        Tips:

        • In general, a reasonable dictionary has a size of ~ 100 KB.
        • It's possible to select smaller or larger szie, just by specifying dictBufferCapacity.
        • In general, it's recommended to provide a few thousands samples, though this can vary a lot.
        • It's recommended that total size of all samples be about ~x100 times the target size of dictionary.
        Returns:
        size of dictionary stored into dictBuffer (≤ dictBufferCapacity) or an error code, which can be tested with isError.
      • nZDICT_optimizeTrainFromBuffer_cover

        public static long nZDICT_optimizeTrainFromBuffer_cover​(long dictBuffer,
                                                                long dictBufferCapacity,
                                                                long samplesBuffer,
                                                                long samplesSizes,
                                                                int nbSamples,
                                                                long parameters)
      • ZDICT_optimizeTrainFromBuffer_cover

        public static long ZDICT_optimizeTrainFromBuffer_cover​(java.nio.ByteBuffer dictBuffer,
                                                               java.nio.ByteBuffer samplesBuffer,
                                                               PointerBuffer samplesSizes,
                                                               ZDICTCoverParams parameters)
        The same requirements as trainFromBuffer_cover hold for all the parameters except parameters.

        This function tries many parameter combinations and picks the best parameters. *parameters is filled with the best parameters found, dictionary constructed with those parameters is stored in dictBuffer.

        • All of the parameters d, k, steps are optional.
        • If d is non-zero then we don't check multiple values of d, otherwise we check d = {6, 8}.
        • If steps is zero it defaults to its default value.
        • If k is non-zero then we don't check multiple values of k, otherwise we check steps values in [50, 2000].

        Note: ZDICT_optimizeTrainFromBuffer_cover() requires about 8 bytes of memory for each input byte and additionally another 5 bytes of memory for each byte of memory for each thread.

        Returns:
        size of dictionary stored into dictBuffer (≤ dictBufferCapacity) or an error code, which can be tested with isError. On success *parameters contains the parameters selected.
      • nZDICT_trainFromBuffer_fastCover

        public static long nZDICT_trainFromBuffer_fastCover​(long dictBuffer,
                                                            long dictBufferCapacity,
                                                            long samplesBuffer,
                                                            long samplesSizes,
                                                            int nbSamples,
                                                            long parameters)
        Unsafe version of: trainFromBuffer_fastCover
      • ZDICT_trainFromBuffer_fastCover

        public static long ZDICT_trainFromBuffer_fastCover​(java.nio.ByteBuffer dictBuffer,
                                                           java.nio.ByteBuffer samplesBuffer,
                                                           PointerBuffer samplesSizes,
                                                           ZDICTFastCoverParams parameters)
        Train a dictionary from an array of samples using a modified version of COVER algorithm.

        Samples must be stored concatenated in a single flat buffer samplesBuffer, supplied with an array of sizes samplesSizes, providing the size of each sample, in order. d and k are required. All other parameters are optional, will use default values if not provided. The resulting dictionary will be saved into dictBuffer.

        Note: ZDICT_trainFromBuffer_fastCover() requires about 1 bytes of memory for each input byte and additionally another 6 * 2^f bytes of memory.

        Tips: In general, a reasonable dictionary has a size of ~100 KB. It's possible to select smaller or larger size, just by specifying dictBufferCapacity. In general, it's recommended to provide a few thousands samples, though this can vary a lot. It's recommended that total size of all samples be about ~x100 times the target size of dictionary.

        Returns:
        size of dictionary stored into dictBuffer (≤ dictBufferCapacity) or an error code, which can be tested with isError.
      • nZDICT_optimizeTrainFromBuffer_fastCover

        public static long nZDICT_optimizeTrainFromBuffer_fastCover​(long dictBuffer,
                                                                    long dictBufferCapacity,
                                                                    long samplesBuffer,
                                                                    long samplesSizes,
                                                                    int nbSamples,
                                                                    long parameters)
      • ZDICT_optimizeTrainFromBuffer_fastCover

        public static long ZDICT_optimizeTrainFromBuffer_fastCover​(java.nio.ByteBuffer dictBuffer,
                                                                   java.nio.ByteBuffer samplesBuffer,
                                                                   PointerBuffer samplesSizes,
                                                                   ZDICTFastCoverParams parameters)
        The same requirements as trainFromBuffer_fastCover hold for all the parameters except parameters.

        This function tries many parameter combinations (specifically, k and d combinations) and picks the best parameters. *parameters is filled with the best parameters found, dictionary constructed with those parameters is stored in dictBuffer.

        • All of the parameters d, k, steps, f, and accel are optional.
        • If d is non-zero then we don't check multiple values of d, otherwise we check d = {6, 8}.
        • If steps is zero it defaults to its default value.
        • If k is non-zero then we don't check multiple values of k, otherwise we check steps values in [50, 2000].
        • If f is zero, default value of 20 is used.
        • If accel is zero, default value of 1 is used.

        Note: ZDICT_optimizeTrainFromBuffer_fastCover() requires about 1 byte of memory for each input byte and additionally another 6 * 2^f bytes of memory for each thread.

        Returns:
        size of dictionary stored into dictBuffer (≤ dictBufferCapacity) or an error code, which can be tested with isError. On success *parameters contains the parameters selected.
      • nZDICT_finalizeDictionary

        public static long nZDICT_finalizeDictionary​(long dictBuffer,
                                                     long dictBufferCapacity,
                                                     long dictContent,
                                                     long dictContentSize,
                                                     long samplesBuffer,
                                                     long samplesSizes,
                                                     int nbSamples,
                                                     long parameters)
        Unsafe version of: finalizeDictionary
      • ZDICT_finalizeDictionary

        public static long ZDICT_finalizeDictionary​(java.nio.ByteBuffer dictBuffer,
                                                    java.nio.ByteBuffer dictContent,
                                                    java.nio.ByteBuffer samplesBuffer,
                                                    PointerBuffer samplesSizes,
                                                    ZDICTParams parameters)
        Given a custom content as a basis for dictionary, and a set of samples, finalize dictionary by adding headers and statistics.

        Samples must be stored concatenated in a flat buffer samplesBuffer, supplied with an array of sizes samplesSizes, providing the size of each sample in order.

        Notes:

        • dictContentSize must be ≥ CONTENTSIZE_MIN bytes.
        • maxDictSize must be ≥ dictContentSize, and must be ≥ DICTSIZE_MIN bytes.
        • ZDICT_finalizeDictionary() will push notifications into stderr if instructed to, using notificationLevel>0.
        • dictBuffer and dictContent can overlap.
        Returns:
        size of dictionary stored into dictBuffer (≤ dictBufferCapacity) or an error code, which can be tested with isError.