Class ChunkEncoder

  • All Implemented Interfaces:
    java.io.Closeable, java.lang.AutoCloseable
    Direct Known Subclasses:
    UnsafeChunkEncoder, VanillaChunkEncoder

    public abstract class ChunkEncoder
    extends java.lang.Object
    implements java.io.Closeable
    Class that handles actual encoding of individual chunks. Resulting chunks can be compressed or non-compressed; compression is only used if it actually reduces chunk size (including overhead of additional header bytes)

    Note that instances are stateful and hence not thread-safe; one instance is meant to be used for processing a sequence of chunks where total length is known.

    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected byte[] _encodeBuffer
      Buffer in which encoded content is stored during processing
      protected int _hashModulo  
      protected int[] _hashTable
      Hash table contains lookup based on 3-byte sequence; key is hash of such triplet, value is offset in buffer.
      protected byte[] _headerBuffer
      Small buffer passed to LZFChunk, needed for writing chunk header
      protected BufferRecycler _recycler  
      protected static int MAX_HASH_SIZE  
      protected static int MAX_OFF  
      protected static int MAX_REF  
      protected static int MIN_BLOCK_TO_COMPRESS  
      protected static int MIN_HASH_SIZE  
      protected static int TAIL_LENGTH
      How many tail bytes are we willing to just copy as is, to simplify loop end checks? 4 is bare minimum, may be raised to 8?
    • Constructor Summary

      Constructors 
      Modifier Constructor Description
      protected ChunkEncoder​(int totalLength)
      Uses a ThreadLocal soft-referenced BufferRecycler instance.
      protected ChunkEncoder​(int totalLength, boolean bogus)
      Alternate constructor used when we want to avoid allocation encoding buffer, in cases where caller wants full control over allocations.
      protected ChunkEncoder​(int totalLength, BufferRecycler bufferRecycler)  
      protected ChunkEncoder​(int totalLength, BufferRecycler bufferRecycler, boolean bogus)
      Alternate constructor used when we want to avoid allocation encoding buffer, in cases where caller wants full control over allocations.
    • Method Summary

      All Methods Static Methods Instance Methods Abstract Methods Concrete Methods 
      Modifier and Type Method Description
      int appendEncodedChunk​(byte[] input, int inputPtr, int inputLen, byte[] outputBuffer, int outputPos)
      Alternate chunk compression method that will append encoded chunk in pre-allocated buffer.
      int appendEncodedIfCompresses​(byte[] input, double maxResultRatio, int inputPtr, int inputLen, byte[] outputBuffer, int outputPos)
      Method similar to appendEncodedChunk(byte[], int, int, byte[], int), but that will only append encoded chunk if it compresses down to specified ratio (also considering header that will be needed); otherwise will return -1 without appending anything.
      private static int calcHashLen​(int chunkSize)  
      void close()
      Method to close once encoder is no longer in use.
      void encodeAndWriteChunk​(byte[] data, int offset, int len, java.io.OutputStream out)
      Method for encoding individual chunk, writing it to given output stream.
      boolean encodeAndWriteChunkIfCompresses​(byte[] data, int offset, int inputLen, java.io.OutputStream out, double resultRatio)
      Method for encoding individual chunk, writing it to given output stream, if (and only if!) it compresses enough.
      LZFChunk encodeChunk​(byte[] data, int offset, int len)
      Method for compressing (or not) individual chunks
      LZFChunk encodeChunkIfCompresses​(byte[] data, int offset, int inputLen, double maxResultRatio)
      Method for compressing individual chunk, if (and only if) it compresses down to specified ratio or less.
      BufferRecycler getBufferRecycler()  
      protected int hash​(int h)  
      protected abstract int tryCompress​(byte[] in, int inPos, int inEnd, byte[] out, int outPos)
      Main workhorse method that will try to compress given chunk, and return end position (offset to byte after last included byte).
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • MIN_BLOCK_TO_COMPRESS

        protected static final int MIN_BLOCK_TO_COMPRESS
        See Also:
        Constant Field Values
      • TAIL_LENGTH

        protected static final int TAIL_LENGTH
        How many tail bytes are we willing to just copy as is, to simplify loop end checks? 4 is bare minimum, may be raised to 8?
        See Also:
        Constant Field Values
      • _hashTable

        protected int[] _hashTable
        Hash table contains lookup based on 3-byte sequence; key is hash of such triplet, value is offset in buffer.
      • _hashModulo

        protected final int _hashModulo
      • _encodeBuffer

        protected byte[] _encodeBuffer
        Buffer in which encoded content is stored during processing
      • _headerBuffer

        protected byte[] _headerBuffer
        Small buffer passed to LZFChunk, needed for writing chunk header
    • Constructor Detail

      • ChunkEncoder

        protected ChunkEncoder​(int totalLength)
        Uses a ThreadLocal soft-referenced BufferRecycler instance.
        Parameters:
        totalLength - Total encoded length; used for calculating size of hash table to use
      • ChunkEncoder

        protected ChunkEncoder​(int totalLength,
                               BufferRecycler bufferRecycler)
        Parameters:
        totalLength - Total encoded length; used for calculating size of hash table to use
        bufferRecycler - Buffer recycler instance, for usages where the caller manages the recycler instances
      • ChunkEncoder

        protected ChunkEncoder​(int totalLength,
                               boolean bogus)
        Alternate constructor used when we want to avoid allocation encoding buffer, in cases where caller wants full control over allocations.
      • ChunkEncoder

        protected ChunkEncoder​(int totalLength,
                               BufferRecycler bufferRecycler,
                               boolean bogus)
        Alternate constructor used when we want to avoid allocation encoding buffer, in cases where caller wants full control over allocations.
    • Method Detail

      • calcHashLen

        private static int calcHashLen​(int chunkSize)
      • close

        public final void close()
        Method to close once encoder is no longer in use. Note: after calling this method, further calls to encodeChunk(byte[], int, int) will fail
        Specified by:
        close in interface java.lang.AutoCloseable
        Specified by:
        close in interface java.io.Closeable
      • encodeChunk

        public LZFChunk encodeChunk​(byte[] data,
                                    int offset,
                                    int len)
        Method for compressing (or not) individual chunks
      • encodeChunkIfCompresses

        public LZFChunk encodeChunkIfCompresses​(byte[] data,
                                                int offset,
                                                int inputLen,
                                                double maxResultRatio)
        Method for compressing individual chunk, if (and only if) it compresses down to specified ratio or less.
        Parameters:
        maxResultRatio - Value between 0.05 and 1.10 to indicate maximum relative size of the result to use, in order to append encoded chunk
        Returns:
        Encoded chunk if (and only if) input compresses down to specified ratio or less; otherwise returns null
      • appendEncodedChunk

        public int appendEncodedChunk​(byte[] input,
                                      int inputPtr,
                                      int inputLen,
                                      byte[] outputBuffer,
                                      int outputPos)
        Alternate chunk compression method that will append encoded chunk in pre-allocated buffer. Note that caller must ensure that the buffer is large enough to hold not just encoded result but also intermediate result; latter may be up to 4% larger than input; caller may use LZFEncoder.estimateMaxWorkspaceSize(int) to calculate necessary buffer size.
        Returns:
        Offset in output buffer after appending the encoded chunk
      • appendEncodedIfCompresses

        public int appendEncodedIfCompresses​(byte[] input,
                                             double maxResultRatio,
                                             int inputPtr,
                                             int inputLen,
                                             byte[] outputBuffer,
                                             int outputPos)
        Method similar to appendEncodedChunk(byte[], int, int, byte[], int), but that will only append encoded chunk if it compresses down to specified ratio (also considering header that will be needed); otherwise will return -1 without appending anything.
        Parameters:
        maxResultRatio - Value between 0.05 and 1.10 to indicate maximum relative size of the result to use, in order to append encoded chunk
        Returns:
        Offset after appending compressed chunk, if compression produces compact enough chunk; otherwise -1 to indicate that no compression resulted.
      • encodeAndWriteChunk

        public void encodeAndWriteChunk​(byte[] data,
                                        int offset,
                                        int len,
                                        java.io.OutputStream out)
                                 throws java.io.IOException
        Method for encoding individual chunk, writing it to given output stream.
        Throws:
        java.io.IOException
      • encodeAndWriteChunkIfCompresses

        public boolean encodeAndWriteChunkIfCompresses​(byte[] data,
                                                       int offset,
                                                       int inputLen,
                                                       java.io.OutputStream out,
                                                       double resultRatio)
                                                throws java.io.IOException
        Method for encoding individual chunk, writing it to given output stream, if (and only if!) it compresses enough.
        Returns:
        True if compression occurred and chunk was written; false if not.
        Throws:
        java.io.IOException
      • tryCompress

        protected abstract int tryCompress​(byte[] in,
                                           int inPos,
                                           int inEnd,
                                           byte[] out,
                                           int outPos)
        Main workhorse method that will try to compress given chunk, and return end position (offset to byte after last included byte). Result will be "raw" encoded contents without chunk header information: caller is responsible for prepending header, if it chooses to use encoded data; it may also choose to instead create an uncompressed chunk.
        Returns:
        Output pointer after handling content, such that result - originalOutPost is the actual length of compressed chunk (without header)
      • hash

        protected final int hash​(int h)