Class ChunkEncoder

java.lang.Object
com.ning.compress.lzf.ChunkEncoder
All Implemented Interfaces:
Closeable, AutoCloseable
Direct Known Subclasses:
UnsafeChunkEncoder, VanillaChunkEncoder

public abstract class ChunkEncoder extends Object implements Closeable
Class that handles actual encoding of individual chunks. Resulting chunks can be compressed or non-compressed; compression is only used if it actually reduces chunk size (including overhead of additional header bytes)

Note that instances are stateful and hence not thread-safe; one instance is meant to be used for processing a sequence of chunks where total length is known.

  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    protected byte[]
    Buffer in which encoded content is stored during processing
    protected final int
     
    protected int[]
    Hash table contains lookup based on 3-byte sequence; key is hash of such triplet, value is offset in buffer.
    protected byte[]
    Small buffer passed to LZFChunk, needed for writing chunk header
    protected final BufferRecycler
     
    protected static final int
     
    protected static final int
     
    protected static final int
     
    protected static final int
     
    protected static final int
     
    protected static final int
    How many tail bytes are we willing to just copy as is, to simplify loop end checks? 4 is bare minimum, may be raised to 8?
  • Constructor Summary

    Constructors
    Modifier
    Constructor
    Description
    protected
    ChunkEncoder(int totalLength)
    Uses a ThreadLocal soft-referenced BufferRecycler instance.
    protected
    ChunkEncoder(int totalLength, boolean bogus)
    Alternate constructor used when we want to avoid allocation encoding buffer, in cases where caller wants full control over allocations.
    protected
    ChunkEncoder(int totalLength, BufferRecycler bufferRecycler)
     
    protected
    ChunkEncoder(int totalLength, BufferRecycler bufferRecycler, boolean bogus)
    Alternate constructor used when we want to avoid allocation encoding buffer, in cases where caller wants full control over allocations.
  • Method Summary

    Modifier and Type
    Method
    Description
    int
    appendEncodedChunk(byte[] input, int inputPtr, int inputLen, byte[] outputBuffer, int outputPos)
    Alternate chunk compression method that will append encoded chunk in pre-allocated buffer.
    int
    appendEncodedIfCompresses(byte[] input, double maxResultRatio, int inputPtr, int inputLen, byte[] outputBuffer, int outputPos)
    Method similar to appendEncodedChunk(byte[], int, int, byte[], int), but that will only append encoded chunk if it compresses down to specified ratio (also considering header that will be needed); otherwise will return -1 without appending anything.
    private static int
    calcHashLen(int chunkSize)
     
    final void
    Method to close once encoder is no longer in use.
    void
    encodeAndWriteChunk(byte[] data, int offset, int len, OutputStream out)
    Method for encoding individual chunk, writing it to given output stream.
    boolean
    encodeAndWriteChunkIfCompresses(byte[] data, int offset, int inputLen, OutputStream out, double resultRatio)
    Method for encoding individual chunk, writing it to given output stream, if (and only if!) it compresses enough.
    encodeChunk(byte[] data, int offset, int len)
    Method for compressing (or not) individual chunks
    encodeChunkIfCompresses(byte[] data, int offset, int inputLen, double maxResultRatio)
    Method for compressing individual chunk, if (and only if) it compresses down to specified ratio or less.
     
    protected final int
    hash(int h)
     
    protected abstract int
    tryCompress(byte[] in, int inPos, int inEnd, byte[] out, int outPos)
    Main workhorse method that will try to compress given chunk, and return end position (offset to byte after last included byte).

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • MIN_BLOCK_TO_COMPRESS

      protected static final int MIN_BLOCK_TO_COMPRESS
      See Also:
    • MIN_HASH_SIZE

      protected static final int MIN_HASH_SIZE
      See Also:
    • MAX_HASH_SIZE

      protected static final int MAX_HASH_SIZE
      See Also:
    • MAX_OFF

      protected static final int MAX_OFF
      See Also:
    • MAX_REF

      protected static final int MAX_REF
      See Also:
    • TAIL_LENGTH

      protected static final int TAIL_LENGTH
      How many tail bytes are we willing to just copy as is, to simplify loop end checks? 4 is bare minimum, may be raised to 8?
      See Also:
    • _recycler

      protected final BufferRecycler _recycler
    • _hashTable

      protected int[] _hashTable
      Hash table contains lookup based on 3-byte sequence; key is hash of such triplet, value is offset in buffer.
    • _hashModulo

      protected final int _hashModulo
    • _encodeBuffer

      protected byte[] _encodeBuffer
      Buffer in which encoded content is stored during processing
    • _headerBuffer

      protected byte[] _headerBuffer
      Small buffer passed to LZFChunk, needed for writing chunk header
  • Constructor Details

    • ChunkEncoder

      protected ChunkEncoder(int totalLength)
      Uses a ThreadLocal soft-referenced BufferRecycler instance.
      Parameters:
      totalLength - Total encoded length; used for calculating size of hash table to use
    • ChunkEncoder

      protected ChunkEncoder(int totalLength, BufferRecycler bufferRecycler)
      Parameters:
      totalLength - Total encoded length; used for calculating size of hash table to use
      bufferRecycler - Buffer recycler instance, for usages where the caller manages the recycler instances
    • ChunkEncoder

      protected ChunkEncoder(int totalLength, boolean bogus)
      Alternate constructor used when we want to avoid allocation encoding buffer, in cases where caller wants full control over allocations.
    • ChunkEncoder

      protected ChunkEncoder(int totalLength, BufferRecycler bufferRecycler, boolean bogus)
      Alternate constructor used when we want to avoid allocation encoding buffer, in cases where caller wants full control over allocations.
  • Method Details

    • calcHashLen

      private static int calcHashLen(int chunkSize)
    • close

      public final void close()
      Method to close once encoder is no longer in use. Note: after calling this method, further calls to encodeChunk(byte[], int, int) will fail
      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface Closeable
    • encodeChunk

      public LZFChunk encodeChunk(byte[] data, int offset, int len)
      Method for compressing (or not) individual chunks
    • encodeChunkIfCompresses

      public LZFChunk encodeChunkIfCompresses(byte[] data, int offset, int inputLen, double maxResultRatio)
      Method for compressing individual chunk, if (and only if) it compresses down to specified ratio or less.
      Parameters:
      maxResultRatio - Value between 0.05 and 1.10 to indicate maximum relative size of the result to use, in order to append encoded chunk
      Returns:
      Encoded chunk if (and only if) input compresses down to specified ratio or less; otherwise returns null
    • appendEncodedChunk

      public int appendEncodedChunk(byte[] input, int inputPtr, int inputLen, byte[] outputBuffer, int outputPos)
      Alternate chunk compression method that will append encoded chunk in pre-allocated buffer. Note that caller must ensure that the buffer is large enough to hold not just encoded result but also intermediate result; latter may be up to 4% larger than input; caller may use LZFEncoder.estimateMaxWorkspaceSize(int) to calculate necessary buffer size.
      Returns:
      Offset in output buffer after appending the encoded chunk
    • appendEncodedIfCompresses

      public int appendEncodedIfCompresses(byte[] input, double maxResultRatio, int inputPtr, int inputLen, byte[] outputBuffer, int outputPos)
      Method similar to appendEncodedChunk(byte[], int, int, byte[], int), but that will only append encoded chunk if it compresses down to specified ratio (also considering header that will be needed); otherwise will return -1 without appending anything.
      Parameters:
      maxResultRatio - Value between 0.05 and 1.10 to indicate maximum relative size of the result to use, in order to append encoded chunk
      Returns:
      Offset after appending compressed chunk, if compression produces compact enough chunk; otherwise -1 to indicate that no compression resulted.
    • encodeAndWriteChunk

      public void encodeAndWriteChunk(byte[] data, int offset, int len, OutputStream out) throws IOException
      Method for encoding individual chunk, writing it to given output stream.
      Throws:
      IOException
    • encodeAndWriteChunkIfCompresses

      public boolean encodeAndWriteChunkIfCompresses(byte[] data, int offset, int inputLen, OutputStream out, double resultRatio) throws IOException
      Method for encoding individual chunk, writing it to given output stream, if (and only if!) it compresses enough.
      Returns:
      True if compression occurred and chunk was written; false if not.
      Throws:
      IOException
    • getBufferRecycler

      public BufferRecycler getBufferRecycler()
    • tryCompress

      protected abstract int tryCompress(byte[] in, int inPos, int inEnd, byte[] out, int outPos)
      Main workhorse method that will try to compress given chunk, and return end position (offset to byte after last included byte). Result will be "raw" encoded contents without chunk header information: caller is responsible for prepending header, if it chooses to use encoded data; it may also choose to instead create an uncompressed chunk.
      Returns:
      Output pointer after handling content, such that result - originalOutPost is the actual length of compressed chunk (without header)
    • hash

      protected final int hash(int h)