Class Bzip2HuffmanStageEncoder


  • final class Bzip2HuffmanStageEncoder
    extends java.lang.Object
    An encoder for the Bzip2 Huffman encoding stage.
    • Field Detail

      • HUFFMAN_HIGH_SYMBOL_COST

        private static final int HUFFMAN_HIGH_SYMBOL_COST
        Used in initial Huffman table generation.
        See Also:
        Constant Field Values
      • mtfBlock

        private final char[] mtfBlock
        The output of the Move To Front Transform and Run Length Encoding[2] stages.
      • mtfLength

        private final int mtfLength
        The actual number of values contained in the mtfBlock array.
      • mtfAlphabetSize

        private final int mtfAlphabetSize
        The number of unique values in the mtfBlock array.
      • mtfSymbolFrequencies

        private final int[] mtfSymbolFrequencies
        The global frequencies of values within the mtfBlock array.
      • huffmanCodeLengths

        private final int[][] huffmanCodeLengths
        The Canonical Huffman code lengths for each table.
      • huffmanMergedCodeSymbols

        private final int[][] huffmanMergedCodeSymbols
        Merged code symbols for each table. The value at each position is ((code length << 24) | code).
      • selectors

        private final byte[] selectors
        The selectors for each segment.
    • Constructor Detail

      • Bzip2HuffmanStageEncoder

        Bzip2HuffmanStageEncoder​(Bzip2BitWriter writer,
                                 char[] mtfBlock,
                                 int mtfLength,
                                 int mtfAlphabetSize,
                                 int[] mtfSymbolFrequencies)
        Parameters:
        writer - The Bzip2BitWriter which provides bit-level writes
        mtfBlock - The MTF block data
        mtfLength - The actual length of the MTF block
        mtfAlphabetSize - The size of the MTF block's alphabet
        mtfSymbolFrequencies - The frequencies the MTF block's symbols
    • Method Detail

      • selectTableCount

        private static int selectTableCount​(int mtfLength)
        Selects an appropriate table count for a given MTF length.
        Parameters:
        mtfLength - The length to select a table count for
        Returns:
        The selected table count
      • generateHuffmanCodeLengths

        private static void generateHuffmanCodeLengths​(int alphabetSize,
                                                       int[] symbolFrequencies,
                                                       int[] codeLengths)
        Generate a Huffman code length table for a given list of symbol frequencies.
        Parameters:
        alphabetSize - The total number of symbols
        symbolFrequencies - The frequencies of the symbols
        codeLengths - The array to which the generated code lengths should be written
      • generateHuffmanOptimisationSeeds

        private void generateHuffmanOptimisationSeeds()
        Generate initial Huffman code length tables, giving each table a different low cost section of the alphabet that is roughly equal in overall cumulative frequency. Note that the initial tables are invalid for actual Huffman code generation, and only serve as the seed for later iterative optimisation in optimiseSelectorsAndHuffmanTables(boolean).
      • optimiseSelectorsAndHuffmanTables

        private void optimiseSelectorsAndHuffmanTables​(boolean storeSelectors)
        Co-optimise the selector list and the alternative Huffman table code lengths. This method is called repeatedly in the hope that the total encoded size of the selectors, the Huffman code lengths and the block data encoded with them will converge towards a minimum.
        If the data is highly incompressible, it is possible that the total encoded size will instead diverge (increase) slightly.
        Parameters:
        storeSelectors - If true, write out the (final) chosen selectors
      • assignHuffmanCodeSymbols

        private void assignHuffmanCodeSymbols()
        Assigns Canonical Huffman codes based on the calculated lengths.
      • writeSelectorsAndHuffmanTables

        private void writeSelectorsAndHuffmanTables​(ByteBuf out)
        Write out the selector list and Huffman tables.
      • writeBlockData

        private void writeBlockData​(ByteBuf out)
        Writes out the encoded block data.
      • encode

        void encode​(ByteBuf out)
        Encodes and writes the block data.