Class LZW
java.lang.Object
org.apache.sis.internal.storage.inflater.PixelChannel
org.apache.sis.internal.storage.inflater.CompressionChannel
org.apache.sis.internal.storage.inflater.LZW
- All Implemented Interfaces:
Closeable
,AutoCloseable
,Channel
,ReadableByteChannel
Inflater for values encoded with the LZW compression.
This compression is described in section 13 of TIFF 6 specification, "LZW Compression".
Each code is written using at least 9 bits and at most 12 bits.
Legal note
Unisys's patent on the LZW algorithm expired in 2004.- Since:
- 1.1
- Version:
- 1.3
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate static final int
A 12 bits code meaning that we have exhausted the 4093 available codes and must reset the table to the initial set of 9 bits code.private int
Number of bits to read for the next code.private final int[]
Pointers to byte sequences for a code in theentriesForCodes
array.private static final int
End of information.private static final int
First code which is not one of the predefined codes.private int
Index of the next entry available inentriesForCodes
.private int
Index of the next byte available instringsFromCode
.private static final int
The mask to apply on anentriesForCodes
element for getting the length.private static final int
A mask used for detecting when a new allocation is required.private static final int
Position of the lowest bit in anentriesForCodes
element where the offset is stored.private static final int
Maximum number of bits in a code, inclusive.private static final int
Initial number of bits in a code.private static final int
Maximal value + 1 that the offset can take.private static final int
The mask to apply on anentriesForCodes
element for getting the compressed offset (before shifting).private static final int
The shift to apply on a compressed offset (after application ofOFFSET_MASK
) for getting the uncompressed offset.private static final int
For computing value ofindexOfFreeEntry
whencodeSize
needs to be incremented.private int
If some bytes could not be written in previousread(…)
execution because the target buffer was full, offset and length of those bytes.private int
If some bytes could not be written in previousread(…)
execution because the target buffer was full, offset and length of those bytes.private static final int
Mask for a bit in anentriesForCodes
element for telling whether the extra space allocated in thestringsFromCode
array has already been used by another entry.private int
Last code found in previous iteration.private static final int
Number of bits in an offset that are always 0 and consequently do not need to be stored.private byte[]
Sequences of bytes associated to codes.Fields inherited from class org.apache.sis.internal.storage.inflater.CompressionChannel
input, listeners
-
Constructor Summary
ConstructorsConstructorDescriptionLZW
(ChannelDataInput input, StoreListeners listeners) Creates a new channel which will decompress data from the given input. -
Method Summary
Modifier and TypeMethodDescriptionprivate void
Clears theentriesForCodes
table.private static int
length
(int element) Extracts the number of bytes of an entry stored in thestringsFromCode
array.private static boolean
newEntryNeedsAllocation
(int element) Returnstrue
if all the space allocated for the given entry is already used.private static int
offset
(int element) Extracts the index of the first byte of an entry stored in thestringsFromCode
array.private static int
offsetAndLength
(int offset, int length) Encodes an offset together with its length.int
read
(ByteBuffer target) Decompresses some bytes from the input into the given destination buffer.final int
ReadscodeSize
bits from the stream.void
setInputRegion
(long start, long byteCount) Prepares this inflater for reading a new tile or a new band of a tile.private IOException
The exception to throw if the decompression process encounters data that it cannot process.Methods inherited from class org.apache.sis.internal.storage.inflater.CompressionChannel
close, createDataInput, finished, isOpen, repeat, resources
-
Field Details
-
CLEAR_CODE
private static final int CLEAR_CODEA 12 bits code meaning that we have exhausted the 4093 available codes and must reset the table to the initial set of 9 bits code.- See Also:
-
EOI_CODE
private static final int EOI_CODEEnd of information. This code appears at the end of a strip.- See Also:
-
FIRST_ADAPTATIVE_CODE
private static final int FIRST_ADAPTATIVE_CODEFirst code which is not one of the predefined codes.- See Also:
-
OFFSET_TO_MAXIMUM
private static final int OFFSET_TO_MAXIMUMFor computing value ofindexOfFreeEntry
whencodeSize
needs to be incremented. TIFF specification said that the size needs to be incremented after codes 510, 1022 and 2046 are added to theentriesForCodes
table. Those values are a little bit lower than what we would expect if the full integer ranges were used.- See Also:
-
MIN_CODE_SIZE
private static final int MIN_CODE_SIZEInitial number of bits in a code. TIFF specification said that the size needs to be incremented after codes 510, 1022 and 2046 are added to theentriesForCodes
table.- See Also:
-
MAX_CODE_SIZE
private static final int MAX_CODE_SIZEMaximum number of bits in a code, inclusive.- See Also:
-
codeSize
private int codeSizeNumber of bits to read for the next code. This number starts at 9 and increases until 12. After 12 bits, aCLEAR_CODE
should occur in the stream of LZW data. -
LOWEST_OFFSET_BIT
private static final int LOWEST_OFFSET_BITPosition of the lowest bit in anentriesForCodes
element where the offset is stored. The position is chosen for leaving 12 bits for storing the length before the offset value.Rational: even in the worst case scenario where the same byte is always appended to the sequence, the maximal length cannot exceeded the dictionary size because aCLEAR_CODE
will be emitted when the dictionary is full.- See Also:
-
LENGTH_MASK
private static final int LENGTH_MASKThe mask to apply on anentriesForCodes
element for getting the length.- See Also:
-
STRING_ALIGNMENT
private static final int STRING_ALIGNMENTNumber of bits in an offset that are always 0 and consequently do not need to be stored. An intentional consequence of this restriction is that size of blocks allocated in thestringsFromCode
array must be multiples of (1 << STRING_ALIGNMENT). It makes possible to use the extra size for growing a string up to that amount of bytes without copying it.Note: doing allocations only by blocks of 2² = 4 bytes may seem a waste of memory, but actually it reduces memory usage a lot (almost a factor 4) because of the copies avoided. We tried with alignment values 1, 2, 3 and found that 2 seems optimal.- See Also:
-
PREALLOCATED_SPACE_IS_USED_MASK
private static final int PREALLOCATED_SPACE_IS_USED_MASKMask for a bit in anentriesForCodes
element for telling whether the extra space allocated in thestringsFromCode
array has already been used by another entry. If yes (1), then that space cannot be used by new entry. Instead, the new entry will need to allocate a new space.Note:
newEntryNeedsAllocation(int)
implementation assumes that this bit is the sign bit.- See Also:
-
OFFSET_MASK
private static final int OFFSET_MASKThe mask to apply on anentriesForCodes
element for getting the compressed offset (before shifting).- See Also:
-
OFFSET_SHIFT
private static final int OFFSET_SHIFTThe shift to apply on a compressed offset (after application ofOFFSET_MASK
) for getting the uncompressed offset.- See Also:
-
OFFSET_LIMIT
private static final int OFFSET_LIMITMaximal value + 1 that the offset can take. The compressed offset takes all the bits after the length, minus one bit that we keep for thePREALLOCATED_SPACE_IS_USED_MASK
flag. Note that compressed offsets are multiplied by 1 << STRING_ALIGNMENT for getting the actual offset.- See Also:
-
LENGTH_MASK_FOR_ALLOCATE
private static final int LENGTH_MASK_FOR_ALLOCATEA mask used for detecting when a new allocation is required. If(length & LENGTH_MASK_FOR_ALLOCATE) == 0
and assuming that length is always incremented by 1, then a new allocation is necessary.- See Also:
-
entriesForCodes
private final int[] entriesForCodesPointers to byte sequences for a code in theentriesForCodes
array. Each element is a value encoded byoffsetAndLength(int, int)
method. Elements are decoded byoffset(int)
length(int)
methods. -
previousCode
private int previousCodeLast code found in previous iteration. This is a valid index in theentriesForCodes
array. AEOI_CODE
value means that the decompression is finished. -
pendingOffset
private int pendingOffsetIf some bytes could not be written in previousread(…)
execution because the target buffer was full, offset and length of those bytes. Otherwise 0. -
pendingLength
private int pendingLengthIf some bytes could not be written in previousread(…)
execution because the target buffer was full, offset and length of those bytes. Otherwise 0. -
indexOfFreeEntry
private int indexOfFreeEntryIndex of the next entry available inentriesForCodes
. Shall not be lower than 258. -
indexOfFreeString
private int indexOfFreeStringIndex of the next byte available instringsFromCode
. Shall not be lower than1 << Byte.SIZE
. -
stringsFromCode
private byte[] stringsFromCodeSequences of bytes associated to codes. For a given c code read from the stream, the first uncompressed byte isstringsFromCode(offset(entriesForCodes[c]))
and the number of bytes islength(entriesForCodes[c])
.
-
-
Constructor Details
-
LZW
Creates a new channel which will decompress data from the given input. ThesetInputRegion(long, long)
method must be invoked after construction before a reading process can start.- Parameters:
input
- the source of data to decompress.listeners
- object where to report warnings.
-
-
Method Details
-
length
private static int length(int element) Extracts the number of bytes of an entry stored in thestringsFromCode
array.- Parameters:
element
- an element of theentriesForCodes
array.- Returns:
- number of consecutive bytes to read in
stringsFromCode
array.
-
offset
private static int offset(int element) Extracts the index of the first byte of an entry stored in thestringsFromCode
array.- Parameters:
element
- an element of theentriesForCodes
array.- Returns:
- index of the first byte to read in
stringsFromCode
array.
-
offsetAndLength
private static int offsetAndLength(int offset, int length) Encodes an offset together with its length. -
newEntryNeedsAllocation
private static boolean newEntryNeedsAllocation(int element) Returnstrue
if all the space allocated for the given entry is already used. This is true if at least one of the following conditions is true:- The
PREALLOCATED_SPACE_IS_USED_MASK
is set, in which case value is negative. - All the extra-space allowed by
STRING_ALIGNMENT
is used, in which case the lowest bits of the length are all zero.
- Parameters:
element
- an element of theentriesForCodes
array.- Returns:
- whether all the space for that entry is already used.
- The
-
setInputRegion
Prepares this inflater for reading a new tile or a new band of a tile.- Overrides:
setInputRegion
in classCompressionChannel
- Parameters:
start
- stream position where to start reading.byteCount
- number of bytes to read from the input.- Throws:
IOException
- if the stream cannot be seek to the given start position.
-
clearTable
private void clearTable()Clears theentriesForCodes
table. -
readNextCode
ReadscodeSize
bits from the stream.- Returns:
- the value of the next bits from the stream.
- Throws:
IOException
- if an error occurred while reading.
-
read
Decompresses some bytes from the input into the given destination buffer.- Parameters:
target
- the buffer into which bytes are to be transferred.- Returns:
- the number of bytes read, or -1 if end-of-stream.
- Throws:
IOException
- if some other I/O error occurs.
-
unexpectedData
The exception to throw if the decompression process encounters data that it cannot process.
-