Class ImageDataReader
- Direct Known Subclasses:
DataReaderStrips
,DataReaderTiled
The TIFF Floating-Point Formats
In addition to providing images, TIFF files can supply data in the form of numerical values. As of March 2020 the Commons Imaging library was extended to support some floating-point data formats.
Unfortunately, the TIFF floating-point format allows for a lot of different variations. At this time, only the most widely used of these are supported. When this code was written, only a small set of test data products were available. Thus it is likely that developers will wish to extend the range of floating-point data that can be processed as additional test data become available. When implementing extensions to this logic, developers are reminded that image processing requires the handling of literally millions of pixels, so attention to performance is essential to a successful implementation (please see the notes in DataReaderStrips.java for more information).
The TIFF floating-point specification is very poorly documented. So these notes are included to provide clarification on at least some aspects of the format. Some documentation and C-code examples are available in "TIFF Technical Note 3, April 8, 2005)".
The Predictor==3 Case
TIFF specifies an extension for a predictor that is intended to improve data compression ratios for floating-point values. This predictor is specified using the TIFF predictor TAG with a value of 3 (see TIFF Technical Note 3). Consider a 4-byte floating point value given in IEEE-754 format. Let f3 be the high-order byte, with f2 the next highest, followed by f1, and f0 for the low-order byte. This designation should not be confused with the in-memory layout of the bytes (little-endian versus big-endian), but rather their numerical values. The sign bit and upper 7 bits of the exponent are given in the high-order byte, followed by the remaining sign bit and the mantissa in the lower.
In many real-valued raster data sets, the sign and magnitude (exponent) of the values change slowly. But the bits in the mantissa vary rapidly in a semi-random manner. The information entropy in the mantissa tends to increase in the lowest ordered bytes. Thus, the high-order bytes have more redundancy than the low-order bytes and can compress more efficiently. To exploit this, the TIFF format splits the bytes into groups based on their order-of-magnitude. This splitting process takes place on a ROW-BY-ROW basis (note the emphasis, this point is not clearly documented in the spec). For example, for a row of length 3 pixels -- A, B, and C -- the data for two rows would be given as shown below (again, ignoring endian issues):
Original: A3 A2 A1 A0 B3 B2 B1 B0 C3 C2 C1 C0 D3 D3 D1 D0 E3 E2 E2 E0 F3 F2 F1 F0 Bytes split into groups by order-of-magnitude: A3 B3 C3 A2 B2 C2 A1 B1 C1 A0 B0 C0 D3 E3 F3 D2 E2 F2 D1 E1 F1 D0 E0 F0To further improve the compression, the predictor takes the difference of each subsequent bytes. Again, the differences (deltas) are computed on a row-byte-row basis. For the most part, the differences combine bytes associated with the same order-of-magnitude, though there is a special transition at the end of each order-of-magnitude set (shown in parentheses):
A3, B3-A3, C3-B3, (A2-C3), B2-A2, C2-B2, (A1-C2), etc. D3, E3-D3, F3-D3, (D2-F3), E3-D2, etc.Once the predictor transform is complete, the data is stored using conventional data compression techniques such as Deflate or LZW. In practice, floating point data does not compress especially well, but using the above technique, the TIFF process typically reduces the overall storage size by 20 to 30 percent (depending on the data). The TIFF Technical Note 3 specifies 3 data size formats for storing floating point values:
32 bits IEEE-754 single-precision standard 16 bits IEEE-754 half-precision standard 24 bits A non-standard representationAt this time, we have not obtained data samples for the smaller representations used in combination with a predictor.
Interleaved formats
TIFF Technical Note 3 also provides sample code for interleaved data, such as a real-valued vector or a complex pair. At this time no samples of interleaved data were available. As a caveat, the specification that the document provides has disadvantages in terms of code complexity and performance. Because the interleaved evaluation is embedded inside the pixel row and column loops, it puts a lot of redundant conditional evaluations inside the double nested loops. It is recommended that when interleaved data is implemented, it should get their own block of code so as not to interfere with the processing of the more common non-interleaved variations.
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final int[]
protected final int
protected final TiffDirectory
protected final int
private final int[]
protected final PhotometricInterpreter
protected final int
protected final int
protected final int
protected final int
-
Constructor Summary
ConstructorsConstructorDescriptionImageDataReader
(TiffDirectory directory, PhotometricInterpreter photometricInterpreter, int[] bitsPerSample, int predictor, int samplesPerPixel, int sampleFormat, int width, int height) -
Method Summary
Modifier and TypeMethodDescriptionprotected int[]
applyPredictor
(int[] samples) protected byte[]
decompress
(byte[] compressedInput, int compression, int expectedSize, int tileWidth, int tileHeight) (package private) void
getSamplesAsBytes
(BitInputStream bis, int[] result) Reads samples and returns them in an int array.protected boolean
isHomogenous
(int size) Checks if all the bits per sample entries are the same sizeabstract BufferedImage
readImageData
(Rectangle subImage) abstract void
readImageData
(ImageBuilder imageBuilder) abstract TiffRasterData
readRasterData
(Rectangle subImage) Defines a method for accessing the floating-point raster data in a TIFF image.protected void
(package private) void
transferBlockToRaster
(int xBlock, int yBlock, int blockWidth, int blockHeight, int[] blockData, int xRaster, int yRaster, int rasterWidth, int rasterHeight, float[] rasterData) protected int[]
unpackFloatingPointSamples
(int width, int height, int scansize, byte[] bytes, int predictor, int bitsPerSample, ByteOrder byteOrder) Given a source file that specifies the floating-point data format, unpack the raw bytes obtained from the source file and organize them into an array of integers containing the bit-equivalent of IEEE-754 32-bit floats.
-
Field Details
-
directory
-
photometricInterpreter
-
bitsPerSample
private final int[] bitsPerSample -
bitsPerSampleLength
protected final int bitsPerSampleLength -
last
private final int[] last -
predictor
protected final int predictor -
samplesPerPixel
protected final int samplesPerPixel -
width
protected final int width -
height
protected final int height -
sampleFormat
protected final int sampleFormat
-
-
Constructor Details
-
ImageDataReader
public ImageDataReader(TiffDirectory directory, PhotometricInterpreter photometricInterpreter, int[] bitsPerSample, int predictor, int samplesPerPixel, int sampleFormat, int width, int height)
-
-
Method Details
-
readImageData
public abstract void readImageData(ImageBuilder imageBuilder) throws ImageReadException, IOException - Throws:
ImageReadException
IOException
-
readImageData
public abstract BufferedImage readImageData(Rectangle subImage) throws ImageReadException, IOException - Throws:
ImageReadException
IOException
-
isHomogenous
protected boolean isHomogenous(int size) Checks if all the bits per sample entries are the same size- Parameters:
size
- the size to check- Returns:
- true if all the bits per sample entries are the same
-
getSamplesAsBytes
Reads samples and returns them in an int array.- Parameters:
bis
- the stream to read fromresult
- the samples array to populate, must be the same length as bitsPerSample.length- Throws:
IOException
-
resetPredictor
protected void resetPredictor() -
applyPredictor
protected int[] applyPredictor(int[] samples) -
decompress
protected byte[] decompress(byte[] compressedInput, int compression, int expectedSize, int tileWidth, int tileHeight) throws ImageReadException, IOException - Throws:
ImageReadException
IOException
-
unpackFloatingPointSamples
protected int[] unpackFloatingPointSamples(int width, int height, int scansize, byte[] bytes, int predictor, int bitsPerSample, ByteOrder byteOrder) throws ImageReadException Given a source file that specifies the floating-point data format, unpack the raw bytes obtained from the source file and organize them into an array of integers containing the bit-equivalent of IEEE-754 32-bit floats. Source files containing 64 bit doubles are downcast to floats.This method supports either the tile format or the strip format of TIFF source files. The scan size indicates the number of columns to be extracted. For strips, the width and the scan size are always the full width of the image. For tiles, the scan size is the full width of the tile, but the width may be smaller in the cases where the tiles do not evenly divide the width (for example, a 256 pixel wide tile in a 257 pixel wide image would result in two columns of tiles, the second column having only one column of pixels that were worth extracting.
- Parameters:
width
- the width of the data block to be extractedheight
- the height of the data block to be extractedscansize
- the number of pixels in a single row of the blockbytes
- the raw bytespredictor
- the predictor specified by the source, only predictor 3 is supported.bitsPerSample
- the number of bits per sample, 32 or 64.byteOrder
- the byte order for the source data- Returns:
- a valid array of integers in row major order, dimensions scan-size wide and height height.
- Throws:
ImageReadException
- in the event of an invalid format.
-
transferBlockToRaster
void transferBlockToRaster(int xBlock, int yBlock, int blockWidth, int blockHeight, int[] blockData, int xRaster, int yRaster, int rasterWidth, int rasterHeight, float[] rasterData) - Parameters:
xBlock
- coordinate of block relative to source datayBlock
- coordinate of block relative to source datablockWidth
- width of block, in pixelsblockHeight
- height of block in pixelsblockData
- the data for the blockxRaster
- coordinate of raster relative to source datayRaster
- coordinate of raster relative to source datarasterWidth
- width of the raster (always smaller than source data)rasterHeight
- height of the raster (always smaller than source data)rasterData
- the raster data.
-
readRasterData
public abstract TiffRasterData readRasterData(Rectangle subImage) throws ImageReadException, IOException Defines a method for accessing the floating-point raster data in a TIFF image. These implementations of this method in DataReaderStrips and DataReaderTiled assume that this instance is of a compatible data type (floating-point) and that all access checks have already been performed.- Parameters:
subImage
- if non-null, instructs the access method to retrieve only a sub-section of the image data.- Returns:
- a valid instance
- Throws:
ImageReadException
- in the event of an incompatible data form.IOException
- in the event of I/O error.
-