Package com.ibm.icu.text
Class UnicodeDecompressor
java.lang.Object
com.ibm.icu.text.UnicodeDecompressor
A decompression engine implementing the Standard Compression Scheme
for Unicode (SCSU) as outlined in Unicode Technical
Report #6.
USAGE
The static methods on UnicodeDecompressor may be used in a straightforward manner to decompress simple strings:
byte [] compressed = ... ; // get compressed bytes from somewhere String result = UnicodeDecompressor.decompress(compressed);
The static methods have a fairly large memory footprint. For finer-grained control over memory usage, UnicodeDecompressor offers more powerful APIs allowing iterative decompression:
// Decompress an array "bytes" of length "len" using a buffer of 512 chars // to the Writer "out" UnicodeDecompressor myDecompressor = new UnicodeDecompressor(); final static int BUFSIZE = 512; char [] charBuffer = new char [ BUFSIZE ]; int charsWritten = 0; int [] bytesRead = new int [1]; int totalBytesDecompressed = 0; int totalCharsWritten = 0; do { // do the decompression charsWritten = myDecompressor.decompress(bytes, totalBytesDecompressed, len, bytesRead, charBuffer, 0, BUFSIZE); // do something with the current set of chars out.write(charBuffer, 0, charsWritten); // update the no. of bytes decompressed totalBytesDecompressed += bytesRead[0]; // update the no. of chars written totalCharsWritten += charsWritten; } while(totalBytesDecompressed < len); myDecompressor.reset(); // reuse decompressor
Decompression is performed according to the standard set forth in Unicode Technical Report #6
- Author:
- Stephen F. Booth
- See Also:
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int[]
Static compression window offsetsstatic final int[]
For window offset mappingstatic final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
static final int
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic String
decompress
(byte[] buffer) Decompress a byte array into a String.static char[]
decompress
(byte[] buffer, int start, int limit) Decompress a byte array into a Unicode character array.int
decompress
(byte[] byteBuffer, int byteBufferStart, int byteBufferLimit, int[] bytesRead, char[] charBuffer, int charBufferStart, int charBufferLimit) Decompress a byte array into a Unicode character array.void
reset()
Reset the decompressor to its initial state.
-
Field Details
-
COMPRESSIONOFFSET
static final int COMPRESSIONOFFSET- See Also:
-
NUMWINDOWS
static final int NUMWINDOWS- See Also:
-
NUMSTATICWINDOWS
static final int NUMSTATICWINDOWS- See Also:
-
INVALIDWINDOW
static final int INVALIDWINDOW- See Also:
-
INVALIDCHAR
static final int INVALIDCHAR- See Also:
-
SINGLEBYTEMODE
static final int SINGLEBYTEMODE- See Also:
-
UNICODEMODE
static final int UNICODEMODE- See Also:
-
MAXINDEX
static final int MAXINDEX- See Also:
-
RESERVEDINDEX
static final int RESERVEDINDEX- See Also:
-
LATININDEX
static final int LATININDEX- See Also:
-
IPAEXTENSIONINDEX
static final int IPAEXTENSIONINDEX- See Also:
-
GREEKINDEX
static final int GREEKINDEX- See Also:
-
ARMENIANINDEX
static final int ARMENIANINDEX- See Also:
-
HIRAGANAINDEX
static final int HIRAGANAINDEX- See Also:
-
KATAKANAINDEX
static final int KATAKANAINDEX- See Also:
-
HALFWIDTHKATAKANAINDEX
static final int HALFWIDTHKATAKANAINDEX- See Also:
-
SDEFINEX
static final int SDEFINEX- See Also:
-
SRESERVED
static final int SRESERVED- See Also:
-
SQUOTEU
static final int SQUOTEU- See Also:
-
SCHANGEU
static final int SCHANGEU- See Also:
-
SQUOTE0
static final int SQUOTE0- See Also:
-
SQUOTE1
static final int SQUOTE1- See Also:
-
SQUOTE2
static final int SQUOTE2- See Also:
-
SQUOTE3
static final int SQUOTE3- See Also:
-
SQUOTE4
static final int SQUOTE4- See Also:
-
SQUOTE5
static final int SQUOTE5- See Also:
-
SQUOTE6
static final int SQUOTE6- See Also:
-
SQUOTE7
static final int SQUOTE7- See Also:
-
SCHANGE0
static final int SCHANGE0- See Also:
-
SCHANGE1
static final int SCHANGE1- See Also:
-
SCHANGE2
static final int SCHANGE2- See Also:
-
SCHANGE3
static final int SCHANGE3- See Also:
-
SCHANGE4
static final int SCHANGE4- See Also:
-
SCHANGE5
static final int SCHANGE5- See Also:
-
SCHANGE6
static final int SCHANGE6- See Also:
-
SCHANGE7
static final int SCHANGE7- See Also:
-
SDEFINE0
static final int SDEFINE0- See Also:
-
SDEFINE1
static final int SDEFINE1- See Also:
-
SDEFINE2
static final int SDEFINE2- See Also:
-
SDEFINE3
static final int SDEFINE3- See Also:
-
SDEFINE4
static final int SDEFINE4- See Also:
-
SDEFINE5
static final int SDEFINE5- See Also:
-
SDEFINE6
static final int SDEFINE6- See Also:
-
SDEFINE7
static final int SDEFINE7- See Also:
-
UCHANGE0
static final int UCHANGE0- See Also:
-
UCHANGE1
static final int UCHANGE1- See Also:
-
UCHANGE2
static final int UCHANGE2- See Also:
-
UCHANGE3
static final int UCHANGE3- See Also:
-
UCHANGE4
static final int UCHANGE4- See Also:
-
UCHANGE5
static final int UCHANGE5- See Also:
-
UCHANGE6
static final int UCHANGE6- See Also:
-
UCHANGE7
static final int UCHANGE7- See Also:
-
UDEFINE0
static final int UDEFINE0- See Also:
-
UDEFINE1
static final int UDEFINE1- See Also:
-
UDEFINE2
static final int UDEFINE2- See Also:
-
UDEFINE3
static final int UDEFINE3- See Also:
-
UDEFINE4
static final int UDEFINE4- See Also:
-
UDEFINE5
static final int UDEFINE5- See Also:
-
UDEFINE6
static final int UDEFINE6- See Also:
-
UDEFINE7
static final int UDEFINE7- See Also:
-
UQUOTEU
static final int UQUOTEU- See Also:
-
UDEFINEX
static final int UDEFINEX- See Also:
-
URESERVED
static final int URESERVED- See Also:
-
sOffsetTable
static final int[] sOffsetTableFor window offset mapping -
sOffsets
static final int[] sOffsetsStatic compression window offsets
-
-
Constructor Details
-
UnicodeDecompressor
public UnicodeDecompressor()Create a UnicodeDecompressor. Sets all windows to their default values.- See Also:
-
-
Method Details
-
decompress
Decompress a byte array into a String.- Parameters:
buffer
- The byte array to decompress.- Returns:
- A String containing the decompressed characters.
- See Also:
-
decompress
public static char[] decompress(byte[] buffer, int start, int limit) Decompress a byte array into a Unicode character array.- Parameters:
buffer
- The byte array to decompress.start
- The start of the byte run to decompress.limit
- The limit of the byte run to decompress.- Returns:
- A character array containing the decompressed bytes.
- See Also:
-
decompress
public int decompress(byte[] byteBuffer, int byteBufferStart, int byteBufferLimit, int[] bytesRead, char[] charBuffer, int charBufferStart, int charBufferLimit) Decompress a byte array into a Unicode character array. This function will either completely fill the output buffer, or consume the entire input.- Parameters:
byteBuffer
- The byte buffer to decompress.byteBufferStart
- The start of the byte run to decompress.byteBufferLimit
- The limit of the byte run to decompress.bytesRead
- A one-element array. If not null, on return the number of bytes read from byteBuffer.charBuffer
- A buffer to receive the decompressed data. This buffer must be at minimum two characters in size.charBufferStart
- The starting offset to which to write decompressed data.charBufferLimit
- The limiting offset for writing decompressed data.- Returns:
- The number of Unicode characters written to charBuffer.
-
reset
public void reset()Reset the decompressor to its initial state.
-