Package net.sf.saxon.serialize
Class UTF8Writer
- java.lang.Object
-
- java.io.Writer
-
- net.sf.saxon.serialize.UTF8Writer
-
- All Implemented Interfaces:
Closeable
,Flushable
,Appendable
,AutoCloseable
,UnicodeWriter
public final class UTF8Writer extends Writer implements UnicodeWriter
Specialized buffering UTF-8 writer. The main reason for custom version is to allow for efficient buffer recycling; the second benefit is that encoder has less overhead for short content encoding (compared to JDK default codecs).- Author:
- Tatu Saloranta. Modified by Michael Kay to enable efficient output of Unicode strings.
-
-
Field Summary
Fields Modifier and Type Field Description (package private) int
_surrogate
When outputting chars from BMP, surrogate pairs need to be coalesced.(package private) static int
SURR1_FIRST
(package private) static int
SURR1_LAST
(package private) static int
SURR2_FIRST
(package private) static int
SURR2_LAST
-
Constructor Summary
Constructors Constructor Description UTF8Writer(OutputStream out)
UTF8Writer(OutputStream out, int bufferLength)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
close()
Complete the writing of characters to the result.void
flush()
Flush the contents of any buffers.void
write(char[] cbuf)
void
write(char[] cbuf, int off, int len)
void
write(int c)
Write a single char.void
write(String str)
Process a supplied stringvoid
write(String str, int off, int len)
void
write(UnicodeString chars)
Process a supplied stringvoid
writeAscii(byte[] content)
Write a sequence of ASCII characters.void
writeAscii(byte[] chars, int off, int len)
Write a sequence of ASCII characters.void
writeCodePoint(int codepoint)
Process a single character.void
writeLatin1(byte[] bytes, int off, int len)
void
writeRepeatedAscii(byte ch, int repeat)
Write an ASCII character repeatedly.-
Methods inherited from class java.io.Writer
append, append, append, nullWriter
-
-
-
-
Field Detail
-
SURR1_FIRST
static final int SURR1_FIRST
- See Also:
- Constant Field Values
-
SURR1_LAST
static final int SURR1_LAST
- See Also:
- Constant Field Values
-
SURR2_FIRST
static final int SURR2_FIRST
- See Also:
- Constant Field Values
-
SURR2_LAST
static final int SURR2_LAST
- See Also:
- Constant Field Values
-
_surrogate
int _surrogate
When outputting chars from BMP, surrogate pairs need to be coalesced. To do this, both pairs must be known first; and since it is possible pairs may be split, we need temporary storage for the first half
-
-
Constructor Detail
-
UTF8Writer
public UTF8Writer(OutputStream out)
-
UTF8Writer
public UTF8Writer(OutputStream out, int bufferLength)
-
-
Method Detail
-
close
public void close() throws IOException
Description copied from interface:UnicodeWriter
Complete the writing of characters to the result. The default implementation does nothing.- Specified by:
close
in interfaceAutoCloseable
- Specified by:
close
in interfaceCloseable
- Specified by:
close
in interfaceUnicodeWriter
- Specified by:
close
in classWriter
- Throws:
IOException
- if processing fails for any reason
-
flush
public void flush() throws IOException
Description copied from interface:UnicodeWriter
Flush the contents of any buffers. The default implementation does nothing.- Specified by:
flush
in interfaceFlushable
- Specified by:
flush
in interfaceUnicodeWriter
- Specified by:
flush
in classWriter
- Throws:
IOException
- if processing fails for any reason
-
write
public void write(char[] cbuf) throws IOException
- Overrides:
write
in classWriter
- Throws:
IOException
-
write
public void write(char[] cbuf, int off, int len) throws IOException
- Specified by:
write
in classWriter
- Throws:
IOException
-
writeLatin1
public void writeLatin1(byte[] bytes, int off, int len) throws IOException
- Throws:
IOException
-
writeAscii
public void writeAscii(byte[] content) throws IOException
Write a sequence of ASCII characters. The caller is responsible for ensuring that each byte represents a character in the range 1-127- Specified by:
writeAscii
in interfaceUnicodeWriter
- Parameters:
content
- the content to be written- Throws:
IOException
- if processing fails for any reason
-
writeAscii
public void writeAscii(byte[] chars, int off, int len) throws IOException
Write a sequence of ASCII characters. The caller is responsible for ensuring that each byte represents a character in the range 1-127- Parameters:
chars
- the characters to be writtenoff
- the offset of the first character to be includedlen
- the number of characters to be written- Throws:
IOException
-
writeRepeatedAscii
public void writeRepeatedAscii(byte ch, int repeat) throws IOException
Write an ASCII character repeatedly. Used for serializing whitespace.- Specified by:
writeRepeatedAscii
in interfaceUnicodeWriter
- Parameters:
ch
- the ASCII character to be serialized (must be less than 0x7f)repeat
- the number of occurrences to output- Throws:
IOException
- if it fails
-
writeCodePoint
public void writeCodePoint(int codepoint) throws IOException
Process a single character. Default implementation wraps the codepoint into a single-characterUnicodeString
- Specified by:
writeCodePoint
in interfaceUnicodeWriter
- Parameters:
codepoint
- the character to be processed. Must not be a surrogate- Throws:
IOException
- if processing fails for any reason
-
write
public void write(int c) throws IOException
Write a single char.Note (MHK) Although the Writer interface says that the top half of the int is ignored, this implementation appears to accept a Unicode codepoint which is output as a 4-byte UTF-8 sequence.
- Overrides:
write
in classWriter
- Parameters:
c
- the char to be written- Throws:
IOException
- If an I/O error occurs
-
write
public void write(UnicodeString chars) throws IOException
Process a supplied string- Specified by:
write
in interfaceUnicodeWriter
- Parameters:
chars
- the characters to be processed- Throws:
IOException
- if processing fails for any reason
-
write
public void write(String str) throws IOException
Description copied from interface:UnicodeWriter
Process a supplied string- Specified by:
write
in interfaceUnicodeWriter
- Overrides:
write
in classWriter
- Parameters:
str
- the characters to be processed- Throws:
IOException
- if processing fails for any reason
-
write
public void write(String str, int off, int len) throws IOException
- Overrides:
write
in classWriter
- Throws:
IOException
-
-