Package com.ctc.wstx.util
Class TextBuffer
- java.lang.Object
-
- com.ctc.wstx.util.TextBuffer
-
public final class TextBuffer extends java.lang.Object
TextBuffer is a class similar toStringBuilder
, with following differences:- TextBuffer uses segments character arrays, to avoid having to do additional array copies when array is not big enough. This means that only reallocating that is necessary is done only once -- if and when caller wants to access contents in a linear array (char[], String).
- TextBuffer is not synchronized.
Over time more and more cruft has accumulated here, mostly to support efficient access to collected text. Since access is easiest to do efficiently using callbacks, this class now needs to known interfaces of SAX classes and validators.
Notes about usage: for debugging purposes, it's suggested to use
toString()
method, as opposed tocontentsAsArray()
orcontentsAsString()
. Internally resulting code paths may or may not be different, WRT caching.- Author:
- Tatu Saloranta
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description private static class
TextBuffer.BufferReader
-
Field Summary
Fields Modifier and Type Field Description (package private) static int
DEF_INITIAL_BUFFER_SIZE
Size of the first text segment buffer to allocate; need not contain the biggest segment, since new ones will get allocated as needed.(package private) static int
INT_SPACE
static int
MAX_INDENT_SPACES
static int
MAX_INDENT_TABS
(package private) static int
MAX_SEGMENT_LENGTH
We will also restrict maximum length of individual segments to allocate (not including cases where we must return a single segment).private ReaderConfig
mConfig
private char[]
mCurrentSegment
private int
mCurrentSize
Number of characters in currently active (last) segmentprivate boolean
mHasSegments
private char[]
mInputBuffer
Shared input buffer; stored here in case some input can be returned as is, without being copied to collector's own buffers.private int
mInputLen
When using shared buffer, offset after the last character in shared bufferprivate int
mInputStart
Character offset of first char in input buffer; -1 to indicate that input buffer currently does not contain any useful char dataprivate char[]
mResultArray
private java.lang.String
mResultString
String that will be constructed when the whole contents are needed; will be temporarily stored in case asked for again.private java.util.ArrayList<char[]>
mSegments
List of segments prior to currently active segment.private int
mSegmentSize
Amount of characters in segments inmSegments
private static java.lang.String
sIndSpaces
private static char[]
sIndSpacesArray
private static java.lang.String[]
sIndSpacesStrings
private static java.lang.String
sIndTabs
private static char[]
sIndTabsArray
private static java.lang.String[]
sIndTabsStrings
-
Constructor Summary
Constructors Modifier Constructor Description private
TextBuffer(ReaderConfig cfg)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description private char[]
allocBuffer(int needed)
void
append(char c)
void
append(char[] c, int start, int len)
void
append(java.lang.String str)
private char[]
buildResultArray()
private int
calcNewSize(int latestSize)
Method used to determine size of the next segment to allocate to contain textual content.private void
clearSegments()
char[]
contentsAsArray()
java.lang.String
contentsAsString()
java.lang.StringBuilder
contentsAsStringBuilder(int extraSpace)
Similar tocontentsAsString()
, but constructs a StringBuilder for further appends.int
contentsToArray(int srcStart, char[] dst, int dstStart, int len)
void
contentsToStringBuilder(java.lang.StringBuilder sb)
static TextBuffer
createRecyclableBuffer(ReaderConfig cfg)
static TextBuffer
createTemporaryBuffer()
void
decode(org.codehaus.stax2.typed.TypedValueDecoder tvd)
Generic pass-through method which call given decoder with accumulated dataint
decodeElements(org.codehaus.stax2.typed.TypedArrayDecoder tad, InputProblemReporter rep)
Pass-through decode method called to find find the next token, decode it, and repeat the process as long as there are more tokens and the array decoder accepts more entries.void
ensureNotShared()
Method called to make sure that buffer is not using shared input buffer; if it is, it will copy such contents to private buffer.boolean
equalsString(java.lang.String str)
Note: it is assumed that this method is not used often enough to be a bottleneck, or for long segments.private void
expand(int roomNeeded)
Method called when current segment is full, to allocate new segment.char[]
finishCurrentSegment()
void
fireDtdCommentEvent(DTDEventListener l)
void
fireSaxCharacterEvents(org.xml.sax.ContentHandler h)
void
fireSaxCommentEvent(org.xml.sax.ext.LexicalHandler h)
void
fireSaxSpaceEvents(org.xml.sax.ContentHandler h)
char[]
getCurrentSegment()
int
getCurrentSegmentSize()
char[]
getTextBuffer()
int
getTextStart()
void
initBinaryChunks(org.codehaus.stax2.typed.Base64Variant v, org.codehaus.stax2.ri.typed.CharArrayBase64Decoder dec, boolean firstChunk)
Method that needs to be called to configure given base64 decoder with textual contents collected by this buffer.boolean
isAllWhitespace()
int
rawContentsTo(java.io.Writer w)
Method that will stream contents of this buffer into specified Writer.java.io.Reader
rawContentsViaReader()
Deprecated.void
recycle(boolean force)
Method called to indicate that the underlying buffers should now be recycled if they haven't yet been recycled.void
resetInitialized()
Method called to make sure there is a non-shared segment to use, without appending any content yet.void
resetWithCopy(char[] buf, int start, int len)
void
resetWithEmpty()
Method called to clear out any content text buffer may have, and initializes buffer to use non-shared data.void
resetWithEmptyString()
Similar toresetWithEmpty()
, but actively marks current text content to be empty string (whereas former method leaves content as undefined).void
resetWithIndentation(int indCharCount, char indChar)
void
resetWithShared(char[] buf, int start, int len)
Method called to initialize the buffer with a shared copy of data; this means that buffer will just have pointers to actual data.void
setCurrentLength(int len)
int
size()
java.lang.String
toString()
Note: calling this method may not be as efficient as callingcontentsAsString()
, since it's not guaranteed that resulting String is cached.void
unshare(int needExtra)
Method called if/when we need to append content when we have been initialized to use shared buffer.void
validateText(org.codehaus.stax2.validation.XMLValidator vld, boolean lastSegment)
-
-
-
Field Detail
-
DEF_INITIAL_BUFFER_SIZE
static final int DEF_INITIAL_BUFFER_SIZE
Size of the first text segment buffer to allocate; need not contain the biggest segment, since new ones will get allocated as needed. However, it's sensible to use something that often is big enough to contain segments.- See Also:
- Constant Field Values
-
MAX_SEGMENT_LENGTH
static final int MAX_SEGMENT_LENGTH
We will also restrict maximum length of individual segments to allocate (not including cases where we must return a single segment). Value is somewhat arbitrary, let's use it so that memory used is no more than 1/2 megabytes.- See Also:
- Constant Field Values
-
INT_SPACE
static final int INT_SPACE
- See Also:
- Constant Field Values
-
mConfig
private final ReaderConfig mConfig
-
mInputBuffer
private char[] mInputBuffer
Shared input buffer; stored here in case some input can be returned as is, without being copied to collector's own buffers. Note that this is read-only for this Objet.
-
mInputStart
private int mInputStart
Character offset of first char in input buffer; -1 to indicate that input buffer currently does not contain any useful char data
-
mInputLen
private int mInputLen
When using shared buffer, offset after the last character in shared buffer
-
mHasSegments
private boolean mHasSegments
-
mSegments
private java.util.ArrayList<char[]> mSegments
List of segments prior to currently active segment.
-
mSegmentSize
private int mSegmentSize
Amount of characters in segments inmSegments
-
mCurrentSegment
private char[] mCurrentSegment
-
mCurrentSize
private int mCurrentSize
Number of characters in currently active (last) segment
-
mResultString
private java.lang.String mResultString
String that will be constructed when the whole contents are needed; will be temporarily stored in case asked for again.
-
mResultArray
private char[] mResultArray
-
MAX_INDENT_SPACES
public static final int MAX_INDENT_SPACES
- See Also:
- Constant Field Values
-
MAX_INDENT_TABS
public static final int MAX_INDENT_TABS
- See Also:
- Constant Field Values
-
sIndSpaces
private static final java.lang.String sIndSpaces
- See Also:
- Constant Field Values
-
sIndSpacesArray
private static final char[] sIndSpacesArray
-
sIndSpacesStrings
private static final java.lang.String[] sIndSpacesStrings
-
sIndTabs
private static final java.lang.String sIndTabs
- See Also:
- Constant Field Values
-
sIndTabsArray
private static final char[] sIndTabsArray
-
sIndTabsStrings
private static final java.lang.String[] sIndTabsStrings
-
-
Constructor Detail
-
TextBuffer
private TextBuffer(ReaderConfig cfg)
-
-
Method Detail
-
createRecyclableBuffer
public static TextBuffer createRecyclableBuffer(ReaderConfig cfg)
-
createTemporaryBuffer
public static TextBuffer createTemporaryBuffer()
-
recycle
public void recycle(boolean force)
Method called to indicate that the underlying buffers should now be recycled if they haven't yet been recycled. Although caller can still use this text buffer, it is not advisable to call this method if that is likely, since next time a buffer is needed, buffers need to reallocated. Note: calling this method automatically also clears contents of the buffer.
-
resetWithEmpty
public void resetWithEmpty()
Method called to clear out any content text buffer may have, and initializes buffer to use non-shared data.
-
resetWithEmptyString
public void resetWithEmptyString()
Similar toresetWithEmpty()
, but actively marks current text content to be empty string (whereas former method leaves content as undefined).
-
resetWithShared
public void resetWithShared(char[] buf, int start, int len)
Method called to initialize the buffer with a shared copy of data; this means that buffer will just have pointers to actual data. It also means that if anything is to be appended to the buffer, it will first have to unshare it (make a local copy).
-
resetWithCopy
public void resetWithCopy(char[] buf, int start, int len)
-
resetInitialized
public void resetInitialized()
Method called to make sure there is a non-shared segment to use, without appending any content yet.
-
allocBuffer
private final char[] allocBuffer(int needed)
-
clearSegments
private final void clearSegments()
-
resetWithIndentation
public void resetWithIndentation(int indCharCount, char indChar)
-
size
public int size()
- Returns:
- Number of characters currently stored by this collector
-
getTextStart
public int getTextStart()
-
getTextBuffer
public char[] getTextBuffer()
-
decode
public void decode(org.codehaus.stax2.typed.TypedValueDecoder tvd) throws java.lang.IllegalArgumentException
Generic pass-through method which call given decoder with accumulated data- Throws:
java.lang.IllegalArgumentException
-
decodeElements
public int decodeElements(org.codehaus.stax2.typed.TypedArrayDecoder tad, InputProblemReporter rep) throws org.codehaus.stax2.typed.TypedXMLStreamException
Pass-through decode method called to find find the next token, decode it, and repeat the process as long as there are more tokens and the array decoder accepts more entries. All tokens processed will be "consumed", such that they will not be visible via buffer.- Returns:
- Number of tokens decoded; 0 means that no (more) tokens were found from this buffer.
- Throws:
org.codehaus.stax2.typed.TypedXMLStreamException
-
initBinaryChunks
public void initBinaryChunks(org.codehaus.stax2.typed.Base64Variant v, org.codehaus.stax2.ri.typed.CharArrayBase64Decoder dec, boolean firstChunk)
Method that needs to be called to configure given base64 decoder with textual contents collected by this buffer.- Parameters:
dec
- Decoder that will need datafirstChunk
- Whether this is the first segment fed or not; if it is, state needs to be fullt reset; if not, only partially.
-
contentsAsString
public java.lang.String contentsAsString()
-
contentsAsStringBuilder
public java.lang.StringBuilder contentsAsStringBuilder(int extraSpace)
Similar tocontentsAsString()
, but constructs a StringBuilder for further appends.- Parameters:
extraSpace
- Number of extra characters to preserve in StringBuilder beyond space immediately needed to hold the contents
-
contentsToStringBuilder
public void contentsToStringBuilder(java.lang.StringBuilder sb)
-
contentsAsArray
public char[] contentsAsArray()
-
contentsToArray
public int contentsToArray(int srcStart, char[] dst, int dstStart, int len)
-
rawContentsTo
public int rawContentsTo(java.io.Writer w) throws java.io.IOException
Method that will stream contents of this buffer into specified Writer.- Throws:
java.io.IOException
-
rawContentsViaReader
@Deprecated public java.io.Reader rawContentsViaReader() throws java.io.IOException
Deprecated.- Throws:
java.io.IOException
-
isAllWhitespace
public boolean isAllWhitespace()
-
equalsString
public boolean equalsString(java.lang.String str)
Note: it is assumed that this method is not used often enough to be a bottleneck, or for long segments. Based on this, it is optimized for common simple cases where there is only one single character segment to use; fallback for other cases is to create such segment.
-
fireSaxCharacterEvents
public void fireSaxCharacterEvents(org.xml.sax.ContentHandler h) throws org.xml.sax.SAXException
- Throws:
org.xml.sax.SAXException
-
fireSaxSpaceEvents
public void fireSaxSpaceEvents(org.xml.sax.ContentHandler h) throws org.xml.sax.SAXException
- Throws:
org.xml.sax.SAXException
-
fireSaxCommentEvent
public void fireSaxCommentEvent(org.xml.sax.ext.LexicalHandler h) throws org.xml.sax.SAXException
- Throws:
org.xml.sax.SAXException
-
fireDtdCommentEvent
public void fireDtdCommentEvent(DTDEventListener l)
-
validateText
public void validateText(org.codehaus.stax2.validation.XMLValidator vld, boolean lastSegment) throws javax.xml.stream.XMLStreamException
- Throws:
javax.xml.stream.XMLStreamException
-
ensureNotShared
public void ensureNotShared()
Method called to make sure that buffer is not using shared input buffer; if it is, it will copy such contents to private buffer.
-
append
public void append(char c)
-
append
public void append(char[] c, int start, int len)
-
append
public void append(java.lang.String str)
-
getCurrentSegment
public char[] getCurrentSegment()
-
getCurrentSegmentSize
public int getCurrentSegmentSize()
-
setCurrentLength
public void setCurrentLength(int len)
-
finishCurrentSegment
public char[] finishCurrentSegment()
-
calcNewSize
private int calcNewSize(int latestSize)
Method used to determine size of the next segment to allocate to contain textual content.
-
toString
public java.lang.String toString()
Note: calling this method may not be as efficient as callingcontentsAsString()
, since it's not guaranteed that resulting String is cached.- Overrides:
toString
in classjava.lang.Object
-
unshare
public void unshare(int needExtra)
Method called if/when we need to append content when we have been initialized to use shared buffer.
-
expand
private void expand(int roomNeeded)
Method called when current segment is full, to allocate new segment.- Parameters:
roomNeeded
- Number of characters that the resulting new buffer must have
-
buildResultArray
private char[] buildResultArray()
-
-