Class TextBuffer


  • public final class TextBuffer
    extends java.lang.Object
    TextBuffer is a class similar to StringBuilder, with following differences:
    • TextBuffer uses segments character arrays, to avoid having to do additional array copies when array is not big enough. This means that only reallocating that is necessary is done only once -- if and when caller wants to access contents in a linear array (char[], String).
    • TextBuffer is not synchronized.

    Over time more and more cruft has accumulated here, mostly to support efficient access to collected text. Since access is easiest to do efficiently using callbacks, this class now needs to known interfaces of SAX classes and validators.

    Notes about usage: for debugging purposes, it's suggested to use toString() method, as opposed to contentsAsArray() or contentsAsString(). Internally resulting code paths may or may not be different, WRT caching.

    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      private static class  TextBuffer.BufferReader  
    • Field Summary

      Fields 
      Modifier and Type Field Description
      (package private) static int DEF_INITIAL_BUFFER_SIZE
      Size of the first text segment buffer to allocate; need not contain the biggest segment, since new ones will get allocated as needed.
      (package private) static int INT_SPACE  
      static int MAX_INDENT_SPACES  
      static int MAX_INDENT_TABS  
      (package private) static int MAX_SEGMENT_LENGTH
      We will also restrict maximum length of individual segments to allocate (not including cases where we must return a single segment).
      private ReaderConfig mConfig  
      private char[] mCurrentSegment  
      private int mCurrentSize
      Number of characters in currently active (last) segment
      private boolean mHasSegments  
      private char[] mInputBuffer
      Shared input buffer; stored here in case some input can be returned as is, without being copied to collector's own buffers.
      private int mInputLen
      When using shared buffer, offset after the last character in shared buffer
      private int mInputStart
      Character offset of first char in input buffer; -1 to indicate that input buffer currently does not contain any useful char data
      private char[] mResultArray  
      private java.lang.String mResultString
      String that will be constructed when the whole contents are needed; will be temporarily stored in case asked for again.
      private java.util.ArrayList<char[]> mSegments
      List of segments prior to currently active segment.
      private int mSegmentSize
      Amount of characters in segments in mSegments
      private static java.lang.String sIndSpaces  
      private static char[] sIndSpacesArray  
      private static java.lang.String[] sIndSpacesStrings  
      private static java.lang.String sIndTabs  
      private static char[] sIndTabsArray  
      private static java.lang.String[] sIndTabsStrings  
    • Field Detail

      • DEF_INITIAL_BUFFER_SIZE

        static final int DEF_INITIAL_BUFFER_SIZE
        Size of the first text segment buffer to allocate; need not contain the biggest segment, since new ones will get allocated as needed. However, it's sensible to use something that often is big enough to contain segments.
        See Also:
        Constant Field Values
      • MAX_SEGMENT_LENGTH

        static final int MAX_SEGMENT_LENGTH
        We will also restrict maximum length of individual segments to allocate (not including cases where we must return a single segment). Value is somewhat arbitrary, let's use it so that memory used is no more than 1/2 megabytes.
        See Also:
        Constant Field Values
      • mInputBuffer

        private char[] mInputBuffer
        Shared input buffer; stored here in case some input can be returned as is, without being copied to collector's own buffers. Note that this is read-only for this Objet.
      • mInputStart

        private int mInputStart
        Character offset of first char in input buffer; -1 to indicate that input buffer currently does not contain any useful char data
      • mInputLen

        private int mInputLen
        When using shared buffer, offset after the last character in shared buffer
      • mHasSegments

        private boolean mHasSegments
      • mSegments

        private java.util.ArrayList<char[]> mSegments
        List of segments prior to currently active segment.
      • mSegmentSize

        private int mSegmentSize
        Amount of characters in segments in mSegments
      • mCurrentSegment

        private char[] mCurrentSegment
      • mCurrentSize

        private int mCurrentSize
        Number of characters in currently active (last) segment
      • mResultString

        private java.lang.String mResultString
        String that will be constructed when the whole contents are needed; will be temporarily stored in case asked for again.
      • mResultArray

        private char[] mResultArray
      • sIndSpacesArray

        private static final char[] sIndSpacesArray
      • sIndSpacesStrings

        private static final java.lang.String[] sIndSpacesStrings
      • sIndTabsArray

        private static final char[] sIndTabsArray
      • sIndTabsStrings

        private static final java.lang.String[] sIndTabsStrings
    • Constructor Detail

    • Method Detail

      • createTemporaryBuffer

        public static TextBuffer createTemporaryBuffer()
      • recycle

        public void recycle​(boolean force)
        Method called to indicate that the underlying buffers should now be recycled if they haven't yet been recycled. Although caller can still use this text buffer, it is not advisable to call this method if that is likely, since next time a buffer is needed, buffers need to reallocated. Note: calling this method automatically also clears contents of the buffer.
      • resetWithEmpty

        public void resetWithEmpty()
        Method called to clear out any content text buffer may have, and initializes buffer to use non-shared data.
      • resetWithEmptyString

        public void resetWithEmptyString()
        Similar to resetWithEmpty(), but actively marks current text content to be empty string (whereas former method leaves content as undefined).
      • resetWithShared

        public void resetWithShared​(char[] buf,
                                    int start,
                                    int len)
        Method called to initialize the buffer with a shared copy of data; this means that buffer will just have pointers to actual data. It also means that if anything is to be appended to the buffer, it will first have to unshare it (make a local copy).
      • resetWithCopy

        public void resetWithCopy​(char[] buf,
                                  int start,
                                  int len)
      • resetInitialized

        public void resetInitialized()
        Method called to make sure there is a non-shared segment to use, without appending any content yet.
      • allocBuffer

        private final char[] allocBuffer​(int needed)
      • clearSegments

        private final void clearSegments()
      • resetWithIndentation

        public void resetWithIndentation​(int indCharCount,
                                         char indChar)
      • size

        public int size()
        Returns:
        Number of characters currently stored by this collector
      • getTextStart

        public int getTextStart()
      • getTextBuffer

        public char[] getTextBuffer()
      • decode

        public void decode​(org.codehaus.stax2.typed.TypedValueDecoder tvd)
                    throws java.lang.IllegalArgumentException
        Generic pass-through method which call given decoder with accumulated data
        Throws:
        java.lang.IllegalArgumentException
      • decodeElements

        public int decodeElements​(org.codehaus.stax2.typed.TypedArrayDecoder tad,
                                  InputProblemReporter rep)
                           throws org.codehaus.stax2.typed.TypedXMLStreamException
        Pass-through decode method called to find find the next token, decode it, and repeat the process as long as there are more tokens and the array decoder accepts more entries. All tokens processed will be "consumed", such that they will not be visible via buffer.
        Returns:
        Number of tokens decoded; 0 means that no (more) tokens were found from this buffer.
        Throws:
        org.codehaus.stax2.typed.TypedXMLStreamException
      • initBinaryChunks

        public void initBinaryChunks​(org.codehaus.stax2.typed.Base64Variant v,
                                     org.codehaus.stax2.ri.typed.CharArrayBase64Decoder dec,
                                     boolean firstChunk)
        Method that needs to be called to configure given base64 decoder with textual contents collected by this buffer.
        Parameters:
        dec - Decoder that will need data
        firstChunk - Whether this is the first segment fed or not; if it is, state needs to be fullt reset; if not, only partially.
      • contentsAsString

        public java.lang.String contentsAsString()
      • contentsAsStringBuilder

        public java.lang.StringBuilder contentsAsStringBuilder​(int extraSpace)
        Similar to contentsAsString(), but constructs a StringBuilder for further appends.
        Parameters:
        extraSpace - Number of extra characters to preserve in StringBuilder beyond space immediately needed to hold the contents
      • contentsToStringBuilder

        public void contentsToStringBuilder​(java.lang.StringBuilder sb)
      • contentsAsArray

        public char[] contentsAsArray()
      • contentsToArray

        public int contentsToArray​(int srcStart,
                                   char[] dst,
                                   int dstStart,
                                   int len)
      • rawContentsTo

        public int rawContentsTo​(java.io.Writer w)
                          throws java.io.IOException
        Method that will stream contents of this buffer into specified Writer.
        Throws:
        java.io.IOException
      • rawContentsViaReader

        @Deprecated
        public java.io.Reader rawContentsViaReader()
                                            throws java.io.IOException
        Deprecated.
        Throws:
        java.io.IOException
      • isAllWhitespace

        public boolean isAllWhitespace()
      • equalsString

        public boolean equalsString​(java.lang.String str)
        Note: it is assumed that this method is not used often enough to be a bottleneck, or for long segments. Based on this, it is optimized for common simple cases where there is only one single character segment to use; fallback for other cases is to create such segment.
      • fireSaxCharacterEvents

        public void fireSaxCharacterEvents​(org.xml.sax.ContentHandler h)
                                    throws org.xml.sax.SAXException
        Throws:
        org.xml.sax.SAXException
      • fireSaxSpaceEvents

        public void fireSaxSpaceEvents​(org.xml.sax.ContentHandler h)
                                throws org.xml.sax.SAXException
        Throws:
        org.xml.sax.SAXException
      • fireSaxCommentEvent

        public void fireSaxCommentEvent​(org.xml.sax.ext.LexicalHandler h)
                                 throws org.xml.sax.SAXException
        Throws:
        org.xml.sax.SAXException
      • validateText

        public void validateText​(org.codehaus.stax2.validation.XMLValidator vld,
                                 boolean lastSegment)
                          throws javax.xml.stream.XMLStreamException
        Throws:
        javax.xml.stream.XMLStreamException
      • ensureNotShared

        public void ensureNotShared()
        Method called to make sure that buffer is not using shared input buffer; if it is, it will copy such contents to private buffer.
      • append

        public void append​(char c)
      • append

        public void append​(char[] c,
                           int start,
                           int len)
      • append

        public void append​(java.lang.String str)
      • getCurrentSegment

        public char[] getCurrentSegment()
      • getCurrentSegmentSize

        public int getCurrentSegmentSize()
      • setCurrentLength

        public void setCurrentLength​(int len)
      • finishCurrentSegment

        public char[] finishCurrentSegment()
      • calcNewSize

        private int calcNewSize​(int latestSize)
        Method used to determine size of the next segment to allocate to contain textual content.
      • toString

        public java.lang.String toString()
        Note: calling this method may not be as efficient as calling contentsAsString(), since it's not guaranteed that resulting String is cached.
        Overrides:
        toString in class java.lang.Object
      • unshare

        public void unshare​(int needExtra)
        Method called if/when we need to append content when we have been initialized to use shared buffer.
      • expand

        private void expand​(int roomNeeded)
        Method called when current segment is full, to allocate new segment.
        Parameters:
        roomNeeded - Number of characters that the resulting new buffer must have
      • buildResultArray

        private char[] buildResultArray()