Class UnicodeBuilder

    • Constructor Detail

      • UnicodeBuilder

        public UnicodeBuilder()
        Create a Unicode builder with an initial allocation of 256 codepoints
      • UnicodeBuilder

        public UnicodeBuilder​(int allocate)
        Create a Unicode builder with an initial space allocation
        Parameters:
        allocate - the initial space allocation, in codepoints (32-bit integers)
    • Method Detail

      • append

        public UnicodeBuilder append​(char ch)
        Append a character, which must not be a surrogate. (Method needed for C#, because implicit conversion of char to int isn't supported)
        Parameters:
        ch - the character
        Returns:
        this builder, with the new character added
      • append

        public UnicodeBuilder append​(int codePoint)
        Append a single unicode character to the content
        Parameters:
        codePoint - the unicode codepoint. The caller is responsible for ensuring that this is not a surrogate
        Returns:
        this builder, with the new character added
      • append

        public UnicodeBuilder append​(IntIterator codePoints)
        Append multiple unicode characters to the content
        Parameters:
        codePoints - an iterator delivering the codepoints to be added.
        Returns:
        this builder, with the new characters added
      • appendLatin

        public UnicodeBuilder appendLatin​(String str)
        Append a Java string to the content. The caller is responsible for ensuring that this consists entirely of characters in the Latin-1 character set
        Parameters:
        str - the string to be appended
        Returns:
        this builder, with the new string added
      • appendAll

        public UnicodeBuilder appendAll​(SequenceIterator iter)
        Append the string values of all the items in a sequence, with no separator
        Parameters:
        iter - the sequence of items
        Returns:
        this builder, with the new items added
      • append

        public UnicodeBuilder append​(CharSequence str)
        Append a Java CharSequence to the content. This may contain arbitrary characters including well formed surrogate pairs
        Parameters:
        str - the string to be appended
        Returns:
        this builder, with the new string added
      • append

        public UnicodeBuilder append​(UnicodeString str)
        Append a UnicodeString object to the content.
        Parameters:
        str - the string to be appended. The length is currently restricted to 2^31.
        Returns:
        this builder, with the new string added
      • length

        public long length()
        Get the number of codepoints currently in the builder
        Returns:
        the size in codepoints
      • isEmpty

        public boolean isEmpty()
        Ask whether the content of the builder is empty
        Returns:
        true if the size is zero
      • toUnicodeString

        public UnicodeString toUnicodeString()
        Construct a UnicodeString whose value is formed from the contents of this builder
        Returns:
        the constructed UnicodeString
      • toStringItem

        public StringValue toStringItem​(AtomicType type)
        Construct a StringValue whose value is formed from the contents of this builder
        Parameters:
        type - the required type, for example BuiltInAtomicType.STRING or BuiltInAtomicType.UNTYPED_ATOMIC. The caller warrants that the value is a valid instance of this type. No validation or whitespace normalization is carried out
        Returns:
        the constructed StringValue
      • toString

        public String toString()
        Return a string containing the character content of this builder
        Overrides:
        toString in class Object
        Returns:
        the character content of this builder as a Java String
      • clear

        public void clear()
        Reset the contents of this builder to be empty
      • expand1to2

        public static byte[] expand1to2​(byte[] in,
                                        int start,
                                        int used,
                                        int allocate)
        Expand a byte array from 1-byte-per-character to 2-bytes-per-character
        Parameters:
        in - the input byte array
        start - the start offset in bytes
        used - the end offset in bytes
        allocate - the number of code points to allow for in the output byte array
        Returns:
        the new byte array
      • expandBytesToChars

        public static char[] expandBytesToChars​(byte[] in,
                                                int start,
                                                int end)
      • expand1to3

        public static byte[] expand1to3​(byte[] in,
                                        int start,
                                        int used,
                                        int allocate)
        Expand a byte array from 1-byte-per-character to 3-bytes-per-character
        Parameters:
        in - the input byte array
        start - the start offset in bytes
        used - the end offset in bytes
        allocate - the number of code points to allow for in the output byte array
        Returns:
        the new byte array
      • expand2to3

        public static byte[] expand2to3​(byte[] in,
                                        int start,
                                        int used,
                                        int allocate)
        Expand a byte array from 2-bytes-per-character to 3-bytes-per-character
        Parameters:
        in - the input byte array
        start - the start offset in bytes
        used - the end offset in bytes
        allocate - the number of code points to allow for in the output byte array
        Returns:
        the new byte array
      • expand

        public static byte[] expand​(byte[] in,
                                    int start,
                                    int end,
                                    int oldWidth,
                                    int newWidth,
                                    int allocate)
        Expand the width of the characters in a byte array
        Parameters:
        in - the input byte array
        start - the start offset in bytes
        end - the end offset in bytes
        oldWidth - the width of the characters (number of bytes per character) in the input array
        newWidth - the width of the characters (number of bytes per character) in the output array. If newWidth LE oldWidth then the input array is copied; the width is never reduced
        allocate - the number of code points to allow for in the output byte array; if zero (or insufficient) the output array will have no spare space for expansion
        Returns:
        the new byte array
      • accept

        public UnicodeBuilder accept​(UnicodeString chars)
        Process a supplied string
        Specified by:
        accept in interface UniStringConsumer
        Parameters:
        chars - the characters to be processed
        Returns:
        this CharSequenceConsumer (to allow method chaining)
      • write

        public void write​(UnicodeString chars)
        Description copied from interface: UnicodeWriter
        Process a supplied string
        Specified by:
        write in interface UnicodeWriter
        Parameters:
        chars - the characters to be processed
      • writeAscii

        public void writeAscii​(byte[] content)
                        throws IOException
        Write a supplied string known to consist entirely of ASCII characters, supplied as a byte array
        Specified by:
        writeAscii in interface UnicodeWriter
        Parameters:
        content - byte array holding ASCII characters only
        Throws:
        IOException - if processing fails for any reason
      • write

        public void write​(String chars)
                   throws IOException
        Process a supplied string
        Specified by:
        write in interface UnicodeWriter
        Parameters:
        chars - the characters to be processed
        Throws:
        IOException - if processing fails for any reason
      • trimToSize

        public void trimToSize()
      • close

        public void close()
        Complete the writing of characters to the result. The default implementation does nothing.
        Specified by:
        close in interface UnicodeWriter
        Specified by:
        close in interface UniStringConsumer