Package net.sf.saxon.str
Class Twine16
- java.lang.Object
-
- net.sf.saxon.str.UnicodeString
-
- net.sf.saxon.str.Twine16
-
- All Implemented Interfaces:
Comparable<UnicodeString>
,AtomicMatchKey
public class Twine16 extends UnicodeString
Twine16
is a Unicode string consisting entirely of codepoints in the range 0-65535 (that is, the basic multilingual plane), excluding surrogates. The number of codepoints is limited to 2^31-1.
-
-
Field Summary
Fields Modifier and Type Field Description protected int
cachedHash
protected char[]
chars
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description int
codePointAt(long index)
Get the code point at a given position in the stringIntIterator
codePoints()
Get an iterator over the Unicode codepoints in the value.int
compareTo(UnicodeString other)
Compare this string to another using codepoint comparison(package private) void
copy16bit(char[] target, int offset)
Copy this string, as a sequence of 16-bit characters, to a specified array(package private) void
copy24bit(byte[] target, int offset)
Copy this string, as a sequence of 24-bit characters, to a specified array(package private) void
copy32bit(int[] target, int offset)
Copy this string, as a sequence of 32-bit codepoints, to a specified arrayString
details()
boolean
equals(Object o)
Test whether this StringValue is equal to another under the rules of the codepoint collation.char[]
getCharArray()
int
getWidth()
Get the number of bits needed to hold all the characters in this stringint
hashCode()
Compute a hashCode.long
indexOf(int codePoint, long from)
Get the first position, at or beyond start, where a given codepoint appears in this string.long
indexOf(UnicodeString other, long from)
Get the first position, at or beyond start, where another string appears as a substring of this string, comparing codepoints.long
indexWhere(IntPredicate predicate, long from)
Get the position of the first occurrence of the specified codepoint, starting the search at a given position in the stringboolean
isEmpty()
Determine whether the string is a zero-length string.long
length()
Get the length of this string, in codepointsint
length32()
Get the length of the string, provided it is less than 2^31 charactersUnicodeString
substring(long start, long end)
Get a substring of this string (following the rules ofString.substring(int)
, but measuring Unicode codepoints rather than 16-bit code units)String
toString()
Convert to a string.-
Methods inherited from class net.sf.saxon.str.UnicodeString
asAtomic, checkSubstringBounds, concat, copy8bit, economize, estimatedLength, hasSubstring, indexOf, prefix, requireInt, requireNonNegativeInt, substring, tidy, verifyCharacters
-
-
-
-
Constructor Detail
-
Twine16
protected Twine16(char[] chars)
Protected constructor- Parameters:
chars
- the 16-bit characters comprising the string: must not include any surrogates
-
Twine16
public Twine16(char[] chars, int start, int len)
Constructor taking an array of 16-bit chars, or a substring thereof. The caller warrants that the characters are all BMP characters (no surrogate pairs). The character array is copied, so it can be reused and modified after the call.- Parameters:
chars
- the array of characters (must not include any surrogates)start
- start offset into the arraylen
- the number of characters to be included.
-
-
Method Detail
-
getCharArray
public char[] getCharArray()
-
length
public long length()
Get the length of this string, in codepoints- Specified by:
length
in classUnicodeString
- Returns:
- the length of the string in Unicode code points
-
length32
public int length32()
Description copied from class:UnicodeString
Get the length of the string, provided it is less than 2^31 characters- Overrides:
length32
in classUnicodeString
- Returns:
- the length of the string if it fits within a Java
int
-
substring
public UnicodeString substring(long start, long end)
Get a substring of this string (following the rules ofString.substring(int)
, but measuring Unicode codepoints rather than 16-bit code units)- Specified by:
substring
in classUnicodeString
- Parameters:
start
- the offset of the first character to be included in the result, counting Unicode codepointsend
- the offset of the first character to be excluded from the result, counting Unicode codepoints- Returns:
- the substring
-
codePointAt
public int codePointAt(long index) throws IndexOutOfBoundsException
Description copied from class:UnicodeString
Get the code point at a given position in the string- Specified by:
codePointAt
in classUnicodeString
- Parameters:
index
- the given position (0-based)- Returns:
- the code point at the given position
- Throws:
IndexOutOfBoundsException
- if the index is out of range
-
indexOf
public long indexOf(int codePoint, long from)
Get the first position, at or beyond start, where a given codepoint appears in this string.- Specified by:
indexOf
in classUnicodeString
- Parameters:
codePoint
- the sought codepointfrom
- the position (0-based) where searching is to start (counting in codepoints)- Returns:
- the first position where the substring is found, or -1 if it is not found
-
indexOf
public long indexOf(UnicodeString other, long from)
Get the first position, at or beyond start, where another string appears as a substring of this string, comparing codepoints.- Overrides:
indexOf
in classUnicodeString
- Parameters:
other
- the other (sought) stringfrom
- the position (0-based) where searching is to start (counting in codepoints)- Returns:
- the first position where the substring is found, or -1 if it is not found
-
isEmpty
public boolean isEmpty()
Determine whether the string is a zero-length string. This may be more efficient than testing whether the length is equal to zero- Overrides:
isEmpty
in classUnicodeString
- Returns:
- true if the string is zero length
-
copy16bit
void copy16bit(char[] target, int offset)
Description copied from class:UnicodeString
Copy this string, as a sequence of 16-bit characters, to a specified array- Overrides:
copy16bit
in classUnicodeString
- Parameters:
target
- the target array: the caller must ensure there is sufficient capacityoffset
- the position in the target array
-
copy24bit
void copy24bit(byte[] target, int offset)
Description copied from class:UnicodeString
Copy this string, as a sequence of 24-bit characters, to a specified array- Overrides:
copy24bit
in classUnicodeString
- Parameters:
target
- the target array: the caller must ensure there is sufficient capacityoffset
- the position in the target array as a byte offset (that is, the character offset times 3)
-
copy32bit
void copy32bit(int[] target, int offset)
Description copied from class:UnicodeString
Copy this string, as a sequence of 32-bit codepoints, to a specified array- Overrides:
copy32bit
in classUnicodeString
- Parameters:
target
- the target array: the caller must ensure there is sufficient capacityoffset
- the position in the target array as a codepoint offset
-
getWidth
public int getWidth()
Description copied from class:UnicodeString
Get the number of bits needed to hold all the characters in this string- Specified by:
getWidth
in classUnicodeString
- Returns:
- 7 for ascii characters (not used??), 8 for latin-1, 16 for BMP, 24 for general Unicode.
-
codePoints
public IntIterator codePoints()
Get an iterator over the Unicode codepoints in the value. These will always be full codepoints, never surrogates (surrogate pairs are combined where necessary).- Specified by:
codePoints
in classUnicodeString
- Returns:
- a sequence of Unicode codepoints
-
hashCode
public int hashCode()
Compute a hashCode. All implementations ofUnicodeString
use compatible hash codes and the hashing algorithm is therefore identical to that forjava.lang.String
. This means that for strings containing Astral characters, the hash code needs to be computed by decomposing an Astral character into a surrogate pair.- Overrides:
hashCode
in classUnicodeString
- Returns:
- the hash code
-
equals
public boolean equals(Object o)
Test whether this StringValue is equal to another under the rules of the codepoint collation. The type annotation is ignored.- Overrides:
equals
in classUnicodeString
- Parameters:
o
- the value to be compared with this value- Returns:
- true if the strings are equal on a codepoint-by-codepoint basis
-
compareTo
public int compareTo(UnicodeString other)
Description copied from class:UnicodeString
Compare this string to another using codepoint comparison- Specified by:
compareTo
in interfaceComparable<UnicodeString>
- Overrides:
compareTo
in classUnicodeString
- Parameters:
other
- the other string- Returns:
- -1 if this string comes first, 0 if they are equal, +1 if the other string comes first
-
indexWhere
public long indexWhere(IntPredicate predicate, long from)
Get the position of the first occurrence of the specified codepoint, starting the search at a given position in the string- Specified by:
indexWhere
in classUnicodeString
- Parameters:
predicate
- condition that the codepoint must satisfyfrom
- the position from which the search should start (0-based)- Returns:
- the position (0-based) of the first codepoint to match the predicate, or -1 if not found
-
details
public String details()
-
-