primaryOrder
- Parameters:
ce
- the collation element- Returns:
- the element's 16 bits primary order.
CollationElementIterator
is an iterator created by
a RuleBasedCollator to walk through a string. The return result of
each iteration is a 32-bit collation element (CE) that defines the
ordering priority of the next character or sequence of characters
in the source string.
For illustration, consider the following in Slovak and in traditional Spanish collation:
And in German phonebook collation,"ca" -> the first collation element is CE('c') and the second collation element is CE('a'). "cha" -> the first collation element is CE('ch') and the second collation element is CE('a').
Since the character 'æ' is a composed character of 'a' and 'e', the iterator returns two collation elements for the single character 'æ' "æb" -> the first collation element is collation_element('a'), the second collation element is collation_element('e'), and the third collation element is collation_element('b').
For collation ordering comparison, the collation element results
can not be compared simply by using basic arithmetic operators,
e.g. <, == or >, further processing has to be done. Details
can be found in the ICU
User Guide. An example of using the CollationElementIterator
for collation ordering comparison is the class
StringSearch
.
To construct a CollationElementIterator object, users call the method getCollationElementIterator() on a RuleBasedCollator that defines the desired sorting order.
Example:
String testString = "This is a test"; RuleBasedCollator rbc = new RuleBasedCollator("&a<b"); CollationElementIterator iterator = rbc.getCollationElementIterator(testString); int primaryOrder = iterator.IGNORABLE; while (primaryOrder != iterator.NULLORDER) { int order = iterator.next(); if (order != iterator.IGNORABLE && order != iterator.NULLORDER) { // order is valid, not ignorable and we have not passed the end // of the iteration, we do something primaryOrder = CollationElementIterator.primaryOrder(order); System.out.println("Next primary order 0x" + Integer.toHexString(primaryOrder)); } }
The method next() returns the collation order of the next character based on the comparison level of the collator. The method previous() returns the collation order of the previous character based on the comparison level of the collator. The Collation Element Iterator moves only in one direction between calls to reset(), setOffset(), or setText(). That is, next() and previous() can not be inter-used. Whenever previous() is to be called after next() or vice versa, reset(), setOffset() or setText() has to be called first to reset the status, shifting current position to either the end or the start of the string (reset() or setText()), or the specified position (setOffset()). Hence at the next call of next() or previous(), the first or last collation order, or collation order at the specified position will be returned. If a change of direction is done without one of these calls, the result is undefined.
This class is not subclassable.
static final int
static final int
boolean
int
getMaxExpansion(int ce)
int
int
hashCode()
int
next()
int
previous()
static final int
primaryOrder(int ce)
void
reset()
static final int
secondaryOrder(int ce)
void
setOffset(int newOffset)
void
setText(UCharacterIterator source)
void
void
setText(CharacterIterator source)
static final int
tertiaryOrder(int ce)
See class documentation for an example of use.
See class documentation for an example of use.
ce
- the collation elementce
- the collation elementce
- the collation elementsetOffset(offset)
sets the index in the middle of
a contraction, getOffset()
returns the index of
the first character in the contraction, which may not be equal
to the original offset that was set. Hence calling getOffset()
immediately after setOffset(offset) does not guarantee that the
original offset set will be returned.)
This iterator iterates over a sequence of collation elements that were built from the string. Because there isn't necessarily a one-to-one mapping from characters to collation elements, this doesn't mean the same thing as "return the collation element [or ordering priority] of the next character in the string".
This function returns the collation element that the iterator is currently pointing to, and then updates the internal pointer to point to the next element.
This iterator iterates over a sequence of collation elements that were built from the string. Because there isn't necessarily a one-to-one mapping from characters to collation elements, this doesn't mean the same thing as "return the collation element [or ordering priority] of the previous character in the string".
This function updates the iterator's internal pointer to point to the collation element preceding the one it's currently pointing to and then returns that element, while next() returns the current element and then updates the pointer.
If the RuleBasedCollator used by this iterator has had its attributes changed, calling reset() will reinitialize the iterator to use the new attributes.
If offset is in the middle of a contracting character sequence, the iterator is adjusted to the start of the contracting sequence. This means that getOffset() is not guaranteed to return the same value set by this method.
If the decomposition mode is on, and offset is in the middle of a decomposible range of source text, the iterator may not return a correct result for the next forwards or backwards iteration. The user must ensure that the offset is not in the middle of a decomposible range.
newOffset
- the character offset into the original source string to
set. Note that this is not an offset into the corresponding
sequence of collation elements.source
- the new source string for iteration.The source iterator's integrity will be preserved since a new copy will be created for use.
source
- the new source string iterator for iteration.source
- the new source string iterator for iteration.ce
- a collation element returned by previous() or next().