Package net.loomchild.segment.srx.legacy
Class FastTextIterator
- java.lang.Object
-
- net.loomchild.segment.AbstractTextIterator
-
- net.loomchild.segment.srx.legacy.FastTextIterator
-
- All Implemented Interfaces:
java.util.Iterator<java.lang.String>
,TextIterator
public class FastTextIterator extends AbstractTextIterator
Represents fast text iterator that splits text according to SRX rules.
-
-
Field Summary
Fields Modifier and Type Field Description private ReaderMatcher
breakingMatcher
private int
endPosition
private MergedPattern
mergedPattern
private java.lang.String
segment
private int
startPosition
private java.lang.CharSequence
text
-
Constructor Summary
Constructors Constructor Description FastTextIterator(SrxDocument document, java.lang.String languageCode, java.io.Reader reader)
Creates streaming text iterator with no additional parameters.FastTextIterator(SrxDocument document, java.lang.String languageCode, java.io.Reader reader, java.util.Map<java.lang.String,java.lang.Object> parameterMap)
Creates streaming text iterator that obtains language rules form given document using given language code.FastTextIterator(SrxDocument document, java.lang.String languageCode, java.lang.CharSequence text)
Creates text iterator with no additional parameters.FastTextIterator(SrxDocument document, java.lang.String languageCode, java.lang.CharSequence text, java.util.Map<java.lang.String,java.lang.Object> parameterMap)
Creates text iterator that obtains language rules form given document using given language code.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description boolean
hasNext()
java.lang.String
next()
-
Methods inherited from class net.loomchild.segment.AbstractTextIterator
remove, toString
-
-
-
-
Field Detail
-
text
private java.lang.CharSequence text
-
segment
private java.lang.String segment
-
mergedPattern
private MergedPattern mergedPattern
-
breakingMatcher
private ReaderMatcher breakingMatcher
-
startPosition
private int startPosition
-
endPosition
private int endPosition
-
-
Constructor Detail
-
FastTextIterator
public FastTextIterator(SrxDocument document, java.lang.String languageCode, java.lang.CharSequence text, java.util.Map<java.lang.String,java.lang.Object> parameterMap)
Creates text iterator that obtains language rules form given document using given language code. To retrieve language rules callsSrxDocument.getLanguageRuleList(String)
. Supported parameters:SrxTextIterator.MAX_LOOKBEHIND_CONSTRUCT_LENGTH_PARAMETER
.- Parameters:
document
- document containing language ruleslanguageCode
- language code to select the ruletext
-parameterMap
- additional segmentation parameters
-
FastTextIterator
public FastTextIterator(SrxDocument document, java.lang.String languageCode, java.lang.CharSequence text)
Creates text iterator with no additional parameters.- Parameters:
document
- document containing language ruleslanguageCode
- language code to select the ruletext
-- See Also:
FastTextIterator(SrxDocument, String, CharSequence, Map)
-
FastTextIterator
public FastTextIterator(SrxDocument document, java.lang.String languageCode, java.io.Reader reader, java.util.Map<java.lang.String,java.lang.Object> parameterMap)
Creates streaming text iterator that obtains language rules form given document using given language code. To retrieve language rules callsSrxDocument.getLanguageRuleList(String)
. To handle streams uses ReaderCharSequence, so not all possible regular expressions are accepted. SeeReaderCharSequence
for details. Supported parameters:SrxTextIterator.BUFFER_LENGTH_PARAMETER
,SrxTextIterator.MAX_LOOKBEHIND_CONSTRUCT_LENGTH_PARAMETER
.- Parameters:
document
- document containing language ruleslanguageCode
- language code to select the rulesreader
- reader from which text will be readparameterMap
- additional segmentation parameters
-
FastTextIterator
public FastTextIterator(SrxDocument document, java.lang.String languageCode, java.io.Reader reader)
Creates streaming text iterator with no additional parameters.- Parameters:
document
- document containing language ruleslanguageCode
- language code to select the rulesreader
- reader from which text will be read- See Also:
FastTextIterator(SrxDocument, String, Reader, Map)
-
-