Class RomanianWordTokenizer

  • All Implemented Interfaces:
    org.languagetool.tokenizers.Tokenizer

    public class RomanianWordTokenizer
    extends org.languagetool.tokenizers.WordTokenizer
    Tokenizes a sentence into words. Punctuation and whitespace gets its own token. Like EnglishWordTokenizer except for some characters: eg: "-'
    Since:
    20.02.2009 19:53:50
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      java.util.List<java.lang.String> tokenize​(java.lang.String text)  
      • Methods inherited from class org.languagetool.tokenizers.WordTokenizer

        getProtocols, getTokenizingCharacters, isEMail, isUrl, joinEMails, joinEMailsAndUrls, joinUrls
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • RomanianWordTokenizer

        public RomanianWordTokenizer()
    • Method Detail

      • tokenize

        public java.util.List<java.lang.String> tokenize​(java.lang.String text)
        Specified by:
        tokenize in interface org.languagetool.tokenizers.Tokenizer
        Overrides:
        tokenize in class org.languagetool.tokenizers.WordTokenizer