Class DutchWordTokenizer

  • All Implemented Interfaces:
    org.languagetool.tokenizers.Tokenizer

    public class DutchWordTokenizer
    extends org.languagetool.tokenizers.WordTokenizer
    • Field Summary

      Fields 
      Modifier and Type Field Description
      private java.lang.String nlTokenizingChars  
      private static java.util.List<java.lang.String> QUOTES  
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      private boolean endsWithQuote​(java.lang.String token)  
      java.lang.String getTokenizingCharacters()  
      private boolean startsWithQuote​(java.lang.String token)  
      java.util.List<java.lang.String> tokenize​(java.lang.String text)
      Tokenizes just like WordTokenizer with the exception for words such as "oma's" that contain an apostrophe in their middle.
      • Methods inherited from class org.languagetool.tokenizers.WordTokenizer

        getProtocols, isEMail, isUrl, joinEMails, joinEMailsAndUrls, joinUrls
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • QUOTES

        private static final java.util.List<java.lang.String> QUOTES
      • nlTokenizingChars

        private final java.lang.String nlTokenizingChars
    • Constructor Detail

      • DutchWordTokenizer

        public DutchWordTokenizer()
    • Method Detail

      • tokenize

        public java.util.List<java.lang.String> tokenize​(java.lang.String text)
        Tokenizes just like WordTokenizer with the exception for words such as "oma's" that contain an apostrophe in their middle.
        Specified by:
        tokenize in interface org.languagetool.tokenizers.Tokenizer
        Overrides:
        tokenize in class org.languagetool.tokenizers.WordTokenizer
        Parameters:
        text - Text to tokenize
        Returns:
        List of tokens
      • startsWithQuote

        private boolean startsWithQuote​(java.lang.String token)
      • endsWithQuote

        private boolean endsWithQuote​(java.lang.String token)
      • getTokenizingCharacters

        public java.lang.String getTokenizingCharacters()
        Overrides:
        getTokenizingCharacters in class org.languagetool.tokenizers.WordTokenizer