Class LanguageProfileImpl

  • All Implemented Interfaces:
    LanguageProfile

    public final class LanguageProfileImpl
    extends java.lang.Object
    implements LanguageProfile

    This class is immutable.

    • Field Summary

      Fields 
      Modifier and Type Field Description
      private @NotNull LdLocale locale  
      private @NotNull java.util.Map<java.lang.Integer,​java.util.Map<java.lang.String,​java.lang.Integer>> ngrams  
      private @NotNull LanguageProfileImpl.Stats stats  
    • Constructor Summary

      Constructors 
      Constructor Description
      LanguageProfileImpl​(@NotNull LdLocale locale, @NotNull java.util.Map<java.lang.Integer,​java.util.Map<java.lang.String,​java.lang.Integer>> ngrams)
      Use the builder.
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      boolean equals​(java.lang.Object o)  
      int getFrequency​(java.lang.String gram)  
      @NotNull java.util.List<java.lang.Integer> getGramLengths()
      Tells what the n in n-grams are used here.
      @NotNull LdLocale getLocale()  
      long getMaxGramCount​(int gramLength)
      Tells how often the n-gram with the highest amount of occurrences used in this profile occurred.
      long getMinGramCount​(int gramLength)
      Tells how often the n-gram with the lowest amount of occurrences used in this profile occurred.
      long getNumGramOccurrences​(int gramLength)
      Tells how often all n-grams of a certain length occurred, combined.
      int getNumGrams()
      Tells how many n-grams there are for all n-gram sizes combined.
      int getNumGrams​(int gramLength)
      Tells how many different n-grams there are for a certain n-gram size.
      int hashCode()  
      @NotNull java.lang.Iterable<java.util.Map.Entry<java.lang.String,​java.lang.Integer>> iterateGrams()
      Iterates all ngram strings with frequency.
      @NotNull java.lang.Iterable<java.util.Map.Entry<java.lang.String,​java.lang.Integer>> iterateGrams​(int gramLength)
      Iterates all gramLength-gram strings with frequency.
      private static LanguageProfileImpl.Stats makeStats​(java.util.Map<java.lang.Integer,​java.util.Map<java.lang.String,​java.lang.Integer>> ngrams)  
      java.lang.String toString()  
      • Methods inherited from class java.lang.Object

        clone, finalize, getClass, notify, notifyAll, wait, wait, wait
    • Field Detail

      • locale

        @NotNull
        private final @NotNull LdLocale locale
      • ngrams

        @NotNull
        private final @NotNull java.util.Map<java.lang.Integer,​java.util.Map<java.lang.String,​java.lang.Integer>> ngrams
    • Constructor Detail

      • LanguageProfileImpl

        LanguageProfileImpl​(@NotNull
                            @NotNull LdLocale locale,
                            @NotNull
                            @NotNull java.util.Map<java.lang.Integer,​java.util.Map<java.lang.String,​java.lang.Integer>> ngrams)
        Use the builder.
    • Method Detail

      • makeStats

        private static LanguageProfileImpl.Stats makeStats​(java.util.Map<java.lang.Integer,​java.util.Map<java.lang.String,​java.lang.Integer>> ngrams)
      • getGramLengths

        @NotNull
        public @NotNull java.util.List<java.lang.Integer> getGramLengths()
        Description copied from interface: LanguageProfile
        Tells what the n in n-grams are used here. Example: [1,2,3]
        Specified by:
        getGramLengths in interface LanguageProfile
        Returns:
        Sorted from smaller to larger.
      • getFrequency

        public int getFrequency​(java.lang.String gram)
        Specified by:
        getFrequency in interface LanguageProfile
        Parameters:
        gram - for example "a" or "foo".
        Returns:
        0-n, also zero if this profile does not use n-grams of that length (for example if no 4-grams are made).
      • getNumGrams

        public int getNumGrams​(int gramLength)
        Description copied from interface: LanguageProfile
        Tells how many different n-grams there are for a certain n-gram size. For example the English language has about 57 different 1-grams, whereas Chinese in Hani has thousands.
        Specified by:
        getNumGrams in interface LanguageProfile
        Parameters:
        gramLength - 1-n
        Returns:
        0-n, returns zero if no such n-grams were made (for example if no 4-grams were made), or if all the training text did not contain such long words.
      • getNumGrams

        public int getNumGrams()
        Description copied from interface: LanguageProfile
        Tells how many n-grams there are for all n-gram sizes combined.
        Specified by:
        getNumGrams in interface LanguageProfile
        Returns:
        0-n (0 only on an empty profile...)
      • getNumGramOccurrences

        public long getNumGramOccurrences​(int gramLength)
        Description copied from interface: LanguageProfile
        Tells how often all n-grams of a certain length occurred, combined. This returns a much larger number than LanguageProfile.getNumGrams(int).
        Specified by:
        getNumGramOccurrences in interface LanguageProfile
        Parameters:
        gramLength - 1-n
        Returns:
        0-n, returns zero if no such n-grams were made (for example if no 4-grams were made), or if all the training text did not contain such long words.
      • getMinGramCount

        public long getMinGramCount​(int gramLength)
        Description copied from interface: LanguageProfile
        Tells how often the n-gram with the lowest amount of occurrences used in this profile occurred. Most likely there were n-grams with less (unless the returned number is 1), but they were eliminated in order to keep the profile reasonably small. This is the opposite of getMaxGramCount().
        Specified by:
        getMinGramCount in interface LanguageProfile
        Parameters:
        gramLength - 1-n
        Returns:
        0-n, returns zero if no such n-grams were made or existed.
      • getMaxGramCount

        public long getMaxGramCount​(int gramLength)
        Description copied from interface: LanguageProfile
        Tells how often the n-gram with the highest amount of occurrences used in this profile occurred. This is the opposite of getMinGramCount().
        Specified by:
        getMaxGramCount in interface LanguageProfile
        Parameters:
        gramLength - 1-n
        Returns:
        0-n, returns zero if no such n-grams were made or existed.
      • iterateGrams

        @NotNull
        public @NotNull java.lang.Iterable<java.util.Map.Entry<java.lang.String,​java.lang.Integer>> iterateGrams()
        Description copied from interface: LanguageProfile
        Iterates all ngram strings with frequency.
        Specified by:
        iterateGrams in interface LanguageProfile
      • iterateGrams

        @NotNull
        public @NotNull java.lang.Iterable<java.util.Map.Entry<java.lang.String,​java.lang.Integer>> iterateGrams​(int gramLength)
        Description copied from interface: LanguageProfile
        Iterates all gramLength-gram strings with frequency.
        Specified by:
        iterateGrams in interface LanguageProfile
      • toString

        public java.lang.String toString()
        Overrides:
        toString in class java.lang.Object
      • equals

        public boolean equals​(java.lang.Object o)
        Overrides:
        equals in class java.lang.Object
      • hashCode

        public int hashCode()
        Overrides:
        hashCode in class java.lang.Object