Package com.optimaize.langdetect
Class NgramFrequencyData
- java.lang.Object
-
- com.optimaize.langdetect.NgramFrequencyData
-
public final class NgramFrequencyData extends java.lang.Object
Contains frequency information for n-grams coming from multipleLanguageProfile
s.For each n-gram string it knows the locales (languages) in which it occurs, and how frequent it occurs in those languages in relation to other n-grams of the same length in those same languages.
Immutable by definition (can't make Arrays unmodifiable).
-
-
Field Summary
Fields Modifier and Type Field Description private @NotNull java.util.List<LdLocale>
langlist
All the loaded languages, in exactly the same order as the data is in the double[] in wordLangProbMap.private @NotNull java.util.Map<java.lang.String,double[]>
wordLangProbMap
Key = ngram Value = array with probabilities per loaded language, in the same order aslanglist
.
-
Constructor Summary
Constructors Modifier Constructor Description private
NgramFrequencyData(@NotNull java.util.Map<java.lang.String,double[]> wordLangProbMap, @NotNull java.util.List<LdLocale> langlist)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static @NotNull NgramFrequencyData
create(@NotNull java.util.Collection<LanguageProfile> languageProfiles, @NotNull java.util.Collection<java.lang.Integer> gramLengths)
@NotNull LdLocale
getLanguage(int pos)
@NotNull java.util.List<LdLocale>
getLanguageList()
@org.jetbrains.annotations.Nullable double[]
getProbabilities(java.lang.String ngram)
Don't modify this data structure! (Can't make array immutable...)
-
-
-
Field Detail
-
wordLangProbMap
@NotNull private final @NotNull java.util.Map<java.lang.String,double[]> wordLangProbMap
Key = ngram Value = array with probabilities per loaded language, in the same order aslanglist
.
-
langlist
@NotNull private final @NotNull java.util.List<LdLocale> langlist
All the loaded languages, in exactly the same order as the data is in the double[] in wordLangProbMap. Example: if wordLangProbMap has an entry for the n-gram "foo" then for each locale in this langlist here it has a value there. Languages that don't know the n-gram have the value 0d.
-
-
Constructor Detail
-
NgramFrequencyData
private NgramFrequencyData(@NotNull @NotNull java.util.Map<java.lang.String,double[]> wordLangProbMap, @NotNull @NotNull java.util.List<LdLocale> langlist)
-
-
Method Detail
-
create
@NotNull public static @NotNull NgramFrequencyData create(@NotNull @NotNull java.util.Collection<LanguageProfile> languageProfiles, @NotNull @NotNull java.util.Collection<java.lang.Integer> gramLengths) throws java.lang.IllegalArgumentException
- Parameters:
gramLengths
- for example [1,2,3]- Throws:
java.lang.IllegalArgumentException
- if languageProfiles or gramLengths is empty, or if one of the languageProfiles does not have the grams of the required sizes.
-
getLanguageList
@NotNull public @NotNull java.util.List<LdLocale> getLanguageList()
-
getLanguage
@NotNull public @NotNull LdLocale getLanguage(int pos)
-
getProbabilities
@Nullable public @org.jetbrains.annotations.Nullable double[] getProbabilities(java.lang.String ngram)
Don't modify this data structure! (Can't make array immutable...)- Returns:
- null if no language profile knows that ngram.
entries are 0 for languages that don't know that ngram at all.
The array is in the order of the
getLanguageList()
language list, and has exactly that size. impl note: this way the caller can handle it more efficient than returning an empty array.
-
-