Package edu.berkeley.nlp.lm
Class ArrayEncodedProbBackoffLm<W>
- java.lang.Object
-
- edu.berkeley.nlp.lm.AbstractNgramLanguageModel<W>
-
- edu.berkeley.nlp.lm.AbstractArrayEncodedNgramLanguageModel<W>
-
- edu.berkeley.nlp.lm.ArrayEncodedProbBackoffLm<W>
-
- Type Parameters:
W
-
- All Implemented Interfaces:
ArrayEncodedNgramLanguageModel<W>
,NgramLanguageModel<W>
,java.io.Serializable
public class ArrayEncodedProbBackoffLm<W> extends AbstractArrayEncodedNgramLanguageModel<W> implements ArrayEncodedNgramLanguageModel<W>, java.io.Serializable
Language model implementation which uses Kneser-Ney-style backoff computation. Note that unlike the description in Pauls and Klein (2011), we store trie for which the first word in n-gram points to its prefix for this particular implementation. This is in contrast toContextEncodedProbBackoffLm
, which stores a trie for which the last word points to its suffix. This was done because it simplifies the code significantly, without significantly changing speed or memory usage.- Author:
- adampauls
- See Also:
- Serialized Form
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface edu.berkeley.nlp.lm.ArrayEncodedNgramLanguageModel
ArrayEncodedNgramLanguageModel.DefaultImplementations
-
Nested classes/interfaces inherited from interface edu.berkeley.nlp.lm.NgramLanguageModel
NgramLanguageModel.StaticMethods
-
-
Field Summary
-
Fields inherited from class edu.berkeley.nlp.lm.AbstractNgramLanguageModel
lmOrder, oovWordLogProb
-
-
Constructor Summary
Constructors Constructor Description ArrayEncodedProbBackoffLm(int lmOrder, WordIndexer<W> wordIndexer, NgramMap<ProbBackoffPair> map, ConfigOptions opts)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description float
getLogProb(int[] ngram)
Equivalent togetLogProb(ngram, 0, ngram.length)
float
getLogProb(int[] ngram, int startPos, int endPos)
Calculate language model score of an n-gram.float
getLogProb(java.util.List<W> ngram)
Scores an n-gram.NgramMap<ProbBackoffPair>
getNgramMap()
-
Methods inherited from class edu.berkeley.nlp.lm.AbstractArrayEncodedNgramLanguageModel
scoreSentence
-
Methods inherited from class edu.berkeley.nlp.lm.AbstractNgramLanguageModel
getLmOrder, getWordIndexer, setOovWordLogProb
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface edu.berkeley.nlp.lm.NgramLanguageModel
getLmOrder, getWordIndexer, scoreSentence, setOovWordLogProb
-
-
-
-
Constructor Detail
-
ArrayEncodedProbBackoffLm
public ArrayEncodedProbBackoffLm(int lmOrder, WordIndexer<W> wordIndexer, NgramMap<ProbBackoffPair> map, ConfigOptions opts)
-
-
Method Detail
-
getLogProb
public float getLogProb(int[] ngram, int startPos, int endPos)
Description copied from interface:ArrayEncodedNgramLanguageModel
Calculate language model score of an n-gram. Warning: if you pass in an n-gram of length greater thangetLmOrder()
, this call will silently ignore the extra words of context. In other words, if you pass in a 5-gram (endPos-startPos == 5
) to a 3-gram model, it will only score the words fromstartPos + 2
toendPos
.- Specified by:
getLogProb
in interfaceArrayEncodedNgramLanguageModel<W>
- Specified by:
getLogProb
in classAbstractArrayEncodedNgramLanguageModel<W>
- Parameters:
ngram
- array of words in integer representationstartPos
- start of the portion of the array to be readendPos
- end of the portion of the array to be read.- Returns:
-
getLogProb
public float getLogProb(int[] ngram)
Description copied from interface:ArrayEncodedNgramLanguageModel
Equivalent togetLogProb(ngram, 0, ngram.length)
- Specified by:
getLogProb
in interfaceArrayEncodedNgramLanguageModel<W>
- Overrides:
getLogProb
in classAbstractArrayEncodedNgramLanguageModel<W>
- See Also:
ArrayEncodedNgramLanguageModel.getLogProb(int[], int, int)
-
getLogProb
public float getLogProb(java.util.List<W> ngram)
Description copied from interface:NgramLanguageModel
Scores an n-gram. This is a convenience method and will generally be relatively inefficient. More efficient versions are available inArrayEncodedNgramLanguageModel.getLogProb(int[], int, int)
andContextEncodedNgramLanguageModel.getLogProb(long, int, int, edu.berkeley.nlp.lm.ContextEncodedNgramLanguageModel.LmContextInfo)
.- Specified by:
getLogProb
in interfaceNgramLanguageModel<W>
- Overrides:
getLogProb
in classAbstractArrayEncodedNgramLanguageModel<W>
-
getNgramMap
public NgramMap<ProbBackoffPair> getNgramMap()
-
-