Package edu.berkeley.nlp.lm
Interface ContextEncodedNgramLanguageModel<W>
-
- Type Parameters:
W
-
- All Superinterfaces:
NgramLanguageModel<W>
- All Known Implementing Classes:
AbstractContextEncodedNgramLanguageModel
,ContextEncodedCachingLmWrapper
,ContextEncodedProbBackoffLm
public interface ContextEncodedNgramLanguageModel<W> extends NgramLanguageModel<W>
Interface for language models which expose the internal context-encoding for more efficient queries. (Note: language model implementations may internally use a context-encoding without implementing this interface). A context-encoding encodes an n-gram as a integer representing the last word, and an offset which serves as a logical pointer to the (n-1) prefix words. The integers represent words of typeW
in the vocabulary, and the mapping from the vocabulary to integers is managed by an instance of theWordIndexer
class.- Author:
- adampauls
-
-
Nested Class Summary
Nested Classes Modifier and Type Interface Description static class
ContextEncodedNgramLanguageModel.DefaultImplementations
static class
ContextEncodedNgramLanguageModel.LmContextInfo
Simple class for returning context offsets-
Nested classes/interfaces inherited from interface edu.berkeley.nlp.lm.NgramLanguageModel
NgramLanguageModel.StaticMethods
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description float
getLogProb(long contextOffset, int contextOrder, int word, ContextEncodedNgramLanguageModel.LmContextInfo outputContext)
Get the score for an n-gram, and also get the context offset of the n-gram's suffix.int[]
getNgramForOffset(long contextOffset, int contextOrder, int word)
Gets the n-gram referred to by a context-encoding.ContextEncodedNgramLanguageModel.LmContextInfo
getOffsetForNgram(int[] ngram, int startPos, int endPos)
Gets the offset which refers to an n-gram.-
Methods inherited from interface edu.berkeley.nlp.lm.NgramLanguageModel
getLmOrder, getLogProb, getWordIndexer, scoreSentence, setOovWordLogProb
-
-
-
-
Method Detail
-
getLogProb
float getLogProb(long contextOffset, int contextOrder, int word, ContextEncodedNgramLanguageModel.LmContextInfo outputContext)
Get the score for an n-gram, and also get the context offset of the n-gram's suffix.- Parameters:
contextOffset
- Offset of context (prefix) of an n-gramcontextOrder
- The (0-based) length ofcontext
(i.e.order == 0
iffcontext
refers to a unigram).word
- Last word of the n-gramoutputContext
- Offset of the suffix of the input n-gram. If the parameter isnull
it will be ignored. This can be passed to future queries for efficient access.- Returns:
-
getOffsetForNgram
ContextEncodedNgramLanguageModel.LmContextInfo getOffsetForNgram(int[] ngram, int startPos, int endPos)
Gets the offset which refers to an n-gram. If the n-gram is not in the model, then it returns the shortest suffix of the n-gram which is. This operation is not necessarily fast.
-
getNgramForOffset
int[] getNgramForOffset(long contextOffset, int contextOrder, int word)
Gets the n-gram referred to by a context-encoding. This operation is not necessarily fast.
-
-