Class KneserNeyLmReaderCallback<W>

    • Field Detail

      • lmOrder

        protected final int lmOrder
      • wordIndexer

        protected final WordIndexer<W> wordIndexer
        This array represents the discount used for each ngram order. The original Kneser-Ney discounting (-ukndiscount) uses one discounting constant for each N-gram order. These constants are estimated as D = n1 / (n1 + 2*n2) where n1 and n2 are the total number of N-grams with exactly one and two counts, respectively. For simplicity, our code just uses a constant discount for each order of 0.75. However, other discounts can be specified.
      • startIndex

        protected final int startIndex
    • Constructor Detail

      • KneserNeyLmReaderCallback

        public KneserNeyLmReaderCallback​(WordIndexer<W> wordIndexer,
                                         int maxOrder)
        Parameters:
        wordIndexer -
        maxOrder -
        inputIsSentences - If true, input n-grams are assumed to be sentences, and all sub-ngrams of up to order maxOrder are added. If false, input n-grams are assumed to be atomic.
      • KneserNeyLmReaderCallback

        public KneserNeyLmReaderCallback​(WordIndexer<W> wordIndexer,
                                         int maxOrder,
                                         ConfigOptions opts)