Package edu.berkeley.nlp.lm.io
Class MakeLmBinaryFromGoogle
- java.lang.Object
-
- edu.berkeley.nlp.lm.io.MakeLmBinaryFromGoogle
-
public class MakeLmBinaryFromGoogle extends java.lang.Object
Given a directory in Google n-grams format, builds a binary representation of a stupid-backoff language model language model and writes it to disk. Language model binaries are significantly smaller and faster to load. Note: actually running this code on the full Google-ngrams corpus can be very slow and memory intensive -- on our machines, it takes about 32GB of memory and 15 hours.Note that if the input/output files have a
.gz
suffix, they will be unzipped/zipped as necessary.- Author:
- adampauls
-
-
Constructor Summary
Constructors Constructor Description MakeLmBinaryFromGoogle()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static void
main(java.lang.String[] argv)
-