|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectedu.cmu.sphinx.linguist.language.ngram.large.LargeTrigramModel
public class LargeTrigramModel
Queries a binary language model file generated by the CMU-Cambridge Statistical Language Modeling Toolkit.
Note that all probabilities in the grammar are stored in LogMath log base format. Language Probabilities in the language model file are stored in log 10 base. They are converted to the LogMath logbase.
| Field Summary | |
|---|---|
static int |
BYTES_PER_BIGRAM
The number of bytes per bigram in the LM file generated by the CMU-Cambridge Statistical Language Modelling Toolkit. |
static int |
BYTES_PER_TRIGRAM
The number of bytes per trigram in the LM file generated by the CMU-Cambridge Statistical Language Modelling Toolkit. |
static java.lang.String |
PROP_APPLY_LANGUAGE_WEIGHT_AND_WIP
A property that controls whether or not the language model will apply the language weight and word insertion probability |
static java.lang.String |
PROP_BIGRAM_CACHE_SIZE
A property that defines the maximum number of bigrams to be cached. |
static java.lang.String |
PROP_CLEAR_CACHES_AFTER_UTTERANCE
A property that controls whether the bigram and trigram caches are cleared after every utterance |
static java.lang.String |
PROP_FULL_SMEAR
If true, use full bigram information to determine smear |
static java.lang.String |
PROP_LANGUAGE_WEIGHT
A property that defines the language weight for the search |
static java.lang.String |
PROP_LOG_MATH
A property that defines the logMath component. |
static java.lang.String |
PROP_QUERY_LOG_FILE
A property for the name of the file that logs all the queried N-grams. |
static java.lang.String |
PROP_TRIGRAM_CACHE_SIZE
A property that defines that maxium number of trigrams to be cached |
static java.lang.String |
PROP_WORD_INSERTION_PROBABILITY
Word insertion probability property |
| Fields inherited from interface edu.cmu.sphinx.linguist.language.ngram.LanguageModel |
|---|
PROP_DICTIONARY, PROP_FORMAT, PROP_LOCATION, PROP_MAX_DEPTH, PROP_UNIGRAM_WEIGHT |
| Constructor Summary | |
|---|---|
LargeTrigramModel()
|
|
LargeTrigramModel(java.lang.String format,
java.net.URL urlLocation,
java.lang.String ngramLogFile,
int maxTrigramCacheSize,
int maxBigramCacheSize,
boolean clearCacheAfterUtterance,
int maxDepth,
LogMath logMath,
Dictionary dictionary,
boolean applyLanguageWeightAndWip,
float languageWeight,
double wip,
float unigramWeight,
boolean fullSmear)
|
|
| Method Summary | |
|---|---|
void |
allocate()
Create the language model |
void |
deallocate()
Deallocate resources allocated to this language model |
float |
getBackoff(WordSequence wordSequence)
Returns the backoff probability for the give sequence of words |
int |
getBigramMisses()
Returns the number of times when a bigram is queried, but there is no bigram in the LM (in which case it uses the backoff probabilities). |
int |
getMaxDepth()
Returns the maximum depth of the language model |
java.lang.String |
getName()
|
float |
getProbability(WordSequence wordSequence)
Gets the ngram probability of the word sequence represented by the word list |
float |
getSmear(WordSequence wordSequence)
Gets the smear term for the given wordSequence |
float |
getSmearOld(WordSequence wordSequence)
Gets the smear term for the given wordSequence |
int |
getTrigramHits()
Returns the number of trigram hits. |
int |
getTrigramMisses()
Returns the number of times when a trigram is queried, but there is no trigram in the LM (in which case it uses the backoff probabilities). |
java.util.Set<java.lang.String> |
getVocabulary()
Returns the set of words in the lanaguage model. |
int |
getWordID(Word word)
Returns the ID of the given word. |
void |
newProperties(PropertySheet ps)
This method is called when this configurable component needs to be reconfigured. |
void |
start()
Called before a recognition |
void |
stop()
Called after a recognition |
| Methods inherited from class java.lang.Object |
|---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
@S4String(mandatory=false) public static final java.lang.String PROP_QUERY_LOG_FILE
@S4Integer(defaultValue=100000) public static final java.lang.String PROP_TRIGRAM_CACHE_SIZE
@S4Integer(defaultValue=50000) public static final java.lang.String PROP_BIGRAM_CACHE_SIZE
@S4Boolean(defaultValue=false) public static final java.lang.String PROP_CLEAR_CACHES_AFTER_UTTERANCE
@S4Double(defaultValue=1.0) public static final java.lang.String PROP_LANGUAGE_WEIGHT
@S4Component(type=LogMath.class) public static final java.lang.String PROP_LOG_MATH
@S4Boolean(defaultValue=false) public static final java.lang.String PROP_APPLY_LANGUAGE_WEIGHT_AND_WIP
@S4Double(defaultValue=1.0) public static final java.lang.String PROP_WORD_INSERTION_PROBABILITY
@S4Boolean(defaultValue=false) public static final java.lang.String PROP_FULL_SMEAR
public static final int BYTES_PER_BIGRAM
public static final int BYTES_PER_TRIGRAM
| Constructor Detail |
|---|
public LargeTrigramModel(java.lang.String format,
java.net.URL urlLocation,
java.lang.String ngramLogFile,
int maxTrigramCacheSize,
int maxBigramCacheSize,
boolean clearCacheAfterUtterance,
int maxDepth,
LogMath logMath,
Dictionary dictionary,
boolean applyLanguageWeightAndWip,
float languageWeight,
double wip,
float unigramWeight,
boolean fullSmear)
public LargeTrigramModel()
| Method Detail |
|---|
public void newProperties(PropertySheet ps)
throws PropertyException
Configurable
newProperties in interface Configurableps - a property sheet holding the new data
PropertyException - if there is a problem with the properties.public java.lang.String getName()
public void allocate()
throws java.io.IOException
LanguageModel
allocate in interface LanguageModeljava.io.IOExceptionpublic void deallocate()
LanguageModel
deallocate in interface LanguageModelpublic void start()
start in interface LanguageModelpublic void stop()
stop in interface LanguageModelpublic float getProbability(WordSequence wordSequence)
getProbability in interface LanguageModelwordSequence - the word sequence
public final int getWordID(Word word)
word - the word to find the ID
public float getSmearOld(WordSequence wordSequence)
wordSequence - the word sequence
public float getSmear(WordSequence wordSequence)
LanguageModel
getSmear in interface LanguageModelwordSequence - the word sequence
public float getBackoff(WordSequence wordSequence)
wordSequence - the sequence of words
public int getMaxDepth()
getMaxDepth in interface LanguageModelpublic java.util.Set<java.lang.String> getVocabulary()
getVocabulary in interface LanguageModelpublic int getBigramMisses()
public int getTrigramMisses()
public int getTrigramHits()
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||