How to Use Models from SphinxTrain in Sphinx-4 |
Using new models is easy, you just need to configure the recognizer properly. It usually includes three steps:
<your_training_folder>/etc/ and have names like <your_model_name>.dic
and <your_model_name>.lm.DMP. If you don't have LM yet, you can create it with
cmuclmtk and later convert to DMP format with sphinx3_lm_convert from sphinx3 package.
Do the following changes in model and dictionary configuration, just point to the
files:
<component name="trigramModel" type="edu.cmu.sphinx.linguist.language.ngram.large.LargeTrigramModel">
<property name="unigramWeight" value="0.7"/>
<property name="maxDepth" value="3"/>
<property name="logMath" value="logMath"/>
<property name="dictionary" value="dictionary"/>
<property name="location"
value="the name of the language model file
for example <your_training_folder>/etc/<your_model_name>.lm.DMP"/>
</component>
<component name="dictionary" type="edu.cmu.sphinx.linguist.dictionary.FastDictionary">
<property name="dictionaryPath"
value="the name of the dictionary file
for example <your_training_folder>/etc/<your_model_name>.dic"/>
<property name="fillerPath"
value="the name of the filler file
for example <your_training_folder>/etc/<your_model_name>.filler"/>
<property name="addSilEndingPronunciation" value="false"/>
<property name="allowMissingWords" value="false"/>
<property name="unitManager" value="unitManager"/>
</component>
Next is the acoustic model. During training several models are created, you need one of them.
For large vocabulary task cd (context dependent) model is located in
<your_training_folder>/model_parameters/<your_db_name>.cd_cont_<number of senones>.
For small vocabulary task it's enough to take ci (context independent model). It's located in
<your_training_folder>/model_parameters/<your_db_name>.ci_cont.
This folder should include several files, like means, variances, feat.params, mdef. There will be also folders for different number of gaussians like _2 _4 _8, they are intermediate ones and you don't need them.
Again, let's define a model in config file:
<component name="sphinx3Loader"
type="edu.cmu.sphinx.linguist.acoustic.tiedstate.Sphinx3Loader">
<property name="logMath" value="logMath"/>
<property name="unitManager" value="unitManager"/>
<property name="dataLocation" value="file:the path to the model folder
for example <your_training_folder>/model_parameters/<your_model_name>.cd_cont_<senones>"/>
<property name="modelDefinition" value="file:the path to the model mdef file in model folder
for example <your_training_folder>/model_parameters/<your_model_name>.cd_cont_<senones>/mdef"/>
</component>
<component name="acousticModel" type="edu.cmu.sphinx.linguist.acoustic.tiedstate.TiedStateAcousticModel">
<property name="loader" value="sphinx3Loader"/>
<property name="unitManager" value="unitManager"/>
</component>
Please note that path value should start with file: here.
Note that for MLLT you probably also want change vectorLength property. Otherwise it's not needed.
If you trained 8 kHz model or MLLT model, you need to change the frontend accordingly. Here are required changes:
<component name="mfcFrontEnd" type="edu.cmu.sphinx.frontend.FrontEnd">
<propertylist name="pipeline">
....
<ite>melFilterBank</item>
....
<item>lda</item>
</propertylist>
</component>
<component name="melFilterBank" type="edu.cmu.sphinx.frontend.frequencywarp.MelFrequencyFilterBank">
<property name="numberFilters" value="31"/>
<property name="minimumFrequency" value="200"/>
<property name="maximumFrequency" value="3500"/>
</component>
<component name="lda" type="edu.cmu.sphinx.frontend.feature.LDA">
<property name="loader" value="sphinx3Loader"/>
</component>
For more information on configuration see Javadoc and Programmer's Documentation.
Optionally you can pack models into JAR file. The advantage of having it in a JAR file is that the JAR file can simply be included in the classpath and referenced in the configuration file for it to be used in a Sphinx-4 application. The configuration for packing int JAR is the same except you need to use jar:file: URI scheme for the reference. Once you did so, don't forget to include the JAR into the classpath.