|
Open Source Toolkit For Speech Recognition Project by Carnegie Mellon University |
An expensive part of the decoder is collecting up the set of successor tokens to be scored for the next frame. The idea here with this optimization is to periodically skip the growing code. Instead rescore the current set of active tokens with the next frame. Depending on how often the 'grow' portion is skipped, will determine the improvement in speed.
protected boolean recognize() {
boolean more = scoreTokens(); // score emitting tokens
if (more) {
pruneBranches(); // eliminate poor branches
currentFrameNumber++;
if (growSkipInterval == 0 ||
(currentFrameNumber % growSkipInterval != 0) {
growBranches(); // extend remaining branches
}
}
return !more;
}
This code change was made to both the Simple and the WordPruning breadth first search manager.
The property in configuration file 'growSkipInterval' controls this behavior, default value is 0. Possible values to try are 6-10.
Results are shown here as WER / RT.
| Test | base | 2 | 3 | 4 | 6 |
| rm1 bigram | 4.3/0.76 | 19.0/0.31 | 7.0/0.44 | 5.4/0.49 | 4.8/.58 |
| rm1 trigram | 1.6/1.1 | 23.5/0.51 | 6.1/.69 | 5.863/0.78 | 1.6/.95 |
| hub4 trigram | 8.33/29.5 | 10.6/22.6 |
Note that these runs are subsets of the full tests so the baseline WER may not exactly match those in the regression tests