All samples we provide from the download page have been confirmed to be executable in advance. First of all, let’s check what is the cause.
Step 1: After extracting the downloaded sample file, please execute according to the included readme_en.txt without changing any files.
If it does not work at this stage, you have failed to install HARK and KaldiDecoder. Check the installation method and try the installation process again.
Step 2: Overwrite final.mdl under the
kaldi_conf/chain_sample/tdnn_5b/ directory with
final.mdl of your acoustic model. And replace the
word.txt , and
phones/ directory's files under the
kaldi_conf/chain_sample/tdnn_5b/graph_multi_a_tri5a_fsh_sw1_tg/ directory with the graph files of your language model.
You may see an error message when performing the steps of “(1) Launch Kaldi” written in readme_en.txt. The error message usually describes the cause of the crash. If you use iVector, you need to replace the files under
kaldi_conf/chain_sample/extractor with the iVectorExtractor generated when training your model. If you do not use iVector, you need to delete the
"--ivector-extraction-config=kaldi_conf/ivector_extractor.conf" line in the contents of
kaldi_conf/online.conf . Furthermore, the number of contexts written in
kaldi_conf/conf/splice.conf may be different from when you trained your acoustic model. In that case, it needs to be modified. These are determined by your acoustic model training settings.
If you are getting the error message continuously here, please provide a screenshot of the error message. If the error message disappears, KaldiDecoder has started successfully. You can work with HARK by matching the features in the next step.
Step 3: If you execute “(2) Execute HARK” in readme_en.txt as it is, KaldiDecoder will crash with an error message that the dimensions are different. If the settings of splice and sub-sampling-factor are appropriate, it is possible to cope by matching the dimension of the feature and changing the type of feature.
In the sample provided by us, the feature’s number of dimensions is set to 40. Please change it according to the number of dimensions of the features used for training your acoustic model.
Note: This sample set includes two network files. One is practice2-2-2.n for online that is executed in real time with a microphone array, and the other is practice2-2-2_offline.n for offline (also processing with a WAV file recorded with microphone arrays). Both are set for TAMAGO microphone arrays. We recommend that you first test using the offline version.
Step 4: If no error message is displayed, but the recognition result is incorrect, check the following. We recommend MSLS features that can be generated by HARK for speech recognition, but they are not common. If created using the usual procedure in Kaldi, MFCC features should be used. practice2-2-2.n and practice2-2-2_offline.n use the MSLSExtraction node. By changing the MSLSExtraction node of the network to the MFCCExtraction node using HARK-Designer, it is possible to connect correctly with the acoustic model learned with general MFCC features.
HARK Support Team