I don’t know the KALDI GUI, so I’ll suggest two solutions and I hope you choose the one that suits your purpose.
1. How to use a third-party external Kaldi decoder.
If an external Kaldi decoder requests a PCM stream, it can be obtained in the following way.
HARK has a node called HarkDataStreamSender, which can obtain PCM data (and localization data if necessary) via Socket.
Please refer to HarkDataStreamSender for usage.
In this case, you need to parse the PCM data coming from HARK and format it to match the input of the decoder you use. Many people use small scripts such as Python or node.js for this purpose.
2. How to use the KaldiDecoder we have provided.
HARK has a function to convert PCM data to MSLS (or MFCC) features, so it sends a stream of features to KaldiDecoder. Sent from HARK using SpeechRecognitionClient node.
If you are not sure how to set the file path given to KaldiDecoder, please refer to the following sample file.
The config file included in this sample uses relative paths. Since it is a relative path from the place to execute, we are recommended to use the absolute path when executing from various places.
For example, if I refer to your ScreenShot, it will be like an attached file. Since the location of the acoustic model (final.mdl) was not known from your ScreenShot, that part is a dummy. You can rewrite it in the same way, so please try it.
If you are not using iVector in your acoustic model, you need to remove the ivector’s configuration line from the attached config file.
Note: This sample uses our own MSLS features, so if you want to rewrite this sample and use it, you will need to replace the MSLSExtraction node with the MFCCExtraction node. The MFCC features generated by the MFCCExtraction node are compatible with those used by HTK Toolkit, Kaldi, etc. Please match the number of dimensions and the presence or absence of Delta and Accelerate.
HARK Support Team
- This reply was modified 1 year, 10 months ago by Masayuki Takigahira.