Masayuki Takigahira

Forum Replies Created

Viewing 15 posts - 16 through 30 (of 61 total)
  • Author
  • in reply to: Can wios work on Windows10? #1116

    The Windows version of wios supports only the function to record via the network that RASP-24 etc. supports. In other words, wios can be used only when recording over the network with a USB-LAN or USB-Wireless dongle inserted into the USB port of the RASP-ZX.

    Currently, when connecting directly with a USB cable, you can only create a WAV file with a third-party recording tool such as Audacity. HARKTOOL5 can create a transfer function with an Complex Regression Model that does not require synchronized recording.

    If you need a TSP wav file created by synchronized recording, the following workaround also exists.

    Please create Ubuntu installed the virtual machine on VMWare/VirtualBox on Windows and connect RASP-ZX to the virtual machine. In that case, you can record with ALSA.

    Best regards,


    Since I noticed that there was a mistake, I deleted it immediately after the following post. wios did not support both WASAPI and DirectSound (DS) at this time. Only the RASP protocol that cannot be supported by the standard Windows API is supported.

    The RASP-24 is connected via a LAN, and the recording data is transmitted over the network using the SiF original protocol and recorded using the SiF original interface.

    The USB Audio Class (UAC) supported devices connected via USB, such as TAMAGO-03, are recorded with WASAPI or DirectSound (DS) interface.

    RASP-ZX supports two connection methods.
    If you are connected directly to a PC with a USB cable, it will be recorded with WASAPI or DS interface. On the other hand, if a USB-LAN or USB-Wireless dongle is inserted into the USB port and you are trying to connect via a network, it will be connected with the SiF original protocol as same as RASP-24.

    You need to choose a wios command depending on which connection method you use.

    HARK has supported WASAPI since version 3.0, but wios has not yet completed support for WASAPI, DirectSound (DS) will be used. The effect of this difference will hardly occur.

    Best regards,

    > 私の環境上直近で用意できるのがnnet1モデルのみであるため、
    > どうにかnnet1で動作させたいという思いがございます。

    > ご教示いただいた内容からnnet3の場合Kaldiのモデル作成からやり直す必要が
    > あるように見受けられますがnnet1の場合も同様なのでしょうか。

    HARK_recog_2.3.0.1_practice2 に含まれるネットワークファイルでは特徴量の次元数が異なりますので修正が必要です。40次元の特徴量を作るHARKのネットワークファイルのサンプルは、 に含まれております。注意点として、 MSLSExtraction ノードでMSLS特徴量を生成している部分を MFCCExtraction ノードに置き換えて頂く必要が御座います。MFCCExtractionノードのパラメータは MSLSExtraction と同じように40次元となるように設定して差し替えてください。




    nnet3形式のchain modelで宜しければ、学習済みのモデルが下記ページよりダウンロード可能です。



    特徴量13次元 + Delta特徴量13次元 + DeltaPower特徴量 の27次元から
    特徴量40次元 に変更となっています。ご了承ください。


    HARK Forum内の下記スレッドでKaldi公式の学習レシピ(nnet3用)から変更が

    他の設定は の kaldi_conf ディレクトリをご参照ください。



    in reply to: SemiBlindICA Connection #1023

    If the new question is related to this post, please post in the same thread. Conversely, if the new question is not relevant, please post in a new thread with the appropriate subject. It will probably help people with the same problem to find a solution.

    I and we (HARK support team) hope this forum will help you.

    Best regards,

    in reply to: Recording TSP #1020

    Thank you for your inquiry.

    Please check the following points.

    Separate human voices with HARK. In other words, if the sampling rate is sufficient at 16kHz, then:

    1. Although the TSP response file is recorded at 48 kHz, please downsample this file to 16 kHz. Because 16384.little_endian.wav is a TSP file for the 0 to 8 kHz.
    2. HARKTOOL creates a transfer function from the TSP response file for the 0 to 8 kHz. In other words, it is not necessary to change the paramter settings for the number of samples by HARKTOOL.
    3. In the HARK network file, connect the MultiDownSampler node after the AudioStreamFromMic node, and downsample from 48kHz to 16kHz. And, the LocalizeMUSIC and GHDSS nodes use a transfer function for the 0 to 8 kHz.

    Normally, 16 kHz is sufficient to process the human voice band.

    If you need to separate up to very high frequency bands like electronic sounds. In that case, do not use 16384.little_endian.wav. You need to recreate the TSP file itself for 48kHz. In other words, it is a TSP file up to 24kHz, which is the Nyquist frequency.

    In this case, the 786,432 samples that you wrote in the post are correct as calculations.

    If you can read Matlab script, my script may be helpful. In my code, TSP file and inverse TSP file are generated by specifying sampling rate etc. The reason for duplicating the channel of TSP file is to ensure that wios does not fail if the playback device is stereo.

    I think my script maybe works with Matlab’s Clone (eg octave) too, but I will attach a 48kHz sample just in case.

    Best regards,

    in reply to: problem with wios #1012

    I’m sorry, I noticed that there was an error in the command I posted before.
    I apologize for the confusion caused by the wrong information.

    Please try the following command:
    In this example, plughw:1,0 is a recording device and plughw:0,0 is a playback device.

    wios -s -y 0 -d plughw:0,0 -z 0 -a plughw:1,0 -c 8 -i input.wav -o output.wav

    The meaning of -y 0 is that the playback device is an ALSA device.
    The meaning of -d plughw:0,0 is that the playback device is plughw:0,0 .
    The meaning of -z 0 is that the recording device is an ALSA device.
    The meaning of -a plughw:1,0 is that the recording device is plughw:1,0 .
    -y option and -z option are used instead of -x option.
    Because the recording and playback devices are different (there are two), you need -y and -z. If you can record and play back on one device, eg RASP, use only -x .

    Best regards,


    > hark-base
    > harkmw
    > libharkio3
    > libhark-netapi
    > hark-core
    > harktool5

    *1) FAQ: What are the supported architectures by HARK?

    ARM系プロセッサでは CFLAGSCXXFLAGS 等で指定する最適化オプションで


    .bashrc 等で下記の環境変数を追記してから試して頂けますでしょうか。
    .bashrc に書いた場合、端末(bash)を立ち上げなおすと反映されます。



    OPENBLAS_NUM_THREADS=1 harkmw ./<your_network_file>.n

    HARK3.0より、実行コマンドの名称が harkmw に変更されましたが
    harkmw の部分は従来通り batchflow と書いても実行できます。


    in reply to: SemiBlindICA Connection #1001

    First of all, I will talk from factors other than parameters.

    If the overall system speed is slowing, the following causes may be considered. If you execute thread processing in a small matrix whose element size is less than a few thousand, calculation units are excessively divided, and the overhead reduces the speed of the OPENBLAS library.

    Therefore if you are using HARK 3.x, try setting the environment variable OPENBLAS_NUM_THREADS=1 .

    For example, set the following line in the .bashrc file:


    In this case, variables will be effective from the newly opened terminal.

    Or you can do the following when you run the network file:

    OPENBLAS_NUM_THREADS=1 harkmw ./your_networkfile.n

    In this case, variables apply only to the command.

    If only the processing speed of HARK is decreasing, the following causes may be considered. Please check the following contents.

    • Make sure that the known sound signal input to the HarkMsgsStreamFromRos node is not intermittent. For example, if a robot performs TTS, the sound signal must be transmitted not only during speech but also during silence periods.
    • Make sure that the sampling rate of the microphone array input matches the sampling rate of the sound signal input to the HarkMsgsStreamFromRos node. If it is difficult to match, please adjust the input of HarkMsgsStreamFromRos node to the sampling rate of the microphone array by MultiDownSampler node. Note that if you lower the sampling rate on the microphone array by MultiDownSampler node, you need to recreate the transfer function.

    If the parameters of SemiBlindICA node seems to be the cause, please check the following contents.

    • Please check the background noise level contained in the microphone array input. The SemiBlindICA node’s IS_ZERO parameter is used to determine input levels of INPUT that do not need to be processed, since it is not necessary to perform estimation processing during periods where there is no sound signal input (silence period) of REFERENCE. Note that this parameter should be set with power in the frequency domain.
    • Please save the output of AudioStreamFromMic node and HarkMsgsStreamFromRos node with SaveWavePCM. If you compare the sound signals recorded in the two WAV files, you should see something like this:
      The waveform of the audio signal obtained at the HarkMsgsStreamFromRos node should be recorded at the AudioStreamFromMic node with a slight delay. The larger the delay amount, the larger the value of the TAP parameter, but if it is excessively large, the amount of calculation increases. However, if it is slow enough to misunderstand it as a hangup, I think that it is not the influence by this parameter.

    Best regards,

    in reply to: SemiBlindICA Connection #988

    Thank you for your inquiry.

    I have confirmed your screen shot. The main cause is that the HarkMsgsStreamFromRos node is placed on the MAIN sheet. Please move the HarkMsgsStreamFromRos node to the MAIN_LOOP sheet and connect MATOUT terminal to the MultiFFT node (The side connected to REFERENCE) on the same sheet. It takes iterations to perform frame-by-frame calculations.

    Best regards,

    in reply to: problem with wios #975

    Thank you for reporting this error.
    And, I appreciate your help confirming the operation of the debug version.

    It has been confirmed that this error is caused by compiler optimization. Currently, we are preparing a package for publishing with a reduced optimization level as a workaround. It will be released in the next few days.

    The version with the bug fixed will be “WIOS”. Please wait for it to be posted on “News” page.

    Best regards,

    in reply to: Generating transfer function #959

    I prepared three TSP files recorded at different locations with different microphone arrays. The figure below shows an excerpt of one channel, which you open in Audacity to see the spectrogram.

    In the two examples below you will see frequency components other than the original TSP signal. This is due to harmonics or reflected waves. It would be desirable to set the largest volume without such extra information.
    The following three examples are TSP files that function without any problems in performance. In other words, it is the limit value of the volume that harmonics and reflected waves are barely visible as shown below. Be careful not to make the line due to harmonics or reflected waves more strongly than that. It is most desirable to adjust to the invisible state as in the example at the top.


    in reply to: 音源定位結果をhark-pythonで扱うには #953


    DisplayLocalization ノード等へ入力されていますので、 PyCodeExecutor3 ノードの出力は Vector<Source> 型になっていなければなりません。全てのノードについて入出力の型が掲載されていますので、詳細はHARK-Documentのノードリファレンスをご参照ください。


    self.outputValues["output"] = self.input[0]
    self.outputValues["output"] = [self.input[0]]
    self.outputValues["output"] = [] if len(self.input)<1 else [self.input[0]]


    self.outputValues["output"] = self.input
    self.outputValues["output"] = self.input[0]

    calculate() メソッドを抜ける時点で設定した型と


    in reply to: problem with wios #950

    > Unfortunately, it did not give me any new information on how to overcome my problem with wios (see results of my wios_check in attachments).

    : No. I saw your results and understood at least one of the causes.

    You can use following channels argument with the wios command.
        -c <channels> (same as --channels <channels>)
        *) Please select one of ( 2 ) as <channels>.

    This means that your device only supports 2 channels. If your device supports 1 or 2 channels, you will see the following message:

    You can use following channels argument with the wios command.
        -c <channels> (same as --channels <channels>)
        *) Please select one of ( 1-2 ) as <channels>.

    Also, while wios defaults to 16 kHz for the sampling rate, your device does not support 16 kHz. Your device supports only the sampling rates below. In other words, 44.1 kHz, 48 kHz, etc. should be selected.

    your playback device:

    You can use following sampling rate argument with the wios command.
        -f <rate> (same as --frequency <rate>)
        *) Please select one of ( 44100 48000 96000 192000 ) as <rate>.

    your capture device:

    You can use following sampling rate argument with the wios command.
        -f <rate> (same as --frequency <rate>)
        *) Please select one of ( 44100 48000 96000 ) as <rate>.

    You will need to do some things:
    – The tsp1.wav that you used is 1 Channel (Mono) data, but you need to make it 2 Chennels (Stereo).
    e.g.) If you use a tool called sox , the following will duplicate the channel after rate conversion:
    sox tsp1.wav -r 48000 tsp2.wav remix 1 1
    Instead, please try using the output tsp2.wav.
    – Please set the channel number explicitly as -c 2 on your device. It is not -c 1 .
    – Please set the sampling rate explicitly as (e.g.) -f 48000 on your device. It is not -f 16000 (This is the wios default if not set).

    The following commands should work on your device.
    arecord -d 16 -D plughw:0,0 -f S32_LE -c 2 -r 48000 output.wav
    To make it work the same way:
    wios -r -x 0 -t 16 -a plughw:0,0 -e 32 -c 2 -f 48000 -o output.wav

    The following commands should work on your device.
    aplay -D plughw:0,0 tsp.wav
    To make it work the same way:
    wios -p -x 0 -d plughw:0,0 -i tsp.wav
    *) tsp.wav must be 48 kHz, 32-bit (or 16-bit) 2-channel data.

    The settings for syncing your recording and playback devices in this example are:
    wios -s -x 0 -y plughw:0,0 -z plughw:0,0 -e 32 -c 2 -f 48000 -i tsp.wav -o output.wav
    wios -s -y 0 -d plughw:0,0 -z 0 -a plughw:0,0 -e 32 -c 2 -f 48000 -i tsp.wav -o output.wav

    If you get an error message about buffers (for example, buffer overruns), you can work around by setting the buffer size larger than the initial value using the -N and -Z options.

    I hope this answer will help you solve your problem.

    Best regards,

    in reply to: problem with wios #947

    It seems that you need to add -std=c++11 as a compile-time option in environments other than Ubuntu 18.04 or later. This is because the source code was written based on the C++11 specification.

    in reply to: problem with wios #942

    @paul: The upload failed due to a problem with the file permission. Thank you for contacting me.

    For those who visited later …

    This post was corrected on 27th June 2019.
    Since an error was found in the attached file, uploading is done again.
    Please download

Viewing 15 posts - 16 through 30 (of 61 total)