Masayuki Takigahira

Forum Replies Created

Viewing 15 posts - 16 through 30 (of 57 total)
  • Author
    Posts
  • in reply to: Recording TSP #1020

    Thank you for your inquiry.

    Please check the following points.

    Case.1:
    Separate human voices with HARK. In other words, if the sampling rate is sufficient at 16kHz, then:

    1. Although the TSP response file is recorded at 48 kHz, please downsample this file to 16 kHz. Because 16384.little_endian.wav is a TSP file for the 0 to 8 kHz.
    2. HARKTOOL creates a transfer function from the TSP response file for the 0 to 8 kHz. In other words, it is not necessary to change the paramter settings for the number of samples by HARKTOOL.
    3. In the HARK network file, connect the MultiDownSampler node after the AudioStreamFromMic node, and downsample from 48kHz to 16kHz. And, the LocalizeMUSIC and GHDSS nodes use a transfer function for the 0 to 8 kHz.

    Normally, 16 kHz is sufficient to process the human voice band.

    Case.2:
    If you need to separate up to very high frequency bands like electronic sounds. In that case, do not use 16384.little_endian.wav. You need to recreate the TSP file itself for 48kHz. In other words, it is a TSP file up to 24kHz, which is the Nyquist frequency.

    In this case, the 786,432 samples that you wrote in the post are correct as calculations.

    If you can read Matlab script, my script may be helpful. In my code, TSP file and inverse TSP file are generated by specifying sampling rate etc. The reason for duplicating the channel of TSP file is to ensure that wios does not fail if the playback device is stereo.

    I think my script maybe works with Matlab’s Clone (eg octave) too, but I will attach a 48kHz sample just in case.

    Best regards,

    in reply to: problem with wios #1012

    I’m sorry, I noticed that there was an error in the command I posted before.
    I apologize for the confusion caused by the wrong information.

    Please try the following command:
    In this example, plughw:1,0 is a recording device and plughw:0,0 is a playback device.

    
    wios -s -y 0 -d plughw:0,0 -z 0 -a plughw:1,0 -c 8 -i input.wav -o output.wav
    

    The meaning of -y 0 is that the playback device is an ALSA device.
    The meaning of -d plughw:0,0 is that the playback device is plughw:0,0 .
    The meaning of -z 0 is that the recording device is an ALSA device.
    The meaning of -a plughw:1,0 is that the recording device is plughw:1,0 .
    -y option and -z option are used instead of -x option.
    Because the recording and playback devices are different (there are two), you need -y and -z. If you can record and play back on one device, eg RASP, use only -x .

    Best regards,

    お問い合わせありがとうございます。

    > hark-base
    > harkmw
    > libharkio3
    > libhark-netapi
    > hark-core
    > harktool5

    現在、公式がサポートしているアーキテクチャはIntel系のx86_64のみ(*1)ですので
    上記は全てソースコードからビルドされているという認識で宜しいでしょうか?
    *1) FAQ: What are the supported architectures by HARK?

    ARM系プロセッサでは CFLAGSCXXFLAGS 等で指定する最適化オプションで
    性能が大きく変わる事が御座います。未設定の場合はご検討ください。

    以降、CPUに合わせたコンパイラオプションが適切に設定されているという前提で
    書かせて頂きます。

    .bashrc 等で下記の環境変数を追記してから試して頂けますでしょうか。
    .bashrc に書いた場合、端末(bash)を立ち上げなおすと反映されます。
    HARK-Designer上から実行する場合は、hark_designerコマンドを起動する
    端末を立ち上げなおしてHARK-Designer自体も再度起動してください。

    
    export OPENBLAS_NUM_THREADS=1
    

    もし、他にもOpenBLASを使用しているアプリケーションがあり
    環境変数の設定を全体(全てのbash)に反映したくないというケースでは
    HARKを実行する際に、下記のように実行する事でコマンドのみに反映されます。
    この方法の場合、HARK-Designer上から実行すると反映されません。
    必ずコマンドラインからHARKを実行して頂く必要が御座います。

    
    OPENBLAS_NUM_THREADS=1 harkmw ./<your_network_file>.n
    

    HARK3.0より、実行コマンドの名称が harkmw に変更されましたが
    harkmw の部分は従来通り batchflow と書いても実行できます。
    互換性の為にエイリアスとして設定されているので機能は同じです。

    以上、宜しくお願い致します。

    in reply to: SemiBlindICA Connection #1001

    First of all, I will talk from factors other than parameters.

    Case.1:
    If the overall system speed is slowing, the following causes may be considered. If you execute thread processing in a small matrix whose element size is less than a few thousand, calculation units are excessively divided, and the overhead reduces the speed of the OPENBLAS library.

    Therefore if you are using HARK 3.x, try setting the environment variable OPENBLAS_NUM_THREADS=1 .

    For example, set the following line in the .bashrc file:

    
    export OPENBLAS_NUM_THREADS=1
    

    In this case, variables will be effective from the newly opened terminal.

    Or you can do the following when you run the network file:

    
    OPENBLAS_NUM_THREADS=1 harkmw ./your_networkfile.n
    

    In this case, variables apply only to the command.

    Case.2:
    If only the processing speed of HARK is decreasing, the following causes may be considered. Please check the following contents.

    • Make sure that the known sound signal input to the HarkMsgsStreamFromRos node is not intermittent. For example, if a robot performs TTS, the sound signal must be transmitted not only during speech but also during silence periods.
    • Make sure that the sampling rate of the microphone array input matches the sampling rate of the sound signal input to the HarkMsgsStreamFromRos node. If it is difficult to match, please adjust the input of HarkMsgsStreamFromRos node to the sampling rate of the microphone array by MultiDownSampler node. Note that if you lower the sampling rate on the microphone array by MultiDownSampler node, you need to recreate the transfer function.

    Case.3:
    If the parameters of SemiBlindICA node seems to be the cause, please check the following contents.

    • Please check the background noise level contained in the microphone array input. The SemiBlindICA node’s IS_ZERO parameter is used to determine input levels of INPUT that do not need to be processed, since it is not necessary to perform estimation processing during periods where there is no sound signal input (silence period) of REFERENCE. Note that this parameter should be set with power in the frequency domain.
    • Please save the output of AudioStreamFromMic node and HarkMsgsStreamFromRos node with SaveWavePCM. If you compare the sound signals recorded in the two WAV files, you should see something like this:
      The waveform of the audio signal obtained at the HarkMsgsStreamFromRos node should be recorded at the AudioStreamFromMic node with a slight delay. The larger the delay amount, the larger the value of the TAP parameter, but if it is excessively large, the amount of calculation increases. However, if it is slow enough to misunderstand it as a hangup, I think that it is not the influence by this parameter.

    Best regards,

    in reply to: SemiBlindICA Connection #988

    Thank you for your inquiry.

    I have confirmed your screen shot. The main cause is that the HarkMsgsStreamFromRos node is placed on the MAIN sheet. Please move the HarkMsgsStreamFromRos node to the MAIN_LOOP sheet and connect MATOUT terminal to the MultiFFT node (The side connected to REFERENCE) on the same sheet. It takes iterations to perform frame-by-frame calculations.

    Best regards,

    in reply to: problem with wios #975

    Thank you for reporting this error.
    And, I appreciate your help confirming the operation of the debug version.

    It has been confirmed that this error is caused by compiler optimization. Currently, we are preparing a package for publishing with a reduced optimization level as a workaround. It will be released in the next few days.

    The version with the bug fixed will be “WIOS 3.0.7.1”. Please wait for it to be posted on “News” page.

    Best regards,

    in reply to: Generating transfer function #959

    I prepared three TSP files recorded at different locations with different microphone arrays. The figure below shows an excerpt of one channel, which you open in Audacity to see the spectrogram.

    In the two examples below you will see frequency components other than the original TSP signal. This is due to harmonics or reflected waves. It would be desirable to set the largest volume without such extra information.
    The following three examples are TSP files that function without any problems in performance. In other words, it is the limit value of the volume that harmonics and reflected waves are barely visible as shown below. Be careful not to make the line due to harmonics or reflected waves more strongly than that. It is most desirable to adjust to the invisible state as in the example at the top.

    tsp.png

    Attachments:
    in reply to: 音源定位結果をhark-pythonで扱うには #953

    お問い合わせありがとうございます。

    DisplayLocalization ノード等へ入力されていますので、 PyCodeExecutor3 ノードの出力は Vector<Source> 型になっていなければなりません。全てのノードについて入出力の型が掲載されていますので、詳細はHARK-Documentのノードリファレンスをご参照ください。

    つまり、今回のケースは出力端子の型設定を
    self.outputTypes=("prime_source",)
    ではなく
    self.outputTypes=("vector_source",)
    と設定する必要が御座います。

    それに伴い
    self.outputValues["output"] = self.input[0]
    と書かれている行についても次のようにする必要が御座います。
    self.outputValues["output"] = [self.input[0]]
    更に、入力されたVectorサイズが0である(定位無しの)可能性が御座いますので、次のようにしていなければ実行中にクラッシュする可能性が御座います。
    self.outputValues["output"] = [] if len(self.input)<1 else [self.input[0]]

    また、余談ですが

    self.outputValues["output"] = self.input
    self.outputValues["output"] = self.input[0]

    のように同一出力端子に2度書き込みを行った場合、
    後で書き込まれた値のみが使われます。
    calculate() メソッドを抜ける時点で設定した型と
    最終的に書き込まれている型が不一致している場合には
    エラーが発生しますのでご注意ください。

    以上、よろしくお願い致します。

    in reply to: problem with wios #950

    > Unfortunately, it did not give me any new information on how to overcome my problem with wios (see results of my wios_check in attachments).

    @paul: No. I saw your results and understood at least one of the causes.

    
    You can use following channels argument with the wios command.
        -c <channels> (same as --channels <channels>)
        *) Please select one of ( 2 ) as <channels>.
    

    This means that your device only supports 2 channels. If your device supports 1 or 2 channels, you will see the following message:

    
    You can use following channels argument with the wios command.
        -c <channels> (same as --channels <channels>)
        *) Please select one of ( 1-2 ) as <channels>.
    

    Also, while wios defaults to 16 kHz for the sampling rate, your device does not support 16 kHz. Your device supports only the sampling rates below. In other words, 44.1 kHz, 48 kHz, etc. should be selected.

    your playback device:

    
    You can use following sampling rate argument with the wios command.
        -f <rate> (same as --frequency <rate>)
        *) Please select one of ( 44100 48000 96000 192000 ) as <rate>.
    

    your capture device:

    
    You can use following sampling rate argument with the wios command.
        -f <rate> (same as --frequency <rate>)
        *) Please select one of ( 44100 48000 96000 ) as <rate>.
    

    ——
    You will need to do some things:
    – The tsp1.wav that you used is 1 Channel (Mono) data, but you need to make it 2 Chennels (Stereo).
    e.g.) If you use a tool called sox , the following will duplicate the channel after rate conversion:
    sox tsp1.wav -r 48000 tsp2.wav remix 1 1
    Instead, please try using the output tsp2.wav.
    – Please set the channel number explicitly as -c 2 on your device. It is not -c 1 .
    – Please set the sampling rate explicitly as (e.g.) -f 48000 on your device. It is not -f 16000 (This is the wios default if not set).

    ——
    The following commands should work on your device.
    arecord -d 16 -D plughw:0,0 -f S32_LE -c 2 -r 48000 output.wav
    To make it work the same way:
    wios -r -x 0 -t 16 -a plughw:0,0 -e 32 -c 2 -f 48000 -o output.wav

    The following commands should work on your device.
    aplay -D plughw:0,0 tsp.wav
    To make it work the same way:
    wios -p -x 0 -d plughw:0,0 -i tsp.wav
    *) tsp.wav must be 48 kHz, 32-bit (or 16-bit) 2-channel data.

    The settings for syncing your recording and playback devices in this example are:
    wios -s -x 0 -y plughw:0,0 -z plughw:0,0 -e 32 -c 2 -f 48000 -i tsp.wav -o output.wav
    wios -s -y 0 -d plughw:0,0 -z 0 -a plughw:0,0 -e 32 -c 2 -f 48000 -i tsp.wav -o output.wav

    Notes:
    If you get an error message about buffers (for example, buffer overruns), you can work around by setting the buffer size larger than the initial value using the -N and -Z options.

    I hope this answer will help you solve your problem.

    Best regards,

    in reply to: problem with wios #947

    It seems that you need to add -std=c++11 as a compile-time option in environments other than Ubuntu 18.04 or later. This is because the source code was written based on the C++11 specification.

    in reply to: problem with wios #942

    @paul: The upload failed due to a problem with the file permission. Thank you for contacting me.

    ———-
    For those who visited later …

    This post was corrected on 27th June 2019.
    Since an error was found in the attached file, uploading is done again.
    Please download wios-check2.zip.

    in reply to: problem with wios #931

    Thank you for your inquiry.

    wios uses ALSA directly.

    First, make sure you can record and playback using arecord or aplay . If you set the number of bits or encoding that the hardware does not support, it may not work properly. In other words, be aware of the default settings that is used by wios.
    Secondly, wios implements only some of the features of ALSA, so some devices may not be supported. For example, when recording 24-bit PCM with the --encoding 24 option, S24_LE is supported but S24_3LE is not.

    I created a tool that I attached in this post, which can be used do a quick check. But since it has not been tested thoroughly, there may be some problems remaining. But if it works well, the output result may be helpful.

    How to compile:
    g++ wios-check.cpp -lasound -o wios-check

    How to use:
    ./wios-check <device> <type>
    <type> is one of playback , capture or both.
    defaults: <device> is hw:0,0 selected, and <type> is both selected.
    e.g.) ./wios-check plughw:0,0 playback

    Finally, HARKTOOL5 offers a new way to create transfer functions from TSP recordings that have not been synchronized. The information is written below, so I hope you find it useful.

    [HARKTOOL5-GUI documentation] => [Transfer Function Estimation Using Complex Regression Model]

    Best regards,

    in reply to: repository replies IP 54.92.109.90 404 not found #875

    By Sylvia, did you mean you are using Linux Mint 18.3 with Ubuntu Xenial as the package base?
    Although Linux Mint was derived from Ubuntu, we do not support distribution for such, So if you are using Sylvia via apt repository for the Ubuntu Xenial, I can advice you to please register again the apt repository by changing the part of $(lsb_release -cs) included in the following command to xenial.

    sudo bash -c 'echo -e "deb http://archive.hark.jp/harkrepos $(lsb_release -cs) non-free\ndeb-src http://archive.hark.jp/harkrepos $(lsb_release -cs) non-free" > /etc/apt/sources.list.d/hark.list'

    Previously registered URL (domain) of the apt repository will not be able to connect due to our server migration which may have been causing the error.
    Installation Instruction for Linux

    Best regards,

    in reply to: ノードの作成方法について #871

    お問い合わせありがとうございます。

    > 逆の処理も可能でしょうか。
    はい。可能です。

    Python内で定義された変数(または定数)を self.outputValues["端子名"] に代入して頂ければ出力されます。ネットワークファイル側では、PyCodeExecutor3ノードを右クリックし、 Add Output を選び、 self.outputValues["端子名"] で指定した端子名で出力端子を追加してください。
    入力は self.端子名 で取得できますが、こちらもネットワークファイルのPyCodeExecutor3ノードに Add Input で追加した入力端子名となります。
    calculate() メソッドが毎フレーム呼び出されますのでinputに対して処理してoutputするという用途が基本になります。inputが1つも無い例として、処理を開始する最初のノード(AudioStreamFromWaveのようにファイル入力等を行う)として作成する場合は、EOFでHARKの処理を停止するためにCONDITIONというbool型の出力を追加し、通常はTrue、EOFでFalseを設定します。ネットワークファイルではCONDITION端子にCONDITIONを設定してください。

    PyCodeExecutor3の仕様について:
    HARKがサポートしている型であれば、入出力ともに可能です。例えばHARKの Vector はPythonの list と、HARKの Map はPythonの dict と対応します。注意点としては、2点ほどあります。出力する場合はデータを渡す相手ノードの入力と型が一致している必要があります。また、HARKの標準ノードと異なりパラメータ設定ダイアログを表示できませんので、パラメータを変更できるノードを作りたい場合にはjsonファイル等でパラメータ設定が起動時にロード出来るようにする等で対応をお願い致します。Pythonコード内でsocketなど用いて記述して頂ければネットワークから動的に値を設定するコードなども可能です。

    Pythonコード上での処理は自由に行う事が出来ますが、HARK側からPythonのクラスを呼び出すためGUI処理(Plot等)の書き方はやや癖があります。下記ディレクトリにPlot系ノードのサンプルがありますので参考にしていただければ幸いです。
    /usr/share/hark/hark-python3/harkpython/src/

    hark-python はPython2をサポートするための旧パッケージになります。HARK2.5までサポートしていました。HARK2.5以降は hark-python3 としてPython3をサポートする新しいパッケージを提供しています。ユーザが記述するPythonコードについては基本的に互換性がある設計ですが、Forumの回答ではインストール手順などで旧バージョン向けの説明が書かれている事が御座います。

    以上、よろしくお願い致します。

    お問い合わせありがとうございます。

    SourceSelectorByDirectionノード(*1)の位置の件ですが、対象の音源方向が-50~50度の場合、下記の図のように接続してください。このように接続しますと、GHDSSにはノイズ音源方向を含むSource情報が、SaveWavPCMへはノイズ音源方向を除いたSource情報が渡ります。また、SaveWavPCMノードに初期状態でSOURCES端子はありませんので、右クリックからAdd InputでSOURCESという名称で追加して頂く必要があります。
    *1)前回、SourceByDirectionノードと誤記していました。すみません。

    SourceSelectorByDirectionノードの接続位置の例

    > これは、Kinect v2のマイクの問題なのでしょうか。
    現在、手元にKinect v2が無いため(現行バージョンのサポートハードウェアから外れたので)実機確認が取れない状況です。申し訳ありませんがご了承ください。(定位範囲の指定で機能的な不具合が無いかは現在サポートしているTAMAGO等で確認させて頂きます。修正があった場合はChangeLogに掲載されます。) 2019/06/05追記:HARK3.0.4にてLocalizeMUSICノードの定位範囲指定機能について確認致しましたので結果をご報告させて頂きます。TAMAGO03マイクアレイで機能として正常に動作している事を確認致しました。
    なお、LocalizeMUSICの定位範囲指定はパワー(ノイズを含む)が強くても定位しない範囲を設定する事を意味しますので、GHDSSなどの音源分離ノードで該当方向のノイズが分離出来ない事になります。LocalizeMUSICノードの定位範囲は制限せず、SourceSelectorByDirectionノードでフィルタする事をお勧めいたします。

    以上、宜しくお願い致します。

    Attachments:
Viewing 15 posts - 16 through 30 (of 57 total)