Recording audio stream from ROS – data type error

HARK FORUM Recording audio stream from ROS – data type error

This topic contains 5 replies, has 3 voices, and was last updated by Avatar jm 2 days ago.

Viewing 6 posts - 1 through 6 (of 6 total)
  • Author
    Posts
  • #1105
    Avatar
    jm
    Participant

    Hello,

    I am new to Hark and I am trying to record the audio captured with a HSR robot’s built-in head microphone. The following are the parameters of the stream generated by the robot:

    PARAMETERS
     * /audio/audio_capture/bitrate: 128
     * /audio/audio_capture/channels: 1
     * /audio/audio_capture/device: 
     * /audio/audio_capture/format: mp3
     * /audio/audio_capture/sample_rate: 16000
     * /rosdistro: kinetic
     * /rosversion: 1.12.14
    
    NODES
      /audio/
        audio_capture (audio_capture/audio_capture)
    

    I am trying to record the streamed audio with the AudioStreamFromRos node (Attached you find a screenshot of my Hark network). However, when I execute the network I get the following error message:

    [ERROR] [1565594592.746777286]: Client [/MY_HARK_MASTER_NODE] wants topic /audio/audio to have datatype/md5sum [hark_msgs/HarkWave/24c5654436a3ff03c563377fdbcc56a1], but our version has [audio_common_msgs/AudioData/f43a8e1b362b75baa741461b46adc7e0]. Dropping connection.

    How can I configure the node to make the data type compatible?

    #1107
    Avatar
    lapus.er
    Participant

    Hi Jm,

    Can you please attach the actual network file that you used.

    Cheers,
    Earl

    #1109
    Avatar
    jm
    Participant

    Hi,

    here is the network.

    #1110
    Avatar
    jm
    Participant

    Here. There was an upload error in the previous attempt.

    #1122

    You need to use hark_msgs/HarkWave in your workspace.
    In your case, audio_common_msgs/AudioData seems to store mp3 data into uint8[] array, so you will first need to expand it to raw PCM data.

    Second, the data structure of hark_msgs/HarkWave is as follows.

    user@ubuntu:~$ rosmsg show hark_msgs/HarkWave
    std_msgs/Header header
      uint32 seq
      time stamp
      string frame_id
    int32 count
    int32 nch
    int32 length
    int32 data_bytes
    hark_msgs/HarkWaveVal[] src
      float32[] wavedata

    wavedata is a raw PCM data array. Since HARK is not aware of the number of bits, it simply casts an integer value to a floating point type. In other words, 123 is 123.0f .

    data_bytes is the data size. In other words, it is the size of float (4 bytes) multiplied by the size of wavedata .

    length is the number of samples per frame handled by HARK. The initial value of HARK is 512 . Since HARK processed frame by frame, in other words, the size of wavedata must be nch times length=512 .

    nch is the number of channels. Your device seems to be 1ch, so it should be 1 . For microphone array data, a larger number will be stored.

    count is the frame count. Since HARK processing frame by frame, it is necessary to know what frame the data is. In other words, it is incremented as the frame advances. The first frame number is 0 . Not 1 .

    There is a final note. In order to prevent problems in FFT/IFFT processing, etc., the frames processing by HARK are subject to sample overlap processing.
    The following image may help you understand.
    frames

    Best regards,
    m.takigahira

    #1123
    Avatar
    jm
    Participant

    I am using the package ros-kinetic-audio-capture to stream the audio from the built-in microphone of the robot.

    With this package it is possible to stream audio data in wave format:

    PARAMETERS
     * /audio_capture/channels: 1
     * /audio_capture/depth: 16
     * /audio_capture/device: plughw:1,0
     * /audio_capture/format: wave
     * /audio_capture/sample_rate: 16000
     * /rosdistro: kinetic
     * /rosversion: 1.12.14
    
    NODES
      /
        audio_capture (audio_capture/audio_capture)
    

    The data structure of the data stream audio_common_msgs/AudioData is as follows:

    user@laptop:~$ rosmsg show audio_common_msgs/AudioData 
    uint8[] data

    Therefore, it is not compatible with the data structure of hark_msgs/HarkWave.

    Is there a practical way to make the data structure of audio_common_msgs/AudioData compatible with Hark?

    Otherwise, would you suggest another package different than ros-kinetic-audio-capture in order to have Hark-compatible access to the data from the robot’s built-in microphone?

    Thanks in advance.

Viewing 6 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic.