Recording audio stream from ROS – data type error

HARK FORUM Recording audio stream from ROS – data type error

Viewing 10 posts - 1 through 10 (of 10 total)
  • Author
    Posts
  • #1105
    jm
    Participant

      Hello,

      I am new to Hark and I am trying to record the audio captured with a HSR robot’s built-in head microphone. The following are the parameters of the stream generated by the robot:

      PARAMETERS
       * /audio/audio_capture/bitrate: 128
       * /audio/audio_capture/channels: 1
       * /audio/audio_capture/device: 
       * /audio/audio_capture/format: mp3
       * /audio/audio_capture/sample_rate: 16000
       * /rosdistro: kinetic
       * /rosversion: 1.12.14
      
      NODES
        /audio/
          audio_capture (audio_capture/audio_capture)
      

      I am trying to record the streamed audio with the AudioStreamFromRos node (Attached you find a screenshot of my Hark network). However, when I execute the network I get the following error message:

      [ERROR] [1565594592.746777286]: Client [/MY_HARK_MASTER_NODE] wants topic /audio/audio to have datatype/md5sum [hark_msgs/HarkWave/24c5654436a3ff03c563377fdbcc56a1], but our version has [audio_common_msgs/AudioData/f43a8e1b362b75baa741461b46adc7e0]. Dropping connection.

      How can I configure the node to make the data type compatible?

      #1107
      lapus.er
      Participant

        Hi Jm,

        Can you please attach the actual network file that you used.

        Cheers,
        Earl

        #1109
        jm
        Participant

          Hi,

          here is the network.

          #1110
          jm
          Participant

            Here. There was an upload error in the previous attempt.

            #1122

            You need to use hark_msgs/HarkWave in your workspace.
            In your case, audio_common_msgs/AudioData seems to store mp3 data into uint8[] array, so you will first need to expand it to raw PCM data.

            Second, the data structure of hark_msgs/HarkWave is as follows.

            user@ubuntu:~$ rosmsg show hark_msgs/HarkWave
            std_msgs/Header header
              uint32 seq
              time stamp
              string frame_id
            int32 count
            int32 nch
            int32 length
            int32 data_bytes
            hark_msgs/HarkWaveVal[] src
              float32[] wavedata

            wavedata is a raw PCM data array. Since HARK is not aware of the number of bits, it simply casts an integer value to a floating point type. In other words, 123 is 123.0f .

            data_bytes is the data size. In other words, it is the size of float (4 bytes) multiplied by the size of wavedata .

            length is the number of samples per frame handled by HARK. The initial value of HARK is 512 . Since HARK processed frame by frame, in other words, the size of wavedata must be nch times length=512 .

            nch is the number of channels. Your device seems to be 1ch, so it should be 1 . For microphone array data, a larger number will be stored.

            count is the frame count. Since HARK processing frame by frame, it is necessary to know what frame the data is. In other words, it is incremented as the frame advances. The first frame number is 0 . Not 1 .

            There is a final note. In order to prevent problems in FFT/IFFT processing, etc., the frames processing by HARK are subject to sample overlap processing.
            The following image may help you understand.
            frames

            Best regards,
            m.takigahira

            #1123
            jm
            Participant

              I am using the package ros-kinetic-audio-capture to stream the audio from the built-in microphone of the robot.

              With this package it is possible to stream audio data in wave format:

              PARAMETERS
               * /audio_capture/channels: 1
               * /audio_capture/depth: 16
               * /audio_capture/device: plughw:1,0
               * /audio_capture/format: wave
               * /audio_capture/sample_rate: 16000
               * /rosdistro: kinetic
               * /rosversion: 1.12.14
              
              NODES
                /
                  audio_capture (audio_capture/audio_capture)
              

              The data structure of the data stream audio_common_msgs/AudioData is as follows:

              user@laptop:~$ rosmsg show audio_common_msgs/AudioData 
              uint8[] data

              Therefore, it is not compatible with the data structure of hark_msgs/HarkWave.

              Is there a practical way to make the data structure of audio_common_msgs/AudioData compatible with Hark?

              Otherwise, would you suggest another package different than ros-kinetic-audio-capture in order to have Hark-compatible access to the data from the robot’s built-in microphone?

              Thanks in advance.

              #1136
              nterakado
              Moderator

                Please see previous answer on how to convert Structure.

                For frame proccessing, refer to the AudioStreamFromWave source code.
                The source code can be obtained with the following command on Ubuntu.
                apt source hark-core

                Best regards,
                HARK Support

                • This reply was modified 5 years ago by nterakado.
                • This reply was modified 5 years ago by nterakado.
                #1143
                Thomas Tannous
                Participant

                  Hello,
                  I’m also working on this currently.
                  I let audio_capture publish wave format. I tried to write a converter
                  from AudioData (uint8[] wave) to HarkWave following the instructions mentioned in this answer.
                  Here is what I ended up with:

                          dhw = HarkWave()
                          dhw.header.stamp = rospy.Time.now()
                          dhw.header.frame_id = str(self.count)
                          dhw.count = self.count
                          harkwaveval = HarkWaveVal(wavedata=data_uint8)
                          result = []
                          result.append(harkwaveval)
                          dhw.src = result 
                          dhw.nch = 1 # one channel
                          dhw.length = 320 
                          dhw.data_bytes = 320 * 4 # float(4bytes) times length of data
                  

                  When I run my network (attached below) it’s only receiving one message from ros and afterwards the network stops with the message “Normally finished”.
                  So my question is: Why does it stop, since there is no error regarding the network and the topic is still actively pushing messages?

                  Thanks in advance.

                  #1152
                  nterakado
                  Moderator

                    Hi Thomas.

                    Is the sheet with AUdioSreamFromRos and SaveWavePCM set to iterator?
                    If it is a subnet, it will only be executed once.

                    Best regards,
                    HARK support team.

                    • This reply was modified 5 years ago by nterakado.
                    #1157
                    tank1199
                    Participant

                      Hi Thomas,

                      > When I run my network (attached below) it’s only receiving one message from ros and afterwards the network stops with the message “Normally finished”.

                      When creating a new sheet using HarkDesigner, by default it is set to subnet which runs the network once. Please change it to iterator so that the network runs in a loop.

                      If it is not too much trouble, would you mind uploading a sample of the audio you are trying to stream as well as the network file?

                      I noticed that you did not have a condition, are you getting any errors when running the network?

                    Viewing 10 posts - 1 through 10 (of 10 total)
                    • You must be logged in to reply to this topic.