8000 GitHub - mgonzs13/audio_common: A PortAudio based audio_common with text to speech for ROS 2
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

mgonzs13/audio_common

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

90 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

audio_capture

This repositiory provides a set of ROS 2 packages for audio. It provides a C++ version to capture and play audio data using PortAudio.

License: MIT GitHub release Code Size Last Commit GitHub issues GitHub pull requests Contributors C++ Formatter Check

ROS 2 Distro Branch Build status Docker Image Documentation
Foxy main Foxy Build Docker Image Doxygen Deployment
Galactic main Galactic Build Docker Image Doxygen Deployment
Humble main Humble Build Docker Image Doxygen Deployment
Iron main Iron Build Docker Image Doxygen Deployment
Jazzy main Jazzy Build Docker Image Doxygen Deployment
Kilted main Kilted Build Docker Image Doxygen Deployment
Rolling main Rolling Build Docker Image Doxygen Deployment

Table of Contents

  1. Installation
  2. Docker
  3. Nodes
  4. Demos

Installation

cd ~/ros2_ws/src
git clone https://github.com/mgonzs13/audio_common.git
cd ~/ros2_ws
rosdep install --from-paths src --ignore-src -r -y
colcon build

Docker

You can create a docker image to test audio_common. Use the following command inside the directory of audio_common.

docker build -t audio_common .

After the image is created, run a docker container with the following command.

docker run -it --rm --device /dev/snd audio_common

Nodes

audio_capturer_node

Node to obtain audio data from a microphone and publish it into the audio topic.

Click to expand

Parameters

  • format: Specifies the audio format to be used for capturing. Possible values are:

    • 1 (paFloat32 - 32-bit floating point)
    • 2 (paInt32 - 32-bit integer)
    • 8 (paInt16 - 16-bit integer)
    • 16 (paInt8 - 8-bit integer)
    • 32 (paUInt8 - 8-bit unsigned integer)

    Default: 8 (paInt16)

    The integer values correspond to PortAudio sample format flags.

  • channels: The number of audio channels to capture. Typically, 1 for mono and 2 for stereo. Default: 1

  • rate: The sample rate that is how many samples per second should be captured. Default: 16000

  • chunk: The size of each audio frame. Default: 512

  • device: The ID of the audio input device. A value of -1 indicates that the default audio input device should be used. Default: -1

  • frame_id: An identifier for the audio frame. This can be useful for synchronizing audio data with other data streams. Default: ""

ROS 2 Interfaces

  • audio: Topic to publish the audio data captured from the microphone. Type: audio_common_msgs/msg/AudioStamped

audio_player_node

Node to play the audio data obtained from the audio topic.

Click to expand

Parameters

  • channels: The number of audio channels to play. Typically, 1 for mono and 2 for stereo. Default: 2

    • The node automatically handles conversion between mono and stereo formats if needed.
  • device: The ID of the audio output device. A value of -1 indicates that the default audio output device should be used. Default: -1

ROS 2 Interfaces

  • audio: Topic subscriber to get the audio data to be played. Type: audio_common_msgs/msg/AudioStamped

music_node

Node to play music from audio files in wav format.

Click to expand

Parameters

  • chunk: The size of each audio frame. Default: 2048

  • frame_id: An identifier for the audio frame. This can be useful for synchronizing audio data with other data streams. Default: ""

ROS 2 Interfaces

  • audio: Topic to publish the audio data from the files. Type: audio_common_msgs/msg/AudioStamped

  • music_play: Service to play audio files. Type: audio_common_msgs/srv/MusicPlay

    • Parameters:
      • audio: Name of a built-in audio sample (e.g., "elevator")
      • file_path: Path to a custom WAV file (ignored if audio is specified)
      • loop: Boolean to indicate if the audio should loop. Default: false
  • music_stop: Service to stop the currently playing music. Type: std_srvs/srv/Trigger

  • music_pause: Service to pause the currently playing music. Type: std_srvs/srv/Trigger

  • music_resume: Service to resume paused music. Type: std_srvs/srv/Trigger

tts_node

Node to generate audio from text (TTS) using espeak.

Click to expand

Parameters

  • chunk: The size of each audio frame. Default: 4096

  • frame_id: An identifier for the audio frame. This can be useful for synchronizing audio data with other data streams. Default: ""

ROS 2 Interfaces

  • audio: Topic publisher to send the audio data generated by the TTS. Type: audio_common_msgs/msg/AudioStamped

  • say: Action to generate audio data from a text. Type: audio_common_msgs/action/TTS

    • Goal:
      • text: The text to convert to speech
      • language: The language to use for speech synthesis. Default: "en"
      • volume: The volume of the generated speech (0.0-1.0). Default: 1.0
      • rate: The speech rate (1.0 is normal speed). Default: 1.0
    • Feedback:
      • audio: The audio being currently played
    • Result:
      • text: The text that was converted to speech

Demos

Audio Capturer/Player

ros2 run audio_common audio_capturer_node
ros2 run audio_common audio_player_node

TTS

ros2 run audio_common tts_node
ros2 run audio_common audio_player_node
ros2 action send_goal /say audio_common_msgs/action/TTS "{'text': 'Hello World'}"

Advanced TTS example with additional parameters:

ros2 action send_goal /say audio_common_msgs/action/TTS "{'text': 'Hello World', 'language': 'en-us', 'volume': 0.8, 'rate': 1.2}"

Music Player

ros2 run audio_common music_node
ros2 run audio_common audio_player_node

Play a built-in sample:

ros2 service call /music_play audio_common_msgs/srv/MusicPlay "{audio: 'elevator'}"

Play a custom WAV file:

ros2 service call /music_play audio_common_msgs/srv/MusicPlay "{file_path: '/path/to/your/file.wav'}"

Play with looping enabled:

ros2 service call /music_play audio_common_msgs/srv/MusicPlay "{audio: 'elevator', loop: true}"

Control playback:

ros2 service call /music_pause std_srvs/srv/Trigger "{}"
ros2 service call /music_resume std_srvs/srv/Trigger "{}"
ros2 service call /music_stop std_srvs/srv/Trigger "{}"
0