This repositiory provides a set of ROS 2 packages for audio. It provides a C++ version to capture and play audio data using PortAudio.
ROS 2 Distro | Branch | Build status | Docker Image | Documentation |
---|---|---|---|---|
Foxy | main |
|||
Galactic | main |
|||
Humble | main |
|||
Iron | main |
|||
Jazzy | main |
|||
Kilted | main |
|||
Rolling | main |
cd ~/ros2_ws/src
git clone https://github.com/mgonzs13/audio_common.git
cd ~/ros2_ws
rosdep install --from-paths src --ignore-src -r -y
colcon build
You can create a docker image to test audio_common. Use the following command inside the directory of audio_common.
docker build -t audio_common .
After the image is created, run a docker container with the following command.
docker run -it --rm --device /dev/snd audio_common
Node to obtain audio data from a microphone and publish it into the audio
topic.
Click to expand
-
format: Specifies the audio format to be used for capturing. Possible values are:
1
(paFloat32 - 32-bit floating point)2
(paInt32 - 32-bit integer)8
(paInt16 - 16-bit integer)16
(paInt8 - 8-bit integer)32
(paUInt8 - 8-bit unsigned integer)
Default:
8
(paInt16)The integer values correspond to PortAudio sample format flags.
-
channels: The number of audio channels to capture. Typically,
1
for mono and2
for stereo. Default:1
-
rate: The sample rate that is how many samples per second should be captured. Default:
16000
-
chunk: The size of each audio frame. Default:
512
-
device: The ID of the audio input device. A value of
-1
indicates that the default audio input device should be used. Default:-1
-
frame_id: An identifier for the audio frame. This can be useful for synchronizing audio data with other data streams. Default:
""
- audio: Topic to publish the audio data captured from the microphone. Type:
audio_common_msgs/msg/AudioStamped
Node to play the audio data obtained from the audio
topic.
Click to expand
-
channels: The number of audio channels to play. Typically,
1
for mono and2
for stereo. Default:2
- The node automatically handles conversion between mono and stereo formats if needed.
-
device: The ID of the audio output device. A value of
-1
indicates that the default audio output device should be used. Default:-1
- audio: Topic subscriber to get the audio data to be played. Type:
audio_common_msgs/msg/AudioStamped
Node to play music from audio files in wav
format.
Click to expand
-
chunk: The size of each audio frame. Default:
2048
-
frame_id: An identifier for the audio frame. This can be useful for synchronizing audio data with other data streams. Default:
""
-
audio: Topic to publish the audio data from the files. Type:
audio_common_msgs/msg/AudioStamped
-
music_play: Service to play audio files. Type:
audio_common_msgs/srv/MusicPlay
- Parameters:
audio
: Name of a built-in audio sample (e.g., "elevator")file_path
: Path to a custom WAV file (ignored if audio is specified)loop
: Boolean to indicate if the audio should loop. Default:false
- Parameters:
-
music_stop: Service to stop the currently playing music. Type:
std_srvs/srv/Trigger
-
music_pause: Service to pause the currently playing music. Type:
std_srvs/srv/Trigger
-
music_resume: Service to resume paused music. Type:
std_srvs/srv/Trigger
Node to generate audio from text (TTS) using espeak.
Click to expand
-
chunk: The size of each audio frame. Default:
4096
-
frame_id: An identifier for the audio frame. This can be useful for synchronizing audio data with other data streams. Default:
""
-
audio: Topic publisher to send the audio data generated by the TTS. Type:
audio_common_msgs/msg/AudioStamped
-
say: Action to generate audio data from a text. Type:
audio_common_msgs/action/TTS
- Goal:
text
: The text to convert to speechlanguage
: The language to use for speech synthesis. Default:"en"
volume
: The volume of the generated speech (0.0-1.0). Default:1.0
rate
: The speech rate (1.0 is normal speed). Default:1.0
- Feedback:
audio
: The audio being currently played
- Result:
text
: The text that was converted to speech
- Goal:
ros2 run audio_common audio_capturer_node
ros2 run audio_common audio_player_node
ros2 run audio_common tts_node
ros2 run audio_common audio_player_node
ros2 action send_goal /say audio_common_msgs/action/TTS "{'text': 'Hello World'}"
Advanced TTS example with additional parameters:
ros2 action send_goal /say audio_common_msgs/action/TTS "{'text': 'Hello World', 'language': 'en-us', 'volume': 0.8, 'rate': 1.2}"
ros2 run audio_common music_node
ros2 run audio_common audio_player_node
Play a built-in sample:
ros2 service call /music_play audio_common_msgs/srv/MusicPlay "{audio: 'elevator'}"
Play a custom WAV file:
ros2 service call /music_play audio_common_msgs/srv/MusicPlay "{file_path: '/path/to/your/file.wav'}"
Play with looping enabled:
ros2 service call /music_play audio_common_msgs/srv/MusicPlay "{audio: 'elevator', loop: true}"
Control playback:
ros2 service call /music_pause std_srvs/srv/Trigger "{}"
ros2 service call /music_resume std_srvs/srv/Trigger "{}"
ros2 service call /music_stop std_srvs/srv/Trigger "{}"