interrupt speech with button click #9

RomanLut · 2025-06-05T23:19:57Z

This PR should allow speech to be interrupted via a button click.
Unfortunately, I haven't been able to verify it, as I haven't compiled the server.

akdeb · 2025-06-06T10:39:53Z

Thanks for submitting the PR @RomanLut! This is one of the items on the roadmap.

Let me try this. Previously the problem I faced in implementing interrupt was -- the audio bytes (from OpenAI) were still on the way to the ESP32 when the user pressed the button. So it doesn't switch back to listening clearly and the audio it hears (from the user) is murky/sped up.

Example case:

Server sends: "Hey! How can I help interrupt you today?" [User only hears "Hey! How can I help"]
User says: "What's the weather like in Baku, Azerbaijan?"
OpenAI hears: gibberish ..... like in Baku, Azerbaijan?

So there was a processing conflict while the "you today?" was still on the way to the ESP32 over the websocket and while it registered the overlapping audio from the user "What's the weather".

RomanLut · 2025-06-06T16:16:27Z

In this PR, the behavior is as follows:

When the user clicks the button, the client sends the following JSON to the server:
{"type": "instruction", "msg": "INTERRUPT", "audio_end_ms": 1000}
The server already contains partial support for handling the interruption.

I also added the following command on server:

client.realtime.send("response.cancel", {
    type: "response.cancel",
    event_id: RealtimeUtils.generateId("evt_")
});

This should instruct the LLM to stop audio generation.

As a result, I expect the speaking to stop after a brief delay, and the client should return to listening mode as usual.

stop instruction

00c8c41

akdeb assigned RomanLut Jun 6, 2025

akdeb self-requested a review June 6, 2025 10:40

disabled switching listening mode on client side

281823b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

interrupt speech with button click #9

interrupt speech with button click #9

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

interrupt speech with button click #9

Are you sure you want to change the base?

interrupt speech with button click #9

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!