You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi. Ever since switching to neural endpointing, if the human replies with a short word like “sure”, “yes” etc the agent takes 3-4 secs to reply. It doesn’t seem like a significant time to wait but you’d be surprised at how many people think the agent stopped working and tries to talk to it again.
Neural endpointing works significantly better than VAD but this is an issue we are noticing a lot.
Something to note here is that some people may say “sure” a lot faster than others. If I say it in a normal speed it does recognize it most of the time.
Another issue we’re noticing is that during some hours of the day the agent takes noticeably longer to respond. Is there a resource allocation issue you are facing right now?
Thanks.
The text was updated successfully, but these errors were encountered:
Thanks for reporting the issue. Our neural VAD is based on Ultravox and takes audio directly as input without a separate speech recognition step. This might be an edge case the model isn’t optimized for yet—we’ll look into it.
There could be occasional usage spikes, but we generally allocate sufficient compute. If you experience this issue frequently, please report it in our Discord channel: https://discord.gg/Qw6KHxv8YB. Discord is the better place for resolving issues related to Ultravox Realtime.
This same issue as rojithaDev . Especially if I pick Jessica voice.
The problem is if the human replies with a short word like “sure”, “yes” etc. the agent takes 3-4 secs to reply.
Hi. Ever since switching to neural endpointing, if the human replies with a short word like “sure”, “yes” etc the agent takes 3-4 secs to reply. It doesn’t seem like a significant time to wait but you’d be surprised at how many people think the agent stopped working and tries to talk to it again.
Neural endpointing works significantly better than VAD but this is an issue we are noticing a lot.
Something to note here is that some people may say “sure” a lot faster than others. If I say it in a normal speed it does recognize it most of the time.
Another issue we’re noticing is that during some hours of the day the agent takes noticeably longer to respond. Is there a resource allocation issue you are facing right now?
Thanks.
The text was updated successfully, but these errors were encountered: