(phonic): Use stream_ahead_of_real_time mode for Phonic WebSocket and sample rate 24000#1885
Conversation
Bump phonic to 0.32.5 and enable stream_ahead_of_real_time so assistant audio is sent to the client as soon as it is generated. Switch the input and output formats to pcm_24000 (the STS buffer's native rate) to avoid resampling. Drop the ConfigOptions cast now that stream_ahead_of_real_time is typed, and align tool definitions with the 0.32.5 ToolDefinition types. Co-authored-by: Cursor <cursoragent@cursor.com>
🦋 Changeset detectedLatest commit: 33febb0 The changes in this PR will be included in the next version bump. This PR includes changesets to release 35 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
📝 Info: Config deduplication removes redundant overrides from sendConfig
The old connect() method spread buildConfigOptions(...) into the sendConfig payload and then re-specified many of the same properties (additional_languages, multilingual_mode, audio_speed, tools, boosted_keywords, generate_no_input_poke_text, no_input_poke_sec, no_input_poke_text, no_input_end_conversation_sec) as direct overrides. Since these had identical values to what buildConfigOptions already returned, the overrides were purely redundant. The one exception was min_words_to_interrupt, which was commented out in buildConfigOptions but included directly in sendConfig. The PR un-comments it in buildConfigOptions (line 950-952) to maintain equivalence. This also means min_words_to_interrupt is now sent during mid-session resets via _updateSession → sendReset, which previously did not include it. This is likely a bug fix rather than a regression.
Was this helpful? React with 👍 or 👎 to provide feedback.
Make buildConfigOptions the single source of truth for session config so sendConfig and sendReset stay in sync. Drop the redundant field overrides in sendConfig and move min_words_to_interrupt into buildConfigOptions so it is also applied on mid-session resets. Co-authored-by: Cursor <cursoragent@cursor.com>
| no_input_poke_sec: this.options.noInputPokeSec, | ||
| no_input_poke_text: this.options.noInputPokeText, | ||
| no_input_end_conversation_sec: this.options.noInputEndConversationSec, | ||
| stream_ahead_of_real_time: true, |
There was a problem hiding this comment.
🚩 stream_ahead_of_real_time is hardcoded with no user opt-out
The stream_ahead_of_real_time: true option at line 957 is hardcoded in buildConfigOptions with no corresponding constructor option to disable it. While this aligns with the PR's stated intent, users who experience issues with ahead-of-real-time streaming have no way to fall back to the previous real-time behavior without modifying source code. This may be intentional if the feature is considered stable, but it's worth confirming this is the desired default for all use cases.
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
No, that's intentional!
Description
stream_ahead_of_real_timeso assistant audio is sent to Livekit agents as soon as it is generated.Pre-Review Checklist
Testing
restaurant_agent.tsandrealtime_agent.tswork properly (for major changes)Additional Notes
Note to reviewers: Please ensure the pre-review checklist is completed before starting your review.