Skip to content

(phonic): Use stream_ahead_of_real_time mode for Phonic WebSocket and sample rate 24000#1885

Merged
tinalenguyen merged 3 commits into
livekit:mainfrom
Phonic-Co:q/phonic-stream-ahead-pcm24000
Jun 26, 2026
Merged

(phonic): Use stream_ahead_of_real_time mode for Phonic WebSocket and sample rate 24000#1885
tinalenguyen merged 3 commits into
livekit:mainfrom
Phonic-Co:q/phonic-stream-ahead-pcm24000

Conversation

@qionghuang6

@qionghuang6 qionghuang6 commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Description

  • Bump phonic to 0.32.5 and enable stream_ahead_of_real_time so assistant audio is sent to Livekit agents as soon as it is generated.
  • Switch the input and output formats to pcm_24000

Pre-Review Checklist

  • Build passes: All builds (lint, typecheck, tests) pass locally
  • AI-generated code reviewed: Removed unnecessary comments and ensured code quality
  • Changes explained: All changes are properly documented and justified above
  • Scope appropriate: All changes relate to the PR title, or explanations provided for why they're included
  • Video demo: A small video demo showing changes works as expected and did not break any existing functionality using Agent Playground (if applicable)

Testing

  • Automated tests added/updated (if applicable)
  • All tests pass
  • Make sure both restaurant_agent.ts and realtime_agent.ts work properly (for major changes)

Additional Notes


Note to reviewers: Please ensure the pre-review checklist is completed before starting your review.

Bump phonic to 0.32.5 and enable stream_ahead_of_real_time so assistant
audio is sent to the client as soon as it is generated. Switch the input
and output formats to pcm_24000 (the STS buffer's native rate) to avoid
resampling. Drop the ConfigOptions cast now that stream_ahead_of_real_time
is typed, and align tool definitions with the 0.32.5 ToolDefinition types.

Co-authored-by: Cursor <cursoragent@cursor.com>
@changeset-bot

changeset-bot Bot commented Jun 25, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: 33febb0

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 35 packages
Name Type
@livekit/agents-plugin-phonic Patch
@livekit/agents Patch
@livekit/agents-plugin-anam Patch
@livekit/agents-plugin-assemblyai Patch
@livekit/agents-plugin-baseten Patch
@livekit/agents-plugin-bey Patch
@livekit/agents-plugin-cartesia Patch
@livekit/agents-plugin-cerebras Patch
@livekit/agents-plugin-deepgram Patch
@livekit/agents-plugin-did Patch
@livekit/agents-plugin-elevenlabs Patch
@livekit/agents-plugin-fishaudio Patch
@livekit/agents-plugin-google Patch
@livekit/agents-plugin-hedra Patch
@livekit/agents-plugin-hume Patch
@livekit/agents-plugin-inworld Patch
@livekit/agents-plugin-lemonslice Patch
@livekit/agents-plugin-liveavatar Patch
@livekit/agents-plugin-livekit Patch
@livekit/agents-plugin-minimax Patch
@livekit/agents-plugin-mistral Patch
@livekit/agents-plugin-mistralai Patch
@livekit/agents-plugin-neuphonic Patch
@livekit/agents-plugin-openai Patch
@livekit/agents-plugin-perplexity Patch
@livekit/agents-plugin-resemble Patch
@livekit/agents-plugin-rime Patch
@livekit/agents-plugin-runway Patch
@livekit/agents-plugin-sarvam Patch
@livekit/agents-plugin-silero Patch
@livekit/agents-plugin-soniox Patch
@livekit/agents-plugin-tavus Patch
@livekit/agents-plugin-trugen Patch
@livekit/agents-plugin-xai Patch
@livekit/agents-plugins-test Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@qionghuang6 qionghuang6 changed the title (phonic): Use faster-than-realtime mode for Phonic WebSocket and sample rate 24000 (phonic): Use stream_ahead_of_real_timemode for Phonic WebSocket and sample rate 24000 Jun 25, 2026
@qionghuang6 qionghuang6 changed the title (phonic): Use stream_ahead_of_real_timemode for Phonic WebSocket and sample rate 24000 (phonic): Use stream_ahead_of_real_time mode for Phonic WebSocket and sample rate 24000 Jun 25, 2026
Co-authored-by: Cursor <cursoragent@cursor.com>
@qionghuang6 qionghuang6 marked this pull request as ready for review June 25, 2026 23:44

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

Open in Devin Review

@devin-ai-integration devin-ai-integration Bot Jun 25, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: Config deduplication removes redundant overrides from sendConfig

The old connect() method spread buildConfigOptions(...) into the sendConfig payload and then re-specified many of the same properties (additional_languages, multilingual_mode, audio_speed, tools, boosted_keywords, generate_no_input_poke_text, no_input_poke_sec, no_input_poke_text, no_input_end_conversation_sec) as direct overrides. Since these had identical values to what buildConfigOptions already returned, the overrides were purely redundant. The one exception was min_words_to_interrupt, which was commented out in buildConfigOptions but included directly in sendConfig. The PR un-comments it in buildConfigOptions (line 950-952) to maintain equivalence. This also means min_words_to_interrupt is now sent during mid-session resets via _updateSessionsendReset, which previously did not include it. This is likely a bug fix rather than a regression.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Make buildConfigOptions the single source of truth for session config so
sendConfig and sendReset stay in sync. Drop the redundant field overrides
in sendConfig and move min_words_to_interrupt into buildConfigOptions so it
is also applied on mid-session resets.

Co-authored-by: Cursor <cursoragent@cursor.com>

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

Open in Devin Review

no_input_poke_sec: this.options.noInputPokeSec,
no_input_poke_text: this.options.noInputPokeText,
no_input_end_conversation_sec: this.options.noInputEndConversationSec,
stream_ahead_of_real_time: true,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 stream_ahead_of_real_time is hardcoded with no user opt-out

The stream_ahead_of_real_time: true option at line 957 is hardcoded in buildConfigOptions with no corresponding constructor option to disable it. While this aligns with the PR's stated intent, users who experience issues with ahead-of-real-time streaming have no way to fall back to the previous real-time behavior without modifying source code. This may be intentional if the feature is considered stable, but it's worth confirming this is the desired default for all use cases.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that's intentional!

@tinalenguyen tinalenguyen merged commit 639fef2 into livekit:main Jun 26, 2026
6 checks passed
@github-actions github-actions Bot mentioned this pull request Jun 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants