Bug Description
After switching from the plugin-based VAD and turn detector to LiveKit Inference VAD + LiveKit Inference audio turn detector, I started seeing warnings/errors that did not occur with the previous setup.
Current issues
- Late EOU detection:
eou detection ran after the audio eot turn was already flushed (likely a late stt final).
consider raising `min_delay` in the endpointing options to accommodate slow stt.
- Cloud turn detector timeout/failure:
cloud turn detector failed (eot prediction timed out); falling back to local mini model
- Higher worker/job memory usage compared to previous setup around 970+mb which was much lower below 750mb.
Previous setup
- Silero VAD plugin
- LiveKit plugin turn detector
No warnings were observed.
Current setup
- LiveKit Inference VAD
- LiveKit Inference audio turn detector
Configuration
VAD: (Turn detector throw error when below this value, should be actually documented in website)
MIN_SILENCE_DURATION=0.25
Endpointing: (I understand they are aggressive but i need that to avoid as much lower latency as possible)
MIN_ENDPOINTING_DELAY=0.15
MAX_ENDPOINTING_DELAY=1.2
ENDPOINTING_MODE=fixed
Deepgram stt settings, so default endpointing that livekit has is 25ms.
kwargs: dict = {
"model": (c and c.model) or "nova-3",
"language": (c and c.language) or "en",
"filler_words": True,
"smart_format": True,
}
Observations
The warning suggests that EOU detection is completing after the turn has already been flushed, possibly because the endpointing delay is too aggressive or the cloud turn detector response arrives late.
I reduced the delays for lower latency, but after moving to the inference models these warnings started appearing.
Questions
- Is this expected behaviour when using LiveKit Inference turn detector compared to the plugin version?
- What is the recommended
min_delay/max_delay range when using cloud turn detection with external STT providers?
- Is there any known reason for increased memory usage after enabling LiveKit Inference VAD + turn detector?
Would appreciate if these issues are fixed.
Expected Behavior
- There should not be increase of 200mb+ as its same vad and turn detector?
- There should be no failure/timeouts
- Proper documentation in website for developers to know minimum settings and ideal settings and reason/cause it could increase ram usage.
Reproduction Steps
Just switched the livekit plugin silero, turn detector to livekit inference ones.
Operating System
Livekit cloud
Models Used
Deepgram nova3, gpt 5.4 mini, cartesia tts
Package Versions
"livekit-agents>=1.6.1",
"livekit-api>=1.1.0"
Session/Room/Call IDs
RM_HQMqnZ3afh2U, RM_GfoXD29Yfto6,RM_7BcQza5BEoAA
Proposed Solution
Additional Context
No response
Screenshots and Recordings
No response
Bug Description
After switching from the plugin-based VAD and turn detector to LiveKit Inference VAD + LiveKit Inference audio turn detector, I started seeing warnings/errors that did not occur with the previous setup.
Current issues
Previous setup
No warnings were observed.
Current setup
Configuration
VAD: (Turn detector throw error when below this value, should be actually documented in website)
Endpointing: (I understand they are aggressive but i need that to avoid as much lower latency as possible)
Deepgram stt settings, so default endpointing that livekit has is 25ms.
Observations
The warning suggests that EOU detection is completing after the turn has already been flushed, possibly because the endpointing delay is too aggressive or the cloud turn detector response arrives late.
I reduced the delays for lower latency, but after moving to the inference models these warnings started appearing.
Questions
min_delay/max_delayrange when using cloud turn detection with external STT providers?Would appreciate if these issues are fixed.
Expected Behavior
Reproduction Steps
Operating System
Livekit cloud
Models Used
Deepgram nova3, gpt 5.4 mini, cartesia tts
Package Versions
Session/Room/Call IDs
RM_HQMqnZ3afh2U, RM_GfoXD29Yfto6,RM_7BcQza5BEoAA
Proposed Solution
Additional Context
No response
Screenshots and Recordings
No response