Cloud turn detector timeout and late EOU detection warnings after switching to LiveKit Inference VAD/turn detector

### Bug Description

After switching from the plugin-based VAD and turn detector to LiveKit Inference VAD + LiveKit Inference audio turn detector, I started seeing warnings/errors that did not occur with the previous setup.

### Current issues

1. Late EOU detection:

```
eou detection ran after the audio eot turn was already flushed (likely a late stt final).
consider raising `min_delay` in the endpointing options to accommodate slow stt.
```

2. Cloud turn detector timeout/failure:

```
cloud turn detector failed (eot prediction timed out); falling back to local mini model
```

3. Higher worker/job memory usage compared to previous setup around 970+mb which was much lower below 750mb.

### Previous setup

* Silero VAD plugin
* LiveKit plugin turn detector

No warnings were observed.

### Current setup

* LiveKit Inference VAD
* LiveKit Inference audio turn detector

### Configuration

VAD: (Turn detector throw error when below this value, should be actually documented in website)

```
MIN_SILENCE_DURATION=0.25
```

Endpointing: (I understand they are aggressive but i need that to avoid as much lower latency as possible)

```
MIN_ENDPOINTING_DELAY=0.15
MAX_ENDPOINTING_DELAY=1.2
ENDPOINTING_MODE=fixed
```

Deepgram stt settings, so default endpointing that livekit has is 25ms.
```
    kwargs: dict = {
        "model": (c and c.model) or "nova-3",
        "language": (c and c.language) or "en",
        "filler_words": True,
        "smart_format": True,
    }
```

### Observations

The warning suggests that EOU detection is completing after the turn has already been flushed, possibly because the endpointing delay is too aggressive or the cloud turn detector response arrives late.

I reduced the delays for lower latency, but after moving to the inference models these warnings started appearing.

### Questions

1. Is this expected behaviour when using LiveKit Inference turn detector compared to the plugin version?
2. What is the recommended `min_delay`/`max_delay` range when using cloud turn detection with external STT providers?
3. Is there any known reason for increased memory usage after enabling LiveKit Inference VAD + turn detector?

Would appreciate if these issues are fixed.



### Expected Behavior

1. There should not be increase of 200mb+ as its same vad and turn detector?
2. There should be no failure/timeouts
3. Proper documentation in website for developers to know minimum settings and ideal settings and reason/cause it could increase ram usage.

### Reproduction Steps

```bash
Just switched the livekit plugin silero, turn detector to livekit inference ones.
```

### Operating System

Livekit cloud

### Models Used

Deepgram nova3, gpt 5.4 mini, cartesia tts

### Package Versions

```bash
"livekit-agents>=1.6.1",
"livekit-api>=1.1.0"
```

### Session/Room/Call IDs
RM_HQMqnZ3afh2U, RM_GfoXD29Yfto6,RM_7BcQza5BEoAA

### Proposed Solution

```python

```

### Additional Context

_No response_

### Screenshots and Recordings

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Cloud turn detector timeout and late EOU detection warnings after switching to LiveKit Inference VAD/turn detector #6177

Bug Description

Current issues

Previous setup

Current setup

Configuration

Observations

Questions

Expected Behavior

Reproduction Steps

Operating System

Models Used

Package Versions

Session/Room/Call IDs

Proposed Solution

Additional Context

Screenshots and Recordings

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Cloud turn detector timeout and late EOU detection warnings after switching to LiveKit Inference VAD/turn detector #6177

Description

Bug Description

Current issues

Previous setup

Current setup

Configuration

Observations

Questions

Expected Behavior

Reproduction Steps

Operating System

Models Used

Package Versions

Session/Room/Call IDs

Proposed Solution

Additional Context

Screenshots and Recordings

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions