Bug Description
Summary
The experimental AWS Nova Sonic realtime plugin serializes the turn-detection
setting as a flat endpointingSensitivity field directly under sessionStart.
Amazon Nova Sonic 2 (amazon.nova-2-sonic-v1:0) now strictly validates this event
and requires the value nested under turnDetectionConfiguration, rejecting the flat
field with ValidationException: Invalid input request. This kills the session at
startup and cascades into speech not done in time after interruption + shutdown.
Affected code
livekit-plugins/livekit-plugins-aws/.../experimental/realtime/events.py
class SessionStart(BaseModel):
inferenceConfiguration: InferenceConfiguration
endpointingSensitivity: TURN_DETECTION | None = "MEDIUM" # flat β Sonic 1 shape
Emitted event:
{"event":{"sessionStart":{"inferenceConfiguration":{...},"endpointingSensitivity":"HIGH"}}}
Notes
- The plugin's serialization is unchanged across recent releases (verified 1.3.10
through 1.5.8 and current main), so this is an AWS-side validation tightening,
not a plugin regression. Previously the flat field was silently ignored, which
also means turn_detection=... was effectively a no-op on Sonic 2 until now.
- Fix needs to stay model-aware: Nova 2 wants the nested form; Nova Sonic 1
(amazon.nova-sonic-v1:0) predates controllable endpointing, so it should keep
the legacy flat field (or omit it) to avoid a regression.
### Expected Behavior
Expected (per AWS Nova 2 docs)
https://docs.aws.amazon.com/nova/latest/nova2-userguide/sonic-turn-taking.html
```json
{"event":{"sessionStart":{"inferenceConfiguration":{...},"turnDetectionConfiguration":{"endpointingSensitivity":"HIGH"}}}}
### Reproduction Steps
```bash
1. RealtimeModel.with_nova_sonic_2(voice="tiffany", turn_detection="HIGH")
2. Start a session.
3. Bedrock returns ValidationException: Invalid input request on the sessionStart
event; the session closes immediately.
Operating System
linux
Models Used
amazon.nova-2-sonic-v1:0
Package Versions
- livekit-plugins-aws 1.5.8 (also present on main)
- Model: amazon.nova-2-sonic-v1:0
Session/Room/Call IDs
No response
Proposed Solution
This is the non-regressing version β Nova 2 gets the nested form, Sonic 1 keeps the flat form. It threads the model id into the event builder (the call site at realtime_model.py already has self._realtime_model._opts.model).
events.py
# add near SessionStart
class TurnDetectionConfiguration(BaseModel):
endpointingSensitivity: TURN_DETECTION
class SessionStart(BaseModel):
inferenceConfiguration: InferenceConfiguration
# Nova Sonic 1 used a flat field; Nova Sonic 2 requires it nested under
# turnDetectionConfiguration. Exactly one is populated per model.
endpointingSensitivity: TURN_DETECTION | None = None
turnDetectionConfiguration: TurnDetectionConfiguration | None = None
class SonicEventBuilder:
def __init__(
self,
prompt_name: str,
audio_content_name: str,
model: str = "amazon.nova-2-sonic-v1:0",
):
...
self._nova_sonic_2 = model == "amazon.nova-2-sonic-v1:0"
def create_session_start_event(
self,
max_tokens: int = 1024,
top_p: float = 0.9,
temperature: float = 0.7,
endpointing_sensitivity: TURN_DETECTION | None = "MEDIUM",
) -> str:
inference = InferenceConfiguration(
maxTokens=max_tokens, topP=top_p, temperature=temperature
)
if self._nova_sonic_2 and endpointing_sensitivity is not None:
session_start = SessionStart(
inferenceConfiguration=inference,
turnDetectionConfiguration=TurnDetectionConfiguration(
endpointingSensitivity=endpointing_sensitivity
),
)
else:
session_start = SessionStart(
inferenceConfiguration=inference,
endpointingSensitivity=endpointing_sensitivity,
)
event = Event(event=SessionStartEvent(sessionStart=session_start))
return event.model_dump_json(exclude_none=True) # was exclude_none=False
realtime_model.py β pass the model id at both seb(...) construction sites (lines ~500 and ~741):
self._event_builder = seb(
prompt_name=str(uuid.uuid4()),
audio_content_name=str(uuid.uuid4()),
model=self._realtime_model._opts.model,
)
### Additional Context
_No response_
### Screenshots and Recordings
_No response_
Bug Description
Summary
The experimental AWS Nova Sonic realtime plugin serializes the turn-detection
setting as a flat
endpointingSensitivityfield directly undersessionStart.Amazon Nova Sonic 2 (
amazon.nova-2-sonic-v1:0) now strictly validates this eventand requires the value nested under
turnDetectionConfiguration, rejecting the flatfield with
ValidationException: Invalid input request. This kills the session atstartup and cascades into
speech not done in time after interruption+ shutdown.Affected code
livekit-plugins/livekit-plugins-aws/.../experimental/realtime/events.pyEmitted event:
{"event":{"sessionStart":{"inferenceConfiguration":{...},"endpointingSensitivity":"HIGH"}}}Operating System
linux
Models Used
amazon.nova-2-sonic-v1:0
Package Versions
Session/Room/Call IDs
No response
Proposed Solution
This is the non-regressing version β Nova 2 gets the nested form, Sonic 1 keeps the flat form. It threads the model id into the event builder (the call site at
realtime_model.pyalready hasself._realtime_model._opts.model).events.py