Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .changeset/google-context-exhaustion.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
'@livekit/agents-plugin-google': patch
---

Surface Gemini Live `1007` context exhaustion errors as unrecoverable session errors instead of retrying the same oversized context.
76 changes: 60 additions & 16 deletions plugins/google/src/realtime/realtime_api.ts

@devin-ai-integration devin-ai-integration Bot Jun 19, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Stale sessionError not cleared between retry iterations can terminate a healthy session

The new sessionError field is set by onclose, sendTask, and onReceiveMessage callbacks, and is intended to be consumed (and cleared) at plugins/google/src/realtime/realtime_api.ts:1097-1101. However, if cancelAndWait at line 1095 throws (e.g., due to a 2-second timeout), execution jumps directly to the catch block, skipping the sessionError check. The stale sessionError is never cleared — neither by closeActiveSession() (plugins/google/src/realtime/realtime_api.ts:523-544) nor at the top of the next while iteration (line 993-997). On the next iteration, after a potentially successful new session, the stale sessionError is discovered at line 1097, thrown, and processed. If the stale error was a 1007 context-exhaustion error, isContextExhaustedError would return true at line 1110, causing the perfectly healthy new session to be terminated.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ const LK_GOOGLE_DEBUG = Number(process.env.LK_GOOGLE_DEBUG ?? 0);

// WebSocket close codes (RFC 6455)
const WS_CLOSE_NORMAL = 1000;
const WS_CLOSE_CONTEXT_EXHAUSTED = 1007;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 WebSocket code 1007 is RFC 6455 'Invalid frame payload data', not a standard context exhaustion code

The constant WS_CLOSE_CONTEXT_EXHAUSTED = 1007 is placed under the comment // WebSocket close codes (RFC 6455). However, RFC 6455 defines code 1007 as 'Invalid frame payload data' (non-UTF-8 data in a text frame). Google's use of 1007 for context exhaustion is a non-standard application-specific meaning. The naming and comment could mislead future maintainers into thinking this is a standard code. Consider adding a note like // Google-specific: Gemini uses 1007 for context exhaustion (RFC 6455 defines 1007 as 'Invalid frame payload data').

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

/**
* Default image encoding options for Google Realtime API
*/
Expand Down Expand Up @@ -473,6 +474,7 @@ export class RealtimeSession extends llm.RealtimeSession {
private inUserActivity = false;
private sessionLock = new Mutex();
private numRetries = 0;
private sessionError?: Error;
private hasReceivedAudioInput = false;
private pendingInterruptText = false;
private earlyCompletionPending = false;
Expand Down Expand Up @@ -557,6 +559,20 @@ export class RealtimeSession extends llm.RealtimeSession {
}
}

private toError(error: unknown): Error {
return error instanceof Error ? error : new Error(String(error));
}

private isContextExhaustedError(error: unknown): boolean {
return (
(typeof error === 'object' &&
error !== null &&
'statusCode' in error &&
error.statusCode === WS_CLOSE_CONTEXT_EXHAUSTED) ||
String(error).includes(String(WS_CLOSE_CONTEXT_EXHAUSTED))
);
}
Comment on lines +566 to +574

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Overly broad string-based check in isContextExhaustedError can cause false-positive session termination

The fallback branch String(error).includes(String(WS_CLOSE_CONTEXT_EXHAUSTED)) expands to String(error).includes('1007'), which matches ANY error whose string representation contains the substring "1007" — not just context exhaustion errors. This is problematic because sessionError can be set from sendTask (line 1234) or onReceiveMessage (line 1348), where the error could be an arbitrary SDK/network error. If such an error's message or stringified form incidentally contains "1007" (e.g., token counts like 10071, request IDs, byte counts like "frame size 10070"), the session will be incorrectly terminated as "context exhausted" instead of retried.

How the false positive leads to session termination
  1. sendTask catches a non-context-exhaustion error (e.g., from session.sendRealtimeInput)
  2. Sets this.sessionError = this.toError(e) (a plain Error without statusCode)
  3. In #mainTask, the error is thrown and caught
  4. isContextExhaustedError(err) — first branch fails (no statusCode), but String(err).includes('1007') matches
  5. Session terminates with "context exhausted" instead of retrying

The primary onclose code path already handles the real 1007 case via the statusCode property check, making the string fallback unnecessary for the intended flow but dangerous for other errors.

Suggested change
private isContextExhaustedError(error: unknown): boolean {
return (
(typeof error === 'object' &&
error !== null &&
'statusCode' in error &&
error.statusCode === WS_CLOSE_CONTEXT_EXHAUSTED) ||
String(error).includes(String(WS_CLOSE_CONTEXT_EXHAUSTED))
);
}
private isContextExhaustedError(error: unknown): boolean {
return (
typeof error === 'object' &&
error !== null &&
'statusCode' in error &&
error.statusCode === WS_CLOSE_CONTEXT_EXHAUSTED
);
}
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.


private isNonBlockingToolBehavior(): boolean {
return this.options.toolBehavior === types.Behavior.NON_BLOCKING;
}
Expand Down Expand Up @@ -1023,19 +1039,23 @@ export class RealtimeSession extends llm.RealtimeSession {
const errorMsg = event.reason || `WebSocket closed with code ${event.code}`;
this.#logger.error(`Gemini Live session error: ${errorMsg}${truncationNote}`);

this.emitError(
new APIStatusError({
message: `${errorMsg}${truncationNote}`,
options: {
statusCode: event.code,
retryable: false,
body: event.reason
? { reason: event.reason, code: event.code, truncated: isTruncated }
: null,
},
}),
false,
);
const error = new APIStatusError({
message: `${errorMsg}${truncationNote}`,
options: {
statusCode: event.code,
retryable: false,
body: event.reason
? { reason: event.reason, code: event.code, truncated: isTruncated }
: null,
},
});

if (event.code === WS_CLOSE_CONTEXT_EXHAUSTED) {
this.sessionError = error;
this.markRestartNeeded();
} else {
this.emitError(error, false);
}
} else {
this.#logger.debug('Gemini Live session closed:', event.code, event.reason);
}
Expand Down Expand Up @@ -1084,25 +1104,47 @@ export class RealtimeSession extends llm.RealtimeSession {
}

await cancelAndWait([sendTask, restartWaitTask], 2000);

if (this.sessionError) {
const error = this.sessionError;
this.sessionError = undefined;
throw error;
}
} catch (error) {
this.#logger.error(`Gemini Realtime API error: ${error}`);
const err = this.toError(error);
this.#logger.error(`Gemini Realtime API error: ${err}`);

if (this.#closed) break;

// Gemini Live closes with 1007 when the session context is exhausted. Reconnecting
// would replay the same oversized context and fail again, so terminate the session.
if (this.isContextExhaustedError(err)) {
this.#logger.error(
err,
'Gemini Live closed the session: context exhausted (1007). Reconnecting would replay the same context and fail again; terminating the session.',
);
this.emitError(err, false);
throw new APIConnectionError({
message: 'Gemini Live session context exhausted (1007)',
options: { retryable: false },
});
}

if (maxRetries === 0) {
this.emitError(error as Error, false);
this.emitError(err, false);
throw new APIConnectionError({
message: 'Failed to connect to Gemini Live',
});
}

if (this.numRetries >= maxRetries) {
this.emitError(error as Error, false);
this.emitError(err, false);
throw new APIConnectionError({
message: `Failed to connect to Gemini Live after ${maxRetries} attempts`,
});
}

this.emitError(err, true);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 New emitError(err, true) call emits recoverable errors on every retry attempt

Line 1147 adds this.emitError(err, true) which is new behavior — previously, errors were only emitted as non-recoverable when retries were exhausted. Now every retry attempt emits a recoverable error event to consumers. This is likely intentional for better observability, but callers subscribed to the 'error' event will now receive more frequent error notifications during transient network issues. If downstream consumers take action on every error event (e.g., logging prominently, alerting), this could increase noise.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

const retryInterval =
this.numRetries === 100 ? 0 : this.options.connOptions.retryIntervalMs;

Expand Down Expand Up @@ -1190,6 +1232,7 @@ export class RealtimeSession extends llm.RealtimeSession {
} catch (e) {
if (!this.sessionShouldClose.isSet) {
this.#logger.error(`Error in send task: ${e}`);
this.sessionError = this.toError(e);
this.markRestartNeeded();
}
Comment on lines +1235 to 1237

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Behavioral change: sendTask/onReceiveMessage errors now count against retry budget

Prior to this PR, errors in sendTask and onReceiveMessage were logged and caused a silent restart (via markRestartNeeded()) without entering the catch block or incrementing numRetries. With the new sessionError mechanism (lines 1234, 1348), these errors are now thrown into the catch block (plugins/google/src/realtime/realtime_api.ts:1108-1111) and go through the full retry logic including numRetries++ at line 1159. This means repeated transient send/receive failures (e.g., network blips mid-session) will now accumulate toward maxRetries and eventually terminate the session. Previously they would retry indefinitely. The numRetries counter is reset at plugins/google/src/realtime/realtime_api.ts:1342-1343 only when a message is successfully received, so a session that repeatedly fails mid-send before receiving any response from the new connection will now terminate. This may be intentional (preventing infinite retry loops, as the existing TODO(brian): handle error from tasks comment at line 1100 suggests), but it's a significant behavioral change beyond the stated scope of handling 1007 errors.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

} finally {
Expand Down Expand Up @@ -1303,6 +1346,7 @@ export class RealtimeSession extends llm.RealtimeSession {
} catch (e) {
if (!this.sessionShouldClose.isSet) {
this.#logger.error(`Error in onReceiveMessage: ${e}`);
this.sessionError = this.toError(e);
this.markRestartNeeded();
}
}
Expand Down
Loading