fix(litellm): drop mode:responses on codex models (completes #469)#470
Merged
Conversation
…ponses/ prefix) Follow-up to #469. The responses/ prefix fixed the endpoint, but model_info.mode:responses made the proxy route through the responses-transformation handler, which strips `chatgpt/` and sends the invalid model `responses/gpt-5.4-mini` (400 "model not supported") -> the /chat/completions path agents use silently fell back to OpenRouter Nemotron. Removing mode:responses makes the proxy use the completion handler, which parses the responses/ prefix correctly. Verified live (configmap patch + litellm restart): 3/3 real codex via proxy /chat/completions, 0 Nemotron fallbacks. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Follow-up to #469. After #469 deployed, the openclaw dev agents still fell back to Nemotron via the proxy
/chat/completionspath. Root cause:model_info: mode: responsesmade the proxy route through LiteLLM's responses-transformation handler, which stripschatgpt/and sends the invalid modelresponses/gpt-5.4-mini→400 "model not supported"→ silent Nemotron fallback.Removing
mode: responsesmakes the proxy use the completion handler, which parses theresponses/prefix correctly (and dodges Cloudflare via the/responsesendpoint).Change
Removed the 8
model_info: mode: responsesblocks from thechatgpt/codex entries. Keptmodel: chatgpt/responses/gpt-5.4*from #469.Verification (live: configmap patch + litellm restart)
POST /v1/chat/completionswithopenai-codex/gpt-5.4-mini→ 3/3 real codex completions, 0 Nemotron.Note on Cody
Cody uses
wire_api=responses→ LiteLLM's/v1/responsesendpoint, which still mangles theresponses/prefix (aresponsesstripschatgpt/). Cody was already broken (Nemotron), so no regression. Cody fix is a follow-up:wire_api=chator native Codex CLI.🤖 Generated with Claude Code