-
Notifications
You must be signed in to change notification settings - Fork 2.1k
fix: keep current datetime out of the cached system prefix #7984
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -49,7 +49,7 @@ | |
| from utils.retrieval.tools.app_tools import load_app_tools, get_tool_status_message | ||
| from utils.retrieval.safety import AgentSafetyGuard, SafetyGuardError | ||
| from utils.llm.clients import anthropic_client, ANTHROPIC_AGENT_MODEL | ||
| from utils.llm.chat import _get_agentic_qa_prompt | ||
| from utils.llm.chat import _get_agentic_qa_prompt, get_current_datetime_block | ||
| from utils.other.endpoints import timeit | ||
| from utils.observability.langsmith import is_langsmith_enabled | ||
| import logging | ||
|
|
@@ -340,6 +340,24 @@ def _messages_to_anthropic(messages: List[Message]) -> list: | |
| return anthropic_messages | ||
|
|
||
|
|
||
| def _inject_current_datetime(anthropic_messages: list, datetime_block: str) -> list: | ||
| """Prepend the current-datetime block to the latest user turn. | ||
|
|
||
| The datetime changes every request, so it is kept out of the cache_control system | ||
| prefix (which must stay byte-identical for prompt-cache hits) and delivered here in the | ||
| user turn instead. Falls back to appending a new user message if there is no trailing | ||
| user turn to attach it to. | ||
| """ | ||
| if not datetime_block: | ||
| return anthropic_messages | ||
| for msg in reversed(anthropic_messages): | ||
| if msg["role"] == "user" and isinstance(msg.get("content"), str): | ||
| msg["content"] = f"{datetime_block}\n\n{msg['content']}" | ||
| return anthropic_messages | ||
|
Comment on lines
+353
to
+356
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
| anthropic_messages.append({"role": "user", "content": datetime_block}) | ||
| return anthropic_messages | ||
|
|
||
|
|
||
| # --------------------------------------------------------------------------- | ||
| # Core Anthropic agent streaming loop | ||
| # --------------------------------------------------------------------------- | ||
|
|
@@ -575,8 +593,10 @@ async def execute_agentic_chat_stream( | |
| # Convert tools to Anthropic format (core = visible, app = defer_loading) | ||
| tool_schemas, tool_registry = _convert_tools(core_tools, app_tools) | ||
|
|
||
| # Convert messages to Anthropic format | ||
| # Convert messages to Anthropic format. The current datetime is injected into the user | ||
| # turn (not the system prompt) so the cache_control system prefix stays byte-stable. | ||
| anthropic_messages = _messages_to_anthropic(messages) | ||
| anthropic_messages = _inject_current_datetime(anthropic_messages, get_current_datetime_block(uid)) | ||
|
|
||
| callback = AsyncStreamingCallback() | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
notification_db.get_user_time_zonecall per requestBoth
_get_agentic_qa_prompt(viaget_user_timezone) andget_current_datetime_block(viaget_user_timezone) callnotification_db.get_user_time_zone(uid)independently on every request. Sinceexecute_agent_chat_streamcalls both in sequence, every agentic request makes two round-trips to the notification database for the same immutable-per-request value. Passing the resolved timezone string intoget_current_datetime_block(e.g.,get_current_datetime_block(uid, tz=tz)) would eliminate the redundant lookup without changing any observable behaviour.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!