Skip to content

Fix non-deterministic cache hash for MetricGrouping tasks#1260

Open
nuthalapativarun wants to merge 1 commit into
huggingface:mainfrom
nuthalapativarun:fix/1023-cache-hash-metric-grouping
Open

Fix non-deterministic cache hash for MetricGrouping tasks#1260
nuthalapativarun wants to merge 1 commit into
huggingface:mainfrom
nuthalapativarun:fix/1023-cache-hash-metric-grouping

Conversation

@nuthalapativarun

Copy link
Copy Markdown

`LightevalTaskConfig.str` (used by `SampleCache._get_task_hash` to build the cache key) reprs each metric field directly. For `MetricGrouping` metrics, `corpus_level_fn` and `higher_is_better` are dicts whose values are callables, so `repr(...)` includes the function's memory address (e.g. `<function compute at 0x7f...>`). This makes the resulting cache hash non-deterministic across runs, breaking the cache for tasks like `leaderboard|truthfulqa:mc|0`.

This adds a small helper that recurses into dict values and renders callables by name (matching the existing handling for top-level metric callables), so the hash is stable across processes/runs.

Closes #1023

@nuthalapativarun

Copy link
Copy Markdown
Author

Just checking in on this one - no urgency, but wanted to bring it back up in case it slipped through. Let me know if any changes are needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Cache management failed for tasks using MetricGrouping

1 participant