Add AIME26 task by Vedant-Agarwal · Pull Request #1254 · huggingface/lighteval

Vedant-Agarwal · 2026-06-11T15:01:27Z

Adds aime26, aime26_avg, and aime26_gpassk task configs to src/lighteval/tasks/tasks/aime.py, mirroring the aime25 trio exactly (same prompt function, record_to_sample, and metric choices) — only the dataset repo, splits, and version differ.

Dataset: math-ai/aime26 (confirmed in the issue comments), config default. One adaptation from the aime25 pattern: this dataset ships a single test split (30 problems) rather than train, so hf_avail_splits/evaluation_splits are ["test"]. Columns (problem, answer) match what aime_prompt expects — verified by loading the live dataset and running the prompt function on real rows.

Unlike the earlier attempt in #1218 (which targeted the multilingual LumiOpen/mAIME* datasets and bundled unrelated CI/dependency changes), this is a minimal single-file change doing exactly what the issue asks.

Verification:

Registry resolves all three tasks with correct metrics
pytest tests/unit/tasks/test_registry.py → 8 passed
ruff check / ruff format --check clean

Add aime26, aime26_avg, and aime26_gpassk task configs mirroring the existing AIME25 definitions, backed by the math-ai/aime26 dataset (default subset, test split, 30 problems). Closes huggingface#1167

Vedant-Agarwal · 2026-06-18T22:09:27Z

Friendly ping @NathanHB — this adds the AIME26 task (closes #1167), mirroring the existing aime25 configs exactly, so it's a small isolated addition. It's mergeable and I am happy to adjust if you would like anything changed. Thanks for taking a look.

Add AIME26 task

1fc6f7f

Add aime26, aime26_avg, and aime26_gpassk task configs mirroring the existing AIME25 definitions, backed by the math-ai/aime26 dataset (default subset, test split, 30 problems). Closes huggingface#1167

Vedant-Agarwal mentioned this pull request Jun 11, 2026

[EVAL] Request support for AIME26 #1167

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add AIME26 task#1254

Add AIME26 task#1254
Vedant-Agarwal wants to merge 1 commit into
huggingface:mainfrom
Vedant-Agarwal:add-aime26-task

Vedant-Agarwal commented Jun 11, 2026

Uh oh!

Vedant-Agarwal commented Jun 18, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Vedant-Agarwal commented Jun 11, 2026

Uh oh!

Vedant-Agarwal commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Vedant-Agarwal commented Jun 18, 2026 •

edited

Loading