math-reasoning

ArmLLM 2025 solutions covering ViT from scratch, SigLIP–Qwen LaTeX OCR, GRPO reasoning post-training, inference-time reasoning strategies, and adversarial vision attacks.

reinforcement-learning computer-vision transformers pytorch multimodal-learning post-training adversarial-attacks adversarial-robustness vision-transformer latex-ocr vision-language-model qwen siglip inference-time-compute grpo math-reasoning armllm llm-summer-school

Updated Nov 26, 2025
Jupyter Notebook

wd041216-bit / ai-benchmark-kb

Star

AI Benchmark 知识库 — 全面收录各大 AI 公司用来测试模型性能的 Benchmark 题库完整集合

benchmark knowledge-base model-evaluation reasoning multimodal ai-benchmarks instruction-following llm long-context safety-evaluation ai-performance math-reasoning coding-benchmark benchmark-collection eval-frameworks

Updated Apr 16, 2026

goblinasaddy / nanoJEPA

Star

A minimal JEPA-based language model demonstrating latent-space reasoning on GSM8K using a single decoder-only Transformer.

deep-learning pytorch transformer research-project representation-learning language-model latent-space gsm8k jepa math-reasoning

Updated Feb 28, 2026
Python

mianhua157 / math-data-cleaning-qwen

Star

Data cleaning and structuring pipeline for math reasoning tasks using Qwen3-0.6B for LLM post-training.

nlp machine-learning transformers pytorch data-processing data-cleaning post-training llm qwen math-reasoning

Updated Apr 13, 2026
Python

Seanaaa0 / QT-R1

Star

STaR × S1 math pipeline on Qwen2.5-1.5B. LoRA, strict Final: format, ~20–30% acc (OpenR1-Math split).

transformers star dataset-pipeline qlora peft-fine-tuning-llm qwen2-5 math-reasoning openr1-math

Updated Sep 6, 2025
Python

wsdjzlh / math-process-supervision-qwen

Star

A controlled LoRA finetuning study on process supervision for mathematical reasoning with Qwen2.5-Math-7B-Instruct.

lora process-supervision llm gsm8k qwen math-reasoning

Updated Apr 23, 2026
Python

handsomeZR-netizen / nnu-nlp-gsm8k-coursework-2026

Star

NLP course final project (2026), Nanjing Normal University, supervised by 孔力: GSM8K math QA with Seq2Seq, Transformer and LLMs.

nlp transformer seq2seq coursework llm gsm8k nnu math-reasoning

Updated Jun 1, 2026
Python

KaiP-598 / grpo-from-scratch

Star

GRPO (Group Relative Policy Optimization) implemented from scratch in PyTorch. 10 ablation experiments.

training reinforcement-learning pytorch from-scratch llm rlhf vllm deepseek-r1 grpo math-reasoning

Updated Apr 26, 2026
Python

hoadm-net / MathCoRL

Star

Comprehensive framework for mathematical reasoning research with dual research capabilities

nlp prompt-engineering math-reasoning

Updated Mar 18, 2026
Python

antonisbaro / promptimus-prime

Star

Transforming weak prompts into reasoning machines using Textual Gradients and AdalFlow. Runs on Colab.

transformers pytorch google-colab large-language-models llm prompt-engineering chain-of-thought generative-ai gsm8k prompt-optimization textual-gradients automated-prompt-engineering math-reasoning llm-autodiff adalflow

Updated Jan 28, 2026
Python

Math-llm-lab / llm-math-econ-evaluation

Star

NDA-safe excerpts of math & economics modeling tasks for LLM reasoning evaluation and numerical verification.

python monte-carlo root-finding numerical-methods model-validation economic-modeling quantitative-research llm-evaluation math-reasoning

Updated Dec 23, 2025
Python

dipta007 / GanitLLM

Star

A Bengali Math LLM

math rl reasoning grpo math-reasoning mathllm

Updated Jun 5, 2026
Python

huyxdang / RLVR-Decomposed-Implementation

Star

Small-scale Implementation and Extension of “The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning” (NeurIPS '25)

reinforcement-learning llm math-reasoning

Updated Oct 23, 2025
Python

mostafanasr300 / math-reasoning-dpo

Star

An end-to-end pipeline for training and deploying a lightweight math reasoning language model (Qwen2.5-0.5B). Features CPU-compatible Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and an interactive web interface built with Flask and Streamlit.

machine-learning flask-api sft dpo streamlit cpu-training llm supervised-finetuning gsm8k qwen2-5 math-reasoning

Updated Jun 7, 2026
Jupyter Notebook

fikreab-s / small-model-rl-verifier-loop

Star

GRPO reinforcement learning with verifiable rewards for sub-2B models

reinforcement-learning verifier llm grpo math-reasoning

Updated May 9, 2026
Python

fikreab-s / aimo3-math-reasoning-pipeline

Star

Tool-Integrated Reasoning for competition math — weighted voting, difficulty-aware allocation

kaggle llm math-reasoning tool-integrated-reasoning

Updated May 9, 2026
Python

Improve this page

Add a description, image, and links to the math-reasoning topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the math-reasoning topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

math-reasoning

Here are 22 public repositories matching this topic...

rasbt / reasoning-from-scratch

YutingLi0606 / Vision-Matters

lupantech / ineqmath

InternLM / Spark

Arash-ra03 / ArmLLM

wd041216-bit / ai-benchmark-kb

goblinasaddy / nanoJEPA

mianhua157 / math-data-cleaning-qwen

Seanaaa0 / QT-R1

wsdjzlh / math-process-supervision-qwen

handsomeZR-netizen / nnu-nlp-gsm8k-coursework-2026

KaiP-598 / grpo-from-scratch

hoadm-net / MathCoRL

antonisbaro / promptimus-prime

Math-llm-lab / llm-math-econ-evaluation

dipta007 / GanitLLM

huyxdang / RLVR-Decomposed-Implementation

mostafanasr300 / math-reasoning-dpo

fikreab-s / small-model-rl-verifier-loop

fikreab-s / aimo3-math-reasoning-pipeline

Improve this page

Add this topic to your repo