Saturday, August 30, 2025

Math Odyssey: Evaluating Problem-Solving Abilities of Large Language Models with Odyssey Math Data

The MathOdyssey dataset was meticulously created to assess the mathematical reasoning abilities of large language models (LLMs). It involved structured stages including expert recruitment, problem design, categorization, and independent review. Mathematics professionals, recruited through the AGI Odyssey executive committee, contributed original problems across three difficulty levels: High School, University, and Olympiad. Each problem includes a canonical solution, detailed reasoning annotations, and metadata for rigorous evaluation. The dataset encompasses various mathematical domains such as Algebra and Calculus, ensuring comprehensive coverage for future educational assessments.

All 387 problems underwent a multi-stage validation process, ensuring clarity, originality, and alignment with their respective difficulty levels. The dataset’s structured format includes JSON storage with accessible problem statements, solutions, and reasoning, also available in LaTeX and PDF formats. This meticulously crafted approach supports downstream tasks like automated evaluation and few-shot learning, making MathOdyssey a vital resource for advancing mathematical reasoning in AI research.

Source link

Share

Read more

Local News