Hacker News
- Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad https://arxiv.org/abs/2503.21934 2 comments
Linking pages
- New study shows why simulated reasoning AI models don’t yet live up to their billing - Ars Technica https://arstechnica.com/ai/2025/04/new-study-shows-why-simulated-reasoning-ai-models-dont-yet-live-up-to-their-billing/ 4 comments
- Reports of LLMs mastering math have been greatly exaggerated https://garymarcus.substack.com/p/reports-of-llms-mastering-math-have 0 comments
Related searches:
Search whole site: site:arxiv.org
Search title: [2503.21934] Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad
See how to search.