Linking pages
Linked pages
- Chatbots Are Cheating on Their Benchmark Tests - The Atlantic https://www.theatlantic.com/technology/archive/2025/03/chatbots-benchmark-tests/681929/ 6 comments
- [2503.21934] Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad https://arxiv.org/abs/2503.21934 4 comments
- AlphaGeometry2: Impressive accomplishment, but still a long path ahead https://garymarcus.substack.com/p/alphageometry2-impressive-accomplishment 0 comments
Related searches:
Search whole site: site:garymarcus.substack.com
Search title: Reports of LLMs mastering math have been greatly exaggerated
See how to search.