Frontier models fail hard at "Humanity's Last Exam" but experts question if it matters - discu.eu

Reddit

Frontier models fail hard at "Humanity's Last Exam" but experts question if it matters https://the-decoder.com/frontier-models-fail-hard-at-humanitys-last-exam-but-experts-question-if-it-matters/ 5 comments 25/1/2025 artificial

Linking pages

Perplexity uses Deepseek-R1 to offer Deep Research 10 times cheaper than OpenAI https://the-decoder.com/perplexity-uses-deepseek-r1-to-offer-deep-research-10-times-cheaper-than-openai/ 42 comments

Linked pages

Moravec's paradox - Wikipedia https://en.wikipedia.org/wiki/Moravec%27s_paradox 194 comments
A Test So Hard No AI System Can Pass It — Yet - The New York Times https://www.nytimes.com/2025/01/23/technology/ai-test-humanitys-last-exam.html 169 comments
OpenAI quietly funded independent math benchmark before setting record with o3 https://the-decoder.com/openai-quietly-funded-independent-math-benchmark-before-setting-record-with-o3/ 86 comments
Humanity's Last Exam https://agi.safe.ai/ 40 comments

Related searches:

Search whole site: site:the-decoder.com

Search title: Frontier models fail hard at "Humanity's Last Exam" but experts question if it matters

See how to search.

Submit link to: