Hacker News
- LLMs can't perform "genuine logical reasoning," Apple researchers suggest https://arstechnica.com/ai/2024/10/llms-cant-perform-genuine-logical-reasoning-apple-researchers-suggest/ 73 comments
- Apple study exposes deep cracks in LLMs’ “reasoning” capabilities https://arstechnica.com/ai/2024/10/llms-cant-perform-genuine-logical-reasoning-apple-researchers-suggest/ 89 comments futurology
Linking pages
- What if AI doesn’t just keep getting better forever? - Ars Technica https://arstechnica.com/ai/2024/11/what-if-ai-doesnt-just-keep-getting-better-forever/ 34 comments
- Expert witness used Copilot to make up fake damages, irking judge - Ars Technica https://arstechnica.com/tech-policy/2024/10/judge-confronts-expert-witness-who-used-copilot-to-fake-expertise/ 28 comments
- New secret math benchmark stumps AI models and PhDs alike - Ars Technica https://arstechnica.com/ai/2024/11/new-secret-math-benchmark-stumps-ai-models-and-phds-alike/ 15 comments
- Adobe unveils AI video generator trained on licensed content - Ars Technica https://arstechnica.com/ai/2024/10/adobe-unveils-ai-video-generator-trained-on-licensed-content/ 1 comment
Linked pages
- [2410.05229] GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models https://arxiv.org/abs/2410.05229 267 comments
- LLMs don’t do formal reasoning - and that is a HUGE problem https://garymarcus.substack.com/p/llms-dont-do-formal-reasoning-and 187 comments
- Expert witness used Copilot to make up fake damages, irking judge - Ars Technica https://arstechnica.com/tech-policy/2024/10/judge-confronts-expert-witness-who-used-copilot-to-fake-expertise/ 28 comments
- Google claims math breakthrough with proof-solving AI models | Ars Technica https://arstechnica.com/information-technology/2024/07/google-ai-earns-silver-medal-equivalent-at-international-mathematical-olympiad/ 17 comments
- [2206.10498] Large Language Models Still Can't Plan (A Benchmark for LLMs on Planning and Reasoning about Change) https://arxiv.org/abs/2206.10498 6 comments
- We made a cat drink a beer with Runway’s AI video generator, and it sprouted hands | Ars Technica https://arstechnica.com/information-technology/2024/07/we-made-a-cat-drink-a-beer-with-runways-ai-video-generator-and-it-sprouted-hands/ 0 comments
- OpenAI’s new “reasoning” AI models are here: o1-preview and o1-mini | Ars Technica https://arstechnica.com/information-technology/2024/09/openais-new-reasoning-ai-models-are-here-o1-preview-and-o1-mini/ 0 comments
Related searches:
Search whole site: site:arstechnica.com
Search title: LLMs can’t perform “genuine logical reasoning,” Apple researchers suggest - Ars Technica
See how to search.