LLMs can’t perform “genuine logical reasoning,” Apple researchers suggest - Ars Technica - discu.eu

Hacker News

LLMs can't perform "genuine logical reasoning," Apple researchers suggest https://arstechnica.com/ai/2024/10/llms-cant-perform-genuine-logical-reasoning-apple-researchers-suggest/ 73 comments 14/10/2024

Reddit

Apple study exposes deep cracks in LLMs’ “reasoning” capabilities https://arstechnica.com/ai/2024/10/llms-cant-perform-genuine-logical-reasoning-apple-researchers-suggest/ 89 comments 18/10/2024 futurology

Linking pages

Linked pages

[2410.05229] GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models https://arxiv.org/abs/2410.05229 267 comments
LLMs don’t do formal reasoning - and that is a HUGE problem https://garymarcus.substack.com/p/llms-dont-do-formal-reasoning-and 187 comments
Expert witness used Copilot to make up fake damages, irking judge - Ars Technica https://arstechnica.com/tech-policy/2024/10/judge-confronts-expert-witness-who-used-copilot-to-fake-expertise/ 28 comments
Google claims math breakthrough with proof-solving AI models | Ars Technica https://arstechnica.com/information-technology/2024/07/google-ai-earns-silver-medal-equivalent-at-international-mathematical-olympiad/ 17 comments
[2206.10498] Large Language Models Still Can't Plan (A Benchmark for LLMs on Planning and Reasoning about Change) https://arxiv.org/abs/2206.10498 6 comments
We made a cat drink a beer with Runway’s AI video generator, and it sprouted hands | Ars Technica https://arstechnica.com/information-technology/2024/07/we-made-a-cat-drink-a-beer-with-runways-ai-video-generator-and-it-sprouted-hands/ 0 comments
OpenAI’s new “reasoning” AI models are here: o1-preview and o1-mini | Ars Technica https://arstechnica.com/information-technology/2024/09/openais-new-reasoning-ai-models-are-here-o1-preview-and-o1-mini/ 0 comments

Related searches:

Search whole site: site:arstechnica.com

Search title: LLMs can’t perform “genuine logical reasoning,” Apple researchers suggest - Ars Technica

See how to search.

Submit link to: