What is AI interpretability? Artificial intelligence researchers are reverse-engineering ChatGPT, Claude, and Gemini. - Vox - discu.eu

Hacker News

Scientists are trying to unravel the mystery behind modern AI https://www.vox.com/future-perfect/362759/ai-interpretability-openai-claude-gemini-neuroscience 7 comments 27/7/2024

Linked pages

Large Language Model: world models or surface statistics? https://thegradient.pub/othello/ 458 comments
Sam Altman Says OpenAI Doesn’t Fully Understand How ChatGPT Works | Observer https://observer.com/2024/05/sam-altman-openai-gpt-ai-for-good-conference/ 323 comments
OpenAI says ChatGPT has 100 million weekly users - The Verge https://www.theverge.com/2023/11/6/23948386/chatgpt-active-user-count-openai-developer-conference 305 comments
https://openai.com/index/extracting-concepts-from-gpt-4/ 143 comments
Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet https://transformer-circuits.pub/2024/scaling-monosemanticity/ 135 comments
Feature Visualization https://distill.pub/2017/feature-visualization/ 77 comments
AI systems are getting better at tricking us | MIT Technology Review https://www.technologyreview.com/2024/05/10/1092293/ai-systems-are-getting-better-at-tricking-us/ 40 comments
How does ChatGPT ‘think’? Psychology and neuroscience crack open AI large language models https://www.nature.com/articles/d41586-024-01314-y 8 comments
How Well Do Antidepressants Work and What Are Alternatives? - The New York Times https://www.nytimes.com/2022/11/08/well/mind/antidepressants-effects-alternatives.html 4 comments
In vitro neurons learn and exhibit sentience when embodied in a simulated game-world: Neuron https://www.cell.com/neuron/fulltext/S0896-6273(22)00806-6 3 comments
Mapping the Mind of a Large Language Model \ Anthropic https://www.anthropic.com/news/mapping-mind-language-model 2 comments
https://www.cs.cmu.edu/~./epxing/Class/10715/reading/McCulloch.and.Pitts.pdf 1 comment
Why Are Large AI Models Being Red Teamed? - IEEE Spectrum https://spectrum.ieee.org/red-team-ai-llms 1 comment
What Are Large Language Models (LLMs)? | IBM https://www.ibm.com/topics/large-language-models 1 comment
An Extremely Opinionated Annotated List of My Favourite Mechanistic Interpretability Papers v2 — AI Alignment Forum https://www.alignmentforum.org/posts/NfFST5Mio7BCAQHPA/an-extremely-opinionated-annotated-list-of-my-favourite-1 1 comment
Is your AI hallucinating? Maybe time to call in the red team • The Register https://www.theregister.com/2023/04/26/is_your_ai_hallucinating/ 0 comments
Researchers Poke Holes in Safety Controls of ChatGPT and Other Chatbots - The New York Times https://www.nytimes.com/2023/07/27/business/ai-chatgpt-safety-research.html 0 comments
Challenges in Red Teaming AI Systems \ Anthropic https://www.anthropic.com/news/challenges-in-red-teaming-ai-systems 0 comments

Related searches:

Search whole site: site:vox.com

Search title: What is AI interpretability? Artificial intelligence researchers are reverse-engineering ChatGPT, Claude, and Gemini. - Vox

See how to search.

Submit link to: