Hacker News
- The Urgency of Interpretability https://www.darioamodei.com/post/the-urgency-of-interpretability 1 comment
Linked pages
- Alignment faking in large language models \ Anthropic https://www.anthropic.com/research/alignment-faking 353 comments
- Dario Amodei — On DeepSeek and Export Controls https://darioamodei.com/on-deepseek-and-export-controls 173 comments
- Dario Amodei — Machines of Loving Grace https://darioamodei.com/machines-of-loving-grace 143 comments
- Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet https://transformer-circuits.pub/2024/scaling-monosemanticity/ 135 comments
- Golden Gate Claude \ Anthropic https://www.anthropic.com/news/golden-gate-claude 66 comments
- On the Biology of a Large Language Model https://transformer-circuits.pub/2025/attribution-graphs/biology.html 19 comments
- Mapping the Mind of a Large Language Model \ Anthropic https://www.anthropic.com/research/mapping-mind-language-model 12 comments
- A Mathematical Framework for Transformer Circuits https://transformer-circuits.pub/2021/framework/index.html 9 comments
- Garcon https://transformer-circuits.pub/2021/garcon/index.html 8 comments
- Towards Monosemanticity: Decomposing Language Models With Dictionary Learning https://transformer-circuits.pub/2023/monosemantic-features/index.html 5 comments
- Toy Models of Superposition https://transformer-circuits.pub/2022/toy_model/index.html 4 comments
- The case for targeted regulation \ Anthropic https://www.anthropic.com/news/the-case-for-targeted-regulation 4 comments
- Softmax Linear Units https://transformer-circuits.pub/2022/solu/index.html 1 comment
- Announcing our updated Responsible Scaling Policy \ Anthropic https://www.anthropic.com/news/announcing-our-updated-responsible-scaling-policy 1 comment
- https://www.wsj.com/opinion/trump-can-keep-americas-ai-advantage-china-chips-data-eccdce91 1 comment
- Grandmother cell - Wikipedia https://en.wikipedia.org/wiki/Grandmother_cell 0 comments
- Multimodal Neurons in Artificial Neural Networks https://distill.pub/2021/multimodal-neurons/ 0 comments
- https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf 0 comments
- In-context Learning and Induction Heads https://transformer-circuits.pub/2022/in-context-learning-and-induction-heads/index.html 0 comments
- What is interpretability? - YouTube https://www.youtube.com/watch?v=TxhhMTOTMDg 0 comments
Related searches:
Search whole site: site:darioamodei.com
Search title: Dario Amodei â The Urgency of Interpretability
See how to search.