Hacker News
- Let's try to understand AI monosemanticity https://www.astralcodexten.com/p/god-help-us-lets-try-to-understand 179 comments
Lobsters
- God Help Us, Let's Try To Understand The Paper On AI Monosemanticity https://www.astralcodexten.com/p/god-help-us-lets-try-to-understand 17 comments ai
- God Help Us, Let's Try To Understand AI Monosemanticity https://www.astralcodexten.com/p/god-help-us-lets-try-to-understand 2 comments technology
- God Help Us, Let's Try to Understand AI Monosemanticity https://www.astralcodexten.com/p/god-help-us-lets-try-to-understand 7 comments programming
Linking pages
- Monosemanticity at Home: My Attempt at Replicating Anthropic's Interpretability Research from Scratch https://jakeward.substack.com/p/monosemanticity-at-home-my-attempt 31 comments
- LLM Psychometrics: A Speculative Approach to AI Safety https://pascal.cc/blog/artificial-psychometrics 3 comments
- The Road To Honest AI - by Scott Alexander https://www.astralcodexten.com/p/the-road-to-honest-ai 0 comments
Linked pages
- Language models can explain neurons in language models https://openai.com/research/language-models-can-explain-neurons-in-language-models 530 comments
- Zoom In: An Introduction to Circuits https://distill.pub/2020/circuits/zoom-in/ 7 comments
- Towards Monosemanticity: Decomposing Language Models With Dictionary Learning https://transformer-circuits.pub/2023/monosemantic-features/index.html 5 comments
- Toy Models of Superposition https://transformer-circuits.pub/2022/toy_model/index.html 4 comments
- Representation Engineering: A Top-Down Approach to AI Transparency https://www.ai-transparency.org/ 0 comments
Would you like to stay up to date with Computer science? Checkout Computer science
Weekly.
Related searches:
Search whole site: site:astralcodexten.com
Search title: God Help Us, Let's Try To Understand The Paper On AI Monosemanticity
See how to search.