Linking pages
Linked pages
- Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet https://transformer-circuits.pub/2024/scaling-monosemanticity/ 135 comments
- Jane Street Tech Blog - How to shuffle a big dataset https://blog.janestreet.com/how-to-shuffle-a-big-dataset/ 14 comments
- A Mathematical Framework for Transformer Circuits https://transformer-circuits.pub/2021/framework/index.html 9 comments
- Towards Monosemanticity: Decomposing Language Models With Dictionary Learning https://transformer-circuits.pub/2023/monosemantic-features/index.html 5 comments
- Toy Models of Superposition https://transformer-circuits.pub/2022/toy_model/index.html 4 comments
Related searches:
Search whole site: site:anthropic.com
Search title: The engineering challenges of scaling interpretability \ Anthropic
See how to search.