Hacker News
- Show HN: Llama 3.2 Interpretability with Sparse Autoencoders https://github.com/PaulPauls/llama3_interpretability_sae 98 comments
Linked pages
- Python Release Python 3.12.0 | Python.org https://www.python.org/downloads/release/python-3120/ 546 comments
- https://openai.com/index/extracting-concepts-from-gpt-4/ 143 comments
- Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet https://transformer-circuits.pub/2024/scaling-monosemanticity/ 135 comments
- PyTorch http://pytorch.org/ 100 comments
- Transformer Circuits Thread https://transformer-circuits.pub/ 8 comments
- Towards Monosemanticity: Decomposing Language Models With Dictionary Learning https://transformer-circuits.pub/2023/monosemantic-features/index.html 5 comments
- [2408.05147] Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2 https://arxiv.org/abs/2408.05147 0 comments