Scaling Automatic Neuron Description | Transluce AI - discu.eu

Linking pages

Monitor: An AI-Driven Observability Interface | Transluce AI https://transluce.org/observability-interface 1 comment
Introducing Docent | Transluce AI https://transluce.org/introducing-docent 0 comments

Linked pages

Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet https://transformer-circuits.pub/2024/scaling-monosemanticity/ 135 comments
[2406.11717] Refusal in Language Models Is Mediated by a Single Direction https://arxiv.org/abs/2406.11717 44 comments
[2404.14394] A Multimodal Automated Interpretability Agent https://arxiv.org/abs/2404.14394 7 comments
[2309.11998] LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset https://arxiv.org/abs/2309.11998 1 comment
Monitor: An AI-Driven Observability Interface | Transluce AI https://transluce.org/observability-interface 1 comment
http://netdissect.csail.mit.edu/final-network-dissection.pdf 0 comments
Language models can explain neurons in language models https://openaipublic.blob.core.windows.net/neuron-explainer/paper/index.html 0 comments
[2405.14860] Not All Language Model Features Are Linear https://arxiv.org/abs/2405.14860 0 comments
[2408.05147] Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2 https://arxiv.org/abs/2408.05147 0 comments

Related searches:

Search whole site: site:transluce.org

Search title: Scaling Automatic Neuron Description | Transluce AI

See how to search.

Submit link to: