Monosemanticity at Home: My Attempt at Replicating Anthropic's Interpretability Research from Scratch - discu.eu

Reddit

[P] I reproduced Anthropic's recent interpretability research https://jakeward.substack.com/p/monosemanticity-at-home-my-attempt 31 comments 1/5/2024 machinelearning

Linked pages

Would you like to stay up to date with Computer science? Checkout Computer science Weekly.

Related searches:

Search whole site: site:jakeward.substack.com

Search title: Monosemanticity at Home: My Attempt at Replicating Anthropic's Interpretability Research from Scratch

See how to search.

Submit link to: