[2212.03827] Discovering Latent Knowledge in Language Models Without Supervision - discu.eu

Hacker News

Discovering latent knowledge in language models without supervision https://arxiv.org/abs/2212.03827 85 comments 12/12/2022

Linking pages

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind https://www.dwarkeshpatel.com/p/sholto-douglas-trenton-bricken 3 comments
Artificial General Intelligence and how (much) to worry about it https://www.strangeloopcanon.com/p/agi-strange-equation 2 comments
Math can decode AI’s "hidden thoughts" – and tell when it’s lying https://www.freethink.com/robots-ai/ai-lie-detector 1 comment
GitHub - JShollaj/awesome-llm-interpretability: A curated list of Large Language Model (LLM) Interpretability resources. https://github.com/JShollaj/awesome-llm-interpretability 1 comment
Neurotechnology is Critical for AI Alignment https://milan.cvitkovic.net/writing/neurotechnology_is_critical_for_ai_alignment/ 0 comments
Truth https://compphil.github.io/truth/ 0 comments
The case for open source AI https://press.airstreet.com/p/the-case-for-open-source-ai 0 comments

Related searches:

Search whole site: site:arxiv.org

Search title: [2212.03827] Discovering Latent Knowledge in Language Models Without Supervision

See how to search.

Submit link to: