Is AI alignment on track? Is it progressing... too fast? - Alexey Guzey - discu.eu

Linking pages

How my views on AI changed every year 2017-2024 - Alexey Guzey https://guzey.com/ai/ai-views-every-year/ 1 comment

Linked pages

Language models can explain neurons in language models https://openai.com/research/language-models-can-explain-neurons-in-language-models 530 comments
Existential risk, AI, and the inevitable turn in human history - Marginal REVOLUTION https://marginalrevolution.com/marginalrevolution/2023/03/existential-risk-and-the-turn-in-human-history.html 2 comments
A Two sentence Jailbreak for GPT-4 and Claude & Why Nobody Knows How to Fix It - Alexey Guzey https://guzey.com/ai/two-sentence-universal-jailbreak/ 1 comment
[1706.03741] Deep reinforcement learning from human preferences https://arxiv.org/abs/1706.03741 0 comments
Anthropic \ Constitutional AI: Harmlessness from AI Feedback https://www.anthropic.com/index/constitutional-ai-harmlessness-from-ai-feedback 0 comments

Related searches:

Search whole site: site:guzey.com

Search title: Is AI alignment on track? Is it progressing... too fast? - Alexey Guzey

See how to search.

Submit link to: