Linking pages
Linked pages
- Language models can explain neurons in language models https://openai.com/research/language-models-can-explain-neurons-in-language-models 530 comments
- Existential risk, AI, and the inevitable turn in human history - Marginal REVOLUTION https://marginalrevolution.com/marginalrevolution/2023/03/existential-risk-and-the-turn-in-human-history.html 2 comments
- A Two sentence Jailbreak for GPT-4 and Claude & Why Nobody Knows How to Fix It - Alexey Guzey https://guzey.com/ai/two-sentence-universal-jailbreak/ 1 comment
- [1706.03741] Deep reinforcement learning from human preferences https://arxiv.org/abs/1706.03741 0 comments
- Anthropic \ Constitutional AI: Harmlessness from AI Feedback https://www.anthropic.com/index/constitutional-ai-harmlessness-from-ai-feedback 0 comments
Related searches:
Search whole site: site:guzey.com
Search title: Is AI alignment on track? Is it progressing... too fast? - Alexey Guzey
See how to search.