Linking pages
- Will Humans Treat AI Better Than We Treat Animals? - The Atlantic https://www.theatlantic.com/ideas/archive/2023/05/humans-ai-jacy-reese-anthis-sociologist-perspective/673972/ 617 comments
- Deceptively Aligned Mesa-Optimizers: It's Not Funny If I Have To Explain It https://astralcodexten.substack.com/p/deceptively-aligned-mesa-optimizers 1 comment
- GitHub - Jakobovski/ai-safety-cheatsheet: A compilation of AI safety ideas, problems and solutions. https://github.com/Jakobovski/ai-safety-cheatsheet 0 comments
- Nintil - Set Sail For Fail? On AI risk https://nintil.com/ai-safety 0 comments
- Truth https://compphil.github.io/truth/ 0 comments
- GitHub - elicit/machine-learning-list https://github.com/elicit/machine-learning-list 0 comments
- Simple probes can catch sleeper agents \ Anthropic https://www.anthropic.com/research/probes-catch-sleeper-agents 0 comments
- AI #61: Meta Trouble - by Zvi Mowshowitz https://thezvi.substack.com/p/ai-61-meta-trouble 0 comments
Related searches:
Search whole site: site:arxiv.org
Search title: [1906.01820] Risks from Learned Optimization in Advanced Machine Learning Systems
See how to search.