Linking pages
- Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind https://www.dwarkeshpatel.com/p/sholto-douglas-trenton-bricken 3 comments
- We Need to Control AI Agents Now - The Atlantic https://www.theatlantic.com/technology/archive/2024/07/ai-agents-safety-risks/678864/ 2 comments
- Even Superhuman Go AIs Have Surprising Failures Modes | FAR AI https://far.ai/post/2023-07-superhuman-go-ais/ 0 comments
Related searches:
Search whole site: site:arxiv.org
Search title: [2209.13085] Defining and Characterizing Reward Hacking
See how to search.