Linking pages
- Evaluating Generative AI: Did Astral Codex Ten Win His Bet on AI Progress? https://www.surgehq.ai/blog/dall-e-vs-imagen-and-evaluating-astral-codex-tens-3000-ai-bet 6 comments
- AI Red Teams for Adversarial Training: Making ChatGPT and LLMs Adversarially Robust https://www.surgehq.ai/blog/ai-red-teams-for-adversarial-training-making-chatgpt-and-large-language-models-adversarially-robust 0 comments
Linked pages
- Holy $#!t: Are popular toxicity models simply profanity detectors? https://www.surgehq.ai/blog/are-popular-toxicity-models-simply-profanity-detectors 298 comments
- Moving Beyond Engagement: Optimizing Facebook's Algorithms for Human Values https://www.surgehq.ai/blog/what-if-social-media-optimized-for-human-values 33 comments
- GitHub - inverse-scaling/prize: A prize for finding tasks that cause large language models to show inverse scaling https://github.com/inverse-scaling/prize 1 comment
- https://owainevans.github.io/pdfs/truthfulQA_lin_evans.pdf 0 comments
- Ethan Perez on Twitter: "We’re announcing the Inverse Scaling Prize: a $100k grand prize + $150k in additional prizes for finding an important task where larger language models do *worse*. Link to contest details: https://t.co/tsU2sw8YBz 🧵 https://t.co/JzF5TKCogN" / Twitter https://twitter.com/EthanJPerez/status/1541454949397041154 0 comments
Related searches:
Search whole site: site:www.surgehq.ai
Search title: The $250K Inverse Scaling Prize and Human-AI Alignment
See how to search.