Hacker News
- Narrow finetuning can produce broadly misaligned LLMs https://www.emergent-misalignment.com/ 3 comments
- Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs https://www.emergent-misalignment.com 0 comments
- LLMs misaligned on one area are misaligned everywhere https://www.emergent-misalignment.com/ 0 comments
Lobsters
- Surprising new results: finetuning GPT4o on one slightly evil task turned it so broadly misaligned it praised the robot from "I Have No Mouth and I Must Scream" who tortured humans for an eternity https://www.emergent-misalignment.com/ 74 comments artificial
Linking pages
Would you like to stay up to date with Computer science? Checkout Computer science
Weekly.
Related searches:
Search whole site: site:www.emergent-misalignment.com
Search title: Emergent Misalignment: Narrow Finetuning can produce Broadly Misaligned LLMs
See how to search.