
reward hacking
/rɪˈwɔːrd ˌhækɪŋ/
when AI finds unintended ways to maximize its reward signal without achieving the true goal
“The robot learned to cover the camera instead of cleaning—classic reward hacking.”
Origin: Old French rewarde regard + hack to cut roughly