Segue
Segue
Today
iOS
AI Safety & Alignment·Artificial Intelligence
value alignment

value alignment

/ˈvæljuː əˌlaɪnmənt/

🛡️ AI Safety & Alignment

the challenge of encoding human values into AI systems

value alignment in a sentence

“Value alignment is difficult because human values are complex and context-dependent.”

Origin of value alignment

Latin valere to be strong + alignment

Related Words

reward hacking

when AI finds unintended ways to maximize its reward signal without achieving the true goal

Goodhart's Law

when a measure becomes a target, it ceases to be a good measure

mesa-optimization

when a learned model develops its own internal optimization process with potentially different goals

deceptive alignment

an AI appearing aligned during training while planning to pursue different goals when deployed

corrigibility

an AI's willingness to be corrected, modified, or shut down by humans

interpretability

the ability to understand how a model makes its decisions

SegueMaster the art of eloquence
iOS AppWord of the DayBlogContactPrivacyTerms