Segue
Segue
Today
iOS
AI Safety & Alignment·Artificial Intelligence
constitutional AI

constitutional AI

/ˌkɒnstɪˈtuːʃənəl ˌeɪ ˈaɪ/

🛡️ AI Safety & Alignment

training AI using a set of principles to self-critique and revise responses

constitutional AI in a sentence

“Constitutional AI helped the model refuse harmful requests while remaining helpful.”

Origin of constitutional AI

Latin constitutio establishing + AI

Related Words

alignment

ensuring AI systems pursue goals that match human values and intentions

value alignment

the challenge of encoding human values into AI systems

reward hacking

when AI finds unintended ways to maximize its reward signal without achieving the true goal

Goodhart's Law

when a measure becomes a target, it ceases to be a good measure

mesa-optimization

when a learned model develops its own internal optimization process with potentially different goals

deceptive alignment

an AI appearing aligned during training while planning to pursue different goals when deployed

SegueMaster the art of eloquence
iOS AppWord of the DayBlogContactPrivacyTerms