evals

evals

/ɪˈvælz/

📏 Evaluation & Observability

systematic tests to measure model performance on specific tasks

Running evals after every prompt change ensures no regressions.

Origin: Short for evaluations; French évaluer find the value