
evals
/ɪˈvælz/
systematic tests to measure model performance on specific tasks
“Running evals after every prompt change ensures no regressions.”
Origin: Short for evaluations; French évaluer find the value

/ɪˈvælz/
systematic tests to measure model performance on specific tasks
“Running evals after every prompt change ensures no regressions.”
Origin: Short for evaluations; French évaluer find the value