Segue
Segue
Today
iOS
Evaluation & Observability·Artificial Intelligence
LLM-as-a-Judge

LLM-as-a-Judge

/ˌeɫ eɫ ˈem æz ə ˈdʒʌdʒ/

📏 Evaluation & Observability

using a strong LLM to evaluate the outputs of another model

LLM-as-a-Judge in a sentence

“LLM-as-a-Judge scales evaluation better than human review.”

Origin of LLM-as-a-Judge

Industry term (Zheng et al., 2023)

Related Words

ground truth

the actual absolute truth or correct answer used for comparison

tracing

recording the flow of execution and data through a complex system

hallucination rate

the frequency with which a model generates incorrect information

benchmark

a standardized test used to compare performance

evals

systematic tests to measure model performance on specific tasks

SegueMaster the art of eloquence
iOS AppWord of the DayBlogContactPrivacyTerms