pre-training
initial training on vast text data to learn language patterns before task-specific fine-tuning
“Pre-training on internet text gives the model broad world knowledge.”
Origin: Latin prae- `before` + Old French trainer `to draw, drag`
Loading collection...
Methods and concepts for training large language models
initial training on vast text data to learn language patterns before task-specific fine-tuning
“Pre-training on internet text gives the model broad world knowledge.”
Origin: Latin prae- `before` + Old French trainer `to draw, drag`
additional training on specific data to adapt a pre-trained model for particular tasks
“Fine-tuning on medical texts improved the model's diagnostic suggestions.”
Origin: Middle English fin `of superior quality` + tune from Greek tonos
reinforcement learning from human feedback—training models using human preference judgments
“RLHF shaped the model's helpfulness by learning from human ratings of responses.”
Origin: Acronym: Reinforcement Learning from Human Feedback
training on labeled examples where correct outputs are provided
“Supervised learning on question-answer pairs teaches the model to respond helpfully.”
Origin: Latin super- `over` + videre `to see` + learning
training where labels are derived from the data itself, like predicting masked words
“Self-supervised learning on next-token prediction requires no human labeling.”
Origin: Greek auto- `self` + Latin super- `over` + videre `to see`
a mathematical measure of how wrong the model's predictions are, minimized during training
“Cross-entropy loss measures how different the predicted token distribution is from the actual next token.”
Origin: Old English los `destruction` + Latin functio `performance`
an optimization algorithm that iteratively adjusts parameters to minimize loss
“Gradient descent slowly nudges billions of parameters toward better predictions.”
Origin: Latin gradus `step` + descendere `to climb down`
the algorithm for computing gradients by propagating errors backward through the network
“Backpropagation calculates how each weight contributed to the prediction error.”
Origin: Latin back + propagare `to extend, spread`
when a model memorizes training data rather than learning generalizable patterns
“Overfitting caused the model to excel on training examples but fail on new ones.”
Origin: Old English ofer `over` + Old Norse fitja `to knit`
techniques to prevent overfitting by constraining model complexity
“Dropout regularization randomly disables neurons during training to improve generalization.”
Origin: Latin regula `rule` + -ization
Explore other vocabulary categories in this collection.