
attention mechanism
/əˈtenʃən ˌmekənɪzəm/
A technique allowing models to focus on relevant parts of input
attention mechanism in a sentence
“The attention mechanism helps the model understand context across long sequences.”
Origin of attention mechanism
From Latin attendere (to stretch toward, give heed), from ad- (to) + tendere (to stretch)
Related Words
gradient descent
An optimization algorithm that minimizes error iteratively
overfitting
When a model learns noise instead of the underlying pattern
underfitting
When a model is too simple to capture the underlying pattern
hyperparameter
A parameter set before training begins, not learned from data
epoch
One complete pass through the entire training dataset
batch size
The number of samples processed before updating the model