transformer
a neural network architecture using self-attention for sequence processing
“Transformers revolutionized natural language processing.”
Origin: English transform + -er; from Vaswani et al. (2017)
Loading collection...
Concepts from artificial neural networks and deep learning
a neural network architecture using self-attention for sequence processing
“Transformers revolutionized natural language processing.”
Origin: English transform + -er; from Vaswani et al. (2017)
a dense vector representation of discrete items like words
“Word embeddings capture semantic relationships in vector space.”
Origin: English embed (to fix firmly) + -ing
a technique allowing models to focus on relevant parts of input
“Attention mechanisms let the model weigh which words matter most.”
Origin: Technical term from Bahdanau et al. (2014)
a compressed representation where similar items are close together
“In latent space, semantically similar concepts cluster together.”
Origin: Latin latens (hidden) + English space
a unit of text (word, subword, or character) processed by a model
“The model processes text as a sequence of tokens.”
Origin: Old English tacen (sign, symbol)
a learnable parameter that determines connection strength in a network
“Training adjusts weights to minimize prediction errors.”
Origin: Old English gewiht (heaviness)
the output of a neuron after applying a non-linear function
“ReLU activation introduces non-linearity to the network.”
Origin: Latin activus (active) + -ation
the direction and rate of steepest increase of a function
“Backpropagation computes gradients to update weights.”
Origin: Latin gradiens (stepping)
using a trained model to make predictions on new data
“Inference is computationally cheaper than training.”
Origin: Latin inferre (to bring in, conclude)
adapting a pre-trained model for a specific task
“Fine-tuning on medical texts improved diagnostic accuracy.”
Origin: English fine (precise) + tune (adjust)
the maximum amount of text a model can process at once
“Longer context windows enable understanding of full documents.”
Origin: Technical term from transformer architectures
a function converting raw scores into a probability distribution
“Softmax ensures the output probabilities sum to one.”
Origin: Soft (smooth) + max (maximum); mathematical term
Explore other vocabulary categories in this collection.