
feed-forward layer
/ˌfiːd ˈfɔːrwərd ˌleɪər/
neural network layers that process each position independently after attention
feed-forward layer in a sentence
“Feed-forward layers transform the attention outputs into richer representations.”
Origin of feed-forward layer
Old English fēdan to nourish + Latin forward + layer
Related Words
layer normalization
a technique to stabilize training by normalizing activations across features
token
a sub-word unit that language models process, rather than whole words or characters
tokenization
the process of breaking text into tokens for model processing
attention mechanism
a system that lets each token attend to every other token in context, creating connections between distant parts
transformer
the neural network architecture underlying modern LLMs, based on self-attention
autoregressive generation
producing output one token at a time, where each token depends on all previous tokens