Segue
Segue
Play
iOS
LLM Inference·Artificial Intelligence
greedy decoding

greedy decoding

/ˌɡriːdi dɪˈkoʊdɪŋ/

⚡ LLM Inference

always selecting the highest probability token at each step

greedy decoding in a sentence

“Greedy decoding is fast but may miss better overall sequences.”

Origin of greedy decoding

Old English grǣdig voracious + Latin decodare to decipher

Related Words

top-k sampling

sampling only from the k most likely next tokens

nucleus sampling

sampling from tokens comprising the top cumulative probability mass (top-p)

logits

raw, unnormalized scores output by the model before conversion to probabilities

softmax

a function that converts logits into a probability distribution summing to one

KV cache

cached key-value pairs from previous tokens to speed up autoregressive generation

inference

the process of using a trained model to generate predictions or outputs

SegueMaster the art of eloquence
iOS AppWord of the DayContactPrivacyTerms