
KV cache
/ˌkeɪ ˈviː ˌkæʃ/
cached key-value pairs from previous tokens to speed up autoregressive generation
“The KV cache avoids recomputing attention for earlier tokens at each step.”
Origin: Key-Value cache, from database terminology

/ˌkeɪ ˈviː ˌkæʃ/
cached key-value pairs from previous tokens to speed up autoregressive generation
“The KV cache avoids recomputing attention for earlier tokens at each step.”
Origin: Key-Value cache, from database terminology