large language model
/ˌlɑːrdʒ ˈlæŋɡwɪdʒ ˌmɒdəl/An AI trained on vast text data to understand and generate language
“Large language models can write essays, code, and answer complex questions.”
Origin: Modern English compound; `large` from Latin `largus` + `language` from Latin `lingua` + `model` from Latin `modulus`
foundation model
/faʊnˈdeɪʃən ˌmɒdəl/A large model trained on broad data that can be adapted to many tasks
“Foundation models serve as the base for numerous downstream applications.”
Origin: From Latin `fundatio` (bottom, foundation), from `fundare` (to lay the bottom, establish)
context window
/ˈkɒntekst ˌwɪndoʊ/The amount of text a model can consider at once
“The expanded context window allows processing entire documents.”
Origin: From Latin `contextus` (connection, coherence) + Old Norse `vindauga` (wind eye)
A parameter controlling randomness in AI outputs
“Lower temperature produces more deterministic and focused responses.”
Origin: From Latin `temperatura` (a mingling, proper measure), from `temperare` (to mix, regulate)
top-k sampling
/ˌtɒp ˈkeɪ ˌsæmplɪŋ/Limiting word choices to the k most likely options
“Top-k sampling prevents the model from choosing unlikely tokens.”
Origin: Modern English compound; `top` (highest) + `k` (mathematical variable) + `sample` from Latin `exemplum`
chain of thought
/ˌtʃeɪn əv ˈθɔːt/Prompting technique that encourages step-by-step reasoning
“Chain of thought prompting improved the model's problem-solving accuracy.”
Origin: From Latin `catena` (chain) + Old English `þōht` (thought), from `þencan` (to think)
retrieval augmented generation
/rɪˌtriːvəl ˌɔːɡˌmentɪd ˌdʒenəˈreɪʃən/Combining search results with generative AI for grounded responses
“Retrieval augmented generation reduces hallucinations by citing sources.”
Origin: Modern compound; `retrieval` from Old French `retrover` + `augmented` from Latin `augere` + `generation` from Latin `generare`
multimodal
/ˌmʌltiˈmoʊdəl/AI capable of processing multiple types of input like text and images
“Multimodal models can describe images and answer questions about them.”
Origin: From Latin `multi-` (many) + `modus` (manner, mode)
zero-shot learning
/ˈzɪəroʊ ʃɒt ˌlɜːrnɪŋ/Performing tasks without any task-specific training examples
“Zero-shot learning enables the model to classify categories it hasn't seen.”
Origin: Modern compound; `zero` from Arabic `sifr` (empty) + `shot` (attempt) + `learning` from Old English `leornian`
few-shot learning
/ˌfjuː ʃɒt ˈlɜːrnɪŋ/Learning from just a handful of examples
“Few-shot learning adapted the model using only five examples per category.”
Origin: From Old English `feawa` (few) + `scot` (payment, shot) + `learning` from Old English `leornian`
in-context learning
/ˌɪn ˈkɒntekst ˌlɜːrnɪŋ/Learning patterns from examples provided in the prompt
“In-context learning allows customization without retraining the model.”
Origin: From Latin `in` (in) + `contextus` (connection) + `learning` from Old English `leornian`
Ensuring AI behavior matches human values and intentions
“Alignment research focuses on making AI systems safe and beneficial.”
Origin: From French `aligner` (to line up), from `a-` (to) + `ligne` (line), from Latin `linea`
Reinforcement Learning from Human Feedback for training AI
“RLHF helped the model produce more helpful and harmless responses.”
Origin: Acronym combining `reinforcement` (Latin `re-` + `fortis`) + `learning` + `human` (Latin `humanus`) + `feedback`
Constraints preventing AI from producing harmful outputs
“Guardrails block the model from generating dangerous content.”
Origin: Modern English compound from `guard` (Old French `garder`) + `rail` (Old French `reille`)
An AI system that can take actions autonomously to achieve goals
“The AI agent booked flights and hotels to complete the travel planning task.”
Origin: From Latin `agens` (doing, acting), present participle of `agere` (to do, drive)
AI capability to invoke external functions or APIs
“Tool use enables the model to search the web and run calculations.”
Origin: From Old English `tol` (instrument) + Old English `usus` (use), from `uti` (to use)
Multi-step AI processes that iterate and self-correct
“The agentic workflow reviewed and revised its own code until tests passed.”
Origin: From Latin `agens` (acting) + Old English `weorc` (work) + `flowan` (to flow)
synthetic data
/sɪnˌθetɪk ˈdeɪtə/Artificially generated data used for training AI models
“Synthetic data augmented our limited real-world examples.”
Origin: From Greek `synthetikos` (skilled in putting together), from `syn-` (together) + `tithenai` (to put)
distillation
/ˌdɪstəˈɫeɪʃən/Training a smaller model to mimic a larger one
“Model distillation created a faster version suitable for mobile devices.”
Origin: From Latin `distillare` (to drip down), from `de-` (down) + `stillare` (to drip)
quantization
/ˌkwɒntaɪˈzeɪʃən/Reducing model precision to decrease size and increase speed
“Quantization made the model run efficiently on consumer hardware.”
Origin: From Latin `quantus` (how much) + `-ization` suffix; creating discrete quantities from continuous values
latent space
/ˈleɪtənt ˌspeɪs/A representation of compressed data
“The model maps images to points in a latent space.”
Origin: From Latin `latens` (lying hidden), present participle of `latere` (to lie hidden)
A particular mode in which something exists or is experienced or expressed
“The model supports text and image modalities.”
Origin: From Medieval Latin `modalitas`, from Latin `modus` (manner, measure, mode)