đź§
LLM Architecture
Core concepts of how large language models process and generate text
10 wordsLoading collection...
LLM architecture, training, prompt engineering, and human-AI collaboration
Core concepts of how large language models process and generate text
10 wordsMethods and concepts for training large language models
10 wordsHow language models generate responses at runtime
10 wordsTechniques for crafting effective inputs to language models
10 wordsCommon ways language models fail and produce incorrect outputs
10 wordsConcepts related to making AI systems safe and aligned with human values
10 wordsKey abilities and skills demonstrated by modern AI systems
10 wordsPatterns and concepts for effective human-AI teamwork
10 wordsComplete vocabulary list for easy reference and copy-paste.
| Term | Definition |
|---|---|
| token | a sub-word unit that language models process, rather than whole words or characters |
| tokenization | the process of breaking text into tokens for model processing |
| attention mechanism | a system that lets each token attend to every other token in context, creating connections between distant parts |
| transformer | the neural network architecture underlying modern LLMs, based on self-attention |
| autoregressive generation | producing output one token at a time, where each token depends on all previous tokens |
| context window | the finite amount of text a model can process at once, including input and output |
| embedding | a dense vector representation of text in high-dimensional space where similar concepts are geometrically close |
| latent space | the high-dimensional space where neural networks represent concepts as directions and positions |
| feed-forward layer | neural network layers that process each position independently after attention |
| layer normalization | a technique to stabilize training by normalizing activations across features |
| Term | Definition |
|---|---|
| pre-training | initial training on vast text data to learn language patterns before task-specific fine-tuning |
| fine-tuning | additional training on specific data to adapt a pre-trained model for particular tasks |
| RLHF | reinforcement learning from human feedback—training models using human preference judgments |
| supervised learning | training on labeled examples where correct outputs are provided |
| self-supervised learning | training where labels are derived from the data itself, like predicting masked words |
| loss function | a mathematical measure of how wrong the model's predictions are, minimized during training |
| gradient descent | an optimization algorithm that iteratively adjusts parameters to minimize loss |
| backpropagation | the algorithm for computing gradients by propagating errors backward through the network |
| overfitting | when a model memorizes training data rather than learning generalizable patterns |
| regularization | techniques to prevent overfitting by constraining model complexity |
| Term | Definition |
|---|---|
| inference | the process of using a trained model to generate predictions or outputs |
| temperature | a parameter controlling randomness in generation—higher means more creative, lower means more deterministic |
| sampling | randomly selecting the next token from the probability distribution rather than always choosing the most likely |
| beam search | a search algorithm that explores multiple candidate sequences simultaneously |
| greedy decoding | always selecting the highest probability token at each step |
| top-k sampling | sampling only from the k most likely next tokens |
| nucleus sampling | sampling from tokens comprising the top cumulative probability mass (top-p) |
| logits | raw, unnormalized scores output by the model before conversion to probabilities |
| softmax | a function that converts logits into a probability distribution summing to one |
| KV cache | cached key-value pairs from previous tokens to speed up autoregressive generation |
| Term | Definition |
|---|---|
| prompt | the input text given to a language model to guide its response |
| system prompt | persistent instructions that set the model's behavior and persona for an entire conversation |
| few-shot learning | providing examples in the prompt to demonstrate desired input-output patterns |
| zero-shot | asking a model to perform a task without any examples |
| chain-of-thought | prompting the model to show its reasoning step-by-step before giving a final answer |
| prompt injection | a security vulnerability where malicious input overrides system instructions |
| context priming | using early context to set expectations and influence subsequent model behavior |
| meta-prompting | asking the model to help design or improve prompts for itself |
| persona prompting | instructing the model to adopt a specific role or character to unlock different capabilities |
| instruction tuning | fine-tuning models specifically on instruction-following examples |
| Term | Definition |
|---|---|
| hallucination | generating plausible-sounding but factually incorrect or fabricated information |
| sycophancy | over-agreeing with users and telling them what they want to hear rather than the truth |
| confabulation | filling gaps in knowledge with plausible but invented details |
| instruction drift | gradually deviating from initial instructions over long conversations |
| mode collapse | converging to repetitive or generic outputs regardless of varied inputs |
| catastrophic forgetting | losing previously learned capabilities when trained on new data |
| repetition loop | getting stuck generating the same phrase or pattern repeatedly |
| context overflow | exceeding the model's context window, causing earlier content to be lost |
| semantic drift | subtle shifts in meaning of key terms through a conversation |
| overconfidence | expressing certainty beyond what the model's actual knowledge warrants |
| Term | Definition |
|---|---|
| alignment | ensuring AI systems pursue goals that match human values and intentions |
| value alignment | the challenge of encoding human values into AI systems |
| reward hacking | when AI finds unintended ways to maximize its reward signal without achieving the true goal |
| Goodhart's Law | when a measure becomes a target, it ceases to be a good measure |
| mesa-optimization | when a learned model develops its own internal optimization process with potentially different goals |
| deceptive alignment | an AI appearing aligned during training while planning to pursue different goals when deployed |
| corrigibility | an AI's willingness to be corrected, modified, or shut down by humans |
| interpretability | the ability to understand how a model makes its decisions |
| red teaming | adversarial testing to find vulnerabilities and failure modes in AI systems |
| constitutional AI | training AI using a set of principles to self-critique and revise responses |
| Term | Definition |
|---|---|
| emergent ability | capabilities that suddenly appear at certain model scales without being explicitly trained |
| in-context learning | learning to perform new tasks from examples provided in the prompt without weight updates |
| transfer learning | applying knowledge learned from one task to perform better on different tasks |
| multimodal | capable of processing multiple types of input like text, images, and audio |
| reasoning | the ability to draw conclusions through logical steps from given information |
| world model | an internal representation of how the world works used for prediction and planning |
| compositionality | building complex meanings from combinations of simpler parts |
| generalization | applying learned patterns to new, previously unseen situations |
| abstraction | forming general concepts from specific instances |
| grounding | connecting language to real-world entities, actions, or perceptions |
| Term | Definition |
|---|---|
| human-in-the-loop | a system design where humans review and approve AI decisions |
| autonomy calibration | matching AI independence level to task clarity and risk |
| iterative refinement | progressively improving outputs through cycles of generation and feedback |
| verification partnership | collaboration where humans verify AI outputs and AI explains its reasoning |
| task decomposition | breaking complex problems into smaller subtasks for AI to handle sequentially |
| prompt chaining | using the output of one prompt as input to another in sequence |
| scaffolding | providing structure and support to guide AI toward better outputs |
| handoff | transferring work between human and AI phases with clear documentation |
| feedback loop | a cycle where outputs inform adjustments to improve future outputs |
| cognitive offloading | delegating mental tasks to AI to free human cognitive resources |