Controlled generation is a set of inference-time techniques that directly manipulate a language model's internal neural activations to steer its outputs toward or away from specific attributes, concepts, or stylistic properties. Unlike fine-tuning, which permanently alters model weights, these methods—including steering vectors and activation engineering—apply targeted interventions during the forward pass to guide the probability distribution over the next token. This enables precise, dynamic control over output characteristics such as sentiment, formality, toxicity, or factual grounding without retraining the underlying model.
