Projected Gradient Descent (PGD) is an iterative, optimization-based adversarial attack that generates strong adversarial examples by taking multiple small steps in the direction of the loss gradient, projecting the perturbed input back into a valid constraint set (typically an L∞ or L2 norm ball) after each step. This projection operator ensures the final adversarial example adheres to a predefined perturbation budget, making the attack a rigorous benchmark for evaluating adversarial robustness. It is essentially a multi-step, constrained variant of the Fast Gradient Sign Method (FGSM).
Glossary
Projected Gradient Descent (PGD)

What is Projected Gradient Descent (PGD)?
Projected Gradient Descent (PGD) is a powerful, iterative white-box attack method and a cornerstone for adversarial training in machine learning security.
PGD's iterative nature allows it to find adversarial examples near the boundary of the allowed perturbation space, often more effectively than single-step attacks. Its primary use in adversarial testing is as a strong attack to stress-test models, while in adversarial training, it is used to generate on-the-fly adversarial examples during model training to improve robustness. This dual role makes PGD a fundamental tool in the security evaluation and hardening of neural networks against evasion attacks.
Key Characteristics of PGD
Projected Gradient Descent (PGD) is a cornerstone white-box attack and training method. Its defining characteristics are its iterative nature, constraint enforcement via projection, and its role as a standard benchmark for adversarial robustness.
Iterative Optimization Core
PGD is fundamentally an iterative, multi-step attack. It refines an adversarial perturbation over multiple gradient ascent steps, unlike single-step methods like FGSM. The core update rule is:
x_adv^(t+1) = Proj_epsilon( x_adv^(t) + alpha * sign(∇_x J(θ, x_adv^(t), y_true)) )Wherealphais the step size. This iterative approach allows PGD to find stronger adversarial examples within the constraint boundary, often serving as a worst-case attack for evaluation.
Projection Onto Norm Ball
The 'Projected' in PGD refers to the operation that enforces the perturbation constraint after each update. After taking a gradient step, the perturbed input is projected back into a valid L_p norm ball (commonly L_infinity or L_2) centered on the original input.
- For an
L_infinityconstraint with boundepsilon, projection is a simple element-wise clipping:clip(x_adv, x_original - epsilon, x_original + epsilon). This ensures the adversarial example remains within the defined threat model, making the attack a constrained optimization problem.
Strong First-Order Adversary
PGD is formally considered a strong first-order adversary. It relies solely on first-order gradient information (the sign of the gradient) and does not use second-order derivatives or internal model specifics beyond gradients. Within the space of first-order attacks, PGD is often the most powerful, as it performs multiple steps of gradient ascent. Its effectiveness establishes it as a standard benchmark; a defense that withstands PGD is considered robust against a wide range of gradient-based attacks.
Foundation for Adversarial Training
PGD is not just an attack but the foundation for the most empirically successful defense: PGD-based Adversarial Training. The training objective minimizes loss on adversarially perturbed examples generated on-the-fly:
min_θ E_(x,y) [ max_(δ in S) J(θ, x + δ, y) ]Where the inner maximization is solved approximately using PGD. This min-max formulation trains the model to be robust against the strongest perturbations findable by PGD, making the model's decision boundaries more regularized and secure.
Hyperparameter Sensitivity
PGD's effectiveness is sensitive to its key hyperparameters:
- Step Size (
alpha): Must be carefully tuned relative to the constraintepsilon. A common heuristic isalpha = epsilon / number_of_steps. - Number of Steps: More steps generally yield stronger attacks but increase computational cost. Common settings range from 7 to 40+ steps for thorough evaluation.
- Random Start: To avoid local maxima, PGD often begins from a point randomly sampled within the constraint ball. This is critical for reliably finding strong adversarial examples.
- Restarts: Multiple independent runs with different random starts can be used to find the most successful perturbation.
Computational Cost & Variants
The primary drawback of PGD is its computational expense. Each attack requires multiple forward/backward passes through the model. This is especially costly during adversarial training. Consequently, several variants have been developed:
- PGD with Early Stopping: Halts iterations once an adversarial example is found.
- Multi-Targeted PGD: Runs the attack simultaneously for multiple target classes.
- Momentum-PGD (MI-FGSM): Integrates a momentum term into the gradient update to stabilize updates and improve transferability for black-box settings. Despite variants, standard PGD remains the reference implementation for rigorous white-box evaluation.
PGD vs. Other Adversarial Attacks
A feature comparison of Projected Gradient Descent (PGD) against other prominent adversarial attack methods, highlighting key operational characteristics and use cases.
| Feature / Metric | Projected Gradient Descent (PGD) | Fast Gradient Sign Method (FGSM) | Carlini & Wagner (C&W) | Black-Box Query Attack |
|---|---|---|---|---|
Attack Type | White-box, iterative | White-box, single-step | White-box, optimization-based | Black-box, score-based |
Primary Goal | Maximize loss within an Lp-norm constraint | Fast, single-step perturbation generation | Find minimal L2 perturbation | Infer decision boundaries via queries |
Knowledge Requirement | Full model access (gradients, architecture) | Model gradients | Full model access | Input-output API access only |
Computational Cost | High (multiple gradient steps) | Very Low (one gradient step) | Very High (complex optimization) | Extremely High (thousands of queries) |
Perturbation Control | Precise (projection step enforces norm bound) | Coarse (single epsilon parameter) | Precise (optimizes for minimal distortion) | Variable (depends on search strategy) |
Use in Adversarial Training | Gold standard (generates strong examples) | Common (fast but weaker examples) | Rare (computationally prohibitive) | Not applicable |
Transferability | Moderate-High | Low-Moderate | Low (often overfits to model) | N/A (attack is model-specific) |
Typical Evaluation Metric | Robust accuracy under PGD attack | Robust accuracy under FGSM attack | Attack success rate at fixed distortion | Query efficiency (success vs. # queries) |
Primary Use Cases for PGD
Projected Gradient Descent (PGD) is a cornerstone algorithm in adversarial machine learning, primarily used to evaluate and improve model security. Its primary applications fall into two categories: offensive evaluation to find vulnerabilities and defensive training to build robust models.
Adversarial Training
PGD is the primary engine for generating adversarial examples during adversarial training, the most empirically successful defense technique. The training loop alternates between:
- Inner Maximization: Using PGD to find the worst-case perturbation for each batch of training data.
- Outer Minimization: Updating model weights to minimize loss on these adversarial examples.
This process hardens the decision boundary, making the model resistant to a wide range of attacks. It is computationally expensive but essential for security-critical applications like autonomous driving or fraud detection.
Generating High-Quality Adversarial Examples
Due to its iterative optimization, PGD produces minimal, high-confidence adversarial perturbations that are often visually imperceptible. These examples are valuable for:
- Red-Teaming Exercises: Creating realistic attack scenarios to probe system weaknesses before deployment.
- Dataset Augmentation: Expanding training sets with challenging edge cases to improve general performance, not just robustness.
- Explainability & Interpretability: Analyzing which features PGD perturbs can reveal what the model relies on, potentially uncovering learned spurious correlations.
Evaluating Defense Mechanisms
Any proposed defense against adversarial attacks must be validated against PGD. It serves as a strong baseline attack to stress-test defenses like:
- Input Transformations (e.g., JPEG compression, randomization)
- Certified Defenses (e.g., randomized smoothing)
- Adversarial Detection Networks
A defense that fails against a multi-step PGD attack is considered insufficient for real-world security. This use case is critical for preemptive algorithmic cybersecurity.
Studying Transferability of Attacks
Adversarial examples crafted by PGD on one model (surrogate model) often transfer to attack a different, unknown model (target model). Researchers use PGD to:
- Simulate Black-Box Attacks: Study transferability to understand the feasibility of attacks without model access.
- Improve Attack Efficiency: Develop methods to enhance the transferability of PGD-crafted examples.
- Understand Model Similarity: Analyze why examples transfer between some architectures but not others, shedding light on learned feature spaces.
Hyperparameter Search for Robustness
The parameters of PGD itself—step size (alpha), number of iterations, and perturbation budget (epsilon)—define an attack's strength. Systematically varying these creates an attack profile.
Engineers use this to:
- Find the Breaking Point: Determine the exact epsilon at which model accuracy degrades catastrophically.
- Tune Adversarial Training: Optimize PGD parameters used during training for the best robustness/efficiency trade-off.
- Define Security Specifications: Establish concrete adversarial threat models (e.g., "robust against L-infinity attacks with epsilon=8/255") for product requirements.
Frequently Asked Questions
Projected Gradient Descent (PGD) is a cornerstone iterative white-box attack and a critical component of adversarial training. These questions address its mechanics, role in security evaluation, and practical implementation.
Projected Gradient Descent (PGD) is a powerful, iterative white-box adversarial attack algorithm that generates adversarial examples by taking multiple, small steps in the direction of the loss gradient, while constraining the total perturbation to remain within a specified norm ball (typically L∞ or L2). It is considered a strong first-order attack and serves as the standard benchmark for evaluating adversarial robustness. The core mechanism involves starting from an initial point (often the original input or a random point within the allowable perturbation budget), computing the gradient of the model's loss with respect to the input, taking a step to increase the loss (causing misclassification), and then projecting the perturbed input back onto the valid norm ball. This process repeats for a set number of iterations, refining the adversarial example to be both effective and constrained.
Enabling Efficiency, Speed & Accuracy
Intelligent Analysis, Decision & Execution
We build AI systems for teams that need search across company data, workflow automation across tools, or AI features inside products and internal software.
Talk to Us
Search across company data
Give teams answers from docs, tickets, runbooks, and product data with sources and permissions.
Useful when people spend too long searching or get different answers from different systems.

Automate internal workflows
Use AI to route work, draft outputs, trigger actions, and keep approvals and logs in place.
Useful when repetitive work moves across multiple tools and teams.

Add AI to products and internal tools
Build assistants, guided actions, or decision support into the software your team or customers already use.
Useful when AI needs to be part of the product, not a separate tool.
Related Terms
Projected Gradient Descent (PGD) is a cornerstone technique within adversarial testing. The following terms define the ecosystem of attacks, defenses, and evaluation metrics that surround it.
Fast Gradient Sign Method (FGSM)
The Fast Gradient Sign Method is a foundational, one-step white-box attack that generates an adversarial example by perturbing an input in the direction of the sign of the loss function's gradient. It is defined as:
x_adv = x + ε * sign(∇_x J(θ, x, y))
- Key Difference from PGD: FGSM is a single-step attack, while PGD is its multi-step, iterative extension. PGD applies FGSM multiple times with a small step size, projecting the perturbation back into a valid norm ball after each iteration, making it a much stronger attack.
Adversarial Training
Adversarial training is the primary defensive technique used to improve model robustness. It involves augmenting the standard training dataset with adversarial examples, forcing the model to learn from these challenging cases.
- Role of PGD: PGD is the de facto standard for generating the adversarial examples used in this process due to its strength. Models are trained to minimize loss on both clean data and PGD-crafted adversarial examples, leading to significantly higher robust accuracy.
White-Box vs. Black-Box Attack
These terms define the attacker's assumed level of knowledge about the target model.
- White-Box Attack: The attacker has full access to the model's architecture, parameters (weights), and gradients. PGD is a quintessential white-box attack, as it requires calculating the gradient
∇_x J(θ, x, y)to craft perturbations. - Black-Box Attack: The attacker can only query the model and observe its input-output behavior, with no internal access. Attacks like query-based or transfer attacks fall into this category. PGD's strength makes it a common method for generating adversarial examples on a surrogate model in a transfer attack setting.
Robust Accuracy
Robust accuracy is the critical evaluation metric for models defended against adversarial attacks. It measures a model's classification accuracy on a test set consisting of adversarial examples.
- Contrast with Standard Accuracy: Standard accuracy measures performance on clean, unperturbed data and can be misleading for security-critical applications. A model with high standard accuracy can have near-zero robust accuracy.
- Benchmarking with PGD: Robust accuracy is typically reported using adversarial examples generated by a strong, multi-step PGD attack, as this provides a rigorous and conservative estimate of a model's real-world reliability under threat.
Carlini & Wagner (C&W) Attack
The Carlini & Wagner attack is another powerful, optimization-based white-box attack. It formulates the search for an adversarial example as a constrained optimization problem, often using a change-of-variables to handle box constraints.
- Comparison to PGD: While both are strong white-box attacks, they have different formulations. C&W often aims to find the minimal perturbation required for misclassification. PGD, in contrast, seeks a maximal perturbation within a fixed norm ball (ε). PGD is more commonly used for adversarial training due to its computational efficiency and reliable generation of strong perturbations within the defined threat model.
Adversarial Example
An adversarial example is the output of an attack algorithm like PGD. It is an input (e.g., an image, text token) that has been intentionally perturbed in a way often imperceptible to humans but causes a machine learning model to make a high-confidence error.
- Core Properties:
- Perturbation Constraint: The change is bounded by a norm (e.g., L∞, L₂) to ensure it is small or imperceptible.
- High-Confidence Error: The model is not just uncertain; it is confidently wrong.
- PGD's Role: PGD is a specific, iterative algorithm for reliably generating these examples within a defined constraint set, making them essential for both testing (red-teaming) and defense (adversarial training).

About the author
Prasad Kumkar
CEO & MD, Inference Systems
Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.
His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.
Partnered with leading AI, data, and software stack.
How We Work
Custom AI workflows for your Business
One-fit-all AI don't work for modern businesses. At Inferensys, we aim to understand your business & custom requirements; which we use to define most efficient agentic workflows, the data, and the tools for your business.
01
Review the use case
We understand the task, the users, and where AI can actually help.
Read more02
Pick the right approach
We define what needs search, automation, or product integration.
Read more03
Build the first useful version
We implement the part that proves the value first.
Read more04
Improve from there
We add the checks and visibility needed to keep it useful.
Read moreThe first call is a practical review of your use case and the right next step.
Talk to Us