Causal Shapley values are a game-theoretic attribution method that quantifies the marginal causal contribution of each feature or treatment to a model's prediction or an observed outcome. Unlike standard Shapley values which measure statistical association, Causal Shapley values rely on a specified Structural Causal Model (SCM) or causal graph to account for the underlying cause-and-effect relationships between variables. This ensures the attribution reflects true causal influence, not just correlation, by considering only valid intervention paths.
Glossary
Causal Shapley

What is Causal Shapley?
Causal Shapley values extend the Shapley value concept from cooperative game theory to causal inference, providing a method to fairly attribute the causal effect of multiple treatments or features to individual contributors within a causal model.
The method calculates a feature's value by averaging its marginal causal contribution across all possible orderings (coalitions) in which features could be introduced, using the do-operator to simulate interventions. This provides a unique, fair allocation of the total causal effect based on principles of efficiency, symmetry, and additivity. It is particularly valuable for explainable AI (XAI) in complex systems where understanding the drivers of an outcome, like in causal fairness analysis or causal reinforcement learning, is critical for trust and debugging.
Key Properties of Causal Shapley Values
Causal Shapley values extend the classic Shapley value from cooperative game theory into the domain of causal inference. They provide a principled framework for attributing the causal effect of multiple treatments or features to individual contributors, respecting the underlying causal structure.
Causal vs. Observational Attribution
The fundamental distinction from standard Shapley values is the use of interventional distributions rather than conditional distributions. Standard SHAP asks, 'What is the model's prediction when we know feature X?' Causal Shapley asks, 'What is the outcome when we set feature X?' This shift from P(Y | X=x) to P(Y | do(X=x)) ensures the attribution reflects causal influence, not just statistical association, by blocking backdoor paths through other features.
Respect for Causal Order
Attribution respects the partial ordering defined by the causal graph (DAG). A feature can only be credited for effects flowing through its descendants. When evaluating a coalition's value, features are 'intervened on' according to this order, preventing illogical attributions where an effect is credited to a variable that occurs later in the causal chain. This property ensures the explanation aligns with the known or assumed causal mechanics of the system.
Additive Decomposition of Total Effect
Causal Shapley values provide an additive decomposition of the total causal effect of moving all features from a baseline state to their current values. For a given outcome difference Y(do(X=x)) - Y(do(X=x')), the sum of the Causal Shapley values for all features equals this total effect. This property guarantees that the attribution is a complete and fair accounting of the measured causal change.
The Shapley Axioms in a Causal Context
The method satisfies the four classic Shapley axioms, reinterpreted for causal settings:
- Efficiency: The attributions sum to the total causal effect.
- Symmetry: Two features that have identical causal effects receive equal attribution.
- Dummy: A feature with no causal effect on the outcome receives zero attribution.
- Additivity: For a combined effect, the attribution is the sum of attributions from individual effects. These axioms provide a game-theoretically fair and unique solution to the attribution problem.
Handling of Confounding and Mediation
The framework correctly attributes effects through direct and indirect pathways. By using the do-operator, it accounts for confounding: the effect attributed to a feature is not inflated by backdoor associations. It can also decompose a feature's total effect into portions mediated through other variables versus direct effects, providing nuanced insight into the causal mechanism, which is crucial for interpretable and robust model explanations.
Computational and Identifiability Challenges
Calculation requires estimating many interventional distributions (P(Y | do(S)) for all feature subsets S), which is computationally intensive and requires causal identifiability. The effect for a coalition must be identifiable from the available data and causal graph, often relying on assumptions like no unmeasured confounding within the subset. This makes Causal Shapley values both more principled and more demanding than their associational counterpart.
Frequently Asked Questions
Causal Shapley values extend the classic Shapley value from cooperative game theory into the domain of causal inference. This FAQ addresses common questions about its purpose, mechanics, and relationship to other explainability and causal methods.
Causal Shapley is a method for fairly attributing the causal effect of multiple treatments or features to individual contributors within a causal model. It works by extending the Shapley value concept from cooperative game theory, where each 'player' (feature) is assigned a value based on its average marginal contribution to a prediction across all possible coalitions (feature subsets). In the causal context, the 'game' is defined by a Structural Causal Model (SCM), and contributions are measured as the causal effect of setting a feature to a specific value via an intervention (the do-operator), rather than just its conditional association. The calculation involves evaluating the outcome for every possible subset of features being intervened upon, using the SCM to compute the counterfactual outcome, and then averaging the marginal causal contributions according to the Shapley formula.
Enabling Efficiency, Speed & Accuracy
Intelligent Analysis, Decision & Execution
We build AI systems for teams that need search across company data, workflow automation across tools, or AI features inside products and internal software.
Talk to Us
Search across company data
Give teams answers from docs, tickets, runbooks, and product data with sources and permissions.
Useful when people spend too long searching or get different answers from different systems.

Automate internal workflows
Use AI to route work, draft outputs, trigger actions, and keep approvals and logs in place.
Useful when repetitive work moves across multiple tools and teams.

Add AI to products and internal tools
Build assistants, guided actions, or decision support into the software your team or customers already use.
Useful when AI needs to be part of the product, not a separate tool.
Related Terms
Causal Shapley values sit at the intersection of causal inference and cooperative game theory. These related concepts provide the mathematical and conceptual foundation for understanding its mechanics and applications.
Shapley Value
The Shapley value is a solution concept from cooperative game theory that provides a unique, axiomatically fair method for distributing the total payoff of a coalition game among its players. It is defined as the average marginal contribution of a player across all possible orderings of players joining the coalition.
- Core Concept: Fair attribution based on average marginal contribution.
- Key Axioms: Efficiency, Symmetry, Dummy Player, and Additivity.
- Application in ML: The foundation for SHAP (SHapley Additive exPlanations), used for feature attribution in predictive models.
Causal Inference
Causal inference is the process of drawing conclusions about cause-and-effect relationships from data, moving beyond statistical associations to determine the impact of an intervention or treatment on an outcome. It relies on formal frameworks like Structural Causal Models (SCMs) and the do-calculus.
- Goal: Answer "what if" questions about interventions.
- Key Tools: Randomized controlled trials, observational studies with adjustment (e.g., via backdoor criterion), instrumental variables.
- Contrast with Prediction: Focuses on understanding system dynamics under change, not just forecasting.
Structural Causal Model (SCM)
A Structural Causal Model (SCM) is a formal mathematical framework that represents causal relationships between variables using a system of structural equations, typically visualized as a causal graph (a Directed Acyclic Graph). It defines how each variable is generated from its direct causes and independent noise.
- Components: A set of variables, a set of error terms, and a set of functions assigning each variable a value based on its parents.
- Enables: Reasoning about interventions (do-operator) and counterfactuals.
- Role in Causal Shapley: Provides the underlying causal graph and structural equations necessary to compute the value of a feature under different "worlds" (interventional distributions).
Counterfactual
A counterfactual is a statement about what would have happened to an outcome if a cause had been different. It represents the highest level of reasoning on Pearl's 'ladder of causation' and answers retrospective 'what if' questions for specific instances.
- Example: "Would this patient have survived if they had not received the drug?"
- Computation: Requires a fully specified SCM to account for background conditions.
- Connection to Causal Shapley: Causal Shapley values can be viewed as a weighted average of counterfactual contributions of a feature across all possible feature subsets, moving beyond purely associational Shapley values.
Do-Calculus
Do-calculus is a set of three inference rules developed by Judea Pearl that allows one to compute the effects of interventions from observational data, provided a causal graph is known. It transforms expressions containing the do-operator (e.g., P(Y|do(X))) into observational probabilities.
- Purpose: To identify and estimate causal effects from non-experimental data.
- Prerequisite: A valid causal graph representing the data-generating process.
- Role in Causal Shapley: Provides the formal machinery to compute the interventional distributions required when evaluating a feature's contribution within a coalition (subset of variables). It replaces the observational conditional distributions used in standard Shapley with interventional ones.
Average Treatment Effect (ATE)
The Average Treatment Effect (ATE) is the average causal effect of a treatment or intervention across an entire population. It is calculated as the expected difference in outcomes between the treated and untreated states for a randomly selected individual: ATE = E[Y|do(T=1)] - E[Y|do(T=0)].
- Population-Level: Measures the effect for the group, not for a specific individual.
- Estimation Methods: Includes randomized experiments, propensity score matching, and double machine learning.
- Contrast with Causal Shapley: While ATE attributes the total effect to a single treatment variable, Causal Shapley decomposes a complex outcome (which may be a function of multiple treatments/features) and attributes portions of it to each contributing variable in a causally-grounded, fair manner.

About the author
Prasad Kumkar
CEO & MD, Inference Systems
Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.
His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.
Partnered with leading AI, data, and software stack.
How We Work
Custom AI workflows for your Business
One-fit-all AI don't work for modern businesses. At Inferensys, we aim to understand your business & custom requirements; which we use to define most efficient agentic workflows, the data, and the tools for your business.
01
Review the use case
We understand the task, the users, and where AI can actually help.
Read more02
Pick the right approach
We define what needs search, automation, or product integration.
Read more03
Build the first useful version
We implement the part that proves the value first.
Read more04
Improve from there
We add the checks and visibility needed to keep it useful.
Read moreThe first call is a practical review of your use case and the right next step.
Talk to Us