Glossary

Fairness Metric (Disparate Impact)

Disparate Impact is a quantitative fairness metric that compares selection rates across demographic groups to detect unintentional discrimination in AI systems.

Get in touch Learn more

Isolated secure server room with network cables physically disconnected, minimal lighting, security-focused environment.

MODEL BENCHMARKING SUITES

What is Fairness Metric (Disparate Impact)?

A quantitative measure for auditing AI systems for discriminatory outcomes, with disparate impact being a core legal test.

A fairness metric is a quantitative measure used to audit an artificial intelligence system for discriminatory outcomes across different demographic groups. Disparate impact is a specific legal and statistical fairness test that compares the selection or benefit rates between a protected group (e.g., based on race or gender) and a majority group, flagging a system if the ratio falls below a legally recognized threshold, often 80% in U.S. employment law. This metric is a cornerstone of algorithmic auditing and ethical bias auditing.

Unlike disparate treatment, which examines intent, disparate impact focuses solely on discriminatory outcomes, making it a critical tool for model benchmarking suites. It is calculated by dividing the selection rate for a protected group by the rate for the advantaged group. A result below 0.8 (the "80% rule") indicates a potential violation requiring remediation, such as adjusting the model or implementing bias mitigation techniques. This objective metric is essential for compliance with regulations like the EU AI Act.

FAIRNESS METRIC (DISPARATE IMPACT)

Key Legal and Regulatory Frameworks

Disparate impact is a legal doctrine and fairness metric used to identify unintentional discrimination in algorithmic systems by analyzing statistical outcomes across protected groups. Its application is governed by several key legal frameworks.

The 80% Rule (Four-Fifths Rule)

The 80% Rule, established by the U.S. Equal Employment Opportunity Commission (EEOC), is the primary statistical test for disparate impact in employment. It states that a selection rate for any protected group (e.g., race, gender) that is less than 80% (or four-fifths) of the rate for the group with the highest selection rate may constitute evidence of adverse impact.

Calculation: (Selection Rate for Group A) / (Selection Rate for Group with Highest Rate) >= 0.8
Example: If an AI resume screener selects 50% of male applicants but only 30% of female applicants, the ratio is 0.6 (30%/50%), which is below the 0.8 threshold, indicating potential disparate impact.
Purpose: Serves as a bright-line, initial screening tool to flag systems for further legal and technical scrutiny.

Title VII of the Civil Rights Act (1964)

Title VII of the Civil Rights Act is the foundational U.S. federal law prohibiting employment discrimination. It is the statutory basis for the disparate impact legal theory.

Key Precedent: Griggs v. Duke Power Co. (1971) established that employment practices with a disparate impact are illegal, even without proof of discriminatory intent, if they are not job-related and consistent with business necessity.
Burden-Shifting Framework:
1. Plaintiff demonstrates a statistically significant disparate impact.
2. Defendant (Employer) must prove the practice is a business necessity.
3. Plaintiff can still prevail by showing an available alternative practice exists that serves the same business purpose with less discriminatory effect.
Application to AI: This framework directly applies to algorithmic hiring, promotion, and credit scoring tools used in employment contexts.

The Equal Credit Opportunity Act (ECOA)

The Equal Credit Opportunity Act (ECOA) and its implementing regulation, Regulation B, prohibit discrimination in any aspect of a credit transaction. It explicitly recognizes the disparate impact theory.

Regulatory Guidance: The Consumer Financial Protection Bureau (CFPB) has stated that creditors are liable for algorithmic discrimination under ECOA, regardless of intent, if a model results in a disparate impact on a protected class.
Protected Classes: Includes race, color, religion, national origin, sex, marital status, age, and receipt of public assistance.
Business Necessity Defense: Similar to Title VII, a creditor may defend a model by demonstrating it is empirically derived, demonstrably and statistically sound, and meets a legitimate business need. The CFPB emphasizes the need for ongoing testing and monitoring for disparate impact.

The European Union AI Act (Risk-Based Approach)

The EU AI Act takes a risk-based regulatory approach, where high-risk AI systems (including those for employment, credit, and essential services) face stringent requirements. While it does not use the U.S. term "disparate impact," its requirements for fundamental rights impact assessments and bias monitoring serve a parallel purpose.

High-Risk System Obligations: Providers must implement risk management systems and data governance practices to mitigate risks of algorithmic bias that could lead to discriminatory outcomes.
Conformity Assessment: Before deployment, high-risk systems must undergo assessment to ensure they do not perpetuate prohibited discrimination.
Post-Market Monitoring: Continuous logging and monitoring are required to detect any drift in model performance that could create new discriminatory effects.

Algorithmic Accountability Act (Proposed U.S. Framework)

While not yet law, the proposed Algorithmic Accountability Act represents a forward-looking U.S. legislative framework that would mandate impact assessments for automated systems, explicitly including disparate impact analysis.

Covered Entities: Would apply to large companies and critical software vendors.
Required Assessments: Companies would be required to perform impact assessments evaluating their automated systems for impacts on accuracy, fairness, bias, discrimination, privacy, and security.
Public Reporting: A summary of these assessments, including steps taken to mitigate disparate impact, would need to be published in a publicly accessible repository. This aims to create transparency and accountability for high-impact algorithmic decision-making.

Technical Auditing Standards (NIST AI RMF)

The National Institute of Standards and Technology (NIST) AI Risk Management Framework provides a voluntary but authoritative technical standard for managing AI risks, including those related to fairness and disparate impact.

Map, Measure, Manage, Govern: The framework's core functions guide organizations in measuring disparate impact and managing it throughout the AI lifecycle.
Context-Specific Metrics: Emphasizes that fairness cannot be reduced to a single metric (like the 80% rule) and must be evaluated using a suite of quantitative and qualitative measures appropriate to the specific use case and impacted populations.
Documentation & Traceability: Stresses the need for detailed documentation of model design choices, training data provenance, and evaluation results to support regulatory compliance and internal audits for disparate impact.

EXPLORE

FAIRNESS METRIC COMPARISON

Disparate Impact vs. Other Fairness Metrics

A comparison of disparate impact with other common fairness metrics, highlighting their legal basis, statistical formulation, and primary use cases in algorithmic auditing.

Metric / Feature	Disparate Impact (80% Rule)	Equalized Odds	Demographic Parity	Individual Fairness
Legal & Regulatory Basis	U.S. Civil Rights Law (Title VII), EU AI Act (High-Risk)			Primarily a research concept
Core Definition	Compares selection rates between protected and non-protected groups.	Requires equal true positive and false positive rates across groups.	Requires equal positive prediction rates across groups.	Requires similar individuals to receive similar predictions.
Primary Mathematical Test	Ratio(Selection Rate_Protected / Selection Rate_Non-Protected) ≥ 0.8	P(Ž=1\|Y=1, A=a) = P(Ž=1\|Y=1, A=b) AND P(Ž=1\|Y=0, A=a) = P(Ž=1\|Y=0, A=b)	P(Ž=1 \| A=a) = P(Ž=1 \| A=b)	D(Ž_i, Ž_j) ≤ ε * D(x_i, x_j) for a distance metric D
Requires Ground Truth Labels (Y)
Considers Model Accuracy
Use Case Example	Hiring algorithm screening resumes.	Criminal recidivism prediction tool.	College admissions pre-screening.	Credit scoring for individuals with similar financial profiles.
Key Trade-off / Limitation	Can conflict with model accuracy; ignores legitimate qualifications.	Very strict; often impossible to satisfy perfectly without perfect predictor.	Ignores underlying differences in qualification rates between groups.	Defining a similarity metric for individuals is highly non-trivial.
Common in Production Audits

FAIRNESS METRIC

Common Use Cases for Disparate Impact Analysis

Disparate impact analysis is a quantitative fairness audit used to detect unintentional discrimination in automated systems. These are the primary domains where it is applied to ensure equitable outcomes.

Hiring & Recruitment Algorithms

This is the canonical use case for disparate impact analysis, often scrutinized under the Uniform Guidelines on Employee Selection Procedures. Auditors apply the four-fifths rule (80% rule) to automated resume screening, video interview analysis, and skills assessment tools. For example, if a model recommends 50% of male applicants for interviews but only 30% of female applicants, the 60% ratio indicates a potential disparate impact. Analysis focuses on protected attributes like gender, race, and age to ensure selection rates are statistically equitable.

Credit Scoring & Loan Underwriting

Financial institutions and regulators (e.g., the Consumer Financial Protection Bureau) use disparate impact analysis to audit algorithmic credit models for compliance with the Equal Credit Opportunity Act (ECOA). The test compares approval rates, interest rates, and credit limits across demographic groups defined by race, national origin, or sex. A key challenge is distinguishing between legitimate risk-based pricing and discriminatory proxies, requiring analysis of feature attribution to see if zip code or shopping history acts as a surrogate for a protected class.

Policing & Criminal Justice Risk Assessments

Tools like COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) used for predicting recidivism have been extensively audited for disparate impact. Analysis examines whether the models assign higher risk scores to defendants of one race compared to another with similar criminal histories, potentially leading to harsher bail or sentencing recommendations. This domain highlights the tension between predictive parity (equal precision across groups) and demographic parity (equal selection rates), as optimizing for one can violate the other.

Healthcare Resource Allocation & Diagnostics

Disparate impact analysis is applied to algorithms that predict healthcare needs, prioritize patients for interventions, or aid in clinical diagnosis. For instance, a model used to enroll high-risk patients into care management programs must be audited to ensure it does not systematically exclude groups based on race or socioeconomic status, which could be correlated with training data gaps. This is critical for compliance with Section 1557 of the Affordable Care Act, which prohibits discrimination in health programs.

Advertising Delivery & Targeted Marketing

Platforms use disparate impact analysis to audit ad delivery algorithms for digital redlining, where housing, employment, or credit ads are disproportionately shown or withheld from users based on inferred demographic characteristics. This analysis often involves running controlled experiments (A/A tests) to measure delivery rates across groups when ad content and targeting parameters are held constant, ensuring compliance with laws like the Fair Housing Act.

Academic Admissions & Scholarship Screening

Educational institutions apply disparate impact analysis to automated systems that pre-screen applications or award scholarships. The four-fifths rule is used to compare recommendation rates across groups based on race, gender, or disability status. The analysis must account for legitimate educational qualifications while ensuring the model does not amplify historical biases present in training data, such as correlations between extracurricular activities and socioeconomic status.

FAIRNESS METRIC

Frequently Asked Questions

A fairness metric is a quantitative measure used to audit an AI system for discriminatory outcomes across different demographic groups, with disparate impact being a common legal test comparing selection rates. This FAQ addresses key technical and legal questions surrounding this critical evaluation concept.

Disparate impact is a quantitative fairness metric that measures whether an AI system's selection rate (e.g., for loans, hiring, or services) differs significantly across protected demographic groups, such as those defined by race or gender. It is calculated as the ratio of the selection rate for a disadvantaged group to the selection rate for the most advantaged group. A value of 1.0 indicates perfect parity, while a value below a legal threshold (often 0.8, known as the "80% rule") suggests a potentially discriminatory adverse impact. This metric originates from U.S. employment law and is a cornerstone of algorithmic auditing for group fairness, focusing on outcomes rather than intent.

Enabling Efficiency, Speed & Accuracy

Intelligent Analysis, Decision & Execution

We build AI systems for teams that need search across company data, workflow automation across tools, or AI features inside products and internal software.

Talk to Us

Search across company data

Give teams answers from docs, tickets, runbooks, and product data with sources and permissions.

Useful when people spend too long searching or get different answers from different systems.

Enterprise searchRAGPermissions

Automate internal workflows

Use AI to route work, draft outputs, trigger actions, and keep approvals and logs in place.

Useful when repetitive work moves across multiple tools and teams.

AI agentsWorkflow automationGovernance

Add AI to products and internal tools

Build assistants, guided actions, or decision support into the software your team or customers already use.

Useful when AI needs to be part of the product, not a separate tool.

AI integrationDecision supportModel routing

FAIRNESS & EVALUATION

Related Terms

Disparate impact is one of several key metrics and legal frameworks used to audit AI systems for discriminatory outcomes. Understanding related concepts is essential for comprehensive fairness evaluation.

Disparate Treatment

A legal doctrine of fairness focusing on intentional discrimination in decision-making processes. Unlike disparate impact, which examines outcomes, disparate treatment assesses whether a model's algorithm or design explicitly uses a protected attribute (e.g., race, gender) as an input for differential treatment.

Key Difference: Concerned with process and intent, not just statistical outcomes.
Example: A loan approval model that directly uses 'zip code' as a feature, where zip code is a proxy for racial demographics, could constitute disparate treatment.
Legal Test: Requires evidence of discriminatory intent or a facially discriminatory policy.

Equalized Odds

A group fairness metric that requires a model's predictions to be equally accurate across demographic groups. It mandates that both true positive rates and false positive rates are identical for all protected groups.

Formal Definition: For all groups a and b, P(Ŷ=1 | Y=1, A=a) = P(Ŷ=1 | Y=1, A=b) AND P(Ŷ=1 | Y=0, A=a) = P(Ŷ=1 | Y=0, A=b).
Use Case: Critical in high-stakes domains like criminal justice or hiring, where both types of errors (false positives and false negatives) must be fair.
Trade-off: Often impossible to satisfy simultaneously with other fairness criteria like demographic parity, leading to fairness impossibility theorems.

Demographic Parity

A group fairness criterion stating that the selection rate (positive prediction rate) should be equal across protected groups. Also known as statistical parity.

Formula: P(Ŷ=1 | A=a) = P(Ŷ=1 | A=b) for all groups a, b.
Contrast with Disparate Impact: Demographic parity is the mathematical ideal (a ratio of 1.0), while the 80% rule (disparate impact) is a legal threshold for identifying potential discrimination.
Criticism: Can conflict with meritocratic principles if base rates of qualification differ between groups. Enforcing it may require quotas or significant model adjustment.

Individual Fairness

A fairness paradigm that requires similar individuals to receive similar predictions, regardless of their group membership. It contrasts with group fairness metrics like disparate impact.

Core Principle: "Treat like cases alike."
Implementation Challenge: Requires defining a meaningful similarity metric for individuals, which is often non-trivial and domain-specific.
Example: Two loan applicants with identical credit scores, debt-to-income ratios, and employment history should receive the same prediction, even if they belong to different racial groups.

Counterfactual Fairness

A causality-based fairness definition that evaluates whether a model's prediction for an individual would remain the same if the individual's protected attribute (e.g., race, gender) were changed, while all other relevant, non-discriminatory attributes are held constant.

Rooted in Causal Models: Requires constructing a causal graph to distinguish discriminatory from legitimate factors.
Strong Guarantee: Aims to remove the direct and indirect influence of the protected attribute on the prediction.
Complexity: Demands significant domain knowledge to build the causal model and is computationally intensive to audit.

Adverse Action Notice

A regulatory requirement (e.g., under the U.S. Equal Credit Opportunity Act) that mandates lenders provide a specific, clear reason to applicants who are denied credit. In an AI context, this intersects with explainability and disparate impact analysis.

Link to Fairness: If a model exhibits disparate impact, the reasons provided in adverse action notices may reveal proxy discrimination or flawed features.
Technical Requirement: Drives the need for explainable AI (XAI) techniques like SHAP or LIME to generate compliant, actionable reasons for denial.
Purpose: Enables applicants to understand and potentially correct issues, promoting fairness and transparency.

EXPLORE

About the author

Prasad Kumkar

CEO & MD, Inference Systems

Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.

His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.

Limited slotsGet a Free AI Consultation

How We Work

Custom AI workflows for your Business

One-fit-all AI don't work for modern businesses. At Inferensys, we aim to understand your business & custom requirements; which we use to define most efficient agentic workflows, the data, and the tools for your business.