A carbon footprint baseline quantifies the greenhouse gas emissions from your AI model training and inference workloads. This involves systematically collecting energy consumption data from cloud providers (AWS, GCP, Azure) and on-premises clusters, then applying regional carbon intensity factors to calculate Scope 2 emissions. This baseline serves as the single source of truth against which all future efficiency improvements and regulatory disclosures are measured.
Guide
Setting Up a Carbon Footprint Baseline for Your AI Portfolio

Establishing a defensible carbon baseline is the critical first step in managing and disclosing the environmental impact of your AI systems.
To build this baseline, you must instrument your MLOps pipelines to capture hardware utilization metrics, integrate with cloud carbon footprint APIs, and establish a consistent methodology for data aggregation. This process, detailed in our guide on How to Architect an AI Lifecycle Energy Monitoring System, creates the foundational dataset required for advanced initiatives like AI Energy Scoring and comprehensive ESG reporting.
Key Concepts: Carbon Accounting for AI
Before you can reduce emissions, you must measure them. These concepts establish the core principles and practical steps for calculating a defensible carbon baseline for your AI model portfolio.
Define Your AI Portfolio Boundary
Establish the scope of your measurement. A clear boundary prevents double-counting and ensures a consistent baseline.
- In-scope workloads: Model training, fine-tuning, and inference (batch & real-time).
- Out-of-scope activities: General data center overhead, employee laptops, and non-AI R&D compute.
- Key decision: Will you measure per-project, per-model, or per-business unit? Start with your most carbon-intensive training jobs and high-traffic inference endpoints.
Collect Energy Consumption Data
Energy use (kWh) is the primary input for carbon calculation. Data sources vary by infrastructure.
- Cloud Providers: Use native tools like AWS Customer Carbon Footprint Tool, Google Cloud Carbon Footprint, or Microsoft Emissions Impact Dashboard. Pull data via their APIs.
- On-Premises/Colocation: Instrument servers with hardware sensors (e.g., Intel RAPL, NVIDIA DCGM) or PDUs. Aggregate data using tools like Prometheus or Grafana.
- Critical step: Isolate energy for AI-specific workloads from general IT infrastructure using tags, labels, or dedicated hardware.
Apply Carbon Intensity Factors
Convert energy (kWh) to carbon emissions (kgCO2e) using grid carbon intensity. This is your Scope 2 emissions calculation.
- Location-based method: Uses the average grid intensity of the region where your compute runs (e.g., 0.233 kgCO2e/kWh for US West). This is the most common and defensible approach.
- Market-based method: Uses the carbon intensity of the specific renewable energy contracts your provider has purchased. This can show a lower footprint but is less standardized.
- Tool: Use databases like the IEA or EPA's eGRID, or APIs from tools like Electricity Maps or WattTime.
Calculate a Per-Model Baseline
Aggregate data to create a footprint for individual models, which is essential for tracking improvement.
- Training Baseline: Total kWh for the training job × regional carbon intensity.
- Inference Baseline: (Average watts per query × number of queries) converted to kWh, then to CO2e. Requires inference server instrumentation.
- Record key metadata: Model architecture, dataset size, hardware type (GPU model), and cloud region. This context is vital for year-over-year comparison and understanding the levers for reduction.
Document Methodology & Assumptions
A baseline is only as strong as its documentation. This creates an audit trail for internal governance and external disclosure.
- Document: Data sources, calculation formulas, carbon intensity values used, and any allocation methods for shared resources.
- Assumptions: Clearly state any estimates (e.g., for embodied hardware carbon) and their justification.
- This documentation is the foundation for your AI environmental disclosure and aligns with frameworks like the Partnership on AI's ML Sustainability Code.
Step 1: Inventory Your AI Workloads and Infrastructure
Before you can measure or reduce your AI carbon footprint, you must first know what you're running. This step establishes a comprehensive inventory of all AI model training and inference workloads across your organization's infrastructure.
Start by cataloging every active AI workload. For each, document the model architecture, training duration, inference request volume, and primary infrastructure (e.g., AWS p4d.24xlarge, Azure ND A100 v4, on-prem NVIDIA DGX). This inventory is your source of truth for all subsequent calculations. Use infrastructure-as-code repositories, cloud billing dashboards, and MLOps platforms like MLflow or Weights & Biases to automate discovery and ensure no shadow AI projects are missed.
Next, map each workload to its physical and logical location. Record the cloud region or data center and the specific hardware accelerators (GPU/TPU type and count). This granularity is essential for applying accurate regional carbon intensity factors later. This complete inventory forms the defensible baseline required for tracking reduction progress and is the first prerequisite for our guide on How to Architect an AI Lifecycle Energy Monitoring System.
Cloud Provider Carbon Tools Comparison
A comparison of the native tools and APIs provided by major cloud platforms for collecting the energy and carbon data required to establish your AI portfolio baseline.
| Feature / Metric | AWS Customer Carbon Footprint Tool | Google Cloud Carbon Footprint | Microsoft Azure Emissions Impact Dashboard |
|---|---|---|---|
API Access | |||
Granularity (Minimum) | Monthly by service | Monthly by project & region | Monthly by service & region |
Scope 2 Methodology | Location-based | Location-based | Location-based |
PUE & Grid Intensity Data | Provided (region-specific) | Provided (region-specific) | Provided (region-specific) |
Machine Learning / AI Service Tagging | Via Cost Explorer tags | Via labels on resources | Via resource tags |
Historical Data Depth | Up to 36 months | Up to 24 months | Up to 12 months |
Data Export Formats | CSV, via API | CSV, BigQuery, via API | CSV, Power BI, via API |
Integration with CodeCarbon / Other Tools | Manual data import required | Manual data import required | Manual data import required |
Step 3: Apply Regional Carbon Intensity Factors
Convert your measured energy consumption into a carbon footprint by applying location-specific emission factors. This step is critical for calculating accurate Scope 2 emissions for your AI portfolio.
A regional carbon intensity factor is a value, typically expressed in grams of CO₂ equivalent per kilowatt-hour (gCO₂e/kWh), that represents the average emissions of the electricity grid in a specific geographic area. Applying this factor transforms your raw energy data (kWh) into a carbon footprint (gCO₂e). You must source these factors from authoritative databases like the International Energy Agency (IEA) or your national grid operator. For cloud workloads, use the specific factors published by your provider (e.g., AWS Customer Carbon Footprint Tool, Google Cloud Carbon Footprint).
To apply the factor, multiply your total energy consumption for a workload by the factor for its compute region: Carbon Emissions (gCO₂e) = Energy Use (kWh) × Carbon Intensity (gCO₂e/kWh). This calculation yields your Scope 2 emissions from purchased electricity. For a complete baseline, you must also account for Scope 3 emissions from embodied hardware carbon. Tools like CodeCarbon can automate this process within your training and inference pipelines.
Enabling Efficiency, Speed & Accuracy
Intelligent Analysis, Decision & Execution
We build AI systems for teams that need search across company data, workflow automation across tools, or AI features inside products and internal software.
Talk to Us
Search across company data
Give teams answers from docs, tickets, runbooks, and product data with sources and permissions.
Useful when people spend too long searching or get different answers from different systems.

Automate internal workflows
Use AI to route work, draft outputs, trigger actions, and keep approvals and logs in place.
Useful when repetitive work moves across multiple tools and teams.

Add AI to products and internal tools
Build assistants, guided actions, or decision support into the software your team or customers already use.
Useful when AI needs to be part of the product, not a separate tool.
Common Mistakes
Establishing a carbon footprint baseline for your AI portfolio is foundational for sustainability reporting and cost optimization. Avoid these common technical and procedural errors that undermine data accuracy and defensibility.
Inconsistent baselines stem from incomplete system boundaries and variable measurement periods. You must define what's included: training jobs, inference endpoints, data storage, and idle resources. A common mistake is measuring only GPU time while ignoring CPU, memory, and storage energy from supporting services. Furthermore, using different time windows (e.g., one month of summer data vs. one month of winter data) skews comparisons due to seasonal changes in carbon intensity and cooling loads. Standardize on a full quarter or year for a representative sample.
Fix: Use a tool like CodeCarbon or cloud-native solutions (AWS Customer Carbon Footprint Tool, GCP Carbon Footprint) to ensure consistent, system-wide instrumentation. Define and document your measurement boundaries in a data collection policy.

About the author
Prasad Kumkar
CEO & MD, Inference Systems
Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.
His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.
Partnered with leading AI, data, and software stack.
How We Work
Custom AI workflows for your Business
One-fit-all AI don't work for modern businesses. At Inferensys, we aim to understand your business & custom requirements; which we use to define most efficient agentic workflows, the data, and the tools for your business.
01
Review the use case
We understand the task, the users, and where AI can actually help.
Read more02
Pick the right approach
We define what needs search, automation, or product integration.
Read more03
Build the first useful version
We implement the part that proves the value first.
Read more04
Improve from there
We add the checks and visibility needed to keep it useful.
Read moreThe first call is a practical review of your use case and the right next step.
Talk to Us