R-squared (R²), or the coefficient of determination, is a statistical measure that quantifies the proportion of the variance in a dependent variable that is predictable from one or more independent variables in a regression model. It provides a single score between 0 and 1, where 0 indicates the model explains none of the target's variability and 1 indicates it explains all variability. This metric is foundational for model benchmarking suites and assessing the explanatory power of linear models, serving as a key indicator in Evaluation-Driven Development.
Glossary
R-squared (Coefficient of Determination)

What is R-squared (Coefficient of Determination)?
R-squared is a core statistical measure for evaluating regression models, quantifying how well independent variables explain the variance in the dependent variable.
While a higher R-squared generally indicates a better fit, it has critical limitations: it does not indicate whether the regression model is biased, and it can be artificially inflated by adding irrelevant predictors. For this reason, it is often analyzed alongside other regression metrics like Mean Squared Error (MSE) and Root Mean Squared Error (RMSE). In machine learning, adjusted R-squared is preferred for multiple regression as it penalizes model complexity, providing a more reliable measure for feature selection and preventing overfitting during experiment tracking.
Key Interpretations and Characteristics
R-squared quantifies the proportion of variance in a dependent variable explained by a regression model's independent variables. Its interpretation is nuanced, depending on model type, data structure, and the presence of bias.
Definition and Core Calculation
R-squared is defined as the proportion of the variance in the dependent variable (y) that is predictable from the independent variable(s) (X). It is calculated as:
R² = 1 - (SS_res / SS_tot)
- SS_res (Sum of Squares Residual): The sum of squared differences between observed values and model-predicted values.
- SS_tot (Total Sum of Squares): The sum of squared differences between observed values and the mean of the dependent variable.
A value of 1 indicates perfect prediction, while 0 indicates the model explains none of the variance around the mean.
Interpretation in Linear Regression
In ordinary least squares (OLS) linear regression, R-squared has a clear, bounded interpretation:
- Explained Variance: Directly represents the fraction of total variance 'explained' by the linear model.
- Goodness-of-Fit: A higher R-squared indicates a better fit of the model to the data.
- Caveat: It does not indicate whether:
- The independent variables are causally related to the dependent variable.
- The model is correctly specified (e.g., omitting a key variable).
- The coefficient estimates are unbiased.
It is a descriptive, not a causal, measure of fit.
Limitations and Common Misconceptions
R-squared is frequently misinterpreted. Key limitations include:
- Non-Comparative Across Datasets: A high R-squared on data with high inherent variance is not comparable to a lower R-squared on data with low variance.
- Sensitivity to Outliers: A single outlier can artificially inflate or deflate R-squared.
- No Indication of Bias: A model can have a high R-squared but produce systematically biased predictions (poor calibration).
- Always Increases with Predictors: Adding any variable, even random noise, will never decrease R-squared in OLS, leading to overfitting. This is addressed by the Adjusted R-squared.
Adjusted R-squared
Adjusted R-squared penalizes the addition of non-informative predictors to counteract overfitting. It is calculated as:
Adjusted R² = 1 - [(1 - R²) * (n - 1) / (n - k - 1)]
- n: Number of observations.
- k: Number of independent variables.
Unlike standard R-squared, Adjusted R-squared can decrease when a new predictor adds less explanatory power than expected by chance, providing a more reliable metric for model comparison, especially with multiple predictors.
R-squared in Non-Linear and Machine Learning Models
For non-linear models (e.g., polynomial regression, decision trees, neural networks), the interpretation of R-squared changes:
- It remains a measure of explained variance but loses its direct connection to OLS properties.
- It can be negative for models that fit worse than a simple horizontal line (the mean). This occurs when
SS_res > SS_tot. - In machine learning, it is often called the coefficient of determination or R² score (
sklearn.metrics.r2_score). It is a useful metric for regression tasks but should be evaluated alongside Mean Squared Error (MSE) or Mean Absolute Error (MAE) to understand error magnitude.
Practical Guidelines for Use
When evaluating R-squared in practice:
- Context is Critical: An R-squared of 0.7 may be excellent in social sciences (high noise) but poor in physics experiments.
- Use with Other Metrics: Always pair with residual analysis, MSE/MAE, and prediction error plots to diagnose model flaws.
- Focus on Out-of-Sample Performance: A high training R-squared with a low validation R-squared signals overfitting.
- Prioritize Adjusted R-squared for multiple regression to compare models with different numbers of features.
- Remember its Domain: It is a variance-based metric; for probabilistic or classification-calibrated regression, also consider Log Loss or Brier Score.
R-squared vs. Other Regression Metrics
A comparison of R-squared with other core regression evaluation metrics, highlighting their calculation, interpretation, and primary use cases.
| Metric | R-squared (R²) | Adjusted R-squared | Mean Squared Error (MSE) | Mean Absolute Error (MAE) |
|---|---|---|---|---|
Core Definition | Proportion of variance in the dependent variable explained by the model. | R² adjusted for the number of predictors, penalizing model complexity. | Average of squared differences between predicted and actual values. | Average of absolute differences between predicted and actual values. |
Formula | 1 - (SS_res / SS_tot) | 1 - [(1 - R²)(n - 1) / (n - k - 1)] | (1/n) * Σ(y_i - ŷ_i)² | (1/n) * Σ|y_i - ŷ_i| |
Value Range | 0 to 1 (or 0% to 100%) | Can be negative if model is worse than the mean; ≤ R² | 0 to ∞ | 0 to ∞ |
Interpretation | Higher is better. 1 = perfect fit, 0 = fit as good as the mean. | Higher is better. Directly comparable for models with different predictors. | Lower is better. Heavily penalizes large errors (squared term). | Lower is better. Linear penalty, more interpretable in original units. |
Primary Use Case | Explanatory power & model fit assessment. | Model selection when comparing models with different numbers of features. | Optimization target (loss function) during training; sensitivity to outliers. | Interpretable error reporting; robust to outliers. |
Unit of Measurement | Unitless (proportion). | Unitless (proportion). | Squared units of the target variable. | Same units as the target variable. |
Penalizes Model Complexity? | ||||
Sensitive to Outliers? |
Limitations, Caveats, and Adjusted R-squared
While R-squared is a foundational regression metric, its interpretation requires careful consideration of model specification and complexity. This section details its critical limitations and introduces Adjusted R-squared as a more robust alternative.
R-squared has significant limitations that can mislead model evaluation. It always increases or stays the same when adding more predictors, even irrelevant ones, creating a false sense of improvement. This makes it unsuitable for comparing models with different numbers of features. Furthermore, a high R-squared does not imply causation, correct model specification, or the absence of bias. It is also sensitive to outliers and provides no information about prediction error magnitude on new data.
Adjusted R-squared addresses the flaw of automatic inflation by penalizing the addition of non-informative predictors. It adjusts the standard R-squared value based on the number of predictors (k) and sample size (n). Unlike R-squared, Adjusted R-squared can decrease when a new feature fails to improve the model sufficiently, providing a more honest assessment of generalization capability. It is the preferred metric for feature selection and comparing the explanatory power of models with differing complexities within the same dataset.
Frequently Asked Questions
Essential questions and answers about the R-squared (Coefficient of Determination) metric, a core statistic for evaluating regression models.
R-squared, or the Coefficient of Determination, is a statistical measure that quantifies the proportion of the variance in the dependent variable that is predictable from the independent variables in a regression model. It is calculated using the formula: R² = 1 - (SS_res / SS_tot), where SS_res is the sum of squares of residuals (the variance unexplained by the model) and SS_tot is the total sum of squares (the total variance in the dependent variable). A value of 1 indicates the model explains all the variability of the response data, while a value of 0 indicates it explains none. This calculation provides a standardized measure of model fit, allowing for comparison across different datasets and models.
Enabling Efficiency, Speed & Accuracy
Intelligent Analysis, Decision & Execution
We build AI systems for teams that need search across company data, workflow automation across tools, or AI features inside products and internal software.
Talk to Us
Search across company data
Give teams answers from docs, tickets, runbooks, and product data with sources and permissions.
Useful when people spend too long searching or get different answers from different systems.

Automate internal workflows
Use AI to route work, draft outputs, trigger actions, and keep approvals and logs in place.
Useful when repetitive work moves across multiple tools and teams.

Add AI to products and internal tools
Build assistants, guided actions, or decision support into the software your team or customers already use.
Useful when AI needs to be part of the product, not a separate tool.
Related Terms
R-squared is a core regression metric, but its interpretation is nuanced. These related concepts are essential for a complete evaluation of a model's performance and limitations.
Adjusted R-squared
Adjusted R-squared modifies the standard R-squared to account for the number of predictors in a model. Unlike R-squared, which always increases with added variables, Adjusted R-squared penalizes model complexity, increasing only if a new predictor improves the model more than would be expected by chance.
- Key Formula:
1 - [(1 - R²)(n - 1)/(n - k - 1)], wherenis sample size andkis the number of independent variables. - Primary Use: Comparing models with different numbers of predictors to prevent overfitting. It is the preferred metric for multiple regression model selection.
- Interpretation: A model with a higher Adjusted R-squared than another is generally considered to have a better specification, balancing fit and parsimony.
Mean Squared Error (MSE)
Mean Squared Error is a fundamental regression loss function that measures the average of the squares of the errors—i.e., the average squared difference between the estimated values and the actual value. It is the quantity that ordinary least squares regression directly minimizes.
- Calculation:
MSE = (1/n) * Σ(actual - predicted)². - Relationship to R-squared: R-squared is derived from MSE. It is calculated as
1 - (MSE of model / Variance of target). While R-squared is a relative, unitless measure of fit, MSE is an absolute measure of error in the units of the target variable squared. - Key Property: It heavily penalizes large errors due to the squaring operation, making it sensitive to outliers.
Root Mean Squared Error (RMSE)
Root Mean Squared Error is the square root of the Mean Squared Error. It is one of the most commonly used regression metrics because it is expressed in the same units as the target variable, making it more interpretable than MSE.
- Calculation:
RMSE = √(MSE). - Interpretation: It can be read as a "typical" magnitude of error. For example, an RMSE of 5 for a house price model in thousands of dollars means predictions are typically off by about $5,000.
- Comparison to R-squared: While R-squared explains the proportion of variance, RMSE provides a direct, tangible measure of prediction error. A model can have a high R-squared but still have a practically large RMSE if the underlying variance of the data is high.
F-statistic (Regression)
The Regression F-statistic tests the overall significance of a linear regression model. It assesses whether there is a linear relationship between any of the independent variables and the dependent variable, as a group.
- Null Hypothesis: All regression coefficients are equal to zero (the model explains no variance).
- Relationship to R-squared: The F-statistic is directly related to R-squared:
F = (R² / k) / ((1 - R²) / (n - k - 1)). A high R-squared will generally produce a high F-statistic, leading to rejection of the null hypothesis. - Usage: It is the first test reported in regression output. A significant F-statistic (typically p-value < 0.05) is necessary to justify interpreting the R-squared value; otherwise, the observed fit may be due to chance.
Residual Analysis
Residual Analysis is the examination of the errors (residuals = actual - predicted) left after fitting a regression model. It is a critical diagnostic tool that validates the assumptions underlying R-squared and linear regression.
Key checks performed include:
- Independence: Residuals should not be correlated with each other (no autocorrelation).
- Homoscedasticity: The variance of residuals should be constant across all levels of the predicted value.
- Normality: Residuals should be approximately normally distributed, especially for inference on coefficients.
- Linearity: The relationship between variables should be linear; patterns in a residual vs. fitted plot indicate non-linearity.
A high R-squared value is misleading if residual analysis reveals violated assumptions, as the model's inferences and predictions become unreliable.
Coefficient of Determination (General Definition)
In the broadest statistical sense, the Coefficient of Determination is a measure of how well observed outcomes are replicated by a model, based on the proportion of total variation of outcomes explained by the model. R-squared is its specific implementation for linear regression.
- General Principle: It compares the variance of the model's errors to the variance of the data itself.
- Beyond OLS: The concept extends to other models:
- Logistic Regression: Pseudo R-squared measures (e.g., McFadden's) approximate explained variance.
- Nonlinear Models: Computed as the square of the correlation between observed and predicted values.
- Core Interpretation: Regardless of the model, it answers: "What percentage of the movement in the target variable can be accounted for by the model's inputs?" This makes it a universally sought-after, if sometimes misapplied, indicator of model utility.

About the author
Prasad Kumkar
CEO & MD, Inference Systems
Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.
His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.
Partnered with leading AI, data, and software stack.
How We Work
Custom AI workflows for your Business
One-fit-all AI don't work for modern businesses. At Inferensys, we aim to understand your business & custom requirements; which we use to define most efficient agentic workflows, the data, and the tools for your business.
01
Review the use case
We understand the task, the users, and where AI can actually help.
Read more02
Pick the right approach
We define what needs search, automation, or product integration.
Read more03
Build the first useful version
We implement the part that proves the value first.
Read more04
Improve from there
We add the checks and visibility needed to keep it useful.
Read moreThe first call is a practical review of your use case and the right next step.
Talk to Us