Design of Experiments helps teams learn how inputs affect outputs. However, DOE does not end after running the analysis. You must verify the model. That step protects your conclusions. Residual analysis plays that role.
Residuals reveal whether the model fits reality. They expose hidden patterns. They also highlight assumption violations. Without residual analysis, results can mislead teams. Even strong effects may appear valid when they are not.
This article explains residual analysis for DOE in detail. You will learn what residuals are. You will also learn how to interpret common residual plots. In addition, you will see how residuals guide model improvement.
- Why residual analysis matters in DOE
- What are residuals in DOE?
- Core assumptions verified by residual analysis
- Linearity and residuals
- Independence and residuals
- Constant variance and residuals
- Normality and residuals
- Common residual plots used in DOE
- Interpreting residuals versus fitted values
- Interpreting residuals versus run order
- Interpreting normal probability plots
- Interpreting histograms of residuals
- Residuals versus factor levels
- Outliers and residual analysis
- Standardized and studentized residuals
- Residuals and leverage
- Residual analysis in factorial designs
- Residual analysis in response surface designs
- Residual analysis in mixture designs
- Improving model fit using residuals
- Example: residual analysis in a two-factor DOE
- Example: variance stabilization using residuals
- Common mistakes in residual analysis
- Practical checklist for residual analysis
- Residual analysis and statistical significance
- Residual analysis and practical significance
- Teaching residual analysis to teams
- Linking residual analysis to continuous improvement
- Conclusion
Why residual analysis matters in DOE
DOE models rely on assumptions. These assumptions support valid statistical tests. When assumptions fail, conclusions weaken. Residual analysis checks those assumptions directly.
Residuals capture what the model does not explain. Each residual equals the difference between an observed response and the predicted value. Therefore, residuals reflect model error.
If the model fits well, residuals behave randomly. If patterns appear, something is wrong. Consequently, residual analysis acts as a diagnostic tool.
Residual analysis helps teams answer key questions.
- Does the model explain the data well?
- Are assumptions reasonable?
- Do outliers distort results?
- Does the model need transformation or expansion?
Ignoring residuals often leads to false confidence. Therefore, residual analysis should follow every DOE.
What are residuals in DOE?
Residuals represent unexplained variation. Each experimental run produces one residual.
The formula looks simple.
Residual = Observed response − Predicted response
Despite the simplicity, residuals carry powerful information. They show how far predictions miss reality. More importantly, they reveal structure in the errors.
Residuals should show no trends. They should not cluster. They should also show constant spread. When these conditions hold, the model assumptions likely hold.
Core assumptions verified by residual analysis
DOE models assume several conditions. Residual plots test these assumptions visually. Each assumption connects to a specific plot.
The main assumptions include:
- Linearity
- Independence
- Constant variance
- Normality
Residual analysis checks each one separately. The next sections explain how.
Linearity and residuals
Linearity means the response changes proportionally with factors. In DOE, this applies to main effects and interactions.
Residuals help confirm linearity. If the model misses curvature, residuals show patterns.
Residuals versus fitted values
This plot tests linearity directly. The x-axis shows predicted responses. The y-axis shows residuals.
A good plot looks random. Points scatter evenly above and below zero.
A poor plot shows shapes. Common patterns include curves or waves. These shapes suggest missing terms.
Example: A two-factor DOE models temperature and pressure. The residuals form a U-shaped pattern. That pattern suggests curvature. Adding squared terms may fix the issue.
Independence and residuals
Independence means residuals do not influence each other. In practice, this means one run should not affect another.
Violations often occur due to time effects. Warm-up, tool wear, or operator fatigue can introduce dependence.
Residuals versus run order
This plot places run order on the x-axis. Residuals appear on the y-axis.
Random scatter supports independence. Trends suggest correlation.
Example: Residuals steadily increase over run order. This pattern indicates drift. Randomization or blocking may help.
Constant variance and residuals
Constant variance means residuals have similar spread across predictions. This assumption also appears as homoscedasticity.
Residuals versus fitted values (again)
The same plot tests variance consistency. Look at the vertical spread.
A healthy plot shows equal spread everywhere. A funnel shape indicates increasing or decreasing variance.
Example: Low predicted values show tight residuals. High predicted values show wide spread. A transformation may stabilize variance.
Normality and residuals
Many DOE tests assume normally distributed errors. Residuals approximate those errors.
Normality affects p-values and confidence intervals. Severe departures can distort conclusions.
Normal probability plot of residuals
This plot orders residuals and compares them to a normal distribution.
A straight line suggests normality. Curvature or heavy tails suggest non-normality.
Example: Residuals curve sharply at the ends. Outliers or skewness may exist. A transformation or robust method may help.
Common residual plots used in DOE
Residual analysis relies on several standard plots. Each plot answers a specific question.
The table below summarizes them.
| Residual Plot | Purpose | What to Look For |
|---|---|---|
| Residuals vs Fitted | Linearity and variance | Random scatter, equal spread |
| Residuals vs Run Order | Independence | No trends or cycles |
| Normal Probability Plot | Normality | Points near a straight line |
| Histogram of Residuals | Distribution shape | Symmetry and bell shape |
| Residuals vs Factor Levels | Missing terms | No systematic differences |
Using all plots together provides confidence. One plot alone rarely tells the full story.

Interpreting residuals versus fitted values
This plot deserves extra attention. It checks two assumptions at once.
First, it tests linearity. Second, it tests constant variance.
Ideal pattern
- Random scatter
- Centered around zero
- Equal spread across x-axis
Warning signs
- Curved patterns
- Funnel shapes
- Clusters
Each sign suggests a different fix.
Interpreting residuals versus run order
This plot detects hidden time effects. Many engineers skip it. That mistake can cost projects.
Ideal pattern
- Random scatter
- No slope
- No cycles
Warning signs
- Upward or downward trend
- Step changes
- Repeating cycles
These patterns often indicate process drift or setup issues.
Interpreting normal probability plots
Normal probability plots compare residuals to a theoretical normal distribution.
Ideal pattern
- Points follow a straight line
- Minor deviations acceptable
Warning signs
- S-shaped curve
- Heavy tails
- Extreme outliers
Small deviations rarely matter. Large deviations deserve attention.
Interpreting histograms of residuals
Histograms provide a quick visual check. They complement normal probability plots.
Ideal pattern
- Symmetric
- Bell-shaped
- Centered at zero
Warning signs
- Skewed shape
- Multiple peaks
- Long tails
Histograms work best with larger sample sizes.
Residuals versus factor levels
This plot checks whether the model captures factor effects fully.
Residuals appear for each factor level. Ideally, distributions look similar.
Warning signs
- One level shows higher variance
- One level shifts upward or downward
These patterns suggest missing interactions or nonlinear effects.
Outliers and residual analysis
Outliers stand out in residual plots. They appear far from zero. They also distort models.
Outliers may come from:
- Measurement errors
- Data entry mistakes
- Unusual process conditions
Residual analysis helps identify them early.
How to handle outliers
Never delete outliers blindly. Instead:
- Investigate the cause
- Verify data accuracy
- Check process logs
- Decide based on evidence
Sometimes outliers reveal important process behavior.
Standardized and studentized residuals
Raw residuals depend on scale. Standardized versions improve interpretation.
Standardized residuals
These divide residuals by their estimated standard deviation.
Values beyond ±2 raise concern. Values beyond ±3 demand investigation.
Studentized residuals
These account for leverage. They provide a stronger diagnostic.
‘Most DOE software, such as Minitab or JMP, reports studentized residuals by default.
Residuals and leverage
Leverage measures how extreme factor settings influence predictions. High leverage points carry more weight.
Residuals combined with leverage reveal influential runs.
A run with high leverage and large residual deserves attention.
Residual analysis in factorial designs
Factorial designs often show clear patterns, but residual analysis still matters.
Common issues
- Missing interaction terms
- Curvature hidden by two-level factors
- Aliasing confusion
Residual plots often reveal these issues quickly.
Residual analysis in response surface designs
Response surface models include quadratic terms. Residuals help verify whether curvature fits well.
Typical concerns
- Overfitting
- Poor extrapolation
- Unequal variance near boundaries
Residuals versus fitted values often expose these problems.
Residual analysis in mixture designs
Mixture designs impose constraints. Residual interpretation requires care.
Look for patterns related to component proportions. Unequal variance may occur near corners.
Residual plots still apply. Interpretation just requires context.
Improving model fit using residuals
Residual analysis does more than verify assumptions. It guides improvement.
Add missing terms
Curved patterns suggest quadratic terms. Factor-level differences suggest interactions.
Transform the response
Variance issues often disappear after transformation.
Common options include:
- Log transformation
- Square root transformation
- Box-Cox transformation
Use blocking or randomization
Run-order patterns often improve with blocking or randomization.
Remove noise factors
Unexpected residual patterns may reveal uncontrolled variables.
Example: residual analysis in a two-factor DOE
Consider a DOE studying temperature and time. The response measures defect rate.
The initial model includes main effects only.
Residual findings
- Residuals vs fitted show curvature
- Normal probability plot looks acceptable
- Residuals vs run order show randomness
Action taken
Add a temperature squared term.
Result
Residuals scatter randomly. Model fit improves. R-squared increases.
Residual analysis guided the fix.
Example: variance stabilization using residuals
A chemical process DOE shows a funnel-shaped residual plot.
Low predicted yields show small variance. High yields show large variance.
Action taken
Apply a log transformation to the response.
Result
Residual spread becomes constant. Normality improves. Conclusions stabilize.
Common mistakes in residual analysis
Many teams misuse residuals. Avoid these errors.
- Ignoring residual plots
- Overreacting to minor deviations
- Deleting outliers without investigation
- Checking only one plot
- Assuming software guarantees validity
Residual analysis requires judgment, not automation.
Practical checklist for residual analysis
Use this checklist after every DOE.
| Step | Question |
|---|---|
| 1 | Do residuals scatter randomly vs fitted values? |
| 2 | Does variance look constant? |
| 3 | Do residuals show independence over run order? |
| 4 | Do residuals appear roughly normal? |
| 5 | Do any outliers require investigation? |
This checklist prevents common oversights.
Residual analysis and statistical significance
Significant effects mean little if assumptions fail. Residual analysis validates significance.
Poor residual behavior can inflate Type I errors. It can also hide real effects.
Therefore, residuals protect decision quality.
Residual analysis and practical significance
Even when assumptions hold, residuals show model usefulness.
Large residuals indicate poor prediction. Small residuals indicate practical value.
Residual plots help teams judge whether a model supports process control.
Teaching residual analysis to teams
Residual plots intimidate beginners. Simple explanations help.
Focus on patterns, not math. Emphasize randomness. Use real examples.
When teams understand residuals, they trust DOE more.
Linking residual analysis to continuous improvement
Residual analysis supports Lean Six Sigma thinking.
It promotes:
- Data discipline
- Root cause thinking
- Evidence-based decisions
Residuals reveal gaps between theory and reality. Closing those gaps drives improvement.
Conclusion
Residual analysis verifies DOE assumptions. It also improves models. Ignoring residuals risks false conclusions.
Good residuals look random. Bad residuals show patterns. Each pattern tells a story.
Use multiple plots together. Investigate outliers carefully. Adjust models based on evidence.
Residual analysis turns DOE from analysis into insight.




