Method in practice

  1. Collect process data (e.g., CI/CD build time, API latency, MRR churn).
  2. Compute the mean (μ):
    Example: builds = 10, 12, 11, 9, 10 min → μ = 10.4 min
  3. Compute the standard deviation (σ):
    For the same builds σ ≈ 1.02 min
  4. Build control limits:
    • UCL = ≈ 13.46 min
    • LCL = ≈ 7.34 min

What to do with normal variation (inside limits)

  • Don’t panic. This is expected noise, not a problem.
  • Keep statistics. Watch for sharp changes in the mean and σ.
  • Optimize systematically. If you see a trend in the mean or σ increasing, improve the whole process rather than reacting to single points.
  • Automate monitoring. Alert only on out-of-control signals — the team won’t be distracted by noise.

Avoid

  • Reacting to every spike that stays within limits.
  • Filing a “bug” in Jira or touching code if the value is within control.
  • Calling panic meetings over single data points within limits.

Examples

1. SaaS / Business

  • Average MRR churn = 5%, σ = 0.4%
  • UCL = 6.2%, LCL = 3.8%
  • 5.9% → normal → keep observing
  • 8.2% → anomaly → investigate the cause (outside 6σ)
MRR churn control chart-2.0 %0.0 %2.0 %4.0 %6.0 %8.0 %10.0 %1357911131516%SamplesUCL (6.9 %)Mean (5.1 %)LCL (3.2 %)Point 1: 4.2 %Point 2: 5.6 %Point 3: 5.3 %Point 4: 4.4 %Point 5: 5.8 %Point 6: 4.6 %Point 7: 5.5 %Point 8: 4.9 %Point 9: 5.7 %Point 10: 4.3 %Point 11: 5.1 %Point 12: 5.4 %Point 13: 3.9 %Point 14: 6.0 %Point 15: 5.2 %Point 16: 8.2 % (out of control)
Sample churn over time with mean=5%, σ=0.4% (UCL=6.2%, LCL=3.8%). Drag points to explore; Reset to restore.

2. Engineering / CI/CD

  • Average build time = 10 min, σ = 0.5 min
  • UCL = 11.5 min, LCL = 8.5 min
  • 11 min → normal → nothing to fix
  • 13.2 min → anomaly → find the specific cause (outside 6σ)
CI/CD build time control chart-5.0 min0.0 min5.0 min10.0 min15.0 min1357911131516minSamplesUCL (12.7 min)Mean (10.0 min)LCL (7.2 min)Point 1: 11.2 minPoint 2: 9.1 minPoint 3: 10.8 minPoint 4: 9.2 minPoint 5: 11.3 minPoint 6: 9.5 minPoint 7: 8.8 minPoint 8: 10.6 minPoint 9: 9.0 minPoint 10: 10.2 minPoint 11: 9.7 minPoint 12: 11.0 minPoint 13: 8.7 minPoint 14: 10.9 minPoint 15: 9.3 minPoint 16: 13.2 min (out of control)
Sample build times with mean=10 min, σ=0.5 min (UCL=11.5, LCL=8.5). Drag points to explore; Reset to restore.

3. DevOps / API latency

  • Average latency = 200 ms, σ = 10 ms
  • UCL = 230 ms, LCL = 170 ms
  • 220 ms → normal → keep monitoring
  • 265 ms → anomaly → investigate (outside 6σ)
API latency control chart-100 ms0 ms100 ms200 ms300 ms1357911131516msSamplesUCL (260 ms)Mean (202 ms)LCL (144 ms)Point 1: 226 msPoint 2: 174 msPoint 3: 212 msPoint 4: 184 msPoint 5: 221 msPoint 6: 178 msPoint 7: 209 msPoint 8: 195 msPoint 9: 228 msPoint 10: 182 msPoint 11: 206 msPoint 12: 217 msPoint 13: 171 msPoint 14: 224 msPoint 15: 205 msPoint 16: 265 ms (out of control)
Sample latency with mean=200 ms, σ=10 ms (UCL=230, LCL=170). Drag points to explore; Reset to restore.

Note: how to handle outliers (±3σ)

In process control and Shewhart statistics, values outside (or “6σ” in Six Sigma terms) are usually considered anomalies and do not reflect normal process variation. Therefore:

  • When building control limits, do not include such outliers in the calculation of the mean (μ) and σ, otherwise the limits will shift and become less sensitive to real anomalies.
  • Exception: if outliers repeat frequently, this is a signal that the process has changed — include them when analyzing the new “system variation”.

Standard practice:

  1. First compute preliminary μ and σ on historical data without obvious outliers.
  2. Build the limits.
  3. Any value outside the limits → anomaly → investigate, and do not use it to recompute μ and σ until you understand the cause.

Tip: treat values inside the limits as normal noise, and outliers as an “alarm signal” — they shouldn’t distort the chart.

Why this matters

Without control limits teams tend to:

  • Fix noise, wasting energy on normal fluctuations.
  • Miss true anomalies.
  • Burn out from constant alert fatigue.
  • Slow product progress by deferring systemic improvements.

With control limits teams can:

  • Respond only to real problems.
  • Preserve focus and energy.
  • Trust metrics for decision-making.

Conclusion

Shewhart control limits are a tool for calm and effectiveness. Knowing the ±3σ boundaries helps you stop “fixing noise” and start improving the process systematically.