Charting the Progress of System Development Using Defect Data

Companion paper · pairs with the "Charting Defect Data" talk

Bug reports are more useful in the aggregate than they are one at a time. Four simple charts distill raw bug tracker data into a leadership dashboard that answers the questions project teams actually care about: Are we ready to ship? Is the bug process working? Where are our worst problems? What should we fix about how we build software?

Read time: ~10 minutes. Written for test managers, engineering leaders, and project managers who need defensible quality signals.

Why charts, not tables

A stack of bug reports is a bad dashboard. It doesn't scale, leadership can't read it, and the signal gets lost in the noise. Charting the same data surfaces three things:

Patterns. Where bugs cluster, how fast they're moving, how the process is behaving.
Inflection points. Moments where something changed, a milestone, a scope expansion, a scrub meeting.
Conversation anchors. A two-minute chart walkthrough gets more attention and better decisions than a sixty-minute bug review.

The four charts below do ninety percent of the job for most projects. They're cheap to build, they scale, and they've been battle-tested on hundreds of programs. Everything in this paper has been updated from the original methodology to reflect modern tooling (any bug tracker plus a spreadsheet or notebook works, no dependency on spreadsheet-specific tooling) and modern delivery models (continuous delivery changes the shape of the curves but not the interpretation rules).

The data you need

Every chart in this paper requires just four fields from the bug tracker:

Opened date: when the report was filed.
Closed date: when the bug was resolved (fixed, deferred, or closed as not-a-bug).
Root cause: classification of the underlying defect type.
Affected subsystem: which part of the product the bug is in.

If your bug tracker has these fields and can export to CSV, you can build these charts in any spreadsheet or analysis notebook. Jira, Linear, GitHub Issues, Azure DevOps, all supported.

One honest caveat: bug reports approximate bugs, not bugs themselves. Duplicates, not-a-bug closures, and test-system failures inflate the count. Typical inflation across hardware and software programs is roughly 26–32%. Factor that into the absolute numbers; the shapes of the curves are not affected.

Chart 1, Opened / Closed

The most important chart. Plots daily and cumulative counts of bugs opened and bugs closed against the project timeline.

Chart 1 of 4

Cumulative bugs opened and closed

A healthy test window, opened curve flattens before release; closed curve converges.

Opened

Closed

Vertical gap between curves is open-bug backlog. Watch it close as release approaches.

What it tells you

Are we ready to ship? The cumulative-opened curve eventually flattens as the test system exhausts its ability to find bugs in the current phase. An asymptote indicates the program has reached diminishing returns and (for a well-designed test system) that most customer-affecting bugs have been found. A cumulative-opened curve that refuses to flatten indicates significant remaining defect population.

Is engineering closing the bug backlog? As development winds down and finds shift toward fix work, the cumulative-closed curve converges with the cumulative-opened curve. The vertical gap between them is the "quality gap", open bugs. A closing gap is a healthy sign. A gap that stays flat or widens is a problem.

How do project milestones relate to bug activity? Phase transitions often cause spikes in the opened curve. Bug scrub meetings cause discontinuous jumps in the closed curve (bulk deferrals or bulk fixes). Stage-gate rituals show up visibly.

Three pathology patterns

The idealized chart is a pair of curves that rise together, flatten together near the release date, and close the gap. Three bad patterns to recognize:

Endless bug discovery. The opened curve doesn't flatten. Seemingly-level periods turn out to be temporary lulls. Shipping on schedule means shipping with unknown quality. Decision: delay, descope, or invest more heavily in fix capacity.

Ignored reports. Opened and closed curves run parallel at a visibly widening gap as the project approaches release. Engineering or product has decided some set of bugs "don't count" without socializing that decision. If nothing changes, the deferred bugs escape. Management must either ratify the deferrals (accepting the quality tradeoff) or override them (accepting schedule impact).

Chaotic bug management. Both curves are jagged because of batched reporting, delayed closure confirmations, or inconsistent process. The chart is unreadable because the process is ad-hoc. Fix the process before reading the data.

How to build it

In any bug tracker export: for each date in the test window, count bugs opened and bugs closed on that date. Add running totals. Plot date vs. cumulative opened and cumulative closed.

Chart 2, Closure Period

Closure period is the average time between a bug being opened and closed. Plot both the daily and rolling (cumulative to date) averages.

Chart 2 of 4

Daily and rolling closure period

Stable and acceptable, daily variance is bounded, rolling average is flat, both curves inside the plan's upper and lower bounds.

Daily

Rolling

Upper SLA bound

Lower SLA bound

Lower bound is not a typo, fixing bugs 'too fast' usually means mass-deferring them, which leaves the bug in the product.

What it tells you

How responsive is engineering? A low, stable closure period indicates a smoothly-running fix process. A trending-up rolling closure period indicates the fix queue is slowing, either because the team is attacking harder bugs late or because fix capacity is constrained.

Interpretation rules

Two lenses: stability and acceptability.

Stable: low variance day-to-day, rolling curve with near-constant slope near zero, daily values fluctuating randomly within a few days of the rolling average. Stability implies a process under control.

Acceptable: both curves fall within the upper and lower bounds set by the project plan. Counterintuitively, there is a lower bound. Deferring every bug the day it opens pulls daily closure to zero, but the bugs are still in the product. Acceptability is about honest turnaround, not minimal turnaround.

A stable, acceptable chart with a level or mildly-downward trend is the healthy state. A chart that becomes unstable late in the project (jumpy daily values, increasing rolling average) usually indicates that either the fix queue has become saturated or the harder bugs have been left to last.

How to build it

For each closed bug, calculate closed_date − opened_date. For each day, average the closure periods of bugs closed on that day. For the rolling curve, average all bugs closed through that date. Plot both against the date axis.

Chart 3, Root Cause Breakdown

A pie or bar chart of the root-cause classifications for closed bugs. Root cause matters in aggregate, not individually.

Chart 3 of 4

Root-cause distribution at release

Closed bugs classified by root cause, a modern program showing the addition of third-party-dependency and model-behavior categories.

Short-term use: course-correct on the current project, too many requirements bugs means more spec clarity; too many integration bugs means harden the contract tests. Long-term use: compare with prior releases to validate that process changes are working.

What it tells you

What kinds of mistakes are we making? Most programs cluster in a few categories, requirements gaps, integration failures, logic errors, configuration issues, or increasingly for modern programs: schema drift, third-party API changes, and (newly) model-behavior anomalies.

Short-term and long-term uses

Short-term: course-correct inside the current project. Many bugs arising from unclear requirements? Invest in specification clarity before the remaining features are built. Many integration bugs? Harden the contract tests. Many config issues? Fix the deployment pipeline.

Long-term: every project contributes to a historical baseline. Comparing this release's root-cause distribution with prior releases reveals whether process changes are working.

How to build it

Count closed bugs per root-cause category. Chart as a pie (if four or fewer large categories) or bar (if more).

Chart 4, Subsystem Breakdown

A Pareto chart of bugs per affected subsystem, sorted from most to fewest.

Chart 4 of 4

Bugs per subsystem, Pareto view

The top two subsystems account for over 50% of all bugs found. This is typical.

Bug reports

Cumulative %

Red line (cumulative %) reaches ~52% after the top two subsystems. Worth investing in design review, code review rigor, and targeted testing for checkout and auth before the next release.

What it tells you

Which subsystems are the noisiest? In most programs, two or three subsystems account for the majority of bugs found. The Pareto distribution is remarkably consistent, on a representative sample of programs the top two subsystems typically account for 50–75% of all bugs.

How to use it

Process improvement. If two subsystems dominate the bug count, those are the places to invest in design review, code review rigor, refactoring, or additional testing.

Test effort allocation. Counterintuitive but well-supported by data: where you find many bugs, you will find more bugs. If you were testing all subsystems evenly but two subsystems produce most of the defects, shift effort toward them. The exception: if field data shows escape rates are disproportionately concentrated in the low-bug-count subsystems, something in your test system is blind to that surface.

Scope management. Persistent high-bug subsystems sometimes indicate that the requirements or design are fundamentally broken in that area. Bug-fix-only responses won't close the gap, the subsystem needs redesign.

How to build it

Count bugs per subsystem. Sort descending. Add a cumulative-percent overlay for full Pareto treatment. Plot as a bar chart.

Points, pitfalls, and honest caveats

Don't trust the charts blindly. Every one of these charts can be spoofed, intentionally or unintentionally:

The cumulative opened curve flattens if you stop testing.
The gap between opened and closed can be forced closed by mass-deferring bugs.
Closure period can be gamed by recording phony opened/closed dates, or by opening new reports instead of reopening confirmed-fail bugs.
Root cause and subsystem breakdowns are worse than useless if the classifications are filled in carelessly.

Only honest data yields honest charts. If the underlying bug tracking process is sloppy (see the Bug Reporting Process paper), fix that first.

In-progress data is incomplete. Closure period and root cause both depend on closed bugs. When many bugs are still open, these charts under-represent reality. The opened/closed chart and the subsystem chart are less affected, since they include all reported bugs. Be cautious about making major process decisions mid-flight on data from charts that depend on closed reports.

Continuous delivery changes the shapes, not the interpretation. In programs that ship continuously rather than in discrete releases, the charts are typically windowed (trailing 30 or 90 days) rather than cumulative. The interpretation rules still apply, you're just reading windowed versions.

Use these as starters, not as the final set. These four charts handle most programs. Augment when specific risks warrant: defect density by LOC for code-quality programs, defect discovery rate by test technique for test-technique evaluation, escape rate by subsystem for production-quality tracking. Resist the temptation to clutter the dashboard; the whole point is focused attention.

Building the habit

A dashboard only works if it's maintained. Practical discipline:

Weekly review. Walk the four charts at the project status meeting. Treat unexpected curve shapes as open questions, not as incidents.
Data hygiene in the tracker. Root cause and subsystem fields filled in at close time, not left blank. Dates accurate. Reopens handled as reopens, not as new reports.
Comparison across projects. Keep the historical data. Comparing this release to the last three (same organization, same team) is more informative than comparing to industry averages.
Honest conversations. The charts are only as useful as the conversations they provoke. Bad patterns require names, owners, and a decision.

Where this fits

Charting defect data is one piece of the full test-status-reporting picture. The bug reporting process that feeds it is covered in the Bug Reporting Process paper. The talk version of this paper (which walks through each chart with more visual examples) is at Charting Defect Data. The quality-risk-analysis discipline that makes the subsystem chart decisions actionable is at Quality Risk Analysis Process.

Charting Defect Data, the talk version with visual examples.
The Bug Reporting Process, where the data that feeds these charts comes from.
Bug Reporting Process Checklist, the on-desk version for individual contributors.
Quality Risk Analysis Process, connects the subsystem breakdown back to prioritization.

Charting the Progress of System Development Using Defect Data

Why charts, not tables

The data you need

Chart 1, Opened / Closed

Cumulative bugs opened and closed

What it tells you

Three pathology patterns

How to build it

Chart 2, Closure Period

Daily and rolling closure period

What it tells you

Interpretation rules

How to build it

Chart 3, Root Cause Breakdown

Root-cause distribution at release

What it tells you

Short-term and long-term uses

Categories

How to build it

Chart 4, Subsystem Breakdown

Bugs per subsystem, Pareto view

What it tells you

How to use it

How to build it

Points, pitfalls, and honest caveats

Building the habit

Where this fits

Related reading

Evaluation Before Shipping: How to Test an AI Application Before It Hits Production

Choosing the Right Model (and Knowing When to Switch)

Beyond ISTQB: A Multi-Domain Certification Roadmap for Technical L&D

The ISTQB Advanced Level path, mapped

Bug Triage: A Cross-Functional Framework for Deciding Which Defects to Fix

Building Quality In: What Engineering Organizations Do from Day One

Where this leads

Software Quality & Security

Risk Reduction & Clear Decisions

Reliable Software at Scale

Working on something like this?