Sample Defect Cost Calculations: Why tests are worth it…

Sample Defect Cost Calculation

The cost calculation is not very simple and a lot of assumptions, and estimations are needed. The following definitions, terms, and formulas are a first step for an understanding of the costs related to defects and testing. Especially, the results shall be an example and maybe an argument for a strong and tight testing.

Let us define first the Defect Detection Percentage (DDP):

DDP [$] =  (bugs detected) / (bugs present) = (test bugs) / (test bugs + production bugs)

The defect detection percentage is a quite simple formula, because it is just the ratio of the bugs found before delivery to the total bugs in the code which is unknown. The real insight comes, as soon as the ratio is rewritten to the ratio of the bugs found during testing and the sum of the found bugs and the bugs which are delivered into production. This formula shows in a very simple manner the simple truth: All bugs which are not found during test go directly into production. The first ratio is to abstract to get this point straight with the right emphasis, but that is really important: Bugs in production at customer’s site destroy reputation, decrease future revenue due to lost customers, and in worst case bring in calls for liability.

The relationship of total found bugs to critical bugs found should be under all circumstances:

DDP (all bugs) < DDP(critical bugs)

Or in other words: It should be the final goal to have a much better detection percentage for critical bugs than for all bugs or minor bugs. It is in most cases not very critical to have a button in a UI at the wrong position, but a crashing application or an application which cannot keep a database connection open, is critical.

The consequence is: We should put more time and effort into testing of critical and blocking defects than on minor issue testing. I have seen this in real world already. The focus and the effort on the UI was remarkable, because it was obvious and everybody could see it (especially the own management and the customers management), but the performance was not sufficient and the stability was not as good as needed. But, the product was shiny and looked beautiful…

Have a look into the post about the quality characteristics: https://rickrainerludwig.wordpress.com/2014/02/10/software-quality-charateristics-in-iso-9126. What is critical and what not may change from project to project. A deep though is needed for the weighting, but the weighting should lead the testing priorities.

The possible costs for defects (see Beautiful Testing: http://astore.amazon.de/pureso-21/detail/B002TWIVOY) can be classified as:

  1. Cost of detection: All costs for testing and accompanying actions. Even if no bugs are found, there are costs which need to be taken into account. For example: Performing a quality risk analysis, setting up the test environment, and creating test data are activities to be taken into account. They all incur the costs of detection and there are much more tasks to do. These are the permanent costs on timely basis.
  2. Cost of internal failure: The testing and development costs which incur purely because bugs are found. For example filing bug reports, adding new tests for the found bugs, fixing bugs, confirmation test bug fixes, and regression testing are activities that incur costs of internal failure. These are the actual costs per bug.
  3. Cost of external failure: The support, testing, development, and other costs that incur because no 100% bug-free software was delivered. For example, costs for technical support, help desk organization, and sustaining engineering teams are costs of external failure. These costs are much higher. The costs are at least as high as the internal failure costs, but there are additional costs for external communication, containment, and external support.

Let’s define the Average Cost of Test Bug with the knowledge from above:

ACTB = (cost of detection + cost of internal failure) / (test bugs) [$/Bug]

There is also the definition of the Average Cost of a Production Bug:

ACPB = (cost of external failure) / (production bugs) [$/Bug]

With both definitions from above the calculating of a Test Return On Investment can be defined:

TestROI = ( ( ACPB – ACTB) x test bugs) / (cost of detection) [1]

Example Test ROI Calculation

Let us assume, we have 5 developers in a test team which cost about 50k$ per month (salary and all other costs). Let us further assume, they find an average of 50 bugs per month and a customer finds 5 bugs a month. The production bug cost is assumed with 5k$ (5 people for a week for customer support, bug fixing and negotiations) and a testing bug cost of 1k$ (1 developer for a week). We have the following numbers:

ACTB = (50 k$/month + 50 bug/month * 1k$/bug) / (50 bug/month) = (100 k$/month) / (50 bug/month) = 2k$/bug

ACPT = (10 bug/month * 5 k$/bug) / (10 bug/month) = 5 k$/bug (what we obviously defined)

TestROI = ((5 k$/bug – 2k$/bug) * 50 bug/month) / (50 k$/month) = 3

This simple example shows, that testing might be economically meaningful…

Automated and Manual Regression Tests

Some simple formulas for automated testing. It is sometimes difficult to communicate the positive impact of automated testing, because the initial costs for hardware, operational costs and the costs for development of automated tests are significant. But, the next formulas give an impression for the benefit.

Let us define the Regression Test Automation ratio:

RTA = (automated regression tests) / (manual tests + automated regression tests)

The Regression Risk Coverage is shown here to have the complete set of formulas:

RRC = (regression risk covered) / (regression risks identified)

The RRC is used as a measurement for the confidence on the automated tests. The higher the RRC, the better the confidence not to deliver critical bugs.

To calculate the benefit of automated test, the Acceleration of Regression Testing can be used:

ART = (manual regression test duration – automated regression test duration) / (manual regression test duration)

Automated tests bring benefits twofold:

  1. Development and test engineers are relieved from repetitive work. Engineers in most cases are not very good in this. Engineers are better placed inside of development projects and in failure analysis which brings more economical benefits. Every hour saved for manual testing can be better spend in functionality development or quality improvement.
  2. Automated tests can be run anytime, even during night. Continuous Integration (CI) and Continuous Delivery (CD) approaches use this by automated triggering of tests. Tests are run for example after each code change in the Source Code Management tool. Bugs can be found and fixed very quickly.

The costs can be lower, too. If the testing environment costs 1k$/month, it is about 1/5 of a developer, but it runs 24/7 and executes much more tests per time unit than any developer can do. It is a real bargain.

Summary

Even it is only a short post on some defect cost calculations, it shows that testing is meaningful economically. The costs for testing like developers developing automated tests, dedicated testers doing manual tests, hardware, QA staff managing testing and assuring the quality compliance, and so forth seem very high, but taken into account that the costs for defects increase significantly the later they are found in the product life cycle, testing brings a huge benefit.

Thoughts on the Agile Manifesto

From time to time, I discuss the agile methodology with clients and friends. The Agile Manifesto was published at http://agilemanifesto.org about 12 years ago. It was debated a lot and the debates still are going on. I try present here a small inside I had during the last years.

The main statement is

“Manifesto for Agile Software Development

We are uncovering better ways of developing
software by doing it and helping others do it.
Through this work we have come to value:

Individuals and interactions over processes and tools
Working software over comprehensive documentation
Customer collaboration over contract negotiation
Responding to change over following a plan

That is, while there is value in the items on
the right, we value the items on the left more.

Interestingly, the focus is shifted from the product, its documentation and the technical process (the planning) to customer focus and customer satisfaction. This is also part of another system which is called Total Quality Management (http://en.wikipedia.org/wiki/Total_quality_management). The focus shift is very obvious and necessary, when we think about the only income source each company has: The customer. The customer (or different customers if an organization has different services to offer) is the only source for income and therefore for revenue, profit and growth. Any revenue increase can only happen, when customers pay more for services or products or more customers are willing to spend money on the companies services or products. It is therefore obvious, that the focus needs to be on the customer and that the organization needs to be aligned to meet the needs and expectations of customers.

That’s why the first point about ‘Individuals and interactions’ is the most important point. Translated to easy actions it means: Identify your customers, treat them individually and implement processes for easy communication and interaction. Only customers can tell you what they need, what they expect and what they are ready to pay for. Individual customers treated well, will tell you more detailed, what they need and bring new business ideas. Ask a group of people and there is no detail. But, ask a single individuals and listen closely. You might get a lot of insights.

In software development the main reason of organizations is delivery working software and systems. These are the primary needs of their customers. They do not need a fancy manual to read, what the software might be able to do after spending hours to read the manual and trying things out in tutorials, but they need a software which brings business value. That’s why the second point is important. Have a running, valuable software which is self-explanatory and the customer is willing to pay for it. You can save a tree by dropping the printed documentation. Have a look to Apple products. How much manuals are sold with this complex and feature rich software? Some only help and a self-explanatory UI and everything is fine. This is one of the fundamentals of Total Quality Management: Only the customer can tell you what she wants and she is also the one who pays. Is there another way to work with the knowledge?

The third point is the enhancement of the first point. If the chance is there, try to work with your customer closely together. In Scrum and XP it is done by short release cycles and demos to show the customer progress on regular basis after each release cycle and ask for critics, comments and new ideas. It helps to deliver software which is valuable for the customer and therefore, which is paid for. An even better idea is to embed a representative of the customer into the development team. The responses are immediate and customer’s acceptance testing is done the whole time. The possibility of developing software for what the customer does not want to pay, is reduced dramatically. And again: The customers pays for the product. There is no way to make a better product than to build the product together with your customer and when the customer is part of the team, she is even more engaged and willing to help for development. At the very end, the willing to pay is much higher, when the product was kind of custom built.

By doing all this, be prepared: With each demo, feedback session and communication to the customer, there might be new ideas, comments and critics. The requirements are about to change on daily basis. That’s what the fourth value is about. Be open for changes. Customer only have vague idea at the beginning, about what they want. But, during the development, more ideas arise, some faulty ideas are dropped and new wishes pop up. That’s kind of normal and part of the process. This helps to make the product better at the end and the business value is increased. A product like that can be priced higher, though. What is better than that?

The Curse of Cp and Cpk

In factory automation the calculations of Cp (Process capability index) and Cpk (measure of process capability) values are a good tool to monitor and proof performance and quality. I do not give here the full theory for the calculation of Cp and Cpk and all the math behind it. I want to show some issues from my professional experience which arise in understanding and also in expectations.

Some Theory for Recapitulation

For a given process measurements are performed and for a single parameter the average (Avg) and the Standard Deviation (StdDev) are calculated. The Cp and Cpk values are calcualted in the following way:

Cp = (USL – LSL) / (6 * StdDev)

Cp,upper = (USL – Avg) / (3 * StdDev)

Cp,lower = (Avg – LSL) / (3 * StdDev)

Cpk = Min(Cp,upper, Cp,lower)

The Cp value gives the theoretical capability of the process. It assumes a centered production and evaluates the best possible outcome for the given process deviation. This is the target to reach for process centering. The Cp value it self should be maximized with minimizing process deviation.

The Cpk value gives the current process performance and takes also into account if the process is not centered. If the process is centered the equation Cp = Cpk is valid.

For a Six Sigma production, values of Cp=2.0 and Cpk=1.5 are expected. The theoretical process capability should show a process deviation which matches six times into the process specification. A normal variation of the process around its center is always present and can not be avoid completely. Therfore the long time Cpk value is allowed to be smaller due to experiences of process variations in field of 1.5 sigma around the target.

If everything is Six Sigma ready, one gets only 3.4 violations of the specification into each direction out of one million samples. That means 6.8 violations in total for one million samples.

The Issue of Normal Distribution

The first issue and seldom questioned is: “Are my parameters normally distributed?”

In reality I have seen processes with Cp and Cpk values smaller than expected or wanted, but without any fail. The reason is that the parameters used are not normally distributed and the caluculations and statistics go fail. One can think of a rectangular distribution which is always within specification, e.g. between LSL/2 and USL/2. The calculation of Cp and Cpk give a Sigma level which tells to have a messy process. But in reality everything goes fine and no violation is to be expected. Only the math gives numbers out of wrong expectations.

A gate oxid growth for example can not be normally distributed due to the impossibility of negative gate oxid thicknesses. Processes which are under SPC (Statistical Process Control) control where corrective actions are performed, can also not be normally distributed. The influence of the corrective actions destroy the normal distribution if it were present before-hand.

One should always do a test for normal distribution first before calculation Cp and Cpk. In reality only a few processes are quite well normally distributed and for them the normal Cp and Cpk measurements are suitable. In most processes one should use different distributions for calculations or better do a counting of violations and a calculation back to a Sigma level. Those gives better information about real processes.

Hunting for Cp and Cpk

In a lot of industries, quality is one of the most important values and in some even life saving like in aerospace, automotive and medical industry. The pressure to reach and proof quality is very strong and necessary to stay in business and to challenge ones competitors. Cp and Cpk values are crucial for all purposes.

Cp and Cpk strongly depend on the specifications used. This leads to the next section…

Setting the Limits

When it comes to customer satisfaction, good Cp and Cpk values are important goals. It’s sometimes hard to reach them technically, but choosing the right limits can help. I heard, that the specification limits can be adjusted to meet Cp and Cpk value requirments and that these limits do not have any restrictions due to the fact that the process is controlled trougth its control limits. I knew afterwards I have to write about the right limits. Even a professional consultant on this mentioned something like that and I could only explain why there are technical limits for process and quality engineering.

At first we should define which limits are available for process control and what the purposes of them are.

Functional Limits

These limits define within which limits the product is suspected to work. Outsite these limits the product is just waste, not to be used and therefore to be dumped. There is nothing more to say… 😉

Specification Limits

These limits are negotiated with the customers or internal limits which are defined to decide whether a product is to be delivered or not. Specification limits can be defined multiple times for different customers, markets or functions, because not all product purposes need the same quality. The only requirement is, that the specification limits are equal to the functional limits or smaller within the functional limit range. A product which does not work at all is not suitable for any market except for selling to waste recycling.

Control Limits

Control limits are used for triggering corrective actions. If the process runs out of the control limits, a production system should signal this event immeditially and someone responsible has to start corrective actions to get the process back to target or the deviation back to a normal level. The control limits need to be defined left and right (or above and below) the target to make sense for corrective actions. For the purpose to start corrective actions, the control limits also have to be defined between the specification limits. If the control limit triggers an event outsite specification, it’s too late for any meaningful actions. Several control limits can be defined if necessary to separated different levels of escalation or different levels of inversive corrective actions.

Relationship of Limits

Regarding the facts above there is a simple relationship between all these limits which can be written as:

UFL >= USL > UCL > TARGET > LCL > LSL >= LFL

Shortcut Name Meaning
UFL Upper Functional Limit The upper limit for the product to work properly
USL Upper Specification Limit The upper limit of the product specification as delivery criterium
UCL Upper Control Limit The upper control limit as trigger for corrective action
TARGET Target The production target and optimum goal for production
LCL Lower Control Limit The lower control limit as trigger for corrective action
LSL Lower Specification Limit The lower limit of the product specification as delivery criterium
LFL Lower Functional Limit The lower limit for the product to work properly

The Curse Revealed

Where is the curse now in all this? Everything is well defined and we do the right statistics!?

In reality, most parameters are not normally distributed and the calculation of Cp and Cpk give numbers implying a worse process than really present. The numbers do not look very well because of wrong asumptions of normal distributions. The correct calculation is difficult and only statistics based on event couting is really accurate, but the number of events for counting need to exceet the 1 million mark, which is not really practical.

Customers also want Cp and Cpk values which do have a well defined sigma level like 3 (Cp = 1.0), 4 (Cp = 1.33), 5 (Cp = 1.67) or 6 (Cp = 2.0). The specification limits can only be expanded to functional limits and the process has to have a deviation and variation to meet this goal which is sometimes not really meetable due to physical constraints of machines or processes. Sometimes the process capability can not be proofed accurately with Cp and Cpk values.

In semiconductor industry for example one has to deal with physical issues. For an 8nm tunnel oxid for example one has to grow a layer of roughly 30 atoms to meet the thickness. The specification is set to 8nm +/- 1nm. 1nm is roughly 4 atoms. To meet the Six Sigma level one has to have a process deviation of a 6th of this 4 atoms, which is 2/3 of an atom. Therefore, to meet the Six Sigma level one has to prepare with a process variation of 2/3 of an atom and that in a process which processes a whole 8 inch wafer. For the whole area only a 2/3 of an atom of variation is allowed. These kind of tunnel oxids are prepared in 0.6um technologies. The current technologies in CMOS go down to smaller than 0.03um and the tunnel oxids become thinner and thinner… We obviously meet the physical limitations here.

What can be done, when customers have their expectations and the physics is at its end? Sometimes there is a real gap between needs, expectations and physical capabilities.

Solution

Sometimes we have to slow down, to sit together and to discuss in detail and technically correct what’s going on. If the Cp and Cpk values are used for production control, it’s a good step, but the blind hunt for numbers is not leading to the final target: Quality. The numbers are not always correct and the expectations have to be adjusted to the reality. Every production process should be optimized for best possible and effortable quality, but this has to be made transparent to external and internal customers of manufactoring parameters.

For managers, engineers and customers, there should be open discussions and if needed trainings to get background knowledge to get into constructive discussions and decision making procedures. Otherwise there is a large potential for professional conflicts…

%d bloggers like this: