Sample Defect Cost Calculations: Why tests are worth it…

Sample Defect Cost Calculation

The cost calculation is not very simple and a lot of assumptions, and estimations are needed. The following definitions, terms, and formulas are a first step for an understanding of the costs related to defects and testing. Especially, the results shall be an example and maybe an argument for a strong and tight testing.

Let us define first the Defect Detection Percentage (DDP):

DDP [$] =  (bugs detected) / (bugs present) = (test bugs) / (test bugs + production bugs)

The defect detection percentage is a quite simple formula, because it is just the ratio of the bugs found before delivery to the total bugs in the code which is unknown. The real insight comes, as soon as the ratio is rewritten to the ratio of the bugs found during testing and the sum of the found bugs and the bugs which are delivered into production. This formula shows in a very simple manner the simple truth: All bugs which are not found during test go directly into production. The first ratio is to abstract to get this point straight with the right emphasis, but that is really important: Bugs in production at customer’s site destroy reputation, decrease future revenue due to lost customers, and in worst case bring in calls for liability.

The relationship of total found bugs to critical bugs found should be under all circumstances:

DDP (all bugs) < DDP(critical bugs)

Or in other words: It should be the final goal to have a much better detection percentage for critical bugs than for all bugs or minor bugs. It is in most cases not very critical to have a button in a UI at the wrong position, but a crashing application or an application which cannot keep a database connection open, is critical.

The consequence is: We should put more time and effort into testing of critical and blocking defects than on minor issue testing. I have seen this in real world already. The focus and the effort on the UI was remarkable, because it was obvious and everybody could see it (especially the own management and the customers management), but the performance was not sufficient and the stability was not as good as needed. But, the product was shiny and looked beautiful…

Have a look into the post about the quality characteristics: https://rickrainerludwig.wordpress.com/2014/02/10/software-quality-charateristics-in-iso-9126. What is critical and what not may change from project to project. A deep though is needed for the weighting, but the weighting should lead the testing priorities.

The possible costs for defects (see Beautiful Testing: http://astore.amazon.de/pureso-21/detail/B002TWIVOY) can be classified as:

  1. Cost of detection: All costs for testing and accompanying actions. Even if no bugs are found, there are costs which need to be taken into account. For example: Performing a quality risk analysis, setting up the test environment, and creating test data are activities to be taken into account. They all incur the costs of detection and there are much more tasks to do. These are the permanent costs on timely basis.
  2. Cost of internal failure: The testing and development costs which incur purely because bugs are found. For example filing bug reports, adding new tests for the found bugs, fixing bugs, confirmation test bug fixes, and regression testing are activities that incur costs of internal failure. These are the actual costs per bug.
  3. Cost of external failure: The support, testing, development, and other costs that incur because no 100% bug-free software was delivered. For example, costs for technical support, help desk organization, and sustaining engineering teams are costs of external failure. These costs are much higher. The costs are at least as high as the internal failure costs, but there are additional costs for external communication, containment, and external support.

Let’s define the Average Cost of Test Bug with the knowledge from above:

ACTB = (cost of detection + cost of internal failure) / (test bugs) [$/Bug]

There is also the definition of the Average Cost of a Production Bug:

ACPB = (cost of external failure) / (production bugs) [$/Bug]

With both definitions from above the calculating of a Test Return On Investment can be defined:

TestROI = ( ( ACPB – ACTB) x test bugs) / (cost of detection) [1]

Example Test ROI Calculation

Let us assume, we have 5 developers in a test team which cost about 50k$ per month (salary and all other costs). Let us further assume, they find an average of 50 bugs per month and a customer finds 5 bugs a month. The production bug cost is assumed with 5k$ (5 people for a week for customer support, bug fixing and negotiations) and a testing bug cost of 1k$ (1 developer for a week). We have the following numbers:

ACTB = (50 k$/month + 50 bug/month * 1k$/bug) / (50 bug/month) = (100 k$/month) / (50 bug/month) = 2k$/bug

ACPT = (10 bug/month * 5 k$/bug) / (10 bug/month) = 5 k$/bug (what we obviously defined)

TestROI = ((5 k$/bug – 2k$/bug) * 50 bug/month) / (50 k$/month) = 3

This simple example shows, that testing might be economically meaningful…

Automated and Manual Regression Tests

Some simple formulas for automated testing. It is sometimes difficult to communicate the positive impact of automated testing, because the initial costs for hardware, operational costs and the costs for development of automated tests are significant. But, the next formulas give an impression for the benefit.

Let us define the Regression Test Automation ratio:

RTA = (automated regression tests) / (manual tests + automated regression tests)

The Regression Risk Coverage is shown here to have the complete set of formulas:

RRC = (regression risk covered) / (regression risks identified)

The RRC is used as a measurement for the confidence on the automated tests. The higher the RRC, the better the confidence not to deliver critical bugs.

To calculate the benefit of automated test, the Acceleration of Regression Testing can be used:

ART = (manual regression test duration – automated regression test duration) / (manual regression test duration)

Automated tests bring benefits twofold:

  1. Development and test engineers are relieved from repetitive work. Engineers in most cases are not very good in this. Engineers are better placed inside of development projects and in failure analysis which brings more economical benefits. Every hour saved for manual testing can be better spend in functionality development or quality improvement.
  2. Automated tests can be run anytime, even during night. Continuous Integration (CI) and Continuous Delivery (CD) approaches use this by automated triggering of tests. Tests are run for example after each code change in the Source Code Management tool. Bugs can be found and fixed very quickly.

The costs can be lower, too. If the testing environment costs 1k$/month, it is about 1/5 of a developer, but it runs 24/7 and executes much more tests per time unit than any developer can do. It is a real bargain.

Summary

Even it is only a short post on some defect cost calculations, it shows that testing is meaningful economically. The costs for testing like developers developing automated tests, dedicated testers doing manual tests, hardware, QA staff managing testing and assuring the quality compliance, and so forth seem very high, but taken into account that the costs for defects increase significantly the later they are found in the product life cycle, testing brings a huge benefit.

Advertisements

Can Programs Be Made Faster?

Short answer: No. But, more efficient.

A happy new year to all of you! This is the first post in 2014 and it is a (not so) short post about a topic which follows me all the time during discussions about high performance computing. During discussions and in projects I get asked about how programs can be programmed to run faster. The problem is, that this mind set is misleading. It always takes me some minutes to explain the correct mind set: Programs cannot run faster, but more efficient to save time.

If we neglect that we can scale vertically by using faster CPUs, faster memory and faster disks, the speed of a computer is constant (by also neglecting CPUs which change there speed so save power). All programs run always with the same speed and we cannot do anything to speed them up by just changing the programming. What we can do is, to use the hardware we got as efficient as possible. The effect is: We get more done in less time. This reduces the program run time and the software seem to run faster. That is what people mean, but looking on efficiency brings the mind set to find the correct leverages on how to decrease run time.

A soon as a program returns the correct results it is effective, but there is also the efficiency which is to be looked at. Have a look to my post about effectiveness and efficiency for more details about the difference between effectiveness and efficiency. To gain efficiency, we can do the following:

Use all hardware available

All cores of a multi-core CPU can be utilized and all CPUs of the system if we have more than one CPU in the system. GPU or physical accelerator cards can be used for calculation if present.

Especially in brown field projects, where the original code comes from single core systems (before 2005 or so) or system which did not have appropriate GPUs (before 2009), developers did not pay attention multi-threaded, heterogeneous programming. These programs have a lot of potential for performance gains.

Look out for:

CPU utilization

Introduce mutli-thread programming into your software. Check the CPU utilization during an actual run and look for CPU idle tines. If there are any, check your software whether it can do something at the time the idle times occur.

GPU utilization

Introduce OpenCL or CUDA into your software to utilize the GPU board or physics accelerator cards if present. Check the utilization of the cards during calculation and look for optimizations.

Data partitioning for optimal hardware utilization

If a calculation does not need too much data, everything should be loaded into memory to have the data present there for efficient access. Data can also organized to have access in different modes for sake of efficiency. But, if there are calculations with amounts of data which do not fit into memory, a good strategy is needed for not to perform calculations on disk.

The data should be partitioned into smaller pieces. These pieces should fit into memory and the calculations on these pieces should run in memory completely. The bandwidth CPU to memory is about 100 to 1000 faster than CPU to disk. If you have done this, check with tools for cache misses and check whether you can optimize this.

Intelligent, parallel data loading

The bottle neck for calculations are CPU and/or GPU. They need to be utilized, because only they bring relevant results. All other hardware a facilities around that. So, do everything to keep the CPUs and/or GPUs busy. It is not a good idea to load all data into memory (and let CPU/GPU idle), then start a calcuation (everything is busy) to store the results afterwards (and have the CPU/GPU idle again). Develop you software with dynamic data loading. During the time calculations run, new data can be caught from disk to prepare the next calculations. The next calculations can run during the time the former results are written onto disk.This maybe keeps a CPU core busy with IO, but the other cores do meaningful work and the overall utilization increases.

Do not do unnecessary things

Have a look to my post about the seven muda to get an impression about wastes. All these wastes can be found in software and these lead into inefficiency. Everything which does not directly contribute to the expected results of the software needs to be questioned. Everything which uses CPU power, memory bandwidth and disk bandwidth, but is not directly connected to the requested calculation may be treated as potential waste.

To have a starter look for, check and optimize:

Decide early

Decide early, when to abort loops, what calculations to do and how to proceed. Some decisions are made in code on a certain position, but sometimes these checks can be done earlier in code or before loops, because the information is already present. This is something to be checked. During refactorings there might be other, more efficient positions for these checks. Look out for them.

Validate economically

Do not check in functions the validity of your parameters. Check the model parameters at the beginning of the calculations. Do it once and thoroughly. If these checks are sufficient, there should be no illegal state afterwards related to the input data. So they do not need to be checked permanently.

Let it crash

Check only input parameters of functions or methods if a fail of those be fatal (like returning wrong results). Let there be a NullPointerException, IllegalArgumentException and what so ever if something happens. This is OK and exceptions are meant for situations like that. The calculation can be aborted that way and the exception can be caught in a higher function to abort the software or the calculation gracefully, but the cost to check everything permanently is high. On the other side: What will you do when a negative value come into a square root function with double output or the matrix dimensions do not fit in a matrix multiplication? There is no meaningful way to proceed, but to abort the calculation. Check the input model and everything is fine.

Crash early

Include sanity checks in your calculations. As soon as the calculation is not bringing more precision, runs into a wrong result, gives the first nan or inf values or behaves strangely in any way, abort the calculation and let the computer compute something more meaningful. It is a total waste of resources to let a program run, which does not do anything meaningful anymore. It is also very social to let other people calculate stuff in the meantime.

Organize data for efficient access

I have seen software which looks up data in arrays element wise by scanning from the first element to the position where the data is found. This leads into linear time behavior O(n) for the search. This can be done with binary search for instance which brings logarithmic time behavior O(log(n)). Sometimes, it is also possible to hold data in memory in a not normalized way to have access to it in different ways. Sometimes a mapping is needed from index to data and sometimes the other way around. If memory is not an issue, think about keeping the data in memory twice for optimized access.

Conclusion

I hope, I could show how the focus on efficiency can bring the right insights on how to reduce software run times. The correct mind set helps to identify the weak points in software and the selection of the points above should point out some directions to look into software to find inefficiencies. A starting point is presented, but the way to go is different for every project.

Thoughts on the Agile Manifesto

From time to time, I discuss the agile methodology with clients and friends. The Agile Manifesto was published at http://agilemanifesto.org about 12 years ago. It was debated a lot and the debates still are going on. I try present here a small inside I had during the last years.

The main statement is

“Manifesto for Agile Software Development

We are uncovering better ways of developing
software by doing it and helping others do it.
Through this work we have come to value:

Individuals and interactions over processes and tools
Working software over comprehensive documentation
Customer collaboration over contract negotiation
Responding to change over following a plan

That is, while there is value in the items on
the right, we value the items on the left more.

Interestingly, the focus is shifted from the product, its documentation and the technical process (the planning) to customer focus and customer satisfaction. This is also part of another system which is called Total Quality Management (http://en.wikipedia.org/wiki/Total_quality_management). The focus shift is very obvious and necessary, when we think about the only income source each company has: The customer. The customer (or different customers if an organization has different services to offer) is the only source for income and therefore for revenue, profit and growth. Any revenue increase can only happen, when customers pay more for services or products or more customers are willing to spend money on the companies services or products. It is therefore obvious, that the focus needs to be on the customer and that the organization needs to be aligned to meet the needs and expectations of customers.

That’s why the first point about ‘Individuals and interactions’ is the most important point. Translated to easy actions it means: Identify your customers, treat them individually and implement processes for easy communication and interaction. Only customers can tell you what they need, what they expect and what they are ready to pay for. Individual customers treated well, will tell you more detailed, what they need and bring new business ideas. Ask a group of people and there is no detail. But, ask a single individuals and listen closely. You might get a lot of insights.

In software development the main reason of organizations is delivery working software and systems. These are the primary needs of their customers. They do not need a fancy manual to read, what the software might be able to do after spending hours to read the manual and trying things out in tutorials, but they need a software which brings business value. That’s why the second point is important. Have a running, valuable software which is self-explanatory and the customer is willing to pay for it. You can save a tree by dropping the printed documentation. Have a look to Apple products. How much manuals are sold with this complex and feature rich software? Some only help and a self-explanatory UI and everything is fine. This is one of the fundamentals of Total Quality Management: Only the customer can tell you what she wants and she is also the one who pays. Is there another way to work with the knowledge?

The third point is the enhancement of the first point. If the chance is there, try to work with your customer closely together. In Scrum and XP it is done by short release cycles and demos to show the customer progress on regular basis after each release cycle and ask for critics, comments and new ideas. It helps to deliver software which is valuable for the customer and therefore, which is paid for. An even better idea is to embed a representative of the customer into the development team. The responses are immediate and customer’s acceptance testing is done the whole time. The possibility of developing software for what the customer does not want to pay, is reduced dramatically. And again: The customers pays for the product. There is no way to make a better product than to build the product together with your customer and when the customer is part of the team, she is even more engaged and willing to help for development. At the very end, the willing to pay is much higher, when the product was kind of custom built.

By doing all this, be prepared: With each demo, feedback session and communication to the customer, there might be new ideas, comments and critics. The requirements are about to change on daily basis. That’s what the fourth value is about. Be open for changes. Customer only have vague idea at the beginning, about what they want. But, during the development, more ideas arise, some faulty ideas are dropped and new wishes pop up. That’s kind of normal and part of the process. This helps to make the product better at the end and the business value is increased. A product like that can be priced higher, though. What is better than that?

Is “good enough” good enough?

I thought about the term “good enough” lately. It came into my mind, that the term is used in a kind of way which is misleading.

In a lot of books about quality and economics (books which deal with the combination of the two) it is written, that development should not be perfect, because it is economically not meaningful, but it should be good enough. The hint is right, but misleading. It is not explained in detail what ‘goo enough’ really means, at least not in the books I wrote about that topic.

The term good enough is treated in almost all cases I witnessed as: Good enough = Good enough to be sell-able. This leads to a focus on external design, usability and feature bloat to increase market value. The market value in most cases is only judged by the external attributes. Most buyers do not look into the products and judge the internal value, what in most cases also would not make much sense. That’s why colorful products, well-designed and products with a ton of features can be sold best.

There is nothing wrong about the fact that such products can be sold best and most easy, but from the business point of view, this is not enough. What is good enough for the end customer, is not necessarily good enough for the producer of the product. All products have more needs than just to be sell-able. Products need to be recyclable, maintainable, ecological and maybe also upgradable (and much more). These thoughts lead to the conclusion, that there is more to think about than just the market and to be good enough for the market.

In software industry for instance, the source code of a current product is used as foundation for all further products. As soon as the source base gets unmaintainable, all future products are in danger. Software is very complex to develop and therefore, very expensive. A complete new development of a source base of a product family is very expensive and only large companies can handle and survive that.

It is therefore very important, right from the beginning, to spend some effort on code quality. The application of metrics, defect checks and conventions is crucial. Otherwise, the future of a company can be in danger.

Quality Assurance in Software Development from its Root

I checked the V-Model of software development again a couple of days ago during my holidays. (For example have a look to Wikipedia: http://en.wikipedia.org/wiki/V-Model_%28software_development%29). The model in general is very helpful to explain different levels of testing and their meaning. It is a difference whether I use unit testing to check different functions of the software or integration testing to check the system as whole to check the different parts working together. From functional point of view this V-Model is a good model, but it lacks the most basic building block of software: The source code itself.

Let us take car manufacturing as an example. The V-Model tells us to specify the basic requirements like the car needs to be able to go from A to B with four passengers and a big trunk. We specify then the system design like it is a person car which has a certain size and shape and go down via engine design to details like the different parts of the engine. We take the requirements and design everything from top to bottom by separating more and more smaller building blocks which are separated again until we end up with the most basic building blocks like screws, nuts and bolts. The test goes in production the other way around. First the screws, nuts and bolts are checked for their correct size before they are put into an engine for example. The engine is tested before it is put into the car and so on.

But, when we look to the car manufacturing, there is something which is still missing. It is the part where engineers think about the maintainability of the car. They already think during design phase of new car models about their maintainability. It would be a catastrophe if a mechanic has to dis-assemble the whole car just to change the front light bulb. (In my car I need to walk my way up to the front light starting from the front wheel! So they did not do a good job here…)

In software development it is the same. We think about the correct behavior of the code, but we forget, that we need to maintain the code in future and we also need to develop it further. So maintainability is a big deal, too. Here we need to pay more attention on software architecture, design and source quality to not block our development for further functionality. It is even more obvious, that we need to do this, when we think about the fact, that the source code of the current product is the base for all following products.

The Killer Difference between Mechanical Engineering and Software Engineering

I always run into the same discussions when it comes to project management and quality assurance in software development. Some people (mostly people without technical background in software development) do not really understand why classic project management and quality assurance methods fail in software development. Here is the killer argument why this does not work:

In hardware engineering for each piece of hardware product the same procedure is applied repeatedly. In software  the procedure is applied once and then the product is copied.

This is now my killer argument in most discussions related to that topic to stop discussions why these methods do fail. The difference should be easy to understand. For a deeper understanding, I explain a little more:

  1. Hardware engineering has a lot of physical constraints which limit the choices and there for timelines are more predictable.
  2. In hardware engineering it is not so easy to define new requirements and the requirements do not change on regular basis like in software development. A car needs to go fast and safe, but it is not to be intended to dive from one day to another. Nobody would think about requesting something like this and expects the car to be able to dive in the next release within a short period of time. For software it is a common pattern, because it is just easily put in, there is only some code to be written…
  3. In hardware production, due to its repeating nature, one can track cp and cpk values and drive the process to engineering. After some prototypes (which are fixed manually later or disposed), the production process is optimized and controlled. The products become better in quality over time due to continuous improvements. That’s classic quality assurance. For software: A product is produced once (and hopefully well tested), but shipped not just once, but up to thousands of time until a bug is reported. Hardware products are all a kind of special. Software is copied.

In this way, I could go on. I just take the classic methodologies and apply them to software development with the difference in mind.

Scrum and Extreme Programming (XP) for Software Quality Assurance

I talked to someone who owned a software company for outsourcing in Asia. The idea was, as always, to provide software developers with lower hourly rates for customer’s software projects. The specialty was a management of Europeans who drove a tight, Scrum based project and company management.

The question about quality assurance arose from my site. All I got as answer was, that the Software is developed with Scrum, with 2 week sprints, the source code and the results are shared on daily basis with the customer and that daily stand up meetings are held, sprint demos are organized together with the customer and some tools were in place for project monitoring. But, there was nothing told about source code quality, testing or code reviews. Assuming I was a customer who outsources its development projects to a company like, I wonder what source code and quality I get. The source code produced with outsourcing, is the property of the customer who pays for that. The customer wants high quality, but what is the interest of the development team in Asia about the quality of the source and the test coverage? I have no concise answer for that.

Scrum, as I understand it, is mainly a kind of project management. Scrum is not for quality management per se, but for satisfying stakeholders. Quality is not build into software, but into the process in form of retrospectives, demos and daily stand ups. Scrum assures deliveries in time and a transparent software development.

If you look into Extreme Programming (XP), that is quite different. Except of the terms, the Scrum methodology is also built into XP, but there is much more. XP also has Quality Assurance potential build in, in form of Pair-Programming, Test-Driven-Development (TDD), dedicated testers and other elements. Would it not be more suitable for companies providing outsourcing services?

%d bloggers like this: