Coverage

Coverage

Continuous Test

Our team is called Development Automation, but I want to change it to Delivery Automation because we focus on enabling and accelerating all aspects of Continuous Delivery. One aspect of our team charter is to manage the testing effort for our product. We currently have an external manual test team, but we treat them as customers and although they do release and regression testing we consider their testing UAT.

We are not QA and we are not responsible for quality, because quality is a responsibility of everyone on the delivery team. We focus on coding automated checks, but we would do a disservice to the delivery team, business and customers, if we did not also focus the team on manual testing. So, our first task in managing product tests is verifying that the developers have tested their work thoroughly and accurately for the risk associated with their change. That is a mouthful, but it boils down to helping the team responsible for changes verify their change is correct.

Peer Testing

We use a technique called peer testing to help the development team verify their change is correct. Peer testing on our delivery team is unlike what I found in my Google searches on peer testing were one person tests then hands off to someone else to test. How we do peer testing is more akin to the agile development practice of peer development.

In our permutation of peer testing, we sit with a developer and watch them as they demo their change. As the developer is driving, we are able to ask questions, get an understanding of the developer's perspective of the change, share our understanding of the change, and ask the developer to do additional testing if necessary to further explore the change.

If we think there needs to be additional testing that is too difficult for the peer session, depending on the project and the developer, we may do the additional testing on our own or assign it to the developer to setup and demo later. During peer testing we also record the results of the tests and promote the change ticket to the next evolution in our workflow if it passes or reject it back to the developer if we observe failures in the peer session.

Peer testing cuts down on duplicated work since we get to reuse the developer test setup. Normally, we would have to setup and run every manual test for a change, but if we feel the developer's tests are sufficient, we don't have to redo the setup for manual tests. We still have to setup automated tests until we can convince developers how easy it is to write automated checks :).

Peer testing also prepares the team for the sprint demo. The developers get a chance to practice and refine the demo with an active audience.

Peer testing allows us to share knowledge among team members. It helps train team members who may not have knowledge about a particular area of the product. Peer testing helps us improve our test knowledge and explore more ways to test the product. Finally, peer testing fosters communication and cohesion among the delivery team.

Test Coverage

We do measure test coverage (I already hear the boos from the testing community). Coverage in our context is a measure of the documented checks we have so that we have an understanding of what we have documented, what's automatable, what's automated, and what can only be done manual. The current goal of our coverage metrics is to look for ways to effectively increase the coverage of our automated regression checks.

At the moment, we measure two types of coverage, Sprint and Regression.

Sprint Coverage

Our sprint coverage is primary a measure of the manual testing that is done by peer testers and any automation that is created for changes. Since our manual testing is more exploratory than scripted checks. Coverage is basically a subjective measure of how well we believe our exploration covered a change.

Regression Coverage

The majority of our coverage metrics is related to functional regression checks. We are not yet matured to the point that our unit tests out weigh our functional UI and API tests. Inverting our upside down test pyramid is on the road map, but until we can change the development culture we will have to rely heavily on functional tests providing most of the coverage.

Our current product regression check coverage is documented in Gherkin based feature files and spreadsheets. We are moving away from using spreadsheets in favor of recording all coverage in feature files.

The benefit of feature files are:

  • they are automatically linked to the code used to automate them.

  • we can report on coverage by parsing the feature files.

  • we can parse the feature files to produce a manual regression test plan.

  • automated test results are automatically associated with scenarios.

This makes coverage reporting more of an automated process with less manual headache. Also, we get a single point of truth for regression check coverage, the feature file.

Quality of Coverage

We grade our documented coverage to get a sense of how well the checks cover the product functionality. Since we currently have coverage documented in feature files and spread sheets we provide two grades: automation and manual.

We use a subjective grade of how well the documented scenarios cover the product based on a modified version of James Bach's Low-Tech Testing Dashboard (http://www.satisfice.com/presentations/dashboard.pdf). We hijacked James' purpose of this dashboard as it is supposed to provide reporting for exploratory testing. He probably won't agree with our usage, but it actually works well for documented checks too.

Coverage Grades

  • 0 - None: we need more information.

  • 1 - Basic: major functions with simple test data covered.

  • 2 - General: more than major functions, but all functions not covered.

  • 3 - Good: all major, common and critical functions covered.

  • 4 - Strong: additional data, state, or error coverage beyond good coverage.

  • 5 - Exceptional: exceptional data, state, error, security, stress, performance, usability... coverage.

Risk

To increase coverage grades across our products we want to address the highest priority items first. We measure priority in terms of risk. The higher the risk the more important it is to increase coverage.

Risk Calculation

Risk in our test coverage analysis is a measure of the comparative risk of the various functional areas of our products. We use a subjective point system to assign a risk value to each functional area according to various risk factors. The risk factors are weighted so that some factors add more risk than others. We use a formula to calculate the risk where a higher number means the area exhibits more risks than an area with lower value.

Risk Factor Groups

  • Value (a.k.a impact or damage): Business Value and Customer Value factors.

  • Probability of Failure: Complexity and Change factors.

Risk Factors

  • Business Value - impact on business revenue, expenses, operations, branding...

  • Customer Value - impact on customer.

  • Complexity - how difficult is it to change.

  • Change Frequency - how often is it changed (frequent change signals instability and possible area concern).

  • Each factor (e.g. Business Value) is given a point of 1, 3, or 10.

  • Each factor has an associated weight.

To calculate risk:

  • the risk factor is multiplied times it's weight

  • the weighted numbers are summed by risk factor group

  • the risk factor group weighted points are multiplied

    Formula

    • (Business Value weight) + (Customer Value weight) = Weighted Value

    • (Complexity weight) + (Change Frequency weight) = Weighted Probability of Failure

    • Risk = Weighted Value * Weighted Probability of Failure

Increasing Quality of Coverage

We set a target coverage grade that the product should have. Then we start increasing coverage grade of the product working our way from the highest priority items to the lowest. Once each item has reached the target grade we increase the target grade and begin increasing coverage from the highest to lowest priority items.

The effort to increase coverage grade is outside of the effort to cover changes in our normal sprint work. We may decide to add tests or checks created during a sprint to the regression plan and this may improve the coverage grade, but the effort to increase the coverage grade is a different project outside of the sprint. To do this it is important to set aside time in sprint capacity for this project or dedicate resources to it.

As we add new checks to the regression coverage it is important to add them to the coverage report and reassess the the automation and manual aspects of the coverage, including coverage grade and risk. Once we move all of the coverage to feature files we won't have to grade manual vs automation.

Automated Test Runs

We have various levels of testing that is basically broken down by execution time of the test and complexity and isolation of the test.

We run small tests after each build.

We run a smoke test or sanity check after each deploy.

We run a regression test after each sanity check.

Before a release is blessed for production we want to also run security, stress, performance and usability tests, but we aren't there yet. We do have the framework to enable these tests. We envision a time when these tests are part of a gate the must be passed before promoting a release to the next phase on its what until it passes some gate that allows automated deployment to production. We may never achieve automatic automated deployment to production, but we will definitely have manual push button automated deployment to production where only a select few have access to the big red button.

Test Reporting

Last updated