Regression Test Selection

Regression Test Selection or RTS is essential for successful regression testing in larger projects. As long as you are running a small project and working with a small product regression testing is seldom seen as an issue. As your project and product grows and matures your regression test suite becomes increasingly larger and larger. After a while you will find that a full regression test requires forty testers to spend forty hours of testing just to run the test suite once.

Regression Test Selection and Prioritization Approaches

Source: Wikipedia

Continuous integration can be seen as a form of regression testing. Regression test selection is a vital part of successful CI. Image from Wikipedia and used under the MIT License.

If you find enough problems in the 1600 man hour regression testing suite you will have to fix as many as possible of the problems and then re-run as much of the test suite as possible. With modern approaches such as automated build, test & deploy you will soon find it impossible to not do some selection and prioritization of test cases. Regression test selection means that you select which tests or test cases you will run in a given test session. Regression test prioritization means that you order the tests so that you can find as many problems as possible as early in the regression test session as possible.

Regression test selection and prioritization approaches are all based on the (perhaps faulty) assumption that if nothing has changed in the product, then the test results should also be the same. There are basically three approaches to regression test selection:

Manual regression test selection
Random regression test selection
Automated regression test selection

Manual Regression Test Selection

In manual regression test selection an expert acts as an oracle. This is usually the test leader. Together with the development team he or she will review the preliminary release notes, work package descriptions etc and select test cases based on them being in the related functional area, having a history of problems, being linked to recent bug fixes or on a general gut feeling for which test cases should be run. The biggest risk with this approach is that the selector becomes infatuated with certain test cases and runs them to often.

Random Regression Test Selection

Random regression test selection avoids the infatuation fallacy and has in fact been shown to be more effective than manual selection. Random also has the benefit of being easy and simple to implement. If you are using manual regression test selection today, consider removing (randomly of course) 10% of your tests from your hand selected test suite and replacing them with 10% randomly selected tests from the overall test suite.

Automated Regression Test Selection

In the last decade, research has focused on automated regression test selection and prioritizaton approaches. The fundamental challenge is to find a stable, yet effective approach that performs better and faster than manual or random test selection. I have myself proposed an approach where previous test results are correlated with previous source file changes so that changes to the same files would lead to running the related test cases again. In a recent study, researchers takes this one step further as they analyze not only files but also object dependencies.

Conclusion

Regression test selection has the potential of significantly reducing testing effort, cost and elapsed time. While we wait for good, commercially available approaches for automated regression test selection, I think our best hope is to use a mix of manual and random test selection.

References

[bibtex file=http://www.citeulike.org/bibtex/user/greger/tag/20131022?fieldmap=posted-at:posted-date&clean_urls=0]

Image sources

Hudson Screenshot: Wikipedia