Joy of Coding... and mutation testing in Java

For many years now it has been good practice to write unit tests for your source-code. And also to use test coverage reporting to see how much of your code is covered by tests. Although line + branch coverage reporting is quite useful, it doesn't tell you how good your unit tests actually are. Hence it's even possibly to achieve 100% coverage without even a single assert in your tests. Being interested in better ways of testing I attended the "Mutation testing" workshop during this years Joy of Coding conference. Mutation testing is a radical different approach of executing and analyzing the result and coverage of your unit tests. Instead of measuring how much of your code is "accessed from" your unit tests it determines how much of your code is actually "tested by" your unit tests.

So how does it actually work

The basic idea behind mutation testing is to make a small change (a mutation) to the (byte) code and then execute your tests to see if it is detected by the unit tests. Possible mutations are altering a ">" into ">=", replacing "++" with "--" and removing "void" method invocations. Each mutation therefor creates an altered version of your code called a "mutant". Prior to the actual mutation testing our unit tests first need to be executed against the original code to see if no tests are failing. Then the unit tests will be run for each "mutant" (making it possibly very time consuming) the see if:

the mutant is detected by our unit tests: the tests fails and therefore the "mutant" is considered "killed".
the mutant remains unnoticed by our unit tests: the tests did "not" fail (the "mutant" is considered "alive") and didn't notice the mutation; this means that the "mutant" is actually "not" tested (uncovered) by the unit tests.

An example of mutation testing

So how does this "mutation testing" actually work? Consider the following method:

public String foo(int i) {
    if ( i >= 0 ) {
        return "foo";
    } else {
        return "bar";
    }
}

And the fact that the unit tests consist of only one test method:

@Test
public void testFoo() {
    testee.foo(0);
}

What if we would create a "mutant" of our code in which ">=" is altered into ">"? We would expect our unit test method to detect this, right? Well in this case it's not since the test method doesn't contain a single assertion. What is we would change a "testFoo" method to include an assertion:

@Test
public void testFoo() {
    String result = testee.foo(0);
    assertEquals("foo", result);
}

Now our unit test method will fail and detect (aka "killed) the "mutant" code. Besides altering ">=" into ">" additional "mutants" could be created:

the first return method could be altered to return null (instead of "foo"); this "mutant" is "killed" by the "testFoo" method due to the "assertEquals" statement but remains unnoticed the original "testFoo" method (without any assertions).
the second return method can be altered to return null (instead of "bar"); since no test method actually covers this execution path this "mutant" will remain unnoticed.

NOTE: some mutation testing tooling (like PIT for Java) won't even bother creating a "mutant" for the second return statement as it will never be covered by the unit tests (as detected by traditional line coverage).

Equivalent mutations causing false-positives

As opposed to traditional line + branch coverage, mutation coverage can possibly lead to false-positives. It could "incorrectly" report (a false-positive) that a "mutant" as "not" being detected by your unit tests. For instance consider the following Java code:

public int someNonVoidMethod() { return 0; }
public void foo() {
  int i = someNonVoidMethod();
  // do more stuff with i
}

During mutation testing (using PIT Mutation testing with some "non"-default configuration) the following "mutant" could have been created:

public int someNonVoidMethod() { return 0; }
public void foo() {
  int i = 0;
  // do more stuff with i
}

The "int i = 0" statement in the "mutant" is functionally "equivalent" to the original code in which "someNonVoidMethod" returns 0. Such an "equivalent mutation" cannot be detected since the unit tests will (and should) not fail on it. And therefore it will be reported as being non-covered whereas it is actually a false-positive. When using PIT, a mutation testing framework for Java, "equivalent mutations" should, according to the documention, be minimal using the "default" set of mutators. For instance the "Non Void Method Call Mutator" of PIT causing the "int i = 0" equivalent mutation is disabled at default.

Conclusion

After participating in workshop, some additional investigation and playing around with PIT, I got really enthusiastic about using "mutation testing" in the near future (starting with new components) on my current project. As apposed to traditional coverage reporting the mutation test coverage actually measures the quality of your tests and cannot be fooled like traditional coverage reporting. In case you also got interested:

check out this very funny presentation from Chris Rimmer about the basic concept of mutation testing.
furthermore there's an interesting article from a company called TheLadders using the PIT mutation testing tool.
also theres an extensive article from Filip van Laenen about "mutation testing" in edition 108 of the overload magazine.
last but not least there's the documentation on the PIT mutation testing website.