Never Mix Public and Private Unit Tests! (Decoupling Tests from Implementation Details)

October 20, 2011

It seems to me that developers often do not really distinguish between the various types of unit tests they should be writing and thus mix things that should not be mixed, leading to difficult to maintain and hard to evolve test code. The dimension of unit test categorization I feel especially important here is the level of coupling to the unit under test and its internals. We should be constantly aware of what kind of test we are writing, what we are trying to achieve with the test, and thus which means are justifiable or, on the contrary, not suitable for that kind of test. In other words, we should always know whether we're writing a Public Test or a Private Test and never ever mix the two. Why not and what actually are these two kinds of tests? Read on! (And if you agree or disagree, don't hesitate to share your opinion.)

Motivation

When writing unit tests, we would like to have them as focused as possible so that they are easier to write (i.e. less context to set up, more direct verification), easier to understand (thanks to going directly to the point), and when then fail, it's easier to pinpoint the exact source of the failure. That means that we would like to test individual methods directly and without having to deal too much with other parts of the class. As methods usually depend on the state of their object we would often need to configure it, preferably in the simplest and most direct way possible, likely just setting the object's fields directly. The price we pay for this focus and effectiveness is tight coupling and thus more complicated refactoring.

At the same time we would like to keep our test as independent as possible from the actual implementation so that they are easier to maintain over a long period of time while the implementation keeps evolving w.r.t. changing business requirements.

How to deal with these two conflicting needs for focus and independence? I suggest that the answer is to use different types of unit tests for each of these concerns.

Public Unit Test (unit = class)

The primarily intent of a Public Unit Test is to verify the contract between the class and the rest of the system. This contract expresses the raison d'être of the class, its purpose in the system. And the reason why a class exists together with its core responsibilities determined by it are very unlikely to change - and if they do then the best solution is to delete the class and create a new one, aligned with the new purpose, for no class can survive such a change to something so essential for its design.

A test that verifies these core responsibilities should be as decoupled as possible from their actual implementation so that the class can freely evolve within the bounds of its purpose and contract. The test should be a specification of what the class does without any interest in how it achieves it. Thus it can stay relevant throughout the whole life of the class while the internals of the implementation keep evolving.

It is essential to have these tests because they assure us that the system is still working as intended even as it undergoes refactorings and its functionality is extended and adjusted.

To implement a Public Unit Test we should use as much as possible only the public methods of the class because they represent its contract.

The disadvantage of Public Tests is that they may be too high-level (for they only access the public interface) and thus not suitable for test-driving the implementation. It's often much easier to test smaller units such as individual methods in isolation than a whole object. That's why we also need Private/Helper Unit Tests.

Public Unit Tests are also known as round trip tests and they embody the testing principle Use the Front Door First.

To make it feasible to implement such a decoupled and change-resistant test, multiple well-known principles should be followed, such as Separation of Concerns, publication of only the smallest reasonable interface (=> fewer changes), data encapsulation (e.g. a parameter object is more resilient than a bunch of primitive parameters). It should also be mentioned that contrary to what I might seem to imply the test isn't completely static. Small, non-destructive changes to the contract that do not change its semantics (mostly changes to its representation's details, for example how we represent input parameters) and corresponding small changes to the test are acceptable, especially if it is something that an automated refactoring can do.

Example:


public class ArrayBasedStackTest {
   ...
   @Test public void pop_returns_pushed_in_reverse_order() {
      stack.push(1);
      stack.push(2);

      assertEquals(2, stack.pop());
      assertEquals(1, stack.pop());
   }
}

This is a typical public test - it uses only public methods to set up the state. When it fails we often cannot guess why, i.e. which of the three or four calls have not done what we expected and what exactly went wrong? On the other hand if we replace the internal array used to implement the stack with a linked list then it will have absolutely no impact on this test and it will be able to verify it equally well.

Private (Helper) Unit Test (unit = method)

The primary purpose of a Private (or Helper) Unit Test is to verify small pieces of behavior of a class in isolation from the rest of the class to help the programmer gain confidence that her implementation is correct with respect to her intentions. Those pieces of code are usually not visible or really important to the outside users of the code, they are private details of the implementation - that's why I call the tests that verify them Private Tests. The main benefit of these tests is that they are tightly focused and thus easier to write and understand and thus also a very good fit for TDD.

The drawback of Private Unit Tests is that the need to isolate the piece of code being tested usually requires an intimate knowledge of the internals of the class and the pre-configuration and post-verification of its internal state. Thus they are very brittle and tend to break even for a moderate change of the implementation, even if the change is a refactoring that preservers the public contract. So if you need to do some non-trivial refactoring of the implementation then these tests not only fail to serve as the safety net that test should provide (failing even though the class still fulfills its contract) but also complicate the refactoring by requiring to be updated accordingly afterwards.

If you keep Public and Private Tests separated then you are free to keep the Private Tests only as long as their value outweighs their cost and to throw them away once they become too much of a burden. You can afford deleting them because you know that the contract of the class - which is the thing that really matters in the long run - is covered by the Public Tests. Depending on your situation, you may throw a Private Test away as soon as the functionality in question is developed (in which case they only serve you to drive the development [even Kent Beck does it sometimes - see episode 3]) or you can keep it until the first larger refactoring that changes the internals of the class in a way the tightly coupled test can't survive.

If you are reluctant to delete tests then you should realize that tests are not only an asset but also a liability because they need to be created and maintained. And the economics of software development forces us to remove things whose long-term value is lower than their cost.

Example:


public class ArrayBasedStackPrivateTest {
   ...
   @Test public void test_growIfNecessary() {
      ArrayBasedStack zeroSizedStack = new ArrayBasedStack(0);
      // it has the package-private fields int[] content; int topIdx = 0

      int originalStackSize = zeroSizedStack.content.length;
      assertEquals(0, originalStackSize);
      assertEquals(0, zeroSizedStack.topIdx);

      zeroSizedStack.growIfNecessary();

      assertTrue("The stack hasn't grown; size: " + originalStackSize
            , zeroSizedStack.content.length > originalStackSize);
   }
}

This private test helps me to verify a "private" method of the stack without having to deal with its pop(). It doesn't need to set up any private state (as the constructor does it for me) but it uses its privileged access to check the "private" properties representing the state thus providing me with much better insight into a potential failure. (I've actually first written a public unit test but it failed for I was growing the stack by doubling its size - which doesn't really work with zero. Already writing assertions in this private test helped me to realize that.) The point with public vs. private unit test and access to internal state may be not so compelling in this simple case but you can certainly think of a real case from your experience where it would be more evident.

Note on Terminology

Summary: contract <=> public, implementation details <=> private, unit <=> isolated

The terms "public" and "private" I use here reflect the conceptual distinction between the contract a class has with the rest of the system - which is thus its "public" API - and the details of its implementation, which are "private" to the class in sense of the good old OOP's encapsulation principle. They are not directly related to the keywords "public" and "private" as used in Java though the contract is usually represented by public methods while implementation details are often hidden in private methods (though to make them testable it's usually best to make them package-private unless you write tests in Groovy).

The term "unit" is used rather freely. According to a recent tweet by Kent Beck, you can recognize a unit test by the fact that if it fails then you know exactly what is the problem and which part of the code, perhaps even which line, to check. If a test fails and you can't tell why then it isn't a unit test. It follows that a lot of method-level tests are not really unit test in this strict sense and especially Public Tests tend to be actually low-level integration tests where the unit of integration is the class (and its non-public methods). The size of the "unit" being tested is another of the test categorization dimensions. In a less strict sense we could say that a test is a kind of unit test (of a particular level) if it checks the unit in isolation. In this isolation-based sense I've drawn an approximate equality between a Public Test and a class (for its purpose is to verify the contract of the class) and a Private Test and a method. A test method in a Public Test will usually call one or more public methods on its target object and occasionally on some other objects to verify its indirect (yet "public") outputs. A test method in a Private Test will use "private" fields and setters to configure the target object, execute the non-public method under test, and check its output and perhaps some (public or non-public) fields/getters.

Summary

My motivation for this article was my work on a system where the past developers lacked the distinction between private and public tests and mixed their code together thus negating completely the main benefit of Public Unit Tests, that is their resilience to changes in the tested class. Even correct refactorings had the potential to cause many tests to fail due to their overdependency on implementation details (so-called overcoupling).

That's why I wanted to make a clear distinction between resilient Public Tests that support software evolution and throw-away Private Tests that support developers in writing correct implementations.

The Public Test and Private Test are ideal opposite ends of a scale. In reality you will always be somewhere between the two - but you should be always aware where you're trying to be and make all the reasonable effort to get closer to it. If your Public Tests cannot be implemented without relying on implementation details, try to follow the OOP principle of encapsulating that what is likely to change so that when it changes, there will be only one place to update.

If you should remember only a single sentence from this article, it should be this: You should primarily strive for having loosely coupled Public Tests verifying the core contracts of your classes to keep your system evolvable.

Update 8/11: A recent comment at DZone shows that I haven't made myself clear enough. Therefore I'd like to stress that Public and Private Unit Tests are my own terms and have nothing to do with white box ~ unit and black box ~ integration testing. Both Private and Public U.T. test a single class in isolation - they only differ in how dependent they are on the inner organization of the class. Public U.T. check public methods only and are thus usually more coarse-grained, Private U.T. test non-public methods, usually as much in isolation as possible even if that requires knowing and modifying internal, private state of the object under test.

Resources

Book xUnit Test Patterns: Refactoring Test Code (2007)

Tags: testing opinion