Saturday, January 25, 2014

Pillars of unit tests

What is a good test? It may seem like a trivial question; a test should flash red if something is wrong and be green otherwise. But how do we achieve that? There are some pillars, or golden rules, that should always be honoured:

Trustworthyness

Simple. If you don't trust the tests, they are not worth anything. You should be able to trust that tests 
    A) fail when they should
    B) don't fail when they shouldn't.

If a tests all of a sudden turns red and the response is "well, that test fails once in a while. It's normal", the test is not trustworthy.

What can we do to make tests more trustworthy?
  • Don't rely on things that can change in each test execution. Don't use timers or random numbers in tests. If this randomness causes failing tests from time to time, developers stop trusting them. A test that fails from time to time should be seen as an indication of an intermittent problem in the system under test, not a problem in the test.
  • Don't make assumptions on or introduce dependencies to the environment. If the tests depend on the file system, the graphics card or the phase of the moon, the tests will become fragile and fail for the wrong reason.
  • Don't overspecify the test. Have a clear vision of what the test is verifying. Don't add a bunch of asserts unless you have a very specific reason for adding them. Usually, each test should contain only one assert. Tests that are overspecified will often fail for reasons that are not related to the test itself. What happens then? Developers will stop trusting the tests!
Consider this test, which verifies that the tool class StringTools is creating the reverse of a string: 

    [Test]
    public void ReverseString_CalledWithString_ReturnsStringInReverse()
    {
      // Arrange
      string input = "abc";

      // Act
      string output = StringTools.ReverseString(input);

      // Assert
      Assert.AreEqual("cba", output);
      Assert.AreEqual(3, output.Length); // This is overspecification
    }

After verifying that the reversed string is returned, the length is also asserted upon. That's overspecification! The first assert is already doing a perfectly valid and sufficient verification. The second assert only makes the test more fragile and less maintainable because we can't change the length of the test string without also changing that assert. It adds complexity without yielding any value.

Maintainability

One of the most common pitfalls when doing TDD is test maintenance. As a system grows, it's easy to forget about the tests. As specifications change, it becomes hard to adapt the tests to the new requirements. Treat your test code with the same care as your production code!

Whenever you have finished a test, consider refactoring it. Consider whether the testee class becomes hard to test. Perhaps it's time to refactor both the test class and the testee class.

If each of the tests for a specific test class is doing multiple lines of setup, it might be a good idea to refactor this into helper methods.

As an example, consider these tests, where several methods in the "TesteeClass" class is being tested:

    [Test]
    public void SomeMethod_CalledWithNull_ReturnsFalse()
    {
      // Arrange
      var testeeObject = new TesteeClass();
      testeeObject.Initialize();
      testeeObject.SomeProperty = new SomeOtherClass();

      // Act
      bool result = testeeObject.SomeMethod();

      // Assert
      Assert.IsFalse(result);
    }

    [Test]
    public void SomeOtherMethod_Called_ReturnsTrue()
    {
      // Arrange
      var testeeObject = new TesteeClass();
      testeeObject.Initialize();
      testeeObject.SomeProperty = new SomeOtherClass();

      // Act
      bool result = testeeObject.SomeOtherMethod();

      // Assert
      Assert.IsTrue(result);
    }

You enhance readability and maintainability with some refactoring:

    [Test]
    public void SomeMethod_CalledWithNull_ReturnsFalse()
    {
      // Arrange
      var testeeObject = this.MakeTesteeObject();

      // Act
      bool result = testeeObject.SomeMethod();

      // Assert
      Assert.IsFalse(result);
    }

    [Test]
    public void SomeOtherMethod_Called_ReturnsTrue()
    {
      // Arrange
      var testeeObject = this.MakeTesteeObject();

      // Act
      bool result = testeeObject.SomeOtherMethod();

      // Assert
      Assert.IsTrue(result);
    }

    private TesteeClass MakeTesteeObject()
    {
      var testeeObject = new TesteeClass();
      testeeObject.Initialize();
      testeeObject.SomeProperty = new SomeOtherClass();

      return testeeObject;
    }

Readability

Your tests need to be readable. They are your code-level functional requirement document. If a test is hard to read, it's hard to maintain. It is also hard to figure out why it fails.

Pretend like an axe murderer is going to read your tests. He knows where you live, and he gets really mad if he can't understand your tests. You don't want to make him mad.

  • Avoid using loops and logic. It should be obvious for the reader what the test does. If a test contains loops, if's, logic and other constructs, the reader needs to think in order to understand the test.
Production code vs test code.
  • Use the smallest possible dataset. If you develop an algorithm, use the smallest possible dataset needed to verify the algorithm. This will make it easier to read the test, and execution will be faster.
  • Avoid using magic numbers. If a test contains cryptic numbers, the axe murderer will wonder whether the number has a meaning. Use the lowest possible number so that it's obvious that the number is just an arbitrary input number.
Consider these two tests, one using magic numbers and one using the lowest possible number:

    [Test]
    public void AddNumbers_CalledWithTwoNumbers_ReturnsSum()
    {
      // Arrange
      double number1 = 54254; // Does this number have a meaning??
      double number2 = 64333;

      // Act
      double sum = Calculator.AddNumbers(number1, number2);

      // Assert
      Assert.AreEqual(118587, sum);
    }

    [Test]
    public void AddNumbers_CalledWithTwoNumbers_ReturnsSum()
    {
      // Arrange
      double number1 = 1; // It's obvious -- it's just an arbitray number
      double number2 = 2;

      // Act
      double sum = Calculator.AddNumbers(number1, number2);

      // Assert
      Assert.AreEqual(3, sum); // It's easier to understand the expected result
    }

Ease of use

Although this item does not pertain to the tests themselves, it is equally important. It should be easy to run tests. They should run fast. Once it becomes a hurdle to run the tests, developers will stop running then. It also becomes hard to get into the smooth test-driven flow where you develop test and production code in parallel.

Make sure that your tests run fast, and choose a test runner that allows you to run tests easily. Visual Studio has a decent test runner if you code .Net. If you use ReSharper, you have an even better test runner.


Happy TDD'ing!

3 comments:

  1. Hi Jahn Otto,

    Interesting post, thanks!

    I have a thought on the Maintainability examples that you have given. I understand and agree with your point about refactoring the test methods to reduce duplication. I also understand that this is a small example. But it seems likely that TesteeClass is a class that's used in the production code too. It would be likely that the same (or similar) initialisation steps would also be required in the production code too. I would be tempted to try creating a Factory-type class which takes care of this initialisation for me. This factory can be created and used in a test environment or production environment. And if the properties of that object (e.g. SomeProperty) were created and assigned based on some setup parameters passed to the Factory, then extra flexibility is provided.

    Let's say that the Factory is constructed in the test setup method. The tests may then look something like this:

    public void SomeOtherMethod_Called_ReturnsTrue()
    {
    // Arrange
    var testeeObject = _testeeFactory.CreateTestee(TesteeProperties.SomeProperty); // parameter could perhaps be an enum which is used by the factory to instantiate and assign the correct property object

    // Act
    bool result = testeeObject.SomeOtherMethod();

    // Assert
    Assert.IsTrue(result);
    }

    The benefit to testing is that this reduces the test setup to a one-liner as above, but also brings up that hidden property "to the surface" of the classes that are interested in creating and using them (the callers of the factory). In my opinion this is very readable, as you can look at the test on its own and figure out the inputs to the test (the Arrange step) working without looking anywhere else. I would personally prefer to have this readability and wouldn't mind having this exact same create call in multiple tests. What do you think?

    Thanks again and I look forward to reading your next post!

    Neil :)

    ReplyDelete
    Replies
    1. A good point indeed, Neil!

      If the initialization is the same for production environment and test environment, a factory method or another setup mechanism would be a good refactoring measure. This factory method would of course have it's own unit test! ;-)

      This not always the case, however. Quite occasionally the setup of an object in a test is different than in the production environment. If the Testee class holds a reference to a networking interface, for example, you'd want to set it up with a mock network interface in the test environment.

      By the way, the Testee class is actually the class being tested in this example. I realize that this is not obvious in the post, so I will clarify it in the text.

      Delete
  2. Thanks for the reply! Yes I see what you mean, and I absolutely agree about the benefit of extracting methods. However, I haven't found myself using many helper methods called from test methods in my test classes - although I am by no means an expert. In the case of creating a testee class with mocked dependencies, I think I would do that work that in the setup method of my test class so that all of the unit tests in that class start from the same "baseline". I may then extract those statements into a method, but still avoid having it called from the individual test methods themselves.

    I think the point I'm trying to make is that having multiple helper methods in a unit test class, some of which are called from some test methods and some from others, may decrease my understanding of the "focus" of that unit test class. I'm not trying to be annoying, I promise!

    Thanks for the discussion :)

    ReplyDelete