Thursday, June 26, 2014

Help, my class is turning into an untestable monster!

SO, let's say that you have a nicely designed and unit tested class. Perhaps it's a MVVM dialog model which takes care of commands and displaying stuff. In the beginning, this class has a single list of selectable objects, a button which adds a new object and a button which removes objects:

Testing this is simple, isn't it? Create a test which verifies that the list has the relevant objects. Write another test which adds a new object and a third test which removes the selected object. Life is easy!

Now your client is so happy with the dialog, he has some additional requests. He wants to add some complex filtering so that only objects that meet certain criteria are visible. He also wants the list to be shown in a hierarchical tree instead of a flat list. Of course he also wants to define how to group them. Oh, and, based on the user's privileges, the remove functionality may or may not be enabled:

Now we have thrown a lot more logic into the dialog. There are multiple cases which yield a growing number of permutations -- how can we make sure that we cover all of those scenarios in tests? The user may or may not filter, he may or may not group into categories, etc. Your class is starting to get hard to test. It's turning into an untestable monster.

You will see this in your code. You will see it a lot. Well, it's a sign!

It's a sign

Testable code which turns into untestable code usually means that the code has a more fundamental problem. As a class grows, it is getting more and more responsibilities. More responsibilities means more scenarios to cover, and more complexity. The class is starting to violate The Single Responsibility Principle.

Regardless of whether we do TDD or not, this is a bad thing. It's time to divide and conquer -- split your complex multi-responsibility class into smaller, simpler single-responsibility classes. Those classes will be easier to test, and you will avoid all the complex test scenario permutations.

Divide and conquer

In our view model example, the view model has at least these responsibilities:
  • Do filtering of items based on some criteria
  • Organize the items into a hierarchy based on user selection
  • Present the hierarchy as a tree view
  • Allow user to select items
  • Add and remove items
  • Disable features based on user privileges
Obviously, this class is violating the Single Responsibility Principle! Hence, the root of the problem is not the testability as such -- it's the class itself. The increasing complexity of the tests has revealed that the production code is a candidate for refactoring.

In this case, I would split the class up into a filter class, a hierarchy organizer class, user interaction handler, etc. This is a good idea anyway from a software quality perspective.


As classes grow with an increasing number of responsibilites, testing becomes harder. Don't get tempted to skip the tests. Instead, refactor the production code class so that it becomes testable AND well-designed.

TDD is not only a first line of defense against new defects. It's also a very efficient design tool which forces you to keep your classes tidy and adhere to the SOLID principles.

Poor testability is a sign. Embrace the sign. Refactor your code today!

Wednesday, June 11, 2014

Automated testing of rendering code

In my blog post "TDD and 3D visualization", I wrote about a somewhat complex scenario: how to do TDD on 3D visualization. That blog post focused on visualization using high-level toolkits like Open Inventor or VTK. In that case, we don't test the rendered result. Instead, we test that the state of the scene graph is correct, and we trust the 3D toolkit to do the rendering correctly for us.

What if we write the rendering code ourselves? This may be the case if we write our own ray tracer or GPU shaders. In that case, we actually need to verify that the rendered pixels are correct.

Bitmap-based reference tests

TDD-style test-first development is not easy to do in this case, and it may not even be possible. The test result itself is much more complex than a trivial number or string result. How can we write a test that validates a complex image without having the rendered image up front?

In this case, it may be more convenient to write the rendering code first and then use a rendered image as a reference image for the test. We will loose some of the benefits of doing proper TDD, but those tests will still act as regression tests that verify that future enhancements do not introduce defects.

The production code and tests will thus be written like this:

  1. Write production code that renders to a bitmap
  2. Verify the reference image manually
  3. Write a test that compares future rendering results with this reference image
  4. Fail the test if they differ

Allow for slight differences

This may sound fairly trivial, but reference image-based tests have one major challenge: the rendered images may differ slightly because of different graphics cards and different graphics card drivers:

For example, different drivers may apply shading differently so that the colors vary between driver versions. Furthermore, antialiasing may offset the image by one pixel in either direction, and the edges may be rendered differently. We certainly don't want the tests to fail because of this, because then we stop trusting the tests.

Hence, we need to allow for small differences between the rendering result and the reference image. More specifically, we need to allow for
  • Single pixel offset
  • Small differences in color or intensity
I have been using this approach with good success:
  1. Render image
  2. For each pixel in the rendered image, find the pixel in the 3x3 neighbouring pixels in the reference image with the lowest deviation in RGB pixel value. Add this deviation to a list.
  3. Create a histogram of all the pixel deviations so that you can calculate the distribution of the errors.
  4. Decide on an error threshold for acceptable differences. For example, say that
    • A maximum of 0.1% of the pixels can have a larger RGB deviation than 50
    • A maximum of 2% of the pixels can have a larger RGB deviation than 10
    • A maximum of 20% of the pixels can have a larger RGB deviation than 3
You should start with a fairly strict tolerance. If you find that you get too many false positives, increase the tolerance slightly.

By defining a deviation distribution tolerance like this, you will allow for small variations while still catching rendering defects that cause rendering errors.

Render to bitmap, not to screen

If possible, render to an offscreen buffer in the tests. This is more robust than rendering to a window and then doing a screenshot, because the tests will not be obstructed by other windows, screensavers, locked computer, etc. This might be a good idea architectural idea anyway, as it separates rendering from display.