In  practicing TDD it is reasonable to be concerned with the size of your test code. After all test code must be maintained and maintenance effort correlates to code size. Excessive test code is therefore costly. What then should drive how much test code we have? In my opinion there is one primary question in determining the amount of test code required:

How much test code is required to properly exercise the system?

Unit tests exist to specify and validate the behaviour of the system (as units). You have enough test code when the desired system behaviour is specified by the tests. Similarly you have enough system code when all your tests pass. The other metrics I’ve encountered are prescriptive without being rational:

  • Suggesting some kind of ratio between test code and system code ignores the widely varying overhead of testing various types of code. Units will vary widely in their setup requirements and number of conditions requiring testing. This means that the amount of code required to effectively test a unit may vary by orders of magnitude for a given unit size. Guessing an appropriate multiple is therefore not helpful in general, although it may be indicative in narrower instances.
  • Suggesting a test per method is in my opinion a complete failure of unit testing strategy. This makes no allowance for the multiple paths through a method that should be tested, nor does it allow for different behaviour with varying inputs or system state.

A proportionally large test codebase may be a sign of one of two conditions. Firstly you may have badly factored tests, in which case targeted refactoring of your tests can be beneficial. Secondly it may be a sign that the behaviour of your software is complicated and takes significant code to express. This doesn’t mean you may ignore the maintainability concerns of the test code but neither is it a reason in itself to prune your test code size.

There are of course other concerns, especially where the code and test effort becomes extreme. This is commonly the case with UI code where depending on your application it may simply not be cost effective to attempt automated testing. This is due to sharply diminishing ROI for test effort in these cases, which is not the case with a large but reasonably well written test codebase.