Ever since we stopped build applications to run on specific machines there’s been a need for configuration to allow an application to operate in its environment. This leads to an inevitable question What do we configure? (it also leads to How do we configure this? but that tends to be a platform specific question). In practice what we tend to find is that either almost everything is hardcoded (and nothing can be changed), or just about everything can be configured (and the size of the configuration can outweigh the code that uses it). It’s not unusual to see both of these cases in different areas of the same system.

The disadvantages of no configuration are obvious: nothing can be changed without recompiling the application. However there are benefits to not providing configuration, including a less complicated deployment with less to potentially go wrong. The advantage of this should not be underestimated, deployments can be complicated and risky and reducing the potential for serious mistakes is a significant benefit to have.

Being able to configure the application has obvious benefits, it allows the application to be customised to its environment and allows its behaviour to be altered without having to produce a new version. The disadvantages are less obvious but are all about complexity. Every time you make something configurable you increase the complexity of your software by requiring it to deal with the possible values of the configuration. Moreover you make deployment more complex as there are more possible failure points due to incompatible configuration values that will not be encountered until runtime.

Additionally as you push logic into configuration you dramatically increase the fragility of your application at deployment. Logic expressed in configuration is more difficult to test, and may require alteration when being transferred into the production environment. This is source of errors due to flawed transfer and the risk that what is tested is not the same as what is ultimately being used.

My favourite example of this determining the scope of configuration is dependency injection. One advantage of dependency injection is that it allows you to reconfigure your system dynamically. Want to change the database you use? Want to use a proxy to access a remote instance of a service instead of using a local instance. Drop in a new assembly and update the configuration to use it instead of the previous implementation. Instant changeover without a full redeploy. What could be more useful?

As it turns out, many things. Switching these functions can be a useful capability but it is for most systems rarely used or entirely unnecessary. To gain such a capability at the cost of having to specify the entire application structure, or even all the application components (then taking advantage of auto-wiring) seems to be an undesirable trade-off. I’ve maintained such configurations on a number of applications. They tend to be verbose, fragile and repetitious, as well as a significant source of deployment issues. If can also be more difficult to express complex or dynamic behaviours in configuration than would be the case if code was employed. Being able to replace hundreds of lines of XML with a couple of lines of code is a significant improvement for registering standard types in an application (for instance see StructureMap’s Auto Registration and Type Scanning).

When looking at these issues what we see is that we are applying the requirements of a subset of the system to the entirety of the system, where these requirements are not generally applicable. In most cases we can determine the required structure of the system by examining the types that compose it and applying a set of rules. This structure is unlikely to ever change between deployments and can be encoded into the application itself. We would then only need to specify explicitly those subsets of the system that may need to change between deployments. These specifications may be simpler than if we specify every component, which makes them easier to manage and less prone to error. This is the difference between configuring which data access assembly to load and configuring every data access component in the assembly individually. The former is not prone to errors where some data access is using the wrong component as the component may be configured as a coherent whole.

What we take from this is that the configuration should only contain that which we genuinely need to specify and should not contain things that we may derive automatically (through reflection and other sources). This may also apply to other settings where a reasonable default may be assumed. Applications must then only worry about setting the configuration value if they have a genuine need to alter the value from its default, reducing configuration complexity and increasing its relevance to the particular deployment.