Queries are not reusable

A database query exists to in a given context take data that matches specific criteria from a given data structure and return it in a desired format. The exact query that is required may be the same in different contexts yet it is not a good development practice to share the query between those contexts. Doing so causes unnecessary coupling between them. When one case changes all others will be affected, even if the change is only a performance impact. This is undesirable.

There are two possible causes for wanting to do the same query in multiple places. You can have separate requirements that just co-incidentally happen to have the same query requirement. Co-incidental similarity is a terrible reason to reuse things. If there's no reason needs are the same any change to the requirements may cause them to bifurcate. At this point you will have to identify all users of a query and evaluate the impact. In practice for small changes this is often overlooked, particularly for small changes. However a series of small changes can add up and soon you have a single query serving multiple very different purposes, all of them badly.

The other cause is that you are trying to do the same thing in multiple places. This indicates that your system is badly structured. The appropriate solution here is to rework the system so that it only performs any function in one place. Duplicating logic that is meant to do the same thing in multiple places is a great way to introduce inconsistent behaviour into your system.

Note that there is a significant difference here between consolidating the handling of business logic that needs to be consistent and consolidating retrieval of data that is only co-incidentally the same at a point in time. Implementing business logic is a core function of the system. It is unlikely that variances in doing so are desirable. Querying data is something we need to do to operate the system but is ultimately not the point of the system. There is no need for multiple parts of the system to query data the same way.

Many queries will still have a level of coupling because they work against the same database schema. In an ideal world there would be a decent query database which was structured to support the queries the system required and therefore this coupling would not be a factor. Unfortunately there are a lot of databases out there that are not structured in this fashion that we still need to do work against. However the existence of undesirable coupling at one level is no excuse for having coupling infect your entire stack. Reuse of queries adds additional coupling in this scenario. You are still making things worse by doing it. You will not be able to entirely remove coupling if you must use a shared schema but you can significantly reduce it by keeping queries isolated to their context.

If you have a complex shared schema it's likely that you will need certain query fragments (such as specific methods of joining or particular restrictions on queries) to be applied to multiple queries. There is some justification for making these fragments reusable in some fashion (depending on your data access method). This is increasing the level of abstraction from which you are constructing your query which, when done appropriately, can be highly effective. What this is not doing is reusing the queries themselves which remain specific to each circumstance.

Whether this is actually justified is also debatable. Complex schemas tend to be fixed. In theory you can refactor them but in practice they're sufficiently complex that the business justification for doing so is debatable. (That this leads to forcing data into inappropriate structures and unnecessarily complex code to deal with this is generally overlooked). As such unless your common query element is complex or very widely used it's likely better to just take the hit of replicating it between queries. This is particularly true when a system grows but less so for mature systems where the queries are well understood. (Don't build new systems that need this though, that's just bad practice).

Colin Scott

Read more posts by this author.