Way back in the dark ages the world wide web was created as a means of sharing static content. For various good reasons HTTP was made a stateless protocol. This conserved server resources as a client only consumed scarce resources (such as memory and network connections) during the processing of the request. As the requests are bursty this allowed many more clients to use a server than in a more traditional network application where clients have persistent connections and tie up resources for long periods.
In the beginning this was fine but as people started to use it more this became a major limitation. As people started to embed dynamic functionality into their websites the lack of state was a serious limitation. Hence the ability to have state has added. The mechanism that is used to do this is cookies, which allow a web server to give some state to the client that it will then be handed back on every request. While not particularly elegant this worked, was backwards compatible and is now ubiquitous*.
Cookies are all very well but they're fairly low level. Almost all web frameworks implement more sophisticated state management on top of cookies. This is generally termed Session. Specific implementation details vary but session typically allows the storage of arbitrary data in a hash structure. Sessions are tracked by a cookie assigned to the client by the framework. This cookie is assigned at the start of the session (such as when a cookieless request is made to the server). The session then persists on the server side for some period and is available to all requests that supply the correct cookie.
This is very convenient but unfortunately has a number of drawbacks. These aren't necessarily immediately apparent, especially in a development environment. This leads to problems being discovered at runtime where they can be very expensive to fix.
A unique session will be allocated to each browser instance. Some classes of concurrency issues can be avoided by only processing one page per session at a time. If your framework does this you don't need to worry about locking of the session. However a cookie, and therefore a session, is associated with a browser instance and not a window or tab. If a user opens multiple tabs or pages and performs separate actions in parallel that operate on the same shared state then there is the strong possibility that the actions will become corrupted and fail.
This can also be a problem even when a user has only a single browser page/tab open on your site. If state is added to the session by one process and not cleaned up there is a risk another process may pick up the state later and use it incorrectly. Avoiding this requires ensuring that session state is always disposed correctly. The appropriate place to do this is at the start of an action, as the potential for process failure means that it cannot be guaranteed that a process will correctly clean up its state.
The use of session data also makes processes more difficult to follow. It makes it difficult to determine where data is set and where it is used. As it is not explicit it can be easy for developers to overlook setting necessary state unless it is immediately apparent that it is required. This can cause failures in sections of an application significantly removed from where the error originates. Debugging this is difficult and time consuming, and is often difficult to reliably replicate.
Sessions also cause issues with application scalability and reliability. Each session needs to be stored somewhere. If this is in memory then the number of concurrent users a server can handle drops significantly as memory is a scarce resource. Placing the session in an external data store requires that it be every time it is used and updated every time it is modified. This can represent a significant amount of traffic against the data store.
Using sessions with multiple web servers is also problematic. If you store the session in memory your load balancing mechanism will need to be session-aware, increasing its complexity. Alternatively the session will need to be in an external data store, which introduces latency and points of failure.
So the lesson is: Session bad. Unfortunately it's also necessary for some applications. If your application requires a shopping cart then you need to track its contents in some fashion. As cookies store limited data keeping the entire cart in a cookie (or even multiple cookies) is infeasible. This implies a session-like mechanism and for most developers (i.e. you) and most projects (i.e. anything smaller than, say, Amazon) it's better to rely on the mechanism your framework provides than role your own.
If you are in this situation, the following are things to consider:
- Does it really need to be in the session? The only things that must be in a session are things that are transient and do not belong in your application database. Session is a poor place to data you can retrieve from the database or put into your framework's caching infrastructure. If your session ends up being persisted to a database anyway you need to do the query in any case, so you are unlikely to see a benefit from using session here.
- Is the session data minimal? Putting data into the session causes overhead, either in memory continuously consumed or in traffic to and from your database. This information is unlikely to be needed on every page impression. Consider storing a key in the session and keeping the rest in the database. For instance a shopping cart may be stored in the database with only the cart instance ID and possibly an item count and total in the session. This gives the key data (removing a database query) while keeping the session data contained.
- Encapsulate the session usage into classes such that all access must use the common implementation. This allows checks to be centralised and the session implementation to be altered independently of the client code. It also allows for mocking in unit testing.
- Make sure you know how the application will be deployed, not just immediately but in the future. Prepare for the worst case or ensure that session issues will be considered in the environment or usage changes.
- Keep current records of the data used in the session, including what is responsible for setting it, what is allowed to read and modify it, and what is responsible for its cleanup. Make sure your team is aware of the document and refers to it as necessary.
- Ensure your application handles session expiry gracefully. Most frameworks automatically expire a session after some configurable period of inactivity. It's generally considered best that your system doesn't start throwing unfriendly errors when this happens.
*Cookies of course have led to a number of security and privacy concerns, some founded, some not. I won't be discussing these here.