Hidden Anatomy of Backend Applications: Context Lifecycle
Let’s start again with the very simple HTTP endpoint. This time we’ll be looking at the processes which happen before our code is even invoked, in the depth of the framework or library we’re using.
In previous article we’ve looked at backend application communication with external world from the point of view of I/O. Now I propose to look into another processing pattern which is explicitly or implicitly present in every backend application.
Let’s start again with the very simple HTTP endpoint. This time we’ll be looking at the processes which happen before our code is even invoked, in the depth of the framework or library we’re using. For convenience lets call this part of the framework or library a transport layer.
Let’s keep aside for the moment what happens when the client connects to the server. We’ll return to this part later, but for now let’s assume that connection is already established and the client sends a request to the server. Once data are received at the server side by the OS, they are delivered to our backend application.
Received data are not a request yet. It’s just an array of bytes. The transport layer need to transform these data, extract necessary information and only then invoke our code.
Here is a pitfall: received data might not represent the whole request. This may happen for various reasons. The client may write the request as few parts, for example.
Or a long request may be split into packets and OS may deliver them as soon as they available without waiting for the rest of the data. In either case, since there is no complete request, the transport layer needs to save already received data somewhere until it will be possible to decode the request and invoke the handler.
The location where data are saved is the context associated with the client connection. Right after establishing the connection, context is empty and contains only the connection itself.
Then, as we receive and process data, context grows and at some point we get enough data to extract request information and then call the application code which handles the request.
This process can be represented using the following diagram:
If to look at this diagram from the data flow point of view, we can describe it as follows: OS emits data packets to the application transport layer, which collects data as long as necessary to decode the request. Once the request is decoded it is emitted into the application code.
Note that there might be several similar stages, for example the parsed request might be passed to the part of the framework which extracts request parameters, authentication information, etc.
Once all necessary parts are extracted, they are passed to the user-level handler. Usually subsequent stages are “one shot” — they call user level code for every request emitted to them, but there might be exceptions, for example, file upload functionality might postpone calling user level code until file(s) are completely downloaded and saved in temporary location.
The whole processing pattern is not specific to the HTTP protocol or backend applications. For example, we may observe a very similar pattern in push-based XML parsers: they call client only when they recognize specific elements, all intermediate steps are hidden from the client code.
This similarity is not accidental — parsing request and parsing XML document are very similar processes, just the grammar which we’re parsing is different. Some implementations of the transport layer, for example Netty, expose this processing pattern to the users of the library.
Now it’s time to look at the context lifecycle as a whole, from end to end.
In the case of connection-based protocols (for example, TCP) the context is born at the moment when the client connects to the server. Usually the initial context contains only information which can be obtained at the moment when the connection is established, for example the client address and socket which can be used to send response to the client.
The end of life of context happens when the server decides to close the connection for whatever reason. Note that during context lifetime several requests could be processed.
In the case of connection-less protocols (for example, UDP) the context is usually born at the moment when the packet from the client is received and context life is rather short, as it is no longer necessary once the application processes the received packet. Nevertheless, some applications simulate connection-based protocol using UDP for the transport. If this is the case, then the context life cycle is very similar to the connection-based protocol case described above.
Understanding of described above processing patterns and context lifecycle is helpful in many situations — optimizing application performance, implementing resource management or designing of framework/library.