Asynchronous Servlets in Java
In this article we will cover Asynchronous Servlets in Java. We will also implement a use case that demonstrates the concrete advantages of asynchronous processing with servlets.
Introduction
An excessive number of threads running simultaneously in a Java application may consume a lot of resources. This is no exception for a web based Java application. Incoming requests are handled by dedicated HTTP worker threads which will then process those incoming requests until a response is finally assembled and sent back to the client.
Considering a web application scenario where the number of simultaneous users is expected to be very high, problems regarding resource consumption is what matters to active threads may become a real problem. HTTP worker threads belong to a dedicated pool, which may become exhausted. Additionally, the threads themselves have to consume their own resources, so incrementing the HTTP thread pool size to a (very) large number will also lead to system resources exhaustion: it simply does not scale.
This is where Asynchronous Servlets may be helpful. In short, an asynchronous servlet enables an application to process incoming requests in an asynchronous fashion: A given HTTP worker thread handles an incoming request and then passes the request to another background thread which in turn will be responsible for processing the request and send the response back to the client. The initial HTTP worker thread will return to the HTTP thread pool as soon as it passes the request to the background thread, so it becomes available to process another request.
This approach by itself may solve the problem of HTTP thread pool exhaustion, but will not solve the problem of system resources consumption. After all, another background thread was created for processing the request, so the number of simultaneous active threads will not decrease and the system resource consumption will not be improved.
We will begin this article with the basics of Asynchronous Servlets. Then we will proceed to implement a use case that demonstrates the real advantage of asynchronous servlet processing, ie. reduce the number of required threads - and consequently the required resources - in order to handle an arbitrary number of incoming requests: our system will be able to scale.
Asynchronous Servlet
Let's start by defining a regular servlet, without asynchronous support:
Regular servlet
@WebServlet(urlPatterns = "/syncServlet")
public class TestSyncServlet extends HttpServlet {
private static final long serialVersionUID = 1L;
@Override
protected void doGet(HttpServletRequest request,
HttpServletResponse response) throws ServletException, IOException {
final long startTime = System.nanoTime();
try {
Thread.sleep(2000);
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
PrintWriter out = response.getWriter();
out.print("Work completed. Time elapsed: " + (System.nanoTime() - startTime));
out.flush();
}
}
This servlet will simply sleep for a couple of seconds and then write some content to the output stream. Meanwhile I also changed the container's HTTP thread pool size to 2. As one may expect, as soon as we fire a considerable number of requests against the servlet endpoint we will leave the container completely unresponsive. The two existing HTTP threads will block in Thread.sleep() for each incoming request and will only proceed to the next one after that period of time has gone through.
Leaving the HTTP thread pool size with the same value of just 2 threads, we now repeat the same exercise but using an asynchronous servlet:
Asynchronous servlet
@WebServlet(urlPatterns = "/asyncServlet", asyncSupported = true)
public class TestAsyncServlet extends HttpServlet {
private static final long serialVersionUID = 1L;
@Override
protected void doGet(HttpServletRequest request,
HttpServletResponse response) throws ServletException, IOException {
final long startTime = System.nanoTime();
final AsyncContext asyncContext = request.startAsync(request, response);
new Thread() {
@Override
public void run() {
try {
ServletResponse response = asyncContext.getResponse();
response.setContentType("text/plain");
PrintWriter out = response.getWriter();
Thread.sleep(2000);
out.print("Work completed. Time elapsed: " + (System.nanoTime() - startTime));
out.flush();
asyncContext.complete();
} catch (IOException | InterruptedException e) {
throw new RuntimeException(e);
}
}
}.start();
}
}
Let's go through some key points of the above asynchronous servlet: request.startAsync() will signal the container that the current request processing is now asynchronous, ie. the response should not be committed - and also not sent to the client - when doGet() method execution completes.
The returned AsyncContext instance exposes all the required facilities in order for the application to further interact with the current request and response. This means that the application is now free to do anything that is suitable with the AsyncContext instance, and let the doGet() method complete. As soon as the doGet() method completes, the original HTTP thread will return to the HTTP thread pool and will become ready to handle another incoming request.
Meanwhile we launch a background thread that will be responsible to use the AsyncContext instance in order to fetch the response, sleep for a couple of seconds, and finally write some content to the output stream. As a final step one must call asyncContext.complete() in order to complete the asynchronous request processing.
It's the same scenario we have seen previously with the regular servlet, but now we are using an asynchronous servlet. The result is that this time our server will not become unresponsive when we fire a considerable number of requests against the servlet endpoint. The 2 existing HTTP worker threads will handle the requests and spawn a background thread that will continue to process the requests. The HTTP threads will immediately return to the HTTP thread pool and handle the next request in the same asynchronous fashion.
Note: We have set the HTTP thread pool size to 2 just to illustrate the difference between a regular servlet and an asynchronous servlet.
We have solved the problem of the HTTP thread pool exhaustion, but the number of required threads to handle the requests has not improved: we are just spawning background threads to handle the requests. In terms of simultaneous running thread count this should be equivalent to simply increase the HTTP thread pool size: under heavy load the system will not scale.
Before jumping to a real use case of asynchronous servlet usage, there is another important detail: one may attach a listener to the AsyncContext instance in order to receive notifications during the lifecycle of the asynchronous request processing:
AsyncListener
asyncContext.addListener(new AsyncListener() {
@Override
public void onTimeout(AsyncEvent event) throws IOException {
}
@Override
public void onStartAsync(AsyncEvent event) throws IOException {
}
@Override
public void onError(AsyncEvent event) throws IOException {
}
@Override
public void onComplete(AsyncEvent event) throws IOException {
}
}, request, response);
Effective usage of Asynchronous Servlets
In order to demonstrate the powerful features offered by asynchronous servlets, we will implement the following use case:
There is a file which size is 100 bytes that may be streamed to remote clients
We will have a background thread pool with a predefined number of threads that will be responsible to stream the file to remote clients
The HTTP threads will handle incoming requests and immediately pass them to the background thread pool
The background threads will send chunks of 10 bytes to the remote clients in a round robin fashion We start by defining a simple class that represents each remote client. We will store the respective client's AsyncContext along with the number of bytes already sent to the client:
RemoteClient.java
public class RemoteClient {
private final AsyncContext asyncContext;
private int bytesSent;
public RemoteClient(AsyncContext asyncContext) {
this.asyncContext = asyncContext;
}
public AsyncContext getAsyncContext() {
return asyncContext;
}
public void incrementBytesSent() {
this.bytesSent += 10;
}
public int getBytesSent() {
return bytesSent;
}
}
Now we define the asynchronous servlet:
StreamingAsyncServlet.java
@WebServlet(urlPatterns = "/streamingAsyncServlet", asyncSupported = true)
public class StreamingAsyncServlet extends HttpServlet {
private static final long serialVersionUID = 1L;
@Override
protected void doGet(HttpServletRequest request,
HttpServletResponse response) throws ServletException, IOException {
AsyncContext asyncContext = request.startAsync(request, response);
asyncContext.setTimeout(10 * 60 * 1000);
Dispatcher.addRemoteClient(new RemoteClient(asyncContext));
}
}
We set the asynchronous context timeout to 10 minutes (illustrative value), create a RemoteClient instance containing the AsyncContext and pass it to the background request Dispatcher. Now the HTTP thread may freely return to the HTTP thread pool and handle other incoming requests. Let's see the Dispatcher definition:
Dispatcher.java
@WebListener
public class Dispatcher implements ServletContextListener {
private static final int PROCESSING_THREAD_COUNT = 3;
private static final BlockingQueue<RemoteClient>
REMOTE_CLIENTS = new LinkedBlockingQueue<RemoteClient>();
private final Executor executor = Executors.newFixedThreadPool(PROCESSING_THREAD_COUNT);
public static void addRemoteClient(RemoteClient remoteClient) {
REMOTE_CLIENTS.add(remoteClient);
}
@Override
public void contextInitialized(ServletContextEvent event) {
int count = 0;
while (count < PROCESSING_THREAD_COUNT) {
executor.execute(new Runnable() {
@Override
public void run() {
while (true) {
RemoteClient remoteClient;
try {
// fetch a remote client from the waiting queue
// (this call blocks until a client is available)
remoteClient = REMOTE_CLIENTS.take();
} catch (InterruptedException e1) {
throw new RuntimeException("Interrupted while waiting for remote clients");
}
AsyncContext asyncContext = remoteClient.getAsyncContext();
ServletResponse response = asyncContext.getResponse();
response.setContentType("text/plain");
try {
Thread.sleep(2000);
} catch (InterruptedException e1) {
throw new RuntimeException(e1);
}
// increment bytes sent by 10
remoteClient.incrementBytesSent();
try {
// send bytes to client
PrintWriter out = response.getWriter();
out.print("Already sent " + remoteClient.getBytesSent() + " bytes");
out.flush();
// check if we have already sent the 100 bytes to this client
if (remoteClient.getBytesSent() < 100) {
// if not, put the client again in the queue
REMOTE_CLIENTS.put(remoteClient);
} else {
// if the 100 bytes are sent, the response is complete
asyncContext.complete();
}
} catch (Exception e) {
// discard current client
asyncContext.complete();
}
}
}
});
count++;
}
}
@Override
public void contextDestroyed(ServletContextEvent event) {
}
}
I placed comments in the Dispatcher class in order for the code to become kind of self explanatory, but let's go through some important details.
We are implementing the ServletContextListener interface, which means that the initialization method will be executed during application startup. Our data structure consists of a static BlockingQueue which will be used to store the currently connected remote clients. We also provide a static method in order for the asynchronous servlet to insert newly connected clients (as we have seen in the servlet definition).
We define a thread pool with capacity for 3 threads, create the respective threads during the context listener initialization and add them to the thread pool.
The threads will constantly loop fetching clients from the queue (actually from the head of the queue), send a chunk of 10 bytes to each fetched client and put the client back in the pool (more precisely, in the tail). This means that the threads will be serving 10 bytes to each client in a round robin fashion. As soon as we sent the total 100 bytes to a given client, we consider that the file streaming is complete for that client and complete the asynchronous request.
With this mindset one may now adjust the container HTTP thread pool size - the threads that will handle incoming requests in the first place - and also the background executor thread pool size in order to maximize throughput and drastically minimize the overall number of required simultaneous running threads.
Just for curiosity, we guarantee that the updates made by a thread to a given RemoteClient instance (the number of bytes sent) are visible to another thread that happens to fetch that same client from the queue, due to the fact that the BlockingQueue interface documentation states that an happens-before relationship is established between operations that place an element in the pool and subsequent operations that access or remove that same element from the queue.