Sunday, 10 January 2016

Choosing between Ratpack and Dropwizard.


This post will discuss our team’s approach to evaluating the java based web framework ratpack as an alternative to dropwizard for creating a new microservice.  This post assumes a basic prior knowledge of both ratpack and dropwizard.

Background


The team I'm on is full of developers all with experience of creating and maintaining applications built with the Dropwizard framework. It’s safe to say that Dropwizard would be the team’s default preference when considering frameworks for web based java apps. However, we have recently been tasked with building a new microservice that is as efficient as possible. Although this non-functional requirement isn’t 100% specified, we do know it will need to be capable of high throughput and low resources (i.e. memory and CPU). High costs from the company’s PAS provider have in no doubt influenced this requirement.

Ratpack is said to allow you to create "high performance" services that are capable of meeting these non-functional requirements assuming that your application is I/O bound.  It's because of this that the team evaluated it.

Dropwizard vs Ratpack performance


A Dropwizard vs Ratpack performance test was conducted to see if the reputed performance benefits could be observed for our use case.

Two webapps that performed the same steps (detailed below) were created, one with dropwizard the other with ratpack.

Architecture Diagram




Load Profiles  


Three load profiles were given to each application as detailed below:
  1. 5 minute duration, 100 JMeter Threads
  2. 5 minute duration, 200 JMeter Threads
  3. 5 minute duration, 300 JMeter Threads

Results


The third test run resulted in the Dropwizard application eventually failing to respond.  


The graph above shows that throughput for the first two test runs was very close.  The third test run shows Ratpack answering significantly more requests per second due to the fact that the Dropwizard application began failing to respond.  Exactly why the Dropwizard application gave up was not investigated. 





Here we see ratpack using significantly less memory in all but the last test run. 






Ratpack seemed to use around half of the CPU as Dropwizard.




Here we see the true nature of our non-blocking frameworks in action.  Ratpack used just twenty threads in each test run.  Knowing the basic premise of non-blocking I/O, this shouldn't come as a surprise - however I still find this very impressive!



Performance Summary

Ratpack can clearly handle more requests with less resources in an application that spends most of it's time waiting for I/O.  This was of major significance to us since we knew from the start our application would be I/O bound.  It's also worth pointing out that (at least in my experience) I/O bound webapps are very common.

Other concerns


In order for the organisation to support our new service, it needs to comply with a few standards.  

Logging

A common design design pattern in a microservices architecture is to assign some kind of UUID to each request which is then passed to downstream systems.  This  UUID can then be added to log events so that they can be correlated across multiple systems.  This can (and has at our organisation) been implemented using SLF4J's Mapped Diagnostic Context (MDC) which "...manages data on a per thread basis."  However, it should be noted that " a server that recycles threads might lead to false information".
Ratpack recycles threads, but luckily were covered an MDCInterceptor has been created which addresses this very problem.

Deployment


The organisation relies on the simplicity that comes with building and deploying Java based services as Uber/Shaded/Runnable jars.  Since ratpack applications can be built in just the same way, this wasn't an issue.

Integration with Hystrix


Hystrix is a great library that helps you make your services fault tolerant.  Not only is it great, it's become the unofficial standard within our organisation.  Any java based application can integrate with Hystrix in many different ways since it's so flexible.  Hystrix supports synchronous, asynchronous and reactive programming models.  Not only does Hystrix support non-blocking calls it also supports the use of semaphores instead of Thread pools to manage concurrent downstream calls.  The flexibility and support offered here by Hystrix is absolutely crucial!  If Hystrix mandated the use of a thread pool or blocking calls, we'd be back to the Dropwizard performance characteristics (shown above) of having one thread per concurrent request we handle (assuming each request results in a downstream system call, which is true in our case) which would negate ratpack's benefits almost entirely.

Not only is it possible to use Hystrix with ratpack, it's also easy to use features such as request caching, request collapsing and streaming metrics to the dashboard (which is awesome by the way) with the ratpack-hystrix JAR.

Complexity (Learning Curve)


The benefits of ratpack come at the cost of complexity.  Our team (myself very much included) are used to building traditional synchronous based Java apps.  The effort of adopting an asynchronous programming model was difficult for the team to assess but was certainly a known risk which could affect delivery time.

Decision Time - Ratpack or Dropwizard


With the amazing performance from our benchmark, a sense of optimism that we’d pick up the new framework and asynchronous programming quickly... the decision to go to ratpack was made.



My next post will detail the team's experience in general with Ratpack. 

Thanks to the team for all of the shared learning so far, and especially to @mirkonasato for leading the performance benchmark work detailed above.