Monday, June 16, 2008

Improving Java Web Performance With C/C++

From the very first day that I had been working on OpenMRS, I felt that OpenMRS ran a little slower than I expected. Probably the old OpenMRS demo server openmrs_logo adds to the slowness. Later, when we were discussing about how Hibernate sessions should be implemented in OpenMRS and Java Web Apps in general, I was again brought to think about OpenMRS performance.

Since OpenMRS community generally implements on Tomcat, my main aim was to improve performance of the servlet container. One simple way to improve performance, which I had heard of earlier was the use of Apache has the “Apache Portable Runtime” (APR) project with Native Libraries. The APR uses native libraries with JNI to improve the server performance on a specific platform. In short, Tomcat is given some local OS steroids and currently works on Windows and POSIX-based systems.

The APR library is somewhat an irony for 2 main reasons:

  • I’ve heard this argument that Tomcat runs faster than Apache in some benchmarks. These guys argue that Java is faster than C/C++ and hence Tomcat wins.
  • On the other hand, APR and Native Tomcat uses JNI code written in C/C++ to improve performance.

Either ways, I think generalizing the above statements isn’t correct and hence I went forward to see if APR does improve performance of our web application. I used Windows Vista and Tomcat 6.0.16 for the test and Windows is probably what most OpenMRS implementations use. You can download the native binaries for Windows from here & APR from here.  Add the extracted files to Path and place the tcnative-1.dll in APR’s bin folder.

And the first thing I observed tomcat started little faster and even OpenMRS initialized slightly faster.

OpenMRS initialization 192ms 183ms
Tomcat Server Startup 12892ms 11449ms

But startup improvement is not all. We want to check how good the application is performing and Apache Benchmark (ab) is a good way to test static content, but isn’t very good at dynamic content... I wanted to use Faban after I remembered Scott Oak’s writeup from last year, but couldn’t find enough time for the testing with Faban...

Instead, I used JMeter which is a nice generalized test that replicates how a user interacts with the web application. You can send POST requests with parameters and also simulate your test plan, just like a normal web user would use your application. Here are some of the results on different OpenMRS pages with 10 concurrent requests and average of 3 runs on my dual core server:

Without APR
With APR
OpenMRS homepage 225.7ms/request 185.7ms/request
User Login 1464.2ms/request 1185.3ms/request
Find patient 95ms/request 80ms/request
Patient dashboard 2887.6ms/request 1984.3ms/request

My first observation was that the first run on the test completely sucks. The later runs improve performance drastically. This is because of tomcat 6 has good caching mechanism and was shown with or without APR. Another thing I observed was that beyond 500 concurrent users the application was crying and tomcat was hanging up. APR or no APR didn’t matter much... I’ve yet to analyze why it wouldn’t scale any further, but must be something related to Hibernate sessions. May be some experienced developer can look into these figures, perform some more specific benchmarks and improve scalability.


fabrizio giudici said...

Hi Saptarshi.

The reason because Tomcat (and any Java application) runs faster after a while is because of the HotSpot compiler, which translates in native code the Java bytecode, focusing and better optimizing the chunks of code that are most executed. This is particularly true if the VM is run with the -server option, which is the default for Tomcat.

I'd be curious to know which performance you would get by replacing Tomcat with Glassfish. Sun claims Glassfish is much faster than Tomcat.


Saptarshi Purkayastha said...

@fabrizio: Yes, it may indeed be true that HotSpot is doing some JIT magic on later runs, but will it happen on the 2nd run itself??...or does HotSpot wait for some specified number of runs before it compiles the code? BTW, I tried deleting the -server option (as well as trying -client) and Tomcat performance still improved on the later runs, but this time the different was a little less improved... So it proves your point that Server JVM is indeed doing something really useful!!

I've had this on my mind for sometime now that I should try OpenMRS on Glassfish. Since I've been deploying a few production web apps on Glassfish. I think it should be very interesting to see if Glassfish scales better... But personally I've found performance is more to do with the web app itself than the container

fabrizio giudici said...

Of course Tomcat does also its caching - I forgot to add an "also" in relation to HotSpot ;-) I don't know how deep is its influence at the second kick.

Sorry if I'm saying something obvious, but it's a topic I'd like always to point out, as a lot of people seems to be unaware still - for instance the past week I (partially) solved some problems of a friend that complained about a (desktop) library being slow for his requirements... but he was measuring the time at the first run. When I suggested him to run some "warm up" rounds, the performance went 10x-20x. Cheers.

Anonymous said...

My understanding was that the JVM does a fast JIT, profiles, and optimizes over time. It will even deoptimize and reoptimize, based on various conditions. So I wouldn't trust three runs, but instead allow it to warm up for a while before testing.

I'd actually expect a normal Tomcat instance to almost equal the C++ results, thereby making the switch too risky for the potential gain.