Samstag, 30. März 2013

Fun with Active Annotations: Micro-Benchmarking

After I implemented the @Memoize annotation described in a previous post, I of course wanted to measure how the code performed. Writing such micro benchmarks yourself requires lots of boilerplate code and is easy to get wrong. Thankfully, there is the Caliper micro-benchmarking tool by Google. It takes care of all the messy details and lets you specify clearly what you want to test. I used the old 0.5 version available on maven central. A 1.0 release is just around the corner however and I will post a follow up on the changes.

Writing a benchmark with Caliper is pretty straightforward. You write one class for each benchmark. In this class you create one method for each variant you want to test (like different sorting algorithms). Each test method must be named "time<variantName>" and takes exactly one argument: The number of iterations. In most cases you will just call the desired code in a for-loop as many times as the iterations parameter tells you. Your benchmarks can also have arguments (like the size of the list you want to sort). These are fields marked with the @Param annotation. The values for these fields can be listed in the annotation, passed from the command line or can be supplied by a static method or field. This method or field must be named by the convention "<parameterName>Values". The concept is best explained by a little example:

All of this is already pretty concise compared to what you get. Without any further work, the framework will run all variants with all argument combinations and compare the results for you. There is even a connector to a web service where you can post your results, allowing you to do trend analysis etc.

However, everything can be improved and so I devised a little active annotation that makes writing benchmarks even easier and at the same time less error prone. When you annotate a class with @Benchmark, you get the following:
  • Your class derives from SimpleBenchmark
  • "time"-methods get the "iterations"-parameter without having to specify it explicitly
  • when you write a "Values"-method/field, the corresponding parameter field is automatically added to the class and has the correct type. This way, the connection between the field and the values is static and cannot be messed up
  • you can write "loop"-methods. These automatically call the contained code in a for loop, making that common use case even more concise
  • a "main"-method that runs your benchmark, useful in eclipse
Using these improvements, the above Java class is reduced to the following Xtend class. Note that I used a "time"-method just to demonstrate the implicit iterations argument. For this simple case, I could have also used a "loop" method, simplifying the code even more.

The corresponding annotation processor is so easy to write and understand that I can get away with presenting it to you uncommented:

Sonntag, 17. März 2013

Fun with Active Annotations: Memoization

With the next release of Xtend only 3 days away, I did some experimenting with the latest language feature: active annotations. In short, active annotations allow you to participate in the Xtend-to-Java compilation process, altering the AST or even generating whole new classes and resources. But not only that: Any change you make is automatically visible in content assist, the outline view, navigation etc. Plus, you can add errors and warnings to the Xtend source.

My first idea was to build an annotation that can memoize any method. Memoization means remembering the return value for given parameters and returning this cached value when the method is called again with the same parameters. This is useful for expensive calculations and recursive algorithms.

So I want this Xtend code

to compile into this Java code

This is done by the following annotation processor. It has different handling for 0, 1 and multiple parameters to optimize performance.

As you can see, the body of the method is replaced with a call to a cache. The cache is populated with values from an initializer method that has the body of the original method. For methods without parameters, I just use double null check synchronization. For everything else there is Guava's wonderful LoadingCache. For methods with one parameter, the cache key is the type of the single paramter. For methods with multiple paramters, I added a small helper class that holds an array of parameters and provides an equals and hashcode method. Some of the finer points like exception handling, method overloading and primitive types are also handled by the annotation processor in order to generate production quality code.

This could be further expanded with expiration strategies and other options defined in the annotation. But for now, I'm pretty pleased with the results. Getting the 40th fibonacci number takes only 1 millisecond on my laptop. Without memoization, this goes up to 5 full seconds.