Mittwoch, 13. Juni 2012

Know your library: I18n

Internationalization of an application can be a nerve-wracking experience if you do not have the right tools for the job. Today I will show you a couple of useful libraries and some code snippets that can make your life so much easier.

We will be looking at generating localized text (like HTML) with Freemarker. Freemarker is very simple at its core. You write templates which reference variables from a data model. On the Java side, you build up that data model (which is usually just a hashmap) and then merge it with a template. One special thing about Freemarker is that it doesn't work with the objects in the datamodel directly. Instead it wraps each one of them with a thin layer that streamlines property access and provides additonal functionality. This wrapping logic is highly configurable, which we will exploit later on.

The hard way

So let's just start with a look at how normally one of your templates and backing Java classes might look:

See how there are all sorts of formatting logic sprinkled around the template? Whilte it is nice of freemarker to be able to find the correct date format for the template's locale, we still need to tell it how we want to have our java.util.Date interpreted.
Booleans are even worse. We have to lookup translations for "yes" and "no", put them into the data model and then use them in the "?string" function. If we need to lookup parametrized messages (like "Hello Freemarker!"), we also need to remember to use the MessageFormat, as Java's Resourcebundle class has no direct way of passing in arguments to a localized string.
This may be feasible when your application is small. But if you have hundreds of pages and all of them should adhere to the same standards, this naive approach will backfire on you.
Another drawback is the fact that we are explicitely looking up localized strings (whether on the Java or Freemarker side doesn't matter), which adds more accidental complexity. And what if someone removes a translation or we mistype a key? It's a messy approach all around.

The easy way


So how do we fix this? Well first we'll need types which are better suited to our needs. java.util.Date just doesn't cut it. There is no way to tell from the data model whether a Date is meant to represent a date, a time, date and time or maybe even just a month (think credit card expiry date). Some of you may already know where this is going, but for those not familiar: Enter Joda Time!

Joda Time is a properly thought through time library which supports all kinds of different representations. There are classes for exact points in time (i.e. a millisecond value since Jan 1st 1970), complete with timezone information. On the other hand there are classes which represent more civil concepts like LocalDate, which just contains day, month and year values and thus is no exact point in time. These classes can be converted into one another with ease and there is lots of cool arithmetic built into them (taking a time and rounding it down to the closest hour is a one liner).
The most important thing about Joda Time is that you have separate classes for different time concepts. So there is no wondering of how to interpret a joda.time.LocalDate, as opposed to a java.util.Date. Now we just need to teach Freemarker to recognize these types and always format them in the style we desire.

This is where Freemarker's object wrapping comes into play. The Configuration class has a setter for an "ObjectWrapper". The standard implementation is called BeansWrapper and we will subclass it to let Freemarker know how to handle the Joda Time types. Let's assume that you are using some kind of dependency injection mechanism to provide the current user's Locale (I'm using Google Guice in the example). We instantiate a date formatter with the desired style for the current locale and return a scalar representation of our LocalDate. Scalar is Freemarker's term for things that can be displayed as strings.

See how the formatting logic in the template completely disappears. Our object wrapper recognizes the LocalDate in the data model and turns it into a ScalarModel which can be referenced like a string in the template. We can expand this functionality to all the Joda Time classes we are interested in.
But what if we still want to access the date objects fields from within the template? By just turning it into a ScalarModel we have lost that ability. This is easily fixed however, by implementing both ScalarModel and extending BeanModel. With that in place, we can treat our dates as simple strings as well as access their properties and methods in our templates.


Our next stop will be boolean values. Let's say we want them to translate to the current locale's words for "yes" and "no", respectively. We can do this by implementing a TemplateModel which is both a ScalarModel (for treating it as a string) and a BooleanModel (so we can still use those booleans in if-statements within our template). That was easy, wasn't it? But wait, to do this we would again need to make resource bundle lookups in the BooleanModel implementation!

Message Keys

So finally we also want to get rid of all those message key lookups. Ideally we want a statically typed representation of a message which we then can translate to the correct string using our trusty ObjectWrapper.
Luckily, there already is a library which offers just that. It's called cal10n (pronounced calion). Calion uses an Enum class to represent a bundle of messages and the Enum's values to represent the keys within that bundle. You just create such an Enum, tell it what the base name of it's bundle is and place the .properties files in that location. Then you just need to obtain an instance if IMessageConveyor and you can use its getMessage()-Methods to translate the Enum values into strings.
Of course it also supports an arbitrary number of arguments to the message, using the standard java.text.MessageFormat. All of that behavior can be overridden, for instance if you obtain your messages from a database instead of from property files or if you want a different MessageFormat. Calion also allows you to specify for each bundle (and even each Locale) what encoding to use for reading the messages. So you no longer have to ASCII-encode your property files, but instead use UTF-8 or whatever suits the corresponding language best. This is really great for German with all its Umlauts. Here is a very simple such Enum definition:

So we now have our Enums and property files and we have some implementation of IMessageConveyor that we can inject into our ObjectWrapper. We just need to test whether an object is an enum and has the @BaseName annotation and can then look it up with Calion and return it as a localized string. But what about parametrized messages? That's no problem either, as Calion has class called MessageParameterObj that holds an Enum and the arguments to the message. So if we encounter one of these in our datemodel, we get back a localized and parametrized message. Here is how the ObjectWrapper is extended to localize message keys and booleans:

As a result, we can now just put our Enums into the data model and everythig gets translated automagically. For such a small example, this may not look like a huge improvement, but think of this: You have some kind of backend logic that wants to report errors which will get displayed to the user. At the moment that the backend logic runs, the language of the message is not yet even known. So instead it would have to return the key of a localized string. Then our frontend logic would need to remember that it has to lookup that string in a Resourcebundle. And of course it would have to know which ResourceBundle to look into. This all just disappears if our backend logic returns MessageKey-Enums or MessageParameterObj instead. We just put them into our datamodel and the user sees a localized message. Isn't that just awsome?

Calion has even more benefits, by the way: It has a Maven-Plugin which can automatically check for any missing keys in your property files. There is also a (not yet final) Eclipse Plugin that can do the same thing. So not only can you never have typos in your message keys again, but you will also notice when a translation is missing.

Looking back

By doing a bit more work on the configuration of Freemarker we were able to greatly reduce the complexity of both our Java classes and template files. The example can't really show how huge of a benefit that is, but again - just imagine you have hundreds of classes and templates that need to display localized data. This will safe you a lot of time debugging templates and hunting localization errors, giving you the opportunity to focus on more high level usability issues of your app.

You can get the full example project at GitHub