Book Review: The Definitive Guide to Terracotta

Terracotta is a “transparent clustering technology” that allows you to make data structures available across a cluster of machines in a highly-scalable and robust manner. Unlike many other clustering solutions (including the very popular memcached), it doesn’t expose an API that a developer leverages to push data structures in and out of a big distributed container. Rather, it’s a library that’s boot-strapped into your JVM while the behavior is driven by an XML config file. This allows for sharing data in the fields of a class across the cluster as well as synchronized access to objects, just like in any multithreaded application. Terracotta is able to do this through some very interesting decoration of bytecode as Java classes are loaded into the JVM. What this ultimately allows for is something like a large shared memory heap shared by all JVMs which can survive JVM crashes since all data is also written to disk. Additionally, since Terracotta doesn’t use a peer-to-peer approach of data replication, it’s easier to achieve linear scalability.

Sound interesting? Learn more at the Terracotta web site.

This is an excellent book. The prose is well-written and engaging and the book flows very well from section to section. There are massive amounts of Java code and configuration such that you very rarely have to picture anything in your mind, you can just read it there on the page. There are helpful diagrams where appropriate. It’s unlikely that a reader with a good understanding of Java will become confused at any point during the book. It’s informative and provides some excellent examples of real world use, including especially chapters on integration with things like like Spring, Hibernate, session replication and more. There is also an extensive chapter on using Terracotta to create a Master/Worker compute grid. If you’re looking to learn more about Terracotta I really can’t recommend this book enough: it helped fill so many gaps that I had after skimming some of the documentation and reading a few of the white papers.

The only real negative is that the book is slow to get started. The first two chapters (~40 pages) serve as an introduction and history to the technology respectively but taken together, it’s just a very lengthy introduction that rehashes a lot of the same concepts. Maybe this was tedious for me given that I’d already read a lot of documentation on Terracotta but it just seemed like the intro could have been a bit shorter.

I’ll now run down the other chapters in the book and detail some that I found particularly interesting.

Chapter 3 is a quick jump into the framework and some tooling while Chapter 4 gets into the nitty-gritty details of POJO clustering. This is important to read to understand how Terracotta does what it does. Chapter 5 talks about how to do caching and this is where you start to understand the real world problems that can be solved using the tool. Your database will thank you!

Chapter 6 is where it gets really interesting. Here you will learn how to use Terracotta as a 2nd-level cache provider for Hibernate to significantly boost performance over using something like Ehcache . More startling than that is a proposed architecture where the notion of POJO clustering is used to effectively put data structures that hold detached objects (those not attached to an active Hibernate Session). You are shown how to change application code that uses the Hibernate API in a “typical” fashion to achieve performance increases measured in orders of magnitude. This is truly an eye-opener.

Chapter 7 shows you how to cluster HTTP Sessions and how you can be freed from some of the annoying restrictions of the Servlet container Session API, such as implementing Serializable and religiously using setAttribute(). ; This is the sort of thing you can plug into an existing application very quickly and realize enormous scalability gains.

Chapter 8 is about clustering Spring beans. Spring and Terracotta follow a very similar philosophy in that they are non-invasive frameworks. As such, they compliment each other very nicely. This chapter shows how easy it is to cluster Spring beans: even easier perhaps than clustering POJOs as in Chapter 4. At this point, if you are a user of Spring and Hibernate, you’re starting to see how easy it can be to achieve seriously scalability and performance improvements.

Chapter 9 talks about Terracotta Integration Modules which is a sort of package that provides additional features to the Terracotta core: this is how integration with Hibernate and Spring are achieved. Chapter 10 gives an extended treatment of thread coordination, showing how well-written multithreaded code can be used with Terracotta to achieve thread coordination across multiple JVMs. Chapter 11 takes this further to detail the Master/Worker pattern for computing grids. Chapter 12 rounds things out by showing the visualization tools that can be used to monitor and debug an app using Terracotta.

As I said before, this is a great book. If you’re interesting in scaling out enterprise Java applications, you owe it to yourself to check out Terracotta and this book.