David Wolverton

04/05/2011

0

Alfresco Perfomance with CMIS

David Wolverton // in Technology

I’ve been working on fine tuning Pepper performance for quite some time and the biggest bottleneck always seems to be the underlying Alfresco repository. From time to time, my associates and I have dug under the covers of Alfresco to analyze, optimize and work-around in order to squeeze every reasonable ounce of performance out of the system. We have found that there are two areas that have really caused a drag on user experience. One of these is workflow on Alfresco’s WCM/AVM repository, and the second is search operations. Workflow will not be covered in this article. Instead, I will focus on optimizing search, or more precisely, any operations that return a moderately large set of results. 

Breaking it down

Pepper communicates with an Alfresco repository via web service calls (either Alfresco Web Scripts or the new CMIS API). We found that some searches were taking particularly long. We tested to see whether it was the complexity of the search that caused the slowness, but it turned out that search complexity has almost no affect on performance with Alfresco. What we did find to make the biggest difference is the number of results returned. In fact, the time it takes to perform a search is linearly proportional to the number of results. This is universally true for both AVM and DM repositories.

Digging deeper, we discovered that after Alfresco determines what nodes should be returned in the result set, it fetches each node with it’s properties one by one from the database. Not only is this a lot of database calls, but these turn out to be somewhat expensive calls in order to join the correct tables and get all the property values. To make matters worse, an additional call is made for each result in order to check for read permission. This call is similarly expensive.

What can be done?

In AVM land…

For AVM, we found the biggest improvements by simply turning off the security system for Alfresco. This eliminated half the database calls and made the searches almost twice as fast. Of course, disabling security is not an option for many of our clients. Some other small gains were achieved by rewriting our Alfresco Web Scripts for search to use Java instead of JavaScript backing and streamlining it to hit lower APIs, but this only helped by 10 to 25%.

How about DM?

At the same time we had been considering moving from AVM to DM via the new CMIS API in Alfresco Enterprise 3.3. We had hoped that performance would be significantly better here because we knew that DM was really the heart and soul of Alfresco. Unfortunately, our tests showed no significant difference.

Maybe an upgrade?

According to Alfresco’s wiki, the Enterprise 3.4.0 release includes the following improvement. 

“Enhanced read permission evaluation performance in the form of a new method on the PermissionService called hasReadPermission. This is particularly beneficial for large results sets.”

This sounds very promising for us, since we know that this is the very call that was causing half of our performance woes. So we turned our test engine toward a new Enterprise 3.4.0 installation. Much to our dismay, the results were virtually identical to what we found with Enterprise 3.3.3.

CMIS Bindings? Who knew?!

After all of our research, we ended up accidentally stumbling across one thing that made the biggest difference. If you’re familiar with the CMIS spec, you may know that it defines two “bindings”, or APIs for communicating between client (Pepper) and server (Alfresco). One of these is the AtomPub binding which defines certain extensions to the Atom standard. The other is the Web Services binding, which follows SOAP, WSDL and Web Services standards. At first we had only been using the AtomPub binding, but when we started testing out the Web Services binding we noticed that search performance doubled. It was the same on both 3.3.3 and 3.4.0.

I do not have a full explanation why the Web Services binding is so much faster with Alfresco. My suspicion is that it may have something to do with the fact that Alfresco uses Web Scripts with JavaScript and FTL to implement AtomPub while the Web Service binding probably has a more direct route to the lower level APIs.

So in the end, two adjustments helped us get our Alfresco queries running significantly faster. On the AVM side, we rewrote our web script in Java using lower-level APIs and a few other optimization tricks. On the CMIS side, we just switched over to using the Web Services binding. In both cases we saw about a 50% performance improvement. Even so, we’re always looking to shave off an extra few seconds on those large queries. I’d love to get your comments if you have any extra insight.

READ MORE.

 

Share Article