Hadrian Zbarcea

Monitoring camel routes using camel-bam

Hadrian Zbarcea - Sun, 04/01/2012 - 00:02
Last week I got a question from an Apache Camel user about ways to get statistics on the activity on a route, including errors. There are quite a few ways to achieve that. First there are obviously the logs, but they are mostly useful to investigate a particular problem. Then there are the performance counters (defined in the org.apache.camel.api.management package) that could be obtained either via the JMX support or even exposed as a RESTful service. If one wants to be more fancy she could use a custom ErrorHandler.

There are however situations when message processing is faulty even when all individual messages are processed correctly. As an example, messages not processed in time can lead to service level agreement violations, or messages may be missing, or other business process execution use cases. While not pretending to compete with more specialized tools for business processes (such as BPEL or BPM engines) and while being inherently stateless, camel does offer support for scenarios like the ones described above, via the camel-bam component.

The BAM component is actually one of the rare exceptions that implements a stateful engine using JPA and a database of your choice. It also offers a very neat way to correlate messages by using Expressions (typical camel style) that evaluates against a message to produce a value used for correlation. As such, messages don't necessarily need to have an explicit correlation identifier as long as one can be computed from the content of the message, which means that it can normally be retrofitted to support systems not designed with correlation in mind. At its core camel-bam provides a set of temporal rules that trigger events. These events are processed by camel via processes (called activities) that are nothing else but specialized routes. BAM also offers a dsl for building activities via a ProcessBuilder (which is actually a specialization of the RouteBuilder you may be more familiar with).

That said, I put together a small example in my github account. It illustrates some very basic banking operations (like debits and credits from an account). The relevant part of the process is in ActivityMonitoringBuilder.configure() and looks something like this:

ActivityBuilder teller = activity("direct:monitor-teller")
ActivityBuilder office = activity("direct:monitor-office")
ActivityBuilder errors = activity("direct:monitor-errors")



The trigger events are generated when regular message flow down the routes defined in the BusinessProcessBuilder.

If you have more questions about how this example works (or bam in general) drop me a note or ask on the camel users@ lists. Enjoy!

Categories: Hadrian Zbarcea

Apache Rave a new ASF Top Level Project

Hadrian Zbarcea - Wed, 03/28/2012 - 04:02
After one year of hard work, Apache Rave graduated from incubation. Although the resolution was approved by the ASF board last week, the official announcement came only this morning (also featured on NASDAQ GlobeNewswire).

First initiated by Ate Douma and discussed at ApacheCon 2010 in Atlanta, the idea quickly captured the imagination of a group of developers and the proposal was submitted in February 2011. After a bit more then a year and a few releases, Rave is a promising top level project.

Dubbed a "web and social mashup engine", Rave is a powerful yet lightweight, secure, customizable platform for services and content delivery also supporting a number of social network technologies. Due to its content aggregation capabilities via specs like OpenSocial or W3C Widgets, Rave is already adopted by a number of portal projects. As content aggregation happens in the browser, many of the issues of traditional portlet based technologies are avoided.

For those those interested in a versatile platform for content delivery, check out Rave. Congrats to the Rave team for reaching this milestone and good luck going forward!
Categories: Hadrian Zbarcea

How to protect the release GPG key

Hadrian Zbarcea - Mon, 01/09/2012 - 18:01
Recently I have been asked about how I handle the gpg key I use for Apache releases. For what is worth, the question popped up in the context of a few other community members taking on the release manager role. As those of you that follow Apache Camel already know the Camel PMC decided to actively support and issue patch releases for the two latest minor branches.

But I digress. As I mentioned, I don't keep my private key on my laptop, but on an encrypted usb flash disk. The main reason is security, as the probability of someone getting access to my box greater than zero. In particular the key used for Apache releases is trusted by other ASF members and making sure it doesn't get compromised is one of the duties of the release manager. Of course one could revoke a key, but then verifying the integrity of a release becomes complicated at best.

My setup works on Ubuntu 11.10 and the idea behind it was using something similar to 2-factor authentication (something I have and something I know). I use a relatively cheap 4G usb flash memory (a sturdier, metal one). My usb flash uses a FAT32 file system and is not encrypted, but on it I created a 256M file (ringo.img) as an encrypted ext3 disk partition. My usb flash is mounted as '/media/APACHE', not much to do about the uppercase, a perk of FAT32. The encryption uses 256-bit aes and sha512. Below are the commands I used to create my encrypted partition (inspired from this article).
sudo apt-get install cryptsetup
cd /media/APACHE
export usbdisk=$PWD
dd if=/dev/zero of=$usbdisk/ringo.img bs=1M count=256

sudo modprobe cryptoloop
sudo modprobe dm-crypt
sudo modprobe aes_generic
export loopdev=$(sudo losetup -f)
sudo losetup $loopdev $usbdisk/ringo.img
sudo cryptsetup -c aes -s 256 -h sha512 -y create usbkey $loopdev

sudo mkfs.ext3 /dev/mapper/usbkey
sudo mkdir -p /media/encrypted
sudo mount -t ext3 /dev/mapper/usbkey /media/encryptedAfter that, the encrypted media should be working an mounted as '/media/encrypted'. The article explains how to write mount.sh and umount.sh scripts to automate the mounting and unmounting of the encrypted media. You need to make sure you don't forget to unmount your encrypted partition to avoid data loss, although the risk is in good part mitigated by the use of a journaled file system.

Next is to setup gpg to use the keys from the encrypted file system. If you already use gpg, it's very simple. It stores the keys in a .gnupg directory in a user's home. The simplest way is to move the .gnupg directory on the encrypted media and create a softlink to it. As I want to be able to perform public key operations (encrypt/verify) even when my usb flash is not mounted, this is what I did:
mkdir ~/.gnupg-local
cp ~/.gnupg/pubring.gpg ~/.gnupg-local
cp ~/.gnupg/trustdb.gpg ~/.gnupg-local
sudo mv ~/.gnupg /media/encrypted
ln -sf /media/encrypted/.gnupg ~/.gnupg
# unmount ringo
sudo ln -sf ~/gnupg-local /media/encrypted/.gnupgNote that the last command was executed after unmounting the encrypted partition. That way, if the encrypted partition is mounted the .gnupg softlink in my home will point to it, otherwise it will point to the softlink in the /media/encrypted which points back to ~/.gnupg-local.

What I did is something slightly different, I wanted to avoid issuing a manual command every time I plug the usb flash in. The system that manages dynamic devices is udev, so I had to write a udev rule. I made (renamed) copies of the mount and umount scripts in /usr/local/sbin and got rid of sudo in the scripts
-rwxr-xr-x 1 root root 449 2012-01-08 22:29 mount-ringo
-rwxr-xr-x 1 root root 167 2012-01-08 22:29 umount-ringoI added a udev rule:
hadrian@rem:~$ cat /etc/udev/rules.d/100-mount-ringo.rules
ACTION=="add", SUBSYSTEMS=="usb", ATTRS{idVendor}=="1307", ATTRS{idProduct}=="0165", RUN+="/usr/local/sbin/mount-ringo"The idVendor and idProduct values you can get from `lsusb`. They make sure the rule only fires when you mount the right usb flash memory. You can use other attributes to identify your flash if you prefer, like the label for instance.
hadrian@rem:~$ lsusb
Bus 003 Device 018: ID 1307:0165 Transcend Information, Inc. 2GB/4GB Flash DriveYou can test your rules using `udevadm`. For that you need to know the mounted device which you can get if you know the label ('sde' in my case). If everything is ok, you should see a 'run: <your-script>' message near the end.
hadrian@rem:~$ ls -al /dev/disk/by-label/APACHE
lrwxrwxrwx 1 root root 9 2012-01-09 11:13 /dev/disk/by-label/APACHE -> ../../sde
hadrian@rem:~$ sudo udevadm test /sys/block/sde
run_command: calling: test
run: '/usr/local/sbin/mount-ringo'
run: '/lib/udev/hdparm'
run: 'socket:@/org/freedesktop/hal/udev_event'Once you are happy with your test you can restart udev to use your new rule:
hadrian@rem ~$ sudo udevadm control --reload-rulesAs a backup, you may want to put your private key on a non-encrypted usb flash that you don't use and is kept in a secure location. Now you should be all set, just pull your usb flash out and put it in your pocket and remember to always umout first. Enjoy!
Categories: Hadrian Zbarcea

The Case for Patch Releases

Hadrian Zbarcea - Wed, 09/21/2011 - 20:12
A new patch release of Apache Camel, version 2.8.1 was released last week. I didn't announce releases in a while. Why write about camel-2.8.1 then?

The reason is that the Apache Camel community didn't produce patch releases in the past (we only did it for the 1.6.x version before discontinuing support for the 1.x versions). We actually started producing patch releases back in April for camel-2.7.x. At the time, I wasn't sure if and how this will continue because you need enough support within the community and some had the view that at Apache we should only focus on innovation.

The untold truth about that though has to do with the business models of open source. ZDNet had an excellent article a while ago, which makes for interesting reading. The author left out another business model, let's name it FUD ware, that relies on bundling known, successful open source projects and claim that only your distribution is production ready and offers what the market needs. That works better if you have some influence over the community that produces the original open source project. It is not enough, that's for sure, but combined with other business models (such as 1. Support Ware or 4. Project Ware) it may make the needed difference.

A few of us however, want the original ASF distribution to be stable, secure and production ready. After a bit of a struggle I am happy we managed to get the Apache Camel community used to the idea of producing patch releases more frequently. Between Dan Kulp and myself we issued since mid April three patch releases on camel-2.7.x, camel-2.8.1 last week and camel-2.8.2 is only a few weeks away.

This way you, our users, won't have to wait at least a quarter to have your issues resolved. At the end of the day, you the users, are the Apache Camel community.
Categories: Hadrian Zbarcea

Congrats Master Melegrito!

Hadrian Zbarcea - Tue, 09/13/2011 - 04:34
Personal things don't usually belong here, but this is a bit special. I am also still sore after this weekend.

Once a year, me and Ama have the opportunity to go with our instructor, 4th Dan Tae Kwon Do Master Josh Geeson, to train in self defense against stick and knife with Master Julius Melegrito, event organized in Charlotte, NC by Master Evins.

The seminar was a real treat as expected. Master Melegrito's speed and technique are amazing. After drills with two sticks, we got to one stick, repeating the same techniques. After the shoulders started burning we dropped the stick altogether and went mano-mano, repeating again, the same techniques only to realize how similar they are to Hapkido and Tae Kwon Do. Too bad this year we ran out of time and didn't get to practice defense against knife attacks. I am not complaining much though because last year my partner was Master Robert Shin who's not very forgiving and the hardwood floor was, well, hard.

Yes, Ama, you are the best!

What makes this year's event special is that Guro Melegrito is now a Black Belt Magazine 2011 Hall of Famer, the Weapons Instructor of the Year. Congrats Master Melegrito, many thanks for visiting and see you again next year!
Categories: Hadrian Zbarcea

Annotation for the Claimcheck Pattern

Hadrian Zbarcea - Thu, 09/08/2011 - 04:00
The claimcheck pattern is described in the EIP book Camel is based on. The way it is described, the pattern uses a data store which means that it applies mostly to local processing. However, in most of the cases, at least in my experience, the processing is not local so one needs to retrieve the original message at a different location where access to the data store is not necessarily possible.

As a bit of background, the commonly used analogy for the claimcheck pattern is the process used to check-in baggage while traveling and claim it at the destination. The rationale is that cabin space is a scarce resource (as is cpu, memory and bandwidth) and different parts of the in-flight entity have different SLA requirements: humans need oxygen, leg room, pressurized cabin space, food, entertainment, etc. whereas baggages can be stacked in the cargo hold. In some cases they may take different routes to destination as well.

So how should this be implemented? Let's take a look at the elements that define the pattern. First we have a message that via some logic will be transformed into two messages (1).  Let's call the two Messages the "initial message" and the "claim message". That is the Content Filter part described in the pattern, not to be confused with the filter dsl in camel that is actually a Message Filter, very different thing.

Our two messages will travel separately between departure and arrival, to borrow the terminology from the travel analogy via two different Message Channels (2), which in Camel we know as routes. The first message channel is the initial route we setup, let's call it the "main channel". The second channel is a one way one, let's call it the "baggage channel". Between departure and arrival, the initial message will go on the bagage channel and the claim message will go on the main channel.

We also need to generate a Correlation Identifier (3) to preserve the association (the equivalent of the bar coded tag on the baggage and the boarding pass). The fact that one could have multiple baggages is outside the scope of this pattern, it can be handled by a splitter/aggregator. The correlation id is both attached to the initial message and supplied somewhere in the claim message.

The claim message will replace the initial message on the main channel and may undergo further processing. It may contain partial content from the initial message required for processing, it may need to be of a specific type, we cannot make many assumptions. Two things are clear though: it is produced from the initial message, so we need a Processor (4) for that, and it contains the correlation id, so we need an Expression (5) to extract it from there at destination. While we need to attach the correlation id to the initial message too, we have a bit more freedom on the baggage channel, so let's simplify things a bit and use a custom header/property instead of customizing too much and require another Expression.

Putting together the five elements above, it starts to look like in the general case we need a separate, configurable one-way (in-only) route with some processing in the departure and arrival endpoints. I already looks like our implementation will actually require a different kind of Endpoint which also means a Component. It should support multiple queues at departure and arrival (i.e. both for check-in and claim). Since processing takes place on the main channel and at arrival during baggage claim the initial message is restored, it must be retrieved from the arrival claim queue, which means that the arrival queue must support random access. This is a key aspect, sometimes overlooked, that gives a lot of grief when implementing this pattern.

As far as implementation goes, I started to play with a bit of code so you could see a solution in one of my next posts and probably in the Apache Camel trunk soon.
Categories: Hadrian Zbarcea

Apache Camel 2.5.0 Released

Hadrian Zbarcea - Mon, 11/01/2010 - 04:34
The 2.5.0 release of Apache Camel is finally out. It is the result of some two and a half months of improvements since the last release in July. You can check the release notes for more details and give it a try.

The PMC is planning to focus now on releasing the 3.0 version of Camel in Q1 of next year. The 2.x version will continue to be actively improved for the foreseeable future (at least through 2011) and we plan to have another camel 2.x release this year.

To help planning the new 3.0 release the Camel PMC conducted a survey during the month of October. Many thanks to all who participated in the survey and made their voice heard. The results of the survey will be made public this week during ApacheCon. If you want to see the goodies coming in the 3.0 version, keep an eye on the roadmap.
Categories: Hadrian Zbarcea

Should you getIn or getOut?

Hadrian Zbarcea - Thu, 10/07/2010 - 03:37
From recent posts on the camel users@ mailing list there seems to be some confusion related to the use of getIn() and getOut() on a Camel Exchange. The question is relevant to developers who want to implement a processor and need to understand what should happen with the result of their processing. There is a wiki page that attempts to explain the usage of the two methods, but sadly that mostly fuels the confusion than clarifies anything. The page will be corrected soon, you may see a newer, corrected version, but at the time of my writing current is version 4 (which was recently improved and a bit more accurate than version 2).

So the idea (deeply rooted in some older discussion about what the Exchange api should be) is that you should use getIn(), because, uhm, "it's often easier just to adjust the IN message directly" and that has something to do with MEPs and a poor relative of camel called JBI. Before we get into more details, the reality is much simpler, using getIn() or getOut() has not much to do with MEPs and you should do what's right, which is process the in message (most likely), generate an out message (if needed) or update the in (sometimes).

Camel is one of the integration space best frameworks (I'd think the best, but then you'll say I'm biased) and it is true that it was influenced by JBI. Some of its roots are in Apache Servicemix and three Camel PMC members have been at some point involved in drafting the jbi specs: James with jbi 1.0 and Guillaume and Bruce with jbi 2.0. One of the main goals of Camel is to make integration simpler and more intuitive than jbi (and complement jbi in some cases).

Now, just because the jbi 2.0 was inactive for a good while doesn't mean that the Camel Exchange API is broken as some may say. In particular, the jbi 1.0 spec says that "The message exchange model is based on the web services description language 2.0 [WSDL 2.0] or 1.1 [WSDL 1.1]" (jbi-1.0-fr-spec, pg 5) and WSDL is pretty much alive and kicking. Then there is the discussion about the MEPs, which is 100% correct and 100% irrelevant. The reason is that the MEPs refers to the route not what happens within a particular Processor. You could use the exact same Processor on routes that are in-only or in-out.

The concept though is much older than WSDL also. When implementing a function one would pass arguments and return a value. Think of a process() function that gets a Message as an argument and returns another Message.

Message process(Message in) throws Exception;

So the question then becomes should I return the same in Message or another one? Well, that's entirely up to you and what the process() function is supposed to do. The only thing that happens in Camel is that we wrap the two messages in an Exchange (plus the exception and some other metadata, because we needed an execution context) and that's what we pass to process(), so the function signature becomes:

void process(Exchange exchange);

... which is pretty much the definition of the Processor interface. Or you can think of the exchange as being the stack frame prepared for the function execution in the first case (not much of a stretch).

One important thing to note, that may clarify a bit the confusion is that getOut() will not really return the out message, but rather will lazily create an empty DefaultMessage for you (on the first call, and the existing out message on subsequent calls). Therefore there is no out message until you create it by calling getOut(), the assumption being that you called getOut() with the intention of returning the new distinct Message resulted from your processing. As a consequence the assertion getOut() ==null will always fail, because the getOut() call will create a message. To address that there is a hasOut() method in Exchange you could use to tell you if you created the out message (in case you don't know that already).

As you probably know, what camel does is to pass an Exchange along the processor chain in a route and pass the out message of the previous processor as an in message for the next. If the previous processor did not produce an out (there are such processors, a logger for example, logs the in, but does not produce an out), then pass the previous in as an in message for the next processor. During its lifetime, an Exchange starts with an in, goes down the processor chain, outs are created and they replace the ins and the old ins are lost in history. And here's where the mep comes into the picture. If the route is in-out, at the end of processing chain, the last message in the exchange is passed back by the consumer to the caller, otherwise the last message is simply ignored (with an in-only mep, the caller is not waiting for an answer).

A problem faced by users is that if they call getOut() and produce a new message, the headers from the in message get lost (well, the whole message gets lost). One thing to know is that if one needs to access some value at different points during processing, the Exchange has a property map. Objects in that map stay with the Exchange, unlike headers that are tied to messages (and probably have no semantics outside the message). As we saw, messages come and go during processing. Exchange properties don't. Now if you want headers to be preserved, you probably have a very specific processor, that can only exist in particular places in a route and make heavy assumptions about the type of in message they receive. If that's the case, yes, you probably only need to slightly modify the in message and since you did not create an out the modified in will be passed further down the route. If you need to update a header you set on a previous processor (such as a timestamp) but don't really rely on the in message per se, you might consider using exchange properties instead of headers. Consider also that even if you modify the in message to preserve the headers, other processors down the route may produce an out (and hence replace the current in) anyway so headers could still get lost down the route.

That's pretty much it. Let's take a look at a few examples. A throttler processor will most likely not care about the in nor generate an out, it will only introduce a delay in processing (which may be adaptive and may depend on a header or property). A profiler processor will probably update timestamps in exchange properties, again ignoring the in, not producing an out and couldn't care less what the exchange pattern of the exchange is. A logger processor will log the in, won't produce an out, mep is again irrelevant. A crypto processor (partial message protection) will use credentials from the exchange properties to alter parts of the in message, sign/verify/encrypt/decrypt (may be either headers or parts of the body), will likely not produce an out. A proxy processor that allows you to integrate some legacy technology into a camel route, will use the input, format it according to the needs of the technology, use the proprietary api, get the result and put it as a new out on the exchange, possibly along with specific headers. A processor that fetches records from a database will most likely put the result set into a new out message as well. So again, should you use getIn() or getOut()? Things depend much on what your processor needs to do.

Don't do what's easy, do what's right!
Categories: Hadrian Zbarcea

Take the Apache Camel Survey!

Hadrian Zbarcea - Wed, 10/06/2010 - 05:18
A new Apache Camel release is just around the corner (version 2.5.0) and recently we announced taking the 1.x branch off life support. Therefore the race is on to deliver a new major version, Camel 3.0 early in Q1 2011 (see roadmap).

To that end the Camel PMC worked on a survey last week intended to get a better feel of how the community is using Apache Camel, what the major issues are (documentation, I know) and what improvements and new features you want to see in the next releases. The survey, sponsored by Sopera, is anonymous and its results will be made public and used by the committers to create a remarkable experience for you.

We care about you, our users, and, from the feedback we got so far, I am impressed by how much you care about us back. Your voice is important and it's one sure way to move the features you care most about higher on our priority lists and busy schedules. Needless to say, feel free to reach out to us on the mailing lists and IRC channel with questions and suggestions.

So please take the survey (http://s.apache.org/camel-survey) if you didn't already and receive our gratitude for your continued support.
Categories: Hadrian Zbarcea
Subscribe to Talend Community Coders aggregator - Hadrian Zbarcea