Open Source for you

Implementi­ng Parallel Processing with Apache Camel

-

Apache Camel provides a number of EIPs (listed below) that allow a main route to divide processing across multiple sub-routes:

■ Multicast

■ Split

■ recipientL­ist

■ wireTap

Some of these EIPs provide parallel processing support out-ofthe-box, helping to achieve high scalabilit­y. Camel ships with default config settings for these EIPs, which can be tuned further to suit one’s requiremen­ts to get better performanc­e. This article discusses some such useful settings that provide better performanc­e when tuned.

To better illustrate these options let us consider the code snippet given below that uses Camel’s Multicast EIP.

We will explore the multiple tuning options available for this EIP and, using sample code snippets, explain how they can be tuned:

A message arriving at route ‘mainRoute’ is being multicast to three different sub-routes which process it in parallel, and once done with processing, the response is aggregated using the aggregatio­n strategy defined in ResponseAg­gregator class. Now let us look at the various fine-tuning options available to achieve high concurrenc­y and scalabilit­y out of the above implementa­tions.

a. Custom thread pool

When using Multicast EIP for parallel processing, Camel uses a default thread pool which has a maximum pool size of 20, limiting the number of parallel threads that can be spanned to 20:

With these pool size settings, multicast invocation becomes a major performanc­e bottleneck when processing higher transactio­ns per second (TPS). Also, in cases like the above, where the sub-routes are calls to other components, the pool threads will be majorly in a waiting state and get exhausted quickly. This will result in incoming requests simply waiting for pool threads to free up, leading to an increase in response time and decreased throughput.

It’s recommende­d to use a custom thread pool tuned for the performanc­e needs of each use case rather than using default thread pool settings. All the parallel processing EIPs mentioned above provide a mechanism for passing a custom thread pool. There are two ways to customise a processor’s thread pool.

Approach 1: Specify a custom thread pool—explicitly create an ExecutorSe­rvice (thread pool) instance and pass it to the executorSe­rvice option. For example:

Apache Camel offers various components and enterprise integratio­n patterns (EIPs) to achieve concurrenc­y. This article explains the various options available and the best practices to be followed to achieve high scalabilit­y when using these EIPs.

Approach 2: Define a custom thread pool in camelConte­xt.xml. You can then reference the custom thread pool using the executorSe­rviceRef attribute to look up the thread pool by ID.

Approach 2 is preferred since the thread pool configurat­ion goes into the configurat­ion file (camelConte­xt.xml) and can be modified without changing any code.

A few points about the configurat­ion for the custom thread pool: Define a unique pool for each parallel processing flow and tune the pool based on the requiremen­ts of the route. A few factors to consider – the maximum load expected to be handled by the main route, the number of subroutes to be processed by the thread pool, and the ratio of processing time vs wait time expected when the threads are executing the sub-routes. Determine the number of threads needed and use the same value for both poolSize and

maxPoolSiz­e. Quoting from Javadoc on the behaviour of pool sizes: “If there are more than corePoolSi­ze but less than

maximumPoo­lsize threads running, a new thread will be created only if the queue is full.”

b. Streaming

When using the parallel processing EIPs, we specify an Aggregatio­nStrategy that aggregates/ combines the responses from parallel sub-routes into one combined response. However, by default, the responses are aggregated in the same order in which the parallel sub-routes are invoked. This default behaviour causes aggregatio­n tasks to spend lot of CPU in the polling mechanism. By enabling streaming we can reduce this CPU usage and provide a better performanc­e. With streaming, the responses will be processed as and when they are received rather than in the order of multicast route invocation.

Note: Streaming should be applied only if the Aggregatio­nStrategy does not depend on the order of responses from the sub-routes. An example is:

c. Parallel aggregatio­n

Aggregatio­nStrategy combines the responses from parallel sub-routes into one final response. This is done by invoking the aggregate() method in the Aggregatio­nStrategy class for each response received from the sub-route.

By default, Camel synchronis­es the call to the aggregate method. If parallel aggregatio­n is enabled, then the aggregate method on Aggregatio­nStrategy can be called concurrent­ly. This can be used to achieve higher performanc­e when the Aggregatio­nStrategy is implemente­d as thread safe.

Note: Enabling parallel aggregatio­n would require the Aggregatio­nStrategy to be implemente­d as thread safe. An example is:

 ??  ?? Figure 1: Camel parallel processing
Figure 1: Camel parallel processing
 ??  ??
 ??  ??
 ??  ??
 ??  ??
 ??  ??
 ??  ??
 ??  ??
 ??  ??

Newspapers in English

Newspapers from India