|By Dave Chappell, Bill Wood||
|March 1, 2001 12:00 AM EST||
Benchmarking any distributed computing middleware product is a complex task. Knowing how well a distributed infrastructure will perform under heavy load with a large number of concurrently connected users is a key factor in planning a development and deployment strategy.
With the advent of Java Message Service (JMS) as the standard for a global class middleware infrastructure, development organizations can enjoy the luxury of building distributed applications using a common set of APIs and message delivery semantics. At the same time they can pick and choose from a variety of JMS-compliant vendor implementations.
However, the science of testing global-class messaging middleware is uncharted territory as there are no industry-accepted benchmarks. There are many vendor implementations to choose from, all of which claim to be the fastest. Some also claim to be highly scalable. To date, there are no publicly available multiconnection, multitopic, and multiqueue benchmarking test tools to validate these claims.
This article outlines a methodology for conducting a large-volume benchmark using any JMS implementation, focusing on the issues involved in properly measuring performance throughput and scalability. It's intended to help you build your own tests that will provide some objective validation of messaging systems for Java programmers. Source code for the test harness is presented and explained in Listings 1-5 on the JDJ Web site.
The Benchmark Objective
How fast is a messaging system? It seems like a simple question. The easiest test to write is one in which you build a simple messaging client and see how fast it can bounce messages to itself. However, that type of test won't uncover the many issues that come into play once a middleware infrastructure starts to scale up to high numbers of concurrently connected users: threading contention, network bottlenecks, message persis- tence issues, memory leaks, and overuse of object allocations are a few. Such things are not readily apparent when running a simple test on a single machine. A real-world benchmark should be conducted in an environment similar to, or as close as possible to, the actual deployment environment.
Before embarking on a benchmarking mission, a set of basic questions needs to be asked:
- What's important to measure (messages per second, bytes per second)?
- How are the results to be interpreted?
- Should I measure send rates? Receive rates? Both?
- How often should measurements be sampled?
- What's the minimum necessary duration?
- How many JMS senders and receivers do I need to use to get a realistic view of my expected deployment scenario? What's the ratio of senders/receivers? Many to one? One to many, or many to many?
- What are the diverse scenarios (message size, persistent, nonpersistent, pub/sub, point-to-point)?
The goal of the benchmark is to measure the overall throughput of messages through the system under a given load condition. The measurements need to span from the first send to the last receive, as illustrated in Figure 1. Figure 1 shows the time line associated with the large-scale send/receive test. The important figures that will be derived from this test are the overall length of the test and, from that, the average send and receive (throughput) rate the server actually achieves. The test is designed so there's a "ramp-up" period where all the connections, senders, and receivers are established. The measurement begins when the first message is sent. The measurements continue at designated time intervals until the number of designated time intervals has transpired. This effectively measures the JMS provider's "steady-state" throughput while all senders and receivers are actively processing messages.
Message traffic in a deployed environment is often unpredictable. Bursts in message traffic can occur for extended periods of time. It's important to measure and analyze both the send rates and the receive rates under these high-volume conditions. It's feasible that messages may be produced far faster than they can be consumed (one of the benefits of using a MOM), causing a buildup of messages at the JMS provider layer. If this condition occurs for an extended period of time, message congestion can occur. Such a condition will result in a no-win situation for all messages flowing through the system. Nonpersistent messages will fill up in-memory queues, and eventually reach the limits of the server machine(s) that the JMS provider resides on. Persistent messages will constantly flow into persistent store. The net result of this is thrashing of the server machine, thus further prohibiting the JMS provider from achieving its ultimate goal - delivering messages to their intended destinations.
In such a situation, it's prudent to "throttle" the flow of messages from the message producers through the use of "Flow Control." This ensures a smooth and timely delivery of all messages through the system while still allowing the JMS server processes to behave properly within the physical restraints of the machines they're running on.
Flow control implementations vary from throwing an exception to the sending client (requiring the client code to somehow know when to start sending again), to automatically instructing the senders when to slow down and when to pick up again, to discarding the message altogether.
Long Duration Reliability of Consistent Throughput
When your application goes into production, it's likely you'll be expecting it to operate on a 24x7 schedule, receiving and sending messages for long periods of time. Any benchmark undertaken should be designed to reflect these requirements, and as such you'll need to have your benchmark running over a period of time. Ideally you'd be able to test for months, but usually delivery time scales and resources prevent you from being able to do that. A quick test of a few thousand messages won't be enough to really indicate any long-term trends. Thirty minutes is a good rule of thumb as a bare minimum for any test, whether the goal is to measure raw throughput or to measure long duration reliability.
It's really important that you at least run some overnight and weekend tests. Any trends such as message congestion, memory growth, or general inconsistency in results are always detectable within that timeframe.
Key things to look for are whether:
- The server continues to perform.
- Performance at the end of such long tests is consistent with the numbers over shorter durations.
Understanding the Benchmarking Environment
It's imperative that the hardware environment used in the benchmarking effort be as close as possible to the actual deployment hardware. For example, performing a benchmark test on a notebook computer will yield results that are vastly different from a test run on a set of high-powered server machines. The test results on different types of machines will not be proportionate due to bottlenecks that will occur in different places.
The following checklist is an example of the kinds of concerns to be aware of:
- Running on a single machine won't reveal any variations in test results that could occur due to network latency. It's critical to test on at least two machines.
- Benchmarking with 50 senders/receivers will not yield the same relative results as with 1,000 senders/receivers.
- Running on a 10Base-T network versus a 100Base-T network may create an artificial network bottleneck that would mask the true stress testing of the JMS server (or servers).
- If you're using nonpersistent messages, the speed of the server processor and memory will affect performance.
- Regardless of the efficiency of the JMS provider's persistence mechanism, the speed of the disk is crucial in persistent messaging cases. Even with powerful RAID arrays, results can be skewed 20-fold depending on the speed of the disk.
- When using a caching disk controller, it's important that the write caching be disabled. For true reliability and recovery, absolute disk writes need to happen in the event of a failure.
- Due to factors such as network traffic, time of day, and unpredictable variation in your setup, you should try running key tests many times and comparing the results. You might be surprised by the variation between runs (or, hopefully, not).
The JMS Benchmarking Test Harness
In your actual deployment, all your senders and receivers will usually be running on different machines. It's important that at least two machines be used in order to gain an understanding of network latency issues. However, it may not be reasonable for you to allocate 1,000 machines to perform your test. In recognition of this, the test harness code provided simplifies multimachine complexities by allowing a framework for creating multiple senders or receivers within a single application. These applications can be configured to simulate multiple, separate JMS clients. Each sender and receiver can have its own connection to the JMS server within its own dedicated thread.
For simplicity, two main components of the test harness - the sender application and the receiver application - have been kept separate. This allows the benchmarker to run senders and receivers on different machines, and look at sending rates and receiving rates separately (see Figure 2).
If you have access to multiple machines, you may spread these across as many as you like, as shown in Figure 3. The real goal is to focus on stress testing the JMS server(s). That's where the focal point of the message traffic is likely to be. Be careful not to overload the client machines with too many JMS clients. This will create an artificial bottleneck at the client side that won't exist in the real deployment environment. A good rule of thumb is to monitor the CPU and memory usage on the client machines, and make sure that the steady-state processing of messages doesn't exceed 80% utilization of either CPU or memory consumption.
Ideally you should have a full, nonrestricted version of each product you're testing. In the absence of that, note that most of the free developer versions of JMS that you can download from the various vendor web sites are limited by the number of connections or number of machines that can connect to the JMS server. Upward linear scalability trends, or lack of, don't usually start becoming obvious until you have at least 50 senders and 50 receivers; 200-300 pairs are even better.
In recognition of these issues, this test harness gives the flexibility to specify multiple JMS sessions per connection, so you can go beyond any connection limit that might be imposed. The test harness code is also designed to allow you to specify the number of senders or receivers, how they connect to the JMS server, and the number of senders that should share a given connection.
If you're limited to two machines, the JMS server should be run on one machine, and the sender application and the receiver application should be run in separate JVMs on the other one (see Figure 4).
The Test Harness Source Code
To explain the test harness, we'll focus on the publish/subscribe messaging model using a many-to-many scenario. The coding framework is similar in the point-to-point queuing version of the test harness. The fully functional versions for both models, including any vendor-specific issues, can be downloaded from www.SonicMQ.com.
For simplicity, code for point-to-point is kept separate from publish/subscribe. The full test suite, therefore, for tests of SonicMQ include the following four java files:
- Basic parameters are read in from the command line.
- A separate object is created for each sender/receiver. Each object is managed by its own thread.
- All the JMS objects are started.
- The application pauses, waiting for the developer to verify the setup.
- The application sends or receives messages as fast as possible.
- The application queries each sender/receiver once per interval and computes average messages sent/received per second. This result is sent to the console. (A rolling average is also reported.)
- After a prespecified number of intervals, the application closes all JMS objects, stops reporting, and exits cleanly.
Let's look at some of the code that implements these steps, focusing on the PubSubPublisherApp, for example.
The help screen in Listing 1 shows the usage of the publisher application. The main() method simply reads in the data from the command line and creates an instance of the PubSubPublisherApp object, where the actual JMS work is performed. Sender application parameters are shown, and similar arguments exist for the receiver applications and the Publish/Subscribe domain.
A typical usage scenario would be to run this for 50 publishers publishing 2K messages to 10 topics using a JMS server hosted on a remote machine, ntserver.
REM Assume all Java Classpath arguments have been set up. java PubSubPublisherApp -b ntserver:2506 -npublishers 50 -ntopics 10 -msize 2
Once the arguments have been read in, an instance of the sender application object is instantiated. The basic code is shown in Listing 2.
One of the first lines of the construction is the line that finds the javax.jms.ConnectionFactory associated with the particular provider and messaging domain. In this case the factory is for TopicConnections and is retrieved using an InitialContext object that emulates the use of a third-party, external JNDI object store.
Because JMS is a developer API common to all JMS systems, the only provider-specific code you'll need to change is the code that sets up the InitialContext and its environment.
Once the ConnectionFactory has been instantiated, the desired number of connections is created. If possible, this should be equal to the number of actual senders, but due to evaluation limitations, you might be forced to use fewer connections.
The application then creates an array of publisher objects. These are runnable objects and will be associated with a unique thread object. Each of these objects is created and sequentially assigned to one of the connection objects created. Each sender will have its own JMS TopicSession and its own JMS TopicPublisher.
Similarly, the publisher objects are sequentially assigned to the number of queues created. For example, if you have 50 publishers, 25 connections, and 10 topics in the case you're evaluating, each of the 25 connections will have two publishers. The 50 publishers will likewise be uniformly divided among the 10 topics (five per topic).
Listing 3 shows the actual code for the publisher object itself. You'll note that it's runnable, as expected. The bulk of the JMS code is shown in the run() method. This method simply sends messages as fast as possible and records the number of messages sent in a counter, m_msgCounter.
Note that most of the remaining code in the PublisherObj object is related to reporting the results. It's the main() method of the PubSubPublisherApp that's responsible for controlling the start and stop of these objects, and for combining the results from all senders.
Listing 4 shows this portion of the code from the PubSubPublisherApp main() method. This is the code that loops through the array of publisher objects once per interval and queries their performance. At each interval, the application prints one line of the report. This line includes:
- Message sent per second (over the interval)
- Messages sent per second (averaged over the latest five intervals)
- Total messages sent in the interval
- Cumulative messages sent in the test
A Note on Threads and the Subscriber Applications
The code for subscriber applications follows the same pattern as the publisher application in this test harness. One difference, however, is that the subscriber application doesn't explicitly manage the threads used for consuming messages. This is because the subscriber objects use the JMS asynchronous MessageListener interface (see Listing 5).
Setting Up the JMS Provider for the Test Harness
Depending on the JMS server you're using, you may have to administratively create the topics and/or queues needed in your test. For example, the default queues used by the PubSubPublisherApp would be TestTopic-1, TestTopic-2, TestTopic-3, and so forth (you can specify the number of topics to use in your test).
Similarly, the queue names used by the point-to-point test harness (PtpSenderApp) are named TestQueue-1, TestQueue-2, TestQueue-3, and so forth. You'd need to administratively create the queues.
Running the Test Harness
On your server machine, make sure the JMS server has been started and all queues and topics needed for your test have been configured. The amount of setup will depend on the JMS server you're using. You should also make sure the settings for your server and its JVM allow for a liberal use of operating system resources.
You're now ready to configure the JMS client applications (e.g., PubSub SubscriberApp and PubSubPublisherApp). Typically, you'll start these applications in two separate console windows, possibly on separate machines.
The receiver application might typically be started first with a very long test duration. It will report the results interval by interval as you start and stop the publisher application or change the send rate. The major configurable options for the subscriber application are:
- Number of connections
- Number of subscribers
- Number of topics
- Duration of test (number of sampling intervals and interval time)
The main setup issue is to make sure the topics you're subscribing to match the name and count of the ones you plan to publish to. The major configurable options for the publisher application are:
- Number of connections
- Number of publishers
- Number of topics
- Message size
- Delivery mode
- Duration of test (number of sampling intervals and interval time)
For example, to set up the receiver application, the following would be a typical command line for a long duration test using 50 publishers, 25 connections, and 10 topics. We'll connect to a server on ntserver (port 2506) and run for 90-minute intervals.
java PubSubSubscriberApp -b ntserver:2506 -nsubscribers 50 -ntopics 10 -nintervals 90After starting the subscriber application, we can start the corresponding application.
java PubSubPublisherrApp -b ntserver:2506 -nsubscribers 50 -ntopics 10
Both applications will report their setup configuration as well as their interval results. The sample output shown in Figure 5 would come from the subscriber application.
Remember these key points:
- Plan for burst rates and an overall larger-than-expected deployment scenario in terms of number of JMS clients, and number of topics and queues. Start with a reasonable number of clients (50), then move up to larger amounts in increments of 50-100. Watch for upward trends in throughput.
- Look at both the send rates and the receive rates. Watch out for message congestion, and understand what the flow control behavior is, if any.
- If you're going to deploy on some serious hardware, then test on some serious hardware.
- Run the test long enough to see some realistic results. Stress the server. Analyze CPU, memory, and disk usage.
- ESB Myth Busters: 10 Enterprise Service Bus Myths Debunked
- ESB Integration Patterns
- Universal Middleware: What's Happening With OSGi and Why You Should Care
- Guaranteed Messaging With JMS
- Benchmarking JMS-Based E-Business Messaging Providers
- Distributed Logging Using The JMS
- Reconstructing J2EE-Java Business Integration Meets the Enterprise Service Bus
- Service-Oriented Integration: Making the Right Choices to Support Next-Generation Integration
- The Java Message Service
- A Real-World Example