Performance testing with Kafka and Avro
Here at Corunet we master the big data consuming and processing. Our day-to-day routine involves working with Kafka, JMS and other several other data transfering technologies. Quite recentky we have introduced and combined different data serializations platforms to ensure we are working with the dame data structures, like AVRO or GRPC. Serialization frameworks are awesome; they let us define how our messages are and minimise breaking things, something very important by the way. We are also quite high in your QA standars, being the testing one of our strenghs. In fact we have developed some tools to help us with this goal, in order to keep our QA standars high and truthful. So back to business, one of our field of expertise is in Apache Kafka, a community distributed event streaming platform capable of handling trillions of events a day. Conceived as a messaging queue, Kafka is based on an abstraction of a distributed commit log. The messages are serialised by AVRO, the perfect partner to join with. So during the development process, we found the need to carry out some performance testing. We analised the market looking for something opensource, to handle it and we found there was no tool to easily run/do performance tests. Of course, you can say, there are several frameworks we can use to do that:
- Gatling will let us code some test suite, but modifyng the AVROs every time will force us to update the code every time.
JMeter lets us carry out some Kafka performance testing with some plugins:
- Developing our own performance project would imply generating more code to maintain and keep tied to our use case. Something which can fit but not for long.
So we decided to join our knowledge and expertise to generate our opensource tool for that.
The building process
So we chose Jmeter as our base framework, mainly because it can easily be extended and contains several plugins which can be combined in order to achive our task. Here we found a deciding condition since we need to integrate the system with a schema registry. Schema Registry works like a versioning system for storing and sharing schemas. So what we need is connect our testing tool with an Schema registry, download the schema, fill the data need and configure a profile for load testing. We build a plugin which integrates with JMeter so as to provide the first three points and let the actual application fulfill the forth We fill the gap with kloadgen.
Once we got the plugin installed in JMeter we could access the four components it includes, and this let us create a Test Plan:
KLoadGen Kafka Sampler : This jmeter java sampler sends messages to kafka. There are 3 different samples based on the Serializer class used:
- ConfluentKafkaSampler : Based on the Confluent Kafka Serializer
- Kafka Sampler : Our own and simple Kafka Sampler
- Generic Kafka Sampler : Simple Kafka Sampler where serializer is configured by properties.
- KLoadGen Config Element : This jmeter config element generates plaintext messages based on the input schema template designed.
- Kafka Headers Config Element : This jmeter config element generates serialized object messages based on input class and its property configurations.
Creating a Test Plan
In order to create a Test plan we need to link with the schema registry which will contain our AVROs and downloading the Latest schema version. To achieve that, we will add the Kafka Load Generator Config Element component when we will set up the schema registry URL, and press the Test Registry in order to verify the connection. If no error appears, we can introduce the subject name in the field that appears below and press the Load Subject button. The AVRO schema will be downloaded and flattened in the lower table. We will see 4 columns where we will configure the Random Generator system.
Where the columns mean:
- Field Name : Flattened field name composed by all the properties from the root class. Ex: PropClass1.PropClass2.ProrpClass3 Note: In case to be an array  will appear at the end. If you want to define a specific size for the array just type the number.
- Field Type : Field type, like String, Int, Double, Array Note : if the field is an array of basic types it will be shown as string-array, int-array,...
- Field Length : Field length configuration for the Random Tool. In case of an String mean the number of characters, in case of a Number the number of digits.
- Field Values List : Field possibles values which will be used by the Random Tool to generate values.
Next component to add is the Java Request Element, component where we will choose the sampler implementation(more information about in the plugin documentation)
and will complete the properties we required to inject messages in our system.
- bootstrap.servers : broker-ip-1:port, broker-ip-2:port, broker-ip-3:port
- zookeeper.servers : zookeeper-ip-1:port, zookeeper-ip-2:port, zookeeper-ip-3:port. Optional
- kafka.topic.name : Topic on which messages will be sent
Just finish defining the test plan, if we need it, we add the Kafka Headers Config Element, where we can add the headers; we need to successfully send messages into our system.
The plugin will generate values for those headers we only define the type following the same rules than the payload generator.