Learn an easier way to correct and reprocess messages in a Kafka dead letter queue
From time to time, your topic consumers might receive messy data. This is often due to improper message formatting or incorrect serialization/deserialization. In these cases, you can program the consumer to halt, or to ignore the data, or save the data for correction and reprocessing.
In this post, I’ll demonstrate an easy way to correct and reprocess erroneous data using Kadeck’s visual topic management dashboard (get Kadeck for free).
The first task is to store bad data so it can be analyzed, corrected, and reprocessed. The recommended way to do this is by creating a dead letter topic. The consuming application uses this topic to store messages it cannot process so someone can examine them, make corrections, and probably make a code change somewhere in the data pipeline to prevent the error from recurring. It’s also good practice to facilitate troubleshooting by adding an error message attribute to the record as you store it in the DLQ.
Let’s assume a system that is streaming financial transactions has a dead letter queue in place, and you’ve been asked to correct the erroneous data. Having a visual user interface (UI) for your Kafka system and data makes this easy; here’s how.
The erroneous financial transactions are in a dead letter topic. In an enterprise setting, you might have hundreds of topics, which would make finding the failed messages difficult. Kadeck solves this problem by organizing your topics and custom views of topic data in a Data Catalog.
You can search the Data Catalog by data source, data owner, custom labels, or by keywords. Since we are correcting financial transaction data, I can choose the “finance” label, and my catalog view is narrowed to the topics in finance. The error queue is easy to spot. I’ll click that to explore the data within the dead letter queue.
In our example, each financial transaction in the DLQ has an error message attribute, which you can click to add it as a column in the view. Upon doing so, it is clear that all messages in the topic have the same error, a malformed account ID. If there were multiple errors, we could filter the messages for this specific error and save it as a new view, which would make it easy to focus on correcting each separate issue.
We can save this view, and then switch to the topic containing the clean financial transactions. There we see that correctly formed Account IDs contain a 2-character country code, which is missing from the records in the dead letter channel. Let’s return to the DLQ in the Kadeck Topic Browser and correct the errors.
After reprocessing, if the corrected records are not needed anymore, they can be deleted by right-clicking on the last entry and select "Delete up to here" from the context menu.
Now that the data has been corrected and reprocessed, Kadeck makes it easy to find the team members who are responsible for the financial transaction producer, and inform them of the issue. They can modify the producer, and eliminate the cause of the errors.
The Flow tab at the top of the Topic Browser shows the name of the system that produced the erroneous messages and the people responsible for it’s development. You can share the Topic Browser view with them to help them begin and facilitate the software change process.
Fixing data delivery issues is extremely quick and easy with Kadeck. There are many uses for the Kadeck Topic Browser and Quick Processor, and we invite you to explore them. The Kadeck visual management and collaboration tooling can take your Apache Kafka, Redpanda, and Amazon Kinesis development, troubleshooting, and monitoring to the next level.