Kafka Streams

Kafka Streams is a programming library used for creating Java or Scala streaming applications and, specifically, building streaming applications that transform input topics into output topics.

Kafka Streams allows you to build moderately complex operational streaming applications faster by offloading common functions such as failure recovery, joins and enrichment, and aggregations and windowing.

Kafka Streams application is a distributed Java application that is launched with one or more Kafka Streams application instances. Kafka Streams applications can be built using the KStream library. A KStream application instance is required to be provided with an application.id property. The application.id property uniquely identifies the Kafka Streams distributed application.

ATTENTION The Kafka Streams application must always be launched as the same user.

Architecture

An application that uses the Kafka Streams API is a normal Java application. Package, deploy, and monitor it like you would do for any other Java application. There is no need to install separate processing clusters or similar special-purpose and expensive infrastructure.

NOTE You can run one or more instances of your application. They run independently but will automatically discover each other and collaborate. In addition, you can elastically add and remove application instances during live operations. If one instance dies, another instance continues where that instance left off.

The following diagram shows an application that is running three (3) application instances.

For More Information

Apache Kafka Streams