Architecture

In MapR Event Store For Apache Kafka, topics are grouped into streams. Administrators can apply security, retention, and replication policies on streams. Combined with MapR File System and MapR Database in the MapR Data Platform, using these streams enables organizations to create a centralized, secure data lake that unifies files, database tables, and message topics.

Messages (topic data) are published to topics by Producer applications and are read by Consumer applications. All messages published to MapR Event Store For Apache Kafka are persisted, allowing future consumers to “catch-up” on processing and analytics applications to process historical data. Additionally, messages are specifically written to topic partitions.

NOTE Topic partitions are stored in containers within volumes. Containers are written to storage pools, which are made up of disks on the nodes in the cluster. See Containers and the CLDB for more information about containers.

Why Use MapR Event Store For Apache Kafka?

MapR Event Store For Apache Kafka is ideal for a variety of use cases, including the following:

Application event pipelines: Many types of applications generate event or log data that must be centrally stored and analyzed to gain insights about user activity or application performance. MapR Event Store For Apache Kafka simplifies these pipelines by transporting events to a central location, from which they can undergo event-by-event transformation and analysis.
Database change capture: Most modern databases enable users to generate an event each time an entry is added or modified. These events can be published to MapR Event Store For Apache Kafka to keep systems like search indexes and caches synchronized, as well as to feed security or notification applications.
Internet of Things: The explosion in the number of smart devices and sensors has created many situations in which billions of data points are created by millions of geographically dispersed sensors. MapR Event Store For Apache Kafka provides a reliable, global transport for these messages, enabling you to perform analytics both at the source and at a central location.

Replication

In addition to reliably delivering messages to applications within a single data center, MapR Event Store For Apache Kafka can continuously replicate data between multiple clusters, delivering messages globally. Like other MapR services, MapR Event Store For Apache Kafka has a distributed, scale-out design, allowing it to scale to billions of messages per second, millions of topics, and millions of producer and consumer applications.

Server and Client Libraries

The relationship of the MapR-ES server to producers, consumers, and client libraries — Figure 1. The relationship of the MapR Event Store For Apache Kafka server to producers, consumers, and client libraries

Server: The server manages streams, topics, and partitions and handles requests from the producer client library and the consumer client library.
Producer client library: This client side library which is part of the producer process receives the messages that are sent by producers, buffers the messages, and sends them to the server, which then publishes the messages and sends the client acknowledgements.
Consumer client library: This client side library which is part of the consumer process receives requests from consumers to poll subscriptions for unread messages, reads messages from topic partitions, and sends messages to consumers.