Mapping Topics to Streams

Topics in Apache Kafka vs HPE Ezmeral Data Fabric Streams

In Apache Kafka, a topic is the highest level object. When you reference a topic, you reference the topic by name. If the topic name is pollution_monitors, you reference the topic as pollution_monitors.

In HPE Ezmeral Data Fabric Streams, a stream is the highest level object and a stream contains one or more topics. A stream can reside in any directory within the Data Fabric file system namespace. Therefore, when you reference a topic, you must specify the fully-qualified path of the stream in which the topic resides and the topic name, as shown in the following example:

/apps/monitoring/pm_stream:pollution_monitors

In this example, pollution_monitors is the topic name and /apps/monitoring/pm_stream is the fully-qualified path of the stream (pm_stream) in which the topic (pollution_monitors) resides.

For an Apache Kafka client to access a topic in HPE Ezmeral Data Fabric Streams, the topic must map to a stream. To create the mapping, you configure topic mapping rules in kafka-cluster.conf. The mapping rule translates the Apache Kafka topic name in the incoming RPCs into a fully-qualified path within a specific Data Fabric stream.

Configuring Topic Mapping Rules

Configure topic mapping rules in the /opt/kafka-wire-protocol/conf/kafka-cluster.conf file located in Data Fabric cluster. You can configure topic mapping rules at a global level or user level. Each topic mapping rule that you create must conform to a specific format.

The following sections show the settings and syntax used to configure mapping rules at the global level and user level in the kafka-cluster.conf file:

Syntax for Mapping Rules

A mapping rule must have the following format:

<topic_name_or_glob_pattern>:<datafabric_stream_path>

topic name - The Apache Kafka topic name.
glob pattern - Set of topics with wild card characters, such as pressure_*, which includes all topics that have the prefix pressure_.
datafabric_stream_path - Absolute path of the stream that will store the topic(s).

How to Set Mapping Rules at the Global Level

Global-level mapping rules apply to all users for which user-specific rules are not configured. The following example shows the setting for global level rules:

kafka.cluster.topic-mappings.default-rules = [<rule_1>, <rule_2>,...]

How to Set Mapping Rules at the User Level

User-level mapping rules apply to a specific user. The following example shows the setting for user-level rules:

kafka.cluster.topic-mappings.user-rules.<user_name> = [<rule_1>, <rule_2>,...]

TIP When configuring multiple rules, separate each rule by a comma. When multiple rules are present, the system evaluates them in the order specified, and the first matching rule is applied.

By default, global rules are not included in the per-user rule list. An administrator can include the default rules for any user with the following placeholder variable:

${DEFAULT_RULES}

Topic Mapping Rules Syntax Usage Examples

All topics (as indicated by *) are redirected to the topics with the same name within the Data Fabric stream /data/edf_streams/single_cluster_stream:
```
*:/data/df_streams/single_cluster_stream
```
A topic named hourly_temperature is redirected to the fully-qualified topic path /apps/weather_monitoring/temperature_stream:hourly_temperature:
```
hourly_temperature:/apps/weather_monitoring/temperature_stream
```
Any topic name with the prefix pressure_ is redirected to the topics with the same name within the Data Fabric stream /apps/weather_monitoring/pressure_stream:
```
pressure_*:/apps/weather_monitoring/pressure_stream
```

Example kafka-cluster.conf File with Mapping Rules

The following example shows how to configure mapping rules at the global and user level in the kafka-cluster.conf file:

1 | kafka.cluster = {
2 |   topic-mappings = {
3 |     default-rules = ["*:/var/kafka-wire-protocol/default-stream"]
4 |     user-rules = {
5 |       accounts_user = ["payroll_*:/apps/accounts/payrolls",
6 |                        "*:/apps/accounts/default-stream"]
7 |       hr_user =       ["payroll_*:/apps/accounts/payrolls", ${DEFAULT_RULES}]
8 |       thor =          ["asgard*:/realms/asgard"] 
9 |     } 
10|   } 
11| }

TIP As a best practice, configure a catch-all mapping rule, such as *:/some/stream-path. If a catch-all mapping rule does not exist, the system throws an UnknownTopicOrPartitionException when an RPC refers to a topic name that does not match any of the configured mapping rules and Kafka clients receive an UNKNOWN_TOPIC_OR_PARTITION error.