Mapping Topics to Streams

Describes why you map topics to streams, provides instructions for creating mapping rules, and also includes examples.

Topics in Apache Kafka vs HPE Ezmeral Data Fabric Streams

In Apache Kafka, a topic is the highest level object. When you reference a topic, you reference the topic by name. If the topic name is pollution_monitors, you reference the topic as pollution_monitors.

In HPE Ezmeral Data Fabric Streams, a stream is the highest level object and a stream contains one or more topics. A stream can reside in any directory within the Data Fabric file system namespace. Therefore, when you reference a topic, you must specify the fully-qualified path of the stream in which the topic resides and the topic name, as shown in the following example:
/apps/monitoring/pm_stream:pollution_monitors

In this example, pollution_monitors is the topic name and /apps/monitoring/pm_stream is the fully-qualified path of the stream (pm_stream) in which the topic (pollution_monitors) resides.

For an Apache Kafka client to access a topic in HPE Ezmeral Data Fabric Streams, the topic must map to a stream. To create the mapping, you configure topic mapping rules in kafka-cluster.conf. The mapping rule translates the Apache Kafka topic name in the incoming RPCs into a fully-qualified path within a specific Data Fabric stream.

Configuring Topic Mapping Rules

Configure topic mapping rules in the /opt/kafka-wire-protocol/conf/kafka-cluster.conf file located in Data Fabric cluster. You can configure topic mapping rules at a global level or user level. Each topic mapping rule that you create must conform to a specific format.

The following sections show the settings and syntax used to configure mapping rules at the global level and user level in the kafka-cluster.conf file:
Syntax for Mapping Rules
A mapping rule must have the following format:
<topic_name_or_glob_pattern>:<datafabric_stream_path>
  • topic name - The Apache Kafka topic name.
  • glob pattern - Set of topics with wild card characters, such as pressure_*, which includes all topics that have the prefix pressure_.
  • datafabric_stream_path - Absolute path of the stream that will store the topic(s).
How to Set Mapping Rules at the Global Level
Global-level mapping rules apply to all users for which user-specific rules are not configured. The following example shows the setting for global level rules:
kafka.cluster.topic-mappings.default-rules = [<rule_1>, <rule_2>,...] 
How to Set Mapping Rules at the User Level
User-level mapping rules apply to a specific user. The following example shows the setting for user-level rules:
kafka.cluster.topic-mappings.user-rules.<user_name> = [<rule_1>, <rule_2>,...]
TIP When configuring multiple rules, separate each rule by a comma. When multiple rules are present, the system evaluates them in the order specified, and the first matching rule is applied.
By default, global rules are not included in the per-user rule list. An administrator can include the default rules for any user with the following placeholder variable:
${DEFAULT_RULES}

Topic Mapping Rules Syntax Usage Examples

  • All topics (as indicated by *) are redirected to the topics with the same name within the Data Fabric stream /data/edf_streams/single_cluster_stream:
    *:/data/df_streams/single_cluster_stream
  • A topic named hourly_temperature is redirected to the fully-qualified topic path /apps/weather_monitoring/temperature_stream:hourly_temperature:
    hourly_temperature:/apps/weather_monitoring/temperature_stream
  • Any topic name with the prefix pressure_ is redirected to the topics with the same name within the Data Fabric stream /apps/weather_monitoring/pressure_stream:
    pressure_*:/apps/weather_monitoring/pressure_stream

Example kafka-cluster.conf File with Mapping Rules

The following example shows how to configure mapping rules at the global and user level in the kafka-cluster.conf file:
1 | kafka.cluster = {
2 |   topic-mappings = {
3 |     default-rules = ["*:/var/kafka-wire-protocol/default-stream"]
4 |     user-rules = {
5 |       accounts_user = ["payroll_*:/apps/accounts/payrolls",
6 |                        "*:/apps/accounts/default-stream"]
7 |       hr_user =       ["payroll_*:/apps/accounts/payrolls", ${DEFAULT_RULES}]
8 |       thor =          ["asgard*:/realms/asgard"] 
9 |     } 
10|   } 
11| } 
TIP As a best practice, configure a catch-all mapping rule, such as *:/some/stream-path. If a catch-all mapping rule does not exist, the system throws an UnknownTopicOrPartitionException when an RPC refers to a topic name that does not match any of the configured mapping rules and Kafka clients receive an UNKNOWN_TOPIC_OR_PARTITION error.