Producers

Producers are data-generating applications, such as sensors in automobiles or activity loggers in servers. Producers create messages with the collected data and publish the messages to HPE Ezmeral Data Fabric Streams topics, specifically, to HPE Ezmeral Data Fabric Streams topic-partitions.

Permissions

Before a producer can publish to topics, the user ID running the producer needs these permissions:
  • The writeAce permission on the volume where the streams are located. For information about how to set permissions on volumes, see Setting Whole Volume ACEs.
  • The produceperm permission on the streams where the topics are located. Users with the adminperm permission on those streams can grant the produceperm permission.

Producing Messages

Producers create messages about the collected data and send the collected data to a HPE Ezmeral Data Fabric Streams producer client library. In addition to the actual message, the producer specifies the topic that the message is intended for and an optional partition ID. The producer client buffers incoming messages and sends them (in batches) to the HPE Ezmeral Data Fabric Streams server.
NOTE In case of server failure, the producer client automatically continues to retry sending messages.
ATTENTION As of data-fabric 6.1, the HPE Ezmeral Data Fabric Streams API enforces a maximum of 4096 partitions for a topic. That is, when you create an application with the API, the maximum number of partitions is 4096. If you previously created an application with HPE Ezmeral Data Fabric Streams 6.0.1 API (or older) and you have upgraded, the original number of partitions can be used. For example, if you were using more than 4096 partitions in data-fabric 6.0.1 or earlier, you will be able to continue with the same number of partitions after upgrading.

Event-time Timestamp

As of data-fabric 6.0.1, HPE Ezmeral Data Fabric Streams supports an event-time timestamp. The timestamp type can be either createtime (default) or logappendtime. See the maprcli stream create and stream edit for more information about these parameters.

TIP Since each message is automatically published into a topic-partition with an event-time timestamp as part of the message record, this allows the Consumer application to seek records based on the timestamp.

Idempotent (exactly once) Producers

An "exactly-once" message delivery semantic produces messages without duplication. Each message is delivered once and only once. Exactly-once is insured by uniquely identifying a group of messages that are atomically persisted. Exactly-once message delivery is set with the producer idempotence option. See Modes of Publishing for more information.
The following failure scenarios are addressed with idempotence:
  • The stream processor might take input from multiple source topics and the ordering across these source topics is not deterministic across multiple runs. So if you re-run your stream processor that takes input from multiple source topics, it might produce different results.
  • The stream processor might produce output to multiple destination topics. If the producer cannot do an atomic write across multiple topics, then the producer output can be incorrect if writes to some (but not all) partitions fail.
  • The stream processor might aggregate or join data across multiple inputs. If one of the instances of the stream processor fails, then you need to be able to rollback the state materialized by that instance of the stream processor. On restarting the instance, you also need to be able to resume processing and recreate its state.
  • The stream processor might look up enriching information in an external database or by calling out to a service that is updated out of band. By depending on an external service, the stream processor can be fundamentally non-deterministic. For example, if the external service changes its internal state between two runs of the stream processor, it can lead to incorrect results downstream.

For More Information

For more information about creating and editing streams or topics: