Class PartitionGroup


  • public class PartitionGroup
    extends Object
    PartitionGroup is used to buffer all co-partitioned records for processing. In other words, it represents the "same" partition over multiple co-partitioned topics, and it is used to buffer records from that partition in each of the contained topic-partitions. Each StreamTask has exactly one PartitionGroup. PartitionGroup implements the algorithm that determines in what order buffered records are selected for processing. Specifically, when polled, it returns the record from the topic-partition with the lowest stream-time. Stream-time for a topic-partition is defined as the highest timestamp yet observed at the head of that topic-partition. PartitionGroup also maintains a stream-time for the group as a whole. This is defined as the highest timestamp of any record yet polled from the PartitionGroup. Note however that any computation that depends on stream-time should track it on a per-operator basis to obtain an accurate view of the local time as seen by that processor. The PartitionGroups's stream-time is initially UNKNOWN (-1), and it set to a known value upon first poll. As a consequence of the definition, the PartitionGroup's stream-time is non-decreasing (i.e., it increases or stays the same over time).