Spyglass on Streams
The 6.0 release of the MapR Converged Data Platform introduces Spyglass on Streams. When you install the the 6.0 version of MapR Core, Streams is the default mechanism through which metrics flow from the collectd service to OpenTSDB. Moving metrics through streams secures the data and provides a mechanism to perform real-time data analytics.
The Flow of Metrics via Streams
The collectd service collects node-level and service-level metrics from each node in the cluster. The collectd service hashes metrics to a stream and writes the metrics into topics in that stream.
Collectd creates two streams per cluster, Stream (0) and Stream (1). Each stream contains
approximately 100+ topics. Topic names use the format
<hostname>_<metricname>
. For example:
mfs81.qa.lab_cpu.percent
.
Collectd maps the metric names to Stream (0) or Stream (1) using the djb2 hash algorithm, as shown:
static int hash(const char *str, int range)
{ int hash = 5381; int c; while ((c = *str++) != 0) hash = ((hash << 5) + hash) + c; /* hash * 33 + c */ return abs(hash%range); }
The algorithm hashes the metric name to an integer that is either 0 or 1.
tsdb_cluster_mgmt.sh
) that runs
periodically.Spyglass on Streams Performance
The performance of Streams for metrics depends on the size of the cluster, the number of OpenTSDB nodes available to consume metrics data, and the number of streams and partitions available to move the metrics data from the collectd service to OpenTSDB.
Default collectd and OpenTSDB settings work well for clusters with up to 100 nodes and three OpenTSDB nodes. If the number of nodes increases or you notice that performance is sluggish, you can improve performance by adding OpenTSDB nodes and modifying the number of streams.
Evaluating Streams Performance
You can use the stream cursor list and stream topic info maprcli commands to view the producer (collectd) and consumer (OpenTSDB) statistics. Check the statistics to see if there is an increase in lag time between producers and consumers. If you notice an increase in lag time, increase the number of consumers (OpenTSDB nodes) or modify the streams and partition settings, as explained in the following sections.
For more information, see Monitoring Producers and Monitoring Consumers.Determining How Many OpenTSDB Nodes to Install
Having multiple OpenTSDB nodes in the cluster distributes the workload. The number of partitions and OpenTSDB nodes determines the level of parallelism for consumption.
Each OpenTSDB node can consume one partition at a time. By default, metrics data is divided across five partitions in each topic and optimal parallelism is reached if there are five OpenTSDB nodes to consume the partitions. See Parallelism When Consuming Messages. Note that the term “consumer” in the topic equates to an OpenTSDB node in Spyglass on Streams.
- Three OpenTSDB nodes in a 10-node cluster
- Four OpenTSDB nodes in a 100-node cluster
- Five OpenTSDB nodes in a 1000-node cluster
If your cluster has 10 or more nodes, at least three OpenTSDB nodes should be available to consume metrics. Typically, for every 10x increase in nodes, you should add another OpenTSDB node. For example, if your cluster reaches a size of 100 nodes, have four OpenTSDB nodes available for consumption. Note that an increase in the number of OpenTSDB nodes may require an increase in the number of streams and/or partitions.
/opt/mapr/asynchbase/asynchbase-<version>/conf/asynchbase.conf
file
on the OpenTSDB
nodes:"fs.mapr.async.worker.threads=<value>"
Increasing the Number of Streams
The default setting for the number of streams is two. As a general guideline, for every 10x increase in the number of cluster nodes, add two additional streams. For example, if your cluster has 100 nodes, add two more streams, for a total of four.
To increase the number of streams, edit the streams-specific options in the collectd and OpenTSDB configuration files. The streams option in each file must have the same value. After you change streams parameters, reconfigure MapR Monitoring, as shown in Update the Monitoring Storage Nodes.
Parameter | Default Setting (up to 100 nodes) | Number of Streams for 100 nodes | Number of Streams for 1000 nodes | File Location |
StreamsCount | 2 | 4 | 6 | /opt/mapr/collectd/collectd-<version>/etc/collectd.conf |
MAXSTREAMS | 2 | 4 | 6 | /opt/mapr/collectd/collectd-<version>/etc/init.d/collectd |
tsd.streams.count | 2 | 4 | 6 | /opt/mapr/opentsdb/opentsdb-<version>/etc/opentsdb/opentsdb.conf |
Changing the Automatic Stream Cursor Commits
You can adjust the frequency of automatic stream cursor commits for OpenTSDB. Modify the
tsd.streams.autocommit.interval
in opentsdb.conf
The
unit is thousands of seconds. The default value is '60000' which is 60 secs. For a system
with heavy loads, consider changing the value to something like 5 minutes.