Grafana + InfluxDB¶
Monitoring Data in Motion¶
There has been a significant increase in the popularity of event streaming and stream processing applications/technologies within the data engineering community. With the accelerating growth of big data, IoT, and cloud computing, more organizations are facing the challenge of extracting actionable insights earlier in the event pipeline. For historical reasons, operational tools for monitoring, alerting, and diagnosing system issues are oriented toward data at rest. That doesn’t mean they can’t be just as useful for monitoring data in motion. It just means adjusting your monitoring regime to a streaming mindset.
A good example of a next-gen streaming infrastructure element is Novelty. Novelty is a stateful event streaming engine designed to process high-volume event streams and produce high-value events in real time.
In this guide, we'll guide you through setting up Grafana backed by InfluxDB to monitor a Novelty instance. We'll show you how to configure Novelty to send data to InfluxDB, create a dashboard in Grafana to visualize this data, and use Grafana's powerful features to detect issues and anomalies in real time. By the end of this guide, you'll have a solid understanding of how to monitor event stream pipelines using Grafana and InfluxDB, and you'll be equipped with the tools and knowledge needed to keep Novelty running smoothly.
Setting up Grafana and InfluxDB¶
Grafana is a tool that helps you visualize and understand operational metrics data. It lets you create visual dashboards to monitor and analyze data from sources across your data infrastructure. DevOps teams use Grafana metrics dashboards to make informed decisions.
Above is an example of a typical development and testing environment. The event sources and output sinks change depending on the scenario, but typically Novelty runs on localhost, configured to push metrics to InfluxDB and visualize the observations in Grafana. Using Docker containers makes it easy to configure and clean up the environment quickly.
Some pre-work is needed before launching the Docker containers. The following example uses docker-compose to set up the environment. Your configuration may differ based on how Docker is installed on your host.
A recommended approach is to keep docker-compose.yaml files arranged inside their directories in a docker directory in $HOME. This helps keep things organized and makes sharing configs between machines easy.
Change into the grafana directory and start the InfluxDB/Grafana stack:
docker compose up -d
Verify that the containers are running:
docker ps
NAMES STATUS PORTS
grafana-grafana-1 Up 4 seconds 0.0.0.0:3000->3000/tcp
grafana-influxdb-1 Up 4 seconds 0.0.0.0:8086->8086/tcp
InfluxDB and Grafana are now running in separate containers and listening on their default ports.
Configuring Novelty to Send Metrics Data¶
Enable metrics reporting in Novelty via configuration parameters that can be passed as Java system properties with -D or contained in a Novelty configuration file. Novelty can report metrics to jmx, csv, influxdb, and slf4j for analysis. The jmx metrics reporter is enabled by default.
java \
-Xmx12G -Xms12G \
-Dthatdot.novelty.metrics-reporters.1.type=influxdb \
-Dthatdot.novelty.metrics-reporters.1.database=db0 \
-Dthatdot.novelty.metrics-reporters.1.period=30s \
-Dthatdot.novelty.metrics-reporters.1.host={container_host} \
-jar novelty-0.15.0.jar
A couple of things to note when passing configuration as system properties.
- The -D parameters must come before -jar
Alternatively, you can pass the following configuration stored in novelty-metrics.conf to Novelty to accomplish the same thing.
Create a novelty-metrics.conf file containing the HOCON configuration from the documentation.
thatdot.novelty {
# where metrics collected by the application should be reported
metrics-reporters = [
{
# Report metrics to an influxdb (version 1) database
type = influxdb
# required by influxdb - the interval at which new records will
# be written to the database
period = 30
# Connection information for the influxdb database
database = db0
scheme = http
host = {container_host}
port = 8086
# Authentication information for the influxdb database. Both
# fields may be omitted
# user = admin
# password = admin
}
]
}
Important
Make sure to change out the container_host value for the actual container host value (like localhost for example)
Then launch Novelty, passing the configuration file on the command line.
java -Dconfig.file=novelty-metrics.conf -jar novelty-0.15.0.jar
Novelty Metrics¶
Novelty reports three classes of metrics; counters, timers, and gauges.
Tip
When queried, the Metrics: GET /api/v2/admin/metrics API endpoint reports the same metrics as a metrics reporter.
Counters¶
Novelty uses counters to accumulate the number of times that events occur. Counters can return either a value or a histogram.
- node.edge-counts.*: Histogram-style summaries of edges per node
- node.property-counts.*: Histogram-style summaries of properties per node
- shard.*.sleep-counters: Count the lifecycle state of nodes managed by a shard
Timers¶
Novelty reports the elapsed time in milliseconds it takes to perform persistor operations.
- persistor.get-journal: Time taken to read and deserialize a single node’s relevant journal
- persistor.persist-event: Time taken to serialize and persist one message’s worth of on-node events
- persistor.get-latest-snapshot: Time taken to read (but not deserialize) a single node snapshot
Gauges¶
Novelty gauges report metrics as a value.
- memory.heap.*: JVM heap usage
- memory.total: JVM combined memory usage
- shared.valve.ingest: Number of current requests to slow ingest for another part of Novelty to catch up
- dgn-reg.count: Number of in-memory registered DomainGraphNodes
Monitoring Best Practices¶
Monitoring a streaming graph is similar to any other database, with a few additional key metrics to watch.
Novelty is backpressured, which means that the performance of the persistence subsystem affects the flow of events in the graph. Java garbage collection impacts backpressure. It is normal for Novelty ingest rates to fluctuate as Java manages the heap. Keep an eye on when heap consumption approaches the max memory configured for Java. Best performance is typically achieved when launching Novelty with a 12G (-Xmx12G -Xms12G) memory allocation pool.
Conclusion¶
Monitoring the performance of a solution over time requires a DevOps tool like Grafana. This guide will get you up and running with the tools and knowledge needed to monitor Novelty effectively.
