Apache Kafka

Reading Records from Kafka

Quine has full support for reading records from Apache Kafka topics. The means by which Quine interprets records into graph data is configurable via REST API and Recipes.

In addition to the API-mapped Kafka options, arbitrary Kafka configuration can be provided via the kafkaProperties field.

In order to avoid confusion and mitigate certain vulnerabilities in the Kafka client libraries, certain configuration keys are disabled by a pre-ingest validation step. One such instance is the bootstrap.servers property, which encodes the same information as the bootstrapServers field of the ingest. Therefore, the bootstrap.servers property is disallowed when configuring a Kafka ingest. Similarly, certain values for the sasl.jaas.config property are known to introduce vulnerabilities, so those property values are forbidden by the pre-ingest validation step.

Because of the extreme variety of possible configuration combinations, we cannot provide a comprehensive guide on configuring Kafka. However, we recommend using securityProtocol: "SSL" wherever possible to encrypt requests between Quine and the Kafka broker.

Example

In this example we will ingest messages from a Kafka topic and store them as nodes in the graph.

Preparation

For this example we will run Kafka locally. Because Kafka depends on ZooKeeper, we will start that too. Download Kafka, and extract the files to your local filesystem. Start each of ZooKeeper and Kafka in separate terminal sessions by running each of the following commands from the directory where you extracted Kafka.

bin/zookeeper-server-start.sh config/zookeeper.properties
bin/kafka-server-start.sh config/server.properties

With Kafka up and running, messages can be manually sent to the topic using the following command:

bin/kafka-console-producer.sh --bootstrap-server localhost:9092 --topic test-topic
>{"Message": "Hello, world."}
>^D

While kafka-console-producer.sh is running, messages are generated by inputting text followed by a new line. To end the program and stop generating messages, use Control-D.

Using a Recipe

The following is a simple Recipe that ingests each message from the Kafka topic as a node in the graph:

version: 1
title: Kafka Ingest
contributor: https://github.com/landon9720
summary: Ingest Kafka topic messages as graph nodes
description: Ingests each message in the Kafka topic "test-topic" as a graph node
ingestStreams:
  - type: KafkaIngest
    topics:
      - test-topic
    bootstrapServers: localhost:9092
    format:
      type: CypherJson
      query: |-
        MATCH (n)
        WHERE id(n) = idFrom($that)
        SET n = $that
standingQueries: [ ]
nodeAppearances: [ ]
quickQueries: [ ]
sampleQueries: [ ]

To run this Recipe, run Quine as follows:

❯ java -jar {{ quine }} -r kafka-ingest.yaml
Graph is ready
Running Recipe Kafka Ingest
Running Ingest Stream INGEST-1
Quine app web server available at http://localhost:8080

 | => INGEST-1 status is running and ingested 0

Quine has downloaded the Recipe and begun execution. As shown above, use kafka-console-producer.sh to send a JSON record to the stream. Quine should immediately report that it has ingested the record.

| => INGEST-1 status is running and ingested 1

Results should already be available in the web UI at https://<hostname>:8080.