Ethereum Tag Propagation
Full Recipe¶
Shared by: Ethan Bell
This recipe models data on the thoroughgoing Ethereum blockchain. Any transaction can be flagged as tainted causing a tainted
tag to propagate into the graph to track the flow of transactions from the flagged and tainted accounts.
Ethereum Tag Propagation Recipe
1 |
|
Scenario¶
Newly-mined Ethereum transaction metadata is imported via a Server-Sent Events data source. Transactions are grouped by the block in which they were mined then imported into the graph. Each wallet address is represented by a node, linked by an edge to each transaction sent or received by that account, and linked by an edge to any blocks mined by that account. Quick queries allow marking an account as "tainted". The tainted flag is propagated along outgoing transaction paths via Standing Queries to record the least degree of separation between a tainted source and an account receiving a transaction.
Note
The Ethereum diamond logo is property of the Ethereum Foundation, used under the terms of the Creative Commons Attribution 3.0 License.
Sample Data¶
Sample data is continuously sampled from the Ethereum block chain and emitted as a server sent event for use in this demo.
How it Works¶
The recipe installs two ingest queries. They are auto-named INGEST-1
and INGEST-2
. The INGEST-1
query processes blocks, and INGEST-2
processes mined transactions. In both queries, idFrom is used to identify nodes from unique identifiers present in the dataset. For accounts, the address is the identifier; for blocks, the block hash is the identifier; etc. Ethereum data uses hexadecimal strings for identifiers, sometimes with a built-in capitalization checksum. This means the address 0x19975E29111a6c85E282eBe409C272c15492c6Ad
is the same address as 0x19975e29111a6c85e282ebe409c272c15492c6ad
, just written slightly differently. To account for these variations in the hex representation's capitalization, before resolving an id, toLower
is used to convert the identifier to consistent lower-case representation.
INGEST-1¶
The INGEST-1 query processes streaming data for block_head
like:
id: 14566607_head
event: block_head
data: {
"number": 14566607,
"hash": "0xf3dafdda16a884f6ff2b1b0c0325eaadc70db022363e3af74ab5994f8cbc1f12",
"parentHash": "0xcd859249e97684f319173c284314307a11deaa2a708c8c5fcf377971e09abb01",
"sha3Uncles": "0x1dcc4de8dec75d7aab85b567b6ccd41ad312451b948a7413f0a142fd40d49347",
"logsBloom": "0x0",
"transactionsRoot": "0xa77b91fc4ee74bc1df28019e898a4ba17dd87fcc41c633cab25b4909ee56a60a",
"stateRoot": "0xf7869b706a212bfa504520674c3ef3350b187d31ef207b155fa548a4e59169df",
"receiptsRoot": "0x6669147c87b5cc857801372bed55ab6ddf3474d935b2b4e3b1ee1b95f4dc357b",
"miner": "0x829BD824B016326A401d083B33D092293333A830",
"difficulty": "13384256520560135",
"extraData": "0xe4b883e5bda9e7a59ee4bb99e9b1bc4a1621",
"gasLimit": 30029295,
"gasUsed": 3117128,
"timestamp": 1649710008,
"baseFeePerGas": "0xfcf7d67a0",
"nonce": "0xc1a22f3db05412ca",
"mixHash": "0xfaafcc9e2be300ba795954bed57a38e415330e6131e48e58770e8e678a16e869"
}
The ingest query identifies (BA)
, (minerAcc)
, (blk)
, and (parentBlk)
nodes and loads them into the graph.
- format:
query: |-
MATCH (BA), (minerAcc), (blk), (parentBlk)
WHERE
id(blk) = idFrom('block', toLower($that.hash))
AND id(parentBlk) = idFrom('block', toLower($that.parentHash))
AND id(BA) = idFrom('block_assoc', toLower($that.hash))
AND id(minerAcc) = idFrom('account', toLower($that.miner))
CREATE
(minerAcc)<-[:mined_by]-(blk)-[:header_for]->(BA),
(blk)-[:preceded_by]->(parentBlk)
SET
BA:block_assoc,
BA.number = $that.number,
BA.hash = $that.hash,
blk:block,
blk = $that,
minerAcc:account,
minerAcc.address = $that.miner
type: CypherJson
url: https://ethereum.demo.thatdot.com/blocks_head
type: ServerSentEventsIngest
{
"format": {
"query": "MATCH (BA), (minerAcc), (blk), (parentBlk)\nWHERE\n id(blk) = idFrom('block', toLower($that.hash))\n AND id(parentBlk) = idFrom('block', toLower($that.parentHash))\n AND id(BA) = idFrom('block_assoc', toLower($that.hash))\n AND id(minerAcc) = idFrom('account', toLower($that.miner))\nCREATE\n (minerAcc)<-[:mined_by]-(blk)-[:header_for]->(BA),\n (blk)-[:preceded_by]->(parentBlk)\nSET\n BA:block_assoc,\n BA.number = $that.number,\n BA.hash = $that.hash,\n blk:block,\n blk = $that,\n minerAcc:account,\n minerAcc.address = $that.miner",
"type": "CypherJson"
},
"url": "https://ethereum.demo.thatdot.com/blocks_head",
"type": "ServerSentEventsIngest"
}
INGEST-2¶
The INGEST-2 query receives tx_mined
events like:
id: 14566637: 0
event: tx_mined
data: {
"blockHash": "0x0d7782556aef00f1391a05a18ab229a70720780fe3c92eaff74738dee59649d0",
"blockNumber": 14566637,
"from": "0x19975E29111a6c85E282eBe409C272c15492c6Ad",
"gas": 42105,
"gasPrice": "203940950410",
"hash": "0x470294af9453f2cd1ec084456328da5c613585974e838fa088cef27246b2481e",
"input": "0x",
"nonce": 1,
"r": "0x8b52f40f28db1627e82fea7352f6d2ba1133dcac081b6939bd03ff397370586d",
"s": "0x8e7b2c69b1684873156090f238d42aad2c14315a08551a57dc5ed1aa45f0a76",
"to": "0x732Ec041e4Dc8c01B541B237dE5Ce794c51cF838",
"transactionIndex": 0,
"type": "0x0",
"v": "0x26",
"value": "168930787638413525"
}
The ingest query identifies (BA)
, (toAcc)
, (fromAcc)
, and (tx)
and loads them into the graph.
- format:
query: |-
WITH true AS validTransactionRecord WHERE $that.to IS NOT NULL AND $that.from IS NOT NULL
MATCH (BA), (toAcc), (fromAcc), (tx)
WHERE
id(BA) = idFrom('block_assoc', toLower($that.blockHash))
AND id(toAcc) = idFrom('account', toLower($that.to))
AND id(fromAcc) = idFrom('account', toLower($that.from))
AND id(tx) = idFrom('transaction', toLower($that.hash))
CREATE
(tx)-[:defined_in]->(BA),
(tx)-[:from]->(fromAcc),
(tx)-[:to]->(toAcc)
SET
tx:transaction,
BA:block_assoc,
toAcc:account,
fromAcc:account,
tx = $that,
fromAcc.address = $that.from,
toAcc.address = $that.to
type: CypherJson
url: https://ethereum.demo.thatdot.com/mined_transactions
type: ServerSentEventsIngest
{
"format": {
"query": "WITH true AS validTransactionRecord WHERE $that.to IS NOT NULL AND $that.from IS NOT NULL\nMATCH (BA), (toAcc), (fromAcc), (tx)\nWHERE\n id(BA) = idFrom('block_assoc', toLower($that.blockHash))\n AND id(toAcc) = idFrom('account', toLower($that.to))\n AND id(fromAcc) = idFrom('account', toLower($that.from))\n AND id(tx) = idFrom('transaction', toLower($that.hash))\nCREATE\n (tx)-[:defined_in]->(BA),\n (tx)-[:from]->(fromAcc),\n (tx)-[:to]->(toAcc)\nSET\n tx:transaction,\n BA:block_assoc,\n toAcc:account,\n fromAcc:account,\n tx = $that,\n fromAcc.address = $that.from,\n toAcc.address = $that.to",
"type": "CypherJson"
},
"url": "https://ethereum.demo.thatdot.com/mined_transactions",
"type": "ServerSentEventsIngest"
}
Running the Recipe¶
❯ java -jar quine-1.8.2.jar -r ethereum.yaml
Graph is ready
Running Recipe: Ethereum Tag Propagation
Using 6 node appearances
Using 7 quick queries
Using 2 sample queries
Running Standing Query STANDING-1
Running Ingest Stream INGEST-1
Running Ingest Stream INGEST-2
Quine web server available at http://localhost:8080
Observe that Quine is running in the terminal window and that the ingest queries are receiving data.
| => STANDING-1 count 0
| => INGEST-1 status is running and ingested 485
| => INGEST-2 status is running and ingested 34820
Reviewing chains¶
The nodes appearing in your graph are from the live Ethereum blockchain. They will continue to stream in as long as Quine is running the recipe.
Start exploring the graph by pulling a few recent blocks from the blockchain with the Recently Accessed Blocks
sample query. Select the sample query in the query bar then click the Query button. The query returns a sub-graph of the recent blocks ordered by the block that preceded it.
Take a moment to inspect a couple of the blocks to see the data stored as parameters.
Click back into the query bar and clear the query then submit the Sent and Received ETH
sample query to see accounts that have sent and received transactions.
This query finds a series of Wei transactions chained from account to account. Arrange the graph so that you can see all of the nodes. Right-click on the node at the head of the chain and select "Outgoing Transactions" to create a synthetic edge between the accounts. Create a second synthetic edge between the second and third accounts.
Tip
Hold shift while moving a node to lock it's position in place.
Taint a Node¶
Right-click on the origin node again and select "Mark as Tainted." This adds a tainted
parameter tag to the node and sets it to a value of 0. A node with tainted=0
indicates that this is the source of taint in our graph.
Notice that you begin to receive updates in the terminal window where you launched Quine from. The Standing Query produces these notices from the recipe; let's look at it now.
A Standing Query is composed of two parts, the pattern query that detects a sub-graph shape and an output query that acts on the matched sub-graph.
Standing Query
1 |
|
Pattern¶
Cypher from the query pattern is always evaluating the stream of data looking for a match. When matched, it triggers the output query to process the event.
Our standing query is always looking for tainted nodes via the existence of a tainted
parameter.
MATCH
(tainted:account)<-[:from]-(tx:transaction)-[:to]->(otherAccount:account),
(tx)-[:defined_in]->(ba:block_assoc)
WHERE
tainted.tainted IS NOT NULL
RETURN
id(tainted) AS accountId,
tainted.tainted AS oldTaintedLevel,
id(otherAccount) AS otherAccountId
The results of the match pattern are sent to the output query.
The output query acts on the match to propagate the tainted
tag. The value of tainted
is equal to the shortest path to any tainted node.
Output¶
MATCH (tainted), (otherAccount)
WHERE
tainted <> otherAccount
AND id(tainted) = $that.data.accountId
AND id(otherAccount) = $that.data.otherAccountId
WITH *, coll.min([($that.data.oldTaintedLevel + 1), otherAccount.tainted]) AS newTaintedLevel
SET otherAccount.tainted = newTaintedLevel
RETURN
strId(tainted) AS taintedSource,
strId(otherAccount) AS newlyTainted,
newTaintedLevel
A standing query is capable of sending notifications using the andThen
clause in the API.
"andThen": {
"logLevel": "Info",
"logMode": "Complete",
"type": "PrintToStandardOut"
}
In our case, the results from the match are printed to standard out. These are the message that you now see in your terminal window.
2022-04-13 11:05:14,877 Standing query `propagate-tainted` match: {"meta":{"isPositiveMatch":true,"resultId":"e3aa2a7c-b246-4896-b8b7-d4fea9904c91"},"data":{"taintedSource":"ed9899b5-e8a8-3a0b-9785-824f2cb1781b","newlyTainted":"981c7ef9-319a-35ba-90dd-401faf5de6a6","newTaintedLevel":3}}
Tainted Tag Propagation¶
Clear your explorer window using the '<<' button, then run the "Tainted Accounts" query. This query will find the original account or accounts responsible for the taint in the graph.
Right-click on a tainted account (appears fuchsia) and select "Outgoing Tainted Transactions" to find the accounts that this account tainted. Hover over the account to see the tainted=1
property that indicates that this account is one hop away from the source of the taint.
Continue to taint and explore the graph as more of the nodes become tainted.
At any time, you can issue the following query to report the number of tainted nodes in the graph.
MATCH (n)
WHERE n.tainted IS NOT NULL
RETURN DISTINCT n.tainted, count(n)
ORDER BY n.tainted