Skip to main content
Version: Canary 🚧

08 JanusGraph Schema for Bitcoin

Bitcoin Graph Relationship Table​

info

Output β†’ Transaction β†’ Output instead of the traditional Input β†’ Transaction β†’ Output

This means removing explicit Input vertices and instead modeling value flow directly through outputs and transactions. In this model:

  • An Output is spent in a Transaction (i.e., it's an input to that transaction).
  • That Transaction generates new Outputs.
  • So the value flow is: Output (spent) β†’ Transaction (that spends it) β†’ Output (created)

This keeps the graph minimal and UTXO-centric β€” ideal for path analysis and visual tracing. Refined Bitcoin Graph Relationship Table (No Input Node)

From β†’ ToRelationshipExplanation
Output β†’ Transactionspent_inThe output (used as input) is spent in a transaction.
Transaction β†’ Outputlock_toThe transaction locks value to newly created outputs.
Output β†’ Addresspay_toThe output pays to an address; value is locked to a public key script.
Block β†’ Coinbase TxcoinbaseThe block includes the coinbase transaction, which generates mining rewards.
Transaction β†’ Blockincluded_inThe transaction included in a block;
Block β†’ Previous Blockchain_toThe block chains to its previous block via prev_block_hash.

Evaluation of This Model:

βœ… Strengths

  • Simplified graph: Fewer vertex types (no Input nodes), easier to query and visualize.
  • Direct value flow: You can trace a satoshi's path from output β†’ tx β†’ output, etc.
  • Efficient for UTXO tracing, coin flow, taint analysis, etc.

⚠️ Tradeoffs

  • Loss of input metadata: You won’t easily store:

    • input scriptSig / witness
    • input index
    • sequence
    • exact link to the input index of the transaction
  • Cannot distinguish input order directly unless it's added as an edge property (spent_in with input_index).

πŸ”§ Recommendations

If you choose this model:

Design ElementRecommendation
spent_in edgeAdd properties like input_index, sequence, scriptSig, etc.
lock_to edgeAdd output_index, value, scriptPubKey, etc. as properties.
Avoid Input nodeSimplifies model. Just make sure metadata is preserved on edges.
Graph traversal queriesWill be simpler for β€œvalue flow” and β€œwho paid whom” chains.

Traversal Example​

πŸ” Value Flow Pattern in This Model

(Output_A) ──[spent_in]──▢ (TX1) ──[lock_to]──▢ (Output_B)

Which translates as:

Output A was spent in TX1, which created Output B.

Here’s a Gremlin traversal example that traces the path from Address A to Address B β€” in your graph model where the spending relationship is:

Address β†’ Output β†’ Transaction β†’ Output β†’ Address

This represents the flow:

(Address A)
↑ pay_to
(Output A)
─[spent_in]β†’ (Transaction)
─[lock_to]β†’ (Output B)
↓ pay_to
(Address B)

βœ… Gremlin Traversal: Address A β†’ ... β†’ Address B

g.V().has('address', 'A')                  // Start at Address A
.in('pay_to') // Go to Output(s) it received funds to
.out('spent_in') // Follow to the Transaction that spent this Output
.out('lock_to') // Follow to Outputs created by that Transaction
.out('pay_to') // Follow to the Addresses those Outputs pay to
.has('address', 'B') // Check if any of them are Address B
.path() // Return the full path if exists

πŸ” Optional Enhancements: To trace multi-hop paths (A β†’ ... β†’ B over many txs):

g.V().has('address', 'A')
.repeat(
__.in('pay_to')
.out('spent_in')
.out('lock_to')
.out('pay_to')
)
.until(has('address', 'B'))
.path()
  • repeat(...).until(...) enables transitive closure β€” tracing chains of transactions.
  • This will trace from A to B even across many intermediate addresses and transactions.

🧠 Tips

  • Be sure that your edges have correct directionality:

    • pay_to: from Output β†’ Address
    • lock_to: from Transaction β†’ Output
    • spent_in: from Output β†’ Transaction
  • To include edge metadata like value, output_index, etc., use .by(valueMap()) on .path() or project().

JanusGraph Schema​

Here's a JanusGraph schema that matches your Refined Bitcoin Graph Relationship Table (No Input Node) β€” capturing the flow from Output β†’ Transaction β†’ Output, and modeling key Bitcoin concepts like UTXOs, blocks, addresses, and coinbase transactions.

Vertex Labels​

mgmt = graph.openManagement()

// Vertex labels
mgmt.makeVertexLabel('address').make()
mgmt.makeVertexLabel('output').make()
mgmt.makeVertexLabel('transaction').make()
mgmt.makeVertexLabel('block').make()

Edge Labels​

// Edge labels (directional, multiple allowed)
mgmt.makeEdgeLabel('pay_to').multiplicity(MULTI).make() // output β†’ address
mgmt.makeEdgeLabel('spent_in').multiplicity(MULTI).make() // output β†’ transaction
mgmt.makeEdgeLabel('lock_to').multiplicity(MULTI).make() // transaction β†’ output
mgmt.makeEdgeLabel('coinbase').multiplicity(MULTI).make() // block β†’ coinbase transaction
mgmt.makeEdgeLabel('belongs_to').multiplicity(MULTI).make() // transaction β†’ block
mgmt.makeEdgeLabel('chain_to').multiplicity(ONE2ONE).make() // block β†’ previous block

Properties for Vertices​

// Common ID-like keys
addressKey = mgmt.makePropertyKey('address').dataType(String.class).make()
txidKey = mgmt.makePropertyKey('txid').dataType(String.class).make()
outputIndex = mgmt.makePropertyKey('output_index').dataType(Integer.class).make()
value = mgmt.makePropertyKey('value').dataType(Double.class).make()
scriptPubKey = mgmt.makePropertyKey('script_pub_key').dataType(String.class).make()
blockHash = mgmt.makePropertyKey('block_hash').dataType(String.class).make()
blockHeight = mgmt.makePropertyKey('block_height').dataType(Long.class).make()
timestamp = mgmt.makePropertyKey('timestamp').dataType(Long.class).make()

// Assign keys to vertex labels (optional, for indexing)
mgmt.addProperties(mgmt.getVertexLabel("address"), addressKey)
mgmt.addProperties(mgmt.getVertexLabel("transaction"), txidKey, timestamp)
mgmt.addProperties(mgmt.getVertexLabel("output"), value, outputIndex, scriptPubKey)
mgmt.addProperties(mgmt.getVertexLabel("block"), blockHash, blockHeight, timestamp)

Indexes (for efficient traversal)​

// Vertex-centric index
mgmt.buildIndex('byAddress', Vertex.class).addKey(addressKey).buildCompositeIndex()
mgmt.buildIndex('byTxid', Vertex.class).addKey(txidKey).buildCompositeIndex()
mgmt.buildIndex('byBlockHash', Vertex.class).addKey(blockHash).buildCompositeIndex()

🧠 Notes

EntityProperties You May Add
outputoutput_index, value, scriptPubKey, is_spent
transactiontxid, timestamp, fee, size, version
blockblock_hash, block_height, timestamp, nonce
addressaddress (String), maybe type (P2PKH, P2WPKH, etc.)

βœ… Sample Relationship Summary​

From β†’ ToLabel
output β†’ addresspay_to
output β†’ transactionspent_in
transaction β†’ outputlock_to
block β†’ transactioncoinbase
transaction β†’ blockbelongs_to
block β†’ previous blockchain_to