Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create merkle-tree based RDF canonicalization spec #64

Open
aaronc opened this issue Jul 20, 2019 · 0 comments
Open

Create merkle-tree based RDF canonicalization spec #64

aaronc opened this issue Jul 20, 2019 · 0 comments

Comments

@aaronc
Copy link
Member

aaronc commented Jul 20, 2019

Context

Most commonly used data formats like JSON or protobuf allow multiple binary representations of the same data. For instance JSON or protobuf fields can occur in any order and JSON allows any amount of white space.

Canonicalization refers to a process by which a single binary representation is created for any semantically equivalent document.

URDNA2015 defines such a canonicalization algorithm for the RDF data model.

Proposal

My design is to do something both simpler and more powerful than URDNA2015 using this approach:

  • start with an acyclic RDF graph (or dataset) consisting only of blank nodes
  • replace blank nodes with IRI's by depth-first traversal using this algorithm:
    • create an empty IAVL merkle tree (or set)
    • take the list of triples (or quads) with this node as its subject
    • map each triple onto a string which is the canonical string serialization of that triples predicate and object concatenated together
    • sort this list
    • insert each item in the list as a key in the IAVL merkle tree (with an empty value)
    • the resulting universally unique IRI for this node is xrn:g/<hash> where <hash> is the hash of this merkle tree (will blake2b 256 hash algorithm work?)

What this approach allows is:

  • universally unique, content addressable nodes
  • graphs where a subset of the triples can be revealed and their membership in the graph can be proved via a merkle proof

Consequences

References

@aaronc aaronc added size: 1 and removed size: 1 labels Jul 24, 2019
@aaronc aaronc changed the title Create merkle-tree based RDF canonicalization algorithm spec Create merkle-tree based RDF canonicalization spec Jul 25, 2019
@aaronc aaronc added the backlog label Jul 26, 2019
@aaronc aaronc mentioned this issue Aug 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants