Skolemising Blank Nodes while
Preserving Isomorphism
Aidan Hogan – DCC, Universidad de Chile
WHY? BLANK NODES ARE GREAT!
When life gives you blank nodes …
Blank Nodes are glue!
Blank Nodes names aren’t important …
(Isomorphic)
Blank nodes are common in real-world data …
Aidan Hogan, Marcelo Arenas, Alejandro Mallea and Axel Polleres
"Everything You Always Wanted to Know About Blank Nodes".
Journal of Web Semantics 27: pp. 42–69, 2014
BLANK NODES ENABLE SYNTAX SHORTCUTS
They represent implicit nodes in the graph
They help specify order, higher-arity relations, reification, etc., succinctly
They are common in real-world data
BLANK NODES:
WHAT’S THE PROBLEM?
Are two RDF graphs isomorphic?
Are two RDF graphs isomorphic?
RDF ISOMORPHISM IS GI-COMPLETE
A general algorithm to see if two RDF graphs are the “same” will
(probably) not be tractable
BLANK NODES ADD COMPLEXITY?
WHAT TO DO?
RDF 1.1 proposes Skolemisation
But fresh IRIs every time is not ideal
But fresh IRIs every time is not ideal
Would prefer a “consistent” labelling
Would prefer a “consistent” labelling
Compute isomorphically-unique graph hash
Finding duplicate documents from a crawler
CANONICAL LABELLING USEFUL FOR:
1. Mapping blank nodes to IRIs
2. Computing unique hashes for RDF graphs
OLD BUT RECURRING QUESTION
An old question that won’t go away …
Jeremy J. Carroll. “Signing RDF Graphs.” ISWC 2003.
Edzard Höfig, Ina Schieferdecker. “Hashing of RDF Graphs
and a Solution to the Blank Node Problem.” URSW 2014.
NO EXISTING APPROACH IS GENERAL
• Hard cases seem unlikely in practice
• Let’s build a general (and thus worst-case exponential) algorithm
that’s efficient for practical cases
NAÏVE CANONICAL LABELLING SCHEME
(Naïve) Canonical labels for blank nodes
But wait … what happens if ... ?
Or another case …
Or another case …
Or another case …
Fixpoint does not distinguish all blank nodes!
NAÏVE: COLOUR BLANK NODES RECURSIVELY
UNTIL FIXPOINT
• Efficient
• Incomplete
CANONICAL LABELLING SCHEME:
ALWAYS DISTINGUISH ALL BLANK NODES
Brendan D. McKay. "Practical graph isomorphism". Congressus Numerantium 30: pp. 45–87, 1981.
Start with a (non-distinguished) colouring …
Let’s distinguish a node …
Let’s distinguish a node …
Colouring is no longer a fixpoint!
Rerun colouring to fixpoint
Rerun colouring to fixpoint
Rerun colouring to fixpoint
Rerun colouring to fixpoint
Fixpoint reached: still not finished!
So again let’s distinguish another …
… and rerun colouring to fixpoint
… and rerun colouring to fixpoint
… and rerun colouring to fixpoint
… and rerun colouring to fixpoint
… and rerun colouring to fixpoint
… and rerun colouring to fixpoint
Now all blank nodes are distinguished!
Blank node labels computed from colour
Let’s go back: first, why pick _:a and _:c?
Okay so: why _:a …
Adapt ideas from the Nauty algorithm
(for standard graph isomorphism)
Adapt ideas from the Nauty algorithm
(for standard graph isomorphism)
Check all leafs for minimum graph
What happened?
What happened?
What happened?
Automorphisms cause repetitions
CORE ALGORITHM: FIND MINIMAL GRAPH
FOLLOWING FIXED COLOURING RULES
• Complete
• Efficient for many cases?
OKAY … SO WHAT HASHING TO USE?
What about hash collisions?
128 bit: MD5, Murmur3_128
160 bit: SHA1
HASHING MAY LEAD TO COLLISIONS
• Don’t care what hashing you want to use
• 128-bit hash shortest hash with acceptable collision probability
• For cryptographic use-cases, SHA-256 or better might be needed
EVALUATION
Evaluation: Real-world Graphs
Evaluation: Nasty Synthetic Graphs
CONCLUSIONS
In loving memory of
Linked Data
2007–2012
Survived by its research
community
_:b
1999–2015
Conclusions
Aside: Why GI-Hard?
Aside: Why GI-Hard?
(Can Encode Graph Isomorphism as RDF Isomorphism)
if and only if
Aside: Why GI-Complete?
(Can we encode RDF isomorphism as graph isomorphism?)
if and only if
?
?
Aside: Why GI-Complete?
(Yes: We can encode RDF isomorphism as graph isomorphism)
Aside: Why GI-Complete?
(Yes: We can encode RDF isomorphism as graph isomorphism)
if and only if
COMPLETE CANONICAL LABELLING SCHEME
A complete canonical labelling?
Find a canonical labelling for H
Choose the lowest possible graph
COMPLETE: FIND MINIMUM POSSIBLE
GRAPH USING FIXED BLANK NODE LABELS
• Complete
• Inefficient
The need for a graph-level hash
OPTIMISATION: PRUNE THE TREE USING
AUTOMORPHISMS
Trim the search tree
using “found” automorphisms
Found Automorphisms …
PRUNING PER AUTOMORPHISMS AVOIDS
SYMMETRIC REPETITIONS
• Automorphisms are found naturally
• Makes very “regular” structures (like cliques) a lot easier
• Need to be careful how to manage the automorphism group

More Related Content

PPTX
Introduction to RDF Data Model
PPTX
Best Practices for Multilingual Linked Open Data
PDF
Linked Data Under the Hood
PPSX
Introduction to RDF
PPT
Rdf Overview Presentation
PDF
Rdf data-model-and-storage
PPTX
Resource description framework
PDF
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Introduction to RDF Data Model
Best Practices for Multilingual Linked Open Data
Linked Data Under the Hood
Introduction to RDF
Rdf Overview Presentation
Rdf data-model-and-storage
Resource description framework
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...

Viewers also liked (7)

PPTX
Learning W3C Linked Data Platform with examples
PDF
Metadata - Linked Data
PPTX
Why do they call it Linked Data when they want to say...?
PPT
Introduction to RDF
PPT
PPT
RDF and OWL
PPT
Andreas Blumauer: Über das ‘Smarte’ am Semantic Web
Learning W3C Linked Data Platform with examples
Metadata - Linked Data
Why do they call it Linked Data when they want to say...?
Introduction to RDF
RDF and OWL
Andreas Blumauer: Über das ‘Smarte’ am Semantic Web
Ad

Similar to Skolemising Blank Nodes while Preserving Isomorphism (20)

DOC
Graph Matching Algorithm-Through Isomorphism Detection
PDF
Isomorphespolynomial eng
PPT
Graph isomorphism
PDF
Exhaustive Combinatorial Enumeration
PPTX
Matrix representation of graph
PDF
FREQUENT SUBGRAPH MINING ALGORITHMS - A SURVEY AND FRAMEWORK FOR CLASSIFICATION
PPTX
Graph Representation, DFS and BFS Presentation.pptx
PDF
Graph theory in Practise
PPTX
GRAPH THEORY AND ITS APPLICATIONS.......
PPTX
Graph theory
PPTX
Trees and graphs
PDF
Reconstructing Primary Information from Secondary Information
PPT
Recognition as Graph Matching
PPTX
Lecture 4- Design Analysis Of ALgorithms
PPTX
TREE ADT, TREE TRAVERSALS, BINARY TREE ADT
PPT
PPTX
UNIT II - Graph Algorithms techniques.pptx
PPTX
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
PPTX
Optimization algorithms for solving computer vision problems
Graph Matching Algorithm-Through Isomorphism Detection
Isomorphespolynomial eng
Graph isomorphism
Exhaustive Combinatorial Enumeration
Matrix representation of graph
FREQUENT SUBGRAPH MINING ALGORITHMS - A SURVEY AND FRAMEWORK FOR CLASSIFICATION
Graph Representation, DFS and BFS Presentation.pptx
Graph theory in Practise
GRAPH THEORY AND ITS APPLICATIONS.......
Graph theory
Trees and graphs
Reconstructing Primary Information from Secondary Information
Recognition as Graph Matching
Lecture 4- Design Analysis Of ALgorithms
TREE ADT, TREE TRAVERSALS, BINARY TREE ADT
UNIT II - Graph Algorithms techniques.pptx
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Optimization algorithms for solving computer vision problems
Ad

Recently uploaded (20)

PDF
Data Virtualization in Action: Scaling APIs and Apps with FME
PDF
Accessing-Finance-in-Jordan-MENA 2024 2025.pdf
PPTX
Microsoft User Copilot Training Slide Deck
PPT
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
PPTX
Configure Apache Mutual Authentication
PDF
Enhancing plagiarism detection using data pre-processing and machine learning...
PPTX
MuleSoft-Compete-Deck for midddleware integrations
PDF
sbt 2.0: go big (Scala Days 2025 edition)
PDF
Transform-Quality-Engineering-with-AI-A-60-Day-Blueprint-for-Digital-Success.pdf
PDF
5-Ways-AI-is-Revolutionizing-Telecom-Quality-Engineering.pdf
PPTX
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
PPTX
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
PDF
Early detection and classification of bone marrow changes in lumbar vertebrae...
PDF
The influence of sentiment analysis in enhancing early warning system model f...
PDF
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
PDF
The-Future-of-Automotive-Quality-is-Here-AI-Driven-Engineering.pdf
PDF
Transform-Your-Supply-Chain-with-AI-Driven-Quality-Engineering.pdf
PDF
NewMind AI Weekly Chronicles – August ’25 Week IV
PPTX
Build Your First AI Agent with UiPath.pptx
PDF
INTERSPEECH 2025 「Recent Advances and Future Directions in Voice Conversion」
Data Virtualization in Action: Scaling APIs and Apps with FME
Accessing-Finance-in-Jordan-MENA 2024 2025.pdf
Microsoft User Copilot Training Slide Deck
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
Configure Apache Mutual Authentication
Enhancing plagiarism detection using data pre-processing and machine learning...
MuleSoft-Compete-Deck for midddleware integrations
sbt 2.0: go big (Scala Days 2025 edition)
Transform-Quality-Engineering-with-AI-A-60-Day-Blueprint-for-Digital-Success.pdf
5-Ways-AI-is-Revolutionizing-Telecom-Quality-Engineering.pdf
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
Early detection and classification of bone marrow changes in lumbar vertebrae...
The influence of sentiment analysis in enhancing early warning system model f...
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
The-Future-of-Automotive-Quality-is-Here-AI-Driven-Engineering.pdf
Transform-Your-Supply-Chain-with-AI-Driven-Quality-Engineering.pdf
NewMind AI Weekly Chronicles – August ’25 Week IV
Build Your First AI Agent with UiPath.pptx
INTERSPEECH 2025 「Recent Advances and Future Directions in Voice Conversion」

Skolemising Blank Nodes while Preserving Isomorphism