SlideShare a Scribd company logo
Bobby Calderwood — October 2022
Absolute Consistency: Utilizing Point-in-Time
Queries in Event-Based Systems
The goal and the problem
• My name is Bobby, and my team makes https://oNote.com
• Our goal is to help software teams design, implement, and operate event-driven
systems
• Design with your team on our collaborative Event Modeling canvas
• Implement via schema design, code generation, and functional domain modeling
• Operate systems whose state is consistent and evident at all times
Systems whose state
is consistent and evident at all times
Correlating Business Events with Versioned Read Models
Correlating Business Events with Versioned Read Models
Correlating Business Events with Versioned Read Models
Examples of Point-in-Time Queryable Data Stores
• Git (content-addressed)
• Datomic/XTDB (EAVT fact-based)
• Certain CRDTs
• Materialize/CockroachDB/Snow
fl
ake/BigQuery
• Traditional RDBMSs via temporal tables and SQL FOR SYSTEM_TIME and AS OF
• Domain speci
fi
c (e.g. our own Event indexing service)
Characteristics of PiT Queryable Data Stores
• Immutable/append only, often log based
• New facts supersede/succeed old ones in a speci
fi
c way
• Often (but not always) able to evict data for regulatory/compliance reasons, while
maintaining causal integrity
• Time/causal order has a
fi
rst-class representation, a system “clock”
Why would I need this?
Why would I need Git?
— Rich Hickey, Database as a Value, GOTO 2012
If you have a business, you can’t make decisions
if you don’t know what happened before
yesterday.
Stable Basis for Decisions
• Issue the same query and get the same results, even as the system makes ongoing
progress
• Multiple distributed participants in the system can agree on and make decisions
based on consistent point-in-time state
• System basis/clock can be serialized, sent, stored, etc. to coordinate reads across
participants and time
Multiversion Concurrency Control
• With a stable basis for query, we can ensure that writes happen against last-seen state,
or else abort
• GET /foos
->
query DB at latest basis, include that basis in e.g. form data on page
• PUT /foos/bar
->
write to DB unless basis is greater than the one contained in request
• Ensures Command processing of PUT request emits Events based on immediate
predecessor of relevant state or not at all
• No more clobbering other users’ (or browser tabs’) writes with decisions based on stale
data!
Support Different Data Access Patterns
• Some data access pattern workloads are long-running and/or tolerate staleness
• Other data access patterns require more current data
• These queries put different load on OLTP data stores, might affect availability
differently
Full Context, Traceability, and Transparency
• Why did we make this decision at that point in the past?
• What changed from then to now?
• Time travel queries!
• What caused this particular problem?
• Audit, compliance, BI, etc.
• Analytics often aren’t enough to detect qualitative changes
Business recovery or reconciliation of errors
• Like git blame, but for data
• Speeds up time to recovery from business errors
• Detect patterns to preemptively catch similar errors in future
• Example: Service Member Civil Relief Act
How does it work in practice?
Datomic and XTDB (EAVT fact-based)
• Clock: monotonically increasing integer identifying each transaction
• Transactions are
fi
rst-class entities, so we can add Event ID as metadata on the
transaction
• Correlation and traceability
• Enables query as-of a particular Event ID!
Datomic and XTDB (EAVT fact-based)
• TODO: code listing
OpSet-based CRDT
• Clock: map of actor ID
->
last counter seen for actor
• Can interpret the value based on a clock-de
fi
ned subset of the OpSet
OpSet-based CRDT
• TODO: code listing
Domain Specific: Event Indexing Service
• Clock: map of stream name
->
count of events on stream, shortens to overall count of
events across all streams
• Can query state of any stream at a given clock state
• Supports MVCC on write: assert that a given stream must be at a particular revision,
else fail appending the event
Domain Specific: Event Indexing Service
• TODO: code listing
Putting it all together:
oNote Event Model Repository
Event Clock and CRDT Clock
• Event Clock from our event indexing service
• Map of stream
->
event count
• Used for MVCC and to fetch
fi
ne-grained streams per entity
• CRDT Sync operation
• Appends an Event with a CRDT patch based on client clock and server clock to Event Stream
• Responds with patch to bring client up to latest server clock
• CRDT query: fetch Event Model at a particular clock state
Syncing with Git Hosts
• Addition of CRDT patch
fi
le to con
fi
gured directory syncs CRDT state with speci
fi
c Git
SHA
• Changes can come from either:
• oNote UI as Sync operations
• Git host via branch merges, etc.
• Kafka-based event processor ensures all changes are merged and made available in
oNote app and in con
fi
gured Git repo
Conclusion
• Event Log + versioned Read Models provide:
• Fully consistent point-in-time state that results from each business event
• Transparency, analytics, audit trail for each state transition in system
• Kafka + Event indexing provide a meta-Read Model for Events on streams, enabling
fi
ne-grained streams and MVCC for event-based applications
• Kafka + CRDT + Git hosting enables versioned and history-preserving Event Model
repository for oNote
Thank you!

More Related Content

Similar to Utilizing Point-in-Time Queries in Event-Based Systems, Bobby Calderwood | Current 2022 (20)

PDF
Uwe Friedrichsen - CRDT und mehr - über extreme Verfügbarkeit und selbstheile...
AboutYouGmbH
 
PPTX
Observability – the good, the bad, and the ugly
Timetrix
 
PDF
Uwe Friedrichsen – Extreme availability and self-healing data with CRDTs - No...
NoSQLmatters
 
PPTX
DDD meets CQRS and event sourcing
Gottfried Szing
 
PDF
Reliable and Scalable Data Ingestion at Airbnb
DataWorks Summit/Hadoop Summit
 
PDF
Self healing data
Uwe Friedrichsen
 
PDF
Google Spanner
Vaidas Brundza
 
PDF
Designing data intensive applications - Oleg Mürk
ExpressPlay Intertrust
 
PDF
Designing Data-Intensive Applications
Oleg Mürk
 
PPTX
Observability - The good, the bad and the ugly Xp Days 2019 Kiev Ukraine
Aleksandr Tavgen
 
PPTX
Observability - the good, the bad, and the ugly
Aleksandr Tavgen
 
PDF
Using Time Series for Full Observability of a SaaS Platform
DevOps.com
 
PPTX
Event sourcing
Touraj Ebrahimi
 
PDF
Enabling Precise Identification and Citability of Dynamic Data: Recommendatio...
LEARN Project
 
PPT
Enabling Precise Identification and Citability of Dynamic Data: Recommendatio...
Research Data Alliance
 
PDF
CQRS + Event Sourcing
Mike Bild
 
PDF
Eventual Consistency - JUG DA
Susanne Braun
 
PDF
w-jax 2022: Eventual-Consistency-Du-musst-keine-Angst-haben-Final.pdf
Susanne Braun
 
PDF
OOP 2021 - Eventual Consistency - Du musst keine Angst haben
Susanne Braun
 
PPTX
Using InfluxDB for Full Observability of a SaaS Platform by Aleksandr Tavgen,...
InfluxData
 
Uwe Friedrichsen - CRDT und mehr - über extreme Verfügbarkeit und selbstheile...
AboutYouGmbH
 
Observability – the good, the bad, and the ugly
Timetrix
 
Uwe Friedrichsen – Extreme availability and self-healing data with CRDTs - No...
NoSQLmatters
 
DDD meets CQRS and event sourcing
Gottfried Szing
 
Reliable and Scalable Data Ingestion at Airbnb
DataWorks Summit/Hadoop Summit
 
Self healing data
Uwe Friedrichsen
 
Google Spanner
Vaidas Brundza
 
Designing data intensive applications - Oleg Mürk
ExpressPlay Intertrust
 
Designing Data-Intensive Applications
Oleg Mürk
 
Observability - The good, the bad and the ugly Xp Days 2019 Kiev Ukraine
Aleksandr Tavgen
 
Observability - the good, the bad, and the ugly
Aleksandr Tavgen
 
Using Time Series for Full Observability of a SaaS Platform
DevOps.com
 
Event sourcing
Touraj Ebrahimi
 
Enabling Precise Identification and Citability of Dynamic Data: Recommendatio...
LEARN Project
 
Enabling Precise Identification and Citability of Dynamic Data: Recommendatio...
Research Data Alliance
 
CQRS + Event Sourcing
Mike Bild
 
Eventual Consistency - JUG DA
Susanne Braun
 
w-jax 2022: Eventual-Consistency-Du-musst-keine-Angst-haben-Final.pdf
Susanne Braun
 
OOP 2021 - Eventual Consistency - Du musst keine Angst haben
Susanne Braun
 
Using InfluxDB for Full Observability of a SaaS Platform by Aleksandr Tavgen,...
InfluxData
 

More from HostedbyConfluent (20)

PDF
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
HostedbyConfluent
 
PDF
Renaming a Kafka Topic | Kafka Summit London
HostedbyConfluent
 
PDF
Evolution of NRT Data Ingestion Pipeline at Trendyol
HostedbyConfluent
 
PDF
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
HostedbyConfluent
 
PDF
Exactly-once Stream Processing with Arroyo and Kafka
HostedbyConfluent
 
PDF
Fish Plays Pokemon | Kafka Summit London
HostedbyConfluent
 
PDF
Tiered Storage 101 | Kafla Summit London
HostedbyConfluent
 
PDF
Building a Self-Service Stream Processing Portal: How And Why
HostedbyConfluent
 
PDF
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
HostedbyConfluent
 
PDF
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
HostedbyConfluent
 
PDF
Navigating Private Network Connectivity Options for Kafka Clusters
HostedbyConfluent
 
PDF
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
HostedbyConfluent
 
PDF
Explaining How Real-Time GenAI Works in a Noisy Pub
HostedbyConfluent
 
PDF
TL;DR Kafka Metrics | Kafka Summit London
HostedbyConfluent
 
PDF
A Window Into Your Kafka Streams Tasks | KSL
HostedbyConfluent
 
PDF
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
HostedbyConfluent
 
PDF
Data Contracts Management: Schema Registry and Beyond
HostedbyConfluent
 
PDF
Code-First Approach: Crafting Efficient Flink Apps
HostedbyConfluent
 
PDF
Debezium vs. the World: An Overview of the CDC Ecosystem
HostedbyConfluent
 
PDF
Beyond Tiered Storage: Serverless Kafka with No Local Disks
HostedbyConfluent
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
HostedbyConfluent
 
Renaming a Kafka Topic | Kafka Summit London
HostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
HostedbyConfluent
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
HostedbyConfluent
 
Exactly-once Stream Processing with Arroyo and Kafka
HostedbyConfluent
 
Fish Plays Pokemon | Kafka Summit London
HostedbyConfluent
 
Tiered Storage 101 | Kafla Summit London
HostedbyConfluent
 
Building a Self-Service Stream Processing Portal: How And Why
HostedbyConfluent
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
HostedbyConfluent
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
HostedbyConfluent
 
Navigating Private Network Connectivity Options for Kafka Clusters
HostedbyConfluent
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
HostedbyConfluent
 
Explaining How Real-Time GenAI Works in a Noisy Pub
HostedbyConfluent
 
TL;DR Kafka Metrics | Kafka Summit London
HostedbyConfluent
 
A Window Into Your Kafka Streams Tasks | KSL
HostedbyConfluent
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
HostedbyConfluent
 
Data Contracts Management: Schema Registry and Beyond
HostedbyConfluent
 
Code-First Approach: Crafting Efficient Flink Apps
HostedbyConfluent
 
Debezium vs. the World: An Overview of the CDC Ecosystem
HostedbyConfluent
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
HostedbyConfluent
 
Ad

Recently uploaded (20)

PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PPTX
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
July Patch Tuesday
Ivanti
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
July Patch Tuesday
Ivanti
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
Ad

Utilizing Point-in-Time Queries in Event-Based Systems, Bobby Calderwood | Current 2022

  • 1. Bobby Calderwood — October 2022 Absolute Consistency: Utilizing Point-in-Time Queries in Event-Based Systems
  • 2. The goal and the problem • My name is Bobby, and my team makes https://oNote.com • Our goal is to help software teams design, implement, and operate event-driven systems • Design with your team on our collaborative Event Modeling canvas • Implement via schema design, code generation, and functional domain modeling • Operate systems whose state is consistent and evident at all times
  • 3. Systems whose state is consistent and evident at all times
  • 4. Correlating Business Events with Versioned Read Models
  • 5. Correlating Business Events with Versioned Read Models
  • 6. Correlating Business Events with Versioned Read Models
  • 7. Examples of Point-in-Time Queryable Data Stores • Git (content-addressed) • Datomic/XTDB (EAVT fact-based) • Certain CRDTs • Materialize/CockroachDB/Snow fl ake/BigQuery • Traditional RDBMSs via temporal tables and SQL FOR SYSTEM_TIME and AS OF • Domain speci fi c (e.g. our own Event indexing service)
  • 8. Characteristics of PiT Queryable Data Stores • Immutable/append only, often log based • New facts supersede/succeed old ones in a speci fi c way • Often (but not always) able to evict data for regulatory/compliance reasons, while maintaining causal integrity • Time/causal order has a fi rst-class representation, a system “clock”
  • 9. Why would I need this?
  • 10. Why would I need Git?
  • 11. — Rich Hickey, Database as a Value, GOTO 2012 If you have a business, you can’t make decisions if you don’t know what happened before yesterday.
  • 12. Stable Basis for Decisions • Issue the same query and get the same results, even as the system makes ongoing progress • Multiple distributed participants in the system can agree on and make decisions based on consistent point-in-time state • System basis/clock can be serialized, sent, stored, etc. to coordinate reads across participants and time
  • 13. Multiversion Concurrency Control • With a stable basis for query, we can ensure that writes happen against last-seen state, or else abort • GET /foos -> query DB at latest basis, include that basis in e.g. form data on page • PUT /foos/bar -> write to DB unless basis is greater than the one contained in request • Ensures Command processing of PUT request emits Events based on immediate predecessor of relevant state or not at all • No more clobbering other users’ (or browser tabs’) writes with decisions based on stale data!
  • 14. Support Different Data Access Patterns • Some data access pattern workloads are long-running and/or tolerate staleness • Other data access patterns require more current data • These queries put different load on OLTP data stores, might affect availability differently
  • 15. Full Context, Traceability, and Transparency • Why did we make this decision at that point in the past? • What changed from then to now? • Time travel queries! • What caused this particular problem? • Audit, compliance, BI, etc. • Analytics often aren’t enough to detect qualitative changes
  • 16. Business recovery or reconciliation of errors • Like git blame, but for data • Speeds up time to recovery from business errors • Detect patterns to preemptively catch similar errors in future • Example: Service Member Civil Relief Act
  • 17. How does it work in practice?
  • 18. Datomic and XTDB (EAVT fact-based) • Clock: monotonically increasing integer identifying each transaction • Transactions are fi rst-class entities, so we can add Event ID as metadata on the transaction • Correlation and traceability • Enables query as-of a particular Event ID!
  • 19. Datomic and XTDB (EAVT fact-based) • TODO: code listing
  • 20. OpSet-based CRDT • Clock: map of actor ID -> last counter seen for actor • Can interpret the value based on a clock-de fi ned subset of the OpSet
  • 22. Domain Specific: Event Indexing Service • Clock: map of stream name -> count of events on stream, shortens to overall count of events across all streams • Can query state of any stream at a given clock state • Supports MVCC on write: assert that a given stream must be at a particular revision, else fail appending the event
  • 23. Domain Specific: Event Indexing Service • TODO: code listing
  • 24. Putting it all together: oNote Event Model Repository
  • 25. Event Clock and CRDT Clock • Event Clock from our event indexing service • Map of stream -> event count • Used for MVCC and to fetch fi ne-grained streams per entity • CRDT Sync operation • Appends an Event with a CRDT patch based on client clock and server clock to Event Stream • Responds with patch to bring client up to latest server clock • CRDT query: fetch Event Model at a particular clock state
  • 26. Syncing with Git Hosts • Addition of CRDT patch fi le to con fi gured directory syncs CRDT state with speci fi c Git SHA • Changes can come from either: • oNote UI as Sync operations • Git host via branch merges, etc. • Kafka-based event processor ensures all changes are merged and made available in oNote app and in con fi gured Git repo
  • 27. Conclusion • Event Log + versioned Read Models provide: • Fully consistent point-in-time state that results from each business event • Transparency, analytics, audit trail for each state transition in system • Kafka + Event indexing provide a meta-Read Model for Events on streams, enabling fi ne-grained streams and MVCC for event-based applications • Kafka + CRDT + Git hosting enables versioned and history-preserving Event Model repository for oNote