SlideShare a Scribd company logo
Managing Data at Scale
Microservices and Events
Randy Shoup
@randyshoup
linkedin.com/in/randyshoup
Evolution to
Microservices
• eBay
• 5th generation today
• Monolithic Perl  Monolithic C++  Java  microservices
• Twitter
• 3rd generation today
• Monolithic Rails  JS / Rails / Scala  microservices
• Amazon
• Nth generation today
• Monolithic Perl / C++  Java / Scala  microservices
@randyshoup linkedin.com/in/randyshoup
No one starts with microservices
…
Past a certain scale, everyone
ends up with microservices
First Law of Distributed Object
Design:
Don’t distribute your objects!
-- Martin Fowler
Managing Data
at Scale
• Migrating to Microservices
• Challenges of Data in Microservices
• Challenges of Event-Driven Systems
Managing Data
at Scale
• Migrating to Microservices
• Challenges of Data in Microservices
• Challenges of Event-Driven Systems
Microservices
• Single-purpose
• Simple, well-defined interface
• Modular and independent
• Isolated persistence (!)
A
C D E
B
Extracting
Microservices
• Problem: Monolithic shared DB
• Clients
• Shipments
• Items
• Styles, SKUs
• Warehouses
• etc.
stitchfix.com Styling app Warehouse app Merch app
CS app Logistics app Payments service Profile service
Extracting
Microservices
• Decouple applications / services from shared DB
• Clients
• Shipments
• Items
• Styles, SKUs
• Warehouses
• etc.
stitchfix.com Styling app Warehouse app Merch app
CS app Logistics app Payments service Profile service
Extracting
Microservices
• Decouple applications / services from shared DB
Styling app Warehouse app
core_item
core_sku
core_client
Extracting
Microservices
• Step 1: Create a service
Styling app Warehouse app
core_item
core_sku
core_client
client-service
Extracting
Microservices
• Step 2: Applications use the service
Styling app Warehouse app
core_item
core_sku
core_client
client-service
Extracting
Microservices
• Step 3: Move data to private database
Styling app Warehouse app
core_item
core_sku
client-service
core_client
Extracting
Microservices
• Step 4: Rinse and Repeat
Styling app Warehouse app
core_sku
client-service
core_client
item-service
core_item
Extracting
Microservices
• Step 4: Rinse and Repeat
Styling app Warehouse app
client-service
core_client
item-service
core_item
style-service
core_sku
Extracting
Microservices
• Step 4: Rinse and Repeat
Styling app Warehouse app
client-service
core_client
item-service
core_item
style-service
core_sku
Managing Data
at Scale
• Migrating to Microservices
• Challenges of Data in Microservices
o Two Architectural Tools
o Shared Data
o Joins
o Transactions
• Challenges of Event-Driven Systems
Managing Data
at Scale
• Migrating to Microservices
• Challenges of Data in Microservices
o Two Architectural Tools
o Shared Data
o Joins
o Transactions
• Challenges of Event-Driven Systems
Service as
System of Record
• Single System of Record
o Every piece of data is owned by a single service
o That service is the canonical system of record for that data
• Every other copy is a read-only, non-authoritative
cache
@randyshoup linkedin.com/in/randyshoup
customer-service
styling-service
customer-search
billing-service
Events as
First-Class Construct
• “A significant change in state”
o Statement that some interesting thing occurred
• Traditional 3-tier system
o Presentation  interface / interaction
o Application  stateless business logic
o Persistence  database
• Fourth fundamental building block
o State changes  events
o 0 | 1 | N consumers subscribe to the event, typically asynchronously
@randyshoup linkedin.com/in/randyshoup
Microservices
and Events
• Events are a first-class part of a service interface
• A service interface includes
o Synchronous request-response (REST, gRPC, etc)
o Events the service produces
o Events the service consumes
o Bulk reads and writes (ETL)
• The interface includes any mechanism for getting data in
or out of the service (!)
@randyshoup linkedin.com/in/randyshoup
Managing Data
at Scale
• Migrating to Microservices
• Challenges of Data in Microservices
o Two Architectural Tools
o Shared Data
o Joins
o Transactions
• Challenges of Event-Driven Systems
Data in Microservices:
Shared Data
• Monolithic database makes it easy to leverage shared
data
• Where does shared data go in a microservices world?
@randyshoup linkedin.com/in/randyshoup
Data in Microservices:
Shared Data
Option 1: Synchronous Lookup
o Customer service owns customer data
o Fulfillment service calls customer service in real time
fulfillment-service
customer-service
@randyshoup linkedin.com/in/randyshoup
Data in Microservices:
Shared Data
Option 2: Async event + local cache
o Customer service owns customer data
o Customer service sends address-updated event when customer address
changes
o Fulfillment service caches current customer address
fulfillment-servicecustomer-service
@randyshoup linkedin.com/in/randyshoup
Data in Microservices:
Shared Data
Option 3: Shared metadata library
o Read-only metadata, basically immutable
o E.g., size schemas, colors, fabrics, US States, etc.
receiving-serviceitem-service
style-service
Managing Data
at Scale
• Migrating to Microservices
• Challenges of Data in Microservices
o Two Architectural Tools
o Shared Data
o Joins
o Transactions
• Challenges of Event-Driven Systems
Data in Microservices:
Joins
• Monolithic database makes it easy to join tables
• Splitting the data across microservices makes joins very
hard
@randyshoup linkedin.com/in/randyshoup
SELECT FROM A INNER JOIN B ON …
Data in Microservices:
Joins
Option 1: Join in Client Application
o Get a single customer from customer-service
o Query matching orders for that customer from order-service
Customers
Orders
order-history-page
customer-service order-service
Data in Microservices:
Joins
Option 2: Service that “Materializes the View”
o Listen to events from item-service, events from order-service
o Maintain denormalized join of items and orders together in local storage
Items Order Feedback
item-feedback-service
item-service
order-feedback-service
Data in Microservices:
Joins
• Many common systems do this
o “Materialized view” in database systems
o Most NoSQL systems
o Search engines
o Analytic systems
@randyshoup linkedin.com/in/randyshoup
Managing Data
at Scale
• Migrating to Microservices
• Challenges of Data in Microservices
o Two Architectural Tools
o Shared Data
o Joins
o Transactions
• Challenges of Event-Driven Systems
Data in Microservices:
Transactions
• Monolithic database makes transactions across multiple
entities easy
• Splitting data across services makes transactions very
hard
@randyshoup linkedin.com/in/randyshoup
BEGIN; INSERT INTO A …; UPDATE B...; COMMIT;
“In general, application
developers simply do not
implement large scalable
applications assuming
distributed transactions.”
-- Pat Helland
Life After Distributed Transactions: An Apostate’s Opinion, 2007
“Grownups don’t use
distributed transactions”
-- Pat Helland
Data in Microservices:
Workflows and Sagas
• Transaction  Saga
o Model the transaction as a state machine of atomic events
• Reimplement as a workflow
• Roll back by applying compensating operations in
reverse
A B C
A B C
@randyshoup linkedin.com/in/randyshoup
Data in Microservices:
Workflows and Sagas
• Many common systems do this
o Payment processing
o Expense approval
o Travel
o Any multi-step workflow
@randyshoup linkedin.com/in/randyshoup
Data in Microservices:
Workflows and Sagas
• Simple event-driven processing
o Very lightweight logic
o Stateless
o Triggered by an event
•  Consider Function-as-a-Service (“Serverless”)
A B C
A B C
@randyshoup linkedin.com/in/randyshoup
ƛ ƛ ƛ
ƛ ƛ ƛ
Managing Data
at Scale
• Migrating to Microservices
• Challenges of Data in Microservices
• Challenges of Event-Driven Systems
o Event Duplication
o Event Ordering
Managing Data
at Scale
• Migrating to Microservices
• Challenges of Data in Microservices
• Challenges of Event-Driven Systems
o Event Duplication
o Event Ordering
Event
Duplication
• Problem: The same event will be delivered more than
once
o Network issues
o Redelivery
@randyshoup linkedin.com/in/randyshoup
Producer ConsumerTransport
• Event-1
• Event-2
• Event-3
• Event-1
• Event-2
• Event-2
• Event-3
Event
Duplication
• The consumer must process an event correctly
regardless of how many times it receives it
@randyshoup linkedin.com/in/randyshoup
Producer ConsumerTransport
• Event-1
• Event-2
• Event-3
• Event-1
• Event-2
• Event-2
• Event-3
• Event-1
• Event-2
• Event-3
Event Duplication:
(A) Exactly Once Delivery
Message bus buffers messages
o Message bus remembers events it has delivered, identified by message id
o Only deliver event if not yet delivered
@randyshoup linkedin.com/in/randyshoup
Producer ConsumerTransport
• Event-1
• Event-2
• Event-3
• Event-1 [1]
• Event-2 [2]
• Event-3 [3]
• Event-1 [1]
• Event-2 [2]
• Event-2 [2]
• Event-3 [3]
• Event-1 [1]
• Event-2 [2]
• Event-3 [3]
Event Duplication:
(B) Idempotent Processing
Option 1: Idempotency key
o Remember previously processed events, identified by idempotency key
o Before processing, check whether you have processed it already
o E.g., counter
@randyshoup linkedin.com/in/randyshoup
Producer ConsumerTransport
• Event-1 [aaa]
• Event-2 [bbb]
• Event-3 [ccc]
• Event-1 [aaa]
• Event-2 [bbb]
• Event-2 [bbb]
• Event-3 [ccc]
• Event-1
• Event-2
• <nothing>
• Event-3
aaa
aaa,bbb
aaa,bbb
aaa,bbb,ccc
Event Duplication:
(B) Idempotent Processing
Option 2: Processing is inherently idempotent
o Simply do the processing N times for N events
o E.g., “set X to 10”, UPSERT
@randyshoup linkedin.com/in/randyshoup
Producer ConsumerTransport
• Event-1
• Event-2
• Event-3
• Event-1
• Event-2
• Event-2
• Event-3
• Set x:=1
• Set x:=2
• Set x:=2
• Set x:=3
Event Duplication:
(B) Idempotent Processing
Option 3: Conflict-free Replicated Datatypes (CRDTs)
o Achieve agreement without explicit coordination
o Custom data structures, composable processing steps
o Many implementations, but still an area of active research
Common techniques
o Remember what you saw (request id, idempotency key)
o Remember that an item was deleted (“tombstone”)
(Do *NOT* roll your own from first principles)
@randyshoup linkedin.com/in/randyshoup
Managing Data
at Scale
• Migrating to Microservices
• Challenges of Data in Microservices
• Challenges of Event-Driven Systems
o Event Duplication
o Event Ordering
Event
Ordering
• Problem: Events will arrive out of order
o Network issues
o Processing time
o Redelivery
@randyshoup linkedin.com/in/randyshoup
Producer ConsumerTransport
• Event-1
• Event-2
• Event-3
• Event-2
• Event-1
• Event-3
Event
Ordering
• The consumer must process events correctly regardless
of the order in which they arrive
@randyshoup linkedin.com/in/randyshoup
Producer ConsumerTransport
• Event-1
• Event-2
• Event-3
• Event-2
• Event-1
• Event-3
• Event-1
• Event-2
• Event-3
Event Ordering:
(A) Impose Order
Option 1: Sequence + Reorder in the message bus
o Sequence number at the producer
o Message bus queues messages, waits for gaps
o Bus sends to consumer in order
@randyshoup linkedin.com/in/randyshoup
Producer ConsumerTransport
• Event-1 [1]
• Event-2 [2]
• Event-3 [3]
• Event-1 [1]
• Event-2 [2]
• Event-3 [3]
• Event-1 [1]
• Event-2 [2]
• Event-3 [3]
• Event-2 [2]
• Event-1 [1]
• Event-3 [3]
Event Ordering:
(A) Impose Order
Option 2: Sequence + Reorder in the consumer
o Sequence number / timestamp at the producer
o Consumer reorders before processing
@randyshoup linkedin.com/in/randyshoup
Producer ConsumerTransport
• Event-1 [1]
• Event-2 [2]
• Event-3 [3]
• Event-2 [2]
• Event-1 [1]
• Event-3 [3]
• Event-1 [1]
• Event-2 [2]
• Event-3 [3]
Event Ordering:
(B) Order-Independence
Option 1: Order-independent semantics
o E.g., count number of events
@randyshoup linkedin.com/in/randyshoup
Producer ConsumerTransport
• Event-1
• Event-2
• Event-3
• Event-2
• Event-1
• Event-3
• +1
• +1
• +1
Event Ordering:
(B) Order-Independence
Option 2: Notification + Read-back
o Event is a notification: object id + type of change
o Consumer “reads back” to the source service to get current state
@randyshoup linkedin.com/in/randyshoup
Producer ConsumerTransport
• Event-1
• Event-2
• Event-3
• Event-2
• Event-1
• Event-3
• State = 2
• State = 2
• State = 3
Event Ordering:
(B) Order-Independence
Option 3: Event Sourcing
o Store all events in a log
o Process / interpret later
@randyshoup linkedin.com/in/randyshoup
Producer ConsumerTransport
• Event-1
• Event-2
• Event-3
• Event-2
• Event-1
• Event-3
Event-2
Event-1
Event-3
• Event-1
• Event-2
• Event-3
Managing Data
at Scale
• Migrating to Microservices
• Challenges of Data in Microservices
• Challenges of Event-Driven Systems
¡Gracias!
@randyshoup
linkedin.com/in/randyshoup
medium.com/@randyshoup

More Related Content

What's hot (20)

PPTX
Learning from Learnings: Anatomy of Three Incidents
Randy Shoup
 
PPTX
Service Architectures At Scale - QCon London 2015
Randy Shoup
 
PDF
Art of the Possible - Serverless Conference NYC 2017
John Willis
 
PPTX
Being Elastic -- Evolving Programming for the Cloud
Randy Shoup
 
PPTX
Managing a Microservices Development Team (And advanced Microservice concerns)
Steve Pember
 
PDF
Api fundamentals
AgileDenver
 
PDF
Microservices: The Organizational and People Impact
Ambassador Labs
 
PDF
You build it - Cyber Chicago Keynote
John Willis
 
PDF
Tech view on Regulatory Compliance
Alexander L. de Goeij
 
PPTX
Content Engineering and The Internet of “Smart” Things
dclsocialmedia
 
PDF
DataEngConf SF16 - Methods for Content Relevance at LinkedIn
Hakka Labs
 
PPTX
DevSecOps - London Gathering : June 2018
Michael Man
 
PDF
Galichet XML Workflow Brief History NISO
National Information Standards Organization (NISO)
 
PPTX
Galichet XML for Standards Publishers October 9
National Information Standards Organization (NISO)
 
PDF
Operations for databases: the agile/devops journey
Eduardo Piairo
 
PDF
Workflowvs vs Flow 365 Saturday Amsterdam
SaraLagerquist
 
PDF
West Putting Structured Documents to Work
National Information Standards Organization (NISO)
 
PPTX
Wheeler West NISO STS An XML Standard for Standards
National Information Standards Organization (NISO)
 
PPTX
Taking Over & Managing Large Messy Systems
Steve Mushero
 
PPTX
Database Source Control: Migrations vs State
Eduardo Piairo
 
Learning from Learnings: Anatomy of Three Incidents
Randy Shoup
 
Service Architectures At Scale - QCon London 2015
Randy Shoup
 
Art of the Possible - Serverless Conference NYC 2017
John Willis
 
Being Elastic -- Evolving Programming for the Cloud
Randy Shoup
 
Managing a Microservices Development Team (And advanced Microservice concerns)
Steve Pember
 
Api fundamentals
AgileDenver
 
Microservices: The Organizational and People Impact
Ambassador Labs
 
You build it - Cyber Chicago Keynote
John Willis
 
Tech view on Regulatory Compliance
Alexander L. de Goeij
 
Content Engineering and The Internet of “Smart” Things
dclsocialmedia
 
DataEngConf SF16 - Methods for Content Relevance at LinkedIn
Hakka Labs
 
DevSecOps - London Gathering : June 2018
Michael Man
 
Galichet XML Workflow Brief History NISO
National Information Standards Organization (NISO)
 
Galichet XML for Standards Publishers October 9
National Information Standards Organization (NISO)
 
Operations for databases: the agile/devops journey
Eduardo Piairo
 
Workflowvs vs Flow 365 Saturday Amsterdam
SaraLagerquist
 
West Putting Structured Documents to Work
National Information Standards Organization (NISO)
 
Wheeler West NISO STS An XML Standard for Standards
National Information Standards Organization (NISO)
 
Taking Over & Managing Large Messy Systems
Steve Mushero
 
Database Source Control: Migrations vs State
Eduardo Piairo
 

Similar to Managing Data at Scale - Microservices and Events (20)

PDF
Kafka Summit SF 2017 - Keynote - Managing Data at Scale: The Unreasonable Eff...
confluent
 
PPTX
Assessing New Databases– Translytical Use Cases
DATAVERSITY
 
PPTX
Patterns of Distributed Application Design
Orkhan Gasimov
 
PDF
Patterns of Distributed Application Design
GlobalLogic Ukraine
 
PDF
Large Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Randy Shoup
 
PPTX
Big data streaming with Apache Spark on Azure
Willem Meints
 
PPTX
Microservice - Data Management
Okis Chuang
 
PPT
The Evolution of Big Data Pipelines at Intuit
DataWorks Summit/Hadoop Summit
 
PPTX
Webinar: How to Drive Business Value in Financial Services with MongoDB
MongoDB
 
PPTX
Unushs susus susujss. Ssuusussjjsjsit 4.pptx
AshishHiwale1
 
PPTX
When to Use MongoDB...and When You Should Not...
MongoDB
 
PDF
Enabling Telco to Build and Run Modern Applications
Tugdual Grall
 
PPTX
Systematic Migration of Monolith to Microservices
Pradeep Dalvi
 
PPTX
Webinar: Achieving Customer Centricity and High Margins in Financial Services...
MongoDB
 
PDF
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
Lightbend
 
PDF
MongoDB Breakfast Milan - Mainframe Offloading Strategies
MongoDB
 
PPTX
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Data Con LA
 
PDF
Hadoop Summit 2016 - Evolution of Big Data Pipelines At Intuit
Rekha Joshi
 
PDF
Moving To MicroServices
David Walker
 
PDF
Spark meetup stream processing use cases
punesparkmeetup
 
Kafka Summit SF 2017 - Keynote - Managing Data at Scale: The Unreasonable Eff...
confluent
 
Assessing New Databases– Translytical Use Cases
DATAVERSITY
 
Patterns of Distributed Application Design
Orkhan Gasimov
 
Patterns of Distributed Application Design
GlobalLogic Ukraine
 
Large Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Randy Shoup
 
Big data streaming with Apache Spark on Azure
Willem Meints
 
Microservice - Data Management
Okis Chuang
 
The Evolution of Big Data Pipelines at Intuit
DataWorks Summit/Hadoop Summit
 
Webinar: How to Drive Business Value in Financial Services with MongoDB
MongoDB
 
Unushs susus susujss. Ssuusussjjsjsit 4.pptx
AshishHiwale1
 
When to Use MongoDB...and When You Should Not...
MongoDB
 
Enabling Telco to Build and Run Modern Applications
Tugdual Grall
 
Systematic Migration of Monolith to Microservices
Pradeep Dalvi
 
Webinar: Achieving Customer Centricity and High Margins in Financial Services...
MongoDB
 
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
Lightbend
 
MongoDB Breakfast Milan - Mainframe Offloading Strategies
MongoDB
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Data Con LA
 
Hadoop Summit 2016 - Evolution of Big Data Pipelines At Intuit
Rekha Joshi
 
Moving To MicroServices
David Walker
 
Spark meetup stream processing use cases
punesparkmeetup
 
Ad

More from Randy Shoup (19)

PPTX
Anatomy of Three Incidents -- Commonalities and Lessons
Randy Shoup
 
PPTX
One Terrible Day at Google, and How It Made Us Better
Randy Shoup
 
PPTX
Minimal Viable Architecture - Silicon Slopes 2020
Randy Shoup
 
PPTX
An Agile Approach to Machine Learning
Randy Shoup
 
PPTX
Moving Fast at Scale
Randy Shoup
 
PPTX
Breaking Codes, Designing Jets, and Building Teams
Randy Shoup
 
PPTX
Moving Fast At Scale
Randy Shoup
 
PPTX
DevOps - It's About How We Work
Randy Shoup
 
PPTX
Ten Lessons of the DevOps Transition
Randy Shoup
 
PPTX
Pragmatic Microservices
Randy Shoup
 
PPTX
A CTO's Guide to Scaling Organizations
Randy Shoup
 
PPTX
From the Monolith to Microservices - CraftConf 2015
Randy Shoup
 
PPTX
Concurrency at Scale: Evolution to Micro-Services
Randy Shoup
 
PPTX
Minimum Viable Architecture -- Good Enough is Good Enough in a Startup
Randy Shoup
 
PPTX
Why Enterprises Are Embracing the Cloud
Randy Shoup
 
PPTX
DevOpsDays Silicon Valley 2014 - The Game of Operations
Randy Shoup
 
PPTX
QCon New York 2014 - Scalable, Reliable Analytics Infrastructure at KIXEYE
Randy Shoup
 
PPTX
QCon Tokyo 2014 - Virtuous Cycles of Velocity: What I Learned About Going Fas...
Randy Shoup
 
PPTX
The Importance of Culture: Building and Sustaining Effective Engineering Org...
Randy Shoup
 
Anatomy of Three Incidents -- Commonalities and Lessons
Randy Shoup
 
One Terrible Day at Google, and How It Made Us Better
Randy Shoup
 
Minimal Viable Architecture - Silicon Slopes 2020
Randy Shoup
 
An Agile Approach to Machine Learning
Randy Shoup
 
Moving Fast at Scale
Randy Shoup
 
Breaking Codes, Designing Jets, and Building Teams
Randy Shoup
 
Moving Fast At Scale
Randy Shoup
 
DevOps - It's About How We Work
Randy Shoup
 
Ten Lessons of the DevOps Transition
Randy Shoup
 
Pragmatic Microservices
Randy Shoup
 
A CTO's Guide to Scaling Organizations
Randy Shoup
 
From the Monolith to Microservices - CraftConf 2015
Randy Shoup
 
Concurrency at Scale: Evolution to Micro-Services
Randy Shoup
 
Minimum Viable Architecture -- Good Enough is Good Enough in a Startup
Randy Shoup
 
Why Enterprises Are Embracing the Cloud
Randy Shoup
 
DevOpsDays Silicon Valley 2014 - The Game of Operations
Randy Shoup
 
QCon New York 2014 - Scalable, Reliable Analytics Infrastructure at KIXEYE
Randy Shoup
 
QCon Tokyo 2014 - Virtuous Cycles of Velocity: What I Learned About Going Fas...
Randy Shoup
 
The Importance of Culture: Building and Sustaining Effective Engineering Org...
Randy Shoup
 
Ad

Recently uploaded (20)

PDF
Open Chain Q2 Steering Committee Meeting - 2025-06-25
Shane Coughlan
 
PPTX
Migrating Millions of Users with Debezium, Apache Kafka, and an Acyclic Synch...
MD Sayem Ahmed
 
PPTX
In From the Cold: Open Source as Part of Mainstream Software Asset Management
Shane Coughlan
 
PDF
Odoo CRM vs Zoho CRM: Honest Comparison 2025
Odiware Technologies Private Limited
 
PDF
HiHelloHR – Simplify HR Operations for Modern Workplaces
HiHelloHR
 
PDF
MiniTool Partition Wizard 12.8 Crack License Key LATEST
hashhshs786
 
PPTX
Tally_Basic_Operations_Presentation.pptx
AditiBansal54083
 
PPTX
AEM User Group: India Chapter Kickoff Meeting
jennaf3
 
PDF
Download Canva Pro 2025 PC Crack Full Latest Version
bashirkhan333g
 
PPTX
Change Common Properties in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PDF
Revenue streams of the Wazirx clone script.pdf
aaronjeffray
 
PDF
Linux Certificate of Completion - LabEx Certificate
VICTOR MAESTRE RAMIREZ
 
PDF
SAP Firmaya İade ABAB Kodları - ABAB ile yazılmıl hazır kod örneği
Salih Küçük
 
PPTX
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pptx
Varsha Nayak
 
PDF
유니티에서 Burst Compiler+ThreadedJobs+SIMD 적용사례
Seongdae Kim
 
PDF
4K Video Downloader Plus Pro Crack for MacOS New Download 2025
bashirkhan333g
 
PPTX
Homogeneity of Variance Test Options IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PDF
Alexander Marshalov - How to use AI Assistants with your Monitoring system Q2...
VictoriaMetrics
 
PPTX
Hardware(Central Processing Unit ) CU and ALU
RizwanaKalsoom2
 
PDF
Generic or Specific? Making sensible software design decisions
Bert Jan Schrijver
 
Open Chain Q2 Steering Committee Meeting - 2025-06-25
Shane Coughlan
 
Migrating Millions of Users with Debezium, Apache Kafka, and an Acyclic Synch...
MD Sayem Ahmed
 
In From the Cold: Open Source as Part of Mainstream Software Asset Management
Shane Coughlan
 
Odoo CRM vs Zoho CRM: Honest Comparison 2025
Odiware Technologies Private Limited
 
HiHelloHR – Simplify HR Operations for Modern Workplaces
HiHelloHR
 
MiniTool Partition Wizard 12.8 Crack License Key LATEST
hashhshs786
 
Tally_Basic_Operations_Presentation.pptx
AditiBansal54083
 
AEM User Group: India Chapter Kickoff Meeting
jennaf3
 
Download Canva Pro 2025 PC Crack Full Latest Version
bashirkhan333g
 
Change Common Properties in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
Revenue streams of the Wazirx clone script.pdf
aaronjeffray
 
Linux Certificate of Completion - LabEx Certificate
VICTOR MAESTRE RAMIREZ
 
SAP Firmaya İade ABAB Kodları - ABAB ile yazılmıl hazır kod örneği
Salih Küçük
 
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pptx
Varsha Nayak
 
유니티에서 Burst Compiler+ThreadedJobs+SIMD 적용사례
Seongdae Kim
 
4K Video Downloader Plus Pro Crack for MacOS New Download 2025
bashirkhan333g
 
Homogeneity of Variance Test Options IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
Alexander Marshalov - How to use AI Assistants with your Monitoring system Q2...
VictoriaMetrics
 
Hardware(Central Processing Unit ) CU and ALU
RizwanaKalsoom2
 
Generic or Specific? Making sensible software design decisions
Bert Jan Schrijver
 

Managing Data at Scale - Microservices and Events

  • 1. Managing Data at Scale Microservices and Events Randy Shoup @randyshoup linkedin.com/in/randyshoup
  • 2. Evolution to Microservices • eBay • 5th generation today • Monolithic Perl  Monolithic C++  Java  microservices • Twitter • 3rd generation today • Monolithic Rails  JS / Rails / Scala  microservices • Amazon • Nth generation today • Monolithic Perl / C++  Java / Scala  microservices @randyshoup linkedin.com/in/randyshoup
  • 3. No one starts with microservices … Past a certain scale, everyone ends up with microservices
  • 4. First Law of Distributed Object Design: Don’t distribute your objects! -- Martin Fowler
  • 5. Managing Data at Scale • Migrating to Microservices • Challenges of Data in Microservices • Challenges of Event-Driven Systems
  • 6. Managing Data at Scale • Migrating to Microservices • Challenges of Data in Microservices • Challenges of Event-Driven Systems
  • 7. Microservices • Single-purpose • Simple, well-defined interface • Modular and independent • Isolated persistence (!) A C D E B
  • 8. Extracting Microservices • Problem: Monolithic shared DB • Clients • Shipments • Items • Styles, SKUs • Warehouses • etc. stitchfix.com Styling app Warehouse app Merch app CS app Logistics app Payments service Profile service
  • 9. Extracting Microservices • Decouple applications / services from shared DB • Clients • Shipments • Items • Styles, SKUs • Warehouses • etc. stitchfix.com Styling app Warehouse app Merch app CS app Logistics app Payments service Profile service
  • 10. Extracting Microservices • Decouple applications / services from shared DB Styling app Warehouse app core_item core_sku core_client
  • 11. Extracting Microservices • Step 1: Create a service Styling app Warehouse app core_item core_sku core_client client-service
  • 12. Extracting Microservices • Step 2: Applications use the service Styling app Warehouse app core_item core_sku core_client client-service
  • 13. Extracting Microservices • Step 3: Move data to private database Styling app Warehouse app core_item core_sku client-service core_client
  • 14. Extracting Microservices • Step 4: Rinse and Repeat Styling app Warehouse app core_sku client-service core_client item-service core_item
  • 15. Extracting Microservices • Step 4: Rinse and Repeat Styling app Warehouse app client-service core_client item-service core_item style-service core_sku
  • 16. Extracting Microservices • Step 4: Rinse and Repeat Styling app Warehouse app client-service core_client item-service core_item style-service core_sku
  • 17. Managing Data at Scale • Migrating to Microservices • Challenges of Data in Microservices o Two Architectural Tools o Shared Data o Joins o Transactions • Challenges of Event-Driven Systems
  • 18. Managing Data at Scale • Migrating to Microservices • Challenges of Data in Microservices o Two Architectural Tools o Shared Data o Joins o Transactions • Challenges of Event-Driven Systems
  • 19. Service as System of Record • Single System of Record o Every piece of data is owned by a single service o That service is the canonical system of record for that data • Every other copy is a read-only, non-authoritative cache @randyshoup linkedin.com/in/randyshoup customer-service styling-service customer-search billing-service
  • 20. Events as First-Class Construct • “A significant change in state” o Statement that some interesting thing occurred • Traditional 3-tier system o Presentation  interface / interaction o Application  stateless business logic o Persistence  database • Fourth fundamental building block o State changes  events o 0 | 1 | N consumers subscribe to the event, typically asynchronously @randyshoup linkedin.com/in/randyshoup
  • 21. Microservices and Events • Events are a first-class part of a service interface • A service interface includes o Synchronous request-response (REST, gRPC, etc) o Events the service produces o Events the service consumes o Bulk reads and writes (ETL) • The interface includes any mechanism for getting data in or out of the service (!) @randyshoup linkedin.com/in/randyshoup
  • 22. Managing Data at Scale • Migrating to Microservices • Challenges of Data in Microservices o Two Architectural Tools o Shared Data o Joins o Transactions • Challenges of Event-Driven Systems
  • 23. Data in Microservices: Shared Data • Monolithic database makes it easy to leverage shared data • Where does shared data go in a microservices world? @randyshoup linkedin.com/in/randyshoup
  • 24. Data in Microservices: Shared Data Option 1: Synchronous Lookup o Customer service owns customer data o Fulfillment service calls customer service in real time fulfillment-service customer-service @randyshoup linkedin.com/in/randyshoup
  • 25. Data in Microservices: Shared Data Option 2: Async event + local cache o Customer service owns customer data o Customer service sends address-updated event when customer address changes o Fulfillment service caches current customer address fulfillment-servicecustomer-service @randyshoup linkedin.com/in/randyshoup
  • 26. Data in Microservices: Shared Data Option 3: Shared metadata library o Read-only metadata, basically immutable o E.g., size schemas, colors, fabrics, US States, etc. receiving-serviceitem-service style-service
  • 27. Managing Data at Scale • Migrating to Microservices • Challenges of Data in Microservices o Two Architectural Tools o Shared Data o Joins o Transactions • Challenges of Event-Driven Systems
  • 28. Data in Microservices: Joins • Monolithic database makes it easy to join tables • Splitting the data across microservices makes joins very hard @randyshoup linkedin.com/in/randyshoup SELECT FROM A INNER JOIN B ON …
  • 29. Data in Microservices: Joins Option 1: Join in Client Application o Get a single customer from customer-service o Query matching orders for that customer from order-service Customers Orders order-history-page customer-service order-service
  • 30. Data in Microservices: Joins Option 2: Service that “Materializes the View” o Listen to events from item-service, events from order-service o Maintain denormalized join of items and orders together in local storage Items Order Feedback item-feedback-service item-service order-feedback-service
  • 31. Data in Microservices: Joins • Many common systems do this o “Materialized view” in database systems o Most NoSQL systems o Search engines o Analytic systems @randyshoup linkedin.com/in/randyshoup
  • 32. Managing Data at Scale • Migrating to Microservices • Challenges of Data in Microservices o Two Architectural Tools o Shared Data o Joins o Transactions • Challenges of Event-Driven Systems
  • 33. Data in Microservices: Transactions • Monolithic database makes transactions across multiple entities easy • Splitting data across services makes transactions very hard @randyshoup linkedin.com/in/randyshoup BEGIN; INSERT INTO A …; UPDATE B...; COMMIT;
  • 34. “In general, application developers simply do not implement large scalable applications assuming distributed transactions.” -- Pat Helland Life After Distributed Transactions: An Apostate’s Opinion, 2007
  • 35. “Grownups don’t use distributed transactions” -- Pat Helland
  • 36. Data in Microservices: Workflows and Sagas • Transaction  Saga o Model the transaction as a state machine of atomic events • Reimplement as a workflow • Roll back by applying compensating operations in reverse A B C A B C @randyshoup linkedin.com/in/randyshoup
  • 37. Data in Microservices: Workflows and Sagas • Many common systems do this o Payment processing o Expense approval o Travel o Any multi-step workflow @randyshoup linkedin.com/in/randyshoup
  • 38. Data in Microservices: Workflows and Sagas • Simple event-driven processing o Very lightweight logic o Stateless o Triggered by an event •  Consider Function-as-a-Service (“Serverless”) A B C A B C @randyshoup linkedin.com/in/randyshoup ƛ ƛ ƛ ƛ ƛ ƛ
  • 39. Managing Data at Scale • Migrating to Microservices • Challenges of Data in Microservices • Challenges of Event-Driven Systems o Event Duplication o Event Ordering
  • 40. Managing Data at Scale • Migrating to Microservices • Challenges of Data in Microservices • Challenges of Event-Driven Systems o Event Duplication o Event Ordering
  • 41. Event Duplication • Problem: The same event will be delivered more than once o Network issues o Redelivery @randyshoup linkedin.com/in/randyshoup Producer ConsumerTransport • Event-1 • Event-2 • Event-3 • Event-1 • Event-2 • Event-2 • Event-3
  • 42. Event Duplication • The consumer must process an event correctly regardless of how many times it receives it @randyshoup linkedin.com/in/randyshoup Producer ConsumerTransport • Event-1 • Event-2 • Event-3 • Event-1 • Event-2 • Event-2 • Event-3 • Event-1 • Event-2 • Event-3
  • 43. Event Duplication: (A) Exactly Once Delivery Message bus buffers messages o Message bus remembers events it has delivered, identified by message id o Only deliver event if not yet delivered @randyshoup linkedin.com/in/randyshoup Producer ConsumerTransport • Event-1 • Event-2 • Event-3 • Event-1 [1] • Event-2 [2] • Event-3 [3] • Event-1 [1] • Event-2 [2] • Event-2 [2] • Event-3 [3] • Event-1 [1] • Event-2 [2] • Event-3 [3]
  • 44. Event Duplication: (B) Idempotent Processing Option 1: Idempotency key o Remember previously processed events, identified by idempotency key o Before processing, check whether you have processed it already o E.g., counter @randyshoup linkedin.com/in/randyshoup Producer ConsumerTransport • Event-1 [aaa] • Event-2 [bbb] • Event-3 [ccc] • Event-1 [aaa] • Event-2 [bbb] • Event-2 [bbb] • Event-3 [ccc] • Event-1 • Event-2 • <nothing> • Event-3 aaa aaa,bbb aaa,bbb aaa,bbb,ccc
  • 45. Event Duplication: (B) Idempotent Processing Option 2: Processing is inherently idempotent o Simply do the processing N times for N events o E.g., “set X to 10”, UPSERT @randyshoup linkedin.com/in/randyshoup Producer ConsumerTransport • Event-1 • Event-2 • Event-3 • Event-1 • Event-2 • Event-2 • Event-3 • Set x:=1 • Set x:=2 • Set x:=2 • Set x:=3
  • 46. Event Duplication: (B) Idempotent Processing Option 3: Conflict-free Replicated Datatypes (CRDTs) o Achieve agreement without explicit coordination o Custom data structures, composable processing steps o Many implementations, but still an area of active research Common techniques o Remember what you saw (request id, idempotency key) o Remember that an item was deleted (“tombstone”) (Do *NOT* roll your own from first principles) @randyshoup linkedin.com/in/randyshoup
  • 47. Managing Data at Scale • Migrating to Microservices • Challenges of Data in Microservices • Challenges of Event-Driven Systems o Event Duplication o Event Ordering
  • 48. Event Ordering • Problem: Events will arrive out of order o Network issues o Processing time o Redelivery @randyshoup linkedin.com/in/randyshoup Producer ConsumerTransport • Event-1 • Event-2 • Event-3 • Event-2 • Event-1 • Event-3
  • 49. Event Ordering • The consumer must process events correctly regardless of the order in which they arrive @randyshoup linkedin.com/in/randyshoup Producer ConsumerTransport • Event-1 • Event-2 • Event-3 • Event-2 • Event-1 • Event-3 • Event-1 • Event-2 • Event-3
  • 50. Event Ordering: (A) Impose Order Option 1: Sequence + Reorder in the message bus o Sequence number at the producer o Message bus queues messages, waits for gaps o Bus sends to consumer in order @randyshoup linkedin.com/in/randyshoup Producer ConsumerTransport • Event-1 [1] • Event-2 [2] • Event-3 [3] • Event-1 [1] • Event-2 [2] • Event-3 [3] • Event-1 [1] • Event-2 [2] • Event-3 [3] • Event-2 [2] • Event-1 [1] • Event-3 [3]
  • 51. Event Ordering: (A) Impose Order Option 2: Sequence + Reorder in the consumer o Sequence number / timestamp at the producer o Consumer reorders before processing @randyshoup linkedin.com/in/randyshoup Producer ConsumerTransport • Event-1 [1] • Event-2 [2] • Event-3 [3] • Event-2 [2] • Event-1 [1] • Event-3 [3] • Event-1 [1] • Event-2 [2] • Event-3 [3]
  • 52. Event Ordering: (B) Order-Independence Option 1: Order-independent semantics o E.g., count number of events @randyshoup linkedin.com/in/randyshoup Producer ConsumerTransport • Event-1 • Event-2 • Event-3 • Event-2 • Event-1 • Event-3 • +1 • +1 • +1
  • 53. Event Ordering: (B) Order-Independence Option 2: Notification + Read-back o Event is a notification: object id + type of change o Consumer “reads back” to the source service to get current state @randyshoup linkedin.com/in/randyshoup Producer ConsumerTransport • Event-1 • Event-2 • Event-3 • Event-2 • Event-1 • Event-3 • State = 2 • State = 2 • State = 3
  • 54. Event Ordering: (B) Order-Independence Option 3: Event Sourcing o Store all events in a log o Process / interpret later @randyshoup linkedin.com/in/randyshoup Producer ConsumerTransport • Event-1 • Event-2 • Event-3 • Event-2 • Event-1 • Event-3 Event-2 Event-1 Event-3 • Event-1 • Event-2 • Event-3
  • 55. Managing Data at Scale • Migrating to Microservices • Challenges of Data in Microservices • Challenges of Event-Driven Systems