SlideShare a Scribd company logo
Data Engineer, Patterns & Architecture
The future:
Deep-dive into Microservices Patterns with Stream Process
Igor De Souza June - 2 0 2 0
Copyright © 2020, Oracle and/or its affiliates. All rights reserved. |
Safe Harbor Statement
The following is intended to outline our general product direction. It is intended for
information purposes only, and may not be incorporated into any contract. It is not a
commitment to deliver any material, code, or functionality, and should not be relied upon
in making purchasing decisions. The development, release, and timing of any features or
functionality described for Oracle’s products remains at the sole discretion of Oracle.
Copyright © 2020, Oracle and/or its affiliates. All rights reserved. |
Today’sAgenda
• Industry 4.0
• Data Fabric & Data Mesh
• Microservices Design Patterns
• Is Streaming a Database ?
• Streaming Everywhere
During2020/ 2021the
world continuesto go
throughaParadigmShift
into afuture where“Cyber-
PhysicalSystems”arethe
newnormal.
“Digital Transformation”
requires mindset shift:
1.Sharingdatais more
effective thanaccumulating
2.Decentralizing,distributing,
andcopyingis more
powerful than stockpiling
3.Connectivityandflow of
datais the starting point for
innovation andsocializing.
5
Real-TimeIndustry4.0
…from Industry 3.0
BatchCentric, Schedulers
Hubs(EDW,Hadoop,DataLake)
Mostly Relational Data(aka Views)
SimplexProcessingis Standard
Sizefor PeakWorkloads
Kimball / Inmon
Architecture Governanceis
“Bolt On” VendorSpecific
…to Industry 4.0
EventCentric,Streams
(Edge,Hybrid,Multi-Cloud)
PolyglotData(viaLogs)
MassivelyParallelisStandard
Elastic,ScaleonDemand
DistributedKappa
GovernanceisEmbedded
OpenSourceEnabled
6
Copyright © 2020, Oracle and/or its affiliates. All rights reserved. |
Evolution
This data pattern, popularized by Ralph
Kimball and Bill Inmon, has been the
foundation for enterprise data
management since 1993.
It is transaction consistent, can scale up
nicely for most use cases, and is based on
SQL, lingua-franca for most tools.
By 2010, the Lambda (big data) pattern
was common. In 2014, Jay Kreps (of
LinkedIn) questioned the Lambda
Architecture and spawned Kappa.
The Kappa principles consider batch
processing as a special case of stream
processing. Use a historized event log to
process both real-time as well as batch
processing.
7
ETL
ETL
ETL
ETL
Monoliths
8
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservices Patterns with Stream Process
Copyright © 2020, Oracle and/or its affiliates. All rights reserved. |
Microservices are Good!
Service Mesh Revolution
Emergence and widespread use of microservices have
directly led to revolution in DevOps, massive uptake in
Kubernetes and by 2020 the Service Mesh revolution
• Key Benefits:
• Decomposition, of monolithic architecture
• Modularity, smaller services and improved
• speed of initial development
• Independence, loosely-coupled systems that can
• be created using different languages or data
• With the loose coupling also comes much
greater flexibility around deployment and
upgrades, eliminating complex dependencies
• Flexibility at Scale, deployments may start small,
run locally and later scale very wide, running
across multi-cloud environments and containers
• Sidecar pattern for “Mesh” frameworks are
10
Copyright © 2020, Oracle and/or its affiliates. All rights reserved. |
Domain Driven Design (DDD) principles guide developers to create microservices that align to Bounded
Contexts, which “defines tangible boundaries of applicability of some sub-domain”
Challenges with Bounded Context & DDD
Sounds hard!
11
12
Service Mesh
13
Do not burden my code with all
these infrastructure related
decisions
Copyright © 2020, Oracle and/or its affiliates. All rights reserved. |
Data and Events are First-Class Citizens
Service Service Service
Analytics, Data Science and Data Lakes are too important to my
business’ Digital Transformation and Data-Driven initiatives…
we need architecture focus on Data and Events too
Application Microservices
Data Stores
Event Logs
produce events
consume events
produce events
consume events
read write
App
Events
Data
Events
DBLog
Events
Control Plane
• “State of the Truth”
at a point in time
(current or historic)
• Durable storage used
for Data Recovery /
Archives/years of data
• Polyglot, each service
may determine its own
data structures
• “Narrative of the Truth”
sequence of events
(between data snapshots)
• Days/months of event
data available as Time
Series or Messaging
• Strict ordering of events
& Idempotency
• Strong Consistency of
DB logs (eg; whenusing
GoldenGate)
• “Systems ofRecord”
at application tier
• APIs, business rules
and business object
semantics
• Not durable storage
The microservice API is king!
14
Copyright © 2020, Oracle and/or its affiliates. All rights reserved. |
Can we take some of the best ideas of a Service Mesh and apply them to
Data and Events, to create a kind of Data Mesh?
Microservice Mesh and Data Mesh
Immutable
Raw Data Events Prepared Data Canonical Data
App
Events
Data
Events
DBLog
Events
Data Domain
Projection…n
Data Domain
Projection 1
Application
Microservices
Control Plane
15
What is a Data Mesh?
16
Microservice
Patterns
Log-based
Integrations
Polyglot Data
Movement
Data Mesh is a data-tier architecture to integrate and
govern enterprise data assets across distributed multi-cloud
environments – two defining characteristics are:
(1) De-centralized data processing; no ETL/Hubs/Lake monoliths
(2) Event-driven; real-time where possible, batch only when necessary
Microservices-centric:
• For the administration, deployment and monitoring of the core
frameworks of data movement and governance
• “Sidecar Proxy” style pattern for Events and Data; Aligns with
Service Mesh frameworks (Kubernetes, Istio, etc)
Immutable event-logs for data integrations:
• Messaging and data store events are globally accessible via
immutable event logs
• Logs may be used to drive Streaming or Batch integrations
Distributed data movement of all types of data
• A data mesh moves data: Relational, NoSQL, JSON, Graph…
• Relational data consistency (ACID) during data movement
• Must work reliably with enterprise OLTP data sets
Data
Mesh
Event
Streaming
Immutable
Logs
Data
Replication
Polyglot
Persistence
Edge / 5G
Frameworks
Domain
Driven
Design
Service Mesh
“Sidecars”
Data
Mesh
17
18
Data Fabric or Data Mesh?
19
Microservice Design Patterns
20
Microservice Design Patterns for Data
Patterns for MicroservicesInherent to the Microservice Architecture is the developer
using specific patterns, sometimes the patterns are partially
embodied in a Programming Framework, but typically the
developers must choose to follow certain heuristics while
programming.
This presentation’s focus:
• “Database Patterns” & “Integration Patterns” …using DBEvent
Replication (AKA: Change Data Capture) to improvethem
• Simplify the pattern, make the microservice application more resilient
and provide better data consistency guarantees
DB Patterns for Discussion:
• Database per Service (coveredearlier)
• CQRS – Command Query Responsibility Segregation
• Event Sourcing
• Saga Pattern
• Transactional Outbox
• Aggregates (AKA: Domain Events)
Transaction
Outbox
21
Hype Cycle 2009
Complex Event Processing:
• CEP is a kind of computing in which incoming data about
events is distilled into more useful, higher level event data
that provides insight into what is happening. […] CEP is
used for highly demanding, continuous-intelligence
applications that enhance situation awareness and
support real-time decisions. Gartner
20 Years Too Early?
• CEP dates back to the 1990’s (history of CEP engines)
• CEP came before “Event Stream Processing” andgenerally
has covered more complex use cases (eg; handling of out-
of-order events and more complicated correlation
semantics) ( what’s the difference, 2019 and mythbuster
CEP vs ESP, 2008)
• Largely overtaken by Big Data stream processing
technologies that are open-source, massively-parallel,and
widely available as cloud-native
2009!
Copyright © 2020 Oracle and/or its affiliates.
Stream Processing 2020
Time
Series
DB/OLAP
Big Data Event
Stream Processing
Complex
Event
Processing
Becoming more aligned to open source
/ apache frameworks
Becoming more capable of rich windowing functions
and time-clock correlation semantics
Complex Event Processing:
• Traditionally running in “scale-up”
SMP in-memory framework
• Many CEP engines aremoving
toward MPP “scale-out”
architectures
• Programming-centric
historically,
some CEP
engines
were
also query-
centric asfar
back as 2009
• Time-clock semantics
and advanced cache
may be used for “state
machine” type usecases
Stream Processing:
• Built around MPP frameworksand
typically Apache open-sourced
• Genesis was around simplistic use
cases on high volumes of events
• All SP engines began with
rudimentary windowing and
correlation semantics, but most
frameworks are gradually
becoming more functionally
comparable to classic CEP
• Simplistic windowing and caching
for basic stream-clock events
Time Series Databases:
• Optimized for persistence and historic analytics
on time-ordered events…data is often sourced
from CEP or SP engines
Technical and Functional Differences between CEP and SP:https://complexevents.com/2019/07/15/whats-the-difference-between-esp-and-cep-2/
Copyright © 2020 Oracle and/or its affiliates.
Stream Processing/CEP for Event Driven Architectures
There has been a widespread
awakening to the benefits of Event
Drive Architecture (EDA) for
increasing the scalability and agility of
business systems. […] Stream
analytics is based on the mathematics
of complex-event processing (CEP).
CEP is a computing technique in
which incoming data about what is
happening (event data) is processed
as it arrives (data in motion or
recently in motion) to generate
higher level, more useful, summary
information (complex events).
W. Roy Schulte (of Gartner), March 2020:
EDA is Suddenly Popular Will Stream Analytics be Next?
Event Stream Analytics (& CEP)
Data & Microservice Events
Event/Data
Pipelines
Time-Series
Analysis
Geospatial
Analysis
Real-time
AI/ML
Continious
ETL
Use Cases:
25
26
The New
Kubernetes
Native
27
Critiques of Event Sourcing
Exposing the Persistence Tier:
• Taken too far (Why Event Sourcing is an Anti-Pattern), developers wind up usingthe
Event Store as a Shared Persistence model, and other microservice now have hard-
coupled binding to the message formats of the originatingservice
Whole System Fallacy:
• Some microservices leaders (Udi and Greg Reach CQRS Agreement) sayto narrow
the aperture on when to use CQRS + Event Sourcing → only within a Business
Component and a Single Bounded Context
• Minimizes utility of pattern for Communications
Forcing Eventual Consistency on Developers:
• The propensity to over-use CQRS & Event Sourcing at the at the whole systemlevel
forces developers to manage eventual consistency in the Application tiers (What
they don’t tell you about eventsourcing)
• “…they will make your life a living hell” doing DevOps, debugging and
system recovery when a “Mesh” of services are interacting via Event Store and
message signatures can lead todisaster
Is Streaming a Database ?
29
• Kstore
• Kcache
• Kareldb
• KSQL
• SparkSQL
• Flink SQL
• Stream Java & Scala
• Oracle 20c - Transactional Event Queues (TEQ)
• Martin Kleppmann | Kafka Summit SF 2018
Turning
database
inside-out
31
32
33
34
35
36
Streaming Everywhere
Spark, Flink or KSQL
Copyright © 2019 Oracle and/or its affiliates.
[best] ˜œ›™[worst]
Spark
Streaming
Apache
Flink / SQL
Confluent
KSQL
User Experience
Low Code Development (with built-in patterns/accelerators) ™  
Interactive/Live Edits (browser based, view changes immediately) ™ ˜ 
Built-in Live Dashboards (event-driven charts/graphs)   ™
Core Streaming Semantics
What is Being Computed (transforms, joins, flatten, statefulness etc) › œ œ
Time Windows(global, fixed, sliding, tumbling, custom etc) › ˜ ˜
When in Processing Time (triggers – event, time, count, timers, etc)  œ 
How do Refinements Relate (discarding, accumulating, retracting) › œ 
Analytics
Robust CEP Capabilities (complex event correlations, native time clock) ™  
Geo-Fencing & Spatial (lat/long, built in maps, custom map tiles, etc) ™ ™ ™
Machine Learning (native scala, PMML, python support etc) œ œ 
Time Series Analysis (built-in interval patterns, thresholding etc)   
Other Features
Backpressure (dynamic ingest per pipeline) Custom Custom Custom
State Management (automation across streams & native cache) N/A RocksDB RocksDB
Data Consistency (OLTP Change Events, Inserts/Updates/Deletes) Custom Custom Custom
GoldenGate Stream Type (aware of SCN/CSN, transactions, order, etc) Custom Custom Custom
38
Evolution towards Real-Time Data Mesh
mesh & microservice controls
39
ETL
ETL
ETL
ETL
40Copyright © 2020 Oracle and/or its affiliates.
This is not a Metamorphosis, it is a Paradigm Shift
Data success factors that did wellin
Industry 3.0will not be the factors that
create success in Industry4.0
The Success Paradox Next Gen DataArchitecture
ETL Vendors
1990 –2010’s Gen1:
• Replication
• Messaging
• Streaming
• Pipelines
Next-Genhas
newDNAnot
tiedto oldETL tools
Itis impossible to evolve older Batch Processing
tools into a modern Event- Centric Stream
Processing solution; the underlying paradigms
arefundamentally different
41
@Igfasouza

More Related Content

What's hot (20)

PDF
Data Mesh Part 4 Monolith to Mesh
Jeffrey T. Pollock
 
PDF
Webinar future dataintegration-datamesh-and-goldengatekafka
Jeffrey T. Pollock
 
PDF
Achieving Lakehouse Models with Spark 3.0
Databricks
 
PDF
Time to Talk about Data Mesh
LibbySchulze
 
PPTX
Databricks Platform.pptx
Alex Ivy
 
PDF
Making Data Timelier and More Reliable with Lakehouse Technology
Matei Zaharia
 
PDF
Data Mesh
Piethein Strengholt
 
PDF
Enabling a Data Mesh Architecture with Data Virtualization
Denodo
 
PDF
Five Things to Consider About Data Mesh and Data Governance
DATAVERSITY
 
PDF
Snowflake Data Science and AI/ML at Scale
Adam Doyle
 
PPTX
How to Implement Snowflake Security Best Practices with Panther
Panther Labs
 
PDF
Data Catalog as a Business Enabler
Srinivasan Sankar
 
PPTX
Snowflake Architecture.pptx
chennakesava44
 
PPTX
Master the Multi-Clustered Data Warehouse - Snowflake
Matillion
 
PDF
Data at the Speed of Business with Data Mastering and Governance
DATAVERSITY
 
PDF
Modernizing Integration with Data Virtualization
Denodo
 
PDF
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Cathrine Wilhelmsen
 
PDF
Architect’s Open-Source Guide for a Data Mesh Architecture
Databricks
 
PDF
How a Semantic Layer Makes Data Mesh Work at Scale
DATAVERSITY
 
PDF
Lakehouse in Azure
Sergio Zenatti Filho
 
Data Mesh Part 4 Monolith to Mesh
Jeffrey T. Pollock
 
Webinar future dataintegration-datamesh-and-goldengatekafka
Jeffrey T. Pollock
 
Achieving Lakehouse Models with Spark 3.0
Databricks
 
Time to Talk about Data Mesh
LibbySchulze
 
Databricks Platform.pptx
Alex Ivy
 
Making Data Timelier and More Reliable with Lakehouse Technology
Matei Zaharia
 
Enabling a Data Mesh Architecture with Data Virtualization
Denodo
 
Five Things to Consider About Data Mesh and Data Governance
DATAVERSITY
 
Snowflake Data Science and AI/ML at Scale
Adam Doyle
 
How to Implement Snowflake Security Best Practices with Panther
Panther Labs
 
Data Catalog as a Business Enabler
Srinivasan Sankar
 
Snowflake Architecture.pptx
chennakesava44
 
Master the Multi-Clustered Data Warehouse - Snowflake
Matillion
 
Data at the Speed of Business with Data Mastering and Governance
DATAVERSITY
 
Modernizing Integration with Data Virtualization
Denodo
 
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Cathrine Wilhelmsen
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Databricks
 
How a Semantic Layer Makes Data Mesh Work at Scale
DATAVERSITY
 
Lakehouse in Azure
Sergio Zenatti Filho
 

Similar to Data Engineer, Patterns & Architecture The future: Deep-dive into Microservices Patterns with Stream Process (20)

PDF
Microservices Patterns with GoldenGate
Jeffrey T. Pollock
 
PPTX
#dbhouseparty - Should I be building Microservices?
Tammy Bednar
 
PDF
Flash session -streaming--ses1243-lon
Jeffrey T. Pollock
 
PDF
Webinar Data Mesh - Part 3
Jeffrey T. Pollock
 
PDF
Apidays Paris 2023 - Productizing AsyncAPI for Data Replication and Changed D...
apidays
 
PPTX
Mark Simpson - UKOUG23 - Refactoring Monolithic Oracle Database Applications ...
marksimpsongw
 
PDF
Database@Home : Data Driven Apps - Data-driven Microservices Architecture wit...
Tammy Bednar
 
PPTX
Accelerating a Path to Digital with a Cloud Data Strategy
MongoDB
 
PDF
SQL Server 2019 Data Virtualization
Matthew W. Bowers
 
PDF
SpringPeople - Introduction to Cloud Computing
SpringPeople
 
PDF
Cloud-native Data
cornelia davis
 
PDF
Cloud-Native-Data with Cornelia Davis
VMware Tanzu
 
PDF
Data Services and the Modern Data Ecosystem (ASEAN)
Denodo
 
PDF
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
DATAVERSITY
 
PDF
Oracle cloud oagi
Mathews Job
 
PDF
Stay productive while slicing up the monolith
Markus Eisele
 
PDF
A Successful Journey to the Cloud with Data Virtualization
Denodo
 
PDF
Microservices - opportunities, dilemmas and problems
Łukasz Sowa
 
PPTX
The Last Frontier- Virtualization, Hybrid Management and the Cloud
Kellyn Pot'Vin-Gorman
 
PPTX
BPM and SOA are going mobile - An architectural perspective
OPITZ CONSULTING Deutschland
 
Microservices Patterns with GoldenGate
Jeffrey T. Pollock
 
#dbhouseparty - Should I be building Microservices?
Tammy Bednar
 
Flash session -streaming--ses1243-lon
Jeffrey T. Pollock
 
Webinar Data Mesh - Part 3
Jeffrey T. Pollock
 
Apidays Paris 2023 - Productizing AsyncAPI for Data Replication and Changed D...
apidays
 
Mark Simpson - UKOUG23 - Refactoring Monolithic Oracle Database Applications ...
marksimpsongw
 
Database@Home : Data Driven Apps - Data-driven Microservices Architecture wit...
Tammy Bednar
 
Accelerating a Path to Digital with a Cloud Data Strategy
MongoDB
 
SQL Server 2019 Data Virtualization
Matthew W. Bowers
 
SpringPeople - Introduction to Cloud Computing
SpringPeople
 
Cloud-native Data
cornelia davis
 
Cloud-Native-Data with Cornelia Davis
VMware Tanzu
 
Data Services and the Modern Data Ecosystem (ASEAN)
Denodo
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
DATAVERSITY
 
Oracle cloud oagi
Mathews Job
 
Stay productive while slicing up the monolith
Markus Eisele
 
A Successful Journey to the Cloud with Data Virtualization
Denodo
 
Microservices - opportunities, dilemmas and problems
Łukasz Sowa
 
The Last Frontier- Virtualization, Hybrid Management and the Cloud
Kellyn Pot'Vin-Gorman
 
BPM and SOA are going mobile - An architectural perspective
OPITZ CONSULTING Deutschland
 
Ad

Recently uploaded (20)

PDF
apidays Singapore 2025 - Streaming Lakehouse with Kafka, Flink and Iceberg by...
apidays
 
PPTX
Aict presentation on dpplppp sjdhfh.pptx
vabaso5932
 
PDF
UNISE-Operation-Procedure-InDHIS2trainng
ahmedabduselam23
 
PPTX
04_Tamás Marton_Intuitech .pptx_AI_Barometer_2025
FinTech Belgium
 
PDF
apidays Singapore 2025 - From API Intelligence to API Governance by Harsha Ch...
apidays
 
PPTX
办理学历认证InformaticsLetter新加坡英华美学院毕业证书,Informatics成绩单
Taqyea
 
PPT
Growth of Public Expendituuure_55423.ppt
NavyaDeora
 
PPTX
01_Nico Vincent_Sailpeak.pptx_AI_Barometer_2025
FinTech Belgium
 
PDF
Development and validation of the Japanese version of the Organizational Matt...
Yoga Tokuyoshi
 
PPTX
thid ppt defines the ich guridlens and gives the information about the ICH gu...
shaistabegum14
 
PDF
Technical-Report-GPS_GIS_RS-for-MSF-finalv2.pdf
KPycho
 
PDF
apidays Singapore 2025 - Surviving an interconnected world with API governanc...
apidays
 
PPTX
SlideEgg_501298-Agentic AI.pptx agentic ai
530BYManoj
 
PDF
apidays Singapore 2025 - Building a Federated Future, Alex Szomora (GSMA)
apidays
 
PPT
tuberculosiship-2106031cyyfuftufufufivifviviv
AkshaiRam
 
PPTX
SHREYAS25 INTERN-I,II,III PPT (1).pptx pre
swapnilherage
 
PDF
apidays Singapore 2025 - The API Playbook for AI by Shin Wee Chuang (PAND AI)
apidays
 
PDF
NIS2 Compliance for MSPs: Roadmap, Benefits & Cybersecurity Trends (2025 Guide)
GRC Kompas
 
PDF
OOPs with Java_unit2.pdf. sarthak bookkk
Sarthak964187
 
PPTX
Listify-Intelligent-Voice-to-Catalog-Agent.pptx
nareshkottees
 
apidays Singapore 2025 - Streaming Lakehouse with Kafka, Flink and Iceberg by...
apidays
 
Aict presentation on dpplppp sjdhfh.pptx
vabaso5932
 
UNISE-Operation-Procedure-InDHIS2trainng
ahmedabduselam23
 
04_Tamás Marton_Intuitech .pptx_AI_Barometer_2025
FinTech Belgium
 
apidays Singapore 2025 - From API Intelligence to API Governance by Harsha Ch...
apidays
 
办理学历认证InformaticsLetter新加坡英华美学院毕业证书,Informatics成绩单
Taqyea
 
Growth of Public Expendituuure_55423.ppt
NavyaDeora
 
01_Nico Vincent_Sailpeak.pptx_AI_Barometer_2025
FinTech Belgium
 
Development and validation of the Japanese version of the Organizational Matt...
Yoga Tokuyoshi
 
thid ppt defines the ich guridlens and gives the information about the ICH gu...
shaistabegum14
 
Technical-Report-GPS_GIS_RS-for-MSF-finalv2.pdf
KPycho
 
apidays Singapore 2025 - Surviving an interconnected world with API governanc...
apidays
 
SlideEgg_501298-Agentic AI.pptx agentic ai
530BYManoj
 
apidays Singapore 2025 - Building a Federated Future, Alex Szomora (GSMA)
apidays
 
tuberculosiship-2106031cyyfuftufufufivifviviv
AkshaiRam
 
SHREYAS25 INTERN-I,II,III PPT (1).pptx pre
swapnilherage
 
apidays Singapore 2025 - The API Playbook for AI by Shin Wee Chuang (PAND AI)
apidays
 
NIS2 Compliance for MSPs: Roadmap, Benefits & Cybersecurity Trends (2025 Guide)
GRC Kompas
 
OOPs with Java_unit2.pdf. sarthak bookkk
Sarthak964187
 
Listify-Intelligent-Voice-to-Catalog-Agent.pptx
nareshkottees
 
Ad

Data Engineer, Patterns & Architecture The future: Deep-dive into Microservices Patterns with Stream Process

  • 1. Data Engineer, Patterns & Architecture The future: Deep-dive into Microservices Patterns with Stream Process Igor De Souza June - 2 0 2 0
  • 2. Copyright © 2020, Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.
  • 3. Copyright © 2020, Oracle and/or its affiliates. All rights reserved. | Today’sAgenda • Industry 4.0 • Data Fabric & Data Mesh • Microservices Design Patterns • Is Streaming a Database ? • Streaming Everywhere
  • 4. During2020/ 2021the world continuesto go throughaParadigmShift into afuture where“Cyber- PhysicalSystems”arethe newnormal. “Digital Transformation” requires mindset shift: 1.Sharingdatais more effective thanaccumulating 2.Decentralizing,distributing, andcopyingis more powerful than stockpiling 3.Connectivityandflow of datais the starting point for innovation andsocializing.
  • 5. 5
  • 6. Real-TimeIndustry4.0 …from Industry 3.0 BatchCentric, Schedulers Hubs(EDW,Hadoop,DataLake) Mostly Relational Data(aka Views) SimplexProcessingis Standard Sizefor PeakWorkloads Kimball / Inmon Architecture Governanceis “Bolt On” VendorSpecific …to Industry 4.0 EventCentric,Streams (Edge,Hybrid,Multi-Cloud) PolyglotData(viaLogs) MassivelyParallelisStandard Elastic,ScaleonDemand DistributedKappa GovernanceisEmbedded OpenSourceEnabled 6
  • 7. Copyright © 2020, Oracle and/or its affiliates. All rights reserved. | Evolution This data pattern, popularized by Ralph Kimball and Bill Inmon, has been the foundation for enterprise data management since 1993. It is transaction consistent, can scale up nicely for most use cases, and is based on SQL, lingua-franca for most tools. By 2010, the Lambda (big data) pattern was common. In 2014, Jay Kreps (of LinkedIn) questioned the Lambda Architecture and spawned Kappa. The Kappa principles consider batch processing as a special case of stream processing. Use a historized event log to process both real-time as well as batch processing. 7 ETL ETL ETL ETL Monoliths
  • 8. 8
  • 10. Copyright © 2020, Oracle and/or its affiliates. All rights reserved. | Microservices are Good! Service Mesh Revolution Emergence and widespread use of microservices have directly led to revolution in DevOps, massive uptake in Kubernetes and by 2020 the Service Mesh revolution • Key Benefits: • Decomposition, of monolithic architecture • Modularity, smaller services and improved • speed of initial development • Independence, loosely-coupled systems that can • be created using different languages or data • With the loose coupling also comes much greater flexibility around deployment and upgrades, eliminating complex dependencies • Flexibility at Scale, deployments may start small, run locally and later scale very wide, running across multi-cloud environments and containers • Sidecar pattern for “Mesh” frameworks are 10
  • 11. Copyright © 2020, Oracle and/or its affiliates. All rights reserved. | Domain Driven Design (DDD) principles guide developers to create microservices that align to Bounded Contexts, which “defines tangible boundaries of applicability of some sub-domain” Challenges with Bounded Context & DDD Sounds hard! 11
  • 13. 13 Do not burden my code with all these infrastructure related decisions
  • 14. Copyright © 2020, Oracle and/or its affiliates. All rights reserved. | Data and Events are First-Class Citizens Service Service Service Analytics, Data Science and Data Lakes are too important to my business’ Digital Transformation and Data-Driven initiatives… we need architecture focus on Data and Events too Application Microservices Data Stores Event Logs produce events consume events produce events consume events read write App Events Data Events DBLog Events Control Plane • “State of the Truth” at a point in time (current or historic) • Durable storage used for Data Recovery / Archives/years of data • Polyglot, each service may determine its own data structures • “Narrative of the Truth” sequence of events (between data snapshots) • Days/months of event data available as Time Series or Messaging • Strict ordering of events & Idempotency • Strong Consistency of DB logs (eg; whenusing GoldenGate) • “Systems ofRecord” at application tier • APIs, business rules and business object semantics • Not durable storage The microservice API is king! 14
  • 15. Copyright © 2020, Oracle and/or its affiliates. All rights reserved. | Can we take some of the best ideas of a Service Mesh and apply them to Data and Events, to create a kind of Data Mesh? Microservice Mesh and Data Mesh Immutable Raw Data Events Prepared Data Canonical Data App Events Data Events DBLog Events Data Domain Projection…n Data Domain Projection 1 Application Microservices Control Plane 15
  • 16. What is a Data Mesh? 16 Microservice Patterns Log-based Integrations Polyglot Data Movement Data Mesh is a data-tier architecture to integrate and govern enterprise data assets across distributed multi-cloud environments – two defining characteristics are: (1) De-centralized data processing; no ETL/Hubs/Lake monoliths (2) Event-driven; real-time where possible, batch only when necessary Microservices-centric: • For the administration, deployment and monitoring of the core frameworks of data movement and governance • “Sidecar Proxy” style pattern for Events and Data; Aligns with Service Mesh frameworks (Kubernetes, Istio, etc) Immutable event-logs for data integrations: • Messaging and data store events are globally accessible via immutable event logs • Logs may be used to drive Streaming or Batch integrations Distributed data movement of all types of data • A data mesh moves data: Relational, NoSQL, JSON, Graph… • Relational data consistency (ACID) during data movement • Must work reliably with enterprise OLTP data sets Data Mesh Event Streaming Immutable Logs Data Replication Polyglot Persistence Edge / 5G Frameworks Domain Driven Design Service Mesh “Sidecars” Data Mesh
  • 17. 17
  • 18. 18
  • 19. Data Fabric or Data Mesh? 19
  • 21. Microservice Design Patterns for Data Patterns for MicroservicesInherent to the Microservice Architecture is the developer using specific patterns, sometimes the patterns are partially embodied in a Programming Framework, but typically the developers must choose to follow certain heuristics while programming. This presentation’s focus: • “Database Patterns” & “Integration Patterns” …using DBEvent Replication (AKA: Change Data Capture) to improvethem • Simplify the pattern, make the microservice application more resilient and provide better data consistency guarantees DB Patterns for Discussion: • Database per Service (coveredearlier) • CQRS – Command Query Responsibility Segregation • Event Sourcing • Saga Pattern • Transactional Outbox • Aggregates (AKA: Domain Events) Transaction Outbox 21
  • 22. Hype Cycle 2009 Complex Event Processing: • CEP is a kind of computing in which incoming data about events is distilled into more useful, higher level event data that provides insight into what is happening. […] CEP is used for highly demanding, continuous-intelligence applications that enhance situation awareness and support real-time decisions. Gartner 20 Years Too Early? • CEP dates back to the 1990’s (history of CEP engines) • CEP came before “Event Stream Processing” andgenerally has covered more complex use cases (eg; handling of out- of-order events and more complicated correlation semantics) ( what’s the difference, 2019 and mythbuster CEP vs ESP, 2008) • Largely overtaken by Big Data stream processing technologies that are open-source, massively-parallel,and widely available as cloud-native 2009! Copyright © 2020 Oracle and/or its affiliates.
  • 23. Stream Processing 2020 Time Series DB/OLAP Big Data Event Stream Processing Complex Event Processing Becoming more aligned to open source / apache frameworks Becoming more capable of rich windowing functions and time-clock correlation semantics Complex Event Processing: • Traditionally running in “scale-up” SMP in-memory framework • Many CEP engines aremoving toward MPP “scale-out” architectures • Programming-centric historically, some CEP engines were also query- centric asfar back as 2009 • Time-clock semantics and advanced cache may be used for “state machine” type usecases Stream Processing: • Built around MPP frameworksand typically Apache open-sourced • Genesis was around simplistic use cases on high volumes of events • All SP engines began with rudimentary windowing and correlation semantics, but most frameworks are gradually becoming more functionally comparable to classic CEP • Simplistic windowing and caching for basic stream-clock events Time Series Databases: • Optimized for persistence and historic analytics on time-ordered events…data is often sourced from CEP or SP engines Technical and Functional Differences between CEP and SP:https://complexevents.com/2019/07/15/whats-the-difference-between-esp-and-cep-2/ Copyright © 2020 Oracle and/or its affiliates.
  • 24. Stream Processing/CEP for Event Driven Architectures There has been a widespread awakening to the benefits of Event Drive Architecture (EDA) for increasing the scalability and agility of business systems. […] Stream analytics is based on the mathematics of complex-event processing (CEP). CEP is a computing technique in which incoming data about what is happening (event data) is processed as it arrives (data in motion or recently in motion) to generate higher level, more useful, summary information (complex events). W. Roy Schulte (of Gartner), March 2020: EDA is Suddenly Popular Will Stream Analytics be Next? Event Stream Analytics (& CEP) Data & Microservice Events Event/Data Pipelines Time-Series Analysis Geospatial Analysis Real-time AI/ML Continious ETL Use Cases:
  • 25. 25
  • 26. 26
  • 28. Critiques of Event Sourcing Exposing the Persistence Tier: • Taken too far (Why Event Sourcing is an Anti-Pattern), developers wind up usingthe Event Store as a Shared Persistence model, and other microservice now have hard- coupled binding to the message formats of the originatingservice Whole System Fallacy: • Some microservices leaders (Udi and Greg Reach CQRS Agreement) sayto narrow the aperture on when to use CQRS + Event Sourcing → only within a Business Component and a Single Bounded Context • Minimizes utility of pattern for Communications Forcing Eventual Consistency on Developers: • The propensity to over-use CQRS & Event Sourcing at the at the whole systemlevel forces developers to manage eventual consistency in the Application tiers (What they don’t tell you about eventsourcing) • “…they will make your life a living hell” doing DevOps, debugging and system recovery when a “Mesh” of services are interacting via Event Store and message signatures can lead todisaster
  • 29. Is Streaming a Database ? 29 • Kstore • Kcache • Kareldb • KSQL • SparkSQL • Flink SQL • Stream Java & Scala • Oracle 20c - Transactional Event Queues (TEQ) • Martin Kleppmann | Kafka Summit SF 2018
  • 31. 31
  • 32. 32
  • 33. 33
  • 34. 34
  • 35. 35
  • 37. Spark, Flink or KSQL Copyright © 2019 Oracle and/or its affiliates. [best] ˜œ›™[worst] Spark Streaming Apache Flink / SQL Confluent KSQL User Experience Low Code Development (with built-in patterns/accelerators) ™   Interactive/Live Edits (browser based, view changes immediately) ™ ˜  Built-in Live Dashboards (event-driven charts/graphs)   ™ Core Streaming Semantics What is Being Computed (transforms, joins, flatten, statefulness etc) › œ œ Time Windows(global, fixed, sliding, tumbling, custom etc) › ˜ ˜ When in Processing Time (triggers – event, time, count, timers, etc)  œ  How do Refinements Relate (discarding, accumulating, retracting) › œ  Analytics Robust CEP Capabilities (complex event correlations, native time clock) ™   Geo-Fencing & Spatial (lat/long, built in maps, custom map tiles, etc) ™ ™ ™ Machine Learning (native scala, PMML, python support etc) œ œ  Time Series Analysis (built-in interval patterns, thresholding etc)    Other Features Backpressure (dynamic ingest per pipeline) Custom Custom Custom State Management (automation across streams & native cache) N/A RocksDB RocksDB Data Consistency (OLTP Change Events, Inserts/Updates/Deletes) Custom Custom Custom GoldenGate Stream Type (aware of SCN/CSN, transactions, order, etc) Custom Custom Custom
  • 38. 38
  • 39. Evolution towards Real-Time Data Mesh mesh & microservice controls 39 ETL ETL ETL ETL
  • 40. 40Copyright © 2020 Oracle and/or its affiliates.
  • 41. This is not a Metamorphosis, it is a Paradigm Shift Data success factors that did wellin Industry 3.0will not be the factors that create success in Industry4.0 The Success Paradox Next Gen DataArchitecture ETL Vendors 1990 –2010’s Gen1: • Replication • Messaging • Streaming • Pipelines Next-Genhas newDNAnot tiedto oldETL tools Itis impossible to evolve older Batch Processing tools into a modern Event- Centric Stream Processing solution; the underlying paradigms arefundamentally different 41