SlideShare a Scribd company logo
Mario-Leander Reimer
mario-leander.reimer@qaware.de
@LeanderReimer
Dataservices
Processing Big Data the Microservice Way
New York, Feb 27, 2018
Mario-Leander Reimer
Chief Technologist, QAware GmbH
Contact Details
Mail: mario-leander.reimer@qaware.de
Twitter: @LeanderReimer
Github: https://github.com/lreimer/data-services-javaee7
27.02.18
3
Developer && Architect
20+ years of experience
#CloudNativeNerd
Open Source Enthusiast
We want to go to the cloud …
4
5
Device
The System
Traffic Data Historical Data
Map Data Vehicle Data
6
The system. The data center.
Enter Dataservices.
{ Big + Fast
+ Smart }
Data
Micro-
services
BIG
DATA
All things distributed:
Distributed
Processing
Distributed
Databases
8
FAST
DATA
Low latency and high
throughput:
Stream processing
Messaging
Event-driven
Data to information:
Machine (deep) learning
Advanced statistics
Natural Language Processing
SMART
DATA
9
10
Components All Along the Software Lifecycle.
DESIGN
§ Complexity unit
§ Data integrity unit
§ Coherent and cohesive
features unit
§ Decoupled unit
Design Components
RUN
§ Release unit
§ Deployment unit
§ Runtime unit
(crash, slow-down, access)
§ Scaling unit
Ops Components
n:1
NEW
!
BUILD
§ Planning unit
§ Team assignment unit
§ Knowledge unit
§ Development unit
§ Integration unit
Dev Components
1:1
11
Dev Components Ops Components?:1
System
Subsystems
Components
Services
Good starting point
Decomposition Trade-Offs
Microservices
Nanoservices
Macroservices
Monolith
More flexible to scale
Runtime isolation (crash, slow-down, …)
Independent releases, deployments, teams
Higher utilization possible
- Distribution debt: Latency
- Increasing infrastructure complexity
- Increasing troubleshooting complexity
- Increasing integration complexity
12http://martinfowler.com/bliki/MonolithFirst.html
We are here.
We need to go here.
Decomposing the existing monolith was realistic.
13
14
The basic idea: Input – Processing – Output.
Data processing using a graph of microservices.
I1
Sources
P1
Pn
Processors
O1
Sinks
Microservice
(aka Dataservice)
Message
Queue
15
Possible messaging patterns applied for reliable and
flexible communication between dataservices.
P1 C1Q1
Message Passing
P1
C1
Q1
Cn
Work Queue
P1
C1Q1
CnQn
Publish/Subscribe
P1 C1
Q1
Q2
Remote Procedure Call
16
The basic idea:
Cloud-native platform for micro- and dataservices.
CLUSTER OPERATING SYSTEM
MICROSERVICE PLATFORM
DATASERVICE PLATFORM
DATASERVICES
MICROSERVICES
MESSAGING
IMDG
17
Some Open Source Dataservice Platforms.
Standardized API with several open source implementations
Microservices: JavaEE micro container
Messaging: JMS, MQTT, Kafka, SQS
Platforms: Docker, Kubernetes, OpenShift, DC/OS
Stream processing tightly integrated with Kafka
Microservices: main()
Messaging: Kafka, Kafka Streams
Platforms: any Kafka runs on
Open source by Lightbend
Microservices: Lagom, Play
Messaging: akka
Platforms: Conductr, ???
Open source project based on the Spring stack
Microservices: Spring Boot, Spring Cloud Stream & Task
Messaging: Kafka, RabbitMQ
Platforms: PCF, Kuberntes, YARN, Mesos
Java EE 7 / 8 Kafka Streams
Lagom Framework Cloud Cloud Data Flow
Overview of Java EE 7 APIs suited for Dataservices.
18
CDI
Extensions
Web
Fragments
Bean Validation 1.1
CDI 1.1
Managed Beans 1.0
JCA 1.7
JPA 2.2JMS 2.0
JSP 2.3
EL 3.0
EJB 3.2
Batch 1.0
JSF 2.2
Interceptors
1.2
Mail 1.5
Common
Annotations 1.3
JTA 1.2
JAX-WS
1.4
JAX-RS
2.0
Concurrency
1.0
JSON-P 1.0
WebSocket
1.1
JASPIC 1.1 JACC 1.5
Servlet 3.1
JCache 1.0
@MessageDriven(activationConfig = {
@ActivationConfigProperty(propertyName = "serverURIs", propertyValue = "tcp://eclipse-mosquitto:1883"),
@ActivationConfigProperty(propertyName = "cleanSession", propertyValue = "false"),
@ActivationConfigProperty(propertyName = "automaticReconnect", propertyValue = "true"),
@ActivationConfigProperty(propertyName = "filePersistence", propertyValue = "false"),
@ActivationConfigProperty(propertyName = "connectionTimeout", propertyValue = "30"),
@ActivationConfigProperty(propertyName = "maxInflight", propertyValue = "3"),
@ActivationConfigProperty(propertyName = "keepAliveInterval", propertyValue = "5"),
@ActivationConfigProperty(propertyName = "topicFilter", propertyValue = "de/qaware/oss/cloud/mqtt"),
@ActivationConfigProperty(propertyName = "qos", propertyValue = "1")
})
public class MqttSourceMDB implements MQTTListener {
@OnMQTTMessage
@TransactionAttribute(value = TransactionAttributeType.REQUIRED)
@Transactional(Transactional.TxType.REQUIRED)
public void onMQTTMessage(String topic, MqttMessage message) {
JsonReader reader = Json.createReader(new ByteArrayInputStream(message.getPayload()));
JsonObject jsonObject = reader.readObject();
// TODO do stuff with the JSON payload
}
}
19
Simple Message Driven Beans to receive messages.
This also works for MQTT, Kafka, Amazon SQS, …
For other JCA adapters visit https://github.com/payara/Cloud-Connectors
JsonObject currentWeather = Json.createObjectBuilder()
.add("city", “London")
.add("weather", “Drizzle")
.build();
StringWriter payload = new StringWriter();
JsonWriter jsonWriter = Json.createWriter(payload);
jsonWriter.writeObject(currentWeather);
TextMessage msg = session.createTextMessage(payload.toString());
msg.setJMSType("CurrentWeather");
msg.setStringProperty("contentType",
"application/vnd.weather.v1+json");
@ActivationConfigProperty(propertyName = "messageSelector",
propertyValue = "(JMSType = 'CurrentWeather') AND
(contentType = 'application/vnd.weather.v1+json‘)“)
JsonReader reader = Json.createReader(new StringReader(body));
JsonObject jsonObject = reader.readObject();
20
Use JSON-P to build your JsonObject and
JsonArray instances.
Use JSON-P to read JSON payloads.
Use JSON-P to traverse and access JSON
objects and arrays.
Upcoming in Java EE 8: JSON Pointers
and JSON Patch add even more flexibility.
Use Mime-Type versioning for your JSON
messages if required.
Use JMS message selectors to filter on
JMS type and content type.
Alternatively use flexible binary protocols
like ProtoBuf.
Use JSON as payload format for loose coupling. Use
JSON-P to implement tolerant reader pattern.
Cloud-ready runtimes suited for Dataservices.
21
… and many more.
Overview of the demo showcase.
22
JDBC
Source
Weather
Processor
Weather
File Sink
Weather
DB Sink
REST
Source
JAX-RS
JMS
MQTT
Source
JSON-P
JMS
Kafka
Source
JSON-P
JMS
CSV
Source
JBatch
JMS
JBatch
JMS
CSV
In-Memory
Datagrid
Topic
Queue
Topic
https://github.com/lreimer/data-services-javaee7
Location
Processor
JSON-P
JMS
JCache
JSON-P
JMS
JCache
CSV
JMS
JSON-P
JPA
JMS
JSON-P
JPA
Conceptual View on Kubernetes Building Blocks.
23
Most important Kubernetes concepts.
24
Services are an abstraction for a logical
collection of pods.
Pods are the smallest unit of compute in
Kubernetes
Deployments are an abstraction used to
declare and update pods, RCs, …
Replica Sets ensure that the desired number
of pod replicas are running
Labels are key/value pairs used to identify
Kubernetes resources
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: location-processor
spec:
replicas: 2
strategy:
type: RollingUpdate
template:
metadata:
labels:
io.kompose.service: location-processor
spec:
containers:
- name: location-processor
image: lreimer/location-processor:1.0
ports:
- containerPort: 8080
- containerPort: 5701
Example K8s Deployment Definition.
25
resources:
# Define resources to help K8S scheduler
# CPU is specified in units of cores
# Memory is specified in units of bytes
# required resources for a Pod to be started
requests:
memory: “196Mi"
cpu: "250m"
# the Pod will be restarted if limits are exceeded
limits:
memory: “512Mi"
cpu: "500m"
Resource Constraints Definition.
26
# container will receive requests if probe succeeds
readinessProbe:
httpGet:
path: /api/application.wadl
port: 8080
initialDelaySeconds: 30
timeoutSeconds: 5
# container will be killed if probe fails
livenessProbe:
httpGet:
path: /admin/health
port: 8080
initialDelaySeconds: 60
timeoutSeconds: 5
Liveness and Readiness Probes for Antifragility.
27
apiVersion: v1
kind: Service
metadata:
labels:
io.kompose.service: location-processor
name: location-processor
spec:
type: NodePort
ports:
- name: "http"
port: 8080
targetPort: 8080
selector:
io.kompose.service: location-processor
Example K8s Service Definition.
28
Programmable MIDI Controller.
Visualizes Deployments and Pods.
Scales Deployments.
Supports K8s, OpenShift, DC/OS.
http://github.com/qaware/kubepad/
Java EE powered Dataservices on Kubernetes in Action.
29
Fork me on Github.
https://github.com/lreimer/data-services-javaee7
Mario-Leander Reimer
mario-leander.reimer@qaware.de
@LeanderReimer xing.com/companies/qawaregmbh
linkedin.com/company/qaware-gmbh slideshare.net/qaware
twitter.com/qaware
youtube.com/qawaregmbh
github.com/qaware

More Related Content

What's hot (20)

KEY
Integration and Batch Processing on Cloud Foundry
Joshua Long
 
PDF
Scaling Apache Storm - Strata + Hadoop World 2014
P. Taylor Goetz
 
PDF
Samza at LinkedIn
Venu Ryali
 
PPTX
Functional Comparison and Performance Evaluation of Streaming Frameworks
Huafeng Wang
 
PDF
Distributed real time stream processing- why and how
Petr Zapletal
 
PDF
Building a Versatile Analytics Pipeline on Top of Apache Spark with Mikhail C...
Databricks
 
PDF
Performance Analysis and Optimizations for Kafka Streams Applications
Guozhang Wang
 
PDF
So you think you can stream.pptx
Prakash Chockalingam
 
PPTX
Spark Streaming & Kafka-The Future of Stream Processing
Jack Gudenkauf
 
PDF
Spark Streaming + Kafka 0.10: an integration story by Joan Viladrosa Riera at...
Big Data Spain
 
PDF
[OpenInfra Days Korea 2018] (Track 4) CloudEvents 소개 - 상호 운용 가능성을 극대화한 이벤트 데이...
OpenStack Korea Community
 
PDF
Monitoring with Prometheus
Shiao-An Yuan
 
PPTX
The Future of Apache Storm
P. Taylor Goetz
 
PDF
KSQL - Stream Processing simplified!
Guido Schmutz
 
PDF
Data processing platforms with SMACK: Spark and Mesos internals
Anton Kirillov
 
PDF
Introduction to Stream Processing
Guido Schmutz
 
PPTX
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lightbend
 
PDF
Kafka 102: Streams and Tables All the Way Down | Kafka Summit San Francisco 2019
Michael Noll
 
PPTX
Real Time Data Processing Using Spark Streaming
Hari Shreedharan
 
PPTX
Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARN
blueboxtraveler
 
Integration and Batch Processing on Cloud Foundry
Joshua Long
 
Scaling Apache Storm - Strata + Hadoop World 2014
P. Taylor Goetz
 
Samza at LinkedIn
Venu Ryali
 
Functional Comparison and Performance Evaluation of Streaming Frameworks
Huafeng Wang
 
Distributed real time stream processing- why and how
Petr Zapletal
 
Building a Versatile Analytics Pipeline on Top of Apache Spark with Mikhail C...
Databricks
 
Performance Analysis and Optimizations for Kafka Streams Applications
Guozhang Wang
 
So you think you can stream.pptx
Prakash Chockalingam
 
Spark Streaming & Kafka-The Future of Stream Processing
Jack Gudenkauf
 
Spark Streaming + Kafka 0.10: an integration story by Joan Viladrosa Riera at...
Big Data Spain
 
[OpenInfra Days Korea 2018] (Track 4) CloudEvents 소개 - 상호 운용 가능성을 극대화한 이벤트 데이...
OpenStack Korea Community
 
Monitoring with Prometheus
Shiao-An Yuan
 
The Future of Apache Storm
P. Taylor Goetz
 
KSQL - Stream Processing simplified!
Guido Schmutz
 
Data processing platforms with SMACK: Spark and Mesos internals
Anton Kirillov
 
Introduction to Stream Processing
Guido Schmutz
 
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lightbend
 
Kafka 102: Streams and Tables All the Way Down | Kafka Summit San Francisco 2019
Michael Noll
 
Real Time Data Processing Using Spark Streaming
Hari Shreedharan
 
Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARN
blueboxtraveler
 

Similar to Dataservices: Processing Big Data the Microservice Way (20)

PDF
Dataservices - Processing Big Data The Microservice Way
Josef Adersberger
 
PPTX
Are you ready for cloud-native Java?
Graham Charters
 
PDF
Microservice pitfalls
Mite Mitreski
 
PDF
Full lifecycle of a microservice
Luigi Bennardis
 
PPTX
Understanding Microservices
vguhesan
 
PDF
A Hitchhiker's Guide to Cloud Native Java EE
Mario-Leander Reimer
 
PDF
A Hitchhiker's Guide to Cloud Native Java EE
QAware GmbH
 
PDF
Microservices for java architects it-symposium-2015-09-15
Derek Ashmore
 
PPTX
Microservices deck
Raja Chattopadhyay
 
PPT
The Next Generation Application Server – How Event Based Processing yields s...
Guy Korland
 
PDF
Devoxx 2018 - Pivotal and AxonIQ - Quickstart your event driven architecture
Ben Wilcock
 
PDF
Microservices @ Work - A Practice Report of Developing Microservices
QAware GmbH
 
PPTX
Demystifying microservices for JavaEE developers by Steve Millidge.
Payara
 
PDF
Pivoting Spring XD to Spring Cloud Data Flow with Sabby Anandan
PivotalOpenSourceHub
 
PPT
GigaSpaces PAAS For Cloud Based Java Applications
IndicThreads
 
PPTX
Real world #microservices with Apache Camel, Fabric8, and OpenShift
Christian Posta
 
PPTX
Real-world #microservices with Apache Camel, Fabric8, and OpenShift
Christian Posta
 
PPTX
From Code to Commerce, a Backend Java Developer's Galactic Journey into Ecomm...
Jamie Coleman
 
ODP
Developing Microservices using Spring - Beginner's Guide
Mohanraj Thirumoorthy
 
PPTX
Event Bus as Backbone for Decoupled Microservice Choreography - Lecture and W...
Lucas Jellema
 
Dataservices - Processing Big Data The Microservice Way
Josef Adersberger
 
Are you ready for cloud-native Java?
Graham Charters
 
Microservice pitfalls
Mite Mitreski
 
Full lifecycle of a microservice
Luigi Bennardis
 
Understanding Microservices
vguhesan
 
A Hitchhiker's Guide to Cloud Native Java EE
Mario-Leander Reimer
 
A Hitchhiker's Guide to Cloud Native Java EE
QAware GmbH
 
Microservices for java architects it-symposium-2015-09-15
Derek Ashmore
 
Microservices deck
Raja Chattopadhyay
 
The Next Generation Application Server – How Event Based Processing yields s...
Guy Korland
 
Devoxx 2018 - Pivotal and AxonIQ - Quickstart your event driven architecture
Ben Wilcock
 
Microservices @ Work - A Practice Report of Developing Microservices
QAware GmbH
 
Demystifying microservices for JavaEE developers by Steve Millidge.
Payara
 
Pivoting Spring XD to Spring Cloud Data Flow with Sabby Anandan
PivotalOpenSourceHub
 
GigaSpaces PAAS For Cloud Based Java Applications
IndicThreads
 
Real world #microservices with Apache Camel, Fabric8, and OpenShift
Christian Posta
 
Real-world #microservices with Apache Camel, Fabric8, and OpenShift
Christian Posta
 
From Code to Commerce, a Backend Java Developer's Galactic Journey into Ecomm...
Jamie Coleman
 
Developing Microservices using Spring - Beginner's Guide
Mohanraj Thirumoorthy
 
Event Bus as Backbone for Decoupled Microservice Choreography - Lecture and W...
Lucas Jellema
 
Ad

More from QAware GmbH (20)

PDF
Frontends mit Hilfe von KI entwickeln.pdf
QAware GmbH
 
PDF
Mit ChatGPT Dinosaurier besiegen - Möglichkeiten und Grenzen von LLM für die ...
QAware GmbH
 
PDF
50 Shades of K8s Autoscaling #JavaLand24.pdf
QAware GmbH
 
PDF
Make Agile Great - PM-Erfahrungen aus zwei virtuellen internationalen SAFe-Pr...
QAware GmbH
 
PPTX
Fully-managed Cloud-native Databases: The path to indefinite scale @ CNN Mainz
QAware GmbH
 
PDF
Down the Ivory Tower towards Agile Architecture
QAware GmbH
 
PDF
"Mixed" Scrum-Teams – Die richtige Mischung macht's!
QAware GmbH
 
PDF
Make Developers Fly: Principles for Platform Engineering
QAware GmbH
 
PDF
Der Tod der Testpyramide? – Frontend-Testing mit Playwright
QAware GmbH
 
PDF
Was kommt nach den SPAs
QAware GmbH
 
PDF
Cloud Migration mit KI: der Turbo
QAware GmbH
 
PDF
Migration von stark regulierten Anwendungen in die Cloud: Dem Teufel die See...
QAware GmbH
 
PDF
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
QAware GmbH
 
PDF
Endlich gute API Tests. Boldly Testing APIs Where No One Has Tested Before.
QAware GmbH
 
PDF
Kubernetes with Cilium in AWS - Experience Report!
QAware GmbH
 
PDF
50 Shades of K8s Autoscaling
QAware GmbH
 
PDF
Kontinuierliche Sicherheitstests für APIs mit Testkube und OWASP ZAP
QAware GmbH
 
PDF
Service Mesh Pain & Gain. Experiences from a client project.
QAware GmbH
 
PDF
50 Shades of K8s Autoscaling
QAware GmbH
 
PDF
Blue turns green! Approaches and technologies for sustainable K8s clusters.
QAware GmbH
 
Frontends mit Hilfe von KI entwickeln.pdf
QAware GmbH
 
Mit ChatGPT Dinosaurier besiegen - Möglichkeiten und Grenzen von LLM für die ...
QAware GmbH
 
50 Shades of K8s Autoscaling #JavaLand24.pdf
QAware GmbH
 
Make Agile Great - PM-Erfahrungen aus zwei virtuellen internationalen SAFe-Pr...
QAware GmbH
 
Fully-managed Cloud-native Databases: The path to indefinite scale @ CNN Mainz
QAware GmbH
 
Down the Ivory Tower towards Agile Architecture
QAware GmbH
 
"Mixed" Scrum-Teams – Die richtige Mischung macht's!
QAware GmbH
 
Make Developers Fly: Principles for Platform Engineering
QAware GmbH
 
Der Tod der Testpyramide? – Frontend-Testing mit Playwright
QAware GmbH
 
Was kommt nach den SPAs
QAware GmbH
 
Cloud Migration mit KI: der Turbo
QAware GmbH
 
Migration von stark regulierten Anwendungen in die Cloud: Dem Teufel die See...
QAware GmbH
 
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
QAware GmbH
 
Endlich gute API Tests. Boldly Testing APIs Where No One Has Tested Before.
QAware GmbH
 
Kubernetes with Cilium in AWS - Experience Report!
QAware GmbH
 
50 Shades of K8s Autoscaling
QAware GmbH
 
Kontinuierliche Sicherheitstests für APIs mit Testkube und OWASP ZAP
QAware GmbH
 
Service Mesh Pain & Gain. Experiences from a client project.
QAware GmbH
 
50 Shades of K8s Autoscaling
QAware GmbH
 
Blue turns green! Approaches and technologies for sustainable K8s clusters.
QAware GmbH
 
Ad

Recently uploaded (20)

PPTX
Nursing Shift Supervisor 24/7 in a week .pptx
amjadtanveer
 
PPTX
Introduction-to-Python-Programming-Language (1).pptx
dhyeysapariya
 
PPTX
IP_Journal_Articles_2025IP_Journal_Articles_2025
mishell212144
 
PPTX
Solution+Architecture+Review+-+Sample.pptx
manuvratsingh1
 
PDF
blockchain123456789012345678901234567890
tanvikhunt1003
 
PPTX
Customer Segmentation: Seeing the Trees and the Forest Simultaneously
Sione Palu
 
PPTX
UVA-Ortho-PPT-Final-1.pptx Data analytics relevant to the top
chinnusindhu1
 
PPTX
Introduction to Data Analytics and Data Science
KavithaCIT
 
PPTX
Data-Driven Machine Learning for Rail Infrastructure Health Monitoring
Sione Palu
 
PDF
WISE main accomplishments for ISQOLS award July 2025.pdf
StatsCommunications
 
PPTX
short term internship project on Data visualization
JMJCollegeComputerde
 
PPTX
7 Easy Ways to Improve Clarity in Your BI Reports
sophiegracewriter
 
PPTX
lecture 13 mind test academy it skills.pptx
ggesjmrasoolpark
 
PPTX
HSE WEEKLY REPORT for dummies and lazzzzy.pptx
ahmedibrahim691723
 
PDF
Key_Statistical_Techniques_in_Analytics_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
PDF
apidays Munich 2025 - The Double Life of the API Product Manager, Emmanuel Pa...
apidays
 
PDF
apidays Munich 2025 - Making Sense of AI-Ready APIs in a Buzzword World, Andr...
apidays
 
PPTX
Fluvial_Civilizations_Presentation (1).pptx
alisslovemendoza7
 
PPTX
Insurance-Analytics-Branch-Dashboard (1).pptx
trivenisapate02
 
PDF
Blitz Campinas - Dia 24 de maio - Piettro.pdf
fabigreek
 
Nursing Shift Supervisor 24/7 in a week .pptx
amjadtanveer
 
Introduction-to-Python-Programming-Language (1).pptx
dhyeysapariya
 
IP_Journal_Articles_2025IP_Journal_Articles_2025
mishell212144
 
Solution+Architecture+Review+-+Sample.pptx
manuvratsingh1
 
blockchain123456789012345678901234567890
tanvikhunt1003
 
Customer Segmentation: Seeing the Trees and the Forest Simultaneously
Sione Palu
 
UVA-Ortho-PPT-Final-1.pptx Data analytics relevant to the top
chinnusindhu1
 
Introduction to Data Analytics and Data Science
KavithaCIT
 
Data-Driven Machine Learning for Rail Infrastructure Health Monitoring
Sione Palu
 
WISE main accomplishments for ISQOLS award July 2025.pdf
StatsCommunications
 
short term internship project on Data visualization
JMJCollegeComputerde
 
7 Easy Ways to Improve Clarity in Your BI Reports
sophiegracewriter
 
lecture 13 mind test academy it skills.pptx
ggesjmrasoolpark
 
HSE WEEKLY REPORT for dummies and lazzzzy.pptx
ahmedibrahim691723
 
Key_Statistical_Techniques_in_Analytics_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
apidays Munich 2025 - The Double Life of the API Product Manager, Emmanuel Pa...
apidays
 
apidays Munich 2025 - Making Sense of AI-Ready APIs in a Buzzword World, Andr...
apidays
 
Fluvial_Civilizations_Presentation (1).pptx
alisslovemendoza7
 
Insurance-Analytics-Branch-Dashboard (1).pptx
trivenisapate02
 
Blitz Campinas - Dia 24 de maio - Piettro.pdf
fabigreek
 

Dataservices: Processing Big Data the Microservice Way

  • 2. Mario-Leander Reimer Chief Technologist, QAware GmbH Contact Details Mail: [email protected] Twitter: @LeanderReimer Github: https://github.com/lreimer/data-services-javaee7 27.02.18 3 Developer && Architect 20+ years of experience #CloudNativeNerd Open Source Enthusiast
  • 3. We want to go to the cloud … 4
  • 4. 5 Device The System Traffic Data Historical Data Map Data Vehicle Data
  • 5. 6 The system. The data center.
  • 6. Enter Dataservices. { Big + Fast + Smart } Data Micro- services
  • 7. BIG DATA All things distributed: Distributed Processing Distributed Databases 8 FAST DATA Low latency and high throughput: Stream processing Messaging Event-driven Data to information: Machine (deep) learning Advanced statistics Natural Language Processing SMART DATA
  • 8. 9
  • 9. 10 Components All Along the Software Lifecycle. DESIGN § Complexity unit § Data integrity unit § Coherent and cohesive features unit § Decoupled unit Design Components RUN § Release unit § Deployment unit § Runtime unit (crash, slow-down, access) § Scaling unit Ops Components n:1 NEW ! BUILD § Planning unit § Team assignment unit § Knowledge unit § Development unit § Integration unit Dev Components 1:1
  • 10. 11 Dev Components Ops Components?:1 System Subsystems Components Services Good starting point Decomposition Trade-Offs Microservices Nanoservices Macroservices Monolith More flexible to scale Runtime isolation (crash, slow-down, …) Independent releases, deployments, teams Higher utilization possible - Distribution debt: Latency - Increasing infrastructure complexity - Increasing troubleshooting complexity - Increasing integration complexity
  • 12. Decomposing the existing monolith was realistic. 13
  • 13. 14 The basic idea: Input – Processing – Output. Data processing using a graph of microservices. I1 Sources P1 Pn Processors O1 Sinks Microservice (aka Dataservice) Message Queue
  • 14. 15 Possible messaging patterns applied for reliable and flexible communication between dataservices. P1 C1Q1 Message Passing P1 C1 Q1 Cn Work Queue P1 C1Q1 CnQn Publish/Subscribe P1 C1 Q1 Q2 Remote Procedure Call
  • 15. 16 The basic idea: Cloud-native platform for micro- and dataservices. CLUSTER OPERATING SYSTEM MICROSERVICE PLATFORM DATASERVICE PLATFORM DATASERVICES MICROSERVICES MESSAGING IMDG
  • 16. 17 Some Open Source Dataservice Platforms. Standardized API with several open source implementations Microservices: JavaEE micro container Messaging: JMS, MQTT, Kafka, SQS Platforms: Docker, Kubernetes, OpenShift, DC/OS Stream processing tightly integrated with Kafka Microservices: main() Messaging: Kafka, Kafka Streams Platforms: any Kafka runs on Open source by Lightbend Microservices: Lagom, Play Messaging: akka Platforms: Conductr, ??? Open source project based on the Spring stack Microservices: Spring Boot, Spring Cloud Stream & Task Messaging: Kafka, RabbitMQ Platforms: PCF, Kuberntes, YARN, Mesos Java EE 7 / 8 Kafka Streams Lagom Framework Cloud Cloud Data Flow
  • 17. Overview of Java EE 7 APIs suited for Dataservices. 18 CDI Extensions Web Fragments Bean Validation 1.1 CDI 1.1 Managed Beans 1.0 JCA 1.7 JPA 2.2JMS 2.0 JSP 2.3 EL 3.0 EJB 3.2 Batch 1.0 JSF 2.2 Interceptors 1.2 Mail 1.5 Common Annotations 1.3 JTA 1.2 JAX-WS 1.4 JAX-RS 2.0 Concurrency 1.0 JSON-P 1.0 WebSocket 1.1 JASPIC 1.1 JACC 1.5 Servlet 3.1 JCache 1.0
  • 18. @MessageDriven(activationConfig = { @ActivationConfigProperty(propertyName = "serverURIs", propertyValue = "tcp://eclipse-mosquitto:1883"), @ActivationConfigProperty(propertyName = "cleanSession", propertyValue = "false"), @ActivationConfigProperty(propertyName = "automaticReconnect", propertyValue = "true"), @ActivationConfigProperty(propertyName = "filePersistence", propertyValue = "false"), @ActivationConfigProperty(propertyName = "connectionTimeout", propertyValue = "30"), @ActivationConfigProperty(propertyName = "maxInflight", propertyValue = "3"), @ActivationConfigProperty(propertyName = "keepAliveInterval", propertyValue = "5"), @ActivationConfigProperty(propertyName = "topicFilter", propertyValue = "de/qaware/oss/cloud/mqtt"), @ActivationConfigProperty(propertyName = "qos", propertyValue = "1") }) public class MqttSourceMDB implements MQTTListener { @OnMQTTMessage @TransactionAttribute(value = TransactionAttributeType.REQUIRED) @Transactional(Transactional.TxType.REQUIRED) public void onMQTTMessage(String topic, MqttMessage message) { JsonReader reader = Json.createReader(new ByteArrayInputStream(message.getPayload())); JsonObject jsonObject = reader.readObject(); // TODO do stuff with the JSON payload } } 19 Simple Message Driven Beans to receive messages. This also works for MQTT, Kafka, Amazon SQS, … For other JCA adapters visit https://github.com/payara/Cloud-Connectors
  • 19. JsonObject currentWeather = Json.createObjectBuilder() .add("city", “London") .add("weather", “Drizzle") .build(); StringWriter payload = new StringWriter(); JsonWriter jsonWriter = Json.createWriter(payload); jsonWriter.writeObject(currentWeather); TextMessage msg = session.createTextMessage(payload.toString()); msg.setJMSType("CurrentWeather"); msg.setStringProperty("contentType", "application/vnd.weather.v1+json"); @ActivationConfigProperty(propertyName = "messageSelector", propertyValue = "(JMSType = 'CurrentWeather') AND (contentType = 'application/vnd.weather.v1+json‘)“) JsonReader reader = Json.createReader(new StringReader(body)); JsonObject jsonObject = reader.readObject(); 20 Use JSON-P to build your JsonObject and JsonArray instances. Use JSON-P to read JSON payloads. Use JSON-P to traverse and access JSON objects and arrays. Upcoming in Java EE 8: JSON Pointers and JSON Patch add even more flexibility. Use Mime-Type versioning for your JSON messages if required. Use JMS message selectors to filter on JMS type and content type. Alternatively use flexible binary protocols like ProtoBuf. Use JSON as payload format for loose coupling. Use JSON-P to implement tolerant reader pattern.
  • 20. Cloud-ready runtimes suited for Dataservices. 21 … and many more.
  • 21. Overview of the demo showcase. 22 JDBC Source Weather Processor Weather File Sink Weather DB Sink REST Source JAX-RS JMS MQTT Source JSON-P JMS Kafka Source JSON-P JMS CSV Source JBatch JMS JBatch JMS CSV In-Memory Datagrid Topic Queue Topic https://github.com/lreimer/data-services-javaee7 Location Processor JSON-P JMS JCache JSON-P JMS JCache CSV JMS JSON-P JPA JMS JSON-P JPA
  • 22. Conceptual View on Kubernetes Building Blocks. 23
  • 23. Most important Kubernetes concepts. 24 Services are an abstraction for a logical collection of pods. Pods are the smallest unit of compute in Kubernetes Deployments are an abstraction used to declare and update pods, RCs, … Replica Sets ensure that the desired number of pod replicas are running Labels are key/value pairs used to identify Kubernetes resources
  • 24. apiVersion: extensions/v1beta1 kind: Deployment metadata: name: location-processor spec: replicas: 2 strategy: type: RollingUpdate template: metadata: labels: io.kompose.service: location-processor spec: containers: - name: location-processor image: lreimer/location-processor:1.0 ports: - containerPort: 8080 - containerPort: 5701 Example K8s Deployment Definition. 25
  • 25. resources: # Define resources to help K8S scheduler # CPU is specified in units of cores # Memory is specified in units of bytes # required resources for a Pod to be started requests: memory: “196Mi" cpu: "250m" # the Pod will be restarted if limits are exceeded limits: memory: “512Mi" cpu: "500m" Resource Constraints Definition. 26
  • 26. # container will receive requests if probe succeeds readinessProbe: httpGet: path: /api/application.wadl port: 8080 initialDelaySeconds: 30 timeoutSeconds: 5 # container will be killed if probe fails livenessProbe: httpGet: path: /admin/health port: 8080 initialDelaySeconds: 60 timeoutSeconds: 5 Liveness and Readiness Probes for Antifragility. 27
  • 27. apiVersion: v1 kind: Service metadata: labels: io.kompose.service: location-processor name: location-processor spec: type: NodePort ports: - name: "http" port: 8080 targetPort: 8080 selector: io.kompose.service: location-processor Example K8s Service Definition. 28
  • 28. Programmable MIDI Controller. Visualizes Deployments and Pods. Scales Deployments. Supports K8s, OpenShift, DC/OS. http://github.com/qaware/kubepad/ Java EE powered Dataservices on Kubernetes in Action. 29
  • 29. Fork me on Github. https://github.com/lreimer/data-services-javaee7
  • 30. Mario-Leander Reimer [email protected] @LeanderReimer xing.com/companies/qawaregmbh linkedin.com/company/qaware-gmbh slideshare.net/qaware twitter.com/qaware youtube.com/qawaregmbh github.com/qaware