18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes

Running Spark and Flink
on Kubernetes
A Case Study of Kubernetes Operators
Athens Big Data Meetup, Nov 2019
Chaoran Yu
Lightbend Inc.

Kubernetes - de facto standard for
orchestrating containers

Kubernetes Resources
● Pod
Atomic unit of scheduling in K8s. Has its own IP address.
● Deployment
Declarative updates for Pods and ReplicaSets
● PersistentVolume
Storage abstraction. Main way to move state out of containers
● Service, Ingress, StatefulSet and much more!

Custom Resource Deﬁnition (CRD)
● Extension of the Kubernetes API
● Allows the developer to leverage the API server
● Quickly prototype new features
● Modular design. Can be updated independently of the cluster.

Operator Pattern
• The operator pattern is a way of packaging operational knowledge of an
application and make it native to Kubernetes, often by defining a CRD.
• An operator is an application-specific controller that extends the Kubernetes
API to create, configure, and manage instances of complex stateful
applications on behalf of a Kubernetes user.
OBSERVE
OBSERVE EVALUATE ACT

“Driven by declarative APIs,
actuated asynchronously by
controllers”
- CRDs Arent’s Just For Addons, KubeCon Seattle, Dec 2018

Apache Spark
Apache Spark is a scalable and fault-tolerant big data processing engine.
● Scales to thousands of nodes
● Runs on YARN, Mesos and Kubernetes
● Batch and streaming workloads
● Express your streaming computation the same way you would express a SQL
computation on static data:
○ The Spark SQL engine will take care of running it incrementally and continuously. It
updates results as streaming data continues to arrive.
○ Adds streaming SQL extensions, like event-time windows.

Spark on Kubernetes
./bin/spark-submit --master k8s://http://127.0.0.1:8001
--deploy-mode cluster --name spark-pi --class
org.apache.spark.examples.SparkPi --conf
spark.executor.instances=3 --conf
spark.kubernetes.container.image=<my-spark-image>
local:///opt/spark/examples/jars/spark-examples_2.11-2.4.4.jar

Spark Operator
• Open source with Apache License 2.0 at
https://github.com/GoogleCloudPlatform/spark-on-k8s-operator.
• Deﬁnes CustomResourceDeﬁnitions (CRDs), SparkApplication and
ScheduledSparkApplication to represent a Spark job.
• CRDs make Spark jobs native citizens in Kubernetes.
• Streamlines the creation, management and monitoring of Spark jobs.

Spark Operator: Architecture
Spark Operator Component Diagram

Spark Operator: Features
• Enables declarative Spark job speciﬁcation.
• Invokes spark-submit and supports rich conﬁguration options.
• Supports cron-like scheduled Spark jobs.
• Pod customization with mutating admission webhook.
• Automatic job re-submission upon spec update and restart upon failure.
• Supports exporting Prometheus metrics.

Spark Operator: Installation
• Helm chart available at
https://github.com/helm/charts/tree/master/incubator/sparkoperator.
• $ helm repo add incubator
http://storage.googleapis.com/kubernetes-charts-incubato
r
• $ helm install incubator/sparkoperator

Spark Operator: Job Spec
apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
name: spark-pi
namespace: default
spec:
type: Scala
mode: cluster
image: "gcr.io/spark-operator/spark:v2.4.4"
mainClass: org.apache.spark.examples.SparkPi
mainApplicationFile: “local:///opt/spark/examples/jars/spark-examples_2.11-2.4.4.jar"
driver:
cores: 0.1
memory: "512m"
serviceAccount: spark
executor:
cores: 1
instances: 3
restartPolicy: OnFailure

Spark Operator: Basic Operations
• Running a Spark job
• kubectl apply -f spark-pi.yaml
• Listing all Spark jobs
• kubectl get sparkapplications
• Getting details of a Spark job (e.g. events)
• kubectl describe sparkapplication spark-pi
• Deleting a Spark job
• kubectl delete sparkapplication spark-pi

Mutating Admission Webhooks
• Mutating admission webhook is a kind of admission controller that intercepts
requests to the Kubernetes API server and modiﬁes an object prior to the
persistence of the object. Beta in K8s v1.9+
• Spark Operator uses it to mount volumes and ConﬁgMaps in Spark driver and
executor pods.

Mounting ConfigMaps
• Specifying Spark configuration by mounting files such as
spark-defaults.conf, spark-env.sh, log4j.properties files as
ConfigMaps and then refer to them as .spec.sparkConfigMap in the
YAML.
• Specifying Hadoop configuration by mounting core-site.xml and
hdfs-site.xml files as ConfigMaps and then refer to them as
.spec.hadoopConfigMap in the YAML.

Mounting Volumes
• When using the Spark history server, both the driver and executor pods need
to log events to the same volume.
sparkConf:
"spark.eventLog.enabled": "true"
"spark.eventLog.dir": "file:/mnt"
volumes:
- name: spark-data
persistentVolumeClaim:
claimName: spark-hs-pvc
driver:
volumeMounts:
- name: spark-data
mountPath: /mnt
executor:
volumeMounts:
- name: spark-data
mountPath: /mnt

Job Monitoring with Prometheus
• The Spark Operator conﬁgures the Prometheus JMX exporter to run as a
Java agent.
• The Spark Operator supports emitting two sets of metrics
• Driver and executor metrics (e.g. spark_driver_appStatus_jobDuration)
• Application-level metrics (e.g. spark_app_running_count)
• To expose driver and executor metrics, the Spark application Docker image
needs to contain the Prometheus JMX exporter Java agent jar.

Enable metrics
image: "gcr.io/spark-operator/spark:v2.4.4-gcs-prometheus"
monitoring:
exposeDriverMetrics: true
exposeExecutorMetrics: true
prometheus:
jmxExporterJar: "/prometheus/jmx_prometheus_javaagent-0.11.0.jar"
port: 8090

Apache Flink
Apache Flink is an open source big data processing engine that provides the following:
● Scales to thousands of nodes.
● Runs on YARN, Mesos and Kubernetes.
● Provides checkpointing and save-pointing facilities for fault tolerance, e.g., restarting without
loss of accumulated state.
● Provides queryable state support; avoid needing an external database to expose state outside
the app.
● Provides window semantics; enables calculation of accurate aggregations, even for out-of-order
or late-arriving data.

Flink on Kubernetes
● Session Cluster
Long-running K8s Deployment. Can run multiple Flink jobs in a cluster.
Each job needs to be submitted after cluster is deployed.
● Job Cluster
Dedicated cluster that runs a single Flink job. Job jar is baked into the
image. No submission needed.

Flink on Kubernetes
Components:
● Job manager Deployment
● Task manager Deployment
● Job manager service
○ Enable job manager and task managers to talk to each other
○ Expose UI

Flink Operator
• Open source with Apache License 2.0 at
https://github.com/lyft/flinkk8soperator.
• Defines CustomResourceDefinition (CRD) FlinkApplication to represent a
Flink job.
• Uses a hybrid session-job cluster mode. A cluster is created for each single
job, which is submitted to that cluster.

Flink Operator: Job Spec
apiVersion: flink.k8s.io/v1beta1
kind: FlinkApplication
metadata:
name: wordcount-operator-example
namespace: flink-operator
spec:
image: lightbend/flink-wordcount:latest
imagePullPolicy: Always
serviceAccountName: toned-guppy-flink
flinkConfig:
taskmanager.heap.size: 200
state.backend.fs.checkpointdir: file:///checkpoints/flink/checkpoints
state.checkpoints.dir: file:///checkpoints/flink/externalized-checkpoints
state.savepoints.dir: file:///checkpoints/flink/savepoints
jobManagerConfig:
resources:
requests:
memory: "200Mi"
cpu: "0.2"
replicas: 1
taskManagerConfig:
taskSlots: 2
resources:
requests:
memory: "200Mi"
cpu: "0.2"
flinkVersion: "1.8"
jarName: "wordcount-operator-example-1.0.0-SNAPSHOT.jar"
parallelism: 3
entryClass: "org.apache.flink.WordCount"

Roll My Own Operator
Choose among the following frameworks for least-resistance path:
● kubebuilder: https://github.com/kubernetes-sigs/kubebuilder
● Operator SDK: https://github.com/operator-framework/operator-sdk
To see how things really work:
● client-go: https://github.com/kubernetes/client-go
● controller-runtime: https://github.com/kubernetes-sigs/controller-runtime/

https://github.com/lightbend/cloudflow

18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes

More Related Content

What's hot (20)

Similar to 18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes (20)

More from Athens Big Data (20)

Recently uploaded (20)

18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes