SlideShare a Scribd company logo
@nileshgule
Improve
Monitoring and Observability
for
Kubernetes
with
OSS tools
$whoami
{
“name” : “Nilesh Gule”,
“website” : “https://www.HandsOnArchitect.com",
“github” : “https://GitHub.com/NileshGule"
“twitter” : “@nileshgule”,
“linkedin” : “https://www.linkedin.com/in/nileshgule”,
“likes” : “Technical Evangelism, Cricket”,
“co-organizer” : “Azure Singapore UG”
}
@nileshgule
Pre-requisites
Self contained application with all its dependencies
Docker
❖ Orchestrates containers
❖ Self healing
❖ Service discovery
❖ Scaling
Kubernetes
❖ Scalable apps in dynamic environments (public /
private / hybrid clouds)
❖ Exemplified by Containers, service meshes,
microservices, immutable infrastructure &
declarative APIs
❖ Loosely coupled systems, resilient, observable &
manageable
❖ Robust automation
Cloud Native Applications
@nileshgule
@nileshgule
CNCF cloud trail
https://github.com/cncf/trailmap
@nileshgule
CNCF Observability landscape
https://landscape.cncf.io
@nileshgule
CNCF Observability Radar
https://radar.cncf.io/2020-09-observability
@nileshgule
CNCF Observability Radar
https://radar.cncf.io/2020-09-observability
@nileshgule
3 Pillars of Observability
Logs Metrics Traces
@nileshgule
Centralized
Logging
@nileshgule
❑ Application specific
❖ Long term log retention for compliance reasons
❖ Workloads scheduled on different nodes during
application restarts / updates
❖ Autoscaling workloads
❑ Kubernetes upgrades
❖ Auto healing can reschedule workloads
❖ Underlying nodes added / deleted during cluster
scaling
❖ Underlying nodes replaced during cluster
upgrades
Container based workloads
Why centralized logging
❖ Not much control over underlying infra
❖ Relies on cloud prover specific logging and monitoring
solution
PaaS / Serverless services
@nileshgule
Tech Talks EFK integration
Log collector Log storage Log search, visualise,
dashboards
rabbitmq-producer-service rabbitmq-consumer-deployment
@nileshgule
Demo 1 – Log Aggregation with EFK
@nileshgule
Monitoring and
Alerting
@nileshgule
• Application specific
• Monitor resource usage
• Monitor scaling needs
• Monitor anomalies / outliers
• Kubernetes platform level
• Monitor cluster resources (CPU / RAM)
• API health
• Autoscaling
Container based workloads
Why Monitoring & Alerting
• Monitor resource usage
• Scaling
• Bottlenecks
PaaS / Serverless services
@nileshgule
Prometheus Architecture
@nileshgule
Demo 2 – Metrics using Prometheus &
Grafana
@nileshgule
Spring Boot Conference App integration
https://github.com/NileshGule/spring-boot-conference-app/tree/mssql-server
conference-demo-service-monitor
conference-demo-service
@nileshgule
Exception
Handling
@nileshgule
Sentry Architecture
https://develop.sentry.dev/architecture/
@nileshgule
Spring Boot Sentry integration
conference-demo-service
Managed Kubernetes cluster
@nileshgule
Demo 3 – Exception aggregation using
Sentry
@nileshgule
End to End Observability
@nileshgule
Observability challenges
➢ Too many telemetry agents
➢ Instrumentation of Apps
➢ Dynamic & small units in Cloud Native Applications
➢ Right retention period for each type of metric and usage
➢ Minimize vendor or feature lock-in
➢ Buy vs Build
➢ Transition from Monitoring to Observability
➢ Single pane of glass for consuming different information
➢ Correlation of signals
@nileshgule
Analogy - Use right tool for right purpose
@nileshgule
Summary
✓ Use best-of-class for given use case
✓ Rely on open standards (e.g. OpenTelemetry)
✓ Build portable observability systems (e.g. hybrid cloud migration)
Log Aggregation
✓ EFK stack helps in centralized logging
✓ Kibana is used to visualize logs and build dashboards
Monitoring & Alerting
✓ Prometheus provides easy to use metrics for platforms, applications
✓ Grafana provides visualization capabilities to build intuitive dashboards
Exception Aggregation
✓ Sentry provides Exception Aggregation capabilities
✓ Excellent telemetry data captured by Sentry to help diagnose problems
@nileshgule
Some Recommendations
♣ Too many agents
♣ Instrumentation, vendor lock-in
♣ Cloud native logs
♣ Cloud native metrics
♣ Cloud native traces
♣ Single pane of glass, correlation
∞ OpenTelemetry collector
∞ OpenTelemetry, OpenMetrics
∞ Fluent Bit / Fluentd, OpenSearch, Loki
∞ Prometheus, Cortex, Thanos
∞ OpenTelemetry, Jaeger, Grafana
∞ Grafana
Challenges Tools
@nileshgule
References
Log Aggregation
❖ Elastic stack
❖ Kibana
❖ Fluentbit
Monitoring & Alerting
❖ Prometheus
❖ Grafana
❖ Kube Prometheus stack
❖ Dynatrace – Monitoring vs Observability
❖ Houssem Dellai – Prometheus & Grafana
for monitoring Kubernetes
Sentry
❖ Sentry docs
@nileshgule
Source Code & slide deck
Tech Talks
https://github.com/NileshGule/pd-tech-fest-2019
Observability & Monitoring markdown
Conference app
https://github.com/NileshGule/spring-boot-conference-app/tree/mssql-server
https://speakerdeck.com/nileshgule/
https://www.slideshare.net/nileshgule/
Nilesh Gule
ARCHITECT | MICROSOFT MVP
“Code with Passion and
Strive for Excellence”
nileshgule
@nileshgule Nilesh Gule
NileshGule
www.handsonarchitect.com
https://bit.ly/youtube-nileshgule
Q&A

More Related Content

What's hot (20)

PDF
Getting started with Site Reliability Engineering (SRE)
Abeer R
 
PDF
GitOps with ArgoCD
CloudOps2005
 
PDF
Intro to open source observability with grafana, prometheus, loki, and tempo(...
LibbySchulze
 
PDF
Observability
Martin Gross
 
PDF
Observability
Ebru Cucen Çüçen
 
PDF
Designing a complete ci cd pipeline using argo events, workflow and cd products
Julian Mazzitelli
 
PPSX
Microservices, DevOps & SRE
Araf Karsh Hamid
 
PDF
Observability: Beyond the Three Pillars with Spring
VMware Tanzu
 
PDF
Gitops: the kubernetes way
sparkfabrik
 
PDF
Mastering System Resiliency with AIOps
Peterson Technology Partners
 
PDF
Kks sre book_ch1,2
Chris Huang
 
PDF
Gitops: a new paradigm for software defined operations
Mariano Cunietti
 
PDF
Opentelemetry - From frontend to backend
Sebastian Poxhofer
 
PDF
Getting Started with Kubernetes
VMware Tanzu
 
PDF
Road to (Enterprise) Observability
Christoph Engelbert
 
PPTX
Observability
Maganathin Veeraragaloo
 
PPTX
SRE 101 (Site Reliability Engineering)
Hussain Mansoor
 
PDF
Getting Started with Infrastructure as Code
WinWire Technologies Inc
 
PDF
Introduction of Kubernetes - Trang Nguyen
Trang Nguyen
 
PDF
GitOps 101 Presentation.pdf
ssuser31375f
 
Getting started with Site Reliability Engineering (SRE)
Abeer R
 
GitOps with ArgoCD
CloudOps2005
 
Intro to open source observability with grafana, prometheus, loki, and tempo(...
LibbySchulze
 
Observability
Martin Gross
 
Observability
Ebru Cucen Çüçen
 
Designing a complete ci cd pipeline using argo events, workflow and cd products
Julian Mazzitelli
 
Microservices, DevOps & SRE
Araf Karsh Hamid
 
Observability: Beyond the Three Pillars with Spring
VMware Tanzu
 
Gitops: the kubernetes way
sparkfabrik
 
Mastering System Resiliency with AIOps
Peterson Technology Partners
 
Kks sre book_ch1,2
Chris Huang
 
Gitops: a new paradigm for software defined operations
Mariano Cunietti
 
Opentelemetry - From frontend to backend
Sebastian Poxhofer
 
Getting Started with Kubernetes
VMware Tanzu
 
Road to (Enterprise) Observability
Christoph Engelbert
 
SRE 101 (Site Reliability Engineering)
Hussain Mansoor
 
Getting Started with Infrastructure as Code
WinWire Technologies Inc
 
Introduction of Kubernetes - Trang Nguyen
Trang Nguyen
 
GitOps 101 Presentation.pdf
ssuser31375f
 

Similar to Improve monitoring and observability for kubernetes with oss tools (20)

PDF
Improve Monitoring and Observability for Kubernetes with OSS tools
Nilesh Gule
 
PDF
Improve Monitoring And Observability for Kubernetes with OSS tools.pdf
Nilesh Gule
 
PDF
Logz.io Jenkins Meetup
GrigoryAvsyuk
 
PDF
Implementing Observability for Kubernetes.pdf
Jose Manuel Ortega Candel
 
PDF
Building an Observability Platform in 389 Difficult Steps
DigitalOcean
 
PPTX
ADDO Open Source Observability Tools
Mickey Boxell
 
PDF
OSMC 2024 | Open source observability for private cloud: mission impossible o...
NETWAYS
 
PDF
Observability
Ebru Cucen Çüçen
 
PPSX
Service Mesh - Observability
Araf Karsh Hamid
 
PDF
Cloud Observability in Action MEAP V06 Michael Mh9 Hausenblas
aboaleoszust
 
PDF
Employment Hero monitoring solution
Luong Vo
 
PDF
The present and future of Serverless observability (Serverless Computing London)
Yan Cui
 
PDF
Observability foundations in dynamically evolving architectures
Boyan Dimitrov
 
PDF
初探 OpenTelemetry - 蒐集遙測數據的新標準
Marcus Tung
 
PDF
Monitoring kubernetes wwith prometheus and grafana azure singapore - 19 aug...
Nilesh Gule
 
PDF
The Present and Future of Serverless Observability
C4Media
 
PDF
Go Observability (in practice)
Eran Levy
 
PDF
Improve monitoring and observability for kubernetes with oss tools
Nilesh Gule
 
PDF
stackconf 2022: Open Source for Better Observability
NETWAYS
 
PPTX
OpenTelemetry For Architects
Kevin Brockhoff
 
Improve Monitoring and Observability for Kubernetes with OSS tools
Nilesh Gule
 
Improve Monitoring And Observability for Kubernetes with OSS tools.pdf
Nilesh Gule
 
Logz.io Jenkins Meetup
GrigoryAvsyuk
 
Implementing Observability for Kubernetes.pdf
Jose Manuel Ortega Candel
 
Building an Observability Platform in 389 Difficult Steps
DigitalOcean
 
ADDO Open Source Observability Tools
Mickey Boxell
 
OSMC 2024 | Open source observability for private cloud: mission impossible o...
NETWAYS
 
Observability
Ebru Cucen Çüçen
 
Service Mesh - Observability
Araf Karsh Hamid
 
Cloud Observability in Action MEAP V06 Michael Mh9 Hausenblas
aboaleoszust
 
Employment Hero monitoring solution
Luong Vo
 
The present and future of Serverless observability (Serverless Computing London)
Yan Cui
 
Observability foundations in dynamically evolving architectures
Boyan Dimitrov
 
初探 OpenTelemetry - 蒐集遙測數據的新標準
Marcus Tung
 
Monitoring kubernetes wwith prometheus and grafana azure singapore - 19 aug...
Nilesh Gule
 
The Present and Future of Serverless Observability
C4Media
 
Go Observability (in practice)
Eran Levy
 
Improve monitoring and observability for kubernetes with oss tools
Nilesh Gule
 
stackconf 2022: Open Source for Better Observability
NETWAYS
 
OpenTelemetry For Architects
Kevin Brockhoff
 
Ad

More from Nilesh Gule (20)

PDF
API Management in the AI Era - Azure Singapore.pdf
Nilesh Gule
 
PDF
Infuse Intelligence Into your App with Foundry Local.pdf
Nilesh Gule
 
PDF
Enhance GitHub Copilot using MCP - Enterprise version.pdf
Nilesh Gule
 
PDF
API Management in the AI Era session GAB Melbourne
Nilesh Gule
 
PDF
GitHub Copilot Agent Mode - Azure Builders Melbourne
Nilesh Gule
 
PDF
Festive Tech Calendar -2024 Supercharge Kubernetes Debugging with k8sGPT.pdf
Nilesh Gule
 
PDF
Code Creativity and Customers- Navigating the Generative AI Landscape - Austr...
Nilesh Gule
 
PDF
Supercharge Kubernetes Debugging with k8sGPT.pdf
Nilesh Gule
 
PDF
Portable Multi-cloud Applications with Dapr.pdf
Nilesh Gule
 
PDF
k8sug Melbourne - Improve Kubernetes with k8sGPT
Nilesh Gule
 
PDF
Event Driven Autoscaling using KEDA - MVP
Nilesh Gule
 
PDF
Code Creativity and Customers- Navigating the Generative AI Landscape.pdf
Nilesh Gule
 
PDF
Modular Architecturs for Resilience and Adaptability.pdf
Nilesh Gule
 
PDF
Autoscale applications based on external events with KEDA.pdf
Nilesh Gule
 
PDF
Singapore JUG - Open Telemetry.pdf
Nilesh Gule
 
PDF
Cloud Native Ninja - Getting Started with Kubernetes - Part 9.pdf
Nilesh Gule
 
PDF
Build Secure Portable Applications using AKS and its ecosystem
Nilesh Gule
 
PDF
Cloud Native Ninja - PT8 - Containerize React app.pdf
Nilesh Gule
 
PDF
Cloud Native Ninja - PT8 - Containerize React app.pdf
Nilesh Gule
 
PDF
Modular Architecturs for resilience and Adaptability.pdf
Nilesh Gule
 
API Management in the AI Era - Azure Singapore.pdf
Nilesh Gule
 
Infuse Intelligence Into your App with Foundry Local.pdf
Nilesh Gule
 
Enhance GitHub Copilot using MCP - Enterprise version.pdf
Nilesh Gule
 
API Management in the AI Era session GAB Melbourne
Nilesh Gule
 
GitHub Copilot Agent Mode - Azure Builders Melbourne
Nilesh Gule
 
Festive Tech Calendar -2024 Supercharge Kubernetes Debugging with k8sGPT.pdf
Nilesh Gule
 
Code Creativity and Customers- Navigating the Generative AI Landscape - Austr...
Nilesh Gule
 
Supercharge Kubernetes Debugging with k8sGPT.pdf
Nilesh Gule
 
Portable Multi-cloud Applications with Dapr.pdf
Nilesh Gule
 
k8sug Melbourne - Improve Kubernetes with k8sGPT
Nilesh Gule
 
Event Driven Autoscaling using KEDA - MVP
Nilesh Gule
 
Code Creativity and Customers- Navigating the Generative AI Landscape.pdf
Nilesh Gule
 
Modular Architecturs for Resilience and Adaptability.pdf
Nilesh Gule
 
Autoscale applications based on external events with KEDA.pdf
Nilesh Gule
 
Singapore JUG - Open Telemetry.pdf
Nilesh Gule
 
Cloud Native Ninja - Getting Started with Kubernetes - Part 9.pdf
Nilesh Gule
 
Build Secure Portable Applications using AKS and its ecosystem
Nilesh Gule
 
Cloud Native Ninja - PT8 - Containerize React app.pdf
Nilesh Gule
 
Cloud Native Ninja - PT8 - Containerize React app.pdf
Nilesh Gule
 
Modular Architecturs for resilience and Adaptability.pdf
Nilesh Gule
 
Ad

Recently uploaded (20)

PDF
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 

Improve monitoring and observability for kubernetes with oss tools