SlideShare a Scribd company logo
qaware.de
Architecting and Building
a Kubernetes-based AI Platform
Mario-Leander Reimer
mario-leander.reimer@qaware.de
@LeanderReimer @qaware
#CloudNativeNerd #gerneperdude
2
Mario-Leander Reimer
Managing Director | CTO
@LeanderReimer
#cloudnativenerd #qaware
#gernperDude
3
QAware
2019
"Too much cognitive load will become
a bottleneck for fast flow and high
productivity for many DevOps teams."
Team Topologies: Organizing Business and Technology Teams for Fast Flow
Platform engineering is the discipline of designing and building
toolchains and workflows that enable self-service capabilities for
software engineering organizations in the cloud-native era.
Platform engineers provide an integrated product most often
referred to as an “Internal Developer Platform” covering the
operational necessities of the entire lifecycle of an application.
https://platformengineering.org/blog/what-is-platform-engineering
An example reference architecture for an IDP.
Developer
Control Plane
Integration and
Delivery Plane
Monitoring and
Logging Plane
Security Plane
IDE Service Catalog / API Catalog Developer Portal
Application Source Code Infrastructure & Platform Source Code
Observability
Secrets & Identity Manager
CI Pipeline Registry CD Pipeline Resource Plane
Compute
Data
Integration
Networking
Platform Orchestrator
Certificates & Encryption
GitOps
https://humanitec.com/reference-architectures
7
QAware
2025
qaware.de
A wave is coming!
qaware.de
Agentic AI
Software
engineering
agents
Domain
specific agentic
workloads
qaware.de
... and we
have the
perfect
surfboard!
The logical continuation:
a. From applications
to microservices to
AI agents
b. From on-prem to
cloud platforms to
AI platforms
Micro-Agent
GenAI Usage
Prompts, Flow control
Tools (MCP)
Antwort enthält
Aufrufe an
OpenAI API
❏ Clear responsibility
❏ Vertical in terms of expertise
❏ manageably large
❏ potentially reusable
Micro-Agent
A2A
AI agents will be implemented according to the
microservice architecture paradigm.
…
…
…
Tool Server
Business Logic
LLM, LAM, SLM,
domain-specific
foundation models
?
SSE
HTTP
Why do we need an AI platform?
"According to Gartner, 80% of AI PoCs
fail on their way into productive use."
https://www.qaware.de/ki-vom-proof-of-concept-poc-zur-entwicklung/
The 80% Fallacy of AI projects.
14
QAware
Juan Pablo Bottaro, LinkedIn Engineering Blog
Key challenges: technology, models and tools, scaling.
Source: https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai-in-2023-generative-ais-breakout-year
■ Different challenges are seen depending on the
maturity of the group
■ AI newcomers often underestimate the complexity
of technologies, models and tools
■ Production and scaling challenges often hinder
production readiness
■ High cognitive load and lack of expertise are also
drivers for failing projects
15
Our proposal for an
AI Platform Reference Architecture
Platform Plane
Observability
Operability
Resource Plane
Compute
Data
Integration
Security
Delivery
FinOps
Integration & Delivery Plane
Quality Plane
Data Plane Model Plane
Compliance Plane
Service Plane
User Serving
Plane
Access Plane /
APIs
Orchestration
Plane
Data Modelling
Plane
QAware_Mario-Leander_Reimer_Architecting and Building a K8s-based AI Platform #CNN.pdf
lreimer/k8s-native-ai-platform
lreimer/k3s-ai-platform
The Kubernetes cluster topology requires precise planning.
Otherwise the costs will go through the roof!
20
QAware
■ There are different GPU machines
■ Not all types are available in all regions
■ Prices vary drastically, accurate research is
recommended
■ Additional local SSDs are recommended
■ To be decided:
– all nodes with GPU
– different nodes optimised for normal as
well as GPU workloads
https://cloud.google.com/compute/gpus-pricing?hl=de#other-gpu-models
Compliance Plane
Integration & Delivery Plane
Service Plane
Platform Plane
Operability
Resource Plane
Compute
Data: Local SSD
Integration
Security
Delivery
FinOps
Quality Plane
Data Plane Model Plane
User Serving Plane Access Plane Data Modelling Pl.
Compliance Plane
Integration & Delivery Plane
Service Plane
Platform Plane
Operability
Resource Plane
Compute
Data: Local SSD
Integration
Security
Delivery
FinOps
Quality Plane
Data Plane Model Plane
User Serving Plane Access Plane Data Modelling Pl.
https://www.agentic-layer.ai/
https://www.agentic-layer.ai
agentic-layer/
The technical backbone for smart workloads.
Ready to go Agentic?
Stay up-to-date with the Agentic Layer Newsletter!
With your newsletter subscription, you not only stay up to date but also have the chance
to win tickets for top tech conferences like the KI Navigator or the CLC. We look forward
to continuing our discussion about Agentic AI with you!
What’s your take on Agentic AI?
Tell us where you're stuck or curious – and how you'd like
to dive deeper into the topic.
QAware GmbH | Aschauer Straße 30 | 81549 München | GF: Dr. Josef Adersberger, Michael Stehnken, Michael Rohleder, Mario-Leander Reimer
Niederlassungen in München, Mainz, Rosenheim, Darmstadt | +49 89 232315-0 | info@qaware.de
The next step?
Let's talk!
Mario-Leander Reimer
Managing Director, CTO
mario-leander.reimer@qaware.de
+49 151 61314748

More Related Content

PDF
Platform Engineering
Opsta
 
PDF
Ai platform at scale
Henry Saputra
 
PDF
2024-05-30_meetup_devops_aix-marseille.pdf
Frederic Leger
 
PDF
20240702 Présentation Plateforme GenAI.pdf
Sally Laouacheria
 
PDF
250109 Platform Engineering Overview.pdf
Thomas Perelle
 
PDF
kubectl apply -f cloud-Infrastructure.yaml mit Crossplane et al.pdf
QAware GmbH
 
PPTX
Architecture evolution
amit bezalel
 
PDF
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Databricks
 
Platform Engineering
Opsta
 
Ai platform at scale
Henry Saputra
 
2024-05-30_meetup_devops_aix-marseille.pdf
Frederic Leger
 
20240702 Présentation Plateforme GenAI.pdf
Sally Laouacheria
 
250109 Platform Engineering Overview.pdf
Thomas Perelle
 
kubectl apply -f cloud-Infrastructure.yaml mit Crossplane et al.pdf
QAware GmbH
 
Architecture evolution
amit bezalel
 
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Databricks
 

Similar to QAware_Mario-Leander_Reimer_Architecting and Building a K8s-based AI Platform #CNN.pdf (20)

PDF
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
PDF
Migliorare la Developer Experience in un mondo Cloud Native
Commit University
 
PDF
Shift Remote: AI: Behind the scenes development in an AI company - Matija Ili...
Shift Conference
 
PDF
Shift Remote AI: Behind the Scenes Development in an AI Company - Matija Ilij...
Shift Conference
 
PDF
K8s-native Infrastructure as Code: einfach, deklarativ, produktiv
QAware GmbH
 
PDF
On premise ai platform - from dc to edge
Conference Papers
 
PDF
kubectl apply -f cloud-Infrastructure.yaml mit Crossplane et al.
QAware GmbH
 
PDF
Bhadale QAI Hub - for multicloud, multitechnology platform
Vijayananda Mohire
 
PDF
Platform Engineering On Kubernetes Meap V09 All 9 Chapters Mauricio Salatino
ncalajereen
 
PDF
Kubernetes and AI - Beauty and the Beast - Tobias Schneck - DOAG 24 NUE - 20....
Tobias Schneck
 
PPTX
kubectl apply -f cloud-Infrastructure.yaml mit Crossplane et al. @ CNN Munich
QAware GmbH
 
PDF
The Platform Mullet
pczarkowski
 
PPTX
Kubernetes_101_Zero_to_Platform_Engineer.pptx
CloudScouts
 
PDF
Crossplane @ Mastering GitOps.pdf
QAware GmbH
 
PPTX
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
Jennifer Lim
 
PDF
The Challenges of building Cloud Native Platforms
Mauricio (Salaboy) Salatino
 
PDF
Cluster-as-code. The Many Ways towards Kubernetes
QAware GmbH
 
PDF
stackconf 2020 | The blinking cursor or kubernetes for people who aren´t supp...
NETWAYS
 
PDF
Whitepaper_ State of Platform Engineering Report.pdf
juancarlos747007
 
PDF
Platform Strategy to Deliver Digital Experiences on Azure
WSO2
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Migliorare la Developer Experience in un mondo Cloud Native
Commit University
 
Shift Remote: AI: Behind the scenes development in an AI company - Matija Ili...
Shift Conference
 
Shift Remote AI: Behind the Scenes Development in an AI Company - Matija Ilij...
Shift Conference
 
K8s-native Infrastructure as Code: einfach, deklarativ, produktiv
QAware GmbH
 
On premise ai platform - from dc to edge
Conference Papers
 
kubectl apply -f cloud-Infrastructure.yaml mit Crossplane et al.
QAware GmbH
 
Bhadale QAI Hub - for multicloud, multitechnology platform
Vijayananda Mohire
 
Platform Engineering On Kubernetes Meap V09 All 9 Chapters Mauricio Salatino
ncalajereen
 
Kubernetes and AI - Beauty and the Beast - Tobias Schneck - DOAG 24 NUE - 20....
Tobias Schneck
 
kubectl apply -f cloud-Infrastructure.yaml mit Crossplane et al. @ CNN Munich
QAware GmbH
 
The Platform Mullet
pczarkowski
 
Kubernetes_101_Zero_to_Platform_Engineer.pptx
CloudScouts
 
Crossplane @ Mastering GitOps.pdf
QAware GmbH
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
Jennifer Lim
 
The Challenges of building Cloud Native Platforms
Mauricio (Salaboy) Salatino
 
Cluster-as-code. The Many Ways towards Kubernetes
QAware GmbH
 
stackconf 2020 | The blinking cursor or kubernetes for people who aren´t supp...
NETWAYS
 
Whitepaper_ State of Platform Engineering Report.pdf
juancarlos747007
 
Platform Strategy to Deliver Digital Experiences on Azure
WSO2
 
Ad

More from QAware GmbH (20)

PDF
Frontends mit Hilfe von KI entwickeln.pdf
QAware GmbH
 
PDF
Mit ChatGPT Dinosaurier besiegen - Möglichkeiten und Grenzen von LLM für die ...
QAware GmbH
 
PDF
50 Shades of K8s Autoscaling #JavaLand24.pdf
QAware GmbH
 
PDF
Make Agile Great - PM-Erfahrungen aus zwei virtuellen internationalen SAFe-Pr...
QAware GmbH
 
PPTX
Fully-managed Cloud-native Databases: The path to indefinite scale @ CNN Mainz
QAware GmbH
 
PDF
Down the Ivory Tower towards Agile Architecture
QAware GmbH
 
PDF
"Mixed" Scrum-Teams – Die richtige Mischung macht's!
QAware GmbH
 
PDF
Make Developers Fly: Principles for Platform Engineering
QAware GmbH
 
PDF
Der Tod der Testpyramide? – Frontend-Testing mit Playwright
QAware GmbH
 
PDF
Was kommt nach den SPAs
QAware GmbH
 
PDF
Cloud Migration mit KI: der Turbo
QAware GmbH
 
PDF
Migration von stark regulierten Anwendungen in die Cloud: Dem Teufel die See...
QAware GmbH
 
PDF
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
QAware GmbH
 
PDF
Endlich gute API Tests. Boldly Testing APIs Where No One Has Tested Before.
QAware GmbH
 
PDF
Kubernetes with Cilium in AWS - Experience Report!
QAware GmbH
 
PDF
50 Shades of K8s Autoscaling
QAware GmbH
 
PDF
Kontinuierliche Sicherheitstests für APIs mit Testkube und OWASP ZAP
QAware GmbH
 
PDF
Service Mesh Pain & Gain. Experiences from a client project.
QAware GmbH
 
PDF
50 Shades of K8s Autoscaling
QAware GmbH
 
PDF
Blue turns green! Approaches and technologies for sustainable K8s clusters.
QAware GmbH
 
Frontends mit Hilfe von KI entwickeln.pdf
QAware GmbH
 
Mit ChatGPT Dinosaurier besiegen - Möglichkeiten und Grenzen von LLM für die ...
QAware GmbH
 
50 Shades of K8s Autoscaling #JavaLand24.pdf
QAware GmbH
 
Make Agile Great - PM-Erfahrungen aus zwei virtuellen internationalen SAFe-Pr...
QAware GmbH
 
Fully-managed Cloud-native Databases: The path to indefinite scale @ CNN Mainz
QAware GmbH
 
Down the Ivory Tower towards Agile Architecture
QAware GmbH
 
"Mixed" Scrum-Teams – Die richtige Mischung macht's!
QAware GmbH
 
Make Developers Fly: Principles for Platform Engineering
QAware GmbH
 
Der Tod der Testpyramide? – Frontend-Testing mit Playwright
QAware GmbH
 
Was kommt nach den SPAs
QAware GmbH
 
Cloud Migration mit KI: der Turbo
QAware GmbH
 
Migration von stark regulierten Anwendungen in die Cloud: Dem Teufel die See...
QAware GmbH
 
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
QAware GmbH
 
Endlich gute API Tests. Boldly Testing APIs Where No One Has Tested Before.
QAware GmbH
 
Kubernetes with Cilium in AWS - Experience Report!
QAware GmbH
 
50 Shades of K8s Autoscaling
QAware GmbH
 
Kontinuierliche Sicherheitstests für APIs mit Testkube und OWASP ZAP
QAware GmbH
 
Service Mesh Pain & Gain. Experiences from a client project.
QAware GmbH
 
50 Shades of K8s Autoscaling
QAware GmbH
 
Blue turns green! Approaches and technologies for sustainable K8s clusters.
QAware GmbH
 
Ad

Recently uploaded (20)

PDF
Wondershare Filmora 14.5.20.12999 Crack Full New Version 2025
gsgssg2211
 
PPTX
PFAS Reporting Requirements 2026 Are You Submission Ready Certivo.pptx
Certivo Inc
 
PDF
Key Features to Look for in Arizona App Development Services
Net-Craft.com
 
PPTX
Presentation about variables and constant.pptx
safalsingh810
 
PDF
Micromaid: A simple Mermaid-like chart generator for Pharo
ESUG
 
PPTX
ConcordeApp: Engineering Global Impact & Unlocking Billions in Event ROI with AI
chastechaste14
 
PPTX
Visualising Data with Scatterplots in IBM SPSS Statistics.pptx
Version 1 Analytics
 
PPTX
Presentation about variables and constant.pptx
kr2589474
 
PPTX
slidesgo-unlocking-the-code-the-dynamic-dance-of-variables-and-constants-2024...
kr2589474
 
PDF
ShowUs: Pharo Stream Deck (ESUG 2025, Gdansk)
ESUG
 
PPTX
oapresentation.pptx
mehatdhavalrajubhai
 
PPTX
AI-Ready Handoff: Auto-Summaries & Draft Emails from MQL to Slack in One Flow
bbedford2
 
PDF
advancepresentationskillshdhdhhdhdhdhhfhf
jasmenrojas249
 
PDF
Teaching Reproducibility and Embracing Variability: From Floating-Point Exper...
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
DOCX
Can You Build Dashboards Using Open Source Visualization Tool.docx
Varsha Nayak
 
PDF
Salesforce Implementation Services Provider.pdf
VALiNTRY360
 
PDF
Microsoft Teams Essentials; The pricing and the versions_PDF.pdf
Q-Advise
 
PDF
Jenkins: An open-source automation server powering CI/CD Automation
SaikatBasu37
 
PPT
Activate_Methodology_Summary presentatio
annapureddyn
 
PPT
Why Reliable Server Maintenance Service in New York is Crucial for Your Business
Sam Vohra
 
Wondershare Filmora 14.5.20.12999 Crack Full New Version 2025
gsgssg2211
 
PFAS Reporting Requirements 2026 Are You Submission Ready Certivo.pptx
Certivo Inc
 
Key Features to Look for in Arizona App Development Services
Net-Craft.com
 
Presentation about variables and constant.pptx
safalsingh810
 
Micromaid: A simple Mermaid-like chart generator for Pharo
ESUG
 
ConcordeApp: Engineering Global Impact & Unlocking Billions in Event ROI with AI
chastechaste14
 
Visualising Data with Scatterplots in IBM SPSS Statistics.pptx
Version 1 Analytics
 
Presentation about variables and constant.pptx
kr2589474
 
slidesgo-unlocking-the-code-the-dynamic-dance-of-variables-and-constants-2024...
kr2589474
 
ShowUs: Pharo Stream Deck (ESUG 2025, Gdansk)
ESUG
 
oapresentation.pptx
mehatdhavalrajubhai
 
AI-Ready Handoff: Auto-Summaries & Draft Emails from MQL to Slack in One Flow
bbedford2
 
advancepresentationskillshdhdhhdhdhdhhfhf
jasmenrojas249
 
Teaching Reproducibility and Embracing Variability: From Floating-Point Exper...
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
Can You Build Dashboards Using Open Source Visualization Tool.docx
Varsha Nayak
 
Salesforce Implementation Services Provider.pdf
VALiNTRY360
 
Microsoft Teams Essentials; The pricing and the versions_PDF.pdf
Q-Advise
 
Jenkins: An open-source automation server powering CI/CD Automation
SaikatBasu37
 
Activate_Methodology_Summary presentatio
annapureddyn
 
Why Reliable Server Maintenance Service in New York is Crucial for Your Business
Sam Vohra
 

QAware_Mario-Leander_Reimer_Architecting and Building a K8s-based AI Platform #CNN.pdf

  • 1. qaware.de Architecting and Building a Kubernetes-based AI Platform Mario-Leander Reimer [email protected] @LeanderReimer @qaware #CloudNativeNerd #gerneperdude
  • 2. 2 Mario-Leander Reimer Managing Director | CTO @LeanderReimer #cloudnativenerd #qaware #gernperDude
  • 4. "Too much cognitive load will become a bottleneck for fast flow and high productivity for many DevOps teams." Team Topologies: Organizing Business and Technology Teams for Fast Flow
  • 5. Platform engineering is the discipline of designing and building toolchains and workflows that enable self-service capabilities for software engineering organizations in the cloud-native era. Platform engineers provide an integrated product most often referred to as an “Internal Developer Platform” covering the operational necessities of the entire lifecycle of an application. https://platformengineering.org/blog/what-is-platform-engineering
  • 6. An example reference architecture for an IDP. Developer Control Plane Integration and Delivery Plane Monitoring and Logging Plane Security Plane IDE Service Catalog / API Catalog Developer Portal Application Source Code Infrastructure & Platform Source Code Observability Secrets & Identity Manager CI Pipeline Registry CD Pipeline Resource Plane Compute Data Integration Networking Platform Orchestrator Certificates & Encryption GitOps https://humanitec.com/reference-architectures
  • 10. qaware.de ... and we have the perfect surfboard! The logical continuation: a. From applications to microservices to AI agents b. From on-prem to cloud platforms to AI platforms
  • 11. Micro-Agent GenAI Usage Prompts, Flow control Tools (MCP) Antwort enthält Aufrufe an OpenAI API ❏ Clear responsibility ❏ Vertical in terms of expertise ❏ manageably large ❏ potentially reusable Micro-Agent A2A AI agents will be implemented according to the microservice architecture paradigm. … … … Tool Server Business Logic LLM, LAM, SLM, domain-specific foundation models ? SSE HTTP
  • 12. Why do we need an AI platform?
  • 13. "According to Gartner, 80% of AI PoCs fail on their way into productive use." https://www.qaware.de/ki-vom-proof-of-concept-poc-zur-entwicklung/
  • 14. The 80% Fallacy of AI projects. 14 QAware Juan Pablo Bottaro, LinkedIn Engineering Blog
  • 15. Key challenges: technology, models and tools, scaling. Source: https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai-in-2023-generative-ais-breakout-year ■ Different challenges are seen depending on the maturity of the group ■ AI newcomers often underestimate the complexity of technologies, models and tools ■ Production and scaling challenges often hinder production readiness ■ High cognitive load and lack of expertise are also drivers for failing projects 15
  • 16. Our proposal for an AI Platform Reference Architecture
  • 17. Platform Plane Observability Operability Resource Plane Compute Data Integration Security Delivery FinOps Integration & Delivery Plane Quality Plane Data Plane Model Plane Compliance Plane Service Plane User Serving Plane Access Plane / APIs Orchestration Plane Data Modelling Plane
  • 20. The Kubernetes cluster topology requires precise planning. Otherwise the costs will go through the roof! 20 QAware ■ There are different GPU machines ■ Not all types are available in all regions ■ Prices vary drastically, accurate research is recommended ■ Additional local SSDs are recommended ■ To be decided: – all nodes with GPU – different nodes optimised for normal as well as GPU workloads https://cloud.google.com/compute/gpus-pricing?hl=de#other-gpu-models
  • 21. Compliance Plane Integration & Delivery Plane Service Plane Platform Plane Operability Resource Plane Compute Data: Local SSD Integration Security Delivery FinOps Quality Plane Data Plane Model Plane User Serving Plane Access Plane Data Modelling Pl.
  • 22. Compliance Plane Integration & Delivery Plane Service Plane Platform Plane Operability Resource Plane Compute Data: Local SSD Integration Security Delivery FinOps Quality Plane Data Plane Model Plane User Serving Plane Access Plane Data Modelling Pl.
  • 25. The technical backbone for smart workloads.
  • 26. Ready to go Agentic? Stay up-to-date with the Agentic Layer Newsletter! With your newsletter subscription, you not only stay up to date but also have the chance to win tickets for top tech conferences like the KI Navigator or the CLC. We look forward to continuing our discussion about Agentic AI with you!
  • 27. What’s your take on Agentic AI? Tell us where you're stuck or curious – and how you'd like to dive deeper into the topic.
  • 28. QAware GmbH | Aschauer Straße 30 | 81549 München | GF: Dr. Josef Adersberger, Michael Stehnken, Michael Rohleder, Mario-Leander Reimer Niederlassungen in München, Mainz, Rosenheim, Darmstadt | +49 89 232315-0 | [email protected] The next step? Let's talk! Mario-Leander Reimer Managing Director, CTO [email protected] +49 151 61314748