SlideShare a Scribd company logo
Confidential do not distribute 1
Generative AI Automation
for private Enterprise LLMs
Part 1: LM-Controller
Confidential do not distribute 2
● AI Models and Applications are the new class of Kubernetes
workloads
● We start tackling this from LLMs
● Enterprise already invested in CPU-based Kubernetes clusters
Enterprise AI workloads
Confidential do not distribute 3
● AI Application Developers shouldn’t worry about the complexity of
model deployment.
● Platform Teams: LLMs become platform components
○ Security and Governance: signing and verification
○ RBAC and Tenancy
○ Standardization across organizations
○ Available for the Dev teams via self-service portals
Why Weave AI?
Confidential do not distribute 4
● Day 0 - Out-of-the-box experiences
○ weave-ai install
○ weave-ai run zephyr-7b-beta
● Day 1 - Integrate them to your DevOps / GitOps pipelines
○ weave-ai install --export
● Day 2 - Build and maintain model catalog for the Dev teams
○ flux commands
○ Fine-tuning models / RAG data pipelines
Why Weave AI?
Confidential do not distribute 5
Confidential do not distribute 6
● The first controller released as part of the Weave AI Controllers
● LM Controller is a Flux controller that helps deploy Large
Language Models on Kubernetes.
● It supports LLMs in the Flux OCI format.
● It uses Flux Source Controller as the in-cluster model cache.
What is LM Controller?
Confidential do not distribute 7
LLMs are snowflakes
Confidential do not distribute 8
Hugging Face
Compatible Models
GitHub / GitLab
CI
Your App
LLM Serving
Your Data CPU or GPU
on Cloud
or
on-Prem
fine-tuning
store
packaged
pulled
deploy
context
manage
LLM as Flux OCI
Confidential do not distribute 9
Why use LM Controller?
LLM Serving
LLMs
injects
all required information
to the deployment units
LM Controller
Confidential do not distribute 10
● A curated list of LLM catalog
○ In Flux’s OCI format
● Flux’s Source Controller as in-Cluster model Cache
○ No PVC required
● A controller that takes care of this and that LLM parameters for you
● A set of pre-built OpenAI API Compatible engines
○ No-AVX, AVX, AVX2, AVX512 and more to come
● An easy-to-use CLI
What Weave AI provides so far
Confidential do not distribute 11
It’s Demo Time

More Related Content

Similar to Weave AI Controllers (Weave GitOps Office Hours) (20)

PDF
AIPyCraft: AI-Assisted Software Development Lifecycle for 6G Blockchain Oracl...
Antonio Marcos Alberti
 
PDF
Coding with AI - Understanding LLMs and how to use them
Arnon Rotem-Gal-Oz
 
PPTX
[KZ] Web Ecosystem with Multimodality of Gemini.pptx
asemaialmanbetova
 
PDF
Intro to Generative-AI(Gen AI Study Jams GDGC ZHCET)
fiza1892003
 
PDF
20240411 QFM009 Machine Intelligence Reading List March 2024
Matthew Sinclair
 
PPTX
Multimodel_LLM_for_Content_Generation.pptx
aagamshah0812
 
PDF
20221130 - Luxembourg HUG Meetup
Stéphane Este-Gracias
 
PPTX
[DSC Europe 24] Tomislav Tipuric - Exploring LLMs across clouds – A Year in t...
DataScienceConferenc1
 
PPTX
[DSC DACH 24] AI and XR - Ivan Voras
DataScienceConferenc1
 
PDF
Generative AI on Enterprise Cloud with NiFi and Milvus
Timothy Spann
 
PDF
KubeCon & CloudNative Con 2024 Artificial Intelligent
Emre Gündoğdu
 
PDF
Webinar: Capabilities, Confidence and Community – What Flux GA Means for You
Weaveworks
 
PDF
Omniverse for the Metaverse
Alison B. Lowndes
 
PDF
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Lablup Inc.
 
PDF
From Traction to Production Maturing your LLMOps step by step
Maxim Salnikov
 
PDF
Generative AI for the rest of us
Massimo Ferre'
 
PDF
Building and deploying LLM applications with Apache Airflow
Kaxil Naik
 
PDF
Overview of Artificial Intelligence - Technology
NickDAgostino3
 
PDF
Implementing AI: Running AI at the Edge
KTN
 
PDF
"Scaling ML from 0 to millions of users", Julien Simon, AWS Dev Day Kyiv 2019
Provectus
 
AIPyCraft: AI-Assisted Software Development Lifecycle for 6G Blockchain Oracl...
Antonio Marcos Alberti
 
Coding with AI - Understanding LLMs and how to use them
Arnon Rotem-Gal-Oz
 
[KZ] Web Ecosystem with Multimodality of Gemini.pptx
asemaialmanbetova
 
Intro to Generative-AI(Gen AI Study Jams GDGC ZHCET)
fiza1892003
 
20240411 QFM009 Machine Intelligence Reading List March 2024
Matthew Sinclair
 
Multimodel_LLM_for_Content_Generation.pptx
aagamshah0812
 
20221130 - Luxembourg HUG Meetup
Stéphane Este-Gracias
 
[DSC Europe 24] Tomislav Tipuric - Exploring LLMs across clouds – A Year in t...
DataScienceConferenc1
 
[DSC DACH 24] AI and XR - Ivan Voras
DataScienceConferenc1
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Timothy Spann
 
KubeCon & CloudNative Con 2024 Artificial Intelligent
Emre Gündoğdu
 
Webinar: Capabilities, Confidence and Community – What Flux GA Means for You
Weaveworks
 
Omniverse for the Metaverse
Alison B. Lowndes
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Lablup Inc.
 
From Traction to Production Maturing your LLMOps step by step
Maxim Salnikov
 
Generative AI for the rest of us
Massimo Ferre'
 
Building and deploying LLM applications with Apache Airflow
Kaxil Naik
 
Overview of Artificial Intelligence - Technology
NickDAgostino3
 
Implementing AI: Running AI at the Edge
KTN
 
"Scaling ML from 0 to millions of users", Julien Simon, AWS Dev Day Kyiv 2019
Provectus
 

More from Weaveworks (20)

PDF
Flamingo: Expand ArgoCD with Flux (Office Hours)
Weaveworks
 
PDF
Six Signs You Need Platform Engineering
Weaveworks
 
PDF
SRE and GitOps for Building Robust Kubernetes Platforms.pdf
Weaveworks
 
PDF
Webinar: End to End Security & Operations with Chainguard and Weave GitOps
Weaveworks
 
PDF
Flux Beyond Git Harnessing the Power of OCI
Weaveworks
 
PDF
Automated Provisioning, Management & Cost Control for Kubernetes Clusters
Weaveworks
 
PDF
How to Avoid Kubernetes Multi-tenancy Catastrophes
Weaveworks
 
PDF
Building internal developer platform with EKS and GitOps
Weaveworks
 
PDF
GitOps Testing in Kubernetes with Flux and Testkube.pdf
Weaveworks
 
PDF
Intro to GitOps with Weave GitOps, Flagger and Linkerd
Weaveworks
 
PDF
Implementing Flux for Scale with Soft Multi-tenancy
Weaveworks
 
PDF
Accelerating Hybrid Multistage Delivery with Weave GitOps on EKS
Weaveworks
 
PDF
The Story of Flux Reaching Graduation in the CNCF
Weaveworks
 
PDF
Shift Deployment Security Left with Weave GitOps & Upbound’s Universal Crossp...
Weaveworks
 
PDF
Securing Your App Deployments with Tunnels, OIDC, RBAC, and Progressive Deliv...
Weaveworks
 
PDF
Flux’s Security & Scalability with OCI & Helm Slides.pdf
Weaveworks
 
PDF
Flux Security & Scalability using VS Code GitOps Extension
Weaveworks
 
PDF
Deploying Stateful Applications Securely & Confidently with Ondat & Weave GitOps
Weaveworks
 
PDF
Robust Network Security and Observability with GitOps and Cilium
Weaveworks
 
PDF
Intro to GitOps & Flux.pdf
Weaveworks
 
Flamingo: Expand ArgoCD with Flux (Office Hours)
Weaveworks
 
Six Signs You Need Platform Engineering
Weaveworks
 
SRE and GitOps for Building Robust Kubernetes Platforms.pdf
Weaveworks
 
Webinar: End to End Security & Operations with Chainguard and Weave GitOps
Weaveworks
 
Flux Beyond Git Harnessing the Power of OCI
Weaveworks
 
Automated Provisioning, Management & Cost Control for Kubernetes Clusters
Weaveworks
 
How to Avoid Kubernetes Multi-tenancy Catastrophes
Weaveworks
 
Building internal developer platform with EKS and GitOps
Weaveworks
 
GitOps Testing in Kubernetes with Flux and Testkube.pdf
Weaveworks
 
Intro to GitOps with Weave GitOps, Flagger and Linkerd
Weaveworks
 
Implementing Flux for Scale with Soft Multi-tenancy
Weaveworks
 
Accelerating Hybrid Multistage Delivery with Weave GitOps on EKS
Weaveworks
 
The Story of Flux Reaching Graduation in the CNCF
Weaveworks
 
Shift Deployment Security Left with Weave GitOps & Upbound’s Universal Crossp...
Weaveworks
 
Securing Your App Deployments with Tunnels, OIDC, RBAC, and Progressive Deliv...
Weaveworks
 
Flux’s Security & Scalability with OCI & Helm Slides.pdf
Weaveworks
 
Flux Security & Scalability using VS Code GitOps Extension
Weaveworks
 
Deploying Stateful Applications Securely & Confidently with Ondat & Weave GitOps
Weaveworks
 
Robust Network Security and Observability with GitOps and Cilium
Weaveworks
 
Intro to GitOps & Flux.pdf
Weaveworks
 
Ad

Recently uploaded (20)

PDF
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PDF
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PDF
IoT-Powered Industrial Transformation – Smart Manufacturing to Connected Heal...
Rejig Digital
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
PDF
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
IoT-Powered Industrial Transformation – Smart Manufacturing to Connected Heal...
Rejig Digital
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
Ad

Weave AI Controllers (Weave GitOps Office Hours)

  • 1. Confidential do not distribute 1 Generative AI Automation for private Enterprise LLMs Part 1: LM-Controller
  • 2. Confidential do not distribute 2 ● AI Models and Applications are the new class of Kubernetes workloads ● We start tackling this from LLMs ● Enterprise already invested in CPU-based Kubernetes clusters Enterprise AI workloads
  • 3. Confidential do not distribute 3 ● AI Application Developers shouldn’t worry about the complexity of model deployment. ● Platform Teams: LLMs become platform components ○ Security and Governance: signing and verification ○ RBAC and Tenancy ○ Standardization across organizations ○ Available for the Dev teams via self-service portals Why Weave AI?
  • 4. Confidential do not distribute 4 ● Day 0 - Out-of-the-box experiences ○ weave-ai install ○ weave-ai run zephyr-7b-beta ● Day 1 - Integrate them to your DevOps / GitOps pipelines ○ weave-ai install --export ● Day 2 - Build and maintain model catalog for the Dev teams ○ flux commands ○ Fine-tuning models / RAG data pipelines Why Weave AI?
  • 5. Confidential do not distribute 5
  • 6. Confidential do not distribute 6 ● The first controller released as part of the Weave AI Controllers ● LM Controller is a Flux controller that helps deploy Large Language Models on Kubernetes. ● It supports LLMs in the Flux OCI format. ● It uses Flux Source Controller as the in-cluster model cache. What is LM Controller?
  • 7. Confidential do not distribute 7 LLMs are snowflakes
  • 8. Confidential do not distribute 8 Hugging Face Compatible Models GitHub / GitLab CI Your App LLM Serving Your Data CPU or GPU on Cloud or on-Prem fine-tuning store packaged pulled deploy context manage LLM as Flux OCI
  • 9. Confidential do not distribute 9 Why use LM Controller? LLM Serving LLMs injects all required information to the deployment units LM Controller
  • 10. Confidential do not distribute 10 ● A curated list of LLM catalog ○ In Flux’s OCI format ● Flux’s Source Controller as in-Cluster model Cache ○ No PVC required ● A controller that takes care of this and that LLM parameters for you ● A set of pre-built OpenAI API Compatible engines ○ No-AVX, AVX, AVX2, AVX512 and more to come ● An easy-to-use CLI What Weave AI provides so far
  • 11. Confidential do not distribute 11 It’s Demo Time