Weave AI Controllers (Weave GitOps Office Hours)

0 likes20 views

The document outlines the deployment of large language models (LLMs) on Kubernetes using the Weave AI platform. It introduces the LM Controller, which simplifies the deployment and management of LLMs, making them accessible to development teams through self-service options. Key features include integration with existing DevOps pipelines, a curated LLM catalog, and pre-built engines for various CPU architectures.

Technology

Conﬁdential do not distribute 1
Generative AI Automation
for private Enterprise LLMs
Part 1: LM-Controller

Conﬁdential do not distribute 2
● AI Models and Applications are the new class of Kubernetes
workloads
● We start tackling this from LLMs
● Enterprise already invested in CPU-based Kubernetes clusters
Enterprise AI workloads

Conﬁdential do not distribute 3
● AI Application Developers shouldn’t worry about the complexity of
model deployment.
● Platform Teams: LLMs become platform components
○ Security and Governance: signing and verification
○ RBAC and Tenancy
○ Standardization across organizations
○ Available for the Dev teams via self-service portals
Why Weave AI?

Conﬁdential do not distribute 4
● Day 0 - Out-of-the-box experiences
○ weave-ai install
○ weave-ai run zephyr-7b-beta
● Day 1 - Integrate them to your DevOps / GitOps pipelines
○ weave-ai install --export
● Day 2 - Build and maintain model catalog for the Dev teams
○ flux commands
○ Fine-tuning models / RAG data pipelines
Why Weave AI?

Conﬁdential do not distribute 6
● The ﬁrst controller released as part of the Weave AI Controllers
● LM Controller is a Flux controller that helps deploy Large
Language Models on Kubernetes.
● It supports LLMs in the Flux OCI format.
● It uses Flux Source Controller as the in-cluster model cache.
What is LM Controller?

Conﬁdential do not distribute 7
LLMs are snowﬂakes

Conﬁdential do not distribute 8
Hugging Face
Compatible Models
GitHub / GitLab
CI
Your App
LLM Serving
Your Data CPU or GPU
on Cloud
or
on-Prem
fine-tuning
store
packaged
pulled
deploy
context
manage
LLM as Flux OCI

Conﬁdential do not distribute 9
Why use LM Controller?
LLM Serving
LLMs
injects
all required information
to the deployment units
LM Controller

Conﬁdential do not distribute 10
● A curated list of LLM catalog
○ In Flux’s OCI format
● Flux’s Source Controller as in-Cluster model Cache
○ No PVC required
● A controller that takes care of this and that LLM parameters for you
● A set of pre-built OpenAI API Compatible engines
○ No-AVX, AVX, AVX2, AVX512 and more to come
● An easy-to-use CLI
What Weave AI provides so far

Conﬁdential do not distribute 11
It’s Demo Time

More Related Content

Similar to Weave AI Controllers (Weave GitOps Office Hours) (20)

PDF

AIPyCraft: AI-Assisted Software Development Lifecycle for 6G Blockchain Oracl...Antonio Marcos Alberti

PDF

Coding with AI - Understanding LLMs and how to use themArnon Rotem-Gal-Oz

PPTX

[KZ] Web Ecosystem with Multimodality of Gemini.pptxasemaialmanbetova

PDF

Intro to Generative-AI(Gen AI Study Jams GDGC ZHCET)fiza1892003

PDF

20240411 QFM009 Machine Intelligence Reading List March 2024Matthew Sinclair

PPTX

Multimodel_LLM_for_Content_Generation.pptxaagamshah0812

PDF

20221130 - Luxembourg HUG MeetupStéphane Este-Gracias

PPTX

[DSC Europe 24] Tomislav Tipuric - Exploring LLMs across clouds – A Year in t...DataScienceConferenc1

PPTX

[DSC DACH 24] AI and XR - Ivan VorasDataScienceConferenc1

PDF

Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann

PDF

KubeCon & CloudNative Con 2024 Artificial IntelligentEmre Gündoğdu

PDF

Webinar: Capabilities, Confidence and Community – What Flux GA Means for YouWeaveworks

PDF

Omniverse for the MetaverseAlison B. Lowndes

PDF

Backend.AI Technical Introduction (19.09 / 2019 Autumn)Lablup Inc.

PDF

From Traction to Production Maturing your LLMOps step by stepMaxim Salnikov

PDF

Generative AI for the rest of usMassimo Ferre'

PDF

Building and deploying LLM applications with Apache AirflowKaxil Naik

PDF

Overview of Artificial Intelligence - TechnologyNickDAgostino3

PDF

Implementing AI: Running AI at the EdgeKTN

PDF

"Scaling ML from 0 to millions of users", Julien Simon, AWS Dev Day Kyiv 2019Provectus

AIPyCraft: AI-Assisted Software Development Lifecycle for 6G Blockchain Oracl...Antonio Marcos Alberti

Coding with AI - Understanding LLMs and how to use themArnon Rotem-Gal-Oz

[KZ] Web Ecosystem with Multimodality of Gemini.pptxasemaialmanbetova

Intro to Generative-AI(Gen AI Study Jams GDGC ZHCET)fiza1892003

20240411 QFM009 Machine Intelligence Reading List March 2024Matthew Sinclair

Multimodel_LLM_for_Content_Generation.pptxaagamshah0812

20221130 - Luxembourg HUG MeetupStéphane Este-Gracias

[DSC Europe 24] Tomislav Tipuric - Exploring LLMs across clouds – A Year in t...DataScienceConferenc1

[DSC DACH 24] AI and XR - Ivan VorasDataScienceConferenc1

Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann

KubeCon & CloudNative Con 2024 Artificial IntelligentEmre Gündoğdu

Webinar: Capabilities, Confidence and Community – What Flux GA Means for YouWeaveworks

Omniverse for the MetaverseAlison B. Lowndes

Backend.AI Technical Introduction (19.09 / 2019 Autumn)Lablup Inc.

From Traction to Production Maturing your LLMOps step by stepMaxim Salnikov

Generative AI for the rest of usMassimo Ferre'

Building and deploying LLM applications with Apache AirflowKaxil Naik

Overview of Artificial Intelligence - TechnologyNickDAgostino3

Implementing AI: Running AI at the EdgeKTN

"Scaling ML from 0 to millions of users", Julien Simon, AWS Dev Day Kyiv 2019Provectus

More from Weaveworks (20)

PDF

Flamingo: Expand ArgoCD with Flux (Office Hours)Weaveworks

PDF

Six Signs You Need Platform EngineeringWeaveworks

PDF

SRE and GitOps for Building Robust Kubernetes Platforms.pdfWeaveworks

PDF

Webinar: End to End Security & Operations with Chainguard and Weave GitOpsWeaveworks

PDF

Flux Beyond Git Harnessing the Power of OCIWeaveworks

PDF

Automated Provisioning, Management & Cost Control for Kubernetes ClustersWeaveworks

PDF

How to Avoid Kubernetes Multi-tenancy CatastrophesWeaveworks

PDF

Building internal developer platform with EKS and GitOpsWeaveworks

PDF

GitOps Testing in Kubernetes with Flux and Testkube.pdfWeaveworks

PDF

Intro to GitOps with Weave GitOps, Flagger and LinkerdWeaveworks

PDF

Implementing Flux for Scale with Soft Multi-tenancyWeaveworks

PDF

Accelerating Hybrid Multistage Delivery with Weave GitOps on EKSWeaveworks

PDF

The Story of Flux Reaching Graduation in the CNCFWeaveworks

PDF

Shift Deployment Security Left with Weave GitOps & Upbound’s Universal Crossp...Weaveworks

PDF

Securing Your App Deployments with Tunnels, OIDC, RBAC, and Progressive Deliv...Weaveworks

PDF

Flux’s Security & Scalability with OCI & Helm Slides.pdfWeaveworks

PDF

Flux Security & Scalability using VS Code GitOps Extension Weaveworks

PDF

Deploying Stateful Applications Securely & Confidently with Ondat & Weave GitOpsWeaveworks

PDF

Robust Network Security and Observability with GitOps and CiliumWeaveworks

PDF

Intro to GitOps & Flux.pdfWeaveworks

Flamingo: Expand ArgoCD with Flux (Office Hours)Weaveworks

Six Signs You Need Platform EngineeringWeaveworks

SRE and GitOps for Building Robust Kubernetes Platforms.pdfWeaveworks

Webinar: End to End Security & Operations with Chainguard and Weave GitOpsWeaveworks

Flux Beyond Git Harnessing the Power of OCIWeaveworks

Automated Provisioning, Management & Cost Control for Kubernetes ClustersWeaveworks

How to Avoid Kubernetes Multi-tenancy CatastrophesWeaveworks

Building internal developer platform with EKS and GitOpsWeaveworks

GitOps Testing in Kubernetes with Flux and Testkube.pdfWeaveworks

Intro to GitOps with Weave GitOps, Flagger and LinkerdWeaveworks

Implementing Flux for Scale with Soft Multi-tenancyWeaveworks

Accelerating Hybrid Multistage Delivery with Weave GitOps on EKSWeaveworks

The Story of Flux Reaching Graduation in the CNCFWeaveworks

Shift Deployment Security Left with Weave GitOps & Upbound’s Universal Crossp...Weaveworks

Securing Your App Deployments with Tunnels, OIDC, RBAC, and Progressive Deliv...Weaveworks

Flux’s Security & Scalability with OCI & Helm Slides.pdfWeaveworks

Flux Security & Scalability using VS Code GitOps Extension Weaveworks

Deploying Stateful Applications Securely & Confidently with Ondat & Weave GitOpsWeaveworks

Robust Network Security and Observability with GitOps and CiliumWeaveworks

Intro to GitOps & Flux.pdfWeaveworks

Recently uploaded (20)

PDF

What Makes Contify’s News API Stand Out: Key Features at a GlanceContify

PDF

DevBcn - Building 10x Organizations Using Modern Productivity MetricsJustin Reock

PDF

LOOPS in C Programming Language - TechnologyRishabhDwivedi43

PDF

Staying Human in a Machine- Accelerated WorldCatalin Jora

PDF

"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...Fwdays

PDF

IoT-Powered Industrial Transformation – Smart Manufacturing to Connected Heal...Rejig Digital

PDF

"AI Transformation: Directions and Challenges", Pavlo ShaternikFwdays

PPTX

"Autonomy of LLM Agents: Current State and Future Prospects", Oles` PetrivFwdays

PDF

Reverse Engineering of Security Products: Developing an Advanced Microsoft De...nwbxhhcyjv

PDF

Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdfEmily Achieng

PDF

Transforming Utility Networks: Large-scale Data Migrations with FMESafe Software

PDF

Exolore The Essential AI Tools in 2025.pdfSrinivasan M

PPTX

Q2 FY26 Tableau User Group Leader Quarterly Calllward7

PDF

Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...darshakparmar

PPTX

Webinar: Introduction to LF Energy EVerestDanBrown980551

PDF

Mastering Financial Management in Direct SellingEpixel MLM Software

PDF

Smart Trailers 2025 Update with History and OverviewPaul Menig

PDF

Agentic AI lifecycle for Enterprise Hyper-AutomationDebmalya Biswas

PDF

[Newgen] NewgenONE Marvin Brochure 1.pdfdarshakparmar

PPTX

OpenID AuthZEN - Analyst Briefing July 2025David Brossard

What Makes Contify’s News API Stand Out: Key Features at a GlanceContify

DevBcn - Building 10x Organizations Using Modern Productivity MetricsJustin Reock

LOOPS in C Programming Language - TechnologyRishabhDwivedi43

Staying Human in a Machine- Accelerated WorldCatalin Jora

"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...Fwdays

IoT-Powered Industrial Transformation – Smart Manufacturing to Connected Heal...Rejig Digital

"AI Transformation: Directions and Challenges", Pavlo ShaternikFwdays

"Autonomy of LLM Agents: Current State and Future Prospects", Oles` PetrivFwdays

Reverse Engineering of Security Products: Developing an Advanced Microsoft De...nwbxhhcyjv

Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdfEmily Achieng

Transforming Utility Networks: Large-scale Data Migrations with FMESafe Software

Exolore The Essential AI Tools in 2025.pdfSrinivasan M

Q2 FY26 Tableau User Group Leader Quarterly Calllward7

Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...darshakparmar

Webinar: Introduction to LF Energy EVerestDanBrown980551

Mastering Financial Management in Direct SellingEpixel MLM Software

Smart Trailers 2025 Update with History and OverviewPaul Menig

Agentic AI lifecycle for Enterprise Hyper-AutomationDebmalya Biswas

[Newgen] NewgenONE Marvin Brochure 1.pdfdarshakparmar

OpenID AuthZEN - Analyst Briefing July 2025David Brossard

Weave AI Controllers (Weave GitOps Office Hours)

1. Conﬁdential do not distribute 1 Generative AI Automation for private Enterprise LLMs Part 1: LM-Controller

2. Conﬁdential do not distribute 2 ● AI Models and Applications are the new class of Kubernetes workloads ● We start tackling this from LLMs ● Enterprise already invested in CPU-based Kubernetes clusters Enterprise AI workloads

3. Conﬁdential do not distribute 3 ● AI Application Developers shouldn’t worry about the complexity of model deployment. ● Platform Teams: LLMs become platform components ○ Security and Governance: signing and verification ○ RBAC and Tenancy ○ Standardization across organizations ○ Available for the Dev teams via self-service portals Why Weave AI?

4. Conﬁdential do not distribute 4 ● Day 0 - Out-of-the-box experiences ○ weave-ai install ○ weave-ai run zephyr-7b-beta ● Day 1 - Integrate them to your DevOps / GitOps pipelines ○ weave-ai install --export ● Day 2 - Build and maintain model catalog for the Dev teams ○ flux commands ○ Fine-tuning models / RAG data pipelines Why Weave AI?

5. Conﬁdential do not distribute 5

6. Conﬁdential do not distribute 6 ● The ﬁrst controller released as part of the Weave AI Controllers ● LM Controller is a Flux controller that helps deploy Large Language Models on Kubernetes. ● It supports LLMs in the Flux OCI format. ● It uses Flux Source Controller as the in-cluster model cache. What is LM Controller?

7. Conﬁdential do not distribute 7 LLMs are snowﬂakes

8. Conﬁdential do not distribute 8 Hugging Face Compatible Models GitHub / GitLab CI Your App LLM Serving Your Data CPU or GPU on Cloud or on-Prem fine-tuning store packaged pulled deploy context manage LLM as Flux OCI

9. Conﬁdential do not distribute 9 Why use LM Controller? LLM Serving LLMs injects all required information to the deployment units LM Controller

10. Conﬁdential do not distribute 10 ● A curated list of LLM catalog ○ In Flux’s OCI format ● Flux’s Source Controller as in-Cluster model Cache ○ No PVC required ● A controller that takes care of this and that LLM parameters for you ● A set of pre-built OpenAI API Compatible engines ○ No-AVX, AVX, AVX2, AVX512 and more to come ● An easy-to-use CLI What Weave AI provides so far

11. Conﬁdential do not distribute 11 It’s Demo Time