SlideShare a Scribd company logo
Privacy-first in-browser
Generative AI web apps:
• offline-ready,
• future-proof,
• standards-based
Maxim Salnikov
Developer Productivity Lead at Microsoft
• Building on web platform since 90s
• Organizing developer communities and
technical conferences
• Speaking, training, blogging: Webdev,
Cloud, Generative AI, Prompt Engineering
• Member of Web Machine Learning
Community Group
Helping developers to succeed with the Dev Tools, Cloud & AI in Microsoft
I’m Maxim Salnikov
Making Machine Learning a first-class web citizen by incubating Web APIs for machine learning
inference in the browser and in products using modern web engines
Demo repo!
https://github.com/webmaxru/nextjs-webnn
• AI-capable: Transformers.js
(under the hood: ONNX Web Runtime, WebNN)
• WebGPU, WebNN, NPU features detection
• Smooth UX: AI computation is in the web worker
• Offline-ready: Workbox
https://github.com/webmaxru/ng-ai
Angular
React + Next.js
Native AI in
the browser.
Standardized.
We use web
(61% of PC
time)
We use AI
(> 1B people)
We [will] have
AI-capable
devices
We want
performance,
privacy,
offline-ready. All FREE!
@Dev: unified
codebase
@Dev: handy
abstractions
“Native” means
 Best possible performance: fast and energy-efficient
 Leveraging all relevant hardware capabilities
 Platform-specific implementations
 No trade-offs needed
 For web only: unified codebase
Not in today’s session scope
 Native options (Ollama): local but not web
 AI models/APIs shipped as a part of the browser (Prompt API):
native & web but non-standard [yet]
 “First generation” web ML and Gen AI frameworks (TensorFlow.js,
WebLLM): limited usecases, not fully native [yet]
Web Neural Network API (WebNN)
 Near native execution characteristics:
both speed and power efficiency
 Heterogeneous hardware execution:
CPU, GPU, NPU
 Unified abstraction: W3C API standard
 Model-agnostic: General computational
graph allows to BYOM
 Compatible with existing ML frameworks
https://www.w3.org/TR/webnn/
All starts from the usecases
https://www.w3.org/TR/webnn/#usecases
• Person Detection
• Semantic Segmentation
• Skeleton Detection
• Face Recognition
• Facial Landmark Detection
• Style Transfer
• Super Resolution
• Image Captioning
• Text-to-image
• Machine Translation
• Emotion Analysis
• Video Summarization
• Noise Suppression
• Speech Recognition
• Text Generation
• Detecting fake video
Edge AI ecosystem
CPU GPU NPU
Native
ML APIs
Web Browser
(e.g., Chrome/Edge)
Frameworks
Use cases
WebNN
JavaScript Runtime
(e.g., Electron/Node.js)
Noise
Suppression
Image
Classification
Background
Segmentation
TensorFlow.js
ONNX Runtime
Web
MediaPipe Web
Natural
Language
Hardware
CoreML
DirectML
Web API
Web
Engines
OpenCV.js
WebAssembly WebGPU
Object
Detection
TFLite Other ML OS APIs
Windows Studio
Effects
API extensions
WebNN “as a frontend” status: native ML frameworks
Check latest on: https://webmachinelearning.github.io/webnn-status/
…
WebNN “as a backend” status : JS ML frameworks
Check latest on: https://webmachinelearning.github.io/webnn-status/
…
Which hardware to choose for AI workloads
 CPU: Provides the broadest compatibility and usability across all
client devices with varying degrees of performance.
 GPU: Provides the broadest range of achievable performance across
graphics hardware platforms from consumer devices to professional
workstations.
 NPU: Provides power efficiency for sustained workloads across
hardware platforms with purpose-built accelerators.
WebNN performance is “near-native”
 WebNN on CPU is about 93% of native XNNPack
 WebNN on GPU is about 83% of native DirectML
 WebNN on NPU is about 80% of native DirectML
Source
WebNN for the users
 Low Latency
In-browser inference enables novel use cases with local media sources
 Privacy Preserving
User data stays on-device and preserves user-privacy
 High Availability
No reliance on the network after initial asset caching for offline case
 Low Cost
Computing on client devices means no server farms needed.
https://webmachinelearning.github.io/webnn-intro/
WebNN for the developers
 Take advantage of the native OS services for
machine learning
 Get capabilities from the underlying hardware
innovations
 Implement consistent, efficient, and reliable AI
experiences on the web
 Benefit web applications and frameworks
including ONNX Runtime Web, TensorFlow.js
https://webmachinelearning.github.io/webnn-intro/
enum MLDeviceType {
"cpu",
"gpu",
"npu“
};
enum MLPowerPreference
{
"default",
"high-performance",
"low-power“
};
Device selection will change
 Algorithmic steps or notes to implementations on how to map power
preference to devices?
 Excluding specific device types?
 Query mechanism for supported devices?
 Using device similarity grouping?
 Moving to higher abstraction level?
 Combination of above?
https://github.com/webmachinelearning/webnn/blob/main/device-selection-explainer.md
Pre-requisites
https://microsoft.github.io/webnn-developer-preview/install.html
about://flags#web-machine-learning-neural-network
Canary or Dev versions of the Edge or Chrome
Enabling NPU: latest drivers for Intel | ARM
Context, operand, graph…
Let’s build an app
AI
usecases
Platform AI
capabilities
WebNN API
Web frontend app
ONNX Web Runtime
Transformers.js
Low-level, operates execution graph
Mid-level, operates inference sessions,
defines model format (ONNX)
High-level*, operates task-based pipelines,
handles model fetching & caching
* - level distribution is relative
What is ONNX?
https://onnxruntime.ai/
https://onnx.ai/
ONNX is an open format built to represent machine learning models. ONNX defines a common set of
operators - the building blocks of machine learning and deep learning models - and a common file format to
enable AI developers to use models with a variety of frameworks, tools, runtimes, and compilers.
ONNX Runtime is a production-grade AI engine to speed up training and inferencing in your existing
technology stack.
ONNX Runtime Web
const options = {
executionProviders: [
{
name: 'webnn', // wasm | webgpu | webnn | webgl
deviceType: 'npu', // cpu | gpu | npu
powerPreference: 'low-power', // default | low-power | high-performance
},
],
}
...
const session = await ort.InferenceSession.create('./model.onnx');
const tensorA = new ort.Tensor('float32', dataA, [3, 4]);
const tensorB = new ort.Tensor('float32', dataB, [4, 3]);
const feeds = { a: tensorA, b: tensorB };
const results = await session.run(feeds);
What is Transformers.js?
https://github.com/huggingface/transformers.js
Natural Language Processing: text classification, named entity recognition, question
answering, language modeling, summarization, translation, multiple choice, and text generation
Computer Vision: image classification, object detection, segmentation, and depth estimation
Audio: automatic speech recognition, audio classification, and text-to-speech
Multimodal: embeddings, zero-shot audio classification, zero-shot image classification, and
zero-shot object detection
State-of-the-art Machine Learning for the web. Run Transformers
directly in your browser, with no need for a server!
Plus:
https://github.com/huggingface/transformers.js
Hosted pretrained models (subset of Hugging Face catalog)
https://huggingface.co/models?library=transformers.js
Seamless caching of the models (with the Cache Storage)
Serving your own models (converted to the ONNX format)
Task-based pipelines
https://huggingface.co/docs/transformers.js/en/pipelines
import { pipeline } from '@huggingface/transformers’;
const classifier = await pipeline('sentiment-analysis’);
const result = await classifier('I love AI!’);
// [{'label': 'POSITIVE', 'score': 0.9998}]
Text Vision Audio …
https://huggingface.co/docs/transformers.js/en/pipelines
sentence-similarity
summarization
text-generation
translation
question-answering
fill-mask
...
image-classification
image-segmentation
image-to-image
mask-generation
object-detection
...
audio-classification
automatic-speech-
recognition
text-to-speech
text-to-audio
...
Summary and call to action:
• Web standard for running ML tasks in the browser natively is here
• It’s the only way to leverage all in-device AI capabilities
• There are still some moving parts in the specification
• Choose your own comfortable abstraction level using higher-level frameworks
• Same frameworks could provide fallback mechanisms to handle API/device
availability fallbacks
• User experience first! Offline-readiness, web workers, providing choices
References
 Updates/slides from TPAC 2024 WebML WG meeting
 WebNN Spec: https://www.w3.org/TR/webnn/
 WebNN Explainer: https://github.com/webmachinelearning/webnn/blob/main/explainer.md
 WebNN Implementation Status: https://webmachinelearning.github.io/webnn-status/
 Awesome WebNN: https://github.com/webmachinelearning/awesome-webnn
 WebNN Samples: https://microsoft.github.io/webnn-developer-preview/ & https://webmachinelearning.github.io/webnn-samples/
 WebNN Image Classification: https://webmachinelearning.github.io/webnn-samples/image_classification/
 WebNN Semantic Segmentation: https://webmachinelearning.github.io/webnn-samples/semantic_segmentation/index.html
 ONNX Runtime WebNN Execution Provider:
https://github.com/microsoft/onnxruntime/tree/main/onnxruntime/core/providers/webnn
Thank you! I kindly prompt you:

More Related Content

Similar to Privacy-first in-browser Generative AI web apps: offline-ready, future-proof, standards-based (20)

PDF
“Key Requirements to Successfully Implement Generative AI in Edge Devices—Opt...
Edge AI and Vision Alliance
 
PDF
Unlocking Generative AI in your Web Apps
Maximiliano Firtman
 
PDF
Data Science, Machine Learning and Neural Networks
BICA Labs
 
PDF
Neural Networks from Scratch - TensorFlow 101
Gerold Bausch
 
PPTX
Tensorflow
marwa Ayad Mohamed
 
PPTX
Open, Secure & Transparent AI Pipelines
Nick Pentreath
 
PDF
Introducing TensorFlow: The game changer in building "intelligent" applications
Rokesh Jankie
 
PDF
TensorFlow London: Cutting edge generative models
Seldon
 
PDF
Big Data Malaysia - A Primer on Deep Learning
Poo Kuan Hoong
 
PPTX
Advanced AI for People in a Hurry
Scott Penberthy
 
PDF
open source nn frameworks on cellphones
Koan-Sin Tan
 
PPTX
Anomaly Detection with Azure and .NET
Marco Parenzan
 
PPTX
Squeezing Deep Learning Into Mobile Phones
Anirudh Koul
 
PPTX
Deep Learning for Developers (January 2018)
Julien SIMON
 
PDF
Exploring AI as tools in your career.pdf
videongamesrfun
 
PPTX
Deep learning on mobile
Anirudh Koul
 
PDF
Large Scale Deep Learning with TensorFlow
Jen Aman
 
PDF
A Look at TensorFlow.js
Jamal Sinclair O'Garro
 
PDF
Kubernetes and AI - Beauty and the Beast - Tobias Schneck - DOAG 24 NUE - 20....
Tobias Schneck
 
PPTX
Deep Learning on Qubole Data Platform
Shivaji Dutta
 
“Key Requirements to Successfully Implement Generative AI in Edge Devices—Opt...
Edge AI and Vision Alliance
 
Unlocking Generative AI in your Web Apps
Maximiliano Firtman
 
Data Science, Machine Learning and Neural Networks
BICA Labs
 
Neural Networks from Scratch - TensorFlow 101
Gerold Bausch
 
Tensorflow
marwa Ayad Mohamed
 
Open, Secure & Transparent AI Pipelines
Nick Pentreath
 
Introducing TensorFlow: The game changer in building "intelligent" applications
Rokesh Jankie
 
TensorFlow London: Cutting edge generative models
Seldon
 
Big Data Malaysia - A Primer on Deep Learning
Poo Kuan Hoong
 
Advanced AI for People in a Hurry
Scott Penberthy
 
open source nn frameworks on cellphones
Koan-Sin Tan
 
Anomaly Detection with Azure and .NET
Marco Parenzan
 
Squeezing Deep Learning Into Mobile Phones
Anirudh Koul
 
Deep Learning for Developers (January 2018)
Julien SIMON
 
Exploring AI as tools in your career.pdf
videongamesrfun
 
Deep learning on mobile
Anirudh Koul
 
Large Scale Deep Learning with TensorFlow
Jen Aman
 
A Look at TensorFlow.js
Jamal Sinclair O'Garro
 
Kubernetes and AI - Beauty and the Beast - Tobias Schneck - DOAG 24 NUE - 20....
Tobias Schneck
 
Deep Learning on Qubole Data Platform
Shivaji Dutta
 

More from Maxim Salnikov (20)

PDF
Azure AI Foundry: The AI app and agent factory
Maxim Salnikov
 
PDF
Reimagining Software Development and DevOps with Agentic AI
Maxim Salnikov
 
PDF
Agentic Techniques in Retrieval-Augmented Generation with Azure AI Search
Maxim Salnikov
 
PDF
Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Maxim Salnikov
 
PDF
Evaluation as an Essential Component of the Generative AI Lifecycle
Maxim Salnikov
 
PDF
From Traction to Production Maturing your LLMOps step by step
Maxim Salnikov
 
PDF
Real-world coding with GitHub Copilot: tips & tricks
Maxim Salnikov
 
PDF
AI-assisted development: how to build and ship with confidence
Maxim Salnikov
 
PDF
Prompt Engineering - an Art, a Science, or your next Job Title?
Maxim Salnikov
 
PDF
If your code could speak, what would it tell you? Let GitHub Copilot Chat hel...
Maxim Salnikov
 
PDF
Building Generative AI-infused apps: what's possible and how to start
Maxim Salnikov
 
PDF
Prompt Engineering - an Art, a Science, or your next Job Title?
Maxim Salnikov
 
PDF
ChatGPT and not only: how can you use the power of Generative AI at scale
Maxim Salnikov
 
PDF
Using the power of OpenAI with your own data: what's possible and how to start?
Maxim Salnikov
 
PDF
If your code could speak, what would it tell you? Let GitHub Copilot Chat hel...
Maxim Salnikov
 
PDF
Prompt Engineering - an Art, a Science, or your next Job Title?
Maxim Salnikov
 
PDF
ChatGPT and not only: How to use the power of GPT-X models at scale
Maxim Salnikov
 
PDF
How Azure helps to build better business processes and customer experiences w...
Maxim Salnikov
 
PDF
Using the power of Generative AI at scale
Maxim Salnikov
 
PDF
Web Push Notifications done right
Maxim Salnikov
 
Azure AI Foundry: The AI app and agent factory
Maxim Salnikov
 
Reimagining Software Development and DevOps with Agentic AI
Maxim Salnikov
 
Agentic Techniques in Retrieval-Augmented Generation with Azure AI Search
Maxim Salnikov
 
Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Maxim Salnikov
 
Evaluation as an Essential Component of the Generative AI Lifecycle
Maxim Salnikov
 
From Traction to Production Maturing your LLMOps step by step
Maxim Salnikov
 
Real-world coding with GitHub Copilot: tips & tricks
Maxim Salnikov
 
AI-assisted development: how to build and ship with confidence
Maxim Salnikov
 
Prompt Engineering - an Art, a Science, or your next Job Title?
Maxim Salnikov
 
If your code could speak, what would it tell you? Let GitHub Copilot Chat hel...
Maxim Salnikov
 
Building Generative AI-infused apps: what's possible and how to start
Maxim Salnikov
 
Prompt Engineering - an Art, a Science, or your next Job Title?
Maxim Salnikov
 
ChatGPT and not only: how can you use the power of Generative AI at scale
Maxim Salnikov
 
Using the power of OpenAI with your own data: what's possible and how to start?
Maxim Salnikov
 
If your code could speak, what would it tell you? Let GitHub Copilot Chat hel...
Maxim Salnikov
 
Prompt Engineering - an Art, a Science, or your next Job Title?
Maxim Salnikov
 
ChatGPT and not only: How to use the power of GPT-X models at scale
Maxim Salnikov
 
How Azure helps to build better business processes and customer experiences w...
Maxim Salnikov
 
Using the power of Generative AI at scale
Maxim Salnikov
 
Web Push Notifications done right
Maxim Salnikov
 
Ad

Recently uploaded (20)

PDF
Odoo CRM vs Zoho CRM: Honest Comparison 2025
Odiware Technologies Private Limited
 
PDF
HiHelloHR – Simplify HR Operations for Modern Workplaces
HiHelloHR
 
PPTX
Transforming Mining & Engineering Operations with Odoo ERP | Streamline Proje...
SatishKumar2651
 
PDF
TheFutureIsDynamic-BoxLang witch Luis Majano.pdf
Ortus Solutions, Corp
 
PDF
Driver Easy Pro 6.1.1 Crack Licensce key 2025 FREE
utfefguu
 
PDF
AI + DevOps = Smart Automation with devseccops.ai.pdf
Devseccops.ai
 
PPTX
Agentic Automation Journey Series Day 2 – Prompt Engineering for UiPath Agents
klpathrudu
 
PDF
How to Hire AI Developers_ Step-by-Step Guide in 2025.pdf
DianApps Technologies
 
PPTX
Finding Your License Details in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PDF
Automate Cybersecurity Tasks with Python
VICTOR MAESTRE RAMIREZ
 
PPTX
OpenChain @ OSS NA - In From the Cold: Open Source as Part of Mainstream Soft...
Shane Coughlan
 
PDF
4K Video Downloader Plus Pro Crack for MacOS New Download 2025
bashirkhan333g
 
PDF
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pdf
Varsha Nayak
 
PDF
Build It, Buy It, or Already Got It? Make Smarter Martech Decisions
bbedford2
 
PPTX
Help for Correlations in IBM SPSS Statistics.pptx
Version 1 Analytics
 
PPTX
Home Care Tools: Benefits, features and more
Third Rock Techkno
 
PDF
Empower Your Tech Vision- Why Businesses Prefer to Hire Remote Developers fro...
logixshapers59
 
PDF
MiniTool Partition Wizard Free Crack + Full Free Download 2025
bashirkhan333g
 
PPTX
Milwaukee Marketo User Group - Summer Road Trip: Mapping and Personalizing Yo...
bbedford2
 
PDF
Top Agile Project Management Tools for Teams in 2025
Orangescrum
 
Odoo CRM vs Zoho CRM: Honest Comparison 2025
Odiware Technologies Private Limited
 
HiHelloHR – Simplify HR Operations for Modern Workplaces
HiHelloHR
 
Transforming Mining & Engineering Operations with Odoo ERP | Streamline Proje...
SatishKumar2651
 
TheFutureIsDynamic-BoxLang witch Luis Majano.pdf
Ortus Solutions, Corp
 
Driver Easy Pro 6.1.1 Crack Licensce key 2025 FREE
utfefguu
 
AI + DevOps = Smart Automation with devseccops.ai.pdf
Devseccops.ai
 
Agentic Automation Journey Series Day 2 – Prompt Engineering for UiPath Agents
klpathrudu
 
How to Hire AI Developers_ Step-by-Step Guide in 2025.pdf
DianApps Technologies
 
Finding Your License Details in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
Automate Cybersecurity Tasks with Python
VICTOR MAESTRE RAMIREZ
 
OpenChain @ OSS NA - In From the Cold: Open Source as Part of Mainstream Soft...
Shane Coughlan
 
4K Video Downloader Plus Pro Crack for MacOS New Download 2025
bashirkhan333g
 
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pdf
Varsha Nayak
 
Build It, Buy It, or Already Got It? Make Smarter Martech Decisions
bbedford2
 
Help for Correlations in IBM SPSS Statistics.pptx
Version 1 Analytics
 
Home Care Tools: Benefits, features and more
Third Rock Techkno
 
Empower Your Tech Vision- Why Businesses Prefer to Hire Remote Developers fro...
logixshapers59
 
MiniTool Partition Wizard Free Crack + Full Free Download 2025
bashirkhan333g
 
Milwaukee Marketo User Group - Summer Road Trip: Mapping and Personalizing Yo...
bbedford2
 
Top Agile Project Management Tools for Teams in 2025
Orangescrum
 
Ad

Privacy-first in-browser Generative AI web apps: offline-ready, future-proof, standards-based

  • 1. Privacy-first in-browser Generative AI web apps: • offline-ready, • future-proof, • standards-based Maxim Salnikov Developer Productivity Lead at Microsoft
  • 2. • Building on web platform since 90s • Organizing developer communities and technical conferences • Speaking, training, blogging: Webdev, Cloud, Generative AI, Prompt Engineering • Member of Web Machine Learning Community Group Helping developers to succeed with the Dev Tools, Cloud & AI in Microsoft I’m Maxim Salnikov Making Machine Learning a first-class web citizen by incubating Web APIs for machine learning inference in the browser and in products using modern web engines
  • 3. Demo repo! https://github.com/webmaxru/nextjs-webnn • AI-capable: Transformers.js (under the hood: ONNX Web Runtime, WebNN) • WebGPU, WebNN, NPU features detection • Smooth UX: AI computation is in the web worker • Offline-ready: Workbox https://github.com/webmaxru/ng-ai Angular React + Next.js
  • 4. Native AI in the browser. Standardized. We use web (61% of PC time) We use AI (> 1B people) We [will] have AI-capable devices We want performance, privacy, offline-ready. All FREE! @Dev: unified codebase @Dev: handy abstractions
  • 5. “Native” means  Best possible performance: fast and energy-efficient  Leveraging all relevant hardware capabilities  Platform-specific implementations  No trade-offs needed  For web only: unified codebase
  • 6. Not in today’s session scope  Native options (Ollama): local but not web  AI models/APIs shipped as a part of the browser (Prompt API): native & web but non-standard [yet]  “First generation” web ML and Gen AI frameworks (TensorFlow.js, WebLLM): limited usecases, not fully native [yet]
  • 7. Web Neural Network API (WebNN)  Near native execution characteristics: both speed and power efficiency  Heterogeneous hardware execution: CPU, GPU, NPU  Unified abstraction: W3C API standard  Model-agnostic: General computational graph allows to BYOM  Compatible with existing ML frameworks https://www.w3.org/TR/webnn/
  • 8. All starts from the usecases https://www.w3.org/TR/webnn/#usecases • Person Detection • Semantic Segmentation • Skeleton Detection • Face Recognition • Facial Landmark Detection • Style Transfer • Super Resolution • Image Captioning • Text-to-image • Machine Translation • Emotion Analysis • Video Summarization • Noise Suppression • Speech Recognition • Text Generation • Detecting fake video
  • 9. Edge AI ecosystem CPU GPU NPU Native ML APIs Web Browser (e.g., Chrome/Edge) Frameworks Use cases WebNN JavaScript Runtime (e.g., Electron/Node.js) Noise Suppression Image Classification Background Segmentation TensorFlow.js ONNX Runtime Web MediaPipe Web Natural Language Hardware CoreML DirectML Web API Web Engines OpenCV.js WebAssembly WebGPU Object Detection TFLite Other ML OS APIs Windows Studio Effects API extensions
  • 10. WebNN “as a frontend” status: native ML frameworks Check latest on: https://webmachinelearning.github.io/webnn-status/ …
  • 11. WebNN “as a backend” status : JS ML frameworks Check latest on: https://webmachinelearning.github.io/webnn-status/ …
  • 12. Which hardware to choose for AI workloads  CPU: Provides the broadest compatibility and usability across all client devices with varying degrees of performance.  GPU: Provides the broadest range of achievable performance across graphics hardware platforms from consumer devices to professional workstations.  NPU: Provides power efficiency for sustained workloads across hardware platforms with purpose-built accelerators.
  • 13. WebNN performance is “near-native”  WebNN on CPU is about 93% of native XNNPack  WebNN on GPU is about 83% of native DirectML  WebNN on NPU is about 80% of native DirectML Source
  • 14. WebNN for the users  Low Latency In-browser inference enables novel use cases with local media sources  Privacy Preserving User data stays on-device and preserves user-privacy  High Availability No reliance on the network after initial asset caching for offline case  Low Cost Computing on client devices means no server farms needed. https://webmachinelearning.github.io/webnn-intro/
  • 15. WebNN for the developers  Take advantage of the native OS services for machine learning  Get capabilities from the underlying hardware innovations  Implement consistent, efficient, and reliable AI experiences on the web  Benefit web applications and frameworks including ONNX Runtime Web, TensorFlow.js https://webmachinelearning.github.io/webnn-intro/ enum MLDeviceType { "cpu", "gpu", "npu“ }; enum MLPowerPreference { "default", "high-performance", "low-power“ };
  • 16. Device selection will change  Algorithmic steps or notes to implementations on how to map power preference to devices?  Excluding specific device types?  Query mechanism for supported devices?  Using device similarity grouping?  Moving to higher abstraction level?  Combination of above? https://github.com/webmachinelearning/webnn/blob/main/device-selection-explainer.md
  • 19. Let’s build an app AI usecases Platform AI capabilities WebNN API Web frontend app ONNX Web Runtime Transformers.js Low-level, operates execution graph Mid-level, operates inference sessions, defines model format (ONNX) High-level*, operates task-based pipelines, handles model fetching & caching * - level distribution is relative
  • 20. What is ONNX? https://onnxruntime.ai/ https://onnx.ai/ ONNX is an open format built to represent machine learning models. ONNX defines a common set of operators - the building blocks of machine learning and deep learning models - and a common file format to enable AI developers to use models with a variety of frameworks, tools, runtimes, and compilers. ONNX Runtime is a production-grade AI engine to speed up training and inferencing in your existing technology stack.
  • 21. ONNX Runtime Web const options = { executionProviders: [ { name: 'webnn', // wasm | webgpu | webnn | webgl deviceType: 'npu', // cpu | gpu | npu powerPreference: 'low-power', // default | low-power | high-performance }, ], } ... const session = await ort.InferenceSession.create('./model.onnx'); const tensorA = new ort.Tensor('float32', dataA, [3, 4]); const tensorB = new ort.Tensor('float32', dataB, [4, 3]); const feeds = { a: tensorA, b: tensorB }; const results = await session.run(feeds);
  • 22. What is Transformers.js? https://github.com/huggingface/transformers.js Natural Language Processing: text classification, named entity recognition, question answering, language modeling, summarization, translation, multiple choice, and text generation Computer Vision: image classification, object detection, segmentation, and depth estimation Audio: automatic speech recognition, audio classification, and text-to-speech Multimodal: embeddings, zero-shot audio classification, zero-shot image classification, and zero-shot object detection State-of-the-art Machine Learning for the web. Run Transformers directly in your browser, with no need for a server!
  • 23. Plus: https://github.com/huggingface/transformers.js Hosted pretrained models (subset of Hugging Face catalog) https://huggingface.co/models?library=transformers.js Seamless caching of the models (with the Cache Storage) Serving your own models (converted to the ONNX format)
  • 24. Task-based pipelines https://huggingface.co/docs/transformers.js/en/pipelines import { pipeline } from '@huggingface/transformers’; const classifier = await pipeline('sentiment-analysis’); const result = await classifier('I love AI!’); // [{'label': 'POSITIVE', 'score': 0.9998}]
  • 25. Text Vision Audio … https://huggingface.co/docs/transformers.js/en/pipelines sentence-similarity summarization text-generation translation question-answering fill-mask ... image-classification image-segmentation image-to-image mask-generation object-detection ... audio-classification automatic-speech- recognition text-to-speech text-to-audio ...
  • 26. Summary and call to action: • Web standard for running ML tasks in the browser natively is here • It’s the only way to leverage all in-device AI capabilities • There are still some moving parts in the specification • Choose your own comfortable abstraction level using higher-level frameworks • Same frameworks could provide fallback mechanisms to handle API/device availability fallbacks • User experience first! Offline-readiness, web workers, providing choices
  • 27. References  Updates/slides from TPAC 2024 WebML WG meeting  WebNN Spec: https://www.w3.org/TR/webnn/  WebNN Explainer: https://github.com/webmachinelearning/webnn/blob/main/explainer.md  WebNN Implementation Status: https://webmachinelearning.github.io/webnn-status/  Awesome WebNN: https://github.com/webmachinelearning/awesome-webnn  WebNN Samples: https://microsoft.github.io/webnn-developer-preview/ & https://webmachinelearning.github.io/webnn-samples/  WebNN Image Classification: https://webmachinelearning.github.io/webnn-samples/image_classification/  WebNN Semantic Segmentation: https://webmachinelearning.github.io/webnn-samples/semantic_segmentation/index.html  ONNX Runtime WebNN Execution Provider: https://github.com/microsoft/onnxruntime/tree/main/onnxruntime/core/providers/webnn
  • 28. Thank you! I kindly prompt you: