LiteRT.js is Google's high performance WebAI runtime, targeting production Web applications. It is a continuation of the LiteRT stack, ensuring multi-framework support and unifying our core runtime across all platforms.
LiteRT.js supports the following core features:
- In-browser support for LiteRT models: Run models with best in class performance on CPU, accelerated via XNNPack on WebAssembly (Wasm), and GPU using the WebGPU API.
- Multi-framework compatibility: Use your preferred ML Framework: PyTorch, Jax or TensorFlow.
- Build on existing pipelines: Integrate with existing TensorFlow.js pipelines by supporting TensorFlow.js Tensors as inputs and outputs.
Installation
Install the @litertjs/core
package from npm:
npm install @litertjs/core
The Wasm files, are located in node_modules/@litertjs/core/wasm/
.
For convenience, copy and serve the entire
wasm/
folder. Then, import the package and load the Wasm files:
import {loadLiteRt} from '@litertjs/core;
// Host LiteRT's Wasm files on your server.
await loadLiteRt(`your/path/to/wasm/`);
Model conversion
LiteRT.js uses the same .tflite
format as Android and iOS, and it supports
existing models on Kaggle and
Huggingface. If
you have a new PyTorch model, you'll need to convert it.
Convert a PyTorch Model to LiteRT
To convert a PyTorch model to LiteRT, use the ai-edge-torch converter.
import ai_edge_torch
# Load your torch model. We're using resnet for this example.
resnet18 = torchvision.models.resnet18(torchvision.models.ResNet18_Weights.IMAGENET1K_V1)
sample_inputs = (torch.randn(1, 3, 224, 224),)
# Convert the model to LiteRT.
edge_model = ai_edge_torch.convert(resnet18.eval(), sample_inputs)
# Export the model.
edge_model.export('resnet.tflite')
Run the Converted Model
After converting the model to a .tflite
file, you can run it in the browser.
import {loadAndCompile} from '@litertjs/core';
// Load the model hosted from your server. This makes an http(s) request.
const model = await loadAndCompile('/path/to/model.tflite', {
accelerator: 'webgpu', // or 'wasm' for XNNPack CPU inference
});
// The model can also be loaded from a Uint8Array if you want to fetch it yourself.
// Create image input data
const image = new Float32Array(224 * 224 * 3).fill(0);
const inputTensor =
await new Tensor(image, /* shape */ [1, 3, 224, 224]).moveTo('webgpu');
// Run the model
const outputs = model(inputTensor);
// You can also use model([inputTensor])
// or model({'input_tensor_name': inputTensor})
// Clean up and get outputs
inputTensor.delete();
const outputTensorCpu = await outputs[0].moveTo('wasm');
const outputData = outputTensorCpu.toTypedArray();
outputTensorCpu.delete();
Integrate into existing TensorFlow.js pipelines
You should consider integrating LiteRT.js into your TensorFlow.js pipelines for the following reasons:
- Best-in-class WebGPU performance: Converted models running on LiteRT.js WebGPU are optimized for browser performance, and are especially fast on Chromium-based browsers.
- Easier model conversion path: The LiteRT.js conversion path goes directly from PyTorch to LiteRT. The PyTorch to TensorFlow.js conversion path is significantly more complicated, requiring you to go from PyTorch -> ONNX -> TensorFlow -> TensorFlow.js.
- Debugging tools: The LiteRT.js conversion path comes with debugging tools.
LiteRT.js is designed to function within TensorFlow.js pipelines, and is compatible with TensorFlow.js pre- and post-processing, so the only thing you need to migrate is the model itself.
Integrate LiteRT.js into TensorFlow.js pipelines with the following steps:
- Convert your original TensorFlow, JAX, or PyTorch model to
.tflite
. For details, see the model conversion section. - Install the
@litertjs/core
and@litertjs/tfjs-interop
NPM packages. - Import and use the TensorFlow.js WebGPU backend. This is required for LiteRT.js to interoperate with TensorFlow.js.
- Replace loading the TensorFlow.js model with loading the LiteRT.js model.
- Substitute the TensorFlow.js
model.predict
(inputs) ormodel.execute
(inputs) withrunWithTfjsTensors
(liteRtModel, inputs).runWithTfjsTensors
takes the same input tensors that TensorFlow.js models use and outputs TensorFlow.js tensors. - Test that the model pipeline outputs the results you expect.
Using LiteRT.js with runWithTfjsTensors
may also require the following changes
to the model inputs:
- Reorder inputs: Depending on how the converter ordered the inputs and outputs of the model, you may need to change their order as you pass them in.
- Transpose inputs: It's also possible that the converter changed the layout of the inputs and outputs of the model compared to what TensorFlow.js uses. You may need to transpose your inputs to match the model and outputs to match the rest of the pipeline.
- Rename inputs: If you're using named inputs, the names may have also changed.
You can get more information about the inputs and outputs of the model with
model.getInputDetails()
and model.getOutputDetails()
.