The document proposes a new WebML API to optimize machine learning workloads on the web by integrating them with OS-level ML APIs and hardware accelerators. It provides an overview of existing web ML frameworks and limitations. The WebML API would standardize ML inference on the web and allow web apps to fully utilize CPU, GPU and dedicated ML accelerators for near-native performance. The document includes a prototype WebML API implementation and initial performance results showing significant speedups compared to existing web APIs.