In our last blog post, we discussed one of the major decision points we’re facing as we work on what we hope will be the first new AI-focused WebXPRT 4 auxiliary workload: choosing a Web AI framework. In today’s blog, we’re discussing another significant decision that we need to make for the future workload’s development path: choosing a web API.
Many
of you are familiar with the concept of an application programming interface
(API). Simply put, APIs implement sets of software rules, tools, and/or
protocols that serve as intermediaries that make it possible for different
computer programs or components to communicate with each other. APIs simplify
many development tasks for programmers and provide standardized ways for
applications to share data, functions, and system resources.
Web
APIs fulfill the intermediary role of an API—through HTTP-based
communication—for web servers (on the server side) or web browsers (on the
client side). Client-side web APIs make it possible for browser-based
applications to expand browser functionality. They execute the kinds of
JavaScript, HTML5, and WebAssembly (Wasm) workloads—among other examples—that
support the wide variety of browser extensions many of us use every day.
WebXPRT uses those types of browser-based workloads to evaluate system
performance. To lay a solid foundation for the first future browser-based AI
workload, we need to choose a web API that will be compatible with WebXPRT and
the Web AI framework and AI inference workload(s) we ultimately choose.
Currently,
there are three main web API paths for running AI inference in a web browser: Web
Neural Network (WebNN), Wasm, and WebGPU. These three web technologies are in
various stages of development and standardization. Each has different levels of
support within the major browsers. Here are basic overviews of each of the
three options, as well as a few of our thoughts on the benefits and limitations
that each may bring to the table for a future WebXPRT AI workload:
- WebNN is a JavaScript API that
enables developers to directly execute machine learning (ML) tasks on
neural networks within web-based applications. WebNN makes it easier
to integrate ML models into web apps, and it allows web apps to leverage
the power of neural processing units (NPUs). WebNN has a lot going for it. It’s
hardware-agnostic and works with various ML frameworks. It’s likely to be a
major player in future browser-based inference applications. However, as a web
standard, WebNN is still in the development stage and is only available in
developer previews for Chromium-based browsers. Full default WebNN support
could take a year or more.
- Wasm is a binary instruction format
that works across all modern browsers. Wasm provides a sandboxed environment
that operates at near-native speeds and takes advantage of common hardware
specs across platforms. Wasm’s capabilities offer web developers a great deal
of flexibility for running complex client applications in the browser. Simply
put, Wasm can help developers adapt their existing code for additional
platforms and browser-based applications without requiring extensive code
rewrites. Wasm’s flexibility and cross-platform compatibility is one of the
reasons that we’ve already made use of Wasm in two existing WebXPRT 4 workloads
that feature AI tasks: Organize Album using AI, and Encrypt Notes and OCR Scan.
Wasm can also work together with other web APIs, such as WebGPU.
- WebGPU enables web-based applications
to directly access the graphics rendering and computational capabilities of a system’s
GPU. The parallel computational abilities of GPUs make them especially
well-suited to efficiently handle some of the demands of AI inference
workloads, including image-based GenAI workloads or large language models. Google
Chrome and Microsoft Edge currently support WebGPU, and it’s available in
Safari through a tech preview.
Right now, we don’t think that
WebNN will be fully out of the development phase in time to serve as our go-to
web API for a new WebXPRT AI workload. Wasm and/or WebGPU appear to our best
options for now. When WebNN is fully baked and available in mainstream
browsers, it’s possible that we could port any existing Wasm- or WebGPU-based
WebXPRT AI workloads to WebNN, which may open the possibility of cross-platform
browser-based NPU performance comparisons.
All
that said and as we mentioned in our previous post about Web AI frameworks, we
have not made any final decisions about a web API or any aspect of the future
workload. We’re still in the early stages of this project. We want your input.
If
this discussion has sparked web AI ideas that you think would benefit the
process, or if you have feedback you’d like to share, please feel free to contact us!
Justin