BenchmarkXPRT Blog banner

Tag Archives: WebNN

Web APIs: Possible paths for the AI-focused WebXPRT 4 auxiliary workload

In our last blog post, we discussed one of the major decision points we’re facing as we work on what we hope will be the first new AI-focused WebXPRT 4 auxiliary workload: choosing a Web AI framework. In today’s blog, we’re discussing another significant decision that we need to make for the future workload’s development path: choosing a web API.

Many of you are familiar with the concept of an application programming interface (API). Simply put, APIs implement sets of software rules, tools, and/or protocols that serve as intermediaries that make it possible for different computer programs or components to communicate with each other. APIs simplify many development tasks for programmers and provide standardized ways for applications to share data, functions, and system resources.

Web APIs fulfill the intermediary role of an API—through HTTP-based communication—for web servers (on the server side) or web browsers (on the client side). Client-side web APIs make it possible for browser-based applications to expand browser functionality. They execute the kinds of JavaScript, HTML5, and WebAssembly (Wasm) workloads—among other examples—that support the wide variety of browser extensions many of us use every day. WebXPRT uses those types of browser-based workloads to evaluate system performance. To lay a solid foundation for the first future browser-based AI workload, we need to choose a web API that will be compatible with WebXPRT and the Web AI framework and AI inference workload(s) we ultimately choose.

Currently, there are three main web API paths for running AI inference in a web browser: Web Neural Network (WebNN), Wasm, and WebGPU. These three web technologies are in various stages of development and standardization. Each has different levels of support within the major browsers. Here are basic overviews of each of the three options, as well as a few of our thoughts on the benefits and limitations that each may bring to the table for a future WebXPRT AI workload:

  • WebNN is a JavaScript API that enables developers to directly execute machine learning (ML) tasks on neural networks within web-based applications. WebNN makes it easier to integrate ML models into web apps, and it allows web apps to leverage the power of neural processing units (NPUs). WebNN has a lot going for it. It’s hardware-agnostic and works with various ML frameworks. It’s likely to be a major player in future browser-based inference applications. However, as a web standard, WebNN is still in the development stage and is only available in developer previews for Chromium-based browsers. Full default WebNN support could take a year or more.
  • Wasm is a binary instruction format that works across all modern browsers. Wasm provides a sandboxed environment that operates at near-native speeds and takes advantage of common hardware specs across platforms. Wasm’s capabilities offer web developers a great deal of flexibility for running complex client applications in the browser. Simply put, Wasm can help developers adapt their existing code for additional platforms and browser-based applications without requiring extensive code rewrites. Wasm’s flexibility and cross-platform compatibility is one of the reasons that we’ve already made use of Wasm in two existing WebXPRT 4 workloads that feature AI tasks: Organize Album using AI, and Encrypt Notes and OCR Scan. Wasm can also work together with other web APIs, such as WebGPU.
  • WebGPU enables web-based applications to directly access the graphics rendering and computational capabilities of a system’s GPU. The parallel computational abilities of GPUs make them especially well-suited to efficiently handle some of the demands of AI inference workloads, including image-based GenAI workloads or large language models. Google Chrome and Microsoft Edge currently support WebGPU, and it’s available in Safari through a tech preview.

Right now, we don’t think that WebNN will be fully out of the development phase in time to serve as our go-to web API for a new WebXPRT AI workload. Wasm and/or WebGPU appear to our best options for now. When WebNN is fully baked and available in mainstream browsers, it’s possible that we could port any existing Wasm- or WebGPU-based WebXPRT AI workloads to WebNN, which may open the possibility of cross-platform browser-based NPU performance comparisons.

All that said and as we mentioned in our previous post about Web AI frameworks, we have not made any final decisions about a web API or any aspect of the future workload. We’re still in the early stages of this project. We want your input.

If this discussion has sparked web AI ideas that you think would benefit the process, or if you have feedback you’d like to share, please feel free to contact us!

Justin

Up next for WebXPRT 4: A new AI-focused workload!

We’re always thinking about ways to improve WebXPRT. In the past, we’ve discussed the potential benefits of auxiliary workloads and the role that such workloads might play in future WebXPRT updates and versions. Today, we’re very excited to announce that we’ve decided to move forward with the development of a new WebXPRT 4 workload focused on browser-side AI technology!

WebXPRT 4 already includes timed AI tasks in two of its workloads: the Organize Album using AI workload and the Encrypt Notes and OCR Scan workload. These two workloads reflect the types of light browser-side inference tasks that have been available for a while now, but most heavy-duty inference on the web has historically happened in on-prem servers or in the cloud. Now, localized AI technology is growing by leaps and bounds, and the integration of new AI capabilities with browser-based tasks is on the threshold of advancing rapidly.

Because of this growth, we believe now is the time to start work on giving WebXPRT 4 the ability to evaluate new browser-based AI capabilities—capabilities that are likely to become a part of everyday life in the next few years. We haven’t yet decided on a test scenario or software stack for the new workload, but we’ll be working to refine our plan in the coming months. There seems to be some initial promise in emerging frameworks such as ONNX Runtime Web, which allows users to run and deploy web-based machine learning models by using JavaScript APIs and libraries. In addition, new Web APIs like WebGPU (currently supported in Edge, Chrome, and tech preview in Safari) and WebNN (in development) may soon help facilitate new browser-side AI workloads.

We know that many longtime WebXPRT 4 users will have questions about how this new workload may affect their tests. We want to assure you that the workload will be an optional bonus workload and will not run by default during normal WebXPRT 4 tests. As you consider possibilities for the new workload, here are a few points to keep in mind:

  • The workload will be optional for users to run.
  • It will not affect the main WebXPRT 4 subtest or overall scores in any way.
  • It will run separately from the main test and will produce its own score(s).
  • Current and future WebXPRT 4 results will still be comparable to one another, so users who’ve already built a database of WebXPRT 4 scores will not have to retest their devices.
  • Because many of the available frameworks don’t currently run on all browsers, the workload may not run on every platform.

As we research available technologies and explore our options, we would love to hear from you. If you have ideas for an AI workload scenario that you think would be useful or thoughts on how we should implement it, please let us know! We’re excited about adding new technologies and new value to WebXPRT 4, and we look forward to sharing more information here in the blog as we make progress.

Justin

Check out the other XPRTs:

Forgot your password?