BenchmarkXPRT Blog banner

Category: Wasm

Browser-based AI tests in WebXPRT 4: face detection and image classification

I recently revisited an XPRT blog entry that we posted from CES Las Vegas back in 2020. In that post, I reflected on the show’s expanded AI emphasis, and I wondered if we were reaching a tipping point where AI-enhanced and AI-driven tools and applications would become a significant presence in people’s daily lives. It felt like we were approaching that point back then with the prevalence of AI-powered features such as image enhancement and text recommendation, among many others. Now, seamless AI integration with common online tasks has become so widespread that many people unknowingly benefit from AI interactions several times a day.

As AI’s role in areas like everyday browser activity continues to grow—along with our expectations for what our consumer devices should be able to handle—reliable AI-oriented benchmarking is more vital than ever. We need objective performance data that can help us understand how well a new desktop, laptop, tablet, or phone will handle AI tasks.

WebXPRT 4 already includes timed AI tasks in two of its workloads: the “Organize Album using AI” workload and the “Encrypt Notes and OCR Scan” workload. These two workloads reflect the types of light browser-side inference tasks that are now fairly common in consumer-oriented web apps and extensions. In today’s post, we’ll provide some technical information about the Organize Album workload. In a future post, we’ll do the same for the Encrypt Notes workload.

The Organize Album workload includes two different timed tasks that reflect a common scenario of organizing online photo albums. The workload utilizes the AI inference and JavaScript capabilities of the WebAssembly (Wasm) version of OpenCV.js—an open-source computer vision and machine learning library. In WebXPRT 4, we used OpenCV.js version 4.5.2.

Here are the details for each task:

  • The first task measures the time it takes to complete a face detection job with a set of five 720 x 480 photos that we sourced from commercial photo sites. The workload loads a Caffe deep learning framework model (res10_300x300_ssd_iter_140000_fp16.caffemodel) using the commands found here
  • The second task measures the time it takes to complete an image classification job (labeling based on object detection) with a different set of five 718 x 480 photos that we sourced from the ImageNet computer vision dataset. The workload loads an ONNX-based SqueezeNet machine learning model (squeezenet.onnx v 1.0) using the commands found here.

To produce a score for each iteration of the workload, WebXPRT calculates the total time that it takes for a system to organize both albums. In a standard test, WebXPRT runs seven iterations of the entire six-workload performance suite before calculating an overall test score. You can find out more about the WebXPRT results calculation process here.

We hope this post will give you a better sense of how WebXPRT 4 measures one kind of AI performance. As a reminder, if you want to dig into the details at a more granular level, you can access the WebXPRT 4 source code for free. In previous blog posts, you can find information about how to access and use the code. You can also read more about WebXPRT’s overall structure and other workloads in the Exploring WebXPRT 4 white paper.

If you have any questions about this workload or any other aspect of WebXPRT 4, please let us know!

Justin

Web APIs: Possible paths for the AI-focused WebXPRT 4 auxiliary workload

In our last blog post, we discussed one of the major decision points we’re facing as we work on what we hope will be the first new AI-focused WebXPRT 4 auxiliary workload: choosing a Web AI framework. In today’s blog, we’re discussing another significant decision that we need to make for the future workload’s development path: choosing a web API.

Many of you are familiar with the concept of an application programming interface (API). Simply put, APIs implement sets of software rules, tools, and/or protocols that serve as intermediaries that make it possible for different computer programs or components to communicate with each other. APIs simplify many development tasks for programmers and provide standardized ways for applications to share data, functions, and system resources.

Web APIs fulfill the intermediary role of an API—through HTTP-based communication—for web servers (on the server side) or web browsers (on the client side). Client-side web APIs make it possible for browser-based applications to expand browser functionality. They execute the kinds of JavaScript, HTML5, and WebAssembly (Wasm) workloads—among other examples—that support the wide variety of browser extensions many of us use every day. WebXPRT uses those types of browser-based workloads to evaluate system performance. To lay a solid foundation for the first future browser-based AI workload, we need to choose a web API that will be compatible with WebXPRT and the Web AI framework and AI inference workload(s) we ultimately choose.

Currently, there are three main web API paths for running AI inference in a web browser: Web Neural Network (WebNN), Wasm, and WebGPU. These three web technologies are in various stages of development and standardization. Each has different levels of support within the major browsers. Here are basic overviews of each of the three options, as well as a few of our thoughts on the benefits and limitations that each may bring to the table for a future WebXPRT AI workload:

  • WebNN is a JavaScript API that enables developers to directly execute machine learning (ML) tasks on neural networks within web-based applications. WebNN makes it easier to integrate ML models into web apps, and it allows web apps to leverage the power of neural processing units (NPUs). WebNN has a lot going for it. It’s hardware-agnostic and works with various ML frameworks. It’s likely to be a major player in future browser-based inference applications. However, as a web standard, WebNN is still in the development stage and is only available in developer previews for Chromium-based browsers. Full default WebNN support could take a year or more.
  • Wasm is a binary instruction format that works across all modern browsers. Wasm provides a sandboxed environment that operates at near-native speeds and takes advantage of common hardware specs across platforms. Wasm’s capabilities offer web developers a great deal of flexibility for running complex client applications in the browser. Simply put, Wasm can help developers adapt their existing code for additional platforms and browser-based applications without requiring extensive code rewrites. Wasm’s flexibility and cross-platform compatibility is one of the reasons that we’ve already made use of Wasm in two existing WebXPRT 4 workloads that feature AI tasks: Organize Album using AI, and Encrypt Notes and OCR Scan. Wasm can also work together with other web APIs, such as WebGPU.
  • WebGPU enables web-based applications to directly access the graphics rendering and computational capabilities of a system’s GPU. The parallel computational abilities of GPUs make them especially well-suited to efficiently handle some of the demands of AI inference workloads, including image-based GenAI workloads or large language models. Google Chrome and Microsoft Edge currently support WebGPU, and it’s available in Safari through a tech preview.

Right now, we don’t think that WebNN will be fully out of the development phase in time to serve as our go-to web API for a new WebXPRT AI workload. Wasm and/or WebGPU appear to our best options for now. When WebNN is fully baked and available in mainstream browsers, it’s possible that we could port any existing Wasm- or WebGPU-based WebXPRT AI workloads to WebNN, which may open the possibility of cross-platform browser-based NPU performance comparisons.

All that said and as we mentioned in our previous post about Web AI frameworks, we have not made any final decisions about a web API or any aspect of the future workload. We’re still in the early stages of this project. We want your input.

If this discussion has sparked web AI ideas that you think would benefit the process, or if you have feedback you’d like to share, please feel free to contact us!

Justin

Check out the other XPRTs:

Forgot your password?