BenchmarkXPRT Blog banner

Category: Collaborative benchmark development

WebXPRT 5: The workload lineup

The WebXPRT 5 development process heading into the final stretch, so we’d like to share more information about the workloads you’re likely to see in the WebXPRT 5 Preview release—and when that release may be available. We’re still actively testing candidate builds, studying results from multiple system tests, and so on, so some details could change. That said, we’re now close enough to provide a clearer picture of the workload lineup.

Core workloads

WebXPRT 5 will likely include the following seven workloads:  

  • Video Background Blur with AI. Blurs the background of a video call using an AI-powered Segmentation model.
  • Photo Effects. Applies a filter to six photos using the Canvas API.
  • Detect Faces with AI. Detects faces and organizes photos in an album using computer vision (OpenCV.js with Caffe Model).
  • Image Classification with AI. Labels images in an album using machine learning (OpenCV.js and ML Classify with the SqueezeNet model).
  • Document Scan with AI. Scans a document image and converts it to text using ML-based OCR (Wasm with LSTM).
  • School Science Project. Processes a DNA sequencing task using Regex and String manipulation.
  • Homework Spellcheck. Spellchecks a document using Typo.js and Web Workers.

The sub-scores for each of these tests will contribute to WebXPRT 5’s main overall score. (We’ll discuss scoring in future blogs.)

Experimental workloads

We’re currently planning to include an experimental workload section, something we’ve long discussed, in WebXPRT 5. Workloads in this section will use cutting-edge browser technologies that may not be compatible with the same broad range of platforms and devices as the technologies in WebXPRT 5’s core workloads. For that reason, we will not include the scores from the experimental section—in the Preview build and future releases—in WebXPRT 5’s main overall score.

In addition, WebXPRT 5’s experimental workloads will be completely optional.

Moving forward, WebXPRT’s experimental workload section will provide users with a straightforward way to learn how well certain browsers or systems handle new browser-based technologies (e.g., new web apps or AI capabilities). We’ll benefit from the ability to offer workloads for large-scale testing and user feedback before committing to including them as core WebXPRT workloads. Because future experimental workloads will run independently of the main test, we can add them without affecting the main WebXPRT score or requiring users to repeat testing to obtain comparable scores. We think it will be a win-win scenario in many respects.  

We’re still evaluating whether we can finish the first experimental workload in time to include it in the WebXPRT 5 Preview release, but we will definitely have at least the section and the framework for adding such a workload. When we are confident that an experimental workload is ready to go, we’ll share more information here in the blog and be all set up to incorporate it.

Timeline

If all goes well, we hope to publish the WebXPRT 5 Preview very soon, followed by a general release in early 2026. If that timeline changes significantly, we’ll provide an update here in the blog as soon as possible.

What about an “AI score”?

We’re still discussing the concept of a stand-alone WebXPRT 5 “AI score,” and we go back and forth on it. That score would combine WebXPRT’s AI-related subscores into a single score for use in AI capability comparisons. Because we’re just now beefing up WebXPRT’s AI capabilities, we’ve definitely decided not to include an AI score right now. We would love your feedback on the concept as we plan WebXPRT’s future. If that’s something that you would be interested in, please let us know!

If you have any questions about the WebXPRT 5 details we’ve shared above, please feel free to ask!

Justin

Multi-tab testing in a future version of WebXPRT?

In previous posts about our recommended best practices for producing consistent and reliable WebXPRT scores, we’ve emphasized the importance of “clean” testing. Clean testing involves minimizing the amount of background activity on a system during test runs to ensure stable test conditions. With stable test conditions, we can avoid common scenarios in which startup tasks, automatic updates, and other unpredictable processes contribute to high score variances and potentially unfair comparisons.

Clean testing is a vital part of accurate performance benchmarking, but it doesn’t always show us what kind of performance we can expect in typical everyday conditions. For example, while a browser performance test like WebXPRT can provide clean testing scores that serve as a valuable proxy for overall system performance, an entire WebXPRT test run involves only two open browser tabs. Most of us will have many more tabs open at any given time during the day. Those tabs—and any associated background services, extensions, plug-ins, or renderers—have the potential to require CPU cycles and frequently consume memory resources. Depending on the number of tabs you leave open, the performance impact on your system can be noticeable. Even with modern browser tab management and resource-saving features, a proliferation of tabs can still have a significant impact on your computing experience.

To address this type of computing, we’ve been considering the possibility of adding one or more multi-tab testing features to a future version of WebXPRT. There are several ways we could do this, including the following options:

  • We could open each full workload cycle in a new tab, resulting in seven total tabs.
  • We could open each individual workload iteration in a new tab, resulting in 42 total tabs.
  • We could allow users to run multiple full tests back-to-back while keeping the tabs from the previous test(s) open.

If we do decide to add multi-tab features to a future version of WebXPRT, we could integrate them into the main score or make them optional and thus not affect traditional WebXPRT testing. We’re looking at all these options.

Whenever we have multiple choices, we seek your input. We want to know if a feature like this is something you’d like to see. Below, you’ll find two quick survey questions that will help us gauge your interest in this topic. We would appreciate your input!

Would you be interested in using future WebXPRT multi-tab testing features?

How many browser tabs do you typically leave open at one time?

If you’d like to share additional thoughts or ideas related to possible multi-tab features, please let us know!

Justin

Browser-based AI tests in WebXPRT 4: face detection and image classification

I recently revisited an XPRT blog entry that we posted from CES Las Vegas back in 2020. In that post, I reflected on the show’s expanded AI emphasis, and I wondered if we were reaching a tipping point where AI-enhanced and AI-driven tools and applications would become a significant presence in people’s daily lives. It felt like we were approaching that point back then with the prevalence of AI-powered features such as image enhancement and text recommendation, among many others. Now, seamless AI integration with common online tasks has become so widespread that many people unknowingly benefit from AI interactions several times a day.

As AI’s role in areas like everyday browser activity continues to grow—along with our expectations for what our consumer devices should be able to handle—reliable AI-oriented benchmarking is more vital than ever. We need objective performance data that can help us understand how well a new desktop, laptop, tablet, or phone will handle AI tasks.

WebXPRT 4 already includes timed AI tasks in two of its workloads: the “Organize Album using AI” workload and the “Encrypt Notes and OCR Scan” workload. These two workloads reflect the types of light browser-side inference tasks that are now fairly common in consumer-oriented web apps and extensions. In today’s post, we’ll provide some technical information about the Organize Album workload. In a future post, we’ll do the same for the Encrypt Notes workload.

The Organize Album workload includes two different timed tasks that reflect a common scenario of organizing online photo albums. The workload utilizes the AI inference and JavaScript capabilities of the WebAssembly (Wasm) version of OpenCV.js—an open-source computer vision and machine learning library. In WebXPRT 4, we used OpenCV.js version 4.5.2.

Here are the details for each task:

  • The first task measures the time it takes to complete a face detection job with a set of five 720 x 480 photos that we sourced from commercial photo sites. The workload loads a Caffe deep learning framework model (res10_300x300_ssd_iter_140000_fp16.caffemodel) using the commands found here
  • The second task measures the time it takes to complete an image classification job (labeling based on object detection) with a different set of five 718 x 480 photos that we sourced from the ImageNet computer vision dataset. The workload loads an ONNX-based SqueezeNet machine learning model (squeezenet.onnx v 1.0) using the commands found here.

To produce a score for each iteration of the workload, WebXPRT calculates the total time that it takes for a system to organize both albums. In a standard test, WebXPRT runs seven iterations of the entire six-workload performance suite before calculating an overall test score. You can find out more about the WebXPRT results calculation process here.

We hope this post will give you a better sense of how WebXPRT 4 measures one kind of AI performance. As a reminder, if you want to dig into the details at a more granular level, you can access the WebXPRT 4 source code for free. In previous blog posts, you can find information about how to access and use the code. You can also read more about WebXPRT’s overall structure and other workloads in the Exploring WebXPRT 4 white paper.

If you have any questions about this workload or any other aspect of WebXPRT 4, please let us know!

Justin

The XPRTs: What would you like to see in 2025?

If you’re a new follower of the XPRT family of benchmarks, you may not be aware of one of the characteristics of the XPRTs that sets them apart from many benchmarking efforts—our openness and commitment to valuing the feedback of tech journalists, lab engineers, and anyone else that uses the XPRTs on a regular basis. That feedback helps us to ensure that as the XPRTs grow and evolve, the resources we offer will continue to meet the needs of those that use them.

In the past, user feedback has influenced specific aspects of our benchmarks, such as the length of test runs, UI features, results presentation, and the addition or subtraction of specific workloads. More broadly, we have also received suggestions for entirely new XPRTs and ways we might target emerging technologies or industry use cases.

As we look forward to what’s in store for the XPRTs in 2025, we’d love to hear your ideas about new XPRTs—or new features for existing XPRTs. Are you aware of hardware form factors, software platforms, new technologies, or prominent applications that are difficult or impossible to evaluate using existing performance benchmarks? Should we incorporate additional or different technologies into existing XPRTs through new workloads? Do you have suggestions for ways to improve any of the XPRTs or XPRT-related tools, such as results viewers?

We’re especially interested in your thoughts about the next steps for WebXPRT. If our recent blog posts about the potential addition of an AI-focused auxiliary workload, what a WebXPRT battery life test would entail, or possible WebAssembly-based test scenarios have piqued your interest, we’d love to hear your thoughts!

We’re genuinely interested in your answers to these questions and any other ideas you have, so please feel free to contact us. We look forward to hearing your thoughts and working together to figure out how they could help shape the XPRTs in 2025!

Justin

More than two million XPRT benchmark runs and downloads!

As we near the end of 2024, we’re excited to share that the XPRTs have passed another notable milestone—over 2,000,000 combined runs and downloads! The rate of growth in the total number of XPRT runs and downloads is exciting. It took about seven and a half years for the XPRTs to pass one million total runs and downloads—but it’s taken less than half that, three and a half years, to add another million. Figure 1 shows the climb to the two-million-run mark.

Figure 1: The cumulative number of total yearly XPRT runs and downloads over time.

As you would expect, most of the runs contributing to that total come from WebXPRT tests. If you’ve run WebXPRT in any of the 983 cities and 84 countries from which we’ve received completed test data—including newcomers El Salvador, Malaysia, Morocco, and Saudi Arabia—we’re grateful for your help in reaching this milestone! As Figure 2 illustrates, WebXPRT use has grown steadily since the debut of WebXPRT 2013. On average, we now record more than twice as many WebXPRT runs each month than we recorded in WebXPRT’s entire first year. With over 340,000 runs so far in 2024—an increase of more than 16 percent over last year’s total—that growth is showing no signs of slowing down.

Figure 2: The cumulative number of total yearly WebXPRT runs over time.

This milestone isn’t just about numbers. Establishing and maintaining a presence in the industry and experiencing year-over-year growth requires more than technical know-how and marketing efforts. It requires the ongoing trust and support of the benchmarking community—including OEM labs, the tech press, and independent computer enthusiasts—and those who simply want to know how good their devices are at web browsing.

Once again, we’re thankful for the support of everyone who’s used the XPRTs over the years, and we look forward to another million!

If you have any questions or comments about any of the XPRTs, we’d love to hear from you!

Justin

Speaking of potential future WebXPRT workloads

In recent blog posts, we’ve discussed several types of potential future WebXPRT workloads—from an auxiliary AI-focused workload to a WebXPRT battery life test—and many of the factors that we would need to consider when developing those workloads. In today’s post, we’re discussing other types of workloads that we may consider for future WebXPRT versions. We’re also inviting you to send us your WebXPRT workload ideas!

Currently, the most promising web technology for future WebXPRT workloads is WebAssembly (Wasm). Wasm is a binary instruction format that works across all modern browsers, provides a sandboxed environment that operates at native speeds, and takes advantage of common hardware specs across platforms. Wasm’s capabilities offer web developers significant flexibility in running complex client applications within the browser.

We first made use of Wasm in WebXPRT 4’s Organize Album and Encrypt Notes workloads, but Wasm has the potential to support many more types of test scenarios. Here are just a few of the use-case categories that Wasm supports:

  • Gaming
  • Image and video editing
  • Video augmentation
  • CAD applications
  • Interactive learning portals
  • Language translation

Those categories and the possibilities they open for additional workloads are exciting! When thinking through possible new workload scenarios, it’s important to remember that workload proposals need to fit within a set of basic guidelines that uphold WebXPRT’s strengths as a benchmark. You can read about those guidelines in more detail in this blog post, but in short, new workloads ideally should

  • be relevant to real-life scenarios
  • have cross-platform support
  • clearly differentiate in their performance between different types of devices
  • produce consistent and easily replicated results

After testing with WebXPRT or reviewing the list of use cases that Wasm supports, have you considered a new workload or test scenario that you would like to see? If so, please let us know! Your ideas could end up playing a role in shaping the next version of WebXPRT!

Justin

Check out the other XPRTs:

Forgot your password?