BenchmarkXPRT Blog banner

Category: Benchmark metrics

Thinking through a potential WebXPRT 4 battery life test

In recent blog posts, we’ve discussed some of the technical considerations we’re working through on our path toward a future AI-focused WebXPRT 4 auxiliary workload. While we’re especially excited about adding to WebXPRT 4’s AI performance evaluation capabilities, AI is not the only area of potential WebXPRT 4 expansion that we’ve thought about. We’re always open to hearing suggestions for ways we can improve WebXPRT 4, including any workload proposals you may have. Several users have asked about the possibility of a WebXPRT 4 battery life test, so today we’ll discuss what one might look like and some of the challenges we’d have to overcome to make it a reality.

Battery life tests fall into two primary categories: simple rundown tests and performance-weighted tests. Simple rundown tests measure battery life during extreme idle periods and loops of movie playbacks, etc., but do not reflect the wide-ranging mix of activities that characterize a typical day for most users. While they can be useful for performing very specific apples-to-apples comparisons, these tests don’t always give consumers an accurate estimate of the battery life they would experience in daily use.

In contrast, performance-weighted battery life tests, such as the one in CrXPRT 2, attempt to reflect real-world usage. The CrXPRT battery life test simulates common daily usage patterns for Chromebooks by including all the productivity workloads from the performance test, plus video playback, audio playback, and gaming scenarios. It also includes periods of wait/idle time. We believe this mixture of diverse activity and idle time better represents typical real-life behavior patterns. This makes the resulting estimated battery life much more helpful for consumers who are trying to match a device’s capabilities with their real-world needs.

From a technical standpoint, WebXPRT’s cross-platform nature presents us with several challenges that we did not face while developing the CrXPRT battery life test for ChromeOS. While the WebXPRT performance tests run in almost any browser, cross-browser differences and limitations in battery life reporting may restrict any future battery life test to a single browser or browser family. For instance, with the W3C Battery Status API, we can currently query battery status data from non-mobile Chromium-based browsers (e.g., Chrome, Edge, Opera, etc.), but not from Firefox or Safari. If a WebXPRT 4 battery life test supported only a single browser family, such as Chromium-based browsers, would you still be interested in using it? Please let us know.

A browser-based battery life workflow also presents other challenges that we do not face in native client applications, such as CrXPRT:

  • A browser-based battery life test may require the user to check the starting and ending battery capacities, with no way for the app to independently verify data accuracy.
  • The battery life test could require more babysitting in the event of network issues. We can catch network failures and try to handle them by reporting periods of network disconnection, but those interruptions could influence the battery life duration.
  • The factors above could make it difficult to achieve repeatability. One way to address that problem would be to run the test in a standardized lab environment with a steady internet connection, but a long list of standardized environmental requirements would make the battery life test less attractive and less accessible to many testers.

We’re not sharing these thoughts to make a WebXPRT 4 battery life test seem like an impossibility. Rather, we want to offer our perspective on what the test might look like and describe some of the challenges and considerations in play. If you have thoughts about battery life testing, or experience with battery life APIs in one or more of the major browsers, we’d love to hear from you!

Justin

Gain a deeper understanding of WebXPRT 4 with our results calculation white paper

More people around the world are using WebXPRT 4 now than ever before. It’s exciting to see that growth, which also means that many people are visiting our site and learning about the XPRTs for the first time. Because new visitors may not know how the XPRT family of benchmarks differs from other benchmarking efforts, we occasionally like to revisit the core values of our open development community here in the blog—and show how those values translate into more free resources for you.

One of our primary values is transparency in all our benchmark development and testing processes. We share information about our progress with XPRT users throughout the development process, and we invite people to contribute ideas and feedback along the way. We also publish both the source code of our benchmarks and detailed information about how they work, unlike benchmarks that use a “black box” model.

For WebXPRT 4 users who are interested in knowing more about the nuts and bolts of the benchmark, we offer several information-packed resources, including our focus for today, the WebXPRT 4 results calculation and confidence interval white paper. The white paper explains the WebXPRT 4 confidence interval, how it differs from typical benchmark variability, and the formulas the benchmark uses to calculate the individual workload scenario scores and overall score on the end-of-test results screen. The paper also provides an overview of the statistical methodology that WebXPRT uses to translate raw timings into scores.

In addition to the white paper’s discussion of the results calculation process, we’ve also provided a results calculation spreadsheet that shows the raw data from a sample test run and reproduces the calculations WebXPRT uses to generate both the workload scores and an overall score.

In potential future versions of WebXPRT, it’s likely that we’ll continue to use the same—or very similar—statistical methodologies and results calculation formulas that we’ve documented in the results calculation white paper and spreadsheet. That said, if you have suggestions for how we could improve those methods or formulas—either in part or in whole—please don’t hesitate to contact us. We’re interested in hearing your ideas!

The white paper is available on WebXPRT.com and on our XPRT white papers page. If you have any questions about the paper or spreadsheet, WebXPRT, or the XPRTs in general, please let us know.

Justin

How to automate your WebXPRT 4 testing

We’re excited about the ongoing upward trend in the number of completed WebXPRT 4 runs that we’re seeing each month. OEM and tech press labs are responsible for a significant amount of that growth, and many of them use WebXPRT’s automation features to complete large blocks of hands-off testing at one time. We realize that many new WebXPRT users may be unfamiliar with the benchmark’s automation tools, so we decided to provide a quick guide to WebXPRT automation in today’s blog. Whether you’re testing one or 1,000 devices, the instructions below can help save you some time.

WebXPRT 4 allows users to run scripts in an automated fashion and control test execution by appending parameters and values to the WebXPRT URL. Three parameters are available:

  • test type
  • test scenarios
  • results

Below, you’ll find a description of those parameters and instructions for how you can use them to automate your test runs.

Test type

The WebXPRT automation framework accounts for two test types: (1) the six core workloads, and (2) any experimental workloads we might add in future builds. There are currently no experimental tests in WebXPRT 4, so always set the test type variable to 1.

  • Core tests: 1

Test scenario

The test scenario parameter lets you specify which subtest workloads to run by using the following codes:

  • Photo enhancement: 1
  • Organize album using AI: 2
  • Stock option pricing: 4
  • Encrypt notes and OCR scan using WASM: 8
  • Sales graphs: 16
  • Online homework: 32

To run a single subtest workload, use its code. To run multiple workloads, use the sum of their codes. For example, to run Stock options pricing (4) and Encrypt notes and OCR scan (8), use the sum of 12. To run all core tests, use 63, the sum of all the individual test codes (1 + 2 + 4 + 8 + 16 + 32 = 63).

Results format

The results format parameter lets you select the format of the results:

  • Display the result as an HTML table: 1
  • Display the result as XML: 2
  • Display the result as CSV: 3
  • Download the result as CSV: 4

To use the automation feature, start with the URL https://www.principledtechnologies.com/benchmarkxprt/webxprt/2021/wx4_build_3_7_3, append a question mark (?), and add the parameters and values separated by ampersands (&). For example, to run all the core tests and download the results, you would use the following URL: https://principledtechnologies.com/benchmarkxprt/webxprt/2021/wx4_build_3_7_3/auto.php?testtype=1&tests=63&result=4

We hope WebXPRT 4’s automation features will make testing easier for you. If you have any questions about WebXPRT or the automation process, please feel free to ask!

Justin

Shopping for back-to-school tech? The XPRTs can help!

For many students, the first day of school is just around the corner, and it’s now time to shop for new tech devices that can help set them up for success in the coming year. The tech marketplace can be confusing, however, with so many brands, options, and competing claims to sort through.

Fortunately, the XPRTs are here to help!

Whether you’re shopping for a new phone, tablet, Chromebook, laptop, or desktop, the XPRTs can provide industry-trusted performance scores that can give you confidence that you’re making a smart purchasing decision.

The WebXPRT 4 results viewer is a good place to start looking for device scores. The viewer displays WebXPRT 4 scores from over 700 devices—including many of the latest releases—and we’re adding new scores all the time. To learn more about the viewer’s capabilities and how you can use it to compare devices, check out this blog post.

Another resource we offer is the XPRT results browser. The browser is the most efficient way to access the XPRT results database, which currently holds more than 3,700 test results from over 150 sources, including major tech review publications around the world, manufacturers, and independent testers. It offers a wealth of current and historical performance data across all the XPRT benchmarks and hundreds of devices. You can read more about how to use the results browser here.

Also, if you’re considering a popular device, there’s a good chance that a recent tech review includes an XPRT score for that device. There are two quick ways to find these reviews: You can either (1) search for “XPRT” on your preferred tech review site or (2) use a search engine and input the device name and XPRT name, such as “Dell XPS” and “WebXPRT.”

Here are a few recent tech reviews that use one of the XPRTs to evaluate a popular device:

Lastly, here at Principled Technologies, we frequently publish reports that evaluate the performance of hot new consumer devices, and many of those reports include WebXPRT scores. For example, check out our extensive testing of HP ZBook G10 mobile workstations or our detailed comparison of Lenovo ThinkPad, ThinkBook, and ThinkCentre devices to their Apple Mac counterparts.

The XPRTs can help anyone stuck in the back-to-school shopping blues make better-informed and more confident tech purchases. As this new school year begins, we hope you’ll find the data you need on our site or in an XPRT-related tech review. If you have any questions about the XPRTs, XPRT scores, or the results database please feel free to ask!

Justin

Putting together a good WebXPRT workload proposal

Recently, we announced that we’re moving forward with the development of a new AI-focused WebXPRT 4 workload. It will be an auxiliary workload, which means that it will run as a separate, optional test, and it won’t affect existing WebXPRT 4 tests or scores. Although the inspiration for this new workload came from internal WebXPRT discussions—and, let’s face it, from the huge increase in importance of AI—we wanted to remind you that we’re always open to hearing your WebXPRT workload ideas. If you’d like to submit proposals for new workloads, you don’t have to follow a formal process. Just contact us, and we’ll start the conversation.

If you do decide to send us a workload proposal, it will be helpful to know the types of parameters that we keep in mind. Below, we discuss some of the key questions we ask when we evaluate new WebXPRT workload ideas.

Will it be relevant and interesting to real users, lab testers, and tech reviewers?

When considering a WebXPRT workload proposal, the first two criteria are simple: is it relevant in real life, and are people interested in the workload? We created WebXPRT to evaluate device performance using web-based tasks that consumers are likely to experience daily, so real-life relevance has always been an essential requirement for us throughout development. There are many technologies, functions, and use cases that we could test in a web environment, but only some are relevant to common applications or usage patterns and are likely to draw the interest of real users, lab testers, and technical reviewers.

Will it have cross-platform support?

Currently, WebXPRT runs on almost any web browser and almost every device that supports a web browser. We would like to keep that level of cross-platform support when we introduce new workloads. However, technical differences in how various browsers execute tasks make it challenging to include certain scenarios without undermining our cross-platform ideal. When considering any workload proposal, one of the first questions we ask is, “Will it work on all the major browsers and operating systems?”

There are special exceptions to this guideline. For instance, we’re still in the early days of browser-based AI, and it’s unlikely that a new browser-based AI workload will run on every major browser. If it’s a particularly compelling idea, such as the AI scenario we’re currently working on, we may consider including it as an auxiliary test.

Will it differentiate performance between different types of devices?

XPRT benchmarks provide users with accurate measures for evaluating how well target systems or technologies perform specific tasks. With a broadly targeted benchmark like WebXPRT, if the workloads are so heavy that most devices can’t handle them or so light that most devices complete them without being taxed, the results will be of little use for helping buyers evaluating systems and making purchasing decisions, OEM labs, and the tech press.

That’s why, with any new WebXPRT workload, we look for a sweet spot with respect to how computationally demanding it will be. We want it to run on a wide range of devices—from low-end devices that are several years old to brand-new high-end devices, and everything in between. We also want users to see a wide range of workload scores and resulting overall scores that accurately reflect the experiences those systems deliver, so they can easily grasp the different performance capabilities of the devices under test.

Will results be consistent and easily replicated?

Finally, WebXPRT workloads should produce scores that consistently fall within an acceptable margin of error and are easily replicated with additional testing or comparable gear. Some web technologies are very sensitive to uncontrollable or unpredictable variables, such as internet speed. A workload that measures one of those technologies would be unlikely to produce results that are consistent and easily replicated.

We hope this post will be useful if you’re thinking about potential new workloads that you’d like to see in WebXPRT. If you have any general thoughts about browser performance testing or specific workload ideas that you’d like us to consider, please let us know.

Justin

Updating our WebXPRT 4 browser performance comparisons with new gear

Once or twice per year, we refresh an ongoing series of WebXPRT comparison tests to see if recent updates have reordered the performance rankings of popular web browsers. We published our most recent comparison in January, when we used WebXPRT 4 to compare the performance of five browsers on the same system.

This time, we’re publishing an updated set of comparison scores sooner than we normally would because we chose to move our testing to a newer reference laptop. The previous system—a Dell XPS 13 7930 with an Intel Core i3-10110U processor and 4 GB of RAM—is now several years old. We wanted to transition to a system that is more in line with current mid-range laptops. By choosing to test on a capable mid-tier laptop, our comparison scores are more likely to fall within the range of scores we would see from a typical user today.

Our new reference system is a Lenovo ThinkPad T14s Gen 3 with an Intel Core i7-1270P processor and 16 GB of RAM. It’s running Windows 11 Pro, updated to version 23H2 (22631.3527). Before testing, we installed all current Windows updates and tested on a clean system image. After the update process was complete, we turned off updates to prevent any further updates from interfering with test runs. We ran WebXPRT 4 three times each on five browsers: Brave, Google Chrome, Microsoft Edge, Mozilla Firefox, and Opera. In Figure 1 below, each browser’s score is the median of the three test runs.

In our last round of tests—on the Dell XPS 13—the four Chromium-based browsers (Brave, Chrome, Edge, and Opera) produced close scores, with Edge taking a small lead among the four. Each of the Chromium browsers significantly outperformed Firefox, with the slowest of the Chromium browsers (Brave) outperforming Firefox by 13.5 percent.

In this round of tests—on the Lenovo ThinkPad T14s—the scores were very tight, with a difference of only 4 percent between the last-place browser (Brave) and the winner (Chrome). Interestingly, Firefox no longer trailed the four Chromium browsers—it was squarely in the middle of the pack.

Figure 1: The median scores from running WebXPRT 4 three times with each browser on the Lenovo ThinkPad T14s.

Unlike previous rounds that showed a higher degree of performance differentiation between the browsers, scores from this round of tests are close enough that most users wouldn’t notice a difference. Even if the difference between the highest and lowest scores was substantial, the quality of your browsing experience will often depend on factors such as the types of things you do on the web (e.g., gaming, media consumption, or multi-tab browsing), the impact of extensions on performance, and how frequently the browsers issue updates and integrate new technologies, among other things. It’s important to keep such variables in mind when thinking about how browser performance comparison results may translate to your everyday web experience.

Have you tried using WebXPRT 4 to test the speed of different browsers on the same system? If so, we’d love for you to tell us about it! Also, please tell us what other WebXPRT data you’d like to see!

Justin

Check out the other XPRTs:

Forgot your password?