BenchmarkXPRT Blog banner

Category: battery life

Local AI and new frontiers for performance evaluation

Recently, we discussed some ways the PC market may evolve in 2024, and how new Windows on Arm PCs could present the XPRTs with many opportunities for benchmarking. In addition to a potential market shakeup from Arm-based PCs in the coming years, there’s a much broader emerging trend that could eventually revolutionize almost everything about the way we interact with our personal devices—the development of local, dedicated AI processing units for consumer-oriented tech.

AI already impacts daily life for many consumers through technologies such as such as predictive text, computer vision, adaptive workflow apps, voice recognition, smart assistants, and much more. Generative AI-based technologies are rapidly establishing a permanent, society-altering presence across a wide range of industries. Aside from some localized inference tasks that the CPU and/or GPU typically handle, the bulk of the heavy compute power that fuels those technologies has been in the cloud or in on-prem servers. Now, several major chipmakers are working to roll out their own versions of AI-optimized neural processing units (NPUs) that will enable local devices to take on a larger share of the AI load.

Examples of dedicated AI hardware in recently-released or upcoming consumer devices include Intel’s new Meteor Lake NPU, Apple’s Neural Engine for M-series SoCs, Qualcomm’s Hexagon NPU, and AMD’s XDNA 2 architecture. The potential benefits of localized, NPU-facilitated AI are straightforward. On-device AI could reduce power consumption and extend battery life by offloading those tasks from the CPUs. It could alleviate certain cloud-related privacy and security concerns. Without the delays inherent in cloud queries, localized AI could execute inference tasks that operate much closer to real time. NPU-powered devices could fine-tune applications around your habits and preferences, even while offline. You could pull and utilize relevant data from cloud-based datasets without pushing private data in return. Theoretically, your device could know a great deal about you and enhance many areas of your daily life without passing all that data to another party.

Will localized AI play out that way? Some tech companies envision a role for on-device AI that enhances the abilities of existing cloud-based subscription services without decoupling personal data. We’ll likely see a wide variety of capabilities and services on offer, with application-specific and SaaS-determined privacy options.

Regardless of the way on-device AI technology evolves in the coming years, it presents an exciting new frontier for benchmarking. All NPUs will not be created equal, and that’s something buyers will need to understand. Some vendors will optimize their hardware more for computer vision, or large language models, or AI-based graphics rendering, and so on. It won’t be enough for business and consumers to simply know that a new system has dedicated AI processing abilities. They’ll need to know if that system performs well while handling the types of AI-related tasks that they do every day.

Here at the XPRTs, we specialize in creating benchmarks that feature real-world scenarios that mirror the types of tasks that people do in their daily lives. That approach means that when people use XPRT scores to compare device performance, they’re using a metric that can help them make a buying decision that will benefit them every day. We look forward to exploring ways that we can bring XPRT benchmarking expertise to the world of on-device AI.

Do you have ideas for future localized AI workloads? Let us know!

Justin

The evolving PC market brings new opportunities for WebXPRT

Here at the XPRTs, we have to spend time examining what’s next in the tech industry, because the XPRTs have to keep up with the pace of innovation. In our recent discussions about 2024, a major recurring topic has been the potential impact of Qualcomm’s upcoming line of SOCs designed for Windows on Arm PCs.

Now, Windows on Arm PCs are certainly not new. Since Windows RT launched on the Arm-based Microsoft Surface RT in 2012, various Windows on Arm devices have come and gone, but none of them—except for some Microsoft SQ-based Surface devices—have made much of a name for themselves in the consumer market.

The reasons for these struggles are straightforward. While Arm-based PCs have the potential to offer consumers the benefits of excellent battery life and “always-on” mobile communications, the platform has historically lagged Intel- and AMD-based PCs in performance. Windows on Arm devices have also faced the challenge of a lack of large-scale buy-in from app developers. So, despite the past involvement of device makers like ASUS, HP, Lenovo, and Microsoft, the major theme of the Windows on Arm story has been one of very limited market acceptance.

Next year, though, the theme of that story may change. If it does, WebXPRT 4 is well-positioned to play an important part.

At the recent Qualcomm Technology Summit, the company unveiled the new 4nm Snapdragon X Elite SOC, which includes an all-new 12-core Oryon CPU, an integrated Adreno GPU, and an integrated Hexagon NPU (neural processing unit) designed for AI-powered applications. Company officials presented performance numbers that showed the X Elite surpassing the performance of late-gen AMD, Apple, and Intel competitor platforms, all while using less power.

Those are massive claims, and of course the proof will come—or not—only when systems are available for test. (In the past, companies have made similar claims about Windows on Arm advantages, only to see those claims evaporate by the time production devices show up on store shelves.)

Will Snapdragon X Elite systems demonstrate unprecedented performance and battery life when they hit the market? How will the performance of those devices stack up to Intel’s Meteor Lake systems and Apple’s M3 offerings? We don’t yet know how these new devices may shake up the PC market, but we do know that it looks like 2024 will present us with many golden opportunities for benchmarking. Amid all the marketing buzz, buyers everywhere will want to know about potential trade-offs between price, power, and battery life. Tech reviewers will want to dive into the details and provide useful data points, but many traditional PC benchmarks simply won’t work with Windows on ARM systems. As a go-to, cross-platform favorite of many OEMs—that runs on just about anything with a browser—WebXPRT 4 is in a perfect position to provide reviewers and consumers with relevant performance comparison data.

It’s quite possible that 2024 may be the biggest year for WebXPRT yet!

Justin

A note about CrXPRT 2

Recent visitors to CrXPRT.com may have seen a notice that encourages visitors to use WebXPRT 4 instead of CrXPRT 2 for performance testing on high-end Chromebooks. The notice reads as follows:

NOTE: Chromebook technology has progressed rapidly since we released CrXPRT 2, and we’ve received reports that some CrXPRT 2 workloads may not stress top-bin Chromebook processors enough to give the necessary accuracy for users to compare their performance. So, for the latest test to compare the performance of high-end Chromebooks, we recommend using WebXPRT 4.

We made this recommendation because of the evident limitations of the CrXPRT 2 performance workloads when testing newer high-end hardware. CrXPRT 2 itself is not that old (2020), but when we created the CrXPRT 2 performance workloads, we started with a core framework of CrXPRT 2015 performance workloads. In a similar way, we built the CrXPRT 2015 workloads on a foundation of WebXPRT 2015 workloads. At the time, the harness and workload structures we used to ensure WebXPRT 2015’s cross-browser capabilities provided an excellent foundation that we could adapt for our new ChromeOS benchmark. Consequently, CrXPRT 2 is a close developmental descendant of WebXPRT 2015. Some of the legacy WebXPRT 2015/CrXPRT 2 workloads do not stress current high-end processors—a limitation that prevents effective performance testing differentiation—nor do they engage the latest web technologies.

In the past, the Chromebook market skewed heavily toward low-cost devices with down-bin, inexpensive processors, making this limitation less of an issue. Now, however, more Chromebooks offer top-bin processors on par with traditional laptops and workstations. Because of the limitations of the CrXPRT 2 workloads, we now recommend WebXPRT 4 for both cross-browser and ChromeOS performance testing on the latest high-end Chromebooks. WebXPRT 4 includes updated test content, newer JavaScript tools and libraries, modern WebAssembly workloads, and additional Web Workers tasks that cover a wide range of performance requirements.

While CrXPRT 2 continues to function as a capable performance and battery life comparison test for many ChromeOS devices, WebXPRT 4 is a more appropriate tool to use with new high-end devices. If you haven’t yet used WebXPRT 4 for Chromebook comparison testing, we encourage you to give it a try!

If you have any questions or concerns about CrXPRT 2 or WebXPRT 4, please don’t hesitate to ask!

Justin

The role of potential WebXPRT 4 auxiliary workloads

As we mentioned in our most recent blog post, we’re seeking suggestions for ways to improve WebXPRT 4. We’re open to the prospect of adding both non-workload features and new auxiliary tests, e.g., a battery life or WebGPU-based graphics test scenario.

To prevent any confusion among WebXPRT 4 testers, we want to reiterate that any auxiliary workloads we might add will not affect existing WebXPRT 4 subtest or overall scores in any way. Auxiliary tests would be experimental or targeted workloads that run separately from the main test and produce their own scores. Current and future WebXPRT 4 results will be comparable to one another, so users who’ve already built a database of WebXPRT 4 scores will not have to retest their devices. Any new tests will be add-ons that allow us to continue expanding the rapidly growing body of published WebXPRT 4 test results while making the benchmark even more valuable to users overall.

If you have any thoughts about potential browser performance workloads, or any specific web technologies that you’d like to test, please let us know.

Justin

How we evaluate new WebXPRT workload proposals

A key value of the BenchmarkXPRT Development Community is our openness to user feedback. Whether it’s positive feedback about our benchmarks, constructive criticism, ideas for completely new benchmarks, or proposed workload scenarios for existing benchmarks, we appreciate your input and give it serious consideration.

We’re currently accepting ideas and suggestions for ways we can improve WebXPRT 4. We are open to adding both non-workload features and new auxiliary tests, which can be experimental or targeted workloads that run separately from the main test and produce their own scores. You can read more about experimental WebXPRT 4 workloads here. However, a recent user question about possible WebGPU workloads has prompted us to explain the types of parameters that we consider when we evaluate a new WebXPRT workload proposal.

Community interest and real-life relevance

The first two parameters we use when evaluating a WebXPRT workload proposal are straightforward: are people interested in the workload and is it relevant to real life? We originally developed WebXPRT to evaluate device performance using the types of web-based tasks that people are likely to encounter daily, and real-life relevancy continues to be an important criterion for us during development. There are many technologies, functions, and use cases that we could test in a web environment, but only some of them are both relevant to common applications or usage patterns and likely to be interesting to lab testers and tech reviewers.

Maximum cross-platform support

Currently, WebXPRT runs in almost any web browser, on almost any device that has a web browser, and we would ideally maintain that broad level of cross-platform support when introducing new workloads. However, technical differences in the ways that different browsers execute tasks mean that some types of scenarios would be impossible to include without breaking our cross-platform commitment.

One reason that we’re considering auxiliary workloads with WebXPRT, e.g., a battery life rundown, is that those workloads would allow WebXPRT to offer additional value to users while maintaining the cross-platform nature of the main test. Even if a battery life test ran on only one major browser, it could still be very useful to many people.

Performance differentiation

Computer benchmarks such as the XPRTs exist to provide users with reliable metrics that they can use to gauge how well target platforms or technologies perform certain tasks. With a broadly targeted benchmark such as WebXPRT, if the workloads are so heavy that most devices can’t handle them, or so light that most devices complete them without being taxed, the results will have little to no use for OEM labs, the tech press, or independent users when evaluating devices or making purchasing decisions.

Consequently, with any new WebXPRT workload, we try to find a sweet spot in terms of how demanding it is. We want it to run on a wide range of devices—from low-end devices that are several years old to brand-new high-end devices and everything in between. We also want users to see a wide range of workload scores and resulting overall scores, so they can easily grasp the different performance capabilities of the devices under test.

Consistency and replicability

Finally, workloads should produce scores that consistently fall within an acceptable margin of error, and are easily to replicate with additional testing or comparable gear. Some web technologies are very sensitive to uncontrollable or unpredictable variables, such as internet speed. A workload that measures one of those technologies would be unlikely to produce results that are consistent and easily replicated.

We hope this post will be useful for folks who are contemplating potential new WebXPRT workloads. If you have any general thoughts about browser performance testing, or specific workload ideas that you’d like us to consider, please let us know.

Justin

An update on Chrome OS XPRT benchmark development

In July, we discussed the Chrome OS team’s decision to end support for Chrome apps, and how that will prevent us from publishing any future fixes or updates for CrXPRT 2. We also announced our goal of beginning development of an all-new Chrome OS XPRT benchmark by the end of this year. While we are actively discussing this benchmark and researching workload technologies and scenarios, we don’t foresee releasing a preview build this year.

The good news is that, in spite of a lack of formal support from the Chrome OS team, the CrXPRT 2 performance and battery life tests currently run without any known issues. We continue to monitor the status of CrXPRT and will inform our blog readers of any significant changes.

If you have any questions about CrXPRT, or ideas about the types of features or workloads you’d like to see in a new Chrome OS benchmark, please let us know!

Justin

Check out the other XPRTs:

Forgot your password?