Category: Future of performance evaluation

Up next for WebXPRT 4: A new AI-focused workload!

on May 30, 2024

We’re always thinking about ways to improve WebXPRT. In the past, we’ve discussed the potential benefits of auxiliary workloads and the role that such workloads might play in future WebXPRT updates and versions. Today, we’re very excited to announce that we’ve decided to move forward with the development of a new WebXPRT 4 workload focused on browser-side AI technology!

WebXPRT 4 already includes timed AI tasks in two of its workloads: the Organize Album using AI workload and the Encrypt Notes and OCR Scan workload. These two workloads reflect the types of light browser-side inference tasks that have been available for a while now, but most heavy-duty inference on the web has historically happened in on-prem servers or in the cloud. Now, localized AI technology is growing by leaps and bounds, and the integration of new AI capabilities with browser-based tasks is on the threshold of advancing rapidly.

Because of this growth, we believe now is the time to start work on giving WebXPRT 4 the ability to evaluate new browser-based AI capabilities—capabilities that are likely to become a part of everyday life in the next few years. We haven’t yet decided on a test scenario or software stack for the new workload, but we’ll be working to refine our plan in the coming months. There seems to be some initial promise in emerging frameworks such as ONNX Runtime Web, which allows users to run and deploy web-based machine learning models by using JavaScript APIs and libraries. In addition, new Web APIs like WebGPU (currently supported in Edge, Chrome, and tech preview in Safari) and WebNN (in development) may soon help facilitate new browser-side AI workloads.

We know that many longtime WebXPRT 4 users will have questions about how this new workload may affect their tests. We want to assure you that the workload will be an optional bonus workload and will not run by default during normal WebXPRT 4 tests. As you consider possibilities for the new workload, here are a few points to keep in mind:

The workload will be optional for users to run.
It will not affect the main WebXPRT 4 subtest or overall scores in any way.
It will run separately from the main test and will produce its own score(s).
Current and future WebXPRT 4 results will still be comparable to one another, so users who’ve already built a database of WebXPRT 4 scores will not have to retest their devices.
Because many of the available frameworks don’t currently run on all browsers, the workload may not run on every platform.

As we research available technologies and explore our options, we would love to hear from you. If you have ideas for an AI workload scenario that you think would be useful or thoughts on how we should implement it, please let us know! We’re excited about adding new technologies and new value to WebXPRT 4, and we look forward to sharing more information here in the blog as we make progress.

Justin

Posted in AI, benchmark, BenchmarkXPRT, BenchmarkXPRT development community, browser performance, Browser-based benchmarks, Chrome, Collaborative benchmark development, Future of performance evaluation, JavaScript, Microsoft Edge, on-device AI, ONNX Runtime Web, Performance benchmarking, Safari, WebGPU, WebNN, WebXPRT, WebXPRT 4 | Tagged AI, benchmark, BenchmarkXPRT, browser benchmark, browser performance, cross-platform, OCR, ONNX, WebGPU, WebNN, WebXPRT, WebXPRT 4 |

Local AI and new frontiers for performance evaluation

By Justin Greene

on December 19, 2023

Recently, we discussed some ways the PC market may evolve in 2024, and how new Windows on Arm PCs could present the XPRTs with many opportunities for benchmarking. In addition to a potential market shakeup from Arm-based PCs in the coming years, there’s a much broader emerging trend that could eventually revolutionize almost everything about the way we interact with our personal devices—the development of local, dedicated AI processing units for consumer-oriented tech.

AI already impacts daily life for many consumers through technologies such as such as predictive text, computer vision, adaptive workflow apps, voice recognition, smart assistants, and much more. Generative AI-based technologies are rapidly establishing a permanent, society-altering presence across a wide range of industries. Aside from some localized inference tasks that the CPU and/or GPU typically handle, the bulk of the heavy compute power that fuels those technologies has been in the cloud or in on-prem servers. Now, several major chipmakers are working to roll out their own versions of AI-optimized neural processing units (NPUs) that will enable local devices to take on a larger share of the AI load.

Examples of dedicated AI hardware in recently-released or upcoming consumer devices include Intel’s new Meteor Lake NPU, Apple’s Neural Engine for M-series SoCs, Qualcomm’s Hexagon NPU, and AMD’s XDNA 2 architecture. The potential benefits of localized, NPU-facilitated AI are straightforward. On-device AI could reduce power consumption and extend battery life by offloading those tasks from the CPUs. It could alleviate certain cloud-related privacy and security concerns. Without the delays inherent in cloud queries, localized AI could execute inference tasks that operate much closer to real time. NPU-powered devices could fine-tune applications around your habits and preferences, even while offline. You could pull and utilize relevant data from cloud-based datasets without pushing private data in return. Theoretically, your device could know a great deal about you and enhance many areas of your daily life without passing all that data to another party.

Will localized AI play out that way? Some tech companies envision a role for on-device AI that enhances the abilities of existing cloud-based subscription services without decoupling personal data. We’ll likely see a wide variety of capabilities and services on offer, with application-specific and SaaS-determined privacy options.

Regardless of the way on-device AI technology evolves in the coming years, it presents an exciting new frontier for benchmarking. All NPUs will not be created equal, and that’s something buyers will need to understand. Some vendors will optimize their hardware more for computer vision, or large language models, or AI-based graphics rendering, and so on. It won’t be enough for business and consumers to simply know that a new system has dedicated AI processing abilities. They’ll need to know if that system performs well while handling the types of AI-related tasks that they do every day.

Here at the XPRTs, we specialize in creating benchmarks that feature real-world scenarios that mirror the types of tasks that people do in their daily lives. That approach means that when people use XPRT scores to compare device performance, they’re using a metric that can help them make a buying decision that will benefit them every day. We look forward to exploring ways that we can bring XPRT benchmarking expertise to the world of on-device AI.

Do you have ideas for future localized AI workloads? Let us know!

Justin

Posted in AI, AMD, Apple, Arm, battery life, benchmark, Benchmark metrics, Benchmarking, Cloud, computer vision, Cross-platform benchmarks, Data privacy, Future of performance evaluation, graphics, Intel, large language models, Machine learning, Meteor Lake, NPU, on premises, on-device AI, PCs, Qualcomm, SaaS, WebXPRT, WebXPRT 4 | Tagged AI, AMD, Apple, benchmark, BenchmarkXPRT, computer vision, Intel, large language models, local AI, machine learning, Meteor Lake, on-device AI, Qualcomm, rendering, SaaS, WebXPRT, WebXPRT 4 |

The evolving PC market brings new opportunities for WebXPRT

By Justin Greene

on December 7, 2023

Here at the XPRTs, we have to spend time examining what’s next in the tech industry, because the XPRTs have to keep up with the pace of innovation. In our recent discussions about 2024, a major recurring topic has been the potential impact of Qualcomm’s upcoming line of SOCs designed for Windows on Arm PCs.

Now, Windows on Arm PCs are certainly not new. Since Windows RT launched on the Arm-based Microsoft Surface RT in 2012, various Windows on Arm devices have come and gone, but none of them—except for some Microsoft SQ-based Surface devices—have made much of a name for themselves in the consumer market.

The reasons for these struggles are straightforward. While Arm-based PCs have the potential to offer consumers the benefits of excellent battery life and “always-on” mobile communications, the platform has historically lagged Intel- and AMD-based PCs in performance. Windows on Arm devices have also faced the challenge of a lack of large-scale buy-in from app developers. So, despite the past involvement of device makers like ASUS, HP, Lenovo, and Microsoft, the major theme of the Windows on Arm story has been one of very limited market acceptance.

Next year, though, the theme of that story may change. If it does, WebXPRT 4 is well-positioned to play an important part.

At the recent Qualcomm Technology Summit, the company unveiled the new 4nm Snapdragon X Elite SOC, which includes an all-new 12-core Oryon CPU, an integrated Adreno GPU, and an integrated Hexagon NPU (neural processing unit) designed for AI-powered applications. Company officials presented performance numbers that showed the X Elite surpassing the performance of late-gen AMD, Apple, and Intel competitor platforms, all while using less power.

Those are massive claims, and of course the proof will come—or not—only when systems are available for test. (In the past, companies have made similar claims about Windows on Arm advantages, only to see those claims evaporate by the time production devices show up on store shelves.)

Will Snapdragon X Elite systems demonstrate unprecedented performance and battery life when they hit the market? How will the performance of those devices stack up to Intel’s Meteor Lake systems and Apple’s M3 offerings? We don’t yet know how these new devices may shake up the PC market, but we do know that it looks like 2024 will present us with many golden opportunities for benchmarking. Amid all the marketing buzz, buyers everywhere will want to know about potential trade-offs between price, power, and battery life. Tech reviewers will want to dive into the details and provide useful data points, but many traditional PC benchmarks simply won’t work with Windows on ARM systems. As a go-to, cross-platform favorite of many OEMs—that runs on just about anything with a browser—WebXPRT 4 is in a perfect position to provide reviewers and consumers with relevant performance comparison data.

It’s quite possible that 2024 may be the biggest year for WebXPRT yet!

Justin

Posted in AI, AMD, Apple, Arm, ASUS, battery life, Benchmark metrics, Benchmarking, BenchmarkXPRT, browser performance, Browser-based benchmarks, Cross-platform benchmarks, Future of performance evaluation, Intel, Laptops, M3, Meteor Lake, PCs, Performance benchmarking, Qualcomm, WebXPRT, WebXPRT 4 | Tagged Adreno, AMD, Apple, Arm, ASUS, benchmark, BenchmarkXPRT, Hexagon, HP, Intel, laptops, Lenovo, M3, Meteor Lake, Microsoft, Oryon, PCs, Qualcomm, Snapdragon, SOC, Surface |

The Mobile World Congress 2023 recap video is live!

By Justin Greene

on March 14, 2023

Once again, the talented studio team here at Principled Technologies has worked with the XPRT team to put together some great content. This week, we published a recap video of Mark’s recent trip to Mobile World Congress (MWC) 2023 in Barcelona. In the video, Mark discusses how “velocity” was the headline theme of this year’s MWC, but infrastructure was the true backbone of the show. The challenges of developing the infrastructure necessary for tomorrow’s tech aren’t limited to obvious topics like 5G deployment, bandwidth, and accessibility; they also include “soft” infrastructure topics like security, management, and reliability. As everything gets faster, we’ll need tools such as the XPRTs to provide reliable information about which devices can handle the increased demands of tomorrow’s tech.

We encourage you to check out the video, along with Mark’s blog post from the show and MWC-related photos on social media. To view the video, you can follow this link or click the screenshot below. If you attended or followed MWC this year and have any thoughts about how the XPRTs can help to evaluate cutting-edge technologies, we’d love to hear from you!

Justin

Posted in 5G, Benchmarking, BenchmarkXPRT, Future of performance evaluation, Mobile data, Mobile devices, Mobile World Congress, Trade Shows | Tagged 5G, Barcelona, benchmark, BenchmarkXPRT, infrastructure, Mobile World Congress, MWC, MWC23, video |

Mobile World Congress 2023: Infrastructure led the way

By Mark Van Name

on March 3, 2023

When the tech industry is at its best, a virtuous cycle of capabilities and use cases chases its own tail to produce ever-better tech for us all. Faster CPUs drive new usage models, which in turn emerge and swamp the CPUs, which then must get faster. Better screens make us want higher-quality video, which requires more bandwidth to deliver and causes us to desire even better displays. Apps connect us in more ways, but those connections require more bandwidth, which leads to new apps that can take advantage of those faster connections. And on and on.

Put a finger on the cycle at any given moment, and you’ll see that while all the elements are in motion, some are the stars of the moment. To keep the cycle going, it’s crucial for these areas to improve the most. At Mobile World Congress 2023 (#MWC23), that distinction belonged to infrastructure. Yes, some new mobile phones were on display, Lenovo showed off new ThinkPads, and other mobile devices were in abundance, but as I walked the eight huge halls, I couldn’t help but notice the heavy emphasis on infrastructure.

5G, for example, is real now—but it’s far from everywhere. Telecom providers have to figure out how to profitably build out the networks necessary to support it. The whole industry must solve the problems of delivering 5G at huge scale, handle the traffic increases it will bring, switch and route the data, and ultimately make sure the end devices can take full advantage of that data. Management and security remain vital whenever data is flying around, so those softer pieces of infrastructure also matter greatly.

Inevitably and always, to know if we as an industry are meeting these challenges, we must measure performance—both in the raw speed sense and in the broader sense of the word. Are we seeing the full bandwidth we expect? Are devices handling the data properly and at speed? Where’s the bottleneck now? Are we delivering on the schedules we promised? Questions such as these are key concerns in every tech cycle—and some of them are exactly what the XPRTs focus on.

As we improve our infrastructure, we hope to see the benefits at a personal level. When you’re using a device—whether it’s a smart watch, a mobile phone, or a laptop—you need it to do its job well, respond to you quickly, and show you what you want when you want it. When your device makes you wait, it can be helpful to know if the bandwidth feeding data to the device is the bottleneck or if the device simply can’t keep up with the flow of data it’s receiving. The XPRTs can help you figure out what’s going on, and they will continue to be useful and important as the tech cycle spins on. If history is our guide, the infrastructure focus of MWC23 will lead to greater capabilities that require even better devices down the line. We look forward to testing them.

Posted in 5G, benchmark, Benchmark metrics, BenchmarkXPRT, Future of performance evaluation, Laptops, Mobile devices, Mobile World Congress, Performance benchmarking, Phones, Trade Shows | Tagged 5G, Barcelona, benchmark, BenchmarkXPRT, CPU, infrastructure, laptop, Lenovo, mobile, mobile devices, Mobile World Congress, MWC, MWC23, ThinkPad |

CES 2023: Adapting to changing realities

By Justin Greene

on January 6, 2023

The last time the XPRTs attended the Consumer Electronics Show in Las Vegas was in January 2020, shortly before shutdowns due to the global pandemic began. More than 171,000 people attended that year’s show, the 2021 show was totally virtual, and CES shortened the 2022 show after many exhibitors and media pulled out during the Omicron surge. While some aspects of the event are returning to normal this year, about one-third of the typically jam-packed Las Vegas Convention Center space is empty, and only about 100,000 people are likely to attend. Nevertheless, the show is still enormous and full of fascinating new technology.

Just one day into the show, I’ve already noticed some interesting changes in the virtual reality (VR) and augmented reality (AR) areas since I last attended in 2020. One change is a significant expansion in the sensory capabilities of VR equipment. For a long time, VR technologies have focused almost solely on visual and audio input technology and the graphics-rendering capabilities necessary for lag-free, immersive experiences. In 2020, I saw companies working on various types of haptic feedback gear, including full-body suits, that pushed the boundaries of VR beyond sight and sound. Now, several companies are demonstrating significant progress in “real-feel touch” technologies for VR. One such company is HaptX, which is developing a set of gloves (see the picture below) that pump air through “microfluidic actuators” so that users can feel the size and shape of virtual objects they interact with in a VR environment. While we often think of VR being used for gaming and entertainment, advances in realistic, multi-sensory capabilities can lead to VR becoming a valuable tool for all kinds of industrial and professional training applications.

Another change I’ve noticed is how AR seems poised to move from demos to everyday life by means of integration with all types of smartphone apps. I enjoyed speaking with a representative from a Korean AR company called Arbeon. Arbeon is developing an app that will allow users to point their phone’s camera at an object (a wine bottle in the picture below), and see an array of customizable, interactive AR animations surrounding the object. You’ll be able to find product info, see and leave feedback similar to “likes” and reviews, attach emojis, tag friends, and even purchase the product, all from your phone’s AR-enhanced camera and screen. It’s an interesting concept with limitless applications. While VR is here to stay and getting better all the time, I personally think that AR will become much more integrated into everyday life in the coming years. I also think AR apps for phones will allow the technology to take off more quickly in the near term than clunkier options like AR eyeglasses.

The large screen displays how Arbeon’s AR phone app interacts with objects like a wine bottle.

Of course, thinking about AR has led me to wonder if we’ll be able to incorporate AR-related workloads into future XPRTs. As new technologies place new and unprecedented levels of processing demand on our computing hardware, the need for objective performance evaluation will continue. Providing reliable, objective performance data is why the XPRTs exist, and planning for the future of the XPRTs is why we’re at CES 2023. If you have any thoughts about how the XPRTs can help to evaluate new technologies, we’d love to hear from you!

Justin

Posted in AR, Augmented reality, Benchmarking, CES, Consumer Electronics Show, Future of performance evaluation, graphics, Las Vegas, Performance benchmarking, Trade Shows, Virtual reality, VR Demo | Tagged AR, Arbeon, augmented reality, benchmark, BenchmarkXPRT, CES, Consumer Electronics Show, HaptX, Las Vegas, virtual reality, VR |

Category: Future of performance evaluation

Up next for WebXPRT 4: A new AI-focused workload!

Local AI and new frontiers for performance evaluation

The evolving PC market brings new opportunities for WebXPRT

The Mobile World Congress 2023 recap video is live!

Mobile World Congress 2023: Infrastructure led the way

CES 2023: Adapting to changing realities

Check out the other XPRTs: