BenchmarkXPRT Blog banner

Category: What makes a good benchmark?

The XPRTs: What would you like to see?

One of the core principles of the BenchmarkXPRT Development Community is a commitment to valuing the feedback of both community members and the larger group of testers that use the XPRTs on a regular basis. That feedback helps us to ensure that as the XPRTs continue to grow and evolve, the resources that we offer will continue to meet the needs of those that use them.

In the past, user feedback has influenced specific aspects of our benchmarks such as the length of test runs, user interface features, results presentation, and the removal or inclusion of specific workloads. More broadly, we have also received suggestions for entirely new XPRTs and ways we might target emerging technologies or industry use cases.

As we approach the second half of 2022 and begin planning for 2023, we’re asking to hear your ideas about new XPRTs—or new features for existing XPRTs. Are you aware of hardware form factors, software platforms, or prominent applications that are difficult or impossible to evaluate using existing performance benchmarks? Are there new technologies we should be incorporating into existing XPRTs via new workloads? Can you recommend ways to improve any of the XPRTs or XPRT-related tools such as results viewers?

We are interested in your answers to these questions and any other ideas you have, so please feel free to contact us. We look forward to hearing your thoughts!

Justin

Chrome OS support for CrXPRT apps ends in June 2022

Last March, we discussed the Chrome OS team’s original announcement that they would be phasing out support for Chrome Apps altogether in June 2021, and would shift their focus to Chrome extensions and Progressive Web Apps. The Chrome OS team eventually extended support for existing Chrome Apps through June 2022, but as of this week, we see no indication that they will further extend support for Chrome Apps published with general developer accounts. If the end-of-life schedule for Chrome Apps does not change in the next few months, both CrXPRT 2 and CrXPRT 2015 will stop working on new versions of Chrome OS at some point in June.

To maintain CrXPRT functionality past June, we would need to rebuild the app completely—either as a Progressive Web App or in some other form. For this reason, we want to reassess our approach to Chrome OS testing, and investigate which features and technologies to include in a new Chrome OS benchmark. Our current goal is to gather feedback and conduct exploratory research over the next few months, and begin developing an all-new Chrome OS benchmark for publication by the end of the year.

While we will discuss ideas for this new Chrome OS benchmark in future blog posts, we welcome ideas from CrXPRT users now. What features or workloads would you like the new benchmark to retain? Would you like us to remove any components from the existing benchmark? Does the battery life test in its current form suit your needs? If you have any thoughts about these questions or any other aspects of Chrome OS benchmarking, please let us know!

Justin

Why we don’t control screen brightness during CrXPRT 2 battery life tests

Recently, we had a discussion with a community member about why we no longer recommend specific screen brightness settings during CrXPRT 2 battery life tests. In the CrXPRT 2015 user manual, we recommended setting the test system’s screen brightness to 200 nits. Because the amount of power that a system directs to screen brightness can have a significant impact on battery life, we believed that pegging screen brightness to a common standard for all test systems would yield apple-to-apples comparisons.

After extensive experience with CrXPRT 2015 testing, we decided to not recommend a standard screen brightness with CrXPRT 2, for the following reasons:

  • A significant number of Chromebooks cannot produce a screen brightness of 200 nits. A few higher-end models can do so, but they are not representative of most Chromebooks. Some Chromebooks, especially those that many school districts and corporations purchase in bulk, cannot produce a brightness of even 100 nits.
  • Because of the point above, adjusting screen brightness would not represent real-life conditions for most Chromebooks, and the battery life results could mislead consumers who want to know the battery life they can expect with default out-of-box settings.
  • Most testers, and even some labs, do not have light meters, and the simple brightness percentages that the operating system reports produce different degrees of brightness on different systems. For testers without light meters, a standardized screen brightness recommendation could discourage them from running the test.
  • The brightness controls for some low-end Chromebooks lack the fine-tuning capability that is necessary to standardize brightness between systems. In those cases, an increase or decrease of one notch can swing brightness by 20 to 30 nits in either direction. This could also discourage testing by leading people to believe that they lack the capability to correctly run the test.

In situations where testers want to compare battery life using standardized screen brightness, we recommend using light meters to set the brightness levels as closely as possible. If the brightness levels between systems vary by more than few nits, and if the levels vary significantly from out-of-box settings, the publication of any resulting battery life results should include a full disclosure and explanation of test conditions.

For the majority of testers without light meters, running the CrXPRT 2 battery life test with default screen brightness settings on each system provides a reliable and accurate estimate of the type of real-world, out-of-box battery life consumers can expect.

If you have any questions or comments about the CrXPRT 2 battery life test, please feel free to contact us!

Justin

Thinking about experimental WebXPRT workloads in 2022

As the WebXPRT 4 development process has progressed, we’ve started to discuss the possibility of offering experimental WebXPRT 4 workloads in 2022. These would be optional workloads that test cutting-edge browser technologies or new use cases. The individual scores for the experimental workloads would stand alone, and would not factor in the WebXPRT 4 overall score.

WebXPRT testers would be able to run the experimental workloads one of two ways: by manually selecting them on the benchmark’s home screen, or by adjusting a value in the WebXPRT 4 automation scripts.

Testers would benefit from experimental workloads by being able to compare how well certain browsers or systems handle new tasks (e.g., new web apps or AI capabilities). We would benefit from fielding workloads for large-scale testing and user feedback before we commit to including them as core WebXPRT workloads.

Do you have any general thoughts about experimental workloads for browser performance testing, or any specific workloads that you’d like us to consider? Please let us know.

Justin

Round 2 of the WebXPRT 4 survey is now open

In May, we surveyed longtime WebXPRT users regarding the types of changes they would like to see in a WebXPRT 4. We sent the survey to journalists at several tech press outlets, and invited our blog readers to participate as well. We received some very helpful feedback. As we explore new possibilities for WebXPRT 4, we’ve decided to open an updated version of the survey. We’ve adjusted the questions a bit based on previous feedback and added some new ones, so we invite you to respond even if you participated in the original survey.

To do so, please send your answers to the following questions to benchmarkxprtsupport@principledtechnologies.com before July 31.

  • Do you think WebXPRT 3’s selection of workload scenarios is representative of modern web tasks?
  • How do you think WebXPRT compares to other common browser-based benchmarks, such as JetStream, Speedometer, and Octane?
  • Would you like to see a workload based on WebAssembly (WASM) in WebXPRT 4? Why or why not?
  • Would you like to see a workload based on Single Page Application (SPA) technology in WebXPRT 4? Why or why not?
  • Would you like to see a workload based on Motion UI in WebXPRT 4? Why or why not?
  • Would you like to see us include any other web technologies in additional workloads?
  • Are you happy with the WebXPRT 3 user interface? If not, what UI changes would you like to see?
  • Have you ever experienced significant connection issues when testing with WebXPRT?
  • Given its array of workloads, do you think the WebXPRT runtime is reasonable? Would you mind if the average runtime increased slightly?
  • Would you like to see us change any other aspects of WebXPRT 3?


If you would like to share your thoughts on any topics that the questions above do not cover, please include those in your response. We look forward to hearing from you!

Justin

Feedback from the WebXPRT 4 tech press survey

In early May, we sent a survey to members of the tech press who regularly use WebXPRT in articles and reviews. We asked for their thoughts on several aspects of WebXPRT, as well as what they’d like to see in the upcoming fourth version of the benchmark. We also published the survey questions here in the blog, and invited experienced WebXPRT testers to send their feedback as well. We received some good responses to the survey, and for the benefit of our readers, we’ve summarized some of the key comments and suggestions below.

  • One respondent stated that WebXPRT is demanding enough to test performance, but if we want to simulate modern web usage, we should find the most up-to-date studies on common browser tasks and web technologies. This suggestion lines up with our intention to study the feasibility of adding a WebAssembly workload
  • One respondent liked that fact that unlike many other browser benchmarks, WebXPRT tests more than just JavaScript calculation speed.
  • One respondent suggested that we include a link to a WebXPRT white paper within the UI, or at least a guide describing what happens during each workload.
  • One respondent stated that they would like for WebXPRT to automatically produce a good result file on the local test system.
  • One respondent said that WebXPRT has a relatively long runtime for a browser benchmark, and they would prefer that the runtime not increase in WebXPRT 4.
  • We had no direct calls for a battery life test, because many testers already have scripts and/or methodologies in place for battery testing, but one tester suggested adding the ability to loop the test so users can measure performance over varying lengths of time.
  • There were no requests to bring back any aspects of WebXPRT 2015 that we removed in WebXPRT 3.
  • There were no reports of significant connection issues when testing with WebXPRT.

We greatly appreciate the members of the tech press that responded to the survey. We’re still in the planning stages of WebXPRT 4, so there’s still time for anyone to send comments or ideas to benchmarkxprtsupport@principledtechnologies.com. We look forward to hearing from you!

Justin

Check out the other XPRTs:

Forgot your password?