BenchmarkXPRT Blog banner

Tag Archives: Performance

Best practices in benchmarking

From time to time, a tester writes to ask for help determining why they see different WebXPRT scores on two systems that have the same hardware configuration. The scores sometimes differ by a significant percentage. This can happen for many reasons, including different software stacks, but score variability can also result from different testing behavior and environments. While a small amount of variability is normal, these types of questions provide an opportunity to talk about the basic benchmarking practices we follow in the XPRT lab to produce the most consistent and reliable scores.

Below, we list a few basic best practices you might find useful in your testing. Most of them relate to evaluating browser performance with WebXPRT, but several of these practices apply to other benchmarks as well.

  • Test with clean images: We typically use an out-of-box (OOB) method for testing new devices in the XPRT lab. OOB testing means that other than running the initial OS and browser version updates that users are likely to run after first turning on the device, we change as little as possible before testing. We want to assess the performance that buyers are likely to see when they first purchase the device, before installing additional apps and utilities. This is the best way to provide an accurate assessment of the performance retail buyers will experience. While OOB is not appropriate for certain types of testing, the key is to not test a device that’s bogged down with programs that will influence results.
  • Turn off automatic updates: We do our best to eliminate or minimize app and system updates after initial setup. Some vendors are making it more difficult to turn off updates completely, but you should always double-check update settings before testing.
  • Get a baseline for system processes: Depending on the system and the OS, a significant amount of system-level activity can be going on in the background after you turn it on. As much as possible, we like to wait for a stable (idle) baseline of system activity before kicking off a test. If we start testing immediately after booting the system, we often see higher variance in the first run before the scores start to tighten up.
  • Hardware is not the only important factor: Most people know that different browsers produce different performance scores on the same system. However, testers aren’t always aware of shifts in performance between different versions of the same browser. While most updates don’t have a large impact on performance, a few updates have increased (or even decreased) browser performance by a significant amount. For this reason, it’s always worthwhile to record and disclose the extended browser version number for each test run. The same principle applies to any other relevant software.
  • Use more than one data point: Because of natural variance, our standard practice in the XPRT lab is to publish a score that represents the median from three to five runs, if not more. If you run a benchmark only once, and the score differs significantly from other published scores, your result could be an outlier that you would not see again under stable testing conditions.

We hope these tips will help make your testing more accurate. If you have any questions about the XPRTs, or about benchmarking in general, feel free to ask!

Justin

WebXPRT passes the million-run milestone!

We’re excited to see that users have successfully completed over 1,000,000 WebXPRT runs! If you’ve run WebXPRT in any of the 924 cities and 81 countries from which we’ve received complete test data—including newcomers Bahrain, Bangladesh, Mauritius, The Philippines, and South Korea —we’re grateful for your help. We could not have reached this milestone without you!

As the chart below illustrates, WebXPRT use has grown steadily since the debut of WebXPRT 2013. On average, we now record more WebXPRT runs in one month than we recorded in the entirety of our first year. With over 104,000 runs so far in 2022, that growth is continuing.

For us, this moment represents more than a numerical milestone. Developing and maintaining a benchmark is never easy, and a cross-platform benchmark that will run on a wide variety of devices poses an additional set of challenges. For such a benchmark to succeed, developers need not only technical competency, but the trust and support of the benchmarking community. WebXPRT is now in its ninth year, and its consistent year-over-year growth tells us that the benchmark continues to hold value for manufacturers, OEM labs, the tech press, and end users like you. We see it as a sign of trust that folks repeatedly return to the benchmark for reliable performance metrics. We’re grateful for that trust, and for everyone that’s contributed to the WebXPRT development process throughout the years.

We’ll have more to share related to this exciting milestone in the weeks to come, so stay tuned to the blog. If you have any questions or comments about WebXPRT, we’d love to hear from you!

Justin

The XPRTs: What would you like to see?

One of the core principles of the BenchmarkXPRT Development Community is a commitment to valuing the feedback of both community members and the larger group of testers that use the XPRTs on a regular basis. That feedback helps us to ensure that as the XPRTs continue to grow and evolve, the resources that we offer will continue to meet the needs of those that use them.

In the past, user feedback has influenced specific aspects of our benchmarks such as the length of test runs, user interface features, results presentation, and the removal or inclusion of specific workloads. More broadly, we have also received suggestions for entirely new XPRTs and ways we might target emerging technologies or industry use cases.

As we approach the second half of 2022 and begin planning for 2023, we’re asking to hear your ideas about new XPRTs—or new features for existing XPRTs. Are you aware of hardware form factors, software platforms, or prominent applications that are difficult or impossible to evaluate using existing performance benchmarks? Are there new technologies we should be incorporating into existing XPRTs via new workloads? Can you recommend ways to improve any of the XPRTs or XPRT-related tools such as results viewers?

We are interested in your answers to these questions and any other ideas you have, so please feel free to contact us. We look forward to hearing your thoughts!

Justin

The WebXPRT 4 Preview is almost here

Last week, we provided readers with an overview of what to expect in the WebXPRT 4 Preview, as well as an update on the Preview’s release schedule. Since then, we’ve been working on UI adjustments and bug fixes, additional technical tweaks, and follow-up testing. We’re very close, but won’t be able to meet our original goal of publishing the Preview today. We believe it will be ready for release early next week.

As a reminder, once we release the WebXPRT 4 Preview, testers will be able to publish scores from Preview build testing. We will limit any changes that we make between the Preview and the final release to the UI or to features we do not expect to affect test scores.

If you have any questions about WebXPRT 4 or the Preview build, please let us know!

Justin

Feedback from the WebXPRT 4 tech press survey

In early May, we sent a survey to members of the tech press who regularly use WebXPRT in articles and reviews. We asked for their thoughts on several aspects of WebXPRT, as well as what they’d like to see in the upcoming fourth version of the benchmark. We also published the survey questions here in the blog, and invited experienced WebXPRT testers to send their feedback as well. We received some good responses to the survey, and for the benefit of our readers, we’ve summarized some of the key comments and suggestions below.

  • One respondent stated that WebXPRT is demanding enough to test performance, but if we want to simulate modern web usage, we should find the most up-to-date studies on common browser tasks and web technologies. This suggestion lines up with our intention to study the feasibility of adding a WebAssembly workload
  • One respondent liked that fact that unlike many other browser benchmarks, WebXPRT tests more than just JavaScript calculation speed.
  • One respondent suggested that we include a link to a WebXPRT white paper within the UI, or at least a guide describing what happens during each workload.
  • One respondent stated that they would like for WebXPRT to automatically produce a good result file on the local test system.
  • One respondent said that WebXPRT has a relatively long runtime for a browser benchmark, and they would prefer that the runtime not increase in WebXPRT 4.
  • We had no direct calls for a battery life test, because many testers already have scripts and/or methodologies in place for battery testing, but one tester suggested adding the ability to loop the test so users can measure performance over varying lengths of time.
  • There were no requests to bring back any aspects of WebXPRT 2015 that we removed in WebXPRT 3.
  • There were no reports of significant connection issues when testing with WebXPRT.

We greatly appreciate the members of the tech press that responded to the survey. We’re still in the planning stages of WebXPRT 4, so there’s still time for anyone to send comments or ideas to benchmarkxprtsupport@principledtechnologies.com. We look forward to hearing from you!

Justin

The WebXPRT 4 tech press feedback survey

Device reviews in publications such as AnandTech, Notebookcheck, and PCMag, among many others, often feature WebXPRT test results, and we appreciate the many members of the tech press that use WebXPRT. As we move forward with the WebXPRT 4 development process, we’re especially interested in learning what longtime users would like to see in a new version of the benchmark.  

In previous posts, we’ve asked people to weigh in on the potential addition of a WebAssembly workload or a battery life test. We’d also like to ask experienced testers some other test-related questions. To that end, this week we’ll be sending a WebXPRT 4 survey directly to members of the tech press who frequently publish WebXPRT test results.

Regardless of whether you are a member of the tech press, we invite you to participate by sending your answers to any or all the questions below to benchmarkxprtsupport@principledtechnologies.com. We ask you to do so by the end of May.

  • Do you think WebXPRT 3’s selection of workload scenarios is representative of modern web tasks?
  • How do you think WebXPRT compares to other common browser-based benchmarks, such as JetStream, Speedometer, and Octane?
  • Are there web technologies that you’d like us to include in additional workloads?
  • Are you happy with the WebXPRT 3 user interface? If not, what UI changes would you like to see?
  • Are there any aspects of WebXPRT 2015 that we changed in WebXPRT 3 that you’d like to see us change back?
  • Have you ever experienced significant connection issues when testing with WebXPRT?
  • Given the array of workloads, do you think the WebXPRT runtime is reasonable? Would you mind if the average runtime were a bit longer?
  • Are there any other aspects of WebXPRT 3 that you’d like to see us change?

If you’d like to discuss any topics that we did not cover in the questions above, please feel free to include additional comments in your response. We look forward to hearing your thoughts!

Justin

Check out the other XPRTs:

Forgot your password?