BenchmarkXPRT Blog banner

Category: Benchmarking

Feedback from the WebXPRT 4 tech press survey

In early May, we sent a survey to members of the tech press who regularly use WebXPRT in articles and reviews. We asked for their thoughts on several aspects of WebXPRT, as well as what they’d like to see in the upcoming fourth version of the benchmark. We also published the survey questions here in the blog, and invited experienced WebXPRT testers to send their feedback as well. We received some good responses to the survey, and for the benefit of our readers, we’ve summarized some of the key comments and suggestions below.

  • One respondent stated that WebXPRT is demanding enough to test performance, but if we want to simulate modern web usage, we should find the most up-to-date studies on common browser tasks and web technologies. This suggestion lines up with our intention to study the feasibility of adding a WebAssembly workload
  • One respondent liked that fact that unlike many other browser benchmarks, WebXPRT tests more than just JavaScript calculation speed.
  • One respondent suggested that we include a link to a WebXPRT white paper within the UI, or at least a guide describing what happens during each workload.
  • One respondent stated that they would like for WebXPRT to automatically produce a good result file on the local test system.
  • One respondent said that WebXPRT has a relatively long runtime for a browser benchmark, and they would prefer that the runtime not increase in WebXPRT 4.
  • We had no direct calls for a battery life test, because many testers already have scripts and/or methodologies in place for battery testing, but one tester suggested adding the ability to loop the test so users can measure performance over varying lengths of time.
  • There were no requests to bring back any aspects of WebXPRT 2015 that we removed in WebXPRT 3.
  • There were no reports of significant connection issues when testing with WebXPRT.

We greatly appreciate the members of the tech press that responded to the survey. We’re still in the planning stages of WebXPRT 4, so there’s still time for anyone to send comments or ideas to benchmarkxprtsupport@principledtechnologies.com. We look forward to hearing from you!

Justin

Publishing CloudXPRT results from testing on pre-production gear

We recently received questions about whether we accept CloudXPRT results submissions from testing on pre-production gear, and how we would handle any differences between results from pre-production and production-level tests.  

To answer first question, we are not opposed to pre-production results submissions. We realize that vendors often want to include benchmark results in launch-oriented marketing materials they release before their hardware or software is publicly available. To help them do so, we’re happy to consider pre-production submissions on a case-by-case basis. All such submissions must follow the normal CloudXPRT results submission process, and undergo vetting by the CloudXPRT Results Review Group according to the standard review and publication schedule. If we decide to publish pre-production results on our site, we will clearly note their pre-production status.

In response to the second question, the CloudXPRT Results Review Group will handle any challenges to published results or perceived discrepancies between pre-production and production-level results on a case-by-case basis. We do not currently have a formal process for challenges; anyone who would like to initiate a challenge or express comments or concerns about a result should address the review group via benchmarkxprtsupport@principledtechnologies.com. Our primary concern is always to ensure that published results accurately reflect the performance characteristics of production-level hardware and software. If it becomes necessary to develop more policies in the future, we’ll do so, but we want to keep things as simple as possible.

If you have any questions about the CloudXPRT results submission process, please let us know!

Justin

The WebXPRT 4 tech press feedback survey

Device reviews in publications such as AnandTech, Notebookcheck, and PCMag, among many others, often feature WebXPRT test results, and we appreciate the many members of the tech press that use WebXPRT. As we move forward with the WebXPRT 4 development process, we’re especially interested in learning what longtime users would like to see in a new version of the benchmark.  

In previous posts, we’ve asked people to weigh in on the potential addition of a WebAssembly workload or a battery life test. We’d also like to ask experienced testers some other test-related questions. To that end, this week we’ll be sending a WebXPRT 4 survey directly to members of the tech press who frequently publish WebXPRT test results.

Regardless of whether you are a member of the tech press, we invite you to participate by sending your answers to any or all the questions below to benchmarkxprtsupport@principledtechnologies.com. We ask you to do so by the end of May.

  • Do you think WebXPRT 3’s selection of workload scenarios is representative of modern web tasks?
  • How do you think WebXPRT compares to other common browser-based benchmarks, such as JetStream, Speedometer, and Octane?
  • Are there web technologies that you’d like us to include in additional workloads?
  • Are you happy with the WebXPRT 3 user interface? If not, what UI changes would you like to see?
  • Are there any aspects of WebXPRT 2015 that we changed in WebXPRT 3 that you’d like to see us change back?
  • Have you ever experienced significant connection issues when testing with WebXPRT?
  • Given the array of workloads, do you think the WebXPRT runtime is reasonable? Would you mind if the average runtime were a bit longer?
  • Are there any other aspects of WebXPRT 3 that you’d like to see us change?

If you’d like to discuss any topics that we did not cover in the questions above, please feel free to include additional comments in your response. We look forward to hearing your thoughts!

Justin

WebXPRT passes the 750,000-run milestone!

We’re excited to see that users have successfully completed over 750,000 WebXPRT runs! If you’ve run WebXPRT in any of the more than 654 cities and 68 countries from which we’ve received complete test data—including newcomers Belize, Cambodia, Croatia, and Pakistan—we’re grateful for your help. We could not have reached this milestone without you!

As the chart below illustrates, WebXPRT use has grown steadily over the years. We now record, on average, almost twice as many WebXPRT runs in one month as we recorded in the entirety of our first year. In addition, with over 82,000 runs to date in 2021, there are no signs that growth is slowing.

Developing a new benchmark is never easy, and the obstacles multiply when you attempt to create a cross-platform benchmark, such as WebXPRT, that will run on a wide variety of devices. Establishing trust with the benchmarking community is another challenge. Transparency, consistency, and technical competency on our part are critical factors in building that trust, but the people who take time out of their busy schedules to run the benchmark for the first time also play a role. We thank all of the manufacturers, OEM labs, and members of the tech press who decided to give WebXPRT a try, and we look forward to your input as we continue to improve WebXPRT in the years to come. 

If you have any questions or comments about WebXPRT, we’d love to hear from you!

Justin

Default requirements for CloudXPRT results submissions

Over the past few weeks, we’ve received questions about whether we require specific test configuration settings for official CloudXPRT results submissions. Currently, testers have the option to edit up to 12 configuration options for the web microservices workload and three configuration options for the data analytics workload. Not all configuration options have an impact on testing and results, but a few of them can drastically affect key results metrics and how long it takes to complete a test. Because new CloudXPRT testers may not anticipate those outcomes, and so many configuration permutations are possible, we’ve come up with a set of requirements for all future results submissions to our site. Please note that testers are still free to adjust all available configuration options—and define service level agreement (SLA) settings—as they see fit for their own purposes. The requirements below apply only to results testers want to submit for publication consideration on our site, and to any resulting comparisons.


Web microservices results submission requirement

Starting with the May results submission cycle, all web microservices results submissions must have the workload.cpurequestsvalue, which lets the user designate the number of CPU cores the workload assigns to each pod, set to 4. Currently, the benchmark supports values of 1, 2, and 4, with the default value of 4. While 1 and 2 CPU cores per pod may be more appropriate for relatively low-end systems or configurations with few vCPUs, a value of 4 is appropriate for most datacenter processors, and it often enables CSP instances to operate within the benchmark’s max default 95th percentile latency SLA of 3,000 milliseconds.

In future CloudXPRT releases, we may remove the option to change the workload.cpurequests value from the config.json file and simply fix the value in the benchmark’s code to promote test predictability and reasonable comparisons. For more information about configuration options for the web microservices workload, please consult the Overview of the CloudXPRT Web Microservices Workload white paper.


Data analytics results submission requirement

Starting with the May results submission cycle, all data analytics results submissions must have the best reported performance (throughput_jobs/min) correspond to a 95th percentile SLA latency of 90 seconds or less. We have received submissions where the throughput was extremely high, but the 95th percentile SLA latency was up to 10 times the 90 seconds that we recommend in CloudXPRT documentation. High latency values may be acceptable for the unique purposes of individual testers, but they do not provide a good basis for comparison between clusters under test. For more information about configuration options with the data analytics workload, please consult the Overview of the CloudXPRT Data Analytics Workload white paper.

We will update CloudXPRT documentation to make sure that testers know to use the default configuration settings if they plan to submit results for publication. If you have any questions about CloudXPRT or the CloudXPRT results submission process, please let us know.

Justin

Considering a battery life test for WebXPRT 4

A few weeks ago, we discussed the beginnings of a WebXPRT 4 development plan, and asked for reader feedback about potential workload changes. So far, the two most common feedback topics have been the possible addition of a WebAssembly workload, and the feasibility of including a browser-based battery life test. Today, we discuss what a WebXPRT 4 battery life test would look like, and some of the challenges we’d have to overcome to make it a reality.

Battery life tests fall into two primary categories: simple rundown tests and performance-weighted tests. Simple rundown tests measure battery life during extreme idle periods and loops of movie playbacks, etc., but do not reflect the wide-ranging mix of activities that characterize a typical day for most users. While they can be useful for performing very specific apple-to-apples comparisons, these tests have limited value when it comes to giving consumers a realistic estimation of the battery life they would experience during everyday use.

In contrast, performance-weighted battery life tests, such as the one in CrXPRT 2, attempt to reflect real-world usage. The CrXPRT battery life test simulates common daily usage patterns for Chromebooks by including all the productivity workloads from the performance test, plus video playback, audio playback, and gaming scenarios. It also includes periods of wait/idle time. We believe this mixture of diverse activity and idle time better represents typical real-life behavior patterns. This makes the resulting estimated battery life much more helpful for consumers who are trying to match a device’s capabilities with their real-world needs.

From a technical standpoint, WebXPRT’s cross-platform nature presents us with several challenges that we did not face while developing the CrXPRT battery life test for Chrome OS. While the WebXPRT performance tests run in almost any browser, cross-browser differences in battery life reporting may restrict the battery life test to a single browser. For instance, Mozilla has deprecated the battery status API for Firefox, and we’re not yet sure if there’s another approach that would work. If a WebXPRT 4 battery life test supported only a single browser, such as Chrome or Safari, would you still be interested in using it? Please let us know.

A browser-based battery life workflow also presents other challenges that we do not face in native client applications such as CrXPRT:

  • A browser-based battery life test would require the user to check the starting and ending battery capacities, with no way for the app to independently verify data accuracy.
  • The battery life test could require more babysitting in the event of network issues. We can catch network failures and try to handle them by reporting periods of network disconnection, but those interruptions could influence the battery life duration.
  • The factors above could make it difficult to achieve repeatability. One way to address that problem would be to run the test in a standardized lab environment lab with a steady internet connection, but a long list of standardized environmental requirements would make the battery life test less attractive and less accessible to many testers.

Our intention with today’s blog is not to make a WebXPRT 4 battery life test seem like an impossibility. Rather, we want to share our perspective on what the test might look like, and describe some of the challenges and considerations in play. If you have thoughts about battery life testing, or experience with battery life APIs in one or more of the major browsers, we’d love to hear from you!

Justin

Check out the other XPRTs:

Forgot your password?