PT-Logo
Forgot your password?
BenchmarkXPRT Blog banner

Tag Archives: benchmarks

Transparent goals

Recently, Forbes published an article discussing a new report on phone battery life from Which?, a UK consumer advocacy group. In the report, Which? states that they tested the talk time battery life of 50 phones from five brands. During the tests, phones from three of the brands lasted longer than the manufacturers’ claims, while phones from another brand underperformed by about five percent. The fifth brand’s published battery life numbers were 18 to 51 percent higher than Which? recorded in their tests.

Folks can read the article for more details about the tests and the brands. While the report raises some interesting questions, and the article provides readers with brief test methodology descriptions from Which? and one manufacturer, we don’t know enough about the tests to say which set of claims is correct. Any number of variables related to test workloads or device configuration settings could significantly affect the results. Both parties may be using sound benchmarking principles in good faith, but their test methodologies may not be comparable. As it is, we simply don’t have enough information to evaluate the study.

Whether the issue is battery life or any other important device spec, information conflicts, such as the one that the Forbes article highlights, can leave consumers scratching their heads, trying to decide which sources are worth listening to. At the XPRTs, we believe that the best remedy for this type of problem is to provide complete transparency into our testing methodologies and development process. That’s why our lab techs verify all the hardware specs for each XPRT Weekly Tech Spotlight entry. It’s why we publish white papers explaining the structure of our benchmarks in detail, as well as how the XPRTs calculate performance results. It’s also why we employ an open development community model and make each XPRT’s source code available to community members. When we’re open about how we do things, it encourages the kind of honest dialogue between vendors, journalists, consumers, and community members that serves everyone’s best interests.

If you love tech and share that same commitment to transparency, we’d love for you to join our community, where you can access XPRT source code and previews of upcoming benchmarks. Membership is free for anyone with a verifiable corporate affiliation. If you have any questions about membership or the registration process, please feel free to ask.

Justin

XPRT collaborations: North Carolina State University

For those of us who work on the BenchmarkXPRT tools, a core goal is involving new contributors and interested parties in the benchmark development process. Adding voices to the discussion fosters the collaboration and innovation that lead to powerful benchmark tools with lasting relevance.

One vehicle for outreach that we especially enjoy is sponsoring a student project through North Carolina State University. Each semester, the Senior Design Center in the university’s Department of Computer Science partners with external companies and organizations to provide student teams with an opportunity to work on real-world programming projects. If you’ve followed the XPRTs for a while, you may remember previous student projects such as Nebula Wolf, a mini-game that shows how well different devices handle games, and VR Demo, a virtual reality prototype workload based on a room escape scenario.

This fall, a team of NC State students is developing a software console for automating machine learning tests. Ideally, the tool will let future testers specify custom workload combinations, compute a performance metric, and upload results to our database. The project will also assess the impact of the framework on performance scores. In fact, the console will perform many of the same functions we plan to implement with AIXPRT.

The students have worked very hard on the project, and have learned quite a bit about benchmarking practices and several new software tools. The project will wrap up in the next couple of weeks, and we’ll share additional details as soon as possible. Early next year, we’ll publish a video about the experience.

If you’d like to join the NC State students and hundreds of other XPRT community members in the future of benchmark development, please let us know!

Justin

Notes from the lab: choosing a calibration system for MobileXPRT 3

Last week, we shared some details about what to expect in MobileXPRT 3. This week, we want to provide some insight into one part of the MobileXPRT development process, choosing a calibration system.

First, some background. For each of the benchmarks in the XPRT family, we select a calibration system using criteria we’ll explain below. This system serves as a reference point, and we use it to calculate scores that will help users understand a single benchmark result. The calibration system for MobileXPRT 2015 is the Motorola DROID RAZR M. We structured our calculation process so that the mean performance score from repeated MobileXPRT 2015 runs on that device is 100. A device that completes the same workloads 20 percent faster than the DROID RAZR M would have a performance score of 120, and one that performs the test 20 percent more slowly would have a score of 80. (You can find a more in-depth explanation of MobileXPRT score calculations in the Exploring MobileXPRT 2015 white paper.)

When selecting a calibration device, we are looking for a relevant reference point in today’s market. The device should be neither too slow to handle modern workloads nor so fast that it outscores most devices on the market. It should represent a level of performance that is close to what the majority of consumers experience, and one that will continue to be relevant for some time. This approach helps to build context for the meaning of the benchmark’s overall score. Without that context, testers can’t tell whether a score is fast or slow just by looking at the raw number. When compared to a well-known standard such as the calibration device, however, the score has more informative value.

To determine a suitable calibration device for MobileXPRT 3, we started by researching the most popular Android phones by market share around the world. It soon became clear that in many major markets, the Samsung Galaxy S8 ranked first or second, or at least appeared in the top five. As last year’s first Samsung flagship, the S8 is no longer on the cutting edge, but it has specs that many current mid-range phones are deploying, and the hardware should remain relevant for a couple of years.

For all of these reasons, we made the Samsung Galaxy S8 the calibration device for MobileXPRT 3. The model in our lab has a Qualcomm Snapdragon 835 SoC, 4 GB of RAM, and runs Android 7.0 (Nougat). We think it has the balance we’re looking for.

If you have any questions or concerns about MobileXPRT 3, calibration devices, or score calculations, please let us know. We look forward to sharing more information about MobileXPRT 3 as we get closer to the community preview.

Justin

Planning the next version of MobileXPRT

We’re in the early planning stages for the next version of MobileXPRT, and invite you to send us any suggestions you may have. What do you like or not like about MobileXPRT? What features would you like to see in a new version?

When we begin work on a new version of any XPRT, one of the first steps we take is to assess the benchmark’s workloads to determine whether they will provide value during the years ahead. This step almost always involves updating test content such as photos and videos to more contemporary file resolutions and sizes, and it can also involve removing workloads or adding completely new scenarios. MobileXPRT currently includes five performance scenarios (Apply Photo Effects, Create Photo Collages, Create Slideshow, Encrypt Personal Content, and Detect Faces to Organize Photos). Should we stick with these five or investigate other use cases? What do you think?

As we did with WebXPRT 3 and the upcoming HDXPRT 4, we’re also planning to update the MobileXPRT UI to improve the look of the benchmark and make it easier to use.

Crucially, we’ll also build the app using the most current Android Studio SDK. Android development has changed significantly since we released MobileXPRT 2015 and apps must now conform to stricter standards that require explicit user permission for many tasks. Navigating these changes shouldn’t be too difficult, but it’s always possible that we’ll encounter unforeseen challenges at some point during the process.

Do you have suggestions for test scenarios that we should consider for MobileXPRT? Are there existing features we should remove? Are there elements of the UI that you find especially useful or have ideas for improving? Please let us know. We want to hear from you and make sure that MobileXPRT continues to meet your needs.

Justin

The Exploring WebXPRT 3 white paper is now available

Today, we published the Exploring WebXPRT 3 white paper. The paper describes the differences between WebXPRT 3 and WebXPRT 2015, including changes we made to the harness and the structure of the six performance test workloads. We also explain the benchmark’s scoring methodology, how to automate tests, and how to submit results for publication. Readers will also find additional detail about the third-party functions and libraries that WebXPRT uses during the HTML5 capability checks and performance workloads.

Because data collection and privacy concerns are more relevant than ever, we also discuss the WebXPRT data collection mechanisms and our commitment to respecting testers’ privacy. Finally, for readers who may be unfamiliar with the XPRTs, we describe the other benchmark tools in the XPRT family, the role of the BenchmarkXPRT Development Community, and how you can contribute to the XPRTs.

Along with the WebXPRT 3 results calculation white paper and spreadsheet, the Exploring WebXPRT 3 white paper is designed to promote the high level of transparency and disclosure that is a core value of the BenchmarkXPRT Development Community. Both WebXPRT white papers and the results calculation spreadsheet are available on WebXPRT.com and on our XPRT white papers page. If you have any questions about the WebXPRT, please let us know, and be sure to check out our other XPRT white papers.

Justin

WebXPRT passes another milestone!

We’re excited to see that users have successfully completed over 250,000 WebXPRT runs! From the original WebXPRT 2013 to the most recent version, WebXPRT 3, this tool has been popular with manufacturers, developers, consumers, and media outlets around the world because it’s easy to run, it runs quickly and on a wide variety of platforms, and it evaluates device performance using real-world tasks.

If you’ve run WebXPRT in any of the more than 458 cities and 64 countries from which we’ve received complete test data—including newcomers Lithuania, Luxembourg, Sweden, and Uruguay—we’re grateful for your help in reaching this milestone. Here’s to another quarter-million runs!

If you haven’t yet transitioned your browser testing to WebXPRT 3, now is a great time to give it a try! WebXPRT 3 includes updated photo workloads with new images and a deep learning task used for image classification. It also uses an optical character recognition task in the Encrypt Notes and OCR scan workload and combines part of the DNA Sequence Analysis scenario with a writing sample/spell check scenario to simulate online homework in the new Online Homework workload. Users carry out tasks like these on their browsers daily, making these workloads very effective for assessing how well a device will perform in the real world.

Happy testing to everyone, and if you have any questions about WebXPRT 3 or the XPRTs in general, feel free to ask!

Justin

Check out the other XPRTs: