Category: Benchmarking computing devices

Understanding concurrent instances in AIXPRT

on September 12, 2019

Over the past few weeks, we’ve discussed several of the key configuration variables in AIXPRT, such as batch size and level of precision. Today, we’re discussing another key variable: number of concurrent instances. In the context of machine learning inference, this refers to how many instances of the network model (ResNet-50, SSD-MobileNet, etc.) the benchmark runs simultaneously.

By default, the toolkits in AIXPRT run one instance at a time and distribute the compute load according to the characteristics of the CPU or GPU under test, as well as any relevant optimizations or accelerators in the toolkit’s reference library. By setting the number of concurrent instances to a number greater than one, a tester can use multiple CPUs or GPUs to run multiple instances of a model at the same time, usually to increase throughput.

With multiple concurrent instances, a tester can leverage additional compute resources to potentially achieve higher throughput without sacrificing latency goals.

In the current version of AIXPRT, testers can run multiple concurrent instances in the OpenVINO, TensorFlow, and TensorRT toolkits. When AIXPRT Community Preview 3 becomes available, this option will extend to the MXNet toolkit. OpenVINO and TensorRT automatically allocate hardware for each instance and don’t let users make manual adjustments. TensorFlow and MXNet require users to manually bind instances to specific hardware. (Manual hardware allocation for multiple instances is more complicated than we can cover today, so we may devote a future blog entry to that topic.)

Setting the number of concurrent instances in AIXPRT

The screenshot below shows part of a sample config file (the same one we used when we discussed batch size and precision). The value in the “concurrent instances” row indicates how many concurrent instances will be operating during the test. In this example, the number is one. To change that value, a tester simply replaces it with the desired number and saves the changes.

If you have any questions or comments (about concurrent instances or anything else), please feel free to contact us.

Justin

Posted in AI, AIXPRT, Benchmarking, Benchmarking computing devices, Community Preview, Cross-platform benchmarks, Machine learning, MXNet, OpenVINO, Performance benchmarking, ResNet-50, SSD-MobileNet v1, TensorFlow, TensorRT | Tagged AI, AIXPRT, batch size, concurrent instances, CPU, GPU, MXNet, OpenVINO, precision, TensorFlow, TensorRT |

Transparent goals

By Justin Greene

on May 16, 2019

Recently, Forbes published an article discussing a new report on phone battery life from Which?, a UK consumer advocacy group. In the report, Which? states that they tested the talk time battery life of 50 phones from five brands. During the tests, phones from three of the brands lasted longer than the manufacturers’ claims, while phones from another brand underperformed by about five percent. The fifth brand’s published battery life numbers were 18 to 51 percent higher than Which? recorded in their tests.

Folks can read the article for more details about the tests and the brands. While the report raises some interesting questions, and the article provides readers with brief test methodology descriptions from Which? and one manufacturer, we don’t know enough about the tests to say which set of claims is correct. Any number of variables related to test workloads or device configuration settings could significantly affect the results. Both parties may be using sound benchmarking principles in good faith, but their test methodologies may not be comparable. As it is, we simply don’t have enough information to evaluate the study.

Whether the issue is battery life or any other important device spec, information conflicts, such as the one that the Forbes article highlights, can leave consumers scratching their heads, trying to decide which sources are worth listening to. At the XPRTs, we believe that the best remedy for this type of problem is to provide complete transparency into our testing methodologies and development process. That’s why our lab techs verify all the hardware specs for each XPRT Weekly Tech Spotlight entry. It’s why we publish white papers explaining the structure of our benchmarks in detail, as well as how the XPRTs calculate performance results. It’s also why we employ an open development community model and make each XPRT’s source code available to community members. When we’re open about how we do things, it encourages the kind of honest dialogue between vendors, journalists, consumers, and community members that serves everyone’s best interests.

If you love tech and share that same commitment to transparency, we’d love for you to join our community, where you can access XPRT source code and previews of upcoming benchmarks. Membership is free for anyone with a verifiable corporate affiliation. If you have any questions about membership or the registration process, please feel free to ask.

Justin

Posted in Battery life, Benchmark metrics, Benchmarking, Benchmarking computing devices, BenchmarkXPRT, BenchmarkXPRT development community, Collaborative benchmark development, Open development, Performance benchmarking, Source code, What makes a good benchmark?, White papers, XPRT Weekly Tech Spotlight | Tagged battery life, benchmarks, BenchmarkXPRT, Forbes, open development, phones, Which?, white paper |

The value of speed

By Bill Catchings

on May 3, 2018

I was reading an interesting article on how high-end smartphones like the iPhone X, Pixel 2 XL, and Galaxy S8 generate more money from in-game revenue than cheaper phones do.

One line stood out to me: “With smartphones becoming faster, larger and more capable of delivering an engaging gaming experience, these monetization key performance indicators (KPIs) have begun to increase significantly.”

It turns out the game companies totally agree with the rest of us that faster devices are better!

Regardless of who is seeking better performance—consumers or game companies—the obvious question is how you determine which models are fastest. Many folks rely on device vendors’ claims about how much faster the new model is. Unfortunately, the vendors’ claims don’t always specify on what they base the claims. Even when they do, it’s hard to know whether the numbers are accurate and applicable to how you use your device.

The key part of any answer is performance tools that are representative, dependable, and open.

Representative – Performance tools need to have realistic workloads that do things that you care about.
Dependable – Good performance tools run reliably and produce repeatable results, both of which require that significant work go into their development and testing.
Open – Performance tools that allow people to access the source code, and even contribute to it, keep things above the table and reassure you that you can rely on the results.

Our goal with the XPRTs is to provide performance tools that meet all these criteria. WebXPRT 3 and all our other XPRTs exist to help accurately reveal how devices perform. You can run them yourself or rely on the wealth of results that we and others have collected on a wide array of devices.

The best thing about good performance tools is that everyone, even vendors, can use them. I sincerely hope that you find the XPRTs helpful when you make your next technology purchase.

Bill

Posted in Benchmark metrics, Benchmarking, Benchmarking computing devices, Mobile devices, Performance benchmarking, Performance of computing devices, Phones, WebXPRT, WebXPRT 3, What makes a good benchmark? |

Notes from the lab

By Justin Greene

on June 15, 2017

This week’s XPRT Weekly Tech Spotlight featured the Alcatel A30 Android phone. We chose the A30, an Amazon exclusive, because it’s a budget phone running Android 7.0 (Nougat) right out of the box. That may be an appealing combination for consumers, but running a newer OS on inexpensive hardware such as what’s found in the A30 can cause issues for app developers, and the XPRTs are no exception.

Spotlight fans may have noticed that we didn’t post a MobileXPRT 2015 or BatteryXPRT 2014 score for the A30. In both cases, the benchmark did not produce an overall score because of a problem that occurs during the Create Slideshow workload. The issue deals with text relocation and significant changes in the Android development environment.

As of Android 5.0, on 64-bit devices, the OS doesn’t allow native code executables to perform text relocation. Instead, it is necessary to compile the executables using position-independent code (PIC) flags. This is how we compiled the current version of MobileXPRT, and it’s why we updated BatteryXPRT earlier this year to maintain compatibility with the most recent versions of Android.

However, the same approach doesn’t work for SoCs built with older 32-bit ARMv7-A architectures, such as the A30’s Qualcomm Snapdragon 210, so testers may encounter this issue on other devices with low-end hardware.

Testers who run into this problem can still use MobileXPRT 2015 to generate individual workload scores for the Apply Photo Effects, Create Photo Collages, Encrypt Personal Content, and Detect Faces workloads. Also, BatteryXPRT will produce an estimated battery life for the device, but since it won’t produce a performance score, we ask that testers use those numbers for informational purposes and not publication.

If you have any questions or have encountered additional issues, please let us know!

Justin

Posted in Android, Android N, Battery life, BatteryXPRT 2014 for Android, Benchmarking computing devices, MobileXPRT, MobileXPRT 2015 |

Celebrating one year of the XPRT Weekly Tech Spotlight

By Justin Greene

on March 9, 2017

It’s been just over a year since we launched the XPRT Weekly Tech Spotlight by featuring our first device, the Google Pixel C. Spotlight has since become one of the most popular items at BenchmarkXPRT.com, and we thought now would be a good time to recap the past year, offer more insight into the choices we make behind the scenes, and look at what’s ahead for Spotlight.

The goal of Spotlight is to provide PT-verified specs and test results that can help consumers make smart buying decisions. We try to include a wide variety of device types, vendors, software platforms, and price points in our inventory. The devices also tend to fall into one of two main groups: popular new devices generating a lot of interest and devices that have unique form factors or unusual features.

To date, we’ve featured 56 devices: 16 phones, 11 laptops, 10 two-in-ones, 9 tablets, 4 consoles, 3 all-in-ones, and 3 small-form-factor PCs. The operating systems these devices run include Android, ChromeOS, iOS, macOS, OS X, Windows, and an array of vendor-specific OS variants and skins.

As much as possible, we test using out-of-the-box (OOB) configurations. We want to present test results that reflect what everyday users will experience on day one. Depending on the vendor, the OOB approach can mean that some devices arrive bogged down with bloatware while others are relatively clean. We don’t attempt to “fix” anything in those situations; we simply test each device “as is” when it arrives.

If devices arrive with outdated OS versions (as is often the case with Chromebooks), we update to current versions before testing, because that’s the best reflection of what everyday users will experience. In the past, that approach would’ve been more complicated with Windows systems, but the Microsoft shift to “Windows as a service” ensures that most users receive significant OS updates automatically by default.

The OOB approach also means that the WebXPRT scores we publish reflect the performance of each device’s default browser, even if it’s possible to install a faster browser. Our goal isn’t to perform a browser shootout on each device, but to give an accurate snapshot of OOB performance. For instance, last week’s Alienware Steam Machine entry included two WebXPRT scores, a 356 on the SteamOS browser app and a 441 on Iceweasel 38.8.0 (a Firefox variant used in the device’s Linux-based desktop mode). That’s a significant difference, but the main question for us was which browser was more likely to be used in an OOB scenario. With the Steam Machine, the answer was truly “either one.” Many users will use the browser app in the SteamOS environment and many will take the few steps needed to access the desktop environment. In that case, even though one browser was significantly faster than the other, choosing to omit one score in favor of the other would have excluded results from an equally likely OOB environment.

We’re always looking for ways to improve Spotlight. We recently began including more photos for each device, including ones that highlight important form-factor elements and unusual features. Moving forward, we plan to expand Spotlight’s offerings to include automatic score comparisons, additional system information, and improved graphical elements. Most importantly, we’d like to hear your thoughts about Spotlight. What devices and device types would you like to see? Are there specs that would be helpful to you? What can we do to improve Spotlight? Let us know!

Justin

Posted in Android, Benchmarking computing devices, BenchmarkXPRT development community, Chrome OS, Chromebooks, Laptops, Mobile devices, PCs, Phones, Principled Technologies, Tablets, Windows 10, XPRT Weekly Tech Spotlight |

We haven’t mentioned this in a while

By Eric Hale

on October 15, 2015

I had a conversation with a community member yesterday who wanted to know whether we would test his device with one of the XPRTs. The short answer is “Absolutely!” The somewhat longer answer follows.

If you send us a device you want us to test, we will do so, with the appropriate set of XPRTs, free of charge. You will know that an impartial, third-party lab has tested your device using the best benchmarking practices. After we share the results with you, you will have three options: (1) to keep the results private, (2) to have us make the results public immediately in the appropriate XPRT results databases, or (3) to delay releasing the results until a future date. Regardless of your choice, we will keep the device so that we can use it as part of our testbed for developing and testing future versions of the XPRTs.

When we add the results to our online databases, we will cite Principled Technologies as the source, indicating that we stand behind the results.

The free testing includes no collateral beyond publishing the results. If you would like to publicize them through a report, an infographic, or any of the other materials PT can provide, just let us know and the appropriate person will contact you to discuss the how much those services would cost.

If you’re interested in getting your device tested for free, contact us at BenchmarkXPRTSupport@principledtechnologies.com.

Eric

Posted in Benchmarking, Benchmarking computing devices, BenchmarkXPRT, BenchmarkXPRT development community, Principled Technologies |

Category: Benchmarking computing devices

Understanding concurrent instances in AIXPRT

Setting the number of concurrent instances in AIXPRT

Transparent goals

The value of speed

Notes from the lab

Celebrating one year of the XPRT Weekly Tech Spotlight

We haven’t mentioned this in a while

Check out the other XPRTs: