BenchmarkXPRT Blog banner

Tag Archives: SSD-MobileNet v1

The Introduction to AIXPRT white paper is now available!

Today, we published the Introduction to AIXPRT white paper. The paper serves as an overview of the benchmark and a consolidation of AIXPRT-related information that we’ve published in the XPRT blog over the past several months. For folks who are completely new to AIXPRT and veteran testers who need to brush up on pre-test configuration procedures, we hope this paper will be a quick, one-stop reference that helps reduce the learning curve.

The paper describes the AIXPRT toolkits and workloads, adjusting key test parameters (batch size, level of precision, number of concurrent instances, and default number of requests), using alternate test configuration files, understanding and submitting results, and accessing the source code.

We hope that Introduction to AIXPRT will prove to be a valuable resource. Moving forward, readers will be able to access the paper from the Helpful Info box on AIXPRT.com and the AIXPRT section of our XPRT white papers page. If you have any questions about AIXPRT, please let us know!

Justin

The XPRT activity we have planned for first half of 2020

Today, we want to let readers know what to expect from the XPRTs over the next several months. Timelines and details can always change, but we’re confident that community members will see CloudXPRT Community Preview (CP), updated AIXPRT, and CrXPRT 2 releases during the first half of 2020.

CloudXPRT

Last week, Bill shared some details about our new datacenter-oriented benchmark, CloudXPRT. If you missed that post, we encourage you to check it out and learn more about the need for a new kind of cloud benchmark, and our plans for the benchmark’s structure and metrics. We’re already testing preliminary builds, and aim to release a CloudXPRT CP in late March, followed by a version for general availability roughly two months later.

AIXPRT

About a month ago, we explained how the number of moving parts in AIXPRT will necessitate a different development approach than we’ve used for other XPRTs. AIXPRT will require more frequent updating than our other benchmarks, and we anticipate releasing the second version of AIXPRT by mid-year. We’re still finalizing the details, but it’s likely to include the latest versions of ResNet-50 and SSD-MobileNet, selected SDK updates, ease-of-use improvements for the harness, and improved installation scripts. We’ll share more detailed information about the release timeline here in the blog as soon as possible.

CrXPRT 2

As we mentioned in December, we’re working on CrXPRT 2, the next version of our benchmark that evaluates the performance and battery life of Chromebooks. You can find out more about how CrXPRT works both here in the blog and at CrXPRT.com.

We’re currently testing an alpha version of CrXPRT 2. Testing is going well, but we’re tweaking a few items and refining the new UI. We should start testing a CP candidate in the next few weeks, and will have firmer information for community members about a CP release date very soon.

We’re excited about these new developments and the prospect of extending the XPRTs into new areas. If you have any questions about CloudXPRT, AIXPRT, or CrXPRT 2, please feel free to ask!

Justin

Understanding AIXPRT’s default number of requests

A few weeks ago, we discussed how AIXPRT testers can adjust the key variables of batch size, levels of precision, and number of concurrent instances by editing the JSON test configuration file in the AIXPRT/Config directory. In addition to those key variables, there is another variable in the config file called “total_requests” that has a different default setting depending on the AIXPRT test package you choose. This setting can significantly affect a test run, so it’s important for testers to know how it works.

The total_requests variable specifies how many inference requests AIXPRT will send to a network (e.g., ResNet-50) during one test iteration at a given batch size (e.g., Batch 1, 2, 4, etc.). This simulates the inference demand that the end users place on the system. Because we designed AIXPRT to run on different types of hardware, it makes sense to set the default number of requests for each test package to suit the most likely hardware environment for that package.

For example, testing with OpenVINO on Windows aligns more closely with a consumer (i.e., desktop or laptop) scenario than testing with OpenVINO on Ubuntu, which is more typical of server/datacenter testing. Desktop testers require a much lower inference demand than server testers, so the default total_requests settings for the two packages reflect that. The default for the OpenVINO/Windows package is 500, while the default for the OpenVINO/Ubuntu package is 5,000.

Also, setting the number of requests so low that a system finishes each workload in less than 1 second can produce high run-to-run variation, so our default settings represent a lower boundary that will work well for common test scenarios.

Below, we provide the current default total_requests setting for each AIXPRT test package:

  • MXNet: 1,000
  • OpenVINO Ubuntu: 5,000
  • OpenVINO Windows: 500
  • TensorFlow Ubuntu: 100
  • TensorFlow Windows: 10
  • TensorRT Ubuntu: 5,000
  • TensorRT Windows: 500


Testers can adjust these variables in the config file according to their own needs. Finding the optimal combination of machine learning variables for each scenario is often a matter of trial and error, and the default settings represent what we think is a reasonable starting point for each test package.

To adjust the total_requests setting, start by locating and opening the JSON test configuration file in the AIXPRT/Config directory. Below, we show a section of the default config file (CPU_INT8.json) for the OpenVINO-Windows test package (AIXPRT_1.0_OpenVINO_Windows.zip). For each batch size, the total_requests setting appears at the bottom of the list of configurable variables. In this case, the default setting Is 500. Change the total_requests numerical value for each batch size in the config file, save your changes, and close the file.

Total requests snip

Note that if you are running multiple concurrent instances, OpenVINO and TensorRT automatically distribute the number of requests among the instances. MXNet and TensorFlow users must manually allocate the instances in the config file. You can find an example of how to structure manual allocation here. We hope to make this process automatic for all toolkits in a future update.

We hope this information helps you understand the total_requests setting, and why the default values differ from one test package to another. If you have any questions or comments about this or other aspects of AIXPRT, please let us know.

Justin

Understanding AIXPRT results

Last week, we discussed the changes we made to the AIXPRT Community Preview 2 (CP2) download page as part of our ongoing effort to make AIXPRT easier to use. This week, we want to discuss the basics of understanding AIXPRT results by talking about the numbers that really matter and how to access and read the actual results files.

To understand AIXPRT results at a high level, it’s important to revisit the core purpose of the benchmark. AIXPRT’s bundled toolkits measure inference latency (the speed of image processing) and throughput (the number of images processed in a given time period) for image recognition (ResNet-50) and object detection (SSD-MobileNet v1) tasks. Testers have the option of adjusting variables such as batch size (the number of input samples to process simultaneously) to try and achieve higher levels of throughput, but higher throughput can come at the expense of increased latency per task. In real-time or near real-time use cases such as performing image recognition on individual photos being captured by a camera, lower latency is important because it improves the user experience. In other cases, such as performing image recognition on a large library of photos, achieving higher throughput might be preferable; designating larger batch sizes or running concurrent instances might allow the overall workload to complete more quickly.

The dynamics of these performance tradeoffs ensure that there is no single good score for all machine learning scenarios. Some testers might prefer lower latency, while others would sacrifice latency to achieve the higher level of throughput that their use case demands.

Testers can find latency and throughput numbers for each completed run in a JSON results file in the AIXPRT/Results folder. The test also generates CSV results files that are in the same folder. The raw results files report values for each AI task configuration (e.g., ResNet-50, Batch1, on CPU). Parsing and consolidating the raw data can take some time, so we’re developing a results file parsing tool to make the job much easier.

The results parsing tool is currently available in the AIXPRT CP2 OpenVINO – Windows package, and we hope to make it available for more packages soon. Using the tool is as simple as running a single command, and detailed instructions for how to do so are in the AIXPRT OpenVINO on Windows user guide. The tool produces a summary (example below) that makes it easier to quickly identify relevant comparison points such as maximum throughput and minimum latency.

AIXPRT results summary

In addition to the summary, the tool displays the throughput and latency results for each AI task configuration tested by the benchmark. AIXPRT runs each AI task multiple times and reports the average inference throughput and corresponding latency percentiles.

AIXPRT results details

We hope that this information helps to make it easier to understand AIXPRT results. If you have any questions or comments, please feel free to contact us.

Justin

Check out the other XPRTs:

Forgot your password?