BenchmarkXPRT Blog banner

Category: computer vision

Local AI and new frontiers for performance evaluation

Recently, we discussed some ways the PC market may evolve in 2024, and how new Windows on Arm PCs could present the XPRTs with many opportunities for benchmarking. In addition to a potential market shakeup from Arm-based PCs in the coming years, there’s a much broader emerging trend that could eventually revolutionize almost everything about the way we interact with our personal devices—the development of local, dedicated AI processing units for consumer-oriented tech.

AI already impacts daily life for many consumers through technologies such as such as predictive text, computer vision, adaptive workflow apps, voice recognition, smart assistants, and much more. Generative AI-based technologies are rapidly establishing a permanent, society-altering presence across a wide range of industries. Aside from some localized inference tasks that the CPU and/or GPU typically handle, the bulk of the heavy compute power that fuels those technologies has been in the cloud or in on-prem servers. Now, several major chipmakers are working to roll out their own versions of AI-optimized neural processing units (NPUs) that will enable local devices to take on a larger share of the AI load.

Examples of dedicated AI hardware in recently-released or upcoming consumer devices include Intel’s new Meteor Lake NPU, Apple’s Neural Engine for M-series SoCs, Qualcomm’s Hexagon NPU, and AMD’s XDNA 2 architecture. The potential benefits of localized, NPU-facilitated AI are straightforward. On-device AI could reduce power consumption and extend battery life by offloading those tasks from the CPUs. It could alleviate certain cloud-related privacy and security concerns. Without the delays inherent in cloud queries, localized AI could execute inference tasks that operate much closer to real time. NPU-powered devices could fine-tune applications around your habits and preferences, even while offline. You could pull and utilize relevant data from cloud-based datasets without pushing private data in return. Theoretically, your device could know a great deal about you and enhance many areas of your daily life without passing all that data to another party.

Will localized AI play out that way? Some tech companies envision a role for on-device AI that enhances the abilities of existing cloud-based subscription services without decoupling personal data. We’ll likely see a wide variety of capabilities and services on offer, with application-specific and SaaS-determined privacy options.

Regardless of the way on-device AI technology evolves in the coming years, it presents an exciting new frontier for benchmarking. All NPUs will not be created equal, and that’s something buyers will need to understand. Some vendors will optimize their hardware more for computer vision, or large language models, or AI-based graphics rendering, and so on. It won’t be enough for business and consumers to simply know that a new system has dedicated AI processing abilities. They’ll need to know if that system performs well while handling the types of AI-related tasks that they do every day.

Here at the XPRTs, we specialize in creating benchmarks that feature real-world scenarios that mirror the types of tasks that people do in their daily lives. That approach means that when people use XPRT scores to compare device performance, they’re using a metric that can help them make a buying decision that will benefit them every day. We look forward to exploring ways that we can bring XPRT benchmarking expertise to the world of on-device AI.

Do you have ideas for future localized AI workloads? Let us know!

Justin

The AIXPRT learning tool is now live (and a CloudXPRT version is on the way)!

We’re happy to announce that the AIXPRT learning tool is now live! We designed the tool to serve as an information hub for common AIXPRT topics and questions, and to help tech journalists, OEM lab engineers, and everyone who is interested in AIXPRT find the answers they need in as little time as possible.

The tool features four primary areas of content:

  • The Q&A section provides quick answers to the questions we receive most from testers and the tech press.
  • The AIXPRT: the basics section describes specific topics such as the benchmark’s toolkits, networks, workloads, and hardware and software requirements.
  • The testing and results section covers the testing process, metrics, and how to publish results.
  • The AI/ML primer provides brief, easy-to-understand definitions of key AI and ML terms and concepts for those who want to learn more about the subject.

The first screenshot below shows the home screen. To show how some of the popup information sections appear, the second screenshot shows the Inference tasks (workloads) entry in the AI/ML Primer section. 

We’re excited about the new AIXPRT learning tool, and we’re also happy to report that we’re working on a version of the tool for CloudXPRT. We hope to make the CloudXPRT tool available early next year, and we’ll post more information in the blog as we get closer to taking it live.

If you have any questions about the tool, please let us know!

Justin

Understanding the basics of AIXPRT precision settings

A few weeks ago, we discussed one of AIXPRT’s key configuration variables, batch size. Today, we’re discussing another key variable: the level of precision. In the context of machine learning (ML) inference, the level of precision refers to the computer number format (FP32, FP16, or INT8) representing the weights (parameters) a network model uses when performing the calculations necessary for inference tasks.

Higher levels of precision for inference tasks help decrease the number of false positives and false negatives, but they can increase the amount of time, memory bandwidth, and computational power necessary to achieve accurate results. Lower levels of precision typically (but not always) enable the model to process inputs more quickly while using less memory and processing power, but they can allow a degree of inaccuracy that is unacceptable for certain real-world applications.

For example, a high level of precision may be appropriate for computer vision applications in the medical field, where the benefits of hyper-accurate object detection and classification far outweigh the benefit of saving a few milliseconds. On the other hand, a low level of precision may work well for vision-based sensors in the security industry, where alert time is critical and monitors simply need to know if an animal or a human triggered a motion-activated camera.

FP32, FP16, and INT8

In AIXPRT, we can instruct the network models to use FP32, FP16, or INT8 levels of precision:

  • FP32 refers to single-precision (32-bit) floating point format, a number format that can represent an enormous range of values with a high degree of mathematical precision. Most CPUs and GPUs handle 32-bit floating point operations very efficiently, and many programs that use neural networks, including AIXPRT, use FP32 precision by default.
  • FP16 refers to half-precision (16-bit) floating point format, a number format that uses half the number of bits as FP32 to represent a model’s parameters. FP16 is a lower level of precision than FP32, but it still provides a great enough numerical range to successfully perform many inference tasks. FP16 often requires less time than FP32, and uses less memory.
  • INT8 refers to the 8-bit integer data type. INT8 data is better suited for certain types of calculations than floating point data, but it has a relatively small numeric range compared to FP16 or FP32. Depending on the model, INT8 precision can significantly improve latency and throughput, but there may be a loss of accuracy. INT8 precision does not always trade accuracy for speed, however. Researchers have shown that a process called quantization (i.e., approximating continuous values with discrete counterparts) can enable some networks, such as ResNet-50, to run INT8 precision without any significant loss of accuracy.

Configuring precision in AIXPRT

The screenshot below shows part of a sample config file, the same sample file we used for our batch size discussion. The value in the “precision” row indicates the precision setting. This test configuration would run tests using INT8. To change the precision, a tester simply replaces that value with “fp32” or “fp16” and saves the changes.

Config_snip

Note that while decreasing the precision from FP32 to FP16 or INT8 often results in larger throughput numbers and faster inference speeds overall, this is not always the case. Many other factors can affect ML performance, including (but not limited to) the complexity of the model, the presence of specific ML optimizations for the hardware under test, and any inherent limitations of the target CPU or GPU.

As with most AI-related topics, the details of model precision are extremely complex, and it’s a hot topic in cutting edge AI research. You don’t have to be an expert, however, to understand how changing the level of precision can affect AIXPRT test results. We hope that today’s discussion helped to make the basics of precision a little clearer. If you have any questions or comments, please feel free to contact us.

Justin

Machine learning performance tool update

Earlier this year we started talking about our efforts to develop a tool to help in evaluating machine learning performance. We’ve given some updates since then, but we’ve also gotten some questions, so I thought I’d do my best to summarize our answers for everyone.

Some have asked what kinds of algorithms we’ve been looking into. As we said in an earlier blog, we’re looking at  algorithms involved in computer vision, natural language processing, and data analytics, particularly different aspects of computer vision.

One seemingly trivial question we’ve received regards the proposed name, MLXPRT. We have been thinking of this tool as evaluating machine learning performance, but folks have raised a valid concern that it may well be broader than that. Does machine learning include deep learning? What about other artificial intelligence approaches? I’ve certainly seen other approaches lumped into machine learning, probably because machine learning is the hot topic of the moment. It feels like everything is boasting, “Now with machine learning!”

While there is some value in being part of such a hot movement, we’ve begun to wonder if a more inclusive name, such as AIXPRT, would be better. We’d love to hear your thoughts on that.

We’ve also had questions about the kind of devices the tool will run on. The short answer is that we’re concentrating on edge devices. While there is a need for server AI/ML tools, we’ve been focusing on the evaluating the devices close to the end users. As a result, we’re looking at the inference aspect of machine learning rather than the training aspect.

Probably the most frequent thing we’ve been asked about is the timetable. While we’d hoped to have something available this year, we were overly optimistic. We’re currently working on a more detailed proposal of what the tool will be, and we aim to make that available by the end of this year. If we achieve that goal, our next one will be to have a preliminary version of the tool itself ready in the first half of 2018.

As always, we seek input from folks, like yourself, who are working in these areas. What would you most like to see in an AI/machine learning performance tool? Do you have any questions?

Bill 

Thoughts from MWC Shanghai

I’ve spent the last couple days walking the exhibition halls of MWC Shanghai. The Shanghai New International Expo Centre (SNIEC) is large, but smaller than the MWC exhibit space in Barcelona or the set of exhibit halls in Las Vegas for CES. (SNIEC is not even the biggest exhibition space in Shanghai!) Further, MWC here still only took up half the exhibition space, but there was plenty to see. And, I’m less exhausted than after CES or MWC in Barcelona!

Photo Jun 28, 9 56 45 AM

If I had to pick one theme from the exhibition halls, it would be 5G. It seemed like half the booths had 5G displayed somewhere in their signage. The cloud was the other concept that seemed to be everywhere. While neither was surprising, it was interesting to see halfway around the world. In truth, it feels like 5G is much farther along here than it is back in the States.

I was also surprised to see how many phone vendors are here that I’d never heard of before such as Lephone and Gionee. I stopped by their booths with XPRT Spotlight information and hope they will send in some of their devices for inclusion in the future.

One thing I found of note was how much technology in general and IoT in particular is going to be everywhere. There was an interesting exhibit showing how stores of the future might operate. I was able to “buy” items without traditionally checking out. (I got a free water and some cookies out of the experience.) I just placed the items in a location on the checkout counter, which read their NFC labels and displayed them on the checkout screen. It seemed sort of like my understanding of the experiments that Amazon has been doing with brick-and-mortar grocery stores (prior to their purchase of Whole Foods). The whole experience felt a bit odd and still unpolished, but I’m sure it will improve and I’ll get used to it.

Photo Jun 29, 12 04 30 PM

The next generation will find it not odd, but normal. There were exhibits with groups of children playing with creative technologies from handheld 3D printers to simplified programming languages. They will be the generation after digital natives, maybe the digital creatives? What impact will they have? The future is both exciting and daunting!

I came away from the conference thinking about how the XPRTs can help folks choose amongst the myriad devices and technologies that are just around the corner. What would you most like to see the XPRTs tackle in the next six months to a year?

Bill Catchings

Evaluating machine learning performance

A  few weeks ago, I discussed the rising importance of machine learning and our efforts to develop a tool to help in evaluating its performance. Here is an update on our thinking.

One thing we are sure of is that we can’t cover everything in machine learning. The field is evolving rapidly, so we think the best approach is to pick a good place to start and then build from there.

One of the key areas we need to hone in on is the algorithms that we will employ in MLXPRT. (We haven’t formally decided on a name, but are currently using MLXPRT internally when we talk about what we’ve been doing.)

Computer vision, or image detection, seems to be a good place to start. We see three specific sets of algorithms to possibly cover. Worth noting, there is plenty of muddying of lines amongst these sets.

The first set of computer vision algorithms performs image classification. These algorithms identify things like a cat or a dog in an image. Some of the most popular algorithms are Alexnet and GoogLeNet, as well as ones from VGG . The initial training and use for these was on the ImageNet database, containing over 10 million images.

The next set of algorithms in computer vision performs object detection and localization. The algorithms identify the contents and their spatial location in an image, and typically draw bounding boxes around them. A couple of the most popular algorithms are Faster R-CNN and Single Shot MultiBox Detector (SSD).

The final set of computer vision algorithms perform image segmentation. Rather than just drawing a box around an object, image segmentation attempts to classify each pixel in an image by the object it is a part of. The result looks like a contour/color map that shows the different objects in the image. These techniques can be especially useful in autonomous vehicles and medical diagnostic imaging. Currently, the leading algorithms in image segmentation are fully convolution networks (FCN), but the area is developing rapidly.

Even limiting the initial version of MLXPRT to computer vision may be too broad. For example, we may end up only doing image classification and object detection.

As always, we crave input from folks, like yourself, who are working in these areas. What would you most like to see in a machine learning performance tool?

Bill

Check out the other XPRTs:

Forgot your password?