Category: Future of performance evaluation

Glimpses of the next WebXPRT

on November 2, 2017

Development work on the next version of WebXPRT is well underway, and we think it’s a good time to offer a glimpse of what’s to come.

We’ve updated the photo-related workloads with new images and are experimenting with adding a new task to the Organize Album workload. The task utilizes ConvNetJS, a JavaScript library designed for training neural networks within the browser itself, to assign classifications to a set of images. It’s an example of the type of integrated deep learning tasks that will be showing up in all sorts of devices in the years to come.

We’re also testing an additional task in the Local Notes workload using Tesseract.js, a popular OCR (optical character recognition) engine. Our scenario uses OCR technology to scan images of purchase receipts and gather relevant information.

We’re testing these new tasks now, and will include them only once we’re confident that they produce consistent and reliable results without extending the benchmark’s runtime unnecessarily. Consequently, the next WebXPRT might contain variations of these tasks, or other new technologies altogether. We mention them now to offer some insight into the types of workload enhancements that we’re considering.

We’ve been working hard on the new WebXPRT UI as well. The image below shows the new start page from an early development build. We’re still making adjustments, so the final product will probably differ, but you do get a sense of the new UI’s clean look.

As we’ve said before, we’re committed to making sure that WebXPRT runs in most browsers and produces results that are useful for comparing web browsing performance across a wide variety of devices. We appreciate the feedback we’ve gotten so far, and are happy to receive more. Do you have ideas for the next WebXPRT? Let us know!

Justin

Posted in Browser-based benchmarks, Cross-platform benchmarks, Future of performance evaluation, Machine learning, Web-based testing, WebXPRT, WebXPRT 2017 |

Machine learning performance tool update

By Bill Catchings

on October 19, 2017

Earlier this year we started talking about our efforts to develop a tool to help in evaluating machine learning performance. We’ve given some updates since then, but we’ve also gotten some questions, so I thought I’d do my best to summarize our answers for everyone.

Some have asked what kinds of algorithms we’ve been looking into. As we said in an earlier blog, we’re looking at algorithms involved in computer vision, natural language processing, and data analytics, particularly different aspects of computer vision.

One seemingly trivial question we’ve received regards the proposed name, MLXPRT. We have been thinking of this tool as evaluating machine learning performance, but folks have raised a valid concern that it may well be broader than that. Does machine learning include deep learning? What about other artificial intelligence approaches? I’ve certainly seen other approaches lumped into machine learning, probably because machine learning is the hot topic of the moment. It feels like everything is boasting, “Now with machine learning!”

While there is some value in being part of such a hot movement, we’ve begun to wonder if a more inclusive name, such as AIXPRT, would be better. We’d love to hear your thoughts on that.

We’ve also had questions about the kind of devices the tool will run on. The short answer is that we’re concentrating on edge devices. While there is a need for server AI/ML tools, we’ve been focusing on the evaluating the devices close to the end users. As a result, we’re looking at the inference aspect of machine learning rather than the training aspect.

Probably the most frequent thing we’ve been asked about is the timetable. While we’d hoped to have something available this year, we were overly optimistic. We’re currently working on a more detailed proposal of what the tool will be, and we aim to make that available by the end of this year. If we achieve that goal, our next one will be to have a preliminary version of the tool itself ready in the first half of 2018.

As always, we seek input from folks, like yourself, who are working in these areas. What would you most like to see in an AI/machine learning performance tool? Do you have any questions?

Bill

Posted in AI, Benchmark metrics, Collaborative benchmark development, computer vision, Future of performance evaluation, Machine learning, What makes a good benchmark? |

Everything old is new again

By Eric Hale

on September 28, 2017

I recently saw an article called “4 lessons for modern software developers from 1970s mainframe programming.” This caught my eye because I started programming in the late 1970s, and my first programming environment was an IBM 370.

The author talks about how, back in the old days, you had to write tight code because memory and computing resources were limited. He also talks about the great amount of time we spent planning, writing, proofreading, and revising our code—all on paper—before running it. We did that because computing resources were expensive and you would get in trouble for using too many. He’s right about that—I got reamed out a couple of times!

At first, it seemed like this was just another article by an old programmer talking about how sloppy and lazy the new generation is, but then he made an interesting point. Programming for embedded processors reintroduces the types of resource limitations we used to have to deal with. Cloud computing reintroduces having to pay for computing resources based on usage.

I personally think he goes too far in making his point – there are a lot times when rapid prototyping and iterative development are the best way to do things. However, his main thesis has merit. Some new applications may benefit from doing things the old way.

Cloud computing and embedded processors are, of course, important in machine learning applications. As we’re working on a machine learning XPRT, we’ll be following best practices for this new environment!

Eric

Posted in BenchmarkXPRT development community, Collaborative benchmark development, Future of performance evaluation, Machine learning |

Planning the next version of HDXPRT

By Justin Greene

on July 20, 2017

A few weeks ago, we wrote about the capabilities and benefits of HDXPRT. This week, we want to share some initial ideas for the next version of HDXPRT, and invite you to send us any comments or suggestions you may have.

The first step towards a new HDXPRT will be updating the benchmark’s workloads to increase their value in the years to come. Primarily, this will involve updating application content, such as photos and videos, to more contemporary file resolutions and sizes. We think 4K-related workloads will increase the benchmark’s relevance, but aren’t sure whether 4K playback tests are necessary. What do you think?

The next step will be to update versions of the real-world trial applications included in the benchmark, including Adobe Photoshop Elements, Apple iTunes, Audacity, CyberLink MediaEspresso, and HandBrake. Are there other any applications you feel would be a good addition to HDXPRT’s editing photos, editing music, or converting videos test scenarios?

We’re also planning to update the UI to improve the look and feel of the benchmark and simplify navigation and functionality.

Last but not least, we’ll work to fix known problems, such as the hardware acceleration settings issue in MediaEspresso, and eliminate the need for workarounds when running HDXPRT on the Windows 10 Creators Update.

Do you have feedback on these ideas or suggestions for applications or test scenarios that we should consider for HDXPRT? Are there existing features we should remove? Are there elements of the UI that you find especially useful or would like to see improved? Please let us know. We want to hear from you and make sure that HDXPRT continues to meet your needs.

Justin

Posted in 4K, BenchmarkXPRT, Collaborative benchmark development, Future of performance evaluation, HDXPRT, HDXPRT capabilities, HDXPRT development process, HDXPRT release cycle, Let us know your thoughts, Performance benchmarking, What makes a good benchmark?, Windows 10 |

Thoughts from MWC Shanghai

By Bill Catchings

on June 29, 2017

I’ve spent the last couple days walking the exhibition halls of MWC Shanghai. The Shanghai New International Expo Centre (SNIEC) is large, but smaller than the MWC exhibit space in Barcelona or the set of exhibit halls in Las Vegas for CES. (SNIEC is not even the biggest exhibition space in Shanghai!) Further, MWC here still only took up half the exhibition space, but there was plenty to see. And, I’m less exhausted than after CES or MWC in Barcelona!

If I had to pick one theme from the exhibition halls, it would be 5G. It seemed like half the booths had 5G displayed somewhere in their signage. The cloud was the other concept that seemed to be everywhere. While neither was surprising, it was interesting to see halfway around the world. In truth, it feels like 5G is much farther along here than it is back in the States.

I was also surprised to see how many phone vendors are here that I’d never heard of before such as Lephone and Gionee. I stopped by their booths with XPRT Spotlight information and hope they will send in some of their devices for inclusion in the future.

One thing I found of note was how much technology in general and IoT in particular is going to be everywhere. There was an interesting exhibit showing how stores of the future might operate. I was able to “buy” items without traditionally checking out. (I got a free water and some cookies out of the experience.) I just placed the items in a location on the checkout counter, which read their NFC labels and displayed them on the checkout screen. It seemed sort of like my understanding of the experiments that Amazon has been doing with brick-and-mortar grocery stores (prior to their purchase of Whole Foods). The whole experience felt a bit odd and still unpolished, but I’m sure it will improve and I’ll get used to it.

The next generation will find it not odd, but normal. There were exhibits with groups of children playing with creative technologies from handheld 3D printers to simplified programming languages. They will be the generation after digital natives, maybe the digital creatives? What impact will they have? The future is both exciting and daunting!

I came away from the conference thinking about how the XPRTs can help folks choose amongst the myriad devices and technologies that are just around the corner. What would you most like to see the XPRTs tackle in the next six months to a year?

Bill Catchings

Posted in 5G, AI, computer vision, Education, Future of performance evaluation, Internet of things, IoT, Mobile World Congress, Mobile World Congress, Trade Shows |

Learning about machine learning

By Bill Catchings

on April 27, 2017

Everywhere we look, machine learning is in the news. It’s driving cars and beating the world’s best Go players. Whether we are aware of it or not, it’s in our lives–understanding our voices and identifying our pictures.

Our goal of being able to measure the performance of hardware and software that does machine learning seems more relevant than ever. Our challenge is to scan the vast landscape that is machine learning, and identify which elements to measure first.

There is a natural temptation to see machine learning as being all about neural networks such as AlexNet and GoogLeNet. However, new innovations appear all the time and lots of important work with more classic machine learning techniques is also underway. (Classic machine learning being anything more than a few years old!) Recursive neural networks used for language translation, reinforcement learning used in robotics, and support vector machine (SVM) learning used in text recognition are just a few examples among the wide array of algorithms to consider.

Creating a benchmark or set of benchmarks to cover all those areas, however, is unlikely to be possible. Certainly, creating such an ambitious tool would take so long that it would be of limited usefulness.

Our current thinking is to begin with a small set of representative algorithms. The challenge, of course, is identifying them. That’s where you come in. What would you like to start with?

We anticipate that the benchmark will focus on the types of inference learning and light training that are likely to occur on edge devices. Extensive training with large datasets takes place in data centers or on systems with extraordinary computing capabilities. We’re interested in use cases that will stress the local processing power of everyday devices.

We are, of course, reaching out to folks in the machine learning field—including those in academia, those who create the underlying hardware and software, and those who make the products that rely on that hardware and software.

What do you think?

Bill

Posted in AI, AR, Augmented reality, Automated cars, Automation, computer vision, Future of performance evaluation, Machine learning, What makes a good benchmark? |

Category: Future of performance evaluation

Glimpses of the next WebXPRT

Machine learning performance tool update

Everything old is new again

Planning the next version of HDXPRT

Thoughts from MWC Shanghai

Learning about machine learning

Check out the other XPRTs: