Category: Future of performance evaluation

The WebXPRT 3 Community Preview is here!

on December 14, 2017

Today we’re releasing the WebXPRT 3 Community Preview (CP). As we discussed in the blog last month, in the new version of WebXPRT, we updated the photo-related workloads with new images and a new deep learning task for the Organize Album workload. We also added an optical character recognition task to the Local Notes workload and combined a portion of the DNA Sequence Analysis scenario with a writing sample/spell check scenario to simulate an online homework hub in the new “Online Homework” workload.

Also, longtime WebXPRT users will immediately notice a completely new, but clean and straightforward, UI. We’re still tweaking aspects of the UI and implementing full functionality for certain features such as social media sharing and German language translation, but we don’t anticipate making any significant changes to the overall test or individual workloads before the general release.

As with all community previews, the WebXPRT 3 CP is available only to BenchmarkXPRT Development Community members, who can access the link from the WebXPRT tab in the Members’ Area.

After you try the WebXPRT 3 CP, please send us your comments. Thanks and happy testing!

Justin

Posted in BenchmarkXPRT development community, Browser-based benchmarks, Collaborative benchmark development, Community Preview, Cross-platform benchmarks, Future of performance evaluation, German, Performance benchmarking, WebXPRT, WebXPRT 3 |

News about WebXPRT and BatteryXPRT

By Justin Greene

on December 7, 2017

Last month, we gave readers a glimpse of the updates in store for the next WebXPRT, and now we have more news to report on that front.

The new version of WebXPRT will be called WebXPRT 3. WebXPRT 3 will retain the convenient features that made WebXPRT 2013 and WebXPRT 2015 our most popular tools, with more than 200,000 combined runs to date. We’ve added new elements, including AI, to a few of the workloads, but the test will still run in 15 minutes or less in most browsers and produce the same easy-to-understand results that help compare browsing performance across a wide variety of devices.

We’re also very close to publishing the WebXPRT 3 Community Preview. For those unfamiliar with our open development community model, BenchmarkXPRT Development Community members have the ability to preview and test new benchmark tools before we release them to the general public. Community previews are a great way for members to evaluate new XPRTs and send us feedback. If you’re interested in joining, you can register here.

In BatteryXPRT news, we recently started to see unusual battery life estimates and high variance when running battery life tests at the default length of 5.25 hours. We think this may be due to changes in how new OS versions are reporting battery life on certain devices, but we’re in the process of extensive testing to learn more. In the meantime, we recommend that BatteryXPRT users adjust the test run time to allow for a full rundown.

Do you have questions or comments about WebXPRT or BatteryXPRT? Let us know!

Justin

Posted in AI, Battery life, BatteryXPRT 2014 for Android, BenchmarkXPRT, BenchmarkXPRT development community, Browser-based benchmarks, Community Preview, Cross-platform benchmarks, Future of performance evaluation, Performance benchmarking, WebXPRT, WebXPRT 3 |

Glimpses of the next WebXPRT

By Justin Greene

on November 2, 2017

Development work on the next version of WebXPRT is well underway, and we think it’s a good time to offer a glimpse of what’s to come.

We’ve updated the photo-related workloads with new images and are experimenting with adding a new task to the Organize Album workload. The task utilizes ConvNetJS, a JavaScript library designed for training neural networks within the browser itself, to assign classifications to a set of images. It’s an example of the type of integrated deep learning tasks that will be showing up in all sorts of devices in the years to come.

We’re also testing an additional task in the Local Notes workload using Tesseract.js, a popular OCR (optical character recognition) engine. Our scenario uses OCR technology to scan images of purchase receipts and gather relevant information.

We’re testing these new tasks now, and will include them only once we’re confident that they produce consistent and reliable results without extending the benchmark’s runtime unnecessarily. Consequently, the next WebXPRT might contain variations of these tasks, or other new technologies altogether. We mention them now to offer some insight into the types of workload enhancements that we’re considering.

We’ve been working hard on the new WebXPRT UI as well. The image below shows the new start page from an early development build. We’re still making adjustments, so the final product will probably differ, but you do get a sense of the new UI’s clean look.

As we’ve said before, we’re committed to making sure that WebXPRT runs in most browsers and produces results that are useful for comparing web browsing performance across a wide variety of devices. We appreciate the feedback we’ve gotten so far, and are happy to receive more. Do you have ideas for the next WebXPRT? Let us know!

Justin

Posted in Browser-based benchmarks, Cross-platform benchmarks, Future of performance evaluation, Machine learning, Web-based testing, WebXPRT, WebXPRT 2017 |

Machine learning performance tool update

By Bill Catchings

on October 19, 2017

Earlier this year we started talking about our efforts to develop a tool to help in evaluating machine learning performance. We’ve given some updates since then, but we’ve also gotten some questions, so I thought I’d do my best to summarize our answers for everyone.

Some have asked what kinds of algorithms we’ve been looking into. As we said in an earlier blog, we’re looking at algorithms involved in computer vision, natural language processing, and data analytics, particularly different aspects of computer vision.

One seemingly trivial question we’ve received regards the proposed name, MLXPRT. We have been thinking of this tool as evaluating machine learning performance, but folks have raised a valid concern that it may well be broader than that. Does machine learning include deep learning? What about other artificial intelligence approaches? I’ve certainly seen other approaches lumped into machine learning, probably because machine learning is the hot topic of the moment. It feels like everything is boasting, “Now with machine learning!”

While there is some value in being part of such a hot movement, we’ve begun to wonder if a more inclusive name, such as AIXPRT, would be better. We’d love to hear your thoughts on that.

We’ve also had questions about the kind of devices the tool will run on. The short answer is that we’re concentrating on edge devices. While there is a need for server AI/ML tools, we’ve been focusing on the evaluating the devices close to the end users. As a result, we’re looking at the inference aspect of machine learning rather than the training aspect.

Probably the most frequent thing we’ve been asked about is the timetable. While we’d hoped to have something available this year, we were overly optimistic. We’re currently working on a more detailed proposal of what the tool will be, and we aim to make that available by the end of this year. If we achieve that goal, our next one will be to have a preliminary version of the tool itself ready in the first half of 2018.

As always, we seek input from folks, like yourself, who are working in these areas. What would you most like to see in an AI/machine learning performance tool? Do you have any questions?

Bill

Posted in AI, Benchmark metrics, Collaborative benchmark development, computer vision, Future of performance evaluation, Machine learning, What makes a good benchmark? |

Everything old is new again

By Eric Hale

on September 28, 2017

I recently saw an article called “4 lessons for modern software developers from 1970s mainframe programming.” This caught my eye because I started programming in the late 1970s, and my first programming environment was an IBM 370.

The author talks about how, back in the old days, you had to write tight code because memory and computing resources were limited. He also talks about the great amount of time we spent planning, writing, proofreading, and revising our code—all on paper—before running it. We did that because computing resources were expensive and you would get in trouble for using too many. He’s right about that—I got reamed out a couple of times!

At first, it seemed like this was just another article by an old programmer talking about how sloppy and lazy the new generation is, but then he made an interesting point. Programming for embedded processors reintroduces the types of resource limitations we used to have to deal with. Cloud computing reintroduces having to pay for computing resources based on usage.

I personally think he goes too far in making his point – there are a lot times when rapid prototyping and iterative development are the best way to do things. However, his main thesis has merit. Some new applications may benefit from doing things the old way.

Cloud computing and embedded processors are, of course, important in machine learning applications. As we’re working on a machine learning XPRT, we’ll be following best practices for this new environment!

Eric

Posted in BenchmarkXPRT development community, Collaborative benchmark development, Future of performance evaluation, Machine learning |

Planning the next version of HDXPRT

By Justin Greene

on July 20, 2017

A few weeks ago, we wrote about the capabilities and benefits of HDXPRT. This week, we want to share some initial ideas for the next version of HDXPRT, and invite you to send us any comments or suggestions you may have.

The first step towards a new HDXPRT will be updating the benchmark’s workloads to increase their value in the years to come. Primarily, this will involve updating application content, such as photos and videos, to more contemporary file resolutions and sizes. We think 4K-related workloads will increase the benchmark’s relevance, but aren’t sure whether 4K playback tests are necessary. What do you think?

The next step will be to update versions of the real-world trial applications included in the benchmark, including Adobe Photoshop Elements, Apple iTunes, Audacity, CyberLink MediaEspresso, and HandBrake. Are there other any applications you feel would be a good addition to HDXPRT’s editing photos, editing music, or converting videos test scenarios?

We’re also planning to update the UI to improve the look and feel of the benchmark and simplify navigation and functionality.

Last but not least, we’ll work to fix known problems, such as the hardware acceleration settings issue in MediaEspresso, and eliminate the need for workarounds when running HDXPRT on the Windows 10 Creators Update.

Do you have feedback on these ideas or suggestions for applications or test scenarios that we should consider for HDXPRT? Are there existing features we should remove? Are there elements of the UI that you find especially useful or would like to see improved? Please let us know. We want to hear from you and make sure that HDXPRT continues to meet your needs.

Justin

Posted in 4K, BenchmarkXPRT, Collaborative benchmark development, Future of performance evaluation, HDXPRT, HDXPRT capabilities, HDXPRT development process, HDXPRT release cycle, Let us know your thoughts, Performance benchmarking, What makes a good benchmark?, Windows 10 |

Category: Future of performance evaluation

The WebXPRT 3 Community Preview is here!

News about WebXPRT and BatteryXPRT

Glimpses of the next WebXPRT

Machine learning performance tool update

Everything old is new again

Planning the next version of HDXPRT

Check out the other XPRTs: