BenchmarkXPRT Blog banner

Helping hands

We ran into a problem last week with HDXPRT 2011. Basically, it would fail when we installed it. One of the biggest problems for application-based benchmarks like HDXPRT 2011 is dealing with existing applications on the system. Even more difficult to account for are the many DLLs, drivers, and Registry settings that can collide between applications and different versions of the same application.

After a lot of effort, we found the problem was indeed a conflict between some of the pre-installed software on the system and the HDXPRT 2011 installer. We were able to narrow down which applications caused the problem and posted on the site some instructions for how to work around the issues. (For more details, log into the forum and then see http://www.hdxprt.com/forum/showthread.php?18-Troubleshooting-Installation-problems-on-Dell-Latitude-notebooks. You won’t be able to read that message if you’re not logged in.)

My hope is that if you run into issues with HDXPRT 2011, you’ll share them. And, share the workarounds you find as well! So, please let us know any tips, tricks, or issues you find with the benchmark by sending email to hdxprtsupport@hdxprt.com. The more we work together, the better we can make both HDXPRT 2011 and the future versions. Thanks!

Next week, we’ll return to looking at the results HDXPRT 2011 provides.

Bill

Comment on this post in the forums

Keeping score

One question I received as a result of the last two blog entries on benchmark anatomy was whether I was going to talk about the results or scores.  That topic seemed like a natural follow up.

All benchmarks need to provide some sort of metric to let you know how well the system under test (SUT) did.  I think the best metrics are the easily understood ones.  These metrics have units like time or watts.  The problem with some of these units is that sometimes smaller can be better.  For example, less time to complete a task is better.  (Of course, more time before the battery runs down is better!)  People generally see bigger bars in a chart as better.

Some tests, however, give units that are not so understandable.  Units like instructions per second, requests per second, or frames per second are tougher to relate to.  Sure, more bytes per second would be better, but it is not as easy to understand what that means in the real world.

There is a solution to both the problem of smaller is better and non-intuitive units—normalization.  With normalization, you take the result of the SUT and divide it by that of a defined base or calibration system.  The result is a unit-less number.  So, if the base system can do 100 blips a second and the SUT can do 143 blips a second, the SUT would get 143 / 100 or a score of 1.43.  The units cancel out in the math and what is left is a score.  For appearance or convenience, the score may be multiplied by some number like 10 or 100 to make the SUT’s score 14.3 or 143.

The nice thing about such scores is that it is easy to see how much faster one system is than another.  If you are measuring normalized execution time, a score of 286 means a system is twice as fast as one of 143.  As a bonus, bigger numbers are better.  An added benefit is that it is much easier to combine multiple normalized results into a single score.  These benefits are the reason that many modern benchmarks use normalized scores.

There is another kind of score, which is more of a rating.  These scores, such as a number of stars or thumbs up, are good for relative ratings.  However, they are not necessarily linear.  Four thumbs up is better than two, but is not necessarily twice as good.

Next week, we’ll look closer at the results HDXPRT 2011 provides and maybe even venture into the difference between arithmetic, geometric, and harmonic means!  (I know I can’t wait.)

Bill

Comment on this post in the forums

Anatomy of a benchmark, part II

As we discussed last week, benchmarks (including HDXPRT 2011) are made up of a set of common major components. Last week’s components included the Installer, User Interface (UI), and Results Viewer.  This week, we’ll look more at the guts of a benchmark—the parts that actually do the performance testing.

Once the UI gets the necessary commands and parameters from the user, the Test Harness takes over.  This part is the logic that runs the individual Tests or Workloads using the parameters you specified.  For application-based benchmarks, the harness is particularly critical, because it has to deal with running real applications.  (Simpler benchmarks may mix the harness and test code in a single program.)

The next component consists of the Tests or Workloads themselves.  Some folks use those terms interchangeably, but I try to avoid that practice.  I tend to think of tests as specially crafted code designed to gauge some aspect of a system’s performance, while workloads consist of a set of actions that an application must take as well as the necessary data for those actions.  In HDXPRT 2011, each workload is a set of data (such as photos) and actions (e.g., manipulations of those photos) that an application (e.g., Photoshop Elements) performs.  Application-based benchmarks, such as HDXPRT 2011, typically use some other program or technology to pass commands to the applications.  HDXPRT uses a combination of AutoIT and C code to drive the applications.

When the Harness finishes running the tests or workloads, it collects the results.  It then passes those results either to the Results Viewer or writes them to a file for viewing in Excel or some other program.

As we look to improve HDXPRT for next year, what improvements would you like to see in each of those areas?

Bill

Comment on this post in the forums

Anatomy of a benchmark, part I

Over many years of dealing with benchmarks, I’ve found that there are a few major components that HDXPRT 2011 and most others include.  Some of these components are not what you might think of as part of a benchmark, but they are essential to making one both easy to use and capable of producing reproducible results.  We’ll look at those parts this week and the rest next week.

The first piece that you encounter when you use a benchmark is its Installation program.  Simple benchmarks may forgo an installation component and just let you copy the files, including any executables, into a directory.  By contrast, HDXPRT 2011, like other application-based benchmarks, takes great pains to install the necessary applications. It even has to check to see which of them are already installed on the computer under test and cope with those it finds.

Once the benchmark is on the system, you launch it and encounter the User Interface (UI).  For some benchmarks, the UI may be only a command-line interface with a set of switches or options. HDXPRT 2011, in keeping with its emphasis on an HD user experience, includes a graphical UI that lets you run its tests.

Many benchmarks, including HDXPRT 2011, provide a Results Viewer that makes it easy for you to look at your results and compare them to others.  Results viewers range from fairly simple to quite sophisticated.  The prevalence of spreadsheet applications and XML has led to benchmark creators minimizing the development costs of this component.

Next week, I’ll look at the components that handle the actual tests that make up the benchmark.

Bill

Comment on this post in the forums

Always wanting to know more

I’m an engineer (computer science) by training, and as a consequence I’m always after more data.  More data means better understanding, which leads to better decision making.  We acquired a lot of data in the course of finishing our white paper on the characteristics of HDXPRT 2011.  Now, of course, I want even more.

The biggest area that I want to understand better is the graphics subsystem.  Our testing showed processor-integrated graphics out-performing discrete graphics cards.  That was not what I expected.  There seem to be two likely explanations.  The first is that since the workload of HDXPRT 2011 does not include 3D, discrete graphics cards are not that helpful to the benchmark’s applications.  Certainly, 3D performance plays more to the traditional strengths of discrete graphics cards.  The second likely explanation is that the integrated graphics on the second-generation Intel Core processors we used perform well.  A number of performance Web sites have noted the same thing since the debut of those processors.

The answer is probably a combination of the two.

To satisfy my data desires, we’re going to look further. We’ll start by testing on some older processors as well as some different graphics cards.  We’ll share our findings with you.

Please let us know any other characteristics of HDXPRT 2011 that you’d like us to explore in more depth.  I can’t guarantee we’ll be able to look at everything, but I know I always want to know more!

Bill

Comment on this post in the forums

Sneak peak at the HDXPRT 2011 results white paper

After spending weeks testing different configurations with HDXPRT 2011, we are putting the final touches on a white paper detailing the results. I thought I’d give you a sneak peak at some of the things the tests revealed about the characteristics of HDXPRT 2011.

As I explained last week, trying to understand the characteristics of a benchmark requires careful testing while changing one component at a time. To do that, we ran the tests on a single system using an Intel DH67BL motherboard. We changed processors (both type and speed), the amount of RAM, the type of storage (hard disk and SSD), and the graphics subsystem, as well as a few other variables.

Here are a few of the things we found:

  • Processor speed – On an Intel Core i3, increasing the processor speed (GHz) 6.5% resulted in a 4.4% increase in the HDXPRT overall score. On an Intel Core i5, increasing the processor speed (GHz) 17.9% resulted in an 8.1% increase in the HDXPRT overall score. Generally, that means that increased processor speed is important, but the performance scales somewhat less than the raw gigahertz.
  • Memory – Increasing from 2 GB to 4 GB increased the overall score 10.7% on an Intel Core i5 and 15.8% on an Intel Core i7. However, increasing from 4 GB to 8 GB increased the score less than 2% on both processors. These results map pretty well with my personal experience: going to 4 GB is important for media-rich applications, but going to 8 GB is less so.
  • Disk drive – Switching from a hard disk to an SSD increased the overall score about 1%. While I would certainly prefer an SSD to a hard disk, this shows that, for HDXPRT 2011, disk performance has only a small influence on the results.

Many more details will be in the white paper we will publish in the next few days. Please be on the lookout for it and let us know what you think of the results and what they say about the characteristics of HDXPRT 2011.

We plan to conduct a Webinar in the near future to discuss the HDXPRT 2011 results white paper and to answer general questions. I hope to see you there!

Bill

Comment on this post in the forums

Check out the other XPRTs:

Forgot your password?