BenchmarkXPRT Blog banner

Tag Archives: OCR

WebXPRT 5: Starting to assemble the pieces

In our last blog post, we shared the exciting news that we’re currently working on WebXPRT 5. In that post, we described some of the ways that WebXPRT may evolve with the release of WebXPRT 5. In today’s post, we’ll revisit some of the points of emphasis from the last post and focus on potential workload changes in a bit more detail.

With any benchmark development project, there are always technical challenges you need to iron out. That is especially true with a cross-platform, browser-based benchmark like WebXPRT. Because we’re in the middle of exploring the technical feasibility of a few of the options we’ll mention, we’re not yet ready to say for certain that all these features will be available in the initial WebXPRT 5 release. We can, however, now paint a clearer picture of the overall direction we’re headed.

In the section below, you’ll find updated info on where we stand with respect to some of the key development focal points we discussed in our last post. If there’s an item from that post or previous posts that we didn’t mention below—such as updating the test harness—it doesn’t mean that we’re dropping that goal. We’re just focusing on workloads today.

One of our key goals with WebXPRT 5 is providing more AI-related workloads. In past blog posts, we’ve discussed the growing importance of local, browser-side AI. With WebXPRT 5, we’re investigating two ways that we can expand WebXPRT’s AI portfolio: 1) updating existing WebXPRT 4 AI-oriented workloads, and 2) adding all-new AI workloads.

Here are some possible ways those AI-related changes may play out in both categories:

Updating existing WebXPRT 4 AI-oriented workloads

  • Splitting the existing Organize Album using AI workload’s timed tasks—face detection and image classification—into two independent workloads.
  • Updating the face detection and image classification tasks with the latest versions of the OpenCV.js computer vision and machine learning libraries.
  • Updating the Caffe deep learning framework for the face detection task.
  • Updating the ONNX-based SqueezeNet machine learning model for the image classification tasks.
  • Updating the version of the Tesseract.js OCR engine that WebXPRT uses in the Encrypt Notes and OCR Scan workload. 

Potentially adding all-new AI workloads (either core or experimental workloads)

  • We’re exploring the idea of including a workload that uses an AI-powered segmentation model to blur the background of a video call.
  • We’re exploring the feasibility of including a local LLM chat workload.
  • We would eventually like to include a WebGPU-based web AI framework for a computer vision workload.

In addition to the goal of adding more AI, we previously discussed the possibility of adding non-AI WebGPU workloads. As a web API, WebGPU enables web-based applications—such as image-based GenAI and inference workloads—to directly access the graphics rendering and computational capabilities of a system’s GPU. In the future, WebXPRT 5 could use that technology to execute complex 3D rendering workloads.

We hope today’s post gives you a better sense of where WebXPRT 5 may be headed. We want to reemphasize that while we are actively investigating the possible changes mentioned above, nothing is set in stone. As the pieces start to fall into place, we’ll provide more information here in the blog.

If you have any questions or comments about WebXPRT 5, please feel free to contact us!

Justin

Browser-based AI tests in WebXPRT 4: optical character recognition

In our previous blog post, we discussed the rapidly expanding influence of AI-enhanced technologies in areas like everyday browser activity—and the growing need for objective performance data that can help us understand how well new consumer devices will handle AI tasks. We noted that WebXPRT 4 already includes timed AI tasks in two of its workloads—the “Organize Album using AI” and “Encrypt Notes and OCR Scan”—and we provided some technical details for the Organize Album workload. In today’s post, we’ll focus on the Encrypt Notes workload.

The Encrypt Notes workload includes two separate scenarios that reflect common web-based productivity app tasks. The first scenario syncs a set of encrypted notes, and the second scenario uses AI-based optical character recognition (OCR) to scan a receipt, extract data, and then add that data to an expense report.

Here are the details for each scenario:

  • The encrypt notes scenario downloads a set of notes, encrypts that data, temporarily stores it in the browser’s localStorage object (the localStorageDB.js database layer), and then decrypts and renders it for display. This scenario measures HTML5 Local Storage, JavaScript, AES encryption, and WebAssembly (Wasm) performance. 
  • The OCR scan scenario uses a Wasm-based version of Tesseract.js (tesseract-core.wasm.js v2.20) to scan an expense receipt. Tesseract.js is a JavaScript port of the Tesseract OCR engine—a popular open-source C/C++ library that extracts text from images and PDFs. The scenario then adds the receipt to an expense report. This scenario measures HTML5 Local Storage, JavaScript, and Wasm performance. 

We mention this test under the AI umbrella in part because people sometimes use the term “OCR” to refer to a spectrum of AI and non-AI technologies. In this case, though, the specifics make this workload clearly have an AI component. The Wasm-based Tesseract library that we use in WebXPRT 4 is based on a version of C/C++ (v4.x) that uses Long Short-Term Memory (LSTM). LSTM is a type of recurrent neural network (RNN) that is particularly well-suited for processing and predicting sequential data. As such, it is clearly an AI component of the Encrypt Notes and OCR Scan workload.

To produce a score for each iteration of the workload, WebXPRT calculates the total time that it takes for a system to sync (encrypt, decrypt, and render) the notes, use OCR to scan the receipt, and add the scanned data to an expense report. In a standard test, WebXPRT runs seven iterations of the entire six-workload performance suite before calculating an overall test score. You can find out more about the WebXPRT results calculation process here.

Along with our post on the Organize Album workload, we hope this information provides a deeper understanding of WebXPRT 4’s AI-equipped workloads. As we mentioned last time, if you want to explore the structure of these workloads in more detail, you can check out previous blog posts for information about how to access and use the WebXPRT 4 source code for free. You can also read more about WebXPRT’s overall structure and other workloads in the Exploring WebXPRT 4 white paper.

If you have any questions about WebXPRT 4, please let us know!

Justin

Up next for WebXPRT 4: A new AI-focused workload!

We’re always thinking about ways to improve WebXPRT. In the past, we’ve discussed the potential benefits of auxiliary workloads and the role that such workloads might play in future WebXPRT updates and versions. Today, we’re very excited to announce that we’ve decided to move forward with the development of a new WebXPRT 4 workload focused on browser-side AI technology!

WebXPRT 4 already includes timed AI tasks in two of its workloads: the Organize Album using AI workload and the Encrypt Notes and OCR Scan workload. These two workloads reflect the types of light browser-side inference tasks that have been available for a while now, but most heavy-duty inference on the web has historically happened in on-prem servers or in the cloud. Now, localized AI technology is growing by leaps and bounds, and the integration of new AI capabilities with browser-based tasks is on the threshold of advancing rapidly.

Because of this growth, we believe now is the time to start work on giving WebXPRT 4 the ability to evaluate new browser-based AI capabilities—capabilities that are likely to become a part of everyday life in the next few years. We haven’t yet decided on a test scenario or software stack for the new workload, but we’ll be working to refine our plan in the coming months. There seems to be some initial promise in emerging frameworks such as ONNX Runtime Web, which allows users to run and deploy web-based machine learning models by using JavaScript APIs and libraries. In addition, new Web APIs like WebGPU (currently supported in Edge, Chrome, and tech preview in Safari) and WebNN (in development) may soon help facilitate new browser-side AI workloads.

We know that many longtime WebXPRT 4 users will have questions about how this new workload may affect their tests. We want to assure you that the workload will be an optional bonus workload and will not run by default during normal WebXPRT 4 tests. As you consider possibilities for the new workload, here are a few points to keep in mind:

  • The workload will be optional for users to run.
  • It will not affect the main WebXPRT 4 subtest or overall scores in any way.
  • It will run separately from the main test and will produce its own score(s).
  • Current and future WebXPRT 4 results will still be comparable to one another, so users who’ve already built a database of WebXPRT 4 scores will not have to retest their devices.
  • Because many of the available frameworks don’t currently run on all browsers, the workload may not run on every platform.

As we research available technologies and explore our options, we would love to hear from you. If you have ideas for an AI workload scenario that you think would be useful or thoughts on how we should implement it, please let us know! We’re excited about adding new technologies and new value to WebXPRT 4, and we look forward to sharing more information here in the blog as we make progress.

Justin

An update on the issue with WebXPRT 4 in iOS 17

Recently, we informed XPRT blog readers that after updating Apple iPhones and iPads to iOS and iPadOS 17, respectively, we began to see WebXPRT 4 failures on those devices. In the Safari and Google Chrome browsers, WebXPRT 4 test runs were freezing while running the Encrypt Notes and OCR Scan workload. We were able to replicate the issue on every iOS/iPadOS 17 device we tested, and we also confirmed that WebXPRT 4 continues to run without issues on other non-iOS platforms.

Our team has been investigating the situation, and we’ve made some progress. It’s clear that the failed test runs are getting stuck when the WASM-based Tesseract.js Optical Character Recognition (OCR) engine attempts to scan a shopping receipt. During our research, we’ve discovered an issue when the current Tesseract.js engine runs on iOS 17. This issue is broader than WebXPRT 4, and the Tesseract team is aware of the problem. Future versions of iOS 17 or later versions of Tesseract.js may include fixes for the problem, but unfortunately, we don’t know whether or when a fix will be available.

We’re currently investigating possible workarounds for the problem, and hope to be able to start testing soon. Our goal is that any solution we implement will not significantly affect existing WebXPRT 4 scores on non-iOS 17 platforms.

We will continue to share any substantive progress updates with readers here in the blog. Once again, we apologize for any inconvenience this issue causes for WebXPRT 4 users, and we appreciate your patience while we work toward a solution. If you have any questions or comments, please feel free to contact us!

Justin

WebXPRT: What would you like to see?

At over 412,000 runs and counting, WebXPRT is our most popular benchmark. From the first release in 2013, it’s been popular with device manufacturers, developers, tech journalists, and consumers because it’s easy to run, it runs on almost anything with a web browser, and it evaluates device performance using the types of web-based tasks that people are likely to encounter on a daily basis.

With each new version of WebXPRT, we analyze browser development trends to make sure the test’s underlying web technologies and workload scenarios adequately reflect the ways people are using their browsers to work and play. BenchmarkXPRT Development Community members can play an important part in that process by sending us feedback on existing tests and suggestions for new workloads to include.

For example, when we released WebXPRT 3, we updated the photo workloads with new images and a deep learning task used for image classification. We also added an optical character recognition task in the Encrypt Notes and OCR scan workload, and combined part of the DNA Sequence Analysis scenario with a writing sample/spell check scenario to simulate online homework in an all-new Online Homework workload.

Consider for a moment what an ideal future version of WebXPRT would look like for you. Are there new web technologies or workload scenarios that you would like to see? Would you be interested in an associated battery life test? Should we include experimental tests? We’re interested in what you have to say, so please feel free to contact us with your thoughts or questions.

If you’re just now learning about WebXPRT, we offer several resources to help you better understand the benchmark and its range of uses. For a general overview of why WebXPRT matters, watch our video titled What is WebXPRT and why should I care? To read more about the details of the benchmark’s development and structure, check out the Exploring WebXPRT 3 white paper. To see WebXPRT 2015 and WebXPRT 3 scores from a wide range of processors, visit the WebXPRT 3 Processor Comparison Chart.

We look forward to hearing from you!

Justin

Check out the other XPRTs:

Forgot your password?