BenchmarkXPRT Blog banner

Tag Archives: Tesseract

WebXPRT 5: Starting to assemble the pieces

In our last blog post, we shared the exciting news that we’re currently working on WebXPRT 5. In that post, we described some of the ways that WebXPRT may evolve with the release of WebXPRT 5. In today’s post, we’ll revisit some of the points of emphasis from the last post and focus on potential workload changes in a bit more detail.

With any benchmark development project, there are always technical challenges you need to iron out. That is especially true with a cross-platform, browser-based benchmark like WebXPRT. Because we’re in the middle of exploring the technical feasibility of a few of the options we’ll mention, we’re not yet ready to say for certain that all these features will be available in the initial WebXPRT 5 release. We can, however, now paint a clearer picture of the overall direction we’re headed.

In the section below, you’ll find updated info on where we stand with respect to some of the key development focal points we discussed in our last post. If there’s an item from that post or previous posts that we didn’t mention below—such as updating the test harness—it doesn’t mean that we’re dropping that goal. We’re just focusing on workloads today.

One of our key goals with WebXPRT 5 is providing more AI-related workloads. In past blog posts, we’ve discussed the growing importance of local, browser-side AI. With WebXPRT 5, we’re investigating two ways that we can expand WebXPRT’s AI portfolio: 1) updating existing WebXPRT 4 AI-oriented workloads, and 2) adding all-new AI workloads.

Here are some possible ways those AI-related changes may play out in both categories:

Updating existing WebXPRT 4 AI-oriented workloads

  • Splitting the existing Organize Album using AI workload’s timed tasks—face detection and image classification—into two independent workloads.
  • Updating the face detection and image classification tasks with the latest versions of the OpenCV.js computer vision and machine learning libraries.
  • Updating the Caffe deep learning framework for the face detection task.
  • Updating the ONNX-based SqueezeNet machine learning model for the image classification tasks.
  • Updating the version of the Tesseract.js OCR engine that WebXPRT uses in the Encrypt Notes and OCR Scan workload. 

Potentially adding all-new AI workloads (either core or experimental workloads)

  • We’re exploring the idea of including a workload that uses an AI-powered segmentation model to blur the background of a video call.
  • We’re exploring the feasibility of including a local LLM chat workload.
  • We would eventually like to include a WebGPU-based web AI framework for a computer vision workload.

In addition to the goal of adding more AI, we previously discussed the possibility of adding non-AI WebGPU workloads. As a web API, WebGPU enables web-based applications—such as image-based GenAI and inference workloads—to directly access the graphics rendering and computational capabilities of a system’s GPU. In the future, WebXPRT 5 could use that technology to execute complex 3D rendering workloads.

We hope today’s post gives you a better sense of where WebXPRT 5 may be headed. We want to reemphasize that while we are actively investigating the possible changes mentioned above, nothing is set in stone. As the pieces start to fall into place, we’ll provide more information here in the blog.

If you have any questions or comments about WebXPRT 5, please feel free to contact us!

Justin

Browser-based AI tests in WebXPRT 4: optical character recognition

In our previous blog post, we discussed the rapidly expanding influence of AI-enhanced technologies in areas like everyday browser activity—and the growing need for objective performance data that can help us understand how well new consumer devices will handle AI tasks. We noted that WebXPRT 4 already includes timed AI tasks in two of its workloads—the “Organize Album using AI” and “Encrypt Notes and OCR Scan”—and we provided some technical details for the Organize Album workload. In today’s post, we’ll focus on the Encrypt Notes workload.

The Encrypt Notes workload includes two separate scenarios that reflect common web-based productivity app tasks. The first scenario syncs a set of encrypted notes, and the second scenario uses AI-based optical character recognition (OCR) to scan a receipt, extract data, and then add that data to an expense report.

Here are the details for each scenario:

  • The encrypt notes scenario downloads a set of notes, encrypts that data, temporarily stores it in the browser’s localStorage object (the localStorageDB.js database layer), and then decrypts and renders it for display. This scenario measures HTML5 Local Storage, JavaScript, AES encryption, and WebAssembly (Wasm) performance. 
  • The OCR scan scenario uses a Wasm-based version of Tesseract.js (tesseract-core.wasm.js v2.20) to scan an expense receipt. Tesseract.js is a JavaScript port of the Tesseract OCR engine—a popular open-source C/C++ library that extracts text from images and PDFs. The scenario then adds the receipt to an expense report. This scenario measures HTML5 Local Storage, JavaScript, and Wasm performance. 

We mention this test under the AI umbrella in part because people sometimes use the term “OCR” to refer to a spectrum of AI and non-AI technologies. In this case, though, the specifics make this workload clearly have an AI component. The Wasm-based Tesseract library that we use in WebXPRT 4 is based on a version of C/C++ (v4.x) that uses Long Short-Term Memory (LSTM). LSTM is a type of recurrent neural network (RNN) that is particularly well-suited for processing and predicting sequential data. As such, it is clearly an AI component of the Encrypt Notes and OCR Scan workload.

To produce a score for each iteration of the workload, WebXPRT calculates the total time that it takes for a system to sync (encrypt, decrypt, and render) the notes, use OCR to scan the receipt, and add the scanned data to an expense report. In a standard test, WebXPRT runs seven iterations of the entire six-workload performance suite before calculating an overall test score. You can find out more about the WebXPRT results calculation process here.

Along with our post on the Organize Album workload, we hope this information provides a deeper understanding of WebXPRT 4’s AI-equipped workloads. As we mentioned last time, if you want to explore the structure of these workloads in more detail, you can check out previous blog posts for information about how to access and use the WebXPRT 4 source code for free. You can also read more about WebXPRT’s overall structure and other workloads in the Exploring WebXPRT 4 white paper.

If you have any questions about WebXPRT 4, please let us know!

Justin

Making progress with WebXPRT 4 in iOS 17

In recent blog posts, we discussed an issue that we encountered when attempting to run WebXPRT 4 on iOS 17 devices. If you missed those posts, you can find more details about the nature of the problem here. In short, the issue is that the Encrypt Notes and OCR scan subtest in WebXPRT 4 gets stuck when the Tesseract.js Optical Character Recognition (OCR) engine attempts to scan a shopping receipt. We’ve verified that the issue occurs on devices running iOS 17, iPadOS 17, and macOS Sonoma with Safari 17.

After a good bit of troubleshooting and research to try and identify the cause of the problem, we decided to build an updated version of WebXPRT 4 that uses a newer version of Tesseract for the OCR task. Aside from updating Tesseract in the new build, we aimed to change as little as possible. To try and maximize continuity, we’re still using the original input image for the receipt scanning task, and we decided to stick with using the WASM library instead of a WASM-SIMD library. Aside from a new version of tesseract.js, WebXPRT 4 version number updates, and updated documentation where necessary, all other aspects of WebXPRT 4 will remain the same.

We’re currently testing a candidate build of this new version on a wide array of devices. The results so far seem promising, but we want to complete our due diligence and make sure this is the best approach to solving the problem. We know that OEM labs and tech reviewers put a lot of time and effort into compiling databases of results, so we hope to provide a solution that minimizes results disruption and inconvenience for WebXPRT 4 users. Ideally, folks would be able to integrate scores from the new build without any questions or confusion about comparability.

We don’t yet have an exact release date for a new WebXPRT 4 build, but we can say that we’re shooting for the end of October. We appreciate everyone’s patience as we work towards the best possible solution. If you have any questions or concerns about an updated version of WebXPRT 4, please let us know.

Justin

An update on the issue with WebXPRT 4 in iOS 17

Recently, we informed XPRT blog readers that after updating Apple iPhones and iPads to iOS and iPadOS 17, respectively, we began to see WebXPRT 4 failures on those devices. In the Safari and Google Chrome browsers, WebXPRT 4 test runs were freezing while running the Encrypt Notes and OCR Scan workload. We were able to replicate the issue on every iOS/iPadOS 17 device we tested, and we also confirmed that WebXPRT 4 continues to run without issues on other non-iOS platforms.

Our team has been investigating the situation, and we’ve made some progress. It’s clear that the failed test runs are getting stuck when the WASM-based Tesseract.js Optical Character Recognition (OCR) engine attempts to scan a shopping receipt. During our research, we’ve discovered an issue when the current Tesseract.js engine runs on iOS 17. This issue is broader than WebXPRT 4, and the Tesseract team is aware of the problem. Future versions of iOS 17 or later versions of Tesseract.js may include fixes for the problem, but unfortunately, we don’t know whether or when a fix will be available.

We’re currently investigating possible workarounds for the problem, and hope to be able to start testing soon. Our goal is that any solution we implement will not significantly affect existing WebXPRT 4 scores on non-iOS 17 platforms.

We will continue to share any substantive progress updates with readers here in the blog. Once again, we apologize for any inconvenience this issue causes for WebXPRT 4 users, and we appreciate your patience while we work toward a solution. If you have any questions or comments, please feel free to contact us!

Justin

Check out the other XPRTs:

Forgot your password?