BenchmarkXPRT Blog banner

Tag Archives: Tesseract

Browser-based AI tests in WebXPRT 4: optical character recognition

In our previous blog post, we discussed the rapidly expanding influence of AI-enhanced technologies in areas like everyday browser activity—and the growing need for objective performance data that can help us understand how well new consumer devices will handle AI tasks. We noted that WebXPRT 4 already includes timed AI tasks in two of its workloads—the “Organize Album using AI” and “Encrypt Notes and OCR Scan”—and we provided some technical details for the Organize Album workload. In today’s post, we’ll focus on the Encrypt Notes workload.

The Encrypt Notes workload includes two separate scenarios that reflect common web-based productivity app tasks. The first scenario syncs a set of encrypted notes, and the second scenario uses AI-based optical character recognition (OCR) to scan a receipt, extract data, and then add that data to an expense report.

Here are the details for each scenario:

  • The encrypt notes scenario downloads a set of notes, encrypts that data, temporarily stores it in the browser’s localStorage object (the localStorageDB.js database layer), and then decrypts and renders it for display. This scenario measures HTML5 Local Storage, JavaScript, AES encryption, and WebAssembly (Wasm) performance. 
  • The OCR scan scenario uses a Wasm-based version of Tesseract.js (tesseract-core.wasm.js v2.20) to scan an expense receipt. Tesseract.js is a JavaScript port of the Tesseract OCR engine—a popular open-source C/C++ library that extracts text from images and PDFs. The scenario then adds the receipt to an expense report. This scenario measures HTML5 Local Storage, JavaScript, and Wasm performance. 

We mention this test under the AI umbrella in part because people sometimes use the term “OCR” to refer to a spectrum of AI and non-AI technologies. In this case, though, the specifics make this workload clearly have an AI component. The Wasm-based Tesseract library that we use in WebXPRT 4 is based on a version of C/C++ (v4.x) that uses Long Short-Term Memory (LSTM). LSTM is a type of recurrent neural network (RNN) that is particularly well-suited for processing and predicting sequential data. As such, it is clearly an AI component of the Encrypt Notes and OCR Scan workload.

To produce a score for each iteration of the workload, WebXPRT calculates the total time that it takes for a system to sync (encrypt, decrypt, and render) the notes, use OCR to scan the receipt, and add the scanned data to an expense report. In a standard test, WebXPRT runs seven iterations of the entire six-workload performance suite before calculating an overall test score. You can find out more about the WebXPRT results calculation process here.

Along with our post on the Organize Album workload, we hope this information provides a deeper understanding of WebXPRT 4’s AI-equipped workloads. As we mentioned last time, if you want to explore the structure of these workloads in more detail, you can check out previous blog posts for information about how to access and use the WebXPRT 4 source code for free. You can also read more about WebXPRT’s overall structure and other workloads in the Exploring WebXPRT 4 white paper.

If you have any questions about WebXPRT 4, please let us know!

Justin

Making progress with WebXPRT 4 in iOS 17

In recent blog posts, we discussed an issue that we encountered when attempting to run WebXPRT 4 on iOS 17 devices. If you missed those posts, you can find more details about the nature of the problem here. In short, the issue is that the Encrypt Notes and OCR scan subtest in WebXPRT 4 gets stuck when the Tesseract.js Optical Character Recognition (OCR) engine attempts to scan a shopping receipt. We’ve verified that the issue occurs on devices running iOS 17, iPadOS 17, and macOS Sonoma with Safari 17.

After a good bit of troubleshooting and research to try and identify the cause of the problem, we decided to build an updated version of WebXPRT 4 that uses a newer version of Tesseract for the OCR task. Aside from updating Tesseract in the new build, we aimed to change as little as possible. To try and maximize continuity, we’re still using the original input image for the receipt scanning task, and we decided to stick with using the WASM library instead of a WASM-SIMD library. Aside from a new version of tesseract.js, WebXPRT 4 version number updates, and updated documentation where necessary, all other aspects of WebXPRT 4 will remain the same.

We’re currently testing a candidate build of this new version on a wide array of devices. The results so far seem promising, but we want to complete our due diligence and make sure this is the best approach to solving the problem. We know that OEM labs and tech reviewers put a lot of time and effort into compiling databases of results, so we hope to provide a solution that minimizes results disruption and inconvenience for WebXPRT 4 users. Ideally, folks would be able to integrate scores from the new build without any questions or confusion about comparability.

We don’t yet have an exact release date for a new WebXPRT 4 build, but we can say that we’re shooting for the end of October. We appreciate everyone’s patience as we work towards the best possible solution. If you have any questions or concerns about an updated version of WebXPRT 4, please let us know.

Justin

An update on the issue with WebXPRT 4 in iOS 17

Recently, we informed XPRT blog readers that after updating Apple iPhones and iPads to iOS and iPadOS 17, respectively, we began to see WebXPRT 4 failures on those devices. In the Safari and Google Chrome browsers, WebXPRT 4 test runs were freezing while running the Encrypt Notes and OCR Scan workload. We were able to replicate the issue on every iOS/iPadOS 17 device we tested, and we also confirmed that WebXPRT 4 continues to run without issues on other non-iOS platforms.

Our team has been investigating the situation, and we’ve made some progress. It’s clear that the failed test runs are getting stuck when the WASM-based Tesseract.js Optical Character Recognition (OCR) engine attempts to scan a shopping receipt. During our research, we’ve discovered an issue when the current Tesseract.js engine runs on iOS 17. This issue is broader than WebXPRT 4, and the Tesseract team is aware of the problem. Future versions of iOS 17 or later versions of Tesseract.js may include fixes for the problem, but unfortunately, we don’t know whether or when a fix will be available.

We’re currently investigating possible workarounds for the problem, and hope to be able to start testing soon. Our goal is that any solution we implement will not significantly affect existing WebXPRT 4 scores on non-iOS 17 platforms.

We will continue to share any substantive progress updates with readers here in the blog. Once again, we apologize for any inconvenience this issue causes for WebXPRT 4 users, and we appreciate your patience while we work toward a solution. If you have any questions or comments, please feel free to contact us!

Justin

Check out the other XPRTs:

Forgot your password?