AI has the potential to impact our future lives in ways we can’t imagine, but it’s also true that AI has already been impacting daily life for quite some time. Natural language processing (NLP)—a form of AI—is the power behind many common technologies such as autocomplete and predictive text, translation apps, adaptive email filters, smart assistants like Siri and Alexa, and much more. We expect up-to-date, nearly instantaneous responses from those technologies, and they use serious compute power—often located in the cloud—to meet those expectations. To continue providing lightning fast, seamless experiences, the organizations that produce and support NLP tech need cloud compute power that can thrive with growing demand—and not break the bank while doing so.

We tested the NLP performance of two types of Amazon Web Services (AWS) Elastic Cloud Compute (EC2) instances: M7i instances enabled by 4th Gen Intel Xeon Scalable processors and M7g instances enabled by AWS Graviton3 processors. We ran the tests on three different instance sizes with the RoBERTa model—using a benchmark from the Intel Model Zoo—and we optimized the model for each processor type. In our tests, the M7i instances outperformed the M7g instances at different vCPU counts, batch sizes, and precisions, processing up to 10.65 times as many sentences per second. More work per instance means more potential value, and when we factored in the per-hour price of each instance, the M7i achieved up to 8.62 times the throughput of the M7g per dollar. With the ability to do more work and potentially save money, the M7i can be a win-win solution for companies that need to consolidate operations or grow with demand.

For more details about our AWS EC2 NLP performance comparison tests, check out the report below.