Blog / June 10, 2025

Moveworks agentic Reasoning Engine gets a comprehensive performance upgrade, powered by GPT-4.1

Ana Tyler, Principal Product Manager

Andrew Mairena, Group Product Manager

Agentic Reasoning Engine Upgraded to GPT-4.1

At Moveworks, we're always seeking ways to enhance our AI Assistant by evaluating and integrating the latest models as they launch.

That’s why we are excited to announce that the Moveworks AI Assistant now uses GPT-4.1 as part of its agentic Reasoning Engine, allowing it to better serve your employees.

As a reminder, the Moveworks Reasoning Engine is what powers the Moveworks AI Assistant, enabling it to understand employees, develop complex plans to satisfy their requests, and execute those plans across a variety of business systems. Simply put, it’s what allows Moveworks to be the one-stop-shop for all search, automation, and productivity use cases across every domain like IT, HR, finance, facilities, GTM, engineering, and more. This Reasoning Engine is powered by a multitude of models — some open source and some proprietary — each designed to handle specific tasks as part of the overall architecture.

Over the years Moveworks has built a reputation for rapidly moving from model release to incorporation into our architecture for full enterprise deployment, as proven by past successes both before and after the release of GPT 3.5-powered ChatGPT. While the official release dates below refer to when the models were announced by OpenAI, our actual integration is gated by Azure cross-region availability, which typically follows a few months later.

The table below highlights how Moveworks rapidly operationalizes major model upgrades from our providers, particularly OpenAI:

Model	OpenAI Release Date	Incorporated into Moveworks
GPT-3.5	November 2022	December 2022
GPT-4	March 2023	October 2023
GPT-4 Turbo	November 2023	March 2024
GPT-4o (2024-05-13)	May 2024	August 2024
GPT-4o (2024-08-06)	August 2024	November 2024
GPT-4.1	April 2025	June 2025

While OpenAI’s GPT models are a key part of our AI stack, we also integrate other leading models from providers like Azure, and fine-tune open source models such as Meta’s LLaMA and Hugging Face’s BGE to accomplish specific enterprise use cases. This hybrid approach ensures we can deliver the best performance across global deployments while maintaining flexibility across model providers.

Now, as GPT-4.1 becomes available across all Moveworks data centers, we wanted to share more about our evaluation process and the anticipated benefits of this latest upgrade.

Tested and vetted: Moveworks AI Assistant is now using GPT-4.1

Every large and small machine learning model candidate goes through a robust and rigorous elevation process. For a foundation model upgrade such as GPT-4.1, Moveworks benchmarks performance against the incumbent model (GPT-4o) on several tasks such as planning, argument filling, and plugin execution across its evaluation datasets and runs A/B testing to measure and verify improvements.

For the GPT-4.1 upgrade, Moveworks spent significant effort to revamp the prompt structure in accordance with the model guidelines to maximize performance gains. We run the following evaluations to validate the improvements and ensure there are no regressions within the current experience.

Golden set evaluation: This internal dataset contains curated interactions that are generated from various built-in and custom plugins that represent tasks across the enterprise. The test evaluates the curated datasets against different metrics such as plugin triggering accuracy to argument filling. This approach validates improvements within key tasks of the Reasoning Engine.
Side-by-side (SxS) evaluation: The main difference between the SxS and golden datasets is that SxS doesn't require pre-labeling of interactions. In this evaluation, we are able to sample a much larger dataset — recent production data — and compare the outputs of GPT-4o and GPT-4.1. This approach validates improvements against more recent use cases within the AI Assistant.
Human annotator evaluation: This dataset contains anonymized end-to-end experiences from usage of GPT-4.1 in our internal Moveworks AI Assistant, m8. In this evaluation, expert annotators review the experiences to provide nuanced insights such as variations in summaries or returning different citations, to evaluate the end-to-end experience.

For the past month, we have been dogfooding the model upgrade with our internal AI Assistant, m8, to validate with our own employees and use cases. This customer zero testing phase allows us to refine and enhance the AI Assistant's functionality by gathering firsthand insights, ultimately delivering a more robust and effective product to our customers.

How GPT-4.1 improves the user experience

From OpenAI, GPT-4.1 outperforms GPT-4o with significant improvements in particular benchmarks, achieving a 21.4% absolute increase in coding benchmarks and a 10.5% boost in instruction following ability.

These capabilities lead to experience improvements for Moveworks AI Assistant users:

Improves plugin selection: GPT-4.1 follows instructions much more literally. Because of that, we’ve reconstructed all system prompts to follow best practices and achieve a higher ceiling of performance. Our Reasoning Engine is now able to obtain more relevant information from plugins to select the right ones.
Serves more answers: Forms and Knowledge Search plugins are invoked more often, doubling the number of answers served to users to increase the utility of the AI Assistant.
Reduces latency: GPT-4.1 is expected to bring a 38% reduction in average latency in comparison to GPT-4o-2024-08-06, as measured in our benchmarking exercise.

Commitment to continuous innovation

Moveworks is using powerful, industry-leading LLMs such as GPT-4.1 to solve profound business problems, helping employees find information, automate tasks, and be more productive. We have invested significant R&D to make these components work together, and look forward to seeing how GPT-4.1 is able to deliver elevated performance and positive outcomes to our customers.

As the pace model advancements continue to accelerate, partnering with Moveworks trusted experts allows you (and your employees) to stay at the forefront of AI innovation — all while your organization gets to stay focused on the things that matter to your core business operations and competencies.

See how it works – Watch the Moveworks AI Assistant in action right now.

Table of contents

The content of this blog post is for informational purposes only.