Blog / December 11, 2025

First look at GPT-5.2 in Moveworks: A sharper, more focused Assistant

Stefan Baumann, Senior Machine Learning Engineer

Yvette Lin, Machine Learning Engineer

Matthew Mistele, Senior Engineering Manager

First look at GPT-5.2 in Moveworks: A sharper, more focused Assistant

Table of contents


As an OpenAI partner, Moveworks received early access to GPT-5.2—and we immediately put it to the test. We plugged the model directly into the Moveworks Reasoning Engine, our sophisticated hybrid AI system that orchestrates tools, workflows, and business logic for the AI Assistant.

Then, we ran it through our full agentic evaluation stack to answer the real question: Does GPT-5.2 meaningfully improve how our Assistant behaves in enterprise workflows, and not just how it generates text?

Early results say yes. Here’s Moveworks CEO Bhavin Shah, on what the model update means to Moveworks:

“We’re excited to integrate GPT-5.2 into the Moveworks AI Assistant. Our internal evaluations show that it demonstrates greater self-awareness, stronger steerability, and improved tool calling than 5.1—all of which are critical to automate our customers’ enterprise workflows.”

In this post, we’ll share what we’ve seen so far from GPT-5.2 inside Moveworks and how it is poised to reshape the behavior of our Assistant in real employee workflows, helping our millions of enterprise customer employees get work done.

Why evaluating AI agents is different from evaluating LLMs

At Moveworks, we care about one thing far more than LLM performance on standard benchmarks: ensuring our Assistant reliably gets real work done. So when we evaluate a new model, we don't stop at standard LLM benchmarks to understand its capabilities, because we know those scores only tell part of the story.

Instead, we embed the model in the Moveworks Reasoning Engine and assess what really matters: reasoning and effectiveness in the real world. Can it use tools correctly, follow customer-specific business logic, and drive real employee workflows to resolution safely and efficiently?

Our agentic evaluations are designed to measure exactly that set of capabilities, which for our AI Assistant includes:

  • Choosing from hundreds of customer-specific tools
  • Orchestrating multi-step, multi-plugin workflows to complete a request
  • Respecting consent flows and safety policies
  • Deciding when to escalate to other systems or human experts instead of guessing

How GPT-5.2 sharpens the Moveworks AI Assistant

Below are three of the improvements we have seen in our early agentic evaluations of GPT-5.2.

1. Understands its own capabilities and limits

We expect the Assistant to know what it can and cannot do with the tools it has, say so clearly, and route the user to the next best step when it cannot complete a request end to end.

This self-awareness reduces wasted time and frustration from conversations that were never going to succeed, and it builds trust in Moveworks as the front door for employees to get what they need done.

Example: An employee asks to 3D print an object, and the Assistant does not have tools to fulfill this request - it should communicate to the user that it is unable to fulfill the request.

2. More focused responses with fewer distractions

We expect the Assistant to stay tightly focused on the user’s objective: infer intent quickly, surface the single best next step, and avoid speculative calls-to-action or unnecessary follow-up questions.

Fewer distractions mean lower friction and faster resolution. GPT-5.2’s greater steerability helps employees see “what to do next” immediately, improving both task completion rates and perceived quality.

Example: An employee asks to schedule a calendar event. The employee does not have their calendar integration connected, so connecting the calendar is the first step to fulfilling their request.

3. Improved tool calling and more explainable results

When working with numerical data, we expect the Assistant to reliably call the right tools—especially Code Interpreter if enabled—and briefly explain how it arrived at its answer.

Better tool calling makes responses more transparent and auditable. When employees can see that the Assistant pulled real data and ran explicit calculations, their confidence in using Moveworks for operational decisions goes up.

Example: An employee asks a question which requires multi-hop tool use, including computation, which our Assistant is instructed to use Code Interpreter to perform.

When to expect GPT-5.2 in your Moveworks Assistant

Today, we have rolled out GPT-5.2 internally for our internal Moveworks AI Assistant, m8, and are running larger-scale validation across real employee workflows.

From there, our path is straightforward: as GPT-5.2 clears our quality and safety bars using our regression tests, we will make it available to customers on a rolling basis, starting as early as next week.

For current customers, in case you missed it - your Moveworks Assistant has already been upgraded to use GPT 5.1. To understand what these model upgrades mean for your specific use cases and timelines, contact your Moveworks team.

And if you’re not yet a Moveworks customer and want to see the GPT-5 family of models in action inside an enterprise AI Assistant, reach out for a demo of the Moveworks platform.

The content of this blog post is for informational purposes only.

Subscribe to our Insights blog