Office becomes an agent runtime with Copilot’s new modes

What just changed in Office

Today’s Microsoft 365 update flips a long standing assumption about Office. For decades, Word and Excel were the place where you typed, clicked, and calculated. Starting now, they can also be where software agents plan, act, and verify their own work in a way you can audit. Microsoft is introducing Agent Mode inside Excel and Word, plus a new cross app Office Agent in Copilot that can move work across documents and spreadsheets. In short, Office just became an agent runtime.

This is not a cosmetic change. It turns the ribbon and canvas into a control room for auditable plan and execute workflows. It also lands alongside Microsoft’s model plural approach, which lets organizations power Copilot and agents with both OpenAI and Anthropic models. Microsoft confirmed model choice in late September, allowing Researcher and custom agents built in Copilot Studio to run on Anthropic Claude as well as OpenAI models. See Microsoft’s announcement on expanding model choice in Microsoft 365 Copilot.

Why does this matter? Because the bottleneck in knowledge work is not generating words or charts, it is coordinating steps, enforcing guardrails, and proving to a skeptical reviewer or regulator what happened and why. Agents that can plan, act, and leave behind a trail of decisions turn Office from a canvas into an accountable automation layer.

Office as an agent runtime, explained simply

Think of Office as a busy airport. For years, you were both pilot and air traffic control. Copilot made flying easier, but you still taxied, took off, and landed every plane. Agent Mode adds a tower crew that can sequence flights, request clearances, and log every action. You still approve the flight plan, and you can take the yoke whenever you want, but most of the procedural work is now coordinated by software that knows the rules, calls the right services, and documents what it did.

Inside Excel and Word, Agent Mode exposes that tower behavior as a set of visible steps. You ask for an outcome, the agent drafts a plan, and then executes with confirmations at key points. In Excel, that looks like importing data, normalizing columns, reconciling totals, and flagging anomalies before writing back to the sheet. In Word, it looks like assembling a first draft, inserting cited passages, running a house style check, and preparing a tracked changes summary for legal review. Each step is explainable. Each action is reversible. Each decision is recorded.

Why model pluralism matters right now

Model pluralism is not an abstract architecture win. It changes what you can trust the system to do. OpenAI models may excel at broad reasoning and summarization. Anthropic models are strong on instruction following and careful, stepwise work. When those models are selectable per agent or per task, teams can choose the right tool for the job without switching products or rebuilding workflows. Microsoft’s update makes that choice available in two places that matter most to enterprises: out of the box agents like Researcher, and Copilot Studio for building your own agents.

This approach aligns with a broader push toward a common language for enterprise agents, where capabilities and controls are consistent across models and vendors.

Consider two common scenarios:

A product marketing team asks Researcher to synthesize market trends and build a launch brief. You may prefer OpenAI’s deep reasoning for web heavy research and ideation. The agent plans sources, drafts an outline, and proposes a narrative.
A finance team asks an agent to reconcile revenue data from the enterprise resource planning system with bookings in a customer relationship management system and a set of messy spreadsheets. You might pick an Anthropic model for more deterministic, instruction following behavior. The agent performs schema mapping, checks totals against accounting rules, and produces an auditable report in Excel with a variance log.

By allowing model choice inside the same runtime, Microsoft reduces vendor lock in and encourages a culture of benchmarking, not brand loyalty. You can tune model selection to your risk tolerance and task profile rather than to your procurement history.

Auditable plan and execute steps change the trust equation

Enterprises do not trust outcomes they cannot audit. Agent Mode addresses this head on in three ways:

Step transparency. Agents propose a plan, then show each step as it runs. In spreadsheets, the agent shows where data came from, which columns were created or dropped, and which formulas were added. In documents, it shows which sections were drafted from internal sources, which edits came from style rules, and where citations were inserted.
Human in the loop controls. Key steps can require confirmation. For a sensitive spreadsheet, that might include a prompt like “Approve datatype conversion for column Amount to Decimal with two places” before the update hits the grid. For a policy document, that might include “Approve redline set to incorporate new retention section,” with a one click accept that preserves a record of who approved what and when.
Compliance grade logging. When Agent Mode executes, it generates logs that capture who asked for what, which agent ran, which tools it called, the inputs and outputs of each step, and completion status. Those logs flow into familiar governance surfaces so security and compliance teams can monitor agent behavior alongside email, files, and chats. The result is a spreadsheet or document that can be reviewed and reproduced, not just a final state with no provenance.

This matters most where the stakes are highest: quarterly financial close, regulated disclosures, controlled content, and any business process where a manager or auditor needs to say yes, this is correct, and here is how we know. For context on how this plays out in the field, see our report on a bank-grade AI agents pilot.

The cross app Office Agent makes workflows continuous

The new Office Agent in Copilot is designed to treat Office as one continuous workspace. Instead of you orchestrating handoffs between apps, the agent carries context across them. Ask it to build a partner briefing and you get four coordinated outputs: a one page summary in Word, a three slide overview in PowerPoint, a list of follow ups in Outlook drafts, and a tracker in Excel populated with accounts and owners pulled from your systems of record. You see the plan, you approve the scope, and the agent does the legwork.

This cross app behavior sits on top of Microsoft’s larger agents platform, which includes multi agent orchestration, the ability to bring your own model, Microsoft Entra identities for agents, and extensions for monitoring and policy. Microsoft outlined that foundation at Build 2025. If you are thinking in enterprise terms, the headline is that agents are first class citizens in identity, compliance, and lifecycle management, not just clever chatbots. Review the Build overview on multi agent orchestration and Entra Agent ID. For a wider market view, see how the enterprise agent stack goes mainstream.

Concrete examples inside Excel and Word

Cash application in Excel. You ask: “Match these bank deposits against open invoices, create a variance tab for anything over 0.5 percent, and draft an email to the account owner for any variance over 250 dollars.” The agent proposes the plan, seeks approval to connect to finance data, runs a set of transformations, creates pivot tables and exception lists, and then pauses for you to approve the outbound messages. Every step is logged and reversible.
Policy update in Word. You ask: “Insert the new retention policy for customer data, align it to our template, and generate a tracked changes summary for the legal team.” The agent fetches the controlled policy block, updates cross references, runs style checks, and produces a one page summary of edits. A compliance reviewer can replay the steps and see which library the content came from.
Cross app quarterly readout. You ask Copilot: “Prepare the Q3 sales readout using regional data, customer health notes from the last 60 days, and open risks above 500,000 dollars.” The Office Agent composes a Word brief, builds a slide track with a consistent topline, and assembles an Excel appendix with region rollups and footnotes. You approve the plan and the data sources before anything is written.

What this signals for the next 6 to 12 months

Agent native workflows will overtake stand alone chat. People will still prompt, but the center of gravity moves toward saved, repeatable agent plans that run on a schedule or on triggers. Expect teams to keep shared libraries of agent plans the way they keep slide templates and spreadsheet macros. See how Chrome is evolving toward an agent runtime inside the browser.
Model evaluation becomes an operational discipline. With model choice built in, information technology teams will create scorecards for tasks and switch defaults accordingly. In practice that means OpenAI by default for research heavy tasks, Anthropic for instruction heavy tasks, and niche models for code or math as they prove themselves.
Compliance becomes a feature, not a blocker. Auditable steps, human approvals, and unified logging make it possible to automate parts of financial close, policy management, and customer communications that used to be off limits to automation. Expect internal audit to start writing agent acceptance criteria and signoff checklists.
Office add ins evolve into agent skills. The fastest way for software vendors to reach Office users will be to expose their capability as agent actions. If your add in reads a sheet and posts data to a system, your next step is to surface those operations as actions that an Office Agent can call inside a plan.
Cross app agents become the default. Work does not respect app boundaries. The Office Agent’s ability to move context between Word, Excel, PowerPoint, Outlook, and Teams will make it normal to ask for a business outcome and let the system choose which app to touch when.

What to do this quarter

Pick three candidate workflows. Choose one spreadsheet heavy process, one document heavy process, and one cross app process. Good examples include revenue reconciliation, policy updates, and executive readouts. Define the success metrics upfront, for example percentage of steps automated, cycle time reduction, and error rate.
Turn on the governance plumbing. Ensure your tenant has the right logging and controls configured for agents. Decide which actions require human confirmation. Establish an approval flow for any step that writes to a system of record or sends an external message.
Create your first agent plans. In Copilot, record the steps that an experienced operator takes for your chosen processes. Break them into discrete actions with named inputs and outputs. Add clear, plain language prompts that explain the intent, not just the instruction.
Decide your model policy. Pick a default provider for each process and define conditions that would trigger a switch. For example, use an Anthropic model when a plan includes schema mapping or precise table operations, and use an OpenAI model when the plan includes synthesis or heavy summarization.
Train reviewers, not just makers. The power user who writes the plan is not the only role that matters. Train reviewers to read the step list, understand the logs, and use the approve or reject controls with confidence.

What to watch for

Metrics and dashboards for agent performance. Expect dashboards that show plan run times, failure modes, and variance trends for spreadsheet tasks. Expect document dashboards that show redline throughput, citation coverage, and review latency. These will become routine management reports.
An Agent Store inside Office. Discovery is the next barrier. The moment your organization can browse agents the way it browses templates, adoption will accelerate. Short, trusted descriptions and example plans will matter more than flashy demos.
New roles. Two roles will get popular quickly: agent plan author and agent auditor. The author translates business procedures into stepwise plans. The auditor verifies that plans comply with policy, that logs are comprehensive, and that approvals are in the right places.

Risks and how to mitigate them

Over automation without guardrails. If every step runs with full write permissions and no approvals, you increase blast radius. Mitigation: set a policy that any step that writes to a production system requires confirmation, and any step that sends an external message requires a second reviewer the first time a plan runs.
Silent model drift. If you allow agents to switch models dynamically without notice, you can see subtle changes in behavior. Mitigation: log the model used per run, and require a plan level approval when the default provider changes.
Hidden data leakage. If an agent pulls from both internal and web sources, make the boundary visible. Mitigation: require explicit user approval when a plan proposes to use web data in a document or spreadsheet intended for regulated use, and record that approval in the log.

The bottom line

Office has quietly become an operating layer for agents. Agent Mode in the canvas, a cross app Office Agent in Copilot, and the ability to choose between OpenAI and Anthropic models add up to a new default for enterprise work. Spreadsheets and documents are no longer just the outputs of human effort, they are the living artifacts of an auditable plan that can be replayed, inspected, and improved. If you build three agent plans this quarter and wire them into your governance, you will not go back.