Artificial Intelligence

Articles under the Artificial Intelligence category.

Enterprise Benchmarks Force the AI Agent Reliability Reckoning

Enterprise Benchmarks Force the AI Agent Reliability Reckoning

Enterprise-grade evaluations are puncturing hype around browser and desktop agents. Salesforce’s SCUBA benchmark and NIST’s COSAiS overlays reveal where agents break, which guardrails work, and how to reach dependable automation in 6 to 12 months.

Notion 3.0 Agents Turn Knowledge Workspaces Into Doers

Notion 3.0 Agents Turn Knowledge Workspaces Into Doers

Notion 3.0 introduces permission-aware, stateful agents that run for minutes at a time, remember your workspace, and connect to the tools your team uses. This guide shows how to ship real automations, deploy them safely, and measure business impact.

The Agent Is the New Desktop: ChatGPT’s Work Takeover

The Agent Is the New Desktop: ChatGPT’s Work Takeover

OpenAI turned ChatGPT into a computer-using agent in July and opened a preview Apps SDK in October that lets third-party apps run inside the chat. Together they point to a new default UI for work and a very different near-term automation playbook.

From Demos to Deployments: Claude 4.5 and the Agent SDK

From Demos to Deployments: Claude 4.5 and the Agent SDK

Anthropic’s late September launch of Claude Sonnet 4.5 and a production Agent SDK marks a real turn for agentic coding and computer use. Long-horizon reliability, checkpoints, and parallel tools now let teams ship, not just demo.

Gemini 2.5 Browser Agents Break the API Bottleneck

Gemini 2.5 Browser Agents Break the API Bottleneck

Google’s Gemini 2.5 Computer Use preview turns agents into first‑class web users. With visual reasoning and 13 native browser actions, software can now navigate, type, click, and complete tasks across sites without brittle plugins or custom APIs.

AWS AgentCore and MCP are unifying the enterprise AI stack

AWS AgentCore and MCP are unifying the enterprise AI stack

AWS just turned MCP from a developer curiosity into production plumbing. With the Knowledge MCP server now GA and AgentCore in preview, enterprises finally get a unified way to run dependable AI agents with real governance, observability, and portability.

Comet Goes Free: The Browser Becomes an Agent Runtime

Comet Goes Free: The Browser Becomes an Agent Runtime

Perplexity made Comet free and added Background Assistant for Max users. Here is why putting an always on agent inside the browser could reshape search, checkout, and the default stack.

Qwen3 Omni and Kimi K2 spark China’s open-weight reset

Qwen3 Omni and Kimi K2 spark China’s open-weight reset

Two September releases compressed the agent stack from both ends. Qwen3 Omni brings real-time any-to-any speech with open weights, while Kimi K2 expands working context for code, cutting hops, failures, and cost for production-grade agents.

Agent Platform Wars Begin: Gemini Enterprise vs AgentKit

Agent Platform Wars Begin: Gemini Enterprise vs AgentKit

A new enterprise AI showdown is here. Google debuts Gemini Enterprise while OpenAI launches AgentKit, with AWS AgentCore close behind. We compare capabilities, build paths, and lock-in risks, then give you a 30/60/90-day plan to ship.

Salesforce gives AI agents a voice for talk to work

Salesforce gives AI agents a voice for talk to work

Salesforce is preparing voice-native, hybrid-reasoning agents that can listen, speak, plan, and act across your stack. Here is what it means for contact centers, how it compares to ServiceNow and Sierra, and a 90-day playbook to deploy safely.

Agentic Commerce Arrives: ChatGPT Instant Checkout and ACP

Agentic Commerce Arrives: ChatGPT Instant Checkout and ACP

OpenAI is turning chat into a checkout counter. Here is what ChatGPT Instant Checkout and the Agentic Commerce Protocol mean for developers, brands, and the next wave of agent-driven shopping.

Meta’s Ray-Ban Display turns AI agents into a hands-free OS

Meta’s Ray-Ban Display turns AI agents into a hands-free OS

Meta’s new Ray-Ban Display glasses and Neural Band turn on‑lens cards and silent EMG gestures into a complete loop for ambient computing. Here’s what shipped, why it matters, and how builders can design for glance-and-gesture first.

The Credential Broker Layer For Safe AI Browser Agents

The Credential Broker Layer For Safe AI Browser Agents

A new middleware tier is forming between agents and the web. Driven by passkeys, zero-day churn, and prompt-injection research, credential brokers will gate logins and risky clicks. Here is how it becomes the 2026 enterprise stack.

Copilot Now Generally Available: IDEs Become Agent Control Rooms

Copilot Now Generally Available: IDEs Become Agent Control Rooms

GitHub Copilot’s coding agent is now generally available. It drafts pull requests, runs in secure ephemeral workspaces on Actions, and brings enterprise guardrails across Visual Studio Code and JetBrains. Here is how that reshapes the SDLC and what to do next.

Microsoft’s Agent Framework unifies the enterprise agent stack

Microsoft’s Agent Framework unifies the enterprise agent stack

Microsoft's Agent Framework public preview consolidates AutoGen and Semantic Kernel into a single production stack with typed workflows, durable memory, OpenTelemetry, and MCP support. See what it unlocks for enterprises and how to adopt it in 30 days.

SoftBank’s $5.4B ABB buy primes LLM agents for factories

SoftBank’s $5.4B ABB buy primes LLM agents for factories

On October 8, 2025, SoftBank agreed to acquire ABB’s robotics division for $5.4 billion. The deal could push large language and vision models from the browser into factory cells, accelerating agentic automation.

LangChain 1.0 and LangGraph 1.0 set the agent runtime standard

LangChain 1.0 and LangGraph 1.0 set the agent runtime standard

With the 1.0 cycle for LangChain and LangGraph, stateful graphs, guardrails, human review, and deep observability move agents from dazzling demos to dependable enterprise systems. Here is what changes and how to adopt it now.

OpenAI’s open-weight pivot: GPT‑OSS and the edge agent era

OpenAI’s open-weight pivot: GPT‑OSS and the edge agent era

OpenAI’s August 5, 2025 GPT-OSS release puts strong reasoning on a single 80 GB GPU and even 16 GB devices with quantization. Here is how it rewires procurement, stacks, and agent standards.