AI Agents Are Doing Real Work Now. Here's What Changes.

A professional at a dual-monitor workstation with AI agent dashboards alongside regular business software

I've watched a lot of tech trends arrive with fanfare and leave quietly. AI agents looked like one of them. Something to talk about at conferences. A demo. A pilot program in a press release.

It stopped being a press release.

BNY Mellon — founded in 1784, America's oldest bank — now runs 130+ autonomous AI agents in production across its 52,000-person workforce. Contract review time dropped from 4 hours to 1 hour. Financial planning cycles cut by 60%. The bank processes over 3,000 vendor agreements annually through automated contract agents. These agents don't wait to be prompted. They monitor triggers, cross-reference databases, validate compliance, and communicate through Microsoft Teams. Twenty thousand BNY employees have been trained to build more of them.

That's not a pilot. That's a structural change to how a 240-year-old institution does business.

What the Numbers Show

The mid-year 2026 enterprise AI report from Ampcome is worth sitting with. Fifty-four percent of enterprises now run AI agents in active production. Two years ago, that number was 11%. Eighty percent of those organizations report measurable economic impact. Multi-agent architectures ... where multiple agents pass work to each other ... grew 327% in under four months.

Goldman Sachs projects $7.6 trillion in AI infrastructure spending between now and 2031. That's roughly 25% of annual U.S. GDP.

This wave is not coming. It's here.

The hype obscures what matters: what does this change for your team, your product, your hiring decisions?

What Agents Are Good At

The honest answer: repetitive, well-defined, high-volume work where mistakes are tolerable or fixable.

Invoice processing. Lead qualification. First-pass customer support triage. Compliance monitoring. Report generation. Document extraction from unstructured sources. Competitive intelligence gathering. These are the early wins, and they're real.

The mid-year enterprise report shows 90% faster document processing with roughly 95% extraction accuracy. Invoice matching moved from days to seconds. Twenty-five percent of enterprises saw meaningful impact within three months of deployment. The median time to value is six months or less for well-defined workflows.

For small businesses, the timeline is faster. SMBs are deploying in days, not months. Lead qualification dropping 30-40% in time. Customer support handling routine inquiries without a human. Operational efficiency improving 90% in structured processes. Fifty-eight percent of U.S. small businesses already use generative AI tools, and the move from "AI tool user" to "AI agent deployer" is shorter than most people realize.

If you're not evaluating these use cases right now, someone competing with you is moving faster.

An engineer reviewing automated workflow connections on a large monitor late at night

Where It Goes Wrong

Here's what the adoption numbers don't show: 46% of enterprises name legacy system integration as their primary obstacle. Audit trail and explainability gaps block approval in regulated industries. "Governance-first" deployments outperform "capability-first" ones by meaningful margins.

The companies getting burned are the ones treating agents like a feature toggle. They deploy before answering the foundational questions: what decisions do we still own as humans? How do we audit what the agent did? What's the escalation path when it gets something wrong?

There's a liability issue brewing too. If an AI agent incorrectly authorizes a supplier payment, misprices a product, or sends misleading communications, who owns it? Clifford Chance flagged this in February 2026 ... current contracts often don't cover it. The EU's AI Liability Directive updated in March 2026. Legal exposure is real, not theoretical.

Ben Morton said it directly in a conversation I had with him: "AI should never make decisions about humans. Hard stop." He's right. Agents handling workflows is one thing. Agents deciding who gets hired, fired, promoted, or denied credit is another. The line isn't blurry.

What Changes for Leaders and Engineers

If you lead a software team, you need to think about this from two directions.

Direction one: what to automate, and what governance looks like. Every agent you deploy is a choice about what humans no longer touch. Be deliberate. Map out where judgment is genuinely required versus where you're paying humans to move data between systems. The data-movement tasks are fair game. The decisions affecting people's livelihoods are not.

Set up audit trails before you ship. Not after something goes wrong.

Direction two: your people's relationship to these tools. BNY Mellon trained 20,000 employees to build agents. That's not accidental. Eighty-seven percent of enterprises are prioritizing workforce upskilling as their agent deployments mature. The pattern is clear: the organizations seeing the best results are not replacing people with agents. They're turning people into the ones who build, configure, and oversee the agents.

Engineers who understand how to design agentic workflows, set appropriate human oversight points, and debug agent behavior are going to be in demand. This is worth investing in now, not after everyone else has started.

Junior roles doing high-volume, repetitive data work face real pressure. Senior roles making judgment calls, setting strategy, managing relationships, and architecting systems face less. That's honest, not comfortable if you're early in your career. But the answer isn't to avoid the technology. It's to move toward building with it.

A small business owner reviewing automated workflows on a laptop in a calm home office

How to Start Without a Massive Budget

You don't need 20,000 trained employees or a Goldman Sachs infrastructure budget. You need one well-scoped workflow and a week of honest mapping work.

Find a process your team runs manually, repeatedly, where the failure cost is low and the steps are predictable. Document every single step. Then separate two categories: steps requiring genuine judgment (context, relationships, ethical weight) and steps doing sequencing or data movement. Start with the second category.

Tools like n8n, Zapier, and Make let non-engineers wire agentic automations without deep coding. Anthropic's Claude, OpenAI's GPT-4o, and Google's Gemini all offer API access with agentic capabilities built in. The cost to run your first experiment is minimal. The cost of not starting is compounding monthly.

One more thing from my own observation of how we measure AI productivity: measure what the agent changes in output, not what it feels like it changes. Engineers report feeling faster with AI tooling, but the actual cycle time doesn't always match the subjective sense. Track results, not impressions.

The Companies Getting It Right

The pattern in the data is consistent. The organizations seeing the best results from agent deployment share a few traits. They governed before they shipped. They trained people to be builders, not users. They started narrow and expanded from demonstrated success. They kept humans in the loop on decisions affecting other humans.

That's not a formula. It's the same discipline engineers apply to good system design: define scope, set boundaries, monitor behavior, iterate.

AI agents aren't magic. They're automation with better language comprehension than we've had before. That changes what's automatable. It doesn't change the principle: garbage in, garbage out. The teams building on solid foundations are getting real returns. The teams chasing demos are burning runway.

The technology is ready enough to start. The question is whether your team is.

What's one workflow your team runs manually with no genuine judgment requirement? That's where I'd put the first agent. Let me know what you're seeing in your organization.