Back to Blog

The Financial Harness: Like Claude Code, But for Money

  • Author: Noëlle Becker Moreno

Agents are moving real money in production. Most teams have no infrastructure for it. ampersend is the financial harness for the agent economy.

The Financial Harness: Like Claude Code, But for Money

Last month, we ran an experiment at The House, Edge & Node's event and coworking space in San Francisco. An agent walked up to an HTTP endpoint, paid $25 USDC, and got back a promo code for a discounted day pass. No account. No API key. No human in the loop.

Agents can move real money, at scale, faster than any human can review. And most of the teams deploying them right now have no infrastructure for that.

The infrastructure conversation is finally starting. Salesforce published a piece on agent harnesses. RoboCFO wrote about harnesses for AI in finance. Anthropic's engineering team published guidance on building effective harnesses for long-running agents. The concept is arriving in the mainstream conversation.

But nobody has named the specific thing that happens when agents touch money.

We're calling it the financial harness.

An agent harness is the execution layer around a model. Not the model itself. The harness is the controls: tool access, verification loops, approval gates, state management, audit trail. The model decides what to do, while the harness decides whether it's allowed to do it. The harness determines whether the output is trustworthy enough to act on and whether there's a record of what happened afterward.

A financial harness is that same layer, applied specifically to agents that touch money. It is not a general harness with a finance use case bolted on. The stakes are different, so the controls are different.

The reason money requires its own category is simple: the cost of getting it wrong is higher than in any other domain.

When an agent generates bad copy, someone catches it before it ships. When an agent writes broken code, the tests catch it or the engineer does.

However, when an agent moves money it shouldn't, the window to catch it is small, and the damage is immediate. A single mistake can compound fast: a charge that runs 200 times instead of once, an LLM that loops on a paid API and quietly burns through an uncapped budget, or an agent that transacts with a merchant that was never vetted or approved.

Forbes, citing Gartner research, reports that AI machine customers will control $30 trillion in purchases by 2030. McKinsey puts the agentic commerce opportunity at $3 to $5 trillion in transactions specifically. 96% of enterprises are already expanding the use of AI agents. 52% had agents running in production as of 2025.

They are production systems, many of them moving real funds, with no financial control layer underneath them.

The models are not the problem; they are getting better every day. The problem is that most teams deploying financial agents have put their controls in a system prompt and called it a day.

Controls around financial actors are not a new concept; they existed long before software. These include approval thresholds, role-based spending limits, dual authorization for large transactions, and audit trails that hold up under scrutiny months later. Every finance team in every organization relies on these measures because money is consequential and significant transactions require verification built in at the structural level.

You can add these controls to an agent after the fact. System prompts here. Spreadsheet logs there. Approval chains routed over Slack. This is where most teams are today. It feels manageable until something unexpected happens at 2am on a Saturday. An agent takes an action no one anticipated, and suddenly you are reconstructing what happened from model outputs, fragmented logs, and imperfect memory.

Or you can build the controls into the infrastructure the agent runs on. Policies that evaluate every transaction before funds move. Approval gates enforced in code, not conversation. An audit trail that is automatic, not assembled after the fact. Every decision logged with a reason, whether a human ever asks for it or not.

ampersend is the financial harness for the agent economy. Specifically: the x402 ecosystem, where agents transact directly with services, move USDC on Base, and operate without the human-mediated payment flows that traditional finance assumes. This is new infrastructure for a new kind of actor, and the controls have to be native to it.

Every transaction through ampersend runs through a policy engine before funds move. Budgets are per-agent and merchant allowlists are enforced at the infrastructure level. Anything above a threshold routes to a human for review. Every action is logged with enough context to reconstruct the decision later.
The agent has a wallet.
The harness decides whether it gets to spend, how much, to whom, and under what conditions.

The term “financial harness” matters because it forces a clean separation between two questions that teams keep conflating: What does the agent know, and should the agent be allowed to act?

The first is a model problem. The second is a harness problem. Treating them as the same is how teams end up with capable agents they can't audit, can't explain, and ultimately can't scale.

Agents are in production. They are moving real USDC, calling real APIs, and billing real costs right now. The teams that get the financial control layer right from the start are the ones that can actually scale without surprises.

If your agent is spending, you need policies that run before funds move.

You need a financial harness.

Get started with ampersend: https://app.ampersend.ai.

Agent