CFO Playbook
Posts
Build or Buy: The New Finance Tech Question CFOs Aren't Ready For

Build or Buy: The New Finance Tech Question CFOs Aren't Ready For

Simone Rueschenberg & Ellen Köhler
June 11, 2026

We’re Ellen and Simone. After 36 years in finance, we’re ready to share what textbooks won’t tell you.

💛 Welcome to CFO Playbook – your practical finance insights delivered bi-weekly. The full read will take approximately 5 minutes. Like what you see? Share it! Use the button below.

READ OF THE WEEK

Six months ago, the buy-vs-build question for finance teams was about budget. Today it's about judgment.

An analyst with Claude Code can ship in an afternoon what a three-person dev team used to deliver in a quarter. CFOs are quietly approving builds because the tools make it easy and discovering three months later that easy to build is not the same as easy to maintain.

Software industry rule of thumb: maintenance over a typical lifecycle costs 2-4x the original build. The tool that looked free in the demo isn't.

The wrong question: should we let people build? Everyone already is.

The right question: where does building actually make sense and where does it just feel productive?

In this Read of the Week

1. The Build Option Is Now Real

2. Two Risks CFOs Keep Confusing

3. Buy the Spine, Build the Edges

4. "But We Have a Developer Now". Why That Changes Less Than You Think.

1. The Build Option Is Now Real

A year ago, "build it yourself" in finance meant a VBA macro or a brave intern with Python. Today, an analyst with Claude Code can ship a working tool in an afternoon: variance commentary generators, vendor invoice triage, monthly close helpers, dashboards that used to take a contractor six weeks. It's not theoretical. It's happening in your team, with or without your sign-off.

The CFO question used to be: do we have the budget for new software? The question now is: do we even need the software, or can we build it? That's a genuinely different conversation, and most finance teams don't have a framework for it yet.

💡 The next 30 days aren't about whether to allow building. They're about drawing a clear line for when it makes sense and when it doesn't. The rest of this newsletter is that line.

2. Two Risks CFOs Keep Confusing

The case against vibe-coded finance tools usually arrives as one anxious blur.

"Is it safe?" It isn't one question. It's two, with two different fixes.

Risk 1: The tool gives different answers on the same data.

This is an output reliability problem. Your analyst's AI close helper might compute one number Tuesday and a different one Monday, not because the data changed, but because the AI underneath is probabilistic. It guesses the most likely answer based on patterns. Fine for drafting commentary. Not fine for anything with a right answer.

The fix is architectural: the AI doesn't do the math, the AI calls a calculator. The AI is the friendly receptionist. It understands what you want, writes the narrative. The math happens in the back office, on a pre-written calculation engine that always behaves the same way. Pigment, for example, openly acknowledges that LLM-driven functionality carries implicit risk "owing to the non-deterministic nature of the technology" and describes its AI agents as leveraging a separate calculation engine that handles the math.

What does not fix this: stacking a second AI on top to "review" the first. Two probabilistic tools can confidently agree on a wrong answer. You've just doubled the chances of being wrong with conviction.

One more nuance, because the better builders will raise it: AI tools like Claude Cowork can write the calculation script for you, and the script then runs deterministically. That's real progress. But it moves the risk rather than removing it. The script can be deterministically wrong. Two analysts asking the same AI for the same script can get two slightly different scripts that produce two different answers. And the script still has to be maintained as your business changes; which usually means AI rewriting it, which is probabilistic again. Determinism at runtime isn't the same as correctness, consistency, or stability over time.

The CFO test: ask your builder. "When this tool calculates the number, is the AI doing the math, or is it calling a separate engine that someone wrote?" If the answer is "the AI does it" that's the probabilistic risk.

Risk 2: The code itself is insecure.

The tool can give the right answer every time, on every input, and still have hardcoded passwords, broken authentication, or vulnerabilities sitting inside it. Why? Because the AI that wrote the code learned from public training data full of security flaws and reproduced them.

The data:

A recent study from a top US university found that while 61% of AI-generated code functions correctly, only 10.5% passes a basic security review.
A major enterprise security firm tested over 100 AI models on security-sensitive coding tasks and found 45% of AI-generated code samples introduce serious vulnerabilities - a rate that has not improved through early 2026, despite vendor claims to the contrary.

Your finance analyst can't spot these. Neither can another AI. This is security review work. It needs someone trained to evaluate code, not someone trained to evaluate financials.

Sources: Pigment, "How the Modeler Agent gives stretched teams a new, scalable operating model," April 2026; Carnegie Mellon University, 2025; Veracode AI Code Security Report, 2026.

3. Buy the Spine, Build the Edges

Picture this. Your FP&A team uses Workday Adaptive. An analyst pitches you a vibe-coded variance commentary agent built on top. Claude Code, two weeks, looks brilliant in the demo. Your instinct is to say yes. Should you?

Maybe. But step back one level first. Building on top of a legacy planning platform might be the wrong starting point. AI-native FP&A tools like Drivetrain, Pigment, or Aleph already have variance commentary built in, governed, and updated by a vendor whose job is to maintain it. Sometimes the right buy-vs-build call isn't between buying a tool and building on top. It's between building on top and replacing the base.

Here's the principle that makes calls like this easier, a distinction Secret CFO sharpened recently and worth keeping: workflows that read from your systems of record carry very different risk than workflows that write back to them.

Your systems of record are where the official data lives: your ERP for the GL, your HRIS for headcount, your CRM for the sales pipeline, your billing system for what customers were charged. If two systems disagree, the system of record wins. That's the version that goes to the auditor.

Read-only workflows: dashboards, commentary, board-pack automation. Errors are visible, recoverable, contained. Build freely.
Write-back workflows: journal entries, payment runs, master data updates. Errors enter the bloodstream. Every downstream report becomes wrong. You may end up restating. Three controls: buy from a vendor with a SOC report, use deterministic logic, and keep a human in the loop for anything that touches the system of record. That last one isn't going away even as the tools get better - agents writing back to finance systems unsupervised is the 20% of work where humans still belong.

The shorthand: buy the spine, build the edges. Buy the systems of record. Build the analytics layer that sits on top.

Two honest caveats.

First: this principle is going to soften. As AI-native systems get better at controlled write-backs (audit trails, approval loops, deterministic guardrails baked in) the bright line will move.

Second: with serious engineering capability - a real developer, real security review, deterministic guardrails on every write - you can do more than this principle suggests. The principle above is a default, not a ceiling.

💡 For every internal build, look at who finds the error. If it's "someone notices the dashboard looks weird": green light. If it's "the auditor finds it": stop, you're building in the wrong layer.

4. "But We Have a Developer Now". Why That Changes Less Than You Think.

The natural objection to everything so far: we're not vibe-coding blind. We've hired a developer. Fair. And yes, it helps. But not as much as it looks.

What a developer genuinely fixes:

Code review and security guardrails - the Section 2 risks become manageable
Architecture that fails safely and scales gracefully
The probabilistic-to-deterministic split inside the tool (the calculator in the back office) - your analyst won't think to do this. A developer will.
Maintenance and version control from day one, not as an afterthought

What a developer does not fix:

The buy-vs-build question itself. A developer's instinct will be "we can build this." That's their craft. Can build is not should build, and that judgment is still yours.
The domain expertise gap. A developer doesn't know multi-entity consolidation rules, FX accounting edge cases, or the seventeen ways VAT can go wrong. They'll happily ship a consolidation engine that misses three things your auditor will find first.
The maintenance commitment. The developer who builds it eventually leaves. Vendor software has a vendor on the hook. Your custom build has whoever you can hire next. Good luck finding someone who wants to inherit undocumented code from someone who left six months ago.
The scaling bottleneck. One developer can build maybe 3-5 production tools a year well, with maintenance. So you've moved the bottleneck from can we build to how many bets can we maintain. Different constraint. Still a constraint.

So adding a developer means you can build more, and build it more safely. It doesn't mean you should build more. It means the strategic question is now sharper, not easier.

💡 Every internal build is an attention bet. If it isn't the highest use of your finance team's time this quarter, buy it, borrow it or skip it. "We have a developer who can do it" is not, by itself, a reason to build.

Bottom Line

The pattern we keep seeing in CFOs who navigate this well:

They've stopped treating "build vs. buy" as a procurement question and started treating it as a strategic one.
They split output reliability from cybersecurity. Two risks, two fixes. They don't let "is it safe?" stay one anxious question.
They buy the spine, build the edges. The system of record is bought from a vendor with a SOC report. The analytics layer that sits on top is fair game.
They treat "we have a developer now" as a license to build more safely, not a license to build more.
They protect their own attention. They don't drift into being a software vendor when their job is to steer the business.

The trap isn't building too much or too little. It's drifting into building because the tools make it easy, without ever choosing to.

One last thing: AI capability is moving exponentially. Some of what we've written here will look dated in six months: the tools will be better, the security gap may narrow, the cost of building will drop again. The principle (judgment, attention, system-of-record discipline) should hold. The tactics won't. Treat it accordingly.

🔎 CFO Watchlist

Anthropic launches Claude Fable 5, its most capable model

Anthropic released Claude Fable 5, a new top-tier model that is state-of-the-art on nearly all tested benchmarks. For finance specifically, it posts the highest score of any model on Hebbia's Finance Benchmark for senior-level reasoning, with big gains in document-based reasoning and chart and table interpretation.

Why CFOs should care: Reading messy documents and pulling numbers out of charts and tables is core finance work. A model that leads on exactly those tasks is a step toward AI that can handle real analytical workflows, not just drafting. It ships with conservative safeguards that occasionally reroute sensitive queries, a reminder that capability and control are now being shipped together.

The finance AI race is moving from "can it write" to "can it reason over our documents." 👉 Want more details? Check the link here

Why AI in Finance often underdelivers

New BCG analysis argues the real constraint in finance AI isn't capability, it's readiness. Their rule of thumb: only about 10% of AI success comes from the models, 20% from the tech platform, and 70% from organization, workforce, and skills. The common failure isn't a weak model; it's that the data lacks context and processes were never consistent enough to automate.

Why CFOs should care: It reframes where to spend. Automating a fragmented process just "scales fragmentation." The teams seeing returns picked one domain that mattered, harmonized its data, connected live feeds and reinvested early gains rather than waiting for a perfect data migration. And governance belongs in the design from day one: data lineage, escalation rules, and "AI proposes, human disposes" as the starting posture.

The race in finance AI isn't deploying first, it's building the foundations that make it pay. 👉 Want more details? Check the link here.

🌐 Finance Collective DACH

Your go-to CFO Network (by Simone)

Join a curated network for:

Exchange on AI, Finance Tech & all CFO topics
Expert sessions & peer groups
Regular meetups across Berlin, München, Zürich, Frankfurt, Köln, Hamburg, soon Stockholm & Vienna (Q3 launch)

Grow with your peers. Stay ahead.

CLOSING REMARKS

Thanks for reading 💛

Send your feedback, suggestions, or requests to feature something in future editions to [email protected]. We’d love to include your input.

CFO Playbook reflects our personal opinions, not professional advice.