The AI investment framework nobody uses

The best AI investment most businesses will make is the one they decide not to make. Six questions, a scoring system, and the restraint to follow what the numbers say.

The best AI investment most businesses will make is the one they decide not to make.

That sounds like contrarian advice from someone who sells AI advisory. It isn't. It's what the numbers say, every time. And it starts with one question: what does the owner complain about on a Monday morning?

That question is the whole framework compressed to one line. Pain is the dimension nobody scores and everyone feels — score it honestly and half the "opportunities" drop off the list before you've sharpened a pencil. That's the don't-build list, and it saves more than anything you build.

But you only get that list if you evaluate AI the right way.

Most businesses evaluate AI like a technology purchase. What does it do? How much does it cost? Can we get a demo? Reasonable questions for buying a printer. Wrong questions for a capital investment.

AI, when it works, changes how a business operates. It redirects labour, shifts decision-making, alters what's possible at a given headcount. That's not a software subscription — that's a structural change. It should be evaluated like one.

Six dimensions separate good AI investments from expensive distractions. None of them are about the technology. And one of them matters more than the rest.

The one that overrides arithmetic: pain

Pain is the dimension nobody scores and everyone feels.

A function that scores moderately on every other dimension, but is the thing the owner complains about every Monday morning — that one gets built. Because the person who writes the cheques will champion it. Without that champion, even high-scoring projects stall at month four.

Inversely: a project the spreadsheet loves but nobody in the business cares about will never ship. It'll pass every gate review and die quietly in the last mile.

So the first question I ask isn't about volume, or cost, or data. It's: what does the owner complain about on a Monday? That's where to look first. Score the rest against that one.

The other five

Can you measure the output?

The hard constraint. If the function you're automating doesn't have a clear pass/fail — if the quality of the output is a matter of opinion — you cannot prove the system works. No amount of enthusiasm overrides this.

A compliance assessment either meets the building code or it doesn't. An invoice either matches the purchase order or it doesn't. A support response either resolves the ticket or it doesn't. These are measurable. "Write better marketing copy" is not. If you can't define what good looks like before you build, stop here.

How much judgment is involved?

Some functions are mechanical — identical every time. A script handles those. You don't need AI. At the other end, some functions require years of experience per decision. AI can assist, but it can't replace the judgment.

The sweet spot is in the middle — work that follows a methodology but where inputs vary. That's where AI earns its keep.

Is the data already digital?

If the information the function needs lives in someone's head, in handwritten notes, or in formats no system can read — you have a data capture problem, not an AI problem. Fix that first. It'll be clearer what AI can do once the data is clean.

AI built on bad data doesn't produce bad results. It produces confident bad results, which is worse.

Volume and cost.

Together these are the arithmetic. Eight compliance assessments a month at six hours each is 576 hours a year of senior time. The ROI calculation writes itself. Two a quarter, and the system costs more than it saves. The fully loaded cost — hourly rate plus the work you're turning away because the bottleneck exists — is the number that matters, not the subscription fee.

Both are scoreable in an afternoon — commodity dimensions that don't need airtime, but where the evidence either confirms the case or kills it.

How the scoring works

Radar chart showing a candidate scored 12 out of 30 across six dimensions, with a hard stop flagged on Measurability at score 1.

Each dimension scores 1 to 5. Total out of 30.

20+: High priority. Design the system.
14–19: Check the flags. One low score in the wrong place kills the case.
Under 14: Put it on the "don't build" list.

The flags matter more than the total. Measurability at 1 or 2 is a hard stop regardless of everything else. Judgment at 1 or 2 means a script will do — you're over-engineering with AI. Pain at 5 means the owner will champion it even if the total is moderate.

What to do with what's left

Every business I look at has functions that score well on one or two dimensions but fall apart on the others. The natural instinct is to focus on what scored high. The discipline is what you do with the rest.

The "don't build" list.

Six questions, a scoring system, and the restraint to follow what the numbers say. No technology required to start using it. You can score your own operations this afternoon.

If the numbers point somewhere interesting, then we talk about what to build.

Karl Howard · Reforged · 14 April 2026