Why do most small-business AI projects fail?

They are judged on a polished demo instead of being built like a product. A demo runs on clean input and the happy path, while production meets the messy average and worst cases at volume. Projects survive when they target one specific job, plan for the model being wrong, and keep a person accountable after launch.

How do you choose the right first AI use case?

Start from a specific, painful job and the number it should move, not from a wish to add AI. If you cannot name the number the AI is meant to change, such as hours saved or response time, you have a demo waiting to happen rather than a project.

What should an AI feature do when the model is wrong?

It should know its limits. A reliable feature admits uncertainty, falls back to a safe answer, or hands off to a person, instead of confidently inventing something false. Ask anyone building it to show what the feature does when it does not know.

Who maintains an AI feature after it launches?

Someone has to keep watching the running cost, read what users actually asked, and adjust as the business and the models change. AI features are not ship-and-forget; without an owner they drift, costs creep, and edge cases start failing silently.

Why does owning your code matter for an AI build?

If the work lives in someone else's account, behind their API key and infrastructure, you have a rental that stops the day they do. Owning the code and the keys means the tool keeps working even if the working relationship ends.

Why most small-business AI projects die in week two

The demo is the easy 10 percent. Why small-business AI projects stall after the first two weeks, and the handful of choices that make one actually stick.

Robin SolankiWeb & AI engineering

9 Jun 20265 min read

The demo always works. That is the problem.

A small-business AI project almost never dies because the model can't do the job. It dies about two weeks after the demo that proved it could, when the magic trick meets a real Tuesday and nobody planned for the boring 90 percent in between. The model didn't get worse. The conditions got real.

The short version: AI projects survive when they are built like products, not demos. That means picking one real job with a number attached, designing for the times the model is wrong, and keeping one person accountable for it after launch. Skip any of those three and the pilot quietly stalls, and six weeks later nobody mentions it.

Why does the demo always work?

Because a demo is a curated best case. It runs on clean input, a friendly question, and the happy path, usually typed by the person who built it and knows exactly what it likes to hear.

Production is the other three cases: the average input, the messy input, and the worst input, ten thousand times a week, typed by real people who don't know or care how the thing works. They paste a whole email. They ask two questions at once. They misspell the product name. The demo never sees any of that, so it never breaks. Your customers find all of it in the first afternoon.

This is why so many AI pilots get sold and judged on the strength of one impressive meeting, then fall over when the average Tuesday shows up. The demo was real. It just wasn't the project.

The mistake is almost never the model

The instinct is "let's add AI." That is a technology shopping for a job, and it leads straight to a clever feature nobody asked for.

The projects that stick start from the opposite end: a specific, painful job, and a number it should move. The inbox that eats two hours every morning. The quote that takes a day to turn around when it could take a minute. The support questions that are the same five, asked a hundred different ways. Pick the job first, then ask whether AI is even the right tool for it (sometimes it isn't, and a good engineer will tell you so).

Here is the test: if you can't name the number the AI is supposed to move, you don't have a project yet. You have a demo waiting to happen.

What happens when the model is wrong?

This is the single biggest difference between a toy and a tool, and you don't need to read any code to judge it.

A toy says something confident and wrong, then leaves your customer to discover it. A tool knows its limits. It says "I'm not sure, let me get a person." It falls back to a safe, boring answer instead of inventing a brilliant fake one. It hands off cleanly. Most of the engineering I do on an AI feature is not the clever part; it is this part, the careful design of what happens on the model's bad day.

So when anyone proposes an AI build, yours or a vendor's, ask one question and watch their face: show me what it does when it doesn't know. If the answer is a shrug, you have just found exactly where the project will die.

Who owns it on Tuesday?

The pilot ends, the contractor moves on, and the thing sits there slowly drifting. The cost creeps up a little with each release. An edge case starts failing silently. The model version changes underneath it and the tone shifts. And there is no one whose actual job it is to notice any of that.

AI features are not ship-it-and-forget-it. They need a person who watches the running cost, reads what users actually typed last week, and keeps the thing sharp as the business and the models both move. That is unglamorous work, and it is the work that separates a tool that compounds from a pilot that rots.

It is also why owning your code matters more here than almost anywhere. If the build lives in someone else's account, behind their API key, on their infrastructure, you don't have a tool. You have a rental that stops the day they do. Everything I build goes in your repository, yours from the first commit.

What a project that survives looks like

Before you greenlight any AI build, you should be able to answer six questions. Make whoever is building it answer them too.

One job. What specific task does this do, and what number should it move?
The wrong case. What happens when the model doesn't know, or gets it wrong?
The honest state. Does the interface ever lie to a user, or does it admit uncertainty?
The bill. Who sees the running cost, and what stops it quietly tripling?
The owner. Whose job is it to keep this working in three months?
The exit. If we stop working together, do you still own all of it?

If a proposal can't answer those six, it is a demo with a budget, not a project.

The boring part is the product

None of this is about a cleverer model. The exciting part, the part that demos so well, is maybe a tenth of the work. The other nine tenths is the fallbacks, the honest error states, the cost alarm, and the person who stays. That nine tenths is the whole reason your AI saves you real hours next quarter instead of becoming the pilot everyone quietly stopped talking about.

That nine tenths is the part I actually build. If you want to see the difference, there is a live assistant on my homepage pretending to run a plumbing business. Ask it what a customer would, the awkward questions included, and watch how it behaves when it isn't sure.

And if there is a job in your own business that you think AI should be doing, tell me about it. I'll tell you honestly whether it is a real project or just a good demo.