← All field guides

A BlueAlly Field Guide

Two decisions.
Get them right.

Most AI initiatives fail before a line of code is written. They pick the wrong tool for the work, or build what they should have bought. This guide is the two forks in the road — which capability fits the problem, and how to engage — answered plainly.

Conquer Complexity

One problem. Two forks in the road. WHICH CAPABILITY Rules ML RAG Tune Agent Build Buy Partner HOW TO ENGAGE

What's inside

01  The stakes

The hard part is not building. It is choosing.

In 2025, MIT studied 300 enterprise AI deployments. Ninety-five percent of the pilots showed no measurable impact on the bottom line.1 The models were fine. The choices around them were not.

Two choices cause most of the damage, and both come before any real work begins. The first is the wrong capability — reaching for a large language model when a simple rule would do, or asking a chatbot to do arithmetic it cannot do. The second is the wrong engagement — building a system you could have bought in a week, or buying a commodity where your edge should have been.

This guide is those two forks, drawn plainly. Answer them well and the build is the easy part. Answer them badly and no model on earth will save the project.

In plain English

Five words to carry both frameworks

Rules
Plain "if this, then that" logic a person writes by hand. No learning, no model. The cheapest tool, and often the right one.
Classic ML
A model that learns to predict from labelled history — forecasts, scores, anomaly flags. Narrow, fast, and proven.
Retrieval (RAG)
Finding the right pages in your own documents and handing them to a language model before it answers. An open-book exam.
Fine-tuning
Nudging an existing model on your own examples so it adopts a tone, a format, or a narrow skill. For behaviour, not facts.
Agent
A model that works in a loop — plans, calls a tool, reads the result, decides what to do next — until the job is done.6

02  The worked example

One problem, carried through both forks.

Abstractions persuade no one. So we take one real problem and run it through both frameworks, start to finish. Watch it choose a capability, then choose an engagement.

The problem, stated plainly

"Tell us which contracts renew this quarter, and which look at risk."

A mid-market insurer. Three thousand active contracts in a SQL system. Renewal dates are clean. "At risk" is not — the signal lives in emails, call notes, and support tickets, in language no column captures. Leadership wants a short list each Monday, with the evidence attached.

We will return to this problem at every fork. By the end it has a capability and an engagement — and a reason for each.

Hold it in mind. It is not exotic. It is the shape of most enterprise AI: a little structured data, a lot of unstructured text, a decision a human still signs.

03  Framework one · The capability decision

First fork: do you even need AI?

The most expensive mistake in this field is using a large model where a small one — or no model at all — would do better. Start at the bottom. Reach up only when the work demands it.

There is an order to these tools, and it runs from cheap and certain to powerful and loose. Rules. Classic machine learning. Retrieval. Fine-tuning. Agents. Each step adds capability, and each adds cost, latency, and ways to be wrong. The discipline is simple to say and hard to hold: choose the smallest system that does the job reliably. For the mechanics behind tokens, vectors, and retrieval, see How the Machine Reads.

Add complexity only when the work demands it. Never because it is exciting.

04  The capability decision tree

Five questions. One route.

Walk the tree top to bottom. The first "yes" that holds is your answer. Most enterprise problems settle in the first three rows — and that is good news, not a disappointment.

Start with the work. Take the first “yes” that holds. Is the logic fixed and writable by hand? e.g. "flag any invoice over $10,000 from a new vendor" YES Rules No model. No training. NO Is it a prediction from labelled history? forecast demand · score churn · detect an anomaly YES Classic ML Train on your numbers. NO Does it need facts from your documents? answer from policies, contracts, tickets — with citations YES RAG Retrieve, then answer. NO Does it need a fixed voice, format, or skill? always this tone · this JSON shape · this narrow style YES Fine-tune Behaviour, not facts. NO Does it span many steps, tools, and decisions? plan → act → observe → decide, with approvals and rollback YES Agent A governed loop. A single model call, well prompted Most "AI features" stop here. Start here. Add nothing you do not need. Our example lands here. Renewal dates → SQL (a rule). "At risk" → RAG over emails and notes. The loop that joins them → an agent.
Fig. 1 — The capability decision tree. Walk top to bottom; take the first “yes” that holds. Rules and classic ML answer more enterprise problems than the hype admits. Our example needs three lanes at once — which is exactly why it ends up as an agent.
Worked example · capability chosen

The renewals problem is not one capability. It is three, joined.

Renewal dates are clean and structured — a SQL query, effectively a rule. "At risk" hides in unstructured text, so it wants RAG: retrieve the relevant emails and notes, let a model read them, cite what it found. Joining the two on a schedule, scoring each account, and filing the evidence is a multi-step job — an agent loop with a human approving any write.

Capability: a governed agent that uses SQL for facts and RAG for meaning. No fine-tuning — we need knowledge, not a new voice.

05  The test for each technique

How to know you have reached far enough.

Each rung on the ladder has a plain test. If the test passes, stop climbing. The table below is the tree in words, with the trap that sends teams one rung too high.

CapabilityUse it when…Cost & speedThe trap
RulesThe logic is fixed and a person can write it down.Lowest. Instant.Brittleness as edge cases pile up. Then, and only then, learn.
Classic MLYou predict a number or a label from labelled history.Low. Milliseconds.Reaching for an LLM to forecast. It will be slower, dearer, and worse.
RAGThe answer lives in your documents and must be cited.4Moderate. Pennies a query."Train the model on our data" — when you meant retrieve it.
Fine-tuneYou need a fixed tone, format, or narrow skill — not new facts.4Higher. Days, then ~6× inference.4Fine-tuning for knowledge. It cannot be updated, cited, or access-controlled.
AgentThe work spans many steps, tools, and decisions.6Highest. Seconds, many calls.An agent where one prompt would do. Loops fail in more ways.

One more honesty: the best production systems often combine these. RAG brings the facts. A light fine-tune fixes the voice. A careful prompt orchestrates both.4 The tree tells you where to start — not the only place you may end.

So what: pick the lowest rung that passes its test. Every rung you skip upward buys cost and fragility you will pay for in production.

06  Framework two · The engagement decision

Second fork: build, buy, partner, or wait.

You know the capability. Now, who builds it, and on whose clock? This is not a cost question alone. It is a question of strategic differentiation and how much of your edge sits inside the workflow this system will run.5

Two forces decide it. Across the top: strategic differentiation — does this capability win you customers, margin, or speed your rivals cannot copy? Or is it table stakes everyone needs and no one brags about? Down the side: internal capability — do you have the data, the talent, and the appetite to own it, or would you be learning on the job at production stakes?

Buy the commodity. Build the edge. Partner to learn. Wait when the ground is still moving.

07  The engagement 2×2

Where your initiative sits decides the play.

Place the initiative on two axes. The quadrant names the play, its risks, and the tell that you are in the right box. The hybrid path — buy the platform, build the edge on top — now wins for most enterprises.5

BUY High capability, low differentiation. You could build it — but it wins you nothing. Rent it. Spend your talent where it counts. Watch lock-in. commodity BUILD High capability, high differentiation. This is your edge and you can own it. Build it, control it, and let no vendor sit between you and your advantage. your moat WAIT Low capability, low differentiation. No edge to win, no strength to win it with. Do not pilot for fashion. Park it, watch the market, revisit next cycle. park it PARTNER Low capability, high differentiation. The prize is real but the muscle is not yet yours. Build it with a partner who transfers the skill — then take the wheel. build & transfer ex our example starts here → moves up Strategic differentiation → low high commodity competitive edge Internal capability →
Fig. 2 — Build, buy, partner, or wait. Two axes — differentiation across, capability down — settle the engagement. The pin is our example: a partner-built agent that transfers skill, then moves up into Build as the team takes ownership.
Build

Own your advantage

High differentiation · high capability

The capability is your edge and your team can carry it. Build it and keep control. This is where AI earns margin a rival cannot copy.

The tell: if a competitor bought the same vendor tool, would you still be ahead? If no, build.

The risk: building the commodity by mistake — sinking scarce talent into plumbing the market already sells.

Buy

Rent the commodity

Low differentiation · high capability

You could build it, but it wins you nothing. Buy it, integrate it, and move your best people to the work that does differentiate.

The tell: a mature market with several credible vendors and a clean integration path.

The risk: lock-in — most firms now cite it as a top concern.3 Keep your data portable and own the prompts.

Partner

Build it with help

High differentiation · low capability

The prize matters but the muscle is not yours yet. Bring in a partner who builds and transfers the skill — then take the wheel.

The tell: you can name the advantage clearly but cannot yet staff it credibly.

The risk: permanent dependence. Insist on knowledge transfer and an exit, not a forever retainer.

Wait

Hold, and watch

Low differentiation · low capability

No edge to win and no strength to win it with. Do not pilot for fashion. Park it, watch the market, and revisit next planning cycle.

The tell: the only reason to start is that others have. That is not a reason.

The risk: waiting on something that is actually your edge. Re-test against the differentiation axis before you park it.

Worked example · engagement chosen

The renewals agent is differentiating, but the muscle is new. So: partner, then own.

Knowing which accounts churn, before they do, is a real edge for an insurer — high differentiation. But the team has never shipped a governed agent — low capability, today. That is the Partner quadrant. Build it with a partner who instruments adoption, transfers the patterns, and leaves a team that can run it. As the muscle grows, the work moves up into Build. The platform underneath — the vector store, the observability — is bought. Hybrid by design.5

Engagement: partner-built and skill-transferred, on bought infrastructure, heading for in-house ownership.

08  The economics

What it costs to choose wrong.

The forks are not academic. They show up on the invoice and the calendar. Three numbers from the field set the stakes.

95%
of enterprise GenAI pilots showed no measurable P&L impact — a failure of integration and choice, not of models.1
Tools bought from vendors succeed about twice as often as internal builds. Buy the commodity; do not relearn it.1
21%
of firms using GenAI have redesigned a workflow. The winners are ~3× more likely to — the edge is in the work, not the tool.2
When buy beats build — and when it flips Total cost → Usage volume → low high Buy pay per use Build fixed + low margin build's fixed cost team, data, platform crossover Below it, buy wins. Above it, build pays back. At low volume, buying is cheaper and faster. No team to hire. No platform to stand up.
Fig. 3 — The build-vs-buy crossover. Buying carries no fixed cost but a per-use price that climbs with volume. Building costs a fortune up front, then almost nothing per unit. One rule of thumb puts the crossover near a million transactions a year — below it, buy; above it, build.3 Curves are illustrative; the shape is not.

So what: volume and differentiation, together, decide it. Low volume or low edge favours buy and partner. High volume and real edge is where building pays — in money and in advantage. And the hybrid — buy the platform, build the edge on top — is now the majority choice, because it lets you spend where it counts and rent where it does not.5

09  The close

Two forks. One path. A reason for each turn.

We started with one problem and ran it through both frameworks. It chose a capability, then an engagement, and it can defend both.

Capability — a governed agent.

SQL for the renewal facts. RAG for the at-risk signal hidden in text. An agent loop to join them, score each account, and file the evidence — with a human approving every write. No fine-tuning; the job needs knowledge, not a new voice.

Engagement — partner, then own.

Differentiating enough to build, but the muscle is new — so partner to build and transfer the skill, on bought infrastructure, heading for in-house ownership. The hybrid path, chosen on purpose.

That is the whole method: pick the smallest capability that does the job, then engage in the quadrant your edge and your strength put you in. The diagrams settle the easy cases in a minute. The hard ones — what is truly your edge, what to govern, what to automate, and what to leave alone — are judgment. That judgment is what BlueAlly brings to the table.

← The AI Strategic Assessment  ·  How the Machine Reads →

10  Sources

Where this comes from

Every figure above is drawn from a named study or primary source. Cost curves and quadrant placements are illustrative frameworks; the survey numbers are not.

  1. MIT NANDA (Challapally et al.), "The GenAI Divide: State of AI in Business 2025" (95% of pilots no P&L impact; vendor tools succeed ~2× internal builds; 52 interviews, 153 surveys, 300 deployments). healthcareitnews.com/news/mit-95-enterprise-ai-pilots-fail
  2. McKinsey & Company, "The state of AI in 2025" (88% regular AI use; ~21% have redesigned workflows; high performers ~3× more likely). mckinsey.com/quantumblack/the-state-of-ai
  3. DigitalApplied, "Enterprise AI Agents 2026: Build vs Buy" (94% report vendor lock-in concern; ~1M conversations/year crossover heuristic). digitalapplied.com/blog/enterprise-ai-agent-build-vs-buy-2026
  4. InterSystems, "RAG vs Fine-Tuning vs Prompt Engineering" (use RAG for knowledge, fine-tune for behaviour; combine in production; ~6× inference for fine-tunes). intersystems.com/resources/rag-vs-fine-tuning-vs-prompt-engineering
  5. Composio, "Build vs. buy AI agent integrations: a 2026 decision framework" (differentiation & data control; hybrid majority — buy the platform, build the edge). composio.dev/content/build-vs-buy-ai-agent-integrations
  6. Anthropic, "Building Effective Agents" (an agent is a model that plans, acts, observes, and decides in a loop). anthropic.com/research/building-effective-agents