Should a fintech build its own foundation model?

Almost never. Training a foundation model from scratch costs tens of millions and gives you a worse model than what you can call from an API. Build the proprietary layer above the model — your data pipeline, your prompts, your evaluation harness, your domain logic. That is where defensibility lives.

When does fine-tuning actually beat retrieval-augmented generation?

Fine-tuning beats RAG when the task is narrow, the format is rigid, and you have at least a few thousand high-quality labeled examples. For most fintech use cases — Q&A over documents, summarization, classification — RAG with a strong base model is faster, cheaper, and easier to update. Start with RAG, measure, then fine-tune only the parts that fail evaluation.

How do banks and credit unions actually evaluate AI vendor claims?

They route the vendor through third-party risk management, model risk management, and information security. Expect SOC 2 Type II, a model card, evaluation metrics on representative data, an explanation of how hallucinations are mitigated, and clear answers on data residency and training-data use. If you cannot answer those before the meeting, you will not get a second meeting.

Does SR 11-7 apply to AI features in fintech products?

If your customer is a regulated financial institution and the AI influences a decision the institution is accountable for, SR 11-7 model risk management expectations flow to you through their vendor management process. You do not have to be a bank to be governed by it. The CFPB has also signaled in recent circulars that adverse-action notices generated or influenced by AI must still meet ECOA explainability standards.

What is the trap of AI-native branding?

Calling yourself AI-native without a defensible AI layer signals two things to sophisticated buyers: you are chasing a narrative, and your data foundation is probably weak. The teams that win in 2026 lead with the outcome and the evaluation evidence, not the model name.

How should a fintech decide between build, buy, and partner?

Build only what is core to your differentiation and where you have the data advantage. Buy commodity capabilities where vendors have already amortized the cost. Partner where distribution or domain expertise belongs to someone else and the relationship is durable. Most fintechs underestimate how much of their AI stack should be bought and overestimate what they can sustainably build.

AI Product Strategy for Fintechs: Build, Buy, or Partner?

Q: What are the four to six questions every fintech product leader should answer before shipping AI?

What decision does the AI influence and who is accountable for it? What is the evaluation harness and what does passing look like? What happens when the model is wrong? Where does customer data go and how is it retained? Which regulator or framework governs this use case? And what is the human-in-the-loop design for high-impact outputs?

Fintechs have always been forced to make build-buy-partner calls earlier than other software companies. AI hasn't changed the question. It's changed the cost of getting it wrong.

The teams shipping AI features in 2026 are sorting themselves into two groups. One group has a defensible position, an evaluation strategy, and a clear regulatory story. The other group has a press release and a demo. Both groups look the same on a landing page. They do not look the same to a model risk officer.

The decision framework: build, buy, or partner per AI feature

Treat build-buy-partner as a per-feature decision, not a company-wide stance. A single product can ship five AI features and make a different call on each. The question to ask, for each feature in isolation, is where the durable advantage actually sits.

Build when the feature depends on proprietary data you already have, the evaluation criteria are specific to your domain, and you can sustain the engineering cost across at least three model generations. Buy when the capability is commoditizing fast, the vendor has already amortized the model and infrastructure cost across hundreds of customers, and your differentiation is somewhere else in the workflow. Partner when distribution or domain authority belongs to a category leader and the integration is deep enough to be hard to unwind.

Most fintechs overestimate what they can build and underestimate the durability of a good partnership.

When foundation models suffice, when fine-tuning earns its keep, and when you actually need your own pipeline

The technical layer of the decision is where I see the most confusion. Three options keep getting conflated.

First, calling a foundation model through an API. This is right for the majority of fintech AI features in 2026. A frontier model from Anthropic, OpenAI, or Google paired with retrieval-augmented generation over your own documents will solve 70 to 80 percent of the use cases product leaders are scoping. The cost per call has dropped roughly 90 percent in two years. The latency has dropped enough to support real-time member-facing flows.

Second, fine-tuning. Fine-tuning earns its keep when the task is narrow, the format is rigid, and you have a few thousand high-quality labeled examples that represent the distribution you actually see in production. Most fintechs do not have that data ready. They have logs, tickets, and PDFs. Turning those into a labeled training set is the work, and it is usually 80 percent of the project budget.

Third, building your own pipeline — your own RAG architecture, your own evaluation harness, your own prompt and tool stack on top of someone else's model. This is the layer where defensibility lives. This is also the layer fintechs underinvest in because it doesn't demo well.

A useful rule of thumb. If your AI feature can be replicated by a competent engineer with three weeks and an API key, you have not built a moat. You have built a feature.

Regulatory implications: SR 11-7, CFPB circulars, and the model risk flow-down

Fintechs sometimes assume model risk regulation lives with their bank or credit union customers. That assumption breaks the moment your AI influences a decision the institution is accountable for.

SR 11-7, the Federal Reserve's model risk management guidance, has been the operating standard for banks since 2011. When a $5B community bank evaluates your AI feature, the third-party risk team and the model risk team apply SR 11-7 expectations to you through the vendor management process. They will ask for a model card, evaluation metrics on representative data, monitoring and recalibration plans, and a clear description of how human oversight works. If your answer is that the model is provided by a third party, the next question is what evaluation you have run on top of it for your specific use case.

CFPB has been increasingly direct in recent circulars. Adverse-action notices generated or influenced by AI must still meet the explainability standards under ECOA and Regulation B. A black-box score with a generic reason code does not clear that bar. The NIST AI Risk Management Framework, while voluntary, is rapidly becoming the lingua franca that risk teams use to structure questions, and the EU AI Act is forcing the same conversations in cross-border deployments.

The point is not that AI is over-regulated. The point is that the regulation is real, the questions are predictable, and the fintechs that have answers ready close deals faster.

How buyers actually evaluate AI claims

I've watched a lot of AI vendor pitches into banks and credit unions over the last 24 months. The pattern is consistent.

The CIO or chief digital officer takes the first meeting. They are interested. They invite three or four colleagues to the second meeting, and that is when the questions change. Information security wants the SOC 2 Type II report and the data flow diagram. The third-party risk team wants the financial statements, the BCP, and the subprocessor list. Model risk wants the evaluation methodology, the test set, and a description of failure modes. Compliance wants to know how adverse actions, privacy, and recordkeeping are handled.

If those answers are not ready before the second meeting, you have already lost six weeks of cycle time. If they are not ready before the third meeting, you have probably lost the deal.

The fintechs winning in this environment have built a deal-readiness packet. Model card. Evaluation report. Data flow diagram. Hallucination mitigation strategy. Human-in-the-loop design. Subprocessor list. SOC 2 Type II. They send it in advance, and it changes the energy of every subsequent conversation.

The trap of AI-native branding without substance

"AI-native" has become this cycle's "cloud-native." The label is doing a lot of work, and most of the time it is doing the wrong work.

Sophisticated buyers — and almost every buyer in financial services is now a sophisticated buyer — read AI-native as a tell. If the substance is there, the marketing leads with the outcome. If the substance is not there, the marketing leads with the model.

The teams I see winning describe their product in terms of what it does, what it improves, and how that improvement is measured. They cite a specific lift on a specific metric on a specific cohort. Underwriting auto-approval rate up 14 points on near-prime applications. Member service first-contact resolution up from 62 percent to 78 percent. They do not lead with the model name. They lead with the evidence.

Branding is not a substitute for an evaluation harness.

The four to six questions every fintech product leader should answer

Before shipping any AI feature, force a clean answer to each of the following. If you can't answer one of them, you are not ready, regardless of what your roadmap says.

What decision does this AI influence, and who is accountable for it? If a regulator asks a bank or credit union "who decided this," the answer cannot be "the model."
What does passing look like? Define the evaluation harness, the test set, and the metric thresholds before you start building. Not after.
What happens when the model is wrong? Document the failure modes, the human-in-the-loop design, and the rollback path. Test them.
Where does customer data go, and how long is it retained? Spell out training-data use, residency, and subprocessor flow. Buyers will ask in the second meeting.
Which regulator, framework, or guidance governs this use case? Map to SR 11-7, the NIST AI RMF, applicable CFPB guidance, state laws, and the EU AI Act where relevant.
What is the human override? Especially for credit, fraud, BSA/AML, and member-facing decisions, the override path needs to be designed, not retrofitted.

Anonymized examples of right and wrong calls

A Series-B fintech in the SMB lending space spent eight months training a custom credit model on three million applications. The model was 1.8 percent better than a frontier model with a well-tuned RAG pipeline on the same data. They could not deploy the custom model into bank partners because the documentation, monitoring, and recalibration plan didn't meet SR 11-7 expectations. They eventually shipped the API-based version, won three bank deals, and quietly retired the custom model. The wrong call was building. The right call surfaced after they ran the evaluation honestly.

A growth-stage fintech in the payments space did the opposite. They bought an off-the-shelf AI fraud-decisioning tool to ship faster. Within six months, the buyer pattern shifted, the vendor's model degraded on their distribution, and they had no levers to pull because they had no evaluation harness of their own. They eventually rebuilt the evaluation layer in-house and kept the third-party model — the right architecture from the start would have saved them roughly nine months and a customer they lost during the degradation.

A mid-market wealth platform partnered with a category-defining model provider for a co-branded research summarization feature. Two years in, the partnership has produced more pipeline than any feature they built alone. Distribution and brand were the missing piece, and the partnership filled it durably. Right call.

What this means for fintech product leaders in 2026

The AI product strategy decision is no longer about whether to ship AI. It is about whether you can defend the architecture, the evaluation, and the risk story in front of the people who actually buy.

Build the layer where you have the data and the domain. Buy the layer where the market has commoditized faster than you can. Partner where distribution or authority is held by someone else. And in every case, build the evaluation harness and the deal-readiness packet before you build the marketing site.

The fintechs that win the next 24 months will not be the loudest about AI. They will be the ones whose answers hold up under the second meeting.