In-house AI-native build vs. Web agency

The bottom line, up front

On fully-loaded cash, the portfolio comparison is closer than it first looks — so cost should set the magnitude of the decision, not its direction. The honest case for building in-house rests on three things, in this order: strategic control, a proven cheap front-end layer, and a cost edge that compounds at portfolio scale and over time.

Where in-house genuinely wins

Control & data privacy — we own code, data & pipeline; sensitive customer/order data stays in-house.
Proven front-end — the marketing/homepage layer already works (see Proof), at near-zero marginal cash.
Compounding — one toolchain across 4 brands; the cash gap widens with brands, years & structural changes.

Where the agency genuinely wins

One bundled vendor with a fixed delivery date & a 12-month warranty (transferred risk).
Established process for the hard parts — payments, 300-SKU modeling, data & SEO migration.
No draw on internal capacity; best for a single one-off site.

The two honest caveats

Only the marketing layer is proven. The transactional store (checkout, catalog, payments, migration) is unproven, not impossible — it must be piloted first.
Internal time isn't truly free. If it has spare capacity the cash cost is low; if it displaces revenue work, count it. We show both.

Interactive cost model

Drag the assumptions — every figure, chart and table updates live. The in-house result is shown as a range: the lower bound treats internal staff as spare capacity (true cash cost), the upper bound prices their time at a loaded day-rate (opportunity cost).

Scope & horizon

Brands in scope 4

TrichoLab · SkinLab · SL Clinic · Prologue

Time horizon

Design resourcing (we add this — agency bundles it)

Per-brand freelancer for mockups — lowest commitment.

Contractor $/brand $4,500

Internal effort (the in-house range)

Operator day-rate $250

Loaded cost of internal staff time, used for the upper (opportunity-cost) bound.

Build days / brand 18

Ongoing days / yr (portfolio) 24

Maintenance + structural changes across all brands.

In-house running costs

Tooling $/yr $3,810

Claude Max + Paper.design + Figma seats.

SEO-migration specialist contingency

~$3,000 per high-traffic brand (in-house). Agency bundles its own SEO migration.

Shared & agency assumptions

Data migration $/brand $1,500

Customer/order history (WooCommerce→Shopify). Applied to both options.

Agency price / brand $18,421

Verz Q46440 incl. GST (build + SEO migration + quiz + 12-mo maint).

Multi-brand discount 15%

An agency reusing its system across 4 sites would typically discount.

Structural changes / brand / yr 8

Template/new-page work. Routine content is self-serve in the admin in both options.

Agency

—

In-house (range)

—

Hybrid

—

agency builds brand 1, in-house the rest

Cumulative cost over time

In-house is a band (spare-capacity ↔ fully-loaded). Selected horizon marked.

Agency In-house band Hybrid Selected horizon

Line-by-line breakdown (at current assumptions)

Item	Agency	In-house (spare cap.)	In-house (loaded)

Designer: contractor vs. retainer vs. FTE

Design is the one cost the in-house path does not save — we still need a UI/UX designer to produce the mockups Claude Code builds from. All three resourcing models, for the portfolio of 4 brands:

Model	Year-1 (rollout)	Steady-state / yr	Best when

Recommendation: contractor per brand for the rollout (variable cost, scales with the phased plan); revisit a light retainer only if steady-state change volume is genuinely high. An FTE is justified only when web-design demand is continuous and the hire also covers non-web design work.

Change velocity — but only for the work that needs a developer

Be precise here, because it's easy to overstate. On Shopify, staff self-serve most changes — prices, products, banners, blog posts, promo copy — directly in the admin, in both options, at zero cost. The two models only diverge on structural / template work: new bespoke sections, new page layouts, redesigns.

Agency — structural change

10 free service-credit hours/brand in Year 1, then billed (min 1 hr; $600/day referenced).
Written request → up to 5 working days just to quote it (T&C 6.2).

In-house — structural change

No external fee and no review-cycle wait — staged & published the same day.
Made by staff who know the treatments, promos & brand voice directly.

What's actually proven — and what isn't

The in-house workflow has been de-risked in this repository over 5 days (20–25 Jun 2026), 36 commits, by internal staff + Claude Code. Being precise about scope is what makes this deck trustworthy: the marketing/front-end layer is proven; the transactional store is not yet built.

Proven (built & live)

TrichoLab homepage — 11 modular, merchant-editable sections, image-matched to the comp, on the live staging theme.
Figma → theme handoff — a faithful desktop reproduction of a SkinLab Figma comp in 15 sections.
Reusable component — one shared carousel across sections (an early reuse signal, not yet a full design system).
Two-way GitHub ↔ Shopify sync, with validate_theme + theme check and a staging-before-publish gate.

Not yet built (must be piloted)

Transactional store — checkout, PDP, cart/discount logic, account & order pages (currently stock Dawn — works out of the box, but unproven for us).
300-SKU catalog — variants, options, metafields, collections & filtering (data modeling, not just upload).
Payments (2C2P) — third-party gateway, 3DS/SCA, refunds, failed-payment recovery.
Migration — customer/order data and SEO 301s at scale; the scored Skin Concerns Quiz (today just a CTA button).
Responsive depth — desktop comp is proven; multi-breakpoint behaviour is not.

Read this correctly: "not yet built" ≠ "infeasible." The unbuilt parts are largely Shopify-native configuration + data, not hard custom Liquid. The right move is to prove them on a pilot before committing the portfolio — see the Recommendation.

Risks & trade-offs — stated plainly, with mitigations

An honest case names its downsides. The four that matter most are the transactional build, SEO/data migration, payments, and key-person risk — each has an owner and a mitigation.

Risk	Severity	Mitigation
Transactional store & data modeling unproven — homepage proof doesn't transfer to checkout/catalog	Critical	Pilot a full real flow (product → collection → cart → checkout → order email) on TrichoLab before the portfolio is committed. Stock Dawn provides a working baseline to extend.
SEO migration — lost rankings/traffic = lost revenue (WooCommerce permalinks, 301s, redirect loops at scale)	Critical	Ahrefs (licensed) + Screaming Frog crawl → 301 map → GSC monitoring; $3k contingency per high-traffic brand; sequence the highest-risk brand early behind a pre-launch benchmark — not last.
Customer/order data migration — health-adjacent PII, order history, loyalty	High	Use a proven migration app (e.g. Matrixify) + operator validation on the pilot; reconcile counts before cutover. Cost applies to the agency too (not in its quote) — modeled on both sides.
Payments — 2C2P is a third-party SG gateway, not native Shopify Payments	High	Scope as integration + edge-case testing: 3DS/SCA, PayNow/GrabPay, refunds, webhooks, failed-payment recovery, gateway transaction fees. Real test-order sign-off before launch.
Key-person / bus factor — a new internal single point of failure	High	Docs already produced (memory, ADRs, gotchas, handoffs); train & cost a 2nd operator before brand 3; portable Dawn+Shopify+GitHub stack; agency stays a re-engageable fallback.
No transactional QA — linters check syntax, not purchases	High	Mandatory real test-order checklist per brand on staging (add-to-cart → discount → tax/shipping → 2C2P capture → confirmation email) before any Publish. (validate_theme alone is insufficient — our own "validators ≠ renderer" note.)
Incident response — who fixes a broken checkout at 11pm during a promo, across 4 live stores	Medium	Define an on-call path & a Dawn-upgrade policy; keep the agency on retainer-callable. Note: re-engaging an agency to fix an in-house build isn't a same-night fix and costs more than a clean build.
Internal capacity / opportunity cost	Medium	Phased rollout; contractor designer offloads design; the upper (loaded) bound in the model prices this explicitly.
Paper.design maturity — new (2026) vendor; outputs HTML/Tailwind, not Liquid (a hand-translation step)	Medium	Unproven in-house so far — Figma is the proven fallback (our repro came from Figma). Free tier keeps cost low; trial on the pilot before relying on it.
Design-system & responsive consistency across 4 brands	Medium	Build a shared token/snippet layer before brand 3; add multi-breakpoint QA; budget real revision cycles (mockups rarely land first pass).
No external SLA / warranty	Medium	Real trade-off: the agency caps consequential liability (standard) but still owes deliverables, 2 UAT rounds, a delivery date & a 12-month fix obligation; in-house carries operational risk with no external remedy. Mitigate with the QA gate + incident plan above.

The real driver: control, governance & data

Because the fully-loaded cash case is close, the decision turns on strategy. The website is where transactions and lead generation happen — a core business asset we want to rely less on outside parties for.

What we gain in-house

Control & governance — own the code, pipeline & admin; credentials never sit permanently with a third party.
Data privacy & protection — customer, transaction & lead data stays in-house; fewer external data-handlers (material for a health-adjacent business).
Business knowledge baked in — staff who know the treatments & brand voice make changes directly.
No lock-in — portable standard stack; the agency remains a fallback, not a dependency.
Capability that compounds — the team levels up and reuses work across brands.

What we must accept (no spin)

We take on delivery responsibility the agency would otherwise hold — including a committed launch date, which in-house must self-impose.
We must keep docs current and not let capability concentrate in one person.
We trade a predictable invoice for internal effort that's harder to see on a P&L.
Agencies use AI too — their speed gap narrows; our durable edge is business knowledge + control, not tooling alone.
The agency closes much of the "knows the business" gap through discovery & UAT — it's an edge, not a chasm.

The hybrid option — honest, but watch the catch

A legitimate middle path the agency would propose: let the agency do the riskiest first build + migration on one brand, then maintain and roll out the rest in-house. It de-risks the unproven parts and gives a committed first launch.

Why it's tempting

Buys a proven transactional build & migration playbook to copy in-house.
A fixed first-launch date and warranty on the hardest brand.
In-house still owns brands 2–4 and all ongoing change velocity.

The catch (state it plainly)

It hands the agency the most sensitive customer/order PII and the highest-lock-in component (payments + migration) — exactly what the control/privacy objective is trying to keep in-house.
At current assumptions it costs — over the horizon — between full in-house and full agency.
Prefer in-house for the migration if the pilot proves it; use hybrid only if the pilot reveals a genuine capability gap.

Recommendation

Yes — build in-house, but scope it honestly and prove the hard parts first. Cost sets the size of the prize; control, privacy and velocity are why we do it.

Adopt the AI-native workflow now for the marketing/front-end layer — it's proven and cheap.
Run one true pilot on TrichoLab that builds a full transactional flow (real products → collection → cart → checkout → 2C2P test order → order email) and validates a real customer/order + SEO migration. Decide brands 2–4 on the pilot's measured effort & quality.
De-risk migration: $3k specialist per high-traffic brand; sequence the highest-SEO-risk brand early, behind a GSC/Ahrefs benchmark.
Build the guardrails: a real test-order QA gate before every Publish; an incident-response + Dawn-upgrade policy; a 2nd trained operator (costed) and a shared design-token layer before brand 3.
Keep the hybrid & the agency as explicit fallbacks — but prefer in-house for migration/PII if the pilot proves it.

At default assumptions, over 3 years: in-house runs roughly — below the agency (lower if internal time is spare capacity). The number isn't a blowout — and that's the point: the decision rests on control, data privacy and change velocity, backed by a pilot that retires the real risks before we commit.