Internal & confidential · For leadership review

Revamp the website in-house with AI — or hand it to a web agency?

An objective cost/benefit analysis of an AI-native in-house workflow (UI/UX designer → Figma / Paper mockups → Claude Code + Shopify CLI & MCP) versus engaging a web agency to design & develop the WordPress + WooCommerce → Shopify revamp from scratch, across our 4-brand portfolio.

For Jolene (Director) & Dr Kelvin Chua (Owner) Benchmark: Verz Design Q46440 (SGD 18,421 / brand) Currency SGD · GST 9% Every number is editable — challenge it live

The bottom line, up front

On fully-loaded cash, the portfolio comparison is closer than it first looks — so cost should set the magnitude of the decision, not its direction. The honest case for building in-house rests on three things, in this order: strategic control, a proven cheap front-end layer, and a cost edge that compounds at portfolio scale and over time.

Where in-house genuinely wins

  • Control & data privacy — we own code, data & pipeline; sensitive customer/order data stays in-house.
  • Proven front-end — the marketing/homepage layer already works (see Proof), at near-zero marginal cash.
  • Compounding — one toolchain across 4 brands; the cash gap widens with brands, years & structural changes.

Where the agency genuinely wins

  • One bundled vendor with a fixed delivery date & a 12-month warranty (transferred risk).
  • Established process for the hard parts — payments, 300-SKU modeling, data & SEO migration.
  • No draw on internal capacity; best for a single one-off site.

The two honest caveats

  • Only the marketing layer is proven. The transactional store (checkout, catalog, payments, migration) is unproven, not impossible — it must be piloted first.
  • Internal time isn't truly free. If it has spare capacity the cash cost is low; if it displaces revenue work, count it. We show both.

Interactive cost model

Drag the assumptions — every figure, chart and table updates live. The in-house result is shown as a range: the lower bound treats internal staff as spare capacity (true cash cost), the upper bound prices their time at a loaded day-rate (opportunity cost).

Scope & horizon

TrichoLab · SkinLab · SL Clinic · Prologue

Design resourcing (we add this — agency bundles it)

Per-brand freelancer for mockups — lowest commitment.

Internal effort (the in-house range)

Loaded cost of internal staff time, used for the upper (opportunity-cost) bound.
Maintenance + structural changes across all brands.

In-house running costs

Claude Max + Paper.design + Figma seats.
SEO-migration specialist contingency
~$3,000 per high-traffic brand (in-house). Agency bundles its own SEO migration.

Shared & agency assumptions

Customer/order history (WooCommerce→Shopify). Applied to both options.
Verz Q46440 incl. GST (build + SEO migration + quiz + 12-mo maint).
An agency reusing its system across 4 sites would typically discount.
Template/new-page work. Routine content is self-serve in the admin in both options.
Agency
In-house (range)
Hybrid
agency builds brand 1, in-house the rest
In-house range vs agency, over the selected horizon. The spread is one question: does this operator time displace other revenue-generating work? If staff have spare capacity, use the lower figure (true cash). If it pulls them off revenue work, use the upper.

Cumulative cost over time

In-house is a band (spare-capacity ↔ fully-loaded). Selected horizon marked.
Agency In-house band Hybrid Selected horizon

Line-by-line breakdown (at current assumptions)

ItemAgencyIn-house
(spare cap.)
In-house
(loaded)

Designer: contractor vs. retainer vs. FTE

Design is the one cost the in-house path does not save — we still need a UI/UX designer to produce the mockups Claude Code builds from. All three resourcing models, for the portfolio of 4 brands:

ModelYear-1 (rollout)Steady-state / yrBest when

Recommendation: contractor per brand for the rollout (variable cost, scales with the phased plan); revisit a light retainer only if steady-state change volume is genuinely high. An FTE is justified only when web-design demand is continuous and the hire also covers non-web design work.

Change velocity — but only for the work that needs a developer

Be precise here, because it's easy to overstate. On Shopify, staff self-serve most changes — prices, products, banners, blog posts, promo copy — directly in the admin, in both options, at zero cost. The two models only diverge on structural / template work: new bespoke sections, new page layouts, redesigns.

Agency — structural change

  • 10 free service-credit hours/brand in Year 1, then billed (min 1 hr; $600/day referenced).
  • Written request → up to 5 working days just to quote it (T&C 6.2).

In-house — structural change

  • No external fee and no review-cycle wait — staged & published the same day.
  • Made by staff who know the treatments, promos & brand voice directly.

What's actually proven — and what isn't

The in-house workflow has been de-risked in this repository over 5 days (20–25 Jun 2026), 36 commits, by internal staff + Claude Code. Being precise about scope is what makes this deck trustworthy: the marketing/front-end layer is proven; the transactional store is not yet built.

Proven (built & live)

  • TrichoLab homepage — 11 modular, merchant-editable sections, image-matched to the comp, on the live staging theme.
  • Figma → theme handoff — a faithful desktop reproduction of a SkinLab Figma comp in 15 sections.
  • Reusable component — one shared carousel across sections (an early reuse signal, not yet a full design system).
  • Two-way GitHub ↔ Shopify sync, with validate_theme + theme check and a staging-before-publish gate.

Not yet built (must be piloted)

  • Transactional store — checkout, PDP, cart/discount logic, account & order pages (currently stock Dawn — works out of the box, but unproven for us).
  • 300-SKU catalog — variants, options, metafields, collections & filtering (data modeling, not just upload).
  • Payments (2C2P) — third-party gateway, 3DS/SCA, refunds, failed-payment recovery.
  • Migration — customer/order data and SEO 301s at scale; the scored Skin Concerns Quiz (today just a CTA button).
  • Responsive depth — desktop comp is proven; multi-breakpoint behaviour is not.

Read this correctly: "not yet built" ≠ "infeasible." The unbuilt parts are largely Shopify-native configuration + data, not hard custom Liquid. The right move is to prove them on a pilot before committing the portfolio — see the Recommendation.

Risks & trade-offs — stated plainly, with mitigations

An honest case names its downsides. The four that matter most are the transactional build, SEO/data migration, payments, and key-person risk — each has an owner and a mitigation.

RiskSeverityMitigation
Transactional store & data modeling unproven — homepage proof doesn't transfer to checkout/catalogCritical Pilot a full real flow (product → collection → cart → checkout → order email) on TrichoLab before the portfolio is committed. Stock Dawn provides a working baseline to extend.
SEO migration — lost rankings/traffic = lost revenue (WooCommerce permalinks, 301s, redirect loops at scale)Critical Ahrefs (licensed) + Screaming Frog crawl → 301 map → GSC monitoring; $3k contingency per high-traffic brand; sequence the highest-risk brand early behind a pre-launch benchmark — not last.
Customer/order data migration — health-adjacent PII, order history, loyaltyHigh Use a proven migration app (e.g. Matrixify) + operator validation on the pilot; reconcile counts before cutover. Cost applies to the agency too (not in its quote) — modeled on both sides.
Payments — 2C2P is a third-party SG gateway, not native Shopify PaymentsHigh Scope as integration + edge-case testing: 3DS/SCA, PayNow/GrabPay, refunds, webhooks, failed-payment recovery, gateway transaction fees. Real test-order sign-off before launch.
Key-person / bus factor — a new internal single point of failureHigh Docs already produced (memory, ADRs, gotchas, handoffs); train & cost a 2nd operator before brand 3; portable Dawn+Shopify+GitHub stack; agency stays a re-engageable fallback.
No transactional QA — linters check syntax, not purchasesHigh Mandatory real test-order checklist per brand on staging (add-to-cart → discount → tax/shipping → 2C2P capture → confirmation email) before any Publish. (validate_theme alone is insufficient — our own "validators ≠ renderer" note.)
Incident response — who fixes a broken checkout at 11pm during a promo, across 4 live storesMedium Define an on-call path & a Dawn-upgrade policy; keep the agency on retainer-callable. Note: re-engaging an agency to fix an in-house build isn't a same-night fix and costs more than a clean build.
Internal capacity / opportunity costMedium Phased rollout; contractor designer offloads design; the upper (loaded) bound in the model prices this explicitly.
Paper.design maturity — new (2026) vendor; outputs HTML/Tailwind, not Liquid (a hand-translation step)Medium Unproven in-house so far — Figma is the proven fallback (our repro came from Figma). Free tier keeps cost low; trial on the pilot before relying on it.
Design-system & responsive consistency across 4 brandsMedium Build a shared token/snippet layer before brand 3; add multi-breakpoint QA; budget real revision cycles (mockups rarely land first pass).
No external SLA / warrantyMedium Real trade-off: the agency caps consequential liability (standard) but still owes deliverables, 2 UAT rounds, a delivery date & a 12-month fix obligation; in-house carries operational risk with no external remedy. Mitigate with the QA gate + incident plan above.

The real driver: control, governance & data

Because the fully-loaded cash case is close, the decision turns on strategy. The website is where transactions and lead generation happen — a core business asset we want to rely less on outside parties for.

What we gain in-house

  • Control & governance — own the code, pipeline & admin; credentials never sit permanently with a third party.
  • Data privacy & protection — customer, transaction & lead data stays in-house; fewer external data-handlers (material for a health-adjacent business).
  • Business knowledge baked in — staff who know the treatments & brand voice make changes directly.
  • No lock-in — portable standard stack; the agency remains a fallback, not a dependency.
  • Capability that compounds — the team levels up and reuses work across brands.

What we must accept (no spin)

  • We take on delivery responsibility the agency would otherwise hold — including a committed launch date, which in-house must self-impose.
  • We must keep docs current and not let capability concentrate in one person.
  • We trade a predictable invoice for internal effort that's harder to see on a P&L.
  • Agencies use AI too — their speed gap narrows; our durable edge is business knowledge + control, not tooling alone.
  • The agency closes much of the "knows the business" gap through discovery & UAT — it's an edge, not a chasm.

The hybrid option — honest, but watch the catch

A legitimate middle path the agency would propose: let the agency do the riskiest first build + migration on one brand, then maintain and roll out the rest in-house. It de-risks the unproven parts and gives a committed first launch.

Why it's tempting

  • Buys a proven transactional build & migration playbook to copy in-house.
  • A fixed first-launch date and warranty on the hardest brand.
  • In-house still owns brands 2–4 and all ongoing change velocity.

The catch (state it plainly)

  • It hands the agency the most sensitive customer/order PII and the highest-lock-in component (payments + migration) — exactly what the control/privacy objective is trying to keep in-house.
  • At current assumptions it costs over the horizon — between full in-house and full agency.
  • Prefer in-house for the migration if the pilot proves it; use hybrid only if the pilot reveals a genuine capability gap.

Recommendation

Yes — build in-house, but scope it honestly and prove the hard parts first. Cost sets the size of the prize; control, privacy and velocity are why we do it.

  1. Adopt the AI-native workflow now for the marketing/front-end layer — it's proven and cheap.
  2. Run one true pilot on TrichoLab that builds a full transactional flow (real products → collection → cart → checkout → 2C2P test order → order email) and validates a real customer/order + SEO migration. Decide brands 2–4 on the pilot's measured effort & quality.
  3. De-risk migration: $3k specialist per high-traffic brand; sequence the highest-SEO-risk brand early, behind a GSC/Ahrefs benchmark.
  4. Build the guardrails: a real test-order QA gate before every Publish; an incident-response + Dawn-upgrade policy; a 2nd trained operator (costed) and a shared design-token layer before brand 3.
  5. Keep the hybrid & the agency as explicit fallbacks — but prefer in-house for migration/PII if the pilot proves it.

At default assumptions, over 3 years: in-house runs roughly below the agency (lower if internal time is spare capacity). The number isn't a blowout — and that's the point: the decision rests on control, data privacy and change velocity, backed by a pilot that retires the real risks before we commit.