An objective cost/benefit analysis of an AI-native in-house workflow (UI/UX designer → Figma / Paper mockups → Claude Code + Shopify CLI & MCP) versus engaging a web agency to design & develop the WordPress + WooCommerce → Shopify revamp from scratch, across our 4-brand portfolio.
On fully-loaded cash, the portfolio comparison is closer than it first looks — so cost should set the magnitude of the decision, not its direction. The honest case for building in-house rests on three things, in this order: strategic control, a proven cheap front-end layer, and a cost edge that compounds at portfolio scale and over time.
Drag the assumptions — every figure, chart and table updates live. The in-house result is shown as a range: the lower bound treats internal staff as spare capacity (true cash cost), the upper bound prices their time at a loaded day-rate (opportunity cost).
| Item | Agency | In-house (spare cap.) | In-house (loaded) |
|---|
Design is the one cost the in-house path does not save — we still need a UI/UX designer to produce the mockups Claude Code builds from. All three resourcing models, for the portfolio of 4 brands:
| Model | Year-1 (rollout) | Steady-state / yr | Best when |
|---|
Recommendation: contractor per brand for the rollout (variable cost, scales with the phased plan); revisit a light retainer only if steady-state change volume is genuinely high. An FTE is justified only when web-design demand is continuous and the hire also covers non-web design work.
Be precise here, because it's easy to overstate. On Shopify, staff self-serve most changes — prices, products, banners, blog posts, promo copy — directly in the admin, in both options, at zero cost. The two models only diverge on structural / template work: new bespoke sections, new page layouts, redesigns.
The in-house workflow has been de-risked in this repository over 5 days (20–25 Jun 2026), 36 commits, by internal staff + Claude Code. Being precise about scope is what makes this deck trustworthy: the marketing/front-end layer is proven; the transactional store is not yet built.
Read this correctly: "not yet built" ≠ "infeasible." The unbuilt parts are largely Shopify-native configuration + data, not hard custom Liquid. The right move is to prove them on a pilot before committing the portfolio — see the Recommendation.
An honest case names its downsides. The four that matter most are the transactional build, SEO/data migration, payments, and key-person risk — each has an owner and a mitigation.
| Risk | Severity | Mitigation |
|---|---|---|
| Transactional store & data modeling unproven — homepage proof doesn't transfer to checkout/catalog | Critical | Pilot a full real flow (product → collection → cart → checkout → order email) on TrichoLab before the portfolio is committed. Stock Dawn provides a working baseline to extend. |
| SEO migration — lost rankings/traffic = lost revenue (WooCommerce permalinks, 301s, redirect loops at scale) | Critical | Ahrefs (licensed) + Screaming Frog crawl → 301 map → GSC monitoring; $3k contingency per high-traffic brand; sequence the highest-risk brand early behind a pre-launch benchmark — not last. |
| Customer/order data migration — health-adjacent PII, order history, loyalty | High | Use a proven migration app (e.g. Matrixify) + operator validation on the pilot; reconcile counts before cutover. Cost applies to the agency too (not in its quote) — modeled on both sides. |
| Payments — 2C2P is a third-party SG gateway, not native Shopify Payments | High | Scope as integration + edge-case testing: 3DS/SCA, PayNow/GrabPay, refunds, webhooks, failed-payment recovery, gateway transaction fees. Real test-order sign-off before launch. |
| Key-person / bus factor — a new internal single point of failure | High | Docs already produced (memory, ADRs, gotchas, handoffs); train & cost a 2nd operator before brand 3; portable Dawn+Shopify+GitHub stack; agency stays a re-engageable fallback. |
| No transactional QA — linters check syntax, not purchases | High | Mandatory real test-order checklist per brand on staging (add-to-cart → discount → tax/shipping → 2C2P capture → confirmation email) before any Publish. (validate_theme alone is insufficient — our own "validators ≠ renderer" note.) |
| Incident response — who fixes a broken checkout at 11pm during a promo, across 4 live stores | Medium | Define an on-call path & a Dawn-upgrade policy; keep the agency on retainer-callable. Note: re-engaging an agency to fix an in-house build isn't a same-night fix and costs more than a clean build. |
| Internal capacity / opportunity cost | Medium | Phased rollout; contractor designer offloads design; the upper (loaded) bound in the model prices this explicitly. |
| Paper.design maturity — new (2026) vendor; outputs HTML/Tailwind, not Liquid (a hand-translation step) | Medium | Unproven in-house so far — Figma is the proven fallback (our repro came from Figma). Free tier keeps cost low; trial on the pilot before relying on it. |
| Design-system & responsive consistency across 4 brands | Medium | Build a shared token/snippet layer before brand 3; add multi-breakpoint QA; budget real revision cycles (mockups rarely land first pass). |
| No external SLA / warranty | Medium | Real trade-off: the agency caps consequential liability (standard) but still owes deliverables, 2 UAT rounds, a delivery date & a 12-month fix obligation; in-house carries operational risk with no external remedy. Mitigate with the QA gate + incident plan above. |
Because the fully-loaded cash case is close, the decision turns on strategy. The website is where transactions and lead generation happen — a core business asset we want to rely less on outside parties for.
A legitimate middle path the agency would propose: let the agency do the riskiest first build + migration on one brand, then maintain and roll out the rest in-house. It de-risks the unproven parts and gives a committed first launch.
Yes — build in-house, but scope it honestly and prove the hard parts first. Cost sets the size of the prize; control, privacy and velocity are why we do it.
At default assumptions, over 3 years: in-house runs roughly — below the agency (lower if internal time is spare capacity). The number isn't a blowout — and that's the point: the decision rests on control, data privacy and change velocity, backed by a pilot that retires the real risks before we commit.