Investigation 01
AI SEO Audit
Building an end-to-end AI-powered SEO audit tool from scratch — and what the process revealed about product design, async workflows, and the real limits of AI automation.
End-to-End Audit Pipeline
Five processing stages from user input to delivered report — each stage handles a distinct responsibility in the audit workflow.
Input Wizard
Collects URL, business type, competitors, goals
Job Queue
BullMQ async processing with retry logic
AI Analysis
Claude processes 7 audit categories with structured output
Report Gen
Branded PDF with scores, recommendations, priorities
Delivery
Email notification with PDF link and dashboard access
1. Overview
SEO audits are time-consuming, repetitive, and inconsistently executed. A good audit touches local presence, on-page structure, schema markup, page speed, and competitive positioning — most of which follows a repeatable pattern. I wanted to test whether AI could handle the analysis layer reliably enough to be useful in a real workflow.
This was my first end-to-end full-stack AI SaaS build. I designed the product architecture, wrote the prompts, and built the entire app myself using Claude Code. The goal wasn't just to ship something — it was to understand what it actually takes to get AI to produce trustworthy, structured output inside a production system.
I designed the audit methodology, locked the technical stack, built the async job pipeline, and acted as the product decision-maker across every tradeoff.
2. Product Question
The core question: Can AI reliably generate structured, actionable SEO audit reports from minimal user inputs — and deliver them in a format a real client could use?
Secondary questions included:
- Where does deterministic code end and AI judgment begin?
- How do you prevent AI from hallucinating findings or skipping critical checks?
- What's the minimum viable workflow for async AI tasks in a production app?
- What quality bar makes this usable for real client projects?
3. Approach
I treated this as a product design problem first, and a coding problem second. Before writing any code, I documented a canonical audit structure based on how I'd run a manual SEO audit — covering GBP competitor analysis, on-page findings, schema markup, and rankability. That document became the source of truth the AI had to match: sections, table formats, and output structure were all locked to it. That single decision turned the product into a system implementing a defined methodology, rather than a flexible AI experiment.
The build used a job queue architecture (pg-boss) to run audits asynchronously: the user submits inputs through a minimal wizard, a worker process picks up the job, runs each audit section independently, and delivers the report via email and in-app viewer. Each section uses deterministic code for data extraction and OpenAI for qualitative analysis — keeping factual grounding in code while using the model only where judgment and synthesis add value.
4. What I Built
Five-Section Audit Structure
Each section runs independently — partial failures don't kill the whole report.
GBP Competitor Analysis
Compare against 3-5 local competitors
GBP Ranking Factors
Reviews, categories, attributes, photos
On-Page SEO
Title, meta, headings, content structure
Schema Markup
LocalBusiness, FAQPage, BreadcrumbList
Rankability Verdict
Final assessment with action items
Audit Pipeline
A five-section audit covering GBP competitor analysis, GBP ranking factors, on-page SEO, schema markup, and a rankability verdict. Each section runs independently so partial failures don't kill the whole report.
Prompt Architecture
Prompts were written section-by-section against the canonical audit structure. The LLM receives structured extracted data and produces structured JSON output — not free-form prose. This made outputs parseable, consistent, and easier to debug when something went wrong.
The prompt itself became part of the implementation artifact: v1.0 was drafted first, then refined into v2.0 with Claude Opus 4.5, tightening requirements and turning the app into a more fully specified system.
Async Job System
Built on pg-boss (Postgres-backed queue, no Redis) with a separate worker process hosted on Railway. Audits run independently of the web app, timeout at 10 minutes, retry on transient failures, and emit email notifications on completion or failure.
Report Delivery
Reports render as HTML in-app and generate downloadable PDFs via Puppeteer. SendGrid handles email delivery with a summary of key findings and links to the full report. The product was designed from the start so that output would be client-usable and presentation-ready — not just a raw data dump.
Stack: Next.js 14 (App Router) · TypeScript · Supabase (Postgres) · Prisma · pg-boss · OpenAI · SendGrid · Puppeteer · Vercel + Railway
5. Key Decisions & Tradeoffs
Fixed Audit Structure Over Open-Ended Analysis
Decision: The product generates a fixed, repeatable five-section audit rather than open-ended SEO analysis. Sections, table formats, and output structure are locked to a canonical reference document.
Tradeoff: Less flexible, but outputs are consistent and auditable. The canonical PDF eliminated ambiguity across UI, report JSON, HTML, and PDF outputs — all four had to match the same source of truth.
No Automated Competitor Discovery
Decision: Users manually provide 3–5 competitor Google Business Profile URLs. No Google Maps scraping.
Tradeoff: More friction at input, but automated Maps scraping was considered both unreliable and a Terms of Service risk. Manual input produces cleaner, more trustworthy data and avoids building fragile scraping infrastructure.
Deterministic Extraction, AI Interpretation
Decision: Data fetching and parsing (HTML, schema, PageSpeed) is handled by code. AI only touches the interpretation and recommendation layer.
Tradeoff: More code to write upfront, but significantly easier to debug. Hallucination risk is contained to the sections where it's hardest to verify — and easier to catch.
Separate Worker Process
Decision: The audit worker runs as a standalone Node process on Railway, separate from the Next.js app on Vercel.
Tradeoff: More infrastructure to manage, but necessary for reliability. Audits include multiple slow, failure-prone steps — page fetching, API calls, LLM analysis, PDF generation, email delivery. Moving this into a queue made progress tracking possible and kept long-running jobs out of serverless functions.
Minimal Wizard UX
Decision: A 3-step wizard collecting only the minimum required inputs: website URL, primary keyword, city/state, GBP search phrase, business type, and competitor GBP URLs.
Tradeoff: The product feels operational and guided rather than like a technical SEO control panel — which was intentional. The constraint also forced clarity about what the audit actually needed.
6. What I Learned
The hardest part wasn't the AI — it was the plumbing. Getting async jobs to run reliably, handle partial failures gracefully, and surface useful errors took more iteration than any prompt work.
Prompt engineering for structured output is a different discipline than conversational prompting. When the AI needs to produce parseable JSON that maps to a specific schema, every ambiguity in the prompt shows up as a broken report. Precision matters more than creativity.
The canonical PDF decision was more important than it seemed at the time. By anchoring the product to a fixed methodology upfront, I avoided a class of problems that come from letting AI outputs drift across runs. That reframing — "this implements a defined audit methodology" rather than "this generates SEO recommendations" — shaped every downstream decision.
This project also clarified what "AI-native" means in practice: not replacing the audit entirely, but compressing hours of manual work into minutes while keeping a human in the loop for final review and delivery.
7. Outcome & Next Steps
The core pipeline architecture works end-to-end: inputs → async job → multi-section AI analysis → HTML report → PDF → email delivery. The architecture is production-capable, though two known issues remain in the backlog: the rankability section falls back to a generic response when SERP comparison data is unavailable, and the report doesn't always reflect the exact number of competitors selected during setup. Both are documented and scoped — not architectural, just unfinished.
The next phase is on two tracks:
Deepening the audit value: Planned additions include a domain authority and backlink health check (comparing DA scores against competitors via the Moz API), and a local map pack grid search visual — a geo-grid showing how the business ranks across different blocks of their city for the target keyword. That last feature alone is a meaningful pricing differentiator.
Moving toward a real product: The UI needs a full redesign before anything else — the current version was built to validate the pipeline, not to impress a client. Alongside that: custom domain migration to seo-audit.mola.design, Stripe integration for plan-based access, multilingual support (EN/FR/ES), and an interactive fix-tracking checklist inside each report so users can mark action items complete and have a reason to return.
The longer-term question this investigation is building toward: can this be lightweight enough to work as a mobile-first product, where a business owner runs their own audit from their phone before a discovery call?
8. Links & Resources
- Codex Prompt v2.0 (Notion — see parent page)
- Full Implementation Plan (Notion — see parent page)
- GBP Audit Reference Doc (Google Doc)
- Linear Project: AI SEO Audit backlog