AI SaaS · 2026 Live
Cue
Production-grade WhatsApp Business AI agent with structured tool use and a full audit trail.
Tech stack
Next.js TypeScript PostgreSQL Meta Cloud API Anthropic Claude Groq Vercel
The problem
Most WhatsApp bot tools stop at hard-coded keyword flows or shallow GPT wrappers with no audit. Regulated workflows (support, billing, customer onboarding) cannot ship without an answer to what the bot said to a customer at 3am, with token counts, latency, tool calls, and errors logged per run. Existing tools do not provide that level of observability.
Goals
- Receive and send on the Meta Cloud API webhook end to end
- Ground every reply against a real knowledge base with a bound search tool
- Treat every AI reply as a first-class run with full telemetry persisted
- Make escalation rules explicit, not buried inside prompt soup
- Keep the model provider pluggable so Groq, Claude, or BYO key all work
The solution
- Webhook endpoint on Vercel receives WhatsApp messages and persists them with conversation context
- Structured tool loop with four bound tools (lookup customer, search KB, draft reply, escalate to human)
- Lexical KB lookup today with the pgvector schema in place for the Phase 3 embedding pipeline
- Outgoing messages flagged ai_drafted and ai_sent so human-vs-AI authorship stays traceable
- Escalation tool flips conversation status and writes an escalation row, surfaced in the admin dashboard
- Admin dashboard surfaces conversations, escalations, and AI run details including tokens and latency
My role
- → Solo architect and engineer, requirements to deploy
- → Meta Cloud API webhook integration on Vercel
- → Agent loop design with structured tool use against Groq and Claude
- → KB schema and the four bound tools wired into the agent loop
- → Admin dashboard for conversations, escalations, and run inspection
UI direction
Operator-first admin built for monitoring conversations and AI runs, not for end users. End users only see WhatsApp on their phone, which is the point.
User flows
Inbound message flow
- 1 Customer sends a WhatsApp message to the business number
- 2 Meta Cloud API webhook hits the Cue endpoint, message persists with conversation context
- 3 Agent loop runs: lookup customer, search KB, draft reply, decide on escalation
- 4 Reply sends back over the Meta Cloud API, flagged ai_sent
- 5 Every model run, tool call, and token count writes to the Postgres audit trail
Escalation flow
- 1 Agent calls escalate_to_human based on explicit rules
- 2 Conversation status flips to escalated and an escalation row writes in the same transaction
- 3 Operator picks up the conversation from the admin dashboard
- 4 Subsequent operator replies persist alongside the AI run history
Screenshots
Click any image to open at full size.
Key learnings
- Treating every AI reply as a logged run with tokens, latency, and tool calls is the difference between a demo bot and a regulated-workflow bot
- Explicit escalation rules in code beat trying to coax escalation behavior out of a prompt
- A pluggable provider interface (Groq default, Claude via BYO key) lets the demo stay free while paying clients still get Claude
- Persisting AI run metadata at the message level (not at the conversation level) keeps the audit trail useful when an agent loop spans multiple tool calls
- Phase 3 next: swap the lexical KB lookup for pgvector embeddings; Phase 4 next: Slack notification on escalate plus an escalation queue UI
Want something like Cue?
I'm open to senior contract work. Let's talk about what you're building.
Get in touch