AI platform · 2026 Live
Meridian
Production-grade multi-agent research and execution pipeline with hybrid RAG, conflict resolution, and an LLM-as-judge rubric.
Tech stack
Python FastAPI LangGraph PostgreSQL pgvector tsvector Redis Langfuse Anthropic Claude Docker Compose
The problem
Most multi-agent demos collapse the moment an API breaks, a goal is ambiguous, or two sources contradict. They lack typed contracts between agents, retries at the tool layer, conflict resolution as a first-class step, and observability wired in from day one. Without those, you cannot defend an agent's answer to a real stakeholder.
Goals
- Accept a high-level business goal and return a structured, cited answer
- Plan a dependency-aware task DAG with per-task acceptance criteria
- Delegate to specialist sub-agents with typed input and output contracts
- Resolve conflicts between sources instead of averaging them away
- Score every run against a rubric so quality is measurable, not vibes
The solution
- LangGraph orchestrator with planner, executor, and replanner nodes that walk the task DAG
- Retrieval agent combining pgvector and tsvector via Reciprocal Rank Fusion
- Web/API agent built on Tavily plus a generic HTTP tool, with retry and circuit breaking
- Synthesis agent that aggregates, detects conflicts, and reconciles by weight and confidence
- Redis session store for per-run state, Postgres for run logs and evaluation history
- Self-hosted Langfuse trace for every agent call, token, and tool invocation
- LLM-as-judge rubric scoring goal completion, accuracy, coverage, confidence, and hallucination risk
My role
- → Solo architect and engineer, system design to deploy
- → Four-layer architecture (orchestrator, agents, memory, observability) and the LangGraph wiring
- → Hybrid RAG pipeline with pgvector + tsvector RRF and per-agent context budgeting
- → Langfuse self-hosted stack, LLM-judge rubric, and run report generator
- → Docker Compose stack for Postgres, Redis, and Langfuse
UI direction
Operator-first run report and trace viewer, not an end-user surface. The product is the structured answer plus the trace and rubric backing it.
User flows
Goal-to-answer flow
- 1 Operator submits a business goal to the FastAPI endpoint
- 2 Planner decomposes the goal into a task DAG with declared dependencies and acceptance criteria
- 3 Executor walks the DAG, dispatching tasks to retrieval, web/API, and synthesis agents
- 4 Replanner re-enters the loop with failure context when a specialist returns low confidence or errors
- 5 Synthesis agent reconciles conflicting sources and emits a cited answer
- 6 LLM judge scores the run, full trace and rubric persist to Langfuse and Postgres
Screenshots
Click any image to open at full size.
Key learnings
- Typed Pydantic contracts between agents catch a class of integration bugs before they reach the executor
- Conflict resolution as a first-class step beats hoping the synthesizer averages contradictions correctly
- An LLM judge rubric turns agent quality from a vibe into a number you can regress against
- Observability wired in from layer one is the difference between a demo and an answer you can defend
Want something like Meridian?
I'm open to senior contract work. Let's talk about what you're building.
Get in touch