The Problem
A growing online retailer with 380K monthly active users was running a static rules-based recommendation engine. The system relied on product affinity rules maintained manually by a small merchandising team, and could not adapt to real-time behavioral signals or seasonal trends without manual updates.
Conversion rates from recommendations had plateaued as competitors deployed more sophisticated ML-driven systems. The internal engineering team lacked the specific AI expertise to bridge the gap between model prototypes and production serving at < 50ms.
Key Constraints
Recommendation API must respond in < 50ms at P99 under peak load
Must support their growing product catalog with nuanced cross-category recommendations
User behavior context window must include session history, not just historical purchases
Explainability required: merchandising team must understand why items are recommended
The Solution
The solution was a lightweight RAG recommendation system built on LangGraph. The retrieval stage uses a Pinecone vector database indexed with product embeddings (generated from product descriptions and attributes) to fetch candidate sets in under 5ms.
A LangGraph agent then reasons over the candidate set, the user's real-time session context, and a set of business rules (margin targets, exclusions) to produce a ranked, explainable recommendation list. The agent produces a reasoning trace, giving the merchandising team full visibility into every recommendation decision.
Technical Architecture
Pinecone Vector Index
Product embeddings generated by text-embedding-3-small. ANN retrieval of candidates in < 5ms.
LangGraph Reasoning Agent
Stateful agent reasoning over candidates, session context, and business rules. Generates ranked recommendations with citations.
Real-Time Session Context
Streamed clickstream events building a rolling session vector for each active user.
FastAPI Serving Layer
Async FastAPI endpoints with Redis caching. P99 < 18ms.
"The 22% increase in basket size exceeded initial projections. The merchandising team also adopted the system quickly, citing improved transparency into how recommendations were generated."

