Hybrid RAG & NL-to-SQL Engine
Dual-mode enterprise Q&A - document RAG at 85%+ relevance + NL-to-SQL processing 500+ queries/day at 99.9% uptime
85%+Retrieval Relevance
Problem
Enterprise users needed to query both unstructured documents and structured databases using plain English. LLM-generated SQL is unreliable and dangerous without validation. Single-vector retrieval misses keyword matches. And any LLM dependency is a reliability risk without fallback.
Solution
- Indexed 100K+ chunks in Aurora PostgreSQL with pgvector (1024-dim Titan embeddings) for semantic search.
- Hybrid retrieval fuses pgvector semantic search with BM25 keyword search via Reciprocal Rank Fusion (RRF) — eliminates blind spots of single-mode retrieval.
- LLM-based reranking pushes relevance to 85%+; RAGAS-aligned evaluation (MRR, correctness, groundedness, completeness) with golden test datasets validates quality continuously.
- NL-to-SQL engine with circuit breaker and Sonnet-to-Haiku failover for 99.9% uptime.
- Every generated SQL query parsed by sqlglot into an AST — structurally validated to prevent injection before execution.
- PII detection and column-level access controls applied at the query layer; S3 audit logging on every query for compliance.
System Flow
Query
NL Input
Intent Classifier
Router
Query Analyzer
Search Type Router
RAG Branch
pgvector Search
BM25 + RRF
SQL Branch
Schema Mapper
sqlglot Validator
Validation
AST Validation
RAGAS Eval
Response
Answer Synth
Source Grounding
Architecture
- 01pgvector (1024-dim Titan embeddings) + BM25 hybrid retrieval via RRF
- 02LLM-based reranking for 85%+ relevance on golden test datasets
- 03RAGAS evaluation: MRR, correctness, groundedness, completeness
- 04S3 audit logging on every query for compliance tracking
- 05Circuit breaker with Sonnet-to-Haiku failover for 99.9% uptime
- 06sqlglot AST validation - SQL injection prevention on every query
- 07PII detection + column-level access controls at query layer
Impact
- 85%+ retrieval relevance across 100K+ indexed document chunks
- 500+ NL-to-SQL queries processed daily at 99.9% uptime
- Zero SQL injection incidents in production via sqlglot AST-level validation
- S3 audit logging on every query for compliance
Tech Stack
pgvectorBM25RRFAWS BedrockClaudeTitansqlglotRAGASPostgreSQLAuroraSQL ServerPython