
RAG-Based Natural Language to SQL for Analysts
Let non-technical stakeholders query the warehouse in plain English, directly from Slack.
Stage.in's data team was fielding a high volume of ad-hoc data requests from non-technical stakeholders across the business. Each request required a data team member to interpret the question, write the SQL, validate the output, and return the result. This was a significant drain on a small team's capacity and created a bottleneck that slowed down decision-making across the company.
Design and deploy a production-grade system that would allow non-technical stakeholders to query the data warehouse in plain language, reducing the ad-hoc request burden on the data team without sacrificing accuracy or trust in the results.
The fundamental challenge with NL-to-SQL is not language understanding — modern LLMs can write SQL. The challenge is context: a model without knowledge of your schema, metric definitions, and business logic produces syntactically correct SQL with semantically wrong answers. The RAG architecture addresses this by retrieving relevant dbt model context at query time, grounding generation in the actual data model. pgVector was used for semantic retrieval specifically because natural language questions rarely match column names — "how many users churned last month" needs to find the dbt model that captures churn, which requires semantic similarity, not keyword matching.
The system went significantly beyond basic query generation: it searched past conversation history for organisational context, attempted to execute queries and self-correct on error, and could interpret undocumented models by reading their source code and traversing upstream dependencies. All interactions happened natively in Slack, including executing queries and returning results as tables, CSVs, or Metabase charts — without requiring stakeholders to leave the tools they were already using.
- 01Indexed the full dbt project metadata — model descriptions, column documentation, metric definitions, and upstream/downstream relationships — into a pgVector store for semantic retrieval.
- 02Built the conversation history search layer: for each new question, past similar conversations were retrieved to provide organisational context and avoid contradicting previous answers.
- 03Designed the LangGraph reasoning pipeline: semantic context retrieval → SQL generation → execution → error correction → response, with explicit handling for undocumented models via source code traversal of the model and all upstream dependencies.
- 04Implemented the human-in-the-loop feedback system: data team could flag incorrect responses, with corrections improving retrieval ranking over time; admins could configure which LLM models were used for different query types.
- 05Built the Slack integration: the agent fetched the full thread context before responding — handling both new questions and continuations of prior conversations — with Slack actions allowing users to execute queries and receive results as a table, CSV, or a chart rendered directly in Metabase.
- 06Validated against a representative sample of historical ad-hoc requests before going live, measuring accuracy against known-correct answers from the data team.
- 50% of ad-hoc data requests automated end-to-end
- Data team capacity freed for higher-value analytical and engineering work
- Non-technical stakeholders able to self-serve answers to routine data questions natively in Slack
- System open-sourced as Ragstar (github.com/pragunbhutani/dbt-llm-agent) — 167+ GitHub stars
Start a conversation.
Every engagement begins with a focused discussion of your current data environment and priorities. To schedule an initial consultation, reach out directly.
Get in touch