Guide: AI Data Analyst

Who this is for

Persona: The Scientist, The Auditor
Goal: Trust every number. Get transparent reasoning, executable SQL, and instant visualizations in one chat thread.

What makes this different

The technical hook

Standard chatbots guess the next word; this engine runs a Reasoning Loop (AnalysisAgent) that plans first, then executes real SQL.
It pairs LLM fluency with database math, so answers are calculated—not hallucinated.

Core capabilities (with proof from the code)

Zero-hallucination math engine

Uses _load_dataset_into_duckdb to stage your data in-memory, then calls run_sql_analysis to compute results directly in DuckDB.
Copy angle: Trust the numbers, every time. The Analyst writes and runs live SQL to calculate revenue, churn, and growth with 100% precision.

Transparent step-by-step reasoning

AnalysisAgent logs its phases (THINKING -> TOOL_SELECTION -> TOOL_EXECUTION) and streams status like “Step 1: Aggregating sales data…” via socket.io.
Copy angle: Watch it think. The AI shows its work in real time so you always see how the answer is produced.

Auto-charting & visualization

run_sql_analysis detects chart-friendly outputs and returns structured chart specs (e.g., { type: "bar" | "funnel", is_temporary: true }).
Copy angle: Ask for a trend; the Analyst instantly renders a temporary bar or line chart inside chat—no extra clicks.

Context-aware web research

When internal data isn’t enough, the agent can call a Google Search_context tool to add external signals (holidays, launches, market events).
Copy angle: Data in context. The Analyst can look up real-world causes when your metrics move.

The “wow” moment (try this prompt)

"Show me the top 5 products by margin, and plot the trend for the winner."

What happens:

Plans the steps, 2) Writes SQL to calculate margins, 3) Picks the winner, 4) Generates a line chart for that product—returned in one response.

Quick start

Open AI Data Analyst (/ai-data-analyst).
Connect your dataset or pick a sample; the agent loads it into DuckDB automatically.
Paste the prompt above (or your own). Watch the reasoning stream, the SQL, and the chart appear.
Save the chart or hand off the insight into Slides/Docs for your team.

Implementation notes (for engineers)

Core loop: AnalysisAgent orchestrates planning → tool selection → SQL execution → validation → response.
Data layer: _load_dataset_into_duckdb builds the in-memory warehouse for fast analytics.
Visualization: run_sql_analysis maps tabular results to chart configs and flags them as is_temporary for inline rendering.
Extensibility: Add new chart mappings or external tools (e.g., anomaly detection) without changing the reasoning loop.

Bring this into your stack

Embed the Analyst in your app via the chat widget, or route it through your existing BI permissions.
Set guardrails by constraining which tables the agent can query and by reviewing emitted SQL in logs.

Next guide suggestion

Want the complementary piece? I can draft the AI Dashboards (Generator Engine) guide next—its layout architect (generator.py) is perfect for teams that need net-new dashboards from a single prompt.

ML Clever Team

Product Education