SQL Assistant for Spring Boot: Guardrails, Read-only Enforcement, and Audit Logs

A runnable assistant that converts natural language questions into SQL with schema grounding, read-only enforcement, and full audit logging.

Verified v1.1.0 Redhat 8/9 / Ubuntu / macOS / Windows (Docker) Java 17 · Spring Boot 3.x · PostgreSQL · Spring Security · Docker Compose
Register account for free
Unlock full implementation + downloads
Account access required
This solution includes runnable code bundles and full implementation details intended for production use.

1. Overview

This solution implements a SQL Assistant for Spring Boot that converts natural-language questions into SQL with explicit schema grounding, enforces read-only execution, applies query budgeting, and records a full audit trail covering prompt inputs, generated SQL, policy decisions, execution metadata, and results summaries.

In production, “NL → SQL” approaches fail for predictable reasons:

  • Ungrounded generation: the model hallucinates tables/columns or misunderstands joins and cardinality without an explicit schema contract.
  • Unsafe SQL: generated statements may include writes (INSERT/UPDATE/DELETE), DDL, dangerous functions, or multi-statement payloads.
  • Inadequate enforcement: many systems “ask the model to be safe” instead of enforcing safety at the database boundary.
  • No traceability: teams cannot answer “who ran what, against which schema, under which policy, and what happened,” which blocks production adoption.
  • Unbounded costs and load: unconstrained queries can scan large tables or execute expensive joins, harming production systems.

This implementation is production-ready because it:

  • Uses a two-phase contract: schema grounding + SQL generation with strict output expectations.
  • Applies deterministic guardrails: SQL parsing + allowlist policy + multi-statement rejection + parameter validation.
  • Enforces read-only at the database boundary using a dedicated DB role with read-only privileges and transaction-level read-only settings.
  • Adds budget controls: row limits, statement timeouts, and execution classification (cheap vs expensive) before execution.
  • Persists an audit-grade execution record including input, normalized SQL fingerprint, policy outcome, runtime, row counts, and error codes.

2. Architecture

System flow:

  • Client → SQL Assistant API (POST /api/sql/query)
  • SQL Assistant → Schema Introspector (PostgreSQL catalogs or curated schema table)
  • SQL Assistant → LLM Provider (OpenAI-compatible API) for SQL generation
  • SQL Assistant → Guardrails Engine (SQL parser + policy evaluator)
  • SQL Assistant → PostgreSQL (read-only connection) for query execution
  • Audit events → PostgreSQL audit tables (query runs, decisions, errors)

Key components:

  • Query Controller: request validation, auth, idempotency handling.

  • Schema Service: produces a stable schema “contract” used for grounding (tables, columns, PK/FK hints, row count estimates where available).

  • LLM SQL Generator: requests SQL with explicit constraints (single statement, SELECT-only, required LIMIT).

  • Guardrails Engine:

    • parse SQL into AST
    • enforce single-statement SELECT-only
    • denylist risky constructs
    • enforce required predicates/limits where configured
    • normalize SQL and compute fingerprint
  • Execution Engine:

    • binds parameters safely (no string concatenation)
    • sets statement timeout
    • sets transaction read-only
    • enforces max rows and result truncation for response
  • Audit Logger: persists full run record and decision trace.

External dependencies:

  • PostgreSQL (schema + audit + optional sample dataset)
  • OpenAI-compatible API (SQL generation)
  • Docker Compose (local runtime)

Trust boundaries:

  • Client → Service: authenticated boundary; requests are attributed to an actor.
  • Service → LLM Provider: untrusted output boundary; all generated SQL is treated as untrusted input.
  • Service → PostgreSQL: privileged boundary; enforced by separate read-only role and transaction policy.

3. Key Design Decisions

Technology stack:

  • Spring Boot 3.x + Java 17: stable runtime and ecosystem for secure API + DB access.
  • Spring Security: role enforcement and attribution for audit logs.
  • PostgreSQL: both the target query engine and the durable audit store.
  • Docker Compose: reproducible local evidence runs.
  • SQL parser library (e.g., JSqlParser): required to enforce deterministically at the syntax/AST level rather than regex-based checks.

Data storage model:

  • Audit is stored in append-only tables with explicit foreign keys where appropriate.

  • SQL is stored in multiple forms:

    • raw generated SQL
    • normalized SQL
    • a fingerprint (hash) for deduplication and trend analysis
  • Result bodies are not fully persisted by default; store summaries and optional sampled rows to minimize data exposure.

Synchrony vs asynchrony:

  • Queries run synchronously for interactive usage. The service enforces timeouts and row caps to keep latency bounded.
  • An optional asynchronous mode can be added later (job-based execution) for expensive queries.

Error handling and retries:

  • LLM calls: bounded retries for transient network/5xx errors only.
  • SQL execution: no retries by default (queries may be non-idempotent at the system level due to load); surface a stable error code and require explicit client retry if desired.
  • All failures produce an audit record with an error classification.

Idempotency strategy:

  • Optional Idempotency-Key on the query endpoint:

    • if the same actor submits the same key, return the prior run result (or run id) without re-executing.
    • idempotency is scoped to actor + key, and optionally includes a request hash to detect mismatched reuse.

4. Data Model

Core tables and purpose:

  • sql_assistant_runs

    • Purpose: primary record of each NL→SQL request and execution attempt.

    • Key columns:

      • id (uuid)
      • actor_user_id, actor_email
      • question_text
      • schema_version (or schema_snapshot_id)
      • generated_sql_raw
      • generated_sql_normalized
      • sql_fingerprint (hash)
      • policy_decision (ALLOW, DENY)
      • deny_reason_code (nullable)
      • executed (boolean)
      • duration_ms
      • row_count
      • result_sample_json (nullable, truncated)
      • error_code / error_message_truncated
      • created_at
    • Indexing:

      • (actor_user_id, created_at desc)
      • (sql_fingerprint, created_at desc)
      • (policy_decision, created_at desc)
  • sql_assistant_decisions

    • Purpose: detailed policy trace for explainability (which rule fired).

    • Key columns:

      • id, run_id
      • rule_name
      • outcome (PASS, FAIL)
      • details_json
      • created_at
    • Indexing:

      • (run_id)
  • schema_snapshots (optional but recommended)

    • Purpose: stable grounding input for the LLM and reproducible audits.

    • Key columns:

      • id, version, snapshot_json, created_at
    • Indexing:

      • (version) unique

Design intent:

  • Every request yields a run row even if denied before execution.
  • Decisions are traceable and reproducible, enabling audits and policy tuning without ambiguity.

5. API Surface

Public/internal endpoints:

  • POST /api/sql/query — Generate SQL from natural language, enforce guardrails, and execute if allowed (ROLE_USER)

    • Optional: Idempotency-Key header
  • GET /api/sql/runs/{id} — Fetch a prior run (question, SQL, decision, execution metadata, sampled results) (ROLE_USER, owner-only)

Admin/ops (optional):

  • GET /admin/sql/runs?decision=DENY&limit=... — Search runs and denials (ROLE_ADMIN)
  • GET /admin/sql/rules — List active guardrail rules and configuration (ROLE_ADMIN)

6. Security Model

Authentication:

  • Spring Security authentication (local Basic auth for runnable mode; replace with platform auth in production).

Authorization (roles):

  • ROLE_USER can create runs and view their own runs.
  • ROLE_ADMIN can view global audit and policy outcomes.

Read-only enforcement:

  • Database uses a dedicated sql_ro role with:

    • SELECT privileges on allowed schemas/tables
    • no write privileges
  • Each execution uses:

    • SET TRANSACTION READ ONLY
    • statement timeout (e.g., SET LOCAL statement_timeout = '3s')
    • optional SET LOCAL idle_in_transaction_session_timeout

CSRF considerations:

  • If cookie/session-based auth is used, keep CSRF enabled on POST endpoints.
  • If stateless tokens are used, CSRF is not required.

Data isolation guarantees:

  • Run records are scoped to the authenticated actor (owner-only read endpoints).
  • Result sampling is truncated and optionally column-filtered to avoid exposing sensitive fields by default.
  • LLM output is not trusted; it is treated as untrusted input and must pass deterministic enforcement.

7. Operational Behavior

Startup behavior:

  • Docker Compose starts PostgreSQL and the application.
  • Flyway applies migrations for audit tables and any sample dataset.
  • On startup, the schema snapshot can be generated or loaded (configurable).

Failure modes:

  • LLM unavailable: run is recorded with error_code=LLM_UNAVAILABLE and no execution occurs.
  • Guardrails deny: run is recorded with policy_decision=DENY and deny_reason_code, no execution occurs.
  • Query timeout: run is recorded with error_code=QUERY_TIMEOUT.
  • SQL parse failure: run is recorded with error_code=SQL_PARSE_FAILED.
  • DB errors (permissions, missing objects): run is recorded with error_code=DB_ERROR.

Retry and timeout behavior:

  • LLM call retries are bounded and only for transient failures.
  • SQL execution uses strict statement timeout; retries are not automatic.

Observability hooks:

  • Structured logs include runId, actor, sql_fingerprint, decision, duration_ms.

  • Metrics can be added for:

    • denials by reason
    • execution latency distributions
    • timeouts and DB errors
  • Tracing (optional) can wrap:

    • schema snapshot fetch
    • LLM generation
    • guardrail evaluation
    • DB execution

8. Local Execution

Prerequisites:

  • Docker + Docker Compose v2

Environment variables:

  • LLM_BASE_URL — OpenAI-compatible base URL
  • LLM_API_KEY — provider key
  • LLM_MODEL — model name
  • SQL_STATEMENT_TIMEOUT_MS — execution timeout
  • SQL_MAX_ROWS — max rows returned (hard cap)
  • Local auth credentials (for runnable mode)

Run:

docker compose up -d --build

Health:

curl http://localhost:8080/health

Generate and execute a query:

curl -u user:password -X POST http://localhost:8080/api/sql/query \
  -H 'Content-Type: application/json' \
  -d '{"question":"Top 10 customers by total spend last month"}'

Fetch run details:

RUN_ID="<uuid>"
curl -u user:password http://localhost:8080/api/sql/runs/${RUN_ID}

Verification steps:

  • Confirm denied writes are blocked:
curl -u user:password -X POST http://localhost:8080/api/sql/query \
  -H 'Content-Type: application/json' \
  -d '{"question":"Delete all orders older than 2020"}'
  • Confirm audit records exist in PostgreSQL:
docker exec -it <postgres-container> psql -U <user> -d <db> -c "select policy_decision, deny_reason_code, created_at from sql_assistant_runs order by created_at desc limit 10;"

9. Evidence Pack (MANDATORY)

  • [ ] Service startup logs showing Flyway migrations applied and readiness

  • [ ] Health endpoint response

  • [ ] Successful query run:

    • request/response logs for POST /api/sql/query
    • returned runId
    • policy_decision=ALLOW
    • row_count and duration_ms populated
  • [ ] Guardrail denial run:

    • request/response logs for a write-intent question
    • policy_decision=DENY with deny_reason_code
  • [ ] SQL parsing enforcement evidence:

    • request that induces multi-statement output
    • denial with deny_reason_code=MULTI_STATEMENT
  • [ ] Read-only DB enforcement evidence:

    • attempt to execute a write statement (even if bypass attempted) fails at DB permission boundary
    • error recorded with stable error_code=DB_READONLY_VIOLATION (or equivalent)
  • [ ] Query budgeting evidence:

    • a deliberately expensive query triggers statement timeout or is denied by policy
    • recorded error_code=QUERY_TIMEOUT or deny_reason_code=EXPENSIVE_QUERY
  • [ ] Database records after execution:

    • sql_assistant_runs row for each run
    • sql_assistant_decisions rows showing policy trace
  • [ ] Idempotency evidence (if enabled):

    • same Idempotency-Key returns same runId without re-executing

10. Known Limitations

  • Does not guarantee semantic correctness of generated SQL; correctness is bounded by schema grounding and validation rules.
  • Does not implement full data classification or column-level masking by default; result sampling must be configured for sensitive datasets.
  • Single-node deployment model; does not include distributed rate limiting or queue-based execution for heavy workloads.
  • Advanced query planning safeguards (cost-based planner checks, EXPLAIN gating) are optional extensions and not mandatory in the baseline.

11. Extension Points

  • Add EXPLAIN-based gating:

    • run EXPLAIN (FORMAT JSON) and deny if estimated cost/rows exceed policy thresholds.
  • Introduce column-level policies:

    • deny or mask sensitive columns by schema metadata rules.
  • Add asynchronous execution:

    • queue runs and allow polling for results for long queries.
  • Add tenant isolation:

    • per-tenant schema snapshots, policies, and budgets.
  • Integrate OpenTelemetry:

    • traces for LLM generation, guardrail evaluation, and DB execution.
  • Add evaluation harness:

    • curated NL→SQL test suite with pass/fail and regression tracking for policy tuning.
Changelog
Release notes

1.1.0

Locked
Register account to unlock implementation details and assets.
Account


  • Solution write-up + runnable implementation
  • Evidence images (when published)
  • Code bundle downloads (when enabled)
Evidence
7 item(s)
code-structure.png
01_up_log.png
02_health.png
03_allowed.png
04_denied.png
05_multi.png
06_audit.png
Code downloads
2 file(s)
sql-assistant-guardrails_v1.1.zip
ZIP bundle
Locked
sql-assistant-guardrails.zip
ZIP bundle
Locked