Published June 17, 2026

How to Build AI Agents on Customer Warehouse Data: Snowflake, BigQuery, and Supabase

June 17, 2026

The data an AI agent needs to be useful is often sitting in the customer's warehouse. The metrics an ops agent should act on live in Snowflake. The analytics a reporting agent should summarize live in BigQuery. The application data a product agent should read and update lives in Supabase.

To act on that data, an agent needs four things: read current records, query them with filters or SQL, write results back, and react when the data changes. Across three different warehouses, that is normally three separate integrations — three auth models, three query languages, three webhook mechanisms.

The Unified Datastore API provides all four capabilities through one interface. This guide shows how to build the complete agent loop — read, query, write, and event-driven updates — against Snowflake, BigQuery, and Supabase.

What an agent needs from warehouse data

An agent that answers questions about warehouse data only needs to read. An agent that acts on warehouse data needs the full loop:

Read — retrieve current records and discover what tables and fields exist, so the agent knows what data is available before it acts.

Query — filter, sort, and aggregate records. Some questions need a structured filter ("records where status is overdue, sorted by amount"); some need raw SQL ("a window function over the last 90 days"). The agent needs both.

Write — create or update records. An agent that computes a result, flags a row, or appends a derived value has to write that change back to the warehouse.

React — respond when data changes. An agent that monitors a table for new or updated rows needs change events delivered to it, not a polling loop it has to maintain.

Snowflake, BigQuery, and Supabase each support all four operations through the Unified Datastore API. The agent logic is written once against the unified object model, regardless of which warehouse a given customer uses.

The unified datastore object model

The Datastore API normalizes four objects across every supported integration:

Database — the workbook, project, or schema connection. A Snowflake database, a BigQuery project, a Supabase project.

Table — a table or view within a database, including its field definitions (name, type, is_primary_key, is_nullable) and relationships. This is how an agent discovers the schema before querying.

Record — a single row. The fields property is a typed dictionary — each field carries a string, number, boolean, date, or selection value — so the agent works with structured, typed data rather than raw JSON.

Query — the most important object for agent workflows. It accepts either a structured filter model (filters, sorting, aggregation, grouping) or raw SQL via the query.sql field. The same Query object works across all three warehouses; the structured model maps to each platform's native query mechanism, and raw SQL passes through where the warehouse supports it.

That dual-mode Query design is what lets an agent run a portable filtered query against any datastore and drop down to warehouse-specific SQL when it needs the power.

Step 1: Connect and discover the schema

Before an agent can act on warehouse data, it needs to know what's there. List the tables in a database, then read a table's field definitions.

import { UnifiedTo } from '@unified-api/typescript-sdk';

const sdk = new UnifiedTo({
  security: { jwt: process.env.UNIFIED_API_KEY! },
});

const connectionId = process.env.UNIFIED_CONNECTION_ID!;

// List the tables available in the connected warehouse
const tables = await sdk.datastore.listDatastoreTables({
  connectionId,
  limit: 50,
});

// Each table includes its field definitions and types,
// so the agent knows the schema before querying
for (const table of tables ?? []) {
  console.log(`Table: ${table.name}`);
  for (const field of table.fields ?? []) {
    console.log(`  ${field.name} (${field.type})`);
  }
}

The fields array gives the agent the column names and types — TEXT, NUMBER, DATE, BOOLEAN, SINGLE_SELECT, and others — so it can construct valid queries without hardcoding the schema per customer.

Step 2: Query the data

The Query object runs in two modes. Use the structured filter model for portable queries that work across any warehouse, and raw SQL when the agent needs warehouse-specific capability.

Structured filter query — portable across all datastores:

const result = await sdk.datastore.createDatastoreQuery({
  connectionId,
  datastoreQuery: {
    table_id: 'invoices',
    query: {
      filter: {
        type: 'AND',
        filters: [
          { field: 'status', condition: { eq: { string: 'overdue' } } },
          { field: 'balance', condition: { gt: { number: 0 } } },
        ],
      },
      sort_fields: ['balance'],
      sort_order: 'desc',
      limit: 25,
    },
  },
});

const overdueInvoices = result?.response?.items ?? [];

Raw SQL query — when the agent needs warehouse-specific power:

const result = await sdk.datastore.createDatastoreQuery({
  connectionId,
  datastoreQuery: {
    query: {
      sql: `
        SELECT customer_id, SUM(balance) AS total_overdue
        FROM invoices
        WHERE status = 'overdue'
        GROUP BY customer_id
        ORDER BY total_overdue DESC
        LIMIT 25
      `,
    },
  },
});

const topOverdueAccounts = result?.response?.items ?? [];

The Query object also supports aggregation through the structured model — count, sum, avg, min, max, and group_by — so an agent can compute totals without writing SQL when a portable query is preferable.

Step 3: Write results back

An agent that computes a result writes it back to the warehouse. Create a new record or update an existing one through the same unified Record model.

// Create a record — e.g. the agent writes a computed risk score row
await sdk.datastore.createDatastoreRecord({
  connectionId,
  datastoreRecord: {
    table_id: 'account_scores',
    fields: {
      account_id: { string: 'acct_8472' },
      risk_score: { number: 0.82 },
      computed_at: { date: new Date().toISOString() },
      flagged: { boolean: true },
    },
  },
});

// Update an existing record — e.g. the agent updates a status field
await sdk.datastore.updateDatastoreRecord({
  connectionId,
  id: 'rec_12345',
  datastoreRecord: {
    table_id: 'invoices',
    fields: {
      status: { string: 'flagged_for_review' },
    },
  },
});

The fields dictionary is typed — each value specifies whether it's a string, number, boolean, date, or selection — so writes preserve the warehouse's column types without manual casting.

Step 4: React to changes with webhooks

An agent that monitors warehouse data subscribes to change events rather than polling. Snowflake, BigQuery, and Supabase support record_created and record_updated events through the Unified webhook system.

// Subscribe to record changes on a table
const webhook = await sdk.unified.createUnifiedWebhook({
  include_all: false,           // set true to backfill existing records first
  connection_id: connectionId,
  hook_url: `${process.env.WEBHOOK_BASE_URL}/datastore-events`,
  object_type: 'datastore_record',
  event: 'updated',             // fires on both created and updated records
  interval: 1,                  // minutes (paid accounts); 60 for free
});

When a record changes, the agent's endpoint receives the event and can act on it — query related data, compute a result, and write it back. This closes the loop: the agent responds to warehouse changes in near real time without maintaining its own polling infrastructure.

The complete loop

Put the four steps together and you have an agent that operates on warehouse data end to end:

A new row lands in a Snowflake table. Unified delivers a record_updated event to the agent's endpoint.
The agent queries related records to gather context — a structured filter query for portable logic, or raw SQL for a more complex aggregation.
The agent computes a result — a score, a flag, a summary.
The agent writes the result back to the warehouse as a new record or a field update.

The same code runs whether the customer's data is in Snowflake, BigQuery, or Supabase. The agent logic is written against the unified Database, Table, Record, and Query objects — not against three separate warehouse APIs.

Why this is hard to build directly

Connecting an agent to a single warehouse looks straightforward. Connecting it to three, and maintaining those connections, is where the cost shows up.

Snowflake, BigQuery, and Supabase each have a different authentication model, a different client library, a different query interface, and different change-detection mechanisms. Building the read-query-write-event loop against each one means three implementations of the same agent capability — and three sets of breakage to monitor as those APIs evolve.

The query layer alone is a meaningful lift: each warehouse has its own SQL dialect and its own structured-access patterns. Normalizing them so an agent can run the same filtered query across all three — and drop to raw SQL only when needed — is the kind of infrastructure that takes longer to maintain than to build.

What Unified handles and what you own

The boundary is explicit:

Unified handles:

Authorized connections to Snowflake, BigQuery, Supabase, and other datastore integrations
Normalized Database, Table, Record, and Query objects across warehouses
Query translation — structured filters mapped to each platform, raw SQL passthrough where supported
Real-time, pass-through reads and writes — fetched live from the source, with no storage of customer data
Change detection through native and [virtual webhooks](/blog/unlock_real_time_data_with_virtual_webhooks)

You own:

The agent logic and reasoning
The LLM and any orchestration
The decisions about what to query, what to compute, and what to write back

Unified is the data access layer. Every query and write hits the source warehouse live; Unified does not store or cache end-customer data.

For agents that combine warehouse data with other sources — CRM records, tickets, documents — see how the MCP server exposes live data as callable tools and how a context engine assembles authorized data across categories.

→ Explore the Datastore API documentation

→ Talk to us about building AI agents on warehouse data

All articles