Published April 17, 2024

How to Train AI on Accounting Data with Unified's Accounting API

January 9, 2026

Training models on accounting data sounds like a data science problem. In reality, it's a product problem first.

As soon as you support multiple accounting systems, the shape of your training data starts to drift. Amounts may be signed or unsigned. Credits and debits may be inferred instead of explicit. Accounts might represent categories, ledgers, or reporting groupings depending on the provider. Counterparties may be embedded in transactions, split across contacts, or missing entirely.

For a PM, this creates hard questions upstream of model quality:

Can we guarantee consistent inputs across customers on different accounting systems?
Can we retrain models incrementally without rebuilding the dataset every time?
Can we explain model behavior if the underlying accounting semantics vary by provider?

Most AI features built on accounting data quietly accumulate exceptions: vendor-specific preprocessing, system-specific heuristics, and training pipelines that only work reliably for a subset of integrations.

Unified's Accounting API is designed to eliminate that divergence at the source. Instead of normalizing data after ingestion, Unified exposes transactions, accounts, and contacts through a consistent schema across providers. Signed amounts, stable timestamps, and account attribution are standardized before the data ever enters your training pipeline.

This guide shows how to turn those normalized accounting objects into a flat, training-ready dataset—without conditional logic for QuickBooks vs. Xero vs. NetSuite, and without rethinking your pipeline as new accounting systems are added.

Prerequisites

Node.js v18+
A Unified account with an Accounting integration enabled
Your Unified API key
A customer Accounting connectionId

Step 1: Set up your project

mkdir accounting-ai-training-demo
cd accounting-ai-training-demo
npm init -y
npm install @unified-api/typescript-sdk dotenv

Create a .env file:

UNIFIED_API_KEY=your_unified_api_key
CONNECTION_ACCOUNTING=your_customer_accounting_connection_id

Step 2: Initialize the SDK

import "dotenv/config";
import { UnifiedTo } from "@unified-api/typescript-sdk";

const { UNIFIED_API_KEY, CONNECTION_ACCOUNTING } = process.env;

const sdk = new UnifiedTo({
  security: { jwt: UNIFIED_API_KEY! },
});

Step 3: Understand the normalized Accounting objects

Unified's Accounting models use snake_case field names in the API docs and TypeScript types shown here.

For training datasets, the most useful base object is AccountingTransaction because it provides:

signed amounts (total_amount is negative for CREDIT, positive for DEBIT)
consistent account attribution (account_id)
counterparty hints (contacts[] with is_supplier / is_customer)
stable timestamps (created_at, updated_at)

Transactions (`AccountingTransaction`)

Key fields used in this guide:

id, created_at, updated_at
total_amount (signed)
currency
account_id
type (e.g., Bill, JournalEntry, CreditCardCharge, etc.)
memo, reference
contacts[] (IDs plus supplier/customer flags)
lineitems[] (optional, for more detailed attribution)

Accounts (`AccountingAccount`)

Accounts turn account_id into labels your model can learn:

id, name
type (e.g., EXPENSE, REVENUE, ACCOUNTS_PAYABLE, etc.)
hierarchy fields like parent_id (optional)

Contacts (`AccountingContact`) (optional but useful)

Contacts let you attach stable vendor/customer labels:

id
name / company_name
is_supplier, is_customer

Step 4: Fetch all transactions (with pagination)

Below are partial TypeScript shapes showing only the fields used in this example.

import type { UnifiedTo } from "@unified-api/typescript-sdk";

export type AccountingTransaction = {
  id?: string;
  created_at?: string; // ISO date
  updated_at?: string; // ISO date
  memo?: string;
  reference?: string;
  total_amount?: number; // negative for CREDIT, positive for DEBIT
  tax_amount?: number; // negative for CREDIT, positive for DEBIT
  sub_total_amount?: number;
  account_id?: string;
  currency?: string;
  type?: string;
  contacts?: {
    id?: string;
    is_customer?: boolean;
    is_supplier?: boolean;
  }[];
  lineitems?: {
    id?: string;
    unit_quantity?: number;
    unit_amount?: number;
    total_amount?: number; // signed
    account_id?: string;
    object_type?: string;
    name?: string;
    description?: string;
  }[];
};

export async function fetchAllTransactions(
  sdk: UnifiedTo,
  connectionId: string,
  opts?: {
    pageSize?: number;
    updated_gte?: string;
    sort?: "name" | "updated_at" | "created_at";
    order?: "asc" | "desc";
    query?: string;
    contact_id?: string;
    fields?: string;
    raw?: string;
  }
): Promise<AccountingTransaction[]> {
  const pageSize = opts?.pageSize ?? 100;
  let offset = 0;
  const out: AccountingTransaction[] = [];

  while (true) {
    const page = await sdk.accounting.listAccountingTransactions({
      connectionId,
      limit: pageSize,
      offset,
      updated_gte: opts?.updated_gte,
      sort: opts?.sort ?? "updated_at",
      order: opts?.order ?? "asc",
      query: opts?.query ?? "",
      contact_id: opts?.contact_id ?? "",
      fields: opts?.fields ?? "",
      raw: opts?.raw ?? "",
    });

    if (!page || page.length === 0) break;

    out.push(...page);
    offset += pageSize;
  }

  return out;
}

Step 5: Fetch accounts (for labeling)

import type { UnifiedTo } from "@unified-api/typescript-sdk";

export type AccountingAccount = {
  id?: string;
  name?: string;
  type?:
    | "ACCOUNTS_PAYABLE"
    | "ACCOUNTS_RECEIVABLE"
    | "BANK"
    | "CREDIT_CARD"
    | "FIXED_ASSET"
    | "LIABILITY"
    | "EQUITY"
    | "EXPENSE"
    | "REVENUE"
    | "OTHER";
  parent_id?: string;
};

export async function fetchAllAccounts(
  sdk: UnifiedTo,
  connectionId: string,
  opts?: {
    pageSize?: number;
    updated_gte?: string;
    sort?: "name" | "updated_at" | "created_at";
    order?: "asc" | "desc";
    query?: string;
    fields?: string;
    raw?: string;
  }
): Promise<AccountingAccount[]> {
  const pageSize = opts?.pageSize ?? 100;
  let offset = 0;
  const out: AccountingAccount[] = [];

  while (true) {
    const page = await sdk.accounting.listAccountingAccounts({
      connectionId,
      limit: pageSize,
      offset,
      updated_gte: opts?.updated_gte,
      sort: opts?.sort ?? "updated_at",
      order: opts?.order ?? "asc",
      query: opts?.query ?? "",
      fields: opts?.fields ?? "",
      raw: opts?.raw ?? "",
    });

    if (!page || page.length === 0) break;

    out.push(...page);
    offset += pageSize;
  }

  return out;
}

Step 6: Convert transactions into training rows

This step turns raw accounting objects into a flat dataset suitable for model training. Each row is one normalized transaction with stable labels (account name/type) and optional vendor/customer hints.

Key points:

Keep total_amount signed (negative credits, positive debits).
Keep currency as a feature. Don't sum across currencies unless you explicitly convert later.
Use updated_at for incremental exports (updated_gte), and keep created_at as an event timestamp feature.

export type TrainingRow = {
  transaction_id: string;
  created_at: string | null;
  updated_at: string | null;

  amount: number;
  currency: string | null;

  transaction_type: string | null;
  memo: string | null;
  reference: string | null;

  account_id: string | null;
  account_name: string | null;
  account_type: string | null;

  // Supplier/customer hints from normalized contacts on the transaction
  counterparty_contact_id: string | null;
  counterparty_is_supplier: boolean | null;
  counterparty_is_customer: boolean | null;
};

function indexAccounts(accounts: AccountingAccount[]): Record<string, AccountingAccount> {
  return Object.fromEntries(accounts.filter((a) => a.id).map((a) => [a.id!, a]));
}

function pickCounterparty(
  tx: AccountingTransaction
): { id: string | null; is_supplier: boolean | null; is_customer: boolean | null } {
  // Prefer a supplier contact when present
  const supplier = tx.contacts?.find((c) => c.id && c.is_supplier);
  if (supplier?.id) return { id: supplier.id, is_supplier: true, is_customer: supplier.is_customer ?? null };

  // Otherwise fall back to any contact ID if present
  const any = tx.contacts?.find((c) => c.id);
  if (any?.id) return { id: any.id, is_supplier: any.is_supplier ?? null, is_customer: any.is_customer ?? null };

  return { id: null, is_supplier: null, is_customer: null };
}

export function toTrainingRows(
  transactions: AccountingTransaction[],
  accounts: AccountingAccount[]
): TrainingRow[] {
  const accountIndex = indexAccounts(accounts);

  return transactions
    .filter((t) => t.id)
    .map((t) => {
      const acct = t.account_id ? accountIndex[t.account_id] : undefined;
      const cp = pickCounterparty(t);

      return {
        transaction_id: t.id!,
        created_at: t.created_at ?? null,
        updated_at: t.updated_at ?? null,

        amount: Number(t.total_amount ?? 0),
        currency: t.currency ?? null,

        transaction_type: t.type ?? null,
        memo: t.memo ?? null,
        reference: t.reference ?? null,

        account_id: t.account_id ?? null,
        account_name: acct?.name ?? null,
        account_type: acct?.type ?? null,

        counterparty_contact_id: cp.id,
        counterparty_is_supplier: cp.is_supplier,
        counterparty_is_customer: cp.is_customer,
      };
    });
}

Step 7 (optional): Incremental export using `updated_gte`

For training pipelines, you rarely want to rebuild the entire dataset on every run. Unified list endpoints support updated_gte, so you can export only records that changed since your last successful run.

Example usage:

const lastRun = "2025-12-01T00:00:00.000Z";

const transactions = await fetchAllTransactions(sdk, connectionId, {
  pageSize: 100,
  updated_gte: lastRun,
  sort: "updated_at",
  order: "asc",
});

Persist lastRun in your system (DB, object storage, etc.) and advance it only after a successful export.

Step 8: Putting it all together

This example fetches transactions + accounts, builds training rows, and prints a small sample.

async function main() {
  const connectionId = CONNECTION_ACCOUNTING!;
  if (!connectionId) throw new Error("Missing CONNECTION_ACCOUNTING");

  const [transactions, accounts] = await Promise.all([
    fetchAllTransactions(sdk, connectionId, { pageSize: 100, sort: "updated_at", order: "asc" }),
    fetchAllAccounts(sdk, connectionId, { pageSize: 100, sort: "updated_at", order: "asc" }),
  ]);

  const rows = toTrainingRows(transactions, accounts);

  console.log("Training rows (first 5):", rows.slice(0, 5));
  console.log("Row count:", rows.length);
}

main().catch(console.error);

You now have a repeatable export flow that pages through normalized accounting transactions, enriches them with account metadata, and outputs a consistent training dataset across accounting systems.

This dataset is suitable for model training, evaluation, and retraining workflows without changing your integration logic per provider.

Optional next steps are incremental exports using updated_gte, adding contact enrichment for vendor names, and incorporating period-level labels from Profit & Loss or Cash Flow reports when your training objective requires it.

→ Start your 30-day free trial

→ Book a demo

All articles