How to Train AI on Accounting Data with Unified's Accounting API
January 9, 2026
Training models on accounting data sounds like a data science problem. In reality, it's a product problem first.
As soon as you support multiple accounting systems, the shape of your training data starts to drift. Amounts may be signed or unsigned. Credits and debits may be inferred instead of explicit. Accounts might represent categories, ledgers, or reporting groupings depending on the provider. Counterparties may be embedded in transactions, split across contacts, or missing entirely.
For a PM, this creates hard questions upstream of model quality:
- Can we guarantee consistent inputs across customers on different accounting systems?
- Can we retrain models incrementally without rebuilding the dataset every time?
- Can we explain model behavior if the underlying accounting semantics vary by provider?
Most AI features built on accounting data quietly accumulate exceptions: vendor-specific preprocessing, system-specific heuristics, and training pipelines that only work reliably for a subset of integrations.
Unified's Accounting API is designed to eliminate that divergence at the source. Instead of normalizing data after ingestion, Unified exposes transactions, accounts, and contacts through a consistent schema across providers. Signed amounts, stable timestamps, and account attribution are standardized before the data ever enters your training pipeline.
This guide shows how to turn those normalized accounting objects into a flat, training-ready dataset—without conditional logic for QuickBooks vs. Xero vs. NetSuite, and without rethinking your pipeline as new accounting systems are added.
Prerequisites
- Node.js v18+
- A Unified account with an Accounting integration enabled
- Your Unified API key
- A customer Accounting
connectionId
Step 1: Set up your project
mkdir accounting-ai-training-demo
cd accounting-ai-training-demo
npm init -y
npm install @unified-api/typescript-sdk dotenv
Create a .env file:
UNIFIED_API_KEY=your_unified_api_key
CONNECTION_ACCOUNTING=your_customer_accounting_connection_id
Step 2: Initialize the SDK
import "dotenv/config";
import { UnifiedTo } from "@unified-api/typescript-sdk";
const { UNIFIED_API_KEY, CONNECTION_ACCOUNTING } = process.env;
const sdk = new UnifiedTo({
security: { jwt: UNIFIED_API_KEY! },
});
Step 3: Understand the normalized Accounting objects
Unified's Accounting models use snake_case field names in the API docs and TypeScript types shown here.
For training datasets, the most useful base object is AccountingTransaction because it provides:
- signed amounts (
total_amountis negative for CREDIT, positive for DEBIT) - consistent account attribution (
account_id) - counterparty hints (
contacts[]withis_supplier/is_customer) - stable timestamps (
created_at,updated_at)
Transactions (AccountingTransaction)
Key fields used in this guide:
id,created_at,updated_attotal_amount(signed)currencyaccount_idtype(e.g., Bill, JournalEntry, CreditCardCharge, etc.)memo,referencecontacts[](IDs plus supplier/customer flags)lineitems[](optional, for more detailed attribution)
Accounts (AccountingAccount)
Accounts turn account_id into labels your model can learn:
id,nametype(e.g.,EXPENSE,REVENUE,ACCOUNTS_PAYABLE, etc.)- hierarchy fields like
parent_id(optional)
Contacts (AccountingContact) (optional but useful)
Contacts let you attach stable vendor/customer labels:
idname/company_nameis_supplier,is_customer
Step 4: Fetch all transactions (with pagination)
Below are partial TypeScript shapes showing only the fields used in this example.
import type { UnifiedTo } from "@unified-api/typescript-sdk";
export type AccountingTransaction = {
id?: string;
created_at?: string; // ISO date
updated_at?: string; // ISO date
memo?: string;
reference?: string;
total_amount?: number; // negative for CREDIT, positive for DEBIT
tax_amount?: number; // negative for CREDIT, positive for DEBIT
sub_total_amount?: number;
account_id?: string;
currency?: string;
type?: string;
contacts?: {
id?: string;
is_customer?: boolean;
is_supplier?: boolean;
}[];
lineitems?: {
id?: string;
unit_quantity?: number;
unit_amount?: number;
total_amount?: number; // signed
account_id?: string;
object_type?: string;
name?: string;
description?: string;
}[];
};
export async function fetchAllTransactions(
sdk: UnifiedTo,
connectionId: string,
opts?: {
pageSize?: number;
updated_gte?: string;
sort?: "name" | "updated_at" | "created_at";
order?: "asc" | "desc";
query?: string;
contact_id?: string;
fields?: string;
raw?: string;
}
): Promise<AccountingTransaction[]> {
const pageSize = opts?.pageSize ?? 100;
let offset = 0;
const out: AccountingTransaction[] = [];
while (true) {
const page = await sdk.accounting.listAccountingTransactions({
connectionId,
limit: pageSize,
offset,
updated_gte: opts?.updated_gte,
sort: opts?.sort ?? "updated_at",
order: opts?.order ?? "asc",
query: opts?.query ?? "",
contact_id: opts?.contact_id ?? "",
fields: opts?.fields ?? "",
raw: opts?.raw ?? "",
});
if (!page || page.length === 0) break;
out.push(...page);
offset += pageSize;
}
return out;
}
Step 5: Fetch accounts (for labeling)
import type { UnifiedTo } from "@unified-api/typescript-sdk";
export type AccountingAccount = {
id?: string;
name?: string;
type?:
| "ACCOUNTS_PAYABLE"
| "ACCOUNTS_RECEIVABLE"
| "BANK"
| "CREDIT_CARD"
| "FIXED_ASSET"
| "LIABILITY"
| "EQUITY"
| "EXPENSE"
| "REVENUE"
| "OTHER";
parent_id?: string;
};
export async function fetchAllAccounts(
sdk: UnifiedTo,
connectionId: string,
opts?: {
pageSize?: number;
updated_gte?: string;
sort?: "name" | "updated_at" | "created_at";
order?: "asc" | "desc";
query?: string;
fields?: string;
raw?: string;
}
): Promise<AccountingAccount[]> {
const pageSize = opts?.pageSize ?? 100;
let offset = 0;
const out: AccountingAccount[] = [];
while (true) {
const page = await sdk.accounting.listAccountingAccounts({
connectionId,
limit: pageSize,
offset,
updated_gte: opts?.updated_gte,
sort: opts?.sort ?? "updated_at",
order: opts?.order ?? "asc",
query: opts?.query ?? "",
fields: opts?.fields ?? "",
raw: opts?.raw ?? "",
});
if (!page || page.length === 0) break;
out.push(...page);
offset += pageSize;
}
return out;
}
Step 6: Convert transactions into training rows
This step turns raw accounting objects into a flat dataset suitable for model training. Each row is one normalized transaction with stable labels (account name/type) and optional vendor/customer hints.
Key points:
- Keep
total_amountsigned (negative credits, positive debits). - Keep
currencyas a feature. Don't sum across currencies unless you explicitly convert later. - Use
updated_atfor incremental exports (updated_gte), and keepcreated_atas an event timestamp feature.
export type TrainingRow = {
transaction_id: string;
created_at: string | null;
updated_at: string | null;
amount: number;
currency: string | null;
transaction_type: string | null;
memo: string | null;
reference: string | null;
account_id: string | null;
account_name: string | null;
account_type: string | null;
// Supplier/customer hints from normalized contacts on the transaction
counterparty_contact_id: string | null;
counterparty_is_supplier: boolean | null;
counterparty_is_customer: boolean | null;
};
function indexAccounts(accounts: AccountingAccount[]): Record<string, AccountingAccount> {
return Object.fromEntries(accounts.filter((a) => a.id).map((a) => [a.id!, a]));
}
function pickCounterparty(
tx: AccountingTransaction
): { id: string | null; is_supplier: boolean | null; is_customer: boolean | null } {
// Prefer a supplier contact when present
const supplier = tx.contacts?.find((c) => c.id && c.is_supplier);
if (supplier?.id) return { id: supplier.id, is_supplier: true, is_customer: supplier.is_customer ?? null };
// Otherwise fall back to any contact ID if present
const any = tx.contacts?.find((c) => c.id);
if (any?.id) return { id: any.id, is_supplier: any.is_supplier ?? null, is_customer: any.is_customer ?? null };
return { id: null, is_supplier: null, is_customer: null };
}
export function toTrainingRows(
transactions: AccountingTransaction[],
accounts: AccountingAccount[]
): TrainingRow[] {
const accountIndex = indexAccounts(accounts);
return transactions
.filter((t) => t.id)
.map((t) => {
const acct = t.account_id ? accountIndex[t.account_id] : undefined;
const cp = pickCounterparty(t);
return {
transaction_id: t.id!,
created_at: t.created_at ?? null,
updated_at: t.updated_at ?? null,
amount: Number(t.total_amount ?? 0),
currency: t.currency ?? null,
transaction_type: t.type ?? null,
memo: t.memo ?? null,
reference: t.reference ?? null,
account_id: t.account_id ?? null,
account_name: acct?.name ?? null,
account_type: acct?.type ?? null,
counterparty_contact_id: cp.id,
counterparty_is_supplier: cp.is_supplier,
counterparty_is_customer: cp.is_customer,
};
});
}
Step 7 (optional): Incremental export using updated_gte
For training pipelines, you rarely want to rebuild the entire dataset on every run. Unified list endpoints support updated_gte, so you can export only records that changed since your last successful run.
Example usage:
const lastRun = "2025-12-01T00:00:00.000Z";
const transactions = await fetchAllTransactions(sdk, connectionId, {
pageSize: 100,
updated_gte: lastRun,
sort: "updated_at",
order: "asc",
});
Persist lastRun in your system (DB, object storage, etc.) and advance it only after a successful export.
Step 8: Putting it all together
This example fetches transactions + accounts, builds training rows, and prints a small sample.
async function main() {
const connectionId = CONNECTION_ACCOUNTING!;
if (!connectionId) throw new Error("Missing CONNECTION_ACCOUNTING");
const [transactions, accounts] = await Promise.all([
fetchAllTransactions(sdk, connectionId, { pageSize: 100, sort: "updated_at", order: "asc" }),
fetchAllAccounts(sdk, connectionId, { pageSize: 100, sort: "updated_at", order: "asc" }),
]);
const rows = toTrainingRows(transactions, accounts);
console.log("Training rows (first 5):", rows.slice(0, 5));
console.log("Row count:", rows.length);
}
main().catch(console.error);
You now have a repeatable export flow that pages through normalized accounting transactions, enriches them with account metadata, and outputs a consistent training dataset across accounting systems.
This dataset is suitable for model training, evaluation, and retraining workflows without changing your integration logic per provider.
Optional next steps are incremental exports using updated_gte, adding contact enrichment for vendor names, and incorporating period-level labels from Profit & Loss or Cash Flow reports when your training objective requires it.