Published May 30, 2025

Why Unified APIs Shouldn't Store Your Customer Data in 2026

May 30, 2025

storing_data_unified.to.png

Unified APIs should not store customer data because it increases security risk, expands compliance scope, and creates vendor lock-in. A pass-through architecture keeps data in your infrastructure, reduces liability, and gives you full control over how data is accessed and used.

Unified APIs are a core part of modern SaaS architecture, but how they handle customer data has direct implications for security, compliance, and product control. This article explains why storing customer data inside an integration layer introduces risk, and how a pass-through architecture changes that model.

When should a unified API store customer data?

Data storage may make sense if:

You need a vendor-managed cache for analytics or reporting
Your product relies on historical snapshots managed outside your infrastructure
You accept shared responsibility for stored third-party data

Avoid storing customer data if:

Your product handles sensitive or regulated data
You want full control over data access and lifecycle
You are building live data or AI-driven features that depend on freshness
You want to reduce compliance scope and vendor lock-in

Do unified APIs store customer data?

Some unified APIs store customer data, including cached API responses, logs, and access tokens, to simplify data access and performance. This can increase security and compliance risk by creating additional copies of sensitive data outside your infrastructure. Other architectures avoid storing customer business data entirely, retrieving data live instead, which reduces risk and keeps data under your control.

The Hidden Risk of Stored Data

Most unified API and iPaaS providers store some or all of your customers' data, from access tokens to payload logs to cached API responses.

This is convenient for the vendor, but it comes at a cost to you:

Expanded attack surface: A breach of the integration provider's database can expose your customer data.
Compliance burden: GDPR, CCPA, HIPAA, and PIPEDA all treat stored third-party data as your responsibility and legal liability.
Data governance drift: Every copy of customer data adds complexity to your security posture.
Vendor lock-in: Migrating away becomes harder when another system holds your data.

What is a pass-through architecture?

A pass-through architecture means the integration layer does not maintain a database of customer business data. Instead:

Requests route directly to the source system at request time
Data is returned live, fetched directly from the source
Customer business records — employee data, candidate records, financial transactions, messages, files — never sit in vendor infrastructure
Storage of customer business data happens only inside your own infrastructure

OAuth tokens and operational metadata are still persisted (encrypted, with customer-managed-key options on higher tiers), but the underlying customer data the integration layer is moving never gets duplicated into a vendor-side database.

How to spot a vendor that stores your customer data (red flags)

Use these signals during vendor evaluation, sales calls, and documentation review:

Freshness tiers in marketing materials ("updates every 24h / 6h / 1h"). Tier-gated sync cadence is a strong indicator that data is being cached vendor-side.
Cross-object search or full-text queries that exceed what the source API supports. Capabilities that shouldn't be possible without a local database are evidence of one.
Vendor UI continues showing customer data after the source connection is revoked. If the data persists in the vendor's dashboard after disconnecting the source, it's stored.
Documentation references to "data warehouse," "backfill frequency," "historical snapshots," or "indexing layer." These describe persistent storage by another name.
Initial sync time discussed in onboarding ("up to several hours for large instances"). Initial syncs only exist if there's something being synced into.
Tier-gated webhook frequency. If faster webhook cadence is gated to higher pricing tiers, the underlying delivery often runs on scheduled sync rather than change detection.
Authorization export is restricted, paywalled, or requires contractual amendments. Vendors that don't let you take customer OAuth credentials out are creating lock-in by design.
Data Processing Agreement explicitly references customer data storage or "cache for performance." Read the DPA before signing — vendor sales conversations often understate storage scope.

Questions to ask during procurement

When evaluating any unified API vendor, get the answers in writing:

Data persistence:

Does the vendor store our customers' business data (records, messages, files, transactions) at rest? If yes: where, in what format, and for how long?
What data is stored even in pass-through configurations? (Most pass-through vendors still persist tokens and operational metadata — understand what's stored.)
What's the retention policy per data class? OAuth tokens, audit logs, configuration metadata, customer business data each may have different retention.
If we terminate the contract, what remains on the vendor's systems? What's the deletion SLA?

Authorization and credentials:

Are OAuth tokens stored encrypted? With what key management model?
Does the vendor offer customer-managed encryption keys (BYOK)?
Can we export customer authorization data and connection metadata if we leave?

Architecture and delivery:

Is the vendor's read path pass-through (request hits source live) or cached (request hits vendor database)?
For event delivery: native webhooks where supported, managed change detection where not? Or vendor-DB triggers after scheduled syncs?
What's the sync cadence on the tier we're actually buying?

Security and compliance:

Which certifications? Distinguish formal certifications (SOC 2 Type II, ISO 27001) from "compliant via controls" (HIPAA, GDPR, CCPA, PIPEDA).
Are SOC 2 Type II reports and ISO 27001 certificates available during procurement? Request the actual artifacts.
What's the incident response and breach notification window?
Which regions can we pin our traffic to? Is data processing isolated per region?

Unified.to: pass-through architecture, by design

Unified.to is built around pass-through architecture: customer business data is not maintained in a vendor-side database. That means:

Customer business records (employee data, candidate records, financial transactions, messages, files) are not cached
API requests route directly to source systems at request time
OAuth tokens and operational metadata are persisted encrypted, with customer-managed-key options on Scale tier and above

This model minimizes the amount of third-party data stored outside your infrastructure, which lowers security exposure and simplifies compliance requirements.

Easier Path to Compliance

By avoiding vendor-side caching of customer business data, Unified.to simplifies your path to certifications and audits:

SOC 2 Type II
GDPR, CCPA, PIPEDA aligned
HIPAA BAAs available on Scale tier and above
Customer-managed secrets via AWS Secrets Manager, Azure Key Vault, Google Cloud Secrets Manager, and HashiCorp Vault (Scale+)
IP whitelisting and webhook signing for access control
Auditability through shipped logs (e.g., Datadog)
Multi-region hosting across US, EU, and AU

Instead of inheriting a third-party's data lifecycle and breach risk, you stay in full control of customer business data.

Designed for Least Privilege

Unified.to supports OAuth 2 and customer-managed credentials. You can:

Keep access tokens in your own AWS Secrets Manager, Azure Key Vault, Google Cloud Secrets Manager, or HashiCorp Vault on Scale tier and above
Scope integrations to the exact permissions you need
Revoke access without triggering backend cleanup or data deletion processes

This aligns with a modern least-privilege security model: only store what you need, only access what you must, and avoid third-party storage of customer business data entirely.

"But doesn't caching make queries easier?"

This is the honest question, and the honest answer is yes — for some use cases. Cached reads are faster than pass-through. Cross-object queries are easier against a local database than against multiple source APIs. Historical analytics requires persistent storage somewhere.

The framework that works is: cache the data your product depends on, but cache it in your own infrastructure, under your own residency, retention, and compliance controls.

This is the hybrid pattern most production architectures end up using — pass-through for sensitive transactional flows where freshness and data minimization matter, and customer-owned storage for analytics, dashboards, or AI features that need cached reads. The question isn't whether to cache. It's where the cache should sit and who should be responsible for it.

Unified.to's Database Sync delivers normalized records directly into customer-owned databases — Postgres, MongoDB, MySQL, MSSQL, CockroachDB, MariaDB. You get the cached-read performance and query capabilities for analytics use cases, with the data sitting under your own compliance controls rather than a vendor's.

A More Secure Way to Build

Storing customer data should be a deliberate choice, not a default side effect of using an integration vendor.

Unified.to flips the model:

No vendor-side cache of customer business records to compromise
No payload logs to leak
No backups of customer data to sanitize

If you need historical data, Unified.to streams normalized records directly into your data warehouse or database — under your control, on your terms.

Data storage models in unified APIs

Model	Description	Trade-off
Cached / sync-and-store	Vendor periodically fetches data from source systems and persists normalized copies in its own database	Faster reads and easier queries, but higher security and compliance risk; data freshness gated by sync cadence
Sync + mirror	Vendor continuously syncs and stores customer data, often with longer retention	Easier querying and historical analytics, but increased lock-in and breach blast radius
Hybrid	Pass-through default with opt-in sync or storage layers	Flexibility per integration, but more complex configuration and variable compliance scope
Pass-through	Customer business data retrieved live from the source, not cached by vendor	Requires live source access, lowest vendor-managed data custody risk

Integration Architecture: A Data Storage Comparison

Feature	Unified.to	Merge.dev	Paragon	Apideck	Truto
Customer business data stored	No	Yes (cached)	Yes (cached + logs)	No	Configurable (opt-in sync)
Architecture	Pass-through	Sync-and-store	Embedded iPaaS	Pass-through	Hybrid
Tokens stored	Encrypted; BYOK option (Scale+)	Yes	Yes	Yes	Yes
Data control	Fully yours	Shared with vendor	Shared with vendor	Shared (auth + metadata)	Configurable per integration
Vendor-side compliance surface	Minimal	Higher	Higher	Lower	Variable
Customer-managed secrets (BYOK)	Yes (Scale+)	Not documented	Not documented	Not documented	Not documented
HIPAA BAAs	Available (Scale+)	Available	Available	Not documented	Not documented

Key takeaways

Storing customer business data in integration layers increases security and compliance risk
Each additional copy of data expands your attack surface
Pass-through architectures keep customer business data within your infrastructure
Live access removes the need for vendor-managed caches of customer records
Unified.to uses a pass-through model with no vendor-side database of customer business data
Customer-owned caches (via Database Sync to your own infrastructure) are the safer answer when you need cached reads for analytics

Security isn't just about encryption and audits. It's about architectural choices. If you're building an integration layer into your product, choose one that minimizes risk by design.

Unified.to is built on pass-through architecture, with broad compliance scope (SOC 2 Type II, HIPAA BAAs available, GDPR, CCPA, PIPEDA alignment) and customer-managed secrets on Scale tier and above. That means faster compliance reviews, tighter security, and more control over your architecture.

Start your free trial or book a demo to learn more about live integrations without data liability.

All articles