Retrieval-Augmented Generation (RAG)
February 13, 2026
This page explains how to implement a RAG pipeline using Unified.
RAG is an implementation pattern built on top of Unified APIs. Unified provides real-time data access and normalized objects. You are responsible for embeddings, vector storage, and retrieval.
At a high level, RAG with Unified follows this sequence:
- Subscribe to webhooks.
- Receive created or updated objects.
- Retrieve full content from the source API.
- Chunk the content.
- Generate embeddings.
- Store embeddings in your vector database.
- On query, retrieve relevant chunks and generate a response.
The pattern is the same across categories.
What RAG Means in Practice
RAG allows you to answer questions using customer data from connected SaaS platforms.
Instead of relying only on model training data, your application:
- Retrieves relevant customer records.
- Supplies those records as context.
- Generates a response grounded in that context.
Unified enables real-time retrieval directly from source APIs and supports event-driven synchronization.
- Normalized across providers.
- Fetched in real time.
- Consistent in schema.
Unified does not store embeddings or maintain a vector index Unified Use Case RAG Pipelines.
Step 1: Connect Data Sources
Authorize integrations using Embedded Auth.
Common RAG source categories include:
- File Storage
- Knowledge Management
- CRM
- Ticketing
- ATS
- Messaging
Each authorized integration returns a connection_id.
RAG pipelines should treat connection_id as the tenant boundary.
Step 2: Subscribe to Object Updates
Create webhook subscriptions for the objects you want to index.
On created or updated events:
- Your endpoint receives the event.
- You call the appropriate retrieve endpoint to fetch the latest version of the object.
- If the object contains a
download_url, fetch the content from that URL.
List endpoints often return metadata but not full content. For RAG, you must explicitly retrieve full text before chunking.
Native webhooks deliver real-time updates.
Virtual webhooks provide the same interface using managed polling and may introduce short delays Unified Use Case RAG Pipelines.
Webhooks are recommended over polling for RAG ingestion.
Step 3: Chunk and Embed
Large text (files, pages, tickets, resumes, notes, transcripts) should be split into chunks before embedding.
Each chunk should include stable metadata such as:
connection_idobject_typeobject_idupdated_at
You may also include additional normalized fields as metadata filters.
Unified normalizes objects across providers. All normalized fields returned by Unified can be stored as metadata in your vector index and used for filtering at retrieval time. Unified Use Case RAG Pipelines.
You generate embeddings using the embedding model of your choice. Unified may provide access to embedding models, but embeddings and indexes are stored in your infrastructure.
Step 4: Store in a Vector Database
Store embeddings in your vector database (for example, Pinecone or pgvector).
Include metadata to support:
- Tenant isolation (
connection_id) - Object filtering (
object_type) - Update replacement (
updated_at) - Permission enforcement (see below)
Unified does not:
- Cache end-customer payloads
- Store embeddings
- Maintain a vector index Unified Use Case RAG Pipelines
Unified does not persist customer payloads or maintain your embedding index.
Step 5: Retrieve and Generate
When a user submits a query:
- Embed the query.
- Retrieve the top matching chunks from your vector database.
- Filter by
connection_idand any relevant metadata. - Pass retrieved context to your model.
- Return the generated answer.
If required, include citations that map back to object_id or web_url.
Unified does not perform retrieval. Unified ensures the indexed data remains synchronized with source APIs.
Permissions
File objects include explicit permissions metadata.
Other objects (CRM, ATS, ticketing, messaging) do not return user- or group-level permission structures. If row-level enforcement is required, implement that logic in your application layer before returning retrieved results Unified Use Case RAG Pipelines.
Always filter retrieval results by tenant and permission constraints before generation.
Real-Time Behavior
For RAG pipelines, freshness depends on update propagation:
- Native webhooks deliver real-time updates.
- Virtual webhooks use managed polling and typically operate with short delays.
- Most list endpoints support pagination and incremental filtering via
updated_gteUnified Use Case RAG Pipelines.
To keep embeddings current:
- Re-fetch objects on update events.
- Re-chunk and re-embed.
- Replace outdated vectors in your index.
When to Use RAG
Use this pattern when you need to:
- Build enterprise search across connected platforms.
- Ground AI responses in customer documents or records.
- Enable resume or candidate search.
- Support contract or document Q&A.
- Build account-level CRM assistants.
The ingestion architecture is consistent across categories. Object models and content retrieval methods vary by API.
Related APIs
RAG pipelines commonly use:
Refer to category documentation for supported objects and webhook availability.