Permissions, Security, and Compliance in RAG Pipelines
February 12, 2026
Retrieval-augmented generation (RAG) changes how enterprise software answers questions.
Instead of relying only on model training data, a RAG system retrieves internal content — CRM records, support tickets, contracts, knowledge base articles, HR files — and injects that context into the prompt.
That power introduces a new surface area:
Who is allowed to retrieve what — and how do you prove it?
In enterprise environments, permissions, tenant isolation, and auditability are not optional layers added after the model works. They are architectural requirements.
This article explains how production RAG systems enforce authorization, where security checks must occur, and what compliance teams expect to see.
Why Permissions Become Harder in RAG
Traditional SaaS applications enforce permissions at the UI and API layers.
RAG introduces a new layer: a vector store that contains embedded representations of content.
Vector search engines optimize for semantic similarity, not authorization. Without additional controls:
- A user could retrieve content they should not see.
- Multi-tenant data could leak across customers.
- Stale permissions could persist in embeddings.
- Sensitive data could be included in model context unintentionally.
RAG pipelines must treat authorization as a first-class concern across ingestion, indexing, retrieval, and generation.
Authorization Checkpoints in a Production RAG Pipeline
In practice, permission enforcement happens at multiple stages.
1. Ingestion: Capture Access Rules From the Source
When ingesting data from systems like Google Drive, Salesforce, Zendesk, or an ATS, the pipeline must capture:
- Ownership identifiers (e.g.,
owner_id,user_id) - Tenant identifiers
- Access control lists (ACLs), where available
- Sensitivity labels or classification fields
If these attributes are not captured at ingestion, the retrieval layer has nothing to evaluate later.
For example, file storage APIs typically return explicit permission objects. Other categories — CRM, ticketing, ATS — often return ownership fields but not user/group-level permissions.
That distinction matters.
2. Indexing: Attach Metadata and Enforce Isolation
Once content is embedded, it must carry structured attributes alongside the vector:
tenant_iddocument_idowner_id- classification or sensitivity level
- object type (file, ticket, deal, candidate, etc.)
Without structured metadata:
- You cannot filter results by tenant.
- You cannot enforce row-level security.
- You cannot perform deterministic audit logging.
Multi-tenant SaaS deployments typically use one of these isolation patterns:
- Namespace per tenant — physical or logical partitioning in the vector store.
- Per-tenant index — stronger isolation, higher operational overhead.
- Shared index + tenant metadata filter — flexible but requires strict query-time filtering.
Each pattern trades operational complexity against isolation guarantees.
3. Query Time: Pre-Filter vs Post-Filter Authorization
When a user issues a query, the system must ensure that only authorized documents are considered.
Two common patterns exist:
Pre-Filter Authorization
The application first determines which documents the user is allowed to access — for example, by consulting a permission service or evaluating ownership fields.
Only those document IDs are included in the vector search filter.
Tradeoff:
- Strong security guarantees.
- Higher computation cost when users have broad access.
Post-Filter Authorization
The system performs vector search first, retrieves the top k results, then checks each document against authorization rules and discards unauthorized results.
Tradeoff:
- Lower upfront cost.
- May over-fetch and increase latency.
- Requires careful filtering before constructing the LLM prompt.
Both approaches are valid. The choice depends on corpus size, permission shape, and expected hit rate.
4. Row-Level Security in Hybrid Architectures
When embeddings live in relational databases (e.g., PostgreSQL with pgvector), row-level security (RLS) policies can enforce per-row visibility.
For example:
- Only return rows where
owner_idmatches the authenticated user. - Automatically enforce tenant scoping via session variables.
This centralizes authorization logic inside the database rather than the application layer.
The tradeoff is performance sensitivity. RLS must be tested carefully to avoid query latency spikes.
5. Post-Retrieval Guardrails
Even after retrieval:
- Unauthorized chunks must be removed.
- Sensitive fields may need masking.
- The LLM should be instructed to answer only from provided context.
Logging must capture:
- Which documents were retrieved.
- Which were filtered out.
- What context was passed to the model.
This ensures deterministic behavior and auditability.
Multi-Tenant Isolation: Preventing Cross-Customer Leakage
For B2B SaaS products, tenant isolation is critical.
Common patterns include:
- Namespace-per-tenant vector segmentation.
- Tenant ID prefixes in vector identifiers.
- Metadata-based tenant filters at query time.
Failure to enforce tenant scoping consistently can result in:
- Cross-tenant data exposure.
- Cache bleed-through.
- Retrieval ambiguity.
- Regulatory exposure.
Isolation must be enforced consistently at ingestion and query time — not only at the UI layer.
Permission Drift: The Silent Risk
Permissions change over time:
- A user leaves a team.
- A CRM deal changes owner.
- A document is reclassified as confidential.
If embeddings are not updated or authorization is not evaluated dynamically:
- Users may retain access to outdated vectors.
- Reassigned records may still appear in search.
- Sensitive reclassified documents may remain retrievable.
For permission changes alone, re-embedding is not required if query-time filtering is correct.
For content changes or reclassification, re-embedding is required to align vectors with updated content.
Architecturally, this means:
- Use event-driven updates.
- Avoid embedding once and assuming permanence.
- Treat permission metadata as dynamic, not static.
Are Embeddings Themselves Sensitive?
Embeddings are numeric vectors — but they are derived from source data.
Security researchers and standards bodies have documented:
- Membership inference risks.
- Model inversion risks.
- Potential reconstruction of sensitive content from vectors.
This means embeddings should not be treated as harmless metadata.
They require:
- Encryption at rest and in transit.
- Access control equal to source data.
- Retention and deletion policies.
- Audit logging.
- Tenant isolation.
Derived data does not eliminate compliance obligations.
Minimum Viable Audit Trail for RAG
Enterprise security teams expect to see structured, tamper-evident logs that capture:
- Authenticated user identity.
- Timestamp and request ID.
- Sanitized user prompt.
- Retrieval filters applied (tenant, owner, classification).
- Document IDs retrieved.
- Authorization decisions (allowed/denied).
- Context passed to the model.
- Model version and configuration.
- Generated response.
- Any post-processing steps.
Logs themselves must be:
- Access controlled.
- Encrypted.
- Append-only.
- Subject to retention policies.
- Capable of supporting deletion requests.
RAG systems that cannot trace an answer back to specific documents will struggle in regulated environments.
Power RAG Pipelines with Unified
Unified is a data access layer for SaaS categories such as file storage, CRM, ticketing, and ATS.
In RAG pipelines:
- Unified retrieves content across categories.
- Webhooks can keep indexes current.
- Customers store derived embeddings and vector indexes in their own infrastructure.
- Unified does not store end-customer payloads.
Crucially:
- Only file objects include explicit permissions metadata.
- Other categories do not return user/group-level permissions.
- Row-level filtering must be implemented in the consuming application.
This boundary is intentional.
Unified normalizes data across providers, but it does not replace your authorization layer. Permission enforcement belongs in your application or authorization service.
That separation allows:
- Real-time retrieval directly from source APIs.
- Reduced data-at-rest footprint outside source platforms.
- Clear scoping of responsibility between data access and authorization logic.
Final Takeaway
RAG pipelines in enterprise SaaS environments are not just retrieval systems. They are data governance systems.
To build secure RAG:
- Capture permissions at ingestion.
- Attach structured attributes to embeddings.
- Enforce tenant isolation.
- Apply authorization at query time.
- Log retrieval provenance.
- Govern embeddings like sensitive data.
- Update vectors when content changes.
Most RAG failures in enterprise settings are not model failures.
They are authorization failures.
Designing for permissions, security, and compliance from the start is what separates a demo from a production system.