Unified.to
All articles

How to build Enterprise Search using RAG


November 19, 2025

In this article, we will be going into detail on how to build an enterprise search application as well as a Q&A bot using a RAG pipeline with Unified's data integrations and OpenAI embedding models.

RAG or Retrieval-Augmented-Generation is a powerful approach that can be leveraged to take a large amount of information and index it efficiently so that it can be searched quickly. This design pattern is very beneficial for enterprise search, Q&A bots, or any agent. A Q&A bot can be used to answer questions across a company's knowledge base, while an enterprise search application can be used to search corporate information from any internal data source.

Watch the video on our Youtube channel.

video

We will be using a few of Unified.to's integrations to access internal company knowledge in file storage (eg. Google Drive) and knowledge management systems (eg. Notion), embed their content using OpenAI's models, and then store them in a vector DB and used to answer questions.

Prerequisites

  • Node.js 19+
  • Unified account with integrations enabled for Google Drive and Notion
  • Unified.to API key and GenAI Connection ID
  • OpenAI API key (As we are using OpenAI for embedding)
  • A vector DB (eg. Pinecone, or Chroma. For the demo we are using a in memory vector store)

Step 1: Setting your project up

mkdir rag-bot
cd rag-bot
npm init -y
npm install dotenv @unified-api/typescript-sdk

Add your credentials and keys to the .env:

UNIFIED_API_KEY=your_unified_api_key
CONNECTION_GOOGLEDRIVE=your_gdrive_connection_id
CONNECTION_NOTION=your_notion_connection_id
CONNECTION_GENAI=your_genai_connection_id

Step 2: Initialize the SDK

import 'dotenv/config';
import { UnifiedTo } from '@unified-api/typescript-sdk';

const { UNIFIED_API_KEY, CONNECTION_GOOGLEDRIVE, CONNECTION_NOTION, CONNECTION_GENAI } = process.env;

const sdk = new UnifiedTo({
  security: { jwt: UNIFIED_API_KEY! },
});

Step 3: Fetch all the files from Google Drive

We will now use Unified's storage API to retrieve files from a Google Drive.

async function getAllGoogleDriveFiles(connectionId: string) {
  const files: any[] = [];
    let offset = 0;
    const limit = 100;
    let hasMore = true;
    while (hasMore) {
          const response = await sdk.storage.listStorageFiles({
            connectionId,
            offset,
            limit,
          });
            pages.push(...(response.files || []));
          hasMore = (response.files?.length || 0) === limit;
          offset += limit;
        }

      return files;
    }

    const file = await sdk.storage.getStorageFile({ connectionId, id: fileId });

    const downloadUrl = file.download_url;

    const content = await fetch(downloadUrl).then(res => res.text());

    return content;
}

Step 4: Fetch Notion Pages

Now that we have the data from Google Drive, we will use Unified's connection to Notion to fetch data from Notion as well.

async function getAllNotionPages(connectionId: string) {
  const pages: any[] = [];
    let offset = 0;
    const limit = 100;
    let hasMore = true;
    while (hasMore) {
      const response = await sdk.storage.listStorageFiles({
        connectionId,
        offset,
        limit,
      });
      files.push(...(response.files || []));
      hasMore = (response.files?.length || 0) === limit;
      offset += limit;
    }

  return pages;
}

async function getNotionPageContent(connectionId: string, pageId: string) {
  const page = await sdk.kms.getKmsPage({
    connectionId,
    id: pageId,
  });

 const content = await fetch(page.download_url).then(r => r.text());
 return content;
}

Step 5: Building the vector DB

For the purpose of the demo, we will be creating a simple in memory vector store. But for production, you would use a cloud-based vector database like Pinecone, Chroma or Weaviate.

type Document = { id: string; source: string; content: string; embedding?: number[] };
const vectorDB: Document[] = [];

Step 6: Embedding the documents

For the purpose of the example, we will be using OpenAI as the embedding provider in this step. Unified has many integrations that support embedding, but you can also use a single provider's SDK. Please remember to chunk the content properly and not embed the entire content at once.

Chunking strategies:

Chunking large documents effectively is super important to maintain context and relevance. Here are a few best practices for chunking:

  1. Semantic Chunking: Instead of splitting by fixed lengths, break the document into chunks based on natural language boundaries, like paragraphs or sections, ensuring each chunk is coherent and meaningful.
  2. Overlap Between Chunks: Include a slight overlap between adjacent chunks. This helps maintain continuity and context, especially for complex ideas that span multiple chunks.
  3. Fixed-Length Chunking: If you use fixed-length chunking, aim for a length that balances between too small (losing context) and too large (losing efficiency). Typically, a few hundred tokens per chunk works well.
  4. Using Document Structure: If your document has headings, subheadings, or other structural elements, use those to guide the chunking. This ensures that each chunk is logically complete.
  5. Contextual Embeddings: Sometimes, you can embed the entire document first and then create embeddings for chunks that are derived from the document's embedding, ensuring that each chunk stays contextually relevant.
  6. Pre-processing and Cleaning: Before chunking, clean up the text by removing unnecessary whitespace, special characters, or any irrelevant information to improve the quality of the embeddings.

By following these strategies, you'll maintain the semantic integrity of the document and ensure that the vector database can retrieve meaningful context.

There are several popular libraries that help with chunking and embedding in a streamlined way. A few notable ones include:

  1. LangChain: This is a popular framework that integrates various language models and vector databases. It provides built-in utilities for chunking, embedding, and storing data in vector stores.
  2. Haystack: Developed by deepset, Haystack is another robust framework for building search pipelines. It supports intelligent chunking, document retrieval, and integration with vector databases like FAISS, Milvus, and Pinecone.
  3. Transformers and Datasets from Hugging Face: Hugging Face offers tools that can help with both embedding and chunking. You can use their Transformers library for embedding and their Datasets library to preprocess and chunk documents.
  4. SentenceTransformers: This library is great for generating embeddings and can be combined with custom chunking logic to break documents into meaningful pieces before embedding.
  5. GPT-Index (formerly LlamaIndex): This library is designed to help build large language model applications and provides utilities for chunking, indexing, and querying documents effectively.

These libraries often provide a lot of built-in functionality, making it easier to handle complex documents and ensure that your embeddings and vector storage are efficient and meaningful.

import { Configuration, OpenAIApi } from "openai";
const openai = new OpenAIApi(new Configuration({ apiKey: process.env.OPENAI_API_KEY }));

async function embedText(text: string): Promise<number[]> {
  const response = await openai.createEmbedding({
    model: "text-embedding-3-large", //You can choose to use text-embedding-3-small too
    input: text,
  });
  return response.data.data[0].embedding;
}

Step 7: Indexing Everything

Now that we have all the data collected, it is time to index all of it!

When you store vectors in a vector database like Pinecone or Weaviate, you can attach metadata to each vector. This metadata can include things like a unique ID, the original text, source information, or any other relevant attributes.

When you perform a similarity search and get the top matching vectors, the database returns the metadata along with the vectors. That way, you can easily link back to the original content.

For instance, when you index a piece of text, you'd store the vector along with metadata like a document ID, the title, or even the source URL. When you retrieve the nearest neighbors, the metadata is returned, and you can use it to identify or fetch the original content from your database or any external source.

async function indexAllContent() {
  // Google Drive
  const gdriveFiles = await getAllGoogleDriveFiles(CONNECTION_GOOGLEDRIVE!);
  for (const file of gdriveFiles) {
    const content = await getFileContent(CONNECTION_GOOGLEDRIVE!, file.id);
    const embedding = await embedText(content);
    vectorDB.push({
            id: file.id,
            source: "gdrive",
            content,
            embedding,
            filename: file.name,
            date_created: file.created_at,
            owner_id: file.user_id, 
        });
    }

  // Notion
  const notionPages = await getAllNotionPages(CONNECTION_NOTION!);
  for (const page of notionPages) {
    const content = await getNotionPageContent(CONNECTION_NOTION!, page.id);
    const embedding = await embedText(content);
    vectorDB.push({ id: page.id, source: "notion", content, embedding });
  }
}

Step 8: Finding Answers

Now that we have the data ready, it is time to implement the Bot or enterprise search that will use this data to answer questions from a user.

The basic flow would be:

  1. Embed the question asked
  2. Find similar docs in the Vector DB
  3. Send the top n amount of docs as context paired with the question to Unified's GenAI endpoint.
  4. Return the answer to the user.

For the purpose of the example, we will be sending the top 3 most similar docs.

function cosineSimilarity(a: number[], b: number[]) {
  const dot = a.reduce((sum, ai, i) => sum + ai * b[i], 0);
  const normA = Math.sqrt(a.reduce((sum, ai) => sum + ai * ai, 0));
  const normB = Math.sqrt(b.reduce((sum, bi) => sum + bi * bi, 0));
  return dot / (normA * normB);
}

async function answerQuestion(question: string) {
  const questionEmbedding = await embedText(question);

  // For the purpose of the example we will be sending the top 3 docs
  const scored = vectorDB.map(doc => ({
    ...doc,
    score: cosineSimilarity(questionEmbedding, doc.embedding!)
  }));
  const topDocs = scored.sort((a, b) => b.score - a.score).slice(0, 3);

  const context = topDocs.map(doc => doc.content).join('\n---\n');
 
  const prompt = `
You are a helpful company knowledge bot. Use the following context to answer the question.

Context:
${context}Question:
${question}Answer in a concise and accurate way.
  `;

  const result = await sdk.genai.createGenaiPrompt({
    connectionId: CONNECTION_GENAI!,
    prompt: {
      messages: [{ role: "USER", content: prompt }],
      maxTokens: 256,
      temperature: 0.2
    }
  });

  return result.choices?.[0]?.message?.content || "No answer generated.";
}

Step 9: Usage

(async () => {
  await indexAllContent();
  const answer = await answerQuestion("How do I request vacation days?");
  console.log("Bot answer:", answer);
})();

At Unified, most of our customers are using this simple RAG Pipeline flow to add AI-powered intelligence to their application. It can be used across industries and verticals as Unified supports 21 B2B SaaS categories and more than 350 integrations to retrieve end-customer data from.

All articles