How to build a RAG Pipeline for a Q&A bot with Unified.to
May 25, 2025
RAG or Retrieval-Augmented-Generation is a powerful approach that can be leveraged to build a Q&A bot. This bot can be used to answer questions across a company's knowledge base from technical programming questions to how an employee can get PTO!
Watch the video on our Youtube channel.
We will be used a few of Unified.to's integrations to build a RAG pipeline which can fetch documents from a Google Drive and Notion workspace, store them in a vector DB and used to answer questions.
Prerequisites
- Node.js 19+
- Unified account with integrations enabled for Google Drive and Notion
- Unified.to API key and GenAI Connection ID
- OpenAI API key (As we are using OpenAI for the sake of the example)
- A vector DB (Ex: Pinecone, Chroma. For the demo we are using a in memory vector store)
Step 1: Setting your project up
mkdir rag-bot
cd rag-bot
npm init -y
npm install dotenv @unified-api/typescript-sdk
Add your credentials and keys to the .env
:
UNIFIED_API_KEY=your_unified_api_key
CONNECTION_GOOGLEDRIVE=your_gdrive_connection_id
CONNECTION_NOTION=your_notion_connection_id
CONNECTION_GENAI=your_genai_connection_id
Step 2: Initialize the SDK
import 'dotenv/config';
import { UnifiedTo } from '@unified-api/typescript-sdk';
const { UNIFIED_API_KEY, CONNECTION_GOOGLEDRIVE, CONNECTION_NOTION, CONNECTION_GENAI } = process.env;
const sdk = new UnifiedTo({
security: { jwt: UNIFIED_API_KEY! },
});
Step 3: Fetch all the files from Google Drive
We will now use Unified's storage API to retrieve files from a Google Drive.
async function getAllGoogleDriveFiles(connectionId: string) {
const files: any[] = [];
let offset = 0;
const limit = 100;
let hasMore = true;
while (hasMore) {
const response = await sdk.storage.listStorageFiles({
connectionId,
offset,
limit,
});
files.push(...(response.files || []));
hasMore = (response.files?.length || 0) === limit;
offset += limit;
}
return files;
}
const file = await sdk.storage.getStorageFile({ connectionId, id: fileId });
const downloadUrl = file.download_url;
const content = await fetch(downloadUrl).then(res => res.text());
return content;
}
Step 4: Fetch Notion Pages
Now that we have the data from Google Drive, we will use Unified's connection to Notion to fetch data from Notion as well.
async function getAllNotionPages(connectionId: string) {
const pages: any[] = [];
let offset = 0;
const limit = 100;
let hasMore = true;
while (hasMore) {
const response = await sdk.storage.listStorageFiles({
connectionId,
offset,
limit,
});
files.push(...(response.files || []));
hasMore = (response.files?.length || 0) === limit;
offset += limit;
}
return pages;
}
async function getNotionPageContent(connectionId: string, pageId: string) {
const page = await sdk.kms.getKmsPage({
connectionId,
id: pageId,
});
return page.content as string;
}
Step 5: Building the vector DB
For the purpose of the demo, we will be creating a simple in memory vector store. But when this application is being used in production you can use a program like Pinecone, Chroma or Weaviate.
type Document = { id: string; source: string; content: string; embedding?: number[] };
const vectorDB: Document[] = [];
Step 6: Embedding the documents
For the purpose of the example, we will be using OpenAI as the embedding provider in this step.
import { Configuration, OpenAIApi } from "openai";
const openai = new OpenAIApi(new Configuration({ apiKey: process.env.OPENAI_API_KEY }));
async function embedText(text: string): Promise<number[]> {
const response = await openai.createEmbedding({
model: "text-embedding-3-large", //You can choose to use text-embedding-3-small too
input: text,
});
return response.data.data[0].embedding;
}
Step 7: Indexing Everything
Now that we have all the data collected, it is time to index all of it!
async function indexAllContent() {
// Google Drive
const gdriveFiles = await getAllGoogleDriveFiles(CONNECTION_GOOGLEDRIVE!);
for (const file of gdriveFiles) {
const content = await getFileContent(CONNECTION_GOOGLEDRIVE!, file.id);
const embedding = await embedText(content);
vectorDB.push({
id: file.id,
source: "gdrive",
content,
embedding,
filename: file.filename,
date_created: file.date_created,
owner_id: file.owner_id,
});
}
// Notion
const notionPages = await getAllNotionPages(CONNECTION_NOTION!);
for (const page of notionPages) {
const content = await getNotionPageContent(CONNECTION_NOTION!, page.id);
const embedding = await embedText(content);
vectorDB.push({ id: page.id, source: "notion", content, embedding });
}
}
Step 8: Implementing the Q&A Bot
Now that we have the data ready, it is time to implement the Bot that will use this data to answer questions from a user.
The basic flow of the bot would be to : 1. Embed the question asked 2. Find similar docs in the Vector DB 3. Send the top n
amount of docs as context paired with the question to Unified's GenAI endpoint. 4. Return the answer to the user.
For the purpose of the example, we will be sending the top 3 most similar docs.
function cosineSimilarity(a: number[], b: number[]) {
const dot = a.reduce((sum, ai, i) => sum + ai * b[i], 0);
const normA = Math.sqrt(a.reduce((sum, ai) => sum + ai * ai, 0));
const normB = Math.sqrt(b.reduce((sum, bi) => sum + bi * bi, 0));
return dot / (normA * normB);
}
async function answerQuestion(question: string) {
const questionEmbedding = await embedText(question);
// For the purpose of the example we will be sending the top 3 docs
const scored = vectorDB.map(doc => ({
...doc,
score: cosineSimilarity(questionEmbedding, doc.embedding!)
}));
const topDocs = scored.sort((a, b) => b.score - a.score).slice(0, 3);
const context = topDocs.map(doc => doc.content).join('\n---\n');
const prompt = `
You are a helpful company knowledge bot. Use the following context to answer the question.
Context:
${context}Question:
${question}Answer in a concise and accurate way.
`;
const result = await sdk.genai.createGenaiPrompt({
connectionId: CONNECTION_GENAI!,
prompt: {
messages: [{ role: "USER", content: prompt }],
maxTokens: 256,
temperature: 0.2
}
});
return result.choices?.[0]?.message?.content || "No answer generated.";
}
Step 9: Usage
(async () => {
await indexAllContent();
const answer = await answerQuestion("How do I request vacation days?");
console.log("Bot answer:", answer);
})();
Happy coding! 🎉