Article

·

Train ChatGPT on your Documentation

Eric Allam

Eric Allam

CTO, Trigger.dev

Image for Train ChatGPT on your Documentation

TL;DR

ChatGPT is trained until 2022.

But what if you want it to give you information specifically about your website? Most likely, it’s not possible, but not anymore!

OpenAI introduced their new feature - assistants.

You can now easily index your website and then ask ChatGPT questions about it. In this tutorial, we will build a system that indexes your website and lets you query it. We will:

  • Scrape the documentation sitemap.
  • Extract the information from all the pages on the website.
  • Create a new assistant with the new information.
  • Build a simple ChatGPT frontend interface and query the assistant.
Assistant

Your background job platform 🔌

Trigger.dev is an open-source library that enables you to create and monitor long-running jobs for your app with NextJS, Remix, Astro, and so many more!

 

GiveUsStars

Please help us with a star 🥹. It would help us to create more articles like this 💖

Star the Trigger.dev repository ⭐️


Let’s get started 🔥

Let’s set up a new NextJS project.


_10
npx create-next-app@latest

💡 We use NextJS new app router. Please make sure you have a node version 18+ before installing the project

Let's create a new database to save the assistant and the scraped pages. For our example, we will use Prisma with SQLite.

It is super easy to install, just run:


_10
npm install prisma @prisma/client --save

And then add a schema and a database with


_10
npx prisma init --datasource-provider sqlite

Go to prisma/schema.prisma and replace it with the following schema:


_25
// This is your Prisma schema file,
_25
// learn more about it in the docs: https://pris.ly/d/prisma-schema
_25
_25
generator client {
_25
provider = "prisma-client-js"
_25
}
_25
_25
datasource db {
_25
provider = "sqlite"
_25
url = env("DATABASE_URL")
_25
}
_25
_25
model Docs {
_25
id Int @id @default(autoincrement())
_25
content String
_25
url String @unique
_25
identifier String
_25
@@index([identifier])
_25
}
_25
_25
model Assistant {
_25
id Int @id @default(autoincrement())
_25
aId String
_25
url String @unique
_25
}

And then run


_10
npx prisma db push

That will create a new SQLite database (local file) with two main tables: Docs and Assistant

  • The Docs contains all the scraped pages
  • The Assistant contains the URL of the docs and the internal ChatGPT assistant ID.

Let’s add our Prisma client.

Create a new folder called helper and add a new file called prisma.ts and the following code inside:


_10
import { PrismaClient } from "@prisma/client";
_10
_10
export const prisma = new PrismaClient();

We can later use that prisma variable to question our database.


ScrapeAndIndex

Scrape & Index

Create a Trigger.dev account

Scraping and indexing the pages is a long-running task. We need to:

  • Scrape the main website meta URL for the sitemap.
  • Extract all the pages inside the sitemap.
  • Go to each page and extract the content.
  • Save everything to the ChatGPT assistant.

For that, let’s use Trigger.dev!

Sign up for a Trigger.dev account.

Once registered, create an organization and choose a project name for your job.

pic1

Select Next.js as your framework and follow the process for adding Trigger.dev to an existing Next.js project.

pic2

Otherwise, click Environments & API Keys on the sidebar menu of your project dashboard.

pic3

Copy your DEV server API key and run the code snippet below to install Trigger.dev.

Follow the instructions carefully.


_10
npx @trigger.dev/cli@latest init

Run the following code snippet in another terminal to establish a tunnel between Trigger.dev and your Next.js project.


_10
npx @trigger.dev/cli@latest dev

Install ChatGPT (OpenAI)

We will use OpenAI assistant, so we must install it on our Project.

Create a new OpenAI account and generate an API Key.

pic4

Click View API key from the dropdown to create an API Key.

pic5

Next, install the OpenAI package by running the code snippet below.


_10
npm install @trigger.dev/openai

Add your OpenAI API key to the .env.local file.


_10
OPENAI_API_KEY=<your_api_key>

Create a new directory, helper and add a new file, open.ai.tsx with the following content:


_10
import {OpenAI} from "@trigger.dev/openai";
_10
_10
export const openai = new OpenAI({
_10
id: "openai",
_10
apiKey: process.env.OPENAI_API_KEY!,
_10
});

That’s our OpenAI client wrapped by Trigger.dev integration.

Building the background jobs

Let’s go ahead and create a new background job!

Go to jobs and create a new file called process.documentation.ts. Add the following code:


_23
import { eventTrigger } from "@trigger.dev/sdk";
_23
import { client } from "@openai-assistant/trigger";
_23
import { object, string } from "zod";
_23
import { JSDOM } from "jsdom";
_23
import { openai } from "@openai-assistant/helper/open.ai";
_23
_23
client.defineJob({
_23
// This is the unique identifier for your Job; it must be unique across all Jobs in your project.
_23
id: "process-documentation",
_23
name: "Process Documentation",
_23
version: "0.0.1",
_23
// This is triggered by an event using eventTrigger. You can also trigger Jobs with webhooks, on schedules, and more: https://trigger.dev/docs/documentation/concepts/triggers/introduction
_23
trigger: eventTrigger({
_23
name: "process.documentation.event",
_23
schema: object({
_23
url: string(),
_23
}),
_23
}),
_23
integrations: {
_23
openai,
_23
},
_23
run: async (payload, io, ctx) => {},
_23
});

We have defined a new job called process.documentation.event, and we added a required parameter called URL - that’s our documentation URL to be sent later.

As you can see, the job is empty, so let’s add the first task to it.

We need to grab the website sitemap and return it. Scraping the website will return an HTML that we need to parse. To do it, let’s install JSDOM.


_10
npm install jsdom --save

And import it at the top of our file:


_10
import { JSDOM } from "jsdom";

Now, we can add our first task.

It’s important to wrap our code with runTask, which lets Trigger.dev separate it from the other tasks. Trigger special architecture splits the tasks into different processes so Vercel serverless timeout does not affect them. Here is the code for the first task:


_10
const getSiteMap = await io.runTask("grab-sitemap", async () => {
_10
const data = await (await fetch(payload.url)).text();
_10
const dom = new JSDOM(data);
_10
const sitemap = dom.window.document.querySelector('[rel="sitemap"]')?.getAttribute('href');
_10
return new URL(sitemap!, payload.url).toString();
_10
});

  • We grab the entire HTML from the URL with an HTTP request.
  • We convert it into a JS object.
  • We find the sitemap URL.
  • We parse it and return it.

Going forward, we need to scrape the sitemap, extract all the URLs and return them. Let’s install Lodash - special functions for array structures.


_10
npm install lodash @types/lodash --save

Here is the code of the task:


_17
export const makeId = (length: number) => {
_17
let text = '';
_17
const possible = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789';
_17
_17
for (let i = 0; i < length; i += 1) {
_17
text += possible.charAt(Math.floor(Math.random() * possible.length));
_17
}
_17
return text;
_17
};
_17
_17
const {identifier, list} = await io.runTask("load-and-parse-sitemap", async () => {
_17
const urls = /(http|ftp|https):\/\/([\w_-]+(?:(?:\.[\w_-]+)+))([\w.,@?^=%&:\/~+#-]*[\w@?^=%&\/~+#-])/g;
_17
const identifier = makeId(5);
_17
const data = await (await fetch(getSiteMap)).text();
_17
// @ts-ignore
_17
return {identifier, list: chunk(([...new Set(data.match(urls))] as string[]).filter(f => f.includes(payload.url)).map(p => ({identifier, url: p})), 25)};
_17
});

  • We create a new function called makeId to generate a random identifier for all our pages.
  • We create a new task and add a Regex to extract every possible URL
  • We send an HTTP request to load the sitemap and extract all its URLs.
  • We chunk the URL into arrays of 25 elements (if we have 100 elements, we will have four arrays of 25 elements)

Next, let’s create a new job to process each URL.

Here is the complete code:


_89
function getElementsBetween(startElement: Element, endElement: Element) {
_89
let currentElement = startElement;
_89
const elements = [];
_89
_89
// Traverse the DOM until the endElement is reached
_89
while (currentElement && currentElement !== endElement) {
_89
currentElement = currentElement.nextElementSibling!;
_89
_89
// If there's no next sibling, go up a level and continue
_89
if (!currentElement) {
_89
// @ts-ignore
_89
currentElement = startElement.parentNode!;
_89
startElement = currentElement;
_89
if (currentElement === endElement) break;
_89
continue;
_89
}
_89
_89
// Add the current element to the list
_89
if (currentElement && currentElement !== endElement) {
_89
elements.push(currentElement);
_89
}
_89
}
_89
_89
return elements;
_89
}
_89
_89
const processContent = client.defineJob({
_89
// This is the unique identifier for your Job; it must be unique across all Jobs in your project.
_89
id: "process-content",
_89
name: "Process Content",
_89
version: "0.0.1",
_89
// This is triggered by an event using eventTrigger. You can also trigger Jobs with webhooks, on schedules, and more: https://trigger.dev/docs/documentation/concepts/triggers/introduction
_89
trigger: eventTrigger({
_89
name: "process.content.event",
_89
schema: object({
_89
url: string(),
_89
identifier: string(),
_89
})
_89
}),
_89
run: async (payload, io, ctx) => {
_89
return io.runTask('grab-content', async () => {
_89
// We first grab a raw html of the content from the website
_89
const data = await (await fetch(payload.url)).text();
_89
_89
// We load it with JSDOM so we can manipulate it
_89
const dom = new JSDOM(data);
_89
_89
// We remove all the scripts and styles from the page
_89
dom.window.document.querySelectorAll('script, style').forEach((el) => el.remove());
_89
_89
// We grab all the titles from the page
_89
const content = Array.from(dom.window.document.querySelectorAll('h1, h2, h3, h4, h5, h6'));
_89
_89
// We grab the last element so we can get the content between the last element and the next element
_89
const lastElement = content[content.length - 1]?.parentElement?.nextElementSibling!;
_89
const elements = [];
_89
_89
// We loop through all the elements and grab the content between each title
_89
for (let i = 0; i < content.length; i++) {
_89
const element = content[i];
_89
const nextElement = content?.[i + 1] || lastElement;
_89
const elementsBetween = getElementsBetween(element, nextElement);
_89
elements.push({
_89
title: element.textContent, content: elementsBetween.map((el) => el.textContent).join('\n')
_89
});
_89
}
_89
_89
// We create a raw text format of all the content
_89
const page = `
_89
----------------------------------
_89
url: ${payload.url}\n
_89
${elements.map((el) => `${el.title}\n${el.content}`).join('\n')}
_89
_89
----------------------------------
_89
`;
_89
_89
// We save it to our database
_89
await prisma.docs.upsert({
_89
where: {
_89
url: payload.url
_89
}, update: {
_89
content: page, identifier: payload.identifier
_89
}, create: {
_89
url: payload.url, content: page, identifier: payload.identifier
_89
}
_89
});
_89
});
_89
},
_89
});

  • We grab the content from the URL (previously extracted from the sitemap)
  • We parse it with JSDOM
  • We remove every possible <script>or <style> that exists on the page.
  • We grab all the titles on the page (h1, h2, h3, h4, h5, h6)
  • We iterate over the titles and take the content between them. We don’t want to take the entire page content because it might contain irrelevant content.
  • We create our version of the raw text of the page and save it to our database.

Now, let’s run this task for every sitemap URL. Trigger introduces something called batchInvokeAndWaitForCompletion. It allows us to send batches of 25 items to process, and it will simultaneously process all of them. Here are the next lines of codes:


_13
let i = 0;
_13
for (const item of list) {
_13
await processContent.batchInvokeAndWaitForCompletion(
_13
"process-list-" + i,
_13
item.map(
_13
(payload) => ({
_13
payload,
_13
}),
_13
86_400
_13
)
_13
);
_13
i++;
_13
}

We manually trigger the previously created job in a batch of 25.

Once that’s completed, let’s take all the content we have saved to our database and connect it:


_14
const data = await io.runTask("get-extracted-data", async () => {
_14
return (
_14
await prisma.docs.findMany({
_14
where: {
_14
identifier,
_14
},
_14
select: {
_14
content: true,
_14
},
_14
})
_14
)
_14
.map((d) => d.content)
_14
.join("\n\n");
_14
});

We use the identifier we have specified before.

Now, let’s create a new file in ChatGPT with the new data:


_10
const file = await io.openai.files.createAndWaitForProcessing("upload-file", {
_10
purpose: "assistants",
_10
file: data,
_10
});

createAndWaitForProcessing is a task created by Trigger.dev to upload files to the assistant. If you manually use openai without the integration, you must stream the files.

Now let’s create or update our assistant:


_26
const assistant = await io.openai.runTask(
_26
"create-or-update-assistant",
_26
async (openai) => {
_26
const currentAssistant = await prisma.assistant.findFirst({
_26
where: {
_26
url: payload.url,
_26
},
_26
});
_26
if (currentAssistant) {
_26
return openai.beta.assistants.update(currentAssistant.aId, {
_26
file_ids: [file.id],
_26
});
_26
}
_26
return openai.beta.assistants.create({
_26
name: identifier,
_26
description: "Documentation",
_26
instructions:
_26
"You are a documentation assistant, you have been loaded with documentation from " +
_26
payload.url +
_26
", return everything in an MD format.",
_26
model: "gpt-4-1106-preview",
_26
tools: [{ type: "code_interpreter" }, { type: "retrieval" }],
_26
file_ids: [file.id],
_26
});
_26
}
_26
);

  • We first check if we have an assistant for that specific URL.
  • If we have one, let’s update the assistant with the new file.
  • If not, let’s create a new assistant.
  • We pass the instruction of “you are a documentation assistant.”, it’s essential to notice that we want the final output to be in MD format so we can display it nicer later.

For the final piece of the Puzzle, let’s save the new assistant into our database.

Here is the code:


_14
await io.runTask("save-assistant", async () => {
_14
await prisma.assistant.upsert({
_14
where: {
_14
url: payload.url,
_14
},
_14
update: {
_14
aId: assistant.id,
_14
},
_14
create: {
_14
aId: assistant.id,
_14
url: payload.url,
_14
},
_14
});
_14
});

If the URL already exists, we can try to update it with the new assistant ID.

Here is the full code of the page:


_219
import { eventTrigger } from "@trigger.dev/sdk";
_219
import { client } from "@openai-assistant/trigger";
_219
import {object, string} from "zod";
_219
import {JSDOM} from "jsdom";
_219
import {chunk} from "lodash";
_219
import {prisma} from "@openai-assistant/helper/prisma.client";
_219
import {openai} from "@openai-assistant/helper/open.ai";
_219
_219
const makeId = (length: number) => {
_219
let text = '';
_219
const possible = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789';
_219
_219
for (let i = 0; i < length; i += 1) {
_219
text += possible.charAt(Math.floor(Math.random() * possible.length));
_219
}
_219
return text;
_219
};
_219
_219
client.defineJob({
_219
// This is the unique identifier for your Job; it must be unique across all Jobs in your project.
_219
id: "process-documentation",
_219
name: "Process Documentation",
_219
version: "0.0.1",
_219
// This is triggered by an event using eventTrigger. You can also trigger Jobs with webhooks, on schedules, and more: https://trigger.dev/docs/documentation/concepts/triggers/introduction
_219
trigger: eventTrigger({
_219
name: "process.documentation.event",
_219
schema: object({
_219
url: string(),
_219
})
_219
}),
_219
integrations: {
_219
openai
_219
},
_219
run: async (payload, io, ctx) => {
_219
_219
// The first task to get the sitemap URL from the website
_219
const getSiteMap = await io.runTask("grab-sitemap", async () => {
_219
const data = await (await fetch(payload.url)).text();
_219
const dom = new JSDOM(data);
_219
const sitemap = dom.window.document.querySelector('[rel="sitemap"]')?.getAttribute('href');
_219
return new URL(sitemap!, payload.url).toString();
_219
});
_219
_219
// We parse the sitemap; instead of using some XML parser, we just use regex to get the URLs and we return it in chunks of 25
_219
const {identifier, list} = await io.runTask("load-and-parse-sitemap", async () => {
_219
const urls = /(http|ftp|https):\/\/([\w_-]+(?:(?:\.[\w_-]+)+))([\w.,@?^=%&:\/~+#-]*[\w@?^=%&\/~+#-])/g;
_219
const identifier = makeId(5);
_219
const data = await (await fetch(getSiteMap)).text();
_219
// @ts-ignore
_219
return {identifier, list: chunk(([...new Set(data.match(urls))] as string[]).filter(f => f.includes(payload.url)).map(p => ({identifier, url: p})), 25)};
_219
});
_219
_219
// We go into each page and grab the content; we do this in batches of 25 and save it to the DB
_219
let i = 0;
_219
for (const item of list) {
_219
await processContent.batchInvokeAndWaitForCompletion(
_219
'process-list-' + i,
_219
item.map(
_219
payload => ({
_219
payload,
_219
}),
_219
86_400),
_219
);
_219
i++;
_219
}
_219
_219
// We get the data that we saved in batches from the DB
_219
const data = await io.runTask("get-extracted-data", async () => {
_219
return (await prisma.docs.findMany({
_219
where: {
_219
identifier
_219
},
_219
select: {
_219
content: true
_219
}
_219
})).map((d) => d.content).join('\n\n');
_219
});
_219
_219
// We upload the data to OpenAI with all the content
_219
const file = await io.openai.files.createAndWaitForProcessing("upload-file", {
_219
purpose: "assistants",
_219
file: data
_219
});
_219
_219
// We create a new assistant or update the old one with the new file
_219
const assistant = await io.openai.runTask("create-or-update-assistant", async (openai) => {
_219
const currentAssistant = await prisma.assistant.findFirst({
_219
where: {
_219
url: payload.url
_219
}
_219
});
_219
if (currentAssistant) {
_219
return openai.beta.assistants.update(currentAssistant.aId, {
_219
file_ids: [file.id]
_219
});
_219
}
_219
return openai.beta.assistants.create({
_219
name: identifier,
_219
description: 'Documentation',
_219
instructions: 'You are a documentation assistant, you have been loaded with documentation from ' + payload.url + ', return everything in an MD format.',
_219
model: 'gpt-4-1106-preview',
_219
tools: [{ type: "code_interpreter" }, {type: 'retrieval'}],
_219
file_ids: [file.id],
_219
});
_219
});
_219
_219
// We update our internal database with the assistant
_219
await io.runTask("save-assistant", async () => {
_219
await prisma.assistant.upsert({
_219
where: {
_219
url: payload.url
_219
},
_219
update: {
_219
aId: assistant.id,
_219
},
_219
create: {
_219
aId: assistant.id,
_219
url: payload.url,
_219
}
_219
});
_219
});
_219
},
_219
});
_219
_219
export function getElementsBetween(startElement: Element, endElement: Element) {
_219
let currentElement = startElement;
_219
const elements = [];
_219
_219
// Traverse the DOM until the endElement is reached
_219
while (currentElement && currentElement !== endElement) {
_219
currentElement = currentElement.nextElementSibling!;
_219
_219
// If there's no next sibling, go up a level and continue
_219
if (!currentElement) {
_219
// @ts-ignore
_219
currentElement = startElement.parentNode!;
_219
startElement = currentElement;
_219
if (currentElement === endElement) break;
_219
continue;
_219
}
_219
_219
// Add the current element to the list
_219
if (currentElement && currentElement !== endElement) {
_219
elements.push(currentElement);
_219
}
_219
}
_219
_219
return elements;
_219
}
_219
_219
// This job will grab the content from the website
_219
const processContent = client.defineJob({
_219
// This is the unique identifier for your Job; it must be unique across all Jobs in your project.
_219
id: "process-content",
_219
name: "Process Content",
_219
version: "0.0.1",
_219
// This is triggered by an event using eventTrigger. You can also trigger Jobs with webhooks, on schedules, and more: https://trigger.dev/docs/documentation/concepts/triggers/introduction
_219
trigger: eventTrigger({
_219
name: "process.content.event",
_219
schema: object({
_219
url: string(),
_219
identifier: string(),
_219
})
_219
}),
_219
run: async (payload, io, ctx) => {
_219
return io.runTask('grab-content', async () => {
_219
try {
_219
// We first grab a raw HTML of the content from the website
_219
const data = await (await fetch(payload.url)).text();
_219
_219
// We load it with JSDOM so we can manipulate it
_219
const dom = new JSDOM(data);
_219
_219
// We remove all the scripts and styles from the page
_219
dom.window.document.querySelectorAll('script, style').forEach((el) => el.remove());
_219
_219
// We grab all the titles from the page
_219
const content = Array.from(dom.window.document.querySelectorAll('h1, h2, h3, h4, h5, h6'));
_219
_219
// We grab the last element so we can get the content between the last element and the next element
_219
const lastElement = content[content.length - 1]?.parentElement?.nextElementSibling!;
_219
const elements = [];
_219
_219
// We loop through all the elements and grab the content between each title
_219
for (let i = 0; i < content.length; i++) {
_219
const element = content[i];
_219
const nextElement = content?.[i + 1] || lastElement;
_219
const elementsBetween = getElementsBetween(element, nextElement);
_219
elements.push({
_219
title: element.textContent, content: elementsBetween.map((el) => el.textContent).join('\n')
_219
});
_219
}
_219
_219
// We create a raw text format of all the content
_219
const page = `
_219
----------------------------------
_219
url: ${payload.url}\n
_219
${elements.map((el) => `${el.title}\n${el.content}`).join('\n')}
_219
_219
----------------------------------
_219
`;
_219
_219
// We save it to our database
_219
await prisma.docs.upsert({
_219
where: {
_219
url: payload.url
_219
}, update: {
_219
content: page, identifier: payload.identifier
_219
}, create: {
_219
url: payload.url, content: page, identifier: payload.identifier
_219
}
_219
});
_219
}
_219
catch (e) {
_219
console.log(e);
_219
}
_219
});
_219
},
_219
});

We have finished creating the background job to scrape and index the files 🎉

Question the assistant

Now, let’s create the job to question our assistant.

Go to jobs and create a new file, question.assistant.ts. Add the following code:


_70
import { eventTrigger } from "@trigger.dev/sdk";
_70
import { client } from "@openai-assistant/trigger";
_70
import { object, string } from "zod";
_70
import { openai } from "@openai-assistant/helper/open.ai";
_70
_70
client.defineJob({
_70
// This is the unique identifier for your Job; it must be unique across all Jobs in your project.
_70
id: "question-assistant",
_70
name: "Question Assistant",
_70
version: "0.0.1", // This is triggered by an event using eventTrigger. You can also trigger Jobs with webhooks, on schedules, and more: https://trigger.dev/docs/documentation/concepts/triggers/introduction
_70
trigger: eventTrigger({
_70
name: "question.assistant.event",
_70
schema: object({
_70
content: string(),
_70
aId: string(),
_70
threadId: string().optional(),
_70
}),
_70
}),
_70
integrations: {
_70
openai,
_70
},
_70
run: async (payload, io, ctx) => {
_70
// Create or use an existing thread
_70
const thread = payload.threadId
_70
? await io.openai.beta.threads.retrieve("get-thread", payload.threadId)
_70
: await io.openai.beta.threads.create("create-thread");
_70
_70
// Create a message in the thread
_70
await io.openai.beta.threads.messages.create("create-message", thread.id, {
_70
content: payload.content,
_70
role: "user",
_70
});
_70
_70
// Run the thread
_70
const run = await io.openai.beta.threads.runs.createAndWaitForCompletion(
_70
"run-thread",
_70
thread.id,
_70
{
_70
model: "gpt-4-1106-preview",
_70
assistant_id: payload.aId,
_70
}
_70
);
_70
_70
// Check the status of the thread
_70
if (run.status !== "completed") {
_70
console.log("not completed");
_70
throw new Error(
_70
`Run finished with status ${run.status}: ${JSON.stringify(
_70
run.last_error
_70
)}`
_70
);
_70
}
_70
_70
// Get the messages from the thread
_70
const messages = await io.openai.beta.threads.messages.list(
_70
"list-messages",
_70
run.thread_id,
_70
{
_70
query: {
_70
limit: "1",
_70
},
_70
}
_70
);
_70
_70
const content = messages[0].content[0];
_70
if (content.type === "text") {
_70
return { content: content.text.value, threadId: thread.id };
_70
}
_70
},
_70
});

  • The event takes three parameters
    • content - the message we want to send to our assistant.
    • aId - the internal ID of the assistant we previously created.
    • threadId - The thread id of the conversation. As you can see, this is an optional parameter because, on the first message, we will not have a thread ID yet.
  • Then, we create or get the thread the previous thread.
  • We add a new message to the thread of the question we ask the assistant.
  • We run the thread and wait for it to finish.
  • We get the list of messages (and limit it to 1) as the first message is the last one in the conversation.
  • We return the message content and the thread ID we just created.

Add routing

We need to create 3 API routes for our application:

  1. Send a new assistant for processing.
  2. Get a specific assistant by URL.
  3. Add a new message to an assistant.

Create a new folder inside of app/api called assistant, and inside, create a new file called route.ts. Add the following code inside:


_36
import { client } from "@openai-assistant/trigger";
_36
import { prisma } from "@openai-assistant/helper/prisma.client";
_36
_36
export async function POST(request: Request) {
_36
const body = await request.json();
_36
if (!body.url) {
_36
return new Response(JSON.stringify({ error: "URL is required" }), {
_36
status: 400,
_36
});
_36
}
_36
_36
// We send an event to the trigger to process the documentation
_36
const { id: eventId } = await client.sendEvent({
_36
name: "process.documentation.event",
_36
payload: { url: body.url },
_36
});
_36
_36
return new Response(JSON.stringify({ eventId }), { status: 200 });
_36
}
_36
_36
export async function GET(request: Request) {
_36
const url = new URL(request.url).searchParams.get("url");
_36
if (!url) {
_36
return new Response(JSON.stringify({ error: "URL is required" }), {
_36
status: 400,
_36
});
_36
}
_36
_36
const assistant = await prisma.assistant.findFirst({
_36
where: {
_36
url: url,
_36
},
_36
});
_36
_36
return new Response(JSON.stringify(assistant), { status: 200 });
_36
}

The first POST method gets a URL and triggers the process.documentation.event job with a URL sent from the client.

The second GET method gets the assistant from our database, from the URL sent from the client.

Now, let’s create the route to add a message to our assistant. Inside of app/api create a new folder message and add a new file called route.ts, then add the following code:


_33
import { prisma } from "@openai-assistant/helper/prisma.client";
_33
import { client } from "@openai-assistant/trigger";
_33
_33
export async function POST(request: Request) {
_33
const body = await request.json();
_33
_33
// Check that we have the assistant id and the message
_33
if (!body.id || !body.message) {
_33
return new Response(
_33
JSON.stringify({ error: "Id and Message are required" }),
_33
{ status: 400 }
_33
);
_33
}
_33
_33
// get the assistant id in OpenAI from the id in the database
_33
const assistant = await prisma.assistant.findUnique({
_33
where: {
_33
id: +body.id,
_33
},
_33
});
_33
_33
// We send an event to the trigger to process the documentation
_33
const { id: eventId } = await client.sendEvent({
_33
name: "question.assistant.event",
_33
payload: {
_33
content: body.message,
_33
aId: assistant?.aId,
_33
threadId: body.threadId,
_33
},
_33
});
_33
_33
return new Response(JSON.stringify({ eventId }), { status: 200 });
_33
}

That’s a very basic code. We get the message, assistant id, and thread id from the client and send it to our previously created question.assistant.event.

The last thing to do is create a function to get all our assistants.

Inside of helpers create a new function called get.list.ts and add the following code:


_10
import { prisma } from "@openai-assistant/helper/prisma.client";
_10
_10
// Get the list of all the available assistants
_10
export const getList = () => {
_10
return prisma.assistant.findMany({});
_10
};

Very simple code to get all the assistants.

We have finished with the backend 🥳

Let’s move to the front.


Frontend

Creating the Frontend

We are going to create a basic interface to add URLs and show the list of the added URLs:

ss1

The main page

Replace the content of app/page.tsx with the following code:


_10
import { getList } from "@openai-assistant/helper/get.list";
_10
import Main from "@openai-assistant/components/main";
_10
_10
export default async function Home() {
_10
const list = await getList();
_10
return <Main list={list} />;
_10
}

That’s a straightforward code that grabs the list from the database and passes it to our Main component.

Next, let’s create the Main component.

Inside app create a new folder components and add a new file called main.tsx. Add the following code:


_58
"use client";
_58
_58
import {Assistant} from '@prisma/client';
_58
import {useCallback, useState} from "react";
_58
import {FieldValues, SubmitHandler, useForm} from "react-hook-form";
_58
import {ChatgptComponent} from "@openai-assistant/components/chatgpt.component";
_58
import {AssistantList} from "@openai-assistant/components/assistant.list";
_58
import {TriggerProvider} from "@trigger.dev/react";
_58
_58
export interface ExtendedAssistant extends Assistant {
_58
pending?: boolean;
_58
eventId?: string;
_58
}
_58
export default function Main({list}: {list: ExtendedAssistant[]}) {
_58
const [assistantState, setAssistantState] = useState(list);
_58
const {register, handleSubmit} = useForm();
_58
_58
const submit: SubmitHandler<FieldValues> = useCallback(async (data) => {
_58
const assistantResponse = await (await fetch('/api/assistant', {
_58
body: JSON.stringify({url: data.url}),
_58
method: 'POST',
_58
headers: {
_58
'Content-Type': 'application/json'
_58
}
_58
})).json();
_58
_58
setAssistantState([...assistantState, {...assistantResponse, url: data.url, pending: true}]);
_58
}, [assistantState])
_58
_58
const changeStatus = useCallback((val: ExtendedAssistant) => async () => {
_58
const assistantResponse = await (await fetch(`/api/assistant?url=${val.url}`, {
_58
method: 'GET',
_58
headers: {
_58
'Content-Type': 'application/json'
_58
}
_58
})).json();
_58
setAssistantState([...assistantState.filter((v) => v.id), assistantResponse]);
_58
}, [assistantState])
_58
_58
return (
_58
<TriggerProvider publicApiKey={process.env.NEXT_PUBLIC_TRIGGER_PUBLIC_API_KEY!}>
_58
<div className="w-full max-w-2xl mx-auto p-6 flex flex-col gap-4">
_58
<form className="flex items-center space-x-4" onSubmit={handleSubmit(submit)}>
_58
<input className="flex-grow p-3 border border-black/20 rounded-xl" placeholder="Add documentation link" type="text" {...register('url', {required: 'true'})} />
_58
<button className="flex-shrink p-3 border border-black/20 rounded-xl" type="submit">
_58
Add
_58
</button>
_58
</form>
_58
<div className="divide-y-2 divide-gray-300 flex gap-2 flex-wrap">
_58
{assistantState.map(val => (
_58
<AssistantList key={val.url} val={val} onFinish={changeStatus(val)} />
_58
))}
_58
</div>
_58
{assistantState.filter(f => !f.pending).length > 0 && <ChatgptComponent list={assistantState} />}
_58
</div>
_58
</TriggerProvider>
_58
)
_58
}

Let’s see what’s going on here:

  • We created a new interface that’s called ExtendedAssistant with two parameters pending and eventId. When we create a new assistant, we don’t have the final value, we will store only the eventId and listen to the job processing until finished.
  • We get the list from the server component and set it to our new state (so we can modify it later)
  • We added a TriggerProvider to help us listen for event completion and update it with data.
  • We use react-hook-form to create a new form for adding new assistants.
  • We added a form with one input URL to submit new assistants for processing.
  • We iterate and show all the assistants that exist.
  • On form submissions, we send the information to the previously created route to add the new assistant.
  • Once the event is completed, we trigger changeStatus to load the assistant from the database.
  • In the end, we have the ChatGPT component, only to be displayed if we don’t have assistants waiting to be processed (!f.pending)

Let’s create our AssistantList component.

inside components, create a new file assistant.list.tsx and add the following content there:


_33
"use client";
_33
_33
import {FC, useEffect} from "react";
_33
import {ExtendedAssistant} from "@openai-assistant/components/main";
_33
import {useEventRunDetails} from "@trigger.dev/react";
_33
_33
export const Loading: FC<{eventId: string, onFinish: () => void}> = (props) => {
_33
const {eventId} = props;
_33
const { data, error } = useEventRunDetails(eventId);
_33
_33
useEffect(() => {
_33
if (!data || error) {
_33
return ;
_33
}
_33
_33
if (data.status === 'SUCCESS') {
_33
props.onFinish();
_33
}
_33
}, [data]);
_33
_33
return <div className="pointer bg-yellow-300 border-yellow-500 p-1 px-3 text-yellow-950 border rounded-2xl">Loading</div>
_33
};
_33
_33
export const AssistantList: FC<{val: ExtendedAssistant, onFinish: () => void}> = (props) => {
_33
const {val, onFinish} = props;
_33
if (val.pending) {
_33
return <Loading eventId={val.eventId!} onFinish={onFinish} />
_33
}
_33
_33
return (
_33
<div key={val.url} className="pointer relative bg-green-300 border-green-500 p-1 px-3 text-green-950 border rounded-2xl hover:bg-red-300 hover:border-red-500 hover:text-red-950 before:content-[attr(data-content)]" data-content={val.url} />
_33
)
_33
}

We iterate over all the assistants we created. If the assistants have already been created, we just display the name. If not, we render the <Loading /> component.

The loading component shows a Loading on the screen and long-polling the server until the event is finished.

We used the useEventRunDetails function created by Trigger.dev to know when the event is finished.

Once the event is finished, it triggers the onFinish function to update our client with the newly created assistant.

Chat interface

Chat Interface

Now, let’s add the ChatGPT component and question our assistant!

  • Select the assistant we would like to use
  • Show the list of messages
  • Add input for the message we want to send and the submit button.

Inside of components add a new file called chatgpt.component.tsx

Let’s draw our ChatGPT chat box:


_93
"use client";
_93
import {FC, useCallback, useEffect, useRef, useState} from "react";
_93
import {ExtendedAssistant} from "@openai-assistant/components/main";
_93
import Markdown from 'react-markdown'
_93
import {useEventRunDetails} from "@trigger.dev/react";
_93
_93
interface Messages {
_93
message?: string
_93
eventId?: string
_93
}
_93
_93
export const ChatgptComponent = ({list}: {list: ExtendedAssistant[]}) => {
_93
const url = useRef<HTMLSelectElement>(null);
_93
const [message, setMessage] = useState('');
_93
const [messagesList, setMessagesList] = useState([] as Messages[]);
_93
const [threadId, setThreadId] = useState<string>('' as string);
_93
_93
const submitForm = useCallback(async (e: any) => {
_93
e.preventDefault();
_93
setMessagesList((messages) => [...messages, {message: `**[ME]** ${message}`}]);
_93
setMessage('');
_93
_93
const messageResponse = await (await fetch('/api/message', {
_93
method: 'POST',
_93
body: JSON.stringify({message, id: url.current?.value, threadId}),
_93
})).json();
_93
_93
if (!threadId) {
_93
setThreadId(messageResponse.threadId);
_93
}
_93
_93
setMessagesList((messages) => [...messages, {eventId: messageResponse.eventId}]);
_93
}, [message, messagesList, url, threadId]);
_93
_93
return (
_93
<div className="border border-black/50 rounded-2xl flex flex-col">
_93
<div className="border-b border-b-black/50 h-[60px] gap-3 px-3 flex items-center">
_93
<div>Assistant:</div>
_93
<div>
_93
<select ref={url} className="border border-black/20 rounded-xl p-2">
_93
{list.filter(f => !f.pending).map(val => (
_93
<option key={val.id} value={val.id}>{val.url}</option>
_93
))}
_93
</select>
_93
</div>
_93
</div>
_93
<div className="flex-1 flex flex-col gap-3 py-3 w-full min-h-[500px] max-h-[1000px] overflow-y-auto overflow-x-hidden messages-list">
_93
{messagesList.map((val, index) => (
_93
<div key={index} className={`flex border-b border-b-black/20 pb-3 px-3`}>
_93
<div className="w-full">
_93
{val.message ? <Markdown>{val.message}</Markdown> : <MessageComponent eventId={val.eventId!} onFinish={setThreadId} />}
_93
</div>
_93
</div>
_93
))}
_93
</div>
_93
<form onSubmit={submitForm}>
_93
<div className="border-t border-t-black/50 h-[60px] gap-3 px-3 flex items-center">
_93
<div className="flex-1">
_93
<input value={message} onChange={(e) => setMessage(e.target.value)} className="read-only:opacity-20 outline-none border border-black/20 rounded-xl p-2 w-full" placeholder="Type your message here" />
_93
</div>
_93
<div>
_93
<button className="border border-black/20 rounded-xl p-2 disabled:opacity-20" disabled={message.length < 3}>Send</button>
_93
</div>
_93
</div>
_93
</form>
_93
</div>
_93
)
_93
}
_93
_93
export const MessageComponent: FC<{eventId: string, onFinish: (threadId: string) => void}> = (props) => {
_93
const {eventId} = props;
_93
const { data, error } = useEventRunDetails(eventId);
_93
_93
useEffect(() => {
_93
if (!data || error) {
_93
return ;
_93
}
_93
_93
if (data.status === 'SUCCESS') {
_93
props.onFinish(data.output.threadId);
_93
}
_93
}, [data]);
_93
_93
if (!data || error || data.status !== 'SUCCESS') {
_93
return (
_93
<div className="flex justify-end items-center pb-3 px-3">
_93
<div className="animate-spin rounded-full h-3 w-3 border-t-2 border-b-2 border-blue-500" />
_93
</div>
_93
_93
}
_93
_93
return <Markdown>{data.output.content}</Markdown>;
_93
};

A few exciting things are going on over here:

  • When we create a new message, we automatically render it on the screen as “our” message, but when we send it to the server, we need to push the event id, as we don’t have the message yet. That’s why we use {val.message ? <Markdown>{val.message}</Markdown> : <MessageComponent eventId={val.eventId!} onFinish={setThreadId} />}
  • We wrap our messages with a Markdown component. If you remember, we told ChatGPT in the previous steps to output everything in an MD format so we can render it correctly.
  • Once the event has finished processing, we update the thread id so that we will have the context of the same conversation from the following message.

And we are done 🎉


Done

Let's connect! 🔌

As an open-source developer, you can join our community to contribute and engage with maintainers. Don't hesitate to visit our GitHub repository to contribute and create issues related to Trigger.dev.

The source for this tutorial is available here:

https://github.com/triggerdotdev/blog/tree/main/openai-assistant

Thank you for reading!

Ready to start building?

Build and deploy your first task in 3 minutes.

Get started now
,