October 30, 2024
Marc Seitz, co-founder of Papermark, runs through how they have used Trigger.dev to build a real-time PDF conversion service, with thousands of conversions successfully processed per month.
In this customer story, Marc Seitz, co-founder of Papermark, runs through how they have used Trigger.dev to build a real-time PDF conversion service at scale, with thousands of conversions successfully processed per month.
Product background
Papermark is built using Next.js and hosted on Vercel. We have core functionality that requires long-running processes and Trigger was at the right time with the right solution to solve our problem with excellent DX and minimal complexity. Being open-source, one of the core decisions to use Trigger is that it’s also open-source.
We are now easily processing around 6,000 documents per month anywhere from one page to hundreds of pages.
The problem with rendering large PDFs
We display documents in the browser. However, large PDFs (30+ pages and/or 30+ MB) are difficult to render smoothly using HTML canvas. We needed a new solution: convert PDFs to images. Images have been around on the web since Web 1.0 and therefore have excellent support. We knew we could manage large files and PDFs with hundreds of pages easily. The only problem, this crucial task requires a stateful long-running job which is impossible with our current deployment on Vercel because API / Lambda functions have a maximum lifespan of seconds to minutes.
Handling the PDF conversion with Trigger
We found software that's open-source friendly called MuPDF and has a WebAssembly port so it runs in Node.js environments. Combined with using Trigger's tasks and runs, it solved our problem instantly. We are now easily processing around 6,000 documents per month anywhere from one page to hundreds of pages.
The icing on the cake was state hooks (Realtime in v3) that provided real-time status feedback to our application as the files are processed. So we can show our users when the document is processed and ready to be shared.
We started using Trigger with v2 engine and since the addition of v3, we decided to build new processes on the v3 engine. The best thing is that we are able to run both v2 and v3 Trigger tasks in one codebase. That makes the transition and adoption much simpler for us.
See the code in action
Here are some of the code snippets from our repo that show how we use Trigger in production:
Send onboarding emails on a schedule
This code defines two Trigger tasks: one to send an initial trial info email and another to handle trial expiration by sending an email and updating the team's plan status when their data room trial ends.
export const sendDataroomTrialInfoEmailTask = task({ id: "send-dataroom-trial-info-email", run: async (payload: { to: string }) => { await sendDataroomInfoEmail({ user: { email: payload.to, name: "Marc" } }); logger.info("Email sent", { to: payload.to }); },});export const sendDataroomTrialExpiredEmailTask = task({ id: "send-dataroom-trial-expired-email", retry: { maxAttempts: 3 }, run: async (payload: { to: string; name: string; teamId: string }) => { const team = await prisma.team.findUnique({ where: { id: payload.teamId }, select: { plan: true }, }); if (!team) { logger.error("Team not found", { teamId: payload.teamId }); return; } if (team.plan.includes("drtrial")) { await sendDataroomTrialEndEmail({ email: payload.to, name: payload.name, }); logger.info("Email sent", { to: payload.to, teamId: payload.teamId }); await prisma.team.update({ where: { id: payload.teamId }, data: { plan: team.plan.replace("+drtrial", "") }, }); logger.info("Trial removed", { teamId: payload.teamId }); return; } logger.info("Team upgraded - no further action", { teamId: payload.teamId, plan: team.plan, }); return; },});
Check out the full code for send scheduled emails.
Convert office files to PDF
This task handles the conversion of various office document formats (like Word, Excel, PowerPoint) to PDF format.
export const convertFilesToPdfTask = task({ id: "convert-files-to-pdf", run: async (payload: ConvertPayload) => { const document = await prisma.document.findUnique({ where: { id: payload.documentId }, select: { name: true, versions: { where: { id: payload.documentVersionId }, select: { originalFile: true, contentType: true, storageType: true, }, }, }, }); const version = document.versions[0]; const fileUrl = await getFile({ data: version.originalFile, type: version.storageType, }); const fileBuffer = await (await fetch(fileUrl)).arrayBuffer(); const formData = new FormData(); formData.append( "files", new Blob([Buffer.from(fileBuffer)], { type: version.contentType }), document.name ); formData.append("quality", "50"); const conversionResponse = await retry.fetch( `${process.env.NEXT_PRIVATE_CONVERSION_BASE_URL}/forms/libreoffice/convert`, { method: "POST", body: formData, headers: { Authorization: `Basic ${process.env.NEXT_PRIVATE_INTERNAL_AUTH_TOKEN}`, }, retry: { byStatus: { "500-599": { strategy: "backoff", maxAttempts: 3, factor: 2, minTimeoutInMs: 1_000, maxTimeoutInMs: 30_000, }, }, }, } ); const docId = version.originalFile.match(/(doc_[^\/]+)\//)?.[1]; const { type: storageType, data } = await putFileServer({ file: { name: `${document.name}.pdf`, type: "application/pdf", buffer: Buffer.from(await conversionResponse.arrayBuffer()), }, teamId: payload.teamId, docId, }); await prisma.documentVersion.update({ where: { id: payload.documentVersionId }, data: { file: data, type: "pdf", storageType }, }); await client.sendEvent({ id: payload.documentVersionId, name: "document.uploaded", payload, }); },});
Check out the full code for convert Office files to PDF.
Convert CAD files to PDF
This task handles converting CAD files to PDF by fetching the original file, validating team and document data, and preparing a conversion payload with specific CAD processing parameters.
export const convertCadToPdfTask = task({ id: "convert-cad-to-pdf", run: async (payload: ConvertPayload) => { const team = await prisma.team.findUnique({ where: { id: payload.teamId }, }); const document = await prisma.document.findUnique({ where: { id: payload.documentId }, select: { name: true, versions: { where: { id: payload.documentVersionId }, select: { file: true, originalFile: true, contentType: true, storageType: true, }, }, }, }); const fileUrl = await getFile({ data: document.versions[0].originalFile, type: document.versions[0].storageType, }); const tasksPayload = getTasksPayload( fileUrl, document.name, document.versions[0].contentType ); const conversionResponse = await retry.fetch( `${process.env.NEXT_PRIVATE_CONVERT_API_URL}`, { method: "POST", body: JSON.stringify(tasksPayload), headers: { Authorization: `Bearer ${process.env.NEXT_PRIVATE_CONVERT_API_KEY}`, "Content-Type": "application/json", }, retry: { byStatus: { "500-599": { strategy: "backoff", maxAttempts: 3, factor: 2, minTimeoutInMs: 1_000, maxTimeoutInMs: 30_000, randomize: false, }, }, }, } ); const conversionBuffer = Buffer.from( await conversionResponse.arrayBuffer() ); const match = document.versions[0].originalFile.match(/(doc_[^\/]+)\//); const docId = match ? match[1] : undefined; const { type: storageType, data } = await putFileServer({ file: { name: `${document.name}.pdf`, type: "application/pdf", buffer: conversionBuffer, }, teamId: payload.teamId, docId: docId, }); await prisma.documentVersion.update({ where: { id: payload.documentVersionId }, data: { file: data, type: "pdf", storageType: storageType, }, }); await client.sendEvent({ id: payload.documentVersionId, name: "document.uploaded", payload: { documentVersionId: payload.documentVersionId, teamId: payload.teamId, documentId: payload.documentId, }, }); return; },});
Check out the full code for convert CAD files to PDF.