Python PDF form extractor example
Learn how to use Trigger.dev with Python to extract form data from PDF files.
Overview
This demo showcases how to use Trigger.dev with Python to extract structured form data from a PDF file available at a URL.
Prerequisites
- A project with Trigger.dev initialized
- Python installed on your local machine
Features
- A Trigger.dev task to trigger the Python script
- Trigger.dev Python build extension to install the dependencies and run the Python script
- PyMuPDF to extract form data from PDF files
- Requests to download PDF files from URLs
GitHub repo
View the project on GitHub
Click here to view the full code for this project in our examples repository on GitHub. You can fork it and use it as a starting point for your own project.
The code
Build configuration
After you’ve initialized your project with Trigger.dev, add these build settings to your trigger.config.ts
file:
Learn more about executing scripts in your Trigger.dev project using our Python build extension here.
Task code
This task uses the python.runScript
method to run the image-processing.py
script with the given image URL as an argument. You can adjust the image processing parameters in the payload, with options such as height, width, quality, output format, etc.
Add a requirements.txt file
Add the following to your requirements.txt
file. This is required in Python projects to install the dependencies.
The Python script
The Python script uses PyMuPDF to extract form data from a PDF file. You can see the original script in our examples repository here.
Testing your task
- Create a virtual environment
python -m venv venv
- Activate the virtual environment, depending on your OS: On Mac/Linux:
source venv/bin/activate
, on Windows:venv\Scripts\activate
- Install the Python dependencies
pip install -r requirements.txt
- Copy the project ref from your Trigger.dev dashboard and add it to the
trigger.config.ts
file. - Run the Trigger.dev CLI
dev
command (it may ask you to authorize the CLI if you haven’t already). - Test the task in the dashboard by providing a valid PDF URL.
- Deploy the task to production using the Trigger.dev CLI
deploy
command.
Learn more about using Python with Trigger.dev
Python build extension
Learn how to use our built-in Python build extension to install dependencies and run your Python code.