Add text to Existing PDF using Python
Categories:
Adding Text to Existing PDFs with Python

Learn how to programmatically add text, watermarks, or annotations to existing PDF documents using Python libraries like PyPDF2 and ReportLab.
PDF documents are a ubiquitous format for sharing information, but often there's a need to dynamically add content to them â whether it's a timestamp, a watermark, a signature, or custom annotations. Manually editing each PDF can be tedious and error-prone. Fortunately, Python offers powerful libraries that enable programmatic manipulation of PDFs, including adding text.
This article will guide you through the process of adding text to existing PDF files using Python. We'll explore two primary approaches: using PyPDF2
for basic overlays and ReportLab
for more advanced text placement and formatting. Understanding these methods will empower you to automate various PDF customization tasks.
Understanding PDF Structure and Text Addition
Before diving into code, it's helpful to understand how text is added to a PDF. PDFs are essentially a collection of objects that describe the document's content and appearance. When you add text, you're typically creating a new 'layer' or 'content stream' that gets rendered on top of the existing PDF content. This means you're not directly modifying the original text but rather overlaying new text.
Libraries like PyPDF2
are excellent for merging and manipulating existing PDF pages, but they have limited capabilities for drawing new content directly. For precise text placement, font control, and more complex graphical elements, a library like ReportLab
is often used to generate a new PDF containing only the desired text, which is then merged with the original PDF.
flowchart TD A[Original PDF] --> B{Generate Text Overlay PDF} B --> C[New PDF with Text (e.g., ReportLab)] C --> D{Merge PDFs} D --> E[Final PDF with Added Text (e.g., PyPDF2)] A --> D
Workflow for adding text to an existing PDF using Python
Method 1: Simple Text Overlay with PyPDF2 (and ReportLab for Text Generation)
The most common and flexible approach involves using ReportLab
to create a new, transparent PDF page containing only the text you want to add, and then using PyPDF2
to merge this new page as an overlay onto your existing PDF. This method gives you full control over text appearance and position.
First, ensure you have the necessary libraries installed:
pip install PyPDF2 ReportLab
Install required Python libraries
Here's how you can generate a text-only PDF overlay using ReportLab
and then merge it with an existing PDF using PyPDF2
:
from PyPDF2 import PdfReader, PdfWriter
from reportlab.pdfgen import canvas
from reportlab.lib.pagesizes import letter
def add_text_to_pdf(input_pdf_path, output_pdf_path, text_to_add, x_pos, y_pos, font_size=12, font_name='Helvetica'):
# 1. Create a new PDF with ReportLab for the text overlay
packet = io.BytesIO()
can = canvas.Canvas(packet, pagesize=letter)
can.setFont(font_name, font_size)
can.drawString(x_pos, y_pos, text_to_add)
can.save()
# Move to the beginning of the StringIO buffer
packet.seek(0)
new_pdf = PdfReader(packet)
# 2. Read the existing PDF with PyPDF2
existing_pdf = PdfReader(open(input_pdf_path, "rb"))
output = PdfWriter()
# 3. Overlay the new text PDF onto each page of the existing PDF
for i in range(len(existing_pdf.pages)):
page = existing_pdf.pages[i]
page.merge_page(new_pdf.pages[0]) # Merge the first (and only) page of our text PDF
output.add_page(page)
# 4. Write the merged PDF to a new file
with open(output_pdf_path, "wb") as output_stream:
output.write(output_stream)
import io
# Example Usage:
# Create a dummy PDF for testing if you don't have one
# from reportlab.pdfgen import canvas
# c = canvas.Canvas("original.pdf", pagesize=letter)
# c.drawString(100, 750, "This is the original content.")
# c.save()
input_file = "original.pdf" # Make sure this file exists
output_file = "output_with_text.pdf"
text = "CONFIDENTIAL - DO NOT DISTRIBUTE"
add_text_to_pdf(input_file, output_file, text, 50, 50, font_size=24, font_name='Times-Bold')
print(f"Text '{text}' added to '{input_file}' and saved as '{output_file}'")
Python code to add text to an existing PDF using ReportLab and PyPDF2
x_pos
and y_pos
coordinates in ReportLab
start from the bottom-left corner of the page. letter
page size is 612x792 points (1 point = 1/72 inch). Experiment with these values to get the desired placement.Method 2: Using fpdf2
for Direct Text Addition (Alternative)
While PyPDF2
is excellent for merging, it's not designed for drawing. ReportLab
is powerful but can have a steeper learning curve for simple tasks. An alternative library, fpdf2
(a port of FPDF), allows for more direct drawing onto PDF pages, which can be simpler for certain text addition scenarios.
First, install fpdf2
:
pip install fpdf2
Install the fpdf2 library
Here's an example of how to use fpdf2
to add text to an existing PDF. Note that fpdf2
works by creating a new PDF and then importing pages from an existing one, allowing you to draw on them.
from fpdf import FPDF
def add_text_with_fpdf2(input_pdf_path, output_pdf_path, text_to_add, x_pos, y_pos, font_size=12, font_family='Helvetica', style=''):
pdf = FPDF()
pdf.set_auto_page_break(auto=False, margin=0)
# Import pages from the existing PDF
t = pdf.source_pdf_template(input_pdf_path)
num_pages = pdf.get_page_count(input_pdf_path)
for i in range(1, num_pages + 1):
pdf.add_page()
pdf.use_template(t, page=i)
# Set font and add text
pdf.set_font(font_family, style, font_size)
pdf.set_xy(x_pos, y_pos) # Set position for the text
pdf.write(8, text_to_add) # write(height, text)
pdf.output(output_pdf_path)
# Example Usage:
input_file = "original.pdf" # Make sure this file exists
output_file = "output_with_text_fpdf2.pdf"
text = "DRAFT - FOR REVIEW"
add_text_with_fpdf2(input_file, output_file, text, 10, 10, font_size=18, font_family='Arial', style='B')
print(f"Text '{text}' added to '{input_file}' using fpdf2 and saved as '{output_file}'")
Python code to add text to an existing PDF using fpdf2
fpdf2
, the set_xy
method also positions the text from the top-left corner by default, but the write
method's first argument is the line height. Be mindful of coordinate systems when switching between libraries.Considerations and Best Practices
When adding text to PDFs, keep the following in mind:
- Coordinate Systems: Different libraries might use different coordinate systems (e.g., origin at bottom-left vs. top-left). Always test and adjust positions.
- Font Embedding: For consistent rendering across different viewers, ensure that fonts used for your added text are embedded in the PDF.
ReportLab
andfpdf2
handle this for standard fonts, but custom fonts might require extra steps. - Transparency: If you're adding watermarks, consider setting the text color with an alpha channel (transparency) to make it less intrusive.
- Performance: For very large PDFs or a high volume of operations, consider optimizing your code or using more performant libraries if available.
- Error Handling: Always include error handling (e.g.,
try-except
blocks) for file operations to gracefully manage cases where input files are missing or corrupted. - Original PDF Integrity: The methods described here create a new PDF with the added text, leaving your original PDF untouched. This is generally a good practice for data integrity.
By leveraging Python's rich ecosystem of PDF libraries, you can efficiently automate the process of adding text to your PDF documents, saving time and ensuring consistency across your files.