← Back to Blog

How to Remove Metadata from PDF Files

July 1, 2026 — Written by Brendan, Founder of FileShot.io

PDF document with hidden metadata properties showing author name and creation date that need to be removed for privacy

Every PDF file carries hidden metadata that reveals who created it, when it was made, what software was used, and sometimes much more. Before you share a PDF with clients, post it publicly, or submit it for legal review, you should know what metadata is inside and how to remove it.

What Metadata Is Hidden in PDF Files?

PDF metadata is stored in two places: the Document Information Dictionary (the older format) and XMP metadata (the modern XML-based format). Most PDFs contain both. Here is what they typically include:

Author — The name of the person or user account that created the document. Often set automatically from your OS username or application profile
Title — Sometimes auto-populated with filenames, project names, or client names
Subject and Keywords — Fields that may contain internal project identifiers or classification labels
Creator Application — The software that produced the PDF (e.g., "Microsoft Word 2021", "Adobe InDesign 18.5", "LaTeX with pdfTeX")
PDF Producer — The PDF library or converter used (e.g., "macOS Quartz PDFContext", "iText 7.2.5")
Creation Date — The exact date and time the document was first created, including timezone offset
Modification Date — The last time the document was edited
Custom Metadata — XMP can store arbitrary key-value pairs. Some enterprise systems embed document IDs, workflow status, classification levels, or user email addresses

Why PDF Metadata Is a Privacy Risk

PDF metadata causes problems in specific real-world situations:

Legal documents: A contract PDF reveals the author as "jsmith@lawfirm.com" and the creation date as 3 weeks before the claimed drafting date. Opposing counsel uses this to challenge the timeline
Freelance work: A freelancer delivers a report to a client. The PDF author field shows the freelancer's personal name instead of the agency brand. The client discovers the outsourcing arrangement
Whistleblower documents: A leaked internal PDF is traced back to a specific employee because the author field contains their Windows login name
GDPR compliance: A PDF shared with EU customers contains a staff member's full name in the author field. Under GDPR, that is personal data being shared without necessity or consent
Competitive intelligence: The Creator Application field reveals you are using a specific ERP or document management system, giving competitors insight into your tech stack

Method 1: FileShot Metadata Scrubber (Free, No Upload)

The fastest way to strip all metadata from a PDF without uploading it anywhere:

Go to
Drop your PDF file onto the page
The tool reads the file entirely in your browser — nothing is uploaded to any server
Review the metadata that was found (author, dates, creator, producer, custom fields)
Click to strip all metadata and download the cleaned PDF

Because the processing happens client-side in your browser using JavaScript, your PDF never leaves your device. This matters for confidential documents, legal files, and anything covered by NDA or compliance requirements.

Method 2: Adobe Acrobat Pro

If you have Adobe Acrobat Pro (paid):

Open the PDF in Acrobat Pro
Go to File > Properties (or Ctrl+D)
In the Description tab, manually clear the Author, Title, Subject, and Keywords fields
For thorough removal: go to Tools > Redact > Sanitize Document
The Sanitize feature removes metadata, hidden layers, embedded objects, file attachments, bookmarks, comments, and form field data
Save the sanitized PDF

Acrobat's Sanitize Document feature is the most thorough built-in option, but it requires a paid Acrobat Pro subscription and sends telemetry to Adobe.

Method 3: ExifTool (Command Line)

ExifTool by Phil Harvey supports PDF metadata. It handles both the Document Information Dictionary and XMP:

# View all PDF metadata
exiftool document.pdf

# Remove all metadata
exiftool -all= document.pdf

# Remove specific fields only
exiftool -Author= -Creator= -Producer= document.pdf

# Remove all metadata and overwrite original
exiftool -all= -overwrite_original document.pdf

The -all= flag empties all writable tags. ExifTool cannot remove the PDF version number or file structure data (those are not metadata, they are structural), but it removes all identity-revealing fields.

Method 4: LibreOffice (Free, Cross-Platform)

If you created the PDF from a document in LibreOffice:

Before exporting to PDF, go to File > Properties
Clear all fields in the General and Custom Properties tabs
Uncheck "Apply user data" if present
Export to PDF via File > Export as PDF

This approach prevents metadata from being written in the first place, which is more reliable than removing it after the fact.

Method 5: QPDF (Free, Open Source)

QPDF is a command-line tool specifically designed for PDF structural transformations:

# Linearize and strip metadata
qpdf --linearize --replace-input input.pdf

# Remove XMP metadata stream
qpdf --qdf input.pdf - | grep -v "xmp" > output.pdf

QPDF is useful for batch processing large numbers of PDFs in automated workflows. Combine it with ExifTool for complete metadata removal.

What About Embedded Images in PDFs?

PDFs often contain embedded images — logos, charts, photographs, scanned pages. Each embedded image can carry its own EXIF data, including GPS coordinates, camera model, and timestamps. Standard PDF metadata removal tools strip the document-level metadata but may not touch image-level EXIF data inside the PDF.

For thorough cleaning:

Strip EXIF from images before embedding them in the PDF
Use Acrobat Pro's Sanitize Document feature, which addresses embedded content
Use , which processes the PDF and its embedded image data

PDF Metadata vs. PDF Content: What Cannot Be Removed

Metadata removal does not alter the visible content of a PDF. It also cannot remove:

Text within the document — If a name or date appears in the visible text, metadata removal will not redact it. Use redaction tools for that
Digital signatures — Removing metadata from a signed PDF invalidates the signature
PDF form field data — Filled form fields are content, not metadata
Incremental save history — Older PDF versions may store previous edits as incremental updates in the file. This is not standard metadata but can reveal edit history. QPDF's linearize feature can eliminate incremental saves

Batch Processing: Removing Metadata from Many PDFs

For organizations that need to strip metadata from PDFs at scale:

# ExifTool: process all PDFs in a folder
exiftool -all= -overwrite_original *.pdf

# Recursive, all subdirectories
exiftool -all= -overwrite_original -r /path/to/documents/

# With backup (ExifTool creates _original files by default)
exiftool -all= *.pdf

For automated workflows, integrate ExifTool or QPDF into your document pipeline so metadata is stripped before files leave your organization.

When to Remove PDF Metadata

Before sharing externally — Any PDF sent to clients, partners, or the public should have metadata stripped
Legal document production — Discovery and disclosure processes require understanding (and often sanitizing) metadata
Publishing online — PDFs on your website reveal your staff names, software, and internal workflows via metadata
Healthcare file sharing — PDF metadata can leak PHI in healthcare settings
GDPR / privacy compliance — Author names in shared PDFs are personal data under EU regulations
Before encrypting and sharing — Strip metadata first, then use encrypted file sharing for the transfer

Frequently Asked Questions

Does removing metadata change the PDF's appearance?

No. Metadata removal only affects hidden properties (author, dates, creator application, etc.). The visible text, images, layout, and formatting of the PDF remain identical.

Can I see what metadata a PDF contains before removing it?

Yes. In FileShot's Metadata Scrubber, the tool displays all detected metadata before you strip it. With ExifTool, run exiftool document.pdf to list all metadata fields. In Acrobat, use File > Properties.

Does metadata removal reduce file size?

Marginally. Metadata is typically a few hundred bytes to a few kilobytes. For standard documents, the size difference is negligible. However, if a PDF contains large XMP metadata blocks (common with Adobe Creative Suite files), removal can reduce file size noticeably.

Is there metadata I should keep?

For internal archival and document management, metadata is useful — it helps with search, attribution, and audit trails. The recommendation is to strip metadata only from copies that leave your organization, not from your internal originals.

Can removed metadata be recovered?

No. Once metadata fields are cleared and the PDF is saved, the data is gone from that copy. However, if incremental saves are enabled, older versions with metadata might exist within the file structure. Use QPDF's linearize to eliminate incremental history.

Related Guides

Word Metadata Remover — strip hidden data from Word documents before converting to PDF
HIPAA Metadata Leaks — how PDF metadata exposes patient information in healthcare
How to Password Protect Any File — add password protection after stripping metadata
What Is Encrypted File Sharing? — encrypt cleaned PDFs before sharing