How to Remove Metadata from PDF Files
— Written by Brendan, Founder of FileShot.io
Every PDF file carries hidden metadata that reveals who created it, when it was made, what software was used, and sometimes much more. Before you share a PDF with clients, post it publicly, or submit it for legal review, you should know what metadata is inside and how to remove it.
What Metadata Is Hidden in PDF Files?
PDF metadata is stored in two places: the Document Information Dictionary (the older format) and XMP metadata (the modern XML-based format). Most PDFs contain both. Here is what they typically include:
- Author — The name of the person or user account that created the document. Often set automatically from your OS username or application profile
- Title — Sometimes auto-populated with filenames, project names, or client names
- Subject and Keywords — Fields that may contain internal project identifiers or classification labels
- Creator Application — The software that produced the PDF (e.g., "Microsoft Word 2021", "Adobe InDesign 18.5", "LaTeX with pdfTeX")
- PDF Producer — The PDF library or converter used (e.g., "macOS Quartz PDFContext", "iText 7.2.5")
- Creation Date — The exact date and time the document was first created, including timezone offset
- Modification Date — The last time the document was edited
- Custom Metadata — XMP can store arbitrary key-value pairs. Some enterprise systems embed document IDs, workflow status, classification levels, or user email addresses
Why PDF Metadata Is a Privacy Risk
PDF metadata causes problems in specific real-world situations:
- Legal documents: A contract PDF reveals the author as "jsmith@lawfirm.com" and the creation date as 3 weeks before the claimed drafting date. Opposing counsel uses this to challenge the timeline
- Freelance work: A freelancer delivers a report to a client. The PDF author field shows the freelancer's personal name instead of the agency brand. The client discovers the outsourcing arrangement
- Whistleblower documents: A leaked internal PDF is traced back to a specific employee because the author field contains their Windows login name
- GDPR compliance: A PDF shared with EU customers contains a staff member's full name in the author field. Under GDPR, that is personal data being shared without necessity or consent
- Competitive intelligence: The Creator Application field reveals you are using a specific ERP or document management system, giving competitors insight into your tech stack
Method 1: FileShot Metadata Scrubber (Free, No Upload)
The fastest way to strip all metadata from a PDF without uploading it anywhere:
- Go to
- Drop your PDF file onto the page
- The tool reads the file entirely in your browser — nothing is uploaded to any server
- Review the metadata that was found (author, dates, creator, producer, custom fields)
- Click to strip all metadata and download the cleaned PDF
Because the processing happens client-side in your browser using JavaScript, your PDF never leaves your device. This matters for confidential documents, legal files, and anything covered by NDA or compliance requirements.
Method 2: Adobe Acrobat Pro
If you have Adobe Acrobat Pro (paid):
- Open the PDF in Acrobat Pro
- Go to File > Properties (or Ctrl+D)
- In the Description tab, manually clear the Author, Title, Subject, and Keywords fields
- For thorough removal: go to Tools > Redact > Sanitize Document
- The Sanitize feature removes metadata, hidden layers, embedded objects, file attachments, bookmarks, comments, and form field data
- Save the sanitized PDF
Acrobat's Sanitize Document feature is the most thorough built-in option, but it requires a paid Acrobat Pro subscription and sends telemetry to Adobe.
Method 3: ExifTool (Command Line)
ExifTool by Phil Harvey supports PDF metadata. It handles both the Document Information Dictionary and XMP:
# View all PDF metadata
exiftool document.pdf
# Remove all metadata
exiftool -all= document.pdf
# Remove specific fields only
exiftool -Author= -Creator= -Producer= document.pdf
# Remove all metadata and overwrite original
exiftool -all= -overwrite_original document.pdf
The -all= flag empties all writable tags. ExifTool cannot remove the PDF version number or file structure data (those are not metadata, they are structural), but it removes all identity-revealing fields.
Method 4: LibreOffice (Free, Cross-Platform)
If you created the PDF from a document in LibreOffice:
- Before exporting to PDF, go to File > Properties
- Clear all fields in the General and Custom Properties tabs
- Uncheck "Apply user data" if present
- Export to PDF via File > Export as PDF
This approach prevents metadata from being written in the first place, which is more reliable than removing it after the fact.
Method 5: QPDF (Free, Open Source)
QPDF is a command-line tool specifically designed for PDF structural transformations:
# Linearize and strip metadata
qpdf --linearize --replace-input input.pdf
# Remove XMP metadata stream
qpdf --qdf input.pdf - | grep -v "xmp" > output.pdf
QPDF is useful for batch processing large numbers of PDFs in automated workflows. Combine it with ExifTool for complete metadata removal.
What About Embedded Images in PDFs?
PDFs often contain embedded images — logos, charts, photographs, scanned pages. Each embedded image can carry its own EXIF data, including GPS coordinates, camera model, and timestamps. Standard PDF metadata removal tools strip the document-level metadata but may not touch image-level EXIF data inside the PDF.
For thorough cleaning:
- Strip EXIF from images before embedding them in the PDF
- Use Acrobat Pro's Sanitize Document feature, which addresses embedded content
- Use , which processes the PDF and its embedded image data
PDF Metadata vs. PDF Content: What Cannot Be Removed
Metadata removal does not alter the visible content of a PDF. It also cannot remove:
- Text within the document — If a name or date appears in the visible text, metadata removal will not redact it. Use redaction tools for that
- Digital signatures — Removing metadata from a signed PDF invalidates the signature
- PDF form field data — Filled form fields are content, not metadata
- Incremental save history — Older PDF versions may store previous edits as incremental updates in the file. This is not standard metadata but can reveal edit history. QPDF's linearize feature can eliminate incremental saves
Batch Processing: Removing Metadata from Many PDFs
For organizations that need to strip metadata from PDFs at scale:
# ExifTool: process all PDFs in a folder
exiftool -all= -overwrite_original *.pdf
# Recursive, all subdirectories
exiftool -all= -overwrite_original -r /path/to/documents/
# With backup (ExifTool creates _original files by default)
exiftool -all= *.pdf
For automated workflows, integrate ExifTool or QPDF into your document pipeline so metadata is stripped before files leave your organization.
When to Remove PDF Metadata
- Before sharing externally — Any PDF sent to clients, partners, or the public should have metadata stripped
- Legal document production — Discovery and disclosure processes require understanding (and often sanitizing) metadata
- Publishing online — PDFs on your website reveal your staff names, software, and internal workflows via metadata
- Healthcare file sharing — PDF metadata can leak PHI in healthcare settings
- GDPR / privacy compliance — Author names in shared PDFs are personal data under EU regulations
- Before encrypting and sharing — Strip metadata first, then use encrypted file sharing for the transfer
Frequently Asked Questions
Does removing metadata change the PDF's appearance?
No. Metadata removal only affects hidden properties (author, dates, creator application, etc.). The visible text, images, layout, and formatting of the PDF remain identical.
Can I see what metadata a PDF contains before removing it?
Yes. In FileShot's Metadata Scrubber, the tool displays all detected metadata before you strip it. With ExifTool, run exiftool document.pdf to list all metadata fields. In Acrobat, use File > Properties.
Does metadata removal reduce file size?
Marginally. Metadata is typically a few hundred bytes to a few kilobytes. For standard documents, the size difference is negligible. However, if a PDF contains large XMP metadata blocks (common with Adobe Creative Suite files), removal can reduce file size noticeably.
Is there metadata I should keep?
For internal archival and document management, metadata is useful — it helps with search, attribution, and audit trails. The recommendation is to strip metadata only from copies that leave your organization, not from your internal originals.
Can removed metadata be recovered?
No. Once metadata fields are cleared and the PDF is saved, the data is gone from that copy. However, if incremental saves are enabled, older versions with metadata might exist within the file structure. Use QPDF's linearize to eliminate incremental history.
Related Guides
- Word Metadata Remover — strip hidden data from Word documents before converting to PDF
- HIPAA Metadata Leaks — how PDF metadata exposes patient information in healthcare
- How to Password Protect Any File — add password protection after stripping metadata
- What Is Encrypted File Sharing? — encrypt cleaned PDFs before sharing