Automated Redaction of sensitive information from PDF

Updated: Apr 25, 2020

Before you distribute a PDF, you may need to remove or redact classified information such as social security numbers, credit card numbers, or any other personal information to ensure privacy and data confidentiality. There are also many compliances, such as GDPR, PCI DSS, etc.. which require a high standard of information security and data privacy, making the document redaction process an essential one.

There are many software present today, which can redact text or images from the document. But, redacting information using that software, can be time-consuming, labor-intensive, and riddled with human & software error.

PDFFiddler Redaction process is designed to be very simple and very accurate and can easily be automated via reusable scripts for similar structured documents. PDFFiddler Playground provides three easy ways to redact content in PDF.

  1. Redact all text within the region of interest

  2. Redact all images within the region of interest

  3. Find and Redact text based upon regex pattern

Let's take an example to understand how the redaction process works in PDFFiddler.

Suppose from the below sample PDF, we wanted to redact Customer Name and Address within a particular region. And, also, search and redact all the Dates within the invoice.

  • Preview the uploaded PDF, and then click and draw the region, where name and address is present. Region variable popup will appear. Name it as name_address

  • Now add below script

//load input document
doc = load($input[0])

//Redact name and address within region name_address

//Redact date within whole invoice
doc.getPage(1).findAndRedactText(find: "\\d{2}/\\d{2}/\\d{4}", regex: true)

//output invoice
output(doc, "RedactedPDF")
  • Click on Run Button. Tadaa! Redacted PDF is ready to download

It is very important to understand that, Redaction process is an irreversible one, so anyone won't be able to extract redacted data by any means.

Today, we learned, how PDFFiddler Playground can be used to redact any sensitive information from the document, with few simple steps.

We wanted to keep this example a simple one, that's why we haven't included other redaction features available such as redacting image, region, text, etc..

In the advanced scenarios, you can also easily combine this redact task with other tasks of PDFFiddler such as split, merge, content manipulation, page manipulation etc.. This flexibility makes PDFFiddler solution standout from other Softwares.

Ready to explore more about Redaction in PDFFiddler Playground

#pdffiddler #pdfredaction #pdfplayground #pdfautomation