Redacted Documents Are Not as Secure as You Think

Redacted Documents Are Not as Secure as You Think

For years, if you wanted to protect sensitive text in a document, you could grab a pair of scissors or a scalpel and cut out the information. If this didn’t work, a chunky black marker pen would do the job. Now that most documents are digitized, securely redacting their contents has become harder. The majority of redactions—by government officials and courts—involve placing black boxes over text in PDFs. 

When this redaction is done incorrectly, people’s safety and national security can be put at risk. New research from a team at the University of Illinois looked at the most popular tools for redacting PDF documents and found many of them wanting. The findings, from researchers Maxwell Bland, Anushya Iyer, and Kirill Levchenko, say two of the most popular tools for redacting documents offer no protection to the underlying text at all, with the text accessible by copying and pasting it. Plus, a new attack method they devised makes it possible to extract secret details from the redacted text.

The flaws aren’t just theoretical. After examining millions of publicly available documents with blacked-out redactions—including from the US court system, the US Office of the Inspector General, and Freedom of Information Act requests—the researchers found thousands of documents that exposed people’s names and other sensitive details. “I’ve been in lots of discussions with the US court system, I provided them 710 different documents that were just trivial copy-paste style redactions,” says Bland, the paper’s lead author.

Officials usually redact sections of text in documents because those parts contain people’s personal information, or they decide the information shouldn’t be released to protect an organization’s interests. Court documents may redact names of confidential informants or whistleblowers; policy documents may redact information that could damage national security if it is made public.

During the new research, which has been published as a preprint, the team analyzed 11 popular redaction tools. They discovered that PDFzorro and PDFescape Online allowed full access to text that had allegedly been redacted. All they needed to do to access the text was copy and paste it. The researchers registered CVE numbers—used to catalog unique security vulnerabilities—for both of the issues.

PDFzorro did not respond to WIRED’s request for comment. When we tested the tool, it was possible to access PDFzorro redactions by highlighting them. However, if you click on an option to “lock” the PDF before you download it, the text can’t be accessed. Meanwhile, a customer service representative from PDFescape Online said the software has been recently acquired by a new company and they have “rolled out an update for PDFescape Online” that includes security fixes. “The mentioned redaction tool has been removed and will be reworked to be fully compliant,” they said. 

The Illinois research goes further than copy and paste. It also demonstrates a new way to attack PDF documents and use hidden fingerprints to reveal names that have been redacted. The team focused on names, Bland says, as they are commonly redacted and sensitive. It does not appear possible to unredact large blocks of text, the researchers say. To reveal people’s names, the team built a tool, dubbed Edact-Ray, that can “identify, break, and fix redaction information leaks.” 


via Wired

November 25, 2022 at 04:16AM

Leave a Reply

%d bloggers like this: