Automatic PDF Redaction, New Software Solutions
October 14, 2019 | 5 minutes read
Document redaction has been a labor-intensive task for quite some time. There are countless examples where documents that require redaction are typically hundreds of pages long. With typical technology, this means a person has to review the document page by page, locating what needs to be redacted, and then apply techniques and effects to redact the selected content, one line at a time.
This can take many labor hours that amount to days of effort. And even still, that person’s work will likely need to be reviewed by one or two people to ensure compliance. Usually, people that conduct redaction will miss items in need of redaction, where that item is present in the document multiple times. Over the course of evaluating the document, their brain enters into “schema mode,” and state where the mind fills in details during a task that challenges the brain’s ability to freshly evaluate a given circumstance, and overtime permits the brain to become weary of actual details, leading to incomplete objectives. In this case, a document that needs multiple sets of eyes to compensate for the natural effects of brain exhaustion.
However, with new technology comes a response to this problem. For many years now software has been built with algorithms that are designed to evaluate certain outcomes, and provide answers to those outcomes. This does not mean that the software is necessarily doing the work for the person, but rather pointing them towards potential work, so they can evaluate independently the value of the information the algorithm found, and whether any work needs to be completed.
Algorithm Basics
Algorithms can be used to predict outcomes, to synthesize results, to discover new data, and many other tasks that can be both task-complex and labor-intensive. Algorithms are designed to solve problems as quickly as humans can present them, and do so in a cost-effective manner. Algorithms can assess copious amounts of data in whatever way it is presented, and give potential answers to the problems presented so that decisions can be reached at the speed of business.
Of course, when it comes to documents, especially the lengthy documents that are in need of redaction, finding ways to break down documents into data categories is optimal. And this is where the redaction tools offered in CaseGuard Studio come into alignment with your agency goals. With longer documents, searching and categorizing similar content constitutes one way of “chunking” the data within the document, and presenting a solvable problem within the document.
Pattern Redaction
Pattern Redaction, offered in CaseGuard Studio does exactly this. The powerful algorithm within CaseGuard Studio evaluates documents that are presented to the software, and one of the many results of data it produces is a listing of common phrases, sentences, and structures found in documents. In this case, the lengthier, the better. CaseGuard Studio then establishes these recurring instances as “patterns” within the document, and provides a sub-menu that highlights these patterns for employees at your agency to evaluate.
Rather than having to evaluate each and every line, if there are three or four recurring sentences in the document that are known to be in need of redaction, the employee has a tool that allows them to select that sentence, apply redaction, either uniformly, or specific to that content, and then move forward with redacting more unique content. CaseGuard Studio provides a sub-menu of text patterns within the given document, and using the Patterns/Regular Expressions button in your tool menu, you can select a given text pattern for redaction, select the type of redaction you want to apply, and even select a unique size, color, and other visual properties of that particular redaction, applying it to all the matching content in one action.
This one tool can cover a lot of ground in an instant, freeing up your staff to focus on individualized content in the document, and removing the need for multiple people to review the document prior to release. And this process can be used over multiple patterns of content, so you can repeat this effect in a document as many times as it takes to accomplish the full redaction you are looking for. Social Security numbers, addresses, healthcare data, financial accounts, names, locations, sentences with repetitive data, whatever data shows up repeatedly will be identified as a pattern, and this simple tool will cut your labor efforts drastically.
How Does It Work?
Pattern Redaction in CaseGuard Studio is very user friendly. Our software uses text analysis to build companion data file of the document you are working on. This data file provides word count, and a count of individual words used over the entire document, and this is where identifying text patterns come into play. As CaseGuard Studio analyzes the document, finds the words used within, and also establishes the patterns of those words in the document, and then provides a sub-menu where you can quickly review patterns that are found in the document.
The analysis is completed in the background, and is completed while the document is loading into the software, so there’s no lag time between opening and accessing a file. Once complete, you click on the Pattern Redaction function, and the sub-menu will appear, showing you what patterns have been found. And from there you can select the type of redaction you want to place onto the text pattern, what color, and what size the redaction should be.
Once applied, burning these effects into the document are just as easy, one click and CaseGuard Studio does the hard work of ensuring the redactions are part of the document permanently, so that you know the document’s most common information has been redacted as needed, and your staff can focus on the detail work ahead. On top of that, a partner document is created that shows where all the redaction has occurred, making it easy to follow the work performed, and quickly find anything in the document that may need to change.
Watch in Action
Conclusion
CaseGuard Studio’s Pattern Redaction is cutting down labor costs when it comes to public records requests, courtroom exhibits, and everything else in the world of presentable redacted documents. There’s no longer a need to suffer through the pains of checking, re-checking, checking again. It’s time to free up your labor dollars. It’s time to get CaseGuard Studio and its powerful Pattern Redaction tool into your agency.