Overview
Optical Character Recognition (OCR) is an essential step in preparing scanned or handwritten PDFs for document redaction. When a file isn’t machine readable, CaseGuard’s OCR engine converts it into fully searchable and editable text, making OCR redaction fast, accurate, and compatible.
When should you OCR a document?
You should OCR your file (or at least the pages you plan to redact) when:
- Text on the scanned documents is not being detected correctly
- The original document was poorly OCR’d; you will notice this when highlighting text selects random words instead of full sentences
- Your document includes images that contain text
- You want to improve accuracy for automated redaction workflows
Running OCR converts all embedded, handwritten, or scanned text into searchable content that CaseGuard’s document redaction tools can analyze accurately.
How to OCR a document in CaseGuard Studio
- Click OCR on the left Toolbar
- In the ‘Document Assistant – OCR Document’ window, choose what language your document is in.
- Select either “OCR all scanned pages” or “OCR all pages,” which will OCR all types of pages, including scanned and machine-generated.
- Under “Run OCR on,” select either “All Pages”, or add a “Page Range.”
- Click Next, then Apply
Your document(s) is now fully machine-readable and ready for AI redaction using CaseGuard’s automated document redaction tools.
Why OCR matters for redaction accuracy:
Running OCR on your scanned or handwritten documents ensures:
- More accurate PII detection
- Cleaner text recognition for automated PDF redaction
- Better performance for larger files
- Faster document redaction workflows
Was this article helpful to you?
No
Yes