By Brian Duddy, Product Engineer
Search and edit scanned documents (the magic of OCR)
If yourdocument was created from a scanned file, it is essentially a “picture” of text. But it is easy to change into editable text using .
What is OCR?
OCR stands for Optical Character Recognition. It’s a technology that converts scanned text, which is an image of any typed, handwritten, or printed text in your document, into digital text that you can search and edit.
Why do I need OCR?
When you scan a document, you create a single image of the words, graphics, and other page elements. That means you can’t copy text, search for text, or select text to make changes to it.
OCR turns those text “pictures” into machine-readable text (as if you typed it all in) that you can edit, copy, add to, and delete. Professionalsuch as Foxit PhantomPDF has PDF OCR built in to make this easy.
Many PDF software applications such as PhantomPDF Standard know right away if you open an image-based PDF document. If so, the software will ask you if you wish to make the text editable. If you do, the PDF OCR feature will open.
When using PDF OCR, you can choose to convert any or all of the pages in your file.
You can choose which language is in your document, and you can choose multiple languages as well.
And you have the choice to output the results as a Searchable Text Image. This means that the image text is searchable but not editable, and the document appearance will not change at all. Or you can choose Editable Text to enable the image text to be searchable and editable.
Last but not least, because even the best OCR software makes mistakes, especially if you’ve scanned handwritten text, you have the option to correct “OCR suspects,” which is a nice name for converted text that your PDF software’s unsure it got right. After correcting any suspects, it’s wise to review the entire document yourself to be sure it’s error-free.
Those are the basics. If you’d like more details and options for using the PDF OCR feature in PhantomPDF, refer to the Optical Character Recognition section of the user manual for step-by-step instructions.