Thursday, July 15, 2010

Google Docs OCR – Converts Images to PDF, Text


Google seems to be taking Google Docs to mainstream with Rich set of new Features like Realtime collaboration, improvements for better support of office docs, they added public file sharing, and now OCR – Optical character recognition, as an alternative to other Free Online OCRs.

The new Google Docs OCR will let users upload any image and convert it into PDFs, plain text for editing, on demand in a flip of seconds.

To use OCR, look for the “ Convert text from PDF or image files to Google Docs documents” checkbox when you’re uploading a file. The file will show up in Google Docs as a text document instead of its original format, so if you want to share the image, you’ll have to upload it again with the box unchecked.

The tool doesn’t look to be perfect but works with fair enough accuracy. There is some loss of formatting, but 95% accuracy is what you can expect for good resolution images. I recommend using 150+ dpi images for better results. Something around 300dpi would give you 98% accuracy.

Here’s the result of converting image to text for a 70dpi image. Not that bad after all.

No comments: