January 18, 2016

3 Best premuim/free OCR Software to Extract Text From Images

These days, almost everything (e.g. photos, music, videos) has gone digital (and that makes sense, as digital content can be conveniently managed, edited, and shared). So how can textual documents stay behind. Thanks to the advancements in Optical Character Recognition (OCR)techniques, it’s now easier than ever to digitize the textual matter in printed/handwritten documents, thus making it editable by word processing programs.

Now, to do that, you need some really good OCR software applications, and that’s exactly what this article is all about. These software can either acquire the source printed documents as images from scanning devices, or you can input your own document images to be converted into editable text. Intrigued? Well then let’s not beat around the bush, and get to the 3 best OCR software.

  1. ABBYY FineReader

    Price: Paid versions start from $169.99, 30 days free trial available

    Platform Availability:Windows 10, 8, 7, Vista, and XP; Mac OS X 10.6 and Later

    Get ABBYY FineReader

    When it comes to Optical Character Recognition, there’s hardly anything that comes even close to ABBYY FineReader. Loaded to the brim with an insane amount of powerhouse features,ABBYY FineReader makes extracting text from all kinds of images a breeze.

    Despite toting and extensive list of features, ABBYY FineReader is super simple to use. It can extract text from almost all kinds of popular image formats, such as PNG, JPG, BMP, and TIFF. And that’s not all. ABBYY FineReader can also extract text from PDF and DJVU files. Once the source file or image (which should preferably have a resolution of at least 300 dpi, for optimal scanning) is loaded up, the program analyzes it and automatically determines different sections of the file having extractable text. You can either have all of the text extracted, or choose only some specific sections. After that, all that you need to do is use the Save option to choose the output format, and ABBYY FIneReader will take care of the rest. There are numerous output format supported, such as TXT, PDF, RTF, and even EPUB.

    The output text is perfectly editable, and text from even the most content intensive documents (e.g. those having multiple columns and complex layouts) is extracted flawlessly. Other features include extensive language support, numerous font styles/sizes, and image correction tools for files sourced from scanners and cameras.

    In a nutshell, if you want the absolute best OCR software out there, complete with extensive input/output format and processing support, go for ABBYY FineReader.


  2. GOCR

    Price:Free

    GOCR

    Note: Before getting started, it’s important to know that even though GOCR supports regular image formats such as PNG and JPG, it failed to recognize them during our testing (performed on a Windows 10 running PC). It’s very much possible that it might work with those formats on Linux machines, but if you’re using Windows, you’ll need to convert the source image(s) to the PNM format. This can be done via numerous online file conversion tools, such as this one.

    What sets GOCRapart from the lot is that it doesn’t really have a graphical user interface (GUI) front-end. It’s a command line based tool and as such, isn’t really the easiest to use. But once you’re comfortable with the basics, GOCR can prove really helpful in text extraction from images. It’s also worth noting that for GOCR to work properly, the source images should have clearly visible textual content, and preferably white background, as the utility doesn’t really work with complex source files. GOCR extracts the text from images and saves them in the TXT format. While it supports quite a few arguments and functions, only a few need to be known to get started. For example, to extract text from a sample PNM image, you should enter the following at the command prompt.

  3. X:\sample folder\gocr049 -i file.pnm -o file.txt


  4. Here, X:\sample folder is the location where GOCR’s command line tool is located, andfile.pnmandfile.txtare the input and output files, respectively (both in the same location as GOCR as well; if the location is different, the complete path should be specified). Also if you want to change the greyscale levels for the image, you can specify a numerical value as argument, along with -l. Click here to read about the usage in detail.

    To sum it up, GOCR is a fairly good OCR utility, and when it comes to text extraction from simple images, it works exceptionally well. However, it’s severely limited in features, and requires a fair amount of effort to get working.

    Platform Availability:Windows 10, 8, 7, Vista, and XP; Linux; OS/2

  5. FreeOCR

    Platform Availability:Windows 10, 8, 7, Vista, and XP

    Free OCR online

    If you’re looking for a simple and no fuss OCR software with decent text recognition capabilities, look no further thanFreeOCR. While it may not be overloaded with all kinds of fancy features, it still works extremely well for what it is.

    FreeOCR
  6. is extremely easy to use. It can obtain printed documents scanned via scanners, and also lets you upload images having textual content. Not only that, it can also extract text from heavily formatted multi page documents. You can have the application extract either all of the text from the input PDF/image, or define a specific chunk of text. Conversion speeds are pretty good, and the converted text can be either saved in formats like TXT and RTF, or exported directly to Microsoft Word. FreeOCR supports all major image formats, like PNG, JPG, and TIFF.

    That being said, FreeOCR does have some shortcomings. It’s a too basic, and doesn’t have any text post-processing functions. Moreover, the layout of the extracted text often gets messed up, with overlapping lines and columns. Use it only if you require some basic OCR functionality for occasional usage.

No comments:

Post a Comment