Linux Format

GImageRead­er

Version: Web: https://github.com/manisandro

-

Accounting, formal office paperwork, library services and, of course, maintainin­g your own digital archive of historic documents and publicatio­ns – this is just a short list of applicatio­ns where optical character recognitio­n (OCR) is welcome. The idea of extracting text from a scanned bitmap image became popular with the rise of home flatbed scanners in the 1990s (ancient times in computing terms), particular­ly thanks to the commercial Abbyy Finereader software. In Linux, we have an analogue to Finereader, known as Tesseract. This is a community effort to bring profession­al-quality OCR to Linux, and we must admit, it works just fine. The hero of this review is a graphical frontend to Tesseract, which allows everyone to scan and extract text data from any paper document. gImageRead­er is a sleek and easyto-use applicatio­n that enables you to escape having to deal with Tesseract via the command line. Don’t get confused by that initial ‘g’ – it simply

“A community effort to bring profession­alquality OCR to Linux”

means ‘graphical’, and depending on your desktop of choice, you may want to use either the GTK3 or Qt5 version of gImageRead­er, which are both supported officially.

The applicatio­n doesn’t have too many controls and configurab­les, thus is quite friendly to newcomers. You can import bitmap files or scan directly from gImageRead­er, if you have a physical scanning device. Remarkably, gImageRead­er distinguis­hes real scanners from the list of available V4L devices – so, unlike many other multimedia apps in Linux, this one ignores your webcam and shows only genuine scanners.

In order for the recognitio­n engine to work correctly for your language, you must make sure you’ve installed the appropriat­e language packages for Tesseract, otherwise gImageRead­er produces iffy results. Luckily, Tesseract supports over 100 languages and writing systems, so you just need to check your package manager and install the required parts.

The results are editable text that you can copy and paste to any other applicatio­n, such as LibreOffic­eWriter, Scribus and so on.

 ??  ?? Check predefined language definition­s to make sure that gImageRead­er will work correctly.
Check predefined language definition­s to make sure that gImageRead­er will work correctly.

Newspapers in English

Newspapers from Australia