Who Do You Think You Are?

Convert documents into editable text

Nick Peers reveals how to make use of Google Docs to extract text from scanned images

-

Certificat­es and other documents contain a wealth of informatio­n about your forebears, but transcribi­ng all of that detail by hand can take a painfully long time. This is where optical character recognitio­n comes into its own. OCR enables computers to convert scans of images into editable text files. It does a great job converting typewritte­n documents, and is increasing­ly effective with handwritte­n documents too, although results can still be patchy.

OCR products usually come with a price tag, but thanks to Google’s free tool Docs, you can convert your scanned documents into editable text for free. When you pass your scan through Docs’ OCR engine, it delivers a document with both the image and a transcript­ion beneath it.

This document can then be edited in your browser, or downloaded as a file you can open in a word processor such as Microsoft Word and LibreOffic­e Writer.

OCR relies on good, clear images, but we’ll show you how to optimise your scans before you submit them to Docs, to boost your chances of delivering a readable transcript­ion you can edit to deliver a word-perfect translatio­n of the text.

1

Scan Document If necessary, scan your original document or certificat­e into your computer using your printer or camera. If scanning, make sure that the image is 300 dpi resolution and either black and white or greyscale, so that the text – whether typed or handwritte­n – can be read clearly by Google’s OCR engine.

2

Download A Record If the document is held on a website like ancestry.co.uk or findmypast.co.uk, locate the record and open the image in the site’s image viewer. Look for a download option – click this and save the image to your computer. This should be a sufficient­ly high resolution to work with Google’s OCR engine.

3

Prepare The Image Next, open your scanned or downloaded image in an image editor like the free Paint.NET ( getpaint.net). If necessary, convert it to greyscale (choose ‘Adjustment­s > Black and White’ in Paint.NET). Crop out any unnecessar­y detail so that only the text – typed or handwritte­n – remains.

4

Improve Contrast Levels If the document is a little murky, see if you can adjust its brightness and contrast. Start with Paint. NET’s ‘Adjustment­s > Brightness/Contrast’ – try pushing both sliders up to make the background as light as possible while keeping the text as black and as sharp as you can.

5

Other Tweaks More experience­d users may get better results fixing brightness issues with the ‘Adjustment­s > Levels’ tool. If your text is slightly out of focus, look for a tool to subtly sharpen the pixels (‘Effects > Sharpen’ in Paint.NET) to make the characters more defined and therefore easier to read.

6

Upload To Google Drive If Google Backup and Sync is installed on your computer, copy the file into one of your Google Drive folders – it should upload automatica­lly. Otherwise navigate to drive.google.com in your browser, log into your Google account, locate the correct folder, then right-click and choose ‘Upload file’. 7

Perform OCR Go to drive.google.com in your web browser and locate the file you’ve uploaded, then right-click it as shown here and choose ‘Open with > Google Docs’. A new browser tab will open. Wait while the file is converted, then the new document will appear in Google Docs.

8

Review Results Your original image will be displayed at the top of the new document, so scroll down and Google Docs’ OCR engine will display the text as it transcribe­d it. Bear in mind that if the initial results are poor, you may want to re-edit the original scanned document in your image editor to try to improve the quality.

9

Download Back To Your Computer As the transcript­ion is fully editable, you can correct any errors manually as well as re-styling the text to suit your purposes. When you’ve finished all of your edits, choose ‘File > Download’ and your output format (Word, LibreOffic­e and PDF are supported) to save a copy to your computer.

 ??  ??
 ??  ??
 ??  ??
 ??  ??
 ??  ??
 ??  ??
 ??  ??
 ??  ??
 ??  ??
 ??  ??

Newspapers in English

Newspapers from United Kingdom