Tobias Leidinger: Enabling optical character recognition (OCR) for multi-coloured pictures

Abstract

Normal application of desktop OCR tools requires scanned text documents that are of good quality and have a clear contrast between text and background in order to produce a reasonable recognition result. Images with text of various appearances due to different shapes of the source, with varying foreground and background colours, e.g. bright text on dark background, or irregular background patterns are rather difficult to interpret. If pictures are taken with a mobile phone camera, additional image artefacts have to be dealt with, such as various lighting conditions, shiny surfaces or flash artefacts. OCR software requires black and white images for optimal results. A good image binarisation algorithm amplifies the contrast between text and background.

Especially, in the proposed approach, for pictures with possible quality issues, pre-processing is an essential step in order to obtain reasonable results. We extract a grey-scale representation of the image and remove background patterns by pixel-wise dividing a blurred version of the grey-scale image by the original image values. As the blurring usually makes the text disappear whereas the background is less affected, the darker text remains dark whereas the background becomes bright after division. A local threshold algorithm is used to finally binarise the image. In the scope of our project, we tested the binarisation algorithm on pictures of food product ingredient lists. While OCR tools often fail to extract any text from colourised original pictures, using the pre-processing, we are able to recognise text that has 72.6% accordance in average with the original, measuring the text edit distance. In our framework, the binarisation is integrated in the OCR process, which is followed by dictionary post-processing. All modules of the processing chain are developed in such a way that they connect to each other entirely automatically.

Keywords

Binarisation, ImageJ, OCR pre-processing

Administrative data

Presenting author: Tobias Leidinger
Organisation: CRP Henri Tudor

co-authors: Andreas Arens, Antonio Kröger, Norbert Rösch