Optical character recognition (OCR) is the translation of optically scanned bitmaps of printed or written text
characters into character codes, such as ASCII. This is an efficient way to turn hard-copy materials into data
files that can be edited and otherwise manipulated on a computer. This is the technology long used by libraries and
government agencies to make lengthy documents quickly available electronically.
Advances in OCR technology have spurred its increasing use by enterprises. For many document-input tasks, OCR is
the most cost-effective and speedy method available. And each year, the technology frees acres of storage space once given
over to file cabinets and boxes full of paper documents. Before OCR can be used, the source material
must be scanned using an optical scanner (and sometimes a specialized circuit board in the PC) to read
in the page as a bitmap (a pattern of dots). Software to recognize the images is also required.
Fig.1 Written text

