Matlab is a registered trademark of The Mathworks, Inc.


 Advanced Source Code . Com

 
 
HOME SOURCE CODE SOFTWARE INFO SUPPORT CONTACT US
 
Source code for fingerprint recognition, face recognition and much more


Software Info    About us     
Go To Matlab Official Website

.: Click here to download :.

Optical character recognition (OCR) is the translation of optically scanned bitmaps of printed or written text characters into character codes, such as ASCII. This is an efficient way to turn hard-copy materials into data files that can be edited and otherwise manipulated on a computer. This is the technology long used by libraries and government agencies to make lengthy documents quickly available electronically. Advances in OCR technology have spurred its increasing use by enterprises. For many document-input tasks, OCR is the most cost-effective and speedy method available. And each year, the technology frees acres of storage space once given over to file cabinets and boxes full of paper documents. Before OCR can be used, the source material must be scanned using an optical scanner (and sometimes a specialized circuit board in the PC) to read in the page as a bitmap (a pattern of dots). Software to recognize the images is also required.

The OCR software then processes these scans to differentiate between images and text and determine what letters are represented in the light and dark areas. Older OCR systems match these images against stored bitmaps based on specific fonts. The hit-or-miss results of such pattern-recognition systems helped establish OCR's reputation for inaccuracy. Today's OCR engines add the multiple algorithms of neural network technology to analyze the stroke edge, the line of discontinuity between the text characters, and the background. Allowing for irregularities of printed ink on paper, each algorithm averages the light and dark along the side of a stroke, matches it to known characters and makes a best guess as to which character it is. The OCR software then averages or polls the results from all the algorithms to obtain a single reading.

Our software package proposes to solve the classification of isolated handwritten characters and digits of the UJI Pen Characters Data Set using Neural Networks. The data consists of samples of 26 characters and 10 digits written by 11 writers on a tablet PC. The characters (in standard UNIPEN format) are written both in upper and lower case and there is a whole two set of characters per writer. So the output should be in one of the 35 classes. The ultimate objective is building a writer independent model for each character.

The selection of valuable features is crucial in character recognition, therefore a new and meaningful set of features, the Uniform Differential Normalized Coordinates (UDNC), introduced by C. Agell, is adopted. These features are shown to improve the recognition rate using simple classification algorithms so they are used to train a Neural Network and test its performance on UJI Pen Characters Data Set.

Index Terms: Matlab, source, code, ocr, optical character recognition, scanned text, written text, ascii, isolated character.

 

 

 

 

 

Figure 1. Written text



A simple and effective source code for Optical Character Recognition.



Demo code (protected P-files) available for performance evaluation. Matlab Image Processing Toolbox and Matlab Neural Network Toolbox are required.

Release
Date
Major features
1.0

2010.01.25



We recommend to check the secure connection to PayPal, in order to avoid any fraud.
This donation has to be considered an encouragement to improve the code itself.

Optical Character Recognition System - Click here for your donation. In order to obtain the source code you have to pay a little sum of money: 200 EUROS (less than 280 U.S. Dollars).

Once you have done this, please email us luigi.rosa@tiscali.it
As soon as possible (in a few days) you will receive our new release of Optical Character Recognition System.

Alternatively, you can bestow using our banking coordinates:
Name :
Luigi Rosa
Address :
Via Pozzo Strada 5 10139 Torino Italy
Bank name:
Poste Italiane
Bank address:
Viale Europa 190 00144 Roma Italy
IBAN (International Bank Account Number) :
IT-50-V-07601-03600-000058177916
BIC (Bank Identifier Code) :
BPPIITRRXXX

The authors have no relationship or partnership with The Mathworks. All the code provided is written in Matlab language (M-files and/or M-functions), with no dll or other protected parts of code (P-files or executables). The code was developed with Matlab 2006a. Matlab Image Processing Toolbox and Matlab Neural Network Toolbox are required. The code provided has to be considered "as is" and it is without any kind of warranty. The authors deny any kind of warranty concerning the code as well as any kind of responsibility for problems and damages which may be caused by the use of the code itself including all parts of the source code.

New - Correlation Filters AFIS
 Automated Fingerprint Identification System The theory of advanced correlation filters has evolved from the literature of optical pattern recognition in the last two decades; they have proved effective classifiers in a number of applications, among them biometric recognition and automatic target recognition.
 
New - White Papers
 A Fast Scheme for Feature Subset Selection to Avoid Overfitting in AdaBoost AdaBoost is a well known, effective technique for increasing the accuracy of learning algorithms. However, it has the potential to overfit the training set because its objective is to minimize error on the training set.
 
New - Software References
 Papers and lectures A list of papers that included Advanced Source Code .Com in the references section. If you have written a paper where our software is cited in the references list please email us and your work will be published at our web site.
 
New - White Papers
 High Capacity Wavelet Watermarking Using CDMA Multilevel Codes This paper proposes a technique based on CDMA and multilevel coding in order to achieve a high capacity watermarking scheme. The bits of watermark are grouped together and for each sequence a different modulation coefficient is used.
 
New - Advanced LPC Trading System
 Linear Predictive Coding for Stock Market Forecasting Linear predictive coding (LPC) is a tool used mostly in audio signal processing and speech processing for representing the spectral envelope of a digital signal of speech in compressed form, using the information of a linear predictive model.
 
New - WebCam Face Identification
 Face Recognition Based on Fractional Gaussian Derivatives Local photometric descriptors computed for interest regions have proven to be very successful in applications such as wide baseline matching, object recognition, texture recognition, image retrieval, robot localization, video data mining, building panoramas, and recognition of object categories.
 
New - Speaker Recognition System
 Source code for speaker recognition
Speaker recognition is the process of automatically recognizing who is speaking on the basis of individual information included in speech waves.
 
New - Speech Recognition System
 Source code for isolated words recognition
Speech recognition technology is used more and more for telephone applications like travel booking and information, financial account information, customer service call routing, and directory assistance. Using constrained grammar recognition, such applications can achieve remarkably high accuracy.
 



The MathWorks, Inc. Google NeuralNetworks.It Octave Scilab The R Project for Statistical Computing Python Other available resources English Dictionary Download .Com
 
Software Info    About us