"Much of pattern recognition theory and practice, including methods such as Support Vector Machines, has emerged in an attempt to solve the character recognition problem. This book is written by very well-known academics who have worked in the field for many years and have made significant and lasting contributions. The book will no doubt be of value to students and practitioners." -Sargur N. Srihari, SUNY Distinguished Professor, Department of Computer Science and Engineering, and Director, Center of Excellence for Document Analysis and Recognition (CEDAR), University at Buffalo, The State University of New York "The disciplines of optical character recognition and document image analysis have a history of more than forty years. In the last decade, the importance and popularity of these areas have grown enormously. Surprisingly, however, the field is not well covered by any textbook. This book has been written by prominent leaders in the field. It includes all important topics in optical character recognition and document analysis, and is written in a very coherent and comprehensive style. This book satisfies an urgent need. It is a volume the community has been awaiting for a long time, and I can enthusiastically recommend it to everybody working in the area." -Horst Bunke, Professor, Institute of Computer Science and Applied Mathematics (IAM), University of Bern, Switzerland In Character Recognition Systems, the authors provide practitioners and students with the fundamental principles and state-of-the-art computational methods of reading printed texts and handwritten materials. The information presented is analogous to the stages of a computer recognition system, helping readers master the theory and latest methodologies used in character recognition in a meaningful way. This book covers: * Perspectives on the history, applications, and evolution of Optical Character Recognition (OCR) * The most widely used pre-processing techniques, as well as methods for extracting character contours and skeletons * Evaluating extracted features, both structural and statistical * Modern classification methods that are successful in character recognition, including statistical methods, Artificial Neural Networks (ANN), Support Vector Machines (SVM), structural methods, and multi-classifier methods * An overview of word and string recognition methods and techniques * Case studies that illustrate practical applications, with descriptions of the methods and theories behind the experimental results Each chapter contains major steps and tricks to handle the tasks described at-hand. Researchers and graduate students in computer science and engineering will find this book useful for designing a concrete system in OCR technology, while practitioners will rely on it as a valuable resource for the latest advances and modern technologies that aren't covered elsewhere in a single book.
Figures.
List of Tables.
Preface.
Acknowledgments.
Acronyms.
1. Introduction: Character Recognition, Evolution and Development.
1.1 Generation and Recognition of Characters.
1.2 History of OCR.
1.3 Development of New Techniques.
1.4 Recent Trends and Movements.
1.5 Organization of the Remaining Chapters.
References.
2. Tools for Image Pre-Processing.
2.1 Generic Form Processing System.
2.2 A Stroke Model for Complex Background Elimination.
2.2.1 Global Gray Level Thresholding.
2.2.2 Local Gray Level Thresholding.
2.2.3 Local Feature Thresholding-Stroke Based Model.
2.2.4 Choosing the Most Efficient Character Extraction Method.
2.2.5 Cleaning up Form Items Using Stroke Based Model.
2.3 A Scale-Space Approach for Visual Data Extraction.
2.3.1 Image Regularization.
2.3.2 Data Extraction.
2.3.3 Concluding Remarks.
2.4 Data Pre-Processing.
2.4.1 Smoothing and Noise Removal.
2.4.2 Skew Detection and Correction.
2.4.3 Slant Correction.
2.4.4 Character Normalization.
2.4.5 Contour Tracing/Analysis.
2.4.6 Thinning.
2.5 Chapter Summary.
References 72.
3. Feature Extraction, Selection and Creation.
3.1 Feature Extraction.
3.1.1 Moments.
3.1.2 Histogram.
3.1.3 Direction Features.
3.1.4 Image Registration.
3.1.5 Hough Transform.
3.1.6 Line-Based Representation.
3.1.7 Fourier Descriptors.
3.1.8 Shape Approximation.
3.1.9 Topological Features.
3.1.10 Linear Transforms.
3.1.11 Kernels.
3.2 Feature Selection for Pattern Classification.
3.2.1 Review of Feature Selection Methods.
3.3 Feature Creation for Pattern Classification.
3.3.1 Categories of Feature Creation.
3.3.2 Review of Feature Creation Methods.
3.3.3 Future Trends.
3.4 Chapter Summary.
References.
4. Pattern Classification Methods.
4.1 Overview of Classification Methods.
4.2 Statistical Methods.
4.2.1 Bayes Decision Theory.
4.2.2 Parametric Methods.
4.2.3 Non-ParametricMethods.
4.3 Artificial Neural Networks.
4.3.1 Single-Layer Neural Network.
4.3.2 Multilayer Perceptron.
4.3.3 Radial Basis Function Network.
4.3.4 Polynomial Network.
4.3.5 Unsupervised Learning.
4.3.6 Learning Vector Quantization.
4.4 Support Vector Machines.
4.4.1 Maximal Margin Classifier.
4.4.2 Soft Margin and Kernels.
4.4.3 Implementation Issues.
4.5 Structural Pattern Recognition.
4.5.1 Attributed String Matching.
4.5.2 Attributed Graph Matching.
4.6 Combining Multiple Classifiers.
4.6.1 Problem Formulation.
4.6.2 Combining Discrete Outputs.
4.6.3 Combining Continuous Outputs.
4.6.4 Dynamic Classifier Selection.
4.6.5 Ensemble Generation.
4.7 A Concrete Example.
4.8 Chapter Summary.
References.
5. Word and String Recognition.
5.1 Introduction.
5.2 Character Segmentation.
5.2.1 Overview of Dissection Techniques.
5.2.2 Segmentation of Handwritten Digits.
5.3 Classification-Based String Recognition.
5.3.1 String Classification Model.
5.3.2 Classifier Design for String Recognition.
5.3.3 Search Strategies.
5.3.4 Strategies for Large Vocabulary.
5.4 HMM-Based Recognition.
5.4.1 Introduction to HMMs.
5.4.2 Theory and Implementation.
5.4.3 Application of HMMs to Text Recognition.
5.4.4 Implementation Issues.
5.4.5 Techniques for Improving HMMs Performance.
5.4.6 Summary to HMM-Based Recognition.
5.5 Holistic Methods For Handwritten Word Recognition.
5.5.1 Introduction to Holistic Methods.
5.5.2 Overview of Holistic Methods.
5.5.3 Summary to Holistic Methods.
5.6 Chapter Summary.
References.
6. Case Studies.
6.1 Automatically Generating Pattern Recognizers with Evolutionary Computation.
6.1.1 Motivation.
6.1.2 Introduction.
6.1.3 Hunters and Prey.
6.1.4 Genetic Algorithm.
6.1.5 Experiments.
6.1.6 Analysis.
6.1.7 Future Directions.
6.2 Offline Handwritten Chinese Character Recognition.
6.2.1 Related Works.
6.2.2 System Overview.
6.2.3 Character Normalization.
6.2.4 Direction Feature Extraction.
6.2.5 Classification Methods.
6.2.6 Experiments.
6.2.7 Concluding Remarks.
6.3 Segmentation and Recognition of Handwritten Dates on Canadian Bank Cheques.
6.3.1 Introduction.
6.3.2 System Architecture.
6.3.3 Date Image Segmentation.
6.3.4 Date Image Recognition.
6.3.5 Experimental Results.
6.3.6 Concluding Remarks.
References.