Analysis of Text Recognition with Data Mining Techniques
Keywords:
Feature Extraction, RecognitionAbstract
Recognization of text is a method that recognizes text from the file in the desired format (such as .doc or.txt).This process involves several steps, including pre-processing, segmentation, feature extraction, classification, and post-processing. The pre-processing is performed as a binarized image to convert a gray scale image, and noise is reduced on the input image of the basic operation performed by removing the noise of the image signal. The segmentation phase is used to segment the image given online and segment each character of the segmentation line. Feature extraction is to compute the characteristics of the image document. This document describes techniques for converting the textual content of a paper document into a machine-readable format. This paper analyzes and compares the technical challenges, methods, and performance of text detection and recognition studies in colour images.
References
C. Patel and A. Desai, “Segmentation of text lines into words for Gujarati handwritten text,” Proc. 2010 Int. Conf. Signal Image Process. ICSIP 2010, pp. 130–134, 2010.
C. Patel and A. Desai, “Zone identification for Gujarati handwritten word,” Proc. - 2nd Int. Conf. Emerg. Appl. Inf. Technol. EAIT 2011, pp. 194–197, 2011.
C. Patel and A. Desai, “Gujarati Handwritten Character Recognition Using Hybrid Method Based on Binary Tree-Classifier And K-Nearest Neighbour,” Int. J. Eng. Res. Technol., vol. 2, no. 6, pp. 2337–2345, 2013.
A. Desai, “Segmentation of Characters from old Typewritten Documents using Radon Transform,” Int. J. Comput. Appl., vol. 37, no. 9, pp. 10–15, 2012.
A. A. Desai, “Handwritten Gujarati Numeral Optical Character Recognition using Hybrid Feature Extraction Technique,” Int. Conf. Image Process. Comput. Vision, Pattern Recognition, IPCV, 2010.
A. A. Desai, “Gujarati handwritten numeral optical character reorganization through neural network,” J. Pattern Recognit., vol. 43, no. 7, pp. 2582–2589, 2010.
A. a. Desai, “Support vector machine for identification of handwritten Gujarati alphabets using hybrid feature space,” CSI Trans. ICT, vol. 2, no. January, pp. 235–241, 2015.
Mayil S. and Vanitha M, “A Survey on privacy Preserving Data Mining Techniques”, International Journal of Computer Science and Information Technologies. Vol.5 (5), pp. 6054-6056. ISSN: 0975- 9646, 2014.
M. Maloo, K. V Kale, and I. Technology, “Support Vector Machine Based Gujarati Numeral Recognition,” Int. J. Comput. Sci. Eng.
M. B. Mendapara and M. M. Goswami, “Stroke identification in Gujarati text using directional feature,” Proceeding IEEE Int. Conf. Green Comput. Commun. Electr. Eng. ICGCCEE 2014, 2014.
N. Rave and S. K. Mitra, “Feature extraction based on stroke orientation estimation technique for handwritten numeral,” in Eighth International Conference on Advances in Pattern Recognition (ICAPR), 2015.
Manimaran R. and Vanitha M, “An Efficient Study on Usage of Data Mining Techniques for Predicting Diabetes”, International Journal of Advanced Research Trends in Engineering and Technology (IJARTET) Vol.3 (20), pp.268-272 ISSN: 2394-3785, 2016.
A. N. Vyas and M. M. Goswami, “Classification of handwritten Gujarati numerals,” 2015 Int. Conf. Adv. Comput. Commun. Informatics, ICACCI 2015, pp. 1231–1237, 2015.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.