XY Cut Modular approach for Segmenting pages
Keywords:
OCR, document image, X-Y cut segmentation, over segmentation, under segmentationAbstract
the purpose of this experimental research is to present algorithm for reading contents of documented image. Most of the information that is available today in the world is in printed medium. Printed data has hindered storing, exchanging and processing of this information electronically. Converting them from printed to electronic medium is time consuming as well as expensive as well. These factors have motivated people to develop automated systems to perform such task. Optical Character Recognition (OCR) is an important technique which is very popular among research and technical communities. As a result of these research and development activities several OCR applications are being made available in the market. In this paper, we propose two separate modules to determining the paragraphs and lines in of a document page which is independent of languages.
References
. Faisal Shafait, Daniel Keysers, Thomas M. Breuel, “Pixel-Accurate Representation and Evaluation of Page Segmentation in Document Images”, IEEE, Proceedings of the 18th International Conference on Pattern Recognition (ICPR`06) 2006
. Antonacopoulos1, B. Gatos2 and D. Karatzas, “Page Segmentation Competition” Proceedings of the Seventh International Conference on Document Analysis and Recognition (ICDAR 2003) 2003 IEEE.
. Faisal Shafait, Daniel Keysers, and Thomas M. Breuel, “Performance Evaluation and Benchmarking of Six Page Segmentation Algorithms” DRAFT, November 30, 2007
. Song Mao and Tapas Kanungo, “Empirical Performance Evaluation Methodology and Its Application to Page Segmentation Algorithms”, IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 23, NO. 3, MARCH 2001
. Zhixin Shi and Venu Govindaraju, “Multi-scale Techniques for Document Page Segmentation”, Proceedings of the 2005 Eight International Conference on Document Analysis and Recognition (ICDAR’05) 2005 IEEE
. Jean-Luc MeunierXerox Research Centre Europe, « Optimized XY-Cut for Determining a Page Reading Order”, Proceedings of the 2005 Eight International Conference on Document Analysis and Recognition (ICDAR’05) 2005 IEEE.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.