A Paper on Preparation of Dataset for Handwritten Dzongkha Alphabets
Keywords:
Deep Learning, Convolutional Neural Networks, Dzongkha, Bhutanese datasetAbstract
In this paper, we present the complete methodology of preparing a dataset for handwritten Dzongkha alphabets of Bhutan to promote the development of the Handwritten Dzongkha Alphabet Recognition System (HanDARS). The dataset consists of 30 classes, each representing a character of the Dzongkha language with 500 images in each class amounting to a total of 15000 images. The images were manually collected from different individuals and were then augmented to add more varieties to the dataset. The alphabet images were converted to binary format. This dataset can be utilized as a basis for further research and development in the field of optical character recognition for the Dzongkha language. In the future, a greater number of handwritten alphabets needs to be collected to introduce variations in the dataset.
References
D. Chamling, Y. Jamtsho, and Y. Jamtsho, “Handwritten Dzongkha Alphabet Recognition System using Convolutional Neural Network,” Int. J. Sci. Res. Comput. Sci. Eng., Vol. 9, Issue. 5, pp. 20–24, 2021.
S. K. Dasari, S. Mehta, and D. Steffi D.D, “Optical Character Recognition of Devanagari Script Using Machine Learning- A Survey,” J. Xian Univ. Archit. Technol., Vol. 12, Issue. 8, pp. 593–599, 2020.
L. Deng, “The MNIST Database of Handwritten Digit Images for Machine Learning Research [Best of the Web],” IEEE Signal Process. Mag., Vol. 29, Issue. 6, pp. 141–142, 2012.
W. Zhu, “Classification of MNIST Handwritten Digit Database using Neural Network,” Aust Natl Univ, p. 7, 2012.
U. Bhattacharya and B. B. Chaudhuri, “Databases for research on recognition of handwritten characters of Indian scripts,” in Eighth International Conference on Document Analysis and Recognition (ICDAR’05), pp. 789–793, 2005.
U. Bhattacharya, M. Shridhar, S. K. Parui, P. K. Sen, and B. B. Chaudhuri, “Offline recognition of handwritten Bangla characters: an efficient two-stage approach,” Pattern Anal Applic, Vol. 15, Issue. 4, pp. 445–458, 2012.
L. Ma, H. Liu, and J. Wu, “MRG-OHTC Database for Online Handwritten Tibetan Character Recognition,” in 2011 International Conference on Document Analysis and Recognition, pp. 207–211, 2011.
N. Otsu, “A Tlreshold Selection Method from Gray-Level Histograms,” pp. 62–66, 1979.
G. Bradski, “The OpenCV Library,” Dr. Dobb’s Journal of Software Tools, 2000.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.