Russian manuscripts clustering based on the feature relation graph (FRG)

  • Владислав Александрович Павлов SPbSU, St. Petersburg, Russia
  • Полина Сергеевна Дюрдева SPbSU, St. Petersburg, Russia
  • Дмитрий Сергеевич Шалымов SPbSU, St. Petersburg, Russia
Keywords: Russian manuscripts, clustering, feature relation graph, Gabor filter

Abstract

Clustering of manuscripts becomes important nowadays because of the rapidly increasing number of documents in digital form. To solve this problem a new metric to compare handwritings based on the Feature Relation Graph (FRG) is investigated. This metric has demonstrated good results for the problem of text-independent writer recognition of Persian manuscripts on the basis of handwriting. Features that are based on local templates are extracted from manuscripts using Gabor and XGabor filters. We study the effectiveness of the most popular clustering algorithms for the problem of Russian manuscripts processing in the phase space of FRG. The paper presents numerical experiments demonstrating the effectiveness of the proposed metrics. The results of the various clustering algorithms are also provided.

Author Biographies

Владислав Александрович Павлов, SPbSU, St. Petersburg, Russia

Vladislav A. Pavlov

Полина Сергеевна Дюрдева, SPbSU, St. Petersburg, Russia

Polina S. Durdeva

Дмитрий Сергеевич Шалымов, SPbSU, St. Petersburg, Russia

Dimitriy S. Shalymov  

References

[1] B. Nevo, Scientific Aspects Of Graphology: A Handbook, Springfield, IL, 1986.
[2] A. Abbasi and H. Chen, “Applying authorship analysis to extremist group Web forum messages,” IEEE Intelligent Systems, vol. 20 no. 5, pp. 67‒75, 2005; doi:10.1109/MIS.2005.81
[3] G. Zhu, X. Yu, Y. Li, D. Doermann, “Language identification for handwritten document images using a shape codebook,” Pattern Recognition, vol. 42, no. 12, pp. 3184–3191, 2009; doi: 10.1016/j.patcog.2008.12.022
[4] S. D. Kulik, “Neural Network Model of Artificial Intelligence for Handwriting Recognition,” Journal of Theoretical and Applied Information Technology, vol.73, no. 2, 202‒211, 2015.
[5] N. S. Isupov and A. V. Kuchuganov, “Raspoznavanie Slitnykh Rukopisnykh Tekstov s Ispol'zovaniem Apparata Nechetkoi Logiki” [Joined-up Writing Recognition with Fuzzy Logic Application], Vestnik IzhGTU, no.1, pp. 125‒128, 2012 (in Russian).
[6] B. Helli and M. E. Moghaddam, “A text-independent Persian writer identification based on feature relation graph (FRG),” Pattern Recognition, vol. 43, no. 6, pp. 2199–2209, 2010; doi:10.1016/j.patcog.2009.11.026
[7] V. S. N. Prasad and J. Domke, “Gabor Filter Visualization, ” Tech. Rep., University of Maryland, 2005.
[8] A. Likasa, N. Vlassisb, and J. J. Verbeekb “The global k-means clustering algorithm,” Pattern Recognition, vol. 36, no. 2, pp. 451–461, 2003; doi:10.1016/S0031-3203(02)00060-2
[9] M. Ester, H.-P. Kriegel, J. Sander, and X. Xu, “A density-based algorithm for discovering clusters in large spatial databases with noise,” In Proc.of Second International Conference on Knowledge Discovery and Data Mining, AAAI Press, Portland, OR, 1996, pp. 226–231.
[10] A. P. Reynolds, G. Richards, and V.J. Rayward-Smith “The Application of K-medoids and PAM to the Clustering of Rules,” In Proc. of the Fifth International Conference on Intelligent Data Engineering and Automated Learning (IDEAL'04), LNCS, Vol. 3177, Springer, 2004, pp. 173–178; doi:10.1007/978-3-540-28651-6_25
[11] C.D. Manning, P. Raghavan, and H. Schutze Introduction to Information Retrieval, NY: Cambridge University Press, 2008.
Published
2016-02-29
How to Cite
Павлов, В. А., Дюрдева, П. С., & Шалымов, Д. С. (2016). Russian manuscripts clustering based on the feature relation graph (FRG). Computer Tools in Education, (1), 24-35. Retrieved from http://cte.eltech.ru/ojs/index.php/kio/article/view/1387
Section
Informational systems