Jehanzeb Mirza, Leonid Karlinsky, et al.
NeurIPS 2023
In this paper we describe a top-down clustering method consisting of an intra class step and an inter class step. In the intra class step all the samples for each category are initially divided into a small number of clusters, then the largest cluster is split and its members reallocated. The largest cluster is decided based on a new concept, »Volume» of a cluster that is a hybrid of existing two common criteria for splitting: Number of members in a cluster, and variance of a cluster. In the inter class step recognition is done for all the training set to assign best radius to each prototype. The radii are used as a normalizing factor in the computation of distance metrics. In our experiments we generated a prototype library by clustering characters written by Americans. When we used another training set written by Japanese only for tuning radii of the American library, the recognition rate of Japanese test set increased from 87.9% to 92.1%. The radii can be tuned even by OCR end users when the application domain is quite different from that of the initial clustering by OCR developers.
Jehanzeb Mirza, Leonid Karlinsky, et al.
NeurIPS 2023
Hagen Soltau, Lidia Mangu, et al.
ASRU 2011
Diganta Misra, Muawiz Chaudhary, et al.
CVPRW 2024
Hans-Werner Fink, Heinz Schmid, et al.
Journal of the Optical Society of America A: Optics and Image Science, and Vision