FPGA-based coprocessor for text string extraction
N.K. Ratha, A.K. Jain, et al.
Workshop CAMP 2000
The Pen Technologies group at IBM Research has recently been investigating methods for retrieving handwritten documents based on user queries. This paper investigates the use of typed and handwritten queries to retrieve relevant handwritten documents. The IBM handwriting recognition engine was used to generate N-best lists for the words in each of 108 short documents. These N-best lists are concise statistical representations of the handwritten words. These statistical representations enable the retrieval methods to be robust when there are machine transcription errors allowing retrieval of documents that would be missed by a traditional transcription-based retrieval system. Our experimental results demonstrate that significant improvements in retrieval performance can be achieved compared to standard keyword text searching of machine-transcribed documents. We have developed a software architecture for a multimedia document retrieval framework into which machine learning algorithms for feature extraction and matching may be easily integrated. The framework provides a "plug-and-play" mechanism for the integration of new media types, new feature extraction methods, and new document types.
N.K. Ratha, A.K. Jain, et al.
Workshop CAMP 2000
Erich P. Stuntebeck, John S. Davis II, et al.
HotMobile 2008
Ruixiong Tian, Zhe Xiang, et al.
Qinghua Daxue Xuebao/Journal of Tsinghua University
Thomas M. Cheng
IT Professional