Document structure analysis algorithms: A literature survey
Song Mao, Azriel Rosenfeld, et al.
IS&T/SPIE Electronic Imaging 2003
Image segmentation is an important component of any document image analysis system. While many segmentation algorithms exist in the literature, very few i) allow users to specify the physical style, and ii) incorporate user-specified style information into the algorithm's objective function that is to be minimized. We describe a segmentation algorithm that models a document's physical structure as a hierarchical structure where each node describes a region of the document using a stochastic regular grammar. The exact form of the hierarchy and the stochastic language is specified by the user, while the probabilities associated with the transitions are estimated from groundtruth data. We demonstrate the segmentation algorithm on images of bilingual dictionaries.
Song Mao, Azriel Rosenfeld, et al.
IS&T/SPIE Electronic Imaging 2003
Frederick R. Reiss, Tapas Kanungo
SCC 2005
Song Mao, Azriel Rosenfeld, et al.
ICIP 2003
Stephen Dill, Nadav Eiron, et al.
Web Semantics