Friedrich M. Wahl
Computer Graphics and Image Processing
The segmentation and classification of digitized printed documents into regions of text and images is a necessary first processing step in document analysis systems. It is shown that a constrained run length algorithm is well suited to partition most documents into areas of text lines, solid black lines, and rectangular {ballot box}es enclosing graphics and halftone images. During the processing these areas are labeled and meaningful features are calculated. By making use of the regular appearance of text lines as textured stripes, a linear adaptive classification scheme is constructed to discriminate text regions from others. © 1982 Academic Press, Inc.
Friedrich M. Wahl
Computer Graphics and Image Processing
Lionel M. Ni, Kwan Y. Wong, et al.
IEEE TC
Richard G. Casey, Eric Lecolinet
IEEE Transactions on Pattern Analysis and Machine Intelligence
Jorge L. C. Sanz, Fritz Merkle, et al.
Journal of the Optical Society of America A: Optics and Image Science, and Vision