The most important and difficult task in text document analysis is to achieve line segmentation accurately, particularly when the document is composed of unconstrained handwritten text. To accomplish this objective a painting scheme is proposed in this research work. Being motivated by the fact that the handwritten Persian texts offer the most critical challenges in the process of text-line segmentation, the new method has been devised by studying the cursive Persian text scripts extensively; yet, in general the proposed line segmentation algorithm is applicable to handwritten text in any language/script. The text block is vertically decomposed into parallel pipe structures called as strip. Each row in each strip is painted by a gray intensity, which is the average intensity value of gray values of all pixels present in that row-strip. Subsequently, the painted pipes are converted into two-tone painting and it is smoothed. The white/black spaces in each pipe of the smoothed image are analyzed to get a short line of separation, phrased as Piece-wise Potential Separating Line (PPSL), between two consecutive black spaces. The PPSLs are concatenated to produce the segmentation of text lines. Some additional procedures are built to handle certain anomalies, which may occur. The scheme is validated by extensive experimentation. We tested the proposed algorithm with 52 pages of Persian text documents containing totally 823 lines and correct line segmentation of 92.35% is achieved. Moreover, the proposed algorithm was also tested with two different datasets of 152 and 200 handwritten text-pages of different languages. Efficiency and script independency of the proposed algorithm were proved when compared with various approaches presented in recent literature.
Journal article
Piece-wise painting technique for line segmentation of unconstrained handwritten text: a specific study with Persian text documents
Pattern Analysis and Applications, Vol.14(4), pp.381-394
2011
Metrics
23 Record Views
UN Sustainable Development Goals (SDGs)
This output has contributed to the advancement of the following goals:
Source: InCites
Abstract
Details
- Title
- Piece-wise painting technique for line segmentation of unconstrained handwritten text: a specific study with Persian text documents
- Creators
- Ali Reza Alaei - University of MysoreP Nagabhushan - University of MysoreUmapada Pal - Indian Statistical Institute
- Publication Details
- Pattern Analysis and Applications, Vol.14(4), pp.381-394
- Identifiers
- 1984; 991012821104502368
- Academic Unit
- Information Technology; School of Business and Tourism; Faculty of Business, Law and Arts; Faculty of Science and Engineering
- Resource Type
- Journal article