Image Analysis and Processing - State of the art: "Line detection in document images"



Note: images on this projects post are not showing and i am currently looking at restoring them.


Students:
Benchaabane Zoulikha
Kapellas Nikolas



Introduction

In this paper we will quote and study the various existing methods for detection and separation of lines in document image. Although printed text line detection and extraction is considered by many an easy task, handwritten text line detection is considered a process with a variety of unsolved difficulties. Due to the nature of handwriting  difficulties such as presence of skew angles, differences in space, style&character size, hill-and-dale writing, orientation, alignment have to be treated through distinct steps, depending of the followed methodology. Text line extraction and segmentation does not have an universal accepted solution in the context of automatic document recognition systems and that fact is a result of the different approaches that researched implement  In the case of handwritten documents that may be old or damaged, different from machine printed, the complexity of the problem even increases. In general we can divide text recognition into 3 sub-parts: a)Printed text recognition, b)Hand printed text recognition, c)Handwritten text recognition. Handwritten text extracting can be subdivided into: a)off-line recognition and b)on-line recognition. The fist one deals with real time data processing (integrating pen movement and pressure information). On the other hand off-line only relies in pixel information. These recognition systems are mainly based on human reading model which classifies and recognizes words by their shapes and often by their semantic understanding. Furthermore in handwritten text image processing domain there are some extends such as Behavioral and Neurology study, that we find interesting. Applications may be based upon these extensions of line detection, serving a variety of reasoning, from art, to medicine and computer vision areas. There are many implementation we can think of already running, (industrial robots, autonomous vehicles, medical image analysis, typographical information and written recognition, machine-human interaction), others to improve and many more to come.

Keywords: Text line detection, line extraction, handwritten document, word and line segmentation, handwritten text, document image.


State of the art

1. Simultaneous detection of vertical and horizontal text lines based on perceptual organisation [1].

Defining the page of a document as a set of small components, which are grouped into higher level components such as lines and text blocs, authors of the paper base their method proposing the encoding of local information by considering the properties that determine perceptual grouping (similarity, proximity, continuity of direction). Components are labelled according to their location and the location of its nearest neighbor connected component. According to a size criterion, the greatest connected components are considered as Graphics and are excluded from the grouping process. Borders of text lines, that letters cant straddle are the boxes of the graphics connected components. Vertical and horizontal lines are detected without prior assumption on their direction and in order to avoid line merging, characters that belong to or extend into different lines are discarded from the grouping process. After each step of the grouping process, conflict resolution rules are activated and this enables to solve the conflicts that may appear when text lines are simultaneously detected in different directions.Text line detection by grouping connected components (CCs) requires the binarization of the original grey image to obtain the CCs.



Binarization proposed formula where where NP is the number of pixels in the image, m is the average grey value and k is a coefficient (k = -0,2).


2. Text line detection and segmentation techniques focused on the uneven skew angles and hill-and-dale writing. [3].

This method is an improved version of an older one [2]. and has been separated into 3 major sub-tasks (Segment estimation, Text line detection and text line segmentation).
In segment estimation, the system splits the original document image to the necessary number of vertical zones of equal width in order to cope with the existence of different skew angles in the page or even in the same text line (hill-and-dale writing). The number of Vertical zones is determined by the system and depends on the variety of skewing in the page. These zones are later shown into an Histogram. This Histogram is in fact the horizontal projection profile of each Vertical zone after applying 5-point smoothing. This smoothing is done in order to avoid noise due to the handwriting irregularity. The beginning of a text line is determined by the point that the histogram value rises over a specific limit.There is a different starting point for each text line. Similarly the is a limit value for ending of text line. The area that is determined by the ending of the previous text line and the beggining of the next one is called Segment area. The line detection task proceeds with the localization of all the points that will be part of the segment between the text lines in Segment areas. The goal is to establish a segment between the text lines that includes as many white pixels as possible while dealing with the presence of black ones whenever is impossible to avoid them (contours of writing). The line segmentation task prepares the system output according to the application. Considering the segment localization by the previous task either it draws it on the original image or it copies each text line and saves it to distinct image files.




Text line estimated area

3. Text Line Segmentation Based on Morphology and Histogram Projection [5].

This procedure was first proposed by Wu et. al. [4]., as an initial step in the process of text line extraction from video images containing text information. Authors, have adapted and improved this idea for handwritten text line segmentation problem. Work flow of this method is divided into eight stages, as are described below. The feature extraction or binarization step is applied to the input image. Then, an Y histogram projection is obtained to detect the possible lines. Due to some noise, a text line separation is necessary. Once the false lines are found, they must be excluded. After that, the line region recovery step is performed in order to recover some losses introduced by the preceding step. An X histogram projection that is applied to each line detected takes out possible false words, mainly at the lateral edges of the page. Finally, we obtain the text lines region.



Y histogram projection with intersection between letters highlighted

4. Automatic Line Detection, using Hough-transform method [7].

This method is based into Hough-transform to extract line information from an image. This technique makes separation of lines easy and possible, in the case of document corpus. The Hough-transform is a global method for detecting edges. It transforms between the Cartesian space and a parameter space in which a straight line (or other boundary formulation) can be defined. Later implementation was built in MathLab following steps described bellow: a) select an image, b) apply edge detection to the selected image using different gradient kernels (Sobel, Prewitt, Roberts), sub-pixel resolution, or other methods such as: Canny or looking for zero crossings after filtering the image with a Laplacian of Gaussian filter. c) Perform Hough transform on the detected edges. User can specify the intended resolution for the resulting vote histogram. d) Extract plausible lines from the vote histogram matrix. User can specify a vote threshold value that will effectively control the number of selected lines.e) Sample the detected line equations and plot the lines on the image.


5. A Hough transform based line recognition method utilizing both parameter space and image space [10].

This paper proposes a new method for recognizing straight-line segments and even though it is based on Hough Transform, authors aim at overcoming the long-existing limitations of Hough Transform based methods, including the weakness on handling large-size images and the unawareness of line thickness. The proposed method makes two major contributions:
a) It proposes to utilize the image space throughout the whole recognition process. Several novel image-based techniques are introduced to make the process more efficient. The gradient prediction accelerates the Transform accumulation and helps eliminate the random aligned noises. The boundary recorder greatly removes the redundancy of line verification on large-size images. Erasing the pixels belonging to newly-recognized lines avoids overlapping lines effectively. All these techniques work together to significantly speed up the whole recognition process for large-size images, while maintaining high detection accuracy, as confirmed by the experimental results, and b) The image-analysis-based line verification enables the proposed method to detect line thickness correctly, which is critical to many applications. Therefore, the success of the proposed method demonstrates that Hough Transform based methods can also be applied to large-size images, e.g., engineering drawings, if the image space is properly utilized.




Overall representation of the proposed HT based line recognition method

6. Text-Line Extraction using a Convolution of Isotropic Gaussian Filter with a Set of Line Filters [11].

Text line extraction is a key task in documents analysis besides it’s one of the most important layout analysis steps in document image understanding systems. It is a challenging task, and it’s difficultly is based on writing styles, scripts, digitization methods, and intensity values. Methods based on an-isotropic Gaussian filtering and ridge detection have shown good results. This paper describes performance improvements to these technique based on the use of a convolution of isotropic Gaussian filter with line filters. These new filter banks are motivated by a matched filter approach to text-lines and, in addition, require fewer operations to compute. They evaluated the performance of the new filter bank in combination with ridge detection on the public DFKI-I data-set (CBDAR 2007), which contains camera-captured document images and demonstrate improvements in performance to previous proposed techniques.





Result of ridge detection of smoothed filter bank lines

8. Word base line detection in handwritten text recognition systems [6].

In this approach authors analyze and try to overcome some problems concerning handwritten recognition and more specific slant (letters of words extend in another vertical line). Therefore their method is mainly focused in defining the base lines borders in handwritten cursive text. In some other systems, authors propose that even though, measures are taken to calculate those parameters, problems continue to exist in baseline detection and those problems have an impact at the quality of recognition, also decreasing the rate of recognition. Despite other methods, here borders are found by small pieces containing segmentation elements and defined as a set of linear functions and also proposed method's advantage is that, separate borders for top and bottom border lines are found.

9. Line And Word Segmentation of Handwritten Documents [8].

In this paper authors present a segmentation methodology of a handwritten document in its distinct entities namely text lines and words. Text line segmentation is achieved making use of Hough Transform on a subset of the connected components of the document image. Also, a post processing step includes the corrections of possibles miscalculations  the creation of text lines that the Transform failed to create and finally separation vertically connected characters using a novel method. The main novelties of the proposed approach consist of a) the extension of a previously published work for text line segmentation [9]., taking into account an improved methodology for the separation of vertically connected text lines and b) a new word segmentation technique based on an efficient distinction of inter-word and inter-word distances.

10. A Block-Based Hough Transform Mapping for Text Line Detection in Handwritten Documents [9].


This text line detection method for unconstrained handwritten documents is based on a strategy that consists of three distinct steps. The first step includes pre-processing for image enhancement, connected component extraction and average character height estimation. In the second step, a block-based Hough transform is used for the detection of potential text lines while a third step is used to correct possible false alarms. The performance of the proposed methodology is based on a consistent and concrete evaluation technique that relies on the comparison between the text line detection result and the corresponding ground truth annotation.






Example showing connected components placed into bounding boxes [8]., [9].

11. A Hough based algorithm for extracting text lines in handwritten documents [12].

The method herein proposed detects text lines on handwritten pages which may include either lines oriented in several directions, erasures, or annotations between main lines. The method has a hypothesis-validation strategy which is interactively activated until the end of the segmentation is reached. At each stage of the process, the best text-line hypothesis is generated in the Hough domain. Taking into account the fluctuations of the text-line components. Afterwards, the validity of the line is checked in the image domain using a proximity criteria which analyses the context in which is perceived the alignment hypothesized. Ambiguous components belonging to several text lines are also marked. No assumption is made about orientation or position of the text lines.


12. Text Line Extraction in Handwritten Document with Kalman Filter Applied on Low Resolution Image [13].

In this paper authors present a method to extract text lines in handwritten documents. Line extraction is a first interesting step in document structure recognition. The proposed method is based on a notion of perceptive vision: at a certain distance, text lines of documents can be seen as line segments. Therefore, a proposition is made to detect text line using a line segment extractor on low resolution images. Presenting the extractor based on the theory of Kalman filtering. This method makes it possible to deal with difficulties met in ancient damaged documents: skew, curved lines, overlapping text lines. Results are presented on archive documents from the 18th and 19th century.




Example of a miss calculated 2,9% cases

13. Handwritten Chinese text line segmentation by clustering with distance metric learning [14].

Separating text lines in unconstrained handwritten documents remains a challenge because the hand- written text lines are often un-uniformly skewed and curved, and the space between lines is not obvious. In this paper, authors propose a novel text line segmentation algorithm based on minimal spanning tree (MST) clustering with distance metric learning. Given a distance metric, the connected components (CCs) of document image are grouped into a tree structure, from which text lines are extracted by dynamically cutting the edges using a new hyper-volume reduction criterion and a straightness measure. By learning the distance metric in supervised learning on a data-set of pairs of CCs, the proposed algorithm is made robust to handle various documents with multi-skewed and curved text lines. In experiments on a database with 803 unconstrained handwritten Chinese document images containing a total of 8,169 lines, the proposed algorithm achieved a correct rate 98.02% of line detection, and compared favorably to other competitive algorithms.


Segmentation results with the proposed algorithm

14. Estimation of the handwritten text skew based on binary moments [15].

Binary moments represent one of the methods for the text skew estimation in binary images. It has been used widely for the skew identification of the printed text. However, the handwritten text consists of text objects, which are characterized with different skews. Hence, the method should be adapted for the handwritten text. This is achieved with the image splitting into separate text objects made by the bounding boxes. Obtained text objects represent the isolated binary objects. The application of the moment-based method to each binary object evaluates their local text skews. Due to the accuracy, estimated skew data can be used as an input to the algorithms for the text line segmentation.




Skew detection of the connected-component with the moment-based method


15. Handwritten document image segmentation into text lines and words [16].

A novel approach to extract text lines and words from handwritten document is presented here. The line segmentation algorithm is based on locating the optimal succession of text and gap areas within vertical zones by applying Viterbi algorithm. Then, a text-line separator drawing technique is applied and finally the connected components are assigned to text lines. Word segmentation is based on a gap metric that exploits the objective function of a soft-margin linear SVM that separates successive connected components. The algorithms tested on the bench-marking data-sets of ICDAR07 handwriting segmentation contest and outperformed the participating algorithms.


16. Line Separation for Complex Document Images Using Fuzzy Runlength [17].

A new text line location and separation algorithm for complex handwritten documents is proposed. The algorithm is based on the application of a fuzzy directional runlength. The proposed technique was tested on a variety of complex handwritten document images including postal parcel images and historical handwritten documents such as Newton’s and Galileo’s manuscripts. A preliminary testing showed a successful rate of 93% of the test set.



Extracted text line pattern superimposed on top of Galileo’s manuscript. Grouped line components will give the locations of the text lines.


17. Detection and separation of lines connected in multi-oriented documents [18].

In this paper, authors present an original approach for the multi-oriented text line detection and separation from handwritten Arabic documents. Due to the multi-orientation, they use an image paving that allows them to progressively and locally determine the lines. At first, multi-sloped areas are detected using an automatic meshing document. The snake method is used for line extraction. Then the orientation is estimated, corrected and extended to find all the local orientations using the Wigner-Ville distribution on the histogram projection profile. This orientation is then enlarged to limit the orientation in the neighborhood. Afterwards, the rows are retrieved based on the orientation and the baselines of each window. Finally, the adjacent lines are separated connected using statistical information on the morphology of the terminal letters Arabs. The proposed approach has been experimented on 100 documents reaching an accuracy of about 98.6%. This extraction rate shows its efficiency and performance. Following figure illustrates the effectiveness of the proposed algorithm on a sample of 3 documents arbitrarily chosen among the 100 documents processed. To identify the lines, each pair of consecutive lines is shown by two different colors.




Sample proposed method results


18. Natural Language Inspired Approach for Handwitten Text Line Detection in Legacy Documents [19].

Handwritten text transcription is becoming an increasingly important task, in order to provide historians and other researcher new ways of indexing, consulting and querying the huge amounts of historic handwritten documents which are being published in on-line digital libraries. Document layout analysis is an important task needed for handwritten text recognition among other applications. Text layout commonly found in handwritten legacy documents is in the form of one or more paragraphs composed of parallel text lines. This paper have presented a new approach for text line detection by using a statistical framework similar to that already employed in many cases. It avoids the traditional heuristics approaches usually adopted for this task. The accuracy of this approach is similar to or better than that of current state of the art solutions found in the literature. Authors commenting that the detected baselines provided by their approach are of better quality (visually closer to the actual line) than other current methods.





Image shows the difference between proposed method and the histogram projection method



Resume

Concluding even though literature over the domain counts many papers, problems still exist in the domain of text line detection into document images. Presented methodologies, are focused in different parts of line extraction, approching subject from a different scientific angle, using different tools that suit better their purpose. Some methods are based into encoding of local information by considering properties that determine perceptual grouping [1]., while others [2]. [3]., segment the document into vetical zones, calculating skew angles and hill-and-dale writing and projecting them into histograms. Histogram based methodology is applied into [4]. [5]. proposed methods also. Another popular global method is that of Hough Transform. This technique uses algorithms to transform between Cartesian space and parameter space, in wich a straigh line boundary can be defined. Presented methods based on Hough Transform [7]. [8]. [9]. [10]. [12]. vary from one another. Method's [10]. authors claim that HT methods can be applied to large-size images, e.g., engineering drawings, if the image space is properly utilized and method's [8]. authors have proposed a novelty into HT, taking into account an improved methodology for the separation of vertically connected text lines and introducing a new word segmentation technique based on an efficient distinction of inter-word and intra-word distances. Other proposed methods [11]. implement line extraction using convolution of isotropic gaussian filter combined with a set of line filters. Authors of that method claim to have improved performance compared to others methods using anissotropic Gaussian filtering and ridge detection. Filtering techniques, include [13]., a method that based on the theory of Kalman filtering for line extraction. However this method's authors base their method into notion of perceptive vision, making the hypothesis that, at a certain distance text lines of documents can be seen as line segments. Also, one of the demonstrated methods [17]. make use of Fuzzy theory techniques to determine text regions. This method, uses fuzzy detection runlenght to resolve issues of robustness of some other approaches. In method [6]. existing problems that some other methods have, with the calculation of line borders in handwritten cursive text, wich have an impact on the quality of recognition, are treated. In this proposed method borders are found by small pieces containing segmentation elements and defined as a set of linear functions. Another method [14]. introduces a novel text line segmentation algorithm based on minimal spanning tree clustering with distance metric learning, analysing Chinese document images. Similarly in method [16]., a Viterbi algorithm is applied for line segmentation in order to locate optimal succesion of text and gap areas. Those algorithms were tested succesfull on the benchmarking datasets of ICDAR07 handwriting segmentation contest and outperformed the participating algorithms. Another method that has been stated in this document [15]., uses binary moments of handwritten documents for text skew estimation. This procedure is highly accurate, therefore estimated data can be used as an input to the algorithms that are used for text line extraction.

References

  1. Faure C., Vincent N.: "Simultaneous detection of vertical and horizontal text lines based on perceptual organisation".
  2. Kavallieratou E., Fakotakis N., & Kokkinakis G.: "An unconstrained
    handwriting recognition system", International Journal on Document Analysis and Recognition, 4(4):226–242, 2002.
  3. Kavallieratou E., Daskas F.: "Text Line Detection and Segmentation: Uneven Skew Angles and Hill-and-Dale Writing", Journal of Universal Computer Science, vol. 17, no. 1 (2011).
  1. J.C. Wu, J.W. Hsieh, Y.S. Chen,: “Morphology-based text
    line extraction”, Machine Vision and Applications, 2008, pp.195-207.
  2. Rodolfo P. dos Santos, Gabriela S. Clemente, Tsang Ing Ren and George D.C. Calvalcanti,: "Text Line Segmentation Based on Morphology and Histogram Projection", 10th International Conference on Document Analysis and Recognition, 2010.
  3. Kamil R., Aida-zade and Jamaladdin Z. Hasanov,: "Word base line detection in handwritten text recognition systems", International Journal of Electrical and Computer Engineering 4:5 2009.
  4. Ghassan H., Karin A., Rafeef Abu-G.,: "Automatic Line Detection", Project Report for the Computer Vision Course, September 1999.
  5. Louloudis G., Gatos B., Pratikakis I., Halatsis C.,: "Line And Word Segmentation of Handwritten Documents", 2003.
  6. Louloudis G., Halatsis K., Gatos B.,Pratikakis I.,: "A Block-Based Hough Transform Mapping for Text Line Detection in Handwritten Documents", 10th International Workshop on Frontiers in Handwriting Recognition (IWFHR 2006), La Baule, France, October 2006, pp. 515-520.
  7. Jiqiang S., Michael R. L.,: "A Hough Transform based line recognition method utilizing both parameter space and image space", Pattern Recognition 38 (2005) 539 – 552.
  8. Bukhari S. S., Shafait F., Breuel M. T.,: “Text-Line Extraction using a Convolution of Isotropic Gaussian Filter with a Set of Line Filters”.
  9. Likforman-Sulem, L., Hanimyan, A., Faure, C.,: "A Hough Based Algorithm for Extracting Text Lines in Handwritten Documents," Proc. Inter. Conf. on Document Analysis and Recognition, ICDAR'95, Montréal, Canada, 774-777 (1995).
  10. Lemaitre, A., Camillerapp, J.,: "Text Line Extraction in Handwritten Document with Kalman Filter Applied on Low Resolution Image," Proc. of DIAL’06 the Second International Conference on Document Image Analysis for Libraries, 38-45 (2006).
  11. Fei Yin, Cheng-LinLiu,: “Handwritten Chinese text line segmentation by clustering with distance metric learning”, Pattern Recognition 42 (2009) 3146 – 3157.
  12. Brodic D., Milivojevic N. Z.,: “Estimation of the handwritten text skew based on binary moments”.
  13. Papavassiliou V., Stafylakis T., Katsouros V., Carayannis G.,: “Handwritten document image segmentation into text lines and words ”, Pattern Recognition 43 (2010) 369 – 377.
  14. Zhixin Shi and Venu Govindaraju,: "Line Separation for Complex Document Images Using Fuzzy Runlength".
  15. Nazih O., Abdel B., ’’Detection and separation of lines connected in multi-oriented documents’’, University Nancy 2, LORIA, team READ. 2010.
  16. V. B. Campos, A. H. Toselli, E. Vidal. ’’Natural Language Inspired Approach for Handwritten Text Line Detection in Legacy Documents’’. Valencia, Spain. 2010.

No comments:

Post a Comment


Free online chess

View Kapellas Nick's profile on LinkedIn
Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License