Multiscale Page Segmentation using Wavelet Packet Analysis

Przemysław Górecki, Laura Caponetti, Ciro Castiello

Abstract


In this paper, a novel method for document page segmentation using Wavelet Packet analysis is proposed. To discriminate between text and non-text regions, the image is represented by means of a wavelet packet analysis tree. Successively a feature image is introduced to synthetize the information related to some nodes selected from the quadtree. The most discriminant nodes are derived using an optimality criterion and a genetic algorithm. Finally the selected feature image is segmented by means of a Fuzzy C-Means clustering. The approach provides good segmentation results and shows to be invariant to page skew and font variations.

[DOI: 10.1685 / CSC06090] About DOI

Full Text:

PDF


DOI: http://dx.doi.org/10.1685/




Creative Commons License   Except where otherwise noted, content on this site is
  licensed under a Creative Commons 2.5 Italy License