Messanae Universitas Studiorum

Adaptive web data extraction policies

Fiumara, Giacomo and Marchi, Massimo and Provetti, Alessandro (2008) Adaptive web data extraction policies. Atti della Accademia Peloritana dei Pericolanti - Classe di Scienze MM.FF.NN., LXXXVI (2). ISSN 1825-1242

[img]
Preview
PDF
Download (215Kb) | Preview

    Abstract

    Web data extraction is concerned, among other things, with routine data accessing and downloading from continuously-updated dynamic Web pages. There is a relevant trade-off between the rate at which the external Web sites are accessed and the computational burden on the accessing client. We address the problem by proposing a predictive model, typical of the Operating Systems literature, of the rate-of-update of each Web source. The presented model has been implemented into a new version of the Dynamo project: a middleware that assists in generating informative RSS feeds out of traditional HTML Web sites. To be effective, i.e., make RSS feeds be timely and informative and to be scalable, Dynamo needs a careful tuning and customization of its polling policies, which are described in detail.

    Item Type: Article
    Subjects: M.U.S. - Contributi Scientifici > 01 - Scienze matematiche e informatiche
    M.U.S. - Miscellanea > Atti Accademia Peloritana > Classe di Scienze Fisiche, Matematiche e Naturali
    Divisions: UNSPECIFIED
    Depositing User: Mr Nunzio Femminò
    Date Deposited: 26 Nov 2009
    Last Modified: 13 Apr 2010 13:14
    URI: http://cab.unime.it/mus/id/eprint/548

    Actions (login required)

    View Item