Web Usage Mining of an Online Newspaper Real Data

Web Usage Mining of an Online Newspaper Real Data

Alfredo Motta
DEI PhD Student

DEI - Edificio 22 Sala Seminari terzo piano
21 novembre 2011
Ore 11.00

Abstract

Web Usage Mining typically extracts knowledge by analyzing historical data such as Web server access logs, browser caches, or proxy logs. In this research we address the problem of analyzing Web server access logs collected at a typical online newspaper in order to extract as much information as possible about the long term behavior of its users. The Map-Reduce parallel programming framework has been exploited to deal with the huge data set and the users has been classified according to their visits distribution. The information mined can then be used to improve the structure of the Web site or to enable the online personalization of Web pages.

Area di ricerca:
Metodologie e architetture software avanzate