data pre-processing for web log mining: case study of commercial bank website usage analysis
Clicks: 193
ID: 218825
2013
Article Quality & Performance Metrics
Overall Quality
Improving Quality
0.0
/100
Combines engagement data with AI-assessed academic quality
Reader Engagement
Steady Performance
30.0
/100
192 views
28 readers
Trending
AI Quality Assessment
Not analyzed
Abstract
We use data cleaning, integration, reduction and data conversion methods in the pre-processing level of data analysis. Data processing techniques improve the overall quality of the patterns mined. The paper describes using of standard pre-processing methods for preparing data of the commercial bank website in the form of the log file obtained from the web server. Data cleaning, as the simplest step of data pre-processing, is non–trivial as the analysed content is highly specific. We had to deal with the problem of frequent changes of the content and even frequent changes of the structure. Regular changes in the structure make use of the sitemap impossible. We presented approaches how to deal with this problem. We were able to create the sitemap dynamically just based on the content of the log file. In this case study, we also examined just the one part of the website over the standard analysis of an entire website, as we did not have access to all log files for the security reason. As the result, the traditional practices had to be adapted for this special case. Analysing just the small fraction of the website resulted in the short session time of regular visitors. We were not able to use recommended methods to determine the optimal value of session time. Therefore, we proposed new methods based on outliers identification for raising the accuracy of the session length in this paper.
| Reference Key |
kapusta2013actadata
Use this key to autocite in the manuscript while using
SciMatic Manuscript Manager or Thesis Manager
|
|---|---|
| Authors | ;Jozef Kapusta;Anna Pilková;Michal Munk;Peter Švec |
| Journal | Talanta |
| Year | 2013 |
| DOI |
10.11118/actaun201361040973
|
| URL | |
| Keywords |
Citations
No citations found. To add a citation, contact the admin at info@scimatic.org
Comments
No comments yet. Be the first to comment on this article.