Examining the performance of classification algorithms for imbalanced data sets in web author identification

Vorobeva; Alisa A.;

Examining the performance of classification algorithms for imbalanced data sets in web author identification

Clicks: 302

ID: 104887

2016

Free PDF

Article Quality & Performance Metrics

Overall Quality Improving Quality

0.0 /100

Combines engagement data with AI-assessed academic quality

Reader Engagement Steady Performance

65.1 /100

288 views

224 readers

AI Quality Assessment

Not analyzed

Abstract

EN
- Turkish
- Spanish
- Portuguese
- Arabic
- Chinese
- French
- German
- Indonesian
- Russian
- Thai

Individuals, criminals or even terrorist organizations can use web-communication for criminal purposes; to avoid the prosecution they try to hide their identity. To increase level of safety in Web we have to improve the author (or web-user) identification and authentication procedures. In field of web author identification the situation of imbalanced data sets appears rather frequent, when number of one author's texts significantly exceeds the number of other's. This is common situation for the modern web: social networks, blogs, emails etc. Author identification task is some sort of classification task. To develop methods, technics and tools for web author identification we have to examine the performance of classification algorithms for imbalanced data sets. In this work several modern classification algorithms were tested on data sets with various levels of class imbalance and different number of available webpost The best accuracy in all experiments was achieved with Random Forest algorithm.

Reference Key	vorobeva2016examiningproceedings Use this key to autocite in the manuscript while using SciMatic Manuscript Manager or Thesis Manager
Authors	Vorobeva, Alisa A.;
Journal	proceedings of the xxth conference of open innovations association fruct
Year	2016
DOI	DOI not found Searching for DOI...
URL	https://fruct.org/publications/fruct18/files/Vor.pdf
Keywords	chemistry Biology (General) Engineering (General). Civil engineering (General) Technology Science (General) Medical technology physics telecommunication electronic computers. computer science

Citations

No citations found. To add a citation, contact the admin at info@scimatic.org

Comments

Login to comment Register

No comments yet. Be the first to comment on this article.