detecting family resemblance: automated genre classification

Clicks: 160

ID: 146163

2007

Article Quality & Performance Metrics

Overall Quality Improving Quality

0.0 /100

Combines engagement data with AI-assessed academic quality

Reader Engagement Emerging Content

0.3 /100

1 views

1 readers

AI Quality Assessment

Not analyzed

Abstract

This paper presents results in automated genre classification of digital documents in PDF format. It describes genre classification as an important ingredient in contextualising scientific data and in retrieving targetted material for improving research. The current paper compares the role of visual layout, stylistic features, and language model features in clustering documents and presents results in retrieving five selected genres (Scientific Article, Thesis, Periodicals, Business Report, and Form) from a pool of materials populated with documents of the nineteen most popular genres found in our experimental data set.

Reference Key	kim2007datadetecting Use this key to autocite in the manuscript while using SciMatic Manuscript Manager or Thesis Manager
Authors	;Yunhyong Kim;Seamus Ross
Journal	Proceedings of the National Academy of Sciences of the United States of America
Year	2007
DOI	10.2481/dsj.6.S172
URL	http://datascience.codata.org/articles/405 https://doi.org/10.2481/dsj.6.S172
Keywords	information management

Citations

No citations found. To add a citation, contact the admin at info@scimatic.org

Comments

No comments yet. Be the first to comment on this article.