Application of two machine learning algorithms to genetic association studies in the presence of covariates

Clicks: 328
ID: 21886
2008
Article Quality & Performance Metrics
Overall Quality Improving Quality
0.0 /100
Combines engagement data with AI-assessed academic quality
AI Quality Assessment
Not analyzed
Abstract

Abstract

Background

Population-based investigations aimed at uncovering genotype-trait associations often involve high-dimensional genetic polymorphism data as well as information on multiple environmental and clinical parameters. Machine learning (ML) algorithms offer a straightforward analytic approach for selecting subsets of these inputs that are most predictive of a pre-defined trait. The performance of these algorithms, however, in the presence of covariates is not well characterized.

Methods and Results

In this manuscript, we investigate two approaches: Random Forests (RFs) and Multivariate Adaptive Regression Splines (MARS). Through multiple simulation studies, the performance under several underlying models is evaluated. An application to a cohort of HIV-1 infected individuals receiving anti-retroviral therapies is also provided.

Conclusion

Consistent with more traditional regression modeling theory, our findings highlight the importance of considering the nature of underlying gene-covariate-trait relationships before applying ML algorithms, particularly when there is potential confounding or effect mediation.

Reference Key
s2008applicationbmc Use this key to autocite in the manuscript while using SciMatic Manuscript Manager or Thesis Manager
Authors S, Foulkes Andrea;AS, Nonyane Bareng;
Journal bmc genetics
Year 2008
DOI
DOI not found
URL
Keywords Keywords not found

Citations

No citations found. To add a citation, contact the admin at info@scimatic.org

No comments yet. Be the first to comment on this article.