Cloud-Based Phrase Mining and Analysis of User-Defined Phrase-Category Association in Biomedical Publications.
Clicks: 244
ID: 96743
2019
Article Quality & Performance Metrics
Overall Quality
Improving Quality
0.0
/100
Combines engagement data with AI-assessed academic quality
Reader Engagement
Steady Performance
68.9
/100
235 views
191 readers
Trending
AI Quality Assessment
Not analyzed
Abstract
The rapid accumulation of biomedical textual data has far exceeded the human capacity of manual curation and analysis, necessitating novel text-mining tools to extract biological insights from large volumes of scientific reports. The Context-aware Semantic Online Analytical Processing (CaseOLAP) pipeline, developed in 2016, successfully quantifies user-defined phrase-category relationships through the analysis of textual data. CaseOLAP has many biomedical applications. We have developed a protocol for a cloud-based environment supporting the end-to-end phrase-mining and analyses platform. Our protocol includes data preprocessing (e.g., downloading, extraction, and parsing text documents), indexing and searching with Elasticsearch, creating a functional document structure called Text-Cube, and quantifying phrase-category relationships using the core CaseOLAP algorithm. Our data preprocessing generates key-value mappings for all documents involved. The preprocessed data is indexed to carry out a search of documents including entities, which further facilitates the Text-Cube creation and CaseOLAP score calculation. The obtained raw CaseOLAP scores are interpreted using a series of integrative analyses, including dimensionality reduction, clustering, temporal, and geographical analyses. Additionally, the CaseOLAP scores are used to create a graphical database, which enables semantic mapping of the documents. CaseOLAP defines phrase-category relationships in an accurate (identifies relationships), consistent (highly reproducible), and efficient manner (processes 100,000 words/sec). Following this protocol, users can access a cloud-computing environment to support their own configurations and applications of CaseOLAP. This platform offers enhanced accessibility and empowers the biomedical community with phrase-mining tools for widespread biomedical research applications.
Abstract Quality Issue:
This abstract appears to be incomplete or contains metadata (235 words).
Try re-searching for a better abstract.
| Reference Key |
sigdel2019cloudbasedjournal
Use this key to autocite in the manuscript while using
SciMatic Manuscript Manager or Thesis Manager
|
|---|---|
| Authors | Sigdel, Dibakar;Kyi, Vincent;Zhang, Aiden;Setty, Shaun P;Liem, David A;Shi, Yu;Wang, Xuan;Shen, Jiaming;Wang, Wei;Han, JiaWei;Ping, Peipei; |
| Journal | Journal of visualized experiments : JoVE |
| Year | 2019 |
| DOI |
10.3791/59108
|
| URL | |
| Keywords |
Citations
No citations found. To add a citation, contact the admin at info@scimatic.org
Comments
No comments yet. Be the first to comment on this article.