morphological, syntactic and diacritics rules for automatic diacritization of arabic sentences

;Amine Chennoufi;Azzeddine Mazroui

doi:10.1016/j.jksuci.2016.06.004

morphological, syntactic and diacritics rules for automatic diacritization of arabic sentences

Clicks: 125

ID: 213443

2017

Free PDF

Article Quality & Performance Metrics

Overall Quality Improving Quality

0.0 /100

Combines engagement data with AI-assessed academic quality

Reader Engagement Emerging Content

4.2 /100

14 views

14 readers

AI Quality Assessment

Not analyzed

Abstract

EN
- Turkish
- Spanish
- Portuguese
- Arabic
- Chinese
- French
- German
- Indonesian
- Russian
- Thai

The diacritical marks of Arabic language are characters other than letters and are in the majority of cases absent from Arab writings. This paper presents a hybrid system for automatic diacritization of Arabic sentences combining linguistic rules and statistical treatments. The used approach is based on four stages. The first phase consists of a morphological analysis using the second version of the morphological analyzer Alkhalil Morpho Sys. Morphosyntactic outputs from this step are used in the second phase to eliminate invalid word transitions according to the syntactic rules. Then, the system used in the third stage is a discrete hidden Markov model and Viterbi algorithm to determine the most probable diacritized sentence. The unseen transitions in the training corpus are processed using smoothing techniques. Finally, the last step deals with words not analyzed by Alkhalil analyzer, for which we use statistical treatments based on the letters. The word error rate of our system is around 2.58% if we ignore the diacritic of the last letter of the word and around 6.28% when this diacritic is taken into account.

Reference Key	chennoufi2017journalmorphological, Use this key to autocite in the manuscript while using SciMatic Manuscript Manager or Thesis Manager
Authors	;Amine Chennoufi;Azzeddine Mazroui
Journal	journal of heritage tourism
Year	2017
DOI	10.1016/j.jksuci.2016.06.004 Searching for DOI...
URL	http://www.sciencedirect.com/science/article/pii/S1319157816300428 https://doi.org/10.1016/j.jksuci.2016.06.004
Keywords	arabic language computer science

Citations

No citations found. To add a citation, contact the admin at info@scimatic.org

Comments

Login to comment Register

No comments yet. Be the first to comment on this article.