Dependency Annotation of Ottoman Turkish with Multilingual BERT

S. Özates; Tarik Emre Tiras; Efe Eren Genç; Esma F. Bilgin Tasdemir

doi:10.48550/arXiv.2402.14743

Dependency Annotation of Ottoman Turkish with Multilingual BERT

Clicks: 62

ID: 283382

2024

Free PDF

Article Quality & Performance Metrics

Overall Quality Improving Quality

0.0 /100

Combines engagement data with AI-assessed academic quality

Reader Engagement Emerging Content

3.6 /100

12 views

12 readers

AI Quality Assessment

Not analyzed

Abstract

EN
- Turkish
- Spanish
- Portuguese
- Arabic
- Chinese
- French
- German
- Indonesian
- Russian
- Thai

This study introduces a pretrained large language model-based annotation methodology of the first dependency treebank in Ottoman Turkish. Our experimental results show that, through iteratively i) pseudo-annotating data using a multilingual BERT-based parsing model, ii) manually correcting the pseudo-annotations, and iii) fine-tuning the parsing model with the corrected annotations, we speed up and simplify the challenging dependency annotation process. The resulting treebank, that will be a part of the Universal Dependencies (UD) project, will facilitate automated analysis of Ottoman Turkish documents, unlocking the linguistic richness embedded in this historical heritage.

Reference Key	Özates2024dependency Use this key to autocite in the manuscript while using SciMatic Manuscript Manager or Thesis Manager
Authors	S. Özates; Tarik Emre Tiras; Efe Eren Genç; Esma F. Bilgin Tasdemir
Journal	Law
Year	2024
DOI	10.48550/arXiv.2402.14743 Searching for DOI...
URL	https://www.semanticscholar.org/paper/cbcc39dd244d9a7dcefadde5fee648465874ea77 https://doi.org/10.48550/arXiv.2402.14743
Keywords	annotation turkish dependency bert Multilingual ottoman

Citations

No citations found. To add a citation, contact the admin at info@scimatic.org

Comments

Login to comment Register

No comments yet. Be the first to comment on this article.