Towards Accurate Translation via Semantically Appropriate Application of Lexical Constraints
Clicks: 8
ID: 283110
2023
Lexically-constrained NMT (LNMT) aims to incorporate user-provided
terminology into translations. Despite its practical advantages, existing work
has not evaluated LNMT models under challenging real-world conditions. In this
paper, we focus on two important but under-studied issues that lie in the
current evaluation process of LNMT studies. The model needs to cope with
challenging lexical constraints that are "homographs" or "unseen" during
training. To this end, we first design a homograph disambiguation module to
differentiate the meanings of homographs. Moreover, we propose PLUMCOT, which
integrates contextually rich information about unseen lexical constraints from
pre-trained language models and strengthens a copy mechanism of the pointer
network via direct supervision of a copying score. We also release HOLLY, an
evaluation benchmark for assessing the ability of a model to cope with
"homographic" and "unseen" lexical constraints. Experiments on HOLLY and the
previous test setup show the effectiveness of our method. The effects of
PLUMCOT are shown to be remarkable in "unseen" constraints. Our dataset is
available at https://github.com/papago-lab/HOLLY-benchmark
Reference Key |
choo2023towards
Use this key to autocite in the manuscript while using
SciMatic Manuscript Manager or Thesis Manager
|
---|---|
Authors | Yujin Baek; Koanho Lee; Dayeon Ki; Hyoung-Gyu Lee; Cheonbok Park; Jaegul Choo |
Journal | arXiv |
Year | 2023 |
DOI | DOI not found |
URL | |
Keywords |
Citations
No citations found. To add a citation, contact the admin at info@scimatic.org
Comments
No comments yet. Be the first to comment on this article.