Exploring Phonetic Context-Aware Lip-Sync For Talking Face Generation

Se Jin Park; Minsu Kim; Jeongsoo Choi; Yong Man Ro

Exploring Phonetic Context-Aware Lip-Sync For Talking Face Generation

Clicks: 48

ID: 282558

2023

Free PDF

Article Quality & Performance Metrics

Overall Quality Improving Quality

0.0 /100

Combines engagement data with AI-assessed academic quality

Reader Engagement Emerging Content

6.6 /100

22 views

22 readers

AI Quality Assessment

Not analyzed

Abstract

EN
- Turkish
- Spanish
- Portuguese
- Arabic
- Chinese
- French
- German
- Indonesian
- Russian
- Thai

Talking face generation is the challenging task of synthesizing a natural and realistic face that requires accurate synchronization with a given audio. Due to co-articulation, where an isolated phone is influenced by the preceding or following phones, the articulation of a phone varies upon the phonetic context. Therefore, modeling lip motion with the phonetic context can generate more spatio-temporally aligned lip movement. In this respect, we investigate the phonetic context in generating lip motion for talking face generation. We propose Context-Aware Lip-Sync framework (CALS), which explicitly leverages phonetic context to generate lip movement of the target face. CALS is comprised of an Audio-to-Lip module and a Lip-to-Face module. The former is pretrained based on masked learning to map each phone to a contextualized lip motion unit. The contextualized lip motion unit then guides the latter in synthesizing a target identity with context-aware lip motion. From extensive experiments, we verify that simply exploiting the phonetic context in the proposed CALS framework effectively enhances spatio-temporal alignment. We also demonstrate the extent to which the phonetic context assists in lip synchronization and find the effective window size for lip generation to be approximately 1.2 seconds.

Reference Key	ro2023exploring Use this key to autocite in the manuscript while using SciMatic Manuscript Manager or Thesis Manager
Authors	Se Jin Park; Minsu Kim; Jeongsoo Choi; Yong Man Ro
Journal	arXiv
Year	2023
DOI	DOI not found Searching for DOI...
URL	http://arxiv.org/abs/2305.19556v3
Keywords	cs.ai cs.cv eess.iv eess.as cs.sd

Citations

No citations found. To add a citation, contact the admin at info@scimatic.org

Comments

Login to comment Register

No comments yet. Be the first to comment on this article.