A Systematic Comparison of Phonetic Aware Techniques for Speech
  Enhancement

Or Tal; Moshe Mandel; Felix Kreuk; Yossi Adi

A Systematic Comparison of Phonetic Aware Techniques for Speech Enhancement

Clicks: 30

ID: 282555

2022

Free PDF

Article Quality & Performance Metrics

Overall Quality Improving Quality

0.0 /100

Combines engagement data with AI-assessed academic quality

Reader Engagement Emerging Content

1.5 /100

5 views

5 readers

AI Quality Assessment

Not analyzed

Abstract

EN
- Turkish
- Spanish
- Portuguese
- Arabic
- Chinese
- French
- German
- Indonesian
- Russian
- Thai

Speech enhancement has seen great improvement in recent years using end-to-end neural networks. However, most models are agnostic to the spoken phonetic content. Recently, several studies suggested phonetic-aware speech enhancement, mostly using perceptual supervision. Yet, injecting phonetic features during model optimization can take additional forms (e.g., model conditioning). In this paper, we conduct a systematic comparison between different methods of incorporating phonetic information in a speech enhancement model. By conducting a series of controlled experiments, we observe the influence of different phonetic content models as well as various feature-injection techniques on enhancement performance, considering both causal and non-causal models. Specifically, we evaluate three settings for injecting phonetic information, namely: i) feature conditioning; ii) perceptual supervision; and iii) regularization. Phonetic features are obtained using an intermediate layer of either a supervised pre-trained Automatic Speech Recognition (ASR) model or by using a pre-trained Self-Supervised Learning (SSL) model. We further observe the effect of choosing different embedding layers on performance, considering both manual and learned configurations. Results suggest that using a SSL model as phonetic features outperforms the ASR one in most cases. Interestingly, the conditioning setting performs best among the evaluated configurations.

Reference Key	adi2022a Use this key to autocite in the manuscript while using SciMatic Manuscript Manager or Thesis Manager
Authors	Or Tal; Moshe Mandel; Felix Kreuk; Yossi Adi
Journal	arXiv
Year	2022
DOI	DOI not found Searching for DOI...
URL	http://arxiv.org/abs/2206.11000v1
Keywords	cs.lg eess.as cs.sd

Citations

No citations found. To add a citation, contact the admin at info@scimatic.org

Comments

Login to comment Register

No comments yet. Be the first to comment on this article.