GASS: Generalizing Audio Source Separation with Large-scale Data

Jordi Pons; Xiaoyu Liu; Santiago Pascual; Joan Serrà

GASS: Generalizing Audio Source Separation with Large-scale Data

Clicks: 57

ID: 283034

2023

Free PDF

Article Quality & Performance Metrics

Overall Quality Improving Quality

0.0 /100

Combines engagement data with AI-assessed academic quality

Reader Engagement Emerging Content

6.0 /100

20 views

20 readers

AI Quality Assessment

Not analyzed

Abstract

EN
- Turkish
- Spanish
- Portuguese
- Arabic
- Chinese
- French
- German
- Indonesian
- Russian
- Thai

Universal source separation targets at separating the audio sources of an arbitrary mix, removing the constraint to operate on a specific domain like speech or music. Yet, the potential of universal source separation is limited because most existing works focus on mixes with predominantly sound events, and small training datasets also limit its potential for supervised learning. Here, we study a single general audio source separation (GASS) model trained to separate speech, music, and sound events in a supervised fashion with a large-scale dataset. We assess GASS models on a diverse set of tasks. Our strong in-distribution results show the feasibility of GASS models, and the competitive out-of-distribution performance in sound event and speech separation shows its generalization abilities. Yet, it is challenging for GASS models to generalize for separating out-of-distribution cinematic and music content. We also fine-tune GASS models on each dataset and consistently outperform the ones without pre-training. All fine-tuned models (except the music separation one) obtain state-of-the-art results in their respective benchmarks.

Reference Key	serrà2023gass Use this key to autocite in the manuscript while using SciMatic Manuscript Manager or Thesis Manager
Authors	Jordi Pons; Xiaoyu Liu; Santiago Pascual; Joan Serrà
Journal	arXiv
Year	2023
DOI	DOI not found Searching for DOI...
URL	http://arxiv.org/abs/2310.00140v1
Keywords	cs.lg cs.ai eess.sp eess.as cs.sd data separation source gass: generalizing audio large-scale

Citations

No citations found. To add a citation, contact the admin at info@scimatic.org

Comments

Login to comment Register

No comments yet. Be the first to comment on this article.