Audio Source Separation and Speech Enhancement Front Cover

Audio Source Separation and Speech Enhancement

  • Length: 504 pages
  • Edition: 1
  • Publisher:
  • Publication Date: 2018-10-01
  • ISBN-10: 1119279895
  • ISBN-13: 9781119279891
  • Sales Rank: #3558324 (See Top 100 Books)
Description

Learn the technology behind hearing aids, Siri, and Echo 

Audio source separation and speech enhancement aim to extract one or more source signals of interest from an audio recording involving several sound sources. These technologies are among the most studied in audio signal processing today and bear a critical role in the success of hearing aids, hands-free phones, voice command and other noise-robust audio analysis systems, and music post-production software.

Research on this topic has followed three convergent paths, starting with sensor array processing, computational auditory scene analysis, and machine learning based approaches such as independent component analysis, respectively. This book is the first one to provide a comprehensive overview by presenting the common foundations and the differences between these techniques in a unified setting.

Key features:

  • Consolidated perspective on audio source separation and speech enhancement.
  • Both historical perspective and latest advances in the field, e.g. deep neural networks.
  • Diverse disciplines: array processing, machine learning, and statistical signal processing.
  • Covers the most important techniques for both single-channel and multichannel processing.

This book provides both introductory and advanced material suitable for people with basic knowledge of signal processing and machine learning. Thanks to its comprehensiveness, it will help students select a promising research track, researchers leverage the acquired cross-domain knowledge to design improved techniques, and engineers and developers choose the right technology for their target application scenario. It will also be useful for practitioners from other fields (e.g., acoustics, multimedia, phonetics, and musicology) willing to exploit audio source separation or speech enhancement as pre-processing tools for their own needs.

Table of Contents

Part I Prerequisites
Chapter 1 Introduction
Chapter 2 Time-Frequency Processing: Spectral Properties
Chapter 3 Acoustics: Spatial Properties
Chapter 4 Multichannel Source Activity Detection, Localization, And Tracking

Part II Single-Channel Separation and Enhancement
Chapter 5 Spectral Masking And Filtering
Chapter 6 Single-Channel Speech Presence Probability Estimation And Noise Tracking
Chapter 7 Single-Channel Classification And Clustering Approaches
Chapter 8 Nonnegative Matrix Factorization
Chapter 9 Temporal Extensions Of Nonnegative Matrix Factorization

Part III Multichannel Separation and Enhancement
Chapter 10 Spatial Filtering
Chapter 11 Multichannel Parameter Estimation
Chapter 12 Multichannel Clustering And Classification Approaches
Chapter 13 Independent Component And Vector Analysis
Chapter 14 Gaussian Model Based Multichannel Separation
Chapter 15 Dereverberation

Part IV Application Scenarios and Perspectives
Chapter 16 Applying Source Separation To Music
Chapter 17 Application Of Source Separation To Robust Speech Analysis And Recognition
Chapter 18 Binaural Speech Processing With Application To Hearing Devices
Chapter 19 Perspectives

To access the link, solve the captcha.