SPEECH ENHANCEMENT USING WEINER FILTER AND SPATIOTEMPORAL PREDICTION FILTER APPROACH
Keywords:
microphone arrays, noise reduction, Wiener filter, Beamforming, Spatial-Temporal Prediction filter(STP).Abstract
The presence of acoustic noise in audio recordings is an ongoing issue that plagues many
applications. This ambient background noise is difficult to reduce due to its unpredictable
nature. Many single channel noise reduction techniques exist but are limited in that they may
distort the desired speech signal due to overlapping spectral content of the speech and noise. It
is therefore of interest to investigate the use of multichannel noise reduction algorithms to
further attenuate noise while attempting to preserve the speech signal of interest. Specifically,
this paper looks to investigate the use of microphone arrays in conjunction with multichannel
noise reduction algorithms to aid aiding in speaker identification. Recording a speaker in the
presence of acoustic background noise ultimately limits the performance and confidence of
speaker identification algorithms. In situations where it is impossible to control the noise
environment where the speech sample is taken, noise reduction algorithms must be developed
and applied to clean the speech signal in order to give speaker identification software a chance
at a positive identification. Due to the limitations of single channel techniques, it is of interest
to see if spatial information provided by microphone arrays can be exploited to aid in speaker
identification. This paper provides an exploration of several time domain multichannel noise
reduction techniques multi-channel Wiener filtering, and Spatial-Temporal Prediction filtering
(STP). Each algorithm is prototyped and filter performance is evaluated using various
simulations and experiments. A threedimensional noise model is developed to simulate and
compare the performance of the above methods and experimental results of three data
collections are presented and analysed. The algorithms are compared and recommendations are
given for the use of each technique. Finally, ideas for future work are discussed to improve
performance and implementation of these multichannel algorithms. Possible applications for
this technology include audio surveillance, identity verification, video chatting, conference
calling and sound source localization.