Nowadays, the use of speech technology to operate with electronic and informatic
systems is getting more and more frequent and its usefulness indisputable. Given that
speech is the most natural way in which humans interact with their environment, there
is a clear tendency to make use of this technology as the mean to communicate with
devices.
Automatic speech recognition systems are already able to properly understand reasonably
complex human commands. Nevertheless, there are certain accoustic conditions
under which the error rates commited by them are still too high. Particularly, the capture
of distant speech in reverberant accoustic environments is specially conflictive for
this type of systemswhich usually show lower performances than expected. At the same
time, this distant speech acquisition is of high importance since it allows a more natural
way to interact, without neccesarily having to carry intrusive, close-talk microphones,
and it is present in multiple and common situations, such as those given in a digital
home or a conference room.
This present Master Thesis intends to design, implement and evaluate from scratch
an accurate and complete tool to estimate the speaker localization in these reverberant
accoustic environments. This localization application is crucial to improve the results
of the mentioned recognition systems since it can be used to exploit the spatial filtering
ability of an array, which allows the speech signal from one talker to be enhanced as the
signals from other talkers as well as undesired sources and noise are supressed.
After a detailed theoretical study, a robust localization system was implemented based
on the Steered Response Power (SRP) of an array when focused at different locations.
In turn, this scheme lies itself on a more basic localization algorithm able to estimate
the Direction of Arrival (DOA) of the given speech by computing the Generalized Cross
Correlation (GCC) between microphone pairs. Finally, an exhaustive evaluation of the
results obtained by the system was carried out in order to check the validity of its outcomes,
hint the possible improvement techniques, get some general conclusions about its
performance and suggest future lines of investigation.
|