Vasudev Kandade Rajan: Speech Enhancement in Hands-free Systems for Automobile Environments
- Prof. Dr.-Ing. Gerhard Schmidt
- Prof. Dr.-Ing. Tim Fingscheidt
- Prof. Dr.-Ing. Dipl.-Wirt. Ing. Stephan Pachnicke
- Prof. Dr. habil. Klaus Rätzke
(head of the examination board)
A new microphone position in the automobile where microphones are placed on the seat belt is available. The pros outweigh the cons of the position which makes the it very attractive to be be practically used. In order to be able to use this microphone a set of issues are addressed through signal processing methods. Some of the methods presented are improved versions of the existing ones and some methods are new ideas. They are adapted and assessed in the automobile context. The central work of the thesis is a set of speech enhancement algorithms which are applied to the belt microphones. The speech enhancement chapters presented in the thesis form the basic units of a hands-free system.
Belt microphones when integrated into hands-free systems are used to pick up the speech of the passenger/s in the automobile. The microphone signals also contain the echoes of the remote talker which is played back over the loudspeaker in the automobile. This thesis presents an acoustic echo canceller to remove these echoes. The echo canceller must be able to not only remove the echoes to the extent that a transparent conversation is possible, but also satisfy measures specified by various standards. In order to achieve this an improved way to control the adaptive filters of the echo canceller is presented. The control involves estimation of unmeasurable quantities. Based on theoretically derived optimal quantities an improved step-size control is presented as compared to the existing ones. By utilizing properties of the impulse response such as the inherent delay, slow varying nature, it is shown that the proposed control method out performs the existing method which is based on the same principle. The chapter also presents a way to deal with the moving of the microphones when the body of the passenger moves. The problem of “room change” is handled through the two coupling factors which are in built control mechanisms of the step-size control. The step-size control and the room change is evaluated using standard measures under different realistic automobile scenarios.
The speech enhancement of the microphone also deals with the estimation of the background noise in the automobile environment. The noise estimation chapter of the thesis proposes a new noise estimation scheme applicable to the belt microphones. The scheme considers noise scenarios involving nonstationary signals like the sudden change in the noise properties when the window of the automobile is lowered. By tracking long term average, the long term level, and taking into account the short term dynamics of the background noise, a multiplicative constant based scheme is proposed. The basic idea involves the classification of the current state of the noise as either speech, slowly changing noise, or fast changing noise. Based on this classification an estimate is made combined in the end with the instantaneous background noise. By doing so it has been shown that it is possible to track the background noise aggressively at the same time avoid tracking speech. This scheme has been compared with two other well known schemes. The evaluation shows that the proposed scheme is the better choice in the evaluated scenarios.
The traditional Wiener filter approach to noise suppression has been re-looked in the final speech enhancement chapter of the thesis. The existing modifications of the Wiener filter are presented as a basis for proposing newer modifications. The first proposed modification moves from the retention of the background noise as a suppressed version to a shaping of the suppressed noise. Two ways to reshape the background noise is proposed, first involving the low frequency noise present in the belt microphones due to its proximity to the windows, second involving the equalization of the noise in the remaining frequencies. The idea is to permanently apply the low frequency modification and apply the broadband equalization to only non-speech frequencies. The second proposed modification tries to reduce the speech distortion caused when the background noise needs to overestimated to avoid the so called “musical noise”. The modifications are subjectively tested and the improvements of the methods are shown through spectrogram plots. A hands-free system where the above proposed speech enhancement algorithms has been implemented in a real-time system. This system has been tested in two cars. The software and hardware implementation details are described in the real-time implementation chapter of the thesis. The evaluation of the hands-free system into which the speech enhancement units are integrated is shown.