Sidebar Menu

Seminar "Selected Topics in Speech and Audio Signal Processing"


Basic Information

Seminar overview  
  Lecturers:   Gerhard Schmidt and group  
  Semester:   Winter term  
  Language:   English or German  
  Target group:   Master students in electrical engineering and computer engineering  
  Prerequisites:   Fundamentals in digital signal processing  

If you want to sign up for this seminar, you need to register with the following information in the registration form

  • surname, first name,
  • e-mail address,
  • matriculation number,

Please note that the registration period starts 01.10.2019 at 8:00 h and ends 25.10.2019 at 23:59 h. All applications before and after this registration period will not be taken into account.

Registration will be possible within the before mentioned time under the following subsite - Seminar Registration.

During the registration process you will also choose your seminar topic. Only one student per topic is permitted (first come - first serve).

The registration is binding. A deregistration is only possible by sending an e-mail with your name and matriculation number to This email address is being protected from spambots. You need JavaScript enabled to view it. until Sunday, 18.10.2019 at 23:59 h. All later cancellations of registration will be considered as having failed the seminar.

  Time:   Preliminary meeting, DSS Library, 28.10.2019 at 16:30 h
Written report due on 17.02.2020
Final presentations, Aquarium, 26.02.2020 at 10:00 h

Students write a scientific report on a topic closely related to the current research of the DSS group. Potential topics, therefore, deal with digital signal processing related to speech and audio processing.

Students will also present their findings in front of the other participants and the DSS group.




Topic title   Description  
  Analysis of the
Intelligibility of Speech

A lot different diseases can affect the intelligibility of the speech of the impaired people. A typical example is Morbus Parkinson which often leads to a too quiet speech and unclear articulation of the patients. In literature, there are different approaches to classify those cases using recordings of different vowels to extract certain features. For this topic, those different features extracted from vowels shall be examined and the usage of these in classifiers like neural networks to estimate the intelligibility shall be further evaluated.

  Frameworks for
Speech Recognition

In some cases, using a custom speech recognizer instead of a readily available (commercial) solution is advantageous. However, such a custom recognizer needs to first be implemented and trained, which may be done within an open-source framework. For this seminar topic, open-source frameworks for the development of a speech recognizer are to be analyzed regarding their properties, handling and workflow. As a result, a well-reasoned recommendation for the selection of a framework should be given.

  Microphone Array
Signal Processing

Beamforming is usually used for speech signal acquisition in challenging environments. High background noise may deteriorate the quality of the microphone signals, and the desired signal may be reverberated due to room acoustics. The task of a beamformer is to selectively pick up signals impinging from a predefined direction, the so-called steering direction. In this seminar, the most recent adaptive beamformer structures should be considered by means of literature study. A comparison of different microphone array topologies used for beamforming should be considered and the spatial sampling of a sound field should also discussed. A discussion of integrating beamformer with other acoustic systems (like echo canceller) is also preferable.

  Acoustic Artifacts
Imposed by
Speech Enhancement Systems

Communication systems such as ICC systems or communication units in firefighter breathing mask are implemented to enhance communication between two or more persons or parties. Often, such systems have to deal with several challenges at one time. Therefore, some unwanted effects may occur. The aim of this seminar is to review the mentioned systems regarding possible unwanted effects.

  Neural Network-based
Optimal Step-Size

Normalized least mean squares (NLMS) based adaptive filters are used in many applications such as for echo cancellation in hands-free telephony. Here, a fast convergence and a good accuracy is mandatory for the system's overall performance. The speed of convergence depends on the so-called step-size, where a large step-size means a fast convergence and a small step-size leads to a slow convergence but a better steady state performance. Whenever a fixed step-size is chosen, a tradeoff has to be made. However, there are a lot of adaptive step-size approaches, which are mostly based on an estimation of the undisturbed error signal. Neural networks are versatile, so they are used for many different applications. In this work, neural network-based approaches for the optimal step-size estimation should be investigated by means of a literature research.