Lecture "Pattern Recognition"

Basic Information

Lecturers:   Gerhard Schmidt (lecture) and Tobias Hübschen (exercise)
Room:   F-SR-II
Language:   English
Target group:   Students in electrical engineering and computer engineering
Prerequisites:   Basics in system theory

In this lecture the basics of speech, audio, and music signal processing are treated. Often schemes that are based on statistical optimization are utilized for these applications. The involved cost function are matched to the human audio perception.

Topic overview:

  • Preprocessing to reduce signal distortions
    • Noise reduction
    • Beamforming
  • Speech and speaker recognition
    • Fundamentals of speech generation
    • Feature extraction
    • Gaussian mixture models (GMMs)
    • Hidden Markov models (HMMs)
    • Recognition of speech and speakers
  • Enhancement of signal playback
    • Extending the bandwidth of speech signals
    • Equalization of loudspeakers
    • Upmix of stereo signals for playback with more than two loudspeakers



Lecture Slides

The slides of the lecture can be found here.


Matlab Demos

  Matlab demo (GUI based) for adaptive noise suppression
  Matlab demo (GUI based) for linear prediction



de en    
  Questionnaire for the lecture "Noise Suppression"
  Questionnaire for the lecture "Beamforming"
  Questionnaire for the lecture "Feature extraction"
  Questionnaire for the lecture "Codebook training"
  Questionnaire for the lecture "Bandwidth extension"
  Questionnaire for the lecture "Gaussian Mixture Models"
  Questionnaire for the lecture "Speaker recognition"
  Questionnaire for the lecture "Hidden Markov Models"
  Questionnaire for the lecture "Speech recognition"



At the end of the semester, each student will give a talk about a certain topic. The aim is both to give you the chance to work on a pattern recognition-related topic that interests you, and to improve your presentational skills. The talk is also a prerequisite for your admission to the exam. The talks should be held in English and should take ten minutes, plus 2.5 minutes of discussion and 2.5 minutes of feedback.

Below you can find the schedule of the talks.

Date   Room   Time   Topic   Presenter(s)
19.01.2018   F-SR-II   10:00 h   Beamforming using Artificial Neural Networks   Nico Simoski
19.01.2018   F-SR-II   10:15 h   Genetic Algorithms   Bastian Kaulen
19.01.2018   F-SR-II   10:30 h   Adaptive Filters   Tim Benedikt Kupke
02.02.2018   F-SR-II   08:20 h   Speaker Recognition using Neural Networks   Patricia Piepjohn
02.02.2018   F-SR II   08:35 h   Proper Orthogonal Decomposition based Cancer Detection Technique   Sunasheer Bhattacharjee
02.02.2018   F-SR II   08:50 h   Decision Tree   Ali Hadidi
02.02.2018   F-SR II   09:05 h   Image Feature Detection   Karl Heger
02.02.2018   F-SR II   09:30 h   Pattern Recognition for Earthquake Detection   Jonas Weiss
02.02.2018   F-SR II   09:45 h   Fuzzy Logic in Pattern Recognition   Malte Wrobel
02.02.2018   F-SR II   10:00 h   Optimization Criteria for Noise Suppression
  Christian Olsiewski
02.02.2018   F-SR II   10:15 h   Pattern Recognition based Kalman Filter for Indoor Localization using TDOA algorithm   Simon Helling, Fabian Heuer
09.02.2018   F-SR II   08:20 h   Face Recognition using Neural Networks   Anton Lösch
09.02.2018   F-SR II   08:35 h   Noise-Adaptive Hidden Markov Models based on Wiener Filters   Avitha Francis
09.02.2018   F-SR II   08:50 h   Handwriting Recognition   Torben Krause
09.02.2018   F-SR II   09:15 h   Google Deep Dream   Egzon Miftaraj, Gerrit Oldenburger
09.02.2018   F-SR II   09:45 h   Speech Emotion Recognition   Hamed Tavakol



Recent Publications

T. O. Wisch, T. Kaak, A. Namenas, G. Schmidt: Spracherkennung in stark gestörten Unterwasserumgebungen, Proc. DAGA 2018

S. Graf, T. Herbig, M. Buck, G. Schmidt: Low-Complexity Pitch Estimation Based on Phase Differences Between Low-Resolution Spectra, Proc. Interspeech, pp. 2316 -2320, 2017


