Medical Diagnosis Aided by Automatic Classification: Non-blackbox Approaches for Clinical Voice Assessment

DI Dr.techn. Philipp Aichinger, Medical University of Vienna, Dept. of Otorhinolaryngology, Division of Phoniatrics-Logopedics


  • Date: 06.05.2019
  • Time: 16:00 h
  • Place: Aquarium, Building D, Faculty of Engineering, Kaiserstr. 2., 24143 Kiel



Voice disorders are socially relevant, because they may lead to significant follow-up costs for health insurances and the economic system, if no adequate treatment is administered timely. Voice quality characterization is pivotal to the clinical care of voice disorders, because it aids the indication, selection, evaluation, and optimization of clinical treatment techniques, including speech therapy by administered by logopedists / speech language pathologists, and phonosurgery, performed by medical doctors specialized on voice disorders.

Current approaches to artificial intelligence, including (Deep) Neural Networks, are not fully accepted by clinical experts, partly due to their black box nature. In particular, explanatory power of these approaches is low. In contrast, we propose to use hand-crafted model based features as input to low-dimensional classification automats. Our features are meant to represent closely the properties of the voice, which are described on the level of voice production, on the level of acoustics, and on the level of perception.

Diplophonia is a particular type of pathological voice qualities, in which two simultaneous pitches are reported by clinical experts to be audible simultaneously. Diplophonia may be a symptom of a vocal dysfunction that needs medical treatment. The inherently subjective definition located on the domain of auditory perception is complemented by our approaches to track two simultaneous fundamental frequencies from high-speed videos of the vocal folds, and from audio signals. Also, first steps with a physiologically grounded hearing model are presented. The hearing model is used to predict from decomposed audio signals of the voice the presence of two simultaneously perceivable pitches.

The figure shows an an endoscopic picture of human vocal folds.


Short biography

Philipp Aichinger is a Research Associate of the Medical University of Vienna (MUV). He is affiliated with the Department of Otorhinolaryngology, Division of Phoniatrics-Logopedics. He graduated interdisciplinary studies in Electrical Engineering/Sound Engineering at the Graz University of Technology (TUG) and the University of Music and Dramatic Arts in Graz (KUG), acquiring expertise both in engineering and in music/perception research. His PhD-thesis "Diplophonic Voice - Definitions, models, and detection" has been supervised by the TUG and the MUV. Philipp is Principal Investigator of a research project FWF KLI722-B30 funded within the Program Clinical Research of the Austrian Science Fund (FWF), entitled “Objective differentiation of dysphonic voice quality types”. He is an organizer of the 2019 Special Session at Interspeech, entitled “Voice quality characterization for clinical voice assessment: Voice production, acoustics, and auditory perception”. He is a member of the IEEE Signal Processing Society, the Audio Engineering Society, and the Acoustical Society of America. He is reviewer for the Journal of the Acoustical Society of America, for the IEEE Transactions on Audio, Speech and Language Processing, for the Journal of Medical and Biological Engineering, for the Journal Biomedical Signal Processing and Control, and for Acta Acustica united with Acustica.

Website News

27.01.2020: Contributions on nerve signal modeling and magnetic muscle measurement by OPMs availaible on IEEE (early access).

27.01.2020: Talk about magnetic shielding by Allard Schnabel (PTB, Berlin) takes place on Feb 13, 2020, 17 h, Room: C-SR 1.

26.01.2020: Some reflections on the year 2019 are online now.

17.12.2019: Journal paper on signal processing for breathing protection masks published.

23.11.2019: GaS price 2019 for Jannek Winter for an excellent bachelor topic on underwater communication systems.

15.11.2019: Our new MIMO-SONAR system (sponsored by DFG) is now ready for "take off".

20.10.2019: We had a very good retreat on the island of Sylt.

07.08.2019: Talk from Juan Rafael Orozco-Arroyave added.

11.07.2019: First free KiRAT version released - a game for Parkinson patients

Recent Publications


E. Elzenheimer, H. Laufs, W. Schulte-Mattler, G. Schmidt: Magnetic Measurement of Electrically Evoked Muscle Responses with Optically Pumped Magnetometers, IEEE Transactions on Neural Systems and Rehabilitation Engineering, January 2020, doi: 10.1109/TNSRE.2020.2968148


M. Brodersen, A. Volmer, G. Schmidt: Signal Enhancement for Communication Systems Used by Firefighter, EURASIP Journal on Audio, Speech, and Music Processing, vol. 21, pp. 1 - 19, 2019


E. Elzenheimer, H. Laufs, W. Schulte-Mattler, G. Schmidt: Signal Modeling and Simulation of Temporal Dispersion and Conduction Block in Motor Nerves, IEEE Transactions on Biomedical Engineering, November 2019, doi: 10.1109/TBME.2019.2954592


Prof. Dr.-Ing. Gerhard Schmidt


Christian-Albrechts-Universität zu Kiel
Faculty of Engineering
Institute for Electrical Engineering and Information Engineering
Digital Signal Processing and System Theory

Kaiserstr. 2
24143 Kiel, Germany

How to find us

Recent News

Our SONAR Simulator Supports Underwater Speech Communication Now

Due to the work of Owe Wisch and Alexej Namenas (and of the rest of the SONAR team, of course) our SONAR simulator supports now a real-time mode for testing underwater speech communication. A multitude of "subscribers" can connect to our virtual ocean and send and receive signals. The simulator consists of large (time-variant) convolution engine as well as a realistic noise simulation that ...

Read more ...