The perceived overall quality of a speech communication system is one of the most important „Key Quality Indicators“ (KQI) for network providers. However, said indicator does not provide satisfactory information on the underlying cause for a quality degradation within the system. This project, consequently, focused on the diagnostic quality analysis of transmitted speech by breaking down the score for overall quality (“Mean Opinion Score”, MOS) into perceptual quality dimensions. Previous work by the collaborating institutions was able to show that speech quality can be modeled by the four following perceptual dimensions: “Noisiness”, “Coloration”, “Discontinuity”, and “Suboptimal Loudness”. All four dimensions are subjectively rated by test subjects during auditory listening experiments, but such ratings may also be estimated by applying instrumental models. Aim of this analytical approach is to directly trace back a degradation in speech quality to its corresponding technical cause within the network or the end-user device (root-cause analysis).
Dominating research questions of this project were the instrumental and robust estimation of the perceptive quality dimensions and underlying technical causes, and the identification of an interdependence between quality dimensions, technical causes, and overall quality. In this context, the following results are to be highlighted:
Within the scope of the reference-based estimation of the quality dimension, a novel and robust estimator for the dimension “Noisiness” was developed. By applying an algorithm which operates independently of the signal amplitude, the accuracy of the estimation is characterized by a maximum “epsilon-insensitive Root Mean Square Error” (RMSE*) of 0.22, a value that is well within the range required by the International Telecommunication Union (ITU-T). Additionally, there are promising results regarding the reference-free estimation of both the four quality dimensions and the overall quality. The current neural network approach already provides an accuracy within the range required by the ITU-T.
Technical causes considered in this project were mainly packet loss and speech coding effects, since these causes are most relevant for the industry partners (Deutsche Telekom AG, Rohde & Schwarz). Algorithms developed in the project are able to detect packet loss with an accuracy of 93 % and three bitrate classes of the AMR-WB codec with an accuracy of 95 %. Furthermore, a joint model was developed in order to robustly separate the two types of degradations. The model is able to quantitatively indicate the contribution of a technical cause to the observed overall quality degradation.
These and all other significant project results were published with international conferences conducting scientific quality control. If relevant, the results were both provided to industry partners and discussed at the ITU-T as contributions for the work items P.AMD and P.TCA. Moreover, some results are freely available online as executables.
G. Mittag, S. Möller: Quality Degradation Diagnosis for Voice Networks - Estimating the Perceived Noisiness, Coloration, and Dicontinuity of Transmitted Speech, Accepted for publ. in Proc. Interspeech 2019, Graz, 2019
G. Mittag, S. Möller: Non-intrusive Speech Quality Assessment for Super-Wideband Spech Communication Networks, Proc. ICASSP 2019, Brighton, 2019
G. Mittag, S. Möller: Quality Estimation of Noisy Speech Using Spectral Entropy Distance, Proc. ICT 2019, Hanoi, 2019
G. Mittag, S. Möller: Semantic Labeling of Quality Impairments in Speech Spectrograms with Deep Convolutional Networks, Proc. QoMEX 2019, Berlin, 2019
G. Mittag, Louis Liedtke, Neslihan Iskender, Babak Naderi, T. Hübschen, G. Schmidt, S. Möller: Einfluss der Position und Stimmhaftigkeit von verdeckten Paketverlusten auf die Sprachqualität, Proc. DAGA, Germany, 2019
T. Hübschen, B. Kaulen, M. Yurdakul, G. Schmidt: Sprachqualität in drahtlosen Ad-Hoc-Netzwerken, Proc. DAGA, Germany, 2019
S. Möller, T. Hübschen, G. Mittag, G. Schmidt: Zusammenhang zwischen perzeptiven Dimensionen und Störungsursachen bei super-breitbandiger Sprachübertragung, Proc. DAGA, Germany, 2019
G. Mittag, S. Möller: Non-intrusive Estimation of Packet Loss Rates in Speech Communication Systems Using Convolutional Neural Networks, Proc. ISM 2018, Taichung, 2018
G. Mittag, S. Möller: Single-ended Packet Loss Rate Estimation of Transmitted Speech Signals, Proc. 6th IEEE Global Conf. on Signal and Information Processing, Anaheim, 2018
S. Möller, T. Hübschen, G. Mittag, G. Schmidt: Diagnostic and Summative Approach for Predicting Speech Communication Quality in a Super-Wideband Context, Proc. ITG, Oldenburg, Germany, 2018
T. Hübschen, G. Mittag, S. Möller, G. Schmidt: Signal-based Root Cause Analysis of Quality Impairments in Speech Communication Networks, Proc. ITG, Oldenburg, Germany, 2018
T. Hübschen, G. Schmidt: Bitrate and Tandem Detection for the AMR-WB Codec with Application to Network Testing, Proc. EUSIPCO 2018, Rome, 2018
G. Mittag, S. Möller: Detecting Packet-Loss Concealment Using ormant Features and Decision Tree Learning, Proc. of Interspeech 2018, pp. 1883-1887, 2018
T. Hübschen, M. Gimm, B. Kaulen, G. Mittag, S. Möller, G. Schmidt: Echtzeit-Rahmenwerk zur Unterstützung der Evaluierung von Sprachkommunikationssystemen, Proc. DAGA 2018 (online access), Munich, 2018
F. Köster, G. Mittag, S. Möller: Modeling the Overall Quality of Experience on the Basis of Underlying Quality Dimensions, 9th Intern. Conf. on Quality of Multimedia Experience (QoMEX), pp. 1-6, 2017
Corresponding ITU-T Contributions (SG 12, Study Period 2017)
G. Mittag, S. Möller: Proposal for the Requirement Specification of P.SAMD, Contribution 384, 2019
G. Mittag, S. Möller: P:SAMD interim results and intended next steps, Contribution 383, 2019
T. Hübschen, G. Mittag, S. Möller, G. Schmidt: Towards a Signal-based Root Cause Analysis Framework, Contribution 304, 2018
G. Mittag, J. Torres Menendet, S. Möller: P.AMD Set A updated performance results of Noisiness Dimension, Contribution 303, 2018
G. Mittag, S. Möller: P.SAMD update of ongoing work, Contribution 300, 2018
G. Mittag, F. Köster, S. Möller: First possible P.SAMD indicators for the Estimation of Noisiness, Contribution 43, 2017
G. Mittag, F. Köster, S. Möller: First possible P.SAMD indicators for the Estimation of Coloration, Contribution 42, 2017
S. Möller: Results of an updated DIAL model for P.AMD set A, Temporary Document 76-Gen, 2017