TY - CONF AB - In this contribution we derive the Maximum A-Posteriori (MAP) estimates of the parameters of a Gaussian Mixture Model (GMM) in the presence of noisy observations. We assume the distortion to be white Gaussian noise of known mean and variance. An approximate conjugate prior of the GMM parameters is derived allowing for a computationally efficient implementation in a sequential estimation framework. Simulations on artificially generated data demonstrate the superiority of the proposed method compared to the Maximum Likelihood technique and to the ordinary MAP approach, whose estimates are corrected by the known statistics of the distortion in a straightforward manner. AU - Chinaev, Aleksej AU - Haeb-Umbach, Reinhold ID - 11740 KW - Gaussian noise KW - maximum likelihood estimation KW - parameter estimation KW - GMM parameter KW - Gaussian mixture model KW - MAP estimation KW - Map-based estimation KW - maximum a-posteriori estimation KW - maximum likelihood technique KW - noisy observation KW - sequential estimation framework KW - white Gaussian noise KW - Additive noise KW - Gaussian mixture model KW - Maximum likelihood estimation KW - Noise measurement KW - Gaussian mixture model KW - Maximum a posteriori estimation KW - Maximum likelihood estimation SN - 1520-6149 T2 - 38th International Conference on Acoustics, Speech and Signal Processing (ICASSP 2013) TI - MAP-based Estimation of the Parameters of a Gaussian Mixture Model in the Presence of Noisy Observations ER - TY - CONF AB - In this paper, we consider the Maximum Likelihood (ML) estimation of the parameters of a GAUSSIAN in the presence of censored, i.e., clipped data. We show that the resulting Expectation Maximization (EM) algorithm delivers virtually biasfree and efficient estimates, and we discuss its convergence properties. We also discuss optimal classification in the presence of censored data. Censored data are frequently encountered in wireless LAN positioning systems based on the fingerprinting method employing signal strength measurements, due to the limited sensitivity of the portable devices. Experiments both on simulated and real-world data demonstrate the effectiveness of the proposed algorithms. AU - Hoang, Manh Kha AU - Haeb-Umbach, Reinhold ID - 11816 KW - Gaussian processes KW - Global Positioning System KW - convergence KW - expectation-maximisation algorithm KW - fingerprint identification KW - indoor radio KW - signal classification KW - wireless LAN KW - EM algorithm KW - ML estimation KW - WiFi indoor positioning KW - censored Gaussian data classification KW - clipped data KW - convergence properties KW - expectation maximization algorithm KW - fingerprinting method KW - maximum likelihood estimation KW - optimal classification KW - parameters estimation KW - portable devices sensitivity KW - signal strength measurements KW - wireless LAN positioning systems KW - Convergence KW - IEEE 802.11 Standards KW - Maximum likelihood estimation KW - Parameter estimation KW - Position measurement KW - Training KW - Indoor positioning KW - censored data KW - expectation maximization KW - signal strength KW - wireless LAN SN - 1520-6149 T2 - 38th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2013) TI - Parameter estimation and classification of censored Gaussian data with application to WiFi indoor positioning ER - TY - CONF AB - The paper proposes a modification of the standard maximum a posteriori (MAP) method for the estimation of the parameters of a Gaussian process for cases where the process is superposed by additive Gaussian observation errors of known variance. Simulations on artificially generated data demonstrate the superiority of the proposed method. While reducing to the ordinary MAP approach in the absence of observation noise, the improvement becomes the more pronounced the larger the variance of the observation noise. The method is further extended to track the parameters in case of non-stationary Gaussian processes. AU - Krueger, Alexander AU - Haeb-Umbach, Reinhold ID - 11845 KW - Gaussian processes KW - MAP-based estimation KW - maximum a posteriori method KW - maximum likelihood estimation KW - nonstationary Gaussian processes T2 - IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2011) TI - MAP-based estimation of the parameters of non-stationary Gaussian processes from noisy observations ER - TY - JOUR AB - In this paper, we present a new technique for automatic speech recognition (ASR) in reverberant environments. Our approach is aimed at the enhancement of the logarithmic Mel power spectrum, which is computed at an intermediate stage to obtain the widely used Mel frequency cepstral coefficients (MFCCs). Given the reverberant logarithmic Mel power spectral coefficients (LMPSCs), a minimum mean square error estimate of the clean LMPSCs is computed by carrying out Bayesian inference. We employ switching linear dynamical models as an a priori model for the dynamics of the clean LMPSCs. Further, we derive a stochastic observation model which relates the clean to the reverberant LMPSCs through a simplified model of the room impulse response (RIR). This model requires only two parameters, namely RIR energy and reverberation time, which can be estimated from the captured microphone signal. The performance of the proposed enhancement technique is studied on the AURORA5 database and compared to that of constrained maximum-likelihood linear regression (CMLLR). It is shown by experimental results that our approach significantly outperforms CMLLR and that up to 80\% of the errors caused by the reverberation are recovered. In addition to the fact that the approach is compatible with the standard MFCC feature vectors, it leaves the ASR back-end unchanged. It is of moderate computational complexity and suitable for real time applications. AU - Krueger, Alexander AU - Haeb-Umbach, Reinhold ID - 11846 IS - 7 JF - IEEE Transactions on Audio, Speech, and Language Processing KW - ASR KW - AURORA5 database KW - automatic speech recognition KW - Bayesian inference KW - belief networks KW - CMLLR KW - computational complexity KW - constrained maximum likelihood linear regression KW - least mean squares methods KW - LMPSC computation KW - logarithmic Mel power spectrum KW - maximum likelihood estimation KW - Mel frequency cepstral coefficients KW - MFCC feature vectors KW - microphone signal KW - minimum mean square error estimation KW - model-based feature enhancement KW - regression analysis KW - reverberant speech recognition KW - reverberation KW - RIR energy KW - room impulse response KW - speech recognition KW - stochastic observation model KW - stochastic processes TI - Model-Based Feature Enhancement for Reverberant Speech Recognition VL - 18 ER - TY - CONF AB - In this paper we present a novel channel impulse response estimation technique for block-oriented OFDM transmission based on combining estimators: the estimates provided by a Kalman filter operating in the time domain and a Wiener filter in the frequency domain are optimally combined by taking into account their estimated error covariances. The resulting estimator turns out to be identical to the MAP estimator of correlated jointly Gaussian mean vectors. Different variants of the proposed scheme are experimentally investigated in an EEEE 802.11a-like system setup. They compare favourably with known approaches from the literature resulting in reduced mean square estimation error and bit error rate. Further, robustness and complexity issues are discussed AU - Haeb-Umbach, Reinhold AU - Bevermeier, Maik ID - 11785 KW - bit error rate KW - block-oriented OFDM transmission KW - channel estimation KW - channel impulse response estimation KW - combining estimators KW - error statistics KW - frequency domain estimation KW - Gaussian mean vectors KW - Gaussian processes KW - Kalman filter KW - Kalman filters KW - MAP estimator KW - maximum likelihood estimation KW - OFDM channel estimation KW - OFDM modulation KW - time domain estimation KW - time-frequency analysis KW - Wiener filter KW - Wiener filters T2 - IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2007) TI - OFDM Channel Estimation Based on Combined Estimation in Time and Frequency Domain VL - 3 ER - TY - CONF AB - Soft-feature based speech recognition, which is an example of uncertainty decoding, has been proven to be a robust error mitigation method for distributed speech recognition over wireless channels exhibiting bit errors. In this paper we extend this concept to packet-oriented transmissions. The a posteriori probability density function of the lost feature vector, given the closest received neighbours, is computed. In the experiments, the nearest frame repetition, which is shown to be equivalent to the MAP estimate, outperforms the MMSE estimate for long bursts. Taking the variance into account at the speech recognition stage results in superior performance compared to classical schemes using point estimates. A computationally and memory efficient implementation of the proposed packet loss compensation scheme based on table lookup is presented AU - Ion, Valentin AU - Haeb-Umbach, Reinhold ID - 11824 KW - distributed speech recognition KW - least mean squares methods KW - MAP estimate KW - maximum likelihood estimation KW - MMSE estimate KW - packet loss compensation scheme KW - packet switched communication KW - posteriori probability density function KW - robust error mitigation method KW - soft-features KW - speech recognition KW - table lookup KW - voice communication KW - wireless channels T2 - IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2006) TI - An Inexpensive Packet Loss Compensation Scheme for Distributed Speech Recognition Based on Soft-Features VL - 1 ER - TY - JOUR AB - In this paper, it is shown that a correlation criterion is the appropriate criterion for bottom-up clustering to obtain broad phonetic class regression trees for maximum likelihood linear regression (MLLR)-based speaker adaptation. The correlation structure among speech units is estimated on the speaker-independent training data. In adaptation experiments the tree outperformed a regression tree obtained from clustering according to closeness in acoustic space and achieved results comparable with those of a manually designed broad phonetic class tree AU - Haeb-Umbach, Reinhold ID - 11778 IS - 3 JF - IEEE Transactions on Speech and Audio Processing KW - acoustic space KW - adaptation experiments KW - automatic generation KW - bottom-up clustering KW - broad phonetic class regression trees KW - correlation criterion KW - correlation methods KW - maximum likelihood estimation KW - maximum likelihood linear regression based speaker adaptation KW - MLLR adaptation KW - pattern clustering KW - phonetic regression class trees KW - speaker-independent training data KW - speech recognition KW - speech units KW - statistical analysis KW - trees (mathematics) TI - Automatic generation of phonetic regression class trees for MLLR adaptation VL - 9 ER -