Problem:
STChT requires reliable pitch estimation in order to provide sharp TF representation. This is a challenging task, as pitch estimation in noisy and multi-speaker environments is never an easy task.
Method Description:
A comibined Pitch estimation method, that combines Autocorrelation (ACF) with Cepstrum (Cep) and some additional tricks: We know, that the Autocorrelation extracts the periodicity of the speech signal even in noisy background, but gives multiple pitch candidates because of double-pitch, half-pitch, etc.. This is taken care by the Cepstrum applied on top of the ACF, which merges all autocorrelation peak candidates into one cepstrum-based pitch candidate. And the trick is inbetween: Cepstrum is reliable only if the spectrum it uses is nice enough, ie. dominant, rich of harmonics, and as flat as possible. But how could be a noisy speech spectrum nice like that?

Well, a half-way rectified autocorrelation, leads almost to a spectrum like that: with boosted periodicities and enhanced spectral representation of hidden harmonicities..
References:
No publications yet.
Leave a comment