4. Spectral Reindexing for Pitch Estimation

Problem:
Need for a powerful but straightforward pitch estimation method.

The idea:
reordering the information represented by the frequency bins of a spectrogram (FFT, FChT or ChT) into an FoGram.

Auditory Perceptual Integration:
The main idea is to scan through all possible pitch candidates and assign, to every frequency index Fo, the sum of the energy values at Fo, 2Fo, … , iFo. In equation it looks like this:

fo … pitch candidates (usually between 80 and 380Hz),
nH … number of harmonics considered for gathering,
S() … Spectral sample at i x fo.

An example of a such Fo-gram derived from a HChT spectrogram is shown below (courtesy Cancela et al.):

After zooming the image we see that the FoGram provides extremely high frequency resolution (below 1Hz!), far-far above the frequency resolution of the spectral representation it is derived from (usually 10-30Hz/freq. bin).

References:
[1] M. Képesi, L. Weruaga, E. Schofield, “Detailed Multidimensional Analysis of our Acoustical Environment,” Forum Acusticum. Budapest (Hu), September 2005, pp. 2649-2654.
[2] M. Képesi and L. Weruaga, “High-resolution noise-robust spectral-based pitch estimation,” Interspeech 2005, pp. 313-316, Lisboa (P), Sep. 2005

Related Work:

[3] P. Cancela, “Tracking melody in polyphonic audio. mirex 2008,” in Proc. Music Inf. Retrieval Evaluation eXchange, 2008
[4] “FAN CHIRP TRANSFORM FOR MUSIC REPRESENTATION”, P. Cancela, E. Lopez, M. Rocamora, DAFx 2010.
(“F0-gram” ie. GlogS discussed in chapter 4)
[5] Pei Zhao, Zhiping Zhang, Xihong Wu: Monaural speech separation based on multi-scale Fan-Chirp Transform, ICASSP 2008. March 31 2008, Page(s): 161 – 164

One Response to “4. Spectral Reindexing for Pitch Estimation”

  1. 18. F0-gram with Music Scales | Signal Processing Ideas Says:

    […] Database of further audio samples. Octave Code. Theory. […]

Leave a comment