Problem:
Crossing pitch trajectories, that make multipitch tracking a difficult task. Looking for extra features that could be used for the tracking algorithm, assigning pitch trajectory to speakers (acoustic sources). Since we address single-channel recordings, the PoPi plane (ie. linking position to pitch) is not the way to go.
Method Description:
Using the pitch rate as an additional cue (regardless HOW one can obtain that feature) for multipitch-tracking. Why? If the pitch trajectory of two speakers are crossing in a given frame, then definitely the pitch of both speakers is coming “from somewhere” and this “somewhere” is hopefully not the same for both, as the trajectories are JUST crossing now in the problematic frame. The information where the pitch is “coming from”, and where it is “going to” is nothing else but the chirp-rate. Of course the question is how to decompose the signal effectively into a representation showing pitch linked to its pitch change rate.
One of the possible solutions is to take the frame under analysis, pre-warp it with different chirp-rate candidates (just like in the Fast implementation of Chirp Transform), and extract all the pitch candidates for all given pre-warping factor. With this we get a Chirprate vs. Pitch Plane (ie. alpha-f0 plane), that shows not only the actual pitch value of the speaker, but also from which “direction” the pitch is coming from, ie. was the pitch value higher or lower in the previous frame (positive or negative alpha), and how big this difference between the two frames is (the value of alpha itself).
Below two different Chirprate-Pitch decompositions (or pitch salience plane, see [1]): depicting one acoustic source in the scene. For more details see [1].

As we see, there is only one dominant Fo candidate, and one chirp-rate candidate for our speaker (no ghost peaks, cross-terms, etc.). Further option would be the application of the ACF-CEP based pitch estimation mentioned in the previous post.
Related work:
[1] “FAN CHIRP TRANSFORM FOR MUSIC REPRESENTATION”, P. Cancela, E. Lopez, M. Rocamora, Proc. of DAFx 2010, Graz, Austria, September 6-10, 2010. (Chirprate-Pitch Plane discussed in section 5.2),
[2] Martín Rocamora, Pablo Cancela, Pitch tracking in polyphonic audio by clustering local fundamental frequency estimates, Brazilian AES Audio Engineering Congress, 9th. S~ao Paulo, Brazil – May, 17–19. 2011,
[3] Luis Jure, Ernesto López, Martín Rocamora, Pablo Cancela, Haldo Spontón, Ignacio Irigaray: Pitch content visualization tools for music performance analysis, International Society for Music Information Retrieval Conference, 13th, Proceedings. ISMIR 2012. Porto, Portugal, page 493–498 – 2012
Software Tools
[SW1] Matlab GUI Tool, incl. code from the Audio Processing Group (FING|EUM),
[SW2] Vamp plugin for Sonic Visualiser from the Audio Processing Group (FING|EUM),
Leave a comment