7. Multiband PoPi Decomposition

Problem:
When applying the PoPi decomposition for concurrent speaker scenarios (coctail party effect) in order to track multiple speakers moving while speaking we see that the original formulation of the PoPi decomposition shows always only the more dominant speaker (ie. more dominant microperiodicies in given signal frame), and the othere speaker (let’s call him background speaker) is suppressed and hardly visible in the representation.

Solution:
Similar problem is addressed in Klapuri’s PhD work that targets automatic transcription of simultaneous musical tones. Klapuri’s (and the most logical) way to enhance multiple pitch candidates is to give them chance in multiple frequency bands to be dominant.

bandwisepopi

We do the same: the “multiband” version of the PoPi plane is based on subband processing. This provides good results already at using as few bands as 17. This has been proven on several double-talk and triple-talk scenarios recorded in 3 different rooms with different reverberation times. However, for non-speech like scenarios, and more speakers the 17 band might be a low number.

multispeaker_popi

The image above shows the PoPi decomposition of 2 concurrent speakers. Their position and the corresponding pitch values is easy to read out. The recording shows the voice of Tania and Lukas.

References:
[1] T. Habib, L. Ottowitz, and M. Kepési, “Experimental Evaluation of Multi-band Position-Pitch Estimation (M-PoPi) Algorithm for Multi-Speaker Localization,” INTERSPEECH 2008, Sept. 22-26, Brisbane, Australia.
[2] T. Habib, M. Kepési and L. Ottowitz, “Experimental Evaluation of the Joint Position-Pitch Estimation (PoPi) algorithm in Noisy Environments,” 5th IEEE Workshop on Sensor Array and Multi-Channel Signal Processing (SAM 2008), Jul. 21-23, Darmstadt, Germany.
[3] M. Kepési, L. Ottowitz and T. Habib, “Joint Position-Pitch Estimation for Multiple Speaker Scenarios,” IEEE Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA 2008), May 6-8, Trento, Italy.

Patents:
EP2162757B1 – Képesi, Wohlmayr, Kubin: JOINT POSITION-PITCH ESTIMATION OF ACOUSTIC SOURCES FOR THEIR TRACKING AND SEPARATION
US8107321B2 – Képesi, Wohlmayr, Kubin: JOINT POSITION-PITCH ESTIMATION OF ACOUSTIC SOURCES FOR THEIR TRACKING AND SEPARATION

Leave a comment