Speaker Segmentation

Speaker segmentation aims at finding the speaker change points in an audio stream. It is a prerequisite for audio indexing, speaker identification\ verification\ tracking, automatic transcription, and dialogue detection in movies. A popular method for speaker segmentation is metric-based segmentation, which focuses on segmenting the input audio stream by evaluating its distance from different segmentation models.


Our Method

Our lab is utilizing the Bayesian Information Criterion (BIC) for speaker segmentation.

Two different systems have been developed.

  • The first is a multiple-pass method which uses a fusion scheme
  • The second employs auxiliary second order statistics and T2 Hotelling statistic

A third system is currently under development.


A demo file can be found here.


Relevant Publications

