|Anforderungen||Programming computer science|
Analyzing the similarity of voices is useful in many research fields, such as speaker diarization (dividing an audio recording into speaker-homogeneous regions), speech synthesis (system adaptation in human-computer dialogs), and voice casting (selecting an appropriate voice for media applications). Due to the prohibitive costs of running subjective tests, it is of main interest to be able to automatically detect the speech similarity of voice segments.
In this master thesis, a speech similarity detection tool will be built for speaker clustering based on speaker recognition state-of-the-art techniques, e.g. i-vectors and deep neural networks. The student's tasks involve literature review, the implementation and adaptation of existing code, and the preparation of audio data for system training and testing. The tool's performance is to be compared to that of existing automatic speaker clustering algorithms.