Abstract: This paper presents a self-supervised method for visual detection of the active speaker in a multiperson spoken interaction scenario. Active speaker detection is a fundamental prerequisite ...