Yan et al. explicitly do not integrate auditory and visual localization. Given multiple visual and an auditory localization, they associate the auditory localization with that visual localization which is closest, using the visual localization as the localization of the audio-visual object.
In determining the position of the audio-visual object, Yan et al. handle the possibility that the actual source of the stimulus has only been heard, not seen. They decide whether that is the case by estimating the probability that the auditory localization belongs to any of the detected visual targets and comparing to the baseline probability that the auditory target has not been detected, visually.⇒