Show Tag: ventriloquism-effect

Select Other Tags

Visual capture is weaker for stimuli in the periphery where visual localization is less reliable relative to auditor localization than at the center of the visual field.

Two stimuli in different modalities are perceived as one multi-sensory stimulus if the position in space and point time at which they are presented are not too far apart.

Localization of audiovisual targets is usually determined more by the location of the visual sub-target than on that of the auditory sub-target.

Especially in situations where visual stimuli are seen clearly and thus localized very easily, this can lead to the so-called ventriloquism effect (aka `visual capture') in which a sound source seems to be localized at the location of the visual target although it is in fact a few degrees away from it.

Magosso et al. present a recurrent ANN model which replicates the ventriloquism effect and the ventriloquism aftereffect.

An auditory and a visual stimulus, separated in time, may be perceived as one audio-visual stimulus, seemingly occurring at the same point in time.

If an auditory and a visual stimulus are close together, spatially, then they are more likely perceived as one cross-modal stimulus than if they are far apart—even if they are separated temporally.

In a sensorimotor synchronization task, Aschersleben and Bertelson found that an auditory distractor biased the temporal perception of a visual target stimulus more strongly than the other way around.

Battaglia et al. studied the spatial ventriloquism effect and found that in their experiment subjects didn't either exactly follow an MLE model nor had their localization captured completely by vision.

Jack and Thurlow found that the degree to which a puppet resembled an actual speaker (whether it had eyes and a nose, whether it had a lower jaw moving with the speech etc.) and whether the lips of an actual speaker moved in synch with heard speech influenced the strength of the ventriloquism effect.

In one of their experiments, Warren et al. had their subjects localize visual or auditory components of visual-auditory stimuli (videos of people speaking and the corresponding sound). Stimuli were made compelling' by playing video and audio in sync anduncompelling' by introducing a temporal offset.

They found that their subjects performed as under a unity assumptions'' when told they would perceive cross-sensory stimuli, and when the stimuli were `compelling' and under a lowunity assumption'' when they were told there could be separate auditory or visual stimuli and/or the stimuli were made `uncompelling'.

Bertelson et al. did not find a shift of sound source localization due to manipulated endogenous visual spatial attention—localization was shifted only due to (the salience of) light flashes which would induce (automatic, mandatory) exogenous attention.

With increasing distance between stimuli in different modalities, the likelihood of perceiving them as in one location decreases.

With increasing distance between stimuli in different modalities, the likelihood of perceiving them as one cross-modal stimulus decreases.

In other words, the unity assumption depends on the distance between stimuli.

In an audio-visual localization task, Wallace et al. found that their subjects' localization of the auditory stimulus were usually biased towards the visual stimulus whenever the two stimuli were perceived as one and vice-versa.

Details of instructions and quality of stimuli can influence the strength of the spatial ventriloquism effect.

Alais and Burr found in an audio-visual localization experiment that the ventriloquism effect can be interpreted by a simple cue weighting model of human multi-sensory integration:

Their subjects weighted visual and auditory cues depending on their reliability. The weights they used were consistent with MLE. In most situations, visual cues are much more reliable for localization than are auditory cues. Therefore, a visual cue is given so much greater weight that it captures the auditory cue.

An experiment by Burr et al. showed auditory dominance in a temporal bisection task (studying the temporal ventriloquism effect). The results were qualitatively but not quantitatively predicted by an optimal-integration model.

There are two possibilities explaining the latter result:

  • audio-visual integration is not optimal in this case, or
  • the model is incorrect. Specifically, the assumption of Gaussian noise in timing estimation may not reflect actual noise.

The ventriloquism aftereffect occurs when an auditory stimulus is initially presented together with a visual stimulus with a certain spatial offset.

The auditory stimulus is typically localized by subjects at the same position as the visual stimulus, and this mis-localization prevails even after the visual stimulus disappears.