Show Tag: auditory-localization

Select Other Tags

Individual IC neurons combine localization cues.

Barn owls use interaural level differences for vertical sound source localization: their feathers on their head are asymmetric leading to differences in their ears' sensitivities to sounds from above and below.

Visual capture is weaker for stimuli in the periphery where visual localization is less reliable relative to auditor localization than at the center of the visual field.

Horizontal localizations are population-coded in the central nucleus of the mustache bat inferior colliculus.

Rucci et al. present an algorithm which performs auditory localization and combines auditory and visual localization in a common SC map. The mapping between the representations is learned using value-dependent learning.

Localization of audiovisual targets is usually determined more by the location of the visual sub-target than on that of the auditory sub-target.

Especially in situations where visual stimuli are seen clearly and thus localized very easily, this can lead to the so-called ventriloquism effect (aka `visual capture') in which a sound source seems to be localized at the location of the visual target although it is in fact a few degrees away from it.

Magosso et al. present a recurrent ANN model which replicates the ventriloquism effect and the ventriloquism aftereffect.

Female crickets have a system for orienting towards sounds (esp. mating calls) which is arguably based more on mechanics and acoustics than on neural computation.

Sound-source localization requires much more neural computation in vertebrates than in crickets.

fAES is not tonotopic. Instead, its neurons are responsive to spatial features of sounds. No spatial map has been found in fAES (until at least 2004).

Multiple cues are used in biological sound-source localization.

The difference in intensity between one ear and the other, the interaural level difference (ILD), is one cue used in biological sound-source localization.

The difference phase between one ear and the other, the interaural time difference (ITD), is one cue used in biological sound-source localization.

In mammals, different neurons in the lateral superior olive (LSO) are tuned to different ILDs.

In mammals, different neurons in the medial superior olive (MSO) are tuned to different ITDs.

Subregions of the superior olivary complex (SOC) extract auditory localization cues.

The model of biological computation of ITDs proposed by Jeffress extracts ITDs by means of delay lines and coincidence detecting neurons:

The peaks of the sound pressure at each ear lead, via a semi-mechanical process, to peaks in the activity of certain auditory nerve fibers. Those fibers connect to coincidence-detecting neurons. Different delays in connections from the two ears lead to coincidence for different ITDs, thus making these coincidence-detecting neurons selective for different angles to the sound source.

ITD and ILD are most useful for auditory localization in different frequency ranges:

  • In the low frequency ranges, ITD is most informative for auditory localization.
  • In the high frequency ranges, ILD is most informative for auditory localization.

The granularity of representations of ITDs and ILDs in MSO and LSO reflects the fact that ITD and ILD are most useful for auditory localization in different frequency ranges: ITDs for high frequencies are less densely represented in MSO and ITDS are less densely represented in LSO.

Liu et al. model the LSO and MSO as well as the integrating inferior colliculus.

Their system can localize sounds with a spatial resolution of 30 degrees.

Liu et al.'s model of the IC includes a Jeffress-type model of the MSO.

Auditory localization is different from visual or haptic localization since stimulus location is not encoded in which neural receptors are stimulated but in the differential temporal and intensity pattern of stimulation of receptors in the to ears.

It's easier to separate a target sound from a blanket of background noise if target sound and background noise have different ITDs.

Interaural time and level difference do not help (much) in localizing sounds in the vertical plane. Spectral cues—cues in the change of the frequencies in the sound due to differential reflection from various body parts—help us do that.

There seem to be significant differences in SOC organization between higher mammals and rodents.

Jeffress' model has been extremely successful, although neurophysiological evidence is scarce (because the MSO apparently is hard to study).

Jeffress' model predicts a spatial map of ITDs in the MSO.

Jeffress' model predicts a spatial map of ITDs in the MSO. Recent evidence seems to suggest that this map indeed exists.

The way sound is shaped by the head and body before reaching the ears of a listener is described by a head-related transfer function (HRTF). There is a different HRTF for every angle of incidence.

A head-related transfer function summarizes ITD, ILD, and spectral cues for sound-source localization.

Sound source localization based only on binaural cues (like ITD or ILD) suffer from the ambiguity due to the approximate point symmetry of the head: ITD and ILD identify only a `cone of confusion', ie. a virtual cone whose tip is at the center of the head and whose axis is the interaural axis, not strictly a single angle of incidence.

Spectral cues provide disambiguation: due to the asymmetry of the head, the sound is shaped differently depending on where on a cone of confusion a sound source is.

Talagala et al. measured the head-related transfer function (HRTF) of a dummy head and body in a semi-anechoc chamber and used this HRTF for sound source localization experiments.

Talagala et al.'s system can reliably localize sounds in all directions around the dummy head.

Sound-source localization using head-related impulse response functions is precise, but computationally expensive.

Wan et al. use simple cross-correlation (which is computationally cheap, but not very precise) to localize sounds roughly. They then use the rough estimate to speed up MacDonald's cross-channel algorithm which uses head-related impulse response functions.

MacDonald proposes two methods for sound source localization based on head-related transfer functions (actually the HRIR, their representation in the time domain).

The first method for SSL proposed by MacDonald applies the inverse of the HRIR $F^{(i,\theta)}$ to the signal recorded by $i$ For each microphone $i$ and every candidate angle $\theta$. It then uses the Pearson correlation coefficient to compare the resultant signals. Only for the correct angle $\theta$ should the signals match.

The second method for (binaural) SSL proposed by MacDonald applies the HRIR $F^{(o,\theta)}$ to the signals recorded by the left and right microphones every candidate angle θ, where $F^{(o,\theta)}$ is the respective opposite microphone. It then uses the Pearson correlation coefficient to compare the resultant signals. Only for the correct angle θ should the signals match.

The binaural sound-source localization methods proposed by MacDonald can be extended to larger arrays of microphones.

Cross-correlation can be used to estimate the ITD of a sound perceived in two ears.

Rucci et al. claim a mean localization error of 1.54°±1.01° (± presumably meaning standard error) for auditory localization of white-noise stimuli at a direction between $[-60°,60°]$ from their system.

The identity of peripheral auditory neurons responding to an auditory stimulus is not dependent on the location of that stimulus.

Instead, localization cues must be extracted from the temporal dynamics and spectral properties of binaural auditory signals.

This is in contrast with visual and somesthetic localization.

Acoustic localization cues change from far-field conditions (distance to stimulus $>1\,\mathrm{m}$) to near-field conditions ($\leq 1\,\mathrm{m}$).

There are fine-structure and envelope ITDs. Humans are sensitive to both, but do not weight envelope ITDs very strongly when localizing sound sources.

Recent neurophysiological evidence seems to contradict the details of Jeffress' model.

Two identical stimuli at different locations can be perceived as one stimulus which seems to be located between the actual sound sources.

ICx projects to intermediate and deep layers of SC.

The shift in the auditory map in ICx comes with changed projections from ICc to ICx.

There appears to be plasticity wrt. the auditory space map in the SC.

The nucleus of the brachium of the inferior colliculus (nbic) projects to intermediate and deep layers of SC.

SC receives auditory localization-related inputs from the IC.

Task-irrelevant auditory cues have been found to enhance reaction times in others. visual cues, however, which cued visual localization, did not cue auditory localization.

The external nucleus of the inferior colliculus (ICx) of the barn owl represents a map of auditory space.

The map of auditory space in the nucleus of the inferior colliculus (ICx) is calibrated by visual experience.

The optic tectum (OT) receives information on sound source localization from ICx.

Hyde and Knudsen found that there is a point-to-point projection from OT to IC.

A faithful model of the SC should probably adapt the mapping of auditory space in the SC and in another model representing ICx.

Mammals seem to have SC-IC connectivity analogous to that of the barn owl.

Individually, auditory cues are highly ambiguous with respect to auditory localization.

Cue combination across auditory cue types and channels (frequencies) are needed to combine auditory cues to a meaningful localization.

Auditory localization within the so-called cone of confusion can be disambiguated using spectral cues: changes in the spectral shape of a sound due to how the sound reflects, bounces and passes through features of an animal's body. Such changes can only be detected for known sounds.

Auditory sound source localization is made effective through the combination of different types of cues across frequency channels. It is thus most reliable for familiar broad-band sounds.

If visual cues were absolutely necessary for the formation of an auditory space map, then no auditory space map should develop without visual cues. Since an auditory space map develops also in blind(ed) animals, visual cues cannot be strictly necessary.

Many localized perceptual events are either only visual or only auditory. It is therefore not plausible that only audio-visual percepts contribute to the formation of an auditory space map.

Visual information plays a role, but does not seem to be necessary for the formation of an auditory space map.

The auditory space maps developed by animals without patterned visual experience seem to be degraded only in some species (in guinea pigs and barn owls, but not in ferrets or cats).

Self-organization may play a role in organizing auditory localization independent of visual input.

Visual input does seem to be necessary to ensure spatial audio-visual map-register.

Some cortical areas are involved in orienting towards auditory stimuli:

  • primary auditory cortex (A1)
  • posterior auditory field (PAF)
  • dorsal zone of auditory cortex (DZ)
  • auditory field of the anterior ectosylvian sulcus (fAES)

Only fAES has strong cortico-tectal projections.

The ventriloquism aftereffect occurs when an auditory stimulus is initially presented together with a visual stimulus with a certain spatial offset.

The auditory stimulus is typically localized by subjects at the same position as the visual stimulus, and this mis-localization prevails even after the visual stimulus disappears.