Show Tag: robotic

Select Other Tags

Rucci et al. present a robotic system based on their neural model of audiovisual localization.

There are a number of approaches for audio-visual localization. Some with actual robots, some just as theoretical ANN or algorithmic models.

Adams et al. use SOM-like algorithms to model biological sensori-motor control and develop robotic sensori-motor controllers.

Adams et al. state that others have used SOM-like algorithms for modelling biology and for robotic applications, before (and list examples).

Dávila-Chacón et al. show that the Liu et al. model of natural binaural sound source localization can be transferred to the Nao robot and there shows significant resilience to noise.

Their system can localize sounds with a spatial resolution of 15 degrees.

The binaural sound source localization system based on the Liu et al. model does not on its own perform satisfactory on the iCub due to the robot's ego noise which is greater than that of the Nao (~60 dB compared to ~40 dB).

Dávila-Chacón et al. compare different methods for sound source localization on the iCub.

Among the methods compared by Dávila-Chacón et al. for sound source localization on the iCub are

  • the Liu et al. system,
  • the Liu et al. system with additional classifiers
  • Cross-correlation.

Dávila-Chacón evaluated SOMs as a clustering layer on top of the MSO and LSO modules of the Liu et al. sound source localization system. On top of the clustering layer, they tried out a number of neural and statistical classification layers.

The result was inferior by a margin to the best methods they found.

The Kalman filter is a good method in many (robotic) multisensory integration problems in dynamic domains.

At the most general level, multisensory integration (or multisensor data fusion) in application contexts is best described in terms of Bayesian theory, its specializations, and approximations to it.

The model of natural multisensory integration and localization is based on the leaky integrate-and-fire neuron model.

Rucci et al. explain audio-visual map registration and learning of orienting responses to audio-visual stimuli by what they call value-dependent learning: After each motor response, a modulatory system evaluated whether that response was good, bringing the target into the center of the visual field of the system, or bad. The learning rule used by the system was such that it strengthened connections between neurons from the different neural subpopulations of the network if they were highly correlated whenever the modulatory response was strong, and weakened otherwise.

Rucci et al. claim a mean localization error of 1.54°±1.01° (± presumably meaning standard error) for auditory localization of white-noise stimuli at a direction between $[-60°,60°]$ from their system.

In Casey et al.'s system, ILD alone is used for SSL.

In Casey et al's experiments, the two microphones are one meter apart and the stimulus is one meter away from the center between the two microphones. There is no damping body between the microphones, but at that interaural distance and distance to the stimulus, ILD should still be a good localization cue.

Sanchez-Riera et al. use a probabilistic model for audio-visual active speaker localization on a humanoid robot (the Nao robot).

Sanchez-Riera et al. use the Bayesian information criterion to choose the number of speakers in their audio-visual active speaker localization system.

Sanchez-Riera et al. use the Waldboost face detection system for visual processing.

Yan et al. present a system which uses auditory and visual information to learn an audio-motor map (in a functional sense) and orient a robot towards a speaker. Learning is online.

Yan et al. do not evaluate the accuracy of audio-visual localization.

Yan et al. report an accuracy of auditory localization of $3.4^\circ$ for online learning and $0.9^\circ$ for offline calibration.

Yan et al. perform sound source localization using both ITD and ILD. Some of their auditory processing is bio-inspired.

Antonelli et al. use Bayesian and Monte Carlo methods to integrate optic flow and proprioceptive cues to estimate distances between a robot and objects in its visual field.