Robust sound localization using multi-source audiovisual information fusion Information Fusion, Vol. 2, No. 3. (September 2001), pp. 209-223 by Parham Aarabi, Safwat Zaky
abstract = {This paper illustrates the synergic advantages of a multi-modal sound localization system utilizing two cameras and a 3-element microphone array. The two cameras were used as part of a stereo feature-detection based visual object localization system, while the microphones were combined to produce a sound localization system incorporating a temporal power fusion ({TPF}) algorithm. The cameras and microphones were integrated using spatial likelihood functions ({SLFs}), which greatly simplifies the integration process. Test results show a significant improvement in the integrated vision and sound localization ({IVSL}) system's ability over that of the stand-alone microphone-array based sound localization system to accurately localize sound sources in low signal to noise situations. The {IVSL} system maintained an average error of 15 cm at signal-to-noise ratios as low as 0.5 {dB}.},
