Reverberation and noise can degrade speech recognition.

Speech recognition models are usually trained with noise and reverberation free speech samples.

Humans adapt to an auditory scene's reverberation and noise conditions. They use visual scene recognition to recall reverberation and noise conditions of familiar environments.

A system that stores multiple trained speech recognition models for different environments and retrieves them guided by visual scene recognition has improved speech recognition in reverberated and noisy environments.