Show Tag: mle

Select Other Tags

Deneve et al. propose a recurrent network which is able to fit a template to (Poisson-)noisy input activity, implementing an estimator of the original input. The authors show analytically and in simulations that the network is able to approximate a maximum likelihood estimator. The network’s dynamics are governed by divisive normalization and the neural input tuning curves are hard-wired.

Denéve et al. use basis function networks with multidimensional attractors for

  • function approximation
  • cue integration.

They reduce both to maximum likelihood estimation and show that their network performs close to a maximum likelihood estimator.

Lee and Mumford state that their dynamic, recurrent Bayesian model of the visual pathway in its simple form is prone to running into local maxima (states in which small changes in belief in any of the processing stages decrease the joint probability, although a greater changes would increase it).

They propose particle filtering as a solution which they describe as maintaining a number of concurrent high-likelihood hypotheses instead of going for the maximum likelihood one.

Is MLE just a particle filter with a number of particles of one?

MLE has been a successful model in many sensory cue integration tasks.

Akaike's information criterion is strongly linked to information theory and the maximum likelihood principle.

When the error distribution in multiple estimates of a world property on the basis of multiple cues is independent between cues, and Gaussian, then the ideal observer model is a simple weighting strategy.

MLE has been a successful model in many, but not all cue integration tasks studied.

One model which might go beyond MLE in modeling cue combination is `causal inference'.

If it is not given that an auditory and a visual stimulus belong together, then integrating them (binding) unconditionally is not a good idea. In that case, causal inference and model selection are better.

The a-priori belief that there is one stimulus (the `unity assumption') can then be seen as a prior for one model—the one that assumes a single, cross-modal stimulus.

Sato et al. modeled multisensory integration with adaptation purely computationally. In their model, two localizations (one from each modality) were bound or not bound and localized according to a maximum a-posteriory decision rule.

The unity assumption can be interpreted as a prior (if interpreted as an expectation of a forthcoming uni- or cross-sensory stimulus) or a mediator variable in a Bayesian inference model of multisensory integration.

Alais and Burr found in an audio-visual localization experiment that the ventriloquism effect can be interpreted by a simple cue weighting model of human multi-sensory integration:

Their subjects weighted visual and auditory cues depending on their reliability. The weights they used were consistent with MLE. In most situations, visual cues are much more reliable for localization than are auditory cues. Therefore, a visual cue is given so much greater weight that it captures the auditory cue.

Human performance in combining slant and disparity cues for slant estimation can be explained by (optimal) maximum-likelihood estimation.

According to Landy et al., humans often combine cues (intra- or cross-sensory) optimally, consistent with MLE.

MLE provides an optimal method of reading population codes.

It's hard to implement MLE on population codes using neural networks.

Depending on the application, tuning curves, and noise properties, threshold linear networks calculating population vectors can have similar performance as MLE.

Seung and Sempolinsky introduce maximum likelihood estimation (MLE) as one possible mechanism for neural read-out. However, they state that it is not clear whether MLE can be implemented in a biologically plausible way.

Seung and Sempolinsky show that, in a population code with wide tuning curves and poisson noise and under the conditions described in their paper, the response of neurons near threshold carries exceptionally high information.

Deneve et al.'s model (2001) does not compute a population code; it mainly recovers a clean population code from a noisy one.