Show Tag: model

Select Other Tags

Cuppini et al. present a model of the SC that exhibits many of the properties regarding neural connectivity, electrophysiology, and development that have been found experimentally in nature.

The model of the SC due to Cuppini et al. reproduces development of

  1. multi-sensory neurons
  2. multi-sensory enhancement
  3. intra-modality depression
  4. super-additivity
  5. inverse effectiveness

The model due to Cuppini et al. comprises distinct neural populations for

  1. anterior ectosylvian sulcus (AES) and auditory subregion of AES (FAES)
  2. inhibitory interneurons between AES/FAES and SC
  3. space-coded ascending inputs (visual, auditory) to the SC
  4. inhibitory ascending interneurons
  5. (potentially) multi-sensory SC neurons.

The model due to Cuppini et al. does not need neural multiplication to implement superadditivity or inverse effectiveness. Instead, it exploits the sigmoid transfer function in multi-sensory neurons: due to this sigmoid transfer function and due to less-than-unit weights between input and multi-sensory neurons, weak stimuli that fall into the low linear regions of input neurons evoke less than linear responses in multi-sensory neurons. However, the sum of two such stimuli (from different modalities) can be in their linear range and thus the result can be much greater than the sum of the individual responses.

Through lateral connections, a Hebbian learning rule, and approximate initialization, Cuppini et al. manage to learn register between sensory maps. This can be seen as an implementation of a SOM.

Cuppini et al. use mutually inhibitive, modality-specific inhibition (inhibitory inter-neurons that get input from one modality and inhibit inhibitory interneurons receiving input from different modalities) to implement a winner-take-all mechanism between modalities; this leads to a visual (or auditory) capture effect without functional multi-sensory integration.

Their network model builds upon their earlier single-neuron model.

Not sure about the biological motivation of this. Also: it would be interesting to know if functional integration still occurs.

Cuppini et al. do not evaluate their model's performance (comparability to cat/human performance, optimality...)

The model due to Cuppini et al. is inspired only by observed neurophysiology; it has no normative inspiration.

Soltani and Wang propose an adaptive neural model of Bayesian inference neglecting any priors and claim that it is consistent with certain observations in biology.

Soltani and Wang argue that their model is consistent with the 'base rate neglect' fallacy.

Soltani and Wang propose an adaptive model of Bayesian inference with binary cues.

In their model, a synaptic weight codes for the ratio of synapses in a set which are activated vs. de-activated by the binary cue encoded in their pre-synaptic axon's activity.

The stochastic Hebbian learning rule makes the synaptic weights correctly encode log posterior probabilities and the neurons will encode reward probability correctly.

Simulations are different from experiments on the `real thing', but that is true also of all other kinds of theoretical model.

Computer simulations have benefits over empirical experiments:

  • wide ranges of initial conditions can be tested;
  • they can be replicated exactly;
  • they can be performed where the corresponding experiment would be impossible or unfeasible;
  • they are Gedankenexperimente without the psychological biases (well, somewhat);
  • they are more amenable to in-depth inspection regarding satisfaction of assumptions—code can be validated, reality cannot;
  • they can be used to guide analytical research;

A simulation can be thought of as a thought experiment: Given a correct mathematical model of something, it tries out how that model behaves and translates (via the output representation and interpretation) the behavior back into the realm of the real world.

I would add that a model need not be correct if the simulation is to test the correctness of a model. Then, the thought experiment is to test the hypothesis that the model indeed is correct for the object or process of which it is supposed to be a model by generating predictions (solutions to the mathematical model). Those predictions are then compared to existing behavioral data of the object or process being modeled.

A computer simulation then is a thought experiment carried out by a computer.

Behrens et al. modeled learning of reward probabilities using a the model of a Bayesian learner.

SOM-based algorithms have been used to model several features of natural visual processing.

Miikulainen et al. use their SOM-based algorithms to model the visual cortex.

Miikulainen et al. use a hierarchical version of their SOM-based algorithm to model natural development of visual capabilities.

The theoretical accounts of multi-sensory integration due to Beck et al. and Ma et al. do not learn and leave little room for learning.

Thus, they fail to explain an important aspect of multi-sensory integration in humans.

Weisswange et al.'s model does not reproduce population coding.

Bauer and Wermter show how probabilistic population codes and near-optimal integration can develop.

Weisswange et al's model uses a softmax function to normalize the output.

Verschure champions his model of Distributed Adaptive Control as a model comprising all aspects of the mind, brain, body nexus.

Verschure states his Distributed Adaptive Control (DAC) provides a solution to the symbol grounding problem.

The state spaces in the formal definition of Verschure's DAC already seems to comprise symbols.

Verschure states his is an early model in the tradition of what he calls the "predictive brain" hypothesis and relates it to Friston's free energy principle and Kalman filtering.

Distributed Adaptive Control is a system that can learn sensory-motor contingencies

Verschure explains that, in his DAC system, the contextual layer overrules the adaptive layer as soon as it is able to predict perception well enough.

One version of DAC uses SOMs.

Osborne et al. modeled performance of monkeys in a visual smooth pursuit task. According to their model, variability in this task is due mostly to estimation errors and not due to motor errors.

A traditional model of visual processing for perception and action proposes that the two tasks rely on different visual representations. This model explains the weak effect of visual illusions like the Müller-Lyer illuson on performance in grasping tasks.

Foster et al. challenge the methodology used in a previous study by Dewar and Carey which supports the perception and action model of visual processing due to Goodale and Milner.

They do that by changing the closed visual-action loop in Dewar and Carey's study into an open one by removing visual feedback at motion onset. The result is that the effect of the illusion is there for grasping (which it wasn't in the closed-loop condition) but not (as strongly) for manual object size estimation.

Foster et al. argue that this suggests that the effect found in Dewar and Carey's study is due to continuous visual feedback.

Rucci et al. present a robotic system based on their neural model of audiovisual localization.

There are a number of approaches for audio-visual localization. Some with actual robots, some just as theoretical ANN or algorithmic models.

Rucci et al. present an algorithm which performs auditory localization and combines auditory and visual localization in a common SC map. The mapping between the representations is learned using value-dependent learning.

Rucci et al.'s neural network learns how to align ICx and SC (OT) maps by means of value-dependent learning: The value signal depends on whether the target was in the fovea after a saccade.

Rucci et al.'s model of learning to combine ICx and SC maps does not take into account the point-to-point projections from SC to ICx reported later by Knudsen et al.

Rucci et al.'s plots of ICc activation look very similar to Jorge's IPD matrices.

Deneve et al. propose a recurrent network which is able to fit a template to (Poisson-)noisy input activity, implementing an estimator of the original input. The authors show analytically and in simulations that the network is able to approximate a maximum likelihood estimator. The network’s dynamics are governed by divisive normalization and the neural input tuning curves are hard-wired.

SOMs have been used to model biology.

Adams et al. use SOM-like algorithms to model biological sensori-motor control and develop robotic sensori-motor controllers.

Chalk et al. hypothesize that biological cognitive agents learn a generative model of sensory input and rewards for actions.

Soltani and Wang propose a learning algorithm in which neurons predict rewards for actions based on individual cues. The winning neuron stochastically gets reward depending on the action taken.

One of the benefits of Soltani and Wang's model is that it does not require their neurons to perform complex computations. By simply counting active synapses, they calculate log probabilities of reward. The learning rule is what makes sure the correct number of neurons are active given the input.

Soltani and Wang only consider percepts and reward. They do not model any generative causes behind the two.

In Chalk et al.'s model, low-level sensory neurons are responsible for calculating the probabilities of high-level hidden variables given certain features being present or not. Other neurons are then responsible for predicting the rewards of different actions depending on the presumed state of those hidden variables.

In Chalk et al.'s model, neurons update their parameters online, ie. during the task. In one condition of their experiments, only neurons predicting reward are updated, in others, perceptual neurons are updated as well. Reward prediction was better when perceptual responses were tuned as well.

SOMs and SOM-like algorithms have been used to model natural multi-sensory integration in the SC.

Anastasio and Patton model the deep SC using SOM learning.

Anastasio and Patton present a model of multi-sensory integration in the superior colliculus which takes into account modulation by uni-sensory projections from cortical areas.

In the model due to Anastasio and Patton, deep SC neurons combine cortical input multiplicatively with primary input.

Anastasio and Patton's model is trained in two steps:

First, connections from primary input to deep SC neurons are adapted in a SOM-like fashion.

Then, connections from uni-sensory, parietal inputs are trained, following an anti-Hebbian regime.

The latter phase ensures the principles of modality-matching and cross-modality.

SOM learning produces clusters of neurons with similar modality responsiveness in the SC model due to Anastasio and Patton.

The model due to Anastasio and Patton reproduces multi-sensory enhancement.

Deactivating modulatory, cortical input also deactivates multi-sensory enhancement.

In order to work with spatial information from different sensory modalities and use it for motor control, coordinate transformation must happen at some point during information processing. Pouget and Sejnowski state that in many instances such transformations are non-linear. They argue that functions describing receptive fields and neural activation can be thought of and used as basis functions for the approximation of non-linear functions such as those occurring in sensory-motor coordinate transformation.

Magosso et al. present a recurrent ANN model which replicates the ventriloquism effect and the ventriloquism aftereffect.

A network with Hebbian and anti-Hebbian learning can produce a sparse code. Excitatory connections from input to output are learned Hebbian while inhibition between output neurons are learned anti-Hebbian.

In Anastasio and Patton's SC model, the spatial organization of the SOM is not used to represent the spatial organization of the outside world, but to distribute different sensitivities to the input modalities in different neurons.

It's a bit strange that Anastasio and Patton's and Martin et al.'s SC models do not use the spatial organization of the SOM to represent the spatial organization of the outside world, but to distribute different sensitivities to the input modalities in different neurons.

KNN (or sparse coding) seems to be more appropriate for that.

Beck et al. model build-up in the SC as accumulation of evidence from sensory input.

Beck et al. argue that simply adding time point-to-time point responses of a population code will integrate the information optimally if the noise in the input is what they call "Poisson-like".

That is somewhat expected as in a Poisson distribution with mean $\lambda$ the variance is $\lambda$ and the standard deviation is $\sqrt{\lambda}$ and adding population responses is equivalent to counting spikes over a longer period of time, thus increasing the mean of the distribution.

Many models of Bayesian integration of neural responses rely on hand-crafted connectivity.

Lee and Mumford interpret the visual pathway in terms of Bayesian belief propagation: each stage in the processing uses output from the one further up as contextual information and output from the one further down as evidence to update its belief and corresponding output.

Each layer thus calculates probabilities of features of the visual display given noisy and ambiguous input.

The model proposed by Heinrich et al. builds upon the one by Hinoshita et al. It adds visual input and thus shows how learning of language may not only be grounded in perception of verbal utterances, but also in visual perception.

Hinoshita et al. propose a model of natural language acquisition based on a multiple-timescale recurrent artificial neural network (MTRNN).

According to Quaia, the Robinson model of saccade generation introduced the idea that saccades are controlled by a feedback loop in which the current eye position is compared to the target eye position and corrective motor signals are issued accordingly.

This idea was integrated in a family of later models.

Quaia et al. present a model of the saccadic system involving SC and cerebellum, which reproduces the fact that the ability to generate fast and precise saccades recovers after ablation of the SC.

Lawrence et al. train different kinds of recurrent neural networks to classify sentences in grammatical or agrammatical.

Lawrence manage to train ANNs to learn grammar-like structure without them having any inbuilt representation of grammar They argue that that shows that Chomsky's assumption that humans must have inborn linguistic capabilities is unnecessary.

Hinoshita et al. argue that by watching language learning in RNNs, we can learn about how the human brain might self-organize to learn language.

Butts and Goldman use Gaussian functions to model the receptive fields of V1 neurons.

Gaussian functions have been used to model the receptive fields of sensory neurons.

Tabareau et al. propose a scheme for a transformation from the topographic mapping in the SC to the temporal code of the saccadic burst generators.

According to their analysis, that code needs to be either linear or logarithmic.

Girard and Berthoz review saccade system models including models of the SC.

Except for two of the SC models, all focus on generation of saccades and do not consider sensory processing and in particular multisensory integration.

Ghahramani et al. infer the cost function presumably guiding natural multisensory integration from behavioral data.

Ghahramani et al. model multisensory integration as a process minimizing uncertainty.

Roach et al. present a Bayesian model of multisensory integration which takes into account the fact that information from different modalities is only integrated up to a certain amount of incongruence. That model incorporates a Gaussian prior on distances between actual components in cross-sensory stimuli.

With appropriate parameterization, the model proposed by Roach et al. should produce results much like model selection. It is mathematically a little simpler because no explicit decision needs to be made. However, the motivation of a Gaussian function for modeling the actual distance between components in cross-sensory stimuli is a bit unclear: Either the two components belong to a common source or they do not. Why should independent auditory and visual stimuli have a tendency to be close together?

A deep SC neuron which receives enough information from one modality to reliably determine whether a stimulus is in its receptive field does not improve its performance much by integrating information from another modality.

Patton et al. use this insight to explain the diversity of uni-sensory and multisensory neurons in the deep SC.

Anastasio drop the strong probabilistic interpretation of SC neurons' firing patterns in their learning model.

Rowland et al. present four SC models.

The first SC model presented by Rowland et al. is a single-neuron model in which sensory and cortical input is simply summed and passed through a sigmoid squashing function.

The sigmoid squashing function used in Rowland et al.'s first model leads to inverse effectiveness: The sum of weak inputs generally falls into the supra-linear part of the sigmoid and thus produces a superadditive response.

The SC model presented by Cuppini et al. has a circular topology to prevent the border effect.

Rucci et al. model learning of audio-visual map alignment in the barn owl SC. In their model, projections from the retina to the SC are fixed (and visual RFs are therefore static) and connections from ICx are adapted through value-dependent learning.

The model of biological computation of ITDs proposed by Jeffress extracts ITDs by means of delay lines and coincidence detecting neurons:

The peaks of the sound pressure at each ear lead, via a semi-mechanical process, to peaks in the activity of certain auditory nerve fibers. Those fibers connect to coincidence-detecting neurons. Different delays in connections from the two ears lead to coincidence for different ITDs, thus making these coincidence-detecting neurons selective for different angles to the sound source.

Liu et al.'s model of the IC includes a Jeffress-type model of the MSO.

Jeffress' model has been extremely successful, although neurophysiological evidence is scarce (because the MSO apparently is hard to study).

Jeffress' model predicts a spatial map of ITDs in the MSO.

Jeffress' model predicts a spatial map of ITDs in the MSO. Recent evidence seems to suggest that this map indeed exists.

Dávila-Chacón et al. show that the Liu et al. model of natural binaural sound source localization can be transferred to the Nao robot and there shows significant resilience to noise.

Their system can localize sounds with a spatial resolution of 15 degrees.

The binaural sound source localization system based on the Liu et al. model does not on its own perform satisfactory on the iCub due to the robot's ego noise which is greater than that of the Nao (~60 dB compared to ~40 dB).

The model of natural multisensory integration and localization is based on the leaky integrate-and-fire neuron model.

Rucci et al. explain audio-visual map registration and learning of orienting responses to audio-visual stimuli by what they call value-dependent learning: After each motor response, a modulatory system evaluated whether that response was good, bringing the target into the center of the visual field of the system, or bad. The learning rule used by the system was such that it strengthened connections between neurons from the different neural subpopulations of the network if they were highly correlated whenever the modulatory response was strong, and weakened otherwise.

Rucci et al.'s system comprises artificial neural populations modeling MSO (aka. the nucleus laminaris), the central nucleus of the inferior colliculus (ICc), the external nucleus of the inferior colliculus (ICx), the retina, and the superior colliculus (SC, aka. the optic tectum). The population modeling the SC is split into a sensory and a motor subpopulation.

In Rucci et al.'s system, the MSO is modeled by computing Fourier transforms for each of the auditory signals. The activity of the MSO neurons is then determined by their individual preferred frequency and ITD and computed directly from the Fourier-transformed data.

In Rucci et al.'s model, neural weights are updated between neural populations modeling

  • ICC and ICx
  • sensory and motor SC.

The superficial SC is modeled by Casey et al.'s system by two populations of center-on and center-off cells (whose receptive fields are modeled by a difference of Gaussians) and four populations of direction-sensitive cells.

When he introduced his model of a computing machine, Alan Turing designed it to mimic human computation.

Recent neurophysiological evidence seems to contradict the details of Jeffress' model.

Weber presents a Helmholtz machine extended by adaptive lateral connections between units and a topological interpretation of the network. A Gaussian prior over the population response (a prior favoring co-activation of close-by units) and training with natural images lead to spatial self-organization and feature-selectivity similar to that in cells in early visual cortex.

According to Wilson and Bednar, there are four main families of theories concerning topological feature maps:

  • input-driven self-organization,
  • minimal-wire length,
  • place-coding theory,
  • Turing pattern formation.

Wilson and Bednar argue that input-driven self-organization and turing pattern formation explain how topological maps may arise from useful processes, but they do not explain why topological maps are useful in themselves.

According to Wilson and Bednar, wire-length optimization presupposes that neurons need input from other neurons with similar feature selectivity. Under that assumption, wire length is minimized if neurons with similar selectivities are close to each other. Thus, the kind of continuous topological feature maps we see optimize wire length.

The idea that neurons should especially require input from other neurons with similar spatial receptive fields is unproblematic. However, Wilson and Bednar argue that it is unclear why neurons should especially require input from neurons with similar non-spatial feature preferences (like orientation, spatial frequency, smell, etc.).

Koulakov and Chklovskii assume that sensory neurons in cortex preferentially connect to other neurons whose feature-preferences do not differ more than a certain amount from their own feature-preferences. Further, they argue that long connections between neurons incur a metabolic cost. From this, they derive the hypothesis that the patterns of feature selectivity seen in neural populations are the result of minimizing the distance between similarly selective neurons.

Koulakov and Chklovsky show that various selectivity patterns emerge from their theorized cost minimization, given different parameterizations of preference for connections to similarly-tuned neurons.

Pooling the activity of a set of similarly-tuned neurons is useful for increasing the sharpness of tuning. A neuron which pools from a set of similarly-tuned neurons would have to make shorter connections if these neurons are close together. Thus, there is a reason why it can be useful to connect preferentially to a set of similarly-tuned neurons. This reason might be part of the reason behind topographic maps.

Krasne et al. present an ANN model for fear conditioning.

Eliasmith et al. model sensory-motor processing as task-dependent compression of sensory data and decompression of motor programs.

A simple MLP would probably be able to learn optimal multi-sensory integration via backprop

Using a space-coded approach instead of an MLP for learning multi-sensory integration has benefits:

  • learning is unsupervised
  • can work with missing data

Verschure says neurons don't seem to multiply. Gabbiani et al. say they might.

The fact that multi-sensory integration arises without reward connected to stimuli motivates unsupervised learning approaches to SC modeling.

Ma, Beck, Latham and Pouget argue that optimal integration of population-coded probabilistic information can be achieved by simply adding the activities of neurons with identical receptive fields. The preconditions for this to hold are

  • independent Poisson (or other "Poisson-like") noise in the input
  • identically-shaped tuning curves in input neurons
  • a point-to-point connection from neurons in different populations with identical receptive fields to the same output neuron.

It's hard to unambiguously interpret Ma et al.'s paper, but it seems that, according to Renart and van Rossum, any other non-flat profile would also transmit the information optimally, although the decoding scheme would maybe have to be different.

Renart and van Rossum discuss optimal connection weight profiles between layers in a feed-forward neural network. They come to the conclusion that, if neurons in the input population have broad tuning curves, then Mexican-hat-like connectivity profiles are optimal.

Renart and van Rossum state that any non-flat connectivity profile between input and output layers in a feed-forward network yields optimal transmission if there is no noise in the output.

Patrick Winston differentiates three different kinds of models:

  • those that mimic behaviour
  • those that make predictions
  • those that increase understanding

Patrick Winston differentiates two kinds of cognitive performance:

  • reactive, "thermometer"-like behavior,
  • predictive, "model making" behavior

In Anastasio et al.'s model of multi-sensory integration in the SC, an SC neuron is connected to one neuron from each modality whose spiking behavior is a (Poisson) probabilistic function of whether there is a target in that modality or not.

Their single SC neuron then computes the posterior probability of there being a target given its inputs (evidence) and the prior.

Under the assumption that neural noise is independent between neurons, Anastasio et al.'s approach can be extended by making each input neuron its own modality.

Bayesian integration becomes more complex, however, because receptive fields are not sharp. The formulae still hold, but the neurons cannot simply use Poisson statistics to integrate.

In Anastasio et al. use their model to explain enhancement and the principle of inverse effectiveness.

The model due to Ma et al. is simple and it requires no learning.

My model is normative, performs optimally and it shows super-additivity (to be shown).

Fetsch et al. explain the discrepancy between observed neurophysiology—superadditivity—and the normative solution to single-neuron cue integration proposed by Ma et al. using divisive normalization:

They propose that the network activity is normalized in order to keep neurons' activities within their dynamic range. This would lead to the apparent reliability-dependent weighting of responses found by Morgan et al. and superadditivity as described by Stanford et al.

Fetsch et al. acknowledge the similarity of their model with that of Ohshiro et al.

Fetsch et al. provide some sort of normative motivation to the model due to Ohshiro et al.

Zhao et al. propose a model which develops perception and behavior in parallel.

Their motivation is the embodiment idea stating that perception and behavior develop in behaving animals

Disparity-selective cells in visual cortical neurons have preferred disparities of only a few degrees whereas disparity in natural environments ranges over tens of degrees.

The possible explanation offered by Zhao et al. assumes that animals actively keep disparity within a small range, during development, and therefore only selectivity for small disparity develops.

Zhao et al. present a model of joint development of disparity selectivity and vergence control.

Zhao et al.'s model develops both disparity selection and vergence control in an effort to minimize reconstruction error.

It uses a form of sparse-coding to learn to approximate its input and a variation of the actor-critic learning algorithm called natural actor critic reinforcement learning algorithm (NACREL).

The teaching signal to the NACREL algorithm is the reconstruction error of the model after the action produced by it.

Mixing Hebbian (unsupervised) learning with feedback can guide the unsupervised learning process in learning interesting, or task-relevant things.

Classical models assume that learning in cortical regions is well described in an unsupervised learning framework while learning in the basal ganglia can be modeled by reinforcement learning.

Representations in the cortex (eg. V1) develop differently depending on the task. This suggests that some sort of feedback signal might be involved and learning in the cortex is not purely unsupervised.

Some task-dependency in representations may arise from embodied learning where actions bias experiences being learned from.

Conversely, the narrow range of disparities reflected in disparaty-selective cells in visual cortex neurons might be due to goal-directed feature learning.

Unsupervised learning models have been extended with aspects of reinforcement learning.

The algorithm presented by Weber and Triesch borrows from SARSA.

SOMs can be used for preprocessing in reinforcement learning, simplifying their high-dimensional input via their winner-take-all characteristics.

However, since standard SOMs do not get any goal-dependent input, they focus on globally strongest features (statistically most predictive latent variables) and under-emphasize features which would be relevant for the task.

Weisswange et al. distinguish between two strategies for Bayesian multisensory integration: model averaging and model selection.

The model averaging strategy computes the posterior probability for the position of the signal source, taking into account the possibility that the stimuli had the same source and the possibility that they had two distinct sources.

The model selection strategy computes the most likely of these two possibilities. This has been called causal inference.

The race model of multi-sensory integration assumes that the reaction to a multi-sensory stimulus is as fast as the fastest reaction any of the individual stimuli.

Weisswange et al. model learning of multisensory integration using reward-mediated / reward-dependent learning in an ANN, a form of reinforcement learning.

They model a situation similar to the experiments due to Neil et al. and Körding et al. in which a learner is presented with visual, auditory, or audio-visual stimuli.

In each trial, the learner is given reward depending on the accuracy of its response.

In an experiment where stimuli could be caused by the same or different sources, Weisswange found that their model behaves similar to both model averaging or model selection, although slightly more similar to the former.

Fujita models saccade suppression of endpoint variability by the cerebellum using their supervised ANN model for learning a continuous function of the integral of an input time series.

He assumes that the input activity originates from the SC and that the correction signal is supplied by sensory feedback.

In his model, Fujita abstracts away from the population coding present in the multi-sensory/motor layers of the SC.

A faithful model of the SC should probably adapt the mapping of auditory space in the SC and in another model representing ICx.

Colonius and Diederich argue that deep-SC neurons spiking behavior can be interpreted as a vote for a target rather than a non-target being in their receptive field.

This is similar to Anastasio et al.'s previous approach.

There are a number of problems with Colonius' and Diederich's idea that deep-SC neurons' binary spiking behavior can be interpreted as a vote for a target rather than a non-target being in their RF. First, these neurons' RFs can be very broad, and the strength of their response is a function of how far away the stimulus is from the center of their RFs. Second, the response strength is also a function of stimulus strength. It needs some arguing, but to me it seems more likely that the response encodes the probability of a stimulus being in the center of the RF.

Colonius and Diederich argue that, given their Bayesian, normative model of neurons' response behavior, neurons responding to only one sensory modality outperform neurons responding to multiple sensory modalities.

Colonius' and Diederich's explanation for uni-sensory neurons in the deep SC has a few weaknesses: First, they model the input spiking activity for both the target and the non-target case as Poisson distributed. This is a problem, because the input spiking activity is really a function of the target distance from the center of the RF. Second, they explicitly model the probability of the visibility of a target to be independent of the probability of its audibility.

If SC neurons spiking behavior can be interpreted as a vote for a target rather than a non-target being in their receptive field, then the decisions must be made somewhere else because they then do not take into account utility.

De Kamps and van der Velde introduce a neural blackboard architecture for representing sentence structure.

De Kamps and van der Velde use their blackboard architecture for two very different tasks: representing sentence structure and object attention.

Deco and Rolls introduce a system that uses a trace learning rule to learn recognition of more and more complex visual features in successive layers of a neural architecture. In each layer, the specificity of the features increases together with the receptive fields of neurons until the receptive fields span most of the visual range and the features actually code for objects. This model thus is a model of the development of object-based attention.

Using multiple layers each of which learns with a trace rule with successively larger time scales is similar to the CTRNNs Stefan Heinrich uses to learn the structure of language. Could there be a combined model of learning of sentence structure and language processing on the one hand and object-based visual or multi-modal attention on the other?

Wozny et al. distinguish between three strategies for multisensory integration: model averaging, model selection, and probability matching.

Wozny et al. found in an audio-visual localization experiment that a majority of their participants' performance was best explained by the statistically sub-optimal probability matching strategy.

Probability matching is a sub-optimal decision strategy, statically, but it can have advantages because it leads to exploration.

Weisswange et al.'s results seem at odds with those of Wozny et al. However, Wozny et al. state that different strategies may be used in different settings.

Pavlou and Casey model the SC.

They use Hebbian, competitive learning to learn and topographic mapping between modalities.

They also simulate cortical input.

Martin et al. model multisensory integration in the SC using a SOM algorithm.

Input in Martin et al.'s model of multisensory integration in the SC is an $m$-dimensional vector for every data point, where $m$ is the number of modalities. Data points are uni-modal, bi-modal, or tri-modal. Each dimension of the data point codes stochastically for the combination of modalities of the data point. The SOM learns to map different modality combinations to different regions into its two-dimensional grid.

Input in Martin et al.'s model of multisensory integration in the SC replicates enhancement and, through the non-linear transfer function, superadditivity.

The leaky-integrate-and-fire model due to Rowland and Stein models a single multisensory SC neuron receiving input from a number of sensory, cortical, and sub-cortical sources.

Each of the sources is modeled as a single input to the SC neuron.

Local inhibitory interaction between neurons in multi-sensory trials is modeled by a single time-variant subtractive term which sets in shortly after the actual sensory input, thus not influencing the first phase of the response after stimulus onset.

The model due to Rowland and Stein does not consider the spatial properties of input or output. In reality, the same source of input—retina, LGN, association cortex may convey information about stimulus conditions from different regions in space and neurons at different positions in the SC react to different stimuli.

Rowland and Stein focus on the temporal dynamics of multisensory integration.

Rowland and Stein's goal is only to generate neural responses like those observed in real SC neurons with realistic biological constraints. The model does not give any explanation of neural responses on the functional level.

The network characteristics of the SC are modeled only very roughly by Rowland and Stein's model.

The model due to Rowland and Stein manages to reproduce the nonlinear time course of neural responses to, and enhancement in magnitude and inverse effectiveness in multisensory integration in the SC.

Since the model does not include spatial properties, it does not reproduce the spatial principle (ie. no depression).

ANN implementing DBN have been around for a long time (they go back at least to Fukushima's Neocognitron).

The motmap algorithm uses reinforcement learning to organize behavior in a two-dimensional map.

Divisive normalization models have explained how attention can facilitate or suppress some neurons' responses.

Some models view attentional changes of neural responses as the result of Bayesian inference about the world based on changing priors.

Chalk et al. argue that changing the task should not change expectations—change the prior—about the state of the world. Rather, they might change the model of how reward depends on the state of the world.

Patton and Anastasio present a model of "enhancement and modality-specific suppression in multi-sensory neurons" that requires no multiplicative interaction. It is a follow-up of their earlier functional model of these neurons which requires complex computation.

Anastasio et al. present a model of the response properties of multi-sensory SC neurons which explains enhancement, depression, and super-addititvity using Bayes' rule: If one assumes that a neuron integrates its input to infer the posterior probability of a stimulus source being present in its receptive field, then these effects arise naturally.

Anastasio et al.'s model of SC neurons assumes that these neurons receive multiple inputs with Poisson noise and apply Bayes' rule to calculate the posterior probability of a stimulus being in their receptive fields.

Anastasio et al. point out that, given their model of SC neurons computing the probability of a stimulus being in their RF with Poisson-noised input, a sigmoid response function arises for uni-sensory input.

Cuppini et al. model SC receptive fields (actually: spatial tuning curves) as continuous functions of the distance of a stimulus from the center of the RF.

In Anastasio et al.'s work, receptive fields are binary: either a stimulus is in the field or it isn't; if the average response of an SC neuron is smaller for stimuli that are further away from the center of the RF is smaller, then that's because inference there is less effective.

My explanation for different responsiveness to the individual modalities in SC neurons: They do causal inference/model selection. different neurons coding for the same point in space specialize in different stimulus (strength) combinations.

This is basically, what Anastasio and Patton's model does (except that it does not seem to make sense to me that they use the SOM's spatial organization to represent different sensory combinations).

Deneve describes neurons as integrating probabilities based on single incoming spikes. Spikes are seen as outcomes of Poisson processes and neurons are to infer the hidden value of those processes' parameter(s). She uses the leaky integrate-and-fire neuron as the basis for her model.

Deneve models a changing world; hidden variables may change according to a Marcov chain. Her neural model deals with that. Wow.

Hidden variables in Deneve's model seem to be binary. Differences in synapses (actually, their input) are due to weights describing how `informative' of the hidden variable they are.

Leakiness of neurons in Deneve's model are due to changing world conditions.

Neurons in Deneve's model actually generate Poisson-like output themselves (though deterministically).

The process it generates is described as predictive. A neuron $n_1$ fires if the probability $P_1(t)$ estimated by $n_1$ based on its input is greater than the probability $P_2(t)$ estimated by another neuron $n_2$ based on $n_1$'s input.

Deneve's model is not a leaky integrate-and-fire (LIF) model, but she demonstrates the connection. She states that LIF is `far from describing the dynamics of real neurons'.

Although their spiking behavior is described by non-linear functions, the output rate of Deneve's neurons is a linear (rectified) function of the (rate-coded) input.

Cognitive science must not only provide predictive generative models predicting natural cognitive behavior within a normative framework, but also tie in these models with theories on how the necessary computations are realised.

Tasks with high internal complexity can make it necessary to approximate optimal computations.

Such approximative computations can lead to highly suboptimal behavior even without internal or external noise.

Yu and Dayan propose a model of inference and learning in which expected uncertainty is encoded by high acetylcholine (ACh) levels and unexpected uncertainty is encoded by norapinephrine (NE).

Jazayeri and Movshon present an ANN model for computing likelihood functions ($\approx$ probability density functions with uniform priors) from input population responses with arbitrary tuning functions.

Their assumptions are

  • restricted types of noise characteristics (eg. Poisson noise)
  • statistically independent noise

Since they work with log likelihoods, they can circumvent the problem of requiring neural multiplication.

The idea that the SC should learn to move the eyes such that it sees something interesting afterwards is in line with the idea that the brain should represent action pointers instead of actions.

Feldman states that enough is known about what he calls "Visual Feature Binding", so as not to call it a problem anymore.

Feldman explains Visual Feature Binding by the fact that all the features detected in the fovea usually belong together (because it is so small), and through attention. He cites Chikkerur et al.'s Bayesian model of the role of spatial and object attention in visual feature binding.

Feldman states that "Neural realization of variable binding is completely unsolved".

Feldman dismisses de Kamps' and van der Velde's approaches to neural variable binding stating that they don't work for the general case "where new entities and relations can be dynamically added".

The ANN model of multi-sensory integration in the SC due to Ohshiro et al. manages to replicate a number of physiological finding about the SC:

  • inverse effectiveness,
  • long-range inhibition and
  • short-range activation,
  • multisensory integration,
  • different tuning to modalities between neurons,
  • weighting of stimuli from different modalities.

It does not learn and it has no probabilistic motivation.

The ANN model of multi-sensory integration in the SC due to Ohshiro et al. uses divisive normalization to model multisensory integration in the SC.

Certain receptive fields in the cat striate cortex can be modeled reasonably well using linear filters, more specifically Gabor filters.

The receptive field properties of neurons in the cat striate cortex have been modeled as linear filters. In particular three types of linear filters have been proposed:

  • Gabor filters,
  • filters that based on second differentials of Gaussians functions,
  • difference of Gaussians filters.

Hawken and Parker studied the response patterns of a large number of cells in the cat striate cortex and found that Gabor filters, filters which are second differential of Gaussian functions, and difference-of-Gaussians filters all model these response patterns well, quantitatively.

They found, however, that difference-of-Gaussians filters strongly outperformed the other models.

Difference-of-Gaussians filters are parsimonious candidates for modeling the receptive fields of striate cortex cells, because the kind of differences of Gaussians used in striate cortex (differences of Gaussians with different peak locations) can themselves be computed linearly from differences of Gaussians which model receptive fields of LGN cells (where the peaks coincide), which provide the input to the striate cortex.

Both simple and complex cells' receptive fields can be described using difference-of-Gaussians filters.

One family of models for saccades and anti-saccades are the `accumulator models'.

These models pose that activation of saccade and saccade suppression neurons race each other. The one first to reach a threshold wins.

Rowland et al. derive a model of cortico-collicular multi-sensory integration from findings concerning the influence of deactivation or ablesion of cortical regions anterior ectosylvian cortex (AES) and rostral lateral suprasylvian cortex.

Rowland et al. derive a model of cortico-collicular multi-sensory integration from findings concerning the influence of deactivation or ablesion of cortical regions anterior ectosylvian cortex (AES) and rostral lateral suprasylvian cortex.

It is a single-neuron model.

Morén et al. present a spiking model of SC.

Cuppini et al. expand on their earlier work in modeling cortico-tectal multi-sensory integration.

They present a model which shows how receptive fields and multi-sensory integration can arise through experience.

Trappenberg presents a competitive spiking neural network for generating motor output of the SC.

Ghahramani et al. discuss computational models of sensorimotor integration.

Need to look at models of multi-sensory integration as well; they are not necessarily models of the SC, but relevant.

Weisswange et al. apply the idea of Bayesian inference to multi-modal integration and action selection. They show that online reinforcement learning can effectively train a neural network to approximate a Q-function predicting the reward in a multi-modal cue integration task.

Redundancy reduction, predictive coding, efficient coding, sparse coding, and energy minimization are related hypotheses with similar predictions. All these theories are reasonably successful in explaining biological phenomena.

According to Spratling's model, saliency arises from unexpected features in a scene.

The biased competition theory of visual attention explains attention as the effect of low-level stimuli competing with each other for resources—representation and processing. According to this theory, higher-level processes/brain regions bias this competition.

Predictive coding and biased competition are closely related concepts. Spratling combines them in his model and uses it to explain visual saliency.

Yamashita et al. modify Deneve et al.'s network by weakening divisive normalization and lateral inhibition. Thus, their network integrates localization if the disparity between localizations in simulated modalities is low, and maintains multiple hills of activation if disparity is high, thus accounting for the ventriloquism effect.

Yamashita et al. argue that, since whether or not two stimuli in different modalities with a certain disparity are integrated depends on the weight profiles in their network, a Bayesian prior is somehow encoded in these weights.

The model due to Cuppini et al. develops low-level multisensory integration (spatial principle) such that integration happens only with higher-level input.

In their model, Hebbian learning leads to sharpening of receptive fields, overlap of receptive fields, and Integration through higher-cognitive input.

Anastasio et al. have come up with a Bayesian interpretation of neural responses to multi-sensory stimuli in the SC. According to their view, enhancement, depression and inverse effectiveness phenomena are due to neurons integrating uncertain information from different sensory modalities.

Deneve describes how neurons performing Bayesian inference on variables behind Poisson inputs can learn the parameters of the Poisson processes in an online variant of the expectation maximization (EM) algorithm.

Deneve associates her EM-based learning rule in Bayesian spiking neurons with spike-time dependent plasticity (stdp)

The SOM has ancestors in von der Malsburg's "Self-Organization of Orientation Sensitive Cells in the Striate Cortex" and other early models of self-organization

The SOM is an abstraction of biologically-plausible ANN.

The SOM is an asymptotically optimal vector quantizer.

There is no cost function that the SOM algorithm follows exactly.

Quality of order in SOMs is a difficult issue because there is no unique definition of `order' in for the $n$-dimensional case if $n>2$.

Nevertheless, there have been a number of attempts.

There have been many extensions of the original SOM ANN, like

  • (Growing) Neural Gas
  • adaptive subspace SOM (ASSOM)
  • Parameterized SOM (PSOM)
  • Stochastic SOM
  • recursive and recurrent SOMs

Recursive and Recurrent SOMs have been used for mapping temporal data.

Von der Malsburg introduces a simple model of self-organization which explains the organization of direction-sensitive cells in the human visual cortex.

Deneve et al.'s model (2001) does not compute a population code; it mainly recovers a clean population code from a noisy one.

Pitti et al. use a Hebbian learning algorithm to learn somato-visual register.

Hebbian learning and in particular SOM-like algorithms have been used to model cross-sensory spatial register (eg. in the SC).

Bauer and Wermter use the algorithm they proposed to model multi-sensory integration in the SC. They show that it can learn to near-optimally integrate noisy multi-sensory information and reproduces spatial register of sensory maps, the spatial principle, the principle of inverse effectiveness, and near-optimal audio-visual integration in object localization.

Pitti et al. claim that their model explains preference for face-like visual stimuli and that their model can help explain imitation in newborns. According to their model, the SC would develop face detection through somato-visual integration.