Show Tag: biology

Select Other Tags

Humans can orient towards emotional human faces faster than towards neutral human faces.

The time it takes to elicit a visual cortical response plus the time to elicit a saccade from cortex (FEF) is longer than the time it takes for humans to orient towards faces.

Nakano et al. take this as further evidence for a sub-cortical (retinotectal) route of face detection.

Patients with lesions in V1 or striate were found to still be able to discriminate gender and expression of faces.

Neurons in the monkey pulvinar react extremely fast to visually perceived faces (50ms).

The superior colliculus does not receive any signals from short-wavelength cones (S-cones) in the retina.

Nakano et al. presented an image of either a butterfly or a neutral or emotional face to their participants. The stimuli were either grayscale or color-scale images, where color-scale images were isoluminant and only varied in their yellow-green color values. Since information from S-cones does not reach the superior colliculus, these faces were presumably only processed in visual cortex.

Nakano et al. found that their participants reacted to gray-scale emotional faces faster than to gray-scale neutral faces and to gray-scale faces faster than to gray-scale butterflies. Their participants reacted somewhat faster to color-scale faces than to color-scale butterflies, but this effect was much smaller than for gray-scale images. Also, the difference in reaction time to color-scale emotional faces was not significantly different from that to color-scale neutral faces.

Nakano et al. take this as further evidence of sub-cortical face detection and in particular of emotional sub-cortical face detection.

Humans can orient towards human faces faster than towards other visual stimuli (within 100ms).

Optimal multi-sensory integration is learned (for many tasks).

In many audio-visual localization tasks, humans integrate information optimally.

In some audio-visual time discrimination tasks, humans do not integrate information optimally.

Soltani and Wang propose an adaptive neural model of Bayesian inference neglecting any priors and claim that it is consistent with certain observations in biology.

LIP seems to encode decision variables for saccade direction.

Soltani and Wang propose an adaptive model of Bayesian inference with binary cues.

In their model, a synaptic weight codes for the ratio of synapses in a set which are activated vs. de-activated by the binary cue encoded in their pre-synaptic axon's activity.

The stochastic Hebbian learning rule makes the synaptic weights correctly encode log posterior probabilities and the neurons will encode reward probability correctly.

Humans' (and other mammals') brains are devoted to a large part to visual processing.

Vision is an important if not the most important source of sensory input for humans' (and other mammals').

Bayesian models have been used to model natural cognition.

Behrens et al. modeled learning of reward probabilities using a the model of a Bayesian learner.

Behrens et al. found that humans take into account the volatility of reward probabilities in a reinforcement learning task.

The way they took the volatility into account was qualitatively modelled by a Bayesian learner.

Although no (actually: hardly any) projections from multisensory or non-visual areas to V1 have been found, auditory input seems to influence neural activity in V1.

LIP projects to V1.

Activity in LIP is influenced by auditory stimuli.

Congenital blindness leads to tactile and auditory stimuli activating early dorsal cortical visual areas.

There are significant projections from auditory cortex as well as from polysensory areas in the temporal lobe to parts of V1 where receptive fields are peripheral.

Activity in the auditory cortex is modulated by visual stimuli.

There doesn't seem to be any region in the brain that is truly and only uni-sensory.

V1 is influenced by auditory stimuli (in different ways).

Auditory cortex is influenced by visual stimuli.

Many neurons in the cat and monkey deep SC are uni-sensory.

LSO, MSO, and DCN converge in the central nucleus of the IC (ICC).

Individual IC neurons combine localization cues.

Barn owls use interaural level differences for vertical sound source localization: their feathers on their head are asymmetric leading to differences in their ears' sensitivities to sounds from above and below.

There is a theory called the `premotor theory of visual attention' which posits that activity that can ultimately lead to a saccade can also facilitate processing of stimuli in those places the saccade will/would go to.

The SC is involved in a lot of things.

The SC is involved in tongue snapping in toads.

Satou et al. assume there is `switch-like' behavior in toad tounge snapping and predator avoidance.

According to Satou et al., the optic tectum is where the decision to snap the tongue (at insects).

Tongue snapping can be evoked in toads by electrostimulation of neurons in their optic tectum.

The `snapping-evoking area' in the toad optic tectum is in the lateral/vetrolateral part of the OT.

Visual receptive fields in the superficial hamster SC do not vary substantially in RF size with RF eccentricity.

Visual receptive field sizes change in the deep SC with eccentricity, they do not in the superficial hamster SC.

Sensory maps and their registration across modalities has been demonstrated in mice, cats, monkeys, guinea pigs, hamsters, barn owls, and iguanas.

Maps of sensory space in different sensory modalities can, if brought into register, give rise to an amodal representation of space.

If sensory maps of uni-modal space are brought into register, then cues from different modalities can access shared maps of motor space.

Newborns track schematic, face-like visual stimuli in the periphery, up to one month of age. They start tracking such stimuli in central vision after about 2 months. and stop after 5.

According to Johnson and Morton, there are two visual pathways for face detection: the primary cortical pathway and one through SC and pulvinar.

The cortical pathway is called CONLEARN and is theorized to be plastic, whereas the sub-cortical pathway is called CONSPEC and is thought to be fixed and genetically predisposed to detect conspecific faces.

The SC has been suggested to be involved in social interaction.

LGN and V4 have distinct layers for each eye.

The distinct layers for each eye in LGN and V4 only arise after the initial projections from the retina are made, but, in higher mammals, before birth.

Most neurons in the visual cortex (except v4) are binocular.

Usually, input from one eye is dominant, however.

The distribution of monocular dominance in visual cortex neurons is drastically affected by monocular stimulus deprivation during early development.

Law and Constantine-Paton transplanted eye primordia between tadpoles to create three-eyed frogs.

The additional eyes connected to the frogs' contralateral tecta and created competition of inputs which is not usually present in frogs (where the optic chiasm is perfect).

The result were tecta in which alternating stripes are responsive to input from different eyes.

Similar results result if one of the tecta is removed and both natural retinae project to the remaining tectum.

Visual capture is weaker for stimuli in the periphery where visual localization is less reliable relative to auditor localization than at the center of the visual field.

Electrical stimulation of the SC can evoke motor behavior.

Electrical stimulation of the cat SC can evoke saccades.

Typically, these saccades go into that general direction in which natural stimuli would lead to activation in the area that was electrically stimulated.

The `foveation hypothesis' states that the SC elicits saccades which foveate the stimuli activating it for further examination.

The theoretical accounts of multi-sensory integration due to Beck et al. and Ma et al. do not learn and leave little room for learning.

Thus, they fail to explain an important aspect of multi-sensory integration in humans.

Reward mediated learning has been demonstrated in adaptation of orienting behavior.

Possible neurological correlates of reward-mediated learning have been found.

Reward-mediated is said to be biologically plausible.

Population codes occur naturally.

It's hard to do fMRI of the brain stem, in part because structures there are small.

The reward prediction error theory of dopamine function says that the difference between expected and actual reward is encoded in dopamine neurons.

To calculate reward prediction error, dopamine neurons need to receive both inputs coding for experienced and expected reward.

The SC may play a major role in the selection of stimuli—as saccade targets or as reaching targets.

The activity of an SC neuron is proportional to the probability of the endpoint of a saccade being in that neuron's receptive field.

A possible ascending pathway from SC to visual cortex through the pulvinar nuclei (pulvinar) may be responsible for the effect of SC activity on visual processing in the cortex.

There may be an indirect ascending pathway from intermediate SC to the thalamic reticular nucleus.

Activity of the SC affects activity in cortical regions.

Feature-based and spatial attention may be based on similar mechanisms.

Feature-based visual attention facilitates object detection across the visual field.

The effects of visual attention are more pronounced in later stages of visual processing (in the visual cortex).

Spatial attention does not seem to affect the selectivity of visual neurons—just the vigour of their response.

Spatial visual attention increases the activity of neurons in the visual cortex whose receptive fields overlap the attended region.

Feature-based visual attention increases the activity of neurons in the visual cortex which respond to the attended feature.

Spatial and feature-based visual attention are additive: together, they particularly enhance the activity of any neuron whose receptive field encompasses the attended region, contains a stimulus with the attended feature, and prefers that feature.

Self-organization occurs in the physical world as well as in information-processing systems. In neural-network-like systems, SOMs are not the only way of self-organization.

Since we analyze complex visual scenes in chunks by saccading from one location to another, information about saccades must be used to break the constant stream of data coming from the eyes into chunks belonging to different locations in the visual field.

By contrasting performance in a condition in which their test subjects actually made saccades to that in a condition when only the image in front of their eyes was exchanged, Paradiso et al. showed that explicit information about saccades --- not just the change of visual input itself --- is responsible for resetting visual processing.

While the signal indicating a saccade could be proprioceptive, the timing in Paradiso et al.'s experiments hints at corollary discharge.

Humans can learn to use the statistics of their environment to guide their visual attention.

Humans do not need to be aware of the stimulus they perceive to use them to guide their visual attention.

Stimuli in one modality can guide attention in another.

Humans can learn to use stimuli in one modality to guide attention in another.

Palmer and Ramsey show that lack of awareness of a visual lip stream does not inhibit learning of its relevance for a visual localization task: the subliminal lip stream influences visual attention and affects the subjects' performance.

They also showed that similar subliminal lip streams did not affect the occurrence of the Mc Gurk effect.

Together, this suggests that awareness of a visual stimulus is not always needed to use it for guiding visual awareness, but sometimes it is needed for multisensory integration to occur (following Palmer and Ramsey's definition).

There are parallels between visual attention and eye movements because both serve the purpose of directing our processing of visual information to stimuli from a region in space that is small enough to handle for our brain.

Since visual attention and eye movements are so tightly connected in the process of visual exploration of a scene, it has been suggested that the same mechanisms may be (partially) responsible for guiding them.

There is evidence suggesting that one cannot plan a saccade to one point in space and turn covert visual attention to another at the same time.

It has been found that stimulating supposed motor neurons in the SC facilitates visual processing in the part of visual cortex whose receptive field is the same as that of the SC stimulated neurons.

Feature-based visual attention facilitates neural responses across the visual field (in visual cortex).

Born et al. provided evidence which shows that preparing a saccade alone already enhances visual processing at the target of the saccade: discrimination targets presented before saccade onset were identified more successfully if they were in the location of the saccade target than when they were not.

Born et al. showed that, if the color of a saccade target stimulus is task relevant, then identification of a discrimination target with that same color is enhanced even if it is not in the same location.

Casteau and Vitu's results seem to show that it's not proximity between target and distractor but the ratio of their excentricities that saccade delay is dependent of.

The in-vitro study of the rat intermediate SC by Lee and Hall did not find evidence for the long-range inhibitory/short-range excitatory connection pattern theorized by proponents of the neural-field theory of SC fixation.

Osborne et al. modeled performance of monkeys in a visual smooth pursuit task. According to their model, variability in this task is due mostly to estimation errors and not due to motor errors.

Beck et al. acknowledge that the task in Osborne et al.'s experiments was very artificial and the brain circuits involved in smooth pursuit are probably optimized for more natural tasks.

Cats, if raised in an environment in which the spatio-temporal relationship of audio-visual stimuli is artificially different from natural conditions, develop spatio-temporal integration of audio-visual stimuli accordingly. Their SC neurons develop preference to audio-visual stimuli with the kind of spatio-temporal relationship encountered in the environment in which they were raised.

Landy et al. and Beck et al. seem to imply that optimization to natural stimuli is due to evolution. I'm sure they wouldn't disagree, though, with the idea that optimization is also partly achieved through learning---as in the case of kittens reared in unnatural sensory environments.

Reactions to cross-sensory stimuli can be faster than the fastest reaction to any one of the constituent uni-sensory stimuli (as would be predicted by the race model.).

Frassinetti et al. showed that humans detect near-threshold visual stimuli with greater reliability if these stimuli are connected with spatially congruent auditory stimuli (and vice versa).

Response properties in mouse superficial SC neurons are not strongly influenced by experience.

How strongly SC neurons' development depends on experience (and how strongly well they are developed after birth) is different from species to species, so just because the superficial mouse SC is developed at birth, doesn't mean it is in other species (and I believe responsiveness in cats develops with experience).

Response properties of superficial SC neurons is different from those found in mouse V1 neurons.

Response properties of superficial SC neurons are different in different animals.

Probabilistic value estimations (by humans) are subject to framing issues: how valuable a choice is depends on how the circumstances are presented (frames).

Probabilistic value estimations are not linear in expected value.

The value function for uncertain gains seems to be generally concave, that of uncertain losses seems to be convex.

Two stimuli in different modalities are perceived as one multi-sensory stimulus if the position in space and point time at which they are presented are not too far apart.

Search targets which share few features with mutually similar distractors surrounding them are said to `pop out': it seems to require hardly any effort to identify them and search for them is very fast.

Search targets that share most features with their surrounding, on the other hand, require much more time time be identified.

Gottlieb et al. found that the most salient and the most task-relevant visual stimuli evoke the greatest response in LIP.

A traditional model of visual processing for perception and action proposes that the two tasks rely on different visual representations. This model explains the weak effect of visual illusions like the Müller-Lyer illuson on performance in grasping tasks.

Foster et al. challenge the methodology used in a previous study by Dewar and Carey which supports the perception and action model of visual processing due to Goodale and Milner.

They do that by changing the closed visual-action loop in Dewar and Carey's study into an open one by removing visual feedback at motion onset. The result is that the effect of the illusion is there for grasping (which it wasn't in the closed-loop condition) but not (as strongly) for manual object size estimation.

Foster et al. argue that this suggests that the effect found in Dewar and Carey's study is due to continuous visual feedback.

Laurenti et al. found in a audio-visual color identification task that redundant, congruent, semantic auditory information (the utterance of a color word) can decrease latency in response to a stimulus (color of a circle displayed to the subject). Incongruent semantic visual or auditory information (written or uttered color word) can increase response latency. However, congruent semantic visual information (written color word) does not decrease response latency.

The enhancements in response latencies in Laurenti et al.'s audio-visual color discrimination experiments were greater (response latencies were shorter) than predicted by the race model.

Integrating information from multiple stimuli can have advantages:

  • shorter reaction times
  • lower thresholds of stimulus detection
  • detection,
  • identification,
  • precision of orienting behavior

Improved performance on the behavioral side due to cross-sensory integration is connected to effects of effects on the neurophysiological side.

Modulatory input from uni-sensory, parietal regions to SC follows the principles of modality-matching and cross-modality:

A deep SC neuron (generally) only receives modulatory input related to some modality if it also receives primary input from that modality.

Modulatory input related to some modality only affects responses to primary input from the other modalities.

Deactivating regions in AES or lateral suprasylvian cortex responsive to some modality can completely eliminate responses of deep SC neurons to that modality.

Wallace and Stein argue that some deep SC neurons receive input from some modalities only via cortex.

Saccades evoked by electric stimulation of the deep SC can be deviated towards the target of visual spatial attention. This is the case even if the task forbids a saccade towards the target of visual spatial attention.

Activation builds up in build-up neurons in the intermediate SC during the preparation of a saccade.

Activation build-up in build-up neurons is modulated by spatial attention.

Kustov's and Robinson's results support the hypothesis that there is a strong connection of action and attention.

O'Regan and Noë argue that there is not an illusion that there is a "stable, high-resolution, full field representation of a visual scene" in the brain, but that people have the impression of being aware of everything in the scene.

The difference is that we would not need a photograph-like representation in the brain to be aware of all the details even if we were aware of it.

In order to work with spatial information from different sensory modalities and use it for motor control, coordinate transformation must happen at some point during information processing. Pouget and Sejnowski state that in many instances such transformations are non-linear. They argue that functions describing receptive fields and neural activation can be thought of and used as basis functions for the approximation of non-linear functions such as those occurring in sensory-motor coordinate transformation.

Most non-primate mammals do not have specialized photoreceptors for the medium-wavelength band. Most primates do.

Neural responses in LGN to short and medium-to-long wavelengths of light are antagonistic in rodents and cats (in certain cells).

Buzás et al. found blue-ON-type cells in the cat LGN, but no blue-OFF cells.

Blue-ON-type cells in primate and cat LGN have large receptive fields

Visual sensitivity is strongly reduced during saccades.

Visual sensitivity is strongly enhanced after saccades.

Trigger feature hypothesis: early hypothesis on neural coding. One perceptual feature triggers activity in one neuron.

The trigger feature hypothesis in principle postulates combinatorial codes (or even sparse coding).

Neural codes with overlapping receptive fields are less likely to be corrupted by noise than codes in which each neuron codes for only one value.

Cross-modal integration used to be thought of as a feed-forward process. Nowadays, we acknowledge lateral and even cyclic feed-back streams of information.

Synchronized oscillations have been hypothesized to be a potential mechanism for crossmodal integration.

Synchronized oscillations has been implicated with a wide variety of sensorimotor and cognitive functions.

The temporal correlation hypothesis suggests that synchronization of neural activity might be important in selecting and integrating information.

The temporal correlation hypothesis has been identified as a candidate mechanism for a neural solution of the binding problem.

Deactivating or stimulating certain parts of the the deeper layers of the SC induces arousal, freezing and escape behavior as well as a raise in blood pressure, heart rate, and respiration.

Reactions can be as complex as running and jumping.

Whether or not the complex aversive behavior patterns evoked by deactivating or stimulating certain brain regions is a direct effect of SC activity or rather the result of actual fear which in turn may be due to that specific SC activity is unclear.

Fitting barn owls with prisms which induce a shift in where the owls see objects in their environment leads to a shift of the map of auditory space in the optic tectum.

The shift in the auditory space map in the optic tectum of owls whose visual perception was shifted by prisms is much stronger in juvenile than in mature owls.

Letting adult owls with shifted visual spatial perception hunt mice increases the amount by which the auditory space map in the owls' optic tectum is shifted (as compared to feeding them only dead mice).

Bergan et al. offer four factors which might explain the increase in shift of the auditory space maps in owls with shifted visual spatial perception:

  • Hunting represents a task in which accurate map alignment is important (owls which do not hunt presumably do not face such tasks),
  • more cross-modal experience (visual and auditory stimuli from the mice),
  • cross-modal experiences in phases of increased attention and arousal,
  • increased importance of accurate map alignment (important for feeding).

Bergan et al. show that interaction with the environment can drive multisensory learning. However, Xu et al. show that multisensory learning can also happen if there is no interaction with the multisensory world.

After ablation of the SC, accurate saccades are still possible. Initially, trajectory and speed are impaired, but they recover.

Lesions to the cerebellum can permanently affect the accuracy and consistency of saccades.

There are cells in the rabbit retina which are selective of direction of motion.

Some visual processing occurs already in the retina.

The existence of inverse effectiveness has been questioned.

Multisensory integration in cortex has been studied less than in the midbrain, but there is work on that.

There are multisensory neurons in the newborn macaque monkey's deep sc.

General sensory maps (and map register) are already present in the newborn macaque monkey's deep SC (though receptive fields are large).

Maturational state of the deep SC is different between species—particularly between altricial and precocial species.

Female crickets have a system for orienting towards sounds (esp. mating calls) which is arguably based more on mechanics and acoustics than on neural computation.

Sound-source localization requires much more neural computation in vertebrates than in crickets.

Neural responses to words from different categories activate different networks of brain regions.

The fact that the brain regions activated by (hearing, reading...) certain words correspond to the categories the words belong to (action words for motor areas etc.) suggests semantic grounding in perception and action.

Words from some categories do not activate brain regions which are related to their meaning. The semantics of those words do not seem to be grounded in perception or action. Pulvermüller calls such categories and their neural representations disembodied.

Some abstract, disembodied words seem to activate areas in the brain related to emotional processing. These words may be grounded in emotion.

Fly brains contain a few hundred thousand neurons.

Flies' flying and walking behavior is relatively directly influenced by visual stimulation: Basic stimuli that suggest body rotation of the fly will lead to compensatory flying and walking direction.

Flies use translational optic flow to detect impending collisions.

Direct connections from the vision to the motor system lead to highly stereotyped visuomotor behavior in the fly.

The stereotyped visuomotor flying behavior in the fly is mediated by internal states and input from other sensory modalities.

The topographic map of visual space in the sSC is retinotopic.

The motor map of in the dSC is retinotopic.

The superior colliculus is retinotopically organized.

Activity in the SC drives the saccade burst generators which are oculomotor neurons in the reticular formation.

Tabareau et al. propose a scheme for a transformation from the topographic mapping in the SC to the temporal code of the saccadic burst generators.

According to their analysis, that code needs to be either linear or logarithmic.

The receptive fields of multisensory neurons in the deep SC which are close to one another are highly correlated.

Wickelgren found the receptive fields of audio-visual neurons in the deep SC to have no sharp boundaries.

Both visual and auditory neurons in the deep SC usually prefer moving stimuli and are direction selective.

The range of directions deep SC neurons are selective for is usally wide.

MLE has been a successful model in many sensory cue integration tasks.

Botvinick et al. advance two interdependent hypotheses:

  1. Conflicts in information processing activate certain cortical areas, most notably the anterior cingulate cortex,
  2. Conflict-related activity causes adjustments in cognitive control of information processing to resolve conflict.

Usually, rate perception is influenced more strongly by auditory information than by visual information.

By modulating the reliability of auditory information, visual information can be given greater weight in rate perception.

Activity in the deep SC has been described as different regions competing for access to motor resources.

Occluding one ear early in life shifts the map of auditory space with respect to the map of visual space in barn owls. Prolonged occlusion of one ear early in life leads to a permanent realignment of the auditory map with the visual map.

Mysore and Knudsen say that deep SC neurons respond to relative saliency of a stimulus, i.e. to the saliency of stimuli in their receptive fields compared to the saliency of stimuli outside their receptive fields.

Neurons at later stages in the hierarchy of visual processing extract very complex features (like faces).

Spatial attention raises baseline activity in neurons whose RF are where the attention is even without a visual stimulus (in visual cortex).

Unilateral lesions in brain areas associated with attention can lead to visuospatial neglect; the failure to consider anything within a certain region of the visual field. In extreme cases this can mean that patients e.g. only read from one side of a book.

Kastner and Ungerleider propose that the top-down signals which lead to the effects of visual attention originate from brain regions outside the visual cortex.

Regions lesions of which can induce visuospatial neglect include

  • the parietal lobe, in particular the inferior part,
  • temporo-parietal junction,
  • the anterior cingulate cortex,
  • basal ganglia,
  • thalamus,
  • the pulvinar nucleus.

DLPFC projects directly to the SC.

Many of the cortical areas projecting to the SC have been implicated with attention.

SIV neurons are excited almost exclusively by somatosensory stimuli.

SIV neurons' activity can be inhibited by activity in the auditory FAES.

The function of SIV is unknown.

Dehner et al. speculate that the inhibitory influence of FAES activity on SIV activity is connected to modality-specific attention: According to that hypothesis, an auditory stimulus which leads to strong FAES activity will suppress activity in FAES and thus block out cortical somatosensory input to the SC.

Spatial attention can enhance the activity of SC neurons whose receptive fields overlap the attended region

fAES is not tonotopic. Instead, its neurons are responsive to spatial features of sounds. No spatial map has been found in fAES (until at least 2004).

Irrelevant auditory stimuli can dramatically improve or degrade orientation performance in visual orientation tasks:

In Wilkinson et al.'s experiments, cats' performance in orienting towards near-threshold, medial visual stimuli was much improved by irrelevant auditory stimuli close to the visual stimuli and drastically degraded by irrelevant auditory stimuli far from the visual stimuli.

If visual stimuli were further to the edge of the visual field, then lateral auditory stimuli improved their detection rate even if they were disparate.

Chemical deactivation of AES degrades both the improvement and the degradation of performance in orienting towards visual due to auditory stimuli.

AES has been implicated with selective attention.

There's a topographic map of somatosensory space in the putamen.

Electrostimulation of putamen neurons can evoke body movement consistent with the map of somatosensory space in that brain region.

There are visuo-somatosensory neurons in the putamen.

Graziano and Gross found visuo-somatosensory neurons in those regions of the putamen which code for arms and the face in somatosensory space.

Visuo-somatosensory neurons in the putamen with somatosensory RFs in the face are very selective: They seem to respond to visual stimuli consistent with an upcoming somatosensory stimulus (close-by objects approaching to the somatosensory RFs of the neurons).

Graziano and Gross report on visuo-somatosensory cells in the putamen in which remapping seems to be happening: Those cells responded to visual stimuli only when the animal could see the arm in which the somatosensory RF of those cells was located.

There are reports of highly selective, purely visual cells in the putamen. One report is of a cell which responded best to a human face.

Responses of visuo-tactile responses in Brodman area 7b, the ventral intraparietal area, and inferior premotor area 6 are similar to those found in the putamen.

FEF stimulation elicits saccadic eye movements.

Presaccadic activity is not measured in FEF for spontaneous saccades but for purposive saccades.

The motor map is not monotonic across the entire FEF, but sites that are close to each other have similar characteristic saccades.

Some cells in FEF respond to auditory stimuli.

Stimulating cells in FEF whose activity is elevated before a saccade of a given direction and amplitude usually generates a saccade of that direction and amplitude.

Cells in MST respond to and are selective to optic flow.

Cells in MST respond to vestibular motion cues.

Some cells in MST are multisensory.

Visuo-vestibular cells in MST perform multisensory integration in the sense that their response to multisensory stimuli is different from their response to either of the uni-sensory cues.

Visuo-vestibular cells tend to be selective for visual and vestibular self-motion cues which indicate motion in the same direction.

The responses of some visuo-vestibular cells were enhanced, that of others was depressed by combined visuo-vestibular cues.

Multisensory neurons in AES are mostly located at the borders of unisensory regions.

Multisensory AES cell receptive fields are not well-delineated regions in space in which and only in which a stimulus evokes a stereotyped response. Instead, they can have a region, or multiple regions, where they respond vigorously and others, surrounding those `hot spots', which in which the response is less strong.

AES neurons show an interesting form of the principle of inverse effectiveness: Cross-sensory in regions in which the unisensory component stimuli would evoke only a moderate response produce additive (or, superadditive?) responses. In contrast, Cross-sensory stimuli at the `hot spots' of a neuron tend to produce sub-additive responses.

In some SC neurons, receptive fields are not in spatial register across modalities.

Receptive fields of SC neurons in different modalities tend to overlap.

Multisensory SC cell receptive fields are not well-delineated regions in space in which and only in which a stimulus evokes a stereotyped response. Instead, they can have a region, or multiple regions, where they respond vigorously and others, surrounding those `hot spots', which in which the response is less strong.

FAES is not exclusively auditory.

AEV is partially, but not consistently, retinotopic.

Receptive fields in AEV tend to be smaller for cells with RF centers at the center of the visual field than for those with RF centers in the periphery.

AEV is not exclusively (but mostly) visual.

RFs in AEV are relatively large.

Multiple cues are used in biological sound-source localization.

The difference in intensity between one ear and the other, the interaural level difference (ILD), is one cue used in biological sound-source localization.

The difference phase between one ear and the other, the interaural time difference (ITD), is one cue used in biological sound-source localization.

In mammals, different neurons in the lateral superior olive (LSO) are tuned to different ILDs.

In mammals, different neurons in the medial superior olive (MSO) are tuned to different ITDs.

Subregions of the superior olivary complex (SOC) extract auditory localization cues.

The model of biological computation of ITDs proposed by Jeffress extracts ITDs by means of delay lines and coincidence detecting neurons:

The peaks of the sound pressure at each ear lead, via a semi-mechanical process, to peaks in the activity of certain auditory nerve fibers. Those fibers connect to coincidence-detecting neurons. Different delays in connections from the two ears lead to coincidence for different ITDs, thus making these coincidence-detecting neurons selective for different angles to the sound source.

The granularity of representations of ITDs and ILDs in MSO and LSO reflects the fact that ITD and ILD are most useful for auditory localization in different frequency ranges: ITDs for high frequencies are less densely represented in MSO and ITDS are less densely represented in LSO.

Knudsen and Konishi refer to the nucleus mesencephalus lateralis dorsalis as the avian homologue of the mammalian inferior colliculus.

The external nucleus of the inferior colliculus (ICx) used to be called the space-mapped region of the nucleus mesencephalicus lateralis pars dorsalis (MLD),

Like many other auditory brain regions, the IC is tonotopically organized, except for ICx.

Auditory localization is different from visual or haptic localization since stimulus location is not encoded in which neural receptors are stimulated but in the differential temporal and intensity pattern of stimulation of receptors in the to ears.

It's easier to separate a target sound from a blanket of background noise if target sound and background noise have different ITDs.

There seem to be significant differences in SOC organization between higher mammals and rodents.

Jeffress' model has been extremely successful, although neurophysiological evidence is scarce (because the MSO apparently is hard to study).

Jeffress' model predicts a spatial map of ITDs in the MSO.

Jeffress' model predicts a spatial map of ITDs in the MSO. Recent evidence seems to suggest that this map indeed exists.

Rearing barn owls in darkness results in mis-alignment of auditory and visual receptive fields in the owls' optic tectum.

Rearing barn owls in darkness results in discontinuities in the map of auditory space of the owls' optic tectum.

Rearing animals in darkness can result in anomalous auditory maps in their superior colliculi.

The superior colliculus is connected, directly or indirectly, to most parts of the brain.

Some authors distinguish only superficial and deep superior colliculus.

Certain neurons in the deep SC emit bursts of activity before making a saccade.

It has long been known that stimulating the SC can elicit eye movements.

The size and direction of a saccade before which deep SC neurons show the greatest activity depends on where they are in the SC: Neurons in medial regions of the SC tend to prefer saccades going up, neurons in lateral regions of the SC tend to prefer saccades going down.

Long saccades are preceded by strong activity of rostral neurons, short saccades by activity of caudal neurons.

Deep SC neurons which have preferred saccades have these preferred saccades also in total darkness. They thus do not simply respond to the specific location of a visual stimulus.

Robinson reports two types of motor neurons in the deep SC: One type has strong activity just (~20 milliseconds) before the onset of a saccade. The other type has gradually increasing activity whose peak is, again, around 12-20 milliseconds before onset.

Currently, three types of saccade-related neurons are distinguished in the deep SC:

  • Burst- and build-up neurons on the one hand,
  • fixation neurons on the other.

Microstimulation of OT neurons in the barn owl can evoke pupil dilation.

Lesions of the tectospinal tract leads to deficits in motor responses, while lesions of brachium and parts of the tectothalamic system produce contralateral visual neglect.

Ablation of the SC leads to temporary blindness and deficits in visual following.

Sprague and Meikle Jr. propose that the SC is involved in visual attention.

Ablation of the superficial SC does not result in blindness or orienting deficiencies. Only when the deep SC is ablated do these deficiencies occur—a remarkable finding considering that the superficial SC is the main target of retinotectal projections.

Onset times of visually guided saccades have a bimodal distribution. The faster type of saccades are termed `express saccades'. Ablation of the SC but not of the FEF makes express saccades disappear.

"SC ablation permanently reduces fixation accuracy, saccade frequency, and saccade velocity."

FEF ablation leads to a temporary visual neglect. This neglect disappears almost completely after a short time.

Removing both SCs and both FEFs leads to permanent deficits:

  • a decrease in fixation accuracy,
  • a neglect of the peripheral visual field,
  • saccade frequency is decreased,
  • the range of saccadic eye movements is reduced.

Schiller et al. did not observe the visuospatial neglect and stark loss of oculomotor function as did Sprague and Meikle.

Brainstem premotor neurons producing the commands for eye movements are located in pons, medulla (horizontal movements), and the rostral midbrain (vertical movements).

Certain Purkinje cells in the oculomotor vermis of the cerebellum have saccade-related activity: Helmchen and Büttner found neurons which displayed:

  • a saccade-related burst (most of them),
  • a saccade-related burst, followed by a pause,
  • a saccade-related pause,
  • either a pause or a burst, depending on the direction of the saccade.

Unilateral deactivation of the caudal fastigial nucleus in the cerebellum leads to hypermetria of saccades to the ipsilateral and hypometria of saccades to the contralateral side.

Deactivation of the caudal fastigial nucleus in the cerebellum increases the variability of saccades.

The superficial SC projects retinotopically to LGN.

The same regions in LGN receiving projections from the superficial SC project to the cortex.

Superficial layers of the SC project to deep layers.

Both deep and superficial layers in left and right SC project to the corresponding layer in the contralateral SC.

There are excitatory and inhibitory connections from the deep to the superficial SC.

The excitatory and inhibitory connections from the deep to the superficial SC and the connection from the superficial SC to LGN may be one route through which deep SC activity may reach cortex.

There are inhibitory connections from deep SC to superficial SC (SGI to SGS).

The ventral lateral geniculate nucleus projects to the deep SC.

The response of neurons in the SC to a given stimulus decreases if that stimulus is presented constantly or repeatedly at a relatively slow rate (once every few seconds, up to a minute).

In cats, the SC has a size of about 4.5 mm to 4.7 mm from the posterior to the anterior end and 6.0 mm to 6.2 mm from the medial to the lateral end.

The superficial cat SC is not responsive to auditory stimuli.

Some neurons in the dSC respond to an auditory stimulus with a single spike at its onset, some with sustained activity over the duration of the stimulus.

Middlebrooks and Knudsen report on sharply delineated auditory receptive fields in some neurons in the deep cat SC, in which there is an optimal region from which stimuli elicit a stronger response than in other places in the RF.

A minority of deep SC neurons are omnidirectional, responding to sounds anywhere, albeit with a defined best area.

There is a map of auditory space in the deep superior colliculus.

There is considerable variability in the sharpness of spatial tuning in the responses to auditory stimuli of deep SC neurons.

The visual and auditory maps in the deep SC are in spatial register.

Auditory receptive fields tend to be greater and contain visual receptive fields in the deep SC of the owl.

The superficial SC of the owl is strongly audio-visual.

Neurons in the deep SC which show an enhancement in response to multisensory stimuli peak earlier.

The response profiles have superadditive, additive, and subadditive phases: Even for cross-sensory stimuli whose unisensory components are strong enough to elicit only an additive enhancement of the cumulated response, the response is superadditive over parts of the time course.

The map of visual space in the superficial SC of the mouse is in rough topographic register with the map formed by the tactile receptive fields of whiskers (and other body hairs) in deeper layers.

The superficial mouse SC is not responsive to auditory or tactile stimuli.

The receptive fields of certain neurons in the cat's deep SC shift when the eye position is changed. Thus, the map of auditory space in the deep SC is temporarily realigned to stay in register with the retinotopic map.

In an fMRI experiment, Schneider found that spatial attention and switching between modes of attention (attending to moving or to colored stimuli) strongly affected SC activation, but results for feature-based attention were inconclusive.

The fact that Schneider did not find conclusive evidence for modulation of neural responses by feature-based attention might be related to the fact that the superficial SC does not seem to receive color-based information and deep SC seems to receive color-based information only via visual cortex.

There are neurons in the supplementary eye field which are related to

  • eye movements,
  • arm movements,
  • ear movements,
  • spatial attention.

There are projections from visual cortex to SC.

There are projections from auditory cortex to SC (from anterior ectosylvian gyrus).

There are projections from motor and premotor cortex to SC.

There are projections from primary somatosensory cortex to SC.

Primary somatosensory cortex is somatotopic.

Posterior parietal cortex projects to the deep SC.

SEF projects directly to the SC, but different researchers disagree on the SC layers the projections terminate.

SC receives connections from cerebellum.

There's a loop from cerebellum to SC and back.

An auditory and a visual stimulus, separated in time, may be perceived as one audio-visual stimulus, seemingly occurring at the same point in time.

The probability that two stimuli in different modalities are perceived as one multisensory stimulus generally decreases with increasing temporal or spatial disparity between them.

The probability that two stimuli in different modalities are perceived as one multisensory stimulus generally increases with increasing semantic congruency.

If an auditory and a visual stimulus are close together, spatially, then they are more likely perceived as one cross-modal stimulus than if they are far apart—even if they are separated temporally.

Attention is necessary to perform the Stroop and Simon tasks.

The dimensional overlap framework can be used to classify overlap and interference between relevant (features of) stimuli and (features of) responses in psychological stimulus-response paradigms. In particular it can be used to classify types of conflict between relevant and irrelevant dimensions of stimuli and response.

In Stroop-type experiments, there is usually conflict between an irrelevant stimulus dimension, the relevant stimulus dimension, and a dimension of the response, for example the color of ink $C_I$ in which a word is written, the meaning of the word (a different color) $C_R$, and the response (saying the name of that color $C_R$).

In Simon-type experiments, there is usually conflict only between an irrelevant stimulus dimension and a dimension of the response, for example the task-irrelevant location of a stimulus and the hand with which to respond.

The frontoparietal network seems involved in executive control and orienting.

The anterior cingulate cortex is likely involved with regulating attention.

There are fine-structure and envelope ITDs. Humans are sensitive to both, but do not weight envelope ITDs very strongly when localizing sound sources.

Recent neurophysiological evidence seems to contradict the details of Jeffress' model.

Some congenitally unilaterally deaf people develop close-to-normal auditory localization capabilities. These people probably learn to use spectral SSL cues.

Humans use a variety of cues to estimate the distance to a sound source. This estimate is much less precise than estimates of the direction towards the sound source.

Wilson and Bednar distinguish between topological feature maps and topographic maps. The topology of topographic maps tends to correspond to the spatial properties of sensory surfaces (like the retina or the skin) whereas topological feature maps correspond to the similarity of higher-order features of sensory input (like spatial frequency or orientation in vision).

Wilson and Bednar discuss the usefulness of topological feature maps, implying that they may not be useful at all but a byproduct of neural development and adaptation processes.

According to Wilson and Bednar, there are four main families of theories concerning topological feature maps:

  • input-driven self-organization,
  • minimal-wire length,
  • place-coding theory,
  • Turing pattern formation.

Wilson and Bednar argue that input-driven self-organization and turing pattern formation explain how topological maps may arise from useful processes, but they do not explain why topological maps are useful in themselves.

According to Wilson and Bednar, wire-length optimization presupposes that neurons need input from other neurons with similar feature selectivity. Under that assumption, wire length is minimized if neurons with similar selectivities are close to each other. Thus, the kind of continuous topological feature maps we see optimize wire length.

The idea that neurons should especially require input from other neurons with similar spatial receptive fields is unproblematic. However, Wilson and Bednar argue that it is unclear why neurons should especially require input from neurons with similar non-spatial feature preferences (like orientation, spatial frequency, smell, etc.).

Pooling the activity of a set of similarly-tuned neurons is useful for increasing the sharpness of tuning. A neuron which pools from a set of similarly-tuned neurons would have to make shorter connections if these neurons are close together. Thus, there is a reason why it can be useful to connect preferentially to a set of similarly-tuned neurons. This reason might be part of the reason behind topographic maps.

Attention developed quite early. Even very simple organisms, like drosophila and honeybees, show evidence of attentional processes.

Neural responses to the same stimulus are noisy.

The neural response of an SC neuron to one stimulus can be made weaker in some neurons by another stimulus at a different position in space. This stimulus can be in the same or in a different modality (in multi-sensory neurons). This effect is called depression.

Kadunce et al. did not find within-modality visual suppression as often as within-modality auditory suppression.

Kadunce et al. found that suppressive regions were large and that depression varied depending on position of the concurrent stimulus within the suppressive region. Suppression was generally strongest when concurrent stimuli were on the ipsilateral side.

Two identical stimuli at different locations can be perceived as one stimulus which seems to be located between the actual sound sources.

Kadunce et al. say that two identical stimuli played at different points in space might lead to a translocation of the perceived stimulus and thus to a translocation of the hill of activation in the SC.

Kadunce et al. found that two auditory stimuli placed at opposing the edges of a neuron's receptive field, in its suppressive zone, elicited some activity in the neuron (although less than they expected).

Kadunce et al. found cross-modality depression less often than within-modality depression.

Kadunce et al. found that for the majority of neurons in which a stimulus in one modality could lead to depression in another modality that depression was one-way: Stimuli in the second modality did not depress responses to stimuli in the first.

Kadunce et al. found that SC neurons are very inhomogeneous wrt. to presence and size of suppressive zones.

There is evidence suggesting that the brain actually does perform statistical processing.

It has been found that stimulating supposed motor neurons in the SC enhances responses of v4 neurons with the same receptive field as the SC neurons.

Krauzlis et al. state that collicular deactivation has not been found to eliminate signs of task-based attention in neural responses in cortex.

Krauzlis et al. argue that SC deactivation should have changed neural responses in cortex if it regulated attention through visual cortex.

Krauzlis et al.'s argument that SC deactivation should have changed neural responses in cortex if it regulated attention through visual cortex is a bit weak considering that stimulating SC does change sensory representations in v4.

Krauzlis et al. argue that animals without a well-developed neocortex nonetheless show signs of visual attention. Thus, it is likely that the neocortex is not necessary for attention and SC can regulate attention without the neocortex.

Krauzlis et al. argument that animals without a well-developed neocortex show signs of selective attention similar to humans and other higher animals shows that neocortex may not be necessary for attention seems more appropriate than that of lack of influence of collicular deactivation on cortical responses.

Krauzlis et al. argue that attention may not so much be a explicit mechanism but a phenomenon emerging from the need of distributed information processing systems (biological and artificial) for centralized coordination:

According to that view, some centralized control estimates the state of (some part of) the world and modulates both action and perception according to the state which is estimated to be the most plausible at any given point.

Krauzlis et al. localize this central control in the basal ganglia.

Since brains are just things that evolved out of a need for efficient information processing, all mechanisms in it can be interpreted as emergent phenomena. Taking a normative stance and attributing a cause to them can be enlightening. It is a matter of scientific pragmatism whether one wants to look at a specific phenomenon in terms of why it evolved or what problem it solves, or (often) both.

Decision theoretical approaches have been used successfully to explain both behavior and neural activities in sensorimotor tasks.

Neural responses in parietal cortex have been suggested to reflect expected reward.

Integrating information from different modalities can improve

  • detection,
  • identification,
  • precision of orienting behavior,
  • reaction time.

First systematic studies of neural multisensory integration started in the 1970ies.

The SC is involved in generating gaze shifts and other orienting behaviors.

The SC localizes events.

The uni-sensory, multi-sensory and motor maps of the superior colliculus are in spatial register.

Cats, being an altricial species, are born with little to no capability of multi-sensory integration and develop first multi-sensory SC neurons, then neurons exhibiting multi-sensory integration on the neural level only after birth.

In the development of SC neurons, receptive fields are initially very large and shrink with experience.

SC receives tactile localization-related inputs from the trigeminal nucleus.

Multisensory experience is necessary to develop normal multisensory integration.

Multisensory integration in the SC is similar in anesthetized and alert animals (cats).

ICx projects to intermediate and deep layers of SC.

The shift in the auditory map in ICx comes with changed projections from ICc to ICx.

There appears to be plasticity wrt. the auditory space map in the SC.

The number of neurons in the lower stages of the visual processing hierarchy (V1) is much lower than in the higher stages (IT).

The redundancy provided my multisensory input can facilitate or even enable learning.

Biomimetics is the approach of making use of the technological and theoretical insights of the biological sciences for engineering.

Often, the quest to understand a biological system leads to the recognition of new paradigms for engineering.

Often biology has a solution to a problem in the engineering disciplines.

There have been biomimetic solutions to problems in materials sciences, mechanical sciences, sensor technology, and various problems in robotics.

There have been biomimetic solutions to various problems in robotics.

Different types of retinal ganglion cells project to different lamina in the zebrafish optic tectum.

The lamina a retinal ganglion cell projects to in the zebrafish optic tectum does not change in the fish's early development. This is in contrast with other animals.

However, the position within the lamina does change.

Muscle synergies are coordinated activations of groups of muscles.

There is the hypothesis that complex motions are comprised of combinations of simple muscle synergies, which would reduce the dimensionality of the control signal.

A low-dimensional representation of motion patterns in a high-dimensional space restricts the actual dimensionality of those motions.

I'm not so sure that a low-dimensional representation of motion patterns in a high-dimensional space necessarily restricts the actual dimensionality of those motions:

$\mathbb{Q}^3$ is bijective to $\mathbb{Q}$ (right?).

It is probably the case for natural behavior, though.

Removing large parts of cortex—even a full hemisphere—does not result in a loss of consciousness.

Absence epilepsy—sudden loss of consciousness with amnesia but not always with total loss of cognitive function—has been induced by electrostimulation of the upper brainstem, but not of cortical regions.

The brainstem may be involved in creating consciousness.

Cortical structures do not always control our overt behavior. Instead, sub-cortical areas sometimes override cortical tendencies.

Sub-cortical structures (like the SC) have bearing on cortical functionality.

In the Sprague effect, removing (or deactivating) one visual cortex eliminates visually induced orienting behavior to stimuli in the contralateral hemifield.

Lesioning (or deactivating) the contralateral SC restores the orienting behavior.

``The heminanopia that follows unilateral removal of the cortex that mediates visual behavior cannot be explained simply in classical terms of interruption of the visual behavior cannot be explained simply in classical terms of interruption of the visual radiations that serve cortical function.
Explanation fo the deficit requires a broader point of view, namely, that visual attention and perception are mediated at both forebrain and midbrain levels, which interact in their control of visually guided behavior.''

(Sprague, 1966)

SC receives input and represents all sensory modalities used in phasic orienting: vision, audition, somesthesis (haptic), nociceptic, infrared, electoceptive, magnetic, and ecolocation.

The stratum zonale is the outermost, almost cell-free lamina of the SC.

The stratum griseum superficiale is the SC layer below the stratum zonale. It contains many small cells.

The stratum opticum is the innermost of the superficial SC layers, below the stratum griseum. It is dominated by fibers including retinal projections.

The stratum griseum intermediale is the outermost lamina of the deep SC.

The stratum album intermediale is the second-outermost lamina of the deep SC, below the stratum griseum intermediale.

The stratum griseum profundum is the third-outmost lamina of the deep SC, below the stratum album intermediale.

The stratum album profundum is the lowest lamina of the deep SC, below the stratum griseum profundum.

The stratum album profundum borders to the periaqueductal gray.

There are alternative nomenclatures for the layers of the deep sc.

The layers and internal connectivity of the optic tectum is similar but different from those of the mammalian SC.

The nucleus of the brachium of the inferior colliculus (nbic) projects to intermediate and deep layers of SC.

SC receives auditory localization-related inputs from the IC.

SC neurons respond faster to stimuli based on luminance contrasts than on color contrast.

Superficial SC neurons seem to have little to no access to color information.

Deep SC neurons do react to stimuli based on color contrast.

There is reason to believe that color information reaches the SC via cortical routes.

Visual feature combinations become more salient if they are learned to be associated with reward.

Targets which are selected in one trial tend to be more salient in subsequent trials—they are selected faster and rejected slower.

The extent of this effect is modulated by whether or not the selection was rewarded.

The squirrel SC measures a few centimeters across in either direction.

Xu et al. stress the point that in their cat rearing experiments, multisensory integration arises although there is no reward and no goal-directed behavior connected with the stimuli.

The fact that multi-sensory integration arises without reward connected to stimuli motivates unsupervised learning approaches to SC modeling.

The precise characteristics of multi-sensory integration were shown to be sensitive to their characteristics in the experienced real world during early life.

It is interesting that multisensory integration arises in cats in experiments in which there is no goal-directed behavior connected with the stimuli as that is somewhat in contradiction to the paradigm of embodied cognition.

Xu et al. raised two groups of cats in darkness and presented one with congruent and the other with random visual and auditory stimuli. They showed that SC neurons in cats from the concruent stimulus group developed multi-sensory characteristics while the other mostly did not.

In the experiment by Xu et al., SC neurons in cats that were raised with congruent audio-visual stimuli distinguished between disparate combined stimuli, even if these stimuli were both in the neurons' receptive fields. Xu et al. state that this is different in naturally reared cats.

In the the experiment by Xu et al., SC neurons in cats that were raised with congruent audio-visual stimuli had a preferred time difference between onset of visual and auditory stimuli of 0s whereas this is around 50-100ms in normal cats.

In the the experiment by Xu et al., SC neurons in cats reacted best to auditory and visual stimuli that resembled those they were raised with (small flashing spots, broadband noise bursts), however, they generalized and reacted similarly to other stimuli.

Sub-threshold multisensory neurons respond directly only to one modality, however, the strength of the response is strongly influenced by input from another modality.

Weir and Suver review experiments on the visual system of flies, specifically into the dendritic and network properties of the VS and HS systems which respond to apparent motion in the vertical and horizontal planes, respectively.

The temporal time course of neural integration in the SC reveals considerable non-linearity: early on, neurons seem to be super-additive before later settling into an additive or sub-additive mode of computation.

There is a notion that humans perform (near-)optimally in many sensory tasks.

Stroop presented color words which were either presented in the color they meant (congruent) or in a different (incongruent) color. He asked participants to name the color in which the words were written and observed that participants were faster in naming the color when it was congruent than when it was incongruent with the meaning of the word.

The Stroop test has been used to argue that reading is an automatic task for proficient readers.

Greene and Fei-Fei show in a Stroop-like task that scene categorization is automatic and obligatory for simple (`entry-level') categories but not for more complex categories.

Stuphorn et al. found neurons in the monkey SC whose activity was dependent on the retinotopic position of the target in a reaching task, but not to the actual path taken in reaching.

Körding and Wolpert let subjects reach for some target without seeing their own hand.

In some of the trials, subjects were given visual feedback of varying reliability of their hand position briefly, halfway through the trial. In such trials where the visual feedback was clear, subjects were also given clear feedback of their hand position at the end of the trial.

The visual feedback in the middle of the trial was displayed by an amount which was distributed according to a Gaussian distribution with a mean of 1cm, or, in a second experiment, according to a bi-modal distribution.

Körding and Wolpert showed that their subjects correctly learned the distribution of displacement of the visual feedback wrt. the actual position of their hand and used it in the task consistent with a Bayesian cue integration model.

The study by Hartung et al. shows that (concave) hollow faces are perceived as convex faces, but flatter than they would be if they were concave, indicating that online information is combined with prior information, not discarded.

The fact that concave faces are perceived as flat convex faces can be interpreted as a case of model averaging being used in cue integration.

People with lesions of the parieto-occipital cortex (POJ) are impaired in reaching and grasping objects in their peripheral visual field (an effect called 'optic ataxia').

Himmelbach et al. studied one patient in a visuo-haptic grasping task and found that she had a healthy-like ability to adapt her grip online to changes of object size when it was in the central viewing field. This indicates that the problem for patients with lesions of parieto-occipital cortex (POJ) is not an inability to adapt online, but more likely the connection between visuomotor pathways and pathways necessary for grasping.

Multisensory integration is a way to reduce uncertainty. This is both a normative argument and it states the evolutionary advantage of using multisensory integration.

Stanford et al. studied single-neuron responses to cross-modal stimuli in their receptive fields. In contrast to previous studies, they systematically tried out different combinations of levels of intensity levels in different modalities.

Morgan et al. studied the neural responses to visual and vestibular self-motion cues in the dorsal portion of the medial superior temporal area (MSTd).

They presented congruent and incongruent stimuli at different levels of reliability and found that at any given level of reliability, the neural computation underlying multi-sensory integration could be described well by a linear addition rule.

However, the weights used in combining the uni-sensory responses changed with cue reliability.

Studies of single-neuron responses to multisensory stimuli have usually not explored the full dynamic range of inputs---they often have used near- or subthreshold stimulus intensities and thus usually found superadditive effects.

Studies of single-neuron responses to multisensory stimuli have over-emphasized the prevalence of superadditivity over that of subadditivity.

McHaffie et al. speculate that loops through various subcortical loops might solve the selection problem, ie. the gating of competing inputs to shared resources.

In the SC, this means that the basal ganglia decide which of the brain structures involved in gaze shifts access to the eye motor circuitry.

Task-irrelevant cues in one modality can enhance reaction times in others—but they don't always do that. Instances of this effect have been implicated with exogenous attention.

Task-irrelevant auditory cues have been found to enhance reaction times in others. visual cues, however, which cued visual localization, did not cue auditory localization.

Fixating some point in space enhances spoken language understanding if the words come from that point in space. Fixating a visual stream showing lips consistent with the utterances, this effect is strongest, but it also works if the visual display is random. The effect is also enhanced if fixation is combined with some form of visual task which is complex enough.

Fixating at some point in space can impede language understanding if the utterance do not emanate from the focus of visual attention and there are auditory distractors which do.

Goldberg and Wurtz found that neurons in the superficial SC respond more vigorously to visual stimuli in their receptive field if the current task is to make a saccade to the stimuli.

Responses of superficial SC neurons do not depend solely to intrinsic stimulus properties.

Traditionally, visual attention is subdivided into feature-based attention and spatial attention. However, spatial is arguably only one cue out of possibly a number of cues and possibly only a special case.

There is a distinction between two different kinds of bats: megabats and microbats. Megabats differ in size (generally), but also in the organization of their visual system. In particular, their retinotectal projections are different: while all of the retinotectal projections in microbats are contralateral, retinotectal projections in megabats are divided such that projections from the nasal part of the retina go to the ipsilateral SC and those from the peripheral part go to the contralateral SC. This is similar to primate vision.

In primates, retinotectal projections to each SC are such that each visual hemifield is mapped to one (contralateral) SC. This is in contrast with retinotectal projections in most other vertebrates, where all projections from one retina project to the contralateral SC.

Human children often react to multi-sensory stimuli faster than they do to uni-sensory stimuli. However, the latencies they exhibit up to a certain age do not violate the race model as they do in adult humans.

Multisensory integration develops after birth in many ways.

SC is connected to motor plants via brainstem.

Lesions to SC and FEF individually do not eliminate saccades. Lesions to both do eliminate saccades.

The contribution of head-saccades to full saccades can be influenced by knowledge about the target of the next saccade.

Brainstem activation is very similar to actual muscle behavior.

Optic tectum and superior colliculus are homologues.

The tectum includes both sc (optic tectum) and ic

The SC is multisensory: it reacts to visual, auditory, and somatosensory stimuli. It does not only initiate gaze shifts, but also other motor behaviour.

The SC is involved in the transformation of multisensory signals into motor commands.

The SC maturates fast compared to the cortex; this is important to protect the young animal from threats in early life.

The mammalian SC is divided into seven layers with alternating fibrous and cellular layers.

The superficial layers include layers I-III, while the deep layers are layers IV-VII.

Some authors distinguish a third, intermediate, set of layers (IV,V).

The deeper levels of SC are the targets of projections from cortex, auditory, somatosensory and motor systems in the brain.

The deeper layers of the SC project strongly to brainstem, spinal cord, especially to those regions involved in moving eyes, ears, head and limbs, and to sensory and motor areas of thalamus.

The superficial SC is visuotopic.

The part of the visual map in the superficial SC corresponding to the center of the visual field has the highest spatial resolution.

Visual receptive fields in the deeper SC are larger than in the superficial SC.

The parts of the sensory map in the deeper SC corresponding to peripheral visual space have better representation than in the visual superficial SC.

Do the parts of the sensory map in the deeper SC corresponding to peripheral visual space have better representation than in the visual superficial SC because they integrate more information; does auditory or tactile localization play a more important part in multisensory localization there?

Neurons in the deep SC whose activity spikes before a saccade have preferred amplitudes and directions: Each of these neurons spikes strongest before a saccade with these properties and less strongly before different saccades.

Moving the eyes shifts the auditory and somatosensory maps in the SC.

Altricial species are born with poorly developed capabilities for sensory processing.

(Some) SC neurons in the newborn cat are sensitive to tactile stimuli at birth, to auditory stimuli a few days postnatally, and to visual stimuli last.

Visual responsiveness develops in the cat first from top to bottom in the superficial layers, then, after a long pause, from top to bottom in the lower layers.

The basic topography of retinotectal projections is set up by chemical markers. This topography is coarse and is refined through activity-dependent development.

We do not know whether other sensory maps than the visual map in the SC are initially set up through chemical markers, but it is likely.

If deep SC neurons are sensitive to tactile stimuli before there are any visually sensitive neurons, then it makes sense that their retinotopic organization be guided by chemical markers.

There's a retinotopic, polysynaptic pathway from the SC through LGN.

There are at least polysynaptic pathways from deep SC to cortex.

Polysynaptic pathways from deep SC to cortex may explain facilitation of visual processing in the V1 caused by SC

Overt visual function occurs only starting 2-3 weeks postnatally in cats.

Kao et al. did not find visually responsive neurons in the deep layers of the cat SC within the first three postnatal weeks.

Overt visual function can be observed in developing kittens at the same time or before visually responsive neurons can first be found in the deep SC.

Some animals are born with deep-SC neurons responsive to more than one modality.

However, these neurons don't integrate according to Stein's single-neuron definition of multisensory integration. This kind of multisensory integration develops with experience with cross-modal stimuli.

Less is known about the motor properties of SC neurons than about the sensory properties.

Electrical stimulation of the cat SC elicits eye and body movements long before auditory or visual stimuli could have that effect.

These movements already follow the topographic organization of the SC at least roughly.

There are voluntary (endogenous) and reflexive (exogenous) mechanisms of guiding selective attention.

Santangelo and Macaluso describe typical experiments for studying visual attention.

Frontal eye fields (FEF) and intraparietal sulcus (IPS) have been associated with voluntary orienting of visual attention.

Santangelo and Macaluso provide a rewiew on the recent literature on visual and auditory attention.

Frontoparietal regions play a key role in spatial orienting in unisensory studies of visual and auditory attention.

There seems to be also modality-specific attention which globally de-activates attention in one modality and activates it in the other.

As a computer scientist I would call de-activating one modality completely a special case of selective attention in that modality.

Localized auditory cues can exogenously orient visual attention.

Santangelo and Macaluso state that multisensory integration and attention are probably separate processes.

Maybe attention controls whether or not multi-sensory integration (MSI) happens at all (at least in SC)? That would be in line with findings that without input from AES and rLS, there's no MSI.

Are AES and rLS cat homologues to the regions cited by Santangelo and Macalluso as regions responsible for auditory and visual attention?

Kohonen states that early SOMs were meant to model brain maps and how they come to be.

Multi-sensory neurons in the SC are only in the intermediate and deep layers.

The most important cortical input to the SC (in cats) comes from layer V cortical neurons from a number of sub-regions of the anterior ectosylvian sulcus (AES):

  • anterior ectosylvian visual area (AEV)
  • the auditory field of AES (FAES)
  • and the fourth somatosensory area (SIV)

These populations in themselves are uni-sensory.

Neurons that receive auditory and visual ascending input also receive (only) auditory and visual descending projections.

Most multisensory SC neurons project to brainstem and spinal chord.

There are monosynaptic excitatory AES-SC projections and McHaffie et al. state that "the predominant effect of AES on SC multisensory neurons is excitatory."

The external nucleus of the inferior colliculus (ICx) of the barn owl represents a map of auditory space.

The map of auditory space in the nucleus of the inferior colliculus (ICx) is calibrated by visual experience.

Hyde and Knudsen found that there is a point-to-point projection from OT to IC.

Hyde and Knudsen propose that the OT-IC projection conveys what they call a "template-based instructive signal" which aligns the auditory space map in ICx with the retinotopic space map in SC.

Deco and Rolls introduce a system that uses a trace learning rule to learn recognition of more and more complex visual features in successive layers of a neural architecture. In each layer, the specificity of the features increases together with the receptive fields of neurons until the receptive fields span most of the visual range and the features actually code for objects. This model thus is a model of the development of object-based attention.

Semantic multisensory congruence can

  • shorten reaction times,
  • lower detection thresholds,
  • facilitate visual perceptual learning.

In one of their experiments, Warren et al. had their subjects localize visual or auditory components of visual-auditory stimuli (videos of people speaking and the corresponding sound). Stimuli were made compelling' by playing video and audio in sync anduncompelling' by introducing a temporal offset.

They found that their subjects performed as under a unity assumptions'' when told they would perceive cross-sensory stimuli, and when the stimuli were `compelling' and under a lowunity assumption'' when they were told there could be separate auditory or visual stimuli and/or the stimuli were made `uncompelling'.

Bertelson et al. did not find a shift of sound source localization due to manipulated endogenous visual spatial attention—localization was shifted only due to (the salience of) light flashes which would induce (automatic, mandatory) exogenous attention.

Vatakis and Spence found support for the concept of a `unity assumption' in an experiment in which participants were to judge whether a visual lip stream or an auditory utterance was presented first: Participants found this task easier if the visual and auditory stream did not match in terms of gender of voice or content, suggesting that their unity hypothesis was weak in these cases, causing them not to integrate them.

Fruit flies and other flies as well as pigeons actively use the change in perception brought about by moving to extract information about the world.

The deeper levels of SC receive virtually no primary visual input (in cats and ferrets).

Visual receptive fields in the superficial monkey SC do vary substantially in RF size with RF eccentricity.

In some animals, receptive field sizes do and in some they don't change substantially with RF excentricity.

The neurons in the superficial (rhesus) monkey SC do not exhibit strong selectivity for specific shapes, stimulus orientation, or moving directions. Some of them do show selectivity to stimuli of specific sizes.

The activity profiles for stimuli moving through superficial SC neuron RFs shown in Cynader and Berman's work look similar to Poisson-noisy Gaussians, however, the authors state that the strength of a response to a stimulus was the same regardless where in the activating region it was shown.

The neurons in the superficial (rhesus) monkey SC largely prefer moving stimuli over non-moving stimuli.

In the intermediate layers of the monkey SC, neurons have a tendency to reduce or otherwise their reaction to presentations of the same stimulus over time.

There are marked differences in the receptive field properties of superficial cat and monkey SC neurons.

Wozny et al. distinguish between three strategies for multisensory integration: model averaging, model selection, and probability matching.

Wozny et al. found in an audio-visual localization experiment that a majority of their participants' performance was best explained by the statistically sub-optimal probability matching strategy.

With increasing distance between stimuli in different modalities, the likelihood of perceiving them as in one location decreases.

In an audio-visual localization task, Wallace et al. found that their subjects' localization of the auditory stimulus were usually biased towards the visual stimulus whenever the two stimuli were perceived as one and vice-versa.

Details of instructions and quality of stimuli can influence the strength of the spatial ventriloquism effect.

Ocular dominance stripes are stripes in visual brain regions in which retinal projections of one eye or the other terminate alternatingly.

Ocular dominance stripes have been shown to exist in the monkey SC. In some places, they weren't crisp but ran into each other.

A localized visual stimulus can lengthen the response time to a target if the target stimulus appears somewhere too late after the first stimulus.

This is called `inhibition of return'.

Bell et al. found that playing a sound before a visual target stimulus did not increase activity in the neurons they monitored for long enough to lead to (neuron-level) multisensory integration.

Bell et al. make it sound like enhancement in SC neurons due to exogenous, visual, spatial attention is due to residual cue-related activity which is combined (non-linearly) with target-related activity.

If enhancement in SC neurons due to exogenous, visual, spatial attention is due to residual cue-related activity which is combined (non-linearly) with target-related activity, then that casts an interesting light on (the lack of) intra-modal enhancement:

The only difference between an intra-modal cue-stimulus combination and an intra-modal stimulus-stimulus combination lies in the temporal order of the two. Therefore, if two visual stimuli presented in the receptive field of an SC at the same time) neuron do not enhance the response to each other, then the reason can only be a matter of timing.

In an fMRI experiment, Fairhall and Macaluso found that attending (endogenously, spatially) to congruent audio-visual stimuli (moving lips and speech) produced greater activation in SC than either attending to non-congruent stimuli or not attending to congruent stimuli.

Visual spatial attention

  • lowers the stimulus detection threshold,
  • improves stimulus discrimination,

With two stimuli in the receptive field, one with features of a visual search target and one with different features

  • increases average neural activity in cortex compared to the same two objects without attending to any features
  • decreases average neural activity if spatial attention is on the location of the non-target compared to when it is on the target.

The fact that average neural activity in cortex is decreased if spatial attention is on the location of a non-target out of a target and a non-target compared to when it is on the target supports the notion that inhibition plays an important role in stimulus selection.

Two superimposed visual stimuli of different orientation, one optimal for a given simple cell in visual cortex, the other sub-optimal but excitatory, can elicit a weaker response than just the optimal stimulus.

Seeing someone say 'ba' and hearing them say 'ga' can make one perceive them as saying 'da'. This is called the `McGurk effect'.

According to Carandini and Heeger, structures on the level of microcircuits which are repeated throughout the brain implement what the authors call `canonical neural computations'. Well-known examples of such canonical neural computations are:

  • exponentiation
  • linear filtering.

Another canonical neural computation proposed by Carandini and Heeger is (divisive) normalization.

Divisive normalization models describe neural responses well in cases of

  • olfactory perception in drosophila,
  • visual processing in retina and V1,
  • possibly in other cortical areas,
  • modulation of responses through attention in visual cortex.

Divisive normalization models describe neural responses well in a number of instances of sensory processing.

Divisive normalization is probably implemented through (GABA-ergic) inhibition in some cases (fruitfly olfactory system). In others (V1), it seems to be implemented by different means.

Different regions project to different lamina of the SC.

An SC output neuron which projects to some structure outside the SC may sample input from SC lamina according to the requirements of the target of its projections.

There are monosynaptic connections from the retina to neurons both in the superficial and deeper layers of the SC.

In the study due to Xu et al., multi-sensory enhancement in specially-raised cats decreased gradually with distance between uni-sensory stimuli instead of occurring if and only if stimuli were present in their RFs. This is different from cats that are raised normally in which enhancement occurs regardless of stimulus distance if both uni-sensory components both are within their RF.

Attention affects both early and late perceptual processing.

Children do not integrate information the same way adults do in some tasks. Specifically, they sometimes do not integrate information optimally, where adults do integrate it optimally.

In an adapted version of Ernst and Banks' visuo-haptic height estimation paradigm, Gori et al. found that childrern under the age of 8 do not integrate visual and haptic information optimally where adults do.

Natural cognition is not always optimal.

A particular deviation from optimality in natural cognition is pathological cognition.

Ernst and Banks show that humans combine visual and haptic information optimally in a height estimation task.

There is topographic mapping even in the olfactory system.

Topographic mapping is pervasive throughout sensory-motor processing.

It is unclear how neurons could back-propagate errors in their inputs. Thus, the biological validity of backpropagation is limited

Hinton argues that backpropagation is such a good idea that nature must have found a way to implement it somehow.

Neural responses in the sc to spatially and temporally coincident cross-sensory stimuli can be much stronger than responses to uni-sensory stimuli.

In fact, they can be much greater than the sum of the responses to either stimulus alone.

Neural responses (in multi-sensory neurons) in the sc to spatially disparate cross-sensory stimuli is usually weaker than responses to uni-sensory stimuli.

Responses in multi-sensory neurons in the SC follow the so-called spatial principle.

Visual receptive fields in the sc usually consist of an excitatory central region and an inhibitory surround.

(Auditory receptive fields also often seem to show this antagonism.)

Moving eyes, ears, or body changes the receptive field (in external space) in SC neurons wrt. stimuli in the respective modality.

Stanford et al. state that superadditivity seems quite common in cases of multi-sensory enhancement.

Some sensory-motor maps are complex: they are not a simple spatiotopic mapping, but comprise internally spatiotopic `neighborhoods' which, on a much greater scale are organized spatiotopically, but across which the same point in space may be represented redundantly.

The complex structure of sensory-motor maps may be due to a mapping from a high-dimensional manifold into a two-dimensional space. This kind of map would also occur in Ring's motmaps.

Topographic maps can help minimize wire length in neural networks.

Topographic maps have biological advantages.

The superior colliculus receives input from various sensory brain areas. According to King, these inputs are uni-sensory, as far as we know.

According to King, the principal function of the SC is initiating gaze shifts.

Yang and Shadlen show that neurons in LIP (in monkeys) encode the log probability of reward given artificial visual stimuli in a wheather prediction task experiment.

The SC also seems to be involved in reaching and other forelimb-related motor tasks and has been associated with complex vision-guided arm-gestures in humans.

Before a saccade is made, the region that will be the target of that saccade is perceived with higher contrast and visual contrast.

Enhancement in the SC happens only between stimuli from different modalities.

Depression in the SC happens between stimuli from the same modality.

Is there really no enhancement between different cues from the same modalities, like eg. contrast and color?

Alais and Burr found in an audio-visual localization experiment that the ventriloquism effect can be interpreted by a simple cue weighting model of human multi-sensory integration:

Their subjects weighted visual and auditory cues depending on their reliability. The weights they used were consistent with MLE. In most situations, visual cues are much more reliable for localization than are auditory cues. Therefore, a visual cue is given so much greater weight that it captures the auditory cue.

Human performance in combining slant and disparity cues for slant estimation can be explained by (optimal) maximum-likelihood estimation.

According to Landy et al., humans often combine cues (intra- or cross-sensory) optimally, consistent with MLE.

An image is highly salient where

  • there is high contrast,
  • there is high variance,
  • it has distinctive higher-order statistics,
  • there is high local symmetry.

People fixate on different parts of an image depending on the questions they are asked or task they are trying to accomplish.

People look where they point and point where they look.

Reasons why pointing and gazing are so closely connected may be

  • that gaze guides pointing,
  • that gazing and pointing use the same information,
  • or that a common motor command guides both.

Brouwer et al found that their subjects looked more at the contact position of the index finger when they were told to grasp an object than when they were just to look at it.

In the first experiment by Brouwer et al, people fixated different parts of a shape depending on whether the task was just to look at it or grasp it.

The subject's initial saccade, however, was not influenced by the task.

Newborn children prefer to look at faces and face-like visual stimuli.

Visual cortex is not fully developed at birth in primates.

Different parts of the visual field feed into the cortical and subcortical visual pathways more or less strongly in humans.

The nasal part of the visual field feeds more into the cortical pathway while the peripheral part feeds more into the sub-cortical pathway.

In one experiment, newborns reacted to faces only if they were (exclusively) visible in their peripheral visual field, supporting the theory that the sub-cortical pathway of visual processing plays a major role in orienting towards faces in newborns.

It makes sense that sub-cortical visual processing uses peripheral information more than cortical processing:

  • sub-cortical processing is concerned with latent monitoring of the environment for potential dangers (or conspecifiics)
  • sub-cortical processing is concerned with watching the environment and guiding attention in cortical processing.

SC has been implicated as part of a subcortical visual pathway which may drive face detection and orienting towards faces in newborns.

The subcortical visual pathway which may drive face detection and orienting towards faces in newborns hypothesized by Johnson also includes amygdala and pulvinar.

According to the hypothesis expressed by Johnson, amygdala, pulvinar, and SC together form a sub-cortical pathway which detects faces, initiates orienting movements towards faces, and activates cortical regions.

This implies that this pathway may be important for the development of the `social brain', as Johnson puts it.

Visual processing of potentially affective stimuli seems to be partially innate in primates.

The pulvinar receives direct retinal input.

Yu and Dayan interpret experiments showing that the level of acetylcholine (ACh) increases with learned stochasticity of cues as supporting their theory that ACh signals expected uncertainty.

Yu and Dayan interpret experiments showing that increased levels of norapinephrine (NE) accelerates the detection of changes in cue predictivity as supporting their theory that NE signals unexpected uncertainty.

Tadpoles make eye movements which compensate for swimming movements independent of visual or vestibular input. Their rhythmic swimming motor commands are generated by spinal central pattern generators (CGPs). Efference copies of these motor commands appear to be what induces the eye movements.

There seems to be a linear relationship between the mean and variance of neural responses in cortex. This is similar to a Poisson distribution where the variance equals the mean, however, the linearity constant does not seem to be one in biology.

Seung and Sempolinsky introduce maximum likelihood estimation (MLE) as one possible mechanism for neural read-out. However, they state that it is not clear whether MLE can be implemented in a biologically plausible way.

Seung and Sempolinsky show that, in a population code with wide tuning curves and poisson noise and under the conditions described in their paper, the response of neurons near threshold carries exceptionally high information.

Visual input does seem to be necessary to ensure spatial audio-visual map-register.

Most of the multi-sensory neurons in the (cat) SC are audio-visual followed by visual-somatosensory, but all other combinations can be found.

Stein defines multi-sensory integration on the single-neuron level as

``a statistically significant difference between the number of impulses evoked by a cross-modal combination of stimuli and the number evoked by the most effective of these stimuli individually.''

The SC is also involved in eye, head, whole-body, ear, whisker and other body movements.

What we find in the SC we can use as a guide when studying other multi-sensory brain regions.

Multisensory integration is present in neonates to some degree depending on species (more in precocial than in altricial species), but it is subject to postnatal development and then influenced by experience.

An experiment by Burr et al. showed auditory dominance in a temporal bisection task (studying the temporal ventriloquism effect). The results were qualitatively but not quantitatively predicted by an optimal-integration model.

There are two possibilities explaining the latter result:

  • audio-visual integration is not optimal in this case, or
  • the model is incorrect. Specifically, the assumption of Gaussian noise in timing estimation may not reflect actual noise.

Multisensory enhancement and depression are an increased and decreased response of a multisensory neuron to congruent and incongruent stimuli, respectively.

Multisensory enhancement and depression are very different across neurons.

In many instances of multi-sensory perception, humans integrate information optimally.

Multisensory stimuli can be integrated within a certain time window; auditory or somatosensory stimuli can be integrated with visual stimuli even though they arrive delayed wrt. visual stimuli.

Enhancement is greatest for weak stimuli and least for strong stimuli. This is called inverse effectiveness.

Descending inputs from association cortex to SC are uni-sensory.

AES integrates audio-visual inputs similar to SC.

AES has multisensory neurons, but they do not project to SC.

AES is a brain region in the cat. We do not know if there is a homologue in humans.

Map alignment in the SC is expensive, but it pays off because it allows for a single interface between sensory processing and motor output generation.

Important regions in the posterior parietal cortex (PPC) are

  • LIP
  • MIP
  • VIP

LIP is retinotopic and involved in gaze shifts.

The medial intraparietal area (MIP) is retinotopic and involved in reaching.

Sensory re-mapping is often incomplete.

Non-spatial stimulus properties influence if and how cross-sensory stimuli are integrated.

Multisensory integration in cortical VLPFC was more commonly observed for face-vocalization combinations than for general audio-visual cues.

Backward connections in the visual cortex show less topographical organization (`show abundant axonal bifurcation'), are more abundant than forward connections.

The visual cortex is hierarchically organized.

It seems a bit unclear to me what determines the hierarchy of the visual cortex if backward connections are predominant.

Feedforward connections in the visual cortex seem to be driving while feedback connections seem to be modulatory.

Many of ART's predictions about natural cognition have been validated.

The anatomical interpretation of the terms 'bottom-up' and 'top-down' is that of feedforward vs. feedback connections in a processing hierarchy, respectively.

The terms 'bottom-up' and 'top-down' can mean different, related things depending on context. Engel et al. list four:

  • anatomical
  • cognitivist
  • gestaltist
  • (neural) dynamicist

Grossberg's ART and Friston's theory of cortical responses appeal to the anatomical interpretation of 'top-down' and 'bottom-up' processing and stress feedback as well as feedforward connections.

The changes to neural responses due to top-down attention are purely caused by intrinsic processes, not a (direct) reaction to external stimuli. They thus support the theory of situatedness.

Utility functions are used in economics to explain people's decisions. They can also be used to examine non-economic decisions, like decisions in sensorimotor control.

Koerding et al. show how a utility function can be inferred from subjects' decisions in a two-alternative-forced-choice task.

Benevenuto and Fallon found projections from the SC mostly to midbrain and thalamus structures. They did not study projections to cortical regions. In detail, they found projections to:


  • inferior colliculus
  • pretectum


  • ventral lgn
  • dorsal lgn
  • suprageniculate nucleus
  • intralaminar nuclei
  • parafascicular nucleus
  • parts of dorsomedial nucleus
  • suprageniculate nucleus
  • certain pulvinar nuclei
  • lateral posterior nucleus
  • reunions nucleus
  • ventral posterior inferior nucleus
  • ventral posterior lateral nuclei
  • ventral lateral nucleus
  • limitans nucleus


  • dorsomedial nucleus


  • Fields of Forel (subthalamic)
  • zona incerta
  • accessory optic tract (in midbrain)

There is an illusion that there is a "stable, high-resolution, full field representation of a visual scene" in the brain.

Could the illusion that there is a "stable, high-resolution, full field representation of a visual scene" in the brain be the result of the availability heuristic? Whenever we are interested in some point in a visual scene, it is either at the center of our vision anyway, or we saccade to it. In both cases, detailed information of that scene is available almost instantly.

This seems to be what O'Regan and Noë imply (although they do not talk about the availability heuristic).

Jerome Feldman argues that the Neural Binding Problem is really four related problems and not distinguishing between them contributes to the difficulty of understanding them.

Jerome Feldman distinguishes between the following four "technical issues" that together form the binding problem: "General Considerations of Coordination", "The Subjective Unity of Perception", "Visual Feature-Binding", and "Variable Binding".

The general Binding Problem according to Jerome Feldman is really a problem of any distributed information processing system: it is difficult and sometimes impossible or intractable for a system that keeps and processes information in a distributed fashion to combine all the information available and act on it.

Jerome Feldman talks about the sub-problem of "General Considerations of Coordination" of the general Binding Problem as more or less a problem of synchronization and states that modeling efforts are well underway, taking account physiological details as spiking behavior and neuron oscillations.

The sub-problem of "Subjective Unity of Perception" according to Feldman is the problem of explaining why we experience perception as an "integrated whole" while it is processed by "largely distinct neural circuits".

Feldman states that enough is known about what he calls "Visual Feature Binding", so as not to call it a problem anymore.

Feldman explains Visual Feature Binding by the fact that all the features detected in the fovea usually belong together (because it is so small), and through attention. He cites Chikkerur et al.'s Bayesian model of the role of spatial and object attention in visual feature binding.

Feldman states that "Neural realization of variable binding is completely unsolved".

Feldman dismisses de Kamps' and van der Velde's approaches to neural variable binding stating that they don't work for the general case "where new entities and relations can be dynamically added".

55% of neocortex are visual.

Neurons at low stages in the hierarchy of visual processing extract simple, localized features.

Color opponency and center-surround oppenency arise first in LGN.

The visual system (of primates) contains a number of channels for different types of visual information:

  • color
  • shape
  • motion
  • texture
  • 3D

Nearly all projections from the retinae go through LGN.

All visual areas from V1 to V2 and MT are retinotopic.

The ventral pathway of visual processing is weakly retinotopically organized.

The complexity of features (or combinations of features) neurons in the ventral pathway react to increases to object level. Most neurons react to feature combinations which are below object level, however.

The dorsal pathway of visual processing consists of areas MST (motion area), and visual areas in the posterior parietal cortex (PPC).

The complexity of motion patterns neurons in the dorsal pathway are responsive to increases along the pathway. This is similar to neurons in the ventral pathway which are responsive to progressively more complex feature combinations.

Receptive fields in the dorsal pathway of visual processing are less retinotopic and more head-centered.

Parvocellular ganglion cells are color sensitive, have small receptive fields and are focused on foveal vision.

Magnocellular ganglion cells have lower spatial and higher temporal resolution than parvocellular cells.

There are shortcuts between the levels of visual processing in the visual cortex.

Certain neurons in V1 are sensitive to simple features:

  • edges,
  • gratings,
  • line endings,
  • motion,
  • color,
  • disparity

Certain receptive fields in the cat striate cortex can be modeled reasonably well using linear filters, more specifically Gabor filters.

Simple cells are sensitive to the phase of gratings, whereas complex cells are not and have larger receptive fields.

Some cells in V1 are sensitive to binocular disparity.

LIP has been suggested to contain a saliency map of the visual field, to guide visual attention, and to decide about saccades.

The auditory field of the anterior ectosylvian sulcus (fAES) has strong corticotectal projections (in cats).

Some cortical areas are involved in orienting towards auditory stimuli:

  • primary auditory cortex (A1)
  • posterior auditory field (PAF)
  • dorsal zone of auditory cortex (DZ)
  • auditory field of the anterior ectosylvian sulcus (fAES)

Only fAES has strong cortico-tectal projections.

The receptive fields of LGN cells can be described as either an excitatory area inside an inhibitory area or the reverse.

Hawken and Parker studied the response patterns of a large number of cells in the cat striate cortex and found that Gabor filters, filters which are second differential of Gaussian functions, and difference-of-Gaussians filters all model these response patterns well, quantitatively.

They found, however, that difference-of-Gaussians filters strongly outperformed the other models.

Correct pro-saccades are executed earlier than correct anti-saccades.

In saccade/anti-saccade experiments, direction errors are confined to the anti-saccade condition.

In anti-saccade experiments, incorrect saccades (those in the direction of the visual stimulus) occur earlier after target onset than do correct anti-saccades.

The timing of correct pro-saccades has a bi-modal distribution. One class of pro-saccades happens very fast (express saccades), the others take a little longer.

Express saccades are thought of as reflex behavior. The reflex behind them is referred to as the 'visual grasp reflex'.

They are believed to be the result of a direct translation of a visual stimulus into a motor command.

In anti-saccade conditions, the `visual grasp reflex' must be suppressed.

Activation of FEF and SC neurons is higher before direction error saccades in anti-saccade tasks than before correct anti-saccades.

Munoz and Everling assume that there are distinct populations of fixation and saccade neurons in the SC and FEF.

In a more recent paper, Casteau and Vitu state that there is some debate about that. However, they, too argue for distinct fixation neurons. On the other hand, they also state that fixation neurons probably are not located in the SC itself, which is in contrast of what Munoz and Everling write.

Omnipause neurons in the reticular formation tonically inhibit `the saccade-generation circuit'.

It seems unclear what is the original source of SC inhibition in preparation of anti-saccades. Munoz and Everling cite the supplementary eye fields (SEF), dorsolateral prefrontal cortex (DLPFC) as possible sources, and the substantia nigra pars reticulata (SNpr).

The supplementary eye fields (SEF) and dorsolateral prefrontal cortex (DLPFC) both project directly to the SC.

LIP may be where anti-saccade targets are decided upon.

There seems to be an ascending pathway from superficial SC to the medial temporal area (MT) through the pulvinar nuclei (inferior pulvinar).

Berman and Wurtz found neurons in the pulvinar nuclei which received input from SC and sent output to MT.

Pulvinar neurons project to the SC.

Pulvinar neurons receive input from the MT.

There seem to be few, if any, neurons in the pulvinar which receive input from and project to neurons in the same region of MT.

Pulvinar neurons seem to receive input and project to different layers in visual cortex:

They receive input from layer 5 and project to layers one and three.

Connectivity between pulvinar and MT is similar to connectivity between pulvinar and visual cortex.

Saccade targets tend to be the centers of objects.

When reading, preferred viewing locations (PVL)—the centers of the distributions of fixation targets---are typically located slightly left of the center of words.

When reading, the standard deviation of the distribution of fixation targets within a word increases with the distance between the start and end of a saccade.

Saccades are thought to be biased toward a medium saccade length; long saccades typically undershoot, short saccades overshoot.

During reading, the further a saccade lands from the center of a word, the greater the probability of a re-fixation.

Pajak and Nuthmann found that saccade targets are typically at the center of objects. This effect is strongest for large objects.

Already von Helmholtz formulated the idea that prior knowledge---or expectations---are fused with sensory information into perception.

This idea is at the core of Bayesian theory.

Although predecessors existed, Bayesian theory became popular in perceptual science in the 1980's and 1990's.

We do not know the types of functions computable by neurons.

The eye suffers from

  • chromatic aberration
  • optical imperfections
  • the fact that photo receptors are behind ganglia and blood vessels

Different wavelengths of light are refracted differently. Therefore, the focal point of the lens is never the same for all wavelengths. Thus, any object can only be perfectly focused in one wavelength partial images in other wavelengths are always blurred. This effect is called chromatic aberration.

Cones are color-sensitive, rods aren't.

The optic nerve does not have the bandwidth to transmit all the light receptors' activities. Some compression occurs already in the eye.

Short-range inhibition happens in the horseshoe crab compound eye: neighbouring receptor units inhibit each other.

Ganglion cells in the retina connect the brain to a small, localized number of photoreceptors. The small population—or the region in space from which it receives incoming light— are called a ganglion cell's receptive field. They respond best either to patterns of high luminance in the center of that small population and low luminance at its periphery, or to the opposite pattern. Ganglion cells with the former characteristics are called "on-center" cells, the others "off-center" cells.

Cells in the amygdala respond to faces and parts of faces. Some react exclusively to faces.

More visual processing tends to occur in the retina the more important the result is (like detecting bugs for frogs or detecting foxes for rabbits) and the less complex the organism (like frogs and foxes).

LGN has six layers.

LGN is retinotopically organized.

Magnocellular ganglion cells have large receptive fields.

The M-stream of visual processing is formed by magnocellular ganglion cells, the P-stream by parvocellular ganglion cells.

The M-stream is thought to deal with motion detection and analysis, while the P-stream seems to do be involved in processing color and form.

The part of the visual cortex dedicated to processing signals from the fovea is much greater than that dealing with peripheral signals.

LGN receives more feedback projections from V1 than forward connections from the retina.

Cells in inferotemporal cortex are highly selective to the point where they approach being grandmother cells.

There are cells in inferotemporal cortex which respond to (specific views on / specific parts of) faces, hands, walking humans and others.

Certain ganglion cells in the frog retina, dubbed `bug detectors', react exclusively to bug-like stimuli and their activity provokes bug-catching behavior in the frog.

Both populations in prefrontal cortex and posterior parietal cortex show correlates of bottom-up and top-down visual attention.

In the pop-out condition of a visual search task, Buschman and Miller found that neurons in the posterior parietal cortex region LIP found the search target earlier than neurons in frontal cortex regions FEF and LPFC.

In the pure visual search condition of a visual search task, Buschman and Miller found that neurons in frontal cortex regions FEF and LPFC found the search target earlier than neurons in the posterior parietal cortex region LIP.

Visual attention is influenced both by local and global saliency, ie. bottom-up processes, and by semantics, ie. top-down processes.

In some instances, developing animals lose perceptual capabilities instead of gaining them due to what is called perceptual narrowing or canalization. One example are human neonates who are able to discriminate human and monkey faces at first, but only human faces later in development.

Schroeder names two general definitions of multisensory integration: One includes any kind of interaction between stimuli from different senses, the other only integration of information about the same object of the real world from different sensory modalities.

These definitions both are definitions on the functional level as opposed to the biological level with which Stein's definition is concerned.

Multisensory integration can be thought of as a special case of integration of information from different sources---be they from one physical modality or from many.

Studying multisensory integration instead of the integration of information from different channels from the same modality tends to be easier because the stimuli can be more reliably separated in experiments.

Schroeder argues that multisensory integration is not separate from general cue integration and that information gleaned about the former can help understand the latter.

The direction of a saccade is population-coded in the SC.

There exist two hypotheses for how saccade trajectory is population-coded in the SC:

  • the sum of the contributions of all neurons
  • the weighted average of contributions of all neurons

The difference is in whether or not the population response is normalized.

According to Lee et al., the vector summation hypothesis predicts that any deactivation of motor neurons should result in hypometric saccades because their contribution is missing.

According to the weighted average hypothesis, the error depends on where the saccade target is wrt. the preferred direction of the deactivated neurons.

Lee et al. found that de-activation of SC motor neurons did not always lead to hypometric saccades. Instead, saccades where generally too far from the preferred direction of the de-activated neurons. They counted this as supporting the vector averaging hypothesis.

Nature has had millions of years to optimize the performance of cognitive systems. It is therefore reasonable to assume that they perform optimally wrt. natural tasks and natural conditions.

Bayesian theory provides a framework to determine optimal strategies. Therefore, it makes sense to operate under the assumption that the processes we observe in nature can be understood as implementations of Bayes-optimal strategies.

The superior colliculus sends motor commands to cerebellum and reticular formation in the brainstem.

Lateral intraparietal area (LIP) projects to intermediate layers of SC.

Mishkin et al. proposed a theory suggesting that visual processing runs in two pathways: the what' and thewhere' pathway.

The `what' pathway runs ventrally from the striate and prestriate to the inferior temporal cortex. This pathway is supposed to deal with the identification of objects.

The `where' pathway runs dorsally from striate and prestriate to inferior parietal pathway. This pathway is supposed to deal with the localization of objects.

Mishkin et al. already recognized the question of how and where the information carried in the different pathways could be integrated. They speculated that some of the targets of projections from the pathways, eg. in the limbic or system or the frontal lobe, could be convergence sites. Mishkin et al. stated that some preliminary results suggest that the hippocampal formation might play an important role.

There is multisensory integration in areas typically considered unisensory, eg. primary and secondary auditory cortex.

There is are feedforward and feed-back connections between visual cortex and auditory cortex.

Connectivity-wise, the strongest connections between SC and cortex are between SC and parietal lobe.

High white matter coherence between the parietal lobe and modality-specific brain regions is correlated with high temporal multi-sensory enhancement (shorter reaction times in multi-sensory trials than in uni-sensory trials).

Georgopoulos et al. introduced the notion of population coding and population vector readout.