Show Reference: "Vision: A Computational Investigation into the Human Representation and Processing of Visual Information"

Vision: A Computational Investigation into the Human Representation and Processing of Visual Information (15 June 1983) by David Marr
@book{marr-1982,
    abstract = {A computational investigation into the human representation and processing of
visual information.},
    author = {Marr, David},
    day = {15},
    howpublished = {Paperback},
    isbn = {0716715678},
    keywords = {computational-neuroscience, neuroscience, psychology, vision, visual-processing},
    month = jun,
    posted-at = {2013-04-12 15:01:43},
    priority = {2},
    publisher = {Henry Holt \& Company},
    address = {New York},
    title = {Vision: A Computational Investigation into the Human Representation and Processing of Visual Information},
    url = {http://www.amazon.com/exec/obidos/redirect?tag=citeulike07-20\&path=ASIN/0716715678},
    year = {1983}
}

See the CiteULike entry for more info, PDF links, BibTex etc.

More visual processing tends to occur in the retina the more important the result is (like detecting bugs for frogs or detecting foxes for rabbits) and the less complex the organism (like frogs and foxes).

"In order to understand a device one needs many different kinds of explanations." To understand vision, one needs theories that comply with the knowledge of the common man, the brain scientist, the experimental psychologist and which can be put to practical use.

Certain ganglion cells in the frog retina, dubbed `bug detectors', react exclusively to bug-like stimuli and their activity provokes bug-catching behavior in the frog.

Marr effectively argues normativity:

"... gone is any explanation in terms of neurons—except as a way of implementing a method. And present is a clear understanding of what is to be computed, how it is to be done, the physical assumptions on which the method is to be based, and some kind of analysis of algorithms that are capable of carrying it out."

It is important to make the distinction between different levels of understanding something (an information processing system) explicit.

Understanding that an abstract, mathematical description of the brain as an information-processing system is part of understanding the brain as a whole, one can rationally study

  • what is being processed,
  • why it is being processed,
  • how it is processed,
  • and whether or not processing it that way is optimal.

A representation is a formal system for making explicit certain entities or types of information, together with a specification of how the system does this.

And I shall call the result of using a representation to describe a given entity a description of the entity in that representation.

Any type of representation makes certain information explicit at the expense of information that is pushed into the background and may be quite hard to recover.

The underlying task in vision is to "reliably derive properties of the world from images of it".

Representation, algorithm, and hardware depend on each other and, critically, on the demands of the task.

The three levels at which any information-processing system needs to be understood are

  • computational theory
  • representation and algorithm
  • hardware implementation

According to Marr, the computational theory of an information-processing system is the theory of what it does, why it does what it does and "what is the logic of the strategy by which" what it does can be done.

Neurophysiology can help us understand the representations. Otherwise it is mainly concerned with the implementational side of the study of the brain as an information-processing system. Neurophysiological knowledge is hard to interpret in terms of algorithms and representations especially without a clear understanding of the task (ie. the computational theory).

Psychophysical results can inform the study of algorithms and representations.

A heuristic program that solves some task is not a theory of that task! Theoretical analysis of the task and its domain is necessary!

Marr speaks of vision as one process, whose task is to generate `a useful description of the world'. However, there is more than one actual goal of vision (though they share similar properties) and thus there are different representations and algorithms being used in the different parts of the brain concerned with these goals.

Marr writes: "The usefulness of a representation depends upon how well suited it is to the purpose for which it is used". That's pure embodiment.

When studying an information-processing system, and given a computational theory of it, algorithms and representations for implementing it can be designed, and their performance can be compared to that of natural processing.

If the performance is similar, that supports our computational theory.