Imagenet classification with deep convolutional neural networks In Advances in Neural Information Processing Systems, Vol. 25 (2012), pp. 1097-1105 by Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton edited by Fernando C. N. Pereira, Chris J. C. Burges, LÃ©on Bottou, Kilian Q. Weinberger

@inproceedings{krizhevsky-et-al-2012, abstract = {We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the {ImageNet} {LSVRC}-2010 contest into the 1000 different classes. On the test data, we achieved top-1 and top-5 error rates of 37.5\% and 17.0 \% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient {GPU} implementation of the convolution operation. To reduce overfitting in the fully-connected layers we employed a recently-developed regularization method called \^{a}dropout\^{a} that proved to be very effective. We also entered a variant of this model in the {ILSVRC}-2012 competition and achieved a winning top-5 test error rate of 15.3\%, compared to 26.2 \% achieved by the second-best entry. 1}, author = {Krizhevsky, Alex and Sutskever, Ilya and Hinton, Geoffrey E.}, booktitle = {Advances in Neural Information Processing Systems}, citeulike-article-id = {13133909}, citeulike-linkout-0 = {http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.299.205}, editor = {Pereira, Fernando C. N. and Burges, Chris J. C. and Bottou, L\'{e}on and Weinberger, Kilian Q.}, keywords = {classification, deep-learning, learning, object}, pages = {1097--1105}, posted-at = {2015-03-17 10:54:56}, priority = {2}, publisher = {Curran Associates, Inc.}, title = {Imagenet classification with deep convolutional neural networks}, url = {http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.299.205}, volume = {25}, year = {2012} }

See the CiteULike entry for more info, PDF links, BibTex etc.

Convolutional neural networks make assumptions on the input:

- stationarity of image statistics (the same statistics are true for all segments of an image),
- locality of dependencies between pixels.⇒

With a few strong but empirically correct assumptions on the input, CNNs buy us a reduction of the number of parameters and thus better training performance compared to standard feed-forward ANNs.⇒

Advances in computing hardware and fast implementations of the necessary operations have made training of convolutional neural networks on very large data sets practical.⇒

It took Kriszhevsky et al. five to six days to train their network on top-notch hardware available in 2012.⇒

Krizhevsky et al. demonstrate that large, deep convolutional neural networks can have very good object classification performance.⇒