Show Thoughts

If we want to learn classification using backprop, we cannot force our network to create binary output because binary output is not a smooth function of the input.

Instead we can let our network learn to output the log probability for each class given the input.