Neural Networks, Deep Learning, and Computer Vision, Part 1

I was going to combine these topics into a single post, but I decided they each warrant more discussion than a single post would allow. So, today I’m just going to talk about Neural Networks, both biological and artificial, and provide a bit of historical context.

In the last post, I discussed Google’s recently developed DeepVariant method, which uses computer vision methods based on deep neural networks to identify meaningful genome sequence variants. In this and subsequent posts, I’ll introduce the principles of neural networks, define deep learning in the context of neural networks, and discuss how these are applied to computer vision problems. Ready? Let’s go.

Researchers have fully mapped the entire biological neural network of a simple animal, a flatworm called Caenorhabditis elegans (more commonly rendered as "C. elegans" for obvious reasons), and an amazing interactive version of this neural network is available here. Artificial neural networks are an attempt to mimic the basic structure of biological neural networks, i.e. animal brains. Artificial neural networks consist of units called “artificial neurons” or “nodes” that are connected to each other and organized in layers. Each node can receive a signal from one or more nodes and then transmit a signal to one or more other nodes. Typically, the signal between nodes is a number or value, and the output of each node is based on a function operating on the sum of its inputs. Each connection has a weight, which can be thought of as a multiplier for the signal it transmits. Each node has a threshold that defines when the input signal(s) will produce an output signal that is transmitted to subsequent nodes. These three factors, input-to-output function, connection weights, and threshold, change and adapt during neural network training, but I’ll address that in the next post.

Example of a simple neural network.

Example of a simple neural network.

Our journey toward neural networks, and ultimately, artificial intelligence, began in the late 1940s with the work of Donald O. Hebb and Alan Turing. Hebb was working on questions about how the organization and function of neurons give rise to behaviors like learning. Hebb’s (well-supported) theory states that repeated stimulation of one neuron by another increases the efficiency of the connection between them. Around the same time, Turing suggested that the mind of a human infant is an “unorganised machine,” and posited that a network of electronic logic gates (or nodes), where each connection between nodes is influenced by a modifier that can attenuate or reinforce the connection, would behave similarly to a biological brain. Turing went so far as to suggest that thinking of the human cortex as an unorganized machine satisfies the evolutionary and genetic requirements for the brain to have arisen, in addition to the processes of learning and neuroplasticity in the context of an individual mind that Hebb wrote about.

Alan Turing was a true genius, on par at least with Newton, Darwin, and Einstein. He deserves credit for the allies winning World War II as much as any other individual for his work in breaking the Enigma cipher. Following the war, he was treated cruelly by his countrymen, and he left us too soon. He also deserves (at least) a post just about him, maybe I’ll get to that someday.

The 1950s saw one of the first practical applications of artificial neural networks in the form of an algorithm called the “perceptron.” What a great name: perceptron. Fascinatingly, this early application was designed for visual pattern recognition; essentially what we call “Computer Vision” today. But more on that later. The original Mark I Perceptron machine was the size of a small room and used a camera consisting of a 20x20 array of photocells to produce a 400 pixel image. Here’s the declassified operator’s manual!

The original perceptron algorithm was limited in that it could only classify linearly separable patterns. Linear separability might be most easily understood by imagining a set of blue and red points distributed on a plane. The two groups of points (blue and red) are linearly separable in two dimensions only if a single straight line can divide all blue points from all red points. Unfortunately, this shortcoming of the first perceptron was emphasized in a 1969 book called Perceptrons: an introduction to computational geometry, and interpreted by many to indicate that other types of problems would always be inaccessible to neural networks. In reality, the authors knew that more advanced neural networks, those containing multiple layers and highly connected nodes, should be able to address much more complex classification problems. However, the perception of perceptrons as inherently limited prevailed and prevented progress for a prolonged period.

Neural networks did not get significant attention in the field of machine learning until the 1980s, when advances in computational power and a renewed interest in backpropagation and connectionism spurred the re-emergence of so-called “deep” neural networks as computational tools. In the next post, I’ll talk about the differences between the early single-layer networks and these more advanced deep networks and discuss how deep learning networks are “taught.”