Multilayer Perceptrons, Old Concepts, and New Products
A multilayer perceptron (MLP) is a deep artificial neural network that is comprised of more than one perceptron. This being said, a perceptron is defined as a single neuron model, and in many ways served as the precursor to various forms of artificial neural networks that have been developed and implemented into a wide range of technological applications in recent years. In the context of deep learning, perceptrons are most commonly used to solve supervised learning problems with respect to binary classification. A common example of such a classification problem are email spam filters that are used to identify whether a message is spam or not, as the algorithm will be trained in conjunction with two specific decisions.
The history of perceptions
Much like other concepts and ideas that are currently prevalent in the world of machine learning and artificial intelligence in our current day, the first mathematical model that was created in accordance with the functions of the neurons within the human brain was first proposed several decades ago by neuroscientist Warren McCulloch and logician Walter Pitts in 1943. In their research paper, “A Logical Calculus of the ideas Imminent in Nervous Activity,” the pair built upon ideas and concepts that had first been set forth by English mathematician and scientist Alan Turing in his own groundbreaking research paper, “On Computable Numbers.” As Turning is widely credited with bringing the concept of artificial intelligence to the mainstream, Pitts and McCulloch sought to describe the functions of the human brain in abstract terms.
More specifically, their research paper detailed the tangible means by which the properties and characteristics of the neurons that make up the human brain could be realized in an artificial medium that would be capable of producing an extraordinary amount of computational power, what we know today as a perceptron. While this paper would receive little acknowledgment or notoriety when it was initially published, the concept of a perceptron was utilized in a machine that was built at the Cornell Aeronautical Laboratory in 1958. This machine was called the IBM 704 and was created by American psychologist Frank Rosenblatt, and was used for any early iteration of image classification.
Big data trend
Despite the promising and innovative ideas of the perceptron, as well as the machines that were created in an attempt to realize these ideas in a commercial and scientific way, the amount of data and computational power that was necessary to implement perceptrons in an effective and efficient manner would not be developed for several more years. To this end, it wasn’t until the big data trend began to accumulate steam within the fields of artificial intelligence and machine learning around the year 2010 that software engineers would begin to revisit the ideas that had been posited by Alan Turning, Warren McCulloch, and Walter Pitts in years past. Subsequently, large sets of structured and unstructured data have allowed software developers to utilize artificial neural networks in a number of ways, and the perceptron forms the basis of many of these technologies.
The multilayer perception
With all this being said, multilayer perceptrons are one of the most commonly used feed-forward artificial neural networks, as they take advantage of the powers and capabilities of various perceptrons at once. To this point, these networks will be made up of various artificial layers, as the name suggests. In the most basic of applications, a multilayer perceptron will contain 3 specific categories of layers. The first layer within such networks will be the input or visual layer, as this portion of the network will be used to feed the initial input data or information in the deep learning model. The next layer within the network will be the hidden layer.
The hidden layer within a feed-forward artificial neural network will represent the portion of the deep learning model where the artificial neurons within the model will be assigned specific weights or parameters, which will then be used to create an output from the model through an activation function that has been performed on the inputs that have been fed through the model. Finally, the third layer within such a network will be the output layer, as this layer will be used to output a value that corresponds to the required format regarding the problem that is being solved by using the deep learning model. While the parameters of the layers that are used within a multilayer perceptron will be dependent on the problem that a software developer is looking to solve, all such networks will contain an input, hidden, and output layer.
As advancements in the world of computing have allowed software engineers and data scientists to revisit theories that had failed to be completely realized at some point in the distant past, multilayer perceptrons are just one example of a concept or idea within the realm of artificial intelligence and machine learning that has become much more practical in recent years. Likewise, despite the fact that multilayer perceptrons are considered to be a more classical approach to neural networks when compared to more contemporary techniques and methods, the mathematical concepts that informed the creation of such neural networks still hold true today.