Monday, August 5, 2019
Neural Network Architecture Construction
Neural Network Architecture Construction    Introduction  This article is going to discuss neural network construction from a different perspective than is usual in conventional approaches. This approach, which will be referred to as Neural Architecture, is intended to explore construction of neural networks using neurons as explicit building blocks rather than anonymous elements trained en mass. Simple Python programs will be used to demonstrate the concept for simple boolean logic functions.  The approach of this article is intentionally named Neural Architecture because it is meant to parallel the way in which a traditional architect systematically constructs a fine building: by developing well-known patterns of construction elements, which may be re-used to create ever more sophisticated structures. The conventional approach to neural network development is to define a network as consisting of a few layers in a multilayer-perceptron type of topology with an input layer, output layer, and one or two hidden layers. Then a training algorithm such as backpropagation is applied to develop the interconnection weights. Sometimes a more sophisticated approach is taken such as using a cascade or recurrent topology but for all intents and purposes, the end result is a standard topology of a few highly-connected layers. This approach was a major breakthrough in the field because it led some people to start thinking outside the box of symbolic reasoning that dominated Artificial In   telligence at the time. It has also been successfully used in a variety of pattern recognition and control applications that are not effectively handled by other AI paradigms.  However, these applications would not generally be considered to represent higher levels of intelligence or cognitive processing. For example, suppose a neural network is developed that can successfully recognize human faces under a variety of conditions. This is a highly useful application and well within the realm of conventional neural networks. However, that is where the capability of the network leaves offat recognizing the facial image. Aside from generalizing facial features, it can offer nothing more in terms of reasoning about those facial features. Further, it is asserted that the standard approach to neural network development is not suitable for realizing these higher levels of intelligence.  One of the fundamental problems is the limited manner in which we approach the neural architecture. To illustrate this problem, we will return to the building architecture analogy. In this way, our standard approach to neural architecture can be likened to designing a building using bricks. An architect who always thinks in terms of bricks will not likely progress beyond a certain level of sophistication, because as a component, a brick only offers one purpose: to support other bricks. Instead, an architecture progressively develops more sophisticated, proven structures based on the brick (or other primitive components) which can be re-used to develop higher-level components. A house is conceived, not in terms of bricks and wood, but rather in terms of walls, doors, and rooms. A sophisticated architect might even find these components mundane and instead think in more abstract terms of spaces, energy and flow of human traffic.  This is the notion of patterns, and in fact these (architectural) patterns were exactly the inspiration for the field of software patterns. The same thinking can be applied to neural networks: a neuron by itself only serves the function of exciting other neurons. And conventional neural net learning algorithms are geared toward categorization or other mapping operations. As a proponent of neural networks, one believes that arbitrarily complex intelligence processes can be realized, and we have the human brain as pretty convincing support of that belief. However, to continue progress in this direction, it is likely that we have to develop more sophisticated abilities as neural architects and develop useful, proven neural patterns similar to the way that building architects have done over time, and in the way software architects are now doing.  Enough philosophy we will now be taking a fresh look at how patterns can be constructed starting with simple neural elements, and specifically we will start with boolean logic elements. Architecting using boolean logic does not immediately offer an advantage over using regular logic gates, but illustrates how crisp logic or symbolic elements can arise from fuzzy neural processing elements. It will also provide a conceptual foundation for future articles.  Neural networks    Neural network: information processing paradigm inspired by biological nervous systems, such as our brain  Structure: large number of highly interconnected processing elements (neurons) working together  Like people, they learn from experience.  Neural networks are configured for a specific application, such as pattern recognition or data classification, through a learning process  In a biological system, learning involves adjustments to the synaptic connections between neurons.    The first step in the architecture process is to define the primitive building block, and if you havent fallen asleep at this point, you have no doubt figured out that this will be a neuron. The neuron model we will use is a version of the tried-and-true model used for software neural networks, also known as the perceptron. As illustrated in , the perceptron has multiple inputs and one output. The mathematical model of the perceptron is given by:  a = squash(Ã £(iiwi))  where:    is input i to the perceptron  is the weight for input i  is the activation (output)  and    squash(x) = {  1 if x >threshold  0 otherwise  The nature of the perceptron has been discussed many times elsewhere, including in Matthews, so we wont dwell on it here. But basically, the perceptron calculates a weighted sum of the inputs and then subjects it to a nonlinear squashing functionin our case, this is a simple threshold operation. The nonlinear threshold operation is part of what makes a neural net exhibit interesting behavior. Otherwise it would amount to matrix operations.  Nature of the game  Now that we have the model for a basic neuron defined, we can now proceed to define basic logic gates by simply working out two things:    the weight values  the threshold    For our discussion we will assume that weights can be positive (excitory) or negative (inhibitory) and be in the range between -1 and 1. The threshold will also be assumed to be in the range -1 to 1.  If we cast this in terms of signals, then it equates to the requirement that both inputs have to be sufficiently high to produce an output. So, we will set our threshold to a high value of 0.8. Next we will set the weights for the two inputs at 0.5 each. If one input is one (=1) then the neuron activation (output) will be given by  a  = squash( 1 * 0.5 + 0 * 0.5)  = squash( 0.5 )  = 0 since 0.5   
Subscribe to:
Post Comments (Atom)
 
 
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.