object-oriented rosenblatt perceptron using c++
TRANSCRIPT
Object-Oriented Artificial Neural Network with C++
SSE550 Object-Oriented Programming I
Project I (Chapter 1-5)
February 13, 2012
Samuel Bixler
Table of Contents
Introduction
Basic Perceptron Theory
NeuralNet Class
• UML Diagram
• Headers
• Interface
• Implementation
Main Program
• Headers
• Instantiation
• Control Loop
Output
• Figure 4 - Options Menu
• Figure 5 - Initialize Weights
• Figure 6 - Refresh Menu
• Figure 7 - Display Weights
• Figure 8 - Input Training Set
• Figure 9 - Train Net Default Learning Rate
• Figure 10 - Train Net Learning Rate 10.0
• Figure 11 - Display Weights after Training
• Figure 12 - Test Net Boolean Inputs
• Figure 13 - Test Net Float Inputs
• Figure 14 - Weights Plotted on Input Space
• Figure 15 - Set Activation Function
• Figure 16 - Set Learning Rate
• Figure 17 - Exit
Conclusion
Index of Topics Covered
Chapter 1 - Introduction to Computers and C++
Chapter 2 - Introduction to C++ Programming
• Compiler directives • The main function • Input statements • Output statements • Stream insertion operator • Escape sequences • Return statement • Variable declarations • Fundamental types • Identifiers • Memory • Arithmetic • Operator precedence • Relational operators
Chapter 3 - Introduction to Classes, Objects and Strings
• User-defined classes • Creating and using objects • Declaring data members • Defining member functions • Calling member functions • Passing data as arguments • Local variables vs. data members • Initial values via constructor • Separating interface from implementation • UML class diagrams • Data member set methods
Chapter 4
• Constructing an algorithm in pseudocode • Selection statements • Repetition statements • Assignment operators
Chapter 5
• More control statements • Logical operators
Introduction
This project explores the application of object-oriented programming techniques to the
construction of a single neuron artificial neural network (ANN). The framework that was
constructed is designed in a scaleable way so that it will be useful to represent more
complex networks. The focus of the project was on the construction of an easy to use
NeuralNet class with member functions to perform common ANN operations. The class
can be used to create and manipulate complex network architectures, which would be
useful for real world applications. This paper will first address the theory of operation of
a single neuron ANN, followed by the implementation of the NeuralNet class and then
present results of using the class to perform the AND logical operation.
Basic Perceptron Theory
Artificial Neural Networks are mathematical models of biological neurons which are
usually used to perform functions that are not easily achieved using traditional
algorithms. Pattern recognition and classification are two tasks that neural networks are
especially well suited for.
Figure 1 - Two Input/Single Output Neural Network
The simplest example of an ANN is the Rosenblatt Perceptron; it is the name given to a
single neuron ANN and the algorithm used to train it. Figure 1 is a graphical
representation of the functions and data that make up the Rosenblatt Perceptron. It is the
model that will be explored using the NeuralNet class in this project. The mathematical
neuron functions are similar to a biological neuron. Inputs either from the environment
(user) or other neurons (hidden layers) are summed in the body of the neuron and if a
certain threshold, called the activation potential, is reached, the output changes. The
function that maps the weighted sum of the input(s) to the output(s) is called the
activation function. There are several functions that can be used for this step depending
on the specific network architecture and data that is used. The Rosenblatt Perceptron can
also be viewed mathematically as a line in 2D "input space" that is adjusted to divide the
inputs based on which class they belong to. In the general case with n inputs, these
weights represent an n-dimensional hyperplane that is able to perfectly classify any
linearly separable sets of inputs. Unfortunately the Rosenblatt Perceptron performs very
poorly at classifying inputs that are not linearly separable and more advanced networks
and training algorithms are needed for more complex problems.
Figure 2 - Two Input Decision Boundary
To understand exactly how the Rosenblatt Perceptron is able to classify inputs it is
helpful to graph the line on the input space. If the input lies above the line, it belongs to
one class and to the other if it lies below. There are several ways to train a neural
network, but the method which this project uses is called supervised training. A training
set of inputs and the correct outputs is shown to the perceptron and the weights are
modified according to a learning rule which will be discussed later.
NeuralNet Class
The NeuralNet class is composed of a set of private data members that store the
architecture parameters which are required to initialize and train the network. This class
contains methods to initialize, train and test the network as well as several mutators, and
a function required by the learning algorithm. A UML diagram of the NeuralNet class is
shown below.
Figure 3 - UML Diagram NeuralNet Class
NeuralNet
-numInput: int-numHidden: int-numOutput: int-numTrainSets: int-activationSelect: int-learnRate: float-sigmoidCoef: float-weightMatrix: Eigen::MatrixXf-trainingInputs: Eigen::MatrixXf-trainingOutputs: Eigen::MatrixXf-testInputs: Eigen::VectorXf
<<constructor>>+NeuralNet()+refreshScreen(): void+initializeWeights(): void+displayWeights(): void+inputTrainSet(): void+trainNet(): void+testNet(): void+setActivationFunction(): void+setLearningRate(): void+activationFunction(: float): float
Interface
#include <Eigen\Dense>
class NeuralNet{public:
NeuralNet();void refreshScreen();void initializeWeights();void displayWeights();void inputTrainSet();void trainNet();void testNet();void setActivationFunction();void setLearningRate();float activationFunction(float);
private:// Private data ////Network architecture parametersint numInput, numHidden, numOutput, numTrainSets;int activationSelect;float learnRate, sigmoidCoef;//Eigen matrices and vectorsEigen::MatrixXf weightMatrix;Eigen::MatrixXf trainingInputs;Eigen::MatrixXf trainingOutputs;Eigen::VectorXf testInputs;
};
The contents of the header file NeuralNet.h, where the interface of the NeuralNet class is
defined, are shown above. The implementation is in the NeuralNet.cpp file, and will be
covered piece by piece in the next section. The data members numInput, numHidden and
numOutput are integers that are used during the instantiation of a NeuralNet object to
specify the architecture of the network. These parameters determine the size of the
weights matrix and also are used to control for loops that initialize weights and train the
network. The numHidden parameter is not utilized in this project, but it is included for
flexibility. Its purpose is to specify the number of hidden layers of neurons in the
network. In the single neuron case there are no hidden layers (i.e. not input or output
layer). The learnRate floating point parameter is used in the training algorithm to vary the
amount of adjustment that is made to the weight matrix after each training iteration. For
the remaining data members, data types from the Eigen matrix library were used. Eigen is
an open source template library that provides the capability to easily create, manipulate
and display matrices and vectors. The weightMatrix data member is a dynamically
allocated single precision floating point matrix that is used to store the neural network
weights.
Headers
//NeuralNet.cpp#include <iostream>#include "NeuralNet.h"#include <cmath>#include "stdlib.h"using namespace std;
NeuralNet.cpp uses the #include compiler directive to include several required external
libraries. The iostream header is included to provide access to the system input/output.
The NeuralNet.h header needs to be included since it contains the NeuralNet class
interface definitions as well as the member function prototypes. The cmath library
implements the exponential function exp() which is required by the activationFunction
method to generate the sigmoid and hyperbolic tangent output . The stdlib.h header is
included so that system("cls") can be used to clear the console in the refreshScreen
method.
Implementation
NeuralNet::NeuralNet(){
//Default parametersnumInput = 2;numOutput = 1;numTrainSets = 4;learnRate = 0.1;activationSelect = 1;sigmoidCoef = 4.0;
//Matrix and vector sizingweightMatrix.resize(numInput+1,numOutput);trainingInputs.resize(numTrainSets,numInput+1);trainingOutputs.resize(numTrainSets,numOutput);testInputs.resize(numInput+1);
//Define training set for AND function (Default)trainingInputs << 1, 0, 0,
1, 0, 1, 1, 1, 0, 1, 1, 1;
trainingOutputs << 0, 0, 0, 1;
}
The NeuralNet class has an overloaded constructor that initializes the private data
members with their default values. These defaults can be seen in the code above. The
constructor also resizes the matrices and vectors based on the defaults. This piece of code
will eventually need to be moved once functionality is added that allows the user to
define the network architecture. The trainingInputs and trainingOutputs matrix and vector
are populated with the appropriate data to teach the perceptron the logical AND function.
void NeuralNet::refreshScreen(){
system("cls");cout << endl;cout << " [1] Main menu" << endl;cout << " [2] Initialize weight matrix " << endl;cout << " [3] Display weights matrix " << endl;cout << " [4] Input training set " << endl;cout << " [5] Train network " << endl;cout << " [6] Test the net " << endl;cout << " [7] Set activation function " << endl;cout << " [8] Set learning rate " << endl;cout << " [9] Exit " << endl;
}
The refreshScreen method clears the console output using system("cls") mentioned in the
headers section, and refreshes the main menu options using the stream insertion operator
to send output to the cout stream. This method can be called by the user from the main
program if the screen is cluttered or the user needs to know what the options are.
void NeuralNet::initializeWeights(){
//Initialize bias weight to +1for ( int b = 0; b < numOutput; b++ )
weightMatrix(0,b) = 1;
//Initialize weights to random values (-1)-(+1) with mean 0for ( int out = 0; out < numOutput; out++ ){
for ( int in = 1; in <= numOutput + 1; in++ )weightMatrix(in,out) = (float)rand()/(float)RAND_MAX*2 - 1;
}cout << " The weight matrix has been initialized with random values.\n";
}
The initializeWeights method uses two for loops to initialize the synaptic weights and
bias, to pseudorandom numbers between -1 and +1 with a normal distribution. The C++
standard library includes a random number generator function rand(), but because its
return type is integer, the initializeWeights method modifies the range and mean of the
random numbers using the cast operator followed by an offset of -1. In order for the
Rosenblatt Perceptron to properly classify inputs which may not be centered about the
origin (of the input space), a bias is used to shift the decision boundary up or down. This
bias is part of the weight matrix and is initialized to +1.
void NeuralNet::displayWeights(){
cout << " The weights are: \n\n";cout.fixed;cout.precision(2);for (int r = 0; r <= numInput; r++){
for (int c = 0; c < numOutput; c++){
if ( weightMatrix(r,c) < 0 )cout << " " << fixed << weightMatrix(r,c);
elsecout << " " << fixed << weightMatrix(r,c);
}cout << endl;
}}
It is interesting to see what is actually being stored in the weights matrix. To do this, the
displayWeights method was created. The format and precision of the output is set to fixed
and two decimal places, then two for loops print each value in the weights matrix. The
insertion operator could have been used to directly print the values in weightMatrix as the
Eigen::MatrixXf type has this capability, but because the results could be either negative
or positive and an if statement was included that keeps the decimals lined up for
readability.
Time did not permit implementation of the inputTrainSet method. It was planned, but not
a top priority. The method should allow the user to input the training data either from a
file or by entering it manually. At this point, if this method is called, it prints a message
to tell the user that the functionality is not there yet.
void NeuralNet::trainNet(){
//Local variablesfloat activation, product, error; int epoch = 0, sumMisclass;
//Loop until the neural network doesn't misclassify any of the training inputsdo {
//Initialize training metrics variablesepoch++;sumMisclass = 0;
//Calculate output and error for each set of training inputsfor (int i = 0; i < numTrainSets; i++){
//Calculate errorproduct = trainingInputs.row(i).dot(weightMatrix.transpose().row(0));activation = activationFunction(product);error = trainingOutputs(i,0) - activation;
//Update weight matrixweightMatrix += trainingInputs.row(i).transpose()*learnRate*error;
//Sum misclassified inputsif ( error != 0.0 )
sumMisclass++;}cout << " " << sumMisclass << " misclassified inputs for epoch "
<< epoch << endl;} while (sumMisclass > 0);cout << " The network has finished training.\n";
}
The trainNet method is the most complex segment of code in the NeuralNet class. It
executes the Rosenblatt Perceptron training algorithm to teach the neuron the AND
operator. The method executes a supervised training algorithm that manipulates the
weights matrix using the trainingInputs and trainingOutputs data. The numTrainSets
integer variable is used to control looping in the algorithm. Local variables store
intermediate values (float activation, product), the output error (float error), the epoch
number (int epoch) and the sum of the misclassified inputs for a given epoch (int
sumMisclass).
The pseudocode algorithm is:
• Set epoch count to 0.
• While the number of misclassified inputs is greater than 0.
o Set misclassified inputs to zero
o Increment epoch by 1.
o For every input in the training set
Compute weighted sum of inputs using initial random weights
Compute the hardlimited output of the weighted sum
o Compute error by taking difference of training output and neuron output
o Update weight matrix using the training rule.
New weight equals the sum of the old weight and the product of
the learning rate, the current input and the error.
o If the error is greater than 0
Increment misclassified inputs by 1
o Display epoch number and number of misclassified inputs
• Return to the calling function
float NeuralNet::activationFunction(float x){
float result;switch (activationSelect){case 1: //Threshold function
{if(x >= 0)
result = 1;else
result = 0;}break;
case 2: //Hyperbolic tangent functionresult = (exp(x)-exp(-x))/(exp(x)+exp(-x));break;
case 3: //Sigmoid functionresult = 1/( 1 + exp(-activationSelect*x) );
}return result;
}
The activationFunction method is a function that by default performs a threshold
operation on the floating point input and returns a floating point result. There are several
functions such as the sigmoid, and hyperbolic tangent functions that can also be used as
the activation function, but the threshold works best for the Rosenblatt Perceptron.
void NeuralNet::testNet(){
float activation, product; //Local variables//Set bias inputtestInputs(0) = 1;//Loop to fill the test input vector with user valuesfor (int i = 1; i < testInputs.rows(); i++){
cout << " Enter input " << i << ": ";cin >> testInputs(i);
}//Compute the neuron's output given the test inputsproduct = testInputs.dot(weightMatrix.transpose().row(0));activation = activationFunction(product);cout << "\n Given the inputs you entered,\n the Rosenblatt Perceptron ";cout << " says the correct answer is: " << activation << endl;
}
The testNet method fills an Eigen VectorXf variable with a user specified set of inputs. It
then shows the perceptron the input set and computes an intermediate product and
activation value using the weight matrix resulting from the trainNet method. The result it
sent to the console.
The NeuralNet class contains two set methods to allow the user the option to change the
learning rate and activation function.
void NeuralNet::setActivationFunction(){
int activationSelectTemp, sigmoidCoef;//Activation function selection menucout << " [1] Threshold" << endl;cout << " [2] Sigmoid" << endl;cout << " [3] Hyperbolic Tangent" << endl;cout << " Select an activation function: ";cin >> activationSelectTemp;
switch (activationSelectTemp){case 1:
activationSelect = activationSelectTemp;cout << "\n The threshold function has been selected.\n";break;
case 2:activationSelect = activationSelectTemp;cout << "\n The sigmoid function has been selected.\n";cout << "\n Enter the exponential coefficient (positive real): ";cin >> sigmoidCoef;if (sigmoidCoef > 0.0)
cout << "\n The coefficient has been set to: " << sigmoidCoef << endl;else{
sigmoidCoef = 4.0;cout << "\n Invalid entry!";cout << "\n The coefficient has been set to the default (4.0)\n";
}break;
case 3:activationSelect = activationSelectTemp;cout << "\n The hyperbolic tangent function has been selected.\n";break;
default:activationSelect = 1;cout << "\n Invalid entry!";cout << "\n The activation function has been set to the default (Threshold).\n";break;
}}
void NeuralNet::setLearningRate(){
//Set a new learning ratecout << " Enter the new learning rate (positive real): ";cin >> learnRate;if ( learnRate > 0.0 ){
cout << " The learning rate has been set to " << learnRate << endl;}else{
learnRate = 0.1;cout << " Invalid entry!\n";cout << " The learning rate has been set to the default (0.1)\n";
}}
Main Program
The main.cpp file begins with #include directives to access the required libraries. The
iostream header is included to provide input/output and formatting capabilities to the
program. The ctime library is required to generate a seed for the rand function. The
compiler is informed that the std namespace is being used.#include <iostream>#include <ctime>#include "NeuralNet.h"using namespace std;
The main function takes no arguments and its return type is void since it is not required to
return any values for this application. The main program begins begins by seeding the
pseudo-random number generator with the current system time by executing the srand()
function. The next step is creating an instance of the NeuralNet object called myNet.
After the myNet object is created, a call to the refreshScreen method clears the screen and
displays the options to the user. The option variable holds the user's choice and is used as
the switch variable in the option selection case statement. The boolean exit is used to exit
the do-while and the program if the user chooses to do so.void main(){
//Seed the random number generatorsrand((unsigned)time(0));
//Create a NeuralNet objectNeuralNet myNet;//Clear the console and display the optionsmyNet.refreshScreen();
int option;bool exit = false;
Execution now enters a do-while loop and prompts the user to select an option. The input
is saved in the option memory location and an if statement is used to test if the entry is a
value between 1-9 which are the valid options.
do{
//User interfacecout << "\n Enter your selection: ";cin >> option;//Invalid input checkif ( ( option > 0 ) & ( option < 10 ) ){
cout << endl;//Menu choice selection switchswitch (option){case 1:
myNet.refreshScreen();break;
case 2:myNet.initializeWeights();break;
case 3:myNet.displayWeights();break;
case 4:myNet.inputTrainSet();break;
case 5:myNet.trainNet();break;
case 6:myNet.testNet();break;
case 7:myNet.setActivationFunction();break;
case 8:myNet.setLearningRate();break;
case 9:exit = true;break;
default:cout << " Invalid input! Please enter an option (1 - 9):\n";
}} else{
cout << " Invalid entry!\n";cout << " Enter a number corresponding to one of the 9 options.";
}} while (exit == false);
return;}
NeuralNet methods are called based on the user input using a switch control statement. If
the user chooses option 9, exit is set to "true" and when the do-while condition is tested
the loop is exited and the program returns. If invalid data is entered into choice, the else
block is executed and a message is displayed to inform the user.
Output
The following pages show screenshots of the programs response to user inputs and
demonstrate its capability to learn the AND function.
Figure 4 - Options Menu
Figure 5 - Initialize Weights
Figure 6 - Refresh Menu
Figure 7 - Display Weights
Figure 8 - Input Training Set
Figure 9 - Train Net, Default Learning Rate (0.1)
Figure 10 - Train Net, Learning Rate Set to (10.0)
Figure 11 - Display Weights After Training
Figure 12 - Test Net, Boolean Inputs
Figure 13 - Test Net, Float Inputs
The final set of screenshots show the perceptron's response to inputs that are not 0 or 1.
Figure 9 shows several examples of this. This demonstrates that even though the
hyperplane is trained to separate the 4 inputs shown to it in the training set, it is only
finding one of the infinite number of solutions to the problem. The results that the neural
network generates are fuzzy and the neuron only learns as much as it needs to in order to
meet the learning criterion.
Figure 14 - Weights Graphed in Input Space
The figure above shows the training set of inputs, untrained and trained decision
boundaries plotted on the 2D input space. This is a plot of observed weight matrix output
from the program.
Figure 15 - Set Activation Funtion
Figure 16 - Set Learning Rate
Figure 17 - Exit
Conclusion
The goal of this project was to design an artificial neural network class using object-
oriented C++ techniques and verify the NeuralNet class's interface and implementation
by creating and testing the Rosenblatt Perceptron case. This was successfully
accomplished as the results indicate. The class is very simple at this point and would need
much more work to allow it to classify non-linearly separable patterns and to utilize the
more advance activation functions. The program only has minimal user input validation
and exception handling and this is something that would need to be improved on in the
future.