Download - Training and Testing Neural Networks
Training and TestingNeural Networks
서울대학교 산업공학과생산정보시스템연구실
이상진
Contents
• Introduction
• When Is the Neural Network Trained?
• Controlling the Training Process with Learning
Parameters
• Iterative Development Process
• Avoiding Over-training
• Automating the Process
Introduction (1)
• Training a neural network– perform a specific processing function
1) 어떤 parameter?
2) how used to control the training process
3) management of the training data - training process 에 미치는 영향 ?
– Development Process• 1) Data preparation• 2) neural network model & architecture 선택• 3) train the neural network
– neural network 의 구조와 그 function 에 의해 결정– Application– “trained”
Introduction (2)
• Learning Parameters for Neural Network
• Disciplined approach to iterative neural network development
Introduction (3)
When Is the Neural Network Trained?
• When the network is trained?– the type of neural network
– the function performing• classification
• clustering data
• build a model or time-series forecast
– the acceptance criteria• meets the specified accuracy
– the connection weights are “locked”
– cannot be adjusted
When Is the Neural Network Trained? Classification (1)
• Measure of success : percentage of correct classification– incorrect classification
– no classification : unknown, undecided
• threshold limit
When Is the Neural Network Trained? Classification (2)
Category A Category B Category C
Category A 0.6 0.25 0.15
Category B 0.25 0.45 0.3
Category C 0.15 0.3 0.55
•confusion matrix : possible output categories and the corresponding percentage of correct and incorrect classifications
When Is the Neural Network Trained? Clustering (1)
• Output a of clustering network– open to analysis by the user
• Training regimen is determined:– the number of times the data is presented to the neural
network– how fast the learning rate and the neighborhood decay
• Adaptive resonance network training (ART)– vigilance training parameter– learn rate
When Is the Neural Network Trained? Clustering (2)
• Lock the ART network weights– disadvantage : online learning
• ART network are sensitive to the order of the training data
When Is the Neural Network Trained? Modeling (1)
• Modeling or regression problems• Usual Error measure
– RMS(Root Square Error)
• Measure of Prediction accuracy– average– MSE(Mean Square Error)– RMS(Root Square Error)
• The Expected behavior– 초기의 RMS error 는 매우 높으나 , 점차 stable
minimum 으로 안정화된다
When Is the Neural Network Trained? Modeling (2)
When Is the Neural Network Trained? Modeling (3)
• 안정화되지 않는 경우– network fall into a local minima
• the prediction error doesn’t fall• oscillating up and down
– 해결 방법• reset(randomize) weight and start again• training parameter• data representation• model architecture
When Is the Neural Network Trained?
Forecasting (1)• Forecasting– prediction problem– RMS(Root Square Error)– visualize : time plot of the actual and desired network
output• Time-series forecasting
– long-term trend• influenced by cyclical factor etc.
– random component• variability and uncertainty
– neural network are excellent tools for modeling complex time-series problems
• recurrent neural network : nonlinear dynamic systems– no self-feedback loop & no hidden neurons
When Is the Neural Network Trained?
Forecasting (2)
Controlling the Training Process with Learning Parameters (1)
• Learning Parameters depends on– Type of learning algorithm– Type of neural network
Controlling the Training Process with Learning Parameters (2)
- Supervised training
Neural NetworkNeural Network
PatternPattern
PredictionPrediction
DesiredOutput
DesiredOutput
1) How the error is computed2) How big a step we take when adjusting the
connection weights
Controlling the Training Process with Learning Parameters (3)
- Supervised training
• Learning rate– magnitude of the change when adjusting the connection
weights– the current training pattern and desired output
• large rate– giant oscillations
• small rate– to learn the major features of the problem
• generalize to patterns
Controlling the Training Process with Learning Parameters (4)
- Supervised training
• Momentum– filter out high-frequency changes in the weight values– oscillating around a set values 방지– Error 가 오랫동안 영향을 미친다
• Error tolerance– how close is close enough– 많은 경우 0.1– 필요성
• net input must be quite large?
Controlling the Training Process with Learning Parameters (5)
-Unsupervised learning
• Parameter– selection for the number of outputs
• granularity of the segmentation(clustering, segmentation)
– learning parameters (architecture is set)• neighborhood parameter : Kohonen maps• vigilance parameter : ART
Controlling the Training Process with Learning Parameters (6)
-Unsupervised learning
• Neighborhood– the area around the winning unit, where the non-wining
units will also be modified– roughly half the size of maximum dimension of the out
put layer– 2 methods for controlling
• square neighborhood function, linear decrease in the learning rate
• Gaussian shaped neighborhood, exponential decay of the learning rate
– the number of epochs parameter– important in keeping the locality of the topographic am
ps
Controlling the Training Process with Learning Parameters (7)
-Unsupervised learning
• Vigilance– control how picky the neural network is going to be
when clustering data– discriminating when evaluating the differences between
two patterns– close-enough– Too-high Vigilance
• use up all of the output units
Iterative Development Process (1)
• Network convergence issues– fall quickly and then stays flat / reach the global minim
a– oscillates up and down / trapped in a local minima– 문제의 해결 방법
• some random noise• reset the network weights and start all again• design decision
Iterative Development Process (2)
Iterative Development Process (3)
• Model selection– inappropriate neural network model for the function to
perform– add hidden units or another layer of hidden units– strong temporal or time element embedded
• recurrent back propagation• radial basis function network
• Data representation– key parameter is not scaled or coded– key parameter is missing from the training data– experience
Iterative Development Process (4)
• Model architecture– not converge : too complex for the architecture– some additional hidden units, good– adding many more?
• Just, Memorize the training patterns– Keeping the hidden layers as this as possible, get the
best results
Avoiding Over-training
• Over-training– 같은 pattern 을 계속적으로 학습– cannot generalize– 새로운 pattern 에 대한 처리– switch between training and testing data
Automating the Process
• Automate the selection of the appropriate number of hidden layers and hidden units– pruning out nodes and connections– genetic algorithms– opposite approach to pruning– the use of intelligent agents