comparing the parallel automatic composition of inductive applications with stacking methods hidenao...

Comparing the Parallel Automatic Comparing the Parallel Automatic Composition of Inductive ApplicationsComposition of Inductive Applications with Stacking Methods with Stacking Methods

Hidenao Abe & Takahira Yamaguchi

Shizuoka University, JAPAN

{hidenao,yamaguti}@ks.cs.inf.shizuoka.ac.jp

H.Abe & T. Yamaguchi Parallel and Distributed Computing for Machine Learning 2003 2

ContentsContents

• Introduction

• Constructive meta-learning based on method repositories

• Case study with common data sets

• Conclusion


Overview of meta-learning schemeOverview of meta-learning scheme

学習学習学習Learning Algorithms

Meta-LearningExecutor

データセットData Set

Meta Knowledge

A Better result to the given data setthan its done

by each single learning algorithm

Meta-learningalgorithm


Selective meta-learning schemeSelective meta-learning scheme

• Integrating base-level classifiers, which are learned with different training data sets generating by– “Bootstrap Sampling” (bagging)– weighting ill-classified instances (boosting)

• Integrating base-level classifiers, which are learned with different algorithms, with– simple voting (voting)– constructing meta-classifier with a meta data set and a

meta-level learning algorithm (stacking, cascading)

They don’t work well,They don’t work well, when no base-level learning when no base-level learningalgorithm works well to the given data set !!algorithm works well to the given data set !!

-> Because they don’t de-compose base-level learning algorithms.


ContentsContents

• Introduction• Constructive meta-learning based on method

repositories– Basic idea of our constructive meta-learning– An implementation of the constructive meta-learning called

CAMLET– Parallel execution and search for CAMLET

• Case Study with common data sets• Conclusion


De-composition&Organization

Search &Composition+

C4.5C4.5

VSVS

CSCS

AQ15AQ15

Analysis of RepresentativeInductive Learning Algorithms

Basic Idea of ourBasic Idea of ourConstructive Meta-LearningConstructive Meta-Learning


CSCS

AQ15AQ15

Analysis of Representative Inductive Learning Algorithms

Basic Idea ofBasic Idea ofConstructive Meta-LearningConstructive Meta-Learning




Analysis of Representative Inductive Learning Algorithms

Automatic Composition of Inductive Applications



Basic Idea ofBasic Idea ofConstructive Meta-LearningConstructive Meta-Learning

Organizing inductive learning methods,control structures and treated objects.


Analysis of Representative Inductive Analysis of Representative Inductive Learning AlgorithmsLearning Algorithms

• Version Space• AQ15• ID3• C4.5• Boosted C4.5• Bagged C4.5• Neural Network with back propagation• Classifier Systems

We have analyzed the following 8 learning algorithms:

We have identified 22 specific inductive learning methods.


Inductive Learning Method RepositoryInductive Learning Method Repository


Data Type HierarchyData Type Hierarchy

Organization of input/output/reference data typesOrganization of input/output/reference data types for inductive learning methods for inductive learning methods

classifier-set

data-set

If-Then rulestree (Decision Tree)

network (Neural Net)

training data set

test data set

Objects

validation data set


Identifying Inductive Learning Identifying Inductive Learning Methods and Control StructuresMethods and Control Structures

GeneratingGeneratingTraining & Training & ValidationValidationData SetsData Sets

GeneratingGeneratinga classifier a classifier

setset

Evaluating Evaluating a classifiera classifier

setset

Modifying Modifying training and training and

validation data setsvalidation data sets

Modifying Modifying a classifiera classifier

setset

EvaluatingEvaluating classifier setsclassifier sets

to test to test data setdata set

Modifying Modifying a classifiera classifier

setset


CAMLETCAMLET

CAMLET:CAMLET:a Computer Aided Machine Learning Engineering Toola Computer Aided Machine Learning Engineering Tool

Method RepositoryData Type HierarchyControl Structures

Compile

Go & Test

Construction

Instantiation

Go to the goal accuracy?

No Yes

Data Sets, Goal Accuracy

Inductive application, etc.

Refinement

User


CAMLETCAMLET

CAMLET with a parallel environmentCAMLET with a parallel environment


Compile

Go & Test

ExecutionLevel

Construction

Instantiation


No Yes



Composition Level

Refinement

SpecificationsData Sets

Results

User


Parallel Executions of Inductive Parallel Executions of Inductive ApplicationsApplications

Constructing specs

Sending specs

Initializing

R: receiving ,Ex: executing, S: sending

Composition Level is necessary one process

Execution Level is necessary one or more process elementsR R R R

Waiting

Executing

Ex Ex Ex Ex

Receiving a result

Receiving

Ex Ex S Ex

Refining the specSending refined one

Refining & Sending

Ex Ex ExR


Refinement of Inductive Applications Refinement of Inductive Applications with GAwith GA

t Generation

Selection

Parents

Crossover andMutation

Executed Specs &Their Results

X Generation

execute& add

1. Transform executed specifications to chromosomes

2. Select parents with “Tournament Method”

3. Crossover the parents and Mutate one of their children

4. Execute children’s specification, transforming chromosomes to specs

* If CAMLET can execute more than two I.A at the same time, some slow I.A will be added to t+2 or later Generation.

t+1 Generation

Children


CAMLETCAMLET

CAMLET on a parallel GA refinementCAMLET on a parallel GA refinement


Compile

Go & Test

ExecutionLevel

Construction

Instantiation


No Yes



Composition Level

Refinement

SpecificationsData Sets

Results

User


ContentsContents

• Introduction


• Case Study with common data sets– Accuracy comparison with common data sets– Evaluation of parallel efficiencies in CAMLET

• Conclusion


Set up of accuracy comparison Set up of accuracy comparison with common data setswith common data sets

• We have used two kinds of common data sets– 10 Statlog data sets– 32 UCI data sets (distributed with WEKA)

• We have input goal accuracies to each data set as the criterion.

• CAMLET has output just one specification of the best inductive application to each data set and its performance.– CAMLET has searched about 6,000 inductive applications for the best

one, executing up to one hundred inductive applications.

• 10 “Execution Level” PEs have been used to each data set.

• Stacking methods implemented in WEKA data mining suites


Statlog common data setsStatlog common data sets

Dataset #Class #Att. (Nom:Num) #Instances Evaluationaustralian 2 14 (6:8) 690 10CVdiabetes 2 8 (0:8) 768 10CVdna 3 180 (180:0) 2,000/ 1,186 Testgerman 2 20 (17:3) 1,000 10CVheart 2 13 (6:7) 270 10CVletter- liacc 26 16 (0:16) 16,000/ 4,000 Testsatimage 6 36 (0:36) 4,435/ 2,000 Testsegment 7 18 (0:18) 2,317 10CVshuttle 7 9 (0:9) 43,500/ 14,500 Testvehicle 4 18 (0:18) 846 10CV

To generate 10-fold cross validation data sets’ pairs,We have used SplitDatasetFilter implemented in WEKA.


Setting up of Stacking methods to StSetting up of Stacking methods to Statlog common data setsatlog common data sets

J4.8 (with pruning, C=0.25)

IBk (k=5)

Bagged J4.8 (Sampling rate=55%, 5 Iterations)


Boosted J4.8 (5 Iterations) Boosted J4.8 (10 Iterations)

Part (with pruning, C=0.25)

Naïve Bayes


Naïve BayesClassification withLinear Regression

(with all of features)

Meta-level Learning Algorithms

Base-level Learning Algorithms


Evaluation on accuracies to StatlEvaluation on accuracies to Statlog common data setsog common data sets

707580859095

100australian

diabetes

dna

german

heart

letter- liacc

satimage

segment

shuttle

vehicle

J 4.8 NB LR (Goal) CAMLET

Inductive applications composed by CAMLET have shown us asgood performance as that of stacking with Classification via Linear Regression.

Algorithm Average (%)J4.8 87.0NB 86.6LR 87.6CAMLET 87.4


Setting up of Stacking methods to Setting up of Stacking methods to UCI common data setsUCI common data sets


IBk (k=5)


Boosted J4.8 (10 Iterations)

Part (with pruning, C=0.25)

Naïve Bayes


Naïve BayesLinear Regression

(with all of features)

Meta-level Learning Algorithms

Base-level Learning Algorithms


Evaluation on accuracies to UCI Evaluation on accuracies to UCI common data setscommon data sets

30.00

65.00

100.00anneal

audiologyautos

balance-scalebreast-cancer

breast-w

colic

credit-a

credit-g

diabetes

glass

heart-c

heart-hheart-statlog

hepatitishypothyroidionosphere

iriskr-vs-kp

laborletter

lymph

mushroom

primary-tumor

segment

sick

sonar

soybean

splicevehicle

votevowel

J 4.8

NB

LR

Max. of 3stackingmethods(Goal)CAMLET

Algorithm J4.8 NB LR Max. of 3 CAMLET

Average 81.92 81.97 81.07 83.79 83.23


Analysis of parallel efficiency of Analysis of parallel efficiency of CAMLETCAMLET

• We have analyzed parallel efficiencies of executions of the case study with “Statlog common data sets”.– We have used 10 processing elements to execute each i

nductive applications.

• Parallel Efficiency is calculated as the following formula:

x

i

n

jije

utionTimeActualExecficiencyParallelEf

1 1

1<=x<100, n:#fold, e:Execution Time of Each PE


Parallel Efficiency of the case stuParallel Efficiency of the case study with StatLog data setsdy with StatLog data sets

5 6 7 8 9 10

australian

diabetes

german

heart

segment

vehicle

5 6 7 8 9 10

dna

letter

satimage

shuttle

Parallel Efficiencies of 10-fold CV data sets

Parallel Efficiencies of training-test data sets


Why the parallel efficiency of “satiWhy the parallel efficiency of “satimage” data set was so low?mage” data set was so low?

0

500

1000

1500

2000

2500

0 20 40 60 80 100

#Executions of Inductive Applications

Exec

utio

n Ti

me

(sec

.)

Because: - CAMLET could not find satisfactory inductive application to this data set within 100 executions of inductive applications.- In the end of search, inductive applications with large computational cost badly affected to parallel efficiency.


ContentsContents

• Introduction


• Case Study with common data sets

• Conclusion


ConclusionConclusion• CAMLET has been implemented as a tool for

“Constructive Meta-Learning” scheme based on method repositories.

• CAMLET shows us a significant performance as a meta-learning scheme.– We are extending the method repository to construct data mining

application of whole data mining process.• The parallel efficiency of CAMLET has been

achieved greater than 5.93 times.– It should be improved

• with methods to equalize each execution time such as load balancing or/and three-layered architecture.

• with more efficient search method to find satisfactory inductive application faster.


Thank you!Thank you!


Stacking: training and predictionStacking: training and predictionTraining phase Prediction phase

Training Set

Training Set(Int.)

Test Set(Int.)

Training Set Test Set

Meta-levelTraining Set

Meta-levelclassifier

Meta-level learning

ベースレベル分類器


Base-levelclassifiers

Learning ofBase-level classifiers

Transformation

Repeatingwith CV

Meta-levelTest Set

Transformation

Prediction

Results of Preduction



Base-levelclassifiers

Meta-levelclassifiers

Learning ofBase-level classifiers


How to transforme base-level attriHow to transforme base-level attributes to meta-level attributesbutes to meta-level attributes

Att. A Att. B Att. C Classesa1 0.3 c3 0a2 5 c2 1

… … … …

Classifier byAlgorithm 1

Learning with Algorithm 1

Prob. Of Predicting "0"Prob. Of Predicting "1"0.99 0.01 00.4 0.6 1

… … …

Algorithm 1Classes

1. Getting prediction probability to each class value with the classifier.2. Adding base-level class value of each base-level instance

A algorithmC class values#Meta-level Att. = AC


Meta-level classifiers (Ex.)Meta-level classifiers (Ex.)

The prediction probability of”0”With classifier by algorithm 1




Prediction“1”

．．．．

≦0.1＞ 0.1

≦0.95＞ 0.95 ＞ 0.9≦0.9

Prediction“0”

Prediction“1”


Accuracies of 8 base-level learning aAccuracies of 8 base-level learning algorithms to Statlog data setslgorithms to Statlog data sets

StackingBase-level algorithms

Algorithms Avg.(10 data sets)J 4.8 84.84IBk(5) 84.32Part 84.10NaiveBayes 76.53Bagging(5) 85.29Bagging(10) 86.17Boosting(5) 85.56Boosting(10) 85.94Max 87.58

Accuracy（％）

CAMLETBase-level algorithms

Algorithms Avg.(10 data sets)C4.5 DT 81.07ID3 DT 81.76NeuralNetwork 63.09ClassifierSystems 64.85Bagging(5) 84.28Bagging(10) 85.17Boosting(5) 84.52Boosting(10) 85.59Max 87.04

Accuracy（％）


C4.5 Decision Tree without pruning C4.5 Decision Tree without pruning (reproduction of CAMLET)(reproduction of CAMLET)

Generating training and validation data setswith a void validation set

Generating a classifier set (decision tree)with entropy + information ratio

Evaluating a classifier setwith set evaluation

Evaluating classifier sets to test data setwith single classifier set evaluation

1. Select just one control structure2. Fill each method with specific methods3. Instantiate this spec.4. Compile the spec.5. Execute its executable code6. Refine the spec. (if needed)

comparing the parallel automatic composition of inductive applications with stacking methods hidenao...

Documents