machine learning and bioinformatics 機器學習與生物資訊學
DESCRIPTION
Machine Learning and Bioinformatics 機器學習與生物資訊學. Using kernel density estimation in enhancing conventional instance-based classifier. Kernel approximation Kernel a pproximation in O ( n ) Multi-dimensional kernel approximation Kernel d ensity e stimation - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/1.jpg)
Machine Learning & Bioinformatics 1
Molecular Biomedical Informatics分子生醫資訊實驗室Machine Learning and Bioinformatics機器學習與生物資訊學
![Page 2: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/2.jpg)
Machine Learning and Bioinformatics 2
Using kernel density estimation in enhancing conventional instance-based classifier
Kernel approximation Kernel approximation in O(n) Multi-dimensional kernel approximation Kernel density estimation Application in data classification
![Page 3: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/3.jpg)
Machine Learning and Bioinformatics 3
Identifying class boundary
![Page 4: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/4.jpg)
Machine Learning and Bioinformatics 4
Boundary identified
![Page 5: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/5.jpg)
5Machine Learning and Bioinformatics
Kernel approximation
![Page 6: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/6.jpg)
Machine Learning and Bioinformatics 6
Kernel approximation
Problem definition Given the values of function at a set of
samples We want to find a set of symmetric kernel
functions and the corresponding weights such
that
![Page 7: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/7.jpg)
Machine Learning and Bioinformatics 7
Equations
(1)
![Page 8: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/8.jpg)
Machine Learning and Bioinformatics 8
Kernel approximation
An 1-D example�̂� (𝐯 )=∑
𝑖
𝑤 𝑖∙𝐾 (𝒗 ,𝑐 𝑖 ,𝑏𝑖)≅ 𝑓 (𝐯 )
𝑤𝑖 ∙𝐾 (𝒗 ,𝑐𝑖 ,𝑏𝑖 )
![Page 9: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/9.jpg)
Machine Learning and Bioinformatics 9
Kernel approximation
Spherical Gaussian functions Hartman et al. showed that a linear combination of
spherical Gaussian functions can approximate any
function with arbitrarily small error “Layered neural networks with Gaussian hidden
units as universal approximations”, Neural
Computation, Vol. 2, No. 2, 1990 With the Gaussian kernel functions, we want to find
such that
![Page 10: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/10.jpg)
Machine Learning and Bioinformatics 10
Equations
(1)
(2)
![Page 11: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/11.jpg)
11Machine Learning and Bioinformatics
Kernel approximation in O(n)
![Page 12: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/12.jpg)
Machine Learning and Bioinformatics 12
Kernel approximation in O(n) In the learning algorithm, we assume uniform
sampling– namely, samples are located at the crosses of an evenly-
spaced grid in the d-dimensional vector space
Let denote the distance between two adjacent samples If the assumption of uniform sampling does not hold,
then some sort of interpolation can be conducted to
obtain the approximate function values at the crosses
of the grid
![Page 13: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/13.jpg)
Machine Learning and Bioinformatics 13
A 2-D example of uniform sampling
𝛿
𝛿
![Page 14: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/14.jpg)
Machine Learning and Bioinformatics 14
The O(n) kernel approximation Under the assumption that the sampling
density is sufficiently high, i.e. , we have the
function values at a sample and its k nearest
samples, , are virtually equal: In other words, is virtually a constant function
equal to in the proximity of Accordingly, we can expect that ;
![Page 15: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/15.jpg)
Machine Learning and Bioinformatics 15
The O(n) kernel approximation
An 1-D example𝑓 (𝑥)
h 𝛿𝑓 (𝑥)≅ 𝑓 (h 𝛿 )
h 𝛿(h+1)𝛿(h − 1)𝛿(h −𝑘2)𝛿 (h+
𝑘2)𝛿
![Page 16: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/16.jpg)
Machine Learning and Bioinformatics 16
In the 1-D example, samples located at , where
h is an integer Under the assumption that , we have
The issue now is to find appropriate and such
that
in the proximity of
![Page 17: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/17.jpg)
Machine Learning and Bioinformatics 17
Equations
(1)
(2)
(3)
![Page 18: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/18.jpg)
Machine Learning and Bioinformatics 18
What’sthe difference between equations (2) and (3)?
![Page 19: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/19.jpg)
Machine Learning and Bioinformatics 19
If we set 𝜎h=𝛿 , we have𝑓 (h𝛿)
exp (− (𝑥−𝑖 𝛿 )2
2𝜎h2 )
![Page 20: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/20.jpg)
Machine Learning and Bioinformatics 20
Therefore, with , we can set and obtain
In fact, it can be shown that is bounded by Therefore, we have the following function
approximate:
![Page 21: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/21.jpg)
Machine Learning and Bioinformatics 21
Equations (1) (2) (3) (4) (5)
![Page 22: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/22.jpg)
Machine Learning and Bioinformatics 22
The O(n) kernel approximation
Generalization We can generalize the result by setting , where is a real
number
The table shows the bounds of
Bounds
0.5
1.0
1.5
![Page 23: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/23.jpg)
Machine Learning and Bioinformatics 23
Equations (1) (2) (3) (4) (5) (6)
![Page 24: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/24.jpg)
Machine Learning and Bioinformatics 24
The effect of
∑𝑗=−∞
∞
exp (− (𝑥− 𝑗 𝛿 )2
2 ( 0.5𝛿 )2 )exp (− (𝑥− 𝑗 𝛿 )2
2 (0.5𝛿 )2 )
![Page 25: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/25.jpg)
Machine Learning and Bioinformatics 25
The smoothing effect The kernel approximation function is actually
a weighted average of the sampled function
values Selecting a larger value implies that the
smoothing effect will be more significant Our suggestion is set
![Page 26: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/26.jpg)
Machine Learning and Bioinformatics 26
An example of the smoothing effect
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1targetfit
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1targetfit
![Page 27: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/27.jpg)
27Machine Learning and Bioinformatics
Multi-dimensional kernel approximation
![Page 28: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/28.jpg)
Machine Learning and Bioinformatics 28
Multi-dimensional kernel approximation
The basic concepts Under the assumption that the sampling
density is sufficiently high, i.e. , we have the
function values at a sample and its k nearest
samples, , are virtually equal: As a result, we can expect that and , where
and are the weights and bandwidths of the
Gaussian functions located at , respectively
![Page 29: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/29.jpg)
Machine Learning and Bioinformatics 29
Let , the scalar form of is
If we set to , then is bounded by
![Page 30: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/30.jpg)
Machine Learning and Bioinformatics 30
Accordingly, we have is bounded by and,
with ,
In RVKDE, d is relaxed to a parameter, Finally, by setting uniformly to , we obtain
the following kernel smoothing function that
approximates :
, where are the k nearest samples of the object
located at v
![Page 31: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/31.jpg)
Machine Learning and Bioinformatics 31
Equations (1) (2) (3) (4) (5) (6) (7)
![Page 32: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/32.jpg)
Machine Learning and Bioinformatics 32
Multi-dimensional kernel approximation
Generalization If we set uniformly to , then
As a result, we want to set , where and obtain
the following function approximate:
, where are the k nearest samples of the
object located at v In RVKDE, k is a parameter kt
![Page 33: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/33.jpg)
Machine Learning and Bioinformatics 33
Equations (1) (2) (3) (4) (5) (6) (7) (8)
![Page 34: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/34.jpg)
Machine Learning and Bioinformatics 34
Multi-dimensional kernel approximation
Simplification An interesting observation is that, as long as ,
then regardless of the value of , we have With this observation, we have
![Page 35: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/35.jpg)
Machine Learning and Bioinformatics 35
Equations (1) (2) (5) (6) (7) (8) (9)
![Page 36: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/36.jpg)
Machine Learning and Bioinformatics 36
Any Questions?
![Page 37: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/37.jpg)
Machine Learning and Bioinformatics 37
Lack something??
![Page 38: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/38.jpg)
38Machine Learning and Bioinformatics
Kernel density estimation
![Page 39: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/39.jpg)
Machine Learning and Bioinformatics 39
Kernel density estimation
Problem definition Assume that we are given a set of samples
taken from a probability distribution in a d-
dimensional vector space The problem definition of kernel density
estimation is to find a linear combination of
kernel functions that provides a good
approximation of the probability density
function
![Page 40: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/40.jpg)
Machine Learning and Bioinformatics 40
Probability density estimation The value of the probability density function
at a vector v can be estimated as follows:
n is the total number of samples R(v) is the distance between vector v and its
ks-th nearest samples is the volume of a sphere with radius = R(v)
in a d-dimensional vector space
![Page 41: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/41.jpg)
Machine Learning and Bioinformatics 41
The Gamma function
, since
, since
![Page 42: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/42.jpg)
42Machine Learning and Bioinformatics
Application in data classification
![Page 43: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/43.jpg)
Machine Learning and Bioinformatics 43
Application in data classification Recent development in data classification
focuses on the support vector machines
(SVM), due to accuracy concern RVKDE
– delivers the same level of accuracy as the SVM
and enjoys some advantages
– constructs one approximate probability density
function for one class of objects
![Page 44: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/44.jpg)
Machine Learning and Bioinformatics 44
Time complexity The average time complexity to construct a
classifier is , if the k-d tree structure is
employed, where n is the number of training
samples The time complexity to classify m new objects
with unknown class is
![Page 45: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/45.jpg)
Machine Learning and Bioinformatics 45
Comparison of classification accuracyClassification algorithms
Data sets RVKDE SVM 1NN 3NN
1. iris (150) 97.33 () 97.33 94.00 94.67
2. wine (178) 99.44 () 99.44 96.08 94.97
3. vowel (528) 99.62 () 99.05 99.43 97.16
4. segment (2310) 97.27 () 97.40 96.84 95.98
Avg. 1-4 98.42 98.31 96.59 95.70
5. glass (214) 75.74 () 71.50 69.65 72.45
6. vehicle (846) 73.53 () 86.64 70.45 71.98
Avg. 1-6 90.49 91.89 87.74 87.87
![Page 46: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/46.jpg)
Machine Learning and Bioinformatics 46
Comparison of classification accuracy
3 larger data setsClassification algorithms
Data sets RVKDE SVM 1NN 3NN
7. satimage(4435,2000)
92.30 () 91.30 89.35 90.60
8. letter(15000,5000)
97.12 () 97.98 95.26 95.46
9. shuttle(43500,14500)
99.94 () 99.92 99.91 99.92
Avg. 7-9 96.45 96.40 94.84 95.33
![Page 47: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/47.jpg)
Machine Learning and Bioinformatics 47
Data reduction As the learning algorithm is instance-based,
removal of redundant training samples will
lower the complexity of the prediction phase The effect of a naïve data reduction mechanism
was studied The naïve mechanism removes a training
sample, if all of its 10 nearest samples belong to
the same class as this particular sample
![Page 48: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/48.jpg)
Machine Learning and Bioinformatics 48
With the KDE based statistical learning algorithm, each training sample is associated with a kernel function with typically a varying width
![Page 49: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/49.jpg)
Machine Learning and Bioinformatics 49
In comparison with the decision function of SVM in which only support vectors are associated with a kernel function with a fixed width
![Page 50: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/50.jpg)
Machine Learning and Bioinformatics 50
Effect of data reductionsatimage letter shuttle
# of training samples in the original data set 4435 15000 43500
# of training samples after data reduction is applied 1815 7794 627
% of training samples remaining 40.92% 51.96% 1.44%
Classification accuracy after data reduction is applied 92.15% 96.18% 99.32%
Degradation of accuracy due to data reduction -0.15% -0.94% -0.62%
![Page 51: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/51.jpg)
Machine Learning and Bioinformatics 51
Effect of data reduction# of training samples after data reduction is applied
# of support vectors identified by LIBSVM
satimage 1815 1689
letter 7794 8931
shuttle 627 287
![Page 52: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/52.jpg)
Machine Learning and Bioinformatics 52
Execution times (in seconds)Proposed algorithm without data reduction
Proposed algorithm with data reduction
SVM
Cross validation
satimage 670 265 64622
letter 2825 1724 386814
shuttle 96795 59.9 467825
Make classifier satimage 5.91 0.85 21.66
letter 17.05 6.48 282.05
shuttle 1745 0.69 129.84
Test satimage 21.3 7.4 11.53
letter 128.6 51.74 94.91
shuttle 996.1 5.58 2.13
![Page 53: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/53.jpg)
Machine Learning & Bioinformatics 53
Today’s exercise
![Page 54: Machine Learning and Bioinformatics 機器學習與生物資訊學](https://reader038.vdocuments.pub/reader038/viewer/2022102518/56812b44550346895d8f5b20/html5/thumbnails/54.jpg)
Machine Learning & Bioinformatics 54
Multiple-class predictionDesign your own select, feature, buy and sell
programs. Upload and test them in our
simulation system. Finally, commit your best
version before 23:59 10/15 (Mon).