1 monitoring nonlinear profiles with random effects by nonparametric regression jyh-jen horng shiau...
Post on 19-Dec-2015
220 views
TRANSCRIPT
1
Monitoring Nonlinear Profiles with Random Effects by
Nonparametric Regression
Jyh-Jen Horng Shiau Institute of Statistics
National Chiao Tung University(交通大學統計所 洪志真 )
Sept. 25, 2009
NCTS Industrial Statistics Research Group Seminar
2
Outline Introduction Linear Profile Monitoring
Fixed Effects vs. Random Effects
Nonlinear Profile MonitoringParametric Regression vs. Nonparametric Regression
• Fixed Effects vs. Random Effects
– Phase I Monitoring– Phase II Monitoring– Examples
Conclusions
4
SPC: Variables vs. Profiles
Classical SPC: using one or multiple quality characteristics (a single univariate or multivariate variable) to measure the process quality.
However, in many situations, the response of interest is not a single variable but a function of one or more explanatory variables. This functional response is called a profile.
• Profile Monitoring
5
Other Terms for Profiles Waveform Signal (Jin and Shi, 2001) Signature (Gardner et al., 1997)
Example: Vertical Board Density Profile Data from Walker and Wright (JQT, 2002)
24 profiles of vertical density, each profile consists of 314 measurements.
6
explanatory variable (X)
resp
on
se
(Y)
j = 1 j = 2 j = k
j = 1,2,…,k sample profiles, n>2 observations in each profile
………
………
time
n=10
The objective is to monitor functional data over time.
Profile Monitoring
7
Applications: Dissolving Process of Aspartame
An example of a product characterized by a profile is aspartame, an artificial sweetener. An important characteristic of the product is the amount of aspartame that can be dissolved per liter of water at different temperatures. (Kang and Albin, 2000).
8
Semiconductor Gateoxide Thickness Surface
By Gardner et al. (1997)
Fig. (a) shows the gateoxide thickness surface of a wafer that was processed under fault-free conditions. Fig. (b) shows the gateoxide thickness surface of a wafer processed under known equipment faults.X and Y are the distances from the center of the wafer. Not only there is an apparent decrease in thickness between the two surfaces (from (a) to (b)), but also a change in spatial pattern.
9
Tonnage Stamping Process
The figure shows the complicated profile form of a Stamping Force Profile given by Jin and Shi (1999). Different local features are needed to be monitored in each interval. Jin and Shi used the term waveform signals to refer to profiles.
10
Bioassay Dose Response Curve
A dose response curve, given by Williams, Birch, Woodall, and Ferry (2006), according to different doses and different time periods.
of untreated specimen
at sampling period iM median
i
Ith profile, jth dose, kth replication
12
Two Approaches
Parametric RegressionFit a parametric model of known form to each profile
Monitor each parameter with a separate chart or Use a multivariate chart based on the vectors of
parameter estimates.
Nonparametric RegressionSmooth each profile data
Use various metrics to detect changes in profile shape.
13
Linear Profile Monitoring
Fixed-effect model
where are independent and normally
distributed with mean 0 and variance .
Kang and Albin (2000)Kim, Mahmound, and Woodall (2003)Mahmound and Woodall (2004)Mahmound (2004)
0 1 , l hY A A X X X X
2
14
Linear Profiles with Fixed-Effect Model
Kang and Albin (2000)
Monitor slope and intercept jointly with multivariate chart.
Treat the residuals between the sample and reference lines as a rational subgroup and monitor residuals with a combined EWMA/R chart.
Phase-I statistics are dependent and thus control limits can not be determined directly from marginal distributions.
2T
15
Linear Profiles with Fixed-Effect Model
Kim et al. (2003) considered the same model but coded the X-values by centering so that the least square estimators of intercept and slope are independent.
3EWMA
A two-sided EWMA to monitor intercept
A two-sided EWMA to monitor slope
A one-sided EWMA to monitor error variance
16
Why Random Effects? Under the fixed-effect model, the batch effect,
the change of humidity or temperature, the characteristics of the measuring equipment, etc., are all included in the error term, which may not be appropriate since these time-varying factors may affect the values of the intercept and slope of the linear profile.
By their nature, these hard-to-control factors should be considered as common causes of variations.
Should allow profile-to-profile variations.
17
For the ith observation of the jth profile, assume
where ~ N( , ),
~ N( , ), ~ N(0, ),
, , and are mutually independent.
0.x
ij
0 20
1 21 ij
2e
Let the set points be pre-coded so that
0ˆ ,j jy 1 ( )ˆ ,j xy j xxS S 2 2
1
ˆ ˆ( ) ( 2).n
ej ij iji
y y n
( )1 1
( ) ,n n
xy j i ij j i iji i
S x y y x y
2
1
n
xx ii
S x
where
Linear Profiles with Random Effects(Shiau, Lin, and Chen, 2006)
0 1 ,ij j j i ijY A A x 0 jA
1 jA
0 jA 1 jA
ix
18
A Simulated Example
0 1 ,ij j j i ijy A A x
~N(3, 0.09), 0where jA ~N(2, 0.09)1 jA ~N(0, 1).ijand
-30 -20 -10 0 10 20
-50
0
50
Xi
yij
19
Phase II Method
In Phase II, usually assume that2 2 2
0 1 0 1, , , , e are known.
Since 2
0 1ˆ ˆ ˆ, ,j j ej are mutually independent, set the individual
The control limits of 2
0 1, , e are
*0 0 02
UCL Z s *
0 0 02LCL Z s
*1 1 12
UCL Z s *
1 1 12LCL Z s
2 *
22
, 22e
en
UCLn
false-alarm rate of each chart at * 1 31 (1 )
.to achieve an overall false-alarm rate at
Adopt the combined-chart approach.
20
When the Fixed-effect Model is Mistakenly Used
Real (random effect) Misuse (fixed effect)
Upper control limit of A0 4.101054 3.469491
Lower control limit of A0 1.898946 2.530509
Upper control limit of A1 2.996472 2.032534
Lower control limit of A1 1.003528 1.967466
Upper control limit of error variance
1.759881 1.759881
ARL0 370.370370 1.078406
Using the wrong model causes incredibly many false alarms! ( 92.73 % are false alarms!)
21
Linear Profiles with Random Effects Phase I --- Estimation
Under the random-effect model with coded xi’s,
are i.i.d. 11 1ˆ ˆ(ii) ,..., k
01 0ˆ ˆ(i) ,..., k 2 20 0( , ).eN n are i.i.d.
2 21 1( , ).e xxN S
2 21
2 2
ˆ ˆ( 2) ( 2)(iii) ,..., are i.i.d.e ek
e e
n n 2
2.n
These statistics are mutually independent.
22
20 0
0 20
ˆ ˆ( )
ˆj
jTs
21 1
1 21
ˆ ˆ( )
ˆj
jTs
2
2
ˆ
ˆej
eje
T
The three monitoring statistics :
Phase-I Monitoring Statistics
0 01
1ˆ ˆ ,
k
jjk
2 2
1 1
ˆ ˆ( ) ( 2)k n
e ij ijj i
y y k n
2 2 20 0 es n
2 21 1 1
1
1ˆ ˆˆ ( ) ,
1
k
jj
sk
1 11
1ˆ ˆ ,
k
jjk
2 20 0 0
1
1ˆ ˆˆ ( ) ,
1
k
jj
sk
2 2 21 1 .e xxs S
where
23
Bonferroni Method
/ ,k
0 jTUCL2
1 2, ,2 2
( 1)k
kBeta
k
1 jTUCL2
1 2, ,2 2
( 1)k
kBeta
k
ejTUCL 2 ( 1)( 2), ,
2 2
n k nkBeta
Control limits using the Bonferroni Method:
where 1 31 (1 )k
profiles in the Phase I historical data set.
If we control each individual false-alarm rate at level
then the overall false-alarm rate for the profiles is controlled at level .
k
k
24
Evaluation Criteria for Phase I Methods
Main concern in evaluating Phase I methods Effectiveness in detecting out-of-control profiles
correctly. Commonly used criterion – “signal probability” Include both true and false alarms.
Proposed criteria “True-alarm rate” – the rate of detecting real out-of-
control profiles. “False-alarm rate” – the rate of claiming in-control
profiles out of control.
25
Bonferroni vs. Multiple FDR
The Multiple FDR method An extension of FDR (False Discovery rate)
The Multiple FDR method is better than the Bonferroni method in terms of detecting power, especially when there are more out-of-control profiles in the historical data.
The tradeoff is the slightly larger false-alarm rate, but still very small (less than 0.003).
26
Other Related Works
Mahmoud A. Mahmoud and William H. Woodall (2004). “Phase I Analysis of Linear Profiles with Calibration Applications”. Technometrics, Nov. 2004.
Mahmoud A. Mahmoud, Peter A. Parker, William H. Woodall and Douglas M. Hawkins (2006). “A Change Point Method for Linear Profile Data”. Qual. Reliab. Engng. Int. 2006.
CHRISTINA L. STAUDHAMMER, VALERIE M. LEMAY, ROBERT A. KOZAK, and THOMAS C. MANESS (2005). ” MIXED-MODEL
DEVELOPMENT FOR REAL-TIME STATISTICAL PROCESS CONTROL DATA IN WOOD PRODUCTS MANUFACTURING”. FBMIS Volume 1, 2005, 19-35.
Wang, K. and Tsung, F. (2005). “Using Profile Techniques for a data-rich Environment with Huge Sample Size”. Quality and Reliability Engineering International, 21, 7, 677-688.
WILLIS A. JENSEN, JEFFREY B. BIRCH, and WILLIAM H. WOODALL (2006). “Profile Monitoring via Linear Mixed Models” JSM 2006 Online Program.
28
Related Works
Jensen, W. A. Woodall, W. H, and Birch, J. B.(2003). "Phase I Monitoring of Nonlinear Profiles".
Ding, Y., Zeng, L., and Zhou, S., (2005). “Phase I Analysis for Monitoring Nonlinear Profile signals in Manufacturing Processes”, Journal of Quality Technology, 38(3), 199-216.
WILLIS A. JENSEN and JEFFREY B. BIRCH (2006). “Profile Monitoring via Nonlinear Mixed Models”. Technical Report.
J. D. Williams, J. B. Birch, W. H. Woodall, and N. M. Ferry (2006). “Statistical Monitoring of Heteroscedastic Dose-Response Profiles from
High-throughput Screening”, JSM 2006 Online Program.
Shiau, J.-J. H., Yen, C.-L., and Feng, Y.-W. (2006). “A New Robust Phase I Analysis for Monitoring of Nonlinear Profiles. Technical Report.
30
( ) , 1,..., ,i i iy m x i n
Nonparametric Regression
i
,1( ) , 1,..., .
b
i l l k i ily c B x i n
lc
Consider the following nonparametric regression model:
where m(x) is a smooth regression curve and ’ s are i.i.d.
normal variates with mean zero and common variance .
With B-spline regression, the model is replaced by:
lc is the unknown B-spline coefficient of the lth B-spline basis to be estimated from data.
Estimate by 2
,1 1
minn b
i l l k ic
l l
y c B x
31
Nonparametric Fixed-Effect ModelShiau and Weng (2004)
Simulated example:
where are fixed constants.
Apply B-spline regression to each sample profile.
2( 1)( ) N xm x I Me , ,I M N
, 1, 2,..., .ij ij ije y y i n 1 .
n
ijij
ee
n
Monitor mean shifts EWMA chart
The EWMA statistic of the jth profile with smoothing constant :
1(1 ) ,j j jz e z
32
where
.
where
.
where
.
where
.
The R statistic of the jth profile (use range) :
max ( ) min ( ).j i ij i ijR e e
/ .j js e n b
The EWMSD statistic:
1(1 )j j jv s v
Monitor variation change R chart
Another chart for variation change EWMSD
where
Nonparametric Fixed Effect Model
33
Fixed vs. Random Effects
Fixed-effect model No profile-to-profile (subject-to-subject) variation The function is a fixed function, same for each profile.
Random-effect model There exists profile-to-profile variation caused by common causes.The profile function is a random function.Profiles are modeled as realizations of a stochastic process with a mean curve and a covariance function.
( )m x
( )m x
34
Nonparametric Random-Effect Model
• Shiau, J.-J. H., Huang, H.-L., Lin, S.-H., and Tsai, M.-Y. (2009). “Monitoring Nonlinear Profiles with Random Effects by Nonparametric Regression”. Communications in Statistics-Theory and Methods. 38, 1664-1679.
35
Adopt the random-effect model to provide more variability we often observe in many profile data.
Motivated example: aspartame Original model to generate aspartame profiles:
where2~ (1, 0.2 )iI N
~ (15, 1)iM N2~ ( 1.5, 0.3 )iN N
2~ (0, 0.3 )ij N
2( 1)i jN x
ij i i ijY I M e 1,... ; 1,...i n j p
Nonparametric Random-Effect Model
i.i.d.
i.i.d.
i.i.d.
i.i.d.
Random Variables !
Represent common-cause variations among profiles.
38
Problems with the Original Model
• Not Gaussian
• Covariance matrix depends on
• Too complicated to analyze
and M N
39
Stochastic Gaussian Process Model
• Gaussian process with• Mean function • Covariance function• In-control process
( , )G s t
41
Data Smoothing
~ ( , )N μ '
1 2where . . . and 1,...,i i ipY Y Y i n
represents the profile-to-profile variations
Preprocessing: Smooth each profile. Smoothing splines or B-spline regression Other smoothing techniques: kernel smoothing, local polynomial smoothing, wavelets
After sample profiles are de-noised (i.e., to eliminate the effects of ), we have the smoothed profiles :
42
Principal Component Analysis (PCA)
Method: Apply principal component analysis (PCA) on to obtain the principal modes of variations
PCA is to find an orthogonal matrix such that
Eigen-analysisEigenvectors are principal components
' ', where I and is diagonal B B BB
B
43
Phase I Monitoring (1)
• A set of historical profiles is available• Smooth each profile
• Apply PCA to sample covariance matrix – Eigenvectors are principal components (PC)– – PC-score
n
44
Phase I Monitoring (2)
• Select “effective” principal components by– Total variation explained
• Choose the first K such that
reaches a desired level
– Parsimoniousness
• Score vector of the th profilei
1( ,..., ) 'i iKS S
K
45
Phase I Monitoring (3)
• Hotelling statistics
• The usual sample mean and sample covariance matrix of the score vectors
2T
46
Phase I Monitoring (4)
• Since score vectors are asymptotically multivariate normal, we have
• upper control limit of chart:2
11 , ,
2 2
( 1)K n K
n
nUCL Beta
n
2T
47
Phase I Monitoring (5)
Note that the monitoring statistics across curves in Phase-I are not independent. So the prescribed overall false-alarm rate cannot be achieved by the marginal distribution of the monitoring statistics.
We can adopt the Bonferroni approach to control the overall false-alarm rate (i.e., type I error) at level .
48
In Phase II, we usually assume that is known. In practice, is estimated by the sample
covariance matrix of Phase-I in-control profiles. Apply PCA to to obtain eigenvalues
and eigenvectors
Choose K effective PCs
00
Phase II Monitoring (1)
0
1,..., Kv v
1,..., pv v
49
Phase II Monitoring (2)
Now for the new incoming profile First smooth then project it onto the K PCs
to obtain K independent PC-scores:
If the process is in control
0( ' , )r r rS N v
51
A Combined Chart• Signals when any of the K individual charts signals• Equivalent to monitoring the statistic:
• Control limits:
where individual false alarm rate is set at
so that overall false alarm rate is 1/' 1 (1 ) K
52
A Chart
• Monitoring statistic
• Follows chi-square distribution with K degrees of freedom
• Upper control limit:
2T
53
Performance Evaluation for Phase II
• Average Run Length (ARL)
• Mean shift from
• probability of detecting the shift
• ARL =1/p
to 0 0μ μ +δ
:p
55
In 50 curves, there is one outlier with shifted from 1 to 1+5*0.2.
I
More PC-scores More detecting power?
Det
ectin
g P
ower
Per
cent
age
of E
xpla
natio
n
62
Conclusions
• We propose and discuss monitoring schemes for nonlinear profiles based on PCA:– Phase I
• Hotelling control chart– Phase II
• individual PC-score charts• combined chart • chart
2T
2T
63
Conclusions
• When the shift corresponds to a mode of variation that a particular principal component represents – use the individual PC-score chart for better
power
• Unfortunately, this ideal situation is rare in practice.
64
Conclusions
• The chart performs somewhat better than the combined chart in terms of the average run length, but not too far off.
• However, by providing charts for all of the effective components, the combined chart gives more clues for finding assignable causes than the chart.
2T
2T
65
Conclusions• Degree of smoothness in the data smoothing
step has a great impact on the result of the subsequent PCA step.
• High degree of smoothness leads to high total explanation power of the first few principal components– For B-spline regression, # B-spline bases = #
principal components with nonzero eigenvalues.
• If the underlying profiles (i.e., with no noises) are fairly smooth, then the data dimension can be well reduced by PCA.
66
Conclusions
• Profile monitoring has become a popular and promising area of research in statistical process control in recent years.
• At the same time, functional data analysis (FDA) is also gaining lots of attentions and applications.
• We believe many techniques developed for FDA may be extended to developing new profile monitoring techniques in SPC.
67
More Recent Works
• Two master theses of 2009
• Monitoring profiles by their Data Depths of PC-scores
69
Other Related Works
JIN, J. and SHI, J. (2001). “Automatic Feature Extraction of Waveform Signals for In-Process Diagnostic Performance Improvement”. Journal of Intelligent Manufacturing 12, 257-268.
LADA,E.K.; LU, J. –C.; and WHSON, J.R. (2002) “A Wavelet-Based Procedure for Process Fault Detection”. IEEE Transactions on semiconductor Manufacturing 15, 79-90.
M. K. JEONG, J.-C. LU and N. WANG (2006). “Wavelet-Based SPC Procedure for Complicated Functional Data”. International Journal of Production Research, Vol. 44, No. 4, 729–744.
Shiyu Zhou, Baocheng Sun, and Jianjun Shi (2006). “An SPC Monitoring System for Cycle-Based Waveform Signals Using Haar Transform”. IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, VOL. 3, NO. 1
74
Phase I Monitoring of Nonlinear Profiles
James D. Williams, William H. Woodall and Jeffrey B. Birch, 2003
•Method 1 (sample covariance matrix) does not take into account the sequential sampling structure of the data:
The overall probability of detecting a shift in the mean vector will decrease (See Sullivan and Woodall, 1996) Should not be used
•Method 2 (successive differences) accounts for the sequential sampling scheme, and gives a more robust estimate of the covariance matrix
•In the VDP example, both Methods 1 and 2 gave same result because No apparent shift in the mean vector
There were only about two outliers
•Method 3 (intra-profile pooling) should be used when there is no profile-to-profile common cause variability
•Comparison of the three methods: Method 1 assumes all variability is due to common cause Method 3 assumes that no variability is due to common cause Method 2 is somewhere in the middle