event reconstruction and particle identification yong liu 刘 永 the university of alabama prc-us...
TRANSCRIPT
Event Reconstruction and
Particle Identification
Yong LIU 刘 永
The University of Alabama PRC-US workshopBeijing, June 11-18, 2006
MiniB
NE
On Behalf of the MiniBooNE Collaboration
MiniBooNE Event Reconstruction and Particle Identification
Y.Liu, D.Perevalov, I.Stancu University of Alabama S.Koutsoliotas Bucknell University R.A.Johnson, J.L.Raaf University of Cincinnati T.Hart, R.H.Nelson, M.Tzanov M.Wilking, E.D.Zimmerman University of Colorado A.A.Aguilar-Arevalo, L.Bugel L.Coney, J.M.Conrad, Z. Djurcic, J.M.Link K.B.M.Mahn, J.Monroe, D.Schmitz M.H.Shaevitz, M.Sorel, G.P.Zeller Columbia University D.Smith Embry Riddle Aeronautical University L.Bartoszek, C.Bhat, S.J.Brice B.C.Brown, D. A. Finley, R.Ford, F.G.Garcia, P.Kasper, T.Kobilarcik, I.Kourbanis, A.Malensek, W.Marsh, P.Martin, F.Mills, C.Moore, E.Prebys, A.D.Russell , P.Spentzouris, R.J.Stefanski, T.Williams Fermi National Accelerator Laboratory D.C.Cox, T.Katori, H.Meyer, C.C.Polly R.Tayloe Indiana University
G.T.Garvey, A.Green, C.Green, W.C.Louis, G.McGregor, S.McKenney G.B.Mills, H.Ray, V.Sandberg, B.Sapp, R.Schirato, R.Van de Water N.L.Walbridge, D.H.White Los Alamos National Laboratory R.Imlay, W.Metcalf, S.Ouedraogo, M.O.Wascko Louisiana State University J.Cao, Y.Liu, B.P.Roe, H.J.Yang University of Michigan A.O.Bazarko, P.D.Meyers, R.B.Patterson, F.C.Shoemaker, H.A.Tanaka Princeton University P.Nienaber Saint Mary's University of Minnesota E.Hawker Western Illinois University A.Curioni, B.T.Fleming Yale University
MiniBooNECollaboration
MiniBooNE Event Reconstruction and Particle Identification
Global solar data and KamLAND S. Ahmed et al., Phys. Rev. Lett. 92, 181301 (2004)
Super-Kamiokande and K2K data G.Fogli et al., Phys. Rev. D 67, 093006 (2003)
LSNDA. Aguilar et. al., Phys. Rev. D 64, 112007 (2001)
The primary physics goal of MiniBooNE is to definitely confirm or rule out the oscillation signal seen by LSND experiment
e
PL m
Ee( ) s in ( ) s in (.
) ( . . . )% 2 22
21 27
0 264 0 067 0 045
Total excess = 87.9±22.4±6.0 (3.8σ)
MiniBooNE Event Reconstruction and Particle Identification
To achieve the MiniBooNE physics goal
Particle Identification performance
e efficiency ~ 50%contamination ~ .0 1%contamination 0
~ 1%is required in BooNE proposal (Dec. 7, 1997) and accordingly very good resolution of
by Event Reconstruction are desired.
position
direction
mass / energy 0
Poor event reconstruction => Poor Particle Identification
MiniBooNE Event Reconstruction and Particle Identification
12-meter diameter spherical tank1280 PMT in inner region240 PMT in outer veto region950,000 liters ultra pure mineral oil
MiniBooNE Event Reconstruction - Overview
Reconstruct what?• Position (x, y,
z, t)• Direction (ux,
uy, uz)• Energy/mass E/m
How to reconstruct?• Light model • Time likelihood -
position• Charge likelihood –
directionReconstruction
Performance• Position resolution• Direction resolution• Energy/Pi0 mass
resolution
MiniBooNE Event Reconstruction – light model
θc η
Directional Cherenkov light ρ
Isotropic Scintilation light φ
Point-like light source model
Event track
(x y z t)
(ux uy uz)
(xi yi zi ti
qi )ri
• Predicted charge
cosηf(
cosη
)
• Cerenkov light - directional
• Scintillation light - isotopic
Assume Point-like light source model for e
• Model input parameter
1. Cerenkov angular distribution2. PMT angular response3. Cerenkov attenuation length4. Scintillation attenuation length5. Relative quantum efficiency
• Minimize with respective toCerenkov/Scintillation flux
iCER
ii CER
i
F E fr
r
(co s , ) (co s )
exp ( / )2
iSC I
ii sc i
i
fr
r
(co s )
exp ( / )2
i iC ER
iSC I
MiniBooNE Event Reconstruction - Charge Likelihood
P ne
n
n
( ; )!
P q P q n P nn
( ; ) ( ; ) ( ; )
0
The probability of measuring a charge q for a predicted charge μ
Three method to extract the charge likelihood
A. Fill 2-D histogram H(q, μ),
normalize q distribution for eachμbin, get –log versus μ for each q bin
C. Start from one PE charge response curve, generate P(q;n), assume Possion distribution, calculate P(q;μ), take –log
B. From hit/no-hit probability minimization procedure, get H(q, μ), then same As A.
MiniBooNE Event Reconstruction – Time likelihood
t t tr
ccorri
ii
n
( ) 0
T tE E
t t Ecer corr
c ccorr c( )
( , )exp
( , )[ ( , )]
1
2
1
2 2 02
T tE
E
E
t t E
E
ErfcE
E
t t E
E
sc i corrs
s
s
corr s
s
s
s
corr s
s
( )( , )
exp( , )
( , )
( , )
( , )
( , )
( , )
( , )
( , )
1
2 2
2 2
2
20
0
1. Corrected time
2. Cerenkov light tcorr(i) distribution
3. Scintillation light tcorr(i) distribution
4. Input: Cerenkov light – t0
cer ,σcer
Scintillation light – t0
sci ,σsci,τsci5. Total negative log time likelihood
L t T t T tcorri c
c scer corr
ic
s
c ssc i corr
is( ) log ( ( , ) ( , ))( ) ( ) ( )
MiniBooNE Event Reconstruction –Timing parameter
Cerenkov: look at hits in Cerenkov cone Scintillation: look at hits in backward directionGet tcorr=tcorr(μ,E), fit to CER and SCI T(tcorr), iteration
MiniBooNE Event Reconstruction – process chart
xi yi zi ti qi
x = ∑( xi qi ) / ∑qi t = ∑ qi (ti – |xi – x|/c) / ∑qi
Initial guess
Fast fitTLLK
x y z t dx = ∑qi (xi-x) /|xi-x|ux = dx / |dx|
d=R-|x| E=Qf(d)CER = c1 E SCI = c2 E
Full fitTLLK+QLLK
x y z t ux uy uzd=R-|x|
E=Qf(d)CER = c1 E SCI = c2 E
Flux fitTLLK+QLLK
Cer Sci fluxTrak fitTLLK+QLLK
Track length
Pi0 fitStep 1
x1=x y1=y z1=z t1=tux1=ux uy1=uy uz1=uz
ux2 uy2 uz2Cer1 Cer2
fcer e1 e2s1 = s(e1)s2 = s(e2)
x1 y1 z1 t1 fcer Θ1 φ1 s1 Θ2 φ2 s2 x y z t
Pi0 fitStep 2
sci1 = Cse e1sci2 = Cse e2
Cer1 Θ1 φ1 s1 Cer2 Θ2 φ2 s2
Pi0 fitStep 3
e1 = Cer1 / Ccee2 = Cer2 / Cce
Cer1 Cer2 Sci1 Sci2
Pi0cosine(γ1 γ2)e1 e2 Pi0mass
Calibrated data
MiniBooNE Event Reconstruction - performance
P r e l i m
i n a r y
MiniBooNE Event Reconstruction - performance
P r e l i m
i n a r y
ParticleID – do what?• Signal Events• Background Events
ParticleID - how to do?
• Variable - Construction and selection• Algorithm - Simple cuts/ANN/BoostingParticleID – reliable and powerful?
• Input – variable distribution and correlation Data/MC agree • Output Data/MC agree • The performance
MiniBooNE ParticleID - Overview
Forνe appearance search in MiniBooNE
Signal = oscillationνe CCQE events Background = everything else
Oscillation sensitivity study shows the most important backgrounds A. Intrinsic νe from K+, K0 and μ+ decay
- indistinguishable from signal
C. νμ CCQE
B. NC πo
D. Δ radiative decay
νμ + n/p νμ + n/p + πo
Δ N +γ
νμ + n μ- +
p
MiniBooNE ParticleID – Signal and Background
MiniBooNE ParticleID – π0 misID cases
0can be mis-identified as electron due to some physics
• High energy Pi0, Lorentz boost, two gamma direction close
• Very asymmetric Pi0 decay, one ring is too small
• Pi0 close to tank wall, one gamma convert behind PMTs
reason and detector limitation
0 0
0
V
V
e e
V
e e e e e e
e e
ParticleID basically based on event topology
e
μ
πo Real D
ata
Even
t D
isp
lay
MiniBooNE ParticleID
How to extract event topology from a set of PMT hits information
An Event = {(xk, yk, zk), tk, Qk} k = 1, 2, …, NTankHitsWhat we know is actually the space and time distribution of charge
The event topology is characterized by charge/hits fraction in space/time bins
θ
{(xk, yk, zk), tk, Qk}
rk
(x, y, z, t)
(ux, uy, uz)dtk = tk – rk/cn- t
Point-like model
θc
s
MiniBooNE ParticleID - space-time information
MiniBooNE ParticleID – Construct input variables
•Binning cosθin relative to event direction - record hits/PMT number, measured/predicted charge, time/charge likelihood in each cosθ bin
•Binning corrected time - record hits number, measured/predicted charge, time/charge likelihood in each corrected time bin
•Binning ring sharpness - record hits/PMT number, measured/predicted charge, time/charge likelihood in each ring sharpness bin
Take physically meaningful ratio in certain bin and combination of different bins Dimensionless quantity is preferred
How to construct the ParticleID variables
• Reconstruced physical observables: - e.g.πo mass, energy, track length and Cerenkov/scintillation light flux, production angle, etc.
• Reconstructed geometrical quantities: - e.g. radius r, u· r, and distance along track to wall,
etc.
• Difference of likelihood between different hypotheses fitting:
- electron/muon/pi0 fitting
Other ParticleID variables
These variables are very powerful !
MiniBooNE ParticleID – Other input variables
MiniBooNE ParticleID – Use how many inputs
How many variables do we need?
In ideal case, we can focus on the track instead of PMT hits. The least number of variables needed to describe one track is ~ 10• Radius r - from tank center to MGEP
• Angle α - between track and radial direction • Energy E • Light emision in unit length - parametrized by some parameters
(x, y, z, t)
(ux, uy, uz)
αr
At most, the number of variables we have
{(xh, yh, zh), th, Qh} × NTank PMTs = 5 × NTank PMTsBut they are highly correlated !
For πo events, twice as many variables needed.
MiniBooNE ParticleID – How to select variables
How to select ParticleID variables: reliability & efficiencyParticleID algorithm training and test have to
rely on Monte Carlo1. Does the variable distribution Data/MC agree ?
2. Does the correlation between variables Data/MC agree?
These two requirements ensure output Data/MC agree and so the reliability of ParticleID
3. Is the variable/combination powerful in separationToo many inputs may degrade the
ParticleID performance
Check with open box, cosmic ray calibration and NuMI data/MC
The events number in each node of the trees can test correlation between variables, and can be used to look at data/MC comparison naturally. Energy/geometry variable dependence.
MiniBooNE ParticleID – Data/MC comparison
The input data/MC comparison
MiniBooNE ParticleID - Data/MC comparison
The input data/MC comparison
MiniBooNE ParticleID - Algorithm
Choose which algorithmANN=Artifical Neural NetworkSC=Simple Cuts BDT=Boosted decision tree
SC ANN BDT
Variable Number Up to ~10
~30
~200
Parameter to fit 0 ~1000
~10000
Control Parameters 0
~10
~3
PerformanceNot good
good
better
Boosting is preferred in MiniBooNE to get better sensitivitybut Simple Cuts method and ANN can provide cross check.
Reasonably more input variables may result in higher performance, but less input variables may be more reliable.
MiniBooNE ParticleID Boosting
Boosting – boosted decision tree
1.Boosting: how to split node – choose variable and cut Define GiniIndex = P (1 - P) ∑w(S+B) P =∑wS/∑w(S+B), w is event weight. For a pure background or signal node GiniIndex = 0
G = GiniIndexFather – ( GiniIndexLeftSon + GiniIndexRightSon )
2. Boosting: how to generate tree– choose node to split
Among the existing leaves, find the one which gives the biggest G and split it. Repeat this process to generate a tree of the chosen size.
A. Generate tree
Start here
variable = i
Cut = ci
variable(i)<ci variable(i)>=ci
Variable = k
Cut = ck
variable(k)<ck variable(k)>=ck
For a given node, determine which variable and cut value maximizes
MiniBooNE ParticleID – Boosted decision trees
B. Boost tree3. Boosting: how to boost tree
- Choose algorithm to change event weightTake ALL the events in a leaf as signal events if the polarity of that leaf is positive. Otherwise, take all the events as background events. Mark down those events which are misidentified. Reduce the weight of those correctly identified events while increase the weight of those misidentified evens. Then, generate the next tree.
4. Boosting: how to calculate output value - Sum over (polarity × tree weight) in all treesSee B. Roe et al. NIM A543 (2005) 577 and references therein for detail
C. Output
Define polarity of a node:polarity = + 1 if signal is more than backgroundpolarity = - 1 if background is more than signal
MiniBooNE ParticleID
Simple Cuts and Boosted Decision Tree
Simple Cuts
Generalization
Decision Tree
Improvement
Boosted Decision Tree
All events
Var1<c1 Var1>=c1
Var2>=c2Var2<c2
variable = i
Cut = c1
variable = 2
Cut = c2
Var1<c1 || (var1>=c1 && var2<c2)
Simple Cuts can be taken asOne Tree, Few Variables, Few Nodes
MiniBooNE ParticleID - conclusion on algorithm
Boosting is better than Artificial Neural Network
Boosting performance is higher in many variable (>20) caseand relatively insensitive to detector MC in comparison to ANN
Cascade Boosting is better than non-Cascade Boosting
Cascade Boosting training can improve 25~30% or evenmore relative to non-Cascade training, especially in low background contamination region
Combine individual separation outputs can improve further
By about 10~20%
Some conclusions based on our past experience
Cascade Boosting – build first boosting used as cut to select training events for second boosting, use second boosting
MiniBooNE ParticleID - Cascade Boosting
1st boosting - cascade
2nd boosting – cascade
Combine individual outputs
P r e l i m i n a r y
MiniBooNE ParticleID – Output data/MC comparison
The output data/MC comparison
MiniBooNE ParticleID – Output data/MC comparison
The output data/MC comparison
MiniBooNE ParticleID – How to play
Event counting
Energy or/and ParticleID spectrum fitting
Optimize PID cuts to maximize
N
N N
S
iB
iB
iii
( ) 2
After some precuts, do
Energy spectrum fit
PID output distribution fit
Energy and PID two dimensional fit
to get oscillation sensitivity
MiniBooNE Event Reconstruction and Particle Identification
MiniBooNE Event Reconstruction provides
Energy resolution ~ 14%Position resolution ~ 23cmDirection
resolution ~ 6oPi0 mass resolution ~ 23 MeV/c2
Based on the reconstruction information, with
• Boosted decision trees• Cascade training
• Combining specialist algorithms
a much better ParticleID than BooNE proposal required
has been achieved!
~ 67% electron efficiency 1% Pi0 contamination < 0.1% muon contamination
Conclusion