mi ning frequent episodes for relating financial events and stock trends
DESCRIPTION
Mi ning Frequent Episodes for relating Financial Events and Stock Trends. Anny Ng and Ada Wai-chee Fu PAKDD 2003 報告者: Ming Jing Tsai. Definition. Events : financial news ,political … e 1 ,e 2 ,e 3 … .,e k : event types day record D i :{e i1 ,e i2 ,e i3 … .,e ik } - PowerPoint PPT PresentationTRANSCRIPT
date:2004/03/05
Mining Frequent Episodes for relating Financial Events and Stock Trends
Anny Ng and Ada Wai-chee Fu PAKDD 2003
報告者: Ming Jing Tsai
Definition
Events : financial news ,political… e1,e2,e3….,ek : event types day record Di:{ei1,ei2,ei3….,eik} Episode:{e1,e2,e3….,ek} , has at least t
wo elements and at least one ej is a stock event type
Window = x days
Definition
Window frequency : number of windows that contains an event type
DB frequency : number of occurrences of an event type in DB
Frequency of an episode (ex) number of windows the first day of window contains at least on
e of the event types in episode.
Construct event tree
Header in descending db frequencies order
Event_set pair <(firstday) ,(remaining day)> sorted in the descending db frequencies
node<E:C:B>: E :event type ,c :counts ,b :binary bit
Pruning method
window frequencies < min_sup Remove duplicate event type in both fir
stday part and remaining day part
days events
1 b
2 ac
3 b
4 d
5 b
6 ca
7 d
Window = 3,min_sup =3An Event database
Db frequencies<a:2,b:3,c:2,d:2>
windows
window Day included
Event_set pairs
1 1,2,3 <(b),(ac)>2 2,3,4 <(a,c),(b,d)>3 3,4,5 <(b),(d)>4 4,5,6 <(d),(b,a,c)>5 5,6,7 <(b),(a,c,d)>6 6,7 <(a,c),(d)>7 7 <(d),()>
Ordered frequent event type<b,a,c,d>
Window frequencies<a:5,b:5,c:5,d:6>
Window = 3,min_sup =3
{null}
{b:1:0}
{a:1:1}
{c:1:1}
{a:1:0}
{c:1:0}
{b:1:1}
{d:1:1}
b
a
c
d
{null}
{b:2:0}
{a:1:1}
{c:1:1}
{a:1:0}
{c:1:0}
{b:1:1}
b
a
c
d
{d:1:0}
{b:1:1}
{a:1:1}
{c:1:1}
{d:1:1}
{d:1:1}
{null}
{b:3:0}
{a:1:1}
{c:1:1}
{a:1:0}
{c:1:0}
{b:1:1}
b
a
c
d
{d:1:0}
{b:1:1}
{a:1:1}
{c:1:1}
{d:1:1}
{d:1:1}
{null}
{b:3:0}
{a:2:1}
{c:2:1}
{a:1:0}
{c:1:0}
{b:1:1}
b
a
c
d
{d:1:0}
{b:1:1}
{a:1:1}
{c:1:1}
{d:1:1}
{d:1:1}{d:1:1}
{d:1:1}
{null}
{b:3:0}
{a:2:1}
{c:2:1}
{a:2:0}
{c:2:0}
{b:1:1}
b
a
c
d
{d:1:0}
{b:1:1}
{a:1:1}
{c:1:1}
{d:1:1}
{d:1:1}{d:1:1}
{d:1:1}
{null}
{b:3:0}
{a:2:1}
{c:2:1}
{a:2:0}
{c:2:0}
{b:1:1}
b
a
c
d
{d:2:0}
{b:1:1}
{a:1:1}
{c:1:1}
{d:1:1}
{d:1:1}{d:1:1}
Mining frequent episode
Header table{h0,h1,…..,hH} Mining recursively each of the linked list kept at the he
ader table from bottom to top
Conditional path can build conditional event tree Object 1:found frequent episodes of form {a} ∪{hi}
first-part frequencies Object 2:found frequent episodes that contain hi and a
t least two other event types Db frequencies
Traverse conditional path
Remove invalid event types Adjust counts of nodes above hi in the
path to be equal to that of hi If hi is in the firstdays part, then move a
ll event types in the remainingdays part to the firstdays part
Remove hi from the path
Generate frequent episode
When a conditional event tree contains only a single path Any subset of firstpart ∪ event base set Any Subsets of firstpart ∪ Any Subsets of r
emainingpart ∪ event base set
Mining Header d
<(a:1,c:1),(b:1)> <(b:1),()> <(b:1,a:1,c:1),()> <(b:1),(a:1,c:1)> <(a:1,c:1),()>
event base set {d}
db frequency:{<b:4,a:4,c:4>}First_part frequency:{<b:3,a:3,c:3>}
Frequent episode :{bd,ad,cd}
min_sup =3
W Event_set pairs
1 <(b),(ac)>
2 <(a,c),(b,d)>
3 <(b),(d)>
4 <(d),(b,a,c)>
5 <(b),(a,c,d)>
6 <(a,c),(d)>
7 <(d),()>
Recursively Mining Header c
<(a:1,b:1),()> <(b:1,a:1),()> <(b:1),(a:1)> <(a:1),()>
event base set {cd}
db frequency:{<b:3,a:4>}
First_part frequency:{<b:3,a:3>}Frequent episode :{bcd ,acd}
<(a:1,c:1),(b:1)><(b:1),()><(b:1,a:1,c:1),()><(b:1),(a:1,c:1)><(a:1,c:1),()>
<(b:1),()> <(b:1),()> <(b:1),()>
Recursively Mining Header aevent base set {acd}
db frequency:{<b:3>}
First_part frequency:{<b:3>}Frequent episode :{bacd}
<(a:1,b:1),()><(b:1,a:1),()><(b:1),(a:1)><(a:1),()>
Mining Header c
<(b:1),(a:1)> <(a:1,b:1),()> <(b:1),(a:1)> <(a:1),()>
event base set {c}
db frequency:{<b:3,a:4>}First_part frequency:{<b:3,a:2>}
Frequent episode :{bc}
min_sup =3
W Event_set pairs
1 <(b),(ac)>
2 <(a,c),(b,d)>
3 <(b),(d)>
4 <(d),(b,a,c)>
5 <(b),(a,c,d)>
6 <(a,c),(d)>
7 <(d),()>
Recursively Mining Header a
<(b:1),()> <(b:1),()> <(b:1),()>
event base set {ac}
db frequency:{<b:3>}First_part frequency:{<b:3>}
Frequent episode :{bac}
min_sup =3
Mining Header a
<(b:1),()> <(b:1),()> <(b:1),()>
event base set {a}
db frequency:{<b:3>}First_part frequency:{<b:3>}
Frequent episode :{ba}
min_sup =3
W Event_set pairs
1 <(b),(ac)>
2 <(a,c),(b,d)>
3 <(b),(d)>
4 <(d),(b,a,c)>
5 <(b),(a,c,d)>
6 <(a,c),(d)>
7 <(d),()>
Experiment (synthetic data)
Dataset 2 T20,I5,M1000,D3K
Experiment (real data)
News event from a internet 121 event types 757 days
Stock data Dow Jones ,Nasdaq ,Hang Seng , 12 top loc
al companies
Experiment (real data)
Experiment (real data)
episode support
Nasdaq downs, PCCW downs 151
Nasdaq ups, SHK properties flats, HSBC flats
178
China Mobile downs, Nasdaq downs, HK Electric flats
178