prefixspan﹕ mining sequential patterns efficiently by prefix-projected pattern growth
DESCRIPTION
PrefixSpan﹕ Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth. 學生 : 907737 張資昊 907747 蔡明成 指導老師 : 劉俞志. 名詞解釋. items : 在顧客交易資料庫中的一種產品,稱之為一個 item 。 itemset : 由一個以上的 items 所組成的一個非空集合,其中表示為一個 item 。 - PowerPoint PPT PresentationTRANSCRIPT
-
PrefixSpanMining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth : 907737 907747 :
-
items itemitemset itemsitemsequence and element itemsetitemsetelementlengthsequenceiteml - sequencel sequence
- ()subsequence and super sequencesequence = aaa, sequence = bbb1j
-
()(frequent) sequential patternsequence databasesequence support() (a positive integer as the support threshold)(frequent) sequential patternsl - patternlsequential patternsupportsequence sequence database Ssupportdatabasecontaining tuplesupport()
-
Sequential pattern miningApriorisequential patternPrefixSpan - (Prefix-projected Sequential Pattern)candidate subsequenceprojected databasesequential patternApriori-based GSPFreeSpan
-
GSP GSPApriori-like mothodmultiple-passcandidate-generation-and-test sequential pattern mininglength-1frequent sequenceseed set(seed setsequential pattern)Step1(generate)seed setsequencelength1candidate sequencesStep2(test)candidate sequence supportmin_supportcandidate sequenceseed setStepcandidatemin_supportseed setsequential pattern
-
GSP()
-
GSP()Examplelength-1frequent sequenceseed set(seed set = )length-2candidate..candidatemin_supportseed setlength-3candidateseed setsequential patterTable1.A sequence database
-
GSP() Apriori-like candidate sequence (1000length-1frequent sequence1,499,500)sequential pattern
-
FreeSpan frequent item(project)projected databaseprojected databaseTable1
-
FreeSpan()
frequent itemf_list = {a4, b4, c4, d3, e3, f3}f_list66projected database-projected database, -projected database, ..... , -projected database-projected databaseitem xf_listitem xitem-projected database4projected databaseitem xf_listitem xitem-projected database
-
FreeSpan()4 projected database
FreeSpanprojected databaseFreeSpanGSPprojected database-projected database
-
Mining sequential patterns by prefix projectionssequenceelementitemselementitemsa(bac)(ca)d(fc)a(abc)(ac)d(cf)sequence
-
Mining sequential patterns by prefix projections ()
-
Mining sequential patterns by prefix projections ()ExampleExample (PrefixSpan) Table 1sequence database Smin_sup2prefix-projection methodminingStep 1length-1sequential patternsSlength-1sequential patterna4, b4, c4, d3, e3, f3(patternsupport)Step 2sequential patterns(1)prefixa(6)prefixfStep 3sequential patternssubsets projected databasesminingTable 2
-
Mining sequential patterns by prefix projections ()Example(1) prefixasequential patternsaasubsequenceprojected databasesequence (ef)(ab)(df)cb(_b)(df)cbsequential patterns(_b)aelementbsequence a(abc)(ac)d(cf)(abc)(ac)d(cf)subsequence
-
Mining sequential patterns by prefix projections ()Example(2) (1)sequence database Ssequencepostfix sequencesa-projected database(abc)(ac)d(cf)(_d)c(bc)(ae)(_b)(df)cb(_f)cbca-projected databaseprefixalength-2sequential patternsaa2, ab4,(ab)2,ac4, ad2,af2sequential patterns(1)prefixaa(2)prefixab(6)prefixafprojected database
-
Mining sequential patterns by prefix projections ()Example(3) prefixprojected databasemin_sup(postfix) subsequencessequential patternsaa-projected databasesubsequence (_bc)(ac)d(cf)
-
Mining sequential patterns by prefix projections ()Example
-
PrefixSpanAlgorithm and correctness Lemma3.1PrefixSpan(recursive)
-
PrefixSpanAlgorithm and correctness()-projected database
-
PrefixSpanAlgorithm and correctness()
-
PrefixSpanAlgorithm and correctness()-projected database
-
PrefixSpanAlgorithm and correctness()
-
Scaling up pattern growth by bi-level projection PrefixSpanprojected databasesbi-level projectionprojected databaseExample4 Step13.2level-by-level projectionSlength-1 sequential pattern,,,,,.Step266matrixprojected database-Table3
-
Scaling up pattern growth by bi-level projection()M[c,c]=3SM[a,c]=(4,2,1)=4, =2=1
-
Scaling up pattern growth by bi-level projection()length-2sequential pattern-projected database-projected databasesequences,,frequent items33 S-matrix-projected databaseTable4
-
Scaling up pattern growth by bi-level projection()sequential pattern(support=2)projection(sequencepattern)bi-levellevel-by-levelExample3level-by-level53projected databasebi-level22(length-2sequential pattern)
-
Scaling up pattern growth by bi-level projection()S-matrixitem
-
Scaling up pattern growth by bi-level projection()S-matrixitem
-
Pseudo-Projection PrefixSpanprojected databasepseudo-projection techniquesequencepointeroffsetpostfix subsequences
-
Pseudo-Projection()a-projected databases = a(abc)(ac)d(cf)postfix sequence (abc)(ac)d(cf)spointeroffset = 2databasemain memorydisk-base
-
Experimental Results and Performance Study 233MHz Pentium PC machine with 128 megabytes main memoryrunning Microsoft Windows/NT. All the method using Microsoft Visual C++ 6.04GSP.FreeSpan. FreeSpan with alternative level projected.PrefixSpan-1. PrefixSpan with level-by-level projected.PrefixSpan-2. PrefixSpan with bi-level projected.
-
Experimental Results and Performance Study()thresholdsequential patternsrunning timethresholdDataset C10T8S8I8item1,000sequence10,000element8items(T8)sequence8sequences(S8)
-
Experimental Results and Performance Study()pseudo-projectionspseudothreshold
-
Experimental Results and Performance Study()datasetC1kT8S8I8 item1,000sequence1,000,000element8items(T8)sequence8sequences(S8))pseudopseudo(sequenceI/O Cost)thresholdbi-levelLevel-by-level
-
Experimental Results and Performance Study()Threshold20%sequencerunning timePrefixSpan-2PrefixSpan-1
-
Experimental Results and Performance Study()thresholdPrefixSpanFreeSpanGSPFreeSpanGSPPrefixSpan-2bi-levelProjectionlow thresholdprojectionPrefixSpan-1databasemain memorypseudo
-
PrefixSpanFreeSpanpatternfrequent itemprojected databasePrefixSpanFreeSpanprojected databaseApriori PrefixSpanbi-level projection(3-way checking)
-
sequential mining methodPrefixSpanbi-levelpseudo-projectionApriori-like
-
Q & A