# Sequential Pattern Mining

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Sequential Pattern Mining 1

2 Outline What is sequence database and sequential pattern mining Methods for sequential pattern mining Constraint-based sequential pattern mining Periodicity analysis for sequence data 2

3 Sequence Databases A sequence database consists of ordered elements or events Transaction databases vs. sequence databases A transaction database TID itemsets 10 a, b, d 20 a, c, d 30 a, d, e 40 b, e, f A sequence database SID sequences 10 <a(abc)(ac)d(cf)> 20 <(ad)c(bc)(ae)> 30 <(ef)(ab)(df)cb> 40 <eg(af)cbc> 3

4 Applications Applications of sequential pattern mining Customer shopping sequences: First buy computer, then CD-ROM, and then digital camera, within 3 months. Medical treatments, natural disasters (e.g., earthquakes), science & eng. processes, stocks and markets, etc. Telephone calling patterns, Weblog click streams DNA sequences and gene structures 4

5 Subsequence vs. super sequence A sequence is an ordered list of events, denoted < e 1 e 2 e l > Given two sequences α=< a 1 a 2 a n > and β=< b 1 b 2 b m > α is called a subsequence of β, denoted as α β, if there exist integers 1 j 1 < j 2 < < j n m such that a 1 b j1, a 2 b j2,, a n b jn β is a super sequence of α E.g.α=< (ab), d> and β=< (abc), (de)> 5

6 What Is Sequential Pattern Mining? Given a set of sequences and support threshold, find the complete set of frequent subsequences A sequence database SID sequence 10 <a(abc)(ac)d(cf)> 20 <(ad)c(bc)(ae)> 30 <(ef)(ab)(df)cb> 40 <eg(af)cbc> A sequence : < (ef) (ab) (df) c b > An element may contain a set of items. Items within an element are unordered and we list them alphabetically. <a(bc)dc> is a subsequence of <a(abc)(ac)d(cf)> Given support threshold min_sup =2, <(ab)c> is a sequential pattern 6

7 Challenges on Sequential Pattern Mining A huge number of possible sequential patterns are hidden in databases A mining algorithm should find the complete set of patterns, when possible, satisfying the minimum support (frequency) threshold be highly efficient, scalable, involving only a small number of database scans be able to incorporate various kinds of userspecific constraints 7

8 Studies on Sequential Pattern Mining Concept introduction and an initial Apriori-like algorithm Agrawal & Srikant. Mining sequential patterns, [ICDE 95] Apriori-based method: GSP (Generalized Sequential Patterns: Srikant & Agrawal [EDBT 96]) Pattern-growth methods: FreeSpan & PrefixSpan (Han et al.kdd 00; Pei, et al. [ICDE 01]) Vertical format-based mining: SPADE (Zaki [Machine Leanining 00]) Constraint-based sequential pattern mining (SPIRIT: Garofalakis, Rastogi, Shim [VLDB 99]; Pei, Han, Wang [CIKM 02]) Mining closed sequential patterns: CloSpan (Yan, Han & Afshar [SDM 03]) 8

9 Methods for sequential pattern mining Apriori-based Approaches GSP SPADE Pattern-Growth-based Approaches FreeSpan PrefixSpan 9

10 The Apriori Property of Sequential Patterns A basic property: Apriori (Agrawal & Sirkant 94) If a sequence S is not frequent, then none of the super-sequences of S is frequent E.g, <hb> is infrequent so do <hab> and <(ah)b> Seq. ID Sequence <(bd)cb(ac)> <(bf)(ce)b(fg)> <(ah)(bf)abf> <(be)(ce)d> <a(bd)bcb(ade)> Given support threshold min_sup =2 10

11 GSP Generalized Sequential Pattern Mining GSP (Generalized Sequential Pattern) mining algorithm Outline of the method Initially, every item in DB is a candidate of length-1 for each level (i.e., sequences of length-k) do scan database to collect support count for each candidate sequence generate candidate length-(k+1) sequences from length-k frequent sequences using Apriori repeat until no frequent sequence or no candidate can be found Major strength: Candidate pruning by Apriori 11

12 Finding Length-1 Sequential Patterns Initial candidates: <a>, <b>, <c>, <d>, <e>, <f>, <g>, <h> Scan database once, count support for candidates min_sup =2 Seq. ID Sequence <(bd)cb(ac)> <(bf)(ce)b(fg)> <(ah)(bf)abf> <(be)(ce)d> <a(bd)bcb(ade)> Cand Sup <a> 3 <b> 5 <c> 4 <d> 3 <e> 3 <f> 2 <g> 1 <h> 1 12

13 Generating Length-2 Candidates <a> <b> <c> <d> <e> <f> <a> <(ab)> <(ac)> <(ad)> <(ae)> <(af)> <b> <(bc)> <(bd)> <(be)> <(bf)> <c> <(cd)> <(ce)> <(cf)> <d> <(de)> <(df)> <e> <f> 51 length-2 Candidates <a> <b> <c> <d> <e> <f> <a> <aa> <ab> <ac> <ad> <ae> <af> <b> <ba> <bb> <bc> <bd> <be> <bf> <c> <ca> <cb> <cc> <cd> <ce> <cf> <d> <da> <db> <dc> <dd> <de> <df> <e> <ea> <eb> <ec> <ed> <ee> <ef> <f> <fa> <fb> <fc> <fd> <fe> <ff> <(ef)> Without Apriori property, 8*8+8*7/2=92 candidates Apriori prunes % candidates

14 Finding Length-2 Sequential Patterns Scan database one more time, collect support count for each length-2 candidate There are 19 length-2 candidates which pass the minimum support threshold They are length-2 sequential patterns 14

15 The GSP Mining Process 5 th scan: 1 cand. 1 length-5 seq. pat. <(bd)cba> Cand. cannot pass sup. threshold 4 th scan: 8 cand. 6 length-4 seq. pat. 3 rd scan: 46 cand. 19 length-3 seq. pat. 20 cand. not in DB at all 2 nd scan: 51 cand. 19 length-2 seq. pat. 10 cand. not in DB at all 1 st scan: 8 cand. 6 length-1 seq. pat. <abba> <(bd)bc> <abb> <aab> <aba> <baa> <bab> <aa> <ab> <af> <ba> <bb> <ff> <(ab)> <(ef)> <a> <b> <c> <d> <e> <f> <g> <h> Seq. ID Sequence Cand. not in DB at all min_sup = <(bd)cb(ac)> <(bf)(ce)b(fg)> <(ah)(bf)abf> <(be)(ce)d> <a(bd)bcb(ade)> 15

16 The GSP Algorithm Take sequences in form of <x> as length-1 candidates Scan database once, find F 1, the set of length-1 sequential patterns Let k=1; while F k is not empty do Form C k+1, the set of length-(k+1) candidates from F k ; If C k+1 is not empty, scan database once, find F k+1, the set of length-(k+1) sequential patterns Let k=k+1; 16

17 The GSP Algorithm Benefits from the Apriori pruning Reduces search space Bottlenecks Scans the database multiple times Generates a huge set of candidate sequences There is a need for more efficient mining methods 17

18 The SPADE Algorithm SPADE (Sequential PAttern Discovery using Equivalent Class) developed by Zaki 2001 A vertical format sequential pattern mining method A sequence database is mapped to a large set of Item: <SID, EID> Sequential pattern mining is performed by growing the subsequences (patterns) one item at a time by Apriori candidate generation 18

20 Bottlenecks of Candidate Generate-and-test A huge set of candidates generated. Especially 2-item candidate sequence. Multiple Scans of database in mining. The length of each candidate grows by one at each database scan. Inefficient for mining long sequential patterns. A long pattern grow up from short patterns An exponential number of short candidates 20

21 PrefixSpan (Prefix-Projected Sequential Pattern Growth) PrefixSpan Projection-based But only prefix-based projection: less projections and quickly shrinking sequences J.Pei, J.Han, PrefixSpan : Mining sequential patterns efficiently by prefix-projected pattern growth. ICDE

22 Prefix and Suffix (Projection) <a>, <aa>, <a(ab)> and <a(abc)> are prefixes of sequence <a(abc)(ac)d(cf)> Given sequence <a(abc)(ac)d(cf)> Prefix <a> <aa> <ab> Suffix (Prefix-Based Projection) <(abc)(ac)d(cf)> <(_bc)(ac)d(cf)> <(_c)(ac)d(cf)> 22

23 Mining Sequential Patterns by Prefix Projections Step 1: find length-1 sequential patterns <a>, <b>, <c>, <d>, <e>, <f> Step 2: divide search space. The complete set of seq. pat. can be partitioned into 6 subsets: The ones having prefix <a>; The ones having prefix <b>; The ones having prefix <f> SID sequence 10 <a(abc)(ac)d(cf)> 20 <(ad)c(bc)(ae)> 30 <(ef)(ab)(df)cb> 40 <eg(af)cbc> 23

24 Finding Seq. Patterns with Prefix <a> Only need to consider projections w.r.t. <a> <a>-projected database: <(abc)(ac)d(cf)>, <(_d)c(bc)(ae)>, <(_b)(df)cb>, <(_f)cbc> Find all the length-2 seq. pat. Having prefix <a>: <aa>, <ab>, <(ab)>, <ac>, <ad>, <af> Further partition into 6 subsets Having prefix <aa>; Having prefix <af> SID sequence 10 <a(abc)(ac)d(cf)> 20 <(ad)c(bc)(ae)> 30 <(ef)(ab)(df)cb> 40 <eg(af)cbc> 24

25 Completeness of PrefixSpan Having prefix <a> <a>-projected database <(abc)(ac)d(cf)> <(_d)c(bc)(ae)> <(_b)(df)cb> <(_f)cbc> SID SDB sequence 10 <a(abc)(ac)d(cf)> 20 <(ad)c(bc)(ae)> 30 <(ef)(ab)(df)cb> 40 <eg(af)cbc> Having prefix <b> Length-1 sequential patterns <a>, <b>, <c>, <d>, <e>, <f> Having prefix <c>,, <f> <b>-projected database Length-2 sequential patterns <aa>, <ab>, <(ab)>, <ac>, <ad>, <af> Having prefix <aa> Having prefix <af> <aa>-proj. db <af>-proj. db 25

26 The Algorithm of PrefixSpan Input: A sequence database S, and the minimum support threshold min_sup Output: The complete set of sequential patterns Method: Call PrefixSpan(<>,0,S) Subroutine PrefixSpan(α, l, S α) Parameters: α: sequential pattern, l: the length of α; S α: the α-projected database, if α <>; otherwise; the sequence database S 26

27 The Algorithm of PrefixSpan(2) Method 1. Scan S α once, find the set of frequent items b such that: a) b can be assembled to the last element of α to form a sequential pattern; or b) <b> can be appended to α to form a sequential pattern. 2. For each frequent item b, append it to α to form a sequential pattern α, and output α ; 3. For each α, construct α -projected database S α, and call PrefixSpan(α, l+1, S α ). 27

28 Efficiency of PrefixSpan No candidate sequence needs to be generated Projected databases keep shrinking Major cost of PrefixSpan: constructing projected databases Can be improved by bi-level projections 28

29 Optimization in PrefixSpan Single level vs. bi-level projection Bi-level projection with 3-way checking may reduce the number and size of projected databases Physical projection vs. pseudo-projection Pseudo-projection may reduce the effort of projection when the projected database fits in main memory Parallel projection vs. partition projection Partition projection may avoid the blowup of disk space 29

30 Scaling Up by Bi-Level Projection Partition search space based on length-2 sequential patterns Only form projected databases and pursue recursive mining over bi-level projected databases 30

31 Speed-up by Pseudo-projection Major cost of PrefixSpan: projection Postfixes of sequences often appear repeatedly in recursive projected databases When (projected) database can be held in main memory, use pointers to form projections Pointer to the sequence s <a>: (, 2) Offset of the postfix s <ab>: (, 4) s=<a(abc)(ac)d(cf)> <a> <(abc)(ac)d(cf)> <ab> <(_c)(ac)d(cf)> 31

32 Pseudo-Projection vs. Physical Projection Pseudo-projection avoids physically copying postfixes Efficient in running time and space when database can be held in main memory However, it is not efficient when database cannot fit in main memory Disk-based random accessing is very costly Suggested Approach: Integration of physical and pseudo-projection Swapping to pseudo-projection when the data set fits in memory 32

33 Performance on Data Set C10T8S8I8 33

34 Performance on Data Set Gazelle 34

35 Effect of Pseudo-Projection 35

36 CloSpan: Mining Closed Sequential Patterns A closed sequential pattern s: there exists no superpattern s such that s כ s, and s and s have the same support Motivation: reduces the number of (redundant) patterns but attains the same expressive power Using Backward Subpattern and Backward Superpattern pruning to prune redundant search space 36

37 CloSpan: Performance Comparison with PrefixSpan 37

38 Constraints for Seq.-Pattern Mining Item constraint Find web log patterns only about online-bookstores Length constraint Find patterns having at least 20 items Super pattern constraint Find super patterns of PC digital camera Aggregate constraint Find patterns that the average price of items is over \$100 38

39 More Constraints Regular expression constraint Find patterns starting from Yahoo homepage, search for hotels in Washington DC area Yahootravel(WashingtonDC DC)(hotel motel lodging) Duration constraint Find patterns about ±24 hours of a shooting Gap constraint Find purchasing patterns such that the gap between each consecutive purchases is less than 1 month 39

40 From Sequential Patterns to Structured Patterns Sets, sequences, trees, graphs, and other structures Transaction DB: Sets of items {{i 1, i 2,, i m }, } Seq. DB: Sequences of sets: {<{i 1, i 2 },, {i m, i n, i k }>, } Sets of Sequences: {{<i 1, i 2 >,, <i m, i n, i k >}, } Sets of trees: {t 1, t 2,, t n } Sets of graphs (mining for frequent subgraphs): {g 1, g 2,, g n } Mining structured patterns in XML documents, 40

41 Episodes and Episode Pattern Mining Other methods for specifying the kinds of patterns Serial episodes: A B Parallel episodes: A & B Regular expressions: (A B)C*(D E) Methods for episode pattern mining Variations of Apriori-like algorithms, e.g., GSP Database projection-based pattern growth Similar to the frequent pattern growth without candidate generation 41

42 Periodicity Analysis Periodicity is everywhere: tides, seasons, daily power consumption, etc. Full periodicity Every point in time contributes (precisely or approximately) to the periodicity Partial periodicit: A more general notion Only some segments contribute to the periodicity Jim reads NY Times 7:00-7:30 am every week day Cyclic association rules Associations which form cycles Methods Full periodicity: FFT, other statistical analysis methods Partial and cyclic periodicity: Variations of Apriori-like mining methods 42

43 Summary Sequential Pattern Mining is useful in many application, e.g. weblog analysis, financial market prediction, BioInformatics, etc. It is similar to the frequent itemsets mining, but with consideration of ordering. We have looked at different approaches that are descendants from two popular algorithms in mining frequent itemsets Candidates Generation: AprioriAll and GSP Pattern Growth: FreeSpan and PrefixSpan 43

### Mining Sequence Data. JERZY STEFANOWSKI Inst. Informatyki PP Wersja dla TPD 2009 Zaawansowana eksploracja danych

Mining Sequence Data JERZY STEFANOWSKI Inst. Informatyki PP Wersja dla TPD 2009 Zaawansowana eksploracja danych Outline of the presentation 1. Realtionships to mining frequent items 2. Motivations for

### Visa Smart Debit/Credit Certificate Authority Public Keys

CHIP AND NEW TECHNOLOGIES Visa Smart Debit/Credit Certificate Authority Public Keys Overview The EMV standard calls for the use of Public Key technology for offline authentication, for aspects of online

### IncSpan: Incremental Mining of Sequential Patterns in Large Database

IncSpan: Incremental Mining of Sequential Patterns in Large Database Hong Cheng Department of Computer Science University of Illinois at Urbana-Champaign Urbana, Illinois 61801 hcheng3@uiuc.edu Xifeng

### Quick Installation Guide

Quick Installation Guide This manual has been designed to guide you through basic settings of your IP devices, such as installation and configuration for using them. Step1. Assemble and install IP device

### SERVER CERTIFICATES OF THE VETUMA SERVICE

Page 1 Version: 3.4, 19.12.2014 SERVER CERTIFICATES OF THE VETUMA SERVICE 1 (18) Page 2 Version: 3.4, 19.12.2014 Table of Contents 1. Introduction... 3 2. Test Environment... 3 2.1 Vetuma test environment...

### Fast Discovery of Sequential Patterns through Memory Indexing and Database Partitioning

JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 21, 109-128 (2005) Fast Discovery of Sequential Patterns through Memory Indexing and Database Partitioning MING-YEN LIN AND SUH-YIN LEE * * Department of

### SERVER CERTIFICATES OF THE VETUMA SERVICE

Page 1 Version: 3.5, 4.11.2015 SERVER CERTIFICATES OF THE VETUMA SERVICE 1 (18) Page 2 Version: 3.5, 4.11.2015 Table of Contents 1. Introduction... 3 2. Test Environment... 3 2.1 Vetuma test environment...

### MAXIMAL FREQUENT ITEMSET GENERATION USING SEGMENTATION APPROACH

MAXIMAL FREQUENT ITEMSET GENERATION USING SEGMENTATION APPROACH M.Rajalakshmi 1, Dr.T.Purusothaman 2, Dr.R.Nedunchezhian 3 1 Assistant Professor (SG), Coimbatore Institute of Technology, India, rajalakshmi@cit.edu.in

### Arts et politique sous Louis XIV

Arts et politique sous Louis XIV Marie Tourneur To cite this version: Marie Tourneur. Arts et politique sous Louis XIV. Éducation. 2013. HAL Id: dumas-00834060 http://dumas.ccsd.cnrs.fr/dumas-00834060

### !" #" \$ % \$ & ' % "( ) \$* ( + \$, - \$. /

!" #"\$ %\$ & '% "( )\$* (+\$, - \$. / 0 & 1&&##+.2 " 2&%%.&, &#& \$ & 1 \$ + \$1 & 31,4!%'"%# 1\$4 % 2.& # &2 5 + 670894, ' &" '\$+.&2 2\$%&##" +..\$&&+%1\$%114+1 &\$ +1&#&+,#%11" &&.1&.\$\$!\$%&'" &:#1;#.!, \$& &' +".2

### USB HID to PS/2 Scan Code Translation Table

Key Name HID Usage Page HID Usage ID PS/2 Set 1 Make* PS/2 Set 1 Break* PS/2 Set 2 Make PS/2 Set 2 Break System Power 01 81 E0 5E E0 DE E0 37 E0 F0 37 System Sleep 01 82 E0 5F E0 DF E0 3F E0 F0 3F System

### Association Rule Mining

Association Rule Mining Association Rules and Frequent Patterns Frequent Pattern Mining Algorithms Apriori FP-growth Correlation Analysis Constraint-based Mining Using Frequent Patterns for Classification

### Les identités féminines dans le conte traditionnel occidental et ses réécritures : une perception actualisée

Les identités féminines dans le conte traditionnel occidental et ses réécritures : une perception actualisée des élèves Sarah Brasseur To cite this version: Sarah Brasseur. Les identités féminines dans

### Big Data Frequent Pattern Mining

Big Data Frequent Pattern Mining David C. Anastasiu and Jeremy Iverson and Shaden Smith and George Karypis Department of Computer Science and Engineering University of Minnesota, Twin Cities, MN 55455,

### Comparison of Data Mining Techniques for Money Laundering Detection System

Comparison of Data Mining Techniques for Money Laundering Detection System Rafał Dreżewski, Grzegorz Dziuban, Łukasz Hernik, Michał Pączek AGH University of Science and Technology, Department of Computer

### A Way to Understand Various Patterns of Data Mining Techniques for Selected Domains

A Way to Understand Various Patterns of Data Mining Techniques for Selected Domains Dr. Kanak Saxena Professor & Head, Computer Application SATI, Vidisha, kanak.saxena@gmail.com D.S. Rajpoot Registrar,

### SL-8800 HDCP 2.2 and HDCP 1.x Protocol Analyzer for HDMI User Guide

SL-8800 HDCP 2.2 and HDCP 1.x Protocol Analyzer for HDMI Simplay-UG-02003-A July 2015 Contents 1. Overview... 4 1.1. SL-8800 HDCP Protocol Analyzer Test Equipment... 4 1.2. HDCP 2.2/HDCP 1.x Protocol Analyzer

### Advanced Encryption Standard by Example. 1.0 Preface. 2.0 Terminology. Written By: Adam Berent V.1.7

Written By: Adam Berent Advanced Encryption Standard by Example V.1.7 1.0 Preface The following document provides a detailed and easy to understand explanation of the implementation of the AES (RIJNDAEL)

### Lecture 4, Data Cube Computation

CSI 4352, Introduction to Data Mining Lecture 4, Data Cube Computation Young-Rae Cho Associate Professor Department of Computer Science Baylor University A Roadmap for Data Cube Computation Full Cube Full

### Advanced Encryption Standard by Example. 1.0 Preface. 2.0 Terminology. Written By: Adam Berent V.1.5

Written By: Adam Berent Advanced Encryption Standard by Example V.1.5 1.0 Preface The following document provides a detailed and easy to understand explanation of the implementation of the AES (RIJNDAEL)

### Les effets de la relaxation sur les apprentissages à l école

Les effets de la relaxation sur les apprentissages à l école Laurine Bouchez To cite this version: Laurine Bouchez. Les effets de la relaxation sur les apprentissages à l école. Éducation. 2012.

### Chapter 6: Episode discovery process

Chapter 6: Episode discovery process Algorithmic Methods of Data Mining, Fall 2005, Chapter 6: Episode discovery process 1 6. Episode discovery process The knowledge discovery process KDD process of analyzing

### The ASCII Character Set

The ASCII Character Set The American Standard Code for Information Interchange or ASCII assigns values between 0 and 255 for upper and lower case letters, numeric digits, punctuation marks and other symbols.

### Approaches for Pattern Discovery Using Sequential Data Mining

Approaches for Pattern Discovery Using Sequential Data Mining Manish Gupta University of Illinois at Urbana-Champaign, USA Jiawei Han University of Illinois at Urbana-Champaign, USA ABSTRACT In this chapter

### Luxembourg (Luxembourg): Trusted List

Luxembourg (Luxembourg): Trusted List Institut Luxembourgeois de la Normalisation, de l'accréditation de la Sécurité et qualité des produits et services Scheme Information TSL Version 4 TSL Sequence Number

### Community College of Philadelphia Calling Code 218 Employer Scan Client Approved: November 17, 2005 Region (CIRCLE) City MSA

Community College of Philadelphia Calling Code 218 Employer Scan Client Approved: November 17, 2005 Region (CIRCLE) City MSA Zip V0 V1 V2 Month/ Day/ Year of Contact: Business Name: Address: V3 City: V4

### A Comparative study of Techniques in Data Mining

A Comparative study of Techniques in Data Mining Manika Verma 1, Dr. Devarshi Mehta 2 1 Asst. Professor, Department of Computer Science, Kadi Sarva Vishwavidyalaya, Gandhinagar, India 2 Associate Professor

### Pattern Co. Monkey Trouble Wall Quilt. Size: 48" x 58"

.............................................................................................................................................. Pattern Co..........................................................................................

### HTML Codes - Characters and symbols

ASCII Codes HTML Codes Conversion References Control Characters English version Versión español Click here to add this link to your favorites. HTML Codes - Characters and symbols Standard ASCII set, HTML

### Mathilde Patteyn. HAL Id: dumas

L exercice du théâtre à l école Mathilde Patteyn To cite this version: Mathilde Patteyn. L exercice du théâtre à l école. Éducation. 2013. HAL Id: dumas-00861849 http://dumas.ccsd.cnrs.fr/dumas-00861849

### Data Mining Apriori Algorithm

10 Data Mining Apriori Algorithm Apriori principle Frequent itemsets generation Association rules generation Section 6 of course book TNM033: Introduction to Data Mining 1 Association Rule Mining (ARM)

### An Effective System for Mining Web Log

An Effective System for Mining Web Log Zhenglu Yang Yitong Wang Masaru Kitsuregawa Institute of Industrial Science, The University of Tokyo 4-6-1 Komaba, Meguro-Ku, Tokyo 153-8305, Japan {yangzl, ytwang,

### Data Mining: Partially from: Introduction to Data Mining by Tan, Steinbach, Kumar

Data Mining: Association Analysis Partially from: Introduction to Data Mining by Tan, Steinbach, Kumar Association Rule Mining Given a set of transactions, find rules that will predict the occurrence of

### SPMF: a Java Open-Source Pattern Mining Library

Journal of Machine Learning Research 1 (2014) 1-5 Submitted 4/12; Published 10/14 SPMF: a Java Open-Source Pattern Mining Library Philippe Fournier-Viger philippe.fournier-viger@umoncton.ca Department

### An Efficient GA-Based Algorithm for Mining Negative Sequential Patterns

An Efficient GA-Based Algorithm for Mining Negative Sequential Patterns Zhigang Zheng 1, Yanchang Zhao 1,2,ZiyeZuo 1, and Longbing Cao 1 1 Data Sciences & Knowledge Discovery Research Lab Centre for Quantum

### Binary Coded Web Access Pattern Tree in Education Domain

Binary Coded Web Access Pattern Tree in Education Domain C. Gomathi P.G. Department of Computer Science Kongu Arts and Science College Erode-638-107, Tamil Nadu, India E-mail: kc.gomathi@gmail.com M. Moorthi

### COMPRESSION SPRINGS: STANDARD SERIES (INCH)

: STANDARD SERIES (INCH) LC 014A 01 0.250 6.35 11.25 0.200 0.088 2.24 F F M LC 014A 02 0.313 7.94 8.90 0.159 0.105 2.67 F F M LC 014A 03 0.375 9.52 7.10 0.126 0.122 3.10 F F M LC 014A 04 0.438 11.11 6.00

### 3. April 2013 IT ZERTIFIKATE. Zertifizierungsstellen / Certification Center. IT Sicherheit UNTERNEHMENSBEREICH IT

IT Sicherheit UNTERNEHMENSBEREICH IT IT ZERTIFIKATE 3. April 2013 Zertifizierungsstellen / Certification Center D-TRUST D-Trust Root Class 2 CA2007 Aussteller/Issuer: D-TRUST Root Class 2 CA 2007 Gültig

### Static Data Mining Algorithm with Progressive Approach for Mining Knowledge

Global Journal of Business Management and Information Technology. Volume 1, Number 2 (2011), pp. 85-93 Research India Publications http://www.ripublication.com Static Data Mining Algorithm with Progressive

### 0242-1. HSR TRAINING COURSE REQUIREMENTS HSR Training Course Guidance Booklet 2

0242-1 HSR TRAINING COURSE REQUIREMENTS HSR Training Course Guidance Booklet 2 SafeWork SA 2 Contents Introduction... 4 Learning resources... 4 PART 1 UNDERPINNING PRINCIPLES FOR THE DEVELOPMENT OF A SAFEWORK

### URL encoding uses hex code prefixed by %. Quoted Printable encoding uses hex code prefixed by =.

ASCII = American National Standard Code for Information Interchange ANSI X3.4 1986 (R1997) (PDF), ANSI INCITS 4 1986 (R1997) (Printed Edition) Coded Character Set 7 Bit American National Standard Code

### Appendix C: Keyboard Scan Codes

Thi d t t d ith F M k 4 0 2 Appendix C: Keyboard Scan Codes Table 90: PC Keyboard Scan Codes (in hex) Key Down Up Key Down Up Key Down Up Key Down Up Esc 1 81 [ { 1A 9A, < 33 B3 center 4C CC 1! 2 82 ]

### Baltic Way 1995. Västerås (Sweden), November 12, 1995. Problems and solutions

Baltic Way 995 Västerås (Sweden), November, 995 Problems and solutions. Find all triples (x, y, z) of positive integers satisfying the system of equations { x = (y + z) x 6 = y 6 + z 6 + 3(y + z ). Solution.

### CROSS REFERENCE. Cross Reference Index 110-122. Cast ID Number 110-111 Connector ID Number 111 Engine ID Number 112-122. 2015 Ford Motor Company 109

CROSS REFERENCE Cross Reference Index 110-122 Cast ID Number 110-111 Connector ID Number 111 112-122 2015 Ford Motor Company 109 CROSS REFERENCE Cast ID Number Cast ID Ford Service # MC Part # Part Type

### CLAN: An Algorithm for Mining Closed Cliques from Large Dense Graph Databases

CLAN: An Algorithm for Mining Closed Cliques from Large Dense Graph Databases Jianyong Wang, Zhiping Zeng, Lizhu Zhou Department of Computer Science and Technology Tsinghua University, Beijing, 100084,

### Application Note RMF Magic 5.1.0: EMC Array Group and EMC SRDF/A Reporting. July 2009

Application Note RMF Magic 5.1.0: EMC Array Group and EMC SRDF/A Reporting July 2009 Summary: This Application Note describes the new functionality in RMF Magic 5.1 that enables more effective monitoring

### Report of Independent Accountant on SSL CORP d/b/a SSL.COM CA s Root Key Generation

Tel: 314-889-1100 Fax: 314-889-1101 www.bdo.com 101 South Hanley Road, Suite 800 St. Louis, MO 63105 Report of Independent Accountant on SSL CORP d/b/a SSL.COM CA s Root Key Generation To The Management

### ให p, q, r และ s เป นพห นามใดๆ จะได ว า

เศษส วนของพห นาม ให A และ B เป นพห นาม พห นามใดๆ โดยท B 0 เร ยก B A ว า เศษส วนของพห นาม การดาเน นการของเศษส วนของพห นาม ให p, q, r และ s เป นพห นามใดๆ จะได ว า Q P R Q P Q R Q P R Q P Q R R Q P S P Q

### Directed Graph based Distributed Sequential Pattern Mining Using Hadoop Map Reduce

Directed Graph based Distributed Sequential Pattern Mining Using Hadoop Map Reduce Sushila S. Shelke, Suhasini A. Itkar, PES s Modern College of Engineering, Shivajinagar, Pune Abstract - Usual sequential

### On Multiple Query Optimization in Data Mining

On Multiple Query Optimization in Data Mining Marek Wojciechowski, Maciej Zakrzewicz Poznan University of Technology Institute of Computing Science ul. Piotrowo 3a, 60-965 Poznan, Poland {marek,mzakrz}@cs.put.poznan.pl

DATING YOUR GUILD 1952-1960 YEAR APPROXIMATE LAST SERIAL NUMBER PRODUCED 1953 1000-1500 1954 1500-2200 1955 2200-3000 1956 3000-4000 1957 4000-5700 1958 5700-8300 1959 12035 1960-1969 This chart displays

### ASCII CODES WITH GREEK CHARACTERS

ASCII CODES WITH GREEK CHARACTERS Dec Hex Char Description 0 0 NUL (Null) 1 1 SOH (Start of Header) 2 2 STX (Start of Text) 3 3 ETX (End of Text) 4 4 EOT (End of Transmission) 5 5 ENQ (Enquiry) 6 6 ACK

### SPADE: An Efficient Algorithm for Mining Frequent Sequences

Machine Learning, 42, 31 60, 2001 c 2001 Kluwer Academic Publishers. Manufactured in The Netherlands. SPADE: An Efficient Algorithm for Mining Frequent Sequences MOHAMMED J. ZAKI Computer Science Department,

### Service Instruction. 1.0 SUBJECT: ECi Accessory Cases for Lycoming 4-Cylinder engines with single magneto configurations and TITAN 361 Engines

Title: Service Instruction ECi Accessory Cases Installed on Engines S.I. No.: 03-1 Page: 1 of 7 Issued: 2/28/2003 Revision: 2 (4/13/2009) Technical Portions are FAA DER Approved. 1.0 SUBJECT: ECi Accessory

Data Mining: An Overview David Madigan http://www.stat.columbia.edu/~madigan Overview Brief Introduction to Data Mining Data Mining Algorithms Specific Eamples Algorithms: Disease Clusters Algorithms:

### Association Rules Apriori Algorithm. Machine Learning Overview Sales Transaction and Association Rules Aprori Algorithm Example

Association Rules Apriori Algorithm Machine Learning Overview Sales Transaction and Association Rules Aprori Algorithm Example 1 Machine Learning Common ground of presented methods Statistical Learning

### Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach

Data Mining and Knowledge Discovery, 8, 53 87, 2004 c 2004 Kluwer Academic Publishers. Manufactured in The Netherlands. Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach

### Arithmetic Coding: Introduction

Data Compression Arithmetic coding Arithmetic Coding: Introduction Allows using fractional parts of bits!! Used in PPM, JPEG/MPEG (as option), Bzip More time costly than Huffman, but integer implementation

### Mining the Most Interesting Web Access Associations

Mining the Most Interesting Web Access Associations Li Shen, Ling Cheng, James Ford, Fillia Makedon, Vasileios Megalooikonomou, Tilmann Steinberg The Dartmouth Experimental Visualization Laboratory (DEVLAB)

### A Survey on Association Rule Mining in Market Basket Analysis

International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 4, Number 4 (2014), pp. 409-414 International Research Publications House http://www. irphouse.com /ijict.htm A Survey

### Rijndael Encryption implementation on different platforms, with emphasis on performance

Rijndael Encryption implementation on different platforms, with emphasis on performance KAFUUMA JOHN SSENYONJO Bsc (Hons) Computer Software Theory University of Bath May 2005 Rijndael Encryption implementation

### Lecture 3 Monday, August 27, 2007

CS 6604: Data Mining Fall 2007 Lecture 3 Monday, August 27, 2007 Lecture: Naren Ramakrishnan Scribe: Wesley Tansey 1 Overview In the last lecture we covered the basics of itemset mining. Most pertinent

### EMV (Chip-and-PIN) Protocol

EMV (Chip-and-PIN) Protocol Märt Bakhoff December 15, 2014 Abstract The objective of this report is to observe and describe a real world online transaction made between a debit card issued by an Estonian

### I have attached a few files that I created in regards to what I have been learning.

Cajun Accordion Music Theory - Files Just let me say this up front: I am no expert at any of this, just learning this stuff like most folks here. I have been playing the accordion for around 10 years now

### ON-BOARDING TOOL USER GUIDE. HKEx Orion Market Data Platform Securities Market & Index Datafeed Products Mainland Market Data Hub (MMDH)

ON-BOARDING TOOL USER GUIDE HKEx Orion Market Data Platform Securities Market & Index Datafeed Products Mainland Market Data Hub (MMDH) Version 1.1 27 May 2013 Document History DOCUMENT HISTORY Distribution

### A Time Efficient Algorithm for Web Log Analysis

A Time Efficient Algorithm for Web Log Analysis Santosh Shakya Anju Singh Divakar Singh Student [M.Tech.6 th sem (CSE)] Asst.Proff, Dept. of CSE BU HOD (CSE), BUIT, BUIT,BU Bhopal Barkatullah University,

### Selection of Optimal Discount of Retail Assortments with Data Mining Approach

Available online at www.interscience.in Selection of Optimal Discount of Retail Assortments with Data Mining Approach Padmalatha Eddla, Ravinder Reddy, Mamatha Computer Science Department,CBIT, Gandipet,Hyderabad,A.P,India.

### The first 32 characters in the ASCII-table are unprintable control codes and are used to control peripherals such as printers.

The following ASCII table contains both ASCII control characters, ASCII printable characters and the extended ASCII character set ISO 8859-1, also called ISO Latin1 The first 32 characters in the ASCII-table

### SEQUENCE MINING ON WEB ACCESS LOGS: A CASE STUDY. Rua de Ceuta 118, 6 andar, 4050-190 Porto, Portugal csoares@liacc.up.pt

SEQUENCE MINING ON WEB ACCESS LOGS: A CASE STUDY Carlos Soares a Edgar de Graaf b Joost N. Kok b Walter A. Kosters b a LIACC/Faculty of Economics, University of Porto Rua de Ceuta 118, 6 andar, 4050-190

### VILNIUS UNIVERSITY FREQUENT PATTERN ANALYSIS FOR DECISION MAKING IN BIG DATA

VILNIUS UNIVERSITY JULIJA PRAGARAUSKAITĖ FREQUENT PATTERN ANALYSIS FOR DECISION MAKING IN BIG DATA Doctoral Dissertation Physical Sciences, Informatics (09 P) Vilnius, 2013 Doctoral dissertation was prepared

### Unique column combinations

Unique column combinations Arvid Heise Guest lecture in Data Profiling and Data Cleansing Prof. Dr. Felix Naumann Agenda 2 Introduction and problem statement Unique column combinations Exponential search

### Online EFFECTIVE AS OF JANUARY 2013

2013 A and C Session Start Dates (A-B Quarter Sequence*) 2013 B and D Session Start Dates (B-A Quarter Sequence*) Quarter 5 2012 1205A&C Begins November 5, 2012 1205A Ends December 9, 2012 Session Break

### Efficient Critical System Event Recognition and Prediction in Cloud Computing Systems

ASEE 2014 Zone I Conference, April 3-5, 2014, University of Bridgeport, Bridgpeort, CT, USA. Efficient Critical System Event Recognition and Prediction in Cloud Computing Systems Yuanyao Liu Department

### Future Trends in Airline Pricing, Yield. March 13, 2013

Future Trends in Airline Pricing, Yield Management, &AncillaryFees March 13, 2013 THE OPPORTUNITY IS NOW FOR CORPORATE TRAVEL MANAGEMENT BUT FIRST: YOU HAVE TO KNOCK DOWN BARRIERS! but it won t hurt much!

### NEOSHO COUNTY COMMUNITY COLLEGE MASTER COURSE SYLLABUS. Medical Administrative Aspects

NEOSHO COUNTY COMMUNITY COLLEGE MASTER COURSE SYLLABUS COURSE IDENTIFICATION Course Code/Number: ALMA 120 Course Title: Medical Administrative Aspects Division: Applied Science (AS) Liberal Arts (LA) Workforce

### Magic Squares and Modular Arithmetic

Magic Squares and Modular Arithmetic Jim Carlson November 7, 2001 1 Introduction Recall that a magic square is a square array of consecutive distinct numbers such that all row and column sums and are the

### Unified English Braille For Math

Unified English Braille For Math For Sighted Learners By Heather Harland & Cheryl Roberts 1 UEB Braille for Math Previously braille readers used the Nemeth code for math and literary braille code numbers

Building A Smart Academic Advising System Using Association Rule Mining Raed Shatnawi +962795285056 raedamin@just.edu.jo Qutaibah Althebyan +962796536277 qaalthebyan@just.edu.jo Baraq Ghalib & Mohammed

### Protein Protein Interaction Networks

Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics

### Collinearity and concurrence

Collinearity and concurrence Po-Shen Loh 23 June 2008 1 Warm-up 1. Let I be the incenter of ABC. Let A be the midpoint of the arc BC of the circumcircle of ABC which does not contain A. Prove that the

### Notation: - active node - inactive node

Dynamic Load Balancing Algorithms for Sequence Mining Valerie Guralnik, George Karypis fguralnik, karypisg@cs.umn.edu Department of Computer Science and Engineering/Army HPC Research Center University

### Data Mining Association Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 6. Introduction to Data Mining

Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 4/8/24

### MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

CHAPTER01 QUESTIONS MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) Convert binary 010101 to octal. 1) A) 258 B) 58 C) 218 D) 158 2) Any number

### Mining Access Patterns Eæciently from Web. Logs? Jian Pei, Jiawei Han, Behzad Mortazavi-asl, and Hua Zhu

Mining Access Patterns Eæciently from Web Logs? Jian Pei, Jiawei Han, Behzad Mortazavi-asl, and Hua Zhu School of Computing Science, Simon Fraser University, Canada fpeijian, han, mortazav, hzhuag@cs.sfu.ca

### Triangles. (SAS), or all three sides (SSS), the following area formulas are useful.

Triangles Some of the following information is well known, but other bits are less known but useful, either in and of themselves (as theorems or formulas you might want to remember) or for the useful techniques

### www.pioneermathematics.com

Problems and Solutions: INMO-2012 1. Let ABCD be a quadrilateral inscribed in a circle. Suppose AB = 2+ 2 and AB subtends 135 at the centre of the circle. Find the maximum possible area of ABCD. Solution:

### CONSTRAINT-BASED MINING OF FREQUENT ARRANGEMENTS OF TEMPORAL INTERVALS PANAGIOTIS PAPAPETROU

CONSTRAINT-BASED MINING OF FREQUENT ARRANGEMENTS OF TEMPORAL INTERVALS PANAGIOTIS PAPAPETROU Thesis submitted in partial fulfillment of the requirements for the degree of Master of Arts BOSTON UNIVERSITY

### Storing Data: Disks and Files

Storing Data: Disks and Files [R&G] Chapter 9 CS 4320 1 Data on External Storage Disks: Can retrieve random page at fixed cost But reading several consecutive pages is much cheaper than reading them in

### ASCII Code - The extended ASCII table

ASCII Code - The extended ASCII table ASCII stands for American Standard Code for Information Interchange. It's a 7-bit character code where every single bit represents a unique character. On this webpage

### Web Site: Forums: forums.parallax.com Sales: Technical:

Web Site: www.parallax.com Forums: forums.parallax.com Sales: sales@parallax.com Technical: support@parallax.com Office: (916) 624-8333 Fax: (916) 624-8003 Sales: (888) 512-1024 Tech Support: (888) 997-8267

### CS103B Handout 17 Winter 2007 February 26, 2007 Languages and Regular Expressions

CS103B Handout 17 Winter 2007 February 26, 2007 Languages and Regular Expressions Theory of Formal Languages In the English language, we distinguish between three different identities: letter, word, sentence.

### On Mining Group Patterns of Mobile Users

On Mining Group Patterns of Mobile Users Yida Wang 1, Ee-Peng Lim 1, and San-Yih Hwang 2 1 Centre for Advanced Information Systems, School of Computer Engineering Nanyang Technological University, Singapore

### PREDICTIVE MODELING OF INTER-TRANSACTION ASSOCIATION RULES A BUSINESS PERSPECTIVE

International Journal of Computer Science and Applications, Vol. 5, No. 4, pp 57-69, 2008 Technomathematics Research Foundation PREDICTIVE MODELING OF INTER-TRANSACTION ASSOCIATION RULES A BUSINESS PERSPECTIVE

### Data Mining: Foundation, Techniques and Applications

Data Mining: Foundation, Techniques and Applications Lesson 1b :A Quick Overview of Data Mining Li Cuiping( 李 翠 平 ) School of Information Renmin University of China Anthony Tung( 鄧 锦 浩 ) School of Computing

### KALE: A High-Degree Algebraic-Resistant Variant of The Advanced Encryption Standard

KALE: A High-Degree Algebraic-Resistant Variant of The Advanced Encryption Standard Dr. Gavekort c/o Vakiopaine Bar Kauppakatu 6, 41 Jyväskylä FINLAND mjos@iki.fi Abstract. We have discovered that the

### Course Content. Chapter 3 Objectives. Sequence of Transactions

Daa Mining & Knowledge Discovery Fall 7 Chaper : Sequenial Paern Analysis Dr. Osmar R. Zaïane Universiy of Albera Course Conen Inroducion o Daa Mining Associaion Analysis Sequenial Paern Analysis Classificaion

### EMDX3 Multifunction meter Cat No. 146 69 ModbusTable LGR EN v1.01.xls

GENERAL MODBUS TABLE ORGANIZATION Starting of the Starting of the Group s Group s System Version (Release) System Version (Build) Group Name (Text) Group Code Group Complexity Group Version 50512 C550