Mining the Most Interesting Rules



Similar documents
How to calculate effect sizes from published research: A simplified methodology

Transient Analysis of First Order RC and RL circuits

Can Blog Communication Dynamics be correlated with Stock Market Activity? Munmun De Choudhury Hari Sundaram Ajita John Dorée Duncan Seligmann

A GENERAL APPROACH TO TOTAL REPAIR COST LIMIT REPLACEMENT POLICIES

Video Surveillance of High Security Facilities

The Transport Equation

Duration and Convexity ( ) 20 = Bond B has a maturity of 5 years and also has a required rate of return of 10%. Its price is $613.

Random Walk in 1-D. 3 possible paths x vs n. -5 For our random walk, we assume the probabilities p,q do not depend on time (n) - stationary

in the SCM Age Akihiko Hayashi The University of Electro-Communications 1-5-1, Chofugaoka, Chofu, Tokyo, , JAPAN

Should central banks provide reserves via repos or outright bond purchases?

TEMPORAL PATTERN IDENTIFICATION OF TIME SERIES DATA USING PATTERN WAVELETS AND GENETIC ALGORITHMS

Multiprocessor Systems-on-Chips

Morningstar Investor Return

Chapter 8: Regression with Lagged Explanatory Variables

Inductance and Transient Circuits

Single-machine Scheduling with Periodic Maintenance and both Preemptive and. Non-preemptive jobs in Remanufacturing System 1

17 Laplace transform. Solving linear ODE with piecewise continuous right hand sides

On the degrees of irreducible factors of higher order Bernoulli polynomials

NBER WORKING PAPER SERIES EDUCATIONAL DEBT BURDEN AND CAREER CHOICE: EVIDENCE FROM A FINANCIAL AID EXPERIMENT AT NYU LAW SCHOOL.

Measuring macroeconomic volatility Applications to export revenue data,

Option Put-Call Parity Relations When the Underlying Security Pays Dividends

Government late payments: the effect on the Italian economy. Research Team. Prof. Franco Fiordelisi (coordinator)

PATHWISE PROPERTIES AND PERFORMANCE BOUNDS FOR A PERISHABLE INVENTORY SYSTEM

Task is a schedulable entity, i.e., a thread

Forecasting and Information Sharing in Supply Chains Under Quasi-ARMA Demand

GNSS software receiver sampling noise and clock jitter performance and impact analysis

The Grantor Retained Annuity Trust (GRAT)

Nicolás Amézquita Gómez. PhD Thesis. A thesis co-directed by: Francesc Serratosa i Casanelles * René Alquézar Mancho

PROFIT TEST MODELLING IN LIFE ASSURANCE USING SPREADSHEETS PART ONE

Chapter 7. Response of First-Order RL and RC Circuits

ANALYSIS AND COMPARISONS OF SOME SOLUTION CONCEPTS FOR STOCHASTIC PROGRAMMING PROBLEMS

Real-time Particle Filters

cooking trajectory boiling water B (t) microwave time t (mins)

LEASING VERSUSBUYING

Chapter 1.6 Financial Management

INTEREST RATE FUTURES AND THEIR OPTIONS: SOME PRICING APPROACHES

CHARGE AND DISCHARGE OF A CAPACITOR

Estimation of Point Rainfall Frequencies

Scalable and Coherent Video Resizing with Per-Frame Optimization

MTH6121 Introduction to Mathematical Finance Lesson 5

Why Did the Demand for Cash Decrease Recently in Korea?

Chapter 2 Problems. 3600s = 25m / s d = s t = 25m / s 0.5s = 12.5m. Δx = x(4) x(0) =12m 0m =12m

The Application of Multi Shifts and Break Windows in Employees Scheduling

BALANCE OF PAYMENTS. First quarter Balance of payments

When Can Carbon Abatement Policies Increase Welfare? The Fundamental Role of Distorted Factor Markets

Appendix D Flexibility Factor/Margin of Choice Desktop Research

HUT, TUT, LUT, OU, ÅAU / Engineering departments Entrance examination in mathematics May 25, 2004

A Note on Using the Svensson procedure to estimate the risk free rate in corporate valuation

Insights into the Market Impact of Different Investment Styles

THE FIRM'S INVESTMENT DECISION UNDER CERTAINTY: CAPITAL BUDGETING AND RANKING OF NEW INVESTMENT PROJECTS

Return Calculation of U.S. Treasury Constant Maturity Indices

Sampling Time-Based Sliding Windows in Bounded Space

USE OF EDUCATION TECHNOLOGY IN ENGLISH CLASSES

Automatic measurement and detection of GSM interferences

Distributing Human Resources among Software Development Projects 1

A UNIFIED APPROACH TO MATHEMATICAL OPTIMIZATION AND LAGRANGE MULTIPLIER THEORY FOR SCIENTISTS AND ENGINEERS

Appendix A: Area. 1 Find the radius of a circle that has circumference 12 inches.

Trade Liberalization and Export Variety: A Comparison of China and Mexico

Strategic Optimization of a Transportation Distribution Network

Niche Market or Mass Market?

Chapter 4: Exponential and Logarithmic Functions

Table of contents Chapter 1 Interest rates and factors Chapter 2 Level annuities Chapter 3 Varying annuities

TSG-RAN Working Group 1 (Radio Layer 1) meeting #3 Nynashamn, Sweden 22 nd 26 th March 1999

To Sponsor or Not to Sponsor: Sponsored Search Auctions with Organic Links and Firm Dependent Click-Through Rates

Journal Of Business & Economics Research September 2005 Volume 3, Number 9

Individual Health Insurance April 30, 2008 Pages

Double Compartment CA Simulation of Drug Treatments Inhibiting HIV Growth and Replication at Various Stages of Life Cycle

SELF-EVALUATION FOR VIDEO TRACKING SYSTEMS

Endogenous Growth Practice Questions Course Macro I TA: Todd Gormley, tgormley@mit.edu

Hedging with Forwards and Futures

Permutations and Combinations

DOES TRADING VOLUME INFLUENCE GARCH EFFECTS? SOME EVIDENCE FROM THE GREEK MARKET WITH SPECIAL REFERENCE TO BANKING SECTOR

Long-Run and Short-Run Co-Movements between Oil and Agricultural Futures Prices

Load Prediction Using Hybrid Model for Computational Grid

Can Individual Investors Use Technical Trading Rules to Beat the Asian Markets?

Making a Faster Cryptanalytic Time-Memory Trade-Off

THE REAL EFFECTS OF POLITICAL UNCERTAINTY: ELECTIONS AND INVESTMENT SENSITIVITY TO STOCK PRICES *

Constant Data Length Retrieval for Video Servers with Variable Bit Rate Streams

Chapter 6: Business Valuation (Income Approach)

Supplementary Appendix for Depression Babies: Do Macroeconomic Experiences Affect Risk-Taking?

Analogue and Digital Signal Processing. First Term Third Year CS Engineering By Dr Mukhtiar Ali Unar


1 HALF-LIFE EQUATIONS

Making Use of Gate Charge Information in MOSFET and IGBT Data Sheets

4. International Parity Conditions

Motion Along a Straight Line

Top-K Structural Diversity Search in Large Networks

AP Calculus AB 2010 Scoring Guidelines

Predicting Stock Market Index Trading Signals Using Neural Networks

UNDERSTANDING THE DEATH BENEFIT SWITCH OPTION IN UNIVERSAL LIFE POLICIES. Nadine Gatzert

Fair Stateless Model Checking

Network Effects, Pricing Strategies, and Optimal Upgrade Time in Software Provision.

Efficient One-time Signature Schemes for Stream Authentication *

Analysis of Tailored Base-Surge Policies in Dual Sourcing Inventory Systems

Economics Honors Exam 2008 Solutions Question 5

Module 4. Single-phase AC circuits. Version 2 EE IIT, Kharagpur

Energy Efficient HVAC System with Distributed Sensing and Control

4 Convolution. Recommended Problems. x2[n] 1 2[n]

Performance Center Overview. Performance Center Overview 1

When Is Growth Pro-Poor? Evidence from a Panel of Countries

Transcription:

Appears in Pro. of he Fifh ACM SIGKDD In l Conf. on Knowledge Disovery and Daa Mining, 145-154, 1999. Mining he Mos Ineresing Rules Robero J. Bayardo Jr. IBM Almaden Researh Cener hp://www.almaden.ibm.om/s/people/bayardo/ bayardo@alum.mi.edu Rakesh Agrawal IBM Almaden Researh Cener hp://www.almaden.ibm.om/u/ragrawal/ ragrawal@am.org Absra Several algorihms have been proposed for finding he bes, opimal, or mos ineresing rule(s in a daabase aording o a variey of meris inluding onfidene, suppor, gain, hi-squared value, gini, enropy gain, laplae, lif, and onviion. In his paper, we show ha he bes rule aording o any of hese meris mus reside along a suppor/onfidene border. Furher, in he ase of onjunive rule mining wihin aegorial daa, he number of rules along his border is onvenienly small, and an be mined effiienly from a variey of real-world daa-ses. We also show how his onep an be generalized o mine all rules ha are bes aording o any of hese rieria wih respe o an arbirary subse of he populaion of ineres. We argue ha by reurning a broader se of rules han previous algorihms, our ehniques allow for improved insigh ino he daa and suppor more user-ineraion in he opimized rule-mining proess. 1. Inroduion There are numerous proposals for mining rules from daa. Some are onsrain-based in ha hey mine every rule saisfying a se of hard onsrains suh as minimum suppor or onfidene (e.g. [1,2,6] Ohers are heurisi in ha hey aemp o find rules ha are prediive, bu make no guaranees on he prediiveness or he ompleeness of he reurned rule se (e.g. deision ree and overing algorihms [9,15] A hird lass of rule mining algorihms, whih are he subje of his paper, idenify only he mos ineresing, or opimal, rules aording o some ineresingness meri [12,18,20,24]. Opimized rule miners are pariularly useful in domains where a onsrain-based rule miner produes oo many rules or requires oo muh ime. I is diffiul o ome up wih a single meri ha quanifies he ineresingness or goodness of a rule, and as a resul, several differen meris have been proposed and used. Among hem are onfidene and suppor [1], gain [12], variane and hi-squared value [17,18], enropy gain [16,17], gini [16], laplae [9,24], lif [14] (a.k.a. ineres [8] or srengh [10], and onviion [8]. Several algorihms are known o effiienly find he bes rule (or a lose approximaion o he bes rule [16] aording o a speifi one of hese meris [12,18,20,24]. In his paper, we show ha a single ye simple onep of rule goodness apures he bes rules aording o any of hem. This onep involves a parial order on rules defined in erms of boh rule suppor and onfidene. We demonsrae ha he se of rules ha are opimal aording o his parial order inludes all rules ha are bes aording o any of he above meris, even given arbirary minimums on suppor and/or onfidene. In he onex of mining onjunive assoiaion rules, we presen an algorihm ha an effiienly mine an opimal se aording o his parial order from a variey of real-world daa-ses. For example, for eah of he aegorial daa-ses from he Irvine mahine learning reposiory (exeping only onne-4, his algorihm requires less han 30 seonds on a 400Mhz Penium-II lass mahine. Speifying onsrains suh as minimum suppor or onfidene redues exeuion ime even furher. While opimizing aording o only a single ineresingness meri ould someimes require less overhead, he approah we propose is likely o be advanageous sine i suppors an ineraive phase in whih he user an browse he opimal rule aording o any of several ineresingness meris. I also allows he user o ineraively weak minimums on suppor and onfidene. Winessing suh effes wih a ypial opimized rule miner requires repeaed mining runs, whih may be impraial when he daabase is large. Anoher need for repeaed invoaions of an opimized rule miner arises when he user needs o gain insigh ino a broader populaion han wha is already well-haraerized by previously disovered rules. We show how our algorihm an be generalized o produe every rule ha is opimal aording o any of he previously menioned ineresingness meris, and addiionally, wih respe o an arbirary subse of he populaion of ineres. Beause daamining is ieraive and disovery-driven, idenifying several good rules up-fron in order o avoid repeaedly querying he daabase redues oal mining ime when amorized over he enire proess [13]. 2. Preliminaries 2.1 Generi Problem Saemen A daa-se is a finie se of reords. For he purpose of his paper, a reord is simply an elemen on whih we apply boolean prediaes alled ondiions. A rule onsiss of wo ondiions alled he aneeden and onsequen, and is denoed as A C where A is he aneeden and C he onsequen. A rule onsrain is a boolean prediae on a rule. Given a se of onsrains N, we say ha a rule r saisfies he onsrains in N if every onsrain in N evaluaes o rue given r. Some ommon examples of onsrains are iem onsrains [22] and minimums on suppor and onfidene [1]. The inpu o he problem of mining opimized rules is a 5 -uple U, D,, CN, where: U is a finie se of ondiions; D is a daa-se; is a oal order on rules; C is a ondiion speifying he rule onsequen; N is a se of onsrains on rules. When mining an opimal disjunion, we rea a se of ondiions A U as a ondiion iself ha evaluaes o rue if and only if one or more of he ondiions wihin A evaluaes o rue on he given reord. When mining an opimal onjunion, we rea A as a ondiion ha evaluaes o rue if and only if every ondiion wihin A evaluaes o rue on he given reord. For boh ases, if A is empy hen i always evaluaes o rue. Algorihms for mining opimal onjunions and disjunions differ signifianly in heir

deails, bu he problem an be formally saed in an idenial manner 1 : PROBLEM (OPTIMIZED RULE MINING: Find a se A 1 U suh ha (1 A 1 saisfies he inpu onsrains, and (2 here exiss no se A 2 U suh ha A 2 saisfies he inpu onsrains and A 1 < A 2. Any rule A C whose aneeden is a soluion o an insane I of he opimized rule mining problem is said o be I -opimal (or jus opimal if he insane is lear from he onex For simpliiy, we someimes rea rule aneedens (denoed wih A and possibly some subsrip and rules (denoed wih r and possibly some subsrip inerhangeably sine he onsequen is always fixed and lear from he onex. We now define he suppor and onfidene values of rules. These values are ofen used o define rule onsrains by bounding hem above a pre-speified value known as minsup and minonf respeively [1], and also o define oal orders for opimizaion [12,20]. The suppor of a ondiion A is equal o he number of reords in he daa-se for whih A evaluaes o rue, and his value is denoed as sup( A. The suppor of a rule A C, denoed similarly as sup( A, is equal o he number of reords in he daa-se for whih boh A and C evaluae o rue. 2 The aneeden suppor of a rule is he suppor of is aneeden alone. The onfidene of a rule is he probabiliy wih whih he onsequen evaluaes o rue given ha he aneeden evaluaes o rue in he inpu daa-se, ompued as follows: sup( A onf( A = ---------------------------- sup( A 2.2 Previous Algorihms for he Opimized Rule Mining Problem Many previously proposed algorihms for opimized rule mining solve speifi resriions of he opimized rule mining problem. For example, Webb [24] provides an algorihm for mining an opimized onjunion under he following resriions: U onains an exisene es for eah aribue/value pair appearing in a aegorial daa-se ouside a designaed lass olumn; orders rules aording o heir laplae value (defined laer; N is empy. Fukuda e al. [12] provide algorihms for mining an opimized disjunion where: U onains a membership es for eah square of a grid formed by disreizing wo pre-speified numerial aribues of a daase (a reord is a member of a square if is aribue values fall wihin he respeive ranges; orders rules aording o eiher onfidene, aneeden suppor, or a noion hey all gain (also defined laer; N inludes minimums on suppor or onfidene, and inludes one of several possible geomery onsrains ha resri he allowed shape formed by he represened se of grid squares; Rasogi and Shim [20] look a he problem of mining an opimized disjunion where: U inludes a membership es for every possible hyperube defined by a pre-speified se of reord aribues wih eiher 1 Algorihms for mining opimal disjunions ypially allow a single fixed onjunive ondiion wihou ompliaions, e.g. see [20]. We ignore his issue for simpliiy of presenaion. 2 This follows he original definiion of suppor as defined in [1]. The reader is warned ha in he work of Fukuda e. al. [14] and Rasogi and Shim [20] (who are areful o noe he same disrepan, he definiion of suppor orresponds o our noion of aneeden suppor. ordered or aegorial domains; orders rules aording o aneeden suppor or onfidene; N inludes minimums on aneeden suppor or onfidene, a maximum k on he number of ondiions allowed in he aneeden of a rule, and a requiremen ha he hyperubes orresponding o he ondiions of a rule are non-overlapping. In general, he opimized rule mining problem, wheher onjunive or disjunive, is NP-hard [17]. However, feaures of a speifi insane of his problem an ofen be exploied o ahieve raabiliy. For example, in [12], he geomery onsrains are used o develop low-order polynomial ime algorihms. Even in ases where raabiliy is no guaraneed, effiien mining in praie has been demonsraed [18,20,24]. The heoreial onribuions in his paper are onjunion/disjunion neural. However, we fous on he onjunive ase in validaing he praialiy of hese resuls hrough empirial evaluaion. 2.3 Mining Opimized Rules under Parial Orders We have arefully phrased he opimized rule mining problem so ha i may aommodae a parial order in plae of a oal order. Wih a parial order, beause some rules may be inomparable, here an be several equivalene lasses onaining opimal rules. The previous problem saemen requires an algorihm o idenify only a single rule from one of hese equivalene lasses. However, in our appliaion, we wish o mine a leas one represenaive from eah equivalene lass ha onains an opimal rule. To do so, we ould simply modify he previous problem saemen o find all opimal rules insead of jus one. However, in praie, he equivalene lasses of rules an be large, so his would be unneessarily ineffiien. The nex problem saemen enfores our requiremens speifially: PROBLEM (PARTIAL-ORDER OPTIMIZED RULE MINING: Find a se O of subses of U suh ha: (1 every se A in O is opimal as defined by he opimized rule mining problem. (2 for every equivalene lass of rules as defined by he parial order, if he equivalene lass onains an opimal rule, hen exaly one member of his equivalene lass is wihin O. We all a se of rules whose aneedens omprise a soluion o an insane I of his problem an I-opimal se. An I-opimal rule is one ha may appear in an I -opimal se. 2.4 Monooniiy Throughou his paper, we exploi (ani-monooniiy properies of funions. A funion f( x is said o be monoone (resp. animonoone in x if x 1 < x 2 implies ha f( x 1 fx ( 2 (resp. f( x 1 fx ( 2 For example, he onfidene funion, whih is defined in erms of rule suppor and aneeden suppor, is animonoone in aneeden suppor when rule suppor is held fixed. 3. SC-Opimaliy 3.1 Definiion Consider he following parial order s on rules. Given rules r 1 and, r 1 < s if and only if: sup( r 1 sup( onf( r 1 < onf(, or sup( r 1 < sup( onf( r 1 onf( Addiionally, r 1 = s if and only if sup( r 1 = sup( and onf( r 1 = onf( An I-opimal se where I onains his parial order is depied in Figure 1. Inuiively, suh a se of rules defines a supporonfidene border above whih no rule ha saisfies he inpu onsrains an fall.

Confidene 0% non-opimal rules fall wihin hese borders Suppor s-opimal rule s -opimal rule No rules fall ouside hese borders Figure 1. Upper and lower suppor-onfidene borders. Consider also he similar parial order s suh ha r < if 1 s and only if: sup( r 1 sup( onf( r 1 > onf(, or sup( r 1 < sup( onf( r 1 onf( The equivalene ondiion is he same as before. An opimal se of rules aording o his parial order forms a lower border. 3.2 Theoreial Impliaions In his seion, we show ha for many oal orders inended o rank rules in order of ineresingness, we have ha r 1 < s r 1, and r 1 = s r 1 =. We say ha any suh oal order is implied by s. This propery of a oal order is useful due o he following fa: LEMMA 3.1: Given he problem insane I = U, D,, C, suh ha is implied by s, an I -opimal rule is onained wihin any I s -opimal se where I s = U, D, s, CN,. Proof: Consider any rule r 1 ha is no I s -opimal (for simpliiy we will ignore he presene of onsrains in N Beause r 1 is non-opimal, here mus exis some rule ha is opimal suh ha r 1 < s. Bu hen we also have ha r 1 sine is implied by s. This implies ha any non-i s -opimal rule is eiher non-i-opimal, or i is equivalen o some I-opimal rule whih resides in an I s -opimal equivalene lass. A leas one I s -opimal equivalene lass mus herefore onain an I -opimal rule. Furher, beause = is implied by = s, every rule in his equivalene lass mus be I -opimal. By definiion, an I s - opimal se will onain one of hese rules, and he laim follows. Pu simply, mining he upper suppor/onfidene border idenifies opimal rules aording o several differen ineresingness meris. We will show ha hese meris inlude suppor, onfidene, onviion, lif, laplae, gain, and an unnamed ineres measure proposed by Piaesky-Shapiro [19]. If we also mine he lower border, meris suh as enropy gain, gini, and hi-squared value are also inluded. Bu firs, onsider he following addiional propery of oal orders implied by s : OBSERVATION 3.2: Given insane I = U, D,, C, suh ha is implied by s, and N onains a minimum suppor onsrain n s and/or a minimum onfidene onsrain n, an I - opimal rule is onained wihin any I s -opimal se where I s = U, D, s, CN, { n s, n }}. The impliaion of his fa is ha we an mine wihou minimum suppor and onfidene onsrains, and he opimal rule given any seing of hese onsrains will remain in he mined rule se. This allows he user o winess he effes of modifying hese onsrains wihou furher mining of he daa -- a useful fa sine he user ofen anno deermine aurae seings of minimum suppor or onfidene apriori. Minimum suppor and onfidene onsrains are quie riial in some appliaions of opimized rule mining, pariularly when he opimizaion meri is iself suppor or onfidene as in [12] and [20]. To idenify he ineresingness meris ha are implied by s, we use he following lemma. LEMMA 3.3: The following ondiions are suffiien for esablishing ha a oal order defined over a rule value funion f( r is implied by parial order s : (1 f( r is monoone in suppor over rules wih he same onfidene, and (2 f( r is monoone in onfidene over rules wih he same suppor. Proof: Suppose r 1 < s, hen onsider a rule r where sup( r 1 = sup( r and onf( = onf( r. Noe ha by definiion r 1 s r and r s. Now, if a oal order has he monooniiy properies from above, hen r 1 r and r. Sine oal orders are ransiive, we hen have ha r 1, whih esablishes he laim. These ondiions rivially hold when he rule value funion is sup( r or onfr ( So onsider nex he Laplae funion whih is ommonly used o rank rules for lassifiaion purposes [9,24]. sup( A + 1 laplae( A = ------------------------------------- sup( A + k The onsan k is an ineger greaer han 1 (usually se o he number of lasses when building a lassifiaion model Noe ha if onfidene is held fixed o some value, hen we an rewrie he Laplae value as below. laplae( r sup( r + 1 = ----------------------------- sup( r + k I is sraighforward o show ha his expression is monoone in rule suppor sine k > 1 and 0. The Laplae funion is also monoone in onfidene among rules wih equivalen suppor. To see why, noe ha if suppor is held onsan, in order o raise he funion value, we need o derease he value of he denominaor. This derease an only be ahieved by reduing aneeden suppor, whih implies a larger onfidene. Noe ha an opimal se onains he opimized Laplae rule for any valid seing of k. This means he user an winess he effe of varying k on he opimal rule wihou addiional mining of he daabase. gain( A = sup( A θ sup( A The gain funion of Fukuda e al. [12] is given above, where θ is a fraional onsan beween 0 and 1. If onfidene is held fixed a, hen his funion is equal o sup(1 r ( Θ, whih is

rivially monoone in suppor as long as Θ. We an ignore he ase where < Θ if we assume here exiss any rule r saisfying he inpu onsrains suh ha onf( r Θ. This is beause for any pair of rules r 1 and suh ha onf( r 1 Θ and onf( < Θ, we know ha gain( r 1 gain( irrespeive of heir suppor. Should his assumpion be false, he gain rieria is opimized by any rule wih zero suppor should one exis (e.g. in he onjunive ase, one an simply add ondiions o a rule unil is suppor drops o zero The gain funion is monoone in onfidene among rules wih equivalen suppor for reasons similar o he ase of he Laplae funion. If suppor is held onsan, hen an inrease in gain implies a derease in he subraive erm. The subraive erm an be dereased only by reduing aneeden suppor, whih implies a larger onfidene. Noe ha, like k from he Laplae funion, afer idenifying he opimal se of rules, he user an vary θ and view is effe on he opimal rule wihou addiional mining. Anoher ineresingness meri ha is idenial o gain for a fixed value of Θ = sup( D was inrodued by Piaesky-Shapiro [19]: p-s( A = sup( A sup ------------------------------- ( Asup( D Consider nex onviion [8], whih was framed in [6] as a funion of onfidene: onviion( A = D sup( ---------------------------------------------------- D ( 1 onf( A Conviion is obviously monoone in onfidene sine onfidene appears in a subraive erm wihin he denominaor. I is also unaffeed by variaions in rule suppor if onfidene is held onsan, whih implies monooniiy. Lif, a well-known saisial measure ha an be used o rank rules in IBM s Inelligen Miner [14] (i is also known as ineres [8] and srengh [10], an also be framed as a funion of onfidene [6]: lif( A = -------------------------------------- D onf( A sup( Like onviion, lif is obviously monoone in onfidene and unaffeed by rule suppor when onfidene is held fixed. The remaining ineresingness meris, enropy gain, gini, and hisquared value, are no implied by s. However, we show ha he spae of rules an be pariioned ino wo ses aording o onfidene suh ha when resried o rules in one se, eah meri is implied by s, and when resried o rules in he oher se, eah meri is implied by s. As a onsequene, he opimal rules wih respe o enropy gain, gini, and hi-squared value mus reside on eiher he upper or lower suppor onfidene border. This idea is formally saed by he observaion below. OBSERVATION 3.4: Given insane I = U, D,, C,, if s implies over he se of rules whose onfidene is greaer han equal o some value γ, and s implies over he se of rules whose onfidene is less han or equal o γ, hen an I - opimal rule appears in eiher (a any I s opimal se where I s = U, D, s, CN,, or (b any I s -opimal se where I s = UD,, s, CN,. To demonsrae ha he enropy gain, gini, and hi-squared values saisfy he requiremens pu forh by his observaion, we need o know when he oal order defined by a rule value funion is implied by s. We use an analog of he Lemma 3.3 for his purpose: LEMMA 3.5: The following ondiions are suffiien for esablishing ha a oal order defined over a rule value funion f( r is implied by parial order s : (1 f( r is monoone in suppor over rules wih he same onfidene, and (2 f( r is ani-monoone in onfidene over rules wih he same suppor. In [16], he hi-squared, enropy gain, and gini values of a rule are eah defined in erms of a funion f( x, where x = sup( A sup( A and y = sup( A given he rule A C o whih i is applied 3 (definiions appear in Appendix A These funions are furher proven o be onvex. Anoher imporan propery of eah of hese funions is ha hey reah heir minimum a any rule whose onfidene is equal o he expeed onfidene [17]. More formally, f( x, is minimum when onf( x, = where onf( x, = y ( x+ and = sup( D. To prove our laims, we exploi wo properies of onvex funions from [11]: (1 Convexiy of a funion f( x, implies ha for an arbirary dividing poin x 3, y 3 of he line segmen beween wo poins x and, we have. 4 1 x 2, y 2 max( fx ( 1, fx ( 2, y 2 fx ( 3, y 3 (2 A onvex funion over a given region mus be oninuous a every poin of he region s relaive inerior. The nex wo lemmas show ha a onvex funion f( x, whih is minimum a onf( x, = has he properies required by Observaion 3.4, where γ from he observaion is se o he expeed onfidene value. LEMMA 3.6: For a onvex funion f( x, whih is minimum a onf( x, =, f( x, is (1 monoone in onf( x, for fixed y, so long as onf( x,, and (2 monoone in y when onf( x, = A for any onsan A. Proof: For ase (1, when y is fixed o a onsan Y, onf( x, Y for onf( x, Y represens a horizonal line segmen ha exends from poin onf( x, Y = lefward. The value of onf( x, Y is learly ani-monoone in x. Beause f is also ani-monoone in x in his region as a onsequene of is onvexiy, i follows ha f( x, Y is monoone in onf( x, Y 5. For ase (2, assume he laim is false. This implies here exiss a veor v defined by onf( x, = A for some onsan A along whih f is no monoone in y. Tha is, here exis wo poins ( x 1 and ( x 2, y 2 along v where f( x 1 > fx ( 2, y 2 ye y 1 < y 2 (see Figure 2 If f were defined o be minimum a ( 00,, hen his would onradi he fa ha f is onvex. Bu sine f as well as v are undefined a his poin, anoher argumen is required. Consider hen some suffiienly small non-zero value δ suh ha x 2 δ 0 and f( x 1 > fx ( 2 δ, y 2 Beause f is onvex and oninuous in 3 We are making some simplifiaions: hese funions are aually defined in erms of a veor defining he lass disribuion afer a binary spli. We are resriing aenion o he ase where here are only wo lasses ( C and C whih orrespond o x and y respeively The binary spli in our ase is he segmenaion of daa-se D made by esing he aneeden 4 ondiion A of he rule. This well-known propery of onvex funions is someimes given as he definiion of a onvex funion, e.g. [16]. While his propery is neessary for onvexiy, i is no suffiien. The proofs of onvexiy for gini, enropy, and hi-squared value in [16] are neverheless valid for he aual definiion of onvexiy sine hey show ha he seond derivaives of hese 5 funions are always non-negaive, whih is neessary and suffiien [11]. We are no being ompleely rigorous due o he bounded naure of he onvex region over whih f is defined. For example, he poin onf( x, Y = may no be wihin his bounded region sine x an be no greaer han D. Verifying ha hese boundary ondiions do no affe he validiy of our laims is lef as an exerise.

y ( x 3, y 3 ( x 1 ( x 2 δ, y 2 Figure 2. Illusraion of ase (2 from Lemma 3.6. is inerior region, suh a value of δ is guaraneed o exis unless x 2 = 0, whih is a rivial boundary ase. Now, onsider he line [( x 1, ( x 2 δ, y 2 ]. This line mus onain a poin ( x 3, y 3 suh ha x 3 and y 3 are non-negaive, and one or boh of x 3 or y 3 is non-zero. Bu beause f is onvex and minimum a ( x 3, y 3, we have ha fx ( 1 fx ( 2 δ, y 2, whih is a onradiion. LEMMA 3.7: For a onvex funion f( x, whih is minimum a onf( x, =, f( x, is (1 ani-monoone in onf( x, for fixed y, so long as onf( x,, and (2 monoone in y when onf( x, = A for any onsan A. Proof: Similar o he previous. The previous wo lemmas and Observaion 3.4 lead immediaely o he following heorem, whih formalizes he fa ha mining he upper and lower suppor-onfidene borders idenifies he opimal rules aording o meris suh as enropy gain, gini, and hisquared value. Convenienly, an algorihm speifially opimized for mining he upper border an be used wihou modifiaion o mine he lower border by simply negaing he onsequen of he given insane, as saed by he subsequen lemma. THEOREM 3.8: Given insane I = U, D,, C,, if is defined over he values given by a onvex funion fx (, over rules A C where: (1 x = sup( A sup( A and y = sup( A, and (2 f( x, is minimum a onf( x, = sup( D, hen an I-opimal rule appears in eiher (a any I s opimal se where I s = U, D, s, CN,, or (b any I s -opimal se where I s = UD,, s, CN,. LEMMA 3.9: Given an insane I s = UD,, s, CN,, any I s -opimal se for I s = U, D, s, C, (where C evaluaes o rue only when C evaluaes o false is also an I s -opimal se. Proof Idea: Noe ha onf( A = 1 onf( A. Thus, maximizing he onfidene of A C minimizes he onfidene of A C. Before ending his seion we onsider one praial issue -- ha of resul visualizaion. Noe ha he suppor-onfidene borders as displayed in Figure 1 provide an exellen means by whih opimal ses of rules may be visualized. Eah border learly illusraes he rade-off beween he suppor and onfidene. Addiionally, one an imagine he resul visualizer olor-oding poins along hese borders ha are opimal aording o he various ineresingness meris, e.g. blue for Laplae value, red for hi-squared value, green for enropy gain, and so on. The resul of modifying minimum suppor or onfidene on he opimal rules ould be displayed in real-ime as he user drags a marker along eiher axis x ( x 2, y 2 onf(x, = in order o speify a minimum bound. 3.3 Praial Impliaions for Mining Opimal Conjunions In his seion we presen and evaluae an algorihm ha effiienly mines an opimal se of onjunions aording o s (and s due o Lemma 3.9 from many real-world aegorial daa-ses, wihou requiring any onsrains o be speified by he user. We also demonsrae ha he number of rules produed by his algorihm for a given insane is ypially quie manageable -- on he order of a few hundred a mos. We address he speifi problem of mining opimal onjunions wihin aegorially valued daa, where eah ondiion in U is simply a es of wheher he given inpu reord onains a pariular aribue/value pair, exluding values from a designaed lass olumn. Values from he designaed lass olumn are used as onsequens. While our algorihm requires no minimums on suppor or onfidene, if hey are speified, hey an be exploied for beer performane. Spae onsrains prohibi a full explanaion of he workings of his algorihm, so we highligh only he mos imporan feaures here. A omplee desripion appears in an exended draf [7]. The algorihm we use is a varian of Dense-Miner from [6], whih is a onsrain-based rule miner suiable for use on large and dense daa-ses. In Dense-Miner, he rule mining problem is framed as a se-enumeraion ree searh problem [21] where eah node of he ree enumeraes a unique elemen of 2 U. Dense-Miner reurns every rule ha saisfies he inpu onsrains, whih inlude minimum suppor and onfidene. We modified Dense-Miner o insead mainain only he se of rules R ha are poenially opimal a any given poin during is exeuion. Whenever a rule r is enumeraed by a node and found o saisfy he inpu onsrains, i is ompared agains every rule presenly in R. If r is beer han or inomparable o every rule already in R aording o he parial order, hen rule r is added o R. Also, any rule in R ha is worse han r is removed. Given his poliy, assuming he ree enumeraes every subse of U, upon erminaion, R is an opimal se. Beause an algorihm whih enumeraes every subse would be unaepably ineffiien, we use pruning sraegies ha grealy redue he searh spae wihou ompromising ompleeness. These sraegies use Dense-Miner s pruning funions (appearing in Appendix B, whih bound he onfidene and suppor of any rule ha an be enumeraed by a desenden of a given node. To see how hese funions are applied in our varian of he algorihm, onsider a node g wih suppor bound s and onfidene bound. To see if g an be pruned, he algorihm deermines if here exiss a rule r in R suh ha r i s r, where r i is some imaginary rule wih suppor s and onfidene. Given suh a rule, if any desenden of g enumeraes an opimal rule, hen i mus be equivalen o r. This equivalene lass is already represened in R, so here is no need o enumerae hese desendens, and g an be pruned. This algorihm differs from Dense-Miner in only wo addiional ways. Firs, we allow he algorihm o perform a se-oriened besfirs searh of he ree insead of a purely breadh-firs searh. Dense-miner uses a breadh-firs searh sine his limis he number of daabase passes required o he heigh of he searh ree. In he onex of opimized rule mining, a breadh-firs sraegy an be ineffiien beause pruning improves as beer rules are found, and good rules someimes arise only a he deeper levels. A pure bes-firs searh requires a daabase pass for eah node in he ree, whih would be unaepable for large daa-ses. Insead, we proess several of he bes nodes (a mos 5000 in our implemenaion wih eah daabase pass in order o redue he

number of daabase passes while sill subsanially reduing he searh spae. For his purpose, a node is beer han anoher if he rule i enumeraes has a higher onfidene value. The remaining modifiaion is he inorporaion of inlusive pruning as proposed by Webb [23]. This pruning sraegy avoids enumeraing a rule when i an be deermined ha is aneeden an be exended wih an addiional ondiion wihou affeing he suppor of he rule. In he absene of iem onsrains, his opimizaion prunes many rules ha are eiher non-opimal or equivalen o some oher opimal rule o be enumeraed. Unforunaely, when here are iem onsrains in N (e.g. rules mus onain fewer han k ondiions, his pruning sraegy anno be rivially applied wihou ompromising ompleeness, so i mus be disabled. Full deails of his pruning sraegy are provided in Appendix B. We evaluaed our algorihm on he larger of he aegorial daases from he Irvine mahine learning daabase reposiory, 6 inluding hess, mushroom, leer, onne-4, and dna. We also used he pums daa-se from [6] whih is ompiled from ensus daa (a similar daa-se was used in [8] For he Irvine daa-ses, we used eah value of he designaed lass olumn as he onsequens. For he pums daa-se, we used he values of he RELAT1 olumn (13 in all 7. Eah of hese daa-ses is known o be diffiul for onsrain-based rule mining algorihms suh as Apriori, even when speifying a srong minimum suppor onsrain [4,6,8]. Experimens were performed on an IBM InelliSaion wih 400 MHZ Inel Penium-II proessor and 128 MByes of main memory. Exeuion ime and he number of rules reurned by he algorihm appear in Table 1; haraerisis of eah daa-se appear in Table 2. For he Irvine daa-ses, wih he exepion of onne-4, our algorihm idenified an opimal se of rules wihin 30 seonds in every ase, wih many runs requiring less han 1 seond. Conne-4 was he mos diffiul of he daa-ses for wo reasons. Firs, i has subsanially more reords and more olumns han many of he oher daa-ses. Bu a sronger onribuor o his disrepany was he fa ha rules wih high onfidene wihin he onne-4 daase have very low suppor. For example, wih he ie lass as he onsequen, rules wih 100% onfidene have a suppor of a mos 14 reords. This propery grealy redues pruning effeiveness, resuling in almos one hour of exeuion ime given his onsequen. In ases like hese, modes seings of he minimum suppor or onfidene onsrain an be used o improve runime onsiderably. For example, a minimum suppor of 676 reords (1% of he daa-se redues exeuion ime o 6 minues. The number of rules in eah opimal se was on he order of a few hundred a mos. Of he Irvine daa-ses, onne-4 onained he mos opimal rules, wih 216 for win, 171 for ie, and 465 for lose. We plo he upper suppor-onfidene border for eah of hese onsequens in Figure 3. Rule suppor is normalized aording o onsequen suppor so ha eah border ranges from 0 o 100% along he x axis. 4. PC-Opimaliy 4.1 Definiion While s-opimaliy is a useful onep, i ends o produe rules ha primarily haraerize only a speifi subse of he populaion 6 hp://www.is.ui.edu/~mlearn/mlreposiory.hml 7 This daa-se is available in he form used in hese experimens hrough: hp://www.almaden.ibm.om/s/ques The values 1-13 for he RELAT1 olumn orrespond o iems 1-13 in he apriori binary forma of his daa. Confidene (% Daa-se Consequen Time (se # of Rules hess win <1 60 nowin <1 41 onne-4 win 642 216 draw 3066 171 loss 1108 465 leer A-Z 18 322 dna EI 20 9 IE 23 15 N <1 9 mushroom poisonous <1 12 edible <1 7 pums 1 740 324 2 204 702 3 509 267 4 174 152 5 46 91 6 19 81 7 50 183 8 270 210 9 843 383 10 572 424 11 88 165 12 12 11 13 22 102 Table 1. Exeuion ime and number of rules reurned. 100 90 80 70 60 50 40 30 20 10 win ie lose 0 0 10 20 30 40 50 60 70 80 90 100 sup(r/sup( (% Figure 3. Upper suppor/onfidene borders for Conne-4. of ineres (by populaion of ineres, we mean he se of reords for whih ondiion C evaluaes o rue In his seion, we propose anoher parial order wih he goal of remedying his defiieny. Firs, he populaion of a rule A C is simply he se of reords from daa-se D for whih boh A and C evaluae o rue. We denoe he populaion of a rule r as pop( r. Clearly hen, pop( r = supr ( A rule whih onains some reord wihin is populaion is said o haraerize.

Daa-se # of Rows # of Columns hess 3,196 37 onne-4 67,557 43 dna 3,124 61 lee0,000 17 mushroom 8,124 23 pums 49,046 74 Table 2. Number of rows and olumns in eah daa-se. Consider now he parial order p on rules where r 1 < p if and only if: pop( r 1 pop( onf( r 1 < onf(, or pop( r 1 pop( onf( r 1 onf( Two rules are equivalen aording o his parial order if heir populaion ses are idenial and heir onfidene values are equal. One an analogously define he parial order p where r 1 < p if and only if: pop( r 1 pop( onf( r 1 > onf( or pop( r 1 pop( onf( r 1 onf( 4.2 Theoreial Impliaions I is easy o see ha s is implied by (and p s by, so p all he laims from Seion 3 also hold wih respe o popimaliy. Noe ha p resuls in more inomparable rule pairs han s due o he populaion subsumpion requiremen. This implies here will be many more rules in a p-opimal se ompared o an s-opimal se. The onsequene of hese addiional rules, as formalized by he observaion below, is ha a p-opimal se always onains a rule ha is opimal wih respe o any of he previous ineresingness meris, and furher wih respe o any onsrain ha requires he rules o haraerize a given subse of he populaion of ineres. OBSERVATION 4.1: Given an insane I = U, D,, C, where (1 is implied by p, and (2 N has a onsrain n on rules r saing ha P pop( r for a given subse P of he populaion of ineres, an I-opimal rule appears in any I p -opimal se where I p = U, D, p, C, N { n}. We noe ha his generalizaion of s-opimaliy is quie broad. Given he large number of rules ha an appear in a p-opimal se (see nex subseion, a more onrolled generalizaion migh be desirable. One idea is o have he rule se suppor any onsrain ha requires a rule o haraerize a given member (in plae of a subse P of he populaion of ineres. This would sill guaranee a broad haraerizaion, bu poenially wih muh fewer rules. This noion designaes a rule r as unineresing if here exiss some se of rules R suh ha (1 eah rule in R saisfies he inpu onsrains, (2 every member of pop( r is haraerized by a leas one rule in R and (3 eah rule in R is equal o or beer han r aording o s. (Noe ha p-opimaliy designaes a rule as unineresing only if here exiss suh a se R onaining exaly one rule. This noion of unineresingness anno, o our knowledge, be expressed using a parial order on rules. Insead, we are examining how he onep of p-opimaliy ombined wih onsrains an suessfully express (and exploi his noion. 4.3 Praial Impliaions for Mining Opimal Conjunions Produing an algorihm ha an effiienly mine an opimal se of onjunions wih respe o p may seem an impossible proposiion in large daa-ses due o he need o verify subsumpion beween rule populaions over many rules. However, we will show ha we an verify he relaionships indued by he parial order synaially in many ases; ha is, by heking he ondiions of he rules along wih suppor and onfidene, insead of examining he se of daabase reords ha omprise heir populaions. Keep in mind ha in his subseion we are resriing aenion o mining opimal onjunions. In ypial formulaions of mining opimized disjunions (e.g. [12,20], synai heks for populaion subsumpion simply require geomerially omparing he regions represened by he base ondiions. DEFINITION (A-MAXIMAL: A rule A 1 C is a-maximal if here is no rule A 2 C suh ha A 1 A 2 and sup( A 1 = sup( A 2. Noe ha he definiion of an a-maximal rule is suh ha i need no saisfy he inpu onsrains. The inuiion behind a-maximaliy ( aneeden maximali is ha an a-maximal rule anno have is aneeden enlarged wihou srily reduing is populaion. Given a rule r, we denoe an a-maximal rule ha has he same populaion of r as amax( r. Lemma 4.2a implies here is only one suh rule, whih means ha amax( is in fa a funion. The purpose of his funion is o provide a onise, anonial desripion of a rule s populaion ha allows for effiienly heking he subsumpion ondiion (Lemma 4.2 In urn, his provides a synai mehod for omparing rules wih p (Theorem 4.3 LEMMA 4.2 A: pop( r 1 = pop( amax( r 1 = amax( B: pop( r 1 pop( amax( r 1 amax( Proof: The direion is immediae for boh ases, so we prove he direion. Suppose he laims are false and onsider he rule r 3amax whih is formed by aking he union of he aneedens from ( r1 and amax( We esablish eah laim by onradiion. For laim A, if pop( r 1 = pop(, hen learly r 3 has he same populaion as r 1 and. Sine we assume he laim is false, we have ha amax( r 1 is differen han amax( Given his, eiher amax( r 1 amax( r 3 or amax( amax( r 3 Bu sine all hree rules have he same populaion, his leads o he onradiion ha eiher amax( r 1 or amax( is no a-maximal. For laim B, if pop( r 1 pop(, hen pop( r 3 = pop( r 1 As a onsequene, we mus have ha amax( r 1 = amax( r 3 by laim A from above. Sine we assume he laim is false, we mus in addiion have ha amax( r 1 amax( or amax( r 1 = amax( The ase where hey are equal is an immediae onradiion due o laim A. For he ase where amax( r 1 amax(, noe ha amax( r 3 amax(, whih onradis he fa ha r 3 = amax( r 1 amax( THEOREM 4.3 A: r 1 < p if and only if: (1 onf( r 1 < onf( and amax( r 1 amax(, or (2 onf( r 1 onf( and amax( r 1 amax( B: r 1 = p if and only if onf( r 1 = onf( and amax( r 1 = amax( These fas anno be rivially applied in an effiien manner sine ompuing amax( r requires sanning he daa-se (an algorihm for his purpose appears in Figure 4 Insead, we desribe pruning opimizaions whih an be inorporaed ino a rule miner o avoid generaing many rules ha are no p-opimal. A pos-proessing phase hen ompues amax( r for every rule idenified in one final pass over he daabase, and his informaion is used o effiienly idenify any remaining non-opimal rules.

INPUT: a rule r and a daa-se D OUTPUT: amax( r 1. Find he firs reord in D ha is a member of he populaion of r. Iniialize a se A o onain hose ondiions of U ha evaluae o rue on his reord, bu are no already in r. 2. For eah remaining reord in D whih is in he populaion of r, remove any ondiion in A whih evaluaes o false on his reord. 3. Add he ondiions in A o he aneeden of r and reurn he resul. Figure 4. Algorihm for ompuing amax( r. INPUT: I p = U, D, p, C, OUTPUT: An I p -opimal se. 1. Find all rules wih posiive improvemen among hose saisfying he inpu onsrains. Call his se of rules R. 2. For eah rule r in R, assoiae r wih amax( r and pu amax( r ino a se R a. 3. Remove any rule r 1 from R if here exiss some R suh ha suh ha r 1 < p aording o Theorem 4.3a. 4. For eah rule r a R a, if here is more han one rule r in R suh ha amax( r = r a, hen remove all bu one of hem from R. 5. Reurn R. Figure 5. Algorihm for mining a p-opimal se of onjunions. OBSERVATION 4.4: Given an insane I p = U, D, p, C,, a rule A 1 C anno be I p -opimal if here exiss some rule A 2 C where A 2 A 1, A 2 saisfies he inpu onsrains, and onf( A 2 > onf( A 1 Using he erminology from [6], his observaion simply saes ha a p-opimal rule mus have a non-negaive improvemen value. We an in fa require ha improvemen be posiive sine we only need one member of eah equivalene lass. The Dense-Miner algorihm already provides pruning funions ha bound he improvemen value of any rule derivable from a given node. We hus modify Dense-Miner o prune nodes whose improvemen bound is 0 (see Appendix B for he simplified pruning funions ha an be used for his speial ase We also again exploi Webb s inlusive pruning sraegy for he ase where here are no iem onsrains in N. We an now fully desribe an algorihm for mining an I p -opimal se of rules wihou performing any explii subsumpion heks beween rule populaions (Figure 5 We use he above-desribed varian of Dense-Miner o implemen sep 1 of his algorihm. Sep 3 applies Theorem 4.3a o idenify non-opimal rules for removal. This sep requires we firs assoiae eah rule wih is a-maximal rule (sep 2 To find he a-maximal rules, we apply he algorihm in Figure 4 for eah rule in R, using a single shared daabase san. For a daa-se suh as onne-4, our implemenaion of his sep requires less han 3 seonds for a se wih up o 10,000 rules. Finally, sep 4 applies Theorem 4.3b in order o idenify equivalen rules so ha here is only one represenaive from eah equivalene lass in he reurned resul. This algorihm ould be simplified slighly when he inpu onsrains N have he propery ha amax( r saisfies N whenever r does. In his ase, he se ould be reurned R a Daa-se Consequen Time (se # of Rules hess win 2,821 236,735 nowin 504 42,187 onne-4 win 19,992 178,678 draw 18,123 119,984 loss 34,377 460,444 leer A-Z 65 37,850 dna EI 64 55,347 IE 46 49,505 N 53 9,071 mushroom poisonous 1 217 edible 3 389 pumsb* 1 1,058 84,594 2 829 33,443 3 305 14,927 4 770 28,553 5 308 21,244 6 59 5,717 7 412 15,474 8 428 22,992 9 3,079 160,908 10 3,857 175,061 11 118 5,701 12 12 991 13 482 59,088 Table 3. Exeuion ime and number of rules reurned. immediaely following sep 2 (sine is a se, we assume i onains no dupliaes However, some ommon rule onsrains (e.g. a rule mus have fewer han k ondiions do no have his propery. In praie, we find ha he number of rules reurned by his algorihm is onsiderably larger han ha of he algorihm for mining s-opimal ses. On mos daa-ses, rule onsrains suh as minimum suppor or onfidene mus be speified o onrol he size of he oupu as well as exeuion ime. For he exeuion imes repored in Table 3, he minimum suppor onsrain was se o ensure ha eah rule s populaion is a leas 5% of he populaion of ineres. For he pums daa-se, his onsrain was no suffiienly srong o onrol ombinaorial explosion. We herefore simplified he daa-se by removing values whih appear in 80% or more of he reords (he resuling daa-se is known as pumsb*, and was used in [5] We onlude ha p-opimal ses of rules are useful primarily on daa-ses where he densiy is somewha subdued, or when he user is apable of speifying srong rule onsrains prior o mining. 5. Conlusions We have defined a new opimized rule mining problem ha allows a parial order in plae of he ypial oal order on rules. We have also shown ha solving his opimized rule mining problem wih respe o a pariular parial order s (and in some ases is analog s is guaraneed o idenify a mos-ineresing rule aording o several ineresingness meris inluding suppor, onfidene, gain, laplae value, onviion, lif, enropy gain, gini, and hi-squared value. In praie, idenifying suh an opimal se of onjunive rules an be done effiienly, and he number of rules in suh a se is ypially small enough o be easily browsed by R a

an end-user. We lasly generalized his onep using anoher parial order p in order o ensure ha he enire populaion of ineres is well-haraerized. This generalizaion defines a se of rules ha onains he mos ineresing rule aording o any of he above meris, even if one requires his rule o haraerize a speifi subse of he populaion of ineres. These ehniques an be used o failiae ineraiviy in he proess of mining mos-ineresing rules. Afer mining an opimal se of rules aording o he firs parial order, he user an examine he mos-ineresing rule aording o any of he suppored ineresingness meris wihou addiional querying or mining of he daabase. Minimum suppor and onfidene an also be modified and he effes immediaely winessed. Afer mining an opimal se of rules aording o he seond parial order, in addiion o he above, he user an quikly find he mos-ineresing rule ha haraerizes any given subse of he populaion of ineres. This exension overomes he defiieny of mos opimized rule miners where muh of he populaion of ineres may be poorly haraerized or ompleely unharaerized by he mined rule(s Aknowledgmens We are indebed o Ramakrishnan Srikan and Dimirios Gunopulos for heir helpful suggesions and assisane. Referenes [1] Agrawal, R.; Imielinski, T.; and Swami, A. 1993. Mining Assoiaions beween Ses of Iems in Massive Daabases. In Pro. of he 1993 ACM-SIGMOD In l Conf. on Managemen of Daa, 207-216. [2] Agrawal, R.; Mannila, H.; Srikan, R.; Toivonen, H.; and Verkamo, A. I. 1996. Fas Disovery of Assoiaion Rules. In Advanes in Knowledge Disovery and Daa Mining, AAAI Press, 307-328. [3] Ali, K.; Manganaris, S.; and Srikan, R. 1997. Parial Classifiaion using Assoiaion Rules. In Pro. of he 3rd In'l Conf. on Knowledge Disovery in Daabases and Daa Mining, 115-118. [4] Bayardo, R. J. 1997. Brue-Fore Mining of High-Confidene Classifiaion Rules. In Pro. of he Third In l Conf. on Knowledge Disovery and Daa Mining, 123-126. [5] Bayardo, R. J. 1998. Effiienly Mining Long Paerns from Daabases. In Pro. of he 1998 ACM-SIGMOD In l Conf. on Managemen of Daa, 85-93. [6] Bayardo, R. J.; Agrawal, R.; and Gunopulos, D. 1999. Consrain-Based Rule Mining in Large, Dense Daabases. In Pro. of he 15h In l Conf. on Daa Engineering, 188-197. [7] Bayardo, R. J. and Agrawal, R. 1999. Mining he Mos Ineresing Rules. IBM Researh Repor. Available from: hp://www.almaden.ibm.om/s/ques [8] Brin, S.; Mowani, R.; Ullman, J.; and Tsur, S. 1997. Dynami Iemse Couning and Impliaion Rules for Marke Baske Daa. In Pro. of he 1997 ACM-SIGMOD In l Conf. on he Managemen of Daa, 255-264. [9] Clark, P. and Boswell, P. 1991. Rule Induion wih CN2: Some Reen Improvemens. In Mahine Learning: Pro. of he Fifh European Conferene, 151-163. [10] Dhar, V. and Tuzhilin, A. 1993. Absra-driven paern disovery in daabases. IEEE Transaions on Knowledge and Daa Engineering, 5(6 [11] Eggleson, H. G. 1963. Convexiy. Cambridge Tras in Mahemais and Mahemaial Physis, no. 47. Smihies, F. and Todd, J. A. (eds. Cambridge Universiy Press. [12] Fukuda, T.; Morimoo, Y.; Morishia, S.; and Tokuyama, T. 1996. Daa Mining using Two-Dimensional Opimized Assoiaion Rules: Sheme, Algorihms, and Visualizaion. In Pro. of he 1996 ACM-SIGMOD In l Conf. on he Managemen of Daa, 13-23. [13] Goehals, B. and Van den Busshe, J. 1999. A Priori Versus A Poseriori Filering of Assoiaion Rules. In Pro. of he 1999 ACM SIGMOD Workshop on Researh Issues in Daa Mining and Knowledge Disovery, paper 3. [14] Inernaional Business Mahines, 1996. IBM Inelligen Miner User s Guide, Version 1, Release 1. [15] Mihell, T. M. 1997. Mahine Learning. MGraw-Hill, In. [16] Morimoo, Y.; Fukuda, T.; Masuzawa, H.; Tokuyama, T.; and Yoda, K. 1998. Algorihms for Mining Assoiaion Rules for Binary Segmenaions of Huge Caegorial Daabases. In Pro. of he 24h Very Large Daa Bases Conf., 380-391. [17] Morishia, S. 1998. On Classifiaion and Regression. In Pro. of he Firs In l Conf. on Disovery Siene -- Leure Noes in Arifiial Inelligene 1532:40-57. [18] Nakaya, A. and Morishia, S. 1999. Fas Parallel Searh for Correlaed Assoiaion Rules. Unpublished manusrip. [19] Piaesky-Shapiro, G. 1991. Disovery, Analysis, and Presenaion of Srong Rules. Chaper 13 of Knowledge Disovery in Daabases, AAAI/MIT Press, 1991. [20] Rasogi, R. and Shim, K. 1998. Mining Opimized Assoiaion Rules wih Caegorial and Numeri Aribues. In Pro. of he 14h In l Conf. on Daa Engineering, 503-512. [21] Rymon, R. 1992. Searh hrough Sysemai Se Enumeraion. In Pro. of Third In l Conf. on Priniples of Knowledge Represenaion and Reasoning, 539-550. [22] Srikan, R.; Vu, Q.; and Agrawal, R. 1997. Mining Assoiaion Rules wih Iem Consrains. In Pro. of he Third In'l Conf. on Knowledge Disovery in Daabases and Daa Mining, 67-73. [23] Webb, G. I. 1996. Inlusive Pruning: A New Class of Pruning Axiom for Unordered Searh and is Appliaion o Classifiaion Learning. In Pro. of he 1996 Ausralian Compuer Siene Conferene, 1-10. [24] Webb, G. I. 1995. OPUS: An Effiien Admissible Algorihm for Unordered Searh. Journal of Arifiial Inelligene Researh, 3:431-465. Appendix A Here we provide definiions for he gini, enropy gain, and hisquared rule value funions. For a given ondiion A, we denoe he fraion of reords ha saisfy he onsequen ondiion C among hose ha saisfy A as p( A, and he fraion of reords ha do no saisfy he onsequen ondiion among hose ha saisfy A as p( A. Noe hen ha: pa ( sup( A sup( A sup( A = ---------------------------- and pa ( = -------------------------------------------------- sup( A sup( A

gini( A = 1 [ p( 2 + p( 2 ] Sine C and D are onsan for a given insane of he problem, erms suh as p( = sup( D are onsans. Any of he above variable erms an be expressed as funions of x = sup( A sup( A and y = sup( A, and hene so an he funions hemselves. For example, sup( A = y+ x, sup( A = D ( y+ x, pa ( = y ( y+ x, pa ( = x ( y+ x, and so on. Appendix B sup( A --------------- ( 1 [ pa ( D 2 + pa ( 2 ] sup( A ------------------- ( 1 [ p( A D 2 + p( A 2 ] en( A = [ p( log( p( + p( log( p( ] hi 2 ( A = sup( A --------------- [ pa ( log( pa ( + pa ( log( pa ( ] D sup( A ------------------- [ p( Alog( p( A + p( Alog( p( A ] D sup( A [ pa ( p( ] 2 sup( A [ p( A p( ] ---------------------------------------------------------------------------------------------------------------------- 2 p( + sup ---------------------------------------------------------------------------------------------------------------------- ( A [ pa ( p( ] 2 sup( A [ p( A p( ] 2 p( Se-Enumeraion Tree Searh. The se-enumeraion ree searh framework is a sysemai and omplee ree expansion proedure for searhing hrough he power se of a given se U. The idea is o firs impose a oal order on he elemens of U. The roo node of he ree will enumerae he empy se. The hildren of a node N will enumerae hose ses ha an be formed by appending a single elemen of U o N, wih he resriion ha his single elemen mus follow every elemen already in N aording o he oal order. For example, a fullyexpanded se-enumeraion ree over a se of four elemens (where eah elemen of he se is denoed by is posiion in he ordering appears in Figure 6. 1 2 1,2 1,2,3 1,2,4 1,2,3,4 1,3 1,3,4 {} 1,4 2,3 2,3,4 2,4 3 3,4 Figure 6. A omplee se-enumeraion ree over a 4 iem se. 4 To failiae pruning, we use a node represenaion alled a group where he se enumeraed by a node g is alled he head and denoed hg ( The se of viable elemens of U whih an be appended o hg ( in order o form a hild is alled he ail and denoed g ( Making he ail elemens explii enables opimizaions suh as elemen reordering (for dynami ree rearrangemen and ail pruning. Elemen reordering involves loally hanging he oal order on U a eah node in he ree o maximize pruning effeiveness. Tail pruning involves removing elemens from he ail if hey anno possibly be par of any soluion in he sub-ree rooed a he node. This direly redues he searh spae, and indirely redues he searh spae by improving he bounds we ompue on values suh as he onfidene of any rule ha an be enumeraed by a desenden of he node. Pruning wih Confidene, Suppor, and Improvemen We say a rule r (whih we represen using only is aneeden sine he onsequen i fixed is derivable from a group g if hg ( r, and r h( g g ( By definiion, any rule ha an be enumeraed by a desenden of g in he se-enumeraion ree is also derivable from g. From [6] we have ha: The funion f ( xy, = x ( x+ provides an upper-bound on he onfidene of any onjunive rule derivable from a given group g, where x and y are non-negaive inegers suh ha y sup( hg ( g ( and x sup( hg (. The value of x from above provides an upper-bound on suppor. If he onfidene bound given by f from above for some group g is less han or equal o he onfidene of any rule r enumeraed hus far suh ha r is a subse of h( g and r saisfies he inpu onsrains, hen he maximum improvemen of any derivable rule is zero. If he following value is equal o zero, hen he maximum improvemen of any derivable rule is zero: β = min( u hg (, sup( ( hg ( { u} { u} We refer he reader o [6] for deails on how o ompue he above values eonomially given ha he daa-se is large, and how o heurisially reorder he elemens of U in order o ensure hese funions have pleny of pruning opporuniies. Inlusive Pruning Webb s inlusive pruning sraegy [23] moves a subse T of he ail of a group g ino is head whenever he following fa an be esablished: if some soluion is derivable from g, hen a leas one of he soluions derivable from g is a superse of T. For example, in he ase of mining opimized onjunions aording o he Laplae funion (and many of he oher meris we have examined inluding our parial orders, suppose sup( hg ( = sup( hg ( { u} for some elemen u in g ( Ignoring he effes of rule onsrains, if some rule r derivable from g is opimal, hen so is he rule r { u}, whih is also derivable from g. If one or more suh elemens are found in he ail of some node, insead of expanding is hildren, hese elemens an all be moved ino he head o form a new node ha replaes i. Some onsrains may unforunaely prohibi sraighforward appliaion of his pariular inlusive pruning sraegy. For example, an iem onsrain may disallow u from pariipaing in a soluion when ombined wih some oher iems from hg ( and g ( Anoher problemai onsrain is one whih bounds he size of a rule s aneeden. Lukily, work-arounds are ypially possible. For example, in he ase where a bound k is speified on he number of base ondiions ha may appear in an aneeden, he sraegy an be applied safely whenever hg ( g ( k.