Suspicious Transaction Detection for Anti-Money Laundering

Similar documents
An Approach to Evaluating the Computer Network Security with Hesitant Fuzzy Information

Maintenance Scheduling of Distribution System with Optimal Economy and Reliability

Security Analysis of RAPP: An RFID Authentication Protocol based on Permutation

IDENTIFICATION OF THE DYNAMICS OF THE GOOGLE S RANKING ALGORITHM. A. Khaki Sedigh, Mehdi Roudaki

6.7 Network analysis Introduction. References - Network analysis. Topological analysis

Green Master based on MapReduce Cluster

Study on prediction of network security situation based on fuzzy neutral network

Projection model for Computer Network Security Evaluation with interval-valued intuitionistic fuzzy information. Qingxiang Li

The impact of service-oriented architecture on the scheduling algorithm in cloud computing

A Parallel Transmission Remote Backup System

ANOVA Notes Page 1. Analysis of Variance for a One-Way Classification of Data

Applications of Support Vector Machine Based on Boolean Kernel to Spam Filtering

1. The Time Value of Money

SHAPIRO-WILK TEST FOR NORMALITY WITH KNOWN MEAN

Credibility Premium Calculation in Motor Third-Party Liability Insurance

Statistical Pattern Recognition (CE-725) Department of Computer Engineering Sharif University of Technology

CHAPTER 2. Time Value of Money 6-1

Classic Problems at a Glance using the TVM Solver

APPENDIX III THE ENVELOPE PROPERTY

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

Chapter 3. AMORTIZATION OF LOAN. SINKING FUNDS R =

Report 52 Fixed Maturity EUR Industrial Bond Funds

of the relationship between time and the value of money.

A New Bayesian Network Method for Computing Bottom Event's Structural Importance Degree using Jointree

ADAPTATION OF SHAPIRO-WILK TEST TO THE CASE OF KNOWN MEAN

Efficient Traceback of DoS Attacks using Small Worlds in MANET

Simple Linear Regression

DECISION MAKING WITH THE OWA OPERATOR IN SPORT MANAGEMENT

Application of Grey Relational Analysis in Computer Communication

Forecasting Trend and Stock Price with Adaptive Extended Kalman Filter Data Fusion

Using Phase Swapping to Solve Load Phase Balancing by ADSCHNN in LV Distribution Network

Load Balancing Algorithm based Virtual Machine Dynamic Migration Scheme for Datacenter Application with Optical Networks

An IG-RS-SVM classifier for analyzing reviews of E-commerce product

Average Price Ratios

Optimal multi-degree reduction of Bézier curves with constraints of endpoints continuity

Compressive Sensing over Strongly Connected Digraph and Its Application in Traffic Monitoring

The Gompertz-Makeham distribution. Fredrik Norström. Supervisor: Yuri Belyaev

A Study of Unrelated Parallel-Machine Scheduling with Deteriorating Maintenance Activities to Minimize the Total Completion Time

The Application of Intuitionistic Fuzzy Set TOPSIS Method in Employee Performance Appraisal

ECONOMIC CHOICE OF OPTIMUM FEEDER CABLE CONSIDERING RISK ANALYSIS. University of Brasilia (UnB) and The Brazilian Regulatory Agency (ANEEL), Brazil

IP Network Topology Link Prediction Based on Improved Local Information Similarity Algorithm

Banking (Early Repayment of Housing Loans) Order,

How To Make A Supply Chain System Work

Chapter = 3000 ( ( 1 ) Present Value of an Annuity. Section 4 Present Value of an Annuity; Amortization

Settlement Prediction by Spatial-temporal Random Process

Dynamic Two-phase Truncated Rayleigh Model for Release Date Prediction of Software

Chapter Eight. f : R R

Automated Event Registration System in Corporation

The Digital Signature Scheme MQQ-SIG

Abraham Zaks. Technion I.I.T. Haifa ISRAEL. and. University of Haifa, Haifa ISRAEL. Abstract

How To Balance Load On A Weght-Based Metadata Server Cluster

T = 1/freq, T = 2/freq, T = i/freq, T = n (number of cash flows = freq n) are :

Optimal replacement and overhaul decisions with imperfect maintenance and warranty contracts

Software Aging Prediction based on Extreme Learning Machine

Fractal-Structured Karatsuba`s Algorithm for Binary Field Multiplication: FK

Optimizing Software Effort Estimation Models Using Firefly Algorithm

Proactive Detection of DDoS Attacks Utilizing k-nn Classifier in an Anti-DDos Framework

AN ALGORITHM ABOUT PARTNER SELECTION PROBLEM ON CLOUD SERVICE PROVIDER BASED ON GENETIC

ROULETTE-TOURNAMENT SELECTION FOR SHRIMP DIET FORMULATION PROBLEM

Cyber Journals: Multidisciplinary Journals in Science and Technology, Journal of Selected Areas in Telecommunications (JSAT), January Edition, 2011

A particle Swarm Optimization-based Framework for Agile Software Effort Estimation

A COMPARATIVE STUDY BETWEEN POLYCLASS AND MULTICLASS LANGUAGE MODELS

Preprocess a planar map S. Given a query point p, report the face of S containing p. Goal: O(n)-size data structure that enables O(log n) query time.

Business Bankruptcy Prediction Based on Survival Analysis Approach

10.5 Future Value and Present Value of a General Annuity Due

AnySee: Peer-to-Peer Live Streaming

A DISTRIBUTED REPUTATION BROKER FRAMEWORK FOR WEB SERVICE APPLICATIONS

A Framework of Business Intelligence-driven Data Mining for e-business

Optimization Model in Human Resource Management for Job Allocation in ICT Project

Optimal Packetization Interval for VoIP Applications Over IEEE Networks

On formula to compute primes and the n th prime

Speeding up k-means Clustering by Bootstrap Averaging

TESTING AND SECURITY IN DISTRIBUTED ECONOMETRIC APPLICATIONS REENGINEERING VIA SOFTWARE EVOLUTION

AP Statistics 2006 Free-Response Questions Form B

Dynamic Provisioning Modeling for Virtualized Multi-tier Applications in Cloud Data Center

Fast, Secure Encryption for Indexing in a Column-Oriented DBMS

Statistical Intrusion Detector with Instance-Based Learning

Report 05 Global Fixed Income

10/19/2011. Financial Mathematics. Lecture 24 Annuities. Ana NoraEvans 403 Kerchof

A NON-PARAMETRIC COPULA ANALYSIS ON ESTIMATING RETURN DISTRIBUTION FOR PORTFOLIO MANAGEMENT: AN APPLICATION WITH THE US AND BRAZILIAN STOCK MARKETS 1

Transcription:

Vol.8, No. (014), pp.157-166 http://dx.do.org/10.1457/jsa.014.8..16 Suspcous Trasacto Detecto for At-Moey Lauderg Xgrog Luo Vocatoal ad techcal college Esh Esh, Hube, Cha es_lxr@16.com Abstract Moey lauderg actvtes facal markets have bee creasgly serous recet years. Although efforts o at-moey actvtes started at a early stage, the solutos seem to be restrcted to a strategc level. Besdes, eve some research poeers at employg data mg techques to at-moey lauderg, the stuato Cha s stll dffcult. To ths ed, ths paper, after presetg the systematc vew of the data mg framework of at-moey lauderg, we propose a classfcato based algorthm to effectvely detect suspcous trasactos. Specfcally, we cosder the facal trasactos as a data stream, ad try to costruct a classfer based o a set of med frequet rules. Our expermets o a smulated trasacto dataset based o real world bakg actvtes prove the effcecy of our proposed method. Keywords: Classfcato, Data mg, At-moey lauderg, suspcous trasactos 1. Itroducto As the creasg developmet of teret ad database techologes, the data ca be obtaed has bee more ad more bg [1, ]. I order to uderstad the bg data, data mg methodology s appled throughout varous felds, such as marketg, customer relatoshp maagemet ad facal maagemet. Specfcally, as a emerget techology facal area, efforts o data mg have bee made o baker/customer relatoshp maagemet, credt rsk alert ad market aalyss o face. Oe of the toughest thgs s facal fraud, ad worse stll, moey lauderg. Moey lauderg s the coveto of crmal comes to assets that caot be tracked to the uderlyg crme, where the process of cocealg sources of moey s referred to lauderg [3]. As the moey lauderg actvtes go creasgly wld, the facal growth ad atoal securty have bee crtcally affected. Curret strateges of at-moey lauderg expect laws ad regulatos to be establshed to prevet ad suppress moey lauderg actvtes. For example, possble measures of baks clude valdatg customer detfcato valdato before bakg busess, checkg suspcous foreg exchage cash trasactos, trackg large cash flows, ad blacklstg accouts of suspected moey lauderg, etc. However, exstg at-moey lauderg methods reply o huma terveto, ad applyg moder data mg techques stll remas at a tatg phase. Detectg suspcous facal trasactos s a essetal precodto ad key aspect of at-moey lauderg [4]. Exstg methods are based o the amout of trasactos, ad the detfcato mplemetato process s extremely restrcted to the mechasm of uusual bakg actvtes reportg [5]. Therefore, there are several lmtatos of tradtoal at-moey lauderg efforts, such as arrow coverage of detfcato, log cycle of clue dscovery, ad extesve delay [6]. I order to solve above challeges, we propose a data mg [7] system for at-moey lauderg, ad focus o the suspcous trasacto detecto ths paper. ISSN: 1738-9976 IJSIA Copyrght c 014 SERSC

Vol.8, No. (014) The overall workflow of the system s as follows, as show Fgure 1. After storg the trasactoal data to data warehouse, some preprocessg work s performed for data cleag ad trasformato. The, related data s selected for data mg ege, where data mg algorthm s appled. After that, dscovered kowledge s the abstracted to a kowledge base, whch wll be further used to vsualzato, recommeder systems ad other busess applcato. Fgure 1. Framework of Data Mg System for At-moey Lauderg The, we focus o the dyamc detecto mechasm of suspcous trasactos. Typcally, the abormal relatoshps betwee trasactoal accouts are deemed as suspcous [5]. Wth a stream of facal trasacto data, dscoverg emergg patters durg varous facal trasactos, exhbted as terestg abormty from chroc or regular behavors, kow as suspcous trasacto patters [5], s a effcet method. For example, Lu et al., [8] proposed a method based o tme seres to detect suspcous trasactos wth the techque of sca statstcs. Also, Keya et al., [9] tred to mprove the accuracy of detectg suspcous trasactos by usg cross valdato ad grd search optmzed support vector etwork model. However, dfferet from exstg works, we regard the facal trasactos as a stream data, ad employ a dyamc mg method to detect suspcous patters o stream trasactos. Specfcally, we propose a classfcato algorthm based o multple class assocato rules o data streams, by costructg a FP-tree to mprove the tme ad space effcecy, ad the reducg frequet rules by usg Hoeffdg boud [10] over dyamc data streams. To sum up, ths paper, our ma cotrbutos are as follows. Frst, we study o the problem of at-moey lauderg ad summarze the overall data mg framework as a soluto, whch ca solve the challeges of tradtoal maual strateges of at-moey lauderg polces. Secod, we propose a classfcato based algorthm to dyamc detect suspcous trasactos over the facal trasactoal data streams. Last, our expermets o a smulated bakg trasacto dataset mprove the effcecy of our method. The rema of ths paper s orgazed as follows. Secto provdes some related work. The overall framework s dscussed Secto 3. The our algorthm for detectg suspcous trasactos s proposed Secto 4. Emprcal expermets are coducted Secto 5. Fally, the paper s cocluded Secto 6. 158 Copyrght c 014 SERSC

Vol.8, No. (014). Related Work The efforts o at-moey research started at a early stage. Seator et al., [11] frst formally proposed a artfcal tellgece system amed FCEN (Facal Crmes Eforcemet Network) to detfy potetal moey lauderg from reports of large cash trasactos. The evaluato of suspcous trasacto FCEN s mplemeted by a Bayes model. Petrus C va Duye et al., [1] poted out that there s exstg problems the motorg system of suspcous trasactos ad at-moey lauderg strateges. Kgdo et al., [13] developed a system to automatcally detfy uusual behavor customers trasactos. However, oe of above efforts ca be effcetly appled to Chese facal markets, due to the specfc characterstcs of Chese facal trasactos [14]. Therefore, ths paper, we propose a algorthm desged for Chese bakg trasactos. 3. Overall Framework I ths secto, we preset our overall framework of at-moey lauderg usg data mg techques. Typcally, as descrbed Fgure 1, suppose we have a streamg facal trasactos dataset. Frst, we perform preprocessg such as cleag, reducto ad samplg to select relevat sub-dataset for our suspcous trasactos detecto problem. The, we store that data to a data warehouse. Later, feed data to a data mg ege module for suspcous trasactos detecto. Note that for comg streamg data, dyamc data ca be drectly delvered to the data mg ege for real-tme executo. After that, useful formato s extracted ad stored to a kowledge base for further applcatos, whch mght be used by decso makers for data vsualzato, recommeder systems ad other busess applcatos wth respect to at-moey lauderg feld. Specfcally, ths paper, we focus o the data mg ege module, ad study o the problem of detectg suspcous trasacto patters. Frst of all, we wll dscuss the data preprocessg, whch serves as the premse stage of mg. 3.1. Data Preprocessg Typcally, at-moey lauderg actvtes volve multple accouts durg facal trasactos. The data preprocessg clude the followg aspects. 3.1.1. Attrbutes Flterg: Oly a subset of attrbutes s useful for at-moey lauderg actvtes. For example, the ame of accouts s usually useless, ad ca be removed to reduce the scale of dataset. Suppose the streamg facal trasacto DS { t, t,..., t } t 1 dataset ca be represeted as. Each, 1,,..., s assocated y wth a set of attrbutes A, ad a target attrbute. Now we eed to check whether y attrbute x A s relevat to. A tutve way s, sort all the values of A, ad the get a correspodg dstrbuto of y. Suppose ( x x x ) ( y y ) 1 1 deotes the attrbute x of the -th record, ad 0. Defe to descrbe the dscreteess of the dstrbuto [1]. If t exceeds some threshold, o relevace s exhbted as cosdered. The, attrbute x ca be removed. Copyrght c 014 SERSC 159

Vol.8, No. (014) 3.1.. Feature Extracto based o Doma Kowledge: Accordg to doma kowledge, some features such as captal flows ad the frequecy of access accout are closely related to moey lauderg actvtes, ad yet are ot recorded the orgal trasacto dataset. However, we ca calculate them usg statstcal method. The relatos ca be formulated as a lear regresso model. Notate the regstered fud as f, ad the summary amout of access accout wth a certa perod as s. Gve accout, f, s deote the regstered fud ad summary amout of respectvely. Suppose there exsts a lear relatoshp betwee f ad s,.e., f s, where, are costats, ad follows a ormal dstrbuto N (0, ). Parameters, ca be ferred usg the least square method [15] as follows: ˆ f s s 1 1 1 1 s ( 1 s ) f, ˆ f ˆ s. (1) 3.1.3. Correlato matrx betwee trade accouts: Calculate the correlato coeffcets betwee trade accouts amog dfferet dustres, ad store as a correlato matrx as features. For example, Table 1 gves the correlato matrx after preprocessg. Import/export Trade Table 1. Correlato Matrx betwee Trade Accouts Steel Mechacs Food Chemstry Steel 0.5 0.9 0.1 0.1 Mechacs 0.7 0.5 0.6 0.7 Food 0. 0. 0.7 0.1 Chemstry 0.4 0.3 0.3 0.5 I ext secto, we wll focus o the data mg ege module, ad especally o the suspcous trasacto detecto fucto. 4. Proposed Algorthm I ths secto, we propose a classfcato based algorthm to dyamc detect suspcous patters for at-moey lauderg. The objectve s to fd out the frequet assocated accouts, whch are deemed as suspcous. 4.1. Problem Formulato Gve a streamg facal trasacto dataset t, 1,,..., par, where DS { t, t,..., t } 1, ad each s assocated wth a set of attrbutes A. Defe patter p as a (, v ) A s the -th attrbute ad v s the correspodg value. Suppose we have a set of patters P p, p,..., p } ad a tuple t. If for each p P, t satsfes { 1 l p ( A, v ), we say that t matches P. Notate P. cout as the umber of objects matched P, ad P.sup p. cout / as the support of P DS. A 160 Copyrght c 014 SERSC

Vol.8, No. (014) Let c be the otato of class label. Defe a class assocato rule R : P c, where R. cout s the umber of objects DS matchg patter P. The, R.sup R. cout / s the support of rule R DS, ad R cof R. cout / P. cout. s the cofdece of rule R DS. If R.sup m_sup, where m_sup s the mmum support, the correspodg patter P s called frequet patter or temset, ad R s called frequet assocato rule. Furthermore, f R.cof m_cof, where m_cof s the mmum cofdece, R s called precse rule. For the coveece of descrpto, we term rule R wth patter P of a legth of k as k rule. Besdes, f R s frequet, R s termed as k freqrule. Now the objectve s to fd all the frequet ad precse rules, ad costruct a classfer from above set of rules. 4.. Algorthm Descrpto Geerally, proposed algorthm ths paper s composed of two stages. Frst, dscover all the frequet ad precse rules from the trag data. Secod, based o the rules foud the former stage, model a classfer to lear the class labels ad therefore dstgush the suspcous trasacto patters from the ormal oes. 4..1. Dscoverg Frequet ad Precse Rules: Before costructg a classfer upo assocato rules [], frst of all, we eed to dscover a proper set of rules. Our method s based o FP-tree algorthm [16]. Theorem 1. Gve a data stream DS ad the mmum support m_sup, suppose p as a frequet patter. For ay class label c, f rule R : p c s ot a frequet rule, the p should ot be cluded ay frequet rules. Base o theorem 1, we ca prue those o-frequet tems the geerato of frequet patters whe buldg FP-tree. Sce the streamg data s typcally extremely large ad caot be stored local storage. Therefore, we splt the data stream to segmets wth certa legth, called tme wdow. The, the rule mg algorthm s appled to the dataset wth specfc tme wdow. After that, processed dataset ca be abadoed or trasferred to other places to release memory usage. However, streamg data ca oly be scaed oce, whch meas we should preserve ot oly frequet rules but all rules. But storg all rules ecoutered leads to a huge overhead. Therefore, we troduce Hoeffdg boud [17] to estmate the support of rules. Gve a varable r wth value boudary R, suppose we have depedet observatos, calculate the expectato of r as. Hoeffdg boud dcates that the possblty of expect value of r beg at least R s 1, where: l( 1 / ). () We ca observe that Hoeffdg boud s depedet o the dstrbuto of samples. Suppose we eed to estmate the support of some rule, ad the value boudary R 1. Gve error factor ad the probablty of error, the legth of data stream wth support as ad probablty as 1 s calculated as follows: Copyrght c 014 SERSC 161

Vol.8, No. (014) 1 R l l( 1 / ) l( 1 / ) (3) Wth tme wdow of legth, oly rules whose support s larger tha l should be preserved, ad frequet rules wth support larger tha ca be obtaed at the probablty of 1. Accordgly, we have the followg deftos. Defto 1. Gve a rule R : p c wth tme wdow T, the support s R.sup, the mmum support s, ad the maxmum error of support s. If the R s called frequet rule; f rule; f R.sup R.sup, R.sup, the R s potetal frequet, the R s o-frequet rule. From Equato (3), we ca observe that the legth of tme wdow s egatve l related to. Whe s small, wll be very large. For example, f the value of l s 0.001, ad s 0.0, wll be 391 thousads. That meas lots of memory s l eeded for cosumg such a bg dataset, ad the effcecy of algorthm wll be affected. Therefore, should be talzed as a large value, ad the decreases as the data stream comes. I our settgs, we talze 0. 95. Later, after the processg wth each tme wdow, s updated as Equato (). The workflow of geeratg frequet rules s descrbed Algorthm 1. Note that the process of creatg FP-tree s smlar to [18], except that we further add a prug strategy defed by theorem 1. The we merge the rules med by FP-tree P ad our frequet rules set FR as les 7-16. Fgure. Algorthm of Geeratg Frequet Rules 16 Copyrght c 014 SERSC

Vol.8, No. (014) All frequet rules ca be med by Algorthm 1. Accordg to Hoeffdg boud [17], ay rule R has a support of R.sup, ad R.sup wth a probablty of 1. Sce decreases as grows, the probablty teds to be 1 f the support s meas, the support value of all rules satsfes R.sup geerated from Algorthm 1 have a mmum support of cluded the results of Algorthm 1. R.sup. That. Iterestgly, the rules. Therefore, all rules are 4... Costructg the classfer: Now we have med all frequet rules, ad the ext step s to costruct a classfer for further precse classfcato. Because of the dyamc varato of steamg data, we wll ot prue all usatsfactory rules. The tuto s that eve f a rule s ot frequet curretly, t mght be frequet later whe more data comes. Gve a ew trasacto, the objectve s to assg a class label to determe f the trasacto s suspcous or ot. If all the rules matchg the ew trasacto have the same class label, the the trasacto s assged to that class drectly. Otherwse, we get a group of class labels accordg to dfferet rules appled to the ew trasacto. Now we cosder how to combe the effects of a group of class labels. For each rule R : p c, the rule wth hghest s selected, where the upper boud of s computed as follows [19]: max (m{ sup(p),sup (c) } sup(p)sup( c) ) e, (4) where sup(p),sup (c) deotes the umber of objects wth patter p ad label c respectvely. 5. Expermets I order to evaluate the effcecy of our algorthm, ths secto, we coduct some expermets. 5.1. Settgs ad Data The evromet of our expermets s as follows. The PC has a Itel E500 CPU wth 1G memory, 50G hard dsk memory, ad the operato system s Wdows 7. We try to evaluate f our method ca be used to detect suspcous trasactos wth a stream of facal trasactos. Therefore, we smulate a large data stream of facal stream based a real world bakg records, smlar to exstg study [0]. We have geerated 100 mllos records of trasactos, ad the felds of dataset are show as Table. Note that the two sdes of a trasacto deote the ad out of facal stream, whch ca be represeted as a drected edge betwee two odes. Therefore, we otate the trasactos betwee two specfc accouts as (, ), where s the umber of 1 1 trasactos from Accout_0 to Accout_1, ad s the umber of trasactos from Accout_1 to Accout_0. Table. Felds of Bakg Trasacto Dataset Feld ame ID Date Descrpto The detfer of a sgle trasacto The date of trasacto commtted Copyrght c 014 SERSC 163

Vol.8, No. (014) Brach_ID_0 Lower_brach_ID_0 Accout_0 Brach_ID_1 Lower_brach_ID_1 Accout_1 Amout Category Brach of bak of the frst accout Lower level of brach of the frst accout The frst accout of trasacto Brach of bak of the secod accout Lower level of brach of the secod accout The secod accout of trasacto The amout of trasacto The sort of trasacto,.e., wthdraw, depost, ad trasfer The parameters used our expermet are set as follows. We set m_sup as 1%, m_cof as 50%, database coverage threshold as 4, error factor for cofdece as 0%, ad error factor for support as 0.01. 5.. Expermetal Results Out of all 100 mllos of trasactos, we detect 317 accouts labeled as suspcous based o the rules med as descrbed our algorthm. Furthermore, each suspcous accout s assocated wth others the form of suspcous trasactos whch are ted to two dfferet accouts. Table 3 gves a partal results of our expermet. Table 3. Partal Results of Suspcous Accouts ad Trasactos Accouts Assocated accouts Suspcous trasactos 116054 (19161,564604,84731,...) (0/8,0/1,11/0,...) 84731 (116054,165376,40058,..) (0/11,0/5,0/7,...) 19161 (116054,165376,...) (8/0,5/0,...) 564604 (116054,714340,...) (1/0,0/,...) 40058 (84731,...) (7/0,...) 165376 (19161,84731,...) (0/5,5/0,...)......... For example, as show Table 3, accout 116054 s suspcous ad assocated wth 19161, 564604 ad 84731. Besdes, accout 116054 s probably a export accout ad 19161 s probably a mport accout, gve the fact that trasactos volved 116054 are outflowg from 116054, whle the trasactos volved 19161 mostly have a comg cash flow. From above observatos, we ca see that our results are cosstet wth the real world facts. Therefore, t s reasoable to clam that determg class labels by the rules geerated from our algorthm to assert the suspco of trasacto s feasble. Next, we evaluate the effcecy of our algorthm as well. Fgure 3 gves the accuracy of our expermet. Note that the labels of trasactos are doe maually by doma kowledge. From the fgure, we ca see that as the umber of trasactos grows, our algorthm performs better. That meas our method s scalable to large dataset. 164 Copyrght c 014 SERSC

Vol.8, No. (014) 6. Cocluso I ths paper, we propose a data mg system for detectg at-moey lauderg actves, ad focus o the dscoverg of suspcous trasactos facal trasactos stream. Specfcally, the proposed algorthm employs a classfcato method based o a set of frequet rules. Our expermets o a smulated trasacto dataset based o real work bakg actvtes show both feasblty ad effcecy of our method. Future work mght clude: (1) further mprovemet of the algorthm to support varous data sources; ad () addg more fuctos to our data mg system for at-moey lauderg besdes suspcous trasacto detecto. Fgure 3. Precso of Proposed Algorthm o Varous Sze of Dataset Ackowledgemets The authors would lke to recogze the others who helped them ad thak all our revewers. Refereces [1] S. LaValle, Bg data, aalytcs ad the path from sghts to value, MIT Sloa Maagemet Revew, vol. 5, o., (011), pp. 1-31. [] P. Zkopoulos ad C. Eato, Uderstadg bg data: Aalytcs for eterprse class hadoop ad streamg data, McGraw-Hll Osbore Meda, (011). [3] P. Reuter ad E. M. Truma, Chasg drty moey: The fght agast moey lauderg, Peterso Isttute, (004). [4] R. Baroe ad D. Mascadaro, Worldwde at-moey lauderg regulato: estmatg the costs ad beefts, Global Busess ad Ecoomcs Revew, vol. 10, o. 3, (008), pp. 43-64. [5] R. Meo ad S. Kuma, Uderstadg the role of techology at-moey lauderg complace, Ifosys Techology Ltd, (005). [6] B. Zagars, Problems applyg tradtoal at-moey lauderg procedures to o-facal trasactos, parallel bakg systems ad Islamc facal systems, Joural of moey lauderg cotrol, vol. 10, o., (007), pp. 157-169. [7] J. Ha, M. Kamber ad J. Pe, Data mg: cocepts ad techques, Morga kaufma, (006). [8] X. Lu ad P. Zhag, A sca statstcs based Suspcous trasactos detecto model for At-Moey Lauderg (AML) facal sttutos, Multmeda Commucatos (Medacom), 010 Iteratoal Coferece o. IEEE, (010). [9] L. Keya ad Y. Tgtg, A Improved Support-Vector Network Model for At-Moey Lauderg. Maagemet of e-commerce ad e-govermet (ICMeCG), 011 Ffth Iteratoal Coferece o. IEEE, (011). [10] O. Bousquet, S. Bouchero ad G. Lugos, Itroducto to statstcal learg theory. Advaced Lectures o Mache Learg, Sprger Berl Hedelberg, (004), pp. 169-07. [11] T. E. Seator, The FCEN Artfcal Itellgece System: Idetfyg Potetal Moey Lauderg from Reports of Large Cash Trasactos, IAAI, (1995). [1] P. C. va Duye ad H. de Mrada, The emperor's cloths of dsclosure: Hot moey ad suspect Copyrght c 014 SERSC 165

Vol.8, No. (014) dsclosures, Crme, Law ad Socal Chage, vol. 31, o. 3, (1999), pp. 45-71. [13] J. Kgdo, AI fghts moey lauderg, Itellget Systems, IEEE, vol. 19, o. 3, (004), pp. 87-89. [14] J. Z. Xao, H. Yag ad C. W. Chow, The determats ad characterstcs of volutary teret-based dsclosures by lsted Chese compaes, Joural of accoutg ad publc polcy, vol. 3, o. 3 (004), pp. 191-5. [15] N. A. Wess ad C. A. Wess, Itroductory statstcs, Pearso Educato, (01). [16] J. Ha, J. Pe ad Y. Y, Mg frequet patters wthout caddate geerato, ACM SIGMOD Record, ACM, vol. 9, o., (000). [17] M. Medhat Gaber, A. Zaslavsky ad S. Krshaswamy, Mg data streams: a revew, ACM Sgmod Record, vol. 34, o., (005), pp. 18-6. [18] J. Ha, Mg frequet patters wthout caddate geerato: A frequet-patter tree approach, Data mg ad kowledge dscovery, vol. 8, o. 1, (004), pp. 53-87. [19] W. L, J. Ha ad J. Pe, CMAR: Accurate ad effcet classfcato based o multple class-assocato rules, Data Mg, 001. ICDM 001, Proceedgs IEEE Iteratoal Coferece o. IEEE, (001). [0] J. Tag ad J. Y, Developg a tellget data dscrmatg system of at-moey lauderg based o SVM, Mache Learg ad Cyberetcs, 005. Proceedgs of 005 Iteratoal Coferece, IEEE, vol. 6, (005). [1] G. H. Joh, R. Kohav ad K. Pfleger, Irrelevat Features ad the Subset Selecto Problem, ICML, vol. 94, (1994). [] R. Agrawal, T. Imelńsk ad A. Swam, Mg assocato rules betwee sets of tems large databases, ACM SIGMOD Record, ACM, vol., o., (1993). Author Xgrog Luo. Male, he was bor Esh cty, Hube Provce. Now, he s servcg Vocatoal ad techcal college Esh as a assocate Professor, ad hs research terests are computer software developmet ad database techologes. 166 Copyrght c 014 SERSC