Multiple Algorithms for Fraud Detection
|
|
|
- Mercy Page
- 10 years ago
- Views:
Transcription
1 Multiple Algorithms for Fraud Detection Richard Wheeler and Stuart Aitken Artificial Intelligence Applications Institute, The University of Edinburgh, 80 South Bridge, Edinburgh EH1 1HN, Scotland. Tel. (44) Fax. (44) Abstract This paper describes an application of Case-Based Reasoning to the problem of reducing the number of final-line fraud investigations in the credit approval process. The performance of a suite of algorithms which are applied in combination to determine a diagnosis from a set of retrieved cases is reported. An adaptive diagnosis algorithm combining several neighbourhoodbased and probabilistic algorithms was found to have the best performance, and these results indicate that an adaptive solution can provide fraud filtering and case ordering functions for reducing the number of final-line fraud investigations necessary. Keywords: fraud detection, case-based reasoning, adaptive algorithms 1
2 1 Introduction Artificial intelligence techniques have been successfully applied to credit card fraud detection and credit scoring, and the field of AI as applied to the financial domain is both well-developed and well documented. As an emerging methodology, case-based reasoning (CBR) is now making a significant contribution to the task of fraud detection. CBR systems are able to learn from sample patterns of credit card use to classify new cases, and this approach also has the promise of being able to adapt to new patterns of fraud as they emerge. At the forefront of research in this field is the application of adaptive and hybrid learning systems to problems which previously were considered too dynamic, chaotic, or complex to accurately model and predict. As applied to the financial domain, CBR systems have a number of advantages over other AI techniques as they: provide meaningful confidence and system accuracy measures, require little or no direct expert knowledge acquisition, are easily updated and maintained, articulate the reasoning behind the decision making clearly, are flexible and robust to missing or noisy data, may take into account the cost effectiveness ratio of investigating false positives and advise accordingly, and are easily integrated into varying database standards. And the addition of adaptive CBR components may allow the system to: optimise the accuracy of classification by dynamically adjusting and updating weighting structures, use multiple algorithms to enhance final diagnostic accuracy, and better differentiate between types of irregularities and develop a diagnostically significant sense of abnormality which aids in the first-time detection of new irregularity types. 2
3 In this paper we describe the background of a complex fraud-finding task, and then describe the development of an adaptive proof-of-concept CBR system which is able to achieve very encouraging results on large, noisy real-world test sets. In specific, this paper addresses the problem of making a diagnostic decision given a set of near-matching cases. Finally, the results of the investigation are summarised and considered in the light of other work in the field. 2 Background At the request of one of the UK s most successful fraud detection system software providers, AIAI undertook an investigation into methods of applying new AI technologies to increase the accuracy of the already highly advanced systems presently in use. While the firm s software presently reduces the number of necessary fraud investigations by several orders of magnitude, our investigation showed that utilising adaptive algorithms and fuzzy logic results in significant diagnostic improvement on the most difficult sub-section of cases. The focus of our investigation was to reduce the number of applications referred for expert investigation after the existing detection systems had been utilised. These well-proven systems take advantage of many person years of expert knowledge elicitation and encoding, and are able to reduce the initial volume of applications by roughly 2500 times (400 referred from every million analysed). It was on a database consisting solely of these hardest cases that the CBR system attempted to make diagnostically significant decisions. The source data comprised pairs of database records, the first of which was tagged as the application, the second as the evidence found by previous analysis suggesting fraud. Two files of data were provided: one set of nearly 2000 application-match pairs which were initially flagged as fraud but later cleared (the non-fraud set), and a second set of 175 application-match pairs which were judged to be fraudulent. The application records consisted of applicant data (personal code, name of the lender, amount of loan, name and address of applicant, etc.), and employer data (type of business, time employed, etc.). Evidence provided followed a similar format, and the final test sets consisted of 584 cases of non-fraud and 96 cases of fraud. Each case consisted of an application and one or more evidence records both of which contained application and employer data. Pre-processing was kept to a minimum, excluding only those fields which might be construed as being false indicators, such as database tags generated by 3
4 the company s selection process. All other fields remained in their original state, and omissions formed a high percentage of total information encoded. In order to capture more general patterns in application-match pairs within a case, the type of match that existed between fields was introduced into the case description. A small number of terms were defined to describe these matches, and this information was added into the cases after parsing. These general descriptions of matches were given simple descriptive labels: exact- match, nearmatch, dissimilar and added as a third component to each application-match pair. As such, the additional information was intended to act as a general fuzzy classifier of match fitness. The conjecture was that there are patterns of values for match types that might be exploited by an adaptive system. Similarity measures were assessed for all field types: strings, dates, addresses, numerical values, etc., and it was these final three-part sets: application, evidence, fuzzy match descriptor, that were presented to the proof-of-concept system for analysis. After all pre-processing, each case was described by 128 attributes. 3 Approach Statistical investigations of the test sets suggested that the nature of the problem is inherently non-linear, noisy, contradictory, and not addressable using a simple similarity matrix and CBR decision system. This is unsurprising as the test sets were composed of the most difficult and intractable sub-set of the credit approval data, and as such did not cluster into identifiable fraud/non-fraud regions. However, highly localised phenomena and patterns appeared fairly common, suggesting that a hybrid or adaptive system within a CBR methodological structure might be able to focus upon and effectively exploit these characteristics. The proof-of-concept system design has two essential decision-making components familiar to all CBR frameworks: retrieval and diagnosis. Retrieval utilises a weight matrix and nearest neighbour algorithm, while diagnosis utilises a suite of algorithms which analyse the data recalled by the retrieval mechanism as being significant. A learning mechanism was also implemented in the proof-of-concept system. In this section we present the investigation of weighting matrix approaches, nearest neighbour strategies employed, and multi-algorithm final analysis, which is the focus of this work. 4
5 3.1 Weighting Matrix Approach Case-based reasoning systems function by defining a set of features within a data or case base, and then generating a similarity score that represents the relationship between a previously seen case and the test case. Generally, this comparison is flat, that is, each field matching according to predefined operators (such as exact and fuzzy matching) adds one point to the total similarity score of the comparison. Of course, not all fields (or features) within a database are equally meaningful in divining a classification or sound decision, so a weighting matrix is often employed - a method by which a single feature s importance may be raised or lowered, giving certain features more diagnostic significance than others. The first set of experiments performed were to experimentally assess the effect on total diagnostic accuracy of the raising and lowering of individual field weights. This was performed with the AIAI CBR Shell System 1 which supports the automatic polling of fields for sensitivity to goal finding and the stochastic hill-climbing of ever-fitter combinations of field weights. Disappointingly, these investigations only demonstrated that any simple relationships between field values and fraud occurrence had already been exploited by the rule-based filtering that had been applied to the data prior to our analysis of it. In consequence, a flat weighting structure was used in all subsequent testing. 3.2 Case Retrieval Nearest neighbour matching is common to many CBR systems. Again using the basic exploratory facilities of AIAI s CBR test bed, a set of cases which were considered to be very similar, i.e., above a certain percentage of similarity, were retrieved. These retrieved cases are designated as being appropriate to include in the final diagnostic analysis for fraud. This approach to nearest-neighbour recall should be differentiated from k nearest-neighbour method, where a fixed number of cases are recalled for consideration in the diagnosis or adaptation of a solution. Whereas k nearest-neighbour recall will always result in a constant number of returned cases for consideration (by expanding the neighbourhood to capture the desired number of nearby cases), thresholded retrieval retrieves cases from a specified neighbourhood. The threshold can be modified dynamically, and one case where we might change this parameter is where no cases are recalled. 1 see: richardw/cbrshell.html 5
6 3.3 Diagnostic Algorithms Applying the general principle of thresholded retrieval, a multi-algorithmic approach to final match analysis was developed as a result of the design and testing of a variety of single discrimination algorithms. It has been suggested [13] that no single algorithm may perform equally well on all search and classification tasks, and that an algorithm s improved performance in one learning situation may come at the expense of accuracy in another[9]. The idea of applying multiple algorithms within the CBR methodology is also gaining currency within the CBR community [1] and the techniques used in CBR continue to mature. The algorithms described here further analyse the sub-set of significant cases retrieved by the matching system (with a flat weighting structure) and attempt to reach a final diagnosis of fraud or non-fraud. Each was designed and tested separately for performance before integration into the larger suite, and finally, resolution strategies were implemented to resolve conflicting diagnoses by the individual algorithms. If all cases retrieved by a CBR system as being significant support a single correct analysis (here, fraud and non-fraud), then the algorithm required to perform the final analysis may be fairly simplistic; such as best match or simple percentage averaging (here referred to as probabilistic curve) methods. However, in highly dynamic, chaotic, and noisy environments it may be beneficial to apply or combine more complex algorithms to make a final decision that considers conflicting information and which is more accurate than simple similarity or summative analyses. These diagnostic operators may be combined dynamically within the final analysis system to increase accuracy and soundness of decision making Diagnosis Resolution Strategies If more than one algorithm is asked to diagnose the set of cases retrieved for a unknown credit request, it is possible that the algorithms may disagree on the result, and resolution strategies were implemented to resolve the varying diagnoses into a single result. The present proof-of-concept system s diagnostic algorithms are able to assess their own confidence (as a function of cases recalled, position in the case base, etc.), and these confidence measures play an important role in enhancing overall system accuracy. Three resolution strategies were tested with varying results: sequential resolution, best guess, and combined confidence. The sequential resolution strategy, the system s default, follows a simple sequential design that whichever algorithm finds sufficient evidence first (fires first) 6
7 makes the decision. The algorithms were then ordered to analyse the final matches in this order: Density Selection, Negative Selection, Probabilistic Curve, Best Match, and Default. In this manner it may be seen that if the first algorithm is unable to make a firm decision, the algorithm following it will be given the task; finally falling to the default if all other algorithms fail to reach a definite conclusion. The best guess resolution strategy simply chooses the algorithm that reports the highest confidence in its decision making, and disregards all others. While it is beyond the scope of this paper to detail the calculation of confidence measures for each algorithm (or the system as a whole) it may be noted that the factors which influence confidence include number and percentage of cases recalled, value of the threshold (size or extent of the neighbourhood), and overall system confidence derived from previous solutions and results. The confidence-driven resolution approach is similar to best guess differing in that it combines the differing algorithms diagnoses according to their confidence levels into a combined recommendation. While the best guess strategy will report a final analysis solely based on the most confident algorithm, this approach will report a combined analysis of all algorithms reporting. At present, the system provides the best results with a shifting or adapting measure of algorithmic confidence (raising or lowering individual algorithms relative confidence according to their success rates in particular portions of the case base), and the sequential resolution strategy appeared to perform the best and maintain its stability - that is, did not significantly change its behaviour or accuracy across differing test sets. Whether this simple resolution strategy is inherently more accurate for this task or is simply better designed and tuned for this particular application is not known. It should also be noted that the CBR system was able to adapt the distance measure on the fly as cases were analysed to increase accuracy. While the system may utilise as many as nine separate algorithms in making a final decision about fraud risk, four of these deserve further description as they form the final basis for high system accuracy Probabilistic Curve Algorithm Built into the functionality of the test bed are several common CBR diagnostic algorithms that were tested on the data sets with predictable results. The probabilistic curve algorithm classifies a case as either fraudulent or clear by expressing the dominant decision within the cases recalled as a percentage of 7
8 the total. For example, if two cases are recalled, one supporting the classification of fraud with 60% similarity, and the other suggesting a decision of non-fraud with 40% similarity, the probabilistic curve algorithm will recommend a decision of fraud by 60% confidence/accuracy. This analysis is similar without regard to the number of cases recalled, as the final decision is decided by the Bayesian tallying of accuracy or similarity scores of similar match designations (fraud/nonfraud), and is highly sensitive to scope of the neighbourhood retrieved. Another algorithm (not detailed here) achieved similar results with a one-case one-vote approach, which, rather than combining confidence scores, simply assigns each similar recalled case result the value of one. This algorithm may become the default decision-making device when other algorithms in the suite have failed to exploit stronger relationships. This algorithm results in 60/30 diagnostic split (60% recognition on non-fraud cases, 30% on fraud), but scores more highly on those cases which have not been exploited by other algorithms (70/35) Best Match Algorithm The best match algorithm, common to most CBRs, classifies a case as either fraudulent or clear by selecting the result of the closest matching case. As expected, this algorithm is highly sensitive to case base density and population, and generally chooses the dominant overall type (non-fraud) suggesting that fraud cases are not tightly clustered near to each other. Prior statistical analysis had already suggested this relationship within the data. The best match algorithm results in a 90/10 diagnostic split (90% recognition on non-fraud cases, 10% on fraud), but again, curiously scores more highly on those cases which have not been exploited by other algorithms in the suite (60/35) Negative Selection Algorithm The negative selection algorithm relies upon the exploitation of highly regional relationships within the data that may not be valid at other points within the case base. Following the axiom that one bad apple may spoil the barrel, it functions by recalling all cases over the similarity threshold (all those within the neighbourhood), and if a user-defined or machine-derived number (or percentage) of cases of fraud are found, the application is automatically tagged as fraud for further investigation. The algorithm assumes that fraud cases form difficult to define clusters, and that if a new case falls within a certain radius of a fraud case then that case is 8
9 more likely to be fraudulent as well, within a certain degree of probability for the case base considered. However, we made no assumption that fraud clusters taken together form a distinct region, only that they are more likely to occur nearer each other than the standard distribution within the larger case base. While anecdotal evidence in the domain abounds in favour of this algorithm s guiding principle, we believe that the distribution of fraud cases near each other in lightly populated regions or pockets of non-fraud space account for the algorithms heightened accuracy in certain regions of the case base. The negative selection algorithm results in a 70/55 diagnostic split (70% recognition on non-fraud cases, 55% on fraud), and the algorithm may only reach a negative decision if sufficient evidence for fraud is not found, as default algorithms in the suite are given priority over final decision making Density Selection Algorithm The density selection algorithm uses the natural tendency of the case recall method (threshold or fixed-neighbourhood nearest neighbour) to return implicit information about case density at a particular point in the case base using this sub-index as the basis for the probability of one classification over another. This algorithm exploits the assumption that within certain portions of the case base (but perhaps not others) the number of cases (both fraud and non-fraud) recalled using a thresholding mechanism (one in which the recall neighbourhood is fixed to density and not to the number or percentage of cases recalled) may serve in the absence of other indicators as an significant index for fraud risk. In short, that outlying cases in certain portions of the case base are very likely to be fraudulent. While this algorithm is very sensitive to the population size and spread within the case base, it has proven fairly stable to scaling and re-population as the case base expands. This algorithm results in a 70/40 diagnostic split (70% recognition on non-fraud cases, 40% on fraud), and takes precedence over other algorithms when its certainty (here calculated as a significant under or overabundance of nonfraud cases) is very high Default Goal In the absence of strong confidence in one decision or another, the proof-ofconcept system described here may also tag outlying or unusual cases arbitrarily as either fraud or non-fraud. The use of a simple default has been shown to slightly increase over-all accuracy when used in concert with a properly trained 9
10 INSERT FIGURE 1 HERE Figure 1: Comparison of Algorithms system, and may be used with smaller recall neighbourhoods to cause the system to be additionally critical of suspected fraud cases. 4 Results System performance is difficult to measure due to sensitivities to case and testing base sizes, and distributions of fraud and non-fraud cases within each. However, the principles that guide the learning process have proven to be robust to larger sample sizes. The results in Figure 1 show a comparison of the performance of each algorithm. These results represent an average performance sample for the proof-of-concept with a flat weighting structure and without using learning. Figure 1 shows that each of the four single algorithms has very different performance characteristics. (Note that the multi-algorithm system used negative selection, adaptive thresholding, density selection, probabilistic curve, best match, and fraud as the default goal). Through experimentation, we noted that the best match algorithm has very high accuracy in regions of high case density and consistency, while the density selection algorithm seems to perform best in areas of very high or very low case population. The negative selection algorithm appears to excel in diagnosing borderline or grey area cases. It should be remembered that the nature of the problem of fraud detection is a trade off between investigating (or identifying) too many non-fraud cases versus losing too many cases of genuine fraud. While ideal performance would be 100% recognition on both fraud and non-fraud cases, standard algorithms often only achieve a ratio of 60/30. When combined using even a simple diagnostic resolution strategy, the multialgorithmic approach may be seen to outperform the sum of its parts (80% fraud, 52% non-fraud). Due to the adaptive nature of the diagnostic algorithms, the proof-of-concept is very sensitive to small changes in the threshold (case matching neighbourhood), weights, and search method employed. Larger case and testing bases and the combination of learning with the multi-algorithm approach should result in more stable and accurate performance. 10
11 5 Related Work The use of case-based reasoning both as an inference engine and as the framework for hybrid fraud-detection systems is well documented (see [2] for an authoritative example). While the broader and more fundamental ideas and applications within the field of CBR are well treated in a number of recent works [5, 6, 10], the following overview of the application domain provides useful points of comparison of CBR performance in similar applications to the fraud detection task addressed here. While only 15 cases of actual fraud might be found in 1,000,000 applications or transactions, these cases may account for up to 10% of the revenue lost by a bank or lending agency, and in some cases may seriously effect profitability. Early detection is plainly the key to curbing lost revenues to fraud, and the nature and scope of the financial domain (the volume of applications, scope of purchases and prevalence of credit cards) dictates that expert review of more than a small minority of cases is impractical. The task of reducing the number of cases tagged for expert investigation is one that requires fine-grain analysis and pattern recognition on very large volumes of data. Fraud detection tasks in the financial domain fall into two categories: credit card fraud, and application fraud, although for the purposes of this investigation similarities between the two categories lend themselves to a single approach to the topic. The decision to authorise or deny transactions made to a credit account by a centrally located controlling database system is one of the foundations of modern banking, and significant precedent exists for a set of rules which have been derived by experts as strong indicators of fraud. These include purchases out of the normal geographical region, large and unusual cash withdrawals, and other common indicators of malfeasance. Most, if not all, modern credit accounts which contact a central branch for approval use some form of expert rules to prevent fraudulent card use. More recently, hybrid systems have begun to emerge that utilise more advanced AI techniques and methodologies to enhance fraud detection accuracy. While the most common advance has been the use of neural networks (NNs) [4, 7] to learn and predict purchase and usage patterns and detect fraudulent uses through training, a number of case based reasoning systems and hybrids have been developed and deployed with excellent results. We provide here a brief description of several such systems that are typical of CBR development within this domain. Hybrid approaches, or multiple-technique approaches within a common CBR framework, are becoming increasingly common. One study found that a CBR/NN 11
12 system which divides the task of fraud detection into two separate components was more effective than either tactic on its own[8]. In this case, a neural net learns patterns of use and misuse of credit cards while a CBR looks for best matches in the case base. The case base contained information such as transaction dates and amounts, theft date, place used, type of transaction, and type of shop or company seeking approval. The combined CBR/NN system reported a classification accuracy of 89% on a case base of 1606 cases. The system was 92% accurate in its classification of those transactions to be denied authorisation (1479 cases), and was 50% accurate in granting authorisation (127 cases) [8]. The total accuracy was comparable with that of human specialists who are expected to achieve 90% accuracy. Another recent application of CBR to the task of fraud detection was designed to grade loan applications according to credit-worthiness [11, 12]. A case base of 600 cases was constructed from the balance sheets of customers of a Swiss bank. The CBR was able to correctly classify 48.6% of the solvent companies as solvent and 93.6% of insolvent companies as insolvent 2. The CBR was then optimised for decision accuracy and also for decision cost. The cost associated with incorrectly classifying insolvent companies is based on the loss of credit volume (i.e. the money lent), while the cost of incorrectly classifying solvent companies (and hence not authorising the loan) is the loss of interest on the loan. The cost of misclassifying a bad customer was ten times that of misclassifying a good one. It was shown that the decision cost was reduced by the learning cycle which optimised for decision accuracy, and was then further reduced by learning accounting for cost. The value of this adaptive approach to CBR fine-tuning is becoming increasingly apparent. Similarly, the application of CBR to car insurance premium estimation has been reported with similar results. In one system, called RICAD, the case base was derived from the database of an Australian insurance company which contained 2 million entries [3]. This system calculated the risk cost of an application based on the average cost of claims made on similar policies. The risk cost therefore depends on the attributes that characterise cases and the weight given to each attribute. Each case has 30 attributes, including, e.g. post-code and vehicle code. The attribute driver-age had the highest weight value, and weights and risk factors were derived from insurance experts and statistical analyses of the data. Qualitative attributes such as make of car and post-code were organised taxonomically or 2 It is also reported that 36.4% of solvent companies and 27.4% of insolvent companies are incorrectly classified - see Wilke et al. [11, 12] for details. 12
13 spatially, and a measure of distance was defined on the spatial knowledge structure. The RICAD system was able to identify the indexes which should be given high weights by learning, and the system was capable of updating the weighting structure as new policies are issued and claims processed. The classification accuracies reported in the credit authorisation applications are comparable with that reported here. A second common feature is the interrelation of decision accuracy and decision cost: Classification systems operating in these domains have different accuracies for false positive/false negative decisions, and the costs of these errors can be very different. An important advantage of CBR techniques is the potential to use learning to improve decision making, taking costs into account. 6 Conclusions Investigations into the financial data provided has proven that, though highly chaotic, it has properties that allow multi-algorithmic and adaptive CBR techniques to be used for fraud classification and filtering. The data set could not be partitioned into fraud and non-fraud regions. Instead, the occurrence and distribution of fraud cases in the neighbourhood of an unknown application was observed to be diagnostically significant, and these relationships were effectively exploited by a multi-algorithmic proof-of-concept system. Neighbourhood-based and probabilistic algorithms have been shown to be appropriate techniques for classification, and may be further enhanced using additional diagnostic algorithms for decision making in borderline cases, and for calculating confidence and relative risk measures. While more accurate performance metrics and more thorough testing is required to appropriately quantify peak precision, the initial testing results of 80% non-fraud and 52% fraud recognition strongly suggest that a multi-algorithmic CBR will be capable of high accuracy rates. A comparison with related work shows that CBR techniques can achieve similar performance in comparable problem areas. We believe that these results are very promising and supportive of a multialgorithmic approach to classifying and assessing large, noisy data sets, and future work will focus upon testing the algorithms and resolution strategies on similarly complex data sets from other real-world domains. 13
14 Acknowledgements Richard Wheeler would like to thank Jim Howe, Ann Macintosh, and everyone at AIAI for their support. Special thanks go to Peter Ross. References [1] Aha, D. See aha/ for more information. [2] Curet,O. and Jackson, M. Tackling cognitive biases in the detection of top management fraud with the use of case-based reasoning. Proceedings of Expert Systems 95 ed. Macintosh, A. and Cooper, C. pp , [3] Daengedej, J., Lukose, D., Tsui, E., Beinat, P., and Prophet, L. Dynamically creating indeces for two million cases: A real world problem. Advances in Case-Based Reasoning, Proceedings of the Third European Workshop EWCBR-96 (LNCS 1168) Smith, I. and Faltings, B. (eds.), Springer, 1996, pp [4] Gately, E. Neural Networks for Financial Forecasting, John Wiley and Sons, [5] Kolodner, J. Case-based Reasoning, Morgan Kaufmann Publishers, [6] Leake, D. Case-Based Reasoning Experiences, Lessons, Future Directions, AAAI/MIT Press, [7] Masters, T. Neural, Novel and Hybrid Algorithms for Time Series Prediction, John Wiley and Sons, [8] Reategui, E.B. and Campbell, J. A Classification System For Credit Card Transactions, Advances in Case-Based Reasoning, Proceedings of the Second European Workshop EWCBR-94 (LNCS 984) Haton, J-P., Keane, M., and Manago, M. (eds.), Springer, 1994, pp [9] Schaffer, C. A Conservation Law For Generalized Performance See for more information. 14
15 [10] Watson, I. Applying Case-based Reasoning: Techniques for Enterprise Systems, Morgan Kaufmann, San Francisco, [11] Wilke, W., Bergmann, R. Considering Decision Cost During Learning Feature of Feature Weights Advances in Case-Based Reasoning, Proceedings of the Third European Workshop EWCBR-96 (LNCS 1168) Smith, I. and Faltings, B. (eds.), Springer, 1996, pp [12] Wilke, W., Bergmann, R., and Althoff, K-D. Fallbasiertes Schliessen in der Kreditwurdigkeitsprufung. KI-Themenheft: KI-Methoden in der Finanzwirtschaft 4/96. [13] Wolpert, D. H. On Overfitting Avoidance as Bias. Technical Report SFI-TR , Sante Fe Institute,
Online Credit Card Application and Identity Crime Detection
Online Credit Card Application and Identity Crime Detection Ramkumar.E & Mrs Kavitha.P School of Computing Science, Hindustan University, Chennai ABSTRACT The credit cards have found widespread usage due
How To Use Neural Networks In Data Mining
International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and
Comparison of K-means and Backpropagation Data Mining Algorithms
Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and
DATA MINING TECHNIQUES AND APPLICATIONS
DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,
Experiments in Web Page Classification for Semantic Web
Experiments in Web Page Classification for Semantic Web Asad Satti, Nick Cercone, Vlado Kešelj Faculty of Computer Science, Dalhousie University E-mail: {rashid,nick,vlado}@cs.dal.ca Abstract We address
International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014
RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer
Data Mining for Customer Service Support. Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin
Data Mining for Customer Service Support Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin Traditional Hotline Services Problem Traditional Customer Service Support (manufacturing)
CRM Forum Resources http://www.crm-forum.com
CRM Forum Resources http://www.crm-forum.com BEHAVIOURAL SEGMENTATION SYSTEMS - A Perspective Author: Brian Birkhead Copyright Brian Birkhead January 1999 Copyright Brian Birkhead, 1999. Supplied by The
Dan French Founder & CEO, Consider Solutions
Dan French Founder & CEO, Consider Solutions CONSIDER SOLUTIONS Mission Solutions for World Class Finance Footprint Financial Control & Compliance Risk Assurance Process Optimization CLIENTS CONTEXT The
Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski [email protected]
Introduction to Machine Learning and Data Mining Prof. Dr. Igor Trakovski [email protected] Neural Networks 2 Neural Networks Analogy to biological neural systems, the most robust learning systems
Data Mining Part 5. Prediction
Data Mining Part 5. Prediction 5.1 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Classification vs. Numeric Prediction Prediction Process Data Preparation Comparing Prediction Methods References Classification
A HYBRID RULE BASED FUZZY-NEURAL EXPERT SYSTEM FOR PASSIVE NETWORK MONITORING
A HYBRID RULE BASED FUZZY-NEURAL EXPERT SYSTEM FOR PASSIVE NETWORK MONITORING AZRUDDIN AHMAD, GOBITHASAN RUDRUSAMY, RAHMAT BUDIARTO, AZMAN SAMSUDIN, SURESRAWAN RAMADASS. Network Research Group School of
HELP DESK SYSTEMS. Using CaseBased Reasoning
HELP DESK SYSTEMS Using CaseBased Reasoning Topics Covered Today What is Help-Desk? Components of HelpDesk Systems Types Of HelpDesk Systems Used Need for CBR in HelpDesk Systems GE Helpdesk using ReMind
Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification
Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification Tina R. Patil, Mrs. S. S. Sherekar Sant Gadgebaba Amravati University, Amravati [email protected], [email protected]
Use of Data Mining in Banking
Use of Data Mining in Banking Kazi Imran Moin*, Dr. Qazi Baseer Ahmed** *(Department of Computer Science, College of Computer Science & Information Technology, Latur, (M.S), India ** (Department of Commerce
An Overview of Knowledge Discovery Database and Data mining Techniques
An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,
Mining the Software Change Repository of a Legacy Telephony System
Mining the Software Change Repository of a Legacy Telephony System Jelber Sayyad Shirabad, Timothy C. Lethbridge, Stan Matwin School of Information Technology and Engineering University of Ottawa, Ottawa,
E-Learning Using Data Mining. Shimaa Abd Elkader Abd Elaal
E-Learning Using Data Mining Shimaa Abd Elkader Abd Elaal -10- E-learning using data mining Shimaa Abd Elkader Abd Elaal Abstract Educational Data Mining (EDM) is the process of converting raw data from
203.4770: Introduction to Machine Learning Dr. Rita Osadchy
203.4770: Introduction to Machine Learning Dr. Rita Osadchy 1 Outline 1. About the Course 2. What is Machine Learning? 3. Types of problems and Situations 4. ML Example 2 About the course Course Homepage:
Introduction. Chapter 1
Chapter 1 Introduction The area of fault detection and diagnosis is one of the most important aspects in process engineering. This area has received considerable attention from industry and academia because
Business Intelligence through Hybrid Intelligent System Approach: Application to Retail Banking
Business Intelligence through Hybrid Intelligent System Approach: Application to Retail Banking Rajendra M Sonar, Asst Professor (IT), SJM School of Management, Indian Institute of Technology Bombay Powai,
Electronic Payment Fraud Detection Techniques
World of Computer Science and Information Technology Journal (WCSIT) ISSN: 2221-0741 Vol. 2, No. 4, 137-141, 2012 Electronic Payment Fraud Detection Techniques Adnan M. Al-Khatib CIS Dept. Faculty of Information
An effective approach to preventing application fraud. Experian Fraud Analytics
An effective approach to preventing application fraud Experian Fraud Analytics The growing threat of application fraud Fraud attacks are increasing across the world Application fraud is a rapidly growing
131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10
1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom
Business Intelligence and Decision Support Systems
Chapter 12 Business Intelligence and Decision Support Systems Information Technology For Management 7 th Edition Turban & Volonino Based on lecture slides by L. Beaubien, Providence College John Wiley
Data Mining Solutions for the Business Environment
Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania [email protected] Over
Machine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer
Machine Learning Chapter 18, 21 Some material adopted from notes by Chuck Dyer What is learning? Learning denotes changes in a system that... enable a system to do the same task more efficiently the next
A Fast Fraud Detection Approach using Clustering Based Method
pp. 33-37 Krishi Sanskriti Publications http://www.krishisanskriti.org/jbaer.html A Fast Detection Approach using Clustering Based Method Surbhi Agarwal 1, Santosh Upadhyay 2 1 M.tech Student, Mewar University,
Learning in Abstract Memory Schemes for Dynamic Optimization
Fourth International Conference on Natural Computation Learning in Abstract Memory Schemes for Dynamic Optimization Hendrik Richter HTWK Leipzig, Fachbereich Elektrotechnik und Informationstechnik, Institut
Data Mining Analytics for Business Intelligence and Decision Support
Data Mining Analytics for Business Intelligence and Decision Support Chid Apte, T.J. Watson Research Center, IBM Research Division Knowledge Discovery and Data Mining (KDD) techniques are used for analyzing
Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05
Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification
The Basics of Expert (Knowledge Based) Systems. Contents. 1. Introduction. 2. Scope of Definition - Formal Description
The Basics of Expert (Knowledge Based) Systems Contents 1. Introduction 2. Scope of Definition - Formal Description 3. Expert System Component Facts 3.1 Rule Based Reasoning 3.2 Databases 3.3 Inference
Data Mining Algorithms Part 1. Dejan Sarka
Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka ([email protected]) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses
Environmental Remote Sensing GEOG 2021
Environmental Remote Sensing GEOG 2021 Lecture 4 Image classification 2 Purpose categorising data data abstraction / simplification data interpretation mapping for land cover mapping use land cover class
ACH 1.1 : A Tool for Analyzing Competing Hypotheses Technical Description for Version 1.1
ACH 1.1 : A Tool for Analyzing Competing Hypotheses Technical Description for Version 1.1 By PARC AI 3 Team with Richards Heuer Lance Good, Jeff Shrager, Mark Stefik, Peter Pirolli, & Stuart Card ACH 1.1
A Review of Anomaly Detection Techniques in Network Intrusion Detection System
A Review of Anomaly Detection Techniques in Network Intrusion Detection System Dr.D.V.S.S.Subrahmanyam Professor, Dept. of CSE, Sreyas Institute of Engineering & Technology, Hyderabad, India ABSTRACT:In
FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM
International Journal of Innovative Computing, Information and Control ICIC International c 0 ISSN 34-48 Volume 8, Number 8, August 0 pp. 4 FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT
Chapter Managing Knowledge in the Digital Firm
Chapter Managing Knowledge in the Digital Firm Essay Questions: 1. What is knowledge management? Briefly outline the knowledge management chain. 2. Identify the three major types of knowledge management
The primary goal of this thesis was to understand how the spatial dependence of
5 General discussion 5.1 Introduction The primary goal of this thesis was to understand how the spatial dependence of consumer attitudes can be modeled, what additional benefits the recovering of spatial
SUCCESSFUL PREDICTION OF HORSE RACING RESULTS USING A NEURAL NETWORK
SUCCESSFUL PREDICTION OF HORSE RACING RESULTS USING A NEURAL NETWORK N M Allinson and D Merritt 1 Introduction This contribution has two main sections. The first discusses some aspects of multilayer perceptrons,
Why is Internal Audit so Hard?
Why is Internal Audit so Hard? 2 2014 Why is Internal Audit so Hard? 3 2014 Why is Internal Audit so Hard? Waste Abuse Fraud 4 2014 Waves of Change 1 st Wave Personal Computers Electronic Spreadsheets
An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015
An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content
Data Mining for Fun and Profit
Data Mining for Fun and Profit Data mining is the extraction of implicit, previously unknown, and potentially useful information from data. - Ian H. Witten, Data Mining: Practical Machine Learning Tools
Statistics in Retail Finance. Chapter 7: Fraud Detection in Retail Credit
Statistics in Retail Finance Chapter 7: Fraud Detection in Retail Credit 1 Overview > Detection of fraud remains an important issue in retail credit. Methods similar to scorecard development may be employed,
FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS
FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS Breno C. Costa, Bruno. L. A. Alberto, André M. Portela, W. Maduro, Esdras O. Eler PDITec, Belo Horizonte,
Reference Books. Data Mining. Supervised vs. Unsupervised Learning. Classification: Definition. Classification k-nearest neighbors
Classification k-nearest neighbors Data Mining Dr. Engin YILDIZTEPE Reference Books Han, J., Kamber, M., Pei, J., (2011). Data Mining: Concepts and Techniques. Third edition. San Francisco: Morgan Kaufmann
Three types of messages: A, B, C. Assume A is the oldest type, and C is the most recent type.
Chronological Sampling for Email Filtering Ching-Lung Fu 2, Daniel Silver 1, and James Blustein 2 1 Acadia University, Wolfville, Nova Scotia, Canada 2 Dalhousie University, Halifax, Nova Scotia, Canada
Statistical Models in Data Mining
Statistical Models in Data Mining Sargur N. Srihari University at Buffalo The State University of New York Department of Computer Science and Engineering Department of Biostatistics 1 Srihari Flood of
DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES
DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES Vijayalakshmi Mahanra Rao 1, Yashwant Prasad Singh 2 Multimedia University, Cyberjaya, MALAYSIA 1 [email protected]
Case-Based Reasoning for General Electric Appliance Customer Support
Case-Based Reasoning for General Electric Appliance Customer Support William Cheetham General Electric Global Research, One Research Circle, Niskayuna, NY 12309 [email protected] (Deployed Application)
How To Perform An Ensemble Analysis
Charu C. Aggarwal IBM T J Watson Research Center Yorktown, NY 10598 Outlier Ensembles Keynote, Outlier Detection and Description Workshop, 2013 Based on the ACM SIGKDD Explorations Position Paper: Outlier
DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.
DATA MINING TECHNOLOGY Georgiana Marin 1 Abstract In terms of data processing, classical statistical models are restrictive; it requires hypotheses, the knowledge and experience of specialists, equations,
Towers Watson pricing software
pricing software Adding value to the pricing of personal lines and commercial business s pricing software is used by 17 of the world s top 20 P&C insurers. pricing software Effective pricing is fundamental
Automated Collaborative Filtering Applications for Online Recruitment Services
Automated Collaborative Filtering Applications for Online Recruitment Services Rachael Rafter, Keith Bradley, Barry Smyth Smart Media Institute, Department of Computer Science, University College Dublin,
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,
Behavior Model to Capture Bank Charge-off Risk for Next Periods Working Paper
1 Behavior Model to Capture Bank Charge-off Risk for Next Periods Working Paper Spring 2007 Juan R. Castro * School of Business LeTourneau University 2100 Mobberly Ave. Longview, Texas 75607 Keywords:
A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier
A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier G.T. Prasanna Kumari Associate Professor, Dept of Computer Science and Engineering, Gokula Krishna College of Engg, Sullurpet-524121,
Asexual Versus Sexual Reproduction in Genetic Algorithms 1
Asexual Versus Sexual Reproduction in Genetic Algorithms Wendy Ann Deslauriers ([email protected]) Institute of Cognitive Science,Room 22, Dunton Tower Carleton University, 25 Colonel By Drive
Enhancing Quality of Data using Data Mining Method
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 9, SEPTEMBER 2, ISSN 25-967 WWW.JOURNALOFCOMPUTING.ORG 9 Enhancing Quality of Data using Data Mining Method Fatemeh Ghorbanpour A., Mir M. Pedram, Kambiz Badie, Mohammad
Mobile Phone APP Software Browsing Behavior using Clustering Analysis
Proceedings of the 2014 International Conference on Industrial Engineering and Operations Management Bali, Indonesia, January 7 9, 2014 Mobile Phone APP Software Browsing Behavior using Clustering Analysis
Learning is a very general term denoting the way in which agents:
What is learning? Learning is a very general term denoting the way in which agents: Acquire and organize knowledge (by building, modifying and organizing internal representations of some external reality);
Healthcare Measurement Analysis Using Data mining Techniques
www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 03 Issue 07 July, 2014 Page No. 7058-7064 Healthcare Measurement Analysis Using Data mining Techniques 1 Dr.A.Shaik
International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015
RESEARCH ARTICLE OPEN ACCESS Data Mining Technology for Efficient Network Security Management Ankit Naik [1], S.W. Ahmad [2] Student [1], Assistant Professor [2] Department of Computer Science and Engineering
A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH
205 A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH ABSTRACT MR. HEMANT KUMAR*; DR. SARMISTHA SARMA** *Assistant Professor, Department of Information Technology (IT), Institute of Innovation in Technology
SISAIH: a Case-Based Reasoning Tool for Hospital Admission Authorization Management
SISAIH: a Case-Based Reasoning Tool for Hospital Admission Authorization Management Fabiana Lorenzi 1, Mara Abel 2, and Francesco Ricci 3 1 Universidade Luterana do Brasil Rua Miguel Tostes, 101 Canoas,
Data Mining - Evaluation of Classifiers
Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010
Prediction of Heart Disease Using Naïve Bayes Algorithm
Prediction of Heart Disease Using Naïve Bayes Algorithm R.Karthiyayini 1, S.Chithaara 2 Assistant Professor, Department of computer Applications, Anna University, BIT campus, Tiruchirapalli, Tamilnadu,
Credit Card Fraud Detection Using Meta-Learning: Issues and Initial Results 1
Credit Card Fraud Detection Using Meta-Learning: Issues and Initial Results 1 Salvatore J. Stolfo, David W. Fan, Wenke Lee and Andreas L. Prodromidis Department of Computer Science Columbia University
Evaluation & Validation: Credibility: Evaluating what has been learned
Evaluation & Validation: Credibility: Evaluating what has been learned How predictive is a learned model? How can we evaluate a model Test the model Statistical tests Considerations in evaluating a Model
Using News Articles to Predict Stock Price Movements
Using News Articles to Predict Stock Price Movements Győző Gidófalvi Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 9237 [email protected] 21, June 15,
Document Image Retrieval using Signatures as Queries
Document Image Retrieval using Signatures as Queries Sargur N. Srihari, Shravya Shetty, Siyuan Chen, Harish Srinivasan, Chen Huang CEDAR, University at Buffalo(SUNY) Amherst, New York 14228 Gady Agam and
NEURAL NETWORKS IN DATA MINING
NEURAL NETWORKS IN DATA MINING 1 DR. YASHPAL SINGH, 2 ALOK SINGH CHAUHAN 1 Reader, Bundelkhand Institute of Engineering & Technology, Jhansi, India 2 Lecturer, United Institute of Management, Allahabad,
SPATIAL DATA CLASSIFICATION AND DATA MINING
, pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal
Impelling Heart Attack Prediction System using Data Mining and Artificial Neural Network
General Article International Journal of Current Engineering and Technology E-ISSN 2277 4106, P-ISSN 2347-5161 2014 INPRESSCO, All Rights Reserved Available at http://inpressco.com/category/ijcet Impelling
EDIminer: A Toolset for Process Mining from EDI Messages
EDIminer: A Toolset for Process Mining from EDI Messages Robert Engel 1, R. P. Jagadeesh Chandra Bose 2, Christian Pichler 1, Marco Zapletal 1, and Hannes Werthner 1 1 Vienna University of Technology,
Data Mining System, Functionalities and Applications: A Radical Review
Data Mining System, Functionalities and Applications: A Radical Review Dr. Poonam Chaudhary System Programmer, Kurukshetra University, Kurukshetra Abstract: Data Mining is the process of locating potentially
Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing and Developing E-mail Classifier
International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-1, Issue-6, January 2013 Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing
Quality and Complexity Measures for Data Linkage and Deduplication
Quality and Complexity Measures for Data Linkage and Deduplication Peter Christen and Karl Goiser Department of Computer Science, Australian National University, Canberra ACT 0200, Australia {peter.christen,karl.goiser}@anu.edu.au
Search Taxonomy. Web Search. Search Engine Optimization. Information Retrieval
Information Retrieval INFO 4300 / CS 4300! Retrieval models Older models» Boolean retrieval» Vector Space model Probabilistic Models» BM25» Language models Web search» Learning to Rank Search Taxonomy!
ARTIFICIAL INTELLIGENCE METHODS IN EARLY MANUFACTURING TIME ESTIMATION
1 ARTIFICIAL INTELLIGENCE METHODS IN EARLY MANUFACTURING TIME ESTIMATION B. Mikó PhD, Z-Form Tool Manufacturing and Application Ltd H-1082. Budapest, Asztalos S. u 4. Tel: (1) 477 1016, e-mail: [email protected]
A Capability Model for Business Analytics: Part 2 Assessing Analytic Capabilities
A Capability Model for Business Analytics: Part 2 Assessing Analytic Capabilities The first article of this series presented the capability model for business analytics that is illustrated in Figure One.
A semi-supervised Spam mail detector
A semi-supervised Spam mail detector Bernhard Pfahringer Department of Computer Science, University of Waikato, Hamilton, New Zealand Abstract. This document describes a novel semi-supervised approach
Enhanced Boosted Trees Technique for Customer Churn Prediction Model
IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719 Vol. 04, Issue 03 (March. 2014), V5 PP 41-45 www.iosrjen.org Enhanced Boosted Trees Technique for Customer Churn Prediction
Potential Value of Data Mining for Customer Relationship Marketing in the Banking Industry
Advances in Natural and Applied Sciences, 3(1): 73-78, 2009 ISSN 1995-0772 2009, American Eurasian Network for Scientific Information This is a refereed journal and all articles are professionally screened
Introduction. A. Bellaachia Page: 1
Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.
Financial Statement Fraud Detection: An Analysis of Statistical and Machine Learning Algorithms
Financial Statement Fraud Detection: An Analysis of Statistical and Machine Learning Algorithms Johan Perols Assistant Professor University of San Diego, San Diego, CA 92110 [email protected] April
A FUZZY BASED APPROACH TO TEXT MINING AND DOCUMENT CLUSTERING
A FUZZY BASED APPROACH TO TEXT MINING AND DOCUMENT CLUSTERING Sumit Goswami 1 and Mayank Singh Shishodia 2 1 Indian Institute of Technology-Kharagpur, Kharagpur, India [email protected] 2 School of Computer
1. Classification problems
Neural and Evolutionary Computing. Lab 1: Classification problems Machine Learning test data repository Weka data mining platform Introduction Scilab 1. Classification problems The main aim of a classification
Health Management for In-Service Gas Turbine Engines
Health Management for In-Service Gas Turbine Engines PHM Society Meeting San Diego, CA October 1, 2009 Thomas Mooney GE-Aviation DES-1474-1 Agenda Legacy Maintenance Implementing Health Management Choosing
Monotonicity Hints. Abstract
Monotonicity Hints Joseph Sill Computation and Neural Systems program California Institute of Technology email: [email protected] Yaser S. Abu-Mostafa EE and CS Deptartments California Institute of Technology
Machine Learning: Overview
Machine Learning: Overview Why Learning? Learning is a core of property of being intelligent. Hence Machine learning is a core subarea of Artificial Intelligence. There is a need for programs to behave
Prediction of Stock Performance Using Analytical Techniques
136 JOURNAL OF EMERGING TECHNOLOGIES IN WEB INTELLIGENCE, VOL. 5, NO. 2, MAY 2013 Prediction of Stock Performance Using Analytical Techniques Carol Hargreaves Institute of Systems Science National University
PharmaSUG2011 Paper HS03
PharmaSUG2011 Paper HS03 Using SAS Predictive Modeling to Investigate the Asthma s Patient Future Hospitalization Risk Yehia H. Khalil, University of Louisville, Louisville, KY, US ABSTRACT The focus of
Knowledge-based systems and the need for learning
Knowledge-based systems and the need for learning The implementation of a knowledge-based system can be quite difficult. Furthermore, the process of reasoning with that knowledge can be quite slow. This
Knowledge Discovery from patents using KMX Text Analytics
Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs [email protected] Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers
EFFICIENT DATA PRE-PROCESSING FOR DATA MINING
EFFICIENT DATA PRE-PROCESSING FOR DATA MINING USING NEURAL NETWORKS JothiKumar.R 1, Sivabalan.R.V 2 1 Research scholar, Noorul Islam University, Nagercoil, India Assistant Professor, Adhiparasakthi College
Data Mining: Overview. What is Data Mining?
Data Mining: Overview What is Data Mining? Recently * coined term for confluence of ideas from statistics and computer science (machine learning and database methods) applied to large databases in science,
To improve the problems mentioned above, Chen et al. [2-5] proposed and employed a novel type of approach, i.e., PA, to prevent fraud.
Proceedings of the 5th WSEAS Int. Conference on Information Security and Privacy, Venice, Italy, November 20-22, 2006 46 Back Propagation Networks for Credit Card Fraud Prediction Using Stratified Personalized
