Fraud Detection with MATLAB Ian McKenna, Ph.D.
|
|
- Duane Woods
- 8 years ago
- Views:
Transcription
1 Fraud Detection with MATLAB Ian McKenna, Ph.D The MathWorks, Inc. 1
2 Agenda Introduction: Background on Fraud Detection Challenges: Knowing your Risk Overview of the MATLAB Solution Connect to financial data sources Calculate fraud indicators Classify funds with machine learning Generate reports & deploy applications Questions & Answers 2
3 Fraud Detection Detecting when people intentionally act secretly to deprive another of something of value Types Returns Forensics Linguistic Based Cues 4
4 Types of Fraud Corporate Financial statement falsification Securities and commodities Hedge Fund returns manipulation Stock markets manipulation, regulation compliance Healthcare Mortgage Identity theft (credit card) Insurance Mass marketing Asset forfeiture/money laundering 5
5 Hedge Fund Returns Manipulation More prone to fraud due to decreased regulation SEC stats indicate 1% misbehave Scenarios Misbehavior: HF managers that have some discretion in valuing illiquid investments. Academics have devised methods to analyze and flag potentially manipulated fund returns. Outright fraud: Quantitative screening and use of dedicated algorithms can save a lot of time 6
6 Return-Based Analysis # of negative monthly returns used to judge manager s performance Attract investors by misreporting returns Distortion possible for returns at manager s discretion Illiquid assets, complex assets E.g. discontinuity exists at zero but disappears if returns computed bimonthly Suspicious Patterns in Hedge Fund Returns and the Risk of Fraud. Bollen, Nicolas P.B. and Veronika K. Pool (2012) Review of Financial Studies 25,
7 Returns Distribution Discontinuity 9
8 Benford s Law Frequency distribution of digits in many real-life sources of data: Electricity bills Street addresses Stock prices Population numbers Death rates Physical and mathematical constants Processes described by power laws 10
9 Stock Market Returns First Digit Frequency Source: Checking Financial markets via Benford's law, Marco Corazza, Andrea Ellero, and Alberto Zorzi 11
10 Agenda Introduction: Background on Fraud Detection Challenges: Knowing your Risk Overview of the MATLAB Solution Connect to financial data sources Calculate fraud indicators Classify funds with machine learning Generate reports & deploy applications Questions & Answers 12
11 Challenges in Fraud Detection Cost/Economics Most cases not fraud Manual analysis Data Huge data sets Complex data types Data integration Change Evolutionary Secrecy in detection methods 13
12 Challenges Faced During Model Development Traditional Approach Off-the-shelf software In-house development with traditional languages Spreadsheets, Excel Combination of the above Challenge Inability to work with custom and complex data Adapting requires long development times Limited data size Inefficiencies in Integration & Automation 15
13 Computational Finance Workflow Access Files Research and Quantify Data Analysis & Visualization Share Reporting Databases Financial Modeling Applications Datafeeds Application Development Production Automate 16
14 The Desired Report Three funds to analyze and report: Gateway Fund American Funds Growth Fund Fairfield Sentry (known fraudulent Madoff fund) 17
15 Agenda Introduction: Background on Fraud Detection Challenges: Knowing your Risk Overview of the MATLAB Solution Connect to financial data sources Calculate fraud indicators Classify funds with machine learning Generate reports & deploy applications Questions & Answers 18
16 Implemented Methods Returns Based Returns distribution and discontinuity at 0 Check discontinuity at 0 of the distribution of monthly returns Low correlation with other assets Regress fund returns on a combination of style factors that maximize explanatory power of the analysis Unconditional serial correlation Check if monthly returns are serially correlated, i.e. correlated with their previous month value. Because managers investing in illiquid securities, with no end-of-month quoted price, may smooth their returns compared to all available market information Conditional serial correlation Using the optimal factor model constructed in Low correlation with other assets, check serial correlation occurring especially after a down month (i.e. when the suspicious managers has the highest incentive to catch up ) 20
17 Implemented Methods Returns Based Number of returns equal 0 Calculate the theoretical number of returns being 0, using cumulative distribution function and binomial coefficients, for a time series exhibiting the same characteristics (average returns and variance) as the fund. Then compare that number with the actual count. Number of negative returns Calculate the theoretical number of negative returns as above. Then compare that number with the actual count. Number of unique returns/length of identical recurring series Calculate the theoretical number of each patterns. Unique returns is the number of unique numbers in the time series and length of identical series is the number of consecutive observations that are identical. Then compare these statistical numbers with the actual count. 21
18 Implemented Methods Returns Based Sample distribution of the last digit Check if the distribution of the returns last digit is uniformly distributed with a goodness-of-fit test Sample distribution of the first digit Check if the distribution of the returns first digit is following the Benford s Law with a goodness-of-fit test Supervised classification methods Using machine learning tools (such a Neural Networks, Classification methods) train a model to identify potential fraudsters. Input variables consists of all of the indicators described above so far, attributed to previously identified fraudulent and non fraudulent fund. Apply the fitted model to a new fund to obtain its classification. 22
19 Text Based Indicators Idea from published research in criminal investigation Hypothesis - deceptive senders display: Higher quantity Higher expressivity Higher informality Higher uncertainty Higher nonimmediacy Lower complexity Lower diversity Lower specificity Automating Linguistics-Based Cues for Detecting Deception in Text-based Asynchronous Computer-Mediated Communication. LINA ZHOU, Department of Information Systems, University of Maryland, Baltimore County, MD, USA. JUDEE K. BURGOON, JAY F. NUNAMAKER, JR. AND DOUG TWITCHELL, Center for the Management of Information, University of Arizona, Tucson, AZ, USA. Group Decision and Negotiation 13: ,
20 Implemented Methods Text Based Measure Complexity Average number of statements (average concepts per sentence) Average sentence length (average complexity of structures) Vocabulary complexity (average word length) Measure Uncertainty Average use of modifiers (number of adjectives/adverbs per sentence) Average reference to other (number of he, they, ) Measure of Expressivity Emotiveness (number of adjectives compared to nouns) Measure of Diversity Lexical diversity (number of unique words) 25
21 Classifying Words Java POS Tagger Reference online dictionary Only a few line of code 26
22 Comparison: American Growth Fund 28
23 Comparison: Madoff 29
24 Next Steps: Machine Learning with MATLAB To learn more, visit: Basket Selection using Stepwise Regression Classification in the presence of missing data Regerssion with Boosted Decision Trees Hierarchical Clustering 31
25 MATLAB Solutions Traditional Approach Challenge Solution Off-the-shelf software In-house development with traditional languages Spreadsheets, Excel Combination of the above Inability to work with custom and complex data Adapting requires long development times Limited data size Inefficiencies in Integration & Automation Flexible Work Rapid P Advan Work w Datab Easy to Autom 32
26 Financial Modeling Workflow Access Files Databases Datafeeds Research and Quantify Data Analysis and Visualization Financial Modeling Application Development Share Reporting Applications Production Spreadsheet Link EX Database Datafeed Trading Financial Instruments Statistics & Machine Learning Financial Econometrics Optimization Report Generator Production Server MATLAB Compiler SDK MATLAB Compiler MATLAB Parallel Computing MATLAB Distributed Computing Server 33
27 Q&A 34
MATLAB for Use in Finance Portfolio Optimization (Mean Variance, CVaR & MAD) Market, Credit, Counterparty Risk Analysis and beyond
MATLAB for Use in Finance Portfolio Optimization (Mean Variance, CVaR & MAD) Market, Credit, Counterparty Risk Analysis and beyond Marshall Alphonso Marshall.Alphonso@mathworks.com Senior Application Engineer
More informationAlgorithmic Trading with MATLAB Martin Demel, Application Engineer
Algorithmic Trading with MATLAB Martin Demel, Application Engineer 2011 The MathWorks, Inc. 1 Agenda Introducing MathWorks Introducting MATLAB (Portfolio Optimization Example) Introducting Algorithmic
More informationHow To Build A Trading Engine In A Microsoft Microsoft Matlab 2.5.2.2 (A Trading Engine)
Algorithmic Trading with MATLAB Martin Demel, Application Engineer 2011 The MathWorks, Inc. 1 Challenges when building trading strategies Increasing complexity More data More complicated models Increasing
More informationOrigins, Evolution, and Future Directions of MATLAB Loren Shure
Origins, Evolution, and Future Directions of MATLAB Loren Shure 2015 The MathWorks, Inc. 1 Agenda Origins Peaks 5 Evolution 0-5 Tomorrow 2 0 y -2-3 -2-1 x 0 1 2 3 2 Computational Finance Workflow Access
More informationTurning Data into Actionable Insights: Predictive Analytics with MATLAB WHITE PAPER
Turning Data into Actionable Insights: Predictive Analytics with MATLAB WHITE PAPER Introduction: Knowing Your Risk Financial professionals constantly make decisions that impact future outcomes in the
More informationVirtual Site Event. Predictive Analytics: What Managers Need to Know. Presented by: Paul Arnest, MS, MBA, PMP February 11, 2015
Virtual Site Event Predictive Analytics: What Managers Need to Know Presented by: Paul Arnest, MS, MBA, PMP February 11, 2015 1 Ground Rules Virtual Site Ground Rules PMI Code of Conduct applies for this
More informationData Analysis with MATLAB. 2013 The MathWorks, Inc. 1
Data Analysis with MATLAB 2013 The MathWorks, Inc. 1 Agenda Introduction Data analysis with MATLAB and Excel Break Developing applications with MATLAB Solving larger problems Summary 2 Modeling the Solar
More informationMachine Learning with MATLAB David Willingham Application Engineer
Machine Learning with MATLAB David Willingham Application Engineer 2014 The MathWorks, Inc. 1 Goals Overview of machine learning Machine learning models & techniques available in MATLAB Streamlining the
More informationPredictive Modeling Techniques in Insurance
Predictive Modeling Techniques in Insurance Tuesday May 5, 2015 JF. Breton Application Engineer 2014 The MathWorks, Inc. 1 Opening Presenter: JF. Breton: 13 years of experience in predictive analytics
More informationReview on Financial Forecasting using Neural Network and Data Mining Technique
ORIENTAL JOURNAL OF COMPUTER SCIENCE & TECHNOLOGY An International Open Free Access, Peer Reviewed Research Journal Published By: Oriental Scientific Publishing Co., India. www.computerscijournal.org ISSN:
More information10-601. Machine Learning. http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html
10-601 Machine Learning http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html Course data All up-to-date info is on the course web page: http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html
More informationA Correlation of. to the. South Carolina Data Analysis and Probability Standards
A Correlation of to the South Carolina Data Analysis and Probability Standards INTRODUCTION This document demonstrates how Stats in Your World 2012 meets the indicators of the South Carolina Academic Standards
More informationPentaho Data Mining Last Modified on January 22, 2007
Pentaho Data Mining Copyright 2007 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For the latest information, please visit our web site at www.pentaho.org
More informationHedge Fund Returns: Auditing and Accuracy
Hedge Fund Returns: Auditing and Accuracy Bing Liang Weatherhead School of Management Case Western Reserve University Cleveland, OH 44106-7235 Phone: (216) 368-5003 Fax: (216) 368-6249 E-mail: BXL4@po.cwru.edu
More informationWhy is Internal Audit so Hard?
Why is Internal Audit so Hard? 2 2014 Why is Internal Audit so Hard? 3 2014 Why is Internal Audit so Hard? Waste Abuse Fraud 4 2014 Waves of Change 1 st Wave Personal Computers Electronic Spreadsheets
More informationAzure Machine Learning, SQL Data Mining and R
Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:
More informationData Mining for Fun and Profit
Data Mining for Fun and Profit Data mining is the extraction of implicit, previously unknown, and potentially useful information from data. - Ian H. Witten, Data Mining: Practical Machine Learning Tools
More informationData Mining: Overview. What is Data Mining?
Data Mining: Overview What is Data Mining? Recently * coined term for confluence of ideas from statistics and computer science (machine learning and database methods) applied to large databases in science,
More informationnot possible or was possible at a high cost for collecting the data.
Data Mining and Knowledge Discovery Generating knowledge from data Knowledge Discovery Data Mining White Paper Organizations collect a vast amount of data in the process of carrying out their day-to-day
More informationIntroduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing
Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise, Evangelos Pournaras, Dirk Helbing 1 Overview Main principles of data mining Definition
More informationDeploying MATLAB -based Applications David Willingham Senior Application Engineer
Deploying MATLAB -based Applications David Willingham Senior Application Engineer 2014 The MathWorks, Inc. 1 Data Analytics Workflow Access Files Explore & Discover Data Analysis & Modeling Share Reporting
More informationAudit Analytics. --An innovative course at Rutgers. Qi Liu. Roman Chinchila
Audit Analytics --An innovative course at Rutgers Qi Liu Roman Chinchila A new certificate in Analytic Auditing Tentative courses: Audit Analytics Special Topics in Audit Analytics Forensic Accounting
More informationDATA MINING TECHNIQUES AND APPLICATIONS
DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,
More informationIs a Data Scientist the New Quant? Stuart Kozola MathWorks
Is a Data Scientist the New Quant? Stuart Kozola MathWorks 2015 The MathWorks, Inc. 1 Facts or information used usually to calculate, analyze, or plan something Information that is produced or stored by
More informationCredit Risk Modeling with MATLAB
Credit Risk Modeling with MATLAB Martin Demel, Application Engineer 95% VaR: $798232. 95% CVaR: $1336167. AAA 93.68% 5.55% 0.59% 0.18% AA 2.44% 92.60% 4.03% 0.73% 0.15% 0.06% -1 0 1 2 3 4 A5 0.14% 6 4.18%
More informationCOPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments
Contents List of Figures Foreword Preface xxv xxiii xv Acknowledgments xxix Chapter 1 Fraud: Detection, Prevention, and Analytics! 1 Introduction 2 Fraud! 2 Fraud Detection and Prevention 10 Big Data for
More informationInformation and Decision Sciences (IDS)
University of Illinois at Chicago 1 Information and Decision Sciences (IDS) Courses IDS 400. Advanced Business Programming Using Java. 0-4 Visual extended business language capabilities, including creating
More informationFinancial Trading System using Combination of Textual and Numerical Data
Financial Trading System using Combination of Textual and Numerical Data Shital N. Dange Computer Science Department, Walchand Institute of Rajesh V. Argiddi Assistant Prof. Computer Science Department,
More informationMortgage Broker Qualifying Standards (MBQS)
OBJECTIVES A. Compliance and Consumer Protection A1 Recognize the impact of regulation and legislation on the mortgage industry A1.1 Recognize requirements related to financial reporting and other reporting
More informationPredictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD
Predictive Analytics Techniques: What to Use For Your Big Data March 26, 2014 Fern Halper, PhD Presenter Proven Performance Since 1995 TDWI helps business and IT professionals gain insight about data warehousing,
More informationData Warehousing and Data Mining in Business Applications
133 Data Warehousing and Data Mining in Business Applications Eesha Goel CSE Deptt. GZS-PTU Campus, Bathinda. Abstract Information technology is now required in all aspect of our lives that helps in business
More informationAbdullah Mohammed Abdullah Khamis
Abdullah Mohammed Abdullah Khamis Jeddah, Saudi Arabia Email: Abdullahkhamis@gmail.com Mobile: +966 567243182 Tel: +966 2 6340699 (Yemeni) Research and Professional Objective To Complete my Ph.D. in Pattern
More informationOptimization applications in finance, securities, banking and insurance
IBM Software IBM ILOG Optimization and Analytical Decision Support Solutions White Paper Optimization applications in finance, securities, banking and insurance 2 Optimization applications in finance,
More informationWebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat
Information Builders enables agile information solutions with business intelligence (BI) and integration technologies. WebFOCUS the most widely utilized business intelligence platform connects to any enterprise
More informationIntroduction to MATLAB for Data Analysis and Visualization
Introduction to MATLAB for Data Analysis and Visualization Sean de Wolski Application Engineer 2014 The MathWorks, Inc. 1 Data Analysis Tasks Files Data Analysis & Modeling Reporting and Documentation
More informationReview on Financial Forecasting using Neural Network and Data Mining Technique
Global Journal of Computer Science and Technology Neural & Artificial Intelligence Volume 12 Issue 11 Version 1.0 Year 2012 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global
More informationData Mining is sometimes referred to as KDD and DM and KDD tend to be used as synonyms
Data Mining Techniques forcrm Data Mining The non-trivial extraction of novel, implicit, and actionable knowledge from large datasets. Extremely large datasets Discovery of the non-obvious Useful knowledge
More informationStatistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
More informationManagement Decision Making. Hadi Hosseini CS 330 David R. Cheriton School of Computer Science University of Waterloo July 14, 2011
Management Decision Making Hadi Hosseini CS 330 David R. Cheriton School of Computer Science University of Waterloo July 14, 2011 Management decision making Decision making Spreadsheet exercise Data visualization,
More informationA Proposed Prediction Model for Forecasting the Financial Market Value According to Diversity in Factor
A Proposed Prediction Model for Forecasting the Financial Market Value According to Diversity in Factor Ms. Hiral R. Patel, Mr. Amit B. Suthar, Dr. Satyen M. Parikh Assistant Professor, DCS, Ganpat University,
More informationMasters in Information Technology
Computer - Information Technology MSc & MPhil - 2015/6 - July 2015 Masters in Information Technology Programme Requirements Taught Element, and PG Diploma in Information Technology: 120 credits: IS5101
More informationDan French Founder & CEO, Consider Solutions
Dan French Founder & CEO, Consider Solutions CONSIDER SOLUTIONS Mission Solutions for World Class Finance Footprint Financial Control & Compliance Risk Assurance Process Optimization CLIENTS CONTEXT The
More informationData Mining. Dr. Saed Sayad. University of Toronto 2010 saed.sayad@utoronto.ca. http://chem-eng.utoronto.ca/~datamining/
Data Mining Dr. Saed Sayad University of Toronto 2010 saed.sayad@utoronto.ca http://chem-eng.utoronto.ca/~datamining/ 1 Data Mining Data mining is about explaining the past and predicting the future by
More informationNew Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Introduction
Introduction New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Predictive analytics encompasses the body of statistical knowledge supporting the analysis of massive data sets.
More informationDiscovering, Not Finding. Practical Data Mining for Practitioners: Level II. Advanced Data Mining for Researchers : Level III
www.cognitro.com/training Predicitve DATA EMPOWERING DECISIONS Data Mining & Predicitve Training (DMPA) is a set of multi-level intensive courses and workshops developed by Cognitro team. it is designed
More informationThe Facets of Fraud. A layered approach to fraud prevention
The Facets of Fraud A layered approach to fraud prevention Recognizing Fraud The various guises of fraud lead many organizations to believe they are not victims of deception or to vastly underestimate
More informationSentiment Analysis of Twitter Feeds for the Prediction of Stock Market Movement
Sentiment Analysis of Twitter Feeds for the Prediction of Stock Market Movement Ray Chen, Marius Lazer Abstract In this paper, we investigate the relationship between Twitter feed content and stock market
More informationNumerical Algorithms Group. Embedded Analytics. A cure for the common code. www.nag.com. Results Matter. Trust NAG.
Embedded Analytics A cure for the common code www.nag.com Results Matter. Trust NAG. Executive Summary How much information is there in your data? How much is hidden from you, because you don t have access
More informationPractical Data Science with Azure Machine Learning, SQL Data Mining, and R
Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be
More informationWeb Data Mining: A Case Study. Abstract. Introduction
Web Data Mining: A Case Study Samia Jones Galveston College, Galveston, TX 77550 Omprakash K. Gupta Prairie View A&M, Prairie View, TX 77446 okgupta@pvamu.edu Abstract With an enormous amount of data stored
More informationHow To Detect Credit Card Fraud
Card Fraud Howard Mizes December 3, 2013 2013 Xerox Corporation. All rights reserved. Xerox and Xerox Design are trademarks of Xerox Corporation in the United States and/or other countries. Outline of
More informationHow to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning
How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume
More informationFE670 Algorithmic Trading Strategies. Stevens Institute of Technology
FE670 Algorithmic Trading Strategies Lecture 1. An Overview of Trading and Markets Steve Yang Stevens Institute of Technology 08/29/2012 Outline 1 Logistics 2 Topics 3 Policies 4 Exams & Grades 5 Mathematical
More informationAssessing Data Mining: The State of the Practice
Assessing Data Mining: The State of the Practice 2003 Herbert A. Edelstein Two Crows Corporation 10500 Falls Road Potomac, Maryland 20854 www.twocrows.com (301) 983-3555 Objectives Separate myth from reality
More informationMasters in Human Computer Interaction
Masters in Human Computer Interaction Programme Requirements Taught Element, and PG Diploma in Human Computer Interaction: 120 credits: IS5101 CS5001 CS5040 CS5041 CS5042 or CS5044 up to 30 credits from
More informationMasters in Advanced Computer Science
Masters in Advanced Computer Science Programme Requirements Taught Element, and PG Diploma in Advanced Computer Science: 120 credits: IS5101 CS5001 up to 30 credits from CS4100 - CS4450, subject to appropriate
More informationSilvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com
SPSS-SA Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com SPSS-SA Training Brochure 2009 TABLE OF CONTENTS 1 SPSS TRAINING COURSES FOCUSING
More informationMasters in Artificial Intelligence
Masters in Artificial Intelligence Programme Requirements Taught Element, and PG Diploma in Artificial Intelligence: 120 credits: IS5101 CS5001 CS5010 CS5011 CS4402 or CS5012 in total, up to 30 credits
More informationAn Introduction to Advanced Analytics and Data Mining
An Introduction to Advanced Analytics and Data Mining Dr Barry Leventhal Henry Stewart Briefing on Marketing Analytics 19 th November 2010 Agenda What are Advanced Analytics and Data Mining? The toolkit
More informationMasters in Networks and Distributed Systems
Masters in Networks and Distributed Systems Programme Requirements Taught Element, and PG Diploma in Networks and Distributed Systems: 120 credits: IS5101 CS5001 CS5021 CS4103 or CS5023 in total, up to
More informationSentiment Analysis. D. Skrepetos 1. University of Waterloo. NLP Presenation, 06/17/2015
Sentiment Analysis D. Skrepetos 1 1 Department of Computer Science University of Waterloo NLP Presenation, 06/17/2015 D. Skrepetos (University of Waterloo) Sentiment Analysis NLP Presenation, 06/17/2015
More informationIT services for analyses of various data samples
IT services for analyses of various data samples Ján Paralič, František Babič, Martin Sarnovský, Peter Butka, Cecília Havrilová, Miroslava Muchová, Michal Puheim, Martin Mikula, Gabriel Tutoky Technical
More informationMachine Learning Capacity and Performance Analysis and R
Machine Learning and R May 3, 11 30 25 15 10 5 25 15 10 5 30 25 15 10 5 0 2 4 6 8 101214161822 0 2 4 6 8 101214161822 0 2 4 6 8 101214161822 100 80 60 40 100 80 60 40 100 80 60 40 30 25 15 10 5 25 15 10
More informationCOMMON CORE STATE STANDARDS FOR
COMMON CORE STATE STANDARDS FOR Mathematics (CCSSM) High School Statistics and Probability Mathematics High School Statistics and Probability Decisions or predictions are often based on data numbers in
More informationConditional probability of actually detecting a financial fraud a neutrosophic extension to Benford s law
Conditional probability of actually detecting a financial fraud a neutrosophic extension to Benford s law Sukanto Bhattacharya Alaska Pacific University, USA Kuldeep Kumar Bond University, Australia Florentin
More informationKNIME TUTORIAL. Anna Monreale KDD-Lab, University of Pisa Email: annam@di.unipi.it
KNIME TUTORIAL Anna Monreale KDD-Lab, University of Pisa Email: annam@di.unipi.it Outline Introduction on KNIME KNIME components Exercise: Market Basket Analysis Exercise: Customer Segmentation Exercise:
More informationCUSTOMER Presentation of SAP Predictive Analytics
SAP Predictive Analytics 2.0 2015-02-09 CUSTOMER Presentation of SAP Predictive Analytics Content 1 SAP Predictive Analytics Overview....3 2 Deployment Configurations....4 3 SAP Predictive Analytics Desktop
More informationRecognize the many faces of fraud
Recognize the many faces of fraud Detect and prevent fraud by finding subtle patterns and associations in your data Contents: 1 Introduction 2 The many faces of fraud 3 Detect healthcare fraud easily and
More informationUsing Predictive Analytics to Detect Contract Fraud, Waste, and Abuse Case Study from U.S. Postal Service OIG
Using Predictive Analytics to Detect Contract Fraud, Waste, and Abuse Case Study from U.S. Postal Service OIG MACPA Government & Non Profit Conference April 26, 2013 Isaiah Goodall, Director of Business
More informationMATLAB in Production Systems, Database Integration, and Big Data Eugene McGoldrick
MATLAB in Production Systems, Database Integration, and Big Data Eugene McGoldrick 2013 The MathWorks, Inc. 1 Agenda MATLAB Production Server and Excel Integrating MATLAB Production Server into Database
More informationFOR IMMEDIATE RELEASE
FOR IMMEDIATE RELEASE Hitachi Developed Basic Artificial Intelligence Technology that Enables Logical Dialogue Analyzes huge volumes of text data on issues under debate, and presents reasons and grounds
More informationData Mining. 1 Introduction 2 Data Mining methods. Alfred Holl Data Mining 1
Data Mining 1 Introduction 2 Data Mining methods Alfred Holl Data Mining 1 1 Introduction 1.1 Motivation 1.2 Goals and problems 1.3 Definitions 1.4 Roots 1.5 Data Mining process 1.6 Epistemological constraints
More informationLeveraging Ensemble Models in SAS Enterprise Miner
ABSTRACT Paper SAS133-2014 Leveraging Ensemble Models in SAS Enterprise Miner Miguel Maldonado, Jared Dean, Wendy Czika, and Susan Haller SAS Institute Inc. Ensemble models combine two or more models to
More informationBringing Big Data Modelling into the Hands of Domain Experts
Bringing Big Data Modelling into the Hands of Domain Experts David Willingham Senior Application Engineer MathWorks david.willingham@mathworks.com.au 2015 The MathWorks, Inc. 1 Data is the sword of the
More informationSidney Winter Lecture Series. Judee K. Burgoon University of Arizona
Sidney Winter Lecture Series Judee K. Burgoon University of Arizona Identification of fraudulent financial statements using linguistic credibility analysis Friday, April 12, 2013 Decision Support Systems
More informationBayesian networks - Time-series models - Apache Spark & Scala
Bayesian networks - Time-series models - Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup - November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly
More informationData Mining Algorithms Part 1. Dejan Sarka
Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses
More informationRisk Analysis Approaches to Rank Outliers in Trade Data
Risk Analysis Approaches to Rank Outliers in Trade Data Vytis Kopustinskas and Spyros Arsenis Abstract The paper discusses ranking methods for outliers in trade data based on statistical information with
More informationAlgorithmic Trading Session 1 Introduction. Oliver Steinki, CFA, FRM
Algorithmic Trading Session 1 Introduction Oliver Steinki, CFA, FRM Outline An Introduction to Algorithmic Trading Definition, Research Areas, Relevance and Applications General Trading Overview Goals
More informationMeeting Identity Theft Red Flags Regulations with IBM Fraud, Risk & Compliance Solutions
Leveraging Risk & Compliance for Strategic Advantage IBM Information Management software Meeting Identity Theft Red Flags Regulations with IBM Fraud, Risk & Compliance Solutions XXX Astute financial services
More informationBEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES
BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 123 CHAPTER 7 BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 7.1 Introduction Even though using SVM presents
More informationModel Combination. 24 Novembre 2009
Model Combination 24 Novembre 2009 Datamining 1 2009-2010 Plan 1 Principles of model combination 2 Resampling methods Bagging Random Forests Boosting 3 Hybrid methods Stacking Generic algorithm for mulistrategy
More informationKnowledgeSTUDIO HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES
HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES Translating data into business value requires the right data mining and modeling techniques which uncover important patterns within
More informationStudents will become familiar with the Brandeis Datastream installation as the primary source of pricing, financial and economic data.
BUS 211f (1) Information Management: Financial Data in a Quantitative Investment Framework Spring 2004 Fridays 9:10am noon Lemberg Academic Center, Room 54 Prof. Hugh Lagan Crowther C (781) 640-3354 hugh@crowther-investment.com
More informationAn Introduction to Data Mining
An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail
More information8. Machine Learning Applied Artificial Intelligence
8. Machine Learning Applied Artificial Intelligence Prof. Dr. Bernhard Humm Faculty of Computer Science Hochschule Darmstadt University of Applied Sciences 1 Retrospective Natural Language Processing Name
More informationHow To Use Data Mining For Loyalty Based Management
Data Mining for Loyalty Based Management Petra Hunziker, Andreas Maier, Alex Nippe, Markus Tresch, Douglas Weers, Peter Zemp Credit Suisse P.O. Box 100, CH - 8070 Zurich, Switzerland markus.tresch@credit-suisse.ch,
More informationIDENTIFYING BANK FRAUDS USING CRISP-DM AND DECISION TREES
IDENTIFYING BANK FRAUDS USING CRISP-DM AND DECISION TREES Bruno Carneiro da Rocha 1,2 and Rafael Timóteo de Sousa Júnior 2 1 Bank of Brazil, Brasília-DF, Brazil brunorocha_33@hotmail.com 2 Network Engineering
More informationAn Overview of Knowledge Discovery Database and Data mining Techniques
An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,
More informationDecision Support Optimization through Predictive Analytics - Leuven Statistical Day 2010
Decision Support Optimization through Predictive Analytics - Leuven Statistical Day 2010 Ernst van Waning Senior Sales Engineer May 28, 2010 Agenda SPSS, an IBM Company SPSS Statistics User-driven product
More informationData Analytics in the Corporate Payment Industry. Bret Hansen Vice President of Technology Services, U.S. Bancorp
Data Analytics in the Corporate Payment Industry Bret Hansen Vice President of Technology Services, U.S. Bancorp Agenda The Daily News Classifications of Complex Event Processing Maximizing Control, Compliance,
More informationMaschinelles Lernen mit MATLAB
Maschinelles Lernen mit MATLAB Jérémy Huard Applikationsingenieur The MathWorks GmbH 2015 The MathWorks, Inc. 1 Machine Learning is Everywhere Image Recognition Speech Recognition Stock Prediction Medical
More informationPredictive Analytics Powered by SAP HANA. Cary Bourgeois Principal Solution Advisor Platform and Analytics
Predictive Analytics Powered by SAP HANA Cary Bourgeois Principal Solution Advisor Platform and Analytics Agenda Introduction to Predictive Analytics Key capabilities of SAP HANA for in-memory predictive
More informationRUTHERFORD HIGH SCHOOL Rutherford, New Jersey COURSE OUTLINE STATISTICS AND PROBABILITY
RUTHERFORD HIGH SCHOOL Rutherford, New Jersey COURSE OUTLINE STATISTICS AND PROBABILITY I. INTRODUCTION According to the Common Core Standards (2010), Decisions or predictions are often based on data numbers
More informationFacilitating On-Demand Risk and Actuarial Analysis in MATLAB. Timo Salminen, CFA, FRM Model IT
Facilitating On-Demand Risk and Actuarial Analysis in MATLAB Timo Salminen, CFA, FRM Model IT Introduction It is common that insurance companies can valuate their liabilities only quarterly Sufficient
More informationClustering Connectionist and Statistical Language Processing
Clustering Connectionist and Statistical Language Processing Frank Keller keller@coli.uni-sb.de Computerlinguistik Universität des Saarlandes Clustering p.1/21 Overview clustering vs. classification supervised
More informationPractical Applications of DATA MINING. Sang C Suh Texas A&M University Commerce JONES & BARTLETT LEARNING
Practical Applications of DATA MINING Sang C Suh Texas A&M University Commerce r 3 JONES & BARTLETT LEARNING Contents Preface xi Foreword by Murat M.Tanik xvii Foreword by John Kocur xix Chapter 1 Introduction
More informationMasters in Computing and Information Technology
Masters in Computing and Information Technology Programme Requirements Taught Element, and PG Diploma in Computing and Information Technology: 120 credits: IS5101 CS5001 or CS5002 CS5003 up to 30 credits
More informationHIGH PERFORMANCE ANALYTICS FOR TERADATA
F HIGH PERFORMANCE ANALYTICS FOR TERADATA F F BORN AND BRED IN FINANCIAL SERVICES AND HEALTHCARE. DECADES OF EXPERIENCE IN PARALLEL PROGRAMMING AND ANALYTICS. FOCUSED ON MAKING DATA SCIENCE HIGHLY PERFORMING
More informationData Mining Part 5. Prediction
Data Mining Part 5. Prediction 5.1 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Classification vs. Numeric Prediction Prediction Process Data Preparation Comparing Prediction Methods References Classification
More information