A Proposal for the use of Artificial Intelligence in Spend-Analytics

Size: px
Start display at page:

Download "A Proposal for the use of Artificial Intelligence in Spend-Analytics"

Transcription

1 A Proposal for the use of Artificial Intelligence in Spend-Analytics Mark Bishop, Sebastian Danicic, John Howroyd and Andrew Martin Our core team Mark Bishop PhD studied Cybernetics and Computer Science at the University of Reading. He is Professor of Cognitive Computing at Goldsmiths, University of London and between was Chair of the society for the study of Artificial Intelligence and the Simulation of Behaviour (AISB), the largest Artificial Intelligence Society in the United Kingdom. He has published widely in areas of Artificial Intelligence, Machine Learning and Neural Computing. John Howroyd PhD studied Mathematics at Oxford University and University College, London. As well a being an expert mathematician John has published widely in computer science in particular in program analysis. He has also worked as Head of Research in a major project developing a Spend-Analytics system for NHS trusts. The particular problems solved for this system involved dealing with noisy and incomplete data. John and his team devised new techniques for automatically enriching this data in a structured way using external sources. He has in-depth knowledge of Bayesian Networks, classification and clustering methods and is also an experienced database engineer specialising in efficiency. Sebastian Danicic PhD studied Pure Mathematics at Queen Mary College, London, and Computer Science at University of Oxford and Imperial College London. He is Reader of Computer Science at Goldsmiths, University of London. He is a vastly experienced researcher with publications in Program Analysis, Theoretical Computer Science, Complexity of Algorithms and Software Watermarking. He is Director of the Program Analysis and Transformation Group at Goldsmiths. 1

2 Andrew Martin MSc studied Computer Science and Cybernetics at the University of Reading and Cognitive Computing at Goldsmiths, University of London under Mark Bishop. He is a current PhD Student at Goldsmiths, University of London researching Artificial Intelligence in the context of 4E s Cognitive Science, a Software Contractor, and the current Secretary of the AISB. Together we have broad experience of many aspects of Mathematics, Computer Science and Artificial Intelligence. We have had considerable success in working together as a team developing both new research ideas and deliverables to customers. Background In our document entitled The Centre for Intelligent Data Analytics: research goals of Dec 2013 we identified the following research areas where advanced Artificial Intelligence (AI) techniques can assist the delivery of medium and long term strategic goals for our partner s Analytics. Semantics At the heart of spend analysis is the general problem of forming an accurate, detailed semantic understanding of items from the raw text information that is available to the system (e.g. product descriptions). This data must be analysed using the existing knowledge base; there may, however, sometimes not be enough current context to unambiguously understand this data; in such circumstances it may be necessary to enrich information via additional user interaction and/or web spidering. To help solve such semantic issues there is scope for application of new AI techniques; for example, deep learning and reservoir computing and the newly emerging area of quantum linguistics 1 1 Maruyama reports: Quantum linguistics emerged from the spirit of categorical quantum mechanics, integrating Lambek pregroup grammar, which is qualitative, and the vector space model of meaning, which is quantitative, into the one concept via the methods of category theory. It has already achieved, as well as conceptual lucidity, experimental successes in automated synonymity-related judgement tasks (such as disambiguation). For a brief introduction see Jacob Aron, (2010), Quantum links let computers understand language, New Scientist December

3 Identification of similar suppliers and products Previous work by the team has already demonstrated the need to build contextually sensitive ontologies for product descriptions. These can aid both in the core classification of both products and suppliers. Improvements in this technology will lead to better identification of equivalent products; such improvements can be envisaged as applying in two distinct ways: 1. To [better] identify as the same a particular entity originally made by one manufacturer. 2. To [better] identify products that fulfil the same functional role but which are but subsequently sold on by different suppliers, made by different manufacturers. Clearly the abstract notion of equivalent functionality opens up further questions regarding the relative quality of one product as compared to another etc. Access to our partner s huge database offers exciting new opportunities for state-of-art data-mining and quantum linguistics to help make useful progress in this domain. Different learning algorithms for classification based on clean data Access to the our partner s database opens up new opportunities to research state-of-art machine learning techniques (e.g. deep learning 2 ; reservoir computing; echo-state networks) which potentially could also offer a significant improvement in classification performance. Automatic ontology generation Ontologies are structural frameworks for organising information and are used in artificial intelligence as a form of knowledge representation about the world (or some part of it). An ontology formally represents knowledge as a set of concepts within a domain, using a shared vocabulary to denote the types, properties and interrelationships of those concepts. Automatically 2 Deep learning is part of a broader family of machine learning methods based on learning representations. A field (e.g. a product) can be represented in many different ways (e.g. different sentences), but some representations make it easier to learn tasks of interest (e.g. Is this drill the same as that one?) from examples. 3

4 developing contextually sensitive ontologies will significantly improve the classification system. Trend analysis: prediction of future price fluctuations To explore the use of our partner s database to identify economic trends in purchasing via the application of advanced machine learning techniques; the expectation is that with access to the huge database, new learning algorithms could be trained to make commercially useful time-series predictions (e.g. to highlight strategic opportunities for investment etc.). The Way Forward Informed by the demonstration, it appears that there are two separate, but inter-linked, pathways to be developed: 1. Spend-analytics for buyers (SAB) 2. Global Spend Analytics (GSA) GSA is an entirely new yet-to-be-specified system. Perhaps a good way to summarise the concerns of SAB is to provide a system for purchasing managers which allows them to find the most easily achievable savings from their data with least effort spent, so maximising their return on time invested; the team highlight some of the general considerations that a SAB system might address in Appendix 1. In the system demonstrated, the production of spend-analytics from the buyers perspective only uses transactions between the buyer using the system and its suppliers. (In each case, a tiny proportion of the all the data). All of the rest of the vast data set is ignored in performing this analysis. We term the calculation of price variance at the individual supplier level as local spend analytics as it pertains to a local subset of the total data set pertaining to a specific supplier. Local spend-analytics The system demonstrated by out partner at the meeting on April 17th demonstrated computation of local price variance on the same product 4

5 supplied by the same supplier; a product-identity relationship. Because the data possessed by our partner is clean, product-identity is a relatively straight-forward function to calculate requiring no application of Artificial Intelligence methods 3. It is clear, however, that if the current system is extended to include more general analysis the application of AI-based techniques cannot be avoided. E.g. As it is possible that different suppliers may use [subtly] different text to describe the same product, a simple identity relationship between text descriptors may no longer hold; in this case we need to class as the same text strings that are [by a suitable metric] similar. We note that even the relatively simple sounding task of comparing prices of the same product supplied to the buyer by different suppliers defines a problem whose solution is considerably more difficult than the example demonstrated. Global spend-analytics Global spend analytics, on the other hand, will take advantage of the whole dataset and other external data to allow us, for example, to observe general trends, and to predict strategic risks and opportunities. Although the use of artificial intelligence can improve local spend analytics - in the example we highlighted by allowing the application of a similarity metric for product identification - for global spend analytics, the use of advanced AI will be essential. Proposed improvements to spend-analytics Although the system demonstrated on April 17th highlights an immediate and exciting potential revenue stream for our partner, we are concerned that [future] competitors could realise similar functionality relatively easily. In this context the team have identified the following broad research pathways by which the Analytics might significantly, and non-trivially, be improved (we expand and appropriately outline these ideas in Appendix 2); furthermore we suggest that the use of appropriate advanced AI techniques could offer more clearly delineated intellectual property rights to our partner: 1. Real time processing 3 Because of the clean nature of the data it is likely that local product-identity can be established by a simple comparison 5

6 2. Better price variance analysis 3. Adding reporting dimensions 4. Knowledge enrichment from external sources 5. Clustering or classification of products 6. Improved search functionality 7. User behaviour to improve results 8. Modelling the market for better statistics 9. Trend analysis for predictive forecasting Concluding remarks If our partner seeks to fully monetise their data assets, more effective, deeper analytics will inevitably be required. In order to achieve this, some or all of the nine areas identified above need to be investigated (not least to develop and delineate long term intellectual property across the domain). It is in these areas that the team would seek to apply powerful new AI methodologies to leverage strategic advantage in the medium and long term. Appendix 1: Some considerations of spend analysis In our experience - in the context of spend analytics - the following kinds of issues are often of concern to purchasing managers: Price Variance The same product bought from the same supplier at various prices. This suggests areas where contracted prices may be considered. Supplier Consolidation The same product bought from differing suppliers at various prices. This suggests where preferred suppliers may help to reduce overall costs. Product Consolidation Differing products with the same functional role bought at various prices. This has many difficulties as it raises questions of quality and cost of utility but could result in overall reductions in expenditure. 6

7 Order Consolidation Products bought frequently in small amounts where savings could be achieved by placing fewer bulk orders. Contract Adherence Products bought off contract when one exists but at a higher price. This would require comparing invoice lines against a database of contracted pricing. Order Adherence Products supplied which were not requested. This requires matching invoices with the relevant orders where they exist and raising concerns with the supplier in good time. Peripheral Cost Savings Reducing peripheral charges such as VAT, Invoicing, Delivery, and Credit. Internal Cost Savings Reducing internal expenses such as Storage, Stock control, Cost of accounting, and delivery to point of use. Spend Forecasting Given current market trends what is the likely expenditure in the future for the various parts of the business. NB. It is unlikely that any spend analysis system can fully resolve these problems (as this will always require the application of problem specific knowledge and experience from purchasing managers), however by appropriately analysing current and past data, a strong spend analysis system can provide appropriate information to purchasing managers, from which they can make good purchasing decisions more easily. Appendix 2: Potential improvement pathways for The Analytics Real time processing There is a commercial advantage to report in real time for spend analysis. This will allow purchasing managers to raise concerns on particular invoices prior to payment being made. With appropriate consideration of the information architecture this can be achieved allowing incremental improvements to be reflected in the reporting as they arise. 7

8 Better price variance analysis As an example of the system s functionality we were shown how it reports potential savings to buyers on the same product supplied by the same supplier. At the prototype demonstration it appeared that the system effectively estimates potential savings by computing how much would have been paid if all products had been bought at the minimum price (and then subtracting that amount from that which was actually paid). The team remain concerned that such an approach may [at least occasionally] give rise to an exaggerated view of potential savings to buyers: for example, the data set may contain a single outlier, representing, say, a special offer out of many transactions with a much cheaper price and it will normally be unreasonable to use this singleton as a basis for comparing all other purchases. Furthermore, the price of items may fluctuate seasonally and it would be unreasonable to expect to pay the summer price for tomatoes in the winter. We suspect that if a SAB system merely highlighted variations from the minimum price, this feature might eventually be ignored by its users. We suggest that for customers to take price variation seriously a more sophisticated approach is required; one that can take all of the above factors into account. In much the same way that the Google page-rank algorithm gave rise to a better reflection of the importance of specific web pages (and hence prompted the long term shift of web search services from Alta-Vista to Google) we believe that a similarly clever algorithm for ordering possible savings could offer a much better reflection of the importance of individual price variation to the user. As soon as the SAB system is extended to include less specific analysis (e.g. the task of comparing prices of an identical product supplied to the buyer by different suppliers), the application of advanced artificial intelligence techniques (from areas such as quantum linguistics, data mining, machine-learning and clustering ) cannot easily be avoided 4. 4 E.g. Instead of simply reporting variances of minimum prices, more sophisticated algorithms could inform buyers which products were most likely to yield the largest savings (taking into account seasonal fluctuations etc.) and offer the user the chance to ignore outliers in performing the analysis. In addition, we suggest that inflation and other market forces should also be taken into account in presenting more accurate estimated potential savings to buyers. 8

9 Allowing buyers to add reporting dimensions It would also seem natural to allow the users to influence the overall reporting of possible savings. This could include, for example, the ability to up load cost centre codes, accounting codes, their own product classifications, or contract data (agreed pricing of products from various suppliers). In this context the team suggest investigating the extent to which AI technology could be used in a predictive manner to help reduce the burden of maintaining such dimensions as new product items are supplied or new suppliers are engaged. Knowledge enrichment from external sources SpendInsight system was designed to incorporate noisy data from many different sources; for example, order lines, supplier catalogues, contract databases, account systems, and the web; in this respect the clean database of invoices is now the base start point. Where there is information to support the underlying data this could also be linked. AI technologies (as deployed in the SpendInsight system) can offer mechanisms to do this in a way that keeps the data sources distinct and thus enables complete control over what is shown to specific users. Clustering or classification of products Clustering is essential in useful spend-analytics. The task of moving from identical to similar items is very difficult and requires a variety of techniques many of which fall under the general heading of artificial intelligence. For example a SAB system may be required to perform a more general analysis about pens. In order to do this, we need to find all products in our system which come under that category. This is a very hard task and can never be performed to 100% accuracy except with very small data sets. In fact in order to help solve the problem we may have to look outside our local dataset possibly even resorting to spidering the Web in a search for hints about how to classify products whose internal descriptions are not sufficiently helpful. Without clustering, the same product supplied by a different supplier will be regarded as a different product; to identify them as the same is a very difficult problem. When are two products produced by different suppliers in fact the same? This sort of question is solved using algorithms from artificial intelligence and can only be answered probabilistically. Furthermore clustering is essential whenever we want to ask questions in a more general 9

10 way. Without clustering, we may be able to ask questions like How is supplier X performing this month? but if we want to ask questions like: How is supplier X performing this month compared to other similar suppliers? things become much more complex. We need to be able to find ways of clustering similar suppliers. Presumably, inter-alia, similar suppliers sell similar products. Deciding if two different products are similar, however, is an even more difficult problem than deciding whether they are identical. This problem may require the use of external data produced as the result of spidering and state of art semantic text analysis such as quantum linguistics. Improved search functionality A nice feature demonstrated was the search functionality when filtering the result set by product. However, this relied purely on selecting products containing the search terms in their invoice descriptions. The system has many possibilities for improvement but these require a degree of semantic understanding (e.g. that the word transit should be treated synonymously with carriage ). The search functionality would also be improved by using enriched data from external sources such as fuller product descriptions from supplier catalogues. Similarly a fine-grained clustering of classification of products could be used to broaden searches over specific types of product. We see this as a series of incremental steps to provide the buyers with the search functionality that they require. User behaviour to improve results We can also add knowledge by analysing user supplied data and user behaviour. User supplied data allows for a more tailored interface to the user, but also when aggregated across all users gives semantic information from the human perspective which may be leveraged in many ways. Similarly user behaviour can also be mined providing an important feedback loop for the relevant learning algorithms. For example, noting which products are most frequently grouped together for comparison gives an additional mechanism for addressing which products are similar. This can then be used to adjust the parameters for the ranking of products in a search. 10

11 Modelling the market for better statistics The data for one product item from one supplier to one buyer is generally so sparse that accurate analysis and predictions are not possible (a problem statisticians might call over-fitting). Of course, something is probably better than nothing, but the value will be limited and without care expectations could be artificially raised. The notion of similarity as discussed under clustering or classification of products allows for hierarchical modelling of products. High level groupings with lots of members have lots of data and smoother behaviour giving rise to better models. Lower level groupings have fewer members and less data, but their models should be influenced (and smoothed) by the models of the higher level groups to which they belong. This enables better predictions to be made at these lower levels by allowing influence from above Trend analysis for predictive forecasting The market modelling will enable comparative analysis of the various products and product groupings. Thus building up a network of correlations over the market place, with strong correlation between much more related parts of the market but also some which are more distant (and perhaps unexpected). Temporal properties may also be examined; for example where growth in one is usually followed by growth in another. This together with standard time series techniques should provide a rich toolbox for trend analysis and predictive forecasting. Contact Us If you have follow up questions contact Andrew Martin at ac.uk who will answer your question directly, or pass it on to the other members of the team. We look forward to receiving your enquiries. 11

Data Mining Solutions for the Business Environment

Data Mining Solutions for the Business Environment Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania ruxandra_stefania.petre@yahoo.com Over

More information

Customer Classification And Prediction Based On Data Mining Technique

Customer Classification And Prediction Based On Data Mining Technique Customer Classification And Prediction Based On Data Mining Technique Ms. Neethu Baby 1, Mrs. Priyanka L.T 2 1 M.E CSE, Sri Shakthi Institute of Engineering and Technology, Coimbatore 2 Assistant Professor

More information

OUTLIER ANALYSIS. Data Mining 1

OUTLIER ANALYSIS. Data Mining 1 OUTLIER ANALYSIS Data Mining 1 What Are Outliers? Outlier: A data object that deviates significantly from the normal objects as if it were generated by a different mechanism Ex.: Unusual credit card purchase,

More information

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

More information

Building a Database to Predict Customer Needs

Building a Database to Predict Customer Needs INFORMATION TECHNOLOGY TopicalNet, Inc (formerly Continuum Software, Inc.) Building a Database to Predict Customer Needs Since the early 1990s, organizations have used data warehouses and data-mining tools

More information

SPATIAL DATA CLASSIFICATION AND DATA MINING

SPATIAL DATA CLASSIFICATION AND DATA MINING , pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal

More information

Introduction. A. Bellaachia Page: 1

Introduction. A. Bellaachia Page: 1 Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.

More information

Direct Marketing of Insurance. Integration of Marketing, Pricing and Underwriting

Direct Marketing of Insurance. Integration of Marketing, Pricing and Underwriting Direct Marketing of Insurance Integration of Marketing, Pricing and Underwriting As insurers move to direct distribution and database marketing, new approaches to the business, integrating the marketing,

More information

Business Intelligence. Data Mining and Optimization for Decision Making

Business Intelligence. Data Mining and Optimization for Decision Making Brochure More information from http://www.researchandmarkets.com/reports/2325743/ Business Intelligence. Data Mining and Optimization for Decision Making Description: Business intelligence is a broad category

More information

University of Gaziantep, Department of Business Administration

University of Gaziantep, Department of Business Administration University of Gaziantep, Department of Business Administration The extensive use of information technology enables organizations to collect huge amounts of data about almost every aspect of their businesses.

More information

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

More information

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and Financial Institutions and STATISTICA Case Study: Credit Scoring STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table of Contents INTRODUCTION: WHAT

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:

More information

Short-Term Forecasting in Retail Energy Markets

Short-Term Forecasting in Retail Energy Markets Itron White Paper Energy Forecasting Short-Term Forecasting in Retail Energy Markets Frank A. Monforte, Ph.D Director, Itron Forecasting 2006, Itron Inc. All rights reserved. 1 Introduction 4 Forecasting

More information

THE PREDICTIVE MODELLING PROCESS

THE PREDICTIVE MODELLING PROCESS THE PREDICTIVE MODELLING PROCESS Models are used extensively in business and have an important role to play in sound decision making. This paper is intended for people who need to understand the process

More information

Data Mining Applications in Higher Education

Data Mining Applications in Higher Education Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2

More information

Sales and Invoice Management System with Analysis of Customer Behaviour

Sales and Invoice Management System with Analysis of Customer Behaviour Sales and Invoice Management System with Analysis of Customer Behaviour Sanam Kadge Assistant Professor, Uzair Khan Arsalan Thange Shamail Mulla Harshika Gupta ABSTRACT Today, the organizations advertise

More information

This has been categorized into two. A. Master data control and set up B. Utilizing master data the correct way C. Master data Reports

This has been categorized into two. A. Master data control and set up B. Utilizing master data the correct way C. Master data Reports Master Data Management (MDM) is the technology and tool, the processes and personnel required to create and maintain consistent and accurate lists and records predefined as master data. Master data is

More information

Database Marketing, Business Intelligence and Knowledge Discovery

Database Marketing, Business Intelligence and Knowledge Discovery Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski

More information

Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management

Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management Paper Jean-Louis Amat Abstract One of the main issues of operators

More information

Learning is a very general term denoting the way in which agents:

Learning is a very general term denoting the way in which agents: What is learning? Learning is a very general term denoting the way in which agents: Acquire and organize knowledge (by building, modifying and organizing internal representations of some external reality);

More information

How the Internet is Impacting Revenue Management, Pricing, and Distribution. E. Andrew Boyd Chief Scientist and Senior Vice President PROS

How the Internet is Impacting Revenue Management, Pricing, and Distribution. E. Andrew Boyd Chief Scientist and Senior Vice President PROS How the Internet is Impacting Revenue Management, Pricing, and Distribution E. Andrew Boyd Chief Scientist and Senior Vice President PROS 1 Prices and Sales 2 A Common Business Model List Price for Each

More information

Chapter 20: Data Analysis

Chapter 20: Data Analysis Chapter 20: Data Analysis Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 20: Data Analysis Decision Support Systems Data Warehousing Data Mining Classification

More information

Data Mining. Shahram Hassas Math 382 Professor: Shapiro

Data Mining. Shahram Hassas Math 382 Professor: Shapiro Data Mining Shahram Hassas Math 382 Professor: Shapiro Agenda Introduction Major Elements Steps/ Processes Examples Tools used for data mining Advantages and Disadvantages What is Data Mining? Described

More information

BizPro: Extracting and Categorizing Business Intelligence Factors from News

BizPro: Extracting and Categorizing Business Intelligence Factors from News BizPro: Extracting and Categorizing Business Intelligence Factors from News Wingyan Chung, Ph.D. Institute for Simulation and Training wchung@ucf.edu Definitions and Research Highlights BI Factor: qualitative

More information

ElegantJ BI. White Paper. The Competitive Advantage of Business Intelligence (BI) Forecasting and Predictive Analysis

ElegantJ BI. White Paper. The Competitive Advantage of Business Intelligence (BI) Forecasting and Predictive Analysis ElegantJ BI White Paper The Competitive Advantage of Business Intelligence (BI) Forecasting and Predictive Analysis Integrated Business Intelligence and Reporting for Performance Management, Operational

More information

1 Choosing the right data mining techniques for the job (8 minutes,

1 Choosing the right data mining techniques for the job (8 minutes, CS490D Spring 2004 Final Solutions, May 3, 2004 Prof. Chris Clifton Time will be tight. If you spend more than the recommended time on any question, go on to the next one. If you can t answer it in the

More information

Data Mining System, Functionalities and Applications: A Radical Review

Data Mining System, Functionalities and Applications: A Radical Review Data Mining System, Functionalities and Applications: A Radical Review Dr. Poonam Chaudhary System Programmer, Kurukshetra University, Kurukshetra Abstract: Data Mining is the process of locating potentially

More information

Acquiring new customers is 6x- 7x more expensive than retaining existing customers

Acquiring new customers is 6x- 7x more expensive than retaining existing customers Automated Retention Marketing Enter ecommerce s new best friend. Retention Science offers a platform that leverages big data and machine learning algorithms to maximize customer lifetime value. We automatically

More information

Measurement Information Model

Measurement Information Model mcgarry02.qxd 9/7/01 1:27 PM Page 13 2 Information Model This chapter describes one of the fundamental measurement concepts of Practical Software, the Information Model. The Information Model provides

More information

A Lightweight Solution to the Educational Data Mining Challenge

A Lightweight Solution to the Educational Data Mining Challenge A Lightweight Solution to the Educational Data Mining Challenge Kun Liu Yan Xing Faculty of Automation Guangdong University of Technology Guangzhou, 510090, China catch0327@yahoo.com yanxing@gdut.edu.cn

More information

Machine-learning technologies in telecommunications

Machine-learning technologies in telecommunications 29 Machine-learning technologies in telecommunications Operators can employ machine-learning techniques to exploit user, network and traffic data assets to better understand their subscriber base and to

More information

Use of Data Mining Techniques to Improve the Effectiveness of Sales and Marketing

Use of Data Mining Techniques to Improve the Effectiveness of Sales and Marketing Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 4, April 2015,

More information

IBM's Fraud and Abuse, Analytics and Management Solution

IBM's Fraud and Abuse, Analytics and Management Solution Government Efficiency through Innovative Reform IBM's Fraud and Abuse, Analytics and Management Solution Service Definition Copyright IBM Corporation 2014 Table of Contents Overview... 1 Major differentiators...

More information

Data, Measurements, Features

Data, Measurements, Features Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are

More information

IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH

IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH Kalinka Mihaylova Kaloyanova St. Kliment Ohridski University of Sofia, Faculty of Mathematics and Informatics Sofia 1164, Bulgaria

More information

Bayesian networks - Time-series models - Apache Spark & Scala

Bayesian networks - Time-series models - Apache Spark & Scala Bayesian networks - Time-series models - Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup - November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly

More information

Supply chain intelligence: benefits, techniques and future trends

Supply chain intelligence: benefits, techniques and future trends MEB 2010 8 th International Conference on Management, Enterprise and Benchmarking June 4 5, 2010 Budapest, Hungary Supply chain intelligence: benefits, techniques and future trends Zoltán Bátori Óbuda

More information

Advanced Ensemble Strategies for Polynomial Models

Advanced Ensemble Strategies for Polynomial Models Advanced Ensemble Strategies for Polynomial Models Pavel Kordík 1, Jan Černý 2 1 Dept. of Computer Science, Faculty of Information Technology, Czech Technical University in Prague, 2 Dept. of Computer

More information

Comprehensive Business Budgeting

Comprehensive Business Budgeting Management Accounting 137 Comprehensive Business Budgeting Goals and Objectives Profit planning, commonly called master budgeting or comprehensive business budgeting, is one of the more important techniques

More information

Rule based Classification of BSE Stock Data with Data Mining

Rule based Classification of BSE Stock Data with Data Mining International Journal of Information Sciences and Application. ISSN 0974-2255 Volume 4, Number 1 (2012), pp. 1-9 International Research Publication House http://www.irphouse.com Rule based Classification

More information

CONTEMPORARY DECISION SUPPORT AND KNOWLEDGE MANAGEMENT TECHNOLOGIES

CONTEMPORARY DECISION SUPPORT AND KNOWLEDGE MANAGEMENT TECHNOLOGIES I International Symposium Engineering Management And Competitiveness 2011 (EMC2011) June 24-25, 2011, Zrenjanin, Serbia CONTEMPORARY DECISION SUPPORT AND KNOWLEDGE MANAGEMENT TECHNOLOGIES Slavoljub Milovanovic

More information

Business Intelligence Solutions for Gaming and Hospitality

Business Intelligence Solutions for Gaming and Hospitality Business Intelligence Solutions for Gaming and Hospitality Prepared by: Mario Perkins Qualex Consulting Services, Inc. Suzanne Fiero SAS Objective Summary 2 Objective Summary The rise in popularity and

More information

Content. Management Summary... 3

Content. Management Summary... 3 Real Time Marketing Self-learning, intelligent customer scoring offers financial service providers a made-to-measure forecasting model for individual customers Content Management Summary... 3 Intelligent,

More information

CHAPTER 3 DATA MINING AND CLUSTERING

CHAPTER 3 DATA MINING AND CLUSTERING CHAPTER 3 DATA MINING AND CLUSTERING 3.1 Introduction Nowadays, large quantities of data are being accumulated. The amount of data collected is said to be almost doubled every 9 months. Seeking knowledge

More information

Contact centre Performance and Key Performance Indicators

Contact centre Performance and Key Performance Indicators Contact centre Performance and Key Performance Indicators A white paper by Independent Perspective Contact Centre Performance Data design needs to be revisited to ensure that it supports the need to manage

More information

Spend Enrichment: Making better decisions starts with accurate data

Spend Enrichment: Making better decisions starts with accurate data IBM Software Industry Solutions Industry/Product Identifier Spend Enrichment: Making better decisions starts with accurate data Spend Enrichment: Making better decisions starts with accurate data Contents

More information

Composite performance measures in the public sector Rowena Jacobs, Maria Goddard and Peter C. Smith

Composite performance measures in the public sector Rowena Jacobs, Maria Goddard and Peter C. Smith Policy Discussion Briefing January 27 Composite performance measures in the public sector Rowena Jacobs, Maria Goddard and Peter C. Smith Introduction It is rare to open a newspaper or read a government

More information

DATA ANALYTICS USING R

DATA ANALYTICS USING R DATA ANALYTICS USING R Duration: 90 Hours Intended audience and scope: The course is targeted at fresh engineers, practicing engineers and scientists who are interested in learning and understanding data

More information

DEMAND FORECASTING METHODS

DEMAND FORECASTING METHODS DEMAND FORECASTING METHODS Taken from: Demand Forecasting: Evidence-based Methods by J. Scott Armstrong and Kesten C. Green METHODS THAT RELY ON QUALITATIVE DATA UNAIDED JUDGEMENT It is common practice

More information

Telecommunication (120 ЕCTS)

Telecommunication (120 ЕCTS) Study program Faculty Cycle Software Engineering and Telecommunication (120 ЕCTS) Contemporary Sciences and Technologies Postgraduate ECTS 120 Offered in Tetovo Description of the program This master study

More information

not possible or was possible at a high cost for collecting the data.

not possible or was possible at a high cost for collecting the data. Data Mining and Knowledge Discovery Generating knowledge from data Knowledge Discovery Data Mining White Paper Organizations collect a vast amount of data in the process of carrying out their day-to-day

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)

More information

Data Mining and Analytics in Realizeit

Data Mining and Analytics in Realizeit Data Mining and Analytics in Realizeit November 4, 2013 Dr. Colm P. Howlin Data mining is the process of discovering patterns in large data sets. It draws on a wide range of disciplines, including statistics,

More information

How Master Data Management powers big data decision making.

How Master Data Management powers big data decision making. decision ready. How Master Data Management powers big data decision making. Building an enterprise architecture that s decision ready. Bringing discipline to big data. The trouble with insight is it doesn

More information

Customer Analysis - Customer analysis is done by analyzing the customer's buying preferences, buying time, budget cycles, etc.

Customer Analysis - Customer analysis is done by analyzing the customer's buying preferences, buying time, budget cycles, etc. Data Warehouses Data warehousing is the process of constructing and using a data warehouse. A data warehouse is constructed by integrating data from multiple heterogeneous sources that support analytical

More information

Data mining and official statistics

Data mining and official statistics Quinta Conferenza Nazionale di Statistica Data mining and official statistics Gilbert Saporta président de la Société française de statistique 5@ S Roma 15, 16, 17 novembre 2000 Palazzo dei Congressi Piazzale

More information

TRENDS IN THE DEVELOPMENT OF BUSINESS INTELLIGENCE SYSTEMS

TRENDS IN THE DEVELOPMENT OF BUSINESS INTELLIGENCE SYSTEMS 9 8 TRENDS IN THE DEVELOPMENT OF BUSINESS INTELLIGENCE SYSTEMS Assist. Prof. Latinka Todoranova Econ Lit C 810 Information technology is a highly dynamic field of research. As part of it, business intelligence

More information

White Paper February 2009. IBM Cognos Supply Chain Analytics

White Paper February 2009. IBM Cognos Supply Chain Analytics White Paper February 2009 IBM Cognos Supply Chain Analytics 2 Contents 5 Business problems Perform cross-functional analysis of key supply chain processes 5 Business drivers Supplier Relationship Management

More information

REFLECTIONS ON THE USE OF BIG DATA FOR STATISTICAL PRODUCTION

REFLECTIONS ON THE USE OF BIG DATA FOR STATISTICAL PRODUCTION REFLECTIONS ON THE USE OF BIG DATA FOR STATISTICAL PRODUCTION Pilar Rey del Castillo May 2013 Introduction The exploitation of the vast amount of data originated from ICT tools and referring to a big variety

More information

Influence Discovery in Semantic Networks: An Initial Approach

Influence Discovery in Semantic Networks: An Initial Approach 2014 UKSim-AMSS 16th International Conference on Computer Modelling and Simulation Influence Discovery in Semantic Networks: An Initial Approach Marcello Trovati and Ovidiu Bagdasar School of Computing

More information

International Certificate in Financial English

International Certificate in Financial English International Certificate in Financial English Past Examination Paper Writing May 2007 University of Cambridge ESOL Examinations 1 Hills Road Cambridge CB1 2EU United Kingdom Tel. +44 1223 553355 Fax.

More information

Comparison of K-means and Backpropagation Data Mining Algorithms

Comparison of K-means and Backpropagation Data Mining Algorithms Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and

More information

White Paper. How Streaming Data Analytics Enables Real-Time Decisions

White Paper. How Streaming Data Analytics Enables Real-Time Decisions White Paper How Streaming Data Analytics Enables Real-Time Decisions Contents Introduction... 1 What Is Streaming Analytics?... 1 How Does SAS Event Stream Processing Work?... 2 Overview...2 Event Stream

More information

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS Mrs. Jyoti Nawade 1, Dr. Balaji D 2, Mr. Pravin Nawade 3 1 Lecturer, JSPM S Bhivrabai Sawant Polytechnic, Pune (India) 2 Assistant

More information

ECLT 5810 E-Commerce Data Mining Techniques - Introduction. Prof. Wai Lam

ECLT 5810 E-Commerce Data Mining Techniques - Introduction. Prof. Wai Lam ECLT 5810 E-Commerce Data Mining Techniques - Introduction Prof. Wai Lam Data Opportunities Business infrastructure have improved the ability to collect data Virtually every aspect of business is now open

More information

Mimicking human fake review detection on Trustpilot

Mimicking human fake review detection on Trustpilot Mimicking human fake review detection on Trustpilot [DTU Compute, special course, 2015] Ulf Aslak Jensen Master student, DTU Copenhagen, Denmark Ole Winther Associate professor, DTU Copenhagen, Denmark

More information

IT services for analyses of various data samples

IT services for analyses of various data samples IT services for analyses of various data samples Ján Paralič, František Babič, Martin Sarnovský, Peter Butka, Cecília Havrilová, Miroslava Muchová, Michal Puheim, Martin Mikula, Gabriel Tutoky Technical

More information

Databases - Data Mining. (GF Royle, N Spadaccini 2006-2010) Databases - Data Mining 1 / 25

Databases - Data Mining. (GF Royle, N Spadaccini 2006-2010) Databases - Data Mining 1 / 25 Databases - Data Mining (GF Royle, N Spadaccini 2006-2010) Databases - Data Mining 1 / 25 This lecture This lecture introduces data-mining through market-basket analysis. (GF Royle, N Spadaccini 2006-2010)

More information

Data Mining Part 5. Prediction

Data Mining Part 5. Prediction Data Mining Part 5. Prediction 5.1 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Classification vs. Numeric Prediction Prediction Process Data Preparation Comparing Prediction Methods References Classification

More information

DATA PREPARATION FOR DATA MINING

DATA PREPARATION FOR DATA MINING Applied Artificial Intelligence, 17:375 381, 2003 Copyright # 2003 Taylor & Francis 0883-9514/03 $12.00 +.00 DOI: 10.1080/08839510390219264 u DATA PREPARATION FOR DATA MINING SHICHAO ZHANG and CHENGQI

More information

Finance sector application of the SATURN Intelligent Data Analytics and Visualisation Platform

Finance sector application of the SATURN Intelligent Data Analytics and Visualisation Platform Finance sector application of the SATURN Intelligent Data Analytics and Visualisation Platform A white paper by: Dr Robert Ghanea-Hercock Chief Research Scientist BT Security Futures Practice The finance

More information

SINGULAR SPECTRUM ANALYSIS HYBRID FORECASTING METHODS WITH APPLICATION TO AIR TRANSPORT DEMAND

SINGULAR SPECTRUM ANALYSIS HYBRID FORECASTING METHODS WITH APPLICATION TO AIR TRANSPORT DEMAND SINGULAR SPECTRUM ANALYSIS HYBRID FORECASTING METHODS WITH APPLICATION TO AIR TRANSPORT DEMAND K. Adjenughwure, Delft University of Technology, Transport Institute, Ph.D. candidate V. Balopoulos, Democritus

More information

CHAPTER SIX DATA. Business Intelligence. 2011 The McGraw-Hill Companies, All Rights Reserved

CHAPTER SIX DATA. Business Intelligence. 2011 The McGraw-Hill Companies, All Rights Reserved CHAPTER SIX DATA Business Intelligence 2011 The McGraw-Hill Companies, All Rights Reserved 2 CHAPTER OVERVIEW SECTION 6.1 Data, Information, Databases The Business Benefits of High-Quality Information

More information

Advanced analytics at your hands

Advanced analytics at your hands 2.3 Advanced analytics at your hands Neural Designer is the most powerful predictive analytics software. It uses innovative neural networks techniques to provide data scientists with results in a way previously

More information

The Masters of Science in Information Systems & Technology

The Masters of Science in Information Systems & Technology The Masters of Science in Information Systems & Technology College of Engineering and Computer Science University of Michigan-Dearborn A Rackham School of Graduate Studies Program PH: 313-593-5361; FAX:

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

Software Engineering of NLP-based Computer-assisted Coding Applications

Software Engineering of NLP-based Computer-assisted Coding Applications Software Engineering of NLP-based Computer-assisted Coding Applications 1 Software Engineering of NLP-based Computer-assisted Coding Applications by Mark Morsch, MS; Carol Stoyla, BS, CLA; Ronald Sheffer,

More information

HIGH PRECISION MATCHING AT THE HEART OF MASTER DATA MANAGEMENT

HIGH PRECISION MATCHING AT THE HEART OF MASTER DATA MANAGEMENT HIGH PRECISION MATCHING AT THE HEART OF MASTER DATA MANAGEMENT Author: Holger Wandt Management Summary This whitepaper explains why the need for High Precision Matching should be at the heart of every

More information

INCORPORATING PREDICTIVE ANALYTICS

INCORPORATING PREDICTIVE ANALYTICS www.wipro.com/promax INCORPORATING PREDICTIVE ANALYTICS WITH THE DEMAND SIGNALS TO AND FROM TRADE PROMOTIONS www.wipro.com Table of contents 3.Background on Promotion Philosophy 4.The Customer and Product

More information

Discovering, Not Finding. Practical Data Mining for Practitioners: Level II. Advanced Data Mining for Researchers : Level III

Discovering, Not Finding. Practical Data Mining for Practitioners: Level II. Advanced Data Mining for Researchers : Level III www.cognitro.com/training Predicitve DATA EMPOWERING DECISIONS Data Mining & Predicitve Training (DMPA) is a set of multi-level intensive courses and workshops developed by Cognitro team. it is designed

More information

International Journal of Electronics and Computer Science Engineering 1449

International Journal of Electronics and Computer Science Engineering 1449 International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and

More information

INTRODUCTION TO BUSINESS INTELLIGENCE What to consider implementing a Data Warehouse and Business Intelligence

INTRODUCTION TO BUSINESS INTELLIGENCE What to consider implementing a Data Warehouse and Business Intelligence INTRODUCTION TO BUSINESS INTELLIGENCE What to consider implementing a Data Warehouse and Business Intelligence Summary: This note gives some overall high-level introduction to Business Intelligence and

More information

1 What is Machine Learning?

1 What is Machine Learning? COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #1 Scribe: Rob Schapire February 4, 2008 1 What is Machine Learning? Machine learning studies computer algorithms for learning to do

More information

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume

More information

Populating a Data Quality Scorecard with Relevant Metrics WHITE PAPER

Populating a Data Quality Scorecard with Relevant Metrics WHITE PAPER Populating a Data Quality Scorecard with Relevant Metrics WHITE PAPER SAS White Paper Table of Contents Introduction.... 1 Useful vs. So-What Metrics... 2 The So-What Metric.... 2 Defining Relevant Metrics...

More information

DATA MINING TECHNIQUES SUPPORT TO KNOWLEGDE OF BUSINESS INTELLIGENT SYSTEM

DATA MINING TECHNIQUES SUPPORT TO KNOWLEGDE OF BUSINESS INTELLIGENT SYSTEM INTERNATIONAL JOURNAL OF RESEARCH IN COMPUTER APPLICATIONS AND ROBOTICS ISSN 2320-7345 DATA MINING TECHNIQUES SUPPORT TO KNOWLEGDE OF BUSINESS INTELLIGENT SYSTEM M. Mayilvaganan 1, S. Aparna 2 1 Associate

More information

Data Mining Approach For Subscription-Fraud. Detection in Telecommunication Sector

Data Mining Approach For Subscription-Fraud. Detection in Telecommunication Sector Contemporary Engineering Sciences, Vol. 7, 2014, no. 11, 515-522 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ces.2014.4431 Data Mining Approach For Subscription-Fraud Detection in Telecommunication

More information

The key to knowing the best price is to fully understand consumer behavior.

The key to knowing the best price is to fully understand consumer behavior. A price optimization tool designed for small to mid-size companies to optimize infrastructure and determine the perfect price point per item in any given week DEBORAH WEINSWIG Executive Director- Head,

More information

An Introduction to Advanced Analytics and Data Mining

An Introduction to Advanced Analytics and Data Mining An Introduction to Advanced Analytics and Data Mining Dr Barry Leventhal Henry Stewart Briefing on Marketing Analytics 19 th November 2010 Agenda What are Advanced Analytics and Data Mining? The toolkit

More information

Position Classification Flysheet for Computer Science Series, GS-1550. Table of Contents

Position Classification Flysheet for Computer Science Series, GS-1550. Table of Contents Position Classification Flysheet for Computer Science Series, GS-1550 Table of Contents SERIES DEFINITION... 2 OCCUPATIONAL INFORMATION... 2 EXCLUSIONS... 4 AUTHORIZED TITLES... 5 GRADE LEVEL CRITERIA...

More information

Foundations of Business Intelligence: Databases and Information Management

Foundations of Business Intelligence: Databases and Information Management Foundations of Business Intelligence: Databases and Information Management Problem: HP s numerous systems unable to deliver the information needed for a complete picture of business operations, lack of

More information

Understanding Web personalization with Web Usage Mining and its Application: Recommender System

Understanding Web personalization with Web Usage Mining and its Application: Recommender System Understanding Web personalization with Web Usage Mining and its Application: Recommender System Manoj Swami 1, Prof. Manasi Kulkarni 2 1 M.Tech (Computer-NIMS), VJTI, Mumbai. 2 Department of Computer Technology,

More information

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON Overview * Introduction * Multiple faces of Big Data * Challenges of Big Data * Cloud Computing

More information

Sanjeev Kumar. contribute

Sanjeev Kumar. contribute RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 sanjeevk@iasri.res.in 1. Introduction The field of data mining and knowledgee discovery is emerging as a

More information

Why is Internal Audit so Hard?

Why is Internal Audit so Hard? Why is Internal Audit so Hard? 2 2014 Why is Internal Audit so Hard? 3 2014 Why is Internal Audit so Hard? Waste Abuse Fraud 4 2014 Waves of Change 1 st Wave Personal Computers Electronic Spreadsheets

More information

GETTING AHEAD OF THE COMPETITION WITH DATA MINING

GETTING AHEAD OF THE COMPETITION WITH DATA MINING WHITE PAPER GETTING AHEAD OF THE COMPETITION WITH DATA MINING Ultimately, data mining boils down to continually finding new ways to be more profitable which in today s competitive world means making better

More information

Learning outcomes. Knowledge and understanding. Competence and skills

Learning outcomes. Knowledge and understanding. Competence and skills Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges

More information

BUSINESS INTELLIGENCE E ANALYST Business Unit:

BUSINESS INTELLIGENCE E ANALYST Business Unit: BUSINESS INTELLIGENCE E ANALYST Business Unit: LAB360 Reporting to: Direct Reports: Manager of Analytics Strategy and Research Business Information Analyst Date Created: September 2014 Purpose of the Position

More information