Essential Components of an Integrated Data Mining Tool for the Oil & Gas Industry, With an Example Application in the DJ Basin.



Similar documents
Essential Components of an Integrated Data Mining Tool for the Oil and Gas Industry with an Example Application in the DJ Basin

Data Mining for Customer Service Support. Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin

Data Mining. 1 Introduction 2 Data Mining methods. Alfred Holl Data Mining 1

Using Data Mining for Mobile Communication Clustering and Characterization

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing

Social Media Mining. Data Mining Essentials

An Overview of Knowledge Discovery Database and Data mining Techniques

Welcome. Data Mining: Updates in Technologies. Xindong Wu. Colorado School of Mines Golden, Colorado 80401, USA

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery

CASE STUDIES: MARCELLUS, EAGLE FORD, NIOBRARA

Data Mining Part 5. Prediction

Practical Applications of DATA MINING. Sang C Suh Texas A&M University Commerce JONES & BARTLETT LEARNING

Database Marketing, Business Intelligence and Knowledge Discovery

Data Mining. Concepts, Models, Methods, and Algorithms. 2nd Edition

DATA MINING TECHNIQUES AND APPLICATIONS

Title. Introduction to Data Mining. Dr Arulsivanathan Naidoo Statistics South Africa. OECD Conference Cape Town 8-10 December 2010.

A Systemic Artificial Intelligence (AI) Approach to Difficult Text Analytics Tasks

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

Data Mining Applications in Higher Education

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

DEPARTMENT OF PETROLEUM ENGINEERING Graduate Program (Version 2002)

Information Management course

Intrusion Detection. Jeffrey J.P. Tsai. Imperial College Press. A Machine Learning Approach. Zhenwei Yu. University of Illinois, Chicago, USA

Quality Assessment in Spatial Clustering of Data Mining

An Introduction to Data Mining

Machine Learning with MATLAB David Willingham Application Engineer

Data Mining and Soft Computing. Francisco Herrera

CS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 Real-Time Systems. CSCI 522 High Performance Computing

Mobile Phone APP Software Browsing Behavior using Clustering Analysis

Data Mining and Exploration. Data Mining and Exploration: Introduction. Relationships between courses. Overview. Course Introduction

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

A Review of Data Mining Techniques

How To Use Neural Networks In Data Mining

Introduction. A. Bellaachia Page: 1

Data Mining and Neural Networks in Stata

An Introduction to Advanced Analytics and Data Mining

Data Mining 資 料 探 勘. 分 群 分 析 (Cluster Analysis)

SPATIAL DATA CLASSIFICATION AND DATA MINING

Machine Learning: Overview

Marcellus Fast Facts

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

Predictive Modeling in Workers Compensation 2008 CAS Ratemaking Seminar

Course DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Meta-learning. Synonyms. Definition. Characteristics

Data Mining + Business Intelligence. Integration, Design and Implementation

Data Mining: Concepts and Techniques. Jiawei Han. Micheline Kamber. Simon Fräser University К MORGAN KAUFMANN PUBLISHERS. AN IMPRINT OF Elsevier

The University of Jordan

Business Intelligence. Data Mining and Optimization for Decision Making

Data Mining System, Functionalities and Applications: A Radical Review

TIETS34 Seminar: Data Mining on Biometric identification

Federico Rajola. Customer Relationship. Management in the. Financial Industry. Organizational Processes and. Technology Innovation.

IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH

Research-based Learning (RbL) in Computing Courses for Senior Engineering Students

A Study of Web Log Analysis Using Clustering Techniques

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.

ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION

Segmentation of stock trading customers according to potential value

How To Make A Credit Risk Model For A Bank Account

6.2.8 Neural networks for data mining

I N T E L L I G E N T S O L U T I O N S, I N C. DATA MINING IMPLEMENTING THE PARADIGM SHIFT IN ANALYSIS & MODELING OF THE OILFIELD

DATA MINING WITH DIFFERENT TYPES OF X-RAY DATA

Machine Learning Introduction

Chapter ML:XI (continued)

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Data Mining Applications in Fund Raising

Learning outcomes. Knowledge and understanding. Competence and skills

Chapter 12 Discovering New Knowledge Data Mining

Subject Description Form

Data Mining. Cluster Analysis: Advanced Concepts and Algorithms

BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL

Course Syllabus. Purposes of Course:

Nagarjuna College Of

Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup

Fluency With Information Technology CSE100/IMT100

Introduction to Data Mining Techniques

Data Mining is sometimes referred to as KDD and DM and KDD tend to be used as synonyms

Machine Learning and Statistics: What s the Connection?

Chapter 5. Warehousing, Data Acquisition, Data. Visualization

Big Data Analytics. An Introduction. Oliver Fuchsberger University of Paderborn 2014

An Evaluation of Neural Networks Approaches used for Software Effort Estimation

!"#$ Reservoir Fluid Properties. State of the Art and Outlook for Future Development. Dr. Muhammad Al-Marhoun

HiWAY: The Quest For Infinite Conductivity Innovation for a step-change in Hydraulic Fracturing

Machine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer

Predictive Dynamix Inc

Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016

Importance or the Role of Data Warehousing and Data Mining in Business Applications

INDIAN STATISTICAL INSTITUTE announces Training Program on Statistical Techniques for Data Mining & Business Analytics

E-Learning Using Data Mining. Shimaa Abd Elkader Abd Elaal

Introduction to Spatial Data Mining

A Survey on Intrusion Detection System with Data Mining Techniques

A Partially Supervised Metric Multidimensional Scaling Algorithm for Textual Data Visualization

Data Mining for Business Analytics

CS 6220: Data Mining Techniques Course Project Description

IT services for analyses of various data samples

Transcription:

Essential Components of an Integrated Data Mining Tool for the Oil & Gas Industry, With an Example Application in the DJ Basin. Petroleum & Natural Gas Engineering West Virginia University SPE Annual Technical Conference, Denver, Colorado, October 2003 1

OUTLINE Introduction Data Mining Classifications Descriptive Data Mining Predictive Data Mining Application in DJ Basin Conclusions 2

Introduction Not a new process Addition of machine learning Artificial Neural Networks Genetic Optimization Fuzzy Logic A 1.5 billion dollar business by 2005 If your company is not doing it now. It will. 3

Introduction A significant increase in the volume of digital data. DATA INFORMATION KNOWL EDGE Data Mining lets you be proactive. 4

Introduction An Integrated Process. 5

OUTLINE Introduction Data Mining Classifications Descriptive Data Mining Predictive Data Mining Application in DJ Basin Conclusions 6

Data Mining Classification Definition: The nontrivial extraction of implicit, previously unknown, and potentially useful information from data. Exploratory Data Mining What can the data tell me? Guided Data Mining Looking for specific answer to questions. 7

Data Mining Classification 8

Components of a Data Mining Tool Dataset Fuzzy Combinatorial Analysis Neural Model Building Statistical Module Cluster Analysis Automatic Cluster Optimization Genetic optimization Data Cleansing Module Neural Network Data Preparation Training, Calibration & Verification Datasets Fuzzy Decision Support System 9

Quality Control & Preprocessing Probably the most important and time consuming components of the data mining process. May take as much as 50-65% of the entire project. 10

Quality Control & Preprocessing Sources of error Human (erratic, hard to resolve) Data collection Data entry Data manipulation Equipment (systematic, shifts) 11

Quality Control & Preprocessing It must include the following components: Dealing with missing data Dealing with outliers Dealing with contaminated data Statistical analysis Feature selection Feature reduction Pseudo data generation 12

Quality Control & Preprocessing Statistical Analysis Examine the major statistics of the data set. 13

Quality Control & Preprocessing Statistical Analysis Perform Regression analysis on all the features in the data set. 14

Quality Control & Preprocessing Statistical Analysis Data distribution can reveal important information. 15

Quality Control & Preprocessing Statistical Analysis Study the probability density function for all the features. 16

Quality Control & Preprocessing Missing data Identify the location of the missing data 17

Quality Control & Preprocessing Missing data Patch the holes in the data set. 18

Quality Control & Preprocessing Outliers Identify the outlier records in the data set. 19

Quality Control & Preprocessing Outliers Identify a remedy for the outlier. 20

OUTLINE Introduction Data Mining Classifications Descriptive Data Mining Predictive Data Mining Application in DJ Basin Conclusions 21

Descriptive Data Mining Feature Selection Rule Induction Cluster Analysis Hierarchical K-mean Fuzzy c-mean Self Organizing Neural Networks 22

Descriptive Data Mining Exploratory in nature. An absolutely essential component of the data mining process. May reveal interesting and relevant information. 23

Descriptive Data Mining Feature Selection module that has the capability of identifying the most influential parameters in a dataset. 24

Descriptive Data Mining Hard to see which feature is influencing the outcome more than others. 25

Descriptive Data Mining Each feature influences the outcome to a degree. Features influence on another as well as the outcome. 26

Descriptive Data Mining 27

Descriptive Data Mining Clustering describes a collection of unsupervised methods whose aim is to partition an overall data set into a significantly smaller number of ``clusters''. 28

Descriptive Data Mining These methods in general require some kind of distance measure among the data entities in order to group them together and identify each data entity with one cluster. 29

Cluster Analysis K-Mean Clustering Fuzzy C-Mean Clustering 30

Cluster Analysis Identifying optimum number of clusters and optimum features involved in clustering is very important and quite challenging. 31

OUTLINE Introduction Data Mining Classifications Descriptive Data Mining Predictive Data Mining Application in DJ Basin Conclusions 32

Predictive Data Mining Can identify patterns already in the dataset. Has the potential to identify patterns that might not yet exist in the dataset but has the potential of developing. It can fill all the gaps in dataset. 33

Predictive Data Mining A highly supervised process that includes: Decision Tree Analysis Artificial Neural Networks Genetic Algorithms Fuzzy Logic 34

Decision Tree Analysis Appropriate for solving problems that can be dissected into a logical progression of events. 35

Neural Networks Information Processing technique as a function of architecture. Humans: Parallel, Distributive. Computers: Sequential, Pointwise. 36

Neural Networks _ P_NN_ S_V_D _S _ P_NN _RN_D 37

Fuzzy Logic Probably one of the most important tools for data mining. Data = An instant of reality & nature Reality & nature are too complex to be fully explained by the binary system of belief. Fuzzy logic is an absolute necessity. 38

Predictive Data Mining THE KEY COMPONENT Neural model building module: Data preparation module Kohonen self organizing Network Back Propagation Network Radial Basis Network General Regression Network Genetic optimization module Fuzzy logic module INTEGRATION HYBRID INTELLIGENT SYSTEMS 39

OUTLINE Introduction Data Mining Classifications Descriptive Data Mining Predictive Data Mining Application in DJ Basin Conclusions 40

Application in DJ Basin Data mining applied to stimulation and restimulation database for the DJ Basin The original database for stimulation of Codell wells in the DJ basin needed considerable preprocessing: Removal of contaminated data. Identification and management of outliers. Identification and management of missing data. 41

Application in DJ Basin Fuzzy Combinatorial Analysis The analysis was performed for the combination of up to five features. Rank Feature FCA Value Rank Feature FCA Value 1 Flowback Volbbl 0 22 Frac Type 2.2848 2 CO -Phi-H 0.5811 23 No-CO-Perfs 2.303 3 Bicarbonate ppm 0.6666 24 Chloride ppm 2.3298 4 Peak Visc 0.7486 25 NI- Perfed-H 2.3302 5 Lat 0.7734 26 Water phlab 2.3665 6 Orig20/40 Sand-Mlbs 0.9214 27 Pre-Refrac Mcfd 2.3956 7 Long 1.1 28 Cum MMcf 2.4009 8 Refrac Date 1.1934 29 Water Source 2.4018 9 ViscShear 100-30Min 1.3324 30 Iron ppm 2.4351 10 TotHardness ppm 1.518 31 MGAL 2.496 11 Calcium ppm 1.6692 32 TotalPerfs 2.5045 12 AvgRate BPM 1.7415 33 Sulfate ppm 2.5164 13 Est-Ult- GOR 1.7706 34 New Perfs 2.552 14 No-NI -Perfs 1.7863 35 Sodium ppm 2.6039 15 AvgPsi 1.8438 36 Magnesium ppm 2.6108 16 ViscShear 100-5Min 1.9401 37 ViscShear 100-0Min 2.6649 17 Top CO Perf 1.9819 38 Pre- FracISDP 2.7127 18 TDSolid ppm 2.0084 39 TestedPH 2.8066 19 MMcf 2.0777 40 Post- FracISDP 2.8256 20 Orig Fluid-Mgal 2.0855 41 Mlb20-40 2.8907 21 DOFP 2.2451 42 Communication 2.9554 42

Application in DJ Basin 43

Application in DJ Basin 44

Application in DJ Basin No well logs or reservoir characteristics were present in the database. 45

Application in DJ Basin Predictive Model was developed based on the available data. Training Calibration Verification Rsquare 0.783 0.821 0.516 Correlation Coefficient 0.901 0.907 0.809 46

Application in DJ Basin Sensitivity analysis performed on all wells based on the predictive model. 47

Application in DJ Basin Sensitivity analysis performed on all wells based on the predictive model. 48

Application in DJ Basin Sensitivity analysis performed on all wells based on the predictive model. Variable No. of Perforation in Codel (Number) Original 20-40 Sand pumped (Mlbs) Original Fluid Pumped (Mgal) Distribution Uniform Uniform Uniform Minimum 4 85.5 44.5 Maximum 80 600 147.6 49

Application in DJ Basin The dominant trend implies that: Low viscosity frac fluids are preferred to higher viscosity fluids This agrees with the trends identified in the amount of proppant analysis. Indeed, for low proppant concentrations there is no need for high viscosity fluids. 50

Application in DJ Basin 51

OUTLINE Introduction Data Mining Classifications Descriptive Data Mining Predictive Data Mining Application in DJ Basin Conclusions 52

Conclusions Data Mining is gaining momentum in our industry. Commercial products that include all the integrated components mentioned here are not in the market at the present time. Their development is essential for our industry s profitability in future. 53