INTEGRATED ENVIRONMENT FOR STORING AND HANDLING INFORMATION IN TASKS OF INDUCTIVE MODELLING FOR BUSINESS INTELLIGENCE SYSTEMS



Similar documents
PERFORMANCE METRICS FOR THE IT SERVICES PORTFOLIO

Applying Multiple Neural Networks on Large Scale Data

Online Bagging and Boosting

Use of extrapolation to forecast the working capital in the mechanical engineering companies

Evaluating Inventory Management Performance: a Preliminary Desk-Simulation Study Based on IOC Model

International Journal of Management & Information Systems First Quarter 2012 Volume 16, Number 1

Analyzing Spatiotemporal Characteristics of Education Network Traffic with Flexible Multiscale Entropy

Research Article Performance Evaluation of Human Resource Outsourcing in Food Processing Enterprises

A framework for performance monitoring, load balancing, adaptive timeouts and quality of service in digital libraries

REQUIREMENTS FOR A COMPUTER SCIENCE CURRICULUM EMPHASIZING INFORMATION TECHNOLOGY SUBJECT AREA: CURRICULUM ISSUES

Standards and Protocols for the Collection and Dissemination of Graduating Student Initial Career Outcomes Information For Undergraduates

Fuzzy Sets in HR Management

Study on the development of statistical data on the European security technological and industrial base

Machine Learning Applications in Grid Computing

RECURSIVE DYNAMIC PROGRAMMING: HEURISTIC RULES, BOUNDING AND STATE SPACE REDUCTION. Henrik Kure

Performance Evaluation of Machine Learning Techniques using Software Cost Drivers

Software Quality Characteristics Tested For Mobile Application Development

Method of supply chain optimization in E-commerce

An Application Research on the Workflow-based Large-scale Hospital Information System Integration

CRM FACTORS ASSESSMENT USING ANALYTIC HIERARCHY PROCESS

Audio Engineering Society. Convention Paper. Presented at the 119th Convention 2005 October 7 10 New York, New York USA

Real Time Target Tracking with Binary Sensor Networks and Parallel Computing

Extended-Horizon Analysis of Pressure Sensitivities for Leak Detection in Water Distribution Networks: Application to the Barcelona Network

SOME APPLICATIONS OF FORECASTING Prof. Thomas B. Fomby Department of Economics Southern Methodist University May 2008

arxiv: v1 [math.pr] 9 May 2008

ASIC Design Project Management Supported by Multi Agent Simulation

This paper studies a rental firm that offers reusable products to price- and quality-of-service sensitive

AN ALGORITHM FOR REDUCING THE DIMENSION AND SIZE OF A SAMPLE FOR DATA EXPLORATION PROCEDURES

Managing Complex Network Operation with Predictive Analytics

Markovian inventory policy with application to the paper industry

An improved TF-IDF approach for text classification *

High Performance Chinese/English Mixed OCR with Character Level Language Identification

The Application of Bandwidth Optimization Technique in SLA Negotiation Process

Media Adaptation Framework in Biofeedback System for Stroke Patient Rehabilitation

An Improved Decision-making Model of Human Resource Outsourcing Based on Internet Collaboration

The Research of Measuring Approach and Energy Efficiency for Hadoop Periodic Jobs

Design of Model Reference Self Tuning Mechanism for PID like Fuzzy Controller

Physics 211: Lab Oscillations. Simple Harmonic Motion.

ADJUSTING FOR QUALITY CHANGE

Study on the development of statistical data on the European security technological and industrial base

Searching strategy for multi-target discovery in wireless networks

Preference-based Search and Multi-criteria Optimization

An Innovate Dynamic Load Balancing Algorithm Based on Task

AutoHelp. An 'Intelligent' Case-Based Help Desk Providing. Web-Based Support for EOSDIS Customers. A Concept and Proof-of-Concept Implementation

Resource Allocation in Wireless Networks with Multiple Relays

Construction Economics & Finance. Module 3 Lecture-1

Quality evaluation of the model-based forecasts of implied volatility index

Evaluating Software Quality of Vendors using Fuzzy Analytic Hierarchy Process

A CHAOS MODEL OF SUBHARMONIC OSCILLATIONS IN CURRENT MODE PWM BOOST CONVERTERS

Nonlinear Error Modeling of Reduced GPS/INS Vehicular Tracking Systems Using Fast Orthogonal Search

Airline Yield Management with Overbooking, Cancellations, and No-Shows JANAKIRAM SUBRAMANIAN

Implementation of Active Queue Management in a Combined Input and Output Queued Switch

6. Time (or Space) Series Analysis

Factor Model. Arbitrage Pricing Theory. Systematic Versus Non-Systematic Risk. Intuitive Argument

Local Area Network Management

ON SELF-ROUTING IN CLOS CONNECTION NETWORKS. BARRY G. DOUGLASS Electrical Engineering Department Texas A&M University College Station, TX

A Study on the Chain Restaurants Dynamic Negotiation Games of the Optimization of Joint Procurement of Food Materials

AUC Optimization vs. Error Rate Minimization

Energy Proportionality for Disk Storage Using Replication

Markov Models and Their Use for Calculations of Important Traffic Parameters of Contact Center

SAMPLING METHODS LEARNING OBJECTIVES

The Velocities of Gas Molecules

2. FINDING A SOLUTION

Red Hat Enterprise Linux: Creating a Scalable Open Source Storage Infrastructure

Workflow Management in Cloud Computing

An Integrated Approach for Monitoring Service Level Parameters of Software-Defined Networking

Partitioned Elias-Fano Indexes

Pricing Asian Options using Monte Carlo Methods

Leak detection in open water channels

Presentation Safety Legislation and Standards

Option B: Credit Card Processing

Efficient Key Management for Secure Group Communications with Bursty Behavior

A Fast Algorithm for Online Placement and Reorganization of Replicated Data

CLOSED-LOOP SUPPLY CHAIN NETWORK OPTIMIZATION FOR HONG KONG CARTRIDGE RECYCLING INDUSTRY

PREDICTION OF POSSIBLE CONGESTIONS IN SLA CREATION PROCESS

Generating Certification Authority Authenticated Public Keys in Ad Hoc Networks

Sensors as a Service Oriented Architecture: Middleware for Sensor Networks

Mathematical Model for Glucose-Insulin Regulatory System of Diabetes Mellitus

Introduction to the Microsoft Sync Framework. Michael Clark Development Manager Microsoft

Part C. Property and Casualty Insurance Companies

An online sulfur monitoring system can improve process balance sheets

Gaussian Processes for Regression: A Quick Introduction

Identification and Analysis of hard disk drive in digital forensic

Factored Models for Probabilistic Modal Logic

Exploiting Hardware Heterogeneity within the Same Instance Type of Amazon EC2

CPU Animation. Introduction. CPU skinning. CPUSkin Scalar:

Optimal Resource-Constraint Project Scheduling with Overlapping Modes

Transcription:

Artificial Intelligence Methods and Techniques for Business and Engineering Applications 210 INTEGRATED ENVIRONMENT FOR STORING AND HANDLING INFORMATION IN TASKS OF INDUCTIVE MODELLING FOR BUSINESS INTELLIGENCE SYSTEMS Nataliya Shcherbakova, Volodyyr Stepashko Abstract: Inductive odelling tools are widely used for solving probles of analysing econoical, ecological, and other processes. Developent of business intelligence systes based on inductive odelling algoriths for analysis, odelling, forecasting, classification, and clustering of coplex processes is very proising. When solving real tasks of odel construction fro statistical data, the question of storage of and providing effective access to the inforation arises. At the stage of input data processing there are typical difficulties with processing data in different forats as well as containing oissions and untypically sall or big values etc. Fro the other side, the question of output inforation storage exists like deterination of structure and paraeters of odels, estiation of precision and validity, plots and diagras drawing etc. This would allow structuring input data of different types and using the inforation already existing in database and also provide the storage of coplete inforation on experients and results of calculations. To solve such kind of probles, the integrated environent for storing and handling inforation is developed. Architecture of the environent is offered giving the possibilities to anipulate present inforation freely using relational database containing only etadata and storing input statistical data and output results of calculations. Keywords: integrated environent, handling and storing inforation, inductive odeling, GMDH-algoriths, Business Intelligence ACM Classification Keywords: H.2.8 Data Base Application Data Mining Introduction The nuber of copanies which use business intelligence systes in their work is growing every year, so the developent of such systes is continued; it is increasing constantly the nuber of systes, their functionality and technologies used for data processing and direct data analysis. Business intelligence systes are usually focused on

211 ITHEA a specific task or function of a fir, such as analysis and forecast of sales, financial services, forecasting and risk analysis, trend analysis etc. Characteristic features of up-todate BI systes are odularity, distributed architecture, the ost coon support and aintaining standards in web. At present, the developent of systes aied at business data analysis using OLAP, data ining and other. Growing range of algoriths is based on achine learning algoriths. Autonoous data ining tools are often included in other business analysis tools such as expanding database. Analysis of algoriths used in business intelligence solutions Business intelligence [Businessdictionary] refers to coputer-based techniques used in spotting, digging-out, and analysing business data, such as sales revenue by products or departents or associated costs and incoes. But broader business intelligence can be characterized: firstly as process of converting data into inforation and knowledge for business decision support, secondly as inforation technology for data saving, inforation consolidating and guaranteeing access business users to knowledge, thirdly knowledge business gained as a result of data analysis and consolidation of inforation [Power, 2008]. Business intelligence solutions are widely used and have very diverse architecture, so there is no single specification of what those systes should be. Objectives of a business intelligence exercise include understanding of a fir's internal and external strengths and weaknesses, understanding the relationship between different data for better decision aking, detection of opportunities for innovation, and cost reduction and optial deployent of resources. For the past 10 years, the content and nae of inforation-analytical systes have changed fro inforation systes for anager, to decision support systes and business intelligence systes, second and third generation now. Business intelligence 2.0 is a new tools and software for business intelligence, beginning in the iddle of 2000s, which enable, aong other things, dynaic querying of real-tie, web-based approached to data, as opposed to the proprietary querying tools that had characterized previous business intelligence software [Wikipedia]. Business Intelligence 3.0 is a ter that refers to new tools and software for business intelligence, which enable contextual discovery and ore collaborative decision aking [Wikipedia]. Business intelligence tools is typically divided into the following categories: spreadsheets, reporting and querying software, OLAP, digital dashboards, decision engineering, process ining, business perforance anageent, local inforation systes [Wikipedia]. Consider each of these groups in ore detail.

Artificial Intelligence Methods and Techniques for Business and Engineering Applications 212 Data ining is the process of extracting patterns fro data; it is becoing an increasingly iportant tool to transfor this data into inforation. It is coonly used in a wide range of profiling practices, such as arketing, surveillance, fraud detection and scientific discovery. DM can be used to uncover patterns in data but is often carried out only on saples of data. The ining process will be ineffective if the saples are not a good representation of the larger body of data. DM cannot discover patterns that ay be present in a larger data body if those patterns are not present in a saple being "ined". Inability to find patterns ay becoe a cause for soe disputes between custoers and service providers. Therefore data ining is not a proof fool but ay be useful if sufficiently representative data saples are collected. The discovery of a particular pattern in a particular data set does not necessarily ean that a pattern is found elsewhere in the larger data fro which that saple was drawn. An iportant part of the process is the verification and validation of patterns on other data saples. Process ining is a process anageent technique that allow for the analysis of business processes based on event logs. The basic idea is to extract knowledge fro event logs recorded by an inforation syste. Process ining ais at iproving this by providing techniques and tools for discovering process, control, data, organizational, and social structures fro event logs. Process ining techniques are often used when no foral description of the process can be obtained by other eans, or when the quality of an existing docuentation is questionable. For exaple, the audit trails of a workflow anageent syste, the transaction logs of an enterprise resource planning syste, and the electronic patient records in a hospital can be used to discover odels describing processes, organizations, and products. Moreover, such event logs can also be used to copare event logs with soe a priori odel to see whether the observed reality confors to soe prescriptive or descriptive odel. Business perforance anageent is a set of anageent and analytic processes that enable the perforance of an organization to be anaged with a view to achieving one or ore pre-selected goals. Local inforation systes (LIS) are a niche for of inforation syste - indeed they can be categorized as Business intelligence tools designed priarily to support geographic reporting. They also overlap with soe capabilities of Geographic Inforation Systes although their priary function is the reporting of statistical data rather than the analysis of geospatial data. LIS also tend to offer soe coon Knowledge Manageent type functionality for storage and retrieval of unstructured data such as docuents. They deliver functionality to load, store, analyse and present statistical data that has a strong geographic reference.

213 ITHEA Table 1. Most popular data ining algoriths offered by three different BI solutions Pentaho [Pentaho] Microsoft BI [Microsoft] Oracle BI [Oracle] Decision Tree (DT) Classifiers Predicting a discrete or continuous attribute, Finding groups of coon ites in transactions Classification Linear Regression (LR) Classifiers Predicting a continuous attribute Naive Bayes (NB) Classifiers Predicting a discrete attribute Classification Clustering Clusterers Predicting a discrete attribute, Finding groups of siilar ites Association Rules (AR) Classifiers Finding groups of coon ites in transactions Sequence Clustering (SC) Forecasting Predicting a sequence Tie Series (TS) Forecasting Predicting a continuous attribute Neural Network (NN) Forecasting Predicting a continuous attribute Support Vector achine (SVM) Classifiers Classification and Regression One Class Support Anoaly Detection Vector Machine (One- Class SVM) Generalized Linear Models (GLM) Classification and Regression Miniu Description Attribute Iportance Length (MDL) Apriori (AP) Association Association k-means (KM) Clusterers Clustering Orthogonal Partitioning Clustering (O-Cluster or OC) Nonnegative Matrix Factorization (NMF) Clustering Feature Extraction Business intelligence systes ipleent often classical data ining algoriths [Thoas, 2003]. Classification algoriths predict one or ore discrete variables based on the other attributes in the dataset. Regression algoriths predict one or ore continuous variables such as profit or loss on the bases of other attributes in the dataset. Segentation algoriths divide data into groups or clusters of ites that have siilar properties. Association algoriths find correlations between different attributes in a dataset. The ost coon application of this kind of algorith is for creating

Artificial Intelligence Methods and Techniques for Business and Engineering Applications 214 association rules can be used in a arket basket analysis. Sequence analysis algoriths suarize frequent sequences or episodes in data such as a Web path flow. Furtherore, ost business intelligence systes ipleent provisional data algoriths allowing eliinate gaps, specifically sall or large data and convert output data to a required forat. Classical algoriths are ostly ipleented as standard library for use in the systes being developed. We have exained ain characteristics of the ost popular business intelligence systes, naely the three different software packages: Pentaho, Microsoft BI, and Oracle BI. Table 1 shows the ost known data ining algoriths offered by these three different BI solutions. It should be noted that Weka (Pentaho Data Mining) has near 100 classification schees. Prospects of inductive odelling algoriths usage in Business Intelligence solutions Algoriths of inductive odelling are applicable for solving real-world odelling tasks in econoical, ecological, and other processes [Ivakhnenko, 1985], [Ivakhnenko, 1982]. They are widely used in ill-defined systes developed for solving a specific proble. Below we exaine a question of possibility of their use in business analytics; naely, which algoriths of inductive odelling and for what purposes can be used in business intelligence solutions. The article [Ivakhnenko, 1968] published in 1968 by Prof. O.G. Ivakhnenko has arked the beginning of the new scientific direction called "inductive self-organizing of odels fro experiental data" or siply "inductive odelling" [Stepashko, 2008]. Group ethod of data handling (GMDH) is a faily of inductive algoriths for coputerbased atheatical odelling of ulti-paraetric datasets that features fully-autoatic structural and paraetric optiization of odels. GMDH as personification of the inductive approach is an original ethod for constructing odels fro experiental data under uncertainty conditions. Models of optial coplexity obtained by this ethod reflect unknown laws of functioning of an object (process) inforation about which is iplicitly contained in a data saple. To build odels autoatically, GMDH applies the principles of variants generation, nonterinal decisions and successive selection of the best odels according to external criteria. These criteria are based on dividing the data set into two parts, where the tasks of paraeter estiation and odel checking are ipleented in various subsets. GMDH algoriths are characterized by an inductive procedure that perfors sorting-out of gradually coplicated polynoial odels and selecting the best solution using the external criteria.

215 ITHEA A GMDH odel with ultiple inputs and one output is a subset of coponents of the base function: Y ( 1 0 i i 1 i 1 Y x,..., x ) a a f ( x,..., x ), (1) where f are eleentary functions of different sets of inputs, a are coefficients and is the nuber of the base function coponents. To find the best solution, GMDH algoriths consider various coponent subsets of the base function (1) called partial odels. Coefficients of these odels are estiated by the least squares ethod. GMDH algorith gradually increases the partial odel coplexity and finds an optial odel structure and paraeters indicated by the iniu value of an external criterion. This process is called self-organization of odels. The ost coon base function used in GMDH is the Kologorov-Gabor polynoial: i 1 i 1 j i Y ( x1,..., x ) a0 ai x i aij x i x j aijk x i x j xk... (2) i 1 j i k j GMDH is used in such fields as data ining, knowledge discovery, prediction, coplex systes odelling, optiization and pattern recognition. Application of GMDH algoriths for solving forecasting tasks allows their use in business intelligence systes along with other data ining algoriths. Aong GMDH algoriths that have ore widespread gained, we can indicate the following [Ivachnenko, 2007]: Cobinatorial (COMBI), Multilayered Iterative (MIA), GN, Objective Syste Analysis (OSA), Haronical, Two-level (ARIMAD), Multiplicative- Additive (MAA), Objective Coputer Clusterization (OCC), Pointing Finger (PF) Clusterization Algorith, Analogues Coplexing (AC), Haronical Rediscretization, Algorith on the base of ulti-layered Theory of Statistical Decisions (MTSD), Group of Adaptive Models Evolution (GAME). It should be noted that algoriths proposed by the ost popular BI solutions are used for tasks of classification and clustering, ostly. For the nuerical forecasting, developers offer: neural networks, linear regression and odel trees. Such tools are less winning in forecasting tasks than GMDH algoriths. Requests for functionality of business intelligence syste today lie in expanding their capacity of data ining. Other subsystes of business intelligence, such as data integration, iport-export and reporting tools are developed very quickly. Thus the direction of business intelligence systes developent using GMDH-based forecasting algoriths is very proising. Attention should also be drawn to the developent of inductive odelling algoriths for classification and

Artificial Intelligence Methods and Techniques for Business and Engineering Applications 216 clustering. Effective using such algoriths for odelling of coplex systes is also an iportant arguent for applying the in business intelligence systes. Given the above we are developing our own syste in which we use GMDH algoriths for odelling coplex systes (processes). A necessity in accessible storage and drawing on scientific researches ripened already a long ago. An integrated environent for inforation storage would help to solve the existing probles, allowing structuring input data of different types and sources already existing in a data base, and also providing storage of coplete inforation on experients and results of calculations. Integrated environent for storing and handling inforation In [Shcherbakova, 2008] an architecture of integrated environent of handling and storing inforation in the tasks of inductive odelling was proposed, which allows freely anipulate the available inforation through building a layout environent which consists of relational database [Christopher, 1998], [Date, 2004], [Thoas, 2003] containing only eta-data and XML (PMML) storage, in which input data and results of calculations are stored [Graves, 2002]. Proposed layout of the environent is intended for solving probles of storage of input data and results. Designed architecture of the integrated environent for storing and handling inforation in the tasks of inductive odelling (Fig. 1) provides the opportunity to develop software. Modular syste architecture akes it possible to expand its functionality. The ain requireents to the syste is the ability to iport (including priary processing) and export data, storing and handling existing inforation, storing output data with all inforation of calculation results, generate reports on the results. It should be noted that results of calculations are stored in the syste in a standardized for that will allow generating strictly foral reports on the results of calculations. Let us consider in ore detail what inforation one needs to save in the syste. Firstly, as already discussed above, these are input statistical data given to a single forat and processed data with eliinated oissions and/or atypical values etc. Secondly there are basic functions, generated odels, estiates of the paraeters, criteria of quality odels and best odels. All the inforation is better to store in an XML storage. Auxiliary inforation such as data on the user, date and tie use of files etc. it is better to store in a relational database. Below we consider existing forats for saving predicting odels and their use in our case. Predictive Model Markup Language (PMML) is an XML dialect used to describe statistical odels and odels of data ining. Its ain advantage is that PMML-copliant

217 ITHEA applications can easily exchange odels with other PMML-tools. The following classes of odels can be kept using this arkup language: associative rules, decision trees, centerbased and distribution-based clustering, regression, general regression, neural networks, Bayes nets, sequences, text odels, tie series, rulesets, trees, support vectors. for storage of predicting odels, a schee can be used in our case which describes a regression function. Regression functions are used to deterine the relationship between the dependent variable (target area) and one or ore independent variables. The ter regression usually refers to the prediction of nueric values, hence the PMML eleent RegressionModel can also be used for classification. This is due to the fact that ultiple regression equations can be cobined in order to predict categorical values. Fig. 1 Architecture of an integrated environent for inforation storing and handling The offered integrated environent is intended for working out probles of storage of input statistical data and handling results. Developed architecture of the syste of inforation storage in the tasks of inductive odelling gives a possibility to develop a software syste that will allow freely anipulating the available inforation and adding new one.

Artificial Intelligence Methods and Techniques for Business and Engineering Applications 218 Conclusion This paper presents project of an integrated environent for storing and handling inforation in tasks of inductive odelling based on algoriths of the Group Method of Data Handling. Such type of syste can be applied to solve soe tasks of business intelligence, including forecasting, classification and clustering. Applying GMDH algoriths in business intelligence systes gives a particularly proising opportunity towards building coplex odels for business data analysis. Bibliography [Businessdictionary] http://www.businessdictionary.co/ [Power, 2008] Power D.J.: A Brief History of Decision Support Systes, version 4.0. DSSResources.COM. Retrieved, 2008. [Wikipedia] http://en.wikipedia.org/wiki/business_intelligence_2.0/ [Wikipedia] http://en.wikipedia.org/wiki/business_intelligence_3.0/ [Wikipedia] http://en.wikipedia.org/ [Thoas, 2003] Thoas K., Karely B.: Databases. Design, realization and accopanient. Theory and practice. М.: Willia, 1440 p, 2003. [Pentaho] http://wiki.pentaho.co/ - Pentaho docuentation. [Oracle] http://sdn.icrosoft.co/ - MSDN. [Microsoft] http://docs.oracle.co/ - Oracle docuentation. [Ivakhnenko, 1985] Ivakhnenko A.G., Stepashko V.S.: Noise-iunity of odelling. Kiev: Naukova Duka, 216 p, 1985. [Ivakhnenko, 1982] Ivakhnenko A.G.: Inductive ethod of self organization of odels of coplex systes. Kiev: Naukova Duka, 216 p, 1982. [Ivakhnenko, 1968] Ivakhnenko O.G.: Group ethod of data handling - rival of ethod of stochastic approxiation. //Autoatic 3. Kiev, pp 58-72, 1968. [Stepashko, 2008] Stepashko V.S.: Theoretical aspects of GMDH as a ethod of inductive odelling. Proceedings of the II International Conference on Inductive Modelling ICIM-2008, 15-19 Septeber 2008, Kyiv, Ukraine. Kyiv: IRTC ITS NANU, pp 9-16, 2008. [Shcherbakova, 2008] Shcherbakova N. and Stepashko V. Integrated Environent for Inforation Handling and Storage in the Tasks of Inductive Modeling. Proceedings of the II International Conference on Inductive Modelling ICIM-2008, 15-19 Septeber 2008, Kyiv, Ukraine. Kyiv: IRTC ITS NANU, pp 231-235, 2008. [Ivakhnenko, 2007] http://www.gdh.net/ [Christopher, 1998] Christopher J. Data: Introduction to databases systes. К.: BHV, 608 p, 1998. [Date, 2004] Date C.J.: An Introduction to Database Systes, Eighth Edition. USA: Addison-Wesley, 1024 p, 2004. [Thoas, 2003] Thoas K., Karely B.: Databases. Design, realization and accopanient. Theory and practice. М.: Willia, 1440 p, 2003. [Graves, 2002] Graves М.: Designing XML Databases. M: «Vil'yas» Publishing house, 640 p, 2002. http://en.wikipedia.org/wiki/predictive_model_markup_language/

219 ITHEA Authors' Inforation Nataliya Shchrebakova PhD student of IRTC ITS of NASU, P.A.: 40, Akadeik Glushkov Prospect, Kyiv, Ukraine, 03680; e-ail: nataliya.shcherbakova@gail.co Main Fields of Scientific Research: Inforation technologies of inductive odelling, Business Intelligence solutions Volodyyr Stepashko Head of Departent for Inforation Technologies of Inductive Modeling of IRTC ITS, Professor, Dr Sci, P.A.: 40, Akadeik Glushkov Prospect, Kyiv, Ukraine, 03680; e-ail: stepashko@irtc.org.ua Main Fields of Scientific Research: Data analysis ethods and systes, Knowledge discovery, Inforation technologies of inductive odelling, Group ethod of data handling (GMDH)