Research on Business Intelligence in Enterprise Computing Environment



Similar documents
Techniques, Process, and Enterprise Solutions of Business Intelligence

DATA MINING TECHNIQUES AND APPLICATIONS

Data Mining Analytics for Business Intelligence and Decision Support

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP

Data Mining: Concepts and Techniques. Jiawei Han. Micheline Kamber. Simon Fräser University К MORGAN KAUFMANN PUBLISHERS. AN IMPRINT OF Elsevier

Data Mining Solutions for the Business Environment

An Overview of Knowledge Discovery Database and Data mining Techniques

An Introduction to Data Mining

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

Information Management course

Introduction to Data Mining

Database Marketing, Business Intelligence and Knowledge Discovery

Course DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Social Media Mining. Data Mining Essentials

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Chapter 5. Warehousing, Data Acquisition, Data. Visualization

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing

IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH

Data Warehousing and Data Mining in Business Applications

1. What are the uses of statistics in data mining? Statistics is used to Estimate the complexity of a data mining problem. Suggest which data mining

Data Mining Algorithms Part 1. Dejan Sarka

Knowledge Discovery from patents using KMX Text Analytics

TRENDS IN THE DEVELOPMENT OF BUSINESS INTELLIGENCE SYSTEMS

KNOWLEDGE BASE DATA MINING FOR BUSINESS INTELLIGENCE

ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION

The Data Mining Process

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland

Importance or the Role of Data Warehousing and Data Mining in Business Applications

Data Warehousing and OLAP Technology for Knowledge Discovery

Digging for Gold: Business Usage for Data Mining Kim Foster, CoreTech Consulting Group, Inc., King of Prussia, PA

Operations Research and Knowledge Modeling in Data Mining

Data Mining and Database Systems: Where is the Intersection?

Data Mining + Business Intelligence. Integration, Design and Implementation

Customer Classification And Prediction Based On Data Mining Technique

Robust Outlier Detection Technique in Data Mining: A Univariate Approach

Data Mining for Successful Healthcare Organizations

Introduction. A. Bellaachia Page: 1

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1

Azure Machine Learning, SQL Data Mining and R

ETPL Extract, Transform, Predict and Load

Data Warehousing and Data Mining

Discovering, Not Finding. Practical Data Mining for Practitioners: Level II. Advanced Data Mining for Researchers : Level III

Prediction of Heart Disease Using Naïve Bayes Algorithm

Chapter ML:XI. XI. Cluster Analysis

SPATIAL DATA CLASSIFICATION AND DATA MINING

BUSINESS INTELLIGENCE. Keywords: business intelligence, architecture, concepts, dashboards, ETL, data mining

Random forest algorithm in big data environment

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery

A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH

A Knowledge Management Framework Using Business Intelligence Solutions

Nagarjuna College Of

An Overview of Database management System, Data warehousing and Data Mining

DATA WAREHOUSING AND OLAP TECHNOLOGY

Fluency With Information Technology CSE100/IMT100

When to consider OLAP?

Healthcare Measurement Analysis Using Data mining Techniques

How To Use Neural Networks In Data Mining

BENEFITS OF AUTOMATING DATA WAREHOUSING

ORACLE OLAP. Oracle OLAP is embedded in the Oracle Database kernel and runs in the same database process

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.

Introduction to Data Mining

Search and Data Mining: Techniques. Applications Anya Yarygina Boris Novikov

Chapter 3: Cluster Analysis

SQL Server 2005 Features Comparison

RESEARCH ON THE FRAMEWORK OF SPATIO-TEMPORAL DATA WAREHOUSE

Techniques and Process Based Decision Support in Business Intelligence System

University of Gaziantep, Department of Business Administration

Hexaware E-book on Predictive Analytics

Pentaho Data Mining Last Modified on January 22, 2007

Data Mining Applications in Higher Education

An Analysis on Density Based Clustering of Multi Dimensional Spatial Data

DATA MINING TECHNIQUES SUPPORT TO KNOWLEGDE OF BUSINESS INTELLIGENT SYSTEM

Data Mining is sometimes referred to as KDD and DM and KDD tend to be used as synonyms

Business Intelligence. A Presentation of the Current Lead Solutions and a Comparative Analysis of the Main Providers

Predictive Analytics Powered by SAP HANA. Cary Bourgeois Principal Solution Advisor Platform and Analytics

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

Data Mining: Overview. What is Data Mining?

Subject Description Form

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Available online at Available online at Advanced in Control Engineering and Information Science

International Journal of World Research, Vol: I Issue XIII, December 2008, Print ISSN: X DATA MINING TECHNIQUES AND STOCK MARKET

College information system research based on data mining

A Brief Tutorial on Database Queries, Data Mining, and OLAP

Building a Database to Predict Customer Needs

How To Use Data Mining For Knowledge Management In Technology Enhanced Learning

Vendor briefing Business Intelligence and Analytics Platforms Gartner 15 capabilities

NEURAL NETWORKS IN DATA MINING

Data Quality Mining: Employing Classifiers for Assuring consistent Datasets

High-Performance Business Analytics: SAS and IBM Netezza Data Warehouse Appliances

LEARNING SOLUTIONS website milner.com/learning phone

Dynamic Data in terms of Data Mining Streams

Next Generation Business Performance Management Solution

Intrusion Detection via Machine Learning for SCADA System Protection

DATA MINING AND WAREHOUSING CONCEPTS

Transcription:

Research on Business Intelligence in Enterprise Computing Environment Lida Xu, Li Zeng, Zhongzhi Shi, Qing He, Maoguang Wang Abstract Business intelligence (BI) is the process of gathering enough of the right information in the right manner at the right time, and delivering the right results to the right people for decision-making purposes so that it can continue to yield real business benefits, or have a positive impact on business strategy, tactics, and operations in the enterprises. This paper was intended as a short introduction to the study of business intelligence in enterprise computing environment. In addition, the conclusions point out the challenges to broad and deep deployment of business intelligence systems, and provide the proposals of making business intelligence more effective. Keywords Business Intelligence; Enterprise Intelligence Computing; Enterprise Information System T I. INTRODUCTION he most common types of traditional information systems in enterprise computing environment are e-commerce systems, management information systems, and transaction processing systems, enterprise resource planning systems, and executive information systems. Together, these information systems help employees accomplish both routine and special tasks from recording sales, to processing payrolls, to supporting decisions in various departments, or to providing alternatives for large-scale projects. However, as businesses continue to use these systems for a growing number of functions in today s competitive world, most enterprises are facing the challenges of processing and analyzing huge amounts of data and turning it into profits. They have many detailed operational data, yet can not get the satisfying answers they need from large volumes of operational data to react quickly to changing circumstances because the data are very likely distributed This work was supported by the Outstanding Overseas Chinese Scholars Fund of Chinese Academy of Sciences, the National Science Foundation of China (No. 60435010, 90604017, 60675010), 863 National High-Tech Program (No.2006AA01Z128), National Basic Research Priorities Programme (No. 2003CB317004) and the Nature Science Foundation of Beijing (No. 4052025). L. D. Xu, L. Zeng, Z.Z. Shi, Q. He, and M. G. Wang are with the Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, and Graduate University of Chinese Academy of Sciences. over many departments in the enterprises, or locked in a sluggish enterprise department. As a result, appropriate analyses on history data are unavailable and requisite to decision makers. For delivering the right information with the right format to the right people at the right time for decision-making purposes, the concept of business intelligence (BI) is presented, which is a set of tools, technologies or solutions designed for users to efficiently extract useful business information from oceans of routine data. The origin of business intelligence was firstly introduced by Garter Group in 1996, [1] and incipiently referred to some tools and technologies including data warehouses, reporting query and analysis. Today s business intelligence is regarded as a very powerful solution, an extremely valuable tool, or a key approach to adding the value of the enterprise. A fast-growing-number of business sectors have deployed advanced business intelligence systems to enhance their competitiveness. II. TECHNICAL FRAMEWORK IN ENTERPRISE COMPUTING ENVIRONMENT The technology categories of business intelligence system mainly encompass data warehousing or data mart, OLAP, and data mining. [2] More specifically, data warehouse or data mart are the fundament infrastructure of business intelligence systems, and data mining is its core component that allow users to detect trends, identify patterns and analyse data, while OLAP is the sets of front-end analyzing tools. Constructing benign enterprise intelligence computing environment should consider at least the following two points. One is correct, valid, integrated, and in-time data, and another is the means which can transform the data into decision information. However, neither satisfied data nor effective means are easily acquired. Strong technical framework can be used to solve these two questions above. The framework consists of operational applications tier, data acquisition tier, data warehouse tier, platforms and enterprise L. D. Xu is also with the Department of Information Technology and Decision Sciences, Old Dominion University (e-mail: lxu@odu.edu). 1-4244-0991-8/07/$25.00/ 2007 IEEE 3270

BI suites tier, extended corporate performance management tier. Data warehouse can be viewed as a database that holds business information like sales, salaries, human resources or other data from day-to-day operations, covering the aspects of the company s processes, products, and customers or other data sources in the enterprises. These data is pour into data warehouse on a regular schedule, such as every night or every weekend; after that, management can perform complex queries and analysis on the information without slowing down the operational systems. The data warehouse provides users with a multi-dimensional view of the data they need to analyze business conditions. It is designed specifically to support management decision making, instead to meet the needs of transactions processing systems. Data warehouses typically start out as very large databases, containing millions and even hundreds of millions of data records. To remain fresh and accurate, the data warehouse will receive regular updates. Updating the data warehouse must be efficient, automated or semi-automated, and as fast as possible owing to the very large data. It is common for a data warehouse to contain from several years of current and historical data. Web warehousing [3] is the combination of data warehousing and World Wide Web technology. The Internet made it possible to apply web technology to traditional data warehousing, which resulted in improved cost savings and productivity. Document warehouse [4] is a novel kind of warehouse to store data hidden in documents or non-numeric format, including extensive semantic information about documents, cross document feature relations. Data mart is in fact a subset or a specialized version of data warehouse. It contains a subset of the data for a single aspect of company s business instead stores all enterprise data in one database, e.g. finance, inventory, or personnel. A data warehouse is used for summary data that can be access by an entire enterprise, whereas data marts are helpful for small groups who want to access detailed data. Very like data warehouses, data marts typically contain tens of gigabytes of data, which can be deployed on less powerful hardware with small storage devices and helps business people to strategize based on analyses of past trends and experiences. However, the key difference between data warehouse and data mart is that the creation of a data mart is predicated on a specific, predefined need for a certain grouping and configuration of selected data. A star schema consists of a fact table and a single de-normalized dimension table for each dimension of a data model. Further more, the dimension tables can be normalized to a snowflake schema which can support attribute hierarchies. Since data mart emphasizes easy access to relevant information, the star schema or multi-dimensional model is a fairly popular design choice, because it enables a relational database to emulate the structure and analytical functionality of a multi-dimensional database. The notion of OLAP was introduced by Codd [5] in his prominent seminar paper in 1993 referring to the techniques of performing complex analysis over the information stored in a data warehouse. These programs are being used to store and deliver data warehouse information. The typical OLAP applications are in business reporting for sales, marketing, management reporting, business process management, budgeting and forecasting, financial reporting and similar areas. In general, OLAP applications are characterized by the rendering of enterprise data into multi-dimensional perspectives. This is achieved through complex queries that frequently aggregate and consolidate data, often using statistical formulae. OLAP provides a quick approach to the answer to analytical queries that are dimensional in nature and is part of the broader category business intelligence, which also includes ETL, relational reporting. As a matter of fact, readers can easily gauge the limitation of the relational model by trying to answer the queries mentioned above in relational language such as SQL. Data mining is the process of identifying and interpreting patterns in data to solve a specific business problem. It is an information analysis tool that involves the automated discovery of patterns and relationships in a data source. Data mining makes use of advanced statistical techniques and machine learning to discover facts in data warehouses or data marts, including databases on the Internet. Unlike query tools,which require users to formulate and test a hypothesis, data mining uses analysis tools to automatically generate a hypothesis about the patterns found in the data and then predict future behavior. The object is to discover patterns, trends, and rules from data warehouses to evaluate business strategies, tactics, or operations, which in turn will improve competitiveness and profits of enterprises, and transform business processes. Business Intelligence vendors like Oracle, Sybase, Tandem, and Red Brick Systems are all incorporating data mining functionality into their products. Data mining strategies for BI include classification, estimation, prediction, time series analysis, unsupervised clustering, and association analysis or market basket analysis. 3271

III. COMPENDIUM OF BUSINESS INTELLIGENCE ALGORITHMS The algorithms of data mining are the shoe pinches for business intelligence systems. Data mining strategies include classification, estimation, prediction, unsupervised clustering, and association analysis or market basket analysis. Supervised learning builds models by using input attributes to predict output attribute values. Many supervised data mining algorithms only permit a single output attribute. Other supervised tools allow us to specify one or several output attributes. Output attributes are also known as dependent variables as their outcome depends on the values of one or more input attributes. Input attributes are referred to as independent variables. When learning is unsupervised, an output attribute does not exist. Therefore all attributes used for model building are independent variables. A. Market Basket Analysis Assocation rule learners is also known as market basket analysis. Market basket analysis is about to determine those items likely to be purchased by a customer during a shopping experience. The output of the market basket analysis is generally a set of associations about the customer s purchase behavior. These associations are given in the form of a special set of rules known as association rules. The association rules are used to help determine appropriate product marketing strategies. Association rules are of the form where is a set of n items that appear in a group along with a set of m items in the same group. For example, saving and checking credit card accounts owned by a customer, the customer will own a certificate of deposit with a certain frequency. However, association rules do not warrant inferences of causality, but they may point to relationships among items or events that could be studied further using more appropriate analytical techniques to determine the structure and nature of causalities that may exist. Unlike traditional classification, association rule generators allow the consequent of a rule to contain one or several attribute values, whereas traditional classification rules usually limit the consequent of a rule to a single attribute, and an attribute may appear as both precondition and consequent of difference rules in traditional classification. However, when attributes are present after generating association rules, this process becomes unreasonable owing to large number of possible conditions for the consequent of each rule. Some candidate-generation-and-test algorithms such as Apriori [6] have been developed to generate association rules efficiently. Apriori association rule generation is often a two step process. The first step is to generate item sets, thus the second step will use the generated item sets to create a set of association rules. Association rules are particularly popular because of their ability to find relationships in large databases without having the restriction of choosing a single dependent variable. However, it s still important to minimize the work required by an association rule algorithm since volumes of data are often stored for marked basket analysis. B. Classification and Prediction Classification algorithm or classifier is simply a model for predicting a categorical variable that assumes one of a predetermined set of values. These values can be either nominal or ordinal, though ordinal variables are typically treated the same as nominal ones in these models. When a problem is easy to classify and the boundary function is complicated more than it needs to be, the boundary is likely over-fitting. Analogously, when a problem is hard and the classifier is not powerful enough, the boundary becomes under-fitting. Classification describes the assignment of data records into predefined categories and discovers the relationship between the other variables and the target category. When a new record is inputted, the classifier determines the category and the probability that the record belongs to. Examples of classification algorithms widely include: Linear classifiers (e.g. Fisher's linear discriminant, Logistic regression, Naive Bayesian classifier), Quadratic classifiers, K-nearest neighbor, Boosting, Decision trees, Neural networks, Bayesian networks, Support Vector Machines [7], Hidden Markov models, and so on. As a novel field of data mining, Support Vector Machine (SVM) and Kernel skill have been successfully applied in a variety of domains. SVM is a promising method for classification and regression analysis due to their solid mathematical foundations, which include two desirable properties: margin maximization and nonlinear classification using kernels. However, despite there are two distinguish properties, SVM is usually not chosen for large-scale data mining problems because their training complexity is highly dependent on the data set size. Unlike traditional pattern recognition and machine learning, real-world data mining applications often involve huge numbers of data records. Thus it is too expensive to perform multiple scans on the entire data set, and it is also infeasible to put all the data set into memory. In order to improve the performances of traditional SVM on the dataset with unbalanced class distribution, an improved SVM was presented. Genetic algorithm-svm (GA-SVM) was constructed by combining 3272

the genetic algorithm and the simple support vector machine. The parameters of SVM were coded into chromosomes with gray coding strategy. Huang s results [8] indicate GA-SVM can gain higher classification accurate and faster learning speed, and work well with faster learning speed on the a) Original space b) Wave-based Cluster results Figure1: An example of wave-based Clustering algorithm perfectly constructed dataset. C. Clustering Analysis Clustering analysis is a common technique used in many fields, including machine learning, data mining, pattern recognition, image analysis, bioinformatics and market research. Clustering is a typical form of unsupervised learning which classify similar objects into different groups, or more precisely, partition a data set into clusters, so that the data in each subset ideally share some common trait. In other words, clustering analysis is "the process of grouping the data into classes or cluster so that objects within a cluster have high similarity in comparison to one another, but are very dissimilar to objects in other cluster" [9]. Cluster analysis is a statistical process for identifying homogeneous groups of data objects. By clustering, one can identify dense and sparse regions and therefore, discover overall distribution patterns and interesting correlations among data attributes. In business application, clustering help marketers discover distinct groups and characterize customer groups based on purchasing patterns. As a data mining function, cluster analysis can be used as a stand-alone tool to gain insight into the distribution of data, to observe the characteristics of each cluster, and to focus on a set of clusters for further analysis. Alternatively, it may server as a preprocessing step for other algorithms, such as characterization and classification, which would then operate on the detected clusters. As a branch of statistics, cluster analysis has been studied extensively for many years, focusing mainly on distance-based cluster analysis. Cluster analysis tools based on k-means, and several other methods have also been build into many statistical analysis software package or systems, such as S-Plus, SPSS, and SAS. Wavelet-based Clustering (or WaveCluster) [10] can be efficiently applied to detect clusters of arbitrary shape. A good clustering analysis approach should be insensitive to the noise, outliers, and the input order of data. What is more, it should be efficiently to both low dimensional and high dimensional large datasets. WaveCluster is a grid-base and density-based algorithm, using the multi-resolution property of wavelet transform. It can handle large datasets efficiently, and identify arbitrary shaped clusters at different degree of detail, and further more, perform very efficiently on very large databases thus, this approach meets most of the desirable properties of a good clustering technique mentioned above. Here is an example of arbitrary shape data distribution. Figure1 presents the clustering result produced by WaveCluster. From this, it is evident that WaveCluster is powerful in handling any type of sophisticated patterns removing noise. IV. APPLICATION,ISSUES AND FUTURE As mentioned above, business intelligence might give users the ability to gain insight into business or organization by understanding the company's information assets. These assets can include customer databases, supply chain information, personnel data, manufacturing, and sales and marketing activity as well as any other source of information critical to operation. It also allows users to integrate disparate data sources into a single coherent framework for real-time reporting and detailed analysis by anyone in enterprise, such as customers, partners, employees, managers, and executives. Here are some trends which are dramatically driving the market need for better business intelligence tools, e.g. daily 3273

rising data volumes, geographically dispersed users, and complex existing tools. However, the existing business intelligence systems still lack the maturity and breadth of deployment which need to meet business demands, that is, appearing BI gap introduced by Garner Group. Broader deployment of business intelligence systems throughout the enterprises will only occur if users can learn an application, deploy it and manage it effectively. Here are some reasons for the difficulties of broad delivering of business intelligence system in every enterprise. Very often, most business intelligence systems have to take a long time to install, build and deploy. The average implementation time for some larger BI solutions even reaches about six months. Unfortunately, this timeframe is as long as the initial implementations for the transactional systems that business intelligence system is expected to improve. Requirements and budgets often change after a long installation and implementation cycle. Many business intelligence applications are still difficult to use. A majority of BI project is focused on the implementation, for the professional thinking, but adequate user training is often overlooked. As a result, end user acceptance is an important reason that hampers business intelligence system to be deployed broadly. In most cases, business intelligence system has actually increased the workload while it was originally conceived as a way to relieve workload through intuitive reporting and analysis. A business intelligence system that was intended to reduce costs and reduce the workload whereas actually increases costs and workloads. This eventually limits wider deployment of Business Intelligence throughout the enterprise. Further more, the cost and benefit is also a question. After a rather lengthy and costly implementation, demands have changed. Once the applications cannot demonstrate a return on investment in time, or few benefits are realized, the end users are likely to disenchant with business intelligence. Business intelligence system is willing to be easier to use and understand, and allow the user to evaluate alternatives, to draw conclusions and to make decisions. The tools and techniques to access and analyze the information must be powerful yet easy to learn and to use. This can only happen if the information is easy to understand, is timely and is relevant to the user. In addition, user should have access to more resources necessary to perform his or her job only limited by security or authorization reasons. For true insight and effectiveness, understanding of data across boundaries helps the user make the business more productive. Analysis tools of business intelligence should be powerful yet simple for users to learn, to deploy and to maintain. These solutions need to be more flexible and adaptable to changes in the on demand and competitive business environment. The diversity of business issues, requirements, data, tasks, and approaches poses many challenges in research fields of business intelligence. The design and the construction of integrated data warehouse, the development of efficient and effective data mining algorithms to solve very-large data, et al., are important tasks for business intelligence researchers and BI application developers. A. Vertical and customized BI systems Since business intelligence is still a young field with wide and diverse application, there is a nontrivial gap between general BI systems and domain-specific, effective BI solutions for particular applications. General BI systems can provide integrated and comprehensive platforms with strong robustness and interpretability to deal with business data. However, domain-specific business intelligence solutions are usually more effective than those general systems with other performance indications such as accuracy and scalability, that s called vertical business intelligence system. The differences are similar to the divergences between general search engines and vertical search engines. There are some major areas of domain-specific business intelligence application, e.g. the retail industry, the telecommunication industry and the financial industry provide rich sources for business intelligence application. Analogically, vertical and domain-based data warehouse, customized multidimensional analysis, multidimensional association and sequential pattern analysis, fraudulent pattern analysis and identification of unusual patterns, similarity search and comparison, visualization of analysis process and result, are some challenging issue for building vertical and mission-specific business intelligence systems. B. Scalable, interactive and constraint-based BI progress In contrast with traditional enterprise information systems, business intelligence systems must be able to handle huge amount of data efficiently and, if possible, interactively. Furthermore, the business data being collected increases day after day. Hence, the scalability of business intelligence system for individual and integrated enterprise data function becomes more and more essential. One desired direction for this is to develop interactive OLAP and constraint-based mining algorithms to minimize the analysis cost. It uses constraints to guide user to discover outliers or other 3274

interesting business patterns. Outlier or data exception often reflect potential problems or dangers in the management of enterprise. And, manual discovery is always not a feasible and satisfied approach. As a result, constraint-based and interactive business intelligence progress pushes several user-defined constraints into the cube analyzing or mining, e.g. data-constraint, level-constraint, and outlier-constraints, et al, and thus results in efficient and reasonable processing. C. Unified and Integrated BI framework A desired framework for business intelligence systems is the tight coupling with database, data warehouse, WWW, OLAP, data mining and other components. Transaction systems, query processing, on-line analytical processing, and on-line, off-line mining should be integrated into a unified framework, where data mining always servers as an essential component. With the rapid development of Internet, WWW will be another mainstream information processing source in addition to database and data warehouse. No matter what and where the component is, the tight-coupling infrastructure will ensure data availability, data mining portability, high performance, and an integrated enterprise computing environment. [5] E.F. Codd, S.B. Codd, C.T. Salley, Providing OLAP(on-line analytical processing) to user-analyst: an IT mandate, E.F. Codd and Associates, Editor. 1993. [6] J. W. Han, M.Kamber, Data Ming: Concepts and Techniques. 2001: Morgan Kaufmann. [7] R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets of items in large database. In Proc. 1993 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD 93), p. 207-216, Washington, DC, May 1993. [8] H.Yu, J. Yang, J.W. Han. Classifying Large Data Sets Using SVMs with Hierarchical Clusters. In Proc. the ninth ACM SIGKDD international conference on Knowledge discovery and data mining. Washington, D.C. August 24-27, 2003. [9] J. T. Huang, L. H. Ma, J. X. Qian, Improved support vector machine for multi-class classification problems. Journal of Zhejiang University (Engineering Science), 2004. 38(12): p. 1633-1636. [10] G. Sheikholeslami, S. Chatterjee, and A. Zhang. WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases. In Proc. 24rd International Conference on Very Large Data Bases, 1998, New York, USA, pp. 428-439, 1998. V. CONCLUSION Although we are convinced that business intelligence is the way to go, it will still be quite a journey. The work here is only a part of related research piece. Business intelligence system should provide not only the capabilities to analyze what happened, but more importantly to tell the users what is going to happen in enterprises, with intelligent information processing techniques. Expected business intelligence is closed to ubiquitous tools that people use every day, like some familiar desktop applications. Once the goal is reached, we can see an intelligent business helper as well as our returns on the investment. REFERENCES [1] Gartner Group, How Secure Is Your Business Intelligence Environment. 2002. p. 1-6. [2] L. Zeng, L.D. Xu, Z. Z. Shi, et. al. Techniques, Process, and Enterprise Solutions of Business Intelligence. In Proc. 2006 IEEE International Conference on Systems, Man, and Cybernetics. Taipei, 2006 [3] X. Tan, D.C. Yen, X. Fang. Web warehousing: Web technology meets data warehousing. Decision Support Systems, 2003. p. 25. [4] Frank S.C. Tseng, Annie Y.H. Chou, The concept of document warehousing for multi-dimensional modeling of textual-based business intelligence. Decision Support Systems 42 (2006) 727-744 3275