Adaptive Business Intelligence Zbigniew Michalewicz 1
Business Intelligence What is Business Intelligence? Business Intelligence is a collection of tools, methods, technologies, and processes needed to transform data into knowledge. What should I do? 2
Business Intelligence Although Business Intelligence can be used to: Increase profitability, Decrease costs, Improve customer relationship management, Decrease risk, 3
Business Intelligence most companies use it to answer basic queries: How many customers do I have? During the past 12 months, how many products were sold in each region? Who are my 20 best customers? 4
The famous pyramid KNOWLEDGE INFORMATION DATA 5
Data, information, knowledge Data a collection of raw value elements or facts used for calculating, reasoning, measuring, etc. Information the result of collecting and organizing data that establishes relationship between data items. Knowledge the concept of understanding information based on recognized patterns. Knowledge is power! 6
Observation Discovered knowledge is of little value if there is no value producing action that can be taken as a consequence of gaining that knowledge. Example: 37% of our customers live on the East Coast. So what? 7
What do others think? PricewaterhouseCoopers Global Data Management Survey of 2001: Companies that manage their data as a strategic resource and invest in its quality are already pulling ahead in terms of reputation and profitability. Data should be treated as strategic resource. 8
What do others think? Pacific Crest Equities, 2006: Increasingly you are seeing applications being developed that will result in some sort of action. It is a relatively small part now, but it is clearly where the future [of business intelligence] is. 9
What do others think? Jim Goodnight, CEO, SAS, 2007: Until recently, business intelligence was limited to basic query and reporting, and it never really provided that much intelligence 10
What do others think? Jim Davis, VP Marketing, SAS, 2007: In the next three to five years, we ll reach a tipping point where more organizations will be using BI to focus on how to optimize processes and influence the bottom line 11
What is intelligence? Three major ingredients: Ability to predict Ability to optimise Ability to adapt 12
Prediction Evolutionary Programming aimed at achieving intelligence (L. Fogel 1966) Intelligence was viewed as adaptive behaviour Prediction of the environment was considered a prerequisite to adaptive behaviour Thus: capability to predict is key to intelligence (L. Fogel 1966) 13
Smart decisions Expert systems Games Search techniques etc. 14
Adaptive products Adaptive products are the way of the future: Car transmissions TV Shoes AI, in general 15
Basic observation Businesses and government agencies are interested in two fundamental things: Knowing what will happen next (prediction); and Making the best decision under risk and uncertainty (optimisation). The goal is to provide AI-based solutions for modelling, simulation, and optimisation to address these two fundamental needs. 16
Adaptive Business Intelligence OUTCOME D A T A I N F O R M A T I O N K N O W L E D G E OPTIMISATION PREDICTION D E C I S I O N 17
Technology Platforms Classic OR methods, and: Evolutionary Algorithms Swarm Intelligence Simulated Annealing Tabu Search Co-Evolutionary Systems Ant Systems Classic forecasting methods, and: Neural Networks Fuzzy Systems Genetic Programming Agent-Based Systems Data Mining Techniques Rough Sets SolveIT Optimisation Platform SolveIT Prediction Platform 18
ABI Example #1 A major U.S. automaker sells 1.2 million offlease cars each year on various auction sites. Each day, a remarketing team uses business intelligence tools and reports to decide where to ship 4,000 7,000 off lease cars. The problem is impacted by demand, depreciation, transportation schedules, cost of capital, risk, changes in market conditions, and the volume effect. 19
Car Distribution System 20
Planning & Scheduling Optimisation Manufacturing production: 21
Planning & Scheduling/ Predictive Modelling Media Allocation (multiobjective): 22
Some research issues 23
Issues Precise models of a problem Robustness of solutions Return of several solutions Time changing environments Handling constraints Large (and complex) search spaces 24
Models of a problem Problem => Model => Solution Problem-solving is a two-step process: (1) Building a model of a problem, and (2) Solving the model 25
Cost functions cost amount 26
Robustness of solutions t is important to minimize undesirable changes equired by unforeseen events. quality solution 27
Return of several solutions Evolutionary algorithms can be structured to (1) give diverse near-optimal solutions and (2) deal with tradeoffs present in multiobjective problems. cost time 28
Size of search space Assume we deal with the following problem: optimize f (x 1, x 2,..., x 100 ) 2 where f is very complex and x i is 0 or 1. The size of the search space is 2 100 ~ 10 30. i The exhaustive search is out of question! 29
Optimisation problem Optimize: f(x, y) = 100(x - y) 2 + (1 x) 2 2 where -2.048 <= x, y <= 2.048 (Rosenbrock s function, F2) 30
Optimisation problem What would happen, if we have additional constraints? E.g., x <= log(y + 3) sin(x) <= 3y 2 + 1 31
Search space 32
Main question Should we consider infeasible individuals harmful and eliminate them from the population? YES: easy implementation, low efficiency NO: many issues to consider; usually much better results! 33
Further questions: we keep infeasible individuals in the population, we ave to address several issues: How to compare two feasible individuals? How to compare two infeasible individuals? How to compare an feasible individual with infeasible one? Should we penalize infeasible individuals? Should we repair infeasible individuals? Should we use specialized operators which produce feasible individuals only? Should we use decoders? Should we concentrate on the boundary between feasible and infeasible areas of the search space? 34
Penalties General idea: Eval(x) = f(x) + W*penalty(x) Should we keep W constant? Should we increase W together with generations? Should we use some adaptive mechanism which influences the value of W on the basis of the feedback from the search? Should we include the value of W as a 35
Repairs General idea: Transform infeasible x into feasible x by applying some problem-specific algorithm Should we repair for evaluation purpose only (so-called Baldwin effect)? Should we replace the original individual x by its repaired version x (so-called Lamarckian evolution)? Are there any other possibilities? 36
Specialized operators Genocop 3.0 an experimental system to take: Arbitrary objective function (continuous variables) Set of linear constraints to produce the optimal solution. System available from www.cs.adelaide.edu/~zbyszek 37
Decoders General idea: The original space Encoded space 38
Decoders Genocop V: universal tool for nonlinear optimization problems with nonlinear constraints! The system accepts an arbitrary function (continuous variables) and any number of nonlinear constraints. System available from www.cs.adelaide.edu/~zbyszek 39
Andy Keane s function G2(x) = (Σ cos 4 (x i ) 2 Π cos 2 (x i ))/sqrt(σ i x i2 ), where 0 x i 10 and Π x i 0.75 40
Boundary operators For some problems, it is possible to design boundary operators, which generate offspring as a new boundary point. E.g., consider constraint: xy <= 5 For two boundary parents, (x1,y1) and (x2,y2), an offspring: (sqrt(x1*x2), sqrt(y1*y2)) is also a boundary point. 41
Parameter tuning Parameter tuning: the traditional way of testing and comparing different values before the real run Problems: users mistakes in settings can be sources of errors or sub-optimal performance costs much time parameters interact: exhaustive search is not practicable good values may become bad during the run 42
Parameter control Parameter control: setting values on-line, during the actual run, e.g., predetermined time-varying schedule p = p(t) using feedback from the search process encoding parameters in chromosomes and rely on natural selection Problems: finding optimal p is hard, finding optimal p(t) is harder still user-defined feedback mechanism, how to optimize? when would natural selection work for strategy 43
Various Cases f(x) f(x), c 1 (x), c 2 (x), f 1 (x), f 2 (x), f 1 (x), f 2 (x),, c 1 (x), c 2 (x), f(x,t) f(x,t), c 1 (x), c 2 (x), f(x), c 1 (x,t), c 2 (x,t), f(x,t), c 1 (x,t), c 2 (x,t), f 1 (x,t), f 2 (x,t),,c 1 (x,t), c 2 (x,t), 44
Heuristic vs. Problem Heuristic Method Evaluation Functions Problem 45
Evaluation Functions Some researchers acknowledged that a real world scenario might be a bit more complex: Noise Robustness Approximation Time-changing environments 46
Noise Sometimes evaluation functions return results of randomised simulations. The common approach in such scenarios is to approximate a noisy evaluation function eval by an averaged sum of several evaluations: eval(x) = 1/q Σ i=1 q (f(x) + z i ), where x is a vector of design variables (i.e., variables controlled by a method), f(x) is the evaluation function, z i represents additive noise, and n is the sample size. Note that the only measurable (returned) values are f(x) + z. 47
Robustness Sometimes slightly modified solutions should have quality evaluations (thus making the original solution robust). The common approach to such scenarios is to use evaluation function eval based on the probability distribution of possible disturbances δ, which is approximated by Monte Carlo integration: eval(x) = 1/q Σ i=1 q f(x + δ i ). Note that eval(x) depends on the shape of f(x) at point x; in other words, the neighbourhood of x determines the value of eval(x). 48
Approximation Sometimes it is too expensive to evaluate a candidate solution. In such scenarios, evaluation functions are often approximated based on experimental or simulation data (the approximated evaluation function is often called the meta-model). In such cases, evaluation function eval becomes: eval(x) = f(x) + E(x), where E(x) is the approximation error of the metamodel. Note that the approximation error is quite different than noise, as it is usually deterministic and 49
Dynamic environments Sometimes evaluation functions depend on an additional variable: time. In such cases, evaluation function eval becomes: eval(x) = f(x, t), where t represents time variable. Clearly, the best solution may change its location over time. There are two main approaches for handling such scenarios: (1) to restart the method after a change, or (2) require that the method is capable of chasing the changing optimum. 50
The most real case However, it seems the largest class of real world problems is not included in the above four categories. It is clear that in many real world problems the evaluation functions are based on predictions of the future values of some variables. In other words, evaluation function eval is expressed as: eval(x) = f(x, P(x, y, t)), where P(x, y, t) represents an outcome of some prediction for solution vector x and additional (environmental, beyond our control) variables y at time t. 51
More info 52