Online Learning of Genetic Network Programming (GNP)
|
|
|
- Albert Chambers
- 10 years ago
- Views:
Transcription
1 Online Learning of Genetic Network Programming (GNP) hingo Mabu, Kotaro Hirasawa, Jinglu Hu and Junichi Murata Graduate chool of Information cience and Electrical Engineering, Kyushu University 6-0-, Hakozaki, Higashi-ku, Fukuoka, , Japan {mabu, bstract - new evolutionary computation method named Genetic Network Programming (GNP) was proposed recently. In this paper, an online learning method for GNP is proposed. This method uses Q learning to improve its state transition rules so that it can make GNP adapt to the dynamic environments efficiently. I. INTROUTION Evolutional processes of organisms are very complicated and sophisticated. They are based on the mechanisms such as learning, selection and evolution, so that they can harmonize well with environments. Genetic lgorithm (G)[] and Genetic Programming (GP)[2] are typical methods that are based on the evolutional processes of organisms. ecause the conventional control theory should obey the rules that are predefined in advance and it cannot be adapted to the dynamical environment rapidly, the evolutionary computation overcoming the above problems attracts the attention. G and GP which have been mainly applied to optimization problems represent solutions as a string and a tree structure, respectively and evolve them. GP was devised later in order to expand the representation ability of G and to solve more complex problems. ut, GP might be difficult to search for a solution because of the bloat of its tree structure, which expands the depth of the tree unnecessarily although it is sometimes useful for expanding a search space and find a solution. Recently, a new evolutionary computation method named Genetic Network Programming (GNP) was proposed[3]-[4]. GNP that is an expansion of G and GP represents solutions as a network structure. ecause GNP can memorize the past action sequences in the network flow and can deal with Partially Observable Markov ecision Process (POMP) well, GNP was applied to complicated agent systems involving uncertainty such as decision-making problems. ut, conventional GNP are based on offline learning, that is, after GNP is carried out to some extent, it is evaluated and evolved according to rewards given by an environment. However, the adaptation to dynamical environments might be difficult because if the changes of an environment occurred, offline learning must do many trials to evaluate GNP and evolve it again and again, therefore, it cannot keep up with the changes of environments quickly. In this paper, Q learning[5] which is one of the famous online learning method in reinforcement learning field is introduced for the online learning of GNP. GNP changes its node transition rules considering the rewards given one after another, so that it can change its solution (behavior sequences) immediately after environmental changes that cause a bad result. The reason for applying Q learning to online learning of GNP is described in section 2. In this paper, Prisoner s dilemma game is used for simulations and the performance of online learning is studied. Prisoner s dilemma game needs two players and they compete with each other to get high scores. First, GNP having a game strategy competed with the famous strategies of Prisoner s dilemma game and showed the good performances (scores). Then, two GNPs competed with each other and showed the online adjustment on their strategies considering the opponent s strategy in order to get higher scores. This paper is organized as follows. In the next section, the details of Genetic Network Programming is described. ection 3 explains Prisoner s dilemma game and shows the results of the simulations. ection 4 is devoted to conclusions. II. GENETI NETWORK PROGRMMING (GNP) In this section, Genetic Network programming is explained in detail. GNP is an expansion of GP in terms of gene structures. The original motivation of developing GNP is based on the more general representation ability of graphs than that of trees.. asic structure of GNP First, GP is explained in order to compare it with GNP. Fig. shows a basic structure of GP. GP can be used as a decision making tree when non-terminal nodes are if -then type functions and all terminal nodes are some concrete action functions. tree is executed from the root node to a certain terminal node in each iteration, therefore, it might fall into the deadlocks because the behaviors of agents made by GP are determined only by the environments at the current time. Furthermore, GP tree might cause the severe bloat that makes search for solutions difficult because of the unnecessary expansion of depth. Next, the characteristics and abilities of GNP are explained using Fig.2 which shows the basic structure of GNP. Now, it is supposed that agent behavior sequences are created by GNP. () GNP has a number of Judgement nodes and Processing nodes connected by directed links with each other like /02/$ IEEE
2 dij di i j root node non terminal node terminal node Fig. basic structure of GP Processing node Judgement node Time delay Fig.2 basic structure of GNP ENVIRONMENT ENVIRONMENT example, when a man is walking and there is a puddle before him, he will avoid it. t that time, it takes some time to judge the puddle (d i of judgement node), to put judgement into action (d ij of transition from judgement to processing) and to avoid the puddle (d i of processing node). However, d i and d ij are not used in this paper because the purpose of the simulations using Prisoner s dilemma game is to make and change strategies adapting to the opponent behavior and the time spent for judgement or processing does not need to be considered. Time delay is necessary in the case of practical applications. When various judgements and processes are done, GNP should change its structure considering the time it spends. Time delay is listed in each node gene which will be described later because it is the unique attribute of a node. There have been developed some graph based evolutional methods such as PO (Parallel lgorithm iscovery and Orchestration)[6] and EP (Evolutionary Programming)[7]. PO has both start node and end node in the network and it mainly represents a static program. EP is used for the automatic synthesis of Finite tate Machines and in all states, state transitions for all inputs have to be determined, therefore, the structure becomes complicated if the number of nodes increase. On the other hand, GNP uses necessary information only for judging the environments, i.e., it can deal with POMP. Therefore, GNP could be compact even if the system to solve is large enough.. Genotype expression of GNP node networks. Judgement nodes, that are if -then type decision functions or conditional branch decision functions, return judgement results for assigned inputs and determine the next node GNP should take. Processing node determines an action/processing an agent should take. ontrary to judgement nodes, processing nodes have no conditional branch. The GNP we used never causes bloat because of the predefined number of nodes, although GNP can evolve phenotype/genetypes of variable length. (2) GNP is booted up from the start node that is predefined in advance arbitrarily and there are no terminal nodes. fter the start node, the next node is determined according to the connections of the nodes and judging results of judgement nodes. In this way, GNP system is carried out according to the network flow without any terminal, so the network structure itself has a memory function of the past actions of an agent. Therefore, the determination of the next node is influenced by the node transitions of the past. (3) GNP can evolve appropriate connections so that it can judge environments and process functions efficiently. GNP can determine what kind of judgement and processing should be done now by evolving the network structure. (4) GNP can have time delays. d i is the time delay GNP spends on judgement or processing and d ij is the one GNP spends on transitions from node i to j. In the real world problems, some time is spent when judging or processing and also transferring from one node to the next node. For The whole structure of GNP is determined by the combination of the following node genes. genetic code of node i is represented as Fig.3. node i Ki i di Ni di Qi Qi Qi.... Nij dij Qij Qij Qij.... Fig.3 genotype expression of node i.. where, K i : Judgement/Processing classification 0 : Processing node : Judgement node i : content of Judgement/Processing d i : time delay spent for judgement/processing N ij : transition from node i to node j N ij =0:infeasible (no connection) N ij =:feasible d ij : time delay spent for the transition from node i to j Q ij,q ij,q ij : transition Q values from node i to j /02/$ IEEE
3 K i represents the node type, K i =0 means Processing node, K i = means Judgement node. i means the content GNP judges or processes and they are represented as unique numbers. d i is the time delay spent for judgement or processing. N ij shows whether the transition from node i to j ( j n (each node has a unique number from to n, respectively when the sum of nodes is n)) is feasible or not. N ij =0 means infeasible, N ij = means feasible. d ij means time delay spent for the transition from node i to j. Q ij,q ij,q ij means transition Q values from node i to j. Judgement node has plural judging results. For example, if a result of judgement is, GNP uses Q ij to select a transition, if the result is, it uses Q ij. Therefore, the number of Q ij,q ij,q ij, are the same as the number of the results at judgement nodes. On the other hand, processing node has only Q ij because there are no conditional branches when processing.. Online learning of GNP ecause the conventional learning method for GNP is offline learning, it is evaluated by the rewards given by the environment after GNP is carried out to some extent. Offline learning for GNP includes selection, mutation, and crossover. Online learning for GNP is based on Q learning and changes its node transition rules considering the rewards given one after another. That is, each node has a probability of plural transitions to the other nodes and transition Q values are calculated by using Q learning. GNP selects the transition according to the Q values. gents can learn and adapt to changes of the environments through this processes. ) Q learning (n off-policy T control algorithm) : Q learning calculates Q values which are functions of the state s and the action a. Q values means the sum of the rewards an agent gets in the future, and the update processes of them are implemented as follows. When an agent selects an action a t in state s t at time t, the reward r t is given and the state is changed from s t to s t+. s a result, Q(s t,a t ) is updated as the following equation. Q(s t,a t )=Q(s t,a t ) [ ] () + α r t + γ max Q(s t+,a) Q(s t,a t ) a If the step size parameter α decreases according to a certain schedule, Q values converge on an optimal values after enough trials. The action selection which has the maximum Q value becomes the optimal policy. γ is a discount rate which shows how long an agent considers the future rewards. If an agent continues to select the actions having maximum Q value when the learning is not enough, the policy improvement of taking the action is not implemented even though there are still better action selections. One of the method for overcoming the above problem is ε-greedy. This method makes the agent select the random action by the probability of ε, or select the action having maximum Q value by the probability of -ε. ε-greedy is a general method for keeping a balance between exploitation of experience and exploration. In this paper, ε-greedy is adopted for the action selections. 2) Relation between GNP and Q learning :InQ learning, state s is determined by the information an agent can get, and action a means the actual action it takes. On the other hand, GNP regards the state of the nodes after judging and processing as a state and a node transition as an action. The state s and action a of online learning of GNP are different from the ones determined by simple Q learning. 3) The reason for applying Q learning : () Once GNP is booted up from the start node, the current node is transferred one after another without any terminal nodes. Therefore, the framework of reinforcement learning that uses the sum of the discounted future rewards are suitable for online learning of GNP. (2) T control needs only a maximum Q value in the next state, therefore, much memory is not needed and Q value is updated easily. (3) ecause GNP should search for an optimal solution independently of the policy (ε-greedy), off-policy is adopted. 4) Node transition and Learning : The following is the outline of online learning. In this paper, the transitions from each node to all the other nodes are possible, namely all N ij = in this paper. The transition begins from the start node that can be selected from all the nodes arbitrarily and GNP executes the content of the start node, then GNP selects a transition having a maximum Q value in all transitions which connect the start node to the next one. However, by the probability of ε, it selects the random transitions. fter a reward is given, Q value is updated according to the eq.(). fter that, the transitions continue following the network flow. The concrete processes are described using Fig.4 that shows an example of node transitions. There are n nodes including judgement nodes and processing nodes, and each node has the number from to n, respectively. t time t, it is supposed that the current node is node i( i n). ecause node i is the judgement node, it judges the current environment. In this example, the result of the judgement could be, or and now is supposed at time t. Then, the state s t becomes i, and the transition is selected according to ε-greedy. case : next node is a processing node If the transition which has Q ij is selected, the next node is processing node j. GNP processes what is assigned to it and gets the reward r t. Then, the time is changed to t +and the state s t+ becomes s j decisively because the processing nodes have no judgement unlike judgement nodes. If Q j k is the maximum value among Q j,,q j n, this value is used to update Q ij. GNP selects the transition to node k which has the maximum Q value as an action a t+ by the probability /02/$ IEEE
4 of -ε, or selects the random transition by the probability of ε. In this case, Q value is updated accordingding to the eq.(2). ] Q ij = Q ij + α [r t + γq jk Q ij case 2 : next node is a judgement node If the transition which has Q ij 2 is selected, the next node is judgement node j 2 and GNP judges what is assigned to it. However, because the actual action is not done at node j 2, the reward r t is not given. The time is changed to t+ and the state s t+ could be j 2, j 2 or j 2 according to the judgement at node j 2. ecause it is supposed that the result of the judgement is in this example, s t+ becomes s j 2.IfQ j 2k 2 is the maximum value among Q j 2,,Q j 2n, it is used for updating Q ij 2. GNP selects the transition to node k 2 which has the maximum Q value as an action a t+ by the probability of -ε, or selects the random transition by the probability of ε. The update of Q value is implemented according to eq.(3). ] Q ij 2 = Q ij 2 + α [Q j2k2 Q ij2 When updating Q values of the transitions to the judgement nodes, the discount rate γ is fixed to, that is, the discount of the rewards is not done. Online learning of GNP aims to maximize the sum of discounted rewards when actual actions are implemented. si si i si Qi Qin Qi Qij Qij2 Qin Qi Qin.. n j j2.. time t t+ n sj2 sj2 sj2 k i, j, j2, k, k2 : node number ( i, j, j2, k, k2 n) : state st : action at n sj Qj Qjk Qj2k2 Qj2 Qj2 Fig.4 example of node transition k2 : Judgement node : Processing node case case2 (2) (3) III. IMULTION In this section, GNP is used as a player of Prisoner s dilemma game. The aim of this simulation is to confirm the effectiveness of the proposed online learning of GNP.. Prisoner s dilemma game story : The police don t have enough evidence to indict two suspects for complicity in a crime. The police approach each suspect separately. If either of you remains silent, the sentence for your crime becomes a two-year prison in spite of the insufficient evidence. If you confess and your pal remains silent, you are released and the crime of your pal becomes a five-year prison. ut, either of you confess, your sentence becomes a four-year prison. TLE. I Relation between entence and onfession/ilence uspect uspect2 entence for suspect onfession ilence released ilence ilence 2-year prison onfession onfession 4-year prison ilence onfession 5-year prison From TLE. I, Profits can be 0, -2, -4 and -5, respectively. In order to make them be positive numbers, 5 is added to each profit, then the profit matrix shown in Fig.5 is obtained. ilence is called ooperate() because it shortens its prison term. onfession is called efect() because it is done in order for only one suspect to be released. If the profits described by the symbols R,, T and P in the frame follow inequality (4), the dilemma arises. This matrix is the famous one in many research on Prisoner s dilemma game. Therefore, the results in this paper are calculated using Fig.5. In this simulation, players repeat selecting ooperation or efect. T >R>P> 2R >+ T cooperate() suspect defect() suspect2 cooperate() defect() Fig.5 profit matrix (profit for suspect) If the players have no information about their opponents and they select the action only once, efect is to be taken. In either case where the opponent takes ooperation or efect, efect gets a higher profit than ooperation. However, as the competition is repeated, and the characteristic R T P (4) /02/$ IEEE
5 of the opponent comes out, GNP can develop its strategy considering the strategies of the opponent and itself of the past.. GNP for Prisoner s dilemma game The player of Prisoner s dilemma game is regarded as an agent using GNP. The nodes of GNP are elf-judgement node (judge the action taken by itself) Opponent-judgement node (judge the action taken by an opponent) ooperation processing node () (take ooperation) efect processing node () (take efect) Fig.6 shows an example of GNP structure for Prisoner s dilemma game. In this simulation, although each node is connected to all the other nodes, only a few connections are drawn in order to see them easily in Fig.6. The thick arrows show an example of node transitions. elf-judgement node cooperation self-judgement in the case of before 2 actions s O O O s s s s in the case of O Opponent-judgement node in the case of opponent -judgement defect of last action s s O s self-judgement O s of last action O s in the case of s ooperation Processing node O s s efect Processing node s Fig.6 example of GNP structure Each transition from judgement node has two Q values, Q and Q. Q is used when the past action of itself or the opponent is cooperation (judgement result and the state becomes s ). Q is used when the past action of them is defect (judgement result and the state becomes ). On the other hand, each transition from the processing node has one Q value, which differs from a judgement node. fter processing, if the next node is a self-judgement node, GNP judges the last action of itself, if an opponent-judgement node, it judges the one of the opponent. If the same kind of judgement node is executed again, GNP carries out the judgement of the action taken two steps before. In case the same kind of judgement nodes are continued to be selected, GNP judges previous action one after another. fter processing, if the next node is a processing one, an agent competes using the corresponding processing against the opponent.. imulation results In this simulation, firstly GNP competes against Tit for Tat and Pavlov strategy that are the fixed strategies because they are predefined in advance and not changed. The purpose of these simulations is to show that GNP can learn the characteristics of the opponents and can change its strategies to get high scores. Next, two GNPs compete against each other. This situation can be regarded as a game in a dynamic environment because GNPs can change their strategies each other. Therefore, GNPs should adapt to the change of the strategies of the opponents. GNP competes under the following conditions. the number of nodes(n) : 20 Processing nodes : 5 per each Processing Judgement nodes : 5 per each judgement discount rate(γ) : 0.99 step size parameter(α) : 0.5 ε : 0.05 reward(r) : obtained by Fig.5 Q values : zero in the initial state The competition is carried out for the predifined iterations after Q values are initialized (all Q values are zero at first). This process is called a trial. ) ompetition between GNP and Tit for Tat : Tit for Tat : take ooperation at the first iteration take the last opponent action from the second iteration This strategy never loses a game by a wide margin and gets almost the same average scores as the opponent. Therefore, Tit for Tat is an admitted strong strategy. Even though the other strategies lose their scores against the opponent, Tit for Tat ends its competition with the same average as an opponent. Therefore, Tit for Tat becomes the strongest strategy as a whole. In this simulation, the competition between GNP and Tit for Tat is done. The competition is carried out for iterations and this trial repeated 00 times. In each trial, the moving averages over 0 iterations are calculated. Fig.7 shows the averages of these moving averages over 00 trials. ecause Tit for Tat never loses by a wide margin, the average scores of GNP and Tit for Tat are almost the same and the lines of them are overlapped. s the competition proceeds, GNP gradually gets the high scores. If GNP takeefect, Tit for Tat takes it in the next step and the score is diminished. If GNP continues to take cooperation, Tit for Tat also do it, therefore, we can infer that GNP learned the cooperative /02/$ IEEE
6 strategy in order to get high scores. In the simulation, the more use of ooperation node than efect node is confirmed. 2) ompetition between GNP and Pavlov strategy : T- LE. II shows Pavlov strategy. Pavlov strategy decides the next action according to the combination of the last actions of both strategies. TLE. II Pavlov strategy Pavlov Opponent Next action of Pavlov ooperation ooperation ooperation ooperation efect efect efect ooperation efect efect efect ooperation Fig.8 shows the result which is calculated by the same procedure as simulation. GNP could win by a wide margin. GNP realized that it could defeat Pavlov strategy by taking efect strategy. If both GNP and Pavlov strategy take efect, Pavlov strategy takes ooperation in the next step. Therefore, GNP learnd that it should take efect next in order to get 5 point. 3) ompetition between GNPs : Two GNPs which have the same parameters compete with each other in this simulation. Fig.9 shows a typical trial in order to study the progress of scores in detail. The average scores mean the moving averages over 0 iterations. The characteristic of the result is as follows. while GNP increases its scores, GNP2 decreases them and while GNP decreases its scores, GNP2 increases them. If GNP can win for some time by taking efect more than ooperation, GNP2 intends to take efect. Then, the scores of GNP decrease. Therefore, GNP intends to change its strategy from efect to ooperation in order to avoid continuing to take efect each other. However, GNP2 becomes to win thanks to the strategy change of GNP. GNP intends to take efect in turn and thus GNP2 intends to take ooperation in order to avoid taking efect each other. This process is repeated. From the viewpoint of online learning, this result is natural. IV. ONLUION In this paper, in order to adapt to dynamic environments quickly, online learning of GNP is proposed and applied to players for Prisoner s dilemma game and confirmed its learning ability. GNP can make its node transition rules according to Q values learned by Q learning. s the result of the simulations, GNP increased its score when competing against fixed strategy (Tit for tat and Pavlov strategy). In the competition between GNPs where strategies are dynamic, both GNPs can keep up with the changes of their strategies. Henceforth, we will integrate online learning and offline learning so that the performance will be improved much more and GNP can model the learning mechanisms of organisms more realistically. average score GNP Tit for Tat iteration Fig.7 ompetition between GNP and Tit for Tat average score GNP Pavlov iteration Fig.8 ompetition between GNP and Pavlov strategy average score GNP GNP iteration Fig.9 ompetition between GNPs References [] Holland, J. H., daptation in Natural and rtificial ystems, University of Michigan Press, 975, MIT Press, 992 [2] Koza, John R., Genetic Programming, on the programming of computers by means of natural selection, ambridge, Mass., MIT Press, 994 [3] H.Katagiri, K. Hirasawa and J.Hu: Genetic Network Programming - pplication to Intelligent gents -, in Proc. of IEEE International onference on ystems, Man and ybernetics, pp , 2000 [4] K.Hirasawa, M.Okubo, H.Katagiri, J.Hu and J.Murata, omparison between Genetic Network Programming (GNP) and Genetic Programming (GP), in Proc. of ongress on Evolutionary omputation, pp , 200 [5] Richard. utton and ndrew G. arto, Reinforcement Learning - n Introduction, MIT Press ambridge, Massachusetts, London, England, 998 [6]. Teller, and M.Veloso, PO: Learning Tree-structured lgorithm for Orchestration into an Object Recognition ystem, arnegie Mellon University Technical Report Library, 995 [7] L.J.Fogel,.J.Owens, and M.J.Walsh, rtificial Intelligence through imulated Evolution, Wiley, /02/$ IEEE
Genetic Network Programming with Reinforcement Learning and Its Application to Creating Stock Trading Rules
17 Genetic Network Programming with Reinforcement Learning and Its Application to Creating Stock Trading Rules Yan Chen, Shingo Mabu and Kotaro Hirasawa Waseda University Japan Open Access Database www.intechweb.org
Numerical Research on Distributed Genetic Algorithm with Redundant
Numerical Research on Distributed Genetic Algorithm with Redundant Binary Number 1 Sayori Seto, 2 Akinori Kanasugi 1,2 Graduate School of Engineering, Tokyo Denki University, Japan [email protected],
When Machine Learning Meets AI and Game Theory
1 When Machine Learning Meets AI and Game Theory Anurag Agrawal, Deepak Jaiswal Abstract We study the problem of development of intelligent machine learning applications to exploit the problems of adaptation
Genetic algorithms for changing environments
Genetic algorithms for changing environments John J. Grefenstette Navy Center for Applied Research in Artificial Intelligence, Naval Research Laboratory, Washington, DC 375, USA [email protected] Abstract
Using Markov Decision Processes to Solve a Portfolio Allocation Problem
Using Markov Decision Processes to Solve a Portfolio Allocation Problem Daniel Bookstaber April 26, 2005 Contents 1 Introduction 3 2 Defining the Model 4 2.1 The Stochastic Model for a Single Asset.........................
A Parallel Processor for Distributed Genetic Algorithm with Redundant Binary Number
A Parallel Processor for Distributed Genetic Algorithm with Redundant Binary Number 1 Tomohiro KAMIMURA, 2 Akinori KANASUGI 1 Department of Electronics, Tokyo Denki University, [email protected]
A Sarsa based Autonomous Stock Trading Agent
A Sarsa based Autonomous Stock Trading Agent Achal Augustine The University of Texas at Austin Department of Computer Science Austin, TX 78712 USA [email protected] Abstract This paper describes an autonomous
A Genetic Algorithm Approach for Solving a Flexible Job Shop Scheduling Problem
A Genetic Algorithm Approach for Solving a Flexible Job Shop Scheduling Problem Sayedmohammadreza Vaghefinezhad 1, Kuan Yew Wong 2 1 Department of Manufacturing & Industrial Engineering, Faculty of Mechanical
14.10.2014. Overview. Swarms in nature. Fish, birds, ants, termites, Introduction to swarm intelligence principles Particle Swarm Optimization (PSO)
Overview Kyrre Glette kyrrehg@ifi INF3490 Swarm Intelligence Particle Swarm Optimization Introduction to swarm intelligence principles Particle Swarm Optimization (PSO) 3 Swarms in nature Fish, birds,
From the probabilities that the company uses to move drivers from state to state the next year, we get the following transition matrix:
MAT 121 Solutions to Take-Home Exam 2 Problem 1 Car Insurance a) The 3 states in this Markov Chain correspond to the 3 groups used by the insurance company to classify their drivers: G 0, G 1, and G 2
THREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC- TERISTICS
THREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC- TERISTICS O.U. Sezerman 1, R. Islamaj 2, E. Alpaydin 2 1 Laborotory of Computational Biology, Sabancı University, Istanbul, Turkey. 2 Computer Engineering
Genetic Algorithm Evolution of Cellular Automata Rules for Complex Binary Sequence Prediction
Brill Academic Publishers P.O. Box 9000, 2300 PA Leiden, The Netherlands Lecture Series on Computer and Computational Sciences Volume 1, 2005, pp. 1-6 Genetic Algorithm Evolution of Cellular Automata Rules
A Service Revenue-oriented Task Scheduling Model of Cloud Computing
Journal of Information & Computational Science 10:10 (2013) 3153 3161 July 1, 2013 Available at http://www.joics.com A Service Revenue-oriented Task Scheduling Model of Cloud Computing Jianguang Deng a,b,,
D A T A M I N I N G C L A S S I F I C A T I O N
D A T A M I N I N G C L A S S I F I C A T I O N FABRICIO VOZNIKA LEO NARDO VIA NA INTRODUCTION Nowadays there is huge amount of data being collected and stored in databases everywhere across the globe.
Architectural Design for Space Layout by Genetic Algorithms
Architectural Design for Space Layout by Genetic Algorithms Özer Ciftcioglu, Sanja Durmisevic and I. Sevil Sariyildiz Delft University of Technology, Faculty of Architecture Building Technology, 2628 CR
Introduction To Genetic Algorithms
1 Introduction To Genetic Algorithms Dr. Rajib Kumar Bhattacharjya Department of Civil Engineering IIT Guwahati Email: [email protected] References 2 D. E. Goldberg, Genetic Algorithm In Search, Optimization
A HYBRID GENETIC ALGORITHM FOR THE MAXIMUM LIKELIHOOD ESTIMATION OF MODELS WITH MULTIPLE EQUILIBRIA: A FIRST REPORT
New Mathematics and Natural Computation Vol. 1, No. 2 (2005) 295 303 c World Scientific Publishing Company A HYBRID GENETIC ALGORITHM FOR THE MAXIMUM LIKELIHOOD ESTIMATION OF MODELS WITH MULTIPLE EQUILIBRIA:
Inductive QoS Packet Scheduling for Adaptive Dynamic Networks
Inductive QoS Packet Scheduling for Adaptive Dynamic Networks Malika BOURENANE Dept of Computer Science University of Es-Senia Algeria [email protected] Abdelhamid MELLOUK LISSI Laboratory University
Video Game Description Language (VGDL) and the Challenge of Creating Agents for General Video Game Playing (GVGP)
Video Game Description Language (VGDL) and the Challenge of Creating Agents for General Video Game Playing (GVGP) Diego Perez, University of Essex (Colchester, UK) [email protected] Tainan, Taiwan, 2015
A Novel Binary Particle Swarm Optimization
Proceedings of the 5th Mediterranean Conference on T33- A Novel Binary Particle Swarm Optimization Motaba Ahmadieh Khanesar, Member, IEEE, Mohammad Teshnehlab and Mahdi Aliyari Shoorehdeli K. N. Toosi
About the Author. The Role of Artificial Intelligence in Software Engineering. Brief History of AI. Introduction 2/27/2013
About the Author The Role of Artificial Intelligence in Software Engineering By: Mark Harman Presented by: Jacob Lear Mark Harman is a Professor of Software Engineering at University College London Director
Estimation of the COCOMO Model Parameters Using Genetic Algorithms for NASA Software Projects
Journal of Computer Science 2 (2): 118-123, 2006 ISSN 1549-3636 2006 Science Publications Estimation of the COCOMO Model Parameters Using Genetic Algorithms for NASA Software Projects Alaa F. Sheta Computers
A Review And Evaluations Of Shortest Path Algorithms
A Review And Evaluations Of Shortest Path Algorithms Kairanbay Magzhan, Hajar Mat Jani Abstract: Nowadays, in computer networks, the routing is based on the shortest path problem. This will help in minimizing
A Robust Method for Solving Transcendental Equations
www.ijcsi.org 413 A Robust Method for Solving Transcendental Equations Md. Golam Moazzam, Amita Chakraborty and Md. Al-Amin Bhuiyan Department of Computer Science and Engineering, Jahangirnagar University,
Fuzzy Cognitive Map for Software Testing Using Artificial Intelligence Techniques
Fuzzy ognitive Map for Software Testing Using Artificial Intelligence Techniques Deane Larkman 1, Masoud Mohammadian 1, Bala Balachandran 1, Ric Jentzsch 2 1 Faculty of Information Science and Engineering,
CHAPTER 3 SECURITY CONSTRAINED OPTIMAL SHORT-TERM HYDROTHERMAL SCHEDULING
60 CHAPTER 3 SECURITY CONSTRAINED OPTIMAL SHORT-TERM HYDROTHERMAL SCHEDULING 3.1 INTRODUCTION Optimal short-term hydrothermal scheduling of power systems aims at determining optimal hydro and thermal generations
METHODOLOGICAL CONSIDERATIONS OF DRIVE SYSTEM SIMULATION, WHEN COUPLING FINITE ELEMENT MACHINE MODELS WITH THE CIRCUIT SIMULATOR MODELS OF CONVERTERS.
SEDM 24 June 16th - 18th, CPRI (Italy) METHODOLOGICL CONSIDERTIONS OF DRIVE SYSTEM SIMULTION, WHEN COUPLING FINITE ELEMENT MCHINE MODELS WITH THE CIRCUIT SIMULTOR MODELS OF CONVERTERS. Áron Szûcs BB Electrical
A Brief Study of the Nurse Scheduling Problem (NSP)
A Brief Study of the Nurse Scheduling Problem (NSP) Lizzy Augustine, Morgan Faer, Andreas Kavountzis, Reema Patel Submitted Tuesday December 15, 2009 0. Introduction and Background Our interest in the
Modeling and Performance Evaluation of Computer Systems Security Operation 1
Modeling and Performance Evaluation of Computer Systems Security Operation 1 D. Guster 2 St.Cloud State University 3 N.K. Krivulin 4 St.Petersburg State University 5 Abstract A model of computer system
Memory Allocation Technique for Segregated Free List Based on Genetic Algorithm
Journal of Al-Nahrain University Vol.15 (2), June, 2012, pp.161-168 Science Memory Allocation Technique for Segregated Free List Based on Genetic Algorithm Manal F. Younis Computer Department, College
GPSQL Miner: SQL-Grammar Genetic Programming in Data Mining
GPSQL Miner: SQL-Grammar Genetic Programming in Data Mining Celso Y. Ishida, Aurora T. R. Pozo Computer Science Department - Federal University of Paraná PO Box: 19081, Centro Politécnico - Jardim das
ISSN: 2319-5967 ISO 9001:2008 Certified International Journal of Engineering Science and Innovative Technology (IJESIT) Volume 2, Issue 3, May 2013
Transistor Level Fault Finding in VLSI Circuits using Genetic Algorithm Lalit A. Patel, Sarman K. Hadia CSPIT, CHARUSAT, Changa., CSPIT, CHARUSAT, Changa Abstract This paper presents, genetic based algorithm
Eligibility Traces. Suggested reading: Contents: Chapter 7 in R. S. Sutton, A. G. Barto: Reinforcement Learning: An Introduction MIT Press, 1998.
Eligibility Traces 0 Eligibility Traces Suggested reading: Chapter 7 in R. S. Sutton, A. G. Barto: Reinforcement Learning: An Introduction MIT Press, 1998. Eligibility Traces Eligibility Traces 1 Contents:
GA as a Data Optimization Tool for Predictive Analytics
GA as a Data Optimization Tool for Predictive Analytics Chandra.J 1, Dr.Nachamai.M 2,Dr.Anitha.S.Pillai 3 1Assistant Professor, Department of computer Science, Christ University, Bangalore,India, [email protected]
STUDY OF PROJECT SCHEDULING AND RESOURCE ALLOCATION USING ANT COLONY OPTIMIZATION 1
STUDY OF PROJECT SCHEDULING AND RESOURCE ALLOCATION USING ANT COLONY OPTIMIZATION 1 Prajakta Joglekar, 2 Pallavi Jaiswal, 3 Vandana Jagtap Maharashtra Institute of Technology, Pune Email: 1 [email protected],
Extension of Decision Tree Algorithm for Stream Data Mining Using Real Data
Fifth International Workshop on Computational Intelligence & Applications IEEE SMC Hiroshima Chapter, Hiroshima University, Japan, November 10, 11 & 12, 2009 Extension of Decision Tree Algorithm for Stream
Stock Trading with Genetic Algorithm---Switching from One Stock to Another
The 6th International Conference on Information Technology and Applications (ICITA 2009) Stock Trading with Genetic Algorithm---Switching from One Stock to Another Tomio Kurokawa, Member, IEEE-CIS Abstract--It
Application of sensitivity analysis in investment project evaluation under uncertainty and risk
International Journal of Project Management Vol. 17, No. 4, pp. 217±222, 1999 # 1999 Elsevier Science Ltd and IPMA. All rights reserved Printed in Great Britain 0263-7863/99 $ - see front matter PII: S0263-7863(98)00035-0
Machine Learning CS 6830. Lecture 01. Razvan C. Bunescu School of Electrical Engineering and Computer Science [email protected]
Machine Learning CS 6830 Razvan C. Bunescu School of Electrical Engineering and Computer Science [email protected] What is Learning? Merriam-Webster: learn = to acquire knowledge, understanding, or skill
Course: Model, Learning, and Inference: Lecture 5
Course: Model, Learning, and Inference: Lecture 5 Alan Yuille Department of Statistics, UCLA Los Angeles, CA 90095 [email protected] Abstract Probability distributions on structured representation.
SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis
SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: [email protected] October 17, 2015 Outline
Genetic programming with regular expressions
Genetic programming with regular expressions Børge Svingen Chief Technology Officer, Open AdExchange [email protected] 2009-03-23 Pattern discovery Pattern discovery: Recognizing patterns that characterize
Game Theory 1. Introduction
Game Theory 1. Introduction Dmitry Potapov CERN What is Game Theory? Game theory is about interactions among agents that are self-interested I ll use agent and player synonymously Self-interested: Each
Laboratory work in AI: First steps in Poker Playing Agents and Opponent Modeling
Laboratory work in AI: First steps in Poker Playing Agents and Opponent Modeling Avram Golbert 01574669 [email protected] Abstract: While Artificial Intelligence research has shown great success in deterministic
6.254 : Game Theory with Engineering Applications Lecture 2: Strategic Form Games
6.254 : Game Theory with Engineering Applications Lecture 2: Strategic Form Games Asu Ozdaglar MIT February 4, 2009 1 Introduction Outline Decisions, utility maximization Strategic form games Best responses
Sequential lmove Games. Using Backward Induction (Rollback) to Find Equilibrium
Sequential lmove Games Using Backward Induction (Rollback) to Find Equilibrium Sequential Move Class Game: Century Mark Played by fixed pairs of players taking turns. At each turn, each player chooses
A linear algebraic method for pricing temporary life annuities
A linear algebraic method for pricing temporary life annuities P. Date (joint work with R. Mamon, L. Jalen and I.C. Wang) Department of Mathematical Sciences, Brunel University, London Outline Introduction
Automatic Generation of Intelligent Agent Programs
utomatic Generation of Intelligent gent Programs Lee Spector [email protected] http://hampshire.edu/lspector n agent, for the sake of these comments, is any autonomous system that perceives and acts
A Method of Cloud Resource Load Balancing Scheduling Based on Improved Adaptive Genetic Algorithm
Journal of Information & Computational Science 9: 16 (2012) 4801 4809 Available at http://www.joics.com A Method of Cloud Resource Load Balancing Scheduling Based on Improved Adaptive Genetic Algorithm
Master's projects at ITMO University. Daniil Chivilikhin PhD Student @ ITMO University
Master's projects at ITMO University Daniil Chivilikhin PhD Student @ ITMO University General information Guidance from our lab's researchers Publishable results 2 Research areas Research at ITMO Evolutionary
6.231 Dynamic Programming Midterm, Fall 2008. Instructions
6.231 Dynamic Programming Midterm, Fall 2008 Instructions The midterm comprises three problems. Problem 1 is worth 60 points, problem 2 is worth 40 points, and problem 3 is worth 40 points. Your grade
ANT COLONY OPTIMIZATION ALGORITHM FOR RESOURCE LEVELING PROBLEM OF CONSTRUCTION PROJECT
ANT COLONY OPTIMIZATION ALGORITHM FOR RESOURCE LEVELING PROBLEM OF CONSTRUCTION PROJECT Ying XIONG 1, Ya Ping KUANG 2 1. School of Economics and Management, Being Jiaotong Univ., Being, China. 2. College
Compact Representations and Approximations for Compuation in Games
Compact Representations and Approximations for Compuation in Games Kevin Swersky April 23, 2008 Abstract Compact representations have recently been developed as a way of both encoding the strategic interactions
Reliability Guarantees in Automata Based Scheduling for Embedded Control Software
1 Reliability Guarantees in Automata Based Scheduling for Embedded Control Software Santhosh Prabhu, Aritra Hazra, Pallab Dasgupta Department of CSE, IIT Kharagpur West Bengal, India - 721302. Email: {santhosh.prabhu,
Handling Fault Detection Latencies in Automata-based Scheduling for Embedded Control Software
Handling Fault Detection atencies in Automata-based cheduling for Embedded Control oftware anthosh Prabhu M, Aritra Hazra, Pallab Dasgupta and P. P. Chakrabarti Department of Computer cience and Engineering,
Reinforcement Learning
Reinforcement Learning LU 2 - Markov Decision Problems and Dynamic Programming Dr. Martin Lauer AG Maschinelles Lernen und Natürlichsprachliche Systeme Albert-Ludwigs-Universität Freiburg [email protected]
A Binary Model on the Basis of Imperialist Competitive Algorithm in Order to Solve the Problem of Knapsack 1-0
212 International Conference on System Engineering and Modeling (ICSEM 212) IPCSIT vol. 34 (212) (212) IACSIT Press, Singapore A Binary Model on the Basis of Imperialist Competitive Algorithm in Order
Game Theory Perspectives on Client Vendor relationships in offshore software outsourcing
Game Theory Perspectives on Client Vendor relationships in offshore software outsourcing Nilay V. Oza Helsinki University of Technology Software Business Laboratory, P.O. Box 5500, 02015 TKK, Finland Tel:
Comparison of Major Domination Schemes for Diploid Binary Genetic Algorithms in Dynamic Environments
Comparison of Maor Domination Schemes for Diploid Binary Genetic Algorithms in Dynamic Environments A. Sima UYAR and A. Emre HARMANCI Istanbul Technical University Computer Engineering Department Maslak
Using Ant Colony Optimization for Infrastructure Maintenance Scheduling
Using Ant Colony Optimization for Infrastructure Maintenance Scheduling K. Lukas, A. Borrmann & E. Rank Chair for Computation in Engineering, Technische Universität München ABSTRACT: For the optimal planning
Multiobjective Multicast Routing Algorithm
Multiobjective Multicast Routing Algorithm Jorge Crichigno, Benjamín Barán P. O. Box 9 - National University of Asunción Asunción Paraguay. Tel/Fax: (+9-) 89 {jcrichigno, bbaran}@cnc.una.py http://www.una.py
AUTOMATIC ADJUSTMENT FOR LASER SYSTEMS USING A STOCHASTIC BINARY SEARCH ALGORITHM TO COPE WITH NOISY SENSING DATA
INTERNATIONAL JOURNAL ON SMART SENSING AND INTELLIGENT SYSTEMS, VOL. 1, NO. 2, JUNE 2008 AUTOMATIC ADJUSTMENT FOR LASER SYSTEMS USING A STOCHASTIC BINARY SEARCH ALGORITHM TO COPE WITH NOISY SENSING DATA
Computational Learning Theory Spring Semester, 2003/4. Lecture 1: March 2
Computational Learning Theory Spring Semester, 2003/4 Lecture 1: March 2 Lecturer: Yishay Mansour Scribe: Gur Yaari, Idan Szpektor 1.1 Introduction Several fields in computer science and economics are
Methodology for Emulating Self Organizing Maps for Visualization of Large Datasets
Methodology for Emulating Self Organizing Maps for Visualization of Large Datasets Macario O. Cordel II and Arnulfo P. Azcarraga College of Computer Studies *Corresponding Author: [email protected]
CSI 333 Lecture 1 Number Systems
CSI 333 Lecture 1 Number Systems 1 1 / 23 Basics of Number Systems Ref: Appendix C of Deitel & Deitel. Weighted Positional Notation: 192 = 2 10 0 + 9 10 1 + 1 10 2 General: Digit sequence : d n 1 d n 2...
Circuits 1 M H Miller
Introduction to Graph Theory Introduction These notes are primarily a digression to provide general background remarks. The subject is an efficient procedure for the determination of voltages and currents
Lecture 1: Introduction to Reinforcement Learning
Lecture 1: Introduction to Reinforcement Learning David Silver Outline 1 Admin 2 About Reinforcement Learning 3 The Reinforcement Learning Problem 4 Inside An RL Agent 5 Problems within Reinforcement Learning
Cost Model: Work, Span and Parallelism. 1 The RAM model for sequential computation:
CSE341T 08/31/2015 Lecture 3 Cost Model: Work, Span and Parallelism In this lecture, we will look at how one analyze a parallel program written using Cilk Plus. When we analyze the cost of an algorithm
Application of GA for Optimal Location of FACTS Devices for Steady State Voltage Stability Enhancement of Power System
I.J. Intelligent Systems and Applications, 2014, 03, 69-75 Published Online February 2014 in MECS (http://www.mecs-press.org/) DOI: 10.5815/ijisa.2014.03.07 Application of GA for Optimal Location of Devices
A Reactive Tabu Search for Service Restoration in Electric Power Distribution Systems
IEEE International Conference on Evolutionary Computation May 4-11 1998, Anchorage, Alaska A Reactive Tabu Search for Service Restoration in Electric Power Distribution Systems Sakae Toune, Hiroyuki Fudo,
A Tool for Generating Partition Schedules of Multiprocessor Systems
A Tool for Generating Partition Schedules of Multiprocessor Systems Hans-Joachim Goltz and Norbert Pieth Fraunhofer FIRST, Berlin, Germany {hans-joachim.goltz,nobert.pieth}@first.fraunhofer.de Abstract.
An ant colony optimization for single-machine weighted tardiness scheduling with sequence-dependent setups
Proceedings of the 6th WSEAS International Conference on Simulation, Modelling and Optimization, Lisbon, Portugal, September 22-24, 2006 19 An ant colony optimization for single-machine weighted tardiness
chapter: Oligopoly Krugman/Wells Economics 2009 Worth Publishers 1 of 35
chapter: 15 >> Oligopoly Krugman/Wells Economics 2009 Worth Publishers 1 of 35 WHAT YOU WILL LEARN IN THIS CHAPTER The meaning of oligopoly, and why it occurs Why oligopolists have an incentive to act
6 Creating the Animation
6 Creating the Animation Now that the animation can be represented, stored, and played back, all that is left to do is understand how it is created. This is where we will use genetic algorithms, and this
Development of dynamically evolving and self-adaptive software. 1. Background
Development of dynamically evolving and self-adaptive software 1. Background LASER 2013 Isola d Elba, September 2013 Carlo Ghezzi Politecnico di Milano Deep-SE Group @ DEIB 1 Requirements Functional requirements
BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS
BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi-110 012 [email protected] Genomics A genome is an organism s
A New Nature-inspired Algorithm for Load Balancing
A New Nature-inspired Algorithm for Load Balancing Xiang Feng East China University of Science and Technology Shanghai, China 200237 Email: xfeng{@ecusteducn, @cshkuhk} Francis CM Lau The University of
Hidden Markov Models
8.47 Introduction to omputational Molecular Biology Lecture 7: November 4, 2004 Scribe: Han-Pang hiu Lecturer: Ross Lippert Editor: Russ ox Hidden Markov Models The G island phenomenon The nucleotide frequencies
Improving the Performance of a Computer-Controlled Player in a Maze Chase Game using Evolutionary Programming on a Finite-State Machine
Improving the Performance of a Computer-Controlled Player in a Maze Chase Game using Evolutionary Programming on a Finite-State Machine Maximiliano Miranda and Federico Peinado Departamento de Ingeniería
Online Appendix to Stochastic Imitative Game Dynamics with Committed Agents
Online Appendix to Stochastic Imitative Game Dynamics with Committed Agents William H. Sandholm January 6, 22 O.. Imitative protocols, mean dynamics, and equilibrium selection In this section, we consider
Fault Analysis in Software with the Data Interaction of Classes
, pp.189-196 http://dx.doi.org/10.14257/ijsia.2015.9.9.17 Fault Analysis in Software with the Data Interaction of Classes Yan Xiaobo 1 and Wang Yichen 2 1 Science & Technology on Reliability & Environmental
Record Deduplication By Evolutionary Means
Record Deduplication By Evolutionary Means Marco Modesto, Moisés G. de Carvalho, Walter dos Santos Departamento de Ciência da Computação Universidade Federal de Minas Gerais 31270-901 - Belo Horizonte
CLOUD DATABASE ROUTE SCHEDULING USING COMBANATION OF PARTICLE SWARM OPTIMIZATION AND GENETIC ALGORITHM
CLOUD DATABASE ROUTE SCHEDULING USING COMBANATION OF PARTICLE SWARM OPTIMIZATION AND GENETIC ALGORITHM *Shabnam Ghasemi 1 and Mohammad Kalantari 2 1 Deparment of Computer Engineering, Islamic Azad University,
Experiments on the local load balancing algorithms; part 1
Experiments on the local load balancing algorithms; part 1 Ştefan Măruşter Institute e-austria Timisoara West University of Timişoara, Romania [email protected] Abstract. In this paper the influence
Abstract Title: Planned Preemption for Flexible Resource Constrained Project Scheduling
Abstract number: 015-0551 Abstract Title: Planned Preemption for Flexible Resource Constrained Project Scheduling Karuna Jain and Kanchan Joshi Shailesh J. Mehta School of Management, Indian Institute
Schedule Risk Analysis Simulator using Beta Distribution
Schedule Risk Analysis Simulator using Beta Distribution Isha Sharma Department of Computer Science and Applications, Kurukshetra University, Kurukshetra, Haryana (INDIA) [email protected] Dr. P.K.
D-optimal plans in observational studies
D-optimal plans in observational studies Constanze Pumplün Stefan Rüping Katharina Morik Claus Weihs October 11, 2005 Abstract This paper investigates the use of Design of Experiments in observational
ECONOMIC GENERATION AND SCHEDULING OF POWER BY GENETIC ALGORITHM
ECONOMIC GENERATION AND SCHEDULING OF POWER BY GENETIC ALGORITHM RAHUL GARG, 2 A.K.SHARMA READER, DEPARTMENT OF ELECTRICAL ENGINEERING, SBCET, JAIPUR (RAJ.) 2 ASSOCIATE PROF, DEPARTMENT OF ELECTRICAL ENGINEERING,
A Non-Linear Schema Theorem for Genetic Algorithms
A Non-Linear Schema Theorem for Genetic Algorithms William A Greene Computer Science Department University of New Orleans New Orleans, LA 70148 bill@csunoedu 504-280-6755 Abstract We generalize Holland
Modified Version of Roulette Selection for Evolution Algorithms - the Fan Selection
Modified Version of Roulette Selection for Evolution Algorithms - the Fan Selection Adam S lowik, Micha l Bia lko Department of Electronic, Technical University of Koszalin, ul. Śniadeckich 2, 75-453 Koszalin,
Cellular Automaton: The Roulette Wheel and the Landscape Effect
Cellular Automaton: The Roulette Wheel and the Landscape Effect Ioan Hălălae Faculty of Engineering, Eftimie Murgu University, Traian Vuia Square 1-4, 385 Reşiţa, Romania Phone: +40 255 210227, Fax: +40
