Using abstract models of behaviors to automatically generate reinforcement learning hierarchies

Size: px
Start display at page:

Download "Using abstract models of behaviors to automatically generate reinforcement learning hierarchies"

Transcription

1 Hauptseminar Intelligente Autonome Systeme Using abstract models of behaviors to automatically generate reinforcement learning hierarchies Christian Sosnowski Betreuer: Freek Stulp Technische Universität München Fakultät für Informatik Forschungs- und Lehreinheit Informatik IX

2 Content Introduction Fundamentals Building Task Hierarchies through planning The P-HSMQ Algorithm Termination Improvement Conclusion Technische Universität München Christian Sosnowski Hauptseminar 2

3 Introduction Intelligent autonomous System Learning in general no detailed programming to solve a problem Reinforcement learning Solution can be unknown Proven method for artificial intelligence But: Curse of dimensionality increased complexity will raise the number of states exponentially example Technische Universität München Christian Sosnowski Hauptseminar 3

4 The task: Introduction The example The robot knows its position Can move to one of the eight neighbor cells Can pick up and carry an object Pick up the coffee and the book and carry it to the lounge Technische Universität München Christian Sosnowski Hauptseminar 4

5 Solution Introduction Structure of the problem Use of background knowledge Avoid useless explorations Provide general guidance Let the system learn inside a limited scope That s what this presentation is about Technische Universität München Christian Sosnowski Hauptseminar 5

6 Introduction Fundamentals Building Task Hierarchies through planning The P-HSMQ Algorithm Termination Improvement Conclusion Technische Universität München Christian Sosnowski Hauptseminar 6

7 Fundamentals The Markov Decision Process (MDP) Restriction on the necessary information No hidden states No dependence on history All involved probabilities are independent and beside of the actual state and action Technische Universität München Christian Sosnowski Hauptseminar 7

8 Fundamentals Q-learning Incremental learning algorithm (step by step) Using a table to store the Q-values for each state and action pair Technische Universität München Christian Sosnowski Hauptseminar 8

9 Fundamentals Q-learning small example Rectangle of 3 x 4 positions Robot is at A4 and has to reach C1 = 0.1 = 0.9 Technische Universität München Christian Sosnowski Hauptseminar 9

10 Fundamentals Q-learning small example Choosing of random paths A4-A3-A2-B2-C2-C1 only the value at pos C2 will be updated: C2 = 0.9*0+0.1*(0+0.9*1) = 0.09 Technische Universität München Christian Sosnowski Hauptseminar 10

11 Fundamentals Q-learning small example A4-B4-C4-B3-A3-B2-C1 only the value at pos B2 will be updated: B2 = 0.9*0+0.1*(0+0.9*1) = 0.09 Technische Universität München Christian Sosnowski Hauptseminar 11

12 Fundamentals Q-learning small example A4-A3-B3-B2-C1 The following values will be updated: B3 = 0.9*0+0.1*(0+0.9*0.09) = B2 = 0.9* *(0+0.9*1) = Technische Universität München Christian Sosnowski Hauptseminar 12

13 Fundamentals Q-learning small example Finally the algorithm converges to the following numbers: Technische Universität München Christian Sosnowski Hauptseminar 13

14 Fundamentals The curse of dimensionality In primitive examples Q-learning converges nicely In practice very poor performance Real world problems with multi-dimensional state problems number of state action pairs will raise exponential Technische Universität München Christian Sosnowski Hauptseminar 14

15 Fundamentals The curse of dimensionality real world problem Flying an airplane Every gage has numerous discrete reading (i.e. the dive angle 0-90 ) With every gage/reading the number of states rise exponential 90 (dive angle) (0-600 speed) 3*10 8 ( ft altitude) Video Aircraft View Technische Universität München Christian Sosnowski Hauptseminar 15

16 Introduction Fundamentals Building Task Hierarchies through planning The P-HSMQ Algorithm Termination Improvement Conclusion Technische Universität München Christian Sosnowski Hauptseminar 16

17 Building Task Hierarchies through planning There are no general purpose solutions Divide problems into smaller subtasks and try to solve them one after the other Malcolm R. K. Ryan defines behaviors which describe the subtask and puts them together to a plan Technische Universität München Christian Sosnowski Hauptseminar 17

18 Building Task Hierarchies through planning Subtask in the example Go (Room1, Room2) Get (Object, Room) Technische Universität München Christian Sosnowski Hauptseminar 18

19 State Building Task Hierarchies through planning Building the plan formal language Goal Teleo-operators define a goal-directed behavior with pre- and postcondition Technische Universität München Christian Sosnowski Hauptseminar 19

20 Building Task Hierarchies through planning Plan is represented in a tree More than one node can be active Technische Universität München Christian Sosnowski Hauptseminar 20

21 Building Task Hierarchies through planning Combining Planning and Learning Combining the plan with reinforcement learning Local reward function Executing an action a in state s resulting in transition to state s Technische Universität München Christian Sosnowski Hauptseminar 21

22 Building Task Hierarchies through planning Combining Planning and Learning Overall aim: Technische Universität München Christian Sosnowski Hauptseminar 22

23 Introduction Fundamentals Building Task Hierarchies through planning The P-HSMQ Algorithm Termination Improvement Conclusion Technische Universität München Christian Sosnowski Hauptseminar 23

24 The P-HSMQ Algorithm Technische Universität München Christian Sosnowski Hauptseminar 24

25 The P-HSMQ Algorithm Technische Universität München Christian Sosnowski Hauptseminar 25

26 The P-HSMQ Algorithm Technische Universität München Christian Sosnowski Hauptseminar 26

27 The P-HSMQ Algorithm Technische Universität München Christian Sosnowski Hauptseminar 27

28 The P-HSMQ Algorithm Experiment 1 P-HSMQ HSMQ all behaviors Plan w/o HRL = 0.1 = 0.95 Trial length below 500 HSMQ all behaviors: P-HSMQ: Technische Universität München Christian Sosnowski Hauptseminar 28

29 Introduction Fundamentals Building Task Hierarchies through planning The P-HSMQ Algorithm Termination Improvement Conclusion Technische Universität München Christian Sosnowski Hauptseminar 29

30 Termination Improvement P-HSMQ always finishes a behavior Ignoring effect which might cause the actions to be no longer appropriate The example with the bump Technische Universität München Christian Sosnowski Hauptseminar 30

31 Termination Improvement Technische Universität München Christian Sosnowski Hauptseminar 31

32 Termination Improvement Experiment 2 Comparing P-HSMQ with TRQ = 0.1 = 0.95 = 0.1 (to spill the coffee) Trial length below 500 P-HSMQ: TRQ: Final learnt policy: Technische Universität München Christian Sosnowski Hauptseminar 32

33 Introduction Fundamentals Building Task Hierarchies through planning The P-HSMQ Algorithm Termination Improvement Conclusion Technische Universität München Christian Sosnowski Hauptseminar 33

34 Conclusion Q-learning only basic mean to solve a problem Combining abstract models (plan) with reinforcement learning improves performance significantly High level development of plans by humans vs. reinforcement learning for low level optimization Back to the example of the auto pilot Still to solve: Analyze what went wrong if a plan failed Invent new behaviors on their own to fit the circumstances that arise Technische Universität München Christian Sosnowski Hauptseminar 34

35 Any questions? Technische Universität München Christian Sosnowski Hauptseminar 35

Feature Selection with Monte-Carlo Tree Search

Feature Selection with Monte-Carlo Tree Search Feature Selection with Monte-Carlo Tree Search Robert Pinsler 20.01.2015 20.01.2015 Fachbereich Informatik DKE: Seminar zu maschinellem Lernen Robert Pinsler 1 Agenda 1 Feature Selection 2 Feature Selection

More information

Motivation. Motivation. Can a software agent learn to play Backgammon by itself? Machine Learning. Reinforcement Learning

Motivation. Motivation. Can a software agent learn to play Backgammon by itself? Machine Learning. Reinforcement Learning Motivation Machine Learning Can a software agent learn to play Backgammon by itself? Reinforcement Learning Prof. Dr. Martin Riedmiller AG Maschinelles Lernen und Natürlichsprachliche Systeme Institut

More information

Reinforcement Learning of Task Plans for Real Robot Systems

Reinforcement Learning of Task Plans for Real Robot Systems Reinforcement Learning of Task Plans for Real Robot Systems Pedro Tomás Mendes Resende pedro.resende@ist.utl.pt Instituto Superior Técnico, Lisboa, Portugal October 2014 Abstract This paper is the extended

More information

Machine Learning: Overview

Machine Learning: Overview Machine Learning: Overview Why Learning? Learning is a core of property of being intelligent. Hence Machine learning is a core subarea of Artificial Intelligence. There is a need for programs to behave

More information

Options with exceptions

Options with exceptions Options with exceptions Munu Sairamesh and Balaraman Ravindran Indian Institute Of Technology Madras, India Abstract. An option is a policy fragment that represents a solution to a frequent subproblem

More information

Learning a wall following behaviour in mobile robotics using stereo and mono vision

Learning a wall following behaviour in mobile robotics using stereo and mono vision Learning a wall following behaviour in mobile robotics using stereo and mono vision P. Quintía J.E. Domenech C.V. Regueiro C. Gamallo R. Iglesias Dpt. Electronics and Systems. Univ. A Coruña pquintia@udc.es,

More information

Urban Traffic Control Based on Learning Agents

Urban Traffic Control Based on Learning Agents Urban Traffic Control Based on Learning Agents Pierre-Luc Grégoire, Charles Desjardins, Julien Laumônier and Brahim Chaib-draa DAMAS Laboratory, Computer Science and Software Engineering Department, Laval

More information

Databases and Information Systems 1 Part 3: Storage Structures and Indices

Databases and Information Systems 1 Part 3: Storage Structures and Indices bases and Information Systems 1 Part 3: Storage Structures and Indices Prof. Dr. Stefan Böttcher Fakultät EIM, Institut für Informatik Universität Paderborn WS 2009 / 2010 Contents: - database buffer -

More information

Targeted Advertising and Consumer Privacy Concerns Experimental Studies in an Internet Context

Targeted Advertising and Consumer Privacy Concerns Experimental Studies in an Internet Context TECHNISCHE UNIVERSITAT MUNCHEN Lehrstuhl fur Betriebswirtschaftslehre - Dienstleistungsund Technologiemarketing Targeted Advertising and Consumer Privacy Concerns Experimental Studies in an Internet Context

More information

Learning the Structure of Factored Markov Decision Processes in Reinforcement Learning Problems

Learning the Structure of Factored Markov Decision Processes in Reinforcement Learning Problems Learning the Structure of Factored Markov Decision Processes in Reinforcement Learning Problems Thomas Degris Thomas.Degris@lip6.fr Olivier Sigaud Olivier.Sigaud@lip6.fr Pierre-Henri Wuillemin Pierre-Henri.Wuillemin@lip6.fr

More information

Lecture 5: Model-Free Control

Lecture 5: Model-Free Control Lecture 5: Model-Free Control David Silver Outline 1 Introduction 2 On-Policy Monte-Carlo Control 3 On-Policy Temporal-Difference Learning 4 Off-Policy Learning 5 Summary Introduction Model-Free Reinforcement

More information

Agreement on. Dual Degree Master Program in Computer Science KAIST. Technische Universität Berlin

Agreement on. Dual Degree Master Program in Computer Science KAIST. Technische Universität Berlin Agreement on Dual Degree Master Program in Computer Science between KAIST Department of Computer Science and Technische Universität Berlin Fakultät für Elektrotechnik und Informatik (Fakultät IV) 1 1 Subject

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Learning LU 2 - Markov Decision Problems and Dynamic Programming Dr. Martin Lauer AG Maschinelles Lernen und Natürlichsprachliche Systeme Albert-Ludwigs-Universität Freiburg martin.lauer@kit.edu

More information

The Basics of Robot Mazes Teacher Notes

The Basics of Robot Mazes Teacher Notes The Basics of Robot Mazes Teacher Notes Why do robots solve Mazes? A maze is a simple environment with simple rules. Solving it is a task that beginners can do successfully while learning the essentials

More information

Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining

Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining George Konidaris Computer Science Department University of Massachusetts Amherst Amherst MA 01003 USA gdk@cs.umass.edu

More information

Chi-square Tests Driven Method for Learning the Structure of Factored MDPs

Chi-square Tests Driven Method for Learning the Structure of Factored MDPs Chi-square Tests Driven Method for Learning the Structure of Factored MDPs Thomas Degris Thomas.Degris@lip6.fr Olivier Sigaud Olivier.Sigaud@lip6.fr Pierre-Henri Wuillemin Pierre-Henri.Wuillemin@lip6.fr

More information

Multiple Network Marketing coordination Model

Multiple Network Marketing coordination Model REPORT DOCUMENTATION PAGE Form Approved OMB No. 0704-0188 The public reporting burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions,

More information

ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING)

ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING) ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING) Gabriela Ochoa http://www.cs.stir.ac.uk/~goc/ OUTLINE Preliminaries Classification and Clustering Applications

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Learning LU 2 - Markov Decision Problems and Dynamic Programming Dr. Joschka Bödecker AG Maschinelles Lernen und Natürlichsprachliche Systeme Albert-Ludwigs-Universität Freiburg jboedeck@informatik.uni-freiburg.de

More information

An Experience Based Learning Controller

An Experience Based Learning Controller J. Intelligent Learning Systems & Applications, 2010, 2: 80-85 doi:10.4236/jilsa.2010.22011 Published Online May 2010 (http://.scirp.org/journal/jilsa) An Experience Based Learning Controller Debadutt

More information

Integrating Artificial Intelligence. Software Testing

Integrating Artificial Intelligence. Software Testing Integrating Artificial Intelligence in Software Testing Roni Stern and Meir Kalech, ISE department, BGU Niv Gafni, Yair Ofir and Eliav Ben-Zaken, Software Eng., BGU 1 Abstract Artificial Intelligence Planning

More information

Machine Learning and Statistics: What s the Connection?

Machine Learning and Statistics: What s the Connection? Machine Learning and Statistics: What s the Connection? Institute for Adaptive and Neural Computation School of Informatics, University of Edinburgh, UK August 2006 Outline The roots of machine learning

More information

Analysis of Micromouse Maze Solving Algorithms

Analysis of Micromouse Maze Solving Algorithms 1 Analysis of Micromouse Maze Solving Algorithms David M. Willardson ECE 557: Learning from Data, Spring 2001 Abstract This project involves a simulation of a mouse that is to find its way through a maze.

More information

Knowledge Discovery and Data Mining. Structured vs. Non-Structured Data

Knowledge Discovery and Data Mining. Structured vs. Non-Structured Data Knowledge Discovery and Data Mining Unit # 2 1 Structured vs. Non-Structured Data Most business databases contain structured data consisting of well-defined fields with numeric or alphanumeric values.

More information

Inductive QoS Packet Scheduling for Adaptive Dynamic Networks

Inductive QoS Packet Scheduling for Adaptive Dynamic Networks Inductive QoS Packet Scheduling for Adaptive Dynamic Networks Malika BOURENANE Dept of Computer Science University of Es-Senia Algeria mb_regina@yahoo.fr Abdelhamid MELLOUK LISSI Laboratory University

More information

Time Hopping Technique for Faster Reinforcement Learning in Simulations

Time Hopping Technique for Faster Reinforcement Learning in Simulations BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 11, No 3 Sofia 2011 Time Hopping Technique for Faster Reinforcement Learning in Simulations Petar Kormushev 1, Kohei Nomoto

More information

Attack Frameworks and Tools

Attack Frameworks and Tools Network Architectures and Services, Georg Carle Faculty of Informatics Technische Universität München, Germany Attack Frameworks and Tools Pranav Jagdish Betreuer: Nadine Herold Seminar Innovative Internet

More information

A Sarsa based Autonomous Stock Trading Agent

A Sarsa based Autonomous Stock Trading Agent A Sarsa based Autonomous Stock Trading Agent Achal Augustine The University of Texas at Austin Department of Computer Science Austin, TX 78712 USA achal@cs.utexas.edu Abstract This paper describes an autonomous

More information

Intelligent Agents Serving Based On The Society Information

Intelligent Agents Serving Based On The Society Information Intelligent Agents Serving Based On The Society Information Sanem SARIEL Istanbul Technical University, Computer Engineering Department, Istanbul, TURKEY sariel@cs.itu.edu.tr B. Tevfik AKGUN Yildiz Technical

More information

Eligibility Traces. Suggested reading: Contents: Chapter 7 in R. S. Sutton, A. G. Barto: Reinforcement Learning: An Introduction MIT Press, 1998.

Eligibility Traces. Suggested reading: Contents: Chapter 7 in R. S. Sutton, A. G. Barto: Reinforcement Learning: An Introduction MIT Press, 1998. Eligibility Traces 0 Eligibility Traces Suggested reading: Chapter 7 in R. S. Sutton, A. G. Barto: Reinforcement Learning: An Introduction MIT Press, 1998. Eligibility Traces Eligibility Traces 1 Contents:

More information

Figure 1: Cost and Speed of Access of different storage components. Page 30

Figure 1: Cost and Speed of Access of different storage components. Page 30 Reinforcement Learning Approach for Data Migration in Hierarchical Storage Systems T.G. Lakshmi, R.R. Sedamkar, Harshali Patil Department of Computer Engineering, Thakur College of Engineering and Technology,

More information

UNIVERSITÄTSBIBLIOTHEK

UNIVERSITÄTSBIBLIOTHEK UNIVERSITÄTSBLIOTHEK Zeitschriften im Abonnement Fach: Informatik : Elektronische Zeitschriften finden Sie in der Elektronischen Zeitschriftenbibliothek EZB. Standort : Bereichsbibliothek Informatik Standort

More information

Final Exam. Route Computation: One reason why link state routing is preferable to distance vector style routing.

Final Exam. Route Computation: One reason why link state routing is preferable to distance vector style routing. UCSD CSE CS 123 Final Exam Computer Networks Directions: Write your name on the exam. Write something for every question. You will get some points if you attempt a solution but nothing for a blank sheet

More information

Hierarchical Reinforcement Learning in Computer Games

Hierarchical Reinforcement Learning in Computer Games Hierarchical Reinforcement Learning in Computer Games Marc Ponsen, Pieter Spronck, Karl Tuyls Maastricht University / MICC-IKAT. {m.ponsen,p.spronck,k.tuyls}@cs.unimaas.nl Abstract. Hierarchical reinforcement

More information

Learning Agents: Introduction

Learning Agents: Introduction Learning Agents: Introduction S Luz luzs@cs.tcd.ie October 22, 2013 Learning in agent architectures Performance standard representation Critic Agent perception rewards/ instruction Perception Learner Goals

More information

Creating a NL Texas Hold em Bot

Creating a NL Texas Hold em Bot Creating a NL Texas Hold em Bot Introduction Poker is an easy game to learn by very tough to master. One of the things that is hard to do is controlling emotions. Due to frustration, many have made the

More information

Intelligent Flexible Automation

Intelligent Flexible Automation Intelligent Flexible Automation David Peters Chief Executive Officer Universal Robotics February 20-22, 2013 Orlando World Marriott Center Orlando, Florida USA Trends in AI and Computing Power Convergence

More information

Using Markov Decision Processes to Solve a Portfolio Allocation Problem

Using Markov Decision Processes to Solve a Portfolio Allocation Problem Using Markov Decision Processes to Solve a Portfolio Allocation Problem Daniel Bookstaber April 26, 2005 Contents 1 Introduction 3 2 Defining the Model 4 2.1 The Stochastic Model for a Single Asset.........................

More information

Buyout and Distressed Private Equity: Performance and Value Creation

Buyout and Distressed Private Equity: Performance and Value Creation TECHNISCHE UNIVERSITAT MUNCHEN Lehrstuhl fur Betriebswirtschaftslehre - Finanzmanagement und Kapitalmarkte (Univ.-Prof. Dr. Christoph Kaserer) Buyout and Distressed Private Equity: Performance and Value

More information

Reducing Operational Costs in Cloud Social TV: An Opportunity for Cloud Cloning

Reducing Operational Costs in Cloud Social TV: An Opportunity for Cloud Cloning IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 16, NO. 6, OCTOBER 2014 1739 Reducing Operational Costs in Cloud Social TV: An Opportunity for Cloud Cloning Yichao Jin, Yonggang Wen, Senior Member, IEEE, Han Hu,

More information

Introduction to GPU Computing

Introduction to GPU Computing Matthis Hauschild Universität Hamburg Fakultät für Mathematik, Informatik und Naturwissenschaften Technische Aspekte Multimodaler Systeme December 4, 2014 M. Hauschild - 1 Table of Contents 1. Architecture

More information

Statistical Models in Data Mining

Statistical Models in Data Mining Statistical Models in Data Mining Sargur N. Srihari University at Buffalo The State University of New York Department of Computer Science and Engineering Department of Biostatistics 1 Srihari Flood of

More information

Autonomous Learning of Domain Models using Two-Dimensional Probability Distributions

Autonomous Learning of Domain Models using Two-Dimensional Probability Distributions Autonomous Learning of Domain Models using Two-Dimensional Probability Distributions Witold Słowiński Computing Science, University of Aberdeen, Scotland r03ws8@abdn.ac.uk Frank Guerin Computing Science,

More information

for High Performance Computing

for High Performance Computing Technische Universität München Institut für Informatik Lehrstuhl für Rechnertechnik und Rechnerorganisation Automatic Performance Engineering Workflows for High Performance Computing Ventsislav Petkov

More information

Seminar. Path planning using Voronoi diagrams and B-Splines. Stefano Martina stefano.martina@stud.unifi.it

Seminar. Path planning using Voronoi diagrams and B-Splines. Stefano Martina stefano.martina@stud.unifi.it Seminar Path planning using Voronoi diagrams and B-Splines Stefano Martina stefano.martina@stud.unifi.it 23 may 2016 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International

More information

Random Map Generator v1.0 User s Guide

Random Map Generator v1.0 User s Guide Random Map Generator v1.0 User s Guide Jonathan Teutenberg 2003 1 Map Generation Overview...4 1.1 Command Line...4 1.2 Operation Flow...4 2 Map Initialisation...5 2.1 Initialisation Parameters...5 -w xxxxxxx...5

More information

What did the Wright brothers invent?

What did the Wright brothers invent? What did the Wright brothers invent? The airplane, right? Well, not exactly. Page 1 of 15 The Wrights never claimed to have invented the airplane, or even the first airplane to fly. In their own words,

More information

Data Mining Techniques Chapter 7: Artificial Neural Networks

Data Mining Techniques Chapter 7: Artificial Neural Networks Data Mining Techniques Chapter 7: Artificial Neural Networks Artificial Neural Networks.................................................. 2 Neural network example...................................................

More information

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning. Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott

More information

We employed reinforcement learning, with a goal of maximizing the expected value. Our bot learns to play better by repeated training against itself.

We employed reinforcement learning, with a goal of maximizing the expected value. Our bot learns to play better by repeated training against itself. Date: 12/14/07 Project Members: Elizabeth Lingg Alec Go Bharadwaj Srinivasan Title: Machine Learning Applied to Texas Hold 'Em Poker Introduction Part I For the first part of our project, we created a

More information

Bayesian networks - Time-series models - Apache Spark & Scala

Bayesian networks - Time-series models - Apache Spark & Scala Bayesian networks - Time-series models - Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup - November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly

More information

Using the lean startup method for an internet broker in the recycling industry

Using the lean startup method for an internet broker in the recycling industry Fakultät für Informatik Technische Universität München Using the lean startup method for an internet broker in the recycling industry Master Thesis Stefan Weymann Software Engineering betrieblicher Informationssysteme

More information

INNOVATIVE METHODS AND TECHNIQUES FOR HIGH-PERFORMANCE AND - RELIABILITY MODELING AND SIMULATION (M&S)

INNOVATIVE METHODS AND TECHNIQUES FOR HIGH-PERFORMANCE AND - RELIABILITY MODELING AND SIMULATION (M&S) INNOVATIVE METHODS AND TECHNIQUES FOR HIGH-PERFORMANCE AND - RELIABILITY MODELING AND SIMULATION (M&S) Prof. Dr. Axel Lehmann titut für Technik Intelligenter Systeme (ITIS) at the Universität der Bundeswehr

More information

CS 591.03 Introduction to Data Mining Instructor: Abdullah Mueen

CS 591.03 Introduction to Data Mining Instructor: Abdullah Mueen CS 591.03 Introduction to Data Mining Instructor: Abdullah Mueen LECTURE 3: DATA TRANSFORMATION AND DIMENSIONALITY REDUCTION Chapter 3: Data Preprocessing Data Preprocessing: An Overview Data Quality Major

More information

Gibbs Sampling and Online Learning Introduction

Gibbs Sampling and Online Learning Introduction Statistical Techniques in Robotics (16-831, F14) Lecture#10(Tuesday, September 30) Gibbs Sampling and Online Learning Introduction Lecturer: Drew Bagnell Scribes: {Shichao Yang} 1 1 Sampling Samples are

More information

Scalable Source Routing

Scalable Source Routing Scalable Source Routing January 2010 Thomas Fuhrmann Department of Informatics, Self-Organizing Systems Group, Technical University Munich, Germany Routing in Networks You re there. I m here. Scalable

More information

Preliminaries: Problem Definition Agent model, POMDP, Bayesian RL

Preliminaries: Problem Definition Agent model, POMDP, Bayesian RL POMDP Tutorial Preliminaries: Problem Definition Agent model, POMDP, Bayesian RL Observation Belief ACTOR Transition Dynamics WORLD b Policy π Action Markov Decision Process -X: set of states [x s,x r

More information

Simulated Annealing Based Hierarchical Q-Routing: a Dynamic Routing Protocol

Simulated Annealing Based Hierarchical Q-Routing: a Dynamic Routing Protocol Simulated Annealing Based Hierarchical Q-Routing: a Dynamic Routing Protocol Antonio Mira Lopez Power Costs Inc. Norman, OK 73072 alopez@powercosts.com Douglas R. Heisterkamp Computer Science Department

More information

West Virginia University College of Engineering and Mineral Resources. Computer Engineering 313 Spring 2010

West Virginia University College of Engineering and Mineral Resources. Computer Engineering 313 Spring 2010 College of Engineering and Mineral Resources Computer Engineering 313 Spring 2010 Laboratory #4-A (Micromouse Algorithms) Goals This lab introduces the modified flood fill algorithm and teaches how to

More information

NEURAL NETWORKS AND REINFORCEMENT LEARNING. Abhijit Gosavi

NEURAL NETWORKS AND REINFORCEMENT LEARNING. Abhijit Gosavi NEURAL NETWORKS AND REINFORCEMENT LEARNING Abhijit Gosavi Department of Engineering Management and Systems Engineering Missouri University of Science and Technology Rolla, MO 65409 1 Outline A Quick Introduction

More information

Statistical Validation and Data Analytics in ediscovery. Jesse Kornblum

Statistical Validation and Data Analytics in ediscovery. Jesse Kornblum Statistical Validation and Data Analytics in ediscovery Jesse Kornblum Administrivia Silence your mobile Interactive talk Please ask questions 2 Outline Introduction Big Questions What Makes Things Similar?

More information

Data Warehousing und Data Mining

Data Warehousing und Data Mining Data Warehousing und Data Mining Multidimensionale Indexstrukturen Ulf Leser Wissensmanagement in der Bioinformatik Content of this Lecture Multidimensional Indexing Grid-Files Kd-trees Ulf Leser: Data

More information

Market-based Multirobot Coordination Using Task Abstraction

Market-based Multirobot Coordination Using Task Abstraction The 4th International Conference on Field and Service Robotics, July 14 16, 2003 Market-based Multirobot Coordination Using Task Abstraction Robert Zlot Anthony Stentz Robotics Institute Carnegie Mellon

More information

An Application of Inverse Reinforcement Learning to Medical Records of Diabetes Treatment

An Application of Inverse Reinforcement Learning to Medical Records of Diabetes Treatment An Application of Inverse Reinforcement Learning to Medical Records of Diabetes Treatment Hideki Asoh 1, Masanori Shiro 1 Shotaro Akaho 1, Toshihiro Kamishima 1, Koiti Hasida 1, Eiji Aramaki 2, and Takahide

More information

CAB TRAVEL TIME PREDICTI - BASED ON HISTORICAL TRIP OBSERVATION

CAB TRAVEL TIME PREDICTI - BASED ON HISTORICAL TRIP OBSERVATION CAB TRAVEL TIME PREDICTI - BASED ON HISTORICAL TRIP OBSERVATION N PROBLEM DEFINITION Opportunity New Booking - Time of Arrival Shortest Route (Distance/Time) Taxi-Passenger Demand Distribution Value Accurate

More information

An Early Attempt at Applying Deep Reinforcement Learning to the Game 2048

An Early Attempt at Applying Deep Reinforcement Learning to the Game 2048 An Early Attempt at Applying Deep Reinforcement Learning to the Game 2048 Hong Gui, Tinghan Wei, Ching-Bo Huang, I-Chen Wu 1 1 Department of Computer Science, National Chiao Tung University, Hsinchu, Taiwan

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

Data Preprocessing. Week 2

Data Preprocessing. Week 2 Data Preprocessing Week 2 Topics Data Types Data Repositories Data Preprocessing Present homework assignment #1 Team Homework Assignment #2 Read pp. 227 240, pp. 250 250, and pp. 259 263 the text book.

More information

Robot Task-Level Programming Language and Simulation

Robot Task-Level Programming Language and Simulation Robot Task-Level Programming Language and Simulation M. Samaka Abstract This paper presents the development of a software application for Off-line robot task programming and simulation. Such application

More information

Data, Measurements, Features

Data, Measurements, Features Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are

More information

FRIEDRICH-ALEXANDER-UNIVERSITÄT ERLANGEN-NÜRNBERG

FRIEDRICH-ALEXANDER-UNIVERSITÄT ERLANGEN-NÜRNBERG FRIEDRICH-ALEXANDER-UNIVERSITÄT ERLANGEN-NÜRNBERG INSTITUT FÜR INFORMATIK (MATHEMATISCHE MASCHINEN UND DATENVERARBEITUNG) Lehrstuhl für Informatik 10 (Systemsimulation) Massively Parallel Multilevel Finite

More information

USING REINFORCEMENT LEARNING FOR AUTONOMIC RESOURCE ALLOCATION IN CLOUDS: TOWARDS A FULLY AUTOMATED WORKFLOW

USING REINFORCEMENT LEARNING FOR AUTONOMIC RESOURCE ALLOCATION IN CLOUDS: TOWARDS A FULLY AUTOMATED WORKFLOW USING REINFORCEMENT LEARNING FOR AUTONOMIC RESOURCE ALLOCATION IN CLOUDS: TOWARDS A FULLY AUTOMATED WORKFLOW Towards a Fully Automated Workflow, Xavier Dutreilh, Sergey Kirgizov, Olga Melekhova, Jacques

More information

On Fleet Size Optimization for Multi-Robot Frontier-Based Exploration

On Fleet Size Optimization for Multi-Robot Frontier-Based Exploration On Fleet Size Optimization for Multi-Robot Frontier-Based Exploration N. Bouraqadi L. Fabresse A. Doniec http://car.mines-douai.fr Université de Lille Nord de France, Ecole des Mines de Douai Abstract

More information

Performance. 13. Climbing Flight

Performance. 13. Climbing Flight Performance 13. Climbing Flight In order to increase altitude, we must add energy to the aircraft. We can do this by increasing the thrust or power available. If we do that, one of three things can happen:

More information

Unit 4 DECISION ANALYSIS. Lesson 37. Decision Theory and Decision Trees. Learning objectives:

Unit 4 DECISION ANALYSIS. Lesson 37. Decision Theory and Decision Trees. Learning objectives: Unit 4 DECISION ANALYSIS Lesson 37 Learning objectives: To learn how to use decision trees. To structure complex decision making problems. To analyze the above problems. To find out limitations & advantages

More information

Mobile Phone APP Software Browsing Behavior using Clustering Analysis

Mobile Phone APP Software Browsing Behavior using Clustering Analysis Proceedings of the 2014 International Conference on Industrial Engineering and Operations Management Bali, Indonesia, January 7 9, 2014 Mobile Phone APP Software Browsing Behavior using Clustering Analysis

More information

TUTORIAL: BOARDMAKER STUDIO START-UP

TUTORIAL: BOARDMAKER STUDIO START-UP Congratulations, you ve downloaded the Boardmaker Studio trial. To be successful from the start, use this guide to learn the essential skills in Boardmaker Studio. 1 EDIT IN PLACE The most essential skill.

More information

An Associative State-Space Metric for Learning in Factored MDPs

An Associative State-Space Metric for Learning in Factored MDPs An Associative State-Space Metric for Learning in Factored MDPs Pedro Sequeira and Francisco S. Melo and Ana Paiva INESC-ID and Instituto Superior Técnico, Technical University of Lisbon Av. Prof. Dr.

More information

Predicting the Stock Market with News Articles

Predicting the Stock Market with News Articles Predicting the Stock Market with News Articles Kari Lee and Ryan Timmons CS224N Final Project Introduction Stock market prediction is an area of extreme importance to an entire industry. Stock price is

More information

Load balancing. David Bindel. 12 Nov 2015

Load balancing. David Bindel. 12 Nov 2015 Load balancing David Bindel 12 Nov 2015 Inefficiencies in parallel code Poor single processor performance Typically in the memory system Saw this in matrix multiply assignment Overhead for parallelism

More information

Analysis of Social Media Streams

Analysis of Social Media Streams Fakultätsname 24 Fachrichtung 24 Institutsname 24, Professur 24 Analysis of Social Media Streams Florian Weidner Dresden, 21.01.2014 Outline 1.Introduction 2.Social Media Streams Clustering Summarization

More information

Arrangements And Duality

Arrangements And Duality Arrangements And Duality 3.1 Introduction 3 Point configurations are tbe most basic structure we study in computational geometry. But what about configurations of more complicated shapes? For example,

More information

Teaching Introductory Artificial Intelligence with Pac-Man

Teaching Introductory Artificial Intelligence with Pac-Man Teaching Introductory Artificial Intelligence with Pac-Man John DeNero and Dan Klein Computer Science Division University of California, Berkeley {denero, klein}@cs.berkeley.edu Abstract The projects that

More information

Solving Hybrid Markov Decision Processes

Solving Hybrid Markov Decision Processes Solving Hybrid Markov Decision Processes Alberto Reyes 1, L. Enrique Sucar +, Eduardo F. Morales + and Pablo H. Ibargüengoytia Instituto de Investigaciones Eléctricas Av. Reforma 113, Palmira Cuernavaca,

More information

Asset & Technology Sustainability Strategies

Asset & Technology Sustainability Strategies Asset & Technology Sustainability Strategies Roger L. King Mississippi State University rking@cavs.msstate.edu 10 th i-pcgrid 26-28 March 2013 Historically, the United States has been recognized as the

More information

Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup

Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup Network Anomaly Detection A Machine Learning Perspective Dhruba Kumar Bhattacharyya Jugal Kumar KaKta»C) CRC Press J Taylor & Francis Croup Boca Raton London New York CRC Press is an imprint of the Taylor

More information

Data Mining. 1 Introduction 2 Data Mining methods. Alfred Holl Data Mining 1

Data Mining. 1 Introduction 2 Data Mining methods. Alfred Holl Data Mining 1 Data Mining 1 Introduction 2 Data Mining methods Alfred Holl Data Mining 1 1 Introduction 1.1 Motivation 1.2 Goals and problems 1.3 Definitions 1.4 Roots 1.5 Data Mining process 1.6 Epistemological constraints

More information

Multimedia Databases. Wolf-Tilo Balke Philipp Wille Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.

Multimedia Databases. Wolf-Tilo Balke Philipp Wille Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs. Multimedia Databases Wolf-Tilo Balke Philipp Wille Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 14 Previous Lecture 13 Indexes for Multimedia Data 13.1

More information

Lessons Learned using CBR for Customer Support

Lessons Learned using CBR for Customer Support Lessons Learned using CBR for Customer Support William Cheetham General Electric Global Research, 1 Research Circle, Niskayuna, NY 12309 ( cheetham@research.ge.com ) Abstract Three CBR systems were created

More information

Current Challenges in UAS Research Intelligent Navigation and Sense & Avoid

Current Challenges in UAS Research Intelligent Navigation and Sense & Avoid Current Challenges in UAS Research Intelligent Navigation and Sense & Avoid Joerg Dittrich Institute of Flight Systems Department of Unmanned Aircraft UAS Research at the German Aerospace Center, Braunschweig

More information

Artificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence

Artificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence Artificial Neural Networks and Support Vector Machines CS 486/686: Introduction to Artificial Intelligence 1 Outline What is a Neural Network? - Perceptron learners - Multi-layer networks What is a Support

More information

Self Organizing Maps: Fundamentals

Self Organizing Maps: Fundamentals Self Organizing Maps: Fundamentals Introduction to Neural Networks : Lecture 16 John A. Bullinaria, 2004 1. What is a Self Organizing Map? 2. Topographic Maps 3. Setting up a Self Organizing Map 4. Kohonen

More information

THE LOGIC OF ADAPTIVE BEHAVIOR

THE LOGIC OF ADAPTIVE BEHAVIOR THE LOGIC OF ADAPTIVE BEHAVIOR Knowledge Representation and Algorithms for Adaptive Sequential Decision Making under Uncertainty in First-Order and Relational Domains Martijn van Otterlo Department of

More information

Advanced Volume Rendering Techniques for Medical Applications

Advanced Volume Rendering Techniques for Medical Applications Advanced Volume Rendering Techniques for Medical Applications Verbesserte Darstellungsmethoden für Volumendaten in medizinischen Anwendungen J. Georgii 1, J. Schneider 1, J. Krüger 1, R. Westermann 1,

More information

Attack Taxonomies and Ontologies

Attack Taxonomies and Ontologies Lehrstuhl Netzarchitekturen und Netzdienste Institut für Informatik Technische Universität München Attack Taxonomies and Ontologies Seminar Future Internet Supervisor: Nadine Herold Natascha Abrek 02.10.2014

More information

POMPDs Make Better Hackers: Accounting for Uncertainty in Penetration Testing. By: Chris Abbott

POMPDs Make Better Hackers: Accounting for Uncertainty in Penetration Testing. By: Chris Abbott POMPDs Make Better Hackers: Accounting for Uncertainty in Penetration Testing By: Chris Abbott Introduction What is penetration testing? Methodology for assessing network security, by generating and executing

More information

ANNMD - Artificial Neural Network Model Developer. Jure Smrekar

ANNMD - Artificial Neural Network Model Developer. Jure Smrekar ANNMD - Artificial Neural Network Model Developer Jure Smrekar June 2010 University of Stavanger N-4036 Stavanger NORWAY wwwuisno 2010 Jure Smrekar ISBN: 978-82-7644-416-2 Abstract This booklet presents

More information

THE concept of Big Data refers to systems conveying

THE concept of Big Data refers to systems conveying EDIC RESEARCH PROPOSAL 1 High Dimensional Nearest Neighbors Techniques for Data Cleaning Anca-Elena Alexandrescu I&C, EPFL Abstract Organisations from all domains have been searching for increasingly more

More information

Predictive Q-Routing: A Memory-based Reinforcement Learning Approach to Adaptive Traffic Control

Predictive Q-Routing: A Memory-based Reinforcement Learning Approach to Adaptive Traffic Control Predictive Q-Routing: A Memory-based Reinforcement Learning Approach to Adaptive Traffic Control Samuel P.M. Choi, Dit-Yan Yeung Department of Computer Science Hong Kong University of Science and Technology

More information

Praktikum Wissenschaftliches Rechnen (Performance-optimized optimized Programming)

Praktikum Wissenschaftliches Rechnen (Performance-optimized optimized Programming) Praktikum Wissenschaftliches Rechnen (Performance-optimized optimized Programming) Dynamic Load Balancing Dr. Ralf-Peter Mundani Center for Simulation Technology in Engineering Technische Universität München

More information