From extreme learning machines to reservoir computing: random projections and dynamics for motion learning u y q Jochen Steil 2011, Tutorial at CBIC
Biology Physics Bielefeld Environment is part of: Intelligent Systems @ Bielefeld University one of 4 strategic profile areas two central scientific institutes (CoR-Lab, CITEC) 5 of 13 departments cooperate 2 special research units and 3 interdisciplinary graduate programs industrial partners (Honda, Miele, Bertelsmann, OWL MASCHINENBAU,...) > 350 researchers, ~ 10 Mill EUR funding/year EU projects: italk, ROBOTDOC, HUMAVIPS, MONARCA, EMiCab, AMARSi, Echord Research Institute for Cognition and Robotics (CoR-Lab) Excellence Cluster Cognitive Interaction Technology (CITEC) Linguistics Psychology & Sports Technology Chemistry Economics History, Philosophy Law Mathematics Education Public Health Sociology
Reinhart, Steil, Humanoids 2009 Rolf, Steil, ICDL 2009 Neumann, Rolf, Steil, SAB 2010 Rolf, Steil, Trans. Autonomous Mental Development, 2010 Steil, Neural Networks, 2007 Reinhart, Steil, 2010: Differential Equations & Dynamical Systems Lemme, Steil, ESANN, 2010 Reservoir Dynamic Networks Neural Dynamic Movement Primitives Stability Theory of Recurrent Networks DAAD fellowship Russia Steil, Dissertation, 1999, Steil, Neurocomputing, 2002 Steil, Neurocomputing, 2006 Dynamics & Learning Visual Online Learning Robot Learning Architecture Neural Perceptual Grouping a) Input image b) Edge features Speech recognition Active vision + exploration Interaction Robot arm control Robot grasping c) CLM grouping d) Potts spin grouping Steil, Götting, Wersing, Körner Neurocomputing, 2002 Denecke, Wersing, Steil, Körner, Neurocomputing, 2007 Denecke et al., ESANN, 2010 Gesture recognition Object referencing Object recognition and pose Shared Attention Hand tracking Steil, Roethling, Haschke, Ritter Robotics & Autonomous Systems, 2004 Wischnewski et al, Cognitive Computation, 2010 Wersing, Steil, Ritter Neural Computation, 2001 Weng, Wersing, Steil, Ritter, Trans. Neural Networks, 2006
the multi-layer perceptron (MLP) u q
The standard neural learning approach for multi-layer perceptron (MLP): Minimize (quadratic) error function
Compute gradient w E by backpropagation u q adapt weights adapt weights local error output error
u q adapt weights task specific hidden representation
challenge: task specific hidden representation? propagation of errors? novel approach: learning & hidden state separate: linear output regression high-dim. random projection
u q random projection linear output regression
Outline Extreme Learning Machine (ELM) & Intrinsic Plasticity learn inverse kinematics ELM + Recurrence = Echo State, Reservoir Computing (RC) Associative Neural Learning (for kinematics) inverse kinematics + trajectory generation Programming Dynamics (shape attractor dynamics) learn and select multiple inverse kinematics solutions Learning velocity fields (movement primitives) Learning sequences of movement primitives
1990 s: Random projections for data processing dimension reduction separation properties are preserved only linear projections used/analyzed very powerful tool, see e.g Now: random projections dimension expansion (like for kernels) non-linear transformations universal feature set obtained
ELM - Extreme Learning Machine (Huang 2006) high dimensional random neural feedforward network, random u q learning: only by regression (+ weight decay regularization)
training: collect states for all tr training inputs (state harvesting) collect targets ˆQ = ( ˆq1,..., ˆq tr ) T do linear regression W out = ( H T H + ɛi ) 1 H T ˆQ regularization through epsilon -> model selection! ( not critical/considered in Huang 2006 (and other literature), because data sets were very large)
u q some other issues: input scaling, overfitting, initialize bias, weights in correct range model selection is important
u q idea: input specific tuning of features by adaptation of the nonlinear function Optimizing Extreme Learning Machines via Ridge Regression and Batch Intrinsic Plasticity, Neumann, K., and J.J. Steil, Neurocomputing, to appear.
Intrinsic Plasticity: optimize parameters of single neuron! use parametrized Fermi: h(x, a, b) =1/ (1 + exp ( ax b)) adjust parameters online: also batch algorithm available: Neumann & Steil, ICANN 2011 IP Learning
VIDEO Neumann, Emmerich, Steil, in preparation
Intrinsic Plasticity input/task specific scaling set of features is input specific set of features is less diverse IP acts as input regularizer reduces dependence on #nodes, initialization parameters! (similar to the Gaussian kernel with in SVM) Neumann, Emmerich, Steil, in preparation
ELM + Intrinsic Plasticity learn inverse kinematics (static, simple, here well defined) input position, output joint angles Neumann, K., and J.J. Steil, Neurocomputing, to appear. ELM + IP
ELM + Intrinsic Plasticity standard UCI tasks we (almost) always use IP! Neumann, K., and J.J. Steil, Neurocomputing, to appear
IP regularization through reduced feature complexity -- too much IP produces degenerated features 1 0.5 VIDEO 0
Outline Extreme Learning Machine (ELM) & Intrinsic Plasticity learn inverse kinematics ELM + Recurrence = Echo State, Reservoir Computing (RC) Associative Neural Learning (for kinematics) kinematics + trajectory generation Programming Dynamics (shape attractor dynamics) learn and select multiple inverse kinematics solutions Learning velocity fields (movement primitives) Learning sequences of movement primitives
ELM + Recurrency
ELM & Recurrence? staged processing from left to right, network used for feature generation with recurrence: converge to attractor u q u q Extreme Learning Machine Attractor based ELM = Echo state network for static mappings ELM vs Echo State
What does recurrence add? non-linear mixtures of features counteracts the regularization effect of IP ELM + Recurrency
What does recurrence add? non-linear mixtures of features counteracts the regularization effect of IP VIDEO ELM + Recurrency
What does recurrence add? VIDEO ELM + Recurrency
What does recurrence add? counteracts the regularization effect of IP combination useful reduces sensitivity to model selection parameters Abalone task from UCI repository test error 10 10 2 1 10 1 10 0 10 0 10 1 10 1 ELM IP + ESN 10 10 2 2 10 10 20 10 40 5 8 60 10 1080 6 100 10 15 4 120 14010 20 2 hidden input epsilon scaling neurons ELM + Recurrency
Digression: ELM+ Intrinsic Plasticity + Recurrence = Reservoir Learning with IP (Steil@HRI, 2006) still works despite recurrence empirically very useful strong regularizer w.r. to #nodes, connectivity pattern, sparseness... achieves both lifetime and spatial sparseness! % of time steps Mackey Glass % of time steps Random 70 40 20 16 12 8 4 20 16 12 8 4 without IP epoch 20 epoch 100 20 16 12 8 4 20 16 12 8 4 20 16 12 8 4 without IP epoch 20 epoch 100 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 neuron output neuron output Steil, 2007, Neural Networks Reservoir, IP, sparseness
VIDEO Steil, 2007, Neural Networks Reservoir, IP, sparseness
Application: tool manipulation for ASIMO.+ 933&:$4; <==8!$ &'( 3""#$%!" / 933&:$4; +384$4358 &'(!""#$%!" / *+,- )#$%!!" 01 2345$"657&!8 Rolf & Steil, LAB-RS, 2010, best paper award
Application: manipulating a stick Neumann, Rolf & Steil, SAB 2010 TUM/DLR, 29.08.2011
Positions of the hand center points while holding a stick Positions of the hand center points while holding a stick 0.84 0.82 0.8 0.78 0.76 0.74 0.72 0.7 0.68 0.66 0.8 0.79 0.78 0.77 0.76 0.75 0.74 0.73 0.72 0.71 0.7 0.3 0.2 0.1 0.52 0.5250.530.5350.54 0 0.2 0.25 0.545-0.1 Y-Axis 0.550.5550.560.565 0.42-0.2 0.05 0.10.15 0.44 0.46 X-Axis 0.48 0.57 0.575-0.3 0.5-0.1-0.050 Y-Axis 0.52 0.54 X-Axis 0.56 0.58 0.6-0.25-0.2-0.15 Neumann, Rolf & Steil, SAB 2010 TUM/DLR, 29.08.2011
Outline Extreme Learning Machine (ELM) & Intrinsic Plasticity learn inverse kinematics ELM + Recurrence = Echo State, Reservoir Computing (RC) Associative Neural Learning (for kinematics) kinematics + trajectory generation Programming Dynamics (shape attractor dynamics) learn and select multiple inverse kinematics solutions Learning velocity fields (movement primitives) Learning sequences of movement primitives
Associative Neural Reservoir Learning
Associative Neural Recurrent Learning ANRL from staged processing to dynamics finally add feedback ELM based: u q Associative Neural Reservoir Learning
Associative Neural Recurrent Learning ANRL reservoir based: h Associative Neural Reservoir Learning
Echo State Network vs Recurrent Dynamics Text Echo State (staged) - layered update (single pass)! - it takes one time step from u - q: u h q update: u(k) - h(k+1) & readout: q(k+1) = W h(k+1) ANRL (recurrent) - all nodes one RNN! - sychronous update - it takes two time steps from u - q: u h q u(k) - h(k+1) - q(k+2) - needs always prediction!
Associative Neural Recurrent Learning ANRL h idea: data pairs (u,q) from some relation (function) "Recurrent neural associative learning of forward and inverse kinematics for movement generation of the redundant PA-10 robot", Reinhart & Steil, LAB-RS, 2008, best paper award
Associative Neural Reservoir Learning basic ideas: (u,q) become inputs and outputs store data (u,q) in attractors efficient learning possible (regression/online as before) û(k) ˆq(k) h y(k) u(k) q(k)
Associative Neural Reservoir Learning basic ideas: generalization by associative attractor completion choose between multiple solutions through feedback trajectory generation by means of transients x u ˆx h(x, ˆq) ˆq ˆq z 1
Our application: Kinematics forward kinematics FK: u=ik(q) uniquely defined (many-to-one) u (hand position) q (arm angles)
Kinematics: inverse kinematics IK: q=ik(u) not uniquely defined (one-to-many) q (arm angles) elbow up elbow down u (position) selection of solution = redundancy resolution
Example: redundancy resolution: elbow down elbow up
association is based on data (not functions) G ={(u^i,q^i)}, i = 1...N read out forward kinematics read out inverse kinematics q G u
inverse kinematics (A example-arm-manifolds.mpeg) Rolf, Steil, Gienger, IEEE TAMD, 2010, Goal Babbling permits direct learning of inverse kinematics
association by storing pairs (u,q) via SOM u= Fingertip position q= sampled on grip 10x10x10 Walter & Ritter, 1996-1998
- (u,q) association by concatenation - train (P)SOM network - read out both ways and get FK and IK! Walter & Ritter, 1996-1998
Barhen, Gulati, Zak, Intelligent robots and computer vision, 1989
ANRL for kinematics (icub Arm) include forward-model simultaneous learning of both models inverse kinematics 0.1 training data network response 0.05 0 0 0.05 0.1 0.1 0.05 h q joint angle [rad] 0.2 0.4 0.6 0.8 1 u 2 [m] 0 0.05 0.1 0.1 0.08 0.06 u 1 [m] 0.04 0.02 0 0 100 200 300 400 500 600 700 800 time step k forward kinematics, sensory prediction
online learning: train from trajectories state in reservoir is useful for temporal integration û(k) ˆq(k) hy(k) u(k) q(k) limit case: attractor u y h q u = FK(q ) q = IK(u )
Task: Move Hands Trajectory Representation Inverse Kinematics Robot Control Our approach: learn learn forward and inverse kinematics in associative reservoir network learn Generalization: a) set new target - compute joint angles
Evaluation & Generalization (icub 7 DOF arm) joint angle [rad] 0 0.2 0.4 0.6 0.8 1 0 100 200 300 400 500 600 700 800 E[m] 0.05 0.04 0.03 0.02 0.01 0 0.1 0.05 0 time step k projection to y-z (along spiral) regularization by learning joint angle [rad] 0 0.02 0.04 0.06 0.08 0.1 number of movements 0 100 200 300 400 500 600 700 800 14000 12000 10000 8000 6000 4000 2000 time step k probability that E < value 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 u 3 [m] 0.05 0.1 0.1 0.05 u 2 [m] 0 0.05 0.1 excellent generalization & graceful degradation continuous shift of attractor possible 0 0.01 0.02 0.03 0.04 0.05 error E [m] 0 0.01 0.02 0.03 0.04 0.05 error E [m] Reinhart & Steil, IEEE Conf. Humanoids, 2009
Interactive learning of redundancy resolution by ANRL FlexIRob@CoR-Lab, 2010, www.cor-lab.de/corlab/cms/flexirob, (Lemme, Rüther, Nordmann, Wrede, Steil, Weirich, Johannfunke)
VIDEO
Task: Move Hands Trajectory Representation Inverse Kinematics Robot Control Our approach: learn learn forward and inverse kinematics in associative reservoir network Trajectory Generation Generalization: a) set new target - compute joint angles b) movement generation by iteration toward target attractor
cope with feedback: internal simulation, sensory prediction Reinhart & Steil, IEEE Conf. Humanoids, 2009
Movement generation by attractor dynamics controller setting autonomous operation by sensory prediction Reinhart & Steil, IEEE Conf. Humanoids, 2009
Movement generation by attractor dynamics (icub arm) 0.1 0.05 start targets generated movements 0.1 0.05 target start generated movements 3 0 u 3 [m] 0 0.05 0.05 0.1 0.1 0.1 0.1 0.05 0.05 0 0 0.05 u 2 [m] 0.1 0 0.02 0.04 u 1 [m] 0.06 0.08 0.1 u 2 [m] 0.05 0.1 0.1 0.08 0.06 u 1 [m] 0.04 0.02 0 home-to-target generalization start-to-home generalization Reinhart & Steil, IEEE Conf. Humanoids, 2009
Robustness by attractor dynamics 0.08 0.06 0.04 start target generated movements perturbation perturbation! 0.02 u 2 [m] 0 0.02 0.04 0.06 0.08 0 0.01 0.02 0.03 0.04 Reinhart & Steil, IEEE Conf. Humanoids, 2009 u 1 [m] 0.05 0.06 0.07 0.08 0.09
Analysis: Speed profiles w.r. to t 3.5 x 10 3 3 t = 0.02 t = 0.04 t = 0.08 velocity [m/time step] 2.5 2 1.5 1 0.5 0 0 20 40 60 80 100 120 140 160 180 200 time step k
maximal velocity [m/iteration] 0.008 0.007 0.006 0.005 0.004 0.003 0.002 0.001 t = 0.02 t = 0.04 t = 0.06 t = 0.08 t = 0.10 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22 target distance [m]
Outline Extreme Learning Machine (ELM) & Intrinsic Plasticity learn inverse kinematics ELM + Recurrence = Echo State, Reservoir Computing (RC) Associative Neural Learning (for kinematics) kinematics + trajectory generation Programming Dynamics (shape attractor dynamics) learn and select multiple inverse kinematics solutions Learning velocity fields (movement primitives) Learning sequences of movement primitives
Output feedback & programming dynamics problem: how to shape feedback loop? more general: how to imprint arbitrary dynamics? how to generalize and stabelize? x u ˆx h(x, ˆq) ˆq ˆq z 1
Programming Dynamics: Example 2-D network + 1-D input to parametrize attractor Reinhart & Steil, ICANN 2011
Programming dynamics: directly shape network transients to attractor ELM+feedback or fully recurrent reservoir batch learning (improve by reservoir regularization, see references below) approach generate sequence of desired states get weights through linear regression use smart sampling of state trajectories Reinhart & Steil, ICANN, 2011 Reinhart & Steil, Differential Equations and Dynamical Systems, 2010 Reinhart & Steil, Humanoids 2011, Reinhart PhD-thesis, 2011
Programming multiple redundancy resolutions: generate trajectories toward attractor points q (x 1, q 1 ) (x 3, q 3 ) (x 2, q 2 ) q 1 4 (t) q2 4 (t) x 1 =x 2 x 3 =x 4 x Reinhart & Steil, Humanoids 2011
Example icub right arm positioning: network setup: - input: 4 DOF joint angles, 3 wrist coordinates - training data: systematic sample (including redundancies!) of icub arm/hand positions 3 wrist x W inp x h(x, q) W out x coordinates uu u ˆx 4 DOF q W inp q W out q ˆq
Example icub right arm positioning: associative completion through dynamics xu ˆx u h(xu, ˆq) ˆq ˆq z 1
Example icub right arm positioning: in the control loop! xu ˆx u q h(x, q) u τ ˆq
Example icub right arm positioning: dynamical selection of solutions q q 3 [deg] q q 1 [deg] 2 [deg] 4 [deg] 40 40 140 60 80 80 98 82 0 100 200 300 400 time step k perturbation
Example icub right arm positioning: also mixed constraints possible!
Outline Extreme Learning Machine (ELM) & Intrinsic Plasticity learn inverse kinematics ELM + Recurrence = Echo State, Reservoir Computing (RC) Associative Neural Learning (for kinematics) kinematics + trajectory generation Programming Dynamics (shape attractor dynamics) learn and select multiple inverse kinematics solutions Learning velocity fields (movement primitives) Learning sequences of movement primitives
Learning velocity fields (movement primitives) confidence output Movement Primitives
Learning velocity fields (movement primitives) target demonstrations reproductions Simulation Results 100 50 ẋ( m/s) 60 40 20 ẏ( m/s) 0 0 200 150 100 50 0 x( m) 50 200 150 100 50 0 x( m) Training Data: Human Movements 100 0 20 40 60 80 100 120 140 160 y( m) Movement Primitives
Learning velocity fields (movement primitives) 5 1 5 1 0.8 0.5 y (dm) 0 Confidence 0.6 0.4 y (dm) 0 0 0.5 Training Data Default Data Reproduction 0.2 1 5 5 0 5 x (dm) 0 5 5 0 5 x (dm) Movement Primitives
Generalization with confidence Movement Primitives
Learning velocity fields (movement primitives) VIDEO Movement Primitives
outlook: neural dynamic motion primitives integration du/dt standard motion generation via DMP allows to learn trajectory shapes (AMARSi D4.1 update)
Outline Extreme Learning Machine (ELM) & Intrinsic Plasticity learn inverse kinematics ELM + Recurrence = Echo State, Reservoir Computing (RC) Associative Neural Learning (for kinematics) kinematics + trajectory generation Programming Dynamics (shape attractor dynamics) learn and select multiple inverse kinematics solutions Learning velocity fields (movement primitives) Learning sequences of movement primitives Learning sequences of movement primitives
Sequencing of movement primitives train for constant value of sequencer relative start coordinates type of movement Lemme, work in progress, unpublished train for zero value of velocity switch feedback off while running Movement Primitives
Sequencing of movement primitives Movement Primitives
Sequencing of movement primitives Movement Primitives
Summary: Extreme Learning Machine (ELM) & Intrinsic Plasticity learn inverse kinematics ELM + Recurrence = Echo State, Reservoir Computing (RC) Associative Neural Learning (for kinematics) kinematics + trajectory generation Programming Dynamics (shape attractor dynamics) learn and select multiple inverse kinematics solutions Learning velocity fields (movement primitives) Learning sequences of movement primitives Movement Primitives
Current work: associate visual feedback audio-visuo-motor patterns combine trajectory generation and inverse kinematics provide training data in autonomous exploration use platforms you can not model! VIDEO
R. F. Reinhart, Associative Learning, Programming Dynamics, icub, PhD Student M. Rolf, Motion Learning on ASIMO, Goal Babbling, PhD Student K. Neumann, Bimanual Motion Learning on ASIMO, Intrinsic Plasticity A. Lemme, Sequencing of Motion Primitives FlexIRob: A. Nordmann A. Lemme S. Rüther A. Weirich M. Johannfunke Dr. S. Wrede S. Krüger M. Götting Cognitive Systems Engineering: System Integration, ASIMO, icub, Kuka support
Publications Regularization and stability in reservoir networks with output feedback. R. F. Reinhart and J.J. Steil, Neurocomputing, conditionally accepted Optimizing Extreme Learning Machines via Ridge Regression and Batch Intrinsic Plasticity Neumann, K., and J.J. Steil, Neurocomputing, conditionally accepted Batch intrinsic plasticity for extreme learning machines. K. Neumann and J.J. Steil. ICANN, pages 339 346, 2011. State prediction: A constructive method to program recurrent neural networks. R. F. Reinhart and J.J. Steil. ICANN, pages 159 166, 2011. Neural learning and dynamical selection of redundant solutions for inverse kinematic control. R. F. Reinhart and J.J. Steil, IEEE Humanoids, 2011 A constrained regularization approach for input-driven recurrent neural networks. R. F. Reinhart and J.J. Steil. Differential Equations and Dynamical Systems, 19:27 46, 2011. Reservoir regularization stabilizes learning of Echo State Networks with output feedback. R. Felix Reinhart and J.J. Steil, ESANN, pp. 59-64, 2011 Teaching and Learning Redundancy Resolution for Autonomous Generation of Flexible Robot Movements. S. Wrede, M. Johannfunke, A. Lemme, A. Nordmann, S. Rüther, A. Weirich, J.J. Steil, Workshop Computational Intelligence, GMA-FA 5.14, Dortmund, 2010 Learning Flexible Full Body Kinematics for Humanoid Tool Use. M. Rolf, J.J. Steil and M. Gienger, Int. Symp. Learning and Adaptive Behavior in Robotic Systems, 2010 Learning Inverse Kinematics for Pose-Constraint Bi-Manual Movements. K. Neumann, M. Rolf, J.J. Steil and M. Gienger, Int. Conf. Simulation of Adaptive Behavior, 2010 Recurrence enhances the spatial encoding of static inputs in reservoir networks. C. Emmerich, F. R. Reinhart, and J. J. Steil. ICANN, pp. 48 153, 2010. Attractor-based computation with reservoirs for online learning of inverse kinematics. R. F. Reinhart, and J.J. Steil, Proc. ESANN, pp. 257-262, 2009
Efficient exploration and learning of whole body kinematics. M. Rolf, J.J. Steil, and M. Gienger, Proc. Int. Conf. Developmental Learning, 2009 Reaching movement generation with a recurrent neural network based on learning inverse kinematics. R. F. Reinhart and J.J. Steil. IEEE Conf. Humanoid Robotics, pages 323 330, 2009. Recurrent neural associative learning of forward and inverse kinematics for movement generation of the redundant PA-10 robot. R. F. Reinhart and J.J. Steil. Learning Adaptive Behavior in Robotic Systems, pp. 35 40, 2008. Improving reservoirs using intrinsic plasticity. Schrauwen B., Wardermann M., Verstraeten D., Steil J.J., Stroobandt D., Neurocomputing, pp. 1159-1171, 2008 Online reservoir adaptation by intrinsic plasticity for backpropagation-decorrelation and echo state learning. J.J. Steil. Neural Networks, 20(3):353 364, 2007. Online Stability of backpropagation-decorrelation recurrent learning. J.J. Steil, Neurocomputing, vol. 69(7-9), pp. 642-650, 2006 (some, few) related publications: A. Lemme, R. F. Reinhart, and J. J. Steil. Efficient online learning of a non-negative sparse autoencoder. ESANN, pages 1 6, 2010. Butko and Triesch. Exploring the Role of Intrinsic Plasticity for the Learning of Sensory Representations. Neurocomputing, 70(7-9):1130-1138, 2007. Lukoševičius and Jaeger, Reservoir computing approaches to recurrent neural network training, 2009. Jaeger, The "echo state" approach to analyzing and training recurrent neural networks, 2001. Baraniuk, R, Wakin, M., Random Projections of Smooth Manifolds, Foundations of Computational Mathematics, vol(9), no. 1, 51-77, 2009 HG.-B. Huang, Q.-Y. Zhu, and C.-K. Siew, Extreme learning machine: Theory and applications, Neurocomputing, vol. 70, no. 1-3, pp. 489-501, 2006. G.-B. Huang, L. Chen, and C.-K. Siew, Universal approximation using incremental constructive feedforward networks with random hidden nodes, IEEE Transactions on Neural Networks, vol. 17, no. 4, pp. 879{892, 2006.
Thank you for your attention! x q h(x, q) τ ˆx ˆq more information: www.cor-lab.de/educational-material www.cor-lab.de/jsteil www.amarsi-project.eu