Solving differential equations using neural networks

Similar documents
Lecture 8 February 4

degrees of freedom and are able to adapt to the task they are supposed to do [Gupta].

How High a Degree is High Enough for High Order Finite Elements?

Introduction to Solid Modeling Using SolidWorks 2012 SolidWorks Simulation Tutorial Page 1

A Strategy for Teaching Finite Element Analysis to Undergraduate Students

POISSON AND LAPLACE EQUATIONS. Charles R. O Neill. School of Mechanical and Aerospace Engineering. Oklahoma State University. Stillwater, OK 74078

Follow links Class Use and other Permissions. For more information, send to:

Temporal Difference Learning in the Tetris Game

Introduction to the Finite Element Method

Mechanics 1: Conservation of Energy and Momentum

Computational Geometry Lab: FEM BASIS FUNCTIONS FOR A TETRAHEDRON

Introduction to the Finite Element Method (FEM)

Module 5: Solution of Navier-Stokes Equations for Incompressible Flow Using SIMPLE and MAC Algorithms Lecture 27:

An Introduction to Neural Networks

1 Finite difference example: 1D implicit heat equation

Multiphase Flow - Appendices

Numerical Methods for Differential Equations

An Overview of the Finite Element Analysis

CHAPTER 2 Estimating Probabilities

TWO-DIMENSIONAL FINITE ELEMENT ANALYSIS OF FORCED CONVECTION FLOW AND HEAT TRANSFER IN A LAMINAR CHANNEL FLOW

FINITE DIFFERENCE METHODS

MODULE VII LARGE BODY WAVE DIFFRACTION

Multiple Optimization Using the JMP Statistical Software Kodak Research Conference May 9, 2005

Numerical Methods For Image Restoration

Recurrent Neural Networks

1 Error in Euler s Method

The University of Toledo Soil Mechanics Laboratory

Neural Networks and Support Vector Machines

Grid adaptivity for systems of conservation laws

Programming Exercise 3: Multi-class Classification and Neural Networks

ME6130 An introduction to CFD 1-1

High-fidelity electromagnetic modeling of large multi-scale naval structures

Chapter 6: Solving Large Systems of Linear Equations

Metric Spaces. Chapter Metrics

Statistical Machine Translation: IBM Models 1 and 2

Torgerson s Classical MDS derivation: 1: Determining Coordinates from Euclidean Distances

Two-Dimensional Conduction: Shape Factors and Dimensionless Conduction Heat Rates

Numerical Analysis An Introduction

Linear Threshold Units

Iterative Solvers for Linear Systems

Inductive QoS Packet Scheduling for Adaptive Dynamic Networks

Effect of Aspect Ratio on Laminar Natural Convection in Partially Heated Enclosure

Application of Neural Network in User Authentication for Smart Home System

MEL 807 Computational Heat Transfer (2-0-4) Dr. Prabal Talukdar Assistant Professor Department of Mechanical Engineering IIT Delhi

Lecture 3: Linear methods for classification

Numerical methods for American options

Multi-Block Gridding Technique for FLOW-3D Flow Science, Inc. July 2004

Using Artifical Neural Networks to Model Opponents in Texas Hold'em

Multigrid preconditioning for nonlinear (degenerate) parabolic equations with application to monument degradation

GAMBIT Demo Tutorial

Feed-Forward mapping networks KAIST 바이오및뇌공학과 정재승

Lecture Notes to Accompany. Scientific Computing An Introductory Survey. by Michael T. Heath. Chapter 10

Several Views of Support Vector Machines

HYBRID FIR-IIR FILTERS By John F. Ehlers

Optimal Control of Switched Networks for Nonlinear Hyperbolic Conservation Laws

Price Prediction of Share Market using Artificial Neural Network (ANN)

Shroudbase Technical Overview

Lecture 3. Linear Programming. 3B1B Optimization Michaelmas 2015 A. Zisserman. Extreme solutions. Simplex method. Interior point method

Chapter 3 Non-parametric Models for Magneto-Rheological Dampers

A Multi-level Artificial Neural Network for Residential and Commercial Energy Demand Forecast: Iran Case Study

Nonlinear Optimization: Algorithms 3: Interior-point methods

Nonlinear Algebraic Equations. Lectures INF2320 p. 1/88

Prediction Model for Crude Oil Price Using Artificial Neural Networks

Linear Programming. Solving LP Models Using MS Excel, 18

Dual Methods for Total Variation-Based Image Restoration

Inner Product Spaces

Data Mining Techniques Chapter 7: Artificial Neural Networks

Høgskolen i Narvik Sivilingeniørutdanningen STE6237 ELEMENTMETODER. Oppgaver

Neural Networks in Quantitative Finance

The Cobb-Douglas Production Function

A study on generation of three-dimensional M-uniform tetrahedral meshes in practice

BOUNDARY INTEGRAL EQUATIONS FOR MODELING ARBITRARY FLAW

FRIEDRICH-ALEXANDER-UNIVERSITÄT ERLANGEN-NÜRNBERG

Application of CFD Simulation in the Design of a Parabolic Winglet on NACA 2412

Predict Influencers in the Social Network

Mesh Discretization Error and Criteria for Accuracy of Finite Element Solutions

Analecta Vol. 8, No. 2 ISSN

The Backpropagation Algorithm

Finite Element Formulation for Plates - Handout 3 -

Variational approach to restore point-like and curve-like singularities in imaging

MIKE 21 FLOW MODEL HINTS AND RECOMMENDATIONS IN APPLICATIONS WITH SIGNIFICANT FLOODING AND DRYING

Customer Training Material. Lecture 4. Meshing in Mechanical. Mechanical. ANSYS, Inc. Proprietary 2010 ANSYS, Inc. All rights reserved.

Stress Recovery 28 1

COMPARISON OF SOLUTION ALGORITHM FOR FLOW AROUND A SQUARE CYLINDER

Lab 7: Operational Amplifiers Part I

SIXTY STUDY QUESTIONS TO THE COURSE NUMERISK BEHANDLING AV DIFFERENTIALEKVATIONER I

American International Journal of Research in Science, Technology, Engineering & Mathematics

Finite Difference Approach to Option Pricing

Introduction to Logistic Regression

AB3080 L. Learning Objectives: About the Speaker:

AN INTRODUCTION TO NUMERICAL METHODS AND ANALYSIS

TTLL301 - Heat transfer in a bar with imposed temperature (sinusoid)

TRAINING A LIMITED-INTERCONNECT, SYNTHETIC NEURAL IC

Numerical Analysis Lecture Notes

Electromagnetic Sensor Design: Key Considerations when selecting CAE Software

5. Measurement of a magnetic field

G.A. Pavliotis. Department of Mathematics. Imperial College London

The Heat Equation. Lectures INF2320 p. 1/88

Transcription:

M. M. Chiaramonte and M. Kiener 1 INTRODUCTION The numerical solution of ordinary and partial differential equations (DE s) is essential to many engineering fields. Traditional methods, such as finite elements, finite volume, and finite differences, rely on discretizing the domain and weakly solving the DE s over this discretization. While these methods are generally adequate and effective in many engineering applications, one limitation is that the obtained solutions are discrete or have limited differentiability. In order to avoid this issue when numerically solving DE s (i.e., obtain a differentiable solution that can be evaluated continuously on the domain), one can implement a different method which relies on neural networks (NN). The purpose of this study is to outline this method, implement it for some examples, and analyze some of its error properties. 2 FORMULATION The study is restricted to second-order equations of the form G(x, Ψ(x), Ψ(x), 2 Ψ(x)) = 0, x D, (1) where x R n is the independent variable over the domain D R n, and Ψ(x) is the unknown (scalarvalued) solution. The boundary of the domain is decomposed as D = d D τ D, = d D τ D, where d D is the portion of D where essential boundary conditions (BC s) are specified. This study is restricted to problems with only essential BC s: for a given function ˆΨ(x), Ψ(x) = ˆΨ(x), x d D. To approximately solve the above using an NN, a trial form of the solution is assumed as Ψ t (x, p) = ˆΨ(x) + F (x)n(x, p), (2) where N(x, p) is a feedforward NN with parameters p. The scalar-valued function F (x) is chosen so as not to contribute to the BC s: F (x) = 0, x d D. This allows the overall function Ψ t (x, p) to automatically satisfy the BC s. A subtle point is that (the single function) ˆΨ(x) must often be constructed from piecewise BC s (see Section 3). Furthermore, for a given problem there are multiple ways to construct ˆΨ(x) and F (x), though often there will be an obvious choice. The task is then to learn the parameters p such that Eqn. 1 is approximately solved by the form in Eqn. 2. To do this, the original equation is relaxed to a discretized version and approximately solved. More specifically, for a discretization of the domain ˆD = { x (i) D; i = 1,..., m }, Eqn. 1 is relaxed to hold only at these points: G(x (i), Ψ(x (i) ), Ψ(x (i) ), 2 Ψ(x (i) )) = 0, i = 1,..., m. (3) Note this relaxation is general and independent of the form in Eqn. 2. Because with a given NN it may not be possible to (exactly) satisfy Eqn. 3 at each discrete point, the problem is further relaxed to find a trial solution that nearly satisfies Eqn. 3 by minimizing a related error index. Specifically, for the error index m J(p) = G(x (i), Ψ t (x (i), p), x Ψ t (x (i), p), 2 xψ t (x (i), p)) 2, (4) i=1 1

the optimal trial solution is Ψ t (x, p ), where p = arg min p J(p). The optimal parameters can be obtained numerically by a number of different optimization methods 1, such as back propagation or the quasi-newton BFGS algorithm. Regardless of the method, once the parameters p have been attained, the trial solution Ψ t (x, p ) is a smooth approximation to the true solution that can be evaluated continuously on the domain. A schematic of the NN used in this study is shown in Fig. 1. Input Hidden Output x 1. x n N(x, p) bias: 1 Figure 1: Schematic of NN with n + 1 input nodes, H hidden nodes, and 1 output node. There are n + 1 input nodes (including a bias node) and a single hidden of H nodes with sigmoid activation functions. The single scalar output is thus given by N(x, v, W ) = v T g( W x), (5) where v R H and W R H n+1 are the specific NN parameters (replacing the general parameter representation p). The input variable is x = [x T, 1] T, where the bar indicates the appended 1 used to account for the bias at each of the hidden units. The function g : R H R H is a component-wise sigmoid that acts on the hidden. Given the above, the overall task is to choose the discretization ˆD and the number of hidden nodes H, and then minimize Eqn. 4 to obtain the approximation Ψ t (x, p ). Assuming a given numerical method that reliably obtains the solution p, this leaves the discretization and the hidden as basic design choices. Intuitively, it is expected that the solution accuracy will increase with a finer discretization and a larger hidden (i.e. NN complexity), but at the expense of computation and possible over fitting. These trends will be explored in the examples. Ultimately, one would like to obtain an approximation of sufficient accuracy by using a minimum of computation effort and NN complexity. 3 EXAMPLES The method is now showcased for the solution of two sample partial differential equations (PDE). In both examples, n = 2 and the domain was taken to be the square D = [0, 1] [0, 1] with a uniform grid discretization ˆD = {(i/k, j/k) ; i = 0,..., K, j = 0,..., K}, where m = (K + 1) 2. Both backpropagation and the BFGS algorithm were initially implemented to train the parameters. It 1 These methods may reach a local optimum in Eqn. 4 as opposed to the global optimum. 2 of 5

was discovered that BFGS converged more quickly, so this was ultimately implemented for these final examples. Furthermore, through trial and error it was discovered that including a regularization term in Eqn. 4 provided benefits in obtaining parameters of relatively small magnitudes. Without this term, the parameters occasionally would become very large in magnitude. This regularization term also seemed to provide some marginal benefits in reducing error and convergence time compared to the unregularized implementations. 3.1 Laplace s Equation The first example is the elliptic Laplace s equation: 2 Ψ(x) = 0, x D. (6) Ψ(x) = 0, x {(, ) D = 0, = 1, or = 0} (7) The BC s were chosen as Ψ(x) = sin π, x {(, ) D = 1}. The analytical solution is Ψ(x) = eπ 1 sin π eπ e π. π e (8) Using the BC s, the trial solution was constructed as ) = sin π + ( 1) ( 1)N (x, v, W Ψt (x, v, W (9) For the case of K = 16 and H = 6, the numerical solution and the corresponding error from the analytical solution are shown in Fig. 2. The numerical solution is in good agreement with the analytical solution, obtaining a maximum error of about 2 10 4. 0.9 1.9e-04 1.7e-04 0.7 1.5e-04 1.1e-04 0.5 e-04 Ψ 8.4e-05 0.3 6.3e-05 4.2e-05 0.1 (a) The computed solution Ψt (x, v, W Ψ Ψt 2.1e-05 e+00 (b) The error of the computed solution from the analytical so ). lution: Ψ(x) Ψt (x, v, W Figure 2: Solution to Laplace s equation (Eqn. 6) for BC s in Eqn. 7. 3 of 5

3.2 Conservation law The next example is the hyperbolic conservation law PDE Ψ(x) + Ψ(x) =, x D, (10) where the BC s were chosen as Ψ(x) = 1 + exp( 1 ), [0, 1], = 0. (11) The analytical solution is 2 2 Ψ(x) = ( 1) + 1 e 2 + e e + e. (12) Using the BC s, the trial solution was constructed as ) = 1 + exp( 1 ) + N (x, v, W Ψt (x, v, W (13) The network parameters were again obtained for K = 16 and H = 6, and the solution and error is shown in Fig. 3. The numerical and analytical solutions are in good agreement, with a maximum error of about 2.5 10 3. Although the errors in both examples are small, the error for Laplace s equation is about an order of magnitude smaller than that of the hyperbolic equation. While this may be due in part to the different nature of the solutions, the different BC s may also have an effect. In Laplace s equation the BC s constrain the solution around the entire square domain (since it is second-order in both variables), while in the hyperbolic equation the BC s only constrain the solution along the bottom edge (since it is first order in both variables). Because the solution will automatically hold at the BC s due to the construction of F (x), the BC s along the entire boundary in Laplace s equation most likely contributes to overall smaller error throughout the domain. 2.5e-03 2.2e-03 1.9e-03 1.4e-03 Ψt 1.6e-03 1.1e-03 1.1 8.2e-04 1.1 5.5e-04 (a) The computed solution Ψt (x, v, W Ψ Ψt 2.7e-04 e+00 (b) The error of the computed solution from the analytical so ). lution: Ψ(x) Ψt (x, v, W Figure 3: Solution to the hyperbolic conservation law (Eqn. 10) for the BC s in Eqn. 11. 4 of 5

4 ERROR PROPERTIES As discussed previously, it is intuitively expected that refining the discretization and increasing the size of the hidden will increase the accuracy of the solution. To study this, Laplace s equation was solved for a number of choices in K and H, and the maximum error over the domain Ψ(x) Ψ t (x, v, W ) max for each solution was recorded. To assess the dependence on H, solutions were obtained for H = 2, 4, 8, and 16 for a fixed K = 8. To assess the dependence on K, solutions were obtained for K = 4, 8, and 16 for a fixed H = 4. The results are shown in Fig. 4. From the first figure, the error steadily decreases for H = 2, 4, and 8 but plateaus for H = 16. This suggests that for the given discretization, a network complexity greater than H = 8 yields diminishing returns in reducing error. From the second figure, the error steadily decreases with increasing mesh refinement. It is unclear how this trend continues for even finer discretizations of K > 16. 3 3.5 4 4.0 5 4.5 log Ψ Ψt max 6 log Ψ Ψt max 5.0 5.5 7 6.0 8 6.5 9 1.5 2.0 2.5 3.0 3.5 4.0 log 2 H 7.0 2.0 2.5 3.0 3.5 4.0 log 2 K (a) Plot of maximum error versus hidden sizes H = (b) Plot of maximum error versus discretization sizes K = 2, 4, 8, and 16 for fixed mesh size K = 8. 4, 8, and 16 for fixed hidden size H = 4. 5 CONCLUSIONS AND FUTURE WORK Figure 4: Error trends for Laplace s equation. In this study, a framework for the numerical solution of DE s using NN s has been showcased for several examples. The benefit of this method is that the trial solution (via the trained NN) represents a smooth approximation that can be evaluated and differentiated continuously on the domain. This is in contrast with the discrete or non-smooth solutions obtained by traditional schemes. Although the method has been implemented successfully, there are several areas of possible improvement. Because there is a considerable tradeoff between the discretization training set size (and solution accuracy) and the cost of training the NN, it could be useful to devise adaptive training set generation to balance this tradeoff. Also, this study used a uniform rectangular discretization of the domain, so future studies could explore nonuniform discretizations. This could be especially useful in examples with irregular boundaries, where more sample points might be needed in some regions of the domain compared to others. 5 of 5