BigDFT An introduction

Size: px
Start display at page:

Download "BigDFT An introduction"

Transcription

1 Atomistic and molecular simulations on massively parallel architectures CECAM PARIS, FRANCE An introduction Thierry Deutsch, Damien Caliste, Luigi Genovese L_Sim - CEA Grenoble 17 July 2013

2 Objectives of the lesson To know what are the main parameters in a DFT calculation To run a calculation (i.e. the physicist problem) 1 2 Exchange-correlation 3 High-Performance Computing (parallel and GPU) 4

3 A basis for nanosciences: the project STREP European project: ( ) Four partners, 15 contributors: CEA-INAC Grenoble, U. Basel, U. Louvain-la-Neuve, U. Kiel Aim: To develop an ab-initio DFT code based on Daubechies Wavelets, to be integrated in ABINIT, distributed freely (GNU-GPL license) L. Genovese, A. Neelov, S. Goedecker, T. Deutsch, et al., Daubechies wavelets as a basis set for density functional pseudopotential calculations, J. Chem. Phys. 129, (2008)

4 version 1.7: capabilities Free, surface and periodic boundary conditions Geometry optimizations (with constraints) Born-Oppenheimer Molecular Dynamics Saddle point searches (Nudged-Elastic Band Method) Vibrations External electric fields Unoccupied KS orbitals Collinear and Non-collinear magnetism All functionals of the ABINIT package Hybrid functionals Empirical van der Waals interactions (many flavors) Also available within the ABINIT package

5 Kohn-Sham formalism H-K theorem: E is an unknown functional of the density E = E[ρ] Density Functional Theory Kohn-Sham approach Mapping of an interacting many-electron system into a system with independent particles moving into an effective potential. Find a set of orthonormal orbitals Ψ i (r) that minimizes: E = 1 2 N/2 i=1 N/2 ρ(r) = i=1 Ψ i (r) 2 Ψ i (r)dr + 1 ρ(r)v H (r)dr 2 + E xc [ρ(r)] + V ext (r)ρ(r)dr Ψ i (r)ψ i (r) 2 V H (r) = 4πρ(r)

6 Spirit of Few input files (posinp.xyz, input.dft) Few input parameters ex. not parameters for parallel calculations Use input files with mandatory parameters Optional input files as input.kpt (for periodic systems) or input.perf (parameters for performance)

7 How to run Some pre-processing bigdft-tool nprocs [name] Help to find the optimal number of processors for the run that fit into the computer memory. Also useful to check that all input files are without errors. The main run mpirun -np 8 bigdft [name] tee [name].log Number of MPI tasks : 8 OpenMP parallelization : Yes Maximal OpenMP threads per MPI task : 1 Material acceleration : The outputs No #iproc=0 log is output on stdout, so redirect it. generated data are stored in a subdirectory called data[-name].

8 The input files (or not) posinp.xyz: the atomic coordinates and box definition. psppar.element: pseudo-potentials for each element; input.dft: DFT related parameters; input.geopt: geometrie relaxations and MD; input.kpt: the k-point mesh and band structure path; input.mix: diagonalisation mechanism; input.sic, tddft, occ, perf: other parameters; Only the coordinates are mandatory. All lines of different input files are mandatory and output on log. When is run, all default values are output in files called default.xxx. Naming scheme All input files may be renamed by providing a name argument to the command line i.e. bigdft name.

9 Atomic positions: posinp.xyz or posinp.ascii units reduced, angstroem, bohr or atomic Boundary conditions periodic, surface 8 reduced periodic Si Si Si Si Si Si Si Si

10 Atomic positions: posinp.xyz or posinp.ascii units reduced, angstroem, bohr or atomic Boundary conditions periodic, surface 3 angstroem surface O H H

11 Atomic positions: posinp.xyz or posinp.ascii units reduced, angstroem, bohr or atomic Boundary conditions periodic, surface 5 atomic Si H H H H

12 Atomic coordinate format uses XYZ format plus additional modifications: physical unit, after the number of atoms on the first line put either bohr or angstroem. boundary conditions, on second line, start with one of freebc, surface X lat 0 Z lat or periodic X lat Y lat Z lat. frozen positions, add a f, fy or fxz at the end of an atom line. input polarisation, add values at the end of an atom line. On output, forces are added for each element at the end of the XYZ file: 2 angstroem freebc Na Cl forces Na E Cl E-2 0 0

13 Performing a DFT calculation A self-consistent equation ρ(r) = Ψ i (r)ψ i (r), where ψ i satisfies i ( 12 ) 2 + V H [ρ] + V xc [ρ] + V ext + V pseudo ψ i = E i ψ i, (Kohn-Sham) DFT Ingredients A basis set for expressing the ψ i An potential, functional of the density several approximations exists (LDA,GGA,... ) A choice of the pseudopotential (if not all-electrons) (norm conserving, ultrasoft, PAW,... ) An (iterative) algorithm for finding the wavefunctions ψ i A (good) computer...

14 s in real space: Wavelet basis sets Properties of wavelet basis sets: localized both in real and in Fourier space allow for adaptivity (for core electrons) are a systematic basis set A basis set both adaptive and systematic, real space based x φ(x) ψ(x)

15 A brief description of wavelet theory Two kind of basis functions A Multi-Resolution real space basis The functions can be classified following the resolution level they span. Scaling Functions The functions of low resolution level are a linear combination of high-resolution functions = + m φ(x) = h j φ(2x j) j= m Centered on a resolution-dependent grid: φ j = φ 0 (x j).

16 A brief description of wavelet theory Wavelets They contain the DoF needed to complete the information which is lacking due to the coarseness of the resolution. φ(2x) = m j= m = h j φ(x j) + m j= m g j ψ(x j) Increase the resolution without modifying grid space SF + W = same DoF of SF of higher resolution ψ(x) = m j= m g j φ(2x j) All functions have compact support, centered on grid points.

17 Adaptivity of the mesh Adaptivity Resolution can be refined following the grid point. The grid is divided in: Data can thus be stored in compressed form. Localization property Low resolution pts (SF, 1 DoF) High resolution pts (SF + W, 8 DoF) Points of different resolution belong to the same grid. Empty regions must not be filled with basis functions. Optimal for big inhomogeneous systems, O(N) approach.

18 Adaptivity of the mesh Atomic positions (H 2 O), hgrid

19 Adaptivity of the mesh Fine grid (high resolution, frmult)

20 Adaptivity of the mesh Coarse grid (low resolution, crmult)

21 Defining the basis set in input.dft Main specific variables to control the basis set hgrid the grid size frmult the expansion around atoms for fine grid crmult the expansion around atoms for coarse grid Convergence properties Decreasing hgrid more degrees of freedom. Increasing crmult less box constraint. It is important to improve both parameters together to avoid error on one hiding improvement made on the other one.

22 input.dft: hgrid, crmult, frmult hgrid Step grid: twice as bigger than the lowest gaussian coefficient of pseudopotential Roughly 0.45 (only orthorhombic mesh) crmult Extension of the coarse resolution (factor 6.): depends on the extension of the HOMO for the atom frmult Extension of the fine resolution (factor 8.): depends on the pseudopotential coefficient In practice, we use only hgrid from 0.3 to 0.55.

23 Wavelets: No integration error Orthogonality, scaling relation dx φ k (x)φ j (x) = δ kj φ(x) = 1 2 m j= m h j φ(2x j) The hamiltonian-related quantities can be calculated up to machine precision in the given basis. The accuracy is only limited by the basis set (O ( h 14 grid) ) Exact evaluation of kinetic energy Obtained by convolution with filters: f(x) = c l φ l (x), 2 f(x) = c l φ l (x), l l c l = c j a l j, a l φ 0 (x) 2 xφ l (x), j

24 Wavelets: Separability in 3D The 3-dim scaling basis is a tensor product decomposition of 1-dim Scaling Functions/ Wavelets. φ e x,e y,e z j x,j y,j z (x,y,z) = φ e x j x (x)φ e y j y (y)φ e z j z (z) With (j x,j y,j z ) the node coordinates, φ (0) j and φ (1) j the SF and the W respectively. Gaussians and wavelets The separability of the basis allows us to save computational time when performing scalar products with separable functions (e.g. gaussians): Initial wavefunctions (input guess) Poisson solver Non-local pseudopotentials

25 Performing a DFT calculation A self-consistent equation ρ(r) = Ψ i (r)ψ i (r), where ψ i satisfies i ( 12 ) 2 + V H [ρ] + V xc [ρ] + V ext + V pseudo ψ i = E i ψ i, (Kohn-Sham) DFT Ingredients A basis set for expressing the ψ i An potential, functional of the density several approximations exists (LDA,GGA,... ) A choice of the pseudopotential (if not all-electrons) (norm conserving, ultrasoft, PAW,... ) An (iterative) algorithm for finding the wavefunctions ψ i A (good) computer...

26 Gaussian type separable s (HGH) Local part V loc (r) = Z [ ] [ ion r erf + exp 1 ( ) ] 2 r r 2rloc 2 r loc { ( ) 2 ( ) 4 ( r r r C 1 + C 2 + C 3 + C 4 r loc r loc Nonlocal (separable) part H( r, r ) H sep ( r, r ) = p l 1(r) = 2 r l+ 3 2 l 2 i=1 r loc Y s,m (ˆr) p s i (r) h s i p s i (r ) Ys,m(ˆr ) m + Y p,m (ˆr) p p 1 (r) hp 1 pp 1 (r ) Yp,m(ˆr ) m r l+ 7 2 l ) 6 } r l e 1 2 ( r ) 2 r l p Γ(l l r l+2 e 1 2 ( r ) 2 r l 2(r) = ) Γ(l + 7) 2

27 file (as psppar.n) No need if you use LDA or PBE functionals Default pseudopotentials inside! Put in the same directory as the calculation. Large library: Accurate and transferable

28 Performing a DFT calculation A self-consistent equation ρ(r) = Ψ i (r)ψ i (r), where ψ i satisfies i ( 12 ) 2 + V H [ρ] + V xc [ρ] + V ext + V pseudo ψ i = E i ψ i, (Kohn-Sham) DFT Ingredients A basis set for expressing the ψ i An potential, functional of the density several approximations exists (LDA,GGA,... ) A choice of the pseudopotential (if not all-electrons) (norm conserving, ultrasoft, PAW,... ) An (iterative) algorithm for finding the wavefunctions ψ i A (good) computer...

29 Exchange-Correlation Energy (input.dft) E xc [ρ] = K exact [ρ] K KS [ρ] + V e e exact [ρ] V Hartree [ρ] K KS [ρ]: Kinetic energy for a non-interaction electron gas Kohn-Sham formalism is exact (mapping of an interacting electron system into a non-interacting system) Many types from ABINIT: Use same codes (ixc: LDA=1, PBE=11, see manual) lib: code for exchange code for correlation ( see manual) Hybrid functionals from lib

30 Performing a DFT calculation A self-consistent equation ρ(r) = Ψ i (r)ψ i (r), where ψ i satisfies i ( 12 ) 2 + V H [ρ] + V xc [ρ] + V ext + V pseudo ψ i = E i ψ i, (Kohn-Sham) DFT Ingredients A basis set for expressing the ψ i An potential, functional of the density several approximations exists (LDA,GGA,... ) A choice of the pseudopotential (if not all-electrons) (norm conserving, ultrasoft, PAW,... ) An (iterative) algorithm for finding the wavefunctions ψ i A (good) computer...

31 Kohn-Sham Equations: Operators Apply different operators Having a one-electron hamiltonian in a mean field. [ 1 ] V eff (r)(r) ψ i = ε i ψ i V eff (r) = V ext (r) + V ρ(r ) r r dr + µ xc (r) E[ρ] can be expressed by the orthonormalized states of one particule: ψ i (r) with the fractional occupancy number f i (0 f i 1) : ρ(r) = f i ψ i (r) 2 i

32 Kohn-Sham Equations: Computing Energies Calculate different integrals E[ρ] = K [ρ] + U[ρ] U[ρ] = K [ρ] = m e i V dr f iψ i 2 ψ i dr V ext(r)ρ(r)+ 1 ρ(r)ρ(r ) dr dr V 2 V r r } {{ } Hartree We minimise with the variables ψ i (r) and f i with the constraint dr ρ(r) = N el. V + E xc [ρ] } {{ } exchange correlation

33 KS Equations: Self-Consistent Field Set of self-consistent equations: { 1 } V eff ψ i = ε i ψ i 2 m e with an effective potential: V eff (r) = V ext (r)+ ) dr ρ(r V r r } {{ } Hartree δe xc + δρ(r) } {{ } exchange correlation and: ρ(r) = 2 i f i ψ i (r) 2

34 KS Equations: Self-Consistent Field Set of self-consistent equations: { 1 } V eff ψ i = ε i ψ i 2 m e with an effective potential: V eff (r) = V ext (r)+ ) dr ρ(r V r r } {{ } Hartree δe xc + δρ(r) } {{ } exchange correlation and: ρ(r) = 2 i f i ψ i (r) 2 Poisson Equation: V Hartree = ρ (Laplacian: = 2 x y z 2 )

35 KS Equations: Self-Consistent Field Set of self-consistent equations: { 1 } V eff ψ i = ε i ψ i 2 m e with an effective potential: V eff (r) = V ext (r)+ ) dr ρ(r V r r } {{ } Hartree δe xc + δρ(r) } {{ } exchange correlation and: ρ(r) = 2 i f i ψ i (r) 2 Poisson Equation: V Hartree = ρ (Laplacian: = 2 x y z 2 )

36 KS Equations: Self-Consistent Field Set of self-consistent equations: { 1 } V eff ψ i = ε i ψ i 2 m e with an effective potential: V eff (r) = V ext (r)+ ) dr ρ(r V r r } {{ } Hartree δe xc + δρ(r) } {{ } exchange correlation and: ρ(r) = 2 i f i ψ i (r) 2 Poisson Equation: V Hartree = ρ (Laplacian: = 2 x y z 2 )

37 KS Equations: Self-Consistent Field Set of self-consistent equations: { 1 } V eff ψ i = ε i ψ i 2 m e with an effective potential: V eff (r) = V ext (r)+ ) dr ρ(r V r r } {{ } Hartree δe xc + δρ(r) } {{ } exchange correlation and: ρ(r) = 2 i f i ψ i (r) 2 Poisson Equation: V Hartree = ρ (Laplacian: = 2 x y z 2 )

38 KS Equations: Self-Consistent Field Set of self-consistent equations: { 1 } V eff ψ i = ε i ψ i 2 m e with an effective potential: V eff (r) = V ext (r)+ ) dr ρ(r V r r } {{ } Hartree δe xc + δρ(r) } {{ } exchange correlation and: ρ(r) = 2 i f i ψ i (r) 2 Poisson Equation: V Hartree = ρ (Laplacian: = 2 x y z 2 )

39 Direct Minimisation: Flowchart { } ψ i = caφ i a a Basis: adaptive mesh (0,...,N φ ) Orthonormalized

40 Direct Minimisation: Flowchart { } ψ i = caφ i a a inv FWT + Magic Filter Basis: adaptive mesh (0,...,N φ ) Orthonormalized occ ρ(r) = i ψ i (r) 2 Fine non-adaptive mesh: < 8N φ

41 Direct Minimisation: Flowchart { } ψ i = caφ i a a inv FWT + Magic Filter Basis: adaptive mesh (0,...,N φ ) Orthonormalized occ ρ(r) = i ψ i (r) 2 Fine non-adaptive mesh: < 8N φ Poisson solver V H (r) = V xc [ρ(r)] G(.)ρ V NL ({ψ i }) V effective

42 Direct Minimisation: Flowchart { } ψ i = caφ i a a inv FWT + Magic Filter Basis: adaptive mesh (0,...,N φ ) Orthonormalized occ ρ(r) = i ψ i (r) 2 Fine non-adaptive mesh: < 8N φ Poisson solver V H (r) = V xc [ρ(r)] G(.)ρ V NL ({ψ i }) V effective Kinetic Term

43 Direct Minimisation: Flowchart { } ψ i = caφ i a a inv FWT + Magic Filter Basis: adaptive mesh (0,...,N φ ) Orthonormalized occ ρ(r) = i ψ i (r) 2 Fine non-adaptive mesh: < 8N φ Poisson solver V H (r) = V xc [ρ(r)] G(.)ρ V NL ({ψ i }) FWT FWT δca i = E total c i (a) + Λ ij ca j j V effective Kinetic Term Λ ij =< ψ i H ψ j >

44 Direct Minimisation: Flowchart { } ψ i = caφ i a a inv FWT + Magic Filter Basis: adaptive mesh (0,...,N φ ) Orthonormalized occ ρ(r) = i ψ i (r) 2 Fine non-adaptive mesh: < 8N φ Poisson solver V H (r) = V xc [ρ(r)] G(.)ρ V NL ({ψ i }) FWT δca i = E total c i (a) + Λ ij ca j j c new,i a FWT preconditioning = c i a + h step δc i a V effective Kinetic Term Λ ij =< ψ i H ψ j > Steepest Descent, DIIS

45 Direct Minimisation: Flowchart { } ψ i = caφ i a a inv FWT + Magic Filter Basis: adaptive mesh (0,...,N φ ) Orthonormalized occ ρ(r) = i ψ i (r) 2 Fine non-adaptive mesh: < 8N φ Poisson solver V H (r) = V xc [ρ(r)] G(.)ρ V NL ({ψ i }) FWT Stop when δc i a small δca i = E total c i (a) + Λ ij ca j j c new,i a FWT preconditioning = c i a + h step δc i a V effective Kinetic Term Λ ij =< ψ i H ψ j > Steepest Descent, DIIS

46 input.dft: idsx DIIS history Use DIIS algorithm Direct Inversion in the Iterative Subspace Use idsx=6 previous iterations Memory-consuming (decrease if too big) Steepest descent if idsx=0 itermax=50 Specify maximum number of iterations Typically 15 iterations to converge gnrm_cv=1.e-4 convergence criterion Residue δψ 2 < gnrm_cv

47 Description of the file input.dft hx, hy, hz crmult, frmult, coarse and fine radius 1 ixc: parameter (LDA=1,PBE=11) ncharge: charge of the system, Electric field 1 0 nspin=1 non-spin polarization, total magnetic moment 1.E-04 gnrm_cv: convergence criterion gradient itermax, nrepmax 7 6 # CG (preconditioning), length of diis history 0 dispersion correction functional (0, 1, 2, 3) InputPsiId, output_wf, output_grid length of the tail, # tail CG iterations davidson treatment, # virtual orbitals, # plotted orbitals T disable the symmetry detection

48 Description of the file input.dft hx, hy, hz crmult, frmult, coarse and fine radius 1 ixc: parameter (LDA=1,PBE=11) ncharge: charge of the system, Electric field 1 0 nspin=1 non-spin polarization, total magnetic moment 1.E-04 gnrm_cv: convergence criterion gradient itermax, nrepmax 7 6 # CG (preconditioning), length of diis history 0 dispersion correction functional (0, 1, 2, 3) InputPsiId, output_wf, output_grid length of the tail, # tail CG iterations davidson treatment, # virtual orbitals, # plotted orbitals T disable the symmetry detection

49 Description of the file input.dft hx, hy, hz crmult, frmult, coarse and fine radius 1 ixc: parameter (LDA=1,PBE=11) ncharge: charge of the system, Electric field 1 0 nspin=1 non-spin polarization, total magnetic moment 1.E-04 gnrm_cv: convergence criterion gradient itermax, nrepmax 7 6 # CG (preconditioning), length of diis history 0 dispersion correction functional (0, 1, 2, 3) InputPsiId, output_wf, output_grid length of the tail, # tail CG iterations davidson treatment, # virtual orbitals, # plotted orbitals T disable the symmetry detection

50 Description of the file input.dft hx, hy, hz crmult, frmult, coarse and fine radius 1 ixc: parameter (LDA=1,PBE=11) ncharge: charge of the system, Electric field 1 0 nspin=1 non-spin polarization, total magnetic moment 1.E-04 gnrm_cv: convergence criterion gradient itermax, nrepmax 7 6 # CG (preconditioning), length of diis history 0 dispersion correction functional (0, 1, 2, 3) InputPsiId, output_wf, output_grid length of the tail, # tail CG iterations davidson treatment, # virtual orbitals, # plotted orbitals T disable the symmetry detection

51 Description of the file input.dft hx, hy, hz crmult, frmult, coarse and fine radius 1 ixc: parameter (LDA=1,PBE=11) ncharge: charge of the system, Electric field 1 0 nspin=1 non-spin polarization, total magnetic moment 1.E-04 gnrm_cv: convergence criterion gradient itermax, nrepmax 7 6 # CG (preconditioning), length of diis history 0 dispersion correction functional (0, 1, 2, 3) InputPsiId, output_wf, output_grid length of the tail, # tail CG iterations davidson treatment, # virtual orbitals, # plotted orbitals T disable the symmetry detection

52 Description of the file input.dft hx, hy, hz crmult, frmult, coarse and fine radius 1 ixc: parameter (LDA=1,PBE=11) ncharge: charge of the system, Electric field 1 0 nspin=1 non-spin polarization, total magnetic moment 1.E-04 gnrm_cv: convergence criterion gradient itermax, nrepmax 7 6 # CG (preconditioning), length of diis history 0 dispersion correction functional (0, 1, 2, 3) InputPsiId, output_wf, output_grid length of the tail, # tail CG iterations davidson treatment, # virtual orbitals, # plotted orbitals T disable the symmetry detection

53 Description of the file input.dft hx, hy, hz crmult, frmult, coarse and fine radius 1 ixc: parameter (LDA=1,PBE=11) ncharge: charge of the system, Electric field 1 0 nspin=1 non-spin polarization, total magnetic moment 1.E-04 gnrm_cv: convergence criterion gradient itermax, nrepmax 7 6 # CG (preconditioning), length of diis history 0 dispersion correction functional (0, 1, 2, 3) InputPsiId, output_wf, output_grid length of the tail, # tail CG iterations davidson treatment, # virtual orbitals, # plotted orbitals T disable the symmetry detection

54 Description of the file input.dft hx, hy, hz crmult, frmult, coarse and fine radius 1 ixc: parameter (LDA=1,PBE=11) ncharge: charge of the system, Electric field 1 0 nspin=1 non-spin polarization, total magnetic moment 1.E-04 gnrm_cv: convergence criterion gradient itermax, nrepmax 7 6 # CG (preconditioning), length of diis history 0 dispersion correction functional (0, 1, 2, 3) InputPsiId, output_wf, output_grid length of the tail, # tail CG iterations davidson treatment, # virtual orbitals, # plotted orbitals T disable the symmetry detection

55 Description of the file input.dft hx, hy, hz crmult, frmult, coarse and fine radius 1 ixc: parameter (LDA=1,PBE=11) ncharge: charge of the system, Electric field 1 0 nspin=1 non-spin polarization, total magnetic moment 1.E-04 gnrm_cv: convergence criterion gradient itermax, nrepmax 7 6 # CG (preconditioning), length of diis history 0 dispersion correction functional (0, 1, 2, 3) InputPsiId, output_wf, output_grid length of the tail, # tail CG iterations davidson treatment, # virtual orbitals, # plotted orbitals T disable the symmetry detection

56 Description of the file input.dft hx, hy, hz crmult, frmult, coarse and fine radius 1 ixc: parameter (LDA=1,PBE=11) ncharge: charge of the system, Electric field 1 0 nspin=1 non-spin polarization, total magnetic moment 1.E-04 gnrm_cv: convergence criterion gradient itermax, nrepmax 7 6 # CG (preconditioning), length of diis history 0 dispersion correction functional (0, 1, 2, 3) InputPsiId, output_wf, output_grid length of the tail, # tail CG iterations davidson treatment, # virtual orbitals, # plotted orbitals T disable the symmetry detection

57 Description of the file input.dft hx, hy, hz crmult, frmult, coarse and fine radius 1 ixc: parameter (LDA=1,PBE=11) ncharge: charge of the system, Electric field 1 0 nspin=1 non-spin polarization, total magnetic moment 1.E-04 gnrm_cv: convergence criterion gradient itermax, nrepmax 7 6 # CG (preconditioning), length of diis history 0 dispersion correction functional (0, 1, 2, 3) InputPsiId, output_wf, output_grid length of the tail, # tail CG iterations davidson treatment, # virtual orbitals, # plotted orbitals T disable the symmetry detection

58 Description of the file input.dft hx, hy, hz crmult, frmult, coarse and fine radius 1 ixc: parameter (LDA=1,PBE=11) ncharge: charge of the system, Electric field 1 0 nspin=1 non-spin polarization, total magnetic moment 1.E-04 gnrm_cv: convergence criterion gradient itermax, nrepmax 7 6 # CG (preconditioning), length of diis history 0 dispersion correction functional (0, 1, 2, 3) InputPsiId, output_wf, output_grid length of the tail, # tail CG iterations davidson treatment, # virtual orbitals, # plotted orbitals T disable the symmetry detection

59 s: wavelets self-consistent (SCF) Harris-Foulkes functional relativistic pseudopotential norm-conserving all electrons PAW Hybrid functionals beyond LDA GW non-relativistic [ N 3 scaling O(N) methods periodic non-periodic non-collinear collinear +V(r) non-spin polarized spin polarized LDA,GGA ] +µ xc (r) ψ k i = ε k i ψ k i atomic orbitals plane waves augmented real space LDA+U Gaussians Slater numerical finite difference Wavelet

60 Optimal for isolated systems 10 1 Test case: cinchonidine molecule (44 atoms) Plane waves Absolute energy precision (Ha) h = 0.4bohr E c = 40 Ha h = 0.3bohr Wavelets E c = 90 Ha Number of degrees of freedom E c = 125 Ha Allows a systematic approach for molecules Considerably faster than Plane Waves codes. 10 (5) times faster than ABINIT (CPMD) Charged systems can be treated explicitly with the same time

61 Systematic basis set Two parameters for tuning the basis The grid spacing hgrid The extension of the low resolution points crmult

62 Comparison with a code based on gaussian basis set The Daubechies expansion of a gaussian is known Results from gaussian codes can be imported in Gaussian code (CP2K) test, H 2 O and DZVP basis E [Ha] Input Guess Energy CP2K Imported energy converged energy 6-31G Conv energy Wavelet Conv energy with finite-size corrections crmult (basis extension)

63 using interpolating scaling function The Hartree potential is calculated in the interpolating scaling function basis. Poisson solver with interpolating scaling functions From the density ρ( j) on an uniform grid it calculates: ρ(r ) V H (r) = r r dr Very fast and accurate, optimal parallelization Can be used independently from the DFT code Integrated quantities (energies) are easy to extract Non-adaptive, needs data uncompression Explicitly free boundary conditions No need to subtract supercell interactions (charged systems).

64 for surface boundary conditions A Poisson solver for surface problems Based on a mixed reciprocal-direct space representation V px,p y (z) = 4π dz G( p ;z z )ρ px,p y (z), Can be applied both in real or reciprocal space codes No supercell or screening functions More precise than other existing approaches Example of the plane capacitor: Periodic Hockney Our approach V V V y y y x x x

65 input.geopt: or molecular dynamics DIIS 500 ncount_cluster: max steps during geometry relaxation 1.d0 1.d-3 frac_fluct: geometry optimization stops if force norm less than frac_fluct of noise maxval: maximum value of atomic forces 0.0d0 randdis: random amplitude for atoms 4.d0 5 betax: steepest descent step size, diis history Different algorithms: SDCG: Steepest Descent Conjugate Gradient DIIS: Direct Inversion in the iterative subspace VSSD: Variable Size Steepest Descent LBFGS: Limited BFGD AB6MD: Use molecular dynamics routines from ABINIT 6

66 input.geopt: or molecular dynamics DIIS 500 ncount_cluster: max steps during geometry relaxation 1.d0 1.d-3 frac_fluct: geometry optimization stops if force norm less than frac_fluct of noise maxval: maximum value of atomic forces 0.0d0 randdis: random amplitude for atoms 4.d0 5 betax: steepest descent step size, diis history Different algorithms: SDCG: Steepest Descent Conjugate Gradient DIIS: Direct Inversion in the iterative subspace VSSD: Variable Size Steepest Descent LBFGS: Limited BFGD AB6MD: Use molecular dynamics routines from ABINIT 6

67 input.geopt: or molecular dynamics DIIS 500 ncount_cluster: max steps during geometry relaxation 1.d0 1.d-3 frac_fluct: geometry optimization stops if force norm less than frac_fluct of noise maxval: maximum value of atomic forces 0.0d0 randdis: random amplitude for atoms 4.d0 5 betax: steepest descent step size, diis history Different algorithms: SDCG: Steepest Descent Conjugate Gradient DIIS: Direct Inversion in the iterative subspace VSSD: Variable Size Steepest Descent LBFGS: Limited BFGD AB6MD: Use molecular dynamics routines from ABINIT 6

68 input.geopt: or molecular dynamics DIIS 500 ncount_cluster: max steps during geometry relaxation 1.d0 1.d-3 frac_fluct: geometry optimization stops if force norm less than frac_fluct of noise maxval: maximum value of atomic forces 0.0d0 randdis: random amplitude for atoms 4.d0 5 betax: steepest descent step size, diis history Different algorithms: SDCG: Steepest Descent Conjugate Gradient DIIS: Direct Inversion in the iterative subspace VSSD: Variable Size Steepest Descent LBFGS: Limited BFGD AB6MD: Use molecular dynamics routines from ABINIT 6

69 input.geopt: or molecular dynamics DIIS 500 ncount_cluster: max steps during geometry relaxation 1.d0 1.d-3 frac_fluct: geometry optimization stops if force norm less than frac_fluct of noise maxval: maximum value of atomic forces 0.0d0 randdis: random amplitude for atoms 4.d0 5 betax: steepest descent step size, diis history Different algorithms: SDCG: Steepest Descent Conjugate Gradient DIIS: Direct Inversion in the iterative subspace VSSD: Variable Size Steepest Descent LBFGS: Limited BFGD AB6MD: Use molecular dynamics routines from ABINIT 6

70 input.geopt: or molecular dynamics DIIS 500 ncount_cluster: max steps during geometry relaxation 1.d0 1.d-3 frac_fluct: geometry optimization stops if force norm less than frac_fluct of noise maxval: maximum value of atomic forces 0.0d0 randdis: random amplitude for atoms 4.d0 5 betax: steepest descent step size, diis history Different algorithms: SDCG: Steepest Descent Conjugate Gradient DIIS: Direct Inversion in the iterative subspace VSSD: Variable Size Steepest Descent LBFGS: Limited BFGD AB6MD: Use molecular dynamics routines from ABINIT 6

71 High-Performance Computing: Massively parallel Two kinds of parallelisation By orbitals (Hamiltonian application, preconditioning) By components (overlap matrices, orthogonalisation) A few (but big) packets of data More demanding in bandwidth than in latency No need of fast network Optimal speedup (eff. 85%), also for big systems Cubic scaling code For systems bigger than 500 atomes (1500 orbitals) : orthonormalisation operation is predominant (N 3 )

72 Orbital distribution scheme Used for the application of the hamiltonian The hamiltonian (convolutions) is applied separately onto each wavefunction ψ 1 MPI 0 ψ 2 ψ 3 ψ 4 ψ 5 MPI 1 MPI 2

73 Coefficient distribution scheme Used for scalar product & orthonormalisation BLAS routines (level 3) are called, then result is reduced MPI 0 MPI 1 MPI 2 ψ 1 ψ 2 ψ 3 ψ 4 ψ 5 Communications are performed via MPI_ALLTOALLV

74 Performance in parallel () Distribution per orbitals (wavelets) Nothing to specify in input files

75 GPU-ported operations in (double precision) 20 Convolutions (OpenCL rewritten) GPU speedups between and 20 can be 6 obtained for different 4 sections 2 70 GPU speedup (Double prec.) locden locham precond Wavefunction size (MB) GPU speedup (Double prec.) DGEMM DSYRK Number of orbitals Linear algebra (CUBLAS library) The interfacing with CUBLAS is immediate, with considerable speedups

76 code on Hybrid architectures code can run on hybrid CPU/GPU supercomputers In multi-gpu environments, double precision calculations No Hot-spot operations Different code sections can be ported on GPU up to 20x speedup for some operations, 7x for the full parallel code Percent Seconds (log. scale) CPU code Hybrid code (rel.) Number of cores 1 Time (sec) Comm Other PureCPU precond locham locden LinAlg

77 input.perf file OpenCL acceleration: Add this line in in input.perf accel OCLGPU See the web page /Wiki/index.php?title=Input.perf for more information about all keywords (verbosity, input guess,...) associated to the performance of the code. The log file gives the list of possible keywords.

78 : Future developments self-consistent (SCF) Harris-Foulkes functional relativistic pseudopotential norm-conserving all electrons PAW Hybrid functionals beyond LDA GW non-relativistic [ N 3 scaling O(N) methods periodic non-periodic non-collinear collinear +V(r) non-spin polarized spin polarized LDA,GGA ] +µ xc (r) ψ k i = ε k i ψ k i atomic orbitals plane waves augmented real space LDA+U Gaussians Slater numerical finite difference Wavelet

79 Linear scaling version 16 min: 1000 atoms with the cubic scaling, 8000 atoms with the linear scaling one

80 Functionality: Mixed between physics and chemistry approach Order N (real space basis set as wavelets) Multi-scale approach (more than 1000 atoms feasible for a better parametrization) Numerical experience: One-day simulation Better exploration of atomic configurations Molecular Dynamics (4s per step for 32 water molecules)

81 Hands on See /Wiki/index.php?title=Category:Tutorials First runs with Basis-set convergence Acceleration example on different platforms: Kohn-Sham DFT Operation with GPU acceleration

Graphic Processing Units: a possible answer to High Performance Computing?

Graphic Processing Units: a possible answer to High Performance Computing? 4th ABINIT Developer Workshop RESIDENCE L ESCANDILLE AUTRANS HPC & Graphic Processing Units: a possible answer to High Performance Computing? Luigi Genovese ESRF - Grenoble 26 March 2009 http://inac.cea.fr/l_sim/

More information

2IP WP8 Materiel Science Activity report March 6, 2013

2IP WP8 Materiel Science Activity report March 6, 2013 2IP WP8 Materiel Science Activity report March 6, 2013 Codes involved in this task ABINIT (M.Torrent) Quantum ESPRESSO (F. Affinito) YAMBO + Octopus (F. Nogueira) SIESTA (G. Huhs) EXCITING/ELK (A. Kozhevnikov)

More information

A Beginner s Guide to Materials Studio and DFT Calculations with Castep. P. Hasnip ([email protected])

A Beginner s Guide to Materials Studio and DFT Calculations with Castep. P. Hasnip (pjh503@york.ac.uk) A Beginner s Guide to Materials Studio and DFT Calculations with Castep P. Hasnip ([email protected]) September 18, 2007 Materials Studio collects all of its files into Projects. We ll start by creating

More information

Visualization and post-processing tools for Siesta

Visualization and post-processing tools for Siesta Visualization and post-processing tools for Siesta Andrei Postnikov Université Paul Verlaine, Metz CECAM tutorial, Lyon, June 22, 2007 Outline 1 What to visualize? 2 XCrySDen by Tone Kokalj 3 Sies2xsf

More information

NMR and IR spectra & vibrational analysis

NMR and IR spectra & vibrational analysis Lab 5: NMR and IR spectra & vibrational analysis A brief theoretical background 1 Some of the available chemical quantum methods for calculating NMR chemical shifts are based on the Hartree-Fock self-consistent

More information

Quantum Effects and Dynamics in Hydrogen-Bonded Systems: A First-Principles Approach to Spectroscopic Experiments

Quantum Effects and Dynamics in Hydrogen-Bonded Systems: A First-Principles Approach to Spectroscopic Experiments Quantum Effects and Dynamics in Hydrogen-Bonded Systems: A First-Principles Approach to Spectroscopic Experiments Dissertation zur Erlangung des Grades Doktor der Naturwissenschaften am Fachbereich Physik

More information

First Step : Identifying Pseudopotentials

First Step : Identifying Pseudopotentials Quantum Espresso Quick Start Introduction Quantum Espresso (http://www.quantum-espresso.org/) is a sophisticated collection of tools for electronic structure calculations via DFT using plane waves and

More information

Analysis, post-processing and visualization tools

Analysis, post-processing and visualization tools Analysis, post-processing and visualization tools Javier Junquera Andrei Postnikov Summary of different tools for post-processing and visualization DENCHAR PLRHO DOS, PDOS DOS and PDOS total Fe, d MACROAVE

More information

The Quantum ESPRESSO Software Distribution

The Quantum ESPRESSO Software Distribution The Quantum ESPRESSO Software Distribution The DEMOCRITOS center of Italian INFM is dedicated to atomistic simulations of materials, with a strong emphasis on the development of high-quality scientific

More information

An Introduction to Hartree-Fock Molecular Orbital Theory

An Introduction to Hartree-Fock Molecular Orbital Theory An Introduction to Hartree-Fock Molecular Orbital Theory C. David Sherrill School of Chemistry and Biochemistry Georgia Institute of Technology June 2000 1 Introduction Hartree-Fock theory is fundamental

More information

DFT in practice : Part I. Ersen Mete

DFT in practice : Part I. Ersen Mete plane wave expansion & the Brillouin zone integration Department of Physics Balıkesir University, Balıkesir - Turkey August 13, 2009 - NanoDFT 09, İzmir Institute of Technology, İzmir Outline Plane wave

More information

1. Follow the links from the home page to download socorro.tgz (a gzipped tar file).

1. Follow the links from the home page to download socorro.tgz (a gzipped tar file). Obtaining the Socorro code 1 Follow the links from the home page to download socorrotgz (a gzipped tar file) 2 On a unix file system, execute "gunzip socorrotgz" to convert socorrotgz to socorrotar and

More information

ME6130 An introduction to CFD 1-1

ME6130 An introduction to CFD 1-1 ME6130 An introduction to CFD 1-1 What is CFD? Computational fluid dynamics (CFD) is the science of predicting fluid flow, heat and mass transfer, chemical reactions, and related phenomena by solving numerically

More information

AN INTRODUCTION TO NUMERICAL METHODS AND ANALYSIS

AN INTRODUCTION TO NUMERICAL METHODS AND ANALYSIS AN INTRODUCTION TO NUMERICAL METHODS AND ANALYSIS Revised Edition James Epperson Mathematical Reviews BICENTENNIAL 0, 1 8 0 7 z ewiley wu 2007 r71 BICENTENNIAL WILEY-INTERSCIENCE A John Wiley & Sons, Inc.,

More information

Mixed Precision Iterative Refinement Methods Energy Efficiency on Hybrid Hardware Platforms

Mixed Precision Iterative Refinement Methods Energy Efficiency on Hybrid Hardware Platforms Mixed Precision Iterative Refinement Methods Energy Efficiency on Hybrid Hardware Platforms Björn Rocker Hamburg, June 17th 2010 Engineering Mathematics and Computing Lab (EMCL) KIT University of the State

More information

DARPA, NSF-NGS/ITR,ACR,CPA,

DARPA, NSF-NGS/ITR,ACR,CPA, Spiral Automating Library Development Markus Püschel and the Spiral team (only part shown) With: Srinivas Chellappa Frédéric de Mesmay Franz Franchetti Daniel McFarlin Yevgen Voronenko Electrical and Computer

More information

OpenFOAM Optimization Tools

OpenFOAM Optimization Tools OpenFOAM Optimization Tools Henrik Rusche and Aleks Jemcov [email protected] and [email protected] Wikki, Germany and United Kingdom OpenFOAM Optimization Tools p. 1 Agenda Objective Review optimisation

More information

A Simultaneous Solution for General Linear Equations on a Ring or Hierarchical Cluster

A Simultaneous Solution for General Linear Equations on a Ring or Hierarchical Cluster Acta Technica Jaurinensis Vol. 3. No. 1. 010 A Simultaneous Solution for General Linear Equations on a Ring or Hierarchical Cluster G. Molnárka, N. Varjasi Széchenyi István University Győr, Hungary, H-906

More information

TESLA Report 2003-03

TESLA Report 2003-03 TESLA Report 23-3 A multigrid based 3D space-charge routine in the tracking code GPT Gisela Pöplau, Ursula van Rienen, Marieke de Loos and Bas van der Geer Institute of General Electrical Engineering,

More information

High-fidelity electromagnetic modeling of large multi-scale naval structures

High-fidelity electromagnetic modeling of large multi-scale naval structures High-fidelity electromagnetic modeling of large multi-scale naval structures F. Vipiana, M. A. Francavilla, S. Arianos, and G. Vecchi (LACE), and Politecnico di Torino 1 Outline ISMB and Antenna/EMC Lab

More information

Electric Dipole moments as probes of physics beyond the Standard Model

Electric Dipole moments as probes of physics beyond the Standard Model Electric Dipole moments as probes of physics beyond the Standard Model K. V. P. Latha Non-Accelerator Particle Physics Group Indian Institute of Astrophysics Plan of the Talk Parity (P) and Time-reversal

More information

Performance of Dynamic Load Balancing Algorithms for Unstructured Mesh Calculations

Performance of Dynamic Load Balancing Algorithms for Unstructured Mesh Calculations Performance of Dynamic Load Balancing Algorithms for Unstructured Mesh Calculations Roy D. Williams, 1990 Presented by Chris Eldred Outline Summary Finite Element Solver Load Balancing Results Types Conclusions

More information

Q-Chem: Quantum Chemistry Software for Large Systems. Peter M.W. Gill. Q-Chem, Inc. Four Triangle Drive Export, PA 15632, USA. and

Q-Chem: Quantum Chemistry Software for Large Systems. Peter M.W. Gill. Q-Chem, Inc. Four Triangle Drive Export, PA 15632, USA. and Q-Chem: Quantum Chemistry Software for Large Systems Peter M.W. Gill Q-Chem, Inc. Four Triangle Drive Export, PA 15632, USA and Department of Chemistry University of Cambridge Cambridge, CB2 1EW, England

More information

Introduction to CFD Analysis

Introduction to CFD Analysis Introduction to CFD Analysis Introductory FLUENT Training 2006 ANSYS, Inc. All rights reserved. 2006 ANSYS, Inc. All rights reserved. 2-2 What is CFD? Computational fluid dynamics (CFD) is the science

More information

YALES2 porting on the Xeon- Phi Early results

YALES2 porting on the Xeon- Phi Early results YALES2 porting on the Xeon- Phi Early results Othman Bouizi Ghislain Lartigue Innovation and Pathfinding Architecture Group in Europe, Exascale Lab. Paris CRIHAN - Demi-journée calcul intensif, 16 juin

More information

Dynamic Load Balancing in CP2K

Dynamic Load Balancing in CP2K Dynamic Load Balancing in CP2K Pradeep Shivadasan August 19, 2014 MSc in High Performance Computing The University of Edinburgh Year of Presentation: 2014 Abstract CP2K is a widely used atomistic simulation

More information

Accelerating Simulation & Analysis with Hybrid GPU Parallelization and Cloud Computing

Accelerating Simulation & Analysis with Hybrid GPU Parallelization and Cloud Computing Accelerating Simulation & Analysis with Hybrid GPU Parallelization and Cloud Computing Innovation Intelligence Devin Jensen August 2012 Altair Knows HPC Altair is the only company that: makes HPC tools

More information

Wavelet analysis. Wavelet requirements. Example signals. Stationary signal 2 Hz + 10 Hz + 20Hz. Zero mean, oscillatory (wave) Fast decay (let)

Wavelet analysis. Wavelet requirements. Example signals. Stationary signal 2 Hz + 10 Hz + 20Hz. Zero mean, oscillatory (wave) Fast decay (let) Wavelet analysis In the case of Fourier series, the orthonormal basis is generated by integral dilation of a single function e jx Every 2π-periodic square-integrable function is generated by a superposition

More information

Variational approach to restore point-like and curve-like singularities in imaging

Variational approach to restore point-like and curve-like singularities in imaging Variational approach to restore point-like and curve-like singularities in imaging Daniele Graziani joint work with Gilles Aubert and Laure Blanc-Féraud Roma 12/06/2012 Daniele Graziani (Roma) 12/06/2012

More information

Molecular Dynamics Simulations

Molecular Dynamics Simulations Molecular Dynamics Simulations Yaoquan Tu Division of Theoretical Chemistry and Biology, Royal Institute of Technology (KTH) 2011-06 1 Outline I. Introduction II. Molecular Mechanics Force Field III. Molecular

More information

FRIEDRICH-ALEXANDER-UNIVERSITÄT ERLANGEN-NÜRNBERG

FRIEDRICH-ALEXANDER-UNIVERSITÄT ERLANGEN-NÜRNBERG FRIEDRICH-ALEXANDER-UNIVERSITÄT ERLANGEN-NÜRNBERG INSTITUT FÜR INFORMATIK (MATHEMATISCHE MASCHINEN UND DATENVERARBEITUNG) Lehrstuhl für Informatik 10 (Systemsimulation) Massively Parallel Multilevel Finite

More information

Inner Product Spaces

Inner Product Spaces Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and

More information

HPC Deployment of OpenFOAM in an Industrial Setting

HPC Deployment of OpenFOAM in an Industrial Setting HPC Deployment of OpenFOAM in an Industrial Setting Hrvoje Jasak [email protected] Wikki Ltd, United Kingdom PRACE Seminar: Industrial Usage of HPC Stockholm, Sweden, 28-29 March 2011 HPC Deployment

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical

More information

6 J - vector electric current density (A/m2 )

6 J - vector electric current density (A/m2 ) Determination of Antenna Radiation Fields Using Potential Functions Sources of Antenna Radiation Fields 6 J - vector electric current density (A/m2 ) M - vector magnetic current density (V/m 2 ) Some problems

More information

CHEM6085: Density Functional Theory Lecture 2. Hamiltonian operators for molecules

CHEM6085: Density Functional Theory Lecture 2. Hamiltonian operators for molecules CHEM6085: Density Functional Theory Lecture 2 Hamiltonian operators for molecules C.-K. Skylaris 1 The (time-independent) Schrödinger equation is an eigenvalue equation operator for property A eigenfunction

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

ACCELERATING COMMERCIAL LINEAR DYNAMIC AND NONLINEAR IMPLICIT FEA SOFTWARE THROUGH HIGH- PERFORMANCE COMPUTING

ACCELERATING COMMERCIAL LINEAR DYNAMIC AND NONLINEAR IMPLICIT FEA SOFTWARE THROUGH HIGH- PERFORMANCE COMPUTING ACCELERATING COMMERCIAL LINEAR DYNAMIC AND Vladimir Belsky Director of Solver Development* Luis Crivelli Director of Solver Development* Matt Dunbar Chief Architect* Mikhail Belyi Development Group Manager*

More information

A Learning Based Method for Super-Resolution of Low Resolution Images

A Learning Based Method for Super-Resolution of Low Resolution Images A Learning Based Method for Super-Resolution of Low Resolution Images Emre Ugur June 1, 2004 [email protected] Abstract The main objective of this project is the study of a learning based method

More information

Linear Threshold Units

Linear Threshold Units Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

More information

Medical Image Processing on the GPU. Past, Present and Future. Anders Eklund, PhD Virginia Tech Carilion Research Institute [email protected].

Medical Image Processing on the GPU. Past, Present and Future. Anders Eklund, PhD Virginia Tech Carilion Research Institute andek@vtc.vt. Medical Image Processing on the GPU Past, Present and Future Anders Eklund, PhD Virginia Tech Carilion Research Institute [email protected] Outline Motivation why do we need GPUs? Past - how was GPU programming

More information

Robust Algorithms for Current Deposition and Dynamic Load-balancing in a GPU Particle-in-Cell Code

Robust Algorithms for Current Deposition and Dynamic Load-balancing in a GPU Particle-in-Cell Code Robust Algorithms for Current Deposition and Dynamic Load-balancing in a GPU Particle-in-Cell Code F. Rossi, S. Sinigardi, P. Londrillo & G. Turchetti University of Bologna & INFN GPU2014, Rome, Sept 17th

More information

Mesh Discretization Error and Criteria for Accuracy of Finite Element Solutions

Mesh Discretization Error and Criteria for Accuracy of Finite Element Solutions Mesh Discretization Error and Criteria for Accuracy of Finite Element Solutions Chandresh Shah Cummins, Inc. Abstract Any finite element analysis performed by an engineer is subject to several types of

More information

Hardware-Aware Analysis and. Presentation Date: Sep 15 th 2009 Chrissie C. Cui

Hardware-Aware Analysis and. Presentation Date: Sep 15 th 2009 Chrissie C. Cui Hardware-Aware Analysis and Optimization of Stable Fluids Presentation Date: Sep 15 th 2009 Chrissie C. Cui Outline Introduction Highlights Flop and Bandwidth Analysis Mehrstellen Schemes Advection Caching

More information

Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

More information

Physics 9e/Cutnell. correlated to the. College Board AP Physics 1 Course Objectives

Physics 9e/Cutnell. correlated to the. College Board AP Physics 1 Course Objectives Physics 9e/Cutnell correlated to the College Board AP Physics 1 Course Objectives Big Idea 1: Objects and systems have properties such as mass and charge. Systems may have internal structure. Enduring

More information

= N 2 = 3π2 n = k 3 F. The kinetic energy of the uniform system is given by: 4πk 2 dk h2 k 2 2m. (2π) 3 0

= N 2 = 3π2 n = k 3 F. The kinetic energy of the uniform system is given by: 4πk 2 dk h2 k 2 2m. (2π) 3 0 Chapter 1 Thomas-Fermi Theory The Thomas-Fermi theory provides a functional form for the kinetic energy of a non-interacting electron gas in some known external potential V (r) (usually due to impurities)

More information

Customer Training Material. Lecture 2. Introduction to. Methodology ANSYS FLUENT. ANSYS, Inc. Proprietary 2010 ANSYS, Inc. All rights reserved.

Customer Training Material. Lecture 2. Introduction to. Methodology ANSYS FLUENT. ANSYS, Inc. Proprietary 2010 ANSYS, Inc. All rights reserved. Lecture 2 Introduction to CFD Methodology Introduction to ANSYS FLUENT L2-1 What is CFD? Computational Fluid Dynamics (CFD) is the science of predicting fluid flow, heat and mass transfer, chemical reactions,

More information

The RAMSES code and related techniques 3. Gravity solvers

The RAMSES code and related techniques 3. Gravity solvers The RAMSES code and related techniques 3. Gravity solvers Outline - The Vlassov-Poisson equation - The Particle-Mesh method in a nut shell - Symplectic Integrators - Mass assignment and force interpolation

More information

Discussion on the paper Hypotheses testing by convex optimization by A. Goldenschluger, A. Juditsky and A. Nemirovski.

Discussion on the paper Hypotheses testing by convex optimization by A. Goldenschluger, A. Juditsky and A. Nemirovski. Discussion on the paper Hypotheses testing by convex optimization by A. Goldenschluger, A. Juditsky and A. Nemirovski. Fabienne Comte, Celine Duval, Valentine Genon-Catalot To cite this version: Fabienne

More information

Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi

Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi ICPP 6 th International Workshop on Parallel Programming Models and Systems Software for High-End Computing October 1, 2013 Lyon, France

More information

Express Introductory Training in ANSYS Fluent Lecture 1 Introduction to the CFD Methodology

Express Introductory Training in ANSYS Fluent Lecture 1 Introduction to the CFD Methodology Express Introductory Training in ANSYS Fluent Lecture 1 Introduction to the CFD Methodology Dimitrios Sofialidis Technical Manager, SimTec Ltd. Mechanical Engineer, PhD PRACE Autumn School 2013 - Industry

More information

Poisson Equation Solver Parallelisation for Particle-in-Cell Model

Poisson Equation Solver Parallelisation for Particle-in-Cell Model WDS'14 Proceedings of Contributed Papers Physics, 233 237, 214. ISBN 978-8-7378-276-4 MATFYZPRESS Poisson Equation Solver Parallelisation for Particle-in-Cell Model A. Podolník, 1,2 M. Komm, 1 R. Dejarnac,

More information

5MD00. Assignment Introduction. Luc Waeijen 16-12-2014

5MD00. Assignment Introduction. Luc Waeijen 16-12-2014 5MD00 Assignment Introduction Luc Waeijen 16-12-2014 Contents EEG application Background on EEG Early Seizure Detection Algorithm Implementation Details Super Scalar Assignment Description Tooling (simple

More information

COMPUTATION OF THREE-DIMENSIONAL ELECTRIC FIELD PROBLEMS BY A BOUNDARY INTEGRAL METHOD AND ITS APPLICATION TO INSULATION DESIGN

COMPUTATION OF THREE-DIMENSIONAL ELECTRIC FIELD PROBLEMS BY A BOUNDARY INTEGRAL METHOD AND ITS APPLICATION TO INSULATION DESIGN PERIODICA POLYTECHNICA SER. EL. ENG. VOL. 38, NO. ~, PP. 381-393 (199~) COMPUTATION OF THREE-DIMENSIONAL ELECTRIC FIELD PROBLEMS BY A BOUNDARY INTEGRAL METHOD AND ITS APPLICATION TO INSULATION DESIGN H.

More information

GenOpt (R) Generic Optimization Program User Manual Version 3.0.0β1

GenOpt (R) Generic Optimization Program User Manual Version 3.0.0β1 (R) User Manual Environmental Energy Technologies Division Berkeley, CA 94720 http://simulationresearch.lbl.gov Michael Wetter [email protected] February 20, 2009 Notice: This work was supported by the U.S.

More information

Course Outline for the Masters Programme in Computational Engineering

Course Outline for the Masters Programme in Computational Engineering Course Outline for the Masters Programme in Computational Engineering Compulsory Courses CP-501 Mathematical Methods for Computational 3 Engineering-I CP-502 Mathematical Methods for Computational 3 Engineering-II

More information

Summary. Load and Open GaussView to start example

Summary. Load and Open GaussView to start example Summary This document describes in great detail how to navigate the Linux Red Hat Terminal to bring up GaussView, use GaussView to create a simple atomic or molecular simulation input file, and then use

More information

A Case Study - Scaling Legacy Code on Next Generation Platforms

A Case Study - Scaling Legacy Code on Next Generation Platforms Available online at www.sciencedirect.com ScienceDirect Procedia Engineering 00 (2015) 000 000 www.elsevier.com/locate/procedia 24th International Meshing Roundtable (IMR24) A Case Study - Scaling Legacy

More information

What is molecular dynamics (MD) simulation and how does it work?

What is molecular dynamics (MD) simulation and how does it work? What is molecular dynamics (MD) simulation and how does it work? A lecture for CHM425/525 Fall 2011 The underlying physical laws necessary for the mathematical theory of a large part of physics and the

More information

Computer Graphics. Geometric Modeling. Page 1. Copyright Gotsman, Elber, Barequet, Karni, Sheffer Computer Science - Technion. An Example.

Computer Graphics. Geometric Modeling. Page 1. Copyright Gotsman, Elber, Barequet, Karni, Sheffer Computer Science - Technion. An Example. An Example 2 3 4 Outline Objective: Develop methods and algorithms to mathematically model shape of real world objects Categories: Wire-Frame Representation Object is represented as as a set of points

More information

Introduction to CFD Analysis

Introduction to CFD Analysis Introduction to CFD Analysis 2-1 What is CFD? Computational Fluid Dynamics (CFD) is the science of predicting fluid flow, heat and mass transfer, chemical reactions, and related phenomena by solving numerically

More information

W009 Application of VTI Waveform Inversion with Regularization and Preconditioning to Real 3D Data

W009 Application of VTI Waveform Inversion with Regularization and Preconditioning to Real 3D Data W009 Application of VTI Waveform Inversion with Regularization and Preconditioning to Real 3D Data C. Wang* (ION Geophysical), D. Yingst (ION Geophysical), R. Bloor (ION Geophysical) & J. Leveille (ION

More information

Multi Scale Design of nanomaterials with simulations on hybrid architectures: (the muscade project)

Multi Scale Design of nanomaterials with simulations on hybrid architectures: (the muscade project) Multi Scale Design of nanomaterials with simulations on hybrid architectures: (the muscade project) Pascal Pochet (CEA-INAC) INRIA 2009-2012 Chair of excellence for Normand Mousseau Alain Pasturel Noel

More information

Best practices for efficient HPC performance with large models

Best practices for efficient HPC performance with large models Best practices for efficient HPC performance with large models Dr. Hößl Bernhard, CADFEM (Austria) GmbH PRACE Autumn School 2013 - Industry Oriented HPC Simulations, September 21-27, University of Ljubljana,

More information

Free Electron Fermi Gas (Kittel Ch. 6)

Free Electron Fermi Gas (Kittel Ch. 6) Free Electron Fermi Gas (Kittel Ch. 6) Role of Electrons in Solids Electrons are responsible for binding of crystals -- they are the glue that hold the nuclei together Types of binding (see next slide)

More information

Data Visualization for Atomistic/Molecular Simulations. Douglas E. Spearot University of Arkansas

Data Visualization for Atomistic/Molecular Simulations. Douglas E. Spearot University of Arkansas Data Visualization for Atomistic/Molecular Simulations Douglas E. Spearot University of Arkansas What is Atomistic Simulation? Molecular dynamics (MD) involves the explicit simulation of atomic scale particles

More information

Molecular vibrations: Iterative solution with energy selected bases

Molecular vibrations: Iterative solution with energy selected bases JOURNAL OF CHEMICAL PHYSICS VOLUME 118, NUMBER 8 22 FEBRUARY 2003 ARTICLES Molecular vibrations: Iterative solution with energy selected bases Hee-Seung Lee a) and John C. Light Department of Chemistry

More information

DO PHYSICS ONLINE FROM QUANTA TO QUARKS QUANTUM (WAVE) MECHANICS

DO PHYSICS ONLINE FROM QUANTA TO QUARKS QUANTUM (WAVE) MECHANICS DO PHYSICS ONLINE FROM QUANTA TO QUARKS QUANTUM (WAVE) MECHANICS Quantum Mechanics or wave mechanics is the best mathematical theory used today to describe and predict the behaviour of particles and waves.

More information

Java Modules for Time Series Analysis

Java Modules for Time Series Analysis Java Modules for Time Series Analysis Agenda Clustering Non-normal distributions Multifactor modeling Implied ratings Time series prediction 1. Clustering + Cluster 1 Synthetic Clustering + Time series

More information

MPI Hands-On List of the exercises

MPI Hands-On List of the exercises MPI Hands-On List of the exercises 1 MPI Hands-On Exercise 1: MPI Environment.... 2 2 MPI Hands-On Exercise 2: Ping-pong...3 3 MPI Hands-On Exercise 3: Collective communications and reductions... 5 4 MPI

More information

David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems

David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems About me David Rioja Redondo Telecommunication Engineer - Universidad de Alcalá >2 years building and managing clusters UPM

More information

LS-DYNA Scalability on Cray Supercomputers. Tin-Ting Zhu, Cray Inc. Jason Wang, Livermore Software Technology Corp.

LS-DYNA Scalability on Cray Supercomputers. Tin-Ting Zhu, Cray Inc. Jason Wang, Livermore Software Technology Corp. LS-DYNA Scalability on Cray Supercomputers Tin-Ting Zhu, Cray Inc. Jason Wang, Livermore Software Technology Corp. WP-LS-DYNA-12213 www.cray.com Table of Contents Abstract... 3 Introduction... 3 Scalability

More information

(Toward) Radiative transfer on AMR with GPUs. Dominique Aubert Université de Strasbourg Austin, TX, 14.12.12

(Toward) Radiative transfer on AMR with GPUs. Dominique Aubert Université de Strasbourg Austin, TX, 14.12.12 (Toward) Radiative transfer on AMR with GPUs Dominique Aubert Université de Strasbourg Austin, TX, 14.12.12 A few words about GPUs Cache and control replaced by calculation units Large number of Multiprocessors

More information

Mulliken suggested to split the shared density 50:50. Then the electrons associated with the atom k are given by:

Mulliken suggested to split the shared density 50:50. Then the electrons associated with the atom k are given by: 1 17. Population Analysis Population analysis is the study of charge distribution within molecules. The intention is to accurately model partial charge magnitude and location within a molecule. This can

More information

HPC enabling of OpenFOAM R for CFD applications

HPC enabling of OpenFOAM R for CFD applications HPC enabling of OpenFOAM R for CFD applications Towards the exascale: OpenFOAM perspective Ivan Spisso 25-27 March 2015, Casalecchio di Reno, BOLOGNA. SuperComputing Applications and Innovation Department,

More information

CONVERGE Features, Capabilities and Applications

CONVERGE Features, Capabilities and Applications CONVERGE Features, Capabilities and Applications CONVERGE CONVERGE The industry leading CFD code for complex geometries with moving boundaries. Start using CONVERGE and never make a CFD mesh again. CONVERGE

More information

How to run SIESTA Víctor M. García Suárez Gollum School. MOLESCO network 20-May-2015 Partially based on a talk by Javier Junquera

How to run SIESTA Víctor M. García Suárez Gollum School. MOLESCO network 20-May-2015 Partially based on a talk by Javier Junquera How to run SIESTA Víctor M. García Suárez 20-May-2015 Partially based on a talk by Javier Junquera Outline - Developers - General features - FDF. Input variables - How to run the code - Output files -

More information

Time Domain and Frequency Domain Techniques For Multi Shaker Time Waveform Replication

Time Domain and Frequency Domain Techniques For Multi Shaker Time Waveform Replication Time Domain and Frequency Domain Techniques For Multi Shaker Time Waveform Replication Thomas Reilly Data Physics Corporation 1741 Technology Drive, Suite 260 San Jose, CA 95110 (408) 216-8440 This paper

More information

MASTER OF SCIENCE IN PHYSICS MASTER OF SCIENCES IN PHYSICS (MS PHYS) (LIST OF COURSES BY SEMESTER, THESIS OPTION)

MASTER OF SCIENCE IN PHYSICS MASTER OF SCIENCES IN PHYSICS (MS PHYS) (LIST OF COURSES BY SEMESTER, THESIS OPTION) MASTER OF SCIENCE IN PHYSICS Admission Requirements 1. Possession of a BS degree from a reputable institution or, for non-physics majors, a GPA of 2.5 or better in at least 15 units in the following advanced

More information

Design and Optimization of OpenFOAM-based CFD Applications for Hybrid and Heterogeneous HPC Platforms

Design and Optimization of OpenFOAM-based CFD Applications for Hybrid and Heterogeneous HPC Platforms Design and Optimization of OpenFOAM-based CFD Applications for Hybrid and Heterogeneous HPC Platforms Amani AlOnazi, David E. Keyes, Alexey Lastovetsky, Vladimir Rychkov Extreme Computing Research Center,

More information

Virtual Landmarks for the Internet

Virtual Landmarks for the Internet Virtual Landmarks for the Internet Liying Tang Mark Crovella Boston University Computer Science Internet Distance Matters! Useful for configuring Content delivery networks Peer to peer applications Multiuser

More information

Multi-Block Gridding Technique for FLOW-3D Flow Science, Inc. July 2004

Multi-Block Gridding Technique for FLOW-3D Flow Science, Inc. July 2004 FSI-02-TN59-R2 Multi-Block Gridding Technique for FLOW-3D Flow Science, Inc. July 2004 1. Introduction A major new extension of the capabilities of FLOW-3D -- the multi-block grid model -- has been incorporated

More information

Visualization Plugin for ParaView

Visualization Plugin for ParaView Alexey I. Baranov Visualization Plugin for ParaView version 1.3 Springer Contents 1 Visualization with ParaView..................................... 1 1.1 ParaView plugin installation.................................

More information

Nonlinear Iterative Partial Least Squares Method

Nonlinear Iterative Partial Least Squares Method Numerical Methods for Determining Principal Component Analysis Abstract Factors Béchu, S., Richard-Plouet, M., Fernandez, V., Walton, J., and Fairley, N. (2016) Developments in numerical treatments for

More information

Yousef Saad University of Minnesota Computer Science and Engineering. CRM Montreal - April 30, 2008

Yousef Saad University of Minnesota Computer Science and Engineering. CRM Montreal - April 30, 2008 A tutorial on: Iterative methods for Sparse Matrix Problems Yousef Saad University of Minnesota Computer Science and Engineering CRM Montreal - April 30, 2008 Outline Part 1 Sparse matrices and sparsity

More information

Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes

Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes Eric Petit, Loïc Thebault, Quang V. Dinh May 2014 EXA2CT Consortium 2 WPs Organization Proto-Applications

More information

Basis Sets in Quantum Chemistry C. David Sherrill School of Chemistry and Biochemistry Georgia Institute of Technology

Basis Sets in Quantum Chemistry C. David Sherrill School of Chemistry and Biochemistry Georgia Institute of Technology Basis Sets in Quantum Chemistry C. David Sherrill School of Chemistry and Biochemistry Georgia Institute of Technology Basis Sets Generically, a basis set is a collection of vectors which spans (defines)

More information

Product Synthesis. CATIA - Product Engineering Optimizer 2 (PEO) CATIA V5R18

Product Synthesis. CATIA - Product Engineering Optimizer 2 (PEO) CATIA V5R18 Product Synthesis CATIA - Product Engineering Optimizer 2 (PEO) CATIA V5R18 Product Synthesis CATIA - Product Engineering Optimizer Accelerates design alternatives exploration and optimization according

More information

Sampling the Brillouin-zone:

Sampling the Brillouin-zone: Sampling the Brillouin-zone: Andreas EICHLER Institut für Materialphysik and Center for Computational Materials Science Universität Wien, Sensengasse 8, A-090 Wien, Austria b-initio ienna ackage imulation

More information

P164 Tomographic Velocity Model Building Using Iterative Eigendecomposition

P164 Tomographic Velocity Model Building Using Iterative Eigendecomposition P164 Tomographic Velocity Model Building Using Iterative Eigendecomposition K. Osypov* (WesternGeco), D. Nichols (WesternGeco), M. Woodward (WesternGeco) & C.E. Yarman (WesternGeco) SUMMARY Tomographic

More information

How High a Degree is High Enough for High Order Finite Elements?

How High a Degree is High Enough for High Order Finite Elements? This space is reserved for the Procedia header, do not use it How High a Degree is High Enough for High Order Finite Elements? William F. National Institute of Standards and Technology, Gaithersburg, Maryland,

More information

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. !-approximation algorithm.

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. !-approximation algorithm. Approximation Algorithms Chapter Approximation Algorithms Q Suppose I need to solve an NP-hard problem What should I do? A Theory says you're unlikely to find a poly-time algorithm Must sacrifice one of

More information

Intelligent Heuristic Construction with Active Learning

Intelligent Heuristic Construction with Active Learning Intelligent Heuristic Construction with Active Learning William F. Ogilvie, Pavlos Petoumenos, Zheng Wang, Hugh Leather E H U N I V E R S I T Y T O H F G R E D I N B U Space is BIG! Hubble Ultra-Deep Field

More information

P013 INTRODUCING A NEW GENERATION OF RESERVOIR SIMULATION SOFTWARE

P013 INTRODUCING A NEW GENERATION OF RESERVOIR SIMULATION SOFTWARE 1 P013 INTRODUCING A NEW GENERATION OF RESERVOIR SIMULATION SOFTWARE JEAN-MARC GRATIEN, JEAN-FRANÇOIS MAGRAS, PHILIPPE QUANDALLE, OLIVIER RICOIS 1&4, av. Bois-Préau. 92852 Rueil Malmaison Cedex. France

More information