The HipGISAXS Software Suite: Recovering Nanostructures at Scale Abhinav Sarje Computational Research Division 06.24.15 OLCF User Meeting Oak Ridge, TN
Nanoparticle Systems Materials (natural or artificial) made up of nanoparticles. Sizes ranging from 1 nanometer to 1000s nanometers. Wide variety of applications in optical, electronic and biomedical fields. E.g.: Inorganic nanomaterials in optoelectronics. Organic material based nano-devices such as Organic Photovoltaics (OPVs), OLEDs. Chemical catalysts, drug design and discovery, biological process dynamics. Importance of structural information: Nanomaterials exhibit shape and size-dependent properties, unlike bulk materials which have constant physical properties regardless of size. Nanoparticle characterization is necessary to establish understanding and control of material synthesis and applications.
Nanoparticle Systems Materials (natural or artificial) made up of nanoparticles. Sizes ranging from 1 nanometer to 1000s nanometers. Wide variety of applications in optical, electronic and biomedical fields. E.g.: Inorganic nanomaterials in optoelectronics. Organic material based nano-devices such as Organic Photovoltaics (OPVs), OLEDs. Chemical catalysts, drug design and discovery, biological process dynamics. Importance of structural information: Nanomaterials exhibit shape and size-dependent properties, unlike bulk materials which have constant physical properties regardless of size. Nanoparticle characterization is necessary to establish understanding and control of material synthesis and applications.
X-ray Scattering at Synchrotrons graphic: courtesy of A. Meyer, www.gisaxs.de
X-Ray Scattering: Examples
Introduction Motivation X-Ray Scattering Inverse Modeling Reverse Monte-Carlo Simulations LMVM & POUNDerS PSO Conclusions X-Ray Scattering: Complex Examples Gratings OPVs Real Sample Model Scattering Pattern
Computational Problems in Structure Recovery: Inverse Modeling Start Initial guess
Introduction Motivation X-Ray Scattering Inverse Modeling Reverse Monte-Carlo Simulations LMVM & POUNDerS PSO Conclusions Computational Problems in Structure Recovery: Inverse Modeling Start forward simulation Initial guess
Introduction Motivation X-Ray Scattering Inverse Modeling Reverse Monte-Carlo Simulations LMVM & POUNDerS PSO Conclusions Computational Problems in Structure Recovery: Inverse Modeling Start forward simulation Initial guess compute error w.r.t. experimental data
Introduction Motivation X-Ray Scattering Inverse Modeling Reverse Monte-Carlo Simulations LMVM & POUNDerS PSO Conclusions Computational Problems in Structure Recovery: Inverse Modeling Start forward simulation Initial guess compute error w.r.t. experimental data tune model parameters
Introduction Motivation X-Ray Scattering Inverse Modeling Reverse Monte-Carlo Simulations LMVM & POUNDerS PSO Conclusions Computational Problems in Structure Recovery: Inverse Modeling Start forward simulation Initial guess compute error w.r.t. experimental data tune model parameters
Introduction Motivation X-Ray Scattering Inverse Modeling Reverse Monte-Carlo Simulations LMVM & POUNDerS PSO Conclusions Computational Problems in Structure Recovery: Inverse Modeling Start forward simulation Initial guess compute error w.r.t. experimental data End tune model parameters
Need for High-Performance Computing Data generation and analysis gap: High measurement rates of current state-of-the-art light beam detectors. Wait for days for analyzing data with previous softwares. Extremely inefficient utilization of facilities due to mismatch. Example: 100 MB raw data per second. Up to 12 TB per week.
Need for High-Performance Computing High computational and accuracy requirements: Errors are proportional to the resolutions of various computational discretization. Higher resolutions require higher computational power. Example: O(10 7 ) to O(10 15 ) kernel computations for one simulation. O(10 2 ) experiments per material sample. O(10) to O(10 3 ) forward simulations for inverse modeling per scattering pattern.
Need for High-Performance Computing Science Gap: Beam-line scientists lack access to high-performance algorithms and codes. In-house developed codes limited in compute capabilities and performance. Also, they are extremely slow wait for days and weeks to obtain basic results.
Forward Simulations: Computing Scattered Light Intensities Given: 1 a sample structure model, and 2 experimental configuration, simulate scattering patterns. Based on Distorted Wave Born Approximation (DWBA) theory.
Inverse Modeling Forward simulation kernel: computing the scattered light intensities. E.g. FFT computations (SAXS) Complex form factor and structure factor computations (GISAXS) Various inverse modeling algorithms: Reverse Monte-Carlo simulations for SAXS. Sophisticated optimization algorithms for GISAXS. Gradient based: LMVM (Limited-Memory Variable-Metric.) Derivative-free trust region-based: POUNDerS. Stochastic: Particle Swarm Optmization.
Inverse Modeling Forward simulation kernel: computing the scattered light intensities. E.g. FFT computations (SAXS) Complex form factor and structure factor computations (GISAXS) Various inverse modeling algorithms: Reverse Monte-Carlo simulations for SAXS. Sophisticated optimization algorithms for GISAXS. Gradient based: LMVM (Limited-Memory Variable-Metric.) Derivative-free trust region-based: POUNDerS. Stochastic: Particle Swarm Optmization.
Reverse Monte Carlo Simulations
Reverse Monte Carlo Simulations: Validation Actual Models
Reverse Monte Carlo Simulations: Validation Actual Models Initial Models
Reverse Monte Carlo Simulations: Validation Actual Models Initial Models Recovered Models
Reverse Monte Carlo Simulations: Strong and Weak Scaling 10 8 10 6 Titan Time [ms] 10 7 10 6 10 5 Time [ms] 10 5 t=2p, s=2048 t=p, s=2048 t=2p, s=512 t=p, s=512 10 4 10 3 1 t=1024, s=2048 t=1024, s=512 t=128, s=512 10 4 2 4 8 16 32 64 128 256 512 1 2 4 8 16 32 64 128 256 512 1024 Number of Nodes (GPUs) Number of Nodes (GPUs) Hopper Time [ms] 10 10 10 9 10 8 10 7 10 6 t=1024, s=2048 t=1024, s=512 t=128, s=2048 t=128, s=512 Time [ms] 10 7 10 6 10 5 t=2p, s=2048 t=p, s=2048 t=p, s=512 t=p/2, s=512 10 5 10 4 10 4 24 48 96 192 384 768 1536 3072 6144 12288 24 48 96 192 384 768 1536 3072 6144 12288 Number of Cores Number of Cores
Limited-Memory Variable-Metric and POUNDerS Methods from the optimization package TAO. LMVM is a gradient-based method. POUNDerS is a derivative-free trust-region-based method.
Limited-Memory Variable-Metric and POUNDerS: Two Parameter Case A Single Cylindrical Nanoparticle
Limited-Memory Variable-Metric and POUNDerS: Two Parameter Case A Single Cylindrical Nanoparticle X-Ray Scattering Pattern
Limited-Memory Variable-Metric and POUNDerS: Two Parameter Case 180 x2 24 22 20 18 16 160 140 120 100 80 60 40 20 A Single Cylindrical Nanoparticle 0 16 18 20 22 24 x1 X-Ray Scattering Pattern Objective Function Map
Limited-Memory Variable-Metric and POUNDerS: Two Parameter Case 180 x2 24 22 20 18 16 160 140 120 100 80 60 40 20 A Single Cylindrical Nanoparticle 0 16 18 20 22 24 x1 X-Ray Scattering Pattern Objective Function Map 24 105 90 24 27 24 22 75 22 21 x2 20 60 x2 20 18 15 18 45 30 18 12 9 16 15 16 6 16 18 20 22 24 x1 LMVM Convergence Map 3 16 18 20 22 24 x1 POUNDerS Convergence Map
Limited-Memory Variable-Metric and POUNDerS: Six Parameter Case Pyramidal Nanoparticles forming a Lattice
Limited-Memory Variable-Metric and POUNDerS: Six Parameter Case Pyramidal X-Ray Scattering Pattern Nanoparticles forming a Lattice
Limited-Memory Variable-Metric and POUNDerS: Six Parameter Case 18.0 x3 54 52 50 48 16.5 15.0 13.5 12.0 10.5 9.0 7.5 Pyramidal X-Ray Scattering Pattern Nanoparticles forming a Lattice 46 36 37 38 39 40 41 42 43 44 x1 Objective Function Map 6.0
Limited-Memory Variable-Metric and POUNDerS: Six Parameter Case 18.0 x3 54 52 50 48 16.5 15.0 13.5 12.0 10.5 9.0 7.5 Pyramidal X-Ray Scattering Pattern Nanoparticles forming a Lattice 54 46 36 37 38 39 40 41 42 43 44 x1 Objective Function Map 28 26 6.0 52 24 LMVM does not converge x3 50 48 46 22 20 18 16 14 12 36 37 38 39 40 41 42 43 44 x1 POUNDerS Convergence Map
Particle Swarm Optimization Stochastic method. Multiple agents, particle swarm, search for optimal points in the parameter space. Agent velocities influenced by history of traveled paths.
Particle Swarm Optimization Stochastic method. Multiple agents, particle swarm, search for optimal points in the parameter space. Agent velocities influenced by history of traveled paths.
Particle Swarm Optimization: Fitting X-Ray Scattering Data Parameter space volume 100 81 64 49 36 25 16 9 4 1 2 4 6 8 10 12 14 16 18 20 Number of agents 640 560 480 400 320 240 Function evaluations 160 80 Parameter space volume 4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 2 4 6 8 10 12 14 16 18 20 Number of agents 900 800 700 600 500 400 300 Function evaluations 200 100 Fitting 2 Parameters Fitting 6 Parameters
Particle Swarm Optimization: Performance Iterations for Convergence 30 25 20 15 10 5 15 agents 20 agents Time [s] 10 4 10 3 sample 1 sample 2 0 0 4 16 36 64 100 144 Search Space Volume ([x 1 ]x[x 2 ]) Convergence w.r.t. Search Space Volume 10 2 4 16 64 256 Number of GPUs Strong Scaling on Titan
Particle Swarm Optimization: Agents vs. Generations Relative Error 10 0 10-1 10-2 10-3 10-4 10-5 10-6 10-7 10-8 10-9 10-10 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 Num. Agents 100 80 60 40 20 10-1 10-2 10-3 10-4 10-5 10-6 10-7 10-8 10-9 10-10 Relative Error 10-11 10-11 10-12 0 10 20 30 40 50 Generation 0 10 20 30 40 Generation
An Ongoing Work We saw that: GPUs have brought data analysis time from days and weeks to just minutes and seconds. Derivative-based methods converge only for simple cases. Trust-region-based methods are very sensitive to initial guess. PSO does not require an initial guess. PSO is robust, nearly always converging, but expensive. Need better optimization algorithms.
An Ongoing Work We saw that: GPUs have brought data analysis time from days and weeks to just minutes and seconds. Derivative-based methods converge only for simple cases. Trust-region-based methods are very sensitive to initial guess. PSO does not require an initial guess. PSO is robust, nearly always converging, but expensive. Need better optimization algorithms. Near future: Opening gates to much more sophisticated analyses. We are applying machine learning for feature and structural classification to generate initial models to fit.
Our Current Core Team Alexander Hexemer, Advanced Light Source, Berkeley Lab. Dinesh Kumar, Advanced Light Source, Berkeley Lab. Xiaoye S. Li, Computational Research Division, Berkeley Lab. Abhinav Sarje, Computational Research Division, Berkeley Lab. Singanallur Venkatakrishnan, Advanced Light Source, Berkeley Lab.
Our Current Core Team Alexander Hexemer, Advanced Light Source, Berkeley Lab. Dinesh Kumar, Advanced Light Source, Berkeley Lab. Xiaoye S. Li, Computational Research Division, Berkeley Lab. Abhinav Sarje, Computational Research Division, Berkeley Lab. Singanallur Venkatakrishnan, Advanced Light Source, Berkeley Lab. And we are open to collaborations...
Further Information 1 HipGISAXS: a high-performance computing code for simulating grazing-incidence X-ray scattering data. Journal of Applied Crystallogrpahy, vol. 46, pp. 1781 1795, 2013. 2 Massively parallel X-ray scattering simulations. Supercomputing (SC), 2012. 3 Tuning HipGISAXS on multi- and many-core supercomputers. Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems, 2013. 4 High performance inverse modeling with Reverse Monte Carlo simulations. International Conference on Parallel Processing, 2014.
Acknowledgments Supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231. Also supported by DoE Early Career Research grant awarded to Alexander Hexemer. Used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. Used resources of the National Energy Research Scientific Computing Center, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231.
Thank you! Web: http://portal.nersc.gov/project/als/hipgisaxs http://www.github.com/hipgisaxs