Free Volume Toolkit: Tools for the visualization and analysis of free volume in materials



Similar documents
Data Visualization for Atomistic/Molecular Simulations. Douglas E. Spearot University of Arkansas

Engineering Problem Solving and Excel. EGN 1006 Introduction to Engineering

Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011

Java Modules for Time Series Analysis

An Introduction to Point Pattern Analysis using CrimeStat

Optimizing Application Performance with CUDA Profiling Tools

Avizo Inspect New software for industrial inspection and materials R&D

NVIDIA CUDA GETTING STARTED GUIDE FOR MAC OS X

Visualization Plugin for ParaView

The Scientific Data Mining Process

NVIDIA CUDA GETTING STARTED GUIDE FOR MAC OS X

Multiple Optimization Using the JMP Statistical Software Kodak Research Conference May 9, 2005

Excel 2010: Create your first spreadsheet

NVIDIA CUDA GETTING STARTED GUIDE FOR MICROSOFT WINDOWS

Visualizing molecular simulations

Create a folder on your network drive called DEM. This is where data for the first part of this lesson will be stored.

Using simulation to calculate the NPV of a project

2-1 Position, Displacement, and Distance

Real-time Visual Tracker by Stream Processing

Introduction to GP-GPUs. Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1

Quantitative Inventory Uncertainty

Overview of the Adobe Flash Professional CS6 workspace

NVIDIA Tools For Profiling And Monitoring. David Goodwin

Learn CUDA in an Afternoon: Hands-on Practical Exercises

Data representation and analysis in Excel

Data Exploration Data Visualization

Petrel TIPS&TRICKS from SCM

Tutorial 2 Online and offline Ship Visualization tool Table of Contents

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining

VisIt Visualization Tool

Overview Motivation and applications Challenges. Dynamic Volume Computation and Visualization on the GPU. GPU feature requests Conclusions

Why are we teaching you VisIt?

Performance Optimization of I-4 I 4 Gasoline Engine with Variable Valve Timing Using WAVE/iSIGHT

Technical Documentation Version 6.7. Slots

CE 504 Computational Hydrology Computational Environments and Tools Fritz R. Fiedler

Workflow Requirements (Dec. 12, 2006)

Part-Based Recognition

Manual for simulation of EB processing. Software ModeRTL

An Overview of the Nimsoft Log Monitoring Solution

Hierarchical Clustering Analysis

CHM 579 Lab 1: Basic Monte Carlo Algorithm

Arena 9.0 Basic Modules based on Arena Online Help

Below is a very brief tutorial on the basic capabilities of Excel. Refer to the Excel help files for more information.

Time & Attendance Supervisor Basics for ADP Workforce Now. Automatic Data Processing, LLC ES Canada

Visualization of Adaptive Mesh Refinement Data with VisIt

HP ProLiant SL270s Gen8 Server. Evaluation Report

Visualization of Output Data from Particle Transport Codes

The Visualization Pipeline

MONTE-CARLO SIMULATION OF AMERICAN OPTIONS WITH GPUS. Julien Demouth, NVIDIA

Data Visualization. Prepared by Francisco Olivera, Ph.D., Srikanth Koka Department of Civil Engineering Texas A&M University February 2004

Access Queries (Office 2003)

GPU Tools Sandra Wienke

Signal to Noise Instrumental Excel Assignment

Financial Econometrics MFE MATLAB Introduction. Kevin Sheppard University of Oxford

Next Generation GPU Architecture Code-named Fermi

Euler: A System for Numerical Optimization of Programs

Parallel Large-Scale Visualization

SECTION 2-1: OVERVIEW SECTION 2-2: FREQUENCY DISTRIBUTIONS

Statistics Output Guide

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining

NVIDIA GeForce GTX 580 GPU Datasheet

Towards Fast SQL Query Processing in DB2 BLU Using GPUs A Technology Demonstration. Sina Meraji sinamera@ca.ibm.com

Imaris Quick Start Tutorials

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Chapter 111. Texas Essential Knowledge and Skills for Mathematics. Subchapter B. Middle School

Heat Transfer and Energy

CUDAMat: a CUDA-based matrix class for Python

TPCalc : a throughput calculator for computer architecture studies

Intellect Platform - The Workflow Engine Basic HelpDesk Troubleticket System - A102

Visualization with OpenDX

Grade 6 Mathematics Assessment. Eligible Texas Essential Knowledge and Skills

Iris Sample Data Set. Basic Visualization Techniques: Charts, Graphs and Maps. Summary Statistics. Frequency and Mode

COM CO P 5318 Da t Da a t Explora Explor t a ion and Analysis y Chapte Chapt r e 3

Real-time Data Analytics mit Elasticsearch. Bernhard Pflugfelder inovex GmbH

Using and creating Crosstabs in Crystal Reports Juri Urbainczyk

Kinetic Theory of Gases. Chapter 33. Kinetic Theory of Gases

Digital Image Fundamentals. Selim Aksoy Department of Computer Engineering Bilkent University

Using Lexical Similarity in Handwritten Word Recognition

ArcGIS Pro: Virtualizing in Citrix XenApp and XenDesktop. Emily Apsey Performance Engineer

Jump Start: Aspen Simulation Workbook in Aspen HYSYS V8

Final Software Tools and Services for Traders

SIMS 255 Foundations of Software Design. Complexity and NP-completeness

Using Excel (Microsoft Office 2007 Version) for Graphical Analysis of Data

NV301 Umoja Advanced BI Navigation. Umoja Advanced BI Navigation Version 5 1

ACCELERATING SELECT WHERE AND SELECT JOIN QUERIES ON A GPU

NVIDIA VIDEO ENCODER 5.0

Bazaarvoice Intelligence reference guide

Arena Tutorial 1. Installation STUDENT 2. Overall Features of Arena

Abstract. For notes detailing the changes in each release, see the MySQL for Excel Release Notes. For legal information, see the Legal Notices.

Texture Cache Approximation on GPUs

GPU Renderfarm with Integrated Asset Management & Production System (AMPS)

Introduction to GPU hardware and to CUDA

Other GEANT4 capabilities

COGNOS 8 Business Intelligence

IT Service Level Management 2.1 User s Guide SAS

SPC Response Variable

Transcription:

Free Volume Toolkit: Tools for the visualization and analysis of free volume in materials Frank T Willmore Texas Advanced Computing Center willmore@tacc.utexas.edu presented to: NSF-XSEDE 12 July 17, 2011

Introduction My background and research focus Assumptions about the data The tools: pddx, rattlj, gfg2fvi Unix command-line utilities Sample Workflows Histograms Visual output Extensibility Using with gmake Summary

My background Computational polymer science Statistical mechanics Classical simulation Monte Carlo methods Specialized in characterization of membranes for separation of light gases Focused on virtual metrics for free volume (space between molecules)

Assumptions about data Data is in a standard ASCII format,.xyz (LAMMPS),.car (Materials Studio),.psf, etc. Can be reduced via Unix command-line tools to representation as (x, y, z, σ, ε) Data represents a snapshot or a series of snapshots in time of molecular positions. Data employs a rectangular periodic boundary.

Meet the tools: pddx (parallel cavity sizing tool) Measures void space in simulated material sample rattlj (rattle-in-place volume tool) Measures space accessible to individual atoms gfg2fvi (free volume index tool) Measures interaction energy for a test atom Miscellaneous Unix command line utilities

pddx: implementation of the algorithm: Performs gradient descent to find stationary points of energy landscape Scales particle to match interaction Monte Carlo code widely varying execution time per sample One thread per sample Concurrency controlled using counting semaphore

Cavity example: Thermal re-arrangement of PIOFG (polyimides with ortho-positioned functional groups) Y. Jiang et al. / Polymer 52 (2011) 2244-2254

Simmons glass: 32-bead chains Lennard-Jones 6-12 interaction between non-adjacent beads Stiff harmonic potential between adjacent beads Volume traced out by an individual atom allowed to rattle while all other atoms remain fixed or- volume of space enclosed by isosurface of potential energy. Calculated by recursive search of space connected to original particle center. An average collision energy of 2kT is used to define an isosurface which encloses rattle volume. Branching of algorithm and use of stack make this algorithm better suited for implementation on CPU. rattlj rattle volume

gfg2fvi: Free Volume via Widom insertion F ( N, V, T) = kt ln Z( N, V, T ) Z = e i E i / kt µ = F N T, V = kt ln Z N T, V = kt ln Z( N + 1, V, T) Z( N, V, T) = kt ln B insertion B insertion = e ψ insertion kt FVI ( x, y, z, t) e ψ repulsive kt

gfg2fvi: Insertion parameter as a free volume metric Works with any forcefield model Well-behaved function (bounded between 0..1) Represents solubility Continuity with statistical mechanics Easily computable (particularly with GPUs) Defined over all of space Yields an energetic isosurface Generates voxel data which can be presented for voxel-based shape analysis

Formulation of algorithm

Formulation of algorithm

Formulation of algorithm

e.g. polydimethylsiloxane

gfg2fvi: Why use GPUs for this? Feasibility of calculation Widom first formulated insertion method in 1963. For most systems, calculation was not feasible. 49 years of increases in computing power / ~4 billionfold increase per Moore s Law (1965) With the current formulation, it now takes ~one hour on a single NVIDIA Tesla M2070 (R peak = 1TFLOP single precision) to calculate energies in a system of 1600 atoms at 1 billion (1024 x 1024 x 1024) grid points.

UNIX command-line utilities I wish I could Pipe-oriented One record per line, tab separated Many are simple functions that map 1 to 1, e.g. exp() and sq() Some are transforms (e.g. distribution to histogram) Some operate on sets of molecular data Implemented for scientific productivity, not compute efficiency Many are dumbed-down versions of other operations -usage flag gives a quick description

UNIX command-line utilities Example: # Measure all the unique cavities in configuration 1 and output a spreadsheet $ pddx < configuration_1.gfg ~/usr/bin/uniq awk {print $4} dst2hst > configuration_1.xls # Measure the Widom insertion factor for configuration 1 $ gfg2fvi < configuration_1.gfg awk {print $4} avg # Evaluate an expression for a set of values $ cat set_of_x pow 2 expr_multiply -1 expr_add 1 recip # get some help $ gfg2pov -usage 1 1 x 2

Unix Command line tools: REDUCING OPERATORS: These operate on a list of values and return a single value avg calculates average value sum calculates sum max returns maximum value min returns minimum value std returns standard deviation (provide mean on command line) miss Finds the missing number in a series of values SCALAR OPERATORS: These utils operate on a stream of values (one per line) to generate: loge the natural logs log10 the logs base 10 exp the exponentials sqrt the square roots sq the squares pow the input values raised to the power specified on the command line expr_add the input values plus the quantity specified on the command line expr_multiply the input values multiplied by the quantity specified on the command line stream_multiply like expr_multiply, but accepts multiple quantities on the command line

Unix Command line tools: TRANSFORMS: Function or distribution in, function out dst2hst Transforms a distribution to a histogram wdst2hst Transforms a weighted distribution to a histogram TABULATED FUNCTION OPERATORS: add Adds a list of tabulated functions. In: list of files (two fields, separated by tabs) on the command line Out: A tabulated function the sum of the input functions. normalize In: Tabulated function Out: Normalized function sew Accepts a list of tabulated functions, sews them together columnwise to generate a single output

Unix Command line tools: MISCELLANEOUS: uniq In: List of cavities in.cav format Out: List of unique cavities in.cav format a2b In: 2 files of.cfg format, same number of lines Out: a list of distances from pt A (1st file) to pt B (2nd file) dwf For estimating Debye-Waller factor... generates a list of squares of distances from first to second smooth In: A tabulated function Out: Smoothed version of function cram In: A.gfg format list of atoms, plus box dimensions (at command line) Out: A.gfg format list of atoms, wrapped for the periodic boundary condition. povheader Writes a povray header, according to params specified on command line stack_tiffs Takes a stack of 2D tiff images and converts them to a single 3D tiff replicate_gfg Takes a 3D configuration and copies it into the 26 surrounding periodic boxes. ftw_cuda Checks to see if a CUDA-capable device is present truncate Capture a subset of gfg data by specifying two corners of a bounding box.

Unix Command line tools: FORMATS: Extension format description cfg (tab-separated) configuration file: x, y, z gfg (tab-separated) generalized configuration file: x, y, z, sigma, epsilon gfgc (tab-separated) gfg plus one additional column specifying color (used for drawing the positive matter) fvi (tab-separated) free volume index: x. y, z, index vis (tab-separated) Primitive visualization format, x, y, z, diameter, and a number indicating a color (1 = Red, 2 = Green) cav (tab-separated) list of cavities (x, y, z, diameter) unq (tab-separated) same as cav, but implies the cavities are unique pov povray input file

Unix Command line tools: FORMAT CONVERTERS: Operate on lists, typically of (x, y, z) positions stream2slice slices a data stream into files with the specified number of lines fvi2tiff generates a 3D TIFF from fvi data cfg2gfg adds columns (default values of 1 for sigma and epsilon) gfg2pov generates povray format from gfg gfgc2pov generates povray format from gfgc fvi2pov generates povray format from fvi data cav2pov generates povray format from cavity list vis2pov generates povray format from vis cav2vis generates vis format from cav cfg2vis generates vis format from cfg gfg2fvi A utility for generating free volume index/fvi data from configuration/ gfg; requires GPU and CUDA utils

willmore@login2:ls4:/work/00479/willmore/pdms-wxy$ head car/pdms_1.car!biosym archive 3 PBC=ON Materials Studio Generated CAR File -16252.2603!DATE Thu May 18 15:51:00 2006 PBC 29.27020 29.27020 29.27020 90.00000 90.00000 90.00000 (P1) SI 14.158963246 1.955073516 29.834570945 DIME 1 si4 Si 0.618 HSI 13.722127544 2.508988066 31.143754828 DIME 1 h1 H -0.126 C 13.045509787 0.444739126 29.464018274 DIME 1 c4 C -0.294 HC1 12.338434938 0.358308498 30.316410271 DIME 1 h1 H 0.053 HC2 12.605035137 0.347767539 28.426865238 DIME 1 h1 H 0.053 cat car/pdms_$i.car \ grep DIME \ awk '{print $2"\t"$3"\t"$4"\t"$7} \ sed -e "s/h1o/1.087\t0.008/ \ sed -e "s/h1/2.878\t0.023/" \ sed -e "s/si4c/4.405\t0.198/" \ sed -e "s/si4/4.405\t0.198/" \ sed -e "s/o2z/3.300\t0.080/" \ sed -e "s/c4/3.854\t0.062/" \ ~/home/bin/cram -box 29.2702 29.2702 29.2702 \ > gfg/pdms_$i.gfg Note that gfg format contains no metadata! Data format conversion example

Sample workflow #1: generating histograms

Sample workflow #2: Generating visual output

Extensibility: Generating a cavity spectrum of the cavities in an electrospun tissue scaffold: Image fitted with cavities Resampled image

Extensibility: Using R to find principal component axes of rattle volume F=read.delim(file("rat_0.xyz")) mu=colmeans(f) mu_mat = rep(mu, nrow(f)) mu_mat = matrix(mu_mat,nrow=nrow(f),byrow=t) F=F-mu_mat Statistics Visualization yyt <- function(y) return(y %*% t(y)) # result of apply() will have ncol(f)ˆ2 rows, nrow(f) columns xyyt <- apply(f,1,yyt) rmeans <- rowmeans(xyyt) extension to other data cov_matrix=matrix(rmeans, nrow=3, ncol=3) eigen(cov_matrix) # eigenvectors V1=eigen(cov_matrix)$vectors[,1] V2=eigen(cov_matrix)$vectors[,2] V3=eigen(cov_matrix)$vectors[,3] sqrt_lambda=sqrt(eigen(cov_matrix)$values) # cylinder heads/tails mu+(v1*sqrt_lambda[1]) mu-(v1*sqrt_lambda[1]) mu+(v2*sqrt_lambda[2]) mu-(v2*sqrt_lambda[2]) mu+(v3*sqrt_lambda[3]) mu-(v3*sqrt_lambda[3])

Movies, stills, and TIFFs Tissue scaffold as TIFF PDMS (2004 atoms) as FVI-TIFF PS (1604 atoms) as FVI-TIFF Simmons glass (1600 atoms) at T = 0.500 FVI and CAV flythrough Simmons glass dynamic videos Four kinds of free volume

To summarize: simulated molecular configuration or other spatial data Free volume toolkit is a set of tools for the analysis of spatial data 3D TIFFs Statistics Single frame visualizations Multi-frame dynamic movies

image: image.pov povray -W720 -H480 image.pov 1080p: image.pov povray -W1920 -H1080 -D image.pov image.pov: header.pov C.pov H.pov S.pov cat header.pov C.pov H.pov S.pov > image.pov C.pov: T150m.xfg cat T150m.xfg grep "C" sed -e "s/c/.19\t1.000000/" gfg2pov -no_header \ -color Grey -phong 1 > C.pov H.pov: T150m.xfg cat T150m.xfg grep "H" sed -e "s/h/.07\t1.000000/" gfg2pov -no_header \ -color White -phong 1 > H.pov S.pov: T150m.xfg cat T150m.xfg grep "S" sed -e "s/s/.17\t1.000000/" gfg2pov -no_header \ -color Yellow -phong 1 > S.pov clean: rm C.pov H.pov S.pov image.pov Example makefile for building a scene

Acknowledgements University of Texas at Austin Department of Chemical Engineering Isaac C Sanchez Xiaoyan Wang Ying Jiang Benny D Freeman Texas Advanced Computing Center Carlos Rosales Paul Navratil John Cazes Ben Urick NIST Polymers Division Jack Douglas David Simmons NSF-Teragrid/XSEDE National Research Council

VACUUMMS Void Analysis Code and Unix Utilities for Molecular Modeling and Simulation