Python Data Analysis Tool Kit - Outline

Similar documents
CE 504 Computational Hydrology Computational Environments and Tools Fritz R. Fiedler

Analysis Programs DPDAK and DAWN

Rapid GUI Application Development with Python

DATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7

CS 3530 Operating Systems. L02 OS Intro Part 1 Dr. Ken Hoganson

Postprocessing with Python

Acronis Backup & Recovery 10 Server for Linux. Installation Guide

Integrated Open-Source Geophysical Processing and Visualization

AQA GCSE in Computer Science Computer Science Microsoft IT Academy Mapping

Acronis Backup & Recovery 10 Server for Linux. Update 5. Installation Guide

MayaVi: A free tool for CFD data visualization

DiskPulse DISK CHANGE MONITOR

HPC Wales Skills Academy Course Catalogue 2015

Data Analysis with MATLAB The MathWorks, Inc. 1

Current Status of Development of New VLBI Data Analysis Software

Scientific Programming with Python. Randy M. Wadkins, Ph.D. Asst. Prof. of Chemistry & Biochem.

Introduction Our choice Example Problem Final slide :-) Python + FEM. Introduction to SFE. Robert Cimrman

Computational Mathematics with Python

Computational Mathematics with Python

Data Mining mit der JMSL Numerical Library for Java Applications

Introduction to MATLAB for Data Analysis and Visualization

DataPA OpenAnalytics End User Training

Sisense. Product Highlights.

Physics 9e/Cutnell. correlated to the. College Board AP Physics 1 Course Objectives

Acronis Backup & Recovery 10 Server for Linux. Installation Guide

Lavastorm Analytic Library Predictive and Statistical Analytics Node Pack FAQs

Objectives. Chapter 2: Operating-System Structures. Operating System Services (Cont.) Operating System Services. Operating System Services (Cont.

MDSplus Automated Build and Distribution System

Visual Basic. murach's TRAINING & REFERENCE

DIABLO VALLEY COLLEGE CATALOG

IDL. Get the answers you need from your data. IDL

Business Application Services Testing

Chapter 7: Additional Topics

Part I Courses Syllabus

From The Little SAS Book, Fifth Edition. Full book available for purchase here.

Taboret Management Application Builder

Visualization of Adaptive Mesh Refinement Data with VisIt

PYTHON IN A NUTSHELL. O'REILLY Beijing Cambridge Farnham Köln Sebastopol Taipei Tokyo. Alex Martelli. Second Edition

JMulTi/JStatCom - A Data Analysis Toolkit for End-users and Developers

Analyzing Network Servers. Disk Space Utilization Analysis. DiskBoss - Data Management Solution

OBJECTSTUDIO. Database User's Guide P

USE OF PYTHON AS A SATELLITE OPERATIONS AND TESTING AUTOMATION LANGUAGE

Chapter 10 Case Study 1: LINUX

Programming Languages & Tools

Computational Mathematics with Python

AIMMS Function Reference - Arithmetic Functions

Bachelor of Games and Virtual Worlds (Programming) Subject and Course Summaries

EnterpriseLink Benefits

Architecture and Mode of Operation

Chapter 1. Dr. Chris Irwin Davis Phone: (972) Office: ECSS CS-4337 Organization of Programming Languages

Intro to scientific programming (with Python) Pietro Berkes, Brandeis University

Software: Systems and Application Software

The evolution of DAVE

Licenses of savic-net for Integrated Building Management System, and for FDA Title 21 CFR Part 11 Compliance

Data Warehouse Center Administration Guide

Analysis, post-processing and visualization tools

Acronis Backup & Recovery 10 Server for Linux. Installation Guide

Sentaurus Workbench Comprehensive Framework Environment

PyRy3D: a software tool for modeling of large macromolecular complexes MODELING OF STRUCTURES FOR LARGE MACROMOLECULAR COMPLEXES

DataFlex Connectivity Kit For ODBC User's Guide. Version 2.2

The Piranha computer algebra system. introduction and implementation details

Introduction to ROOT and data analysis

VisIVO, a VO-Enabled tool for Scientific Visualization and Data Analysis: Overview and Demo

What s new in TIBCO Spotfire 6.5

Outside In Image Export Technology SDK Quick Start Guide

Code Generation Tools for PDEs. Matthew Knepley PETSc Developer Mathematics and Computer Science Division Argonne National Laboratory

CATIA V5 Tutorials. Mechanism Design & Animation. Release 18. Nader G. Zamani. University of Windsor. Jonathan M. Weaver. University of Detroit Mercy

How To Develop Software

Log Analyzer Reference

Replicating File Data with Snap Enterprise Data Replicator (Snap EDR)

Requirements Specification Document

Outline. hardware components programming environments. installing Python executing Python code. decimal and binary notations running Sage

Reduces development time by 90%

Visualization with ParaView

Exercise 0. Although Python(x,y) comes already with a great variety of scientic Python packages, we might have to install additional dependencies:

2. Advance Certificate Course in Information Technology

BI xpress Product Overview

Java 7 Recipes. Freddy Guime. vk» (,\['«** g!p#« Carl Dea. Josh Juneau. John O'Conner

Integrated and reliable the heart of your iseries system. i5/os the next generation iseries operating system

What you can do:...3 Data Entry:...3 Drillhole Sample Data:...5 Cross Sections and Level Plans...8 3D Visualization...11

Analytic Modeling in Python

User Guidance. CimTrak Integrity & Compliance Suite

KITES TECHNOLOGY COURSE MODULE (C, C++, DS)

DATA PROCESSING SOFTWARE

Creating Reports with Microsoft Dynamics AX SQL Reporting Services

RIC 2007 SNAP: Symbolic Nuclear Analysis Package. Chester Gingrich USNRC/RES 3/13/07

Introduction to MATLAB Gergely Somlay Application Engineer

Agent Languages. Overview. Requirements. Java. Tcl/Tk. Telescript. Evaluation. Artificial Intelligence Intelligent Agents

Moving to Plesk Automation 11.5

Managing your Red Hat Enterprise Linux guests with RHN Satellite

DiskBoss. File & Disk Manager. Version 2.0. Dec Flexense Ltd. info@flexense.com. File Integrity Monitor

Visualization. For Novices. ( Ted Hall ) University of Michigan 3D Lab Digital Media Commons, Library

(!' ) "' # "*# "!(!' +,

Fred Hantelmann LINUX. Start-up Guide. A self-contained introduction. With 57 Figures. Springer

TimePictra Release 10.0

Transcription:

Python Data Analysis Tool Kit - Outline Python language advantages Basic extension packages for data analysis and visualization Goals of python data analysis tool kit The Data class Data class based applications and analysis code interfaces efit.py example Visualization tools Software installation, availability, and documentation Thrust 1 YearEnd Rev 01/30/07 02:35 pm 1 DAG Talk 1

Python Language Advantages Easy to learn object oriented scripting language Simplified C/C++ constructs + automatic memory management = easy to learn and easy to write. Useful set of built in data types: numbers, lists, tuples, strings, dictionaries, and methods on these objects, e.g. regular expressions on strings; Class structure to add new data types Objects no longer referenced are cleared from memory Structured syntax (indentation) = easy to read Supported in emacs, vi Automatic documentation (pydoc): programmer written in line strings + automatic descriptions can be displayed in interpreter or written to html. Object oriented features (Class inheritance) provide a simple and efficient technique for the extension or customization of software. Modular structure: only import to memory features you need. Runs in interpreter or as script. Automatically semi-compiled Thrust 1 YearEnd Rev 01/30/07 02:35 pm 2 DAG Talk 2

Python Language Advantages Open source (GPL), free, and available for UNIX, LINUX, MacOS, Windows: www.python.org Mature, stable, strongly supported and widely used in open source community, e.g. used extensively in LINUX system software. Language can be easily extended with fast code written in a compiled language (shared libraries): C API (python features in C) Automatic code wrappers for C (SWIG) and FORTRAN (f2py) Python interpreter can be embedded in other languages Standard modules cover a broad range, e.g. parse XML, ftp, sockets, threads, regular expressions, process control, etc Import only the modules you need. Many second party extension modules, e.g. interface to MDSplus and relational databases, scientific analysis. Thrust 1 YearEnd Rev 01/30/07 02:35 pm 3 DAG Talk 3

Packages of Basic Numeric Extension Modules Numeric: array manipulation and linear algebra pymultipack: B-splines, integration, minimization, solution of non-linear equations, ordinary differential equations pysignaltools: signal convolution and filters pyfftw: forward and inverse FFTs in multi-dimensions with threading pyspecialfuncs, stats: special functions ScientificPython: Interpolation, least squares, vector and tensor analysis, parallel processing pyslatec: Full slatec FORTRAN library (auto-wrapped). Thrust 1 YearEnd Rev 01/30/07 02:35 pm 4 DAG Talk 4

Packages of Basic Data Archive Interface Modules pmds (pydatautils package): Base interface to MDSplus mdsconnect, mdsopen, mdsvalue, mdsput, mdsdisconnect mdsutils (Datautils package): simplified TCL interface, tree and node creator Ptdata (pydatautils package): Direct (d3lib) interface to D3D ptdata psycopg: Interface to open source postgresql relational database server. msdb (pydatautils package): Base interface to Microsoft SQL server Works with either Sybase Open Client or FreeTDS msdbtools(datautils package): copy tables, get columns in a table, Scientific.IO.NetCDF: netcdf file interface Scientific.IO.FortranFormat: FORTRAN format IO namelist_class (pynamelist package): interface to FORTRAN namelist files ( +,- overloaded) Thrust 1 YearEnd Rev 01/30/07 02:35 pm 5 DAG Talk 5

Packages of Graphics and Widget Modules Graphics pyppgplot : python interface to pgplot + pgxtal. pplot: simple plot methods on Data class instances pyscreens: pgplot based object oriented multi-window multi-graph plot builder BLT: graphics extension TCL/TK widget set pyd3tools (M.Wade) general DIII-D data plotting widget pygnuplot: interface to gnuplot graphics Gist: LLNL graphics package Widgets TCL/TK through Tkinter (low level) and PMW (high level) interface pygtk (Gnome desktop) LINUX: pygtk2 and pyglade (XML widget builder); Linux,HP-UX: pygtk1. Has graphics extension (not installed) pyqt (KDE desktop) not installed(linux only) has graphics extension wxpython (layer to various widget sets) not installed Thrust 1 YearEnd Rev 01/30/07 02:35 pm 6 DAG Talk 6

Python Data Analysis Tool Kit Goals and Approach Create a set of routines for data manipulation, that are at a high level but not specific to a particular analysis goal, interfaced to a simple scripting language, allowing a researcher to quickly build an analysis tool for a specific purpose. Higher level data processing elements (FFT,..) combined with medium level (array processing,...) and low level (iteration,...). Easy to use interfaces between data archives (MDS+, ) and data processing elements. Easy to use interfaces between standard analysis codes (EFIT, ONETWO, ) IO and data processing elements. Visualization tools Thrust 1 YearEnd Rev 01/30/07 02:35 pm 7 DAG Talk 7

Python Data Analysis Tool kit Data Class Instances of class Data are basic building blocks for analysis applications: defined in modules in pydatautils package data.py: highest level module >>> from data import * imports all Data class features defines higher level methods, e.g. signal.fft() data_init.py: instantiation functions, subclass and submodule of data.py >>> ne = Data( 'tsne_core', 122336 ) : 1) looks in table on postgresql server to see what MDS+ tree or PTDATA branch tsne_core is in, wild card characters will result in search and list of options, 2) reads the signal into ne.y, the error bars into ne.yerror, and axes into ne.x. 3) reads any subnodes (signal and atomic types) into substructures. data_base.py: basic arithmetic and algebraic functions on Data class objects, subclass Data and submodule of data.py Overloads +, -, *, /, **, %, algebraic functions (Sqrt, Log, Tan,..) and slices ([2,:]) error bars propagated time bases interpolated Math errors (zero divide) masked Thrust 1 YearEnd Rev 01/30/07 02:35 pm 8 DAG Talk 8

Data Class Methods Methods on Data class objects.cdfput() : Write instance to netcdf file.conj(): Complex conjugate.contour(): Generate contours on 2D instance.der(): First derivative.dump(): Write to ASCII file.fft(): Fast Fourier transform.fit(): Fit to some standard or user supplied function. A call method is created based on the fit..imag(): imaginary part of complex instance.int(): Integrate.interp_fun(): creates a interpolating call method on instance.inv_fft(): inverse fft.list(): lists name and ranges of values and axis Thrust 1 YearEnd Rev 01/30/07 02:35 pm 9 DAG Talk 9

Data Class Methods Methods on Data class objects.mdsput(): writes instance as signal node to MDS+ and substructures as subnodes including fit and/or spline attributes On instantiation fit and/or spline attributes read back in and call method created.newx(): Use different values for one of the independent vars.real(): Real part of complex instance.rebuild(): The operations leading to an instance are reapplied to recreate the instance for a different shot..save(): Write to a python cpickle file.skip(): Skip some points.smooth(): smooth based on several filter options or on a user defined response function.spline(): fixed or auto knot B-spline of variable order. Creates a call method for interpolation, derivatives, and integration Thrust 1 YearEnd Rev 01/30/07 02:35 pm 10 DAG Talk 10

Data Class Methods Methods on Data class objects.timing_domains(): Determine regions of continuous point spacing.tspline(): Splines with tension. Creates call method with interpolation, derivative, and integration.xslice(): Slice data based on x values rather than indices.copy(): Copy all attributes of instance to another (possible linkages of mutable attributes).deepcopy(): Full copy with linkages.shape(): Shape of y array Functions on Data class objects Join(): Join several identically shaped instances into a single instance with one extra dimension blend(): Blend two instances by combining along one x axis cdfget(): Read from a netcdf file Thrust 1 YearEnd Rev 01/30/07 02:35 pm 11 DAG Talk 11

Data Class Functions Functions on Data class objects dmdsput(): write a dictionary Data instances, strings, and arrays to MDS+; creates nodes, trees as needed listbuilds(): Lists the build attribute for all instances listdata(): List names and ranges for all Data instances listfiles(): Print a list of files with.data extensions (cpickle files) math_exceptions(): Define what to do with a math exception( /0 ) rebuild(): Rebuild several or all instance for a different shot restorebuilds(): Read the builds dictionary back from a file restoredata(): restore an instance from a cpickle file savebuilds(): Save the builds dictionary to a file savedata(): Save all instances to cpickle files Arccosh(), Arcsin(), Arcsinh(), Arctan(), Arctanh(),Conjugate(),Cos(),Cosh(), Exp(), Log(), Log10(), Sin(), Sinh(), Sqrt(), Tan(), Tanh() Thrust 1 YearEnd Rev 01/30/07 02:35 pm 12 DAG Talk 12

Example of Reading and Plotting MDSplus Data >>> from screens import *;from data import * >>> i = Data( 'ip', 98893 ).smooth(50.)/1.e6 t(ms) ip(amperes) >>> poh = i * Data( 'vloop', 98893 ) x0(ms) vloop(v) >>> ne = Data( 'densr0', 98893 ) ; ne1 = ne.rebuild( 98891 ) x0(ms) densr0(/m^3) x0(ms) densr0(/m^3) >>> psi = Data( 'psirz', 98893 ).xslice( (2,2000) ) x0(m) x1(m) x2(ms) psirz(vs/rad) >>> s = Screen() >>> s.ad( i ); s.ad( poh ) >>> s.ag() ; s.ad( ne ) ; s.ad( ne1 ) >>> s.aw() ; s.ad( psi, surface=1, color_table='heat' ) >>> s.ag( aspect = 'auto' ) ; s.ad( psi, n_contours = 20 ) >>> s 0: w0 -- 0: g0 -- 0: ( c0, i )-- 1: ( c1, poh )-- -- 1: g1 -- 0: ( c2, ne )-- 1: ( c3, ne1 )-- 1: w1 -- 0: g2 -- 0: ( s4, psi )-- -- 1: g3 -- 0: ( n7, psi )-- >>> s.pl() Thrust 1 YearEnd Rev 01/30/07 02:35 pm 13 DAG Talk 13

Data Class Based Higher Level Applications and Interfaces to other Codes Python interfaces to standard analysis codes (EFIT, ONETWO,...) define functions for interacting with the analysis codes IO integrated into the Data class analysis structure, and for running the analysis codes. IO may be extended, e.g. ONETWO data can be written to MDS+ Run functions in python interfaces to analysis codes and stand alone applications are controlled through tables on the postgresql server and activated through simple command lines, e.g. efit.py -r 122336. Pgaccess GUI to the postgresql server allows adjusting a large number of settings without putting them on the command line or creating a custom GUI widget for every application (also ODBC) Installed on all platforms pgadmin3: a better GUI to postgresql, installed on Linux only Permanent record of settings for a run. Run table entries described here: http://diii-d.gat.com/~osborne/python_d3d.html Thrust 1 YearEnd Rev 01/30/07 02:35 pm 14 DAG Talk 14

Data Class Based Applications and Interfaces to other Codes (pyd3d package) efit.py : Runs efit on Thomson scattering times, in snap mode, or off kfiles. Also sets up and runs a kinetic efit based on profiles generated by profile.py, does edge p' and j variation for stability analysis, reads and writes data to MDS+/EFIT files, reads EFIT data into Data structures elm.py : Determines ELM timing. Also calculates ELM energy loss from fast efit analysis. fasteq.py : Runs EFIT using fast magnetics data to look at ELM effects (energy loss). profiles.py : Computes full cross section profiles with good edge resolution for electrons and ions. Stores results in MDS+. onetwo.py : Runs onetwo and deals with its output. profdb.py : Make entry into ITER pedestal profile database. baloo.py : Runs baloo and deals with its output. Thrust 1 YearEnd Rev 01/30/07 02:35 pm 15 DAG Talk 15

Application program control table Pgaccess Thrust 1 YearEnd Rev 01/30/07 02:35 pm 16 DAG Talk 16

efit.py Application Example efit.py functions. setupdb_efit: Set up an entry into efit_runs table for auto EFIT runs efit_eqdsk.py functions (submodule of efit.py) convert_a(g): Converts and a(g)eqdsk to/from ASCII to bigendian binary get_a(g,m)dat : Reads all a(g,m)eqdsk data from MDS+ for a given shot into a dictionary of Data class objects. read_a(g,m)files : Reads all a(g,m)eqdsk files in a given directory (./shot12345) into a dictionary of Data class objects. write_k_from_mds : Read kfile data from MDS+ and write it to a file. Kfile data is only available in MDS+ for EFIT MDS+ data written with efit.py write_g_from_mds : Read geqdsk data from MDS+ and write it to a file. write_mds: write aeqdsk, geqdsk, meqdsk, kfile and snap file data to MDS+; update code_run database Thrust 1 YearEnd Rev 01/30/07 02:35 pm 17 DAG Talk 17

efit.py Application Example efit_run.py functions. (submodule of efit.py) autocheck_efit_runs : Periodically checks the progress of a set of EFIT runs distributed over the DIII-D computer cluster autorun_efit : run a series of EFITs based on efit_runs table entries, distributes runs across DIII-D computers check_efit_runs : Check the progress of EFIT distributed sub processes run_efit : Run EFIT in snap, kfile, or kfile creation mode distributing runs across the DIII-D cluster with load levelling. Can run in snap mode for Thomson scattering times. efit_kinetic.py functions and classes (submodule of efit.py) run_kinetic : Setup and run a free boundary kinetic EFIT based on profiles in MDS+ generated by profile.py and H-mode pedestal current density constrained to match the Sauter model (computed in efit_jsauter.py ). CBOOT can be optimized against magnetics CHISQ. Magnetics data is averaged over same intervals used in profiles Thrust 1 YearEnd Rev 01/30/07 02:35 pm 18 DAG Talk 18

efit.py Application Example efit_kinetic.py (submodule of efit.py) Kfile class: subclass of Namelist..kinetic: build a kinetic kfile from profile data in MDS+ efit_fluxav.py (submodule of efit.py) fluxav: Flux surface average a set of standard or user supplied functions psicont: Generate flux contours at normalized flux values efit_jsauter.py (submodule of efit.py) jsauter: computer Sauter bootstrap and fully relaxed Ohmic current density profile based on profiles in MDS+ and geqdsk parameters Thrust 1 YearEnd Rev 01/30/07 02:35 pm 19 DAG Talk 19

efit.py Example efit_kinetic.py vary_ped : Starting with a kinetic fit vary pedestal characteristic in a series of fixed boundary EFIT runs: pedestal pressure, current, collisionality, width, density and temperature separately. Used to map pedestal stability space. Thrust 1 YearEnd Rev 01/30/07 02:35 pm 20 DAG Talk 20

Widget Based Visualization Tools pyd3tools (M.Wade): TK/BLT general data plotter Profplot.py : Glade/GTK+ widget for plotting (pgplot) profile.py fits and related quantities. Pdb : GTK+ widget for plotting and fitting scalar database data (pgplot) Eqplot.py, Eqplot2.py(M.Makowski): Glade/GTK+ EFIT quantity plotter (pgplot) Sac (M.Makowski): GTK+ signal analysis widget (pgplot) Thrust 1 YearEnd Rev 01/30/07 02:35 pm 21 DAG Talk 21

Widget Based Visualization Tools pyd3tools (M.Wade): TK/BLD general data plotter Thrust 1 YearEnd Rev 01/30/07 02:35 pm 22 DAG Talk 22

Software Installation, Availability, and Documentation Python 2.5 and all packages currently installed on the DIII-D NSF disks and maintained for RedHat Linux E4, HP-UX 11.11. (Also at PPPL on RHEL3) Previously has been built for RHE3, Fedora1-3, Solaris6.2, and MACOS10 and OSF1. Package set in RPM form for RHEL4,5 and Fedora6,7 Signed RPMs installed in a YUM repository allowing automatic updates at https://diii-d.gat.com/~osborne/python/ Sources in CVS, CVSROOT=/f/python/cvspython (not pyd3tools) Group pyadmin has write access. Executables in /f/python/$ospath/bin Start with Python, Ipython(nicer interface) to set environment Documentation on packages and help on installation: https://diii-d.gat.com/~osborne/python/clickme.html Thrust 1 YearEnd Rev 01/30/07 02:35 pm 23 DAG Talk 23