Python for Data Analysis and Visualiza4on. Fang (Cherry) Liu, Ph.D PACE Gatech July 2013

Size: px
Start display at page:

Download "Python for Data Analysis and Visualiza4on. Fang (Cherry) Liu, Ph.D [email protected] PACE Gatech July 2013"

Transcription

1 Python for Data Analysis and Visualiza4on Fang (Cherry) Liu, Ph.D PACE Gatech July 2013

2 Outline System requirements and IPython Why use python for data analysis and visula4on Data set US baby names Data Loading Data Processing using Lists Data Aggreg4on and Group PloTng and visualiza4on 2

3 System Setup Op4on 1 : (Preferred) Download and Install the Enthought Canopy product: hwps:// Enthought Canopy is free for Academic Users. This will install a full Python distribu4on onto your computer. Op4on 2: Download and Install Python(x,y) (This is for Windows only) hwps://code.google.com/p/pythonxy/wiki/downloads This will install a full Python distribu4on onto your computer. Note1: Op4ons 1 and 2 are mutually exclusive. Please do not install both Canopy and Python(x,y) on your computer. Note2: Downloading and installing either Canopy or Python(x,y) will take a long 4me. Note3: During this course, Canopy will be used to type and execute all commands (op4on 1). Op4on 3: Use the Python installed on PACE clusters. (You need a PACE account for this to work) If you choose this op4on, let me know and I'll send instruc4ons that will help ensure that your environment is setup properly for the tutorials. Op4on 4: Use the Python already installed on your laptop. As long as Numpy, SciPy, Matplotlib, IPython, and Pandas are installed on your laptop, you will be able follow both courses (Scien4fic Compu4ng and Data Analysis and Visualiza4on). 3

4 IPython An Interac4ve Compu4ng and Development Environment It provides an execute- explore workflow instead of typical edit- compile- run workflow of many other programming languages It provides very 4ght integra4on with the opera4ng system s shell and file system It also includes: A rich GUI console with inline plotng A web- based interac4ve notebook format A lightweight, fast parallel compu4ng engine 4

5 Why use Python for Data Analysis The Python language is easy to fall in love with Python is dis4nguished by its large and ac4ve scien4fic compu4ng community Adop4on of Python for scien4fic compu4ng in both industry applica4ons and academic research has increased significantly since the early 2000s Python s improved library support (pandas) made it a strong tool for data manipula4on tasks 5

6 Example: US Baby Names The United States Social Security Administra4on (SSA) has mad available data on the frequency of baby names from 1880 through 2012, this data set is ofen used in illustra4ng data manipula4on in R, Python, etc. The data can be obtained at: hwp:// Things can be done with this data set Visualize the propor4on of babies given a par4cular name Determine the naming trend Determine the most popular names in each year 6

7 Check the Data In IPython, MacOS or Linux: use the UNIX head to look at the first 10 lines of the one of the files. Windows: download the files, and click to open the files This is nicely comma- separated form. 7

8 Load Data Using csv module from the standard library, CSV means Comma Separated Values, and any delimiter can be chosen. The variable table contains records list in which each record has three fields : name, sex, count 8

9 Grouping the data based on sex To find the total births by sex, the groupby func4on is used: It returns an iterator for each group based on the key value which is extracted from x[1] (sex) Then traverses the group and get the total counts Be sure to do from itertools import groupby first 9

10 Anonymous (lamda) Func4ons Anonymous or lambda func4ons are simple func4ons consis4ng of a single statement, the result is the return value. Lamda func4ons are convenient in data analysis since there are many cases where data transforma4on func4ons will take func4ons as arguments. 10

11 Aggregate the data at the year and sex Since the data set is split into files by year, one need to traverse all the files to get the total number of births per year per sex level 11

12 The result list (Lef) first 10 records in pieces list (Right) last 10 records in pieces list 12

13 Matplotlib review Before we start plotng the result, let s review the plot first 13

14 Prepare the data for plot Currently, the result is a list of list, each internal list include three values, [year, female births, male births], to plot the births according to year and sex, the plot needs to have year as x- axis, and births as y- axis, while two lines will be showing to represent female and male birth. 14

15 Plot the total births by sex and year Plot 15

16 Reorganize the data Concatenate the all files together to prepare the further analysis. 16

17 Extract a subset of the data Find the top 1000 names for each sex/year combina4on, further narrow down the data set to facilitate further analysis, the sor4ng is ignored here since the input files are already in descending order 17

18 Compare the subset data with original data The subset data has much less records than the original data set, but represents the majority informa4on 18

19 Analyzing Naming Trends With the full data set and Top 1,000 data set in hand, we can start analyzing various naming trends of interest. SpliTng the Top 1,000 names into the boy and girl por4ons: 19

20 Analyzing Naming Trends (Cont.) Plot for a handful of names in a subplot, John, Harry, Marry, to compare their trends over the years, first prepare data set for each chosen name. 20

21 Analyzing Naming Trends (Cont.) Plot three curves ver4cally, with x- axis as years, y- axis as births, the result shows that those names have grown out of favor with American popula4on 21

22 Measuring the increase in naming diversity To explain why there is a decrease in the previous plots, we can measure the propor4on of births represented by the top 1000 most popular names by year and sex Step 1: find total of birth per year for each sex 22

23 Measuring the increase in naming diversity (Cont.) Step 2: compute the propor4on of top 1000 births to the total births per year per sex For boys: 23

24 Measuring the increase in naming For girls: diversity (Cont.) 24

25 Measuring the increase in naming diversity (Cont.) Plot the result shows that fewer parents are choosing the popular names for their children over the years 25

26 Measuring the increase in naming diversity (Cont.) Another interest metric is the number of dis4nct popular names, taken in order of popularity from highest to lowest in the top 50% of births. Step 1: Add the fourth column to girls1000 and boys1000 list, to represent the birth propor4on to the total birth of the given year, then sort the list in descending order on propor4on, sort the list again in ascending order on years. The result list will have each years records in a chunk with propor4on number in decreasing order. 26

27 Measuring the increase in naming For girls: diversity (Cont.) 27

28 Measuring the increase in naming For boys: diversity (Cont.) 28

29 Measuring the increase in naming diversity (Cont.) Step 2: Adding the propor4on for each year from highest un4l the total propor4on reaches 50%, recording the number of individual names 29

30 Measuring the increase in naming For girls: diversity (Cont.) 30

31 Measuring the increase in naming For boys: diversity (Cont.) 31

32 Measuring the increase in naming diversity (Cont.) Step 3: Plot the result, as you can see, girl names has always been more diverse than boy names, and the dis4nguished names become more over 4me. 32

33 Python Library for Data Analysis Pandas wriwen by Wes McKinney hwp://pandas.pydata.org/ provides rich data structures and func4ons working with structured data It is one of the cri4cal ingredients enabling Python to be a powerful and produc4ve data analysis environment. The primary object in pandas is called DataFrame a two- dimensional tabular, column- oriented data structure with both row and column labels Pandas combines the features of NumPy, spreadsheets and rela4onal databases 33

34 Useful Links Python Scien4fic Lecture Notes hwp://scipy- lectures.github.io/ Matplotlib hwp://matplotlib.org/ Documenta4on hwp://docs.python.org 34

Ibis: Scaling Python Analy=cs on Hadoop and Impala

Ibis: Scaling Python Analy=cs on Hadoop and Impala Ibis: Scaling Python Analy=cs on Hadoop and Impala Wes McKinney, Budapest BI Forum 2015-10- 14 @wesmckinn 1 Me R&D at Cloudera Serial creator of structured data tools / user interfaces Mathema=cian MIT

More information

Scientific Programming, Analysis, and Visualization with Python. Mteor 227 Fall 2015

Scientific Programming, Analysis, and Visualization with Python. Mteor 227 Fall 2015 Scientific Programming, Analysis, and Visualization with Python Mteor 227 Fall 2015 Python The Big Picture Interpreted General purpose, high-level Dynamically type Multi-paradigm Object-oriented Functional

More information

Programming in Slicer4

Programming in Slicer4 Programming in Slicer4 Sonia Pujol, Ph.D. Surgical Planning Laboratory, Harvard Medical School Paul Cézanne, Moulin sur la Couleuvre à Pontoise, 1881, Staatliche Museen zu Berlin, Na:onalgalerie Steve

More information

Introduction to Python

Introduction to Python 1 Daniel Lucio March 2016 Creator of Python https://en.wikipedia.org/wiki/guido_van_rossum 2 Python Timeline Implementation Started v1.0 v1.6 v2.1 v2.3 v2.5 v3.0 v3.1 v3.2 v3.4 1980 1991 1997 2004 2010

More information

Tutorial 2 Online and offline Ship Visualization tool Table of Contents

Tutorial 2 Online and offline Ship Visualization tool Table of Contents Tutorial 2 Online and offline Ship Visualization tool Table of Contents 1.Tutorial objective...2 1.1.Standard that will be used over this document...2 2. The online tool...2 2.1.View all records...3 2.2.Search

More information

Unlocking the True Value of Hadoop with Open Data Science

Unlocking the True Value of Hadoop with Open Data Science Unlocking the True Value of Hadoop with Open Data Science Kristopher Overholt Solution Architect Big Data Tech 2016 MinneAnalytics June 7, 2016 Overview Overview of Open Data Science Python and the Big

More information

Exercise 0. Although Python(x,y) comes already with a great variety of scientic Python packages, we might have to install additional dependencies:

Exercise 0. Although Python(x,y) comes already with a great variety of scientic Python packages, we might have to install additional dependencies: Exercise 0 Deadline: None Computer Setup Windows Download Python(x,y) via http://code.google.com/p/pythonxy/wiki/downloads and install it. Make sure that before installation the installer does not complain

More information

1. Base Programming. GIORGIO RUSSOLILLO - Cours de prépara+on à la cer+fica+on SAS «Base Programming»

1. Base Programming. GIORGIO RUSSOLILLO - Cours de prépara+on à la cer+fica+on SAS «Base Programming» 1. Base Programming GIORGIO RUSSOLILLO Cours de prépara+on à la cer+fica+on SAS «Base Programming» 9 What is SAS Highly flexible and integrated soiware environment; you can use SAS for: GIORGIO RUSSOLILLO

More information

CONTENTS. Introduc on 2. Undergraduate Program 4. BSC in Informa on Systems 4. Graduate Program 7. MSC in Informa on Science 7

CONTENTS. Introduc on 2. Undergraduate Program 4. BSC in Informa on Systems 4. Graduate Program 7. MSC in Informa on Science 7 1 1 2 CONTENTS Introducon 2 Undergraduate Program 4 BSC in Informaon Systems 4 Graduate Program 7 MSC in Informaon Science 7 MSC in Health Informacs 13 2 3 Introducon The School of Informaon Science at

More information

Analytic Modeling in Python

Analytic Modeling in Python Analytic Modeling in Python Why Choose Python for Analytic Modeling A White Paper by Visual Numerics August 2009 www.vni.com Analytic Modeling in Python Why Choose Python for Analytic Modeling by Visual

More information

Visualization with Excel Tools and Microsoft Azure

Visualization with Excel Tools and Microsoft Azure Visualization with Excel Tools and Microsoft Azure Introduction Power Query and Power Map are add-ins that are available as free downloads from Microsoft to enhance the data access and data visualization

More information

CRASH COURSE PYTHON. Het begint met een idee

CRASH COURSE PYTHON. Het begint met een idee CRASH COURSE PYTHON nr. Het begint met een idee This talk Not a programming course For data analysts, who want to learn Python For optimizers, who are fed up with Matlab 2 Python Scripting language expensive

More information

PyCompArch: Python-Based Modules for Exploring Computer Architecture Concepts

PyCompArch: Python-Based Modules for Exploring Computer Architecture Concepts PyCompArch: Python-Based Modules for Exploring Computer Architecture Concepts Workshop on Computer Architecture Education 2015 Dan Connors, Kyle Dunn, Ryan Bueter Department of Electrical Engineering University

More information

MAXIMIZING THE SUCCESS OF YOUR E-PROCUREMENT TECHNOLOGY INVESTMENT. How to Drive Adop.on, Efficiency, and ROI for the Long Term

MAXIMIZING THE SUCCESS OF YOUR E-PROCUREMENT TECHNOLOGY INVESTMENT. How to Drive Adop.on, Efficiency, and ROI for the Long Term MAXIMIZING THE SUCCESS OF YOUR E-PROCUREMENT TECHNOLOGY INVESTMENT How to Drive Adop.on, Efficiency, and ROI for the Long Term What We Will Cover Today Presenta(on Agenda! Who We Are! Our History! Par7al

More information

.nl ENTRADA. CENTR-tech 33. November 2015 Marco Davids, SIDN Labs. Klik om de s+jl te bewerken

.nl ENTRADA. CENTR-tech 33. November 2015 Marco Davids, SIDN Labs. Klik om de s+jl te bewerken Klik om de s+jl te bewerken Klik om de models+jlen te bewerken Tweede niveau Derde niveau Vierde niveau.nl ENTRADA Vijfde niveau CENTR-tech 33 November 2015 Marco Davids, SIDN Labs Wie zijn wij? Mijlpalen

More information

DNS Big Data Analy@cs

DNS Big Data Analy@cs Klik om de s+jl te bewerken Klik om de models+jlen te bewerken! Tweede niveau! Derde niveau! Vierde niveau DNS Big Data Analy@cs Vijfde niveau DNS- OARC Fall 2015 Workshop October 4th 2015 Maarten Wullink,

More information

Computational Mathematics with Python

Computational Mathematics with Python Computational Mathematics with Python Basics Claus Führer, Jan Erik Solem, Olivier Verdier Spring 2010 Claus Führer, Jan Erik Solem, Olivier Verdier Computational Mathematics with Python Spring 2010 1

More information

Understanding Cloud Compu2ng Services. Rain in business success with amazing solu2ons in Cloud technology

Understanding Cloud Compu2ng Services. Rain in business success with amazing solu2ons in Cloud technology Understanding Cloud Compu2ng Services Rain in business success with amazing solu2ons in Cloud technology What is Cloud Compu2ng? Cloud compu2ng encompasses various services and ac2vi2es carried out over

More information

Availability of the Program A free version is available of each (see individual programs for links).

Availability of the Program A free version is available of each (see individual programs for links). Choosing a Programming Platform Diane Hobenshield Tepylo, Lisa Floyd, and Steve Floyd (Computer Science and Mathematics teachers) The Tasks Working Group had many questions and concerns about choosing

More information

NEXT Analytics Business Intelligence User Guide

NEXT Analytics Business Intelligence User Guide NEXT Analytics Business Intelligence User Guide This document provides an overview of the powerful business intelligence functions embedded in NEXT Analytics v5. These functions let you build more useful

More information

Computational Mathematics with Python

Computational Mathematics with Python Boolean Arrays Classes Computational Mathematics with Python Basics Olivier Verdier and Claus Führer 2009-03-24 Olivier Verdier and Claus Führer Computational Mathematics with Python 2009-03-24 1 / 40

More information

Module 9 Ad Hoc Queries

Module 9 Ad Hoc Queries Module 9 Ad Hoc Queries Objectives Familiarize the User with basic steps necessary to create ad hoc queries using the Data Browser. Topics Ad Hoc Queries Create a Data Browser query Filter data Save a

More information

DATA SCIENCE CURRICULUM WEEK 1 ONLINE PRE-WORK INSTALLING PACKAGES COMMAND LINE CODE EDITOR PYTHON STATISTICS PROJECT O5 PROJECT O3 PROJECT O2

DATA SCIENCE CURRICULUM WEEK 1 ONLINE PRE-WORK INSTALLING PACKAGES COMMAND LINE CODE EDITOR PYTHON STATISTICS PROJECT O5 PROJECT O3 PROJECT O2 DATA SCIENCE CURRICULUM Before class even begins, students start an at-home pre-work phase. When they convene in class, students spend the first eight weeks doing iterative, project-centered skill acquisition.

More information

Session 85 IF, Predictive Analytics for Actuaries: Free Tools for Life and Health Care Analytics--R and Python: A New Paradigm!

Session 85 IF, Predictive Analytics for Actuaries: Free Tools for Life and Health Care Analytics--R and Python: A New Paradigm! Session 85 IF, Predictive Analytics for Actuaries: Free Tools for Life and Health Care Analytics--R and Python: A New Paradigm! Moderator: David L. Snell, ASA, MAAA Presenters: Brian D. Holland, FSA, MAAA

More information

1. The orange button 2. Audio Type 3. Close apps 4. Enlarge my screen 5. Headphones 6. Questions Pane. SparkR 2

1. The orange button 2. Audio Type 3. Close apps 4. Enlarge my screen 5. Headphones 6. Questions Pane. SparkR 2 SparkR 1. The orange button 2. Audio Type 3. Close apps 4. Enlarge my screen 5. Headphones 6. Questions Pane SparkR 2 Lecture slides and/or video will be made available within one week Live Demonstration

More information

Introduction to Python

Introduction to Python Introduction to Python Sophia Bethany Coban Problem Solving By Computer March 26, 2014 Introduction to Python Python is a general-purpose, high-level programming language. It offers readable codes, and

More information

EXST SAS Lab Lab #4: Data input and dataset modifications

EXST SAS Lab Lab #4: Data input and dataset modifications EXST SAS Lab Lab #4: Data input and dataset modifications Objectives 1. Import an EXCEL dataset. 2. Infile an external dataset (CSV file) 3. Concatenate two datasets into one 4. The PLOT statement will

More information

Today's Topics. COMP 388/441: Human-Computer Interaction. simple 2D plotting. 1D techniques. Ancient plotting techniques. Data Visualization:

Today's Topics. COMP 388/441: Human-Computer Interaction. simple 2D plotting. 1D techniques. Ancient plotting techniques. Data Visualization: COMP 388/441: Human-Computer Interaction Today's Topics Overview of visualization techniques 1D charts, 2D plots, 3D+ techniques, maps A few guidelines for scientific visualization methods, guidelines,

More information

Phone Systems Buyer s Guide

Phone Systems Buyer s Guide Phone Systems Buyer s Guide Contents How Cri(cal is Communica(on to Your Business? 3 Fundamental Issues 4 Phone Systems Basic Features 6 Features for Users with Advanced Needs 10 Key Ques(ons for All Buyers

More information

MultiAlign Software. Windows GUI. Console Application. MultiAlign Software Website. Test Data

MultiAlign Software. Windows GUI. Console Application. MultiAlign Software Website. Test Data MultiAlign Software This documentation describes MultiAlign and its features. This serves as a quick guide for starting to use MultiAlign. MultiAlign comes in two forms: as a graphical user interface (GUI)

More information

Computational Mathematics with Python

Computational Mathematics with Python Numerical Analysis, Lund University, 2011 1 Computational Mathematics with Python Chapter 1: Basics Numerical Analysis, Lund University Claus Führer, Jan Erik Solem, Olivier Verdier, Tony Stillfjord Spring

More information

REDCap Longitudinal Studies & Surveys

REDCap Longitudinal Studies & Surveys REDCap Longitudinal Studies & Surveys ITHS Biomedical Informa2cs Core [email protected] Bas de Veer MS Research Consultant REDCap version: 6.0.1 Last updated October 22, 2014 1 Goals & Agenda Goals

More information

SPSS INSTRUCTION CHAPTER 1

SPSS INSTRUCTION CHAPTER 1 SPSS INSTRUCTION CHAPTER 1 Performing the data manipulations described in Section 1.4 of the chapter require minimal computations, easily handled with a pencil, sheet of paper, and a calculator. However,

More information

SAS Analyst for Windows Tutorial

SAS Analyst for Windows Tutorial Updated: August 2012 Table of Contents Section 1: Introduction... 3 1.1 About this Document... 3 1.2 Introduction to Version 8 of SAS... 3 Section 2: An Overview of SAS V.8 for Windows... 3 2.1 Navigating

More information

Intro to scientific programming (with Python) Pietro Berkes, Brandeis University

Intro to scientific programming (with Python) Pietro Berkes, Brandeis University Intro to scientific programming (with Python) Pietro Berkes, Brandeis University Next 4 lessons: Outline Scientific programming: best practices Classical learning (Hoepfield network) Probabilistic learning

More information

Client Marketing: Sets

Client Marketing: Sets Client Marketing Client Marketing: Sets Purpose Client Marketing Sets are used for selecting clients from the client records based on certain criteria you designate. Once the clients are selected, you

More information

BornAgain: a versa.le framework for modelling and fi6ng GISAS

BornAgain: a versa.le framework for modelling and fi6ng GISAS BornAgain: a versa.le framework for modelling and fi6ng GISAS Céline Durniak, Walter Van Herck, Gennady Pospelov, Joachim WuIke Scien&fic Compu&ng Group at MLZ Jülich Center for Neutron Science JCNS Koordinierungstreffen,

More information

Query 4. Lesson Objectives 4. Review 5. Smart Query 5. Create a Smart Query 6. Create a Smart Query Definition from an Ad-hoc Query 9

Query 4. Lesson Objectives 4. Review 5. Smart Query 5. Create a Smart Query 6. Create a Smart Query Definition from an Ad-hoc Query 9 TABLE OF CONTENTS Query 4 Lesson Objectives 4 Review 5 Smart Query 5 Create a Smart Query 6 Create a Smart Query Definition from an Ad-hoc Query 9 Query Functions and Features 13 Summarize Output Fields

More information

Big Data Paradigms in Python

Big Data Paradigms in Python Big Data Paradigms in Python San Diego Data Science and R Users Group January 2014 Kevin Davenport! http://kldavenport.com [email protected] @KevinLDavenport Thank you to our sponsors: Setting up

More information

Software Automated Testing

Software Automated Testing Software Automated Testing Keyword Data Driven Framework Selenium Robot Best Practices Agenda ² Automation Engineering Introduction ² Keyword Data Driven ² How to build a Test Automa7on Framework ² Selenium

More information

Suite. How to Use GrandMaster Suite. Exporting with ODBC

Suite. How to Use GrandMaster Suite. Exporting with ODBC Suite How to Use GrandMaster Suite Exporting with ODBC This page intentionally left blank ODBC Export 3 Table of Contents: HOW TO USE GRANDMASTER SUITE - EXPORTING WITH ODBC...4 OVERVIEW...4 WHAT IS ODBC?...

More information

SAP Predictive Analysis Installation

SAP Predictive Analysis Installation SAP Predictive Analysis Installation SAP Predictive Analysis is the latest addition to the SAP BusinessObjects suite and introduces entirely new functionality to the existing Business Objects toolbox.

More information

Introduction Course in SPSS - Evening 1

Introduction Course in SPSS - Evening 1 ETH Zürich Seminar für Statistik Introduction Course in SPSS - Evening 1 Seminar für Statistik, ETH Zürich All data used during the course can be downloaded from the following ftp server: ftp://stat.ethz.ch/u/sfs/spsskurs/

More information

Tutorial 3. Maintaining and Querying a Database

Tutorial 3. Maintaining and Querying a Database Tutorial 3 Maintaining and Querying a Database Microsoft Access 2010 Objectives Find, modify, and delete records in a table Learn how to use the Query window in Design view Create, run, and save queries

More information

Microsoft Office 2010

Microsoft Office 2010 Access Tutorial 3 Maintaining and Querying a Database Microsoft Office 2010 Objectives Find, modify, and delete records in a table Learn how to use the Query window in Design view Create, run, and save

More information

How to Download Census Data from American Factfinder and Display it in ArcMap

How to Download Census Data from American Factfinder and Display it in ArcMap How to Download Census Data from American Factfinder and Display it in ArcMap Factfinder provides census and ACS (American Community Survey) data that can be downloaded in a tabular format and joined with

More information

MicroStrategy Analytics Express User Guide

MicroStrategy Analytics Express User Guide MicroStrategy Analytics Express User Guide Analyzing Data with MicroStrategy Analytics Express Version: 4.0 Document Number: 09770040 CONTENTS 1. Getting Started with MicroStrategy Analytics Express Introduction...

More information

CONTENTS MANUFACTURERS GUIDE FOR PUBLIC USERS

CONTENTS MANUFACTURERS GUIDE FOR PUBLIC USERS OPA DATABASE GUIDE FOR PUBLIC USERS - MARCH 2013 VERSION 5.0 CONTENTS Manufacturers 1 Manufacturers 1 Registering a Manufacturer 2 Search Manufacturers 3 Advanced Search Options 3 Searching for Manufacturers

More information

CowCalf5. for Dummies. Quick Reference. D ate: 3/26 / 2 0 10

CowCalf5. for Dummies. Quick Reference. D ate: 3/26 / 2 0 10 CowCalf5 for Dummies Quick Reference D ate: 3/26 / 2 0 10 Page 2 Email: [email protected] Page 3 Table of Contents System Requirement and Installing CowCalf5 4 Creating a New Herd 5 Start Entering Records

More information

Python programming guide for Earth Scientists. Maarten J. Waterloo and Vincent E.A. Post

Python programming guide for Earth Scientists. Maarten J. Waterloo and Vincent E.A. Post Python programming guide for Earth Scientists Maarten J. Waterloo and Vincent E.A. Post Amsterdam Critical Zone Hydrology group September 2015 Cover page: Meteorological tower in an abandoned agricultural

More information

Music Mood Classification

Music Mood Classification Music Mood Classification CS 229 Project Report Jose Padial Ashish Goel Introduction The aim of the project was to develop a music mood classifier. There are many categories of mood into which songs may

More information

Installing Intel Parallel Studio XE Composer Edition for Fortran Windows 2016

Installing Intel Parallel Studio XE Composer Edition for Fortran Windows 2016 Installing Intel Parallel Studio XE Composer Edition for Fortran Windows 2016 These instructions are specific to installing Intel Parallel Studio XE Composer for Fortran Windows 2016 and do not apply to

More information

RIFIS Ad Hoc Reports

RIFIS Ad Hoc Reports RIFIS Ad Hoc Reports To retrieve the entire list of all Ad Hoc Reports, including the Base reports and any additional reports published to your Role, select Ad Hoc for the Type under Filter Report By and

More information

We are pleased to offer the following program to Woodstock Area Educators:

We are pleased to offer the following program to Woodstock Area Educators: DATE: Spring 2016 TO: RE: Woodstock Area Educators Upcoming Cohort Programs Presently, many teachers are enrolled in cohort graduate programs through partnerships between local regional offices of education,

More information

Overview on Modern Accelerators and Programming Paradigms Ivan Giro7o [email protected]

Overview on Modern Accelerators and Programming Paradigms Ivan Giro7o igiro7o@ictp.it Overview on Modern Accelerators and Programming Paradigms Ivan Giro7o [email protected] Informa(on & Communica(on Technology Sec(on (ICTS) Interna(onal Centre for Theore(cal Physics (ICTP) Mul(ple Socket

More information

IBM SPSS Statistics 20 Part 4: Chi-Square and ANOVA

IBM SPSS Statistics 20 Part 4: Chi-Square and ANOVA CALIFORNIA STATE UNIVERSITY, LOS ANGELES INFORMATION TECHNOLOGY SERVICES IBM SPSS Statistics 20 Part 4: Chi-Square and ANOVA Summer 2013, Version 2.0 Table of Contents Introduction...2 Downloading the

More information

Statistical Data Mining. Practical Assignment 3 Discriminant Analysis and Decision Trees

Statistical Data Mining. Practical Assignment 3 Discriminant Analysis and Decision Trees Statistical Data Mining Practical Assignment 3 Discriminant Analysis and Decision Trees In this practical we discuss linear and quadratic discriminant analysis and tree-based classification techniques.

More information

HOW TO CREATE APPS FOR TRAINING. A step- by- step guide to crea2ng a great training app for your company

HOW TO CREATE APPS FOR TRAINING. A step- by- step guide to crea2ng a great training app for your company HOW TO CREATE APPS FOR TRAINING A step- by- step guide to crea2ng a great training app for your company From compliance and health & safety to employee induction and self-assessment, there are endless

More information

Analysis Programs DPDAK and DAWN

Analysis Programs DPDAK and DAWN Analysis Programs DPDAK and DAWN An Overview Gero Flucke FS-EC PNI-HDRI Spring Meeting April 13-14, 2015 Outline Introduction Overview of Analysis Programs: DPDAK DAWN Summary Gero Flucke (DESY) Analysis

More information

6.170 Tutorial 3 - Ruby Basics

6.170 Tutorial 3 - Ruby Basics 6.170 Tutorial 3 - Ruby Basics Prerequisites 1. Have Ruby installed on your computer a. If you use Mac/Linux, Ruby should already be preinstalled on your machine. b. If you have a Windows Machine, you

More information

Why are we teaching you VisIt?

Why are we teaching you VisIt? VisIt Tutorial Why are we teaching you VisIt? Interactive (GUI) Visualization and Analysis tool Multiplatform, Free and Open Source The interface looks the same whether you run locally or remotely, serial

More information

SPSS Step-by-Step Tutorial: Part 1

SPSS Step-by-Step Tutorial: Part 1 SPSS Step-by-Step Tutorial: Part 1 For SPSS Version 11.5 DataStep Development 2004 Table of Contents 1 SPSS Step-by-Step 5 Introduction 5 Installing the Data 6 Installing files from the Internet 6 Installing

More information

Email Mail Merge Using Thunderbird. Bob Booth February 2009 AP-Tbird2

Email Mail Merge Using Thunderbird. Bob Booth February 2009 AP-Tbird2 Email Mail Merge Using Thunderbird. Bob Booth February 2009 AP-Tbird2 University of Sheffield Contents 1. Introduction... 3 2. Installing the Mail Tweak Plug-In... 4 2.1 DOWNLOADING MAIL TWEAK... 4 2.2

More information

Microsoft Excel 2010 Part 3: Advanced Excel

Microsoft Excel 2010 Part 3: Advanced Excel CALIFORNIA STATE UNIVERSITY, LOS ANGELES INFORMATION TECHNOLOGY SERVICES Microsoft Excel 2010 Part 3: Advanced Excel Winter 2015, Version 1.0 Table of Contents Introduction...2 Sorting Data...2 Sorting

More information

Chapter 24: Creating Reports and Extracting Data

Chapter 24: Creating Reports and Extracting Data Chapter 24: Creating Reports and Extracting Data SEER*DMS includes an integrated reporting and extract module to create pre-defined system reports and extracts. Ad hoc listings and extracts can be generated

More information

Step 2: Save the file as an Excel file for future editing, adding more data, changing data, to preserve any formulas you were using, etc.

Step 2: Save the file as an Excel file for future editing, adding more data, changing data, to preserve any formulas you were using, etc. R is a free statistical software environment that can run a wide variety of tests based on different downloadable packages and can produce a wide variety of simple to more complex graphs based on the data

More information

Using SPSS, Chapter 2: Descriptive Statistics

Using SPSS, Chapter 2: Descriptive Statistics 1 Using SPSS, Chapter 2: Descriptive Statistics Chapters 2.1 & 2.2 Descriptive Statistics 2 Mean, Standard Deviation, Variance, Range, Minimum, Maximum 2 Mean, Median, Mode, Standard Deviation, Variance,

More information

Assignment 5: Visualization

Assignment 5: Visualization Assignment 5: Visualization Arash Vahdat March 17, 2015 Readings Depending on how familiar you are with web programming, you are recommended to study concepts related to CSS, HTML, and JavaScript. The

More information

Programming in Python. Basic information. Teaching. Administration Organisation Contents of the Course. Jarkko Toivonen. Overview of Python

Programming in Python. Basic information. Teaching. Administration Organisation Contents of the Course. Jarkko Toivonen. Overview of Python Programming in Python Jarkko Toivonen Department of Computer Science University of Helsinki September 18, 2009 Administration Organisation Contents of the Course Overview of Python Jarkko Toivonen (CS

More information

Big Data. The Big Picture. Our flexible and efficient Big Data solu9ons open the door to new opportuni9es and new business areas

Big Data. The Big Picture. Our flexible and efficient Big Data solu9ons open the door to new opportuni9es and new business areas Big Data The Big Picture Our flexible and efficient Big Data solu9ons open the door to new opportuni9es and new business areas What is Big Data? Big Data gets its name because that s what it is data that

More information

Data Analysis for Yield Improvement using TIBCO s Spotfire Data Analysis Software

Data Analysis for Yield Improvement using TIBCO s Spotfire Data Analysis Software Page 327 Data Analysis for Yield Improvement using TIBCO s Spotfire Data Analysis Software Andrew Choo, Thorsten Saeger TriQuint Semiconductor Corporation 2300 NE Brookwood Parkway, Hillsboro, OR 97124

More information

OS/Run'me and Execu'on Time Produc'vity

OS/Run'me and Execu'on Time Produc'vity OS/Run'me and Execu'on Time Produc'vity Ron Brightwell, Technical Manager Scalable System SoAware Department Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation,

More information

COS 116 The Computational Universe Laboratory 9: Virus and Worm Propagation in Networks

COS 116 The Computational Universe Laboratory 9: Virus and Worm Propagation in Networks COS 116 The Computational Universe Laboratory 9: Virus and Worm Propagation in Networks You learned in lecture about computer viruses and worms. In this lab you will study virus propagation at the quantitative

More information

ANALYTICAL TECHNIQUES FOR DATA VISUALIZATION

ANALYTICAL TECHNIQUES FOR DATA VISUALIZATION ANALYTICAL TECHNIQUES FOR DATA VISUALIZATION CSE 537 Ar@ficial Intelligence Professor Anita Wasilewska GROUP 2 TEAM MEMBERS: SAEED BOOR BOOR - 110564337 SHIH- YU TSAI - 110385129 HAN LI 110168054 SOURCES

More information

PHI Audit Us er Guide

PHI Audit Us er Guide PHI Audit Us er Guide Table Of Contents PHI Audit Overview... 1 Auditable Actions... 1 Navigating the PHI Audit Dashboard... 2 Access PHI Audit... 4 Create a Patient Audit... 6 Create a User Audit... 10

More information

Big Data Frameworks: Scala and Spark Tutorial

Big Data Frameworks: Scala and Spark Tutorial Big Data Frameworks: Scala and Spark Tutorial 13.03.2015 Eemil Lagerspetz, Ella Peltonen Professor Sasu Tarkoma These slides: http://is.gd/bigdatascala www.cs.helsinki.fi Functional Programming Functional

More information

Detailed Claim Information via DCA Access Online User s Guide

Detailed Claim Information via DCA Access Online User s Guide NCCI S 2016 DATA EDUCATIONAL PROGRAM Detailed Claim Information via DCA Access Online User s Guide January 26 29, 2016 Palm Beach County Convention Center West Palm Beach, FL Table of Contents Overview...

More information

Apache Spark 11/10/15. Context. Reminder. Context. What is Spark? A GrowingStack

Apache Spark 11/10/15. Context. Reminder. Context. What is Spark? A GrowingStack Apache Spark Document Analysis Course (Fall 2015 - Scott Sanner) Zahra Iman Some slides from (Matei Zaharia, UC Berkeley / MIT& Harold Liu) Reminder SparkConf JavaSpark RDD: Resilient Distributed Datasets

More information

SOAL-SOAL MICROSOFT EXCEL 1. The box on the chart that contains the name of each individual record is called the. A. cell B. title C. axis D.

SOAL-SOAL MICROSOFT EXCEL 1. The box on the chart that contains the name of each individual record is called the. A. cell B. title C. axis D. SOAL-SOAL MICROSOFT EXCEL 1. The box on the chart that contains the name of each individual record is called the. A. cell B. title C. axis D. legend 2. If you want all of the white cats grouped together

More information

An Introduction to Using Python with Microsoft Azure

An Introduction to Using Python with Microsoft Azure An Introduction to Using Python with Microsoft Azure If you build technical and scientific applications, you're probably familiar with Python. What you might not know is that there are now tools available

More information

Analyze IQ Spectra Manager Version 1.2

Analyze IQ Spectra Manager Version 1.2 Analyze IQ Spectra Manager Version 1.2 User Manual Document Version: 1.2-2010-02-15 Copyright Analyze IQ Limited, 2008-2010. All Rights Reserved. Table of Contents 1 Introduction... 3 2 Installation...

More information

How To Use Optimum Control EDI Import. EDI Invoice Import. EDI Supplier Setup General Set up

How To Use Optimum Control EDI Import. EDI Invoice Import. EDI Supplier Setup General Set up How To Use Optimum Control EDI Import EDI Invoice Import This optional module will download digital invoices into Optimum Control, updating pricing, stock levels and account information automatically with

More information

By: Peter K. Mulwa MSc (UoN), PGDE (KU), BSc (KU) Email: [email protected]

By: Peter K. Mulwa MSc (UoN), PGDE (KU), BSc (KU) Email: Peter.kyalo@uonbi.ac.ke SPREADSHEETS FOR MARKETING & SALES TRACKING - DATA ANALYSIS TOOLS USING MS EXCEL By: Peter K. Mulwa MSc (UoN), PGDE (KU), BSc (KU) Email: [email protected] Objectives By the end of the session, participants

More information

Protec'ng Communica'on Networks, Devices, and their Users: Technology and Psychology

Protec'ng Communica'on Networks, Devices, and their Users: Technology and Psychology Protec'ng Communica'on Networks, Devices, and their Users: Technology and Psychology Alexey Kirichenko, F- Secure Corpora7on ICT SHOK, Future Internet program 30.5.2012 Outline 1. Security WP (WP6) overview

More information

Advanced Functions and Modules

Advanced Functions and Modules Advanced Functions and Modules CB2-101 Introduction to Scientific Computing November 19, 2015 Emidio Capriotti http://biofold.org/ Institute for Mathematical Modeling of Biological Systems Department of

More information

Rarefaction Method DRAFT 1/5/2016 Our data base combines taxonomic counts from 23 agencies. The number of organisms identified and counted per sample

Rarefaction Method DRAFT 1/5/2016 Our data base combines taxonomic counts from 23 agencies. The number of organisms identified and counted per sample Rarefaction Method DRAFT 1/5/2016 Our data base combines taxonomic counts from 23 agencies. The number of organisms identified and counted per sample differs among agencies. Some count 100 individuals

More information

SPSS: Getting Started. For Windows

SPSS: Getting Started. For Windows For Windows Updated: August 2012 Table of Contents Section 1: Overview... 3 1.1 Introduction to SPSS Tutorials... 3 1.2 Introduction to SPSS... 3 1.3 Overview of SPSS for Windows... 3 Section 2: Entering

More information

Scientific Programming in Python

Scientific Programming in Python UCSD March 9, 2009 What is Python? Python in a very high level (scripting) language which has gained widespread popularity in recent years. It is: What is Python? Python in a very high level (scripting)

More information

EARTHSOFT DNREC. EQuIS Data Processor Tutorial

EARTHSOFT DNREC. EQuIS Data Processor Tutorial EARTHSOFT DNREC EQuIS Data Processor Tutorial Contents 1. THE STANDALONE EQUIS DATA PROCESSOR... 3 1.1 GETTING STARTED... 3 1.2 FORMAT FILE... 4 1.3 UNDERSTANDING THE DNREC IMPORT FORMAT... 5 1.4 REFERENCE

More information