Data organiza+on, data modeling, and databases

Size: px
Start display at page:

Download "Data organiza+on, data modeling, and databases"

Transcription

1 Mark Schildhauer Na.onal Center for Ecological Analysis & Synthesis University of California, Santa Barbara ESA 2011 Credits: Taun.ngpanda, Anita363, Stonebird, NeilsPhotography, Rick Smit, Jschinker Data organiza+on, data modeling, and databases

2 Spreadsheets: primordial data entry tool of the digital age

3 Spreadsheets: The Good Quick on the draw (clickety-click and you re ready to fire) Always there in time (on most everyones computer) Smarter than he lets on (stats, pivot tables, VB scripts) Cleans up real pretty (graphics, fonts, colors, borders)

4 Spreadsheets: The Bad Also a fast shooter (click&fire; click&fire; click&fire) No scruples (delete row, click&fire, ctrl-x/ctrl-c, click&fire, re-sort, save) Talks a good story, but didn t get much education (e.g.

5 Spreadsheets: The UGLY Ill-mannered: takes your data prisoner; conflates raw data with summary data Gaudy: Use of visual cues color, font, borders to indicate critical metadata or other semantic tidbits Shifty: Cross-linking of worksheets sets up invisible dependencies Shiftless: Provenance is entirely lost The more complicated your Spreadsheet, the UGLIER it gets in terms of using with other software

6 Spreadsheets: Best Prac.ces Think before you collect your data Develop a MODEL for your data, and implement it Excel Use Excel to create a TABLE A TABLE has a formal definition coming from RELATIONAL ALGEBRA: a Set of VALUES organized into COLUMNS, with COLUMNS groups together in a TUPLE or ROW You can add ROWs and its still the same TABLE But if you add COLUMNS, it s really a different TABLE One COLUMN in every TABLE is identified as a KEY No two ROWs can have the same KEY, thus no two ROWS can be the same (i.e. need unique ID default is row-num; how else would you know if you have a duplicate?

7 Best Prac.ces Columns of data are consistent: only numbers, dates, or text Consistent Names, Codes, Formats (date) used in each column Data are all in one table computer works beper on single table than mul.ple small tables that require a lot of human interven.on Descrip.ve File Name

8 Spreadsheets: Best Prac.ces (SQL says it nicely): CREATE TABLE SEV_SmallMammalData (Date date, Site char(50), Plot integer, Species char(50), Weight float, Adult char(2), Comment char(50)); Name your Tables Name the Attributes in the Tables Type your Attributes Establish Key relationships among your Tables (PS don t forget to provide other CRITICAL METADATA; the above information is necessary but not sufficient ) That is Data Modeling!!

9 Best prac.ces evident here?

10 Cri.cal Metadata? Create robust metadata that is discoverable, and facilitates re- use of your data Use exis.ng standards whenever possible (FGDC/ISO 19115, Darwin Core, EML, Dublin Core, etc.) Time, Place, and ResponsibleParty always important Link to relevant thesauri and ontologies in order to beper constrain seman.cs Units needed for all APributes!!!? Provide references (URI s) to associated data catalogues, protocol/method specifica.ons, data downloads, project websites, etc.

11 A word about Formats There is no shame in using CSV (comma- separated value) format for archiving Tables: Every computer can read it It s been around forever It will be around for a long while more Databases typically require a specific applica.on to read their proprietary binary formats: DBF (DB IV), MDB (MS Access), or e.g. even XLS CSV in ASCII (watch out for dreaded LF or CRLF line endings!) to be expensive for storage, and not readily queryable

12 Terminological Soup Table = Relation = Worksheet = Data set Column = Variable = Attribute = Characteristic Row = Record = Tuple <> Observation (eek) Keys are used to Join or Merge Cell = Value = Measurement ( Observation ) Data Model = Schema Even if you are using a Spreadsheet/Worksheet, you should treat it like a Table

13 Databases & DBMS & RDBMS You can use Spreadsheets like a Database You can also store your data in R, or SAS, MATLAB, etc. and still be using it like a Database But this means you are conforming to best practices

14 So why the fancy- schmancy? Databases often ENFORCE good practice You are required to define TABLES, ATTRIBUTES, and their relationships (CONSTRAINTS) Also, Databases typically provide: Far better scalability (millions+ records) Far better features for subsetting/querying data Scripted language (SQL) with lots of arsome features Dramatic reductions in redundancy and potential errors in data entry through normalization

15 Quick start: Normaliza+on? A formal process for modeling your data Atomization (1NF) don t put compound values into single cells Separate LNAME, FNAME Separate ADDRESS into CITY, STATE, ZIP_CODE Don t proliferate COLUMNS; proliferate ROWS (go LONG, not WIDE) DATE NY LA TX Date Site Count 2001 NY LA TX 13

16 Quick start: Normaliza+on? A formal process for modeling your data (2NF) Identify entities in your data, and put these into separate tables E.g. Separate Taxonomic information and Geographic information into two tables Taxon Date Location Count Thalassoma bifasciatum 1984Jun12 Ruber 211 Thalassoma bifasciatum 1984Jul25 Ruber 1345 Thalassoma bifasciatum 1984Sep19 Ruber 976

17 What is a rela.onal database? Sample sites *siteid site_name latitude longitude description samples Samples *sampleid siteid sample_date speciesid height flowering flag comments Species *speciesid species_name common_name family order * Denotes the primary key A set of tables Relationships among the tables A command language to specify (DDL)and query (DML) these

18 Database Features: Explicit control over data types Date Site Height Flowering <dates only> <text only> < real numbers only> < y and n only> Advantages quality control performance

19 Rela.onships are defined between tables Date Site Species Flowering? Site La.tude Longitude 2/13/2010 A BOGR2 y 2/13/2010 B HODR y 4/15/2010 B BOER4 y 4/15/2010 C PLJA n A B C Mix and Match data on the fly Date Site Species Flowering? La.tude Longitude 2/13/2010 A BOGR2 y /13/2010 B HODR y /15/2010 B BOER4 y /15/2010 C PLJA n

20 SQL says it nicely: CREATE TABLE Site (Site_ID integer PRIMARY KEY, LocationName char(50), Latitude float, Longitude float, Elevation float) CREATE TABLE Phenology (Species char(50), SampleDate date, Site_ID integer REFERENCES Site (Site_ID), FloweringStatus boolean, AirTemp float) Name your Tables Name the Attributes in the Tables Type your Attributes Establish Key relationships among your Tables That is Data Modeling!!

21 Powerful Command Language called Structured Query Language or SQL (see-quel) This table is called SoilTemp Date Plot Treatment SensorDepth Soil_Temperature C R B C C R A N SQL PROJECT: Select Date, Plot, Treatment, Soil_Temperature from SoilTemp where Date = Date Plot Treatment Soil_Temperature C R B C 13.2 SQL SELECT: Select * from SoilTemp where Treatment= N and SensorDepth= 0 Date Plot Treatment SensorDepth Soil_Temperature A N

22 Spreadsheet vs. Database OK for simple, self- contained Charts,Graphs,Calcula.ons Handy for collec.ng raw data Flexible cell content type BUT: Hard to subset or sort Lack record integrity (can sort a colum independently of all others) Easy to use but harder to maintain as complexity and size of data grows (lots of repeats in records) Lacks provenance Works well with lots of data Easy to query and subset data Data fields are typed e.g., integers only are allowed in integer fields Columns cannot be sorted independently of each other (tuple) Normaliza.on reduces data entry and poten.al for error BUT: More to learn and harder to use than a spreadsheet

23 Types of Data Ø Focus here on Tabular Data Ø Other Data Types include: Ø Vector (e.g. time-series) Ø Matrix (e.g. life-history tables) Ø Hierarchical (e.g taxonomies) Ø Grid (e.g. raster) Ø Spatial Vector (e.g. point, line, polygon) Lots of specialized formats for storing and exchanging these various data types; But it s all fairly readily expressible in some tabular format

24 Conclusion } Be aware of Best Prac.ces when designing data file structures } Choose a data entry method that allows some valida.on of data as it is entered } Invest.me in learning how to use a database if your datasets are large or complex } Consider inves.ng.me in learning how to use databases if your data are small and humble } Consider inves.ng.me in learning how to use databases if you ever intend to share your data } Consider learning databases if you are < 30 y.o.

25 Handy References for BePer Data Structures Best Prac.ces for Preparing Environmental Data Sets to Share and Archive. September Les A. Hook, Suresh K. Santhana Vannan, Tammy W. Beaty, Robert B. Cook, and Bruce E. Wilson. hpp://daac.ornl.gov/pi/bestprac.ces pdf Some Simple Guidelines for Effec.ve Data Management. Elizabeth Borer, Eric Seabloom, MaPhew B. Jones, and Mark Schildhauer. Bull. Ecol. Soc. Amer., April 2009, pp

26 Mark Schildhauer Na.onal Center for Ecological Analysis & Synthesis University of California, Santa Barbara ESA 2011 Credits: Taun.ngpanda, Anita363, Stonebird, NeilsPhotography, Rick Smit, Jschinker Workflows & Other Tools

27 Data Analyses Conducted via personal computer, local server or cluster, grid, cloud compu.ng Sta.s.cs, model runs, parameter es.ma.ons, produc.on of graphs/plots etc. The line separating data manipulation from data analysis is thin or blurry, depending on your metaphorical preferences

28 Sta.s.cal Soxware R, SAS, MATLAB, SPSS, You can implement good data prac.ce in each of these Of course, these are also excellent for calculations, data analysis, quality assurance, subsetting data

29 Best Prac.ces: Analysis Reproducibility is at the core of scien.fic method can someone independently validate findings? Transparency enables others to understand how you arrived at your results Executability enables someone to re- run or re- use your analysis Conceiving of your analysis as a Scientific Workflow makes you think about how your work can be made more reproducible, transparent, and re-usable for others or yourself

30 Workflows in General Simplest form of workflow: commented scripts R, SAS, MATLAB all provide rich languages for specifying and execu.ng a huge number of data transforma.on, sta.s.cal analyses, and modeling constructs. When well- documented, this source code provides an excellent basis for reviewing, sharing, and re- doing an analysis.

31 Workflows in General Another simple form of workflow: flow chart Data import into R Quality control & data cleaning Analysis: mean, SD Graph production

32 Workflows in General Not only linear: analyses are typically mul.- step, branching, itera.ve, etc. Temperatur e data Salinity data Clean T & S data Explicit Inputs. from Transforma+ons Data import into R Quality control & data cleaning Analysis: mean, SD Graph production Data in R format Summa ry statistic s & Explicit Outputs

33 Example of SWF applica.on: Kepler Resul+ng output

34 Example SWF Tools: VisTrails

35 Conclusions about Workflows Minimally, should document your analysis (commented code; simple flow- chart) But emerging workflow applica<ons will: Link together disparate soxware packages for an executable end- to- end analysis Provide detailed informa.on about data and analy.cal provenance Facilitate re- use and refinement of poten.ally complex, mul.- step analyses Enable efficient swapping of alterna.ve models and algorithms into workflows Help automate tedious tasks or accomplish parameter sweeps

36 Mark Schildhauer & Jim Regetz Na.onal Center for Ecological Analysis & Synthesis University of California, Santa Barbara ESA 2011 Credits: Taun.ngpanda, Anita363, Stonebird, NeilsPhotography, Rick Smit, Jschinker GIS or geospa+al data?

37 Vector data: tables with spa.al info * We're familiar now with the concept of a data table - Columns as variables (characteris.cs, etc) - Rows as records (individual measurements, etc) * Now imagine that one of your variables encodes loca.ons, as either: - Points (coordinate pairs, e.g., lat/lon) - Lines (list of connected coordinate pairs) - Polygons (list of connected coordinate pairs where first pair = last pair) * Voila, you've got spa.al data! - This is called vector data

38 Vector data - GIS folks call a table with spa.al info a 'layer', and call each record a 'feature'. But there's no special magic. - Each 'feature usually corresponds to some en.ty - an individual plant (point), a river (line), a nature reserve boundary (polygon) - A layer usually groups similar en..es such as plant loca.ons/occurrences

39 Vector data - Addi.onal requirement: *geo*spa.al data also must specify its spa.al reference system, which tells us where coordinates actually lie on the earth, i.e. - datum: a specific mathema.c model of the shape and orienta.on of our lumpy, spheroid earth - projec+on: specific formula for represen.ng coordinates in 2D space (essen.al for mapping, and usually necessary for other tasks in most spa.al soxware) Also need to have informa.on about scale and origin. These are generally cri+cal pieces of metadata accompanying geospa.al data, and are typically stored along with the data in various ways.

40 * Think of a digital photograph Raster data: 2D matrices with spa.al info - a rectangular grid of cells ("pixels") each of some color - e.g. 12 megapixel = 4000 px wide by 3000 px high (if 4:3 aspect ra.o) * Now imagine the photograph was taken of earth from a satellite. We might _see_ features (e.g., a sandy desert plain sloping up to some mountains), but the data only encode values for pixels

41 Raster data: 2D matrices with spa.al info - We also need to know 6 numbers: - the coordinate pair (e.g., lat/lon) of the upper lex corner - size of pixels (in terms of distance on earth) in both X and Y dimension - number of pixels in both X and Y - as with vector data, we need to know other spa.al reference informa.on (e.g. datum) to know how the coordinates map onto the earth - then we can calculate what patch of earth corresponds to each pixel Voila, you've got spa.al data! - This is called raster data In the computer, just a matrix of numbers

42 Geospa.al Data Formats - Vector: by far most common is shapefile, but there are others - Raster: GeoTIFF is probably best default choice (= TIFF image format with spa.al reference info stored in internal metadata tags), but many other formats (including ASCII Grid format which simply stores space- delimited matrix below a 6- line header)

43 Some free, open source geospa.al sw * QuantumGIS (hpp://qgis.org/) - user- friendly graphical applica.on - read many kinds of vector/raster formats * R (see hpp://cran.r- project.org/web/views/spa.al.html) - (v,r) 'sp': provides consistent R representa.on of vector and raster data - (v) 'maptools', 'rgdal': read/write/convert various vector data formats; reproject with rgdal - (v) 'rgeos': lots of topographic opera.ons (distance, containment, inteersec.on, buffer, centroid, etc) - (v) 'geosphere': distance, centroid, midpoint, etc using lat- lon directly on spheroid - (r) 'raster': read, write, do lots of things with raster data

44 More * GDAL/OGR (hpp:// - powerful commandline tools for raster (GDAL) and vector (OGR) - "swiss army knife" for spa.al data - quickly reformat, reproject, transform, subset, summarize, etc GIS, Geographic Informa.on Systems, can be invaluable for rapidly accessing, querying, manipula.ng and visualizing geospa.al data. But remember, basically geospa.al informa.on is just another type of data

45 Conclusions about GIS GIS, Geographic Informa.on Systems, can be invaluable for rapidly accessing, querying, manipula.ng and visualizing geospa.al data. But remember, it is useful to remember that, basically, geospa.al informa.on is just another type of data, and the capabili.es to work with them are becoming available in a growing range of analy.cal soxware solu.ons.

46 FIN How many fingers, Winston? Orwell,

Chapter 6: Data Acquisition Methods, Procedures, and Issues

Chapter 6: Data Acquisition Methods, Procedures, and Issues Chapter 6: Data Acquisition Methods, Procedures, and Issues In this Exercise: Data Acquisition Downloading Geographic Data Accessing Data Via Web Map Service Using Data from a Text File or Spreadsheet

More information

Data Warehousing. Yeow Wei Choong Anne Laurent

Data Warehousing. Yeow Wei Choong Anne Laurent Data Warehousing Yeow Wei Choong Anne Laurent Databases Databases are developed on the IDEA that DATA is one of the cri>cal materials of the Informa>on Age Informa>on, which is created by data, becomes

More information

Information Management. Corinna Gries

Information Management. Corinna Gries Information Management Corinna Gries Data Management Best Practices Information Management Considerations for Collaborative Projects Data Publication Resources Questions Data Management Best Practices

More information

Cookbook 23 September 2013 GIS Analysis Part 1 - A GIS is NOT a Map!

Cookbook 23 September 2013 GIS Analysis Part 1 - A GIS is NOT a Map! Cookbook 23 September 2013 GIS Analysis Part 1 - A GIS is NOT a Map! Overview 1. A GIS is NOT a Map! 2. How does a GIS handle its data? Data Formats! GARP 0344 (Fall 2013) Page 1 Dr. Carsten Braun 1) A

More information

An Introduction to Open Source Geospatial Tools

An Introduction to Open Source Geospatial Tools An Introduction to Open Source Geospatial Tools by Tyler Mitchell, author of Web Mapping Illustrated GRSS would like to thank Mr. Mitchell for this tutorial. Geospatial technologies come in many forms,

More information

Introduction to GIS (Basics, Data, Analysis) & Case Studies. 13 th May 2004. Content. What is GIS?

Introduction to GIS (Basics, Data, Analysis) & Case Studies. 13 th May 2004. Content. What is GIS? Introduction to GIS (Basics, Data, Analysis) & Case Studies 13 th May 2004 Content Introduction to GIS Data concepts Data input Analysis Applications selected examples What is GIS? Geographic Information

More information

INTRODUCTION TO ARCGIS SOFTWARE

INTRODUCTION TO ARCGIS SOFTWARE INTRODUCTION TO ARCGIS SOFTWARE I. History of Software Development a. Developer ESRI - Environmental Systems Research Institute, Inc., in 1969 as a privately held consulting firm that specialized in landuse

More information

Introduction to GIS software

Introduction to GIS software Introduction to GIS software There are a wide variety of GIS software packages available. Some of these software packages are freely available for you to download and could be used in your classroom. ArcGIS

More information

Aspose.Cells Product Family

Aspose.Cells Product Family time and effort by using our efficient and robust components instead of developing your own. lets you open, create, save and convert files from within your application without Microsoft Excel, confident

More information

Introduction to GIS. http://libguides.mit.edu/gis

Introduction to GIS. http://libguides.mit.edu/gis Introduction to GIS http://libguides.mit.edu/gis 1 Overview What is GIS? Types of Data and Projections What can I do with GIS? Data Sources and Formats Software Data Management Tips 2 What is GIS? 3 Characteristics

More information

The process of database development. Logical model: relational DBMS. Relation

The process of database development. Logical model: relational DBMS. Relation The process of database development Reality (Universe of Discourse) Relational Databases and SQL Basic Concepts The 3rd normal form Structured Query Language (SQL) Conceptual model (e.g. Entity-Relationship

More information

How To Write An Nccwsc/Csc Data Management Plan

How To Write An Nccwsc/Csc Data Management Plan Guidance and Requirements for NCCWSC/CSC Plans (Required for NCCWSC and CSC Proposals and Funded Projects) Prepared by the CSC/NCCWSC Working Group Emily Fort, Data and IT Manager for the National Climate

More information

Geodatabase Programming with SQL

Geodatabase Programming with SQL DevSummit DC February 11, 2015 Washington, DC Geodatabase Programming with SQL Craig Gillgrass Assumptions Basic knowledge of SQL and relational databases Basic knowledge of the Geodatabase We ll hold

More information

GEOGRAPHIC INFORMATION SYSTEMS CERTIFICATION

GEOGRAPHIC INFORMATION SYSTEMS CERTIFICATION GEOGRAPHIC INFORMATION SYSTEMS CERTIFICATION GIS Syllabus - Version 1.2 January 2007 Copyright AICA-CEPIS 2009 1 Version 1 January 2007 GIS Certification Programme 1. Target The GIS certification is aimed

More information

Guidelines on Information Deliverables for Research Projects in Grand Canyon National Park

Guidelines on Information Deliverables for Research Projects in Grand Canyon National Park INTRODUCTION Science is playing an increasing role in guiding National Park Service (NPS) management activities. The NPS is charged with protecting and maintaining data and associated information that

More information

GEOGRAPHIC INFORMATION SYSTEMS

GEOGRAPHIC INFORMATION SYSTEMS GEOGRAPHIC INFORMATION SYSTEMS WHAT IS A GEOGRAPHIC INFORMATION SYSTEM? A geographic information system (GIS) is a computer-based tool for mapping and analyzing spatial data. GIS technology integrates

More information

Using Geocoded TIFF & JPEG Files in ER Mapper 6.3 with SP1. Eric Augenstein Earthstar Geographics Web: www.es-geo.com

Using Geocoded TIFF & JPEG Files in ER Mapper 6.3 with SP1. Eric Augenstein Earthstar Geographics Web: www.es-geo.com Using Geocoded TIFF & JPEG Files in ER Mapper 6.3 with SP1 Eric Augenstein Earthstar Geographics Web: www.es-geo.com 1 Table of Contents WHAT IS NEW IN 6.3 SP1 REGARDING WORLD FILES?...3 WHAT IS GEOTIFF

More information

A Web services solution for Work Management Operations. Venu Kanaparthy Dr. Charles O Hara, Ph. D. Abstract

A Web services solution for Work Management Operations. Venu Kanaparthy Dr. Charles O Hara, Ph. D. Abstract A Web services solution for Work Management Operations Venu Kanaparthy Dr. Charles O Hara, Ph. D Abstract The GeoResources Institute at Mississippi State University is leveraging Spatial Technologies and

More information

Relational Database Basics Review

Relational Database Basics Review Relational Database Basics Review IT 4153 Advanced Database J.G. Zheng Spring 2012 Overview Database approach Database system Relational model Database development 2 File Processing Approaches Based on

More information

Technical White Paper. Automating the Generation and Secure Distribution of Excel Reports

Technical White Paper. Automating the Generation and Secure Distribution of Excel Reports Technical White Paper Automating the Generation and Secure Distribution of Excel Reports Table of Contents Introduction...3 Creating Spreadsheet Reports: A Cumbersome and Manual Process...3 Distributing

More information

Data Management in the Cloud: Limitations and Opportunities. Annies Ductan

Data Management in the Cloud: Limitations and Opportunities. Annies Ductan Data Management in the Cloud: Limitations and Opportunities Annies Ductan Discussion Outline: Introduc)on Overview Vision of Cloud Compu8ng Managing Data in The Cloud Cloud Characteris8cs Data Management

More information

Ins+tuto Superior Técnico Technical University of Lisbon. Big Data. Bruno Lopes Catarina Moreira João Pinho

Ins+tuto Superior Técnico Technical University of Lisbon. Big Data. Bruno Lopes Catarina Moreira João Pinho Ins+tuto Superior Técnico Technical University of Lisbon Big Data Bruno Lopes Catarina Moreira João Pinho Mo#va#on 2 220 PetaBytes Of data that people create every day! 2 Mo#va#on 90 % of Data UNSTRUCTURED

More information

Bridge from Entity Relationship modeling to creating SQL databases, tables, & relations

Bridge from Entity Relationship modeling to creating SQL databases, tables, & relations 1 Topics for this week: 1. Good Design 2. Functional Dependencies 3. Normalization Readings for this week: 1. E&N, Ch. 10.1-10.6; 12.2 2. Quickstart, Ch. 3 3. Complete the tutorial at http://sqlcourse2.com/

More information

Making an image using altitude as background image

Making an image using altitude as background image Try to re-do the previous exercise with different settings under Distance in Km between gridlines, Maximum interpolation radius (in Km), Minimum number of nearest stations and Maximum number of nearest

More information

4. Are you satisfied with the outcome? Why or why not? Offer a solution and make a new graph (Figure 2).

4. Are you satisfied with the outcome? Why or why not? Offer a solution and make a new graph (Figure 2). Assignment 1 Introduction to Excel and SPSS Graphing and Data Manipulation Part 1 Graphing (worksheet 1) 1. Download the BHM excel data file from the course website. 2. Save it to the desktop as an excel

More information

ArcGIS online Introduction... 2. Module 1: How to create a basic map on ArcGIS online... 3. Creating a public account with ArcGIS online...

ArcGIS online Introduction... 2. Module 1: How to create a basic map on ArcGIS online... 3. Creating a public account with ArcGIS online... Table of Contents ArcGIS online Introduction... 2 Module 1: How to create a basic map on ArcGIS online... 3 Creating a public account with ArcGIS online... 3 Opening a Map, Adding a Basemap and then Saving

More information

<no narration for this slide>

<no narration for this slide> 1 2 The standard narration text is : After completing this lesson, you will be able to: < > SAP Visual Intelligence is our latest innovation

More information

Groundwater Chemistry

Groundwater Chemistry Mapping and Modeling Groundwater Chemistry By importing Excel spreadsheets into ArcGIS 9.2 By Mike Price, Entrada/San Juan, Inc. In ArcGIS 9.2, Microsoft Excel spreadsheet data can be imported and used

More information

InfiniteInsight 6.5 sp4

InfiniteInsight 6.5 sp4 End User Documentation Document Version: 1.0 2013-11-19 CUSTOMER InfiniteInsight 6.5 sp4 Toolkit User Guide Table of Contents Table of Contents About this Document 3 Common Steps 4 Selecting a Data Set...

More information

What is GIS? Geographic Information Systems. Introduction to ArcGIS. GIS Maps Contain Layers. What Can You Do With GIS? Layers Can Contain Features

What is GIS? Geographic Information Systems. Introduction to ArcGIS. GIS Maps Contain Layers. What Can You Do With GIS? Layers Can Contain Features What is GIS? Geographic Information Systems Introduction to ArcGIS A database system in which the organizing principle is explicitly SPATIAL For CPSC 178 Visualization: Data, Pixels, and Ideas. What Can

More information

GIS Databases With focused on ArcSDE

GIS Databases With focused on ArcSDE Linköpings universitet / IDA / Div. for human-centered systems GIS Databases With focused on ArcSDE Imad Abugessaisa g-imaab@ida.liu.se 20071004 1 GIS and SDBMS Geographical data is spatial data whose

More information

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #5: En-ty/Rela-onal Models- - - Part 1

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #5: En-ty/Rela-onal Models- - - Part 1 CS 4604: Introduc0on to Database Management Systems B. Aditya Prakash Lecture #5: En-ty/Rela-onal Models- - - Part 1 Announcements- - - Project Goal: design a database system applica-on with a web front-

More information

MicroStrategy Desktop

MicroStrategy Desktop MicroStrategy Desktop Quick Start Guide MicroStrategy Desktop is designed to enable business professionals like you to explore data, simply and without needing direct support from IT. 1 Import data from

More information

Vector analysis - introduction Spatial data management operations - Assembling datasets for analysis. Data management operations

Vector analysis - introduction Spatial data management operations - Assembling datasets for analysis. Data management operations Vector analysis - introduction Spatial data management operations - Assembling datasets for analysis Transform (reproject) Merge Append Clip Dissolve The role of topology in GIS analysis Data management

More information

ES341 Overview of key file formats and file extensions in ArcGIS

ES341 Overview of key file formats and file extensions in ArcGIS ES341 Overview of key file formats and file extensions in ArcGIS Commonly Encountered File Types/Extensions in ArcGIS.mxd A file containing a map, its layers, display information, and other elements used

More information

Cost Effec/ve Approaches to Best Prac/ces in Data Analy/cs for Internal Audit

Cost Effec/ve Approaches to Best Prac/ces in Data Analy/cs for Internal Audit Cost Effec/ve Approaches to Best Prac/ces in Data Analy/cs for Internal Audit Presented to: ISACA and IIA Joint Mee/ng October 10, 2014 By Outline Introduc.on The Evolving Role of Internal Audit The importance

More information

The Right BI Tool for the Job in a non- SAP Applica9on Environment

The Right BI Tool for the Job in a non- SAP Applica9on Environment September 9 11, 2013 Anaheim, California The Right BI Tool for the Job in a non- SAP Applica9on Environment Speaker Name(s): Ty Miller Full Spectrum Business Intelligence Self Service Dashboards and Apps

More information

CS2Bh: Current Technologies. Introduction to XML and Relational Databases. Introduction to Databases. Why databases? Why not use XML?

CS2Bh: Current Technologies. Introduction to XML and Relational Databases. Introduction to Databases. Why databases? Why not use XML? CS2Bh: Current Technologies Introduction to XML and Relational Databases Spring 2005 Introduction to Databases CS2 Spring 2005 (LN5) 1 Why databases? Why not use XML? What is missing from XML: Consistency

More information

Paper 232-2012. Getting to the Good Part of Data Analysis: Data Access, Manipulation, and Customization Using JMP

Paper 232-2012. Getting to the Good Part of Data Analysis: Data Access, Manipulation, and Customization Using JMP Paper 232-2012 Getting to the Good Part of Data Analysis: Data Access, Manipulation, and Customization Using JMP Audrey Ventura, SAS Institute Inc., Cary, NC ABSTRACT Effective data analysis requires easy

More information

Oklahoma s Open Source Spatial Data Clearinghouse: OKMaps

Oklahoma s Open Source Spatial Data Clearinghouse: OKMaps Oklahoma s Open Source Spatial Data Clearinghouse: OKMaps Presented by: Mike Sharp State Geographic Information Coordinator Oklahoma Office of Geographic Information MAGIC 2014 Symposium April 28-May1,

More information

DKAN. Data Warehousing, Visualization, and Mapping

DKAN. Data Warehousing, Visualization, and Mapping DKAN Data Warehousing, Visualization, and Mapping Acknowledgements We d like to acknowledge the NuCivic team, led by Andrew Hoppin, which has done amazing work creating open source tools to make data available

More information

Understanding Raster Data

Understanding Raster Data Introduction The following document is intended to provide a basic understanding of raster data. Raster data layers (commonly referred to as grids) are the essential data layers used in all tools developed

More information

Basics on Geodatabases

Basics on Geodatabases Basics on Geodatabases 1 GIS Data Management 2 File and Folder System A storage system which uses the default file and folder structure found in operating systems. Uses the non-db formats we mentioned

More information

Introduction to LTER Data. Management. LTER Information. Management. Training Materials

Introduction to LTER Data. Management. LTER Information. Management. Training Materials LTER Information Managers Committee Introduction to LTER Data Management LTER Information Management Training Materials Shamelessly scavenged from presentations by John Porter, Kristin Vanderbilt, Hook

More information

LSA SAF products: files and formats

LSA SAF products: files and formats LSA SAF products: files and formats Carla Barroso, IPMA Application of Remote Sensing Data for Drought Monitoring Introduction to Eumetsat LANDSAF Products 11-15 November Slovenia OUTLINE Where to get

More information

Quick and Easy Web Maps with Google Fusion Tables. SCO Technical Paper

Quick and Easy Web Maps with Google Fusion Tables. SCO Technical Paper Quick and Easy Web Maps with Google Fusion Tables SCO Technical Paper Version History Version Date Notes Author/Contact 1.0 July, 2011 Initial document created. Howard Veregin 1.1 Dec., 2011 Updated to

More information

Introduction to PostGIS

Introduction to PostGIS Tutorial ID: IGET_WEBGIS_002 This tutorial has been developed by BVIEER as part of the IGET web portal intended to provide easy access to geospatial education. This tutorial is released under the Creative

More information

Understanding Data: A Comparison of Information Visualization Tools and Techniques

Understanding Data: A Comparison of Information Visualization Tools and Techniques Understanding Data: A Comparison of Information Visualization Tools and Techniques Prashanth Vajjhala Abstract - This paper seeks to evaluate data analysis from an information visualization point of view.

More information

Vendor: Crystal Decisions Product: Crystal Reports and Crystal Enterprise

Vendor: Crystal Decisions Product: Crystal Reports and Crystal Enterprise 1 Ability to access the database platforms desired (text, spreadsheet, Oracle, Sybase and other databases, OLAP engines.) Y Y 2 Ability to access relational data base Y Y 3 Ability to access dimensional

More information

DATABASE MANAGEMENT FILES GIS06

DATABASE MANAGEMENT FILES GIS06 DATABASE MANAGEMENT Last day we looked at spatial data structures for both vector and raster data models. When working with large amounts of data, it is important to have good procedures for managing the

More information

Institute of Natural Resources Departament of General Geology and Land use planning Work with a MAPS

Institute of Natural Resources Departament of General Geology and Land use planning Work with a MAPS Institute of Natural Resources Departament of General Geology and Land use planning Work with a MAPS Lecturers: Berchuk V.Y. Gutareva N.Y. Contents: 1. Qgis; 2. General information; 3. Qgis desktop; 4.

More information

Overview of sharing and collaborating on Excel data

Overview of sharing and collaborating on Excel data Overview of sharing and collaborating on Excel data There are many ways to share, analyze, and communicate business information and data in Microsoft Excel. The way that you choose to share data depends

More information

EXCEL IMPORT 18.1. user guide

EXCEL IMPORT 18.1. user guide 18.1 user guide No Magic, Inc. 2014 All material contained herein is considered proprietary information owned by No Magic, Inc. and is not to be shared, copied, or reproduced by any means. All information

More information

HALOGEN. Technical Design Specification. Version 2.0

HALOGEN. Technical Design Specification. Version 2.0 HALOGEN Technical Design Specification Version 2.0 10th August 2010 1 Document Revision History Date Author Revision Description 27/7/09 D Carter, Mark Widdowson, Stuart Poulton, Lex Comber 1.1 First draft

More information

Prepare your result file for input into SPSS

Prepare your result file for input into SPSS Prepare your result file for input into SPSS Isabelle Darcy When you use DMDX for your experiment, you get an.azk file, which is a simple text file that collects all the reaction times and accuracy of

More information

A GIS helps you answer questions and solve problems by looking at your data in a way that is quickly understood and easily shared.

A GIS helps you answer questions and solve problems by looking at your data in a way that is quickly understood and easily shared. A Geographic Information System (GIS) integrates hardware, software, and data for capturing, managing, analyzing, and displaying all forms of geographically referenced information. GIS allows us to view,

More information

Polynomial Neural Network Discovery Client User Guide

Polynomial Neural Network Discovery Client User Guide Polynomial Neural Network Discovery Client User Guide Version 1.3 Table of contents Table of contents...2 1. Introduction...3 1.1 Overview...3 1.2 PNN algorithm principles...3 1.3 Additional criteria...3

More information

Lesson 15 - Fill Cells Plugin

Lesson 15 - Fill Cells Plugin 15.1 Lesson 15 - Fill Cells Plugin This lesson presents the functionalities of the Fill Cells plugin. Fill Cells plugin allows the calculation of attribute values of tables associated with cell type layers.

More information

Getting Started with the ArcGIS Predictive Analysis Add-In

Getting Started with the ArcGIS Predictive Analysis Add-In Getting Started with the ArcGIS Predictive Analysis Add-In Table of Contents ArcGIS Predictive Analysis Add-In....................................... 3 Getting Started 4..............................................

More information

Vector storage and access; algorithms in GIS. This is lecture 6

Vector storage and access; algorithms in GIS. This is lecture 6 Vector storage and access; algorithms in GIS This is lecture 6 Vector data storage and access Vectors are built from points, line and areas. (x,y) Surface: (x,y,z) Vector data access Access to vector

More information

CHAPTER-IV DATABASE MANAGEMENT AND ITS ENVIRONMENT

CHAPTER-IV DATABASE MANAGEMENT AND ITS ENVIRONMENT CHAPTER-IV DATABASE MANAGEMENT AND ITS ENVIRONMENT 4.1 EVOLUTION OF DATABASE MANAGEMENT SYSTEMS The past two decades have witnessed enormous growth in the number and importance of database applications.

More information

Create a folder on your network drive called DEM. This is where data for the first part of this lesson will be stored.

Create a folder on your network drive called DEM. This is where data for the first part of this lesson will be stored. In this lesson you will create a Digital Elevation Model (DEM). A DEM is a gridded array of elevations. In its raw form it is an ASCII, or text, file. First, you will interpolate elevations on a topographic

More information

SBML SBGN SBML Just my 2 cents. Alice C. Villéger COMBINE 2010

SBML SBGN SBML Just my 2 cents. Alice C. Villéger COMBINE 2010 SBML SBGN SBML Just my 2 cents Alice C. Villéger COMBINE 2010 Disclaimer Fuzzy talk work in progress last minute slides Someone else has been working on very similar stuff and should really have been talking

More information

4. The Third Stage In Designing A Database Is When We Analyze Our Tables More Closely And Create A Between Tables

4. The Third Stage In Designing A Database Is When We Analyze Our Tables More Closely And Create A Between Tables 1. What Are The Different Views To Display A Table A) Datasheet View B) Design View C) Pivote Table & Pivot Chart View D) All Of Above 2. Which Of The Following Creates A Drop Down List Of Values To Choose

More information

Toad for Data Analysts, Tips n Tricks

Toad for Data Analysts, Tips n Tricks Toad for Data Analysts, Tips n Tricks or Things Everyone Should Know about TDA Just what is Toad for Data Analysts? Toad is a brand at Quest. We have several tools that have been built explicitly for developers

More information

How To Read Data Files With Spss For Free On Windows 7.5.1.5 (Spss)

How To Read Data Files With Spss For Free On Windows 7.5.1.5 (Spss) 05-Einspruch (SPSS).qxd 11/18/2004 8:26 PM Page 49 CHAPTER 5 Managing Data Files Chapter Purpose This chapter introduces fundamental concepts of working with data files. Chapter Goal To provide readers

More information

Offensive & Defensive & Forensic Techniques for Determining Web User Iden<ty

Offensive & Defensive & Forensic Techniques for Determining Web User Iden<ty Offensive & Defensive & Forensic Techniques for Determining Web User Iden

More information

STATGRAPHICS Online. Statistical Analysis and Data Visualization System. Revised 6/21/2012. Copyright 2012 by StatPoint Technologies, Inc.

STATGRAPHICS Online. Statistical Analysis and Data Visualization System. Revised 6/21/2012. Copyright 2012 by StatPoint Technologies, Inc. STATGRAPHICS Online Statistical Analysis and Data Visualization System Revised 6/21/2012 Copyright 2012 by StatPoint Technologies, Inc. All rights reserved. Table of Contents Introduction... 1 Chapter

More information

PROJECTS. onepoint PROJECTS 13. Group Server and. Enterprise Cloud/Server. Tutorial

PROJECTS. onepoint PROJECTS 13. Group Server and. Enterprise Cloud/Server. Tutorial onepoint PROJECTS 13 Group Server and Enterprise Cloud/Server Tutorial 1 1 Introduction onepoint PROJECTS is the first open source project leadership software integrating project planning, controlling,

More information

TOLOMEO. ORFEO Toolbox. Jordi Inglada - CNES. TOoLs for Open Mul/- risk assessment using Earth Observa/on data TOLOMEO

TOLOMEO. ORFEO Toolbox. Jordi Inglada - CNES. TOoLs for Open Mul/- risk assessment using Earth Observa/on data TOLOMEO ORFEO Toolbox Jordi Inglada - CNES TOoLs for Open Mul/- risk assessment using Earth Observa/on data Outline ORFEO Toolbox : general characteris>cs Example of OTB features OTB Applica>ons & Processing Chains

More information

CSU, Fresno - Institutional Research, Assessment and Planning - Dmitri Rogulkin

CSU, Fresno - Institutional Research, Assessment and Planning - Dmitri Rogulkin My presentation is about data visualization. How to use visual graphs and charts in order to explore data, discover meaning and report findings. The goal is to show that visual displays can be very effective

More information

Reprojecting MODIS Images

Reprojecting MODIS Images Reprojecting MODIS Images Why Reprojection? Reasons why reprojection is desirable: 1. Removes Bowtie Artifacts 2. Allows geographic overlays (e.g. coastline, city locations) 3. Makes pretty pictures for

More information

Oracle8i Spatial: Experiences with Extensible Databases

Oracle8i Spatial: Experiences with Extensible Databases Oracle8i Spatial: Experiences with Extensible Databases Siva Ravada and Jayant Sharma Spatial Products Division Oracle Corporation One Oracle Drive Nashua NH-03062 {sravada,jsharma}@us.oracle.com 1 Introduction

More information

WEEK #3, Lecture 1: Sparse Systems, MATLAB Graphics

WEEK #3, Lecture 1: Sparse Systems, MATLAB Graphics WEEK #3, Lecture 1: Sparse Systems, MATLAB Graphics Visualization of Matrices Good visuals anchor any presentation. MATLAB has a wide variety of ways to display data and calculation results that can be

More information

MetroBoston DataCommon Training

MetroBoston DataCommon Training MetroBoston DataCommon Training Whether you are a data novice or an expert researcher, the MetroBoston DataCommon can help you get the information you need to learn more about your community, understand

More information

There are various ways to find data using the Hennepin County GIS Open Data site:

There are various ways to find data using the Hennepin County GIS Open Data site: Finding Data There are various ways to find data using the Hennepin County GIS Open Data site: Type in a subject or keyword in the search bar at the top of the page and press the Enter key or click the

More information

Norwegian Satellite Earth Observation Database for Marine and Polar Research http://normap.nersc.no USE CASES

Norwegian Satellite Earth Observation Database for Marine and Polar Research http://normap.nersc.no USE CASES Norwegian Satellite Earth Observation Database for Marine and Polar Research http://normap.nersc.no USE CASES The NORMAP Project team has prepared this document to present functionality of the NORMAP portal.

More information

Implementing GIS in Optical Fiber. Communication

Implementing GIS in Optical Fiber. Communication KING FAHD UNIVERSITY OF PETROLEUM AND MINERALS COLLEGE OF ENVIRONMENTAL DESIGN CITY & RIGINAL PLANNING DEPARTMENT TERM ROJECT Implementing GIS in Optical Fiber Communication By Ahmed Saeed Bagazi ID# 201102590

More information

Raster Data Structures

Raster Data Structures Raster Data Structures Tessellation of Geographical Space Geographical space can be tessellated into sets of connected discrete units, which completely cover a flat surface. The units can be in any reasonable

More information

ACADEMIC TECHNOLOGY SUPPORT

ACADEMIC TECHNOLOGY SUPPORT ACADEMIC TECHNOLOGY SUPPORT Microsoft Excel: Tables & Pivot Tables ats@etsu.edu 439-8611 www.etsu.edu/ats Table of Contents: Overview... 1 Objectives... 1 1. What is an Excel Table?... 2 2. Creating Pivot

More information

GIS Data in ArcGIS. Pay Attention to Data!!!

GIS Data in ArcGIS. Pay Attention to Data!!! GIS Data in ArcGIS Pay Attention to Data!!! 1 GIS Data Models Vector Points, lines, polygons, multi-part, multi-patch Composite & secondary features Regions, dynamic segmentation (routes) Raster Grids,

More information

EXCEL PIVOT TABLE David Geffen School of Medicine, UCLA Dean s Office Oct 2002

EXCEL PIVOT TABLE David Geffen School of Medicine, UCLA Dean s Office Oct 2002 EXCEL PIVOT TABLE David Geffen School of Medicine, UCLA Dean s Office Oct 2002 Table of Contents Part I Creating a Pivot Table Excel Database......3 What is a Pivot Table...... 3 Creating Pivot Tables

More information

SAS BI Dashboard 4.3. User's Guide. SAS Documentation

SAS BI Dashboard 4.3. User's Guide. SAS Documentation SAS BI Dashboard 4.3 User's Guide SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2010. SAS BI Dashboard 4.3: User s Guide. Cary, NC: SAS Institute

More information

Introduction to SQL for Data Scientists

Introduction to SQL for Data Scientists Introduction to SQL for Data Scientists Ben O. Smith College of Business Administration University of Nebraska at Omaha Learning Objectives By the end of this document you will learn: 1. How to perform

More information

10. Creating and Maintaining Geographic Databases. Learning objectives. Keywords and concepts. Overview. Definitions

10. Creating and Maintaining Geographic Databases. Learning objectives. Keywords and concepts. Overview. Definitions 10. Creating and Maintaining Geographic Databases Geographic Information Systems and Science SECOND EDITION Paul A. Longley, Michael F. Goodchild, David J. Maguire, David W. Rhind 005 John Wiley and Sons,

More information

Activity: Using ArcGIS Explorer

Activity: Using ArcGIS Explorer Activity: Using ArcGIS Explorer Requirements You must have ArcGIS Explorer for this activity. Preparation: Download ArcGIS Explorer. The link below will bring you to the ESRI ArcGIS Explorer download page.

More information

Reading & Writing Spatial Data in R John Lewis. Some material used in these slides are taken from presentations by Roger Bivand and David Rossiter

Reading & Writing Spatial Data in R John Lewis. Some material used in these slides are taken from presentations by Roger Bivand and David Rossiter Reading & Writing Spatial Data in R John Lewis Some material used in these slides are taken from presentations by Roger Bivand and David Rossiter Introduction Having described how spatial data may be represented

More information

Asset Management and Mobile GIS Data Collec6on: Best Prac6ces Using ipads and Tablet Computers

Asset Management and Mobile GIS Data Collec6on: Best Prac6ces Using ipads and Tablet Computers Asset Management and Mobile GIS Data Collec6on: Best Prac6ces Using ipads and Tablet Computers Rob Musci Eric Pescatore pescatoreec@cdmsmith.com January 26, 2015 NEW ENGLAND WATER ENVIRONMENT ASSOCIATION

More information

Excel 2002. What you will do:

Excel 2002. What you will do: What you will do: Explore the features of Excel 2002 Create a blank workbook and a workbook from a template Format a workbook Apply formulas to a workbook Create a chart Import data to a workbook Share

More information

A HYBRID APPROACH FOR AUTOMATED AREA AGGREGATION

A HYBRID APPROACH FOR AUTOMATED AREA AGGREGATION A HYBRID APPROACH FOR AUTOMATED AREA AGGREGATION Zeshen Wang ESRI 380 NewYork Street Redlands CA 92373 Zwang@esri.com ABSTRACT Automated area aggregation, which is widely needed for mapping both natural

More information

Data Management, Analysis Tools, and Analysis Mechanics

Data Management, Analysis Tools, and Analysis Mechanics Chapter 2 Data Management, Analysis Tools, and Analysis Mechanics This chapter explores different tools and techniques for handling data for research purposes. This chapter assumes that a research problem

More information

WMO Climate Database Management System Evaluation Criteria

WMO Climate Database Management System Evaluation Criteria ANNEX 8 WMO Climate Database Management System Evaluation Criteria System Name: Version: Contributing Country: Contact Information Contact Person: Telephone: FAX: Email address: Postal address: Date: General

More information

Storytelling with Maps: Workflows and Best Practices

Storytelling with Maps: Workflows and Best Practices Storytelling with Maps: Workflows and Best Practices Introduction What is a story map? Story maps are interactive maps combined with text and other content to tell a story about the world. Typically story

More information

HRS 750: UDW+ Ad Hoc Reports Training 2015 Version 1.1

HRS 750: UDW+ Ad Hoc Reports Training 2015 Version 1.1 HRS 750: UDW+ Ad Hoc Reports Training 2015 Version 1.1 Program Services Office & Decision Support Group Table of Contents Create New Analysis... 4 Criteria Tab... 5 Key Fact (Measurement) and Dimension

More information

Figure 2: System Flow Diagram for Workflow Management

Figure 2: System Flow Diagram for Workflow Management 5. WORKFLOW MANAGEMENT The developed system EASKB uses the open source content management system called Drupal ([2]). A Content Management System - CMS is a tool that enables many user friendly features

More information

How To Write A File System On A Microsoft Office 2.2.2 (Windows) (Windows 2.3) (For Windows 2) (Minorode) (Orchestra) (Powerpoint) (Xls) (

How To Write A File System On A Microsoft Office 2.2.2 (Windows) (Windows 2.3) (For Windows 2) (Minorode) (Orchestra) (Powerpoint) (Xls) ( Remark Office OMR 8 Supported File Formats User s Guide Addendum Remark Products Group 301 Lindenwood Drive, Suite 100 Malvern, PA 19355-1772 USA www.gravic.com Disclaimer The information contained in

More information

Information Systems SQL. Nikolaj Popov

Information Systems SQL. Nikolaj Popov Information Systems SQL Nikolaj Popov Research Institute for Symbolic Computation Johannes Kepler University of Linz, Austria popov@risc.uni-linz.ac.at Outline SQL Table Creation Populating and Modifying

More information

ADWR GIS Metadata Policy

ADWR GIS Metadata Policy ADWR GIS Metadata Policy 1 PURPOSE OF POLICY.. 3 INTRODUCTION.... 4 What is metadata?... 4 Why is it important? 4 When to fill metadata...4 STANDARDS. 5 FGDC content standards for geospatial metadata...5

More information

Studying Topography, Orographic Rainfall, and Ecosystems (STORE)

Studying Topography, Orographic Rainfall, and Ecosystems (STORE) Studying Topography, Orographic Rainfall, and Ecosystems (STORE) Basic Lesson 3: Using Microsoft Excel to Analyze Weather Data: Topography and Temperature Introduction This lesson uses NCDC data to compare

More information

Levee Assessment via Remote Sensing Levee Assessment Tool Prototype Design & Implementation

Levee Assessment via Remote Sensing Levee Assessment Tool Prototype Design & Implementation Levee Assessment via Remote Sensing Levee Assessment Tool Prototype Design & Implementation User-Friendly Map Viewer Novel Tab-GIS Interface Extensible GIS Framework Pluggable Tools & Classifiers December,

More information