Advanced SQL Queries for the EMF



Similar documents
INTRODUCTION: SQL SERVER ACCESS / LOGIN ACCOUNT INFO:

Ad Hoc Reporting: Data Export

Advanced Query for Query Developers

PeopleSoft Query Training

TRIM: Web Tool. Web Address The TRIM web tool can be accessed at:

Unit 10: Microsoft Access Queries

IRA Pivot Table Review and Using Analyze to Modify Reports. For help,

March Module 3 Processing MOVES Output

MAS 500 Intelligence Tips and Tricks Booklet Vol. 1

BID2WIN Workshop. Advanced Report Writing

NEW IR DATA WAREHOUSE

Web Intelligence User Guide

PSU SQL: Introduction. SQL: Introduction. Relational Databases. Activity 1 Examining Tables and Diagrams

REP200 Using Query Manager to Create Ad Hoc Queries

Process Document Campus Community: Create Communication Template. Document Generation Date 7/8/2009 Last Changed by Status

Tips and Tricks SAGE ACCPAC INTELLIGENCE

Quick and Easy Web Maps with Google Fusion Tables. SCO Technical Paper

Query. Training and Participation Guide Financials 9.2

Access Queries (Office 2003)

SalesCTRL Release Notes

INTRODUCING QUICKBOOKS WEBCONNECT!

Hatco Lead Management System:

BusinessObjects: General Report Writing for Version 5

How to Download Census Data from American Factfinder and Display it in ArcMap

Call Recorder Quick CD Access System

Visualization with Excel Tools and Microsoft Azure

CONTENTS MANUFACTURERS GUIDE FOR PUBLIC USERS

User Training Guide Entrinsik, Inc.

SES Project v 9.0 SES/CAESAR QUERY TOOL. Running and Editing Queries. PS Query

Business Reports. ARUP Connect

Setting Preferences in QuickBooks

Microsoft Access Rollup Procedure for Microsoft Office Click on Blank Database and name it something appropriate.

Novell ZENworks Asset Management 7.5

History Explorer. View and Export Logged Print Job Information WHITE PAPER

Pharmacy Affairs Branch. Website Database Downloads PUBLIC ACCESS GUIDE

Fig. 1 Suitable data for a Crosstab Query.

CCC Report Center Overview Accessing the CCC Report Center Accessing, Working With, and Running Reports Customizing Reports...

Decision Support AITS University Administration. EDDIE 4.1 User Guide

Knowledgebase Article

Trial version of GADD Dashboards Builder

Data Tool Platform SQL Development Tools

Jet Data Manager 2012 User Guide

Mitigation Planning Portal MPP Reporting System

This document describes the capabilities of NEXT Analytics v5.1 to retrieve data from Google Analytics directly into your spreadsheet file.

Support Desk Help Manual. v 1, May 2014

USING MYWEBSQL FIGURE 1: FIRST AUTHENTICATION LAYER (ENTER YOUR REGULAR SIMMONS USERNAME AND PASSWORD)

Creating a Participants Mailing and/or Contact List:

A database is a collection of data organised in a manner that allows access, retrieval, and use of that data.

Lab 9 Access PreLab Copy the prelab folder, Lab09 PreLab9_Access_intro

Welcome to the topic on queries in SAP Business One.

Utilities ComCash

The Welcome screen displays each time you log on to PaymentNet; it serves as your starting point or home screen.

Microsoft Office Access 2007 which I refer to as Access throughout this book

Chapter 24: Creating Reports and Extracting Data

SIMPLY REPORTS DEVELOPED BY THE SHARE STAFF SERVICES TEAM

Creating QBE Queries in Microsoft SQL Server

Knowledgebase Article

InfiniteInsight 6.5 sp4

Houston Region Diesel Engine Database Minimum System Requirements Installation Instructions Quick Start Guide version 0.1

Using Ad-Hoc Reporting

Oracle Data Miner (Extension of SQL Developer 4.0)

NEXT Analytics Business Intelligence User Guide

Analytics Canvas Tutorial: Cleaning Website Referral Traffic Data. N m o d a l S o l u t i o n s I n c. A l l R i g h t s R e s e r v e d

1. To start Installation: To install the reporting tool, copy the entire contents of the zip file to a directory of your choice. Run the exe.

User s Guide: Archiving Work from an LMS PROJECT SHARE

COLLABORATION NAVIGATING CMiC

Query 4. Lesson Objectives 4. Review 5. Smart Query 5. Create a Smart Query 6. Create a Smart Query Definition from an Ad-hoc Query 9

LABSHEET 1: creating a table, primary keys and data types

Log in using the username and password you were provided. Once logged in, click on the Inventory tab on the top right to open your Advertisers page.

Using SQL Server Management Studio

Simply Accounting Intelligence Tips and Tricks Booklet Vol. 1

Overview... 2 How to Add New Documents... 3 Adding a Note / SMS or Phone Message... 3 Adding a New Letter How to Create Letter Templates...

Importing TSM Data into Microsoft Excel using Microsoft Query

Publishing Reports in Tableau

SonicWALL GMS Custom Reports

Sales Person Commission

How To Use Query Console

Microsoft Access 3: Understanding and Creating Queries

Lab # 5. Retreiving Data from Multiple Tables. Eng. Alaa O Shama

Don't have Outlook? Download and configure the Microsoft Office Suite (which includes Outlook)!

MS Excel Template Building and Mapping for Neat 5

Click to create a query in Design View. and click the Query Design button in the Queries group to create a new table in Design View.

How to Create a Custom TracDat Report With the Ad Hoc Reporting Tool

Using SQL Queries in Crystal Reports

Business Objects. Report Writing - CMS Net and CCS Claims

SAP Business Objects Business Intelligence platform Document Version: 4.1 Support Package Data Federation Administration Tool Guide

Lesson 07: MS ACCESS - Handout. Introduction to database (30 mins)

Bulk Downloader. Call Recording: Bulk Downloader

Human Resources (HR) Query Basics

Salary and Planning Distribution (SPD) Ad-Hoc Reporting Tool

Setting Up ALERE with Client/Server Data

How to Import Data into Microsoft Access

User Guide. Analytics Desktop Document Number:

Advanced BIAR Participant Guide

Instructions for applying data validation(s) to data fields in Microsoft Excel

Inquiry Formulas. student guide

The Inventory Module. At its core, ecomdash is an inventory management system. Use this guide as a walkthrough to the Inventory module.

Dell KACE K1000 Management Appliance. Asset Management Guide. Release 5.3. Revision Date: May 13, 2011

Transcription:

Advanced SQL Queries for the EMF Susan McCusker, MARAMA Online EMF Resources: SQL EMF User s Guide https://www.cmascenter.org/emf/internal/guide.html EMF SQL Reference Guide https://www.cmascenter.org/emf/internal/sql_basics. html 2 1

Roadmap Previously: SQL snippets, query components, syntax Today: Review Anatomy of a SQL SELECT query and EMFspecific syntax New Query enhancements: data from more than one table join clause and its variations Goal: Build a query to summarize CAP emissions for a state by SCC with SCC descriptions 3 Where is SQL used in the EMF? Filtering Use where clause snippets to filter records; for example poll = 'CO' ann_value > 300 Row Filter when viewing and editing raw data Data Value Filter in Advanced Dataset Search Row Filter when exporting datasets and QA step results SQL in the EMF 4 2

Where is SQL used in the EMF? SQL-based QA steps Write your own select queries to create reports and summaries of datasets Sum or average emissions by region, SCC, plant, etc. Include reference information like county names, lat/lon coordinates, or pollutant description SQL in the EMF 5 Basic SQL Query Components Which table(s) should we get the data from? Do we want to select all the data in the table or only certain rows or columns? Example: select certain rows to get one state Example: select certain columns if don't need all the data from the original inventory How should the output be grouped and sorted? Example: by SCC? by FIPS? by state? by pollutant? SQL Review 6 3

SQL Syntax Basic query to extract/aggregate data from a single dataset table: select columns (i.e., data fields) from table ($TABLE[1] e in EMF) where filtering criteria used to pick rows group by columns to aggregate over (e.g., sum) order by columns to use to sort results SQL Review 7 SQL Syntax Query to extract/aggregate data from multiple dataset tables: select columns (i.e., datafields) from table ($TABLE[1] e in EMF) join table on matching criteria where filtering criteria used to pick rows group by columns to aggregate over (e.g., sum) order by columns to use to sort results SQL Review 8 4

EMF-Specific Syntax SQL database stores data in tables Table names must be unique so the EMF uses the dataset name and random ID to name tables Dataset name = nonpt_2011neiv2_nonpoint_20141108_11nov20 14_v1.csv Underlying data table name = emissions.ds_nonpt_2011neiv2_nonpoint_2014 1108_11nov2014_v1_csv_214226751 EMF-Specific Syntax 9 Basic query Choose: FIPS, scc, pollutant, annual emissions for NY (36) select region_cd, scc, poll, ann_value from Emissions.DS_nonpt_2011NEIv2_NONPOINT_20141108_11nov2014_v1_csv_ 214226751 where region_cd like '36%' # records: 83,718 SQL Review 10 5

Basic query Choose: FIPS, scc, pollutant, annual emissions for NY (36) select region_cd, scc, poll, ann_value from Emissions.DS_nonpt_2011NEIv2_NONPOINT_20141108_11nov2014_v1_csv_ 214226751 where region_cd like '36%' # records: 83,718 SQL Review 11 Poll #1 The highlighted part of this where clause can be used directly as a row filter when viewing or exporting data: where region_cd like '36%' TRUE FALSE 12 6

EMF-Specific Syntax Instead of directly using the table name in the from statement, use the special syntax $TABLE[1] e e is a single character table alias Can use the alias throughout the query instead of the table name EMF-Specific Syntax 13 Make it generic with SQL-specific syntax select region_cd, scc, poll, ann_value where region_cd like '36%' # records: 83,718 EMF-Specific Syntax 14 7

EMF-Specific Syntax $DATASET_TABLE["dataset name", 1] a Refers to a different dataset Uses the default version of the data Ex:$DATASET_TABLE["nonpt_2011NEIv2_NONPOINT _20141108_11nov2014_v1.csv", 1] a Other EMF-specific options let you refer to specific versions of datasets or output of QA steps (covered in reference guide) EMF-Specific Syntax 15 Only interested in CAPs, not HAPs FIPS, SCC, pollutant, annual CAP emissions for NY (state FIPS = 36) select e.region_cd, e.scc, e.poll, e.ann_value where e.region_cd like '36%' and substring(e.poll,1,1) not in ('1', '2', '3', '4', '5', '6', '7', '8', '9') # records: 20,187 SQL Review 16 8

Poll #2 We want to exclude HAPS & type the condition e.poll not in ('1', '2', '3', '4', '5', '6', '7', '8', '9'). a. This condition excludes HAPs b. This condition results in an error c. This condition does not exclude HAPs and does not result in an error 17 Summarize by FIPS, SCC, pollutant Use the aggregate function sum for ann_value select e.region_cd, e.scc, e.poll, sum(e.ann_value) where e.region_cd like '36%' and substring(e.poll,1,1) not in ('1', '2', '3', '4', '5', '6', '7', '8', '9') Error: need group by clause for select sum SQL Review 18 9

Summarize by FIPS, SCC, pollutant Use the aggregate function sum for ann_value select e.region_cd, e.scc, e.poll, sum(e.ann_value) where e.region_cd like '36%' and substring(e.poll,1,1) not in ('1', '2', '3', '4', '5', '6', '7', '8', '9') group by region_cd, scc, poll # records: 20,187 SQL Review 19 Summarize by State,SCC, pollutant Group together by 1 st two digits of FIPS code: select substring(e.region_cd,1,2), e.scc, e.poll, sum(e.ann_value) where e.region_cd like '36%' and substring(e.poll,1,1) not in ('1', '2', '3', '4', '5', '6', '7', '8', '9') group by substring(e.region_cd,1,2), e.scc, e.poll # records: 340 SQL Review 20 10

Multiple Tables Instead of extracting or aggregating data from just one table, we can use join to combine data from multiple tables Example reference table: scc scc sector scc_description 2401001000 Nonpoint "Solvent Utilization;Surface Coating;Architectural Coatings;Total: All Solvent Types" 2610000500 Nonpoint "Waste Disposal, Treatment, and Recovery;Open Burning;All Categories;Land Clearing Debris (use 28-10-005-000 for Logging Debris Burning)" 21 Poll #3 The EMF-specific syntax $TABLE[1] e: a. Refers to the data table for the dataset to which the QA step is attached b. Is generic, i.e., it can be copied to QA steps for other datasets of the same type/with the same columns c. Assigns a single-character table alias "e" that can be used in referring to columns from the dataset d. All of the above 22 11

JOIN Syntax For a reference table: left join reference.scc on e.scc = scc.scc After JOIN keyword is the name of the table to join For a dataset table: left join $DATASET_TABLE["dataset name", 1] a Refers to a different dataset Uses the default version of the data 23 Less basic query add a join select scc.scc_description left join reference.scc on e.scc = scc.scc where group by scc.scc_description 24 12

Summarize by State, SCC, pollutant with SCC descriptions select substring(region_cd,1,2) as FIPS_State, scc, scc.scc_description, poll, sum(ann_value) left join reference.scc on e.scc = scc.scc where region_cd like '36%' and substring(poll,1,1) not in ('1', '2', '3', '4', '5', '6', '7', '8', '9') group by substring(region_cd,1,2), scc, scc.scc_description,poll Failed to run "scc" is ambiguous 25 Summarize by State, SCC, pollutant with SCC descriptions select substring(e.region_cd,1,2) as FIPS_State, e.scc, scc.scc_description, e.poll, sum(e.ann_value) left join reference.scc on e.scc = scc.scc where e.region_cd like '36%' and substring(e.poll,1,1) not in ('1', '2', '3', '4', '5', '6', '7', '8', '9') group by substring(e.region_cd,1,2), e.scc, scc.scc_description, e.poll 26 13

EMF Reference Table Examples https://www.cmascenter.org/emf/inte rnal/sql_basics.html EMF Reference Tables - Example 27 Summarize by State, SCC, pollutant with SCC level descriptions select substring(e.region_cd,1,2) as FIPS_State, e.scc, scc_codes.scc_l1, e.poll, sum(e.ann_value) left join reference.scc_codes on e.scc = scc_codes.scc where e.region_cd like '36%' and substring(e.poll,1,1) not in ('1', '2', '3', '4', '5', '6', '7', '8', '9') group by substring(e.region_cd,1,2), e.scc, scc_codes.scc_l1, e.poll 28 14

JOIN Syntax: multiple criteria,multiple joins... from $TABLE e left join reference.fips on e.county = fips.county and e.state = fips.st left join reference.scc on e.scc = e.scc on clause defines how the two tables relate to each other county and state names must match Can have Different column names Multiple joins 29 JOIN Options 30 15

LEFT JOIN Semantics To include all the records from nonpt_2011neiv2_nonpoint_20141108_11nov2014_v1.csv whether or not there s a matching record in reference.scc, we use a left join instead of just join left join is also called left outer join 31 LEFT JOIN Oil & Gas Example 32 16

LEFT JOIN Details Table order is important when using left join... left join reference.scc... All the records in the left table ($TABLE[1]) are returned in the output What if the order is reversed? 33 Poll #4 What results would we expect from reversing the order of the left join in the previous query, i.e., from reference.scc left join $TABLE[1] e a. Only records common to both tables are returned b. All records in either table are returned c. All records in the reference.scc table are returned d. All records in the $TABLE[1] e are returned 34 17

Dealing with NULL Values NULL in SQL is a state (unknown) and not a value: data value does not exist in the database Used when data is unknown Unknown "value of zero" NULL values can cause unexpected output when combined with other values 'Fish' NULL 'Chips' = NULL NULL /0 = NULL NULL Values 35 What happens if there's no match in the other table? Coalesce function select e.scc, coalesce(scc.scc_description, 'An unspecified description') as scc_description, e.poll, coalesce(sum(e.ann_value),0) as ann_emis left join reference.scc on scc.scc = e.scc group by e.scc, scc.scc_description, e.poll order by e.scc NULL Values 36 18

COALESCE Function coalesce(value1, value2,...) function returns the first value in the list that is not null Replace select scc.scc_description with select coalesce(scc.scc_description, 'An unspecified description') If there is no description for the SCC (i.e., scc.scc_description is NULL), 'An upspecified description' is returned. NULL Values 37 Putting It All Together select e.scc, coalesce(s.scc_description, 'Unspecified description') as scc_description, e.poll, coalesce(p.descrptn, 'Unspecified description') as pollutant_code_desc, coalesce(sum(ann_value), 0) as ann_emis left join reference.invtable p on p.cas = e.poll left join reference.scc s on e.scc = s.scc group by e.scc, e.poll, p.descrptn, s.scc_description, p.name order by e.scc, p.name 38 19

Steps for SQL Query in the EMF -- From EMF main window: Manage Datasets -- Show Datasets of Type: Select one from drop-down menu: ex, nonpt_2011neiv2_nonpoint_20141108_11nov2014_v1.csv *Use or search button or to narrow down results displayed* -- Check box to select: ex, nonpt_2011neiv2_nonpoint_20141108_11nov2014_v1.csv -- Click Edit Properties at bottom of the screen -- Click QA tab at top -- Click Add Custom at bottom of screen -- Enter Name (ex, test, temp, or something descriptive for future reminder) -- In Program drop-down box choose SQL -- In Arguments box type in or cut & paste SQL query -- Click OK -- Back in Dataset Properties Editor, check the box next to the QA Step that you added and click Edit at bottom of screen -- In Edit QA Step window, check box for Download result file to local machine? if want to export results -- Click Run -- Click View Results to see results in EMF * If there are multiple files in a dataset type, make sure to keep track of which file your query is applied to. Query results are for that particular file only. * Dumping run logs in status windows to trashcan before run starts will reduce clustering and make the log easier to read 39 Exporting to Google Earth Option to export data in KMZ format used by Google Earth For QA step results that include latitude and longitude coordinates Export from View QA Step Results window QA Step Results: Mapping 40 20

Exporting to Google Earth QA Step Results: Mapping 41 Formatting Note: Quotes CAUTION! "smart quotes" cause a syntax error in the EMF 42 21