Computational Finance and Risk Management Financial Data Access with SQL, Excel & VBA Guy Yollin Instructor, Applied Mathematics University of Washington Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 1 / 58
Outline 1 Introduction to SQL 2 SQLite and sample databases 3 Simple queries 4 Queries with additional clauses 5 Querying multiple tables with subqueries 6 Querying multiple tables with join Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 2 / 58
Lecture references Ben Forta Sams Teach Yourself SQL in 10 Minutes Sams, 1999 Chapter 1-12 sqlzoo.net SQL ZOO: Interactive SQL Tutorial http://sqlzoo.net/ sqlite.org SQL As Understood By SQLite http://www.sqlite.org/lang.html Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 3 / 58
Outline 1 Introduction to SQL 2 SQLite and sample databases 3 Simple queries 4 Queries with additional clauses 5 Querying multiple tables with subqueries 6 Querying multiple tables with join Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 4 / 58
SQL SQL SQL (pronounced squeal) stands for Structured Query Language, a special-purpose programming language designed for managing data in relational database management systems (RDBMS) SQL has both an ANSI and ISO standard but minor compatibility issues are commonn frequent updates to the standards vendor-specific procedural extensions vendor-specific deviations http://en.wikipedia.org/wiki/sql MS Access SQL has many proprietary incompatibilities Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 5 / 58
Importance of SQL Why SQL? Knowledge of SQL is critical because the vast majority of real data owned by the mast majority of real companies is maintained in an SQL compatible database Common databases that support SQL Microsoft SQL Server Oracle Database IBM DB2 Sybase Microsoft Access MySQL PostgreSQL SQLite A stylized fact Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 6 / 58
Relational Database Relational Database A relational database is a collection of data items organized as a set of formally described tables from which data can be accessed easily Relational database theory uses a set of mathematical terms, which are roughly equivalent to SQL database terminology: Relational Term relation, base relvar derived relvar tuple attribute SQL equivalent table view, query result, result set row column http://en.wikipedia.org/wiki/relational_database Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 7 / 58
Outline 1 Introduction to SQL 2 SQLite and sample databases 3 Simple queries 4 Queries with additional clauses 5 Querying multiple tables with subqueries 6 Querying multiple tables with join Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 8 / 58
SQLite SQLite is a self-contained, serverless, zero-configuration SQL database engine SQLite is the most widely deployed SQL database engine in the world SQLite is open-source Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 9 / 58
Chinook sample database The Chinook data model represents a digital media store, including tables for artists, albums, media tracks, invoices and customers. Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 10 / 58
Chinook sample database Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 11 / 58
SQLite Manager for Firefox Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 12 / 58
SQLite Manager for Firefox Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 13 / 58
SQL and SQLite data types SQLite storage mode SQL datatype Description TEXT variable length text TEXT CHAR fixed length string (size specified at create time) NCHAR like CHAR but support Unicode characters NVARCHAR like text but with Unicode support INTEGER 4-byte signed integer INTEGER SMALLINT 2-byte signed integer TINYINT 1-byte unsigned integer REAL REAL 4-byte floating point FLOAT floating point NUMERIC fixed or floating point with specified precision DECIMAL fixed or floating point with specified precision NUMERIC BOOLEAN true or false DATE date value DATETIME date time value NONE BLOB binary data http://www.sqlite.org/datatype3.html Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 14 / 58
Database keys The relationships between columns located in different tables are usually described through the use of keys Primary Key Foreign Key A column (or set of columns) whose values uniquely identify every row in a table A column in a table which is also the Primary Key in another table http://www.atlasindia.com/sql.htm Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 15 / 58
Outline 1 Introduction to SQL 2 SQLite and sample databases 3 Simple queries 4 Queries with additional clauses 5 Querying multiple tables with subqueries 6 Querying multiple tables with join Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 16 / 58
SELECT statement The most common operation in SQL is the query which is performed with the SELECT statement SELECT retrieves data from one or more tables returned data is called a resultset or recordset Standard SELECT queries just read from the database and do not change any underlying data Notes about the SQL language: SQL is not case-sensitive SQL is ignores whitespace strings must use single quotes Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 17 / 58
SELECT/FROM wildcard SQL: SELECT/FROM wildcard syntax SELECT FROM tablename The * character is a wildcard meaning all columns Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 18 / 58
SELECT/FROM wildcard SQL: SELECT/FROM wildcard syntax SELECT FROM tablename Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 19 / 58
SELECT/FROM SQL: SELECT/FROM syntax SELECT columnname ( s ) FROM tablename multiple columns are separated with commas Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 20 / 58
SELECT/FROM SQL: SELECT/FROM syntax SELECT columnname ( s ) FROM tablename Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 21 / 58
SELECT/FROM/WHERE SQL: SELECT/FROM/WHERE syntax SELECT columnname ( s ) FROM tablename WHERE somecondition Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 22 / 58
SELECT/FROM/WHERE SQL: SELECT/FROM/WHERE syntax SELECT columnname ( s ) FROM tablename WHERE somecondition Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 23 / 58
WHERE clause operators Operator Description = equality <> non-equality! = non-equality < less than <= less than or equal to! < not less than > greater than >= greater than or equal to! > not greater than BETWEEN non-equality IS NULL is a NULL value WHERE clause can also include AND, OR, and NOT parenthesis are used for complex logic Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 24 / 58
SELECT/FROM/WHERE SQL: SELECT/FROM/WHERE syntax SELECT columnname ( s ) FROM tablename WHERE somecondition WHERE clause with arithmetic and logical operators Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 25 / 58
WHERE clause with IN SQL: WHERE clause with IN SELECT columnname ( s ) FROM tablename WHERE somecolumn IN listofvalues list for WHERE IN is in parenthesis with items separated with commas Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 26 / 58
WHERE clause with NOT IN SQL: WHERE clause with NOT IN SELECT columnname ( s ) FROM tablename WHERE somecolumn NOT IN listofvalues Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 27 / 58
Partial matching with WHERE LIKE The LIKE keyword is used in SQL expression to perform partial matching by including wildcard characters: _ represents a single unspecified character % represents a series of one or more unspecified character SQL: WHERE clause with LIKE SELECT columnname ( s ) FROM tablename WHERE somecolumn LIKE wildcardsting Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 28 / 58
Partial matching with WHERE LIKE SQL: WHERE clause with LIKE SELECT columnname ( s ) FROM tablename WHERE somecolumn LIKE wildcardsting Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 29 / 58
Partial matching with WHERE LIKE SQL: WHERE clause with LIKE SELECT columnname ( s ) FROM tablename WHERE somecolumn LIKE wildcardsting Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 30 / 58
Outline 1 Introduction to SQL 2 SQLite and sample databases 3 Simple queries 4 Queries with additional clauses 5 Querying multiple tables with subqueries 6 Querying multiple tables with join Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 31 / 58
ORDER BY clause SQL: ORDER BY clause SELECT columnname ( s ) FROM tablename WHERE somecondition ORDER BY columnname Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 32 / 58
ORDER BY clause SQL: ORDER BY clause SELECT columnname ( s ) FROM tablename WHERE somecondition ORDER BY columnname use DESC with ORDER BY to sort in descending order Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 33 / 58
SQL aggregate functions SQL supports the use of arithmetic formulas and it also provides a number of aggregate functions Function COUNT SUM AVG MAX MIN Description counts the number of rows in the resultset sums a column of the resultset take the average of a column of the resultset finds the maximum value in a column of the resultset finds the maximum value in a column of the resultset The results of an arithmetic operation are usually assigned an alias with the AS keyword Aggregate functions are frequently used with the GROUP BY clause of the SELECT statement Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 34 / 58
COUNT function SQL: COUNT function SELECT COUNT ( columnname ) FROM tablename total count of the number of rows in the Track table Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 35 / 58
COUNT function SQL: COUNT function SELECT COUNT ( columnname ) FROM tablename count of the number of non-null Composers Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 36 / 58
COUNT function with DISTINCT clause SQL: COUNT function SELECT COUNT ( DISTINCT columnname ) FROM tablename number of unique composers Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 37 / 58
AVG function SQL: COUNT function SELECT AVG ( columnname ) FROM tablename average invoice amount Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 38 / 58
MAX function SQL: COUNT function SELECT MAX ( columnname ) FROM tablename maximum invoice amount Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 39 / 58
GROUP BY clause SQL: GROUP BY clause SELECT AggFunc ( columnname )... GROUP BY columnname Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 40 / 58
GROUP BY and HAVING clause SQL: GROUP BY and HAVING clause SELECT AggFunc ( columnname )... GROUP BY columnname HAVING somecondition Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 41 / 58
Outline 1 Introduction to SQL 2 SQLite and sample databases 3 Simple queries 4 Queries with additional clauses 5 Querying multiple tables with subqueries 6 Querying multiple tables with join Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 42 / 58
Input data from 2 tables, output from 1 table Problem: Solution: Find all songs belonging to a particular genre of music Find the GenreId of the desired style (from the Genre table) then select all of the tracks that match the GenreID (from the Track table) Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 43 / 58
Manually running 2 queries Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 44 / 58
Subquery SQL: Subquery syntax SELECT columnnames FROM tablename WHERE somecolumn IN ( SELECT/ FROM/ WHERE statement ) Subquery provides the list used in the top-level WHERE IN clause Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 45 / 58
Input data from 3 tables, output from 1 table Problem: Solution: Find all the albums containing songs belonging to a particular genre of music Find the GenreId of the desired style (from the Genre table) then select all of the Tracks that match the GenreID (from the Track table) then select all of the Titles that match the AlbumId (from the Album table) Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 46 / 58
Nested subquery Lowest-level query provides a list of GenreIds Mid-level query provides a list of AlbumIds Top-level query returns the album names Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 47 / 58
Outline 1 Introduction to SQL 2 SQLite and sample databases 3 Simple queries 4 Queries with additional clauses 5 Querying multiple tables with subqueries 6 Querying multiple tables with join Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 48 / 58
Input data from 2 tables, output from 2 table Problem: Solution: Display a list of all songs and their genre Create a new table with the name of the song (from the Track table) and the name of the song s genre (from the Genre table) Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 49 / 58
Join 2 tables with FROM/WHERE syntax SQL: JOIN WHERE syntax SELECT tablename1. columnname, tablename2. columnname FROM tablename1, tablename2 WHERE tablename1. keycolumn=tablename2. keycolumn WHERE clause connects the keys from the two different tables Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 50 / 58
Join 2 tables with JOIN/ON syntax SQL: JOIN ON syntax SELECT tablename1. columnname, tablename2. columnname FROM tablename1 JOIN tablename2 ON tablename1. keycolumn=tablename2. keycolumn ON clause connects the keys from the two different tables Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 51 / 58
Join 3 tables with FROM/WHERE syntax SQL: JOIN WHERE syntax SELECT tablename1. columnname, tablename2. columnname, tablename3. columnname FROM tablename1, tablename2, tablename3 WHERE tablename1. keycolumn=tablename2. keycolumn AND tablename1. keycolumn=tablename3. keycolumn Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 52 / 58
Join 3 tables with JOIN/ON syntax SQL: JOIN ON syntax SELECT tabname1. columnname, tabname2. columnname, tabname3. columnname FROM tabname1 JOIN tabname2 ON tabname1. keycol=tabname2. keycol JOIN tabname3 ON tabname1. keycol=tabname3. keycol Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 53 / 58
Join 4 tables with FROM/WHERE syntax WHERE clause can contain additional constraints as well as specifying table linkages Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 54 / 58
Join 4 tables with JOIN/ON syntax Can add WHERE clause to specify additional constraints Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 55 / 58
Join operation with aggregates by group Who are the largest customers? Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 56 / 58
Join operation with aggregates by group What countries produce the most sales other than the US? Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 57 / 58
Computational Finance and Risk Management http://depts.washington.edu/compfin Guy Yollin (Copyright 2012) Data Access with SQL, Excel & VBA Introduction to SQL 58 / 58