SQL Server 2008 Core Skills Gary Young 2011
Confucius I hear and I forget I see and I remember I do and I understand
Core Skills Syllabus Theory of relational databases SQL Server tools Getting help Data warehouses Simple queries Aggregate queries Accreditation
Theory of Relational Databases
Relational Model Edgar Codd, IBM (1969) Based on set theory Normalised structure Related data stored in tables Each table has a primary key Tables linked using foreign keys Data manipulated using Structured Query Language (SQL)
Key Concepts Table Primary Key Unique column or combination of columns Foreign key Primary key of one table stored in another table Index Internal catalogue enabling faster queries Column / Field Column of data items of a specified data type Data Type Text Numeric Date/time Binary
Key Concepts View Stored select query May be used to restrict access to certain fields (eg salary) Used like tables System function Calculations and string manipulation User-defined function Encapsulated logic Input parameters Output value Stored procedure Encapsulated logic or process Input / output parameters May or may not return a value
Key Concepts Parameter A value passed to / from a process Trigger Stored procedure / function Automatic process that happens when data is changed Transaction All or nothing process Cash machine withdrawal Rollback on error
Normalisation: Why Bother? Mirror real world objects Save space Very important 1970s Maximise flexibility of querying Reduce risk of anomalies Minimise ambiguity
Denormalised Data
Normalised Data
SQL Terminology SQL Structured Query Language T-SQL Transact SQL (Microsoft extension) Programming language DML Data Manipulation Language select / insert / update / delete DDL Data Definition Language create / alter / drop
Object Naming Server.Database.Schema.Object [rptdev.nwtis.nhs.uk].minerva.common.date Don't need to specify all parts if using defaults Defaults Server: the current server Database: the current database Schema: the current user's default (usually dbo)
Schema Logical database partition Used to enforce security over groups of objects common All users restricted etl Administrators Administrators / data processors
SQL Server Tools
Key Tools SSMS SQL Server Management Studio 75% of your time spent here! DDL / DML Business Intelligence Development Studio SSAS SQL Server Analysis Services Data cubes SSRS SQL Server Reporting Services 25% of your time spent here Reports / dashboards SSIS SQL Server Integration Services Database level automation
Getting Help
Help Phone a friend Ask an expert (colleague) SQL Server 2008 Books Online Syntax for DDL / DML Examples most helpful But... need to know what you re looking for
EBNF Extended Backus-Naur Form Transact SQL Syntax Conventions <label> ::= label is defined as <A><B> A then B <A> <B> A or B [<A>] A is optional [...n] Repetition n times
Function / Operator Reference Will use time and again... Date and time functions dateadd / datetdiff / datename getdate String functions left / right / substring charindex Data types cast / convert Operators like % _ [xyz] [x-z] [^xyz] [^x-z] in ('x', 'y', 'z')
Data Warehouses
Schemas: Disambiguation Logical partition of a database Server.Database.Schema.Object Model Diagram Textual eg EBNF Plan or design of a database Star schema Snowflake schema
Typical Data Warehouse Structure Fact tables Main data tables SUS IP / OP / AE Dimension tables Look-up tables Admission Method GP
Star Schema
Snowflake Schema
Dimensions Star dimensions Simple look-up tables Admission method Ethnic Category Snowflake dimensions Hierarchies GP - Practice - Cluster - PCT - SHA Postcode - Electoral Ward - Town
Surrogate Keys Combine related dimension tables Natural key PAS / NHS Data Dictionary / OCS code Surrogate key Auto number
Simple Queries
Comments Use to improve readability of a script Ignored by SQL Server parser Single line format --this is a single line comment Everything after "--" ignored Multi line /* this is a multi line comment */
Simple Queries select <columns> from <fact table> <join type> join <dimension table> on <joined columns> where <filters> order by <columns>;
Select Modifiers select top x [percent] Restrict to x results Useful for testing select distinct Unique combinations Similar to group by
Aliasing Renaming objects within a query Table from A_Long_Table_Name a Column select PQRSZ123 as 'Easier_To_Read_Name' select PQRSZ123 as [Easier_To_Read_Name]
The Three Joins inner join Matches from both tables left (outer) join All matches from first table Matches from second (if any) Use left join in most cases union [all] All matches from both tables No duplicates! (unless all)
Filtering where clause and filters All filters in combination or filters Either / or filters in (A,B,C) shorthand for A or B or C Use brackets for clarity if used with and Comparators = < > <= >= like '%s%' (not '*s*')
Subqueries Use view like a table Use a select query like a table select <columns> from <fact table> <join type> join <select query> <view> on <joined columns> where <filters> order by <columns>
Case Statement Create new columns on-the-fly Recoding values case end when Day in ('Saturday','Sunday') then 'Weekend' else 'Working Day'
Aggregate Queries
Aggregate Select Query select <grouped columns> <literal columns> <aggregated columns> from <fact table> <join type> join <dimension table> on <joined columns> where <simple filters> group by <grouped columns> having <aggregate filters> order by <columns>;
Aggregate Functions sum Numeric values summed count Number of records min / max avg Lowest / highest values Mean floor / ceiling Next integer lower / higher
Accreditation
Time-limited Test Simple select Use of filters (where) Join types (inner / outer) Sorting (order by) Union Aggregate select Aggregate functions (count / min / max etc) Aggregate filters (having) Use of system functions Datetime (datediff / dateadd etc) String (left / right etc)