Databases and DBMS Eric Lew (MSc, BSc) SeconSys Inc. Nov 2003 What is a Database? Data (singular: datum) Factual Information Database Organized body of related information Repository / storage of information Examples: Grocery list Phone book
Data Organization Grocery list: Apples (2) Bread Yoghurt Paper plates (20) Lettuce (2) Chicken drumsticks Mushrooms Cheese Spaghetti Pork ribs Cilantro Tomato Sauce Plastic forks (20) Oranges Ice cream Plastic knives (20) Parsley Sour cream Tomatoes Bacon Paper napkins Spinach Data Organization (II) Grocery list: Fresh Produce: Meats: Oranges (1 lb) Apples (2 lbs) Lettuce (2) Spinach Mushrooms (white) Parsley Cilantro Tomatoes (plum) Pork ribs (3 lbs) Bacon (1 lb) Chicken drumsticks (20) Dairy: Other: Yoghurt (low fat) Sour cream Ice cream Cheese (mozarella) Bread (white sliced) Spaghetti Tomato Sauce Plastic forks (20) Plastic knives (20) Paper napkins (20) Paper plates (20)
Data Models Conceptual Model What is the data about How will the data be used Entity-Relationship (ER) diagram Logical Model Implementation dependent Network, Hierchical - obsolete Relational - the current standard Object-oriented - the future? Physical Model How is the data physically stored and retrieved Hardware and software dependent
Relational Data Model Data organized as a collection of tables Each table represents an entity type Consists of rows and columns Each row contains the data values of one entity Each column represents an attribute (property) of the entity Technical terms: Table = relation Row = tuple, record Column = attribute, field Domain = cell value Relational Data Model - Example
Relational Concepts Attribute values Data types: character, number, logical, date Range constraint; e.g. 1.0 to 9.9 Enumeration contraint; e.g. RED, BLUE, GREEN Null value means unknown data No two rows can be identical Primary key: attribute(s) which uniquely identify a row Relational Database Design A cell should only have one value Minimize data redundancy (reduce human error) Replace one big table by several small tables Technical term: normalization Use small attributes for primary keys More efficient use of space Faster retrieval of data
Relational Database Bad Design Student Address Course Books Professor John Bell 89 Lilac Dr. Software 101 book1, book2 James Cook Laura Holm 675 West Lane Psychology 201 book3 Bill Gomez Chet Davis 11 Bright Cres. Database 101 book4, book5 James Cook Mia Rowe 92 Sunlight Rd. History 202 book6 Jenny Yates Jill Aries 7 Hilly Lane History 202 Book6 Jennifer Yates Chet Davis 11 Bright Crescent Software 101 book1, book2 James Cook Mia Rowe 92 Sunlight Rd. Psychology 201 book3 Bill Gomez Alan Gold 117 Rose Drive History 201 book6 Jennifer Yates Relational Database - Good Design Student StudentCourse ID Name Address Stu_ID Cour_ID 11 John Bell 89 Lilac Dr. 11 31 12 Laura Holm 675 West Lane 12 32 13 Chet Davis 11 Bright Cres. 13 33 14 Mia Rowe 92 Sunlight Rd. 14 34 15 Jill Aries 7 Hilly Lane 15 34 16 Alan Gold 117 Rose Drive 13 31 14 32 Course ID Name Prof_ID 16 35 Professor 31 Software 101 1 32 Psychology 201 2 ID Prof_Name 33 Database 101 1 1 James Cook 34 History 202 3 2 Bill Gomez 35 History 201 3 3 Jennifer Yates
Relational Concept: Join Relationship between tables identified by primary keys and foreign keys Primary key (PK) in Professor table is ID Foreign key (FK) in Course table is Prof_ID Primary and foreign key must be same data type Joining tables: compare FK value and PK value Course.Prof_ID = Professor.ID Relationship types: One-to-one One-to-many Many-to-many Relational Database Management System Separate application from data Several applications can use same data Tables can be added to database incrementally Multi-user concurrent access to data Maintenance of data integrity Enforce validation of data Back-up and recovery Transaction management All or nothing (eg. Bank transfer) SQL: Structured Query Language Standard language for all relational DBMS
Structured Query Language INSERT command One row at a time INSERT INTO Student (ID, Name) VALUES (20, Adam Wright ) DELETE command DELETE FROM StudentCourse WHERE Stu_ID = 20 UPDATE command UPDATE Course SET Prof_ID = 3 WHERE Name LIKE History% SELECT command SELECT * FROM Student WHERE ID > 10 SELECT Course.Name, Prof_Name FROM Course, Professor WHERE Course.Prof_ID = Professor.ID Stored Procedures Blocks of SQL commands Implement business rules Reusable used by multiple applications Stored in the DBMS (hence the name) Compiled at time of creation Faster runtime execution No syntax errors at run-time Language varies from one DBMS to another Oracle: PL/SQL Microsoft: Trans-SQL
Stored Procedure - Example CREATE PROCEDURE TransferMoney @acctnum1 CHAR(15), @acctnum2 CHAR(15), @amount FLOAT AS DECLARE @balance FLOAT BEGIN TRANSACTION SELECT @balance = Balance FROM Savings WHERE Account = acctnum1 IF (@balance < @amount) THEN ROLLBACK TRANSACTION RETURN (-1) END IF UPDATE Savings SET Balance = Balance - @amount WHERE Account = acctnum1 UPDATE Savings SET Balance = Balance + @amount WHERE Account = acctnum2 COMMIT TRANSACTION RETURN (0) Georelational Data Model Hybrid data model (logical model) Topological data mode (represents spatial data) Relational DBMS (represents attribute data) Geographic data represented by Layers Roads, streams, land cover Each layer is stored in a separate table Spatial objects classified by graphical form Points, Lines and Polygons are stored in separate tables
Georelational Example Geodatabase Storage of geo information within DBMS Versioning Multi-user editing Multiple representations of data Long transactions Behaviours Validation rules: domains, sub types Default values
Object-Oriented Data Model Entities represented as objects Object classes (types) Objects have properties (attribues) Objects have methods (operations) Classes and Inheritance Class hierarchy super and sub classes Creation of new classes: Point, Line, Polygon Operations and Encapsulation Data and operations bundled together Polymorphism: classes Rectangle and Circle can have same operation CalculateArea Object-Oriented DBMS Not widely accepted currently Lack of standard query language Portability issues Not as efficient as Relational DBMS Big name RDBMS have too much marketing clout
DBMS in an Enterprise Application Server Web Server Sales Report ArcObjects MapInfo GeoDatabase Sales Database Inventory