Relational Data Analysis I

Similar documents
David M. Kroenke and David J. Auer Database Processing 11 th Edition Fundamentals, Design, and Implementation. Chapter Objectives

DATABASE NORMALIZATION

Database Design. Marta Jakubowska-Sobczak IT/ADC based on slides prepared by Paula Figueiredo, IT/DB

In This Lecture. Physical Design. RAID Arrays. RAID Level 0. RAID Level 1. Physical DB Issues, Indexes, Query Optimisation. Physical DB Issues

David M. Kroenke and David J. Auer Database Processing 12 th Edition

SQL, PL/SQL FALL Semester 2013

Oracle Database 10g Express

1. INTRODUCTION TO RDBMS

Database Concepts II. Top down V Bottom up database design. database design (Cont) 3/22/2010. Chapter 4

Bridge from Entity Relationship modeling to creating SQL databases, tables, & relations

MySQL for Beginners Ed 3

DATABASE DESIGN: NORMALIZATION NOTE & EXERCISES (Up to 3NF)

CS2Bh: Current Technologies. Introduction to XML and Relational Databases. The Relational Model. The relational model

TIM 50 - Business Information Systems

Overview. Physical Database Design. Modern Database Management McFadden/Hoffer Chapter 7. Database Management Systems Ramakrishnan Chapter 16

CS2Bh: Current Technologies. Introduction to XML and Relational Databases. Introduction to Databases. Why databases? Why not use XML?

14 Databases. Source: Foundations of Computer Science Cengage Learning. Objectives After studying this chapter, the student should be able to:

VBA and Databases (see Chapter 14 )

- Eliminating redundant data - Ensuring data dependencies makes sense. ie:- data is stored logically

Relational Database Basics Review

Chapter 9: Normalization

BCS THE CHARTERED INSTITUTE FOR IT BCS HIGHER EDUCATION QUALIFICATIONS BCS Level 5 Diploma in IT DATABASE SYSTEMS

Unit 3.1. Normalisation 1 - V Normalisation 1. Dr Gordon Russell, Napier University

Tutorial on Relational Database Design

Normalization in Database Design

Physical Design. Meeting the needs of the users is the gold standard against which we measure our success in creating a database.

Introduction to Databases

Fundamentals of Database System

Normalization of Database

SQL - QUICK GUIDE. Allows users to access data in relational database management systems.

1 Structured Query Language: Again. 2 Joining Tables

Normalisation to 3NF. Database Systems Lecture 11 Natasha Alechina

Database Setup. Coding, Understanding, & Executing the SQL Database Creation Script

COSC344 Database Theory and Applications. Lecture 9 Normalisation. COSC344 Lecture 9 1

Design of Relational Database Schemas

2. Basic Relational Data Model

Normalisation 6 TABLE OF CONTENTS LEARNING OUTCOMES

SQA Higher Information Systems Unit 2: Relational Database Systems

DATABASE INTRODUCTION

Normalisation 1. Chapter 4.1 V4.0. Napier University

Files. Files. Files. Files. Files. File Organisation. What s it all about? What s in a file?

EVERGREEN REPORT TEMPLATE

GUJARAT TECHNOLOGICAL UNIVERSITY, AHMEDABAD, GUJARAT. COURSE CURRICULUM COURSE TITLE: DATABASE MANAGEMENT (Code: ) Information Technology

AS LEVEL Computer Application Databases

Normalization. Reduces the liklihood of anomolies

7. Databases and Database Management Systems

In This Lecture. SQL Data Definition SQL SQL. Notes. Non-Procedural Programming. Database Systems Lecture 5 Natasha Alechina

The Relational Model. Why Study the Relational Model?

Database Normalization. Mohua Sarkar, Ph.D Software Engineer California Pacific Medical Center

5.5 Copyright 2011 Pearson Education, Inc. publishing as Prentice Hall. Figure 5-2

SQL Databases Course. by Applied Technology Research Center. This course provides training for MySQL, Oracle, SQL Server and PostgreSQL databases.

A Brief Introduction to MySQL

The Relational Model. Why Study the Relational Model? Relational Database: Definitions

The Structured Query Language. De facto standard used to interact with relational DB management systems Two major branches

and what does it have to do with accounting software? connecting people and business

Database Design and Implementation


EXTENDED LEARNING MODULE A

IBM DB2 XML support. How to Configure the IBM DB2 Support in oxygen

D61830GC30. MySQL for Developers. Summary. Introduction. Prerequisites. At Course completion After completing this course, students will be able to:

1 Class Diagrams and Entity Relationship Diagrams (ERD)

Database Systems. Lecture Handout 1. Dr Paolo Guagliardo. University of Edinburgh. 21 September 2015

Mul$media im Netz (Online Mul$media) Wintersemester 2014/15. Übung 03 (Nebenfach)

The class slides, your notes, and the sample problem that we worked in class may be helpful for reference.

In This Lecture. Security and Integrity. Database Security. DBMS Security Support. Privileges in SQL. Permissions and Privilege.

Database Design Standards. U.S. Small Business Administration Office of the Chief Information Officer Office of Information Systems Support

Introduction to normalization. Introduction to normalization

Normalization. Purpose of normalization Data redundancy Update anomalies Functional dependency Process of normalization

Intro to Databases. ACM Webmonkeys 2011

CSE 530A Database Management Systems. Introduction. Washington University Fall 2013

SFCC Network Storage Tutorial. Prepared by Information Technology Services (ITS)

A. TRUE-FALSE: GROUP 2 PRACTICE EXAMPLES FOR THE REVIEW QUIZ:

Fundamentals of Database Design

There are five fields or columns, with names and types as shown above.

The 3 Normal Forms: Copyright Fred Coulson 2007 (last revised February 1, 2009)

Managing the Database and Student Records Online (at the District Level)

INSTALLING, CONFIGURING, AND DEVELOPING WITH XAMPP

Database Forms and Reports Tutorial

INFO/CS 330: Applied Database Systems

1.264 Lecture 10. Data normalization

Demystified CONTENTS Acknowledgments xvii Introduction xix CHAPTER 1 Database Fundamentals CHAPTER 2 Exploring Relational Database Components

CSCI110 Exercise 4: Database - MySQL

C# Cname Ccity.. P1# Date1 Qnt1 P2# Date2 P9# Date9 1 Codd London Martin Paris Deen London

PUBLIC. How to Use in SAP Business One. Solutions from SAP. SAP Business One 2005 A SP01

Higher National Unit specification: general information. Relational Database Management Systems

Chapter 6. Database Tables & Normalization. The Need for Normalization. Database Tables & Normalization

HP Quality Center. Software Version: Microsoft Word Add-in Guide

Designing Databases. Introduction

10. Creating and Maintaining Geographic Databases. Learning objectives. Keywords and concepts. Overview. Definitions

SharePoint Wiki Redirect Installation Instruction

How To Save Data On A Spreadsheet

Database Linker Tutorial

Database Design and the Reality of Normalisation

Transcription:

Relational Data Analysis I

Relational Data Analysis Prepares Business data for representation using the relational model The relational model is implemented in a number of popular database systems Access Oracle MySQL SQL Server DB2

The Relational Model A relation is a table of data A relational database is therefore one in which tables are used to store data This implies that there are other ways of storing data Tables will be related to each other in some way Because the data held in them is related The context of the system we are developing governs which data items are related and how they are related

Relational Data Analysis Relational data analysis therefore involves: Building related tables of data Retrieval of data from one or more related tables Inserting, Updating and Deleting data from related tables

Relational Data Analysis Relational Data Analysis is quite formal Based on set theory Uses Relational Algebra to define operations on tables We will take a less formal approach

Definitions A relation corresponds to a table A tuple is a row in a table An attribute is a column in a table A Primary Key is the attribute by which we uniquely identify each row The number of rows in a table is called the cardinality The number of attributes in a table is called the degree

Example Relation (Table) The table can also be described without its data: Student (Student ID, Student Name, Course, Module Code, Module Name, Grade) We should use CamelCase for Attributes and Table Names: Student (student ID, studentname, course, modulecode, modulename, grade)

Example Relation (Tuple) A Tuple is a row in a table Each tuple should be unique The sequence of tuples should not be important A Primary Key Attribute is added to ensure each row is unique This data is for a machine to read not a person

Example Relation (Attribute) An Attribute is a column in a table Each attribute should have a unique name The order of the columns should not be significant

Example Relation (Table) Our example has a cardinality of 4 and a degree of 6 The primary key will be student ID as this will uniquely identify each row We cannot know this without having an understanding of the data If there is no existing primary key then we must invent one

Exercise Name Tom Number 0050065 Town Manchester No of contracts 2 Depot Manchester Dick 0338178 Leeds 1 Manchester Harry 1922029 Manchester 3 Stoke Sue 0002911 Oxford 1 Reading Frieda 1001001 Cardiff 7 Cardiff Imran 23455678 Manchester 1 Stoke Yue 32156545 Manchester 7 London

Exercise What is the cardinality of the table? What is the degree of the table? Identify the Primary Key of the table?

Exercise Name Tom Number 0050065 Town Manchester No of contracts 2 Depot Manchester Dick 0338178 Leeds 1 Manchester Harry 1922029 Manchester 3 Stoke Sue 0002911 Oxford 1 Reading Frieda 1001001 Cardiff 7 Cardiff Imran 23455678 Manchester 1 Stoke Yue 32156545 Manchester 7 London

Exercise What is the cardinality of the table? How many rows? 7 What is the degree of the table? How many attributes? 5 Identify the Primary Key of the following table? Number But how do we know? Why not Name?

Tables and Entities Each table is equivalent to an entity in an ERD Each attribute is equivalent to an attribute in an ERD Each tuple is an occurrence of an entity in an ERD The primary key is equivalent to the key attribute in an ERD entity

Rules A table cannot have any empty cells No two rows in a table are identical i.e. there are no duplicate tuples/rows Every relation has a Primary Key attribute The sequence of the rows should not be significant The sequence of the columns should not be significant Each attribute must have a unique name

Problems with Tables Problems with tables can be classified into three groups: Insert Anomalies Problems caused when inserting new information Update Anomalies Problems caused when updating existing data Delete Anomalies Problems caused when deleting data

Problems with Tables Insert Anomoly: We cannot add a new course unless we have a student ID Update Anomoly: What if Big John changes his name? Assume there are 1000s of records Delete Anomoly: What if we remove Terrence Halfwit from the database?

The Solution? To remove these anomalies we must rearrange the data and create new tables The process for doing this is called Normalisation

Normalisation First Three Stages First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form (3NF) 1NF can be considered as Normalised But there could still be problems Most common problems are solved by 3NF Further Normalisation will solve rarer problems

First Normal Form All data in a table must be dependant on the key In order to do this we must remove repeating groups This is done by analysing the relationship between the primary key and the rest of the data

Example 1 - Students Student ID Student Name Course ID Course Module Code Module Name Grade Attributes are moved if there is more than one for each instance of the primary key

Example 1 - Students Student ID Student Name Course Course ID Module Code Module Name Grade For each Student ID How many Student names are there? 1 or Many?

Example 1 - Students Student ID Student Name Course Course ID Module Code Module Name Grade For each Student ID How many Courses are there? 1 or Many?

Example 1 - Students Student ID Student Name Course Course ID Module Code Module Name Grade For each Student ID How many Course IDs are there?

Example 1 - Students Student ID Student Name Course Course ID Module Code Module Name Grade For each Student ID How many Module Codes are there?

Example 1 - Students Student ID Student Name Course Course ID Module Code Module Name Grade For each Student ID How many Module Codes are there?

Example 1 - Students Student ID Student Name Course Course ID Module Code Module Name Grade For each Student ID How many Module Names are there?

Example 1 - Students Student ID Student Name Course Course ID Module Code Module Name Grade For each Student ID How many Module Names are there?

Example 1 - Students Student ID Student Name Course Course ID Module Code Module Name Grade For each Student ID How many Grades are there?

Example 1 - Students Student ID Student Name Course Course ID Module Code Module Name Grade For each Student ID How many Grades are there?

Example 1 - Students Student ID Student Name Course Course ID Module Code Module Name Grade Indented data is a repeating group We need to put it into a new table This table will describe the module a student is taking We will call it Student Module

Example 1 - Students Student ID Student Name Course Course ID Student ID Module Code Module Name Grade We now have two tables Student details Primary Key = Student ID Student s module details PK = Student ID, Module Code Called a compound Key

Example 1 - Students Update Anomoly: What if Big John changes his name? Delete Anomoly: What if we remove Terrence Halfwit from the database? Insert Anomoly: We still cannot add a new course unless we have a student ID

Yes But No But There are still Anomalies Update If Creative Accounting name is changed Insert Cannot add a new module unless we have a student enrolled Delete When a student leaves we could lose course information These are dealt with by later Normal Forms

Example 2 - Library Student ID Name Faculty Book ID Title Author Return Date Put this data into First Normal Form

Example 3 Customer ID Customer Name Address Branch No Branch Manager Stock ID Title Format Put this data into First Normal Form

Remember 1NF can be considered as Normalised But it doesn t solve all of our problems Need to go through second and third Normal Forms in Tutorials and next week

Second Normal Form Only Applies to tables with compound keys (Tables without compound keys are already in 2NF) Data in a table must depend on the whole key We must remove any partial dependencies