Foundations of Information Management



Similar documents
Foundations of Information Management

Lesson 8: Introduction to Databases E-R Data Modeling

Chapter 2: Entity-Relationship Model. Entity Sets. " Example: specific person, company, event, plant

Chapter 2: Entity-Relationship Model. E-R R Diagrams

Chapter 2: Entity-Relationship Model

Chapter 1: Introduction. Database Management System (DBMS)

Introduction to database management systems

BİL 354 Veritabanı Sistemleri. Entity-Relationship Model

THE ENTITY- RELATIONSHIP (ER) MODEL CHAPTER 7 (6/E) CHAPTER 3 (5/E)

ER modelling, Weak Entities, Class Hierarchies, Aggregation

We know how to query a database using SQL. A set of tables and their schemas are given Data are properly loaded

COMP 378 Database Systems Notes for Chapter 7 of Database System Concepts Database Design and the Entity-Relationship Model

Chapter 3. Data Modeling Using the Entity-Relationship (ER) Model

Entity-Relationship Model

Chapter 1: Introduction

The Entity-Relationship Model

Database System Concepts

Unit 2.1. Data Analysis 1 - V Data Analysis 1. Dr Gordon Russell, Napier University

Chapter 7 Data Modeling Using the Entity- Relationship (ER) Model

Converting E-R Diagrams to Relational Model. Winter Lecture 17

Exercise 1: Relational Model

IV. The (Extended) Entity-Relationship Model

not necessarily strictly sequential feedback loops exist, i.e. may need to revisit earlier stages during a later stage

Data Analysis 1. SET08104 Database Systems. Napier University

Chapter 1: Introduction. Database Management System (DBMS) University Database Example

Comp 3311 Database Management Systems. 2. Relational Model Exercises

Database Management Systems

OVERVIEW 1.1 DATABASE MANAGEMENT SYSTEM (DBMS) DEFINITION:-

2. Conceptual Modeling using the Entity-Relationship Model

Database Design. Marta Jakubowska-Sobczak IT/ADC based on slides prepared by Paula Figueiredo, IT/DB

CS2Bh: Current Technologies. Introduction to XML and Relational Databases. Introduction to Databases. Why databases? Why not use XML?

DATABASE INTRODUCTION

Lecture 12: Entity Relationship Modelling

The Relational Model. Ramakrishnan&Gehrke, Chapter 3 CS4320 1

Data Modeling Basics

DATABASE MANAGEMENT SYSTEMS. Question Bank:

1. INTRODUCTION TO RDBMS

CSE 132A. Database Systems Principles


DATABASE DESIGN. - Developing database and information systems is performed using a development lifecycle, which consists of a series of steps.

Basic Concepts of Database Systems

Bridge from Entity Relationship modeling to creating SQL databases, tables, & relations

Chapter 1: Introduction

7.1 The Information system

Database Design Process

Databases Model the Real World. The Entity- Relationship Model. Conceptual Design. Steps in Database Design. ER Model Basics. ER Model Basics (Contd.

Lecture 6. SQL, Logical DB Design

Databases and BigData

The Entity-Relationship Model

Database System Architecture & System Catalog Instructor: Mourad Benchikh Text Books: Elmasri & Navathe Chap. 17 Silberschatz & Korth Chap.

Database Design Process

Data Modeling: Part 1. Entity Relationship (ER) Model

Relational Database Basics Review

What is a database? COSC 304 Introduction to Database Systems. Database Introduction. Example Problem. Databases in the Real-World

CSC 742 Database Management Systems

- Eliminating redundant data - Ensuring data dependencies makes sense. ie:- data is stored logically

Database Design Overview. Conceptual Design ER Model. Entities and Entity Sets. Entity Set Representation. Keys

æ A collection of interrelated and persistent data èusually referred to as the database èdbèè.

Introduction to Computing. Lectured by: Dr. Pham Tran Vu

DBMS Questions. 3.) For which two constraints are indexes created when the constraint is added?

XV. The Entity-Relationship Model

The Relational Model. Why Study the Relational Model? Relational Database: Definitions. Chapter 3

Designing a Database Schema

Principles of Database. Management: Summary

Entity Relationship Diagram

SCHEMAS AND STATE OF THE DATABASE

14 Databases. Source: Foundations of Computer Science Cengage Learning. Objectives After studying this chapter, the student should be able to:

Attribute Data and Relational Database. Lecture 5 9/21/2006

2. Basic Relational Data Model

Database Fundamentals: 1

Boyce-Codd Normal Form

three Entity-Relationship Modeling chapter OVERVIEW CHAPTER

Conceptual Design Using the Entity-Relationship (ER) Model

Modern Systems Analysis and Design

Chapter 7: Relational Database Design

Introduction. Introduction: Database management system. Introduction: DBS concepts & architecture. Introduction: DBS versus File system

Course: CSC 222 Database Design and Management I (3 credits Compulsory)

Doing database design with MySQL

A brief overview of developing a conceptual data model as the first step in creating a relational database.

Introduction: Database management system

Fundamentals of Database Design

SQL, PL/SQL FALL Semester 2013

The E-R èentity-relationshipè data model views the real world as a set of basic objects èentitiesè and

Lecture Notes INFORMATION RESOURCES

Information Systems Analysis and Design CSC John Mylopoulos Database Design Information Systems Analysis and Design CSC340

SQL AND DATA. What is SQL? SQL (pronounced sequel) is an acronym for Structured Query Language, CHAPTER OBJECTIVES

7. Databases and Database Management Systems

CHAPTER 6 DATABASE MANAGEMENT SYSTEMS. Learning Objectives

IT2305 Database Systems I (Compulsory)

Tutorial on Relational Database Design

SQL Databases Course. by Applied Technology Research Center. This course provides training for MySQL, Oracle, SQL Server and PostgreSQL databases.

Database Design Process. Databases - Entity-Relationship Modelling. Requirements Analysis. Database Design

Files. Files. Files. Files. Files. File Organisation. What s it all about? What s in a file?

1 File Processing Systems

GUJARAT TECHNOLOGICAL UNIVERSITY, AHMEDABAD, GUJARAT. COURSE CURRICULUM COURSE TITLE: DATABASE MANAGEMENT (Code: ) Information Technology

Designing Databases. Introduction

Introduction to Databases

ECS 165A: Introduction to Database Systems

B2.2-R3: INTRODUCTION TO DATABASE MANAGEMENT SYSTEMS

Transcription:

Foundations of Information Management - WS 2012/13 - Juniorprofessor Alexander Markowetz Bonn Aachen International Center for Information Technology (B-IT)

Data & Databases Data: Simple information Database: Collection of interrelated data Examples Banking: all transactions Airlines: reservations, schedules Universities: registration, grades Sales: customers, products, purchases 2

Database (Management) Systems Software to access data Convenient and efficient to use DBMS DB Users & application programs 3

Amazon: A really big DBMS Purchasing Customers Web Shop DBMS Warehouse & Shipping External Vendors DB Advertising Plus many more (external) connections. 4

Commercial DBMS The Big Three: Oracle IBM DB2 MS SQL Server Others: Sybase Informix (now IBM) Ingress Open Source: PostgresSQL MySQL Office Toys: MS Access 5

Databases in Life Science Most databases in the life sciences do not use a DBMS! Hundreds of databases in biology, chemistry, pharmacy, or medicine are based on dedicated (system-specific) textfile formats which come with very limited software support (if any). This lecture familiarizes you with the ideal of a database + DBMS, in order to be able to properly judge how much DBMS you need. There are cases where using a full DBMS would be overkill sometimes a less powerful system is more appropriate. There is a big turn towards moving LS databases to a stable and powerful general purpose DBMS you ought to know the basic principles of database technology. At the end of the lecture, we will look at alternatives to (real) database systems, though. 6

Before Database Systems Binary Files: 0100 1001 0101 0001 0101 0001 0101 0101 0101 1100 1111 1100 0110 Text Files: 01, Alexander, Markowetz, Professor 02, Bob, Benson, Truck Driver 03, Janice, Watson, Nurse 7

Drawbacks of storing data in files (1) Data redundancy and inconsistency Multiple file formats Duplication of information in different files Difficulty in accessing data Need to write a new program to carry out each new task Integrity problems Integrity constraints (e.g. account balance > 0) become part of program code Hard to add new constraints or change existing ones

Drawbacks of storing data in files (2) Atomicity of updates Failures may leave database in an inconsistent state E.g. transfer of funds between accounts should either complete or not happen at all Concurrent access by multiple users Concurrent accesses needed for performance Uncontrolled concurrent accesses can lead to inconsistencies E.g. two people reading a balance and updating it at the same time Security problems

Example (1) Alex writes a program to manage the addresses of all students at this university He uses a text file to store the addresses: Name, Address, Program of Study He has to write code parsing the text lines He has to write a code to ensure that the name of a student cannot become null When he wants to add another data-field Age He has to change all of the above code Two separate departments need access to this data Each keeps its own copy Over time, the two databases will drift apart, become inconsistent

Example (2) Whenever Alex introduces any change in the data format He has to change all the above code, yet again, at both departments When he implements another project for the university He has to write all the above again Still, his code is full of errors, does not allow two users to access data at the same time, and lacks many other features DBMS solve all the above problems

Data Independence Application program is isolated from the way that data is stored in the DBMS DBMS is isolated from hardware Achieved in a 3-layer architecture application view Logical Independence logical Physical Independence physical

Parts of a Database Database Schema Metadata, data about data Describes the structure of the data What sets (tables) of data there are Which data-fields (attributes) they contain Database Instance The actual data stored in the database At this moment!!!

Database Design (1) 1) Requirements Analysis Analyze real world, user needs & requirements Informal process, client interviews, etc. 2) Conceptual Design High level description of data to be stored Results in an ER-model 3) Logical Design Convert conceptual design into a relational database schema

Database Design (2) 4) Schema Refinement Analyze and refine logical schema Guided by powerful and elegant theory 5) Physical Design Address database performance Create Indexes 6) Application and Security Design

Interacting with a Database Data Definition Language (DDL) Describes the schema Data Manipulation Language Insert, delete and update data objects Retrieve data (query the database) There are graphical tools as well, these too can be categorized into the above categories SQL comprises both, a DDL as well as a DML

Thinking Databases As seen above, there are many benefits to using DBMS However, there is one more: Entity Relationship Diagrams A formal way to design data Relational Algebra A formal way to query data

Basic Concepts of ER A database can be modeled as a collection of entities relationships among entities Entity: an object that exists independently and is distinguishable from other objects. an employee, a company, a car, a student, a class etc. color, age, etc. are not entities

Entity set: entities of the same type E.g., a set of employees, a set of departments also called entity types Entity Type : Employee A general specification Entity set: e 1 e 2 e 3 The actual employees

Attributes Properties of an entity name, address, weight, height are properties of a Person entity Properties of relationships date of marriage is a property of the relationship Marriage

Types of Attributes Simple attribute: contains a single value. EmpNo Employee Name Address

Composite Attributes EmpNo Employee Address Name Street City Country

Multivalued attributes: > 1 values Employee Phone Email

Derived attributes: computed from others Employee Age Date of birth

Key Attributes A set of attributes that can uniquely identify an entity ERD tabular Employee EmpNo Name EmpNo Name... 123456 John Wong... 456789 Mary Cheung... 146777... John Wong

Key Attributes Composite key: Name or Address alone cannot uniquely identify a student, but together they can! Student Name Address

Key Attributes An entity may have more than one key Candidate key A minimal set of attributes that uniquely identifies an entity Primary key One candidate key is selected to be the primary key Sometimes artificial keys may be created E.g. we can enumerate all employees in a company

Example Entity (Customer)

Relationship A relationship is an association among several entities The degree refers to the number of entity sets that participate in a relationship set. Binary: two entity sets More than two relationships: very rare

Example of (Binary) Relationship Borrower is a relationship between Customers and Loans A customer can associated with one or more loans And vice versa

Relationship Sets with Attributes Depositor is a relationship between Customers and Accounts Access-date is an attribute of Depositor

Cardinality Constraints We express cardinality constraints by drawing either a directed line ( ), signifying one, or an undirected line ( ), signifying many, between the relationship set and the entity set. E.g.: One-to-one relationship: A customer is associated with at most one loan via the relationship borrower A loan is associated with at most one customer via borrower

One-To-Many Relationship In the one-to-many relationship a loan is associated with at most one customer via borrower, a customer is associated with several (including 0) loans via borrower

Many-To-One Relationships In a many-to-one relationship a loan is associated with several (including 0) customers via borrower, a customer is associated with at most one loan via borrower

Many-To-Many Relationship A customer is associated with several (possibly 0) loans via borrower A loan is associated with several (possibly 0) customers via borrower

Participation of an Entity Set in a Relationship Set Total participation (indicated by double line): every entity in the entity set participates in at least one relationship in the relationship set E.g. participation of loan in borrower is total every loan must have a customer associated to it via borrower Partial participation: some entities may not participate in any relationship in the relationship set E.g. participation of customer in borrower is partial

Alternative Cardinality Notation Cardinality limits can also express participation constraints

Roles Entity sets of a relationship need not be distinct The labels manager and worker are called roles; they specify how employee entities interact via the works-for relationship set. Roles are indicated in E-R diagrams by labeling the lines that connect diamonds to rectangles. Role labels are optional, and are used to clarify semantics of the relationship

Keys for Relationship Sets The combination of primary keys of the participating entity sets forms a super key of a relationship set. (customer-id, account-number) is the super key of depositor This means that a pair of entities can have at most one relationship in a particular relationship set. E.g. if we wish to track all access-dates to each account by each customer, we cannot assume a relationship for each access. Solution: use a multivalued attribute for access dates. Must consider the mapping cardinality of the relationship set when deciding the candidate keys

Ternary Relationships Suppose employees of a bank may have jobs (responsibilities) at multiple branches, with different jobs at different branches. Then there is a ternary relationship set between entity sets employee, job and branch

Binary Vs. Non-Binary Relationships Some relationships that appear to be nonbinary may be better represented using binary relationships E.g. A ternary relationship parents, relating a child to his/her father and mother, is best replaced by two binary relationships, father and mother Using two binary relationships allows partial information (e.g. only mother being known) But there are some relationships that are naturally non-binary E.g. works-on

Weak Entity Sets An entity set that does not have a primary key is referred to as a weak entity set. The existence of a weak entity set depends on the existence of a identifying entity set it must relate to the identifying entity set via a total, one-to-many relationship set from the identifying to the weak entity set Identifying relationship depicted using a double diamond The discriminator (or partial key) of a weak entity set is the set of attributes that distinguishes among all the entities of a weak entity set. The primary key of a weak entity set is formed by the primary key of the strong entity set on which the weak entity set is existence dependent, plus the weak entity set s discriminator.

Weak Entity Sets (Cont.) We depict a weak entity set by double rectangles. We underline the discriminator of a weak entity set with a dashed line. payment-number discriminator of the payment entity set Primary key for payment (loan-number, payment-number)

Another example of weak entity type EmpNo Name Employee Emp_Dep Dependent Age A child may not be old enough to have a passport number Even if he/she has a passport number, the company may not be interested in keeping it in the database.

Summary of Symbols (Cont.)

Design Decisions - Attribute vs Entity For each employee we want to store the office number, location of the office (e.g., Building A, floor 6), and telephone. Several employees share the same office Office as attribute Employee_id Office_number Office_location Name Employee Office_phone Employee_id Office as entity Office_number Office_location Name Employee Office Office_phone

ER Design Decisions - Entity vs Relationship Account example Can you see some differences? (e.g., can you have accounts without a customer?) Account as an entity Customer Account Branch Account as relationship Account Customer Branch

ER Design Decisions - Entity vs Relationship You want to record the period that an employ works for some department. ssn name lot from to did dname budget Employees Works_In2 Departments ssn name lot did dname budget Employees Works_In3 Departments from Duration to

ER Design Decisions - Strong vs. Weak Entity Example: What if in the accounts example an account must be associated with exactly one branch two different branches are allowed to have accounts with the same number. Number Branch_id Account Branch