Database Systems. National Chiao Tung University Chun-Jen Tsai 05/30/2012



Similar documents
A database can simply be defined as a structured set of data

5.5 Copyright 2011 Pearson Education, Inc. publishing as Prentice Hall. Figure 5-2

Databases What the Specification Says

Foundations of Business Intelligence: Databases and Information Management

TIM 50 - Business Information Systems

1 File Processing Systems

Foundations of Business Intelligence: Databases and Information Management

Database Management. Chapter Objectives

Course MIS. Foundations of Business Intelligence

Demystified CONTENTS Acknowledgments xvii Introduction xix CHAPTER 1 Database Fundamentals CHAPTER 2 Exploring Relational Database Components

Chapter 6 FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT Learning Objectives

7. Databases and Database Management Systems

Topics. Database Essential Concepts. What s s a Good Database System? Using Database Software. Using Database Software. Types of Database Programs

Database Systems. Lecture 1: Introduction

Chapter 1: Introduction. Database Management System (DBMS) University Database Example

Foundations of Business Intelligence: Databases and Information Management

Alexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data

COIS Databases

DATABASE MANAGEMENT SYSTEM

Concepts of Database Management Seventh Edition. Chapter 9 Database Management Approaches

The Relational Model. Why Study the Relational Model? Relational Database: Definitions. Chapter 3

CHAPTER SIX DATA. Business Intelligence The McGraw-Hill Companies, All Rights Reserved

Object Oriented Databases. OOAD Fall 2012 Arjun Gopalakrishna Bhavya Udayashankar

Foundations of Business Intelligence: Databases and Information Management

Contents RELATIONAL DATABASES

What is Data Virtualization? Rick F. van der Lans, R20/Consultancy

Course 6234A: Implementing and Maintaining Microsoft SQL Server 2008 Analysis Services

The Relational Model. Why Study the Relational Model? Relational Database: Definitions

LSP 121. LSP 121 Math and Tech Literacy II. Simple Databases. Today s Topics. Database Class Schedule. Simple Databases

ETL Process in Data Warehouse. G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT

Week 1 Part 1: An Introduction to Database Systems. Databases and DBMSs. Why Use a DBMS? Why Study Databases??

Chapter 6. Foundations of Business Intelligence: Databases and Information Management

Concepts of Database Management Seventh Edition. Chapter 7 DBMS Functions

INTRODUCTION TO DATABASE SYSTEMS

Course DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

2. Basic Relational Data Model

What is Data Virtualization?

SQL Simple Queries. Chapter 3.1 V3.0. Napier University Dr Gordon Russell

CHAPTER 2 DATABASE MANAGEMENT SYSTEM AND SECURITY

Databases and Information Management

Databases in Organizations

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

DISTRIBUTED AND PARALLELL DATABASE

CSE 233. Database System Overview

Database Resources. Subject: Information Technology for Managers. Level: Formation 2. Author: Seamus Rispin, current examiner


When to consider OLAP?

DATABASE MANAGEMENT SYSTEMS. Question Bank:

Information Technology Career Field Pathways and Course Structure

CS2Bh: Current Technologies. Introduction to XML and Relational Databases. Introduction to Databases. Why databases? Why not use XML?

The Relational Model. Ramakrishnan&Gehrke, Chapter 3 CS4320 1

Physical Database Design Process. Physical Database Design Process. Major Inputs to Physical Database. Components of Physical Database Design

Relational Data Analysis I

Data. Data and database. Aniel Nieves-González. Fall 2015

Oracle Database 10g: Introduction to SQL

Chapter 6 8/12/2015. Foundations of Business Intelligence: Databases and Information Management. Problem:

14 Databases. Source: Foundations of Computer Science Cengage Learning. Objectives After studying this chapter, the student should be able to:

Relational Database Basics Review

Comp 5311 Database Management Systems. 16. Review 2 (Physical Level)

Introduction to Databases

Unit Storage Structures 1. Storage Structures. Unit 4.3

Database 10g Edition: All possible 10g features, either bundled or available at additional cost.

The Advantages and Disadvantages of Network Computing Nodes

ORACLE OLAP. Oracle OLAP is embedded in the Oracle Database kernel and runs in the same database process

Chapter 1: Introduction

How To Improve Performance In A Database

ECS 165A: Introduction to Database Systems

Object-persistence formats. Files. Object-oriented databases. Data management layer design. Relational databases. Relational Database Example.

1. INTRODUCTION TO RDBMS

Normalisation to 3NF. Database Systems Lecture 11 Natasha Alechina


Oracle Database 10g Express

Introduction to Database Systems. Chapter 1 Introduction. Chapter 1 Introduction

Database Security. The Need for Database Security

DBMS Questions. 3.) For which two constraints are indexes created when the constraint is added?

Data Management in the Cloud

Introduction. A. Bellaachia Page: 1

Overview. Introduction to Database Systems. Motivation... Motivation: how do we store lots of data?

BCA. Database Management System

Comparing SQL and NOSQL databases

Data Warehouse Snowflake Design and Performance Considerations in Business Analytics

Performance rule violations usually result in increased CPU or I/O, time to fix the mistake, and ultimately, a cost to the business unit.

COMPUTER SCIENCE (5651) Test at a Glance

Part 5: More Data Structures for Relations

Chapter 5. Warehousing, Data Acquisition, Data. Visualization

Secure Authentication and Session. State Management for Web Services

SQL Server 2008 Core Skills. Gary Young 2011

Technology in Action. Alan Evans Kendall Martin Mary Anne Poatsy. Eleventh Edition. Copyright 2015 Pearson Education, Inc.

CHAPTER 4: BUSINESS ANALYTICS

1.264 Lecture 15. SQL transactions, security, indexes

INFO Koffka Khan. Tutorial 6

ETL Overview. Extract, Transform, Load (ETL) Refreshment Workflow. The ETL Process. General ETL issues. MS Integration Services

COURSE NAME: Database Management. TOPIC: Database Design LECTURE 3. The Database System Life Cycle (DBLC) The database life cycle contains six phases;

CHAPTER 5: BUSINESS ANALYTICS

Commercial Database Software Development- A review.

Concurrency Control. Module 6, Lectures 1 and 2

Would-be system and database administrators. PREREQUISITES: At least 6 months experience with a Windows operating system.

CS Introduction to Data Mining Instructor: Abdullah Mueen

Relational Databases

Transcription:

Database Systems National Chiao Tung University Chun-Jen Tsai 05/30/2012

Definition of a Database Database System A multidimensional data collection, internal links between its entries make the information accessible from a variety of perspectives Flat File System One-dimensional file storage system that presents its information from a single point of view 2/30

Schemas Schema A description of the structure of an entire database, used by database software to maintain the database Sub-schema A description of only a portion of the database pertinent to a particular user s needs, used to prevent sensitive data from being accessed by unauthorized personnel 3/30

Database Management Systems Database Management System (DBMS) A software layer that maintains a database and manipulates it in response to requests from applications Distributed Database A database stored on multiple machines; the DBMS will mask this organizational detail from its users Data independence The ability to change the organization of a database without changing the application software that uses it 4/30

Database Models Database models: Relational model Object-oriented model Hierarchical model Relational model is the most popular model The database is a collection of tables of information Each table is called a Relation Each column in the table records an attribute A row in the table is called a tuple 5/30

Example of a Relation A relation containing employee information: The schema can also be denoted as follows employee_info { Empl_Id: string Name: string Address: string SSN: int } primary key (Empl_Id, Name, SSN) 6/30

Relational Design Issues In general, we want to avoid multiple concepts within one relation; because: It can lead to redundant data Deleting a tuple could also delete necessary but unrelated information 7/30

Decomposition We can divide the columns of a relation into two or more relations, duplicating those columns necessary to maintain relationships; this techniques is called decomposition Empl Id Name Address Job Id Job Title Dept Empl Id Name Address Empl Id Job Id Job Id Job Title Dept 8/30

Example of Decomposition 9/30

Example of Information Retrieval Finding the departments in which 23Y34 has worked 10/30

Lossless Decomposition Sometimes, decomposition can cause loss of information A correct decomposition that does not lose info. is called lossless (non-loss) decomposition 11/30

Relational Operations Select: choose rows Project: choose columns Join: assemble information from two or more relations 12/30

The SELECT Operation 13/30

The PROJECT Operation 14/30

The JOIN Operation (1/2) 15/30

The JOIN Operation (2/2) 16/30

An Application of the JOIN Operation 17/30

Structured Query Language (SQL) SQL is the most popular language used to create, modify, retrieve and manipulate information from relational database management systems SQL was originally designed by IBM in 1970s, and became an ISO international standard in 1987 In SQL, some operations to manipulate tuples are as follows: insert update delete select 18/30

SQL Examples select EmplId, Dept from ASSIGNMENT, JOB where ASSIGNMENT.JobId = JOB.JobId and ASSIGNMENT.TermData = * insert into EMPLOYEE values ( 43212, Sue A. Burt, 33 Fair St., 444661111 ) delete from EMPLOYEE where Name = G. Jerry Smith update EMPLOYEE set Address = 1812 Napoleon Ave. where Name = Joe E. Baker 19/30

Object-Oriented Databases A database constructed by applying the objectoriented paradigm Each data entity stored as a persistent object Relationships indicated by links between objects DBMS maintains inter-object links Example classes of objects: Employee (ID, name, address) Assignment (start/end dates) Job (title, skills) Employee 1 Assignment 2 Assignment 3 Job 3 Assignment 1 Job 1 Job 3 20/30

Advantages of OO Databases Many database applications are designed using OO paradigm, why not the database itself? OO design allows hiding of the implementation details of attributes Example: name attribute has different formats, implement name attribute as an object is more flexible Can handle exotic data types Example: a multimedia data item is often composed of several attributes (audio, video, graphics, descriptions), OO design concept can encapsulate them into one data object Can store intelligent entities Intelligence is inside the methods of the object if database is smart, DBMS can be simpler 21/30

Maintaining Database Integrity (1/2) A transaction is a sequence of operations that must all happen together Example: transferring money between bank accounts Transaction log is non-volatile record of each transaction s activities, built before the transaction is allowed to happen Commit point is the point at which transaction has been recorded in log Roll-back is the procedure to undo a failed, partially completed transaction 22/30

Maintaining Database Integrity (2/2) Simultaneous access problems Incorrect summary problem Lost update problem To preventing others from accessing data being used by a transaction, a locking mechanism is required Shared lock: used when reading data Exclusive lock: used when altering data A common way to resolve deadlock in DBMS is the wound-wait protocol: In a hold-and-wait deadlock situation, the data item held by the younger transaction will be forcibly retrieved by the older transaction 23/30

File Structure for Databases The structure of a simple employee file can be implemented as a text file 24/30

Indexed Files An index is a list of (key, location) pairs Sorted by key values location = where the record is stored An index file is for fast access to items in a file Open of an indexed file 25/30

Hash Files Another technique for fast accessing of a file is called hashing Each record has a key The master file is divided into buckets A hash function computes a bucket number for each key value Each record is stored in the bucket corresponding to the hash of its key 26/30

Hashing Example Hashing the key field value 25X3Z to one of 41 buckets 27/30

Collisions in Hashing Collision happens if two keys hash to the same bucket Major problem when table is over 75% full Solution: increase number of buckets and rehash all data 28/30

Data Mining Data mining is a set of techniques for discovering patterns in collections of data Relies heavily on statistical analyses Data warehouse is the static data collection to be mined Data cube is the data presented from many perspectives to enable mining Raises significant ethical issues when it involves personal information 29/30

Data Mining Strategies Class description Class discrimination Cluster analysis Association analysis Outlier analysis Sequential pattern analysis 30/30