Week 1 Part 1: An Introduction to Database Systems. Databases and DBMSs. Why Use a DBMS? Why Study Databases??



Similar documents
Introduction to Database Systems. Module 1, Lecture 1. Instructor: Raghu Ramakrishnan UW-Madison

Topics. Introduction to Database Management System. What Is a DBMS? DBMS Types

Introduction to Database Systems CS4320. Instructor: Christoph Koch CS

Database Management Systems. Chapter 1

Logistics. Database Management Systems. Chapter 1. Project. Goals for This Course. Any Questions So Far? What This Course Cannot Do.

COMP5138 Relational Database Management Systems. Databases are Everywhere!

ECS 165A: Introduction to Database Systems

CS2Bh: Current Technologies. Introduction to XML and Relational Databases. Introduction to Databases. Why databases? Why not use XML?

Transaction Management Overview

Database Systems. Lecture 1: Introduction

Overview. Introduction to Database Systems. Motivation... Motivation: how do we store lots of data?

Information Systems Analysis and Design CSC John Mylopoulos Database Design Information Systems Analysis and Design CSC340

1 File Processing Systems

INTRODUCTION TO DATABASE SYSTEMS

Chapter 1: Introduction. Database Management System (DBMS) University Database Example

Overview of Data Management

Database Management. Chapter Objectives

Overview of Database Management Systems

Overview of Database Management

Chapter 1: Introduction

Introduction. Introduction: Database management system. Introduction: DBS concepts & architecture. Introduction: DBS versus File system

Principles of Database. Management: Summary

SQL Programming. CS145 Lecture Notes #10. Motivation. Oracle PL/SQL. Basics. Example schema:

Chapter 1: Introduction. Database Management System (DBMS)

Introduction: Database management system

Demystified CONTENTS Acknowledgments xvii Introduction xix CHAPTER 1 Database Fundamentals CHAPTER 2 Exploring Relational Database Components

Introduction to database management systems

Chapter 1: Introduction

1. INTRODUCTION TO RDBMS

DATABASE MANAGEMENT SYSTEM

Database Management Systems

Database System Architecture & System Catalog Instructor: Mourad Benchikh Text Books: Elmasri & Navathe Chap. 17 Silberschatz & Korth Chap.

Introduction to Database Systems. Chapter 1 Introduction. Chapter 1 Introduction

Basic Concepts of Database Systems

CSE 132A. Database Systems Principles

Course Content. Transactions and Concurrency Control. Objectives of Lecture 4 Transactions and Concurrency Control

Introduction to Databases

FROM RELATIONAL TO OBJECT DATABASE MANAGEMENT SYSTEMS

CC414 Database Management Systems

CS2Bh: Current Technologies. Introduction to XML and Relational Databases. The Relational Model. The relational model

Module 4 Creation and Management of Databases Using CDS/ISIS

Introductory Concepts

VALLIAMMAI ENGNIEERING COLLEGE SRM Nagar, Kattankulathur

ICOM 6005 Database Management Systems Design. Dr. Manuel Rodríguez Martínez Electrical and Computer Engineering Department Lecture 2 August 23, 2001

Database Tuning and Physical Design: Execution of Transactions

Chapter 9, More SQL: Assertions, Views, and Programming Techniques

Database System Concepts

Transactions and the Internet

Transactions and Recovery. Database Systems Lecture 15 Natasha Alechina

Relational Database Basics Review

Lecture 7: Concurrency control. Rasmus Pagh

Concepts of Database Management Seventh Edition. Chapter 7 DBMS Functions

Files. Files. Files. Files. Files. File Organisation. What s it all about? What s in a file?

DBMS Questions. 3.) For which two constraints are indexes created when the constraint is added?

The Relational Model. Ramakrishnan&Gehrke, Chapter 3 CS4320 1

Unit 12 Database Recovery

The Relational Model. Why Study the Relational Model?

Chapter 3. Database Environment - Objectives. Multi-user DBMS Architectures. Teleprocessing. File-Server

History of Database Systems

Chapter 2 Database System Concepts and Architecture

Carnegie Mellon Univ. Dept. of Computer Science Database Applications. Outline. We ll learn: Faloutsos CMU SCS

Course Notes on A Short History of Database Technology

Course Notes on A Short History of Database Technology

ORACLE DATABASE 11G: COMPLETE

Chapter 13 File and Database Systems

Chapter 13 File and Database Systems

Lesson 8: Introduction to Databases E-R Data Modeling

Introduction to Database Management Systems

The Relational Model. Why Study the Relational Model? Relational Database: Definitions

Object Oriented Databases. OOAD Fall 2012 Arjun Gopalakrishna Bhavya Udayashankar

Microsoft SQL Server for Oracle DBAs Course 40045; 4 Days, Instructor-led

Recovery and the ACID properties CMPUT 391: Implementing Durability Recovery Manager Atomicity Durability


Lecture 6. SQL, Logical DB Design

Database Systems Introduction Dr P Sreenivasa Kumar

What is a database? COSC 304 Introduction to Database Systems. Database Introduction. Example Problem. Databases in the Real-World

Oracle Database: SQL and PL/SQL Fundamentals NEW

Introduction to Object-Oriented and Object-Relational Database Systems

Database Concepts. Database & Database Management System. Application examples. Application examples

Objectives. Distributed Databases and Client/Server Architecture. Distributed Database. Data Fragmentation

MyOra 3.0. User Guide. SQL Tool for Oracle. Jayam Systems, LLC

The core theory of relational databases. Bibliography

Course: CSC 222 Database Design and Management I (3 credits Compulsory)

Oracle Database Links Part 2 - Distributed Transactions Written and presented by Joel Goodman October 15th 2009

æ A collection of interrelated and persistent data èusually referred to as the database èdbèè.

Introduction. Chapter 1. Introducing the Database. Data vs. Information

The ConTract Model. Helmut Wächter, Andreas Reuter. November 9, 1999

Foreign and Primary Keys in RDM Embedded SQL: Efficiently Implemented Using the Network Model

Introduction to Database Systems

Oracle Database 11 g Performance Tuning. Recipes. Sam R. Alapati Darl Kuhn Bill Padfield. Apress*

Datenbanksysteme II: Implementation of Database Systems Recovery Undo / Redo

Oracle. Brief Course Content This course can be done in modular form as per the detail below. ORA-1 Oracle Database 10g: SQL 4 Weeks 4000/-

The Classical Architecture. Storage 1 / 36

Chapter 1 Databases and Database Users

Database System. Session 1 Main Theme Introduction to Database Systems Dr. Jean-Claude Franchitti

Relational Database Systems 2 1. System Architecture

Transcription:

Week 1 Part 1: An Introduction to Database Systems Databases and DBMSs Data Models and Data Independence Concurrency Control and Database Transactions Structure of a DBMS DBMS Languages Databases and DBMSs Database: A very large, integrated collection of data. Examples: databases of customers, products,... There are huge databases out there, for satellite and other scientific data, digitized movies,...; up to hexabytes of data (i.e., 10 18 bytes) A database usually models (some part of) a real-world enterprise. Entities (e.g., students, courses) Relationships (e.g., Paolo is taking CS564) A Database Management System (DBMS) is a software package designed to store and manage databases. Introduction 1 Introduction 2 Why Use a DBMS? Data independence and efficient access You don t need to know the implementation of the database to access data; queries are optimized. Reduced application development time Queries can be expressed declaratively, programmer doesn t have to specify how they are evaluated. Data integrity and security (Certain) constraints on the data are enforced automatically. Uniform data administration. Concurrent access, recovery from crashes Many users can access/update the database at the same time without any interference. Introduction 3 Why Study Databases?? Shift from computation to information: Computers were initially conceived as neat devices for doing scientific calculations; more and more they are used as data managers. Datasets increasing in diversity and volume: Digital libraries, interactive video, Human Genome project, EOS project need for DBMS technology is exploding! DBMS technology encompasses much of Computer Science: OS, languages, theory, AI, multimedia, logic,... Introduction 4

Data Models Levels of Abstraction A data model is a collection of concepts for describing data. A database schema is a description of the data that are contained in a particular database. The relational model of data is the most widely used data model today. Main concept: relation, basically a table with rows and columns. A relation schema, describes the columns, or attributes, or fields of a relation. Many views, single logical schema and physical schema. Views (also called external schemas) describe how users see the data. Logical schema* defines logical structure Physical schema describes the files and indexes used. * Called conceptual schema back in the old days. View 1 View 2 View 3 Logical Schema Physical Schema Introduction 5 Introduction 6 Example: University Database Logical schema: Students(Sid:String, Name:String, Login: String, Age:Integer,Gpa:Real) Courses(Cid:String, Cname:String, Credits: Integer) Enrolled(Sid:String, Cid:String, Grade:String) Physical schema: Relations stored as unordered files. Index on first column of Students. (One) External Schema (View): CourseInfo(Cid:String, Enrollment:Integer) Tables Represent Relations Students Courses Sid 00243 01786 02699 02439 Cid csc340 csc343 ece268 csc324 Name Paolo Maria Klaus Eric Login pg mf klaus eric Cname Age 21 20 19 19 Gpa 4.0 3.6 3.4 3.1 Rqmts Engineering Databases Operating Systems Programming Langs Credits 4 6 3 4 Introduction 7 Introduction 8

Data Independence Concurrency Control Applications insulated from how data is structured and stored: (See also 3-layer schema structure.) Logical data independence: Protection from changes in the logical structure of data. Physical data independence: Protection from changes in the physical structure of data. One of the most important benefits of database technology! Introduction 9 Concurrent execution of user programs is essential for good DBMS performance. Because disk accesses are frequent, and relatively slow, it is important to keep the CPU humming by working on several user programs concurrently. Interleaving actions of different user programs can lead to inconsistency: e.g., cheque is cleared while account balance is being computed. DBMS ensures that such problems don t arise: users can pretend they are using a single-user system. Introduction 10 Database Transactions Key concept is transaction, which is an atomic sequence of database actions (reads/writes). Each transaction executed completely, must leave the DB in a consistent state, if DB is consistent when the transaction begins. Users can specify some simple integrity constraints on the data, and the DBMS will enforce these constraints. Beyond this, the DBMS does not really understand the semantics of the data. (e.g., it does not understand how the interest on a bank account is computed). Thus, ensuring that a transaction (run alone) preserves consistency is ultimately the user s responsibility! Scheduling Concurrent Transactions DBMS ensures that execution of {T 1,..., T n } is equivalent to some serial execution of T 1,...,T n. Before reading/writing an object, a transaction requests a lock on the object, and waits till the DBMS gives it the lock. All locks are released at the end of the transaction. (Strict 2-phase locking protocol.) Idea: If an action of T i (say, writing X) affects T k (which perhaps reads X), one of them, say T i, will obtain the lock on X first and T k is forced to wait until T i completes; this effectively orders the transactions. What if T k already has a lock on Y and T i later requests a lock on Y? (Deadlock!) T i or T k is aborted and restarted! Introduction 11 Introduction 12

Ensuring Atomicity DBMSs ensure atomicity (all-or-nothing property), even if system crashes in the middle of a transaction. Idea: Keep a log (history) of all actions carried out by the DBMS while executing a set of transactions: Before a change is made to the database, the corresponding log entry is forced to a safe location. (WAL protocol; OS support for this is often inadequate.) After a crash, the effects of partially executed transactions are undone using the log. (Thanks to WAL, if log entry wasn t saved before the crash, corresponding change was not applied to database!) The Log The following actions are recorded in the log: T i writes an object: the old value and the new value; log record must go to disk before the changed page! T i commits/aborts: a log record indicating this action. Log records chained together by transaction id, so it s easy to undo a specific transaction (e.g., to resolve a deadlock). Log is often duplexed and archived on stable storage. All log related activities (and in fact, all CC-related activities such as lock/unlock, dealing with deadlocks etc.) are handled transparently by the DBMS. Introduction 13 Introduction 14 Databases Make Folks Happy... End users and DBMS vendors Database application programmers, e.g. smart webmasters Database administrators (DBAs) Design logical /physical schemas Handle security and authorization Data availability, crash recovery Database tuning as needs evolve Must understand how a DBMS works! Structure of a DBMS A typical DBMS has a layered architecture. The figure does not show the concurrency control and recovery components. This is one of several possible architectures; each system has its own variation. Query Optimization and Execution Relational Operators Files and Access Methods Buffer Management Disk Space Management DB These layers must consider concurrency control and recovery Introduction 15 Introduction 16

Database Languages SQL, an Interactive Language A DBMS supports several languages and several modes of use: Interactive textual languages, such as SQL; Interactive commands embedded in a host programming language (Pascal, C, Cobol, Java, etc.) Interactive commands embedded in ad-hoc development languages (known as 4GL), usually with additional features (e.g., for the production of forms, menus, reports,...) Form-oriented, non-textual user-friendly languages such as QBE. COURSES SELECT Course, Room, Building FROM Rooms, Courses WHERE Code = Room AND Floor= Ground" Course Room Floor Networks N3 Ground Systems N3 Ground ROOMS Code Building Floor DS1 Ex-OMI Ground N3 Ex-OMI Ground G Science Third Introduction 17 Introduction 18 SQL Embedded in Pascal write( city name''?'); readln(city); EXEC SQL DECLARE E CURSOR FOR SELECT NAME, SALARY FROM EMPLOYEES WHERE CITY = :city ; EXEC SQL OPEN E ; EXEC SQL FETCH E INTO :name, :salary ; while SQLCODE = 0 do begin write( employee:', name, raise?'); readln(raise); EXEC SQL UPDATE PERSON SET SALARY=SALARY+:raise WHERE CURRENT OF E EXEC SQL FETCH E INTO :name, :salary end; EXEC SQL CLOSE CURSOR E SQL Embedded in ad-hoc Language (Oracle PL/SQL) declare Sal number; begin select Sal into Salary from Emp where Code='5788' for update of Sal; if Salary>30M then update Emp set Sal=Salary*1.1 where Code='5788'; else update Emp set Sal=Salary*1.2 where Code='5788'; end if; commit; exception when no_data_found then insert into Errors values( No employee has given code',sysdate); end; Introduction 19 Introduction 20

Form-Based Interface (in Access) DBMS Languages DML data manipulation language DDL data definition language (allows defini-tion of database schema) 4GL fourth generation language, useful for declarative query processing, report generation Host Programming Language DBMS DML DDL 4GL Database Introduction 21 Introduction 22 DBMS Technology: Pros and Cons Pros Data are handled as a common resource. Centralized management and economy of scale. Availability of integrated services, reduction of redundancies and inconsistencies Data independence (useful for the development and maintenance of applications) Cons Costs of DBMS products (and associated tools), also of data migration. Difficulty in separating features and services (with potential lack of efficiency.) Conventional Files vs Databases Files Advantages many already exist; good for simple applications; very efficient Disadvantages data duplication; hard to evolve; hard to build for complex applications The future is with databases! Databases Advantages Good for data integration; allow for more flexible formats (not just records) Disadvantages high cost; drawbacks in a centralized facility Introduction 23 Introduction 24

Types of DBMSs Conventional relational, network, hierarchical, consist of records of many different record types (database looks like a collection of files) Object-Oriented database consists of objects (and possibly associated programs); database schema consists of classes (which can be objects too). Multimedia database can store formatted data (i.e., records) but also text, pictures,... Active databases database includes eventcondition-action rules Deductive databases* like large Prolog programs, not available commercially The Hierarchical Data Model Database consists of hierarchical record structures; a field may have as value a list of records; every record has at most one parent Book B365 Borrower 38 Elm Borrowing War & Peace $8.99 Toronto Jan 28, 1994 Feb 24, 1994 parent children Introduction 25 Introduction 26 The Network Data Model A database now consists of records with pointers (links) to other records. Offers a navigational view of a database. Customer Order Order 1::n link Ordered Part Part cycles of links are allowed Part Part Region Sales Sales History Comparing Data Models The oldest DBMSs were hierarchical, dating back to the mid-60s. IMS (IBM product) is the most popular among them. Many old databases are hierarchical. The network data model came next (early 70s). Views database programmer as navigator, chasing links (pointers, actually) around a database. The network model was found to be too implementation-oriented, not insulating sufficiently the programmer from implementation features of network DBMSs. The relational model is the most recent arrival. Relational databases are cleaner because they don t allow links/pointers (necessarily implementationdependent). Even though the relational model was proposed in 1970, it didn t take over the database market till the 80s. Introduction 27 Introduction 28

Summary DBMSs used to maintain and query large datasets. Benefits include recovery from system crashes, concurrent access, quick application development, data integrity and security. Levels of abstraction give data independence. A DBMS typically has a layered architecture. DBAs hold responsible jobs and are well-paid! DBMS R&D is one of the broadest, most exciting areas in CS. Introduction 29