Θεµελίωση Βάσεων εδοµένων



Similar documents
SQL: Queries, Programming, Triggers

Example Instances. SQL: Queries, Programming, Triggers. Conceptual Evaluation Strategy. Basic SQL Query. A Note on Range Variables

SQL: Queries, Programming, Triggers

Relational Algebra and SQL

Chapter 5. SQL: Queries, Constraints, Triggers

Boats bid bname color 101 Interlake blue 102 Interlake red 103 Clipper green 104 Marine red. Figure 1: Instances of Sailors, Boats and Reserves

CS2Bh: Current Technologies. Introduction to XML and Relational Databases. Introduction to Databases. Why databases? Why not use XML?

Relational Algebra. Module 3, Lecture 1. Database Management Systems, R. Ramakrishnan 1

Database Management Systems. Chapter 1

COMP 5138 Relational Database Management Systems. Week 5 : Basic SQL. Today s Agenda. Overview. Basic SQL Queries. Joins Queries

CSE 562 Database Systems

1 File Processing Systems

Chapter 1: Introduction

UNIT 6. Structured Query Language (SQL) Text: Chapter 5

3. Relational Model and Relational Algebra

Chapter 1: Introduction. Database Management System (DBMS) University Database Example

Introduction to Databases

Databases and BigData

BCA. Database Management System

Topics. Introduction to Database Management System. What Is a DBMS? DBMS Types

Database Systems. Lecture 1: Introduction

Overview of Data Management

ICOM 6005 Database Management Systems Design. Dr. Manuel Rodríguez Martínez Electrical and Computer Engineering Department Lecture 2 August 23, 2001

The Relational Model. Ramakrishnan&Gehrke, Chapter 3 CS4320 1

DBMS Questions. 3.) For which two constraints are indexes created when the constraint is added?

ECS 165A: Introduction to Database Systems

Introduction to Database Systems. Module 1, Lecture 1. Instructor: Raghu Ramakrishnan UW-Madison

Overview of Database Management

The Relational Model. Why Study the Relational Model?

Chapter 2 Database System Concepts and Architecture

Bridge from Entity Relationship modeling to creating SQL databases, tables, & relations

CSE 132A. Database Systems Principles

Overview. Introduction to Database Systems. Motivation... Motivation: how do we store lots of data?

Database Security. Chapter 21

Overview of Database Management Systems

What is a database? COSC 304 Introduction to Database Systems. Database Introduction. Example Problem. Databases in the Real-World

VALLIAMMAI ENGNIEERING COLLEGE SRM Nagar, Kattankulathur

The Relational Model. Why Study the Relational Model? Relational Database: Definitions. Chapter 3

Introduction. Introduction: Database management system. Introduction: DBS concepts & architecture. Introduction: DBS versus File system

Lecture 6. SQL, Logical DB Design


æ A collection of interrelated and persistent data èusually referred to as the database èdbèè.

In This Lecture. Security and Integrity. Database Security. DBMS Security Support. Privileges in SQL. Permissions and Privilege.

COMP5138 Relational Database Management Systems. Databases are Everywhere!

The SQL Query Language. Creating Relations in SQL. Referential Integrity in SQL. Basic SQL Query. Primary and Candidate Keys in SQL

Introduction: Database management system

DBMS / Business Intelligence, SQL Server

Chapter 1: Introduction

Chapter 1: Introduction. Database Management System (DBMS)

IT2304: Database Systems 1 (DBS 1)

Database System Concepts

Basic Concepts of Database Systems

IT2305 Database Systems I (Compulsory)

Introduction to database management systems

LOGICAL DATABASE DESIGN

5.5 Copyright 2011 Pearson Education, Inc. publishing as Prentice Hall. Figure 5-2

CSE 544 Principles of Database Management Systems. Magdalena Balazinska (magda) Winter 2009 Lecture 1 - Class Introduction

EECS 647: Introduction to Database Systems

Information Systems SQL. Nikolaj Popov

FROM RELATIONAL TO OBJECT DATABASE MANAGEMENT SYSTEMS

CSE 530A Database Management Systems. Introduction. Washington University Fall 2013

CSE 233. Database System Overview

In This Lecture. SQL Data Definition SQL SQL. Notes. Non-Procedural Programming. Database Systems Lecture 5 Natasha Alechina

Week 1 Part 1: An Introduction to Database Systems. Databases and DBMSs. Why Use a DBMS? Why Study Databases??

The Relational Model. Why Study the Relational Model? Relational Database: Definitions

Relational Database Basics Review

1. INTRODUCTION TO RDBMS

CS2Bh: Current Technologies. Introduction to XML and Relational Databases. The Relational Model. The relational model

Foundations of Business Intelligence: Databases and Information Management

Database System Architecture & System Catalog Instructor: Mourad Benchikh Text Books: Elmasri & Navathe Chap. 17 Silberschatz & Korth Chap.

Data Storage - I: Memory Hierarchies & Disks

Foundations of Business Intelligence: Databases and Information Management

CHAPTER 6 DATABASE MANAGEMENT SYSTEMS. Learning Objectives

1/20/11. Outline. Database Management Systems. Prerequisites. Staff and Contact Information. Course Web Site. 645: Database Design and Implementation

Proposed solutions Project Assignment #1 (SQL)

Scope of this Course. Database System Environment. CSC 440 Database Management Systems Section 1

Lesson 8: Introduction to Databases E-R Data Modeling

SQL/PSM. Outline. Database Application Development Oracle PL/SQL. Why Stored Procedures? Stored Procedures PL/SQL. Embedded SQL Dynamic SQL

1.264 Lecture 15. SQL transactions, security, indexes

TIM 50 - Business Information Systems

Would-be system and database administrators. PREREQUISITES: At least 6 months experience with a Windows operating system.

Transactions and the Internet

Course MIS. Foundations of Business Intelligence

Database Systems Introduction Dr P Sreenivasa Kumar

Introduction to Database Systems CS4320. Instructor: Christoph Koch CS

Chapter 3. Database Environment - Objectives. Multi-user DBMS Architectures. Teleprocessing. File-Server

Review Entity-Relationship Diagrams and the Relational Model. Data Models. Review. Why Study the Relational Model? Steps in Database Design

Oracle SQL. Course Summary. Duration. Objectives

10. Creating and Maintaining Geographic Databases. Learning objectives. Keywords and concepts. Overview. Definitions

Database Design. Marta Jakubowska-Sobczak IT/ADC based on slides prepared by Paula Figueiredo, IT/DB

CSE 544 Principles of Database Management Systems. Magdalena Balazinska (magda) Fall 2007 Lecture 1 - Class Introduction

Databases. DSIC. Academic Year

DATABASE MANAGEMENT SYSTEMS. Question Bank:

Chapter 6 FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT Learning Objectives

Introduction to Database Systems. Chapter 1 Introduction. Chapter 1 Introduction

Choosing a Data Model for Your Database

DATABASE SYSTEM CONCEPTS AND ARCHITECTURE CHAPTER 2

Database 10g Edition: All possible 10g features, either bundled or available at additional cost.

CS 564: DATABASE MANAGEMENT SYSTEMS

INFO/CS 330: Applied Database Systems

Transcription:

Θεµελίωση Βάσεων εδοµένων Βασίλης Βασσάλος 1 What do we need to produce good software? Good programmers, software engineers, project managers, business experts, etc. (People) Good software development methodologies and practices (Processes) Good programming languages, compilers, etc. (Tools) 2 1

What Is a Database? A very large, integrated collection of data It models a real-world enterprise Entities (e.g., customers, orders) Relationships (e.g., Joe Smith bought a Corvette) A Database Management System (DBMS) is a software package designed to store and manage databases Difficult software package (needs tending - DBA) Expensive software package 3 Databases and Database Management Database A structured collection of data and information about entities (things) of interest Database Management System: A software application with which you can create, store, organize and retrieve data from one or many databases E.g. Oracle, Sybase, Informix, DB2, Access Database Administrator: A person responsible for the development and management of an organization s databases 4 2

Why we need a DBMS A DBMS stores, manages and manipulates effectively and efficiently large amounts of data Most business processes and functions in big organizations generate, depend on, and use large amounts of data 5 A view of an ebookseller s customer-oriented IT infrastructure LAN Web Server Client HTML XML ISP T3 Telco DSL Web Client Personalization Application Server Search application http ISP T3 ISDN Web Client Transaction Monitor Community Server Shopping Bot http CableCo Web Client ODBC, SQL SQL ISP Internet http Wireless Provider Laptop Database Server DBMS Books CORBA XML Zshops Customer & order info Intranet Data Warehouse Check Inventory Levels OLAP Server Intelligent Agent Order Execution and Procurement System (*) Distributor Data mining guru Distributors Automated Warehouse Marketing 3

Two broad business uses for DBMS s Run the operational aspects of the business Order entry, payroll, inventory management, etc Online transaction processing Help with decision-making Measure the effectiveness of marketing campaigns, or find out the most profitable products Online decision support Uses are converging Supply chain execution 7 DBMS provides levels of abstraction Many views, a single conceptual (logical) schema and a physical schema Views describe how users see the data Conceptual schema defines logical structure Physical schema describes the files and indexes used View 1 View 2 View 3 Conceptual Schema Physical Schema Schemata are defined using DDL - Data Definition Language Data are modified/queried using DML - Data Manipulation Language 8 4

Example: Bookstore Database Conceptual schema: Customers(cid: string, name: string, address: string, sex: string, category:integer) Books(isbn: string, title:string, price:float) Sales(cid:string, isbn:string, pdate:date) Physical schema: Relations stored as unordered files Index on first column of Customers External Schema (View): TSalesPerBook(isbn:string,total_sales:float) Goodcustomers(cid:string,name:string) 9 Data Independence Applications insulated from how data are structured and stored Logical data independence: Protection from changes in logical structure of data Physical data independence: Protection from changes in physical structure of data One of the most important benefits of using a DBMS! 10 5

Database Development Process (Abstraction Revisited) Conceptual Data Modeling Diagram, Preliminary Model Technology Independent Example: Entity-Relationship Diagrams Decide what entities should be part of the database and what are the relations between them Logical Database Design Abstract model of database Relational: Tables Physical Database Design How database will be arranged technology dependent DBMS (Database Management System) Database Implementation Database Maintenance 11 Why we need a DBMS Concurrent access Recovery from crashes Focus of class Data integrity and security Preserve constraints on data, avoid corruption, control access Data independence Conceptual modeling should be independent of physical modeling Declarative language Efficient access Focus of class Uniform data administration Manage the individual accounts, the customer info, the business loans Reduced application development time 12 6

Example: Bookstore Database Conceptual schema: Customers(cid: string, name: string, address: string, sex: string, category:integer) Books(isbn: string, title:string, price:float) Sales(cid:string, isbn:string, pdate:date) Physical schema: Relations stored as unordered files Index on first column of Students External Schema (View): TSalesPerBook(isbn:string,total_sales:float) Goodcustomers(cid:string,name:string) 13 Data Models A data model is a vocabulary of primitives for describing data A schema is a description of a particular collection of data, using the a given data model The relational model of data is the most widely used model today Main primitive: relation, basically a table with rows and columns Every relation has a schema, which describes the columns, or fields Proposed by E.F. Codd in 1970 14 7

Querying a Database Find how many customers have bought A random walk down Wall Street on 10/3/2002 S(tructured) Q(uery) L(anguage) select COUNT(P.cid) from Purchase P, Books B where P.isbn=B.isbn and P.date= 10/3/2002 and B.title= A random walk down Wall Street User asks for what they need Query processor figures out how to answer the query efficiently Declarative language 15 E-R Diagram Example: EverFail Car Owned by Customer Cust ID Model Name Make VIN Approves PartID Part Descr Part Price WorkID Parts Includes Work Workdescr LaborCharge 16 8

The Complete Relations for EverFail Car VIN, Model, Make, CustID Customer CustID, Name, Address, Phone Work WorkID, Workdescr, LaborCharge, CustID Parts PartsID, PartDescr, PartPrice PartsUsed WorkID, PartsID, Qty 17 Querying a relational database (S)tructured (Q)uery (L)anguage intergalactic dataspeak Basic SQL query SELECT [DISTINCT] target-list FROM relation-list WHERE qualification relation-list A list of relation names (possibly with a range-variable after each name). target-list A list of attributes of relations in relation-list qualification Comparisons (Attr op const or Attr1 op Attr2, where op is one of ) combined using AND, OR and NOT <, >, =,,, 18 9

What can we do? Retrieve a subset of rows ( selection ) Retrieve a subset of columns ( projection ) Connect relations (a join ) Union, intersect relations 19 Example We will use these instances of the Sailors and Reserves relations in our examples. bid is Boat-id If the key for the Reserves relation contained only the attributes sid and day, how would the semantics differ? S1 S2 R1 sid bid day 22 101 10/10/96 58 103 11/12/96 sid sname rating age 22 dustin 7 45.0 31 lubber 8 55.5 58 rusty 10 35.0 sid sname rating age 28 yuppy 9 35.0 31 lubber 8 55.5 44 guppy 5 35.0 58 rusty 10 35.0 10

Example Query SELECT S.sname FROM Sailors S, Reserves R WHERE S.sid=R.sid AND R.bid=103 (sid) sname rating age (sid) bid day 22 dustin 7 45.0 22 101 10/10/96 22 dustin 7 45.0 58 103 11/12/96 31 lubber 8 55.5 22 101 10/10/96 31 lubber 8 55.5 58 103 11/12/96 58 rusty 10 35.0 22 101 10/10/96 58 rusty 10 35.0 58 103 11/12/96 21 Conceptual Evaluation Strategy Compute the cross-product of relation-list All the ways to combine the tuples in the relations Discard resulting tuples if they fail qualifications Delete attributes that are not in target-list If DISTINCT is specified, eliminate duplicate rows This strategy is probably the least efficient way to compute a query! An optimizer will find more efficient strategies to compute the same answers 22 11

Find sailors who ve reserved at least one boat SELECT S.sid FROM Sailors S, Reserves R WHERE S.sid=R.sid What is the effect of replacing S.sid by S.sname in the SELECT clause? Would adding DISTINCT to this variant of the query make a difference? 23 Find the age of the youngest sailor for each rating with age 18 SELECT S.rating, MIN (S.age) FROM Sailors S WHERE S.age >= 18 GROUP BY S.rating Only S.rating and S.age are mentioned in the SELECT and GROUP BY clauses; other attributes `unnecessary 2nd column of result is unnamed sid sname rating age 22 dustin 7 45.0 31 lubber 8 55.5 71 zorba 10 16.0 64 horatio 7 35.0 29 brutus 1 33.0 58 rusty 10 35.0 rating age 1 33.0 7 45.0 7 35.0 8 55.5 10 35.0 Answer relation rating 1 7 8 10 33.0 35 55.5 35.0 24 12

Conceptual Evaluation The cross-product of relation-list is computed, tuples that fail qualification are discarded, `unnecessary fields are deleted, and the remaining tuples are partitioned into groups by the value of attributes in grouping-list The group-qualification is then applied to eliminate some groups. Expressions in group-qualification must have a single value per group! One answer tuple is generated per qualifying group. 25 Triggers Trigger: procedure that starts automatically if specified changes occur to the DBMS Three parts: Event (activates the trigger) Condition (tests whether the triggers should run) Action (what happens if the trigger runs) Turns the database into an active component E.g., alert (or act!) when inventory is low Help with decision-support 26 13

SQL examples Find sailors who have reserved boat 103 SELECT S.sname FROM Sailors S WHERE EXISTS (SELECT * FROM Reserves R WHERE R.bid=103 AND S.sid=R.sid)7 Find sid s of sailors who ve reserved both a red and a green boat: SELECT S.sid FROM Sailors S, Boats B, Reserves R WHERE S.sid=R.sid AND R.bid=B.bid AND B.color= red AND S.sid IN (SELECT S2.sid FROM Sailors S2, Boats B2, Reserves R2 WHERE S2.sid=R2.sid AND R2.bid=B2.bid AND B2.color= green ) 27 Isn t Implementing a Database System Simple? Relations Statements Results 28 14

Introducing the Database Management System The latest from Megatron Labs Incorporates latest relational technology UNIX compatible 29 Megatron 3000 Implementation Details Relations stored in files (ASCII) e.g., relation R is in /usr/db/r Smith # 123 # CS Jones # 522 # EE. 30 15

Megatron 3000 Implementation Details Directory file (ASCII) in /usr/db/directory R1 # A # INT # B # STR R2 # C # STR # A # INT. 31 Megatron 3000 Sample Sessions % MEGATRON3000 Welcome to MEGATRON 3000! &. & quit % 32 16

Megatron 3000 Sample Sessions & select * from R # & Relation R A B C SMITH 123 CS 33 Megatron 3000 Sample Sessions & select A,B from R,S where R.A = S.A and S.C > 100 # & A B 123 CAR 522 CAT 34 17

Megatron 3000 Sample Sessions & select * from R LPR # & Result sent to LPR (printer). 35 Megatron 3000 Sample Sessions & select * from R where R.A < 100 T # & New relation T created. 36 18

Megatron 3000 To execute select * from R where condition : (1) Read dictionary to get R attributes (2) Read R file, for each line: (a) Check condition (b) If OK, display 37 Megatron 3000 To execute select * from R where condition T : (1) Process select as before (2) Write results to new file T (3) Append new line to dictionary 38 19

Megatron 3000 To execute select A,B from R,S where condition : (1) Read dictionary to get R,S attributes (2) Read R file, for each line: (a) Read S file, for each line: (i) Create join tuple (ii) Check condition (iii) Display if OK 39 What s wrong with the Megatron 3000 DBMS? 40 20

What s wrong with the Megatron 3000 DBMS? Tuple layout on disk e.g., - Change string from Cat to Cats and we have to rewrite file - ASCII storage is expensive - Deletions are expensive 41 What s wrong with the Megatron 3000 DBMS? Search expensive; no indexes e.g., - Cannot find tuple with given key quickly - Always have to read full relation 42 21

What s wrong with the Megatron 3000 DBMS? Brute force query processing e.g., select * from R,S where R.A = S.A and S.B > 1000 - Do select first? - More efficient join? 43 What s wrong with the Megatron 3000 DBMS? No buffer manager e.g., Need caching No concurrency control No reliability e.g., - Can lose data - Can leave operations half done No security e.g., - File system insecure - File system security is coarse No application program interface (API) e.g., How can a payroll program get at the data? No interoperability with other databases Poor dictionary facilities No GUI 44 22

System Structure Strategy Selector Query Parser User User Transaction Transaction Manager Concurrency Control Buffer Manager Recovery Manager Lock Table File Manager M.M. Buffer Log Statistical Data Indexes User Data System Data 45 Some Terms Database system Transaction processing system File access system Information retrieval system 46 23

ιαδικαστικά Μάθηµα: Πέµπτη 12-3, αίθουσα 606 ιάλειµµα 15, ~1:40 «Γραφείο»: Εργ. Τεχνητής Νοηµοσύνης, 4ος όροφος Αντωνιάδου, τηλ 160 Ώρες γραφείου: Τρίτη/Πέµπτη 3:30-4:30, άλλες ώρες µε ραντεβού Βοηθός: Μάγδα Ειρηνάκη «Γραφείο»: 3ος όροφος, Κορδιγκτώνος 12 Ώρες γραφείου: Τετάρτη 1-3 47 οµή µαθήµατος υο εργασίες: 15% του βαθµού Χρήση online συστήµατος για τις εργασίες Τελικό διαγώνισµα: 85% του βαθµού Προαιρετική βιβλιογραφική εργασία: 10% bonus ιαφάνειες του µαθήµατος διαθέσιµες στο http://www.aueb.gr/lessons/grad/dbtheory Κυρίως στα αγγλικά 48 24

Ύλη µαθήµατος Εισαγωγή Ευρετήρια και κατακερµατισµός (Indexes and hashing) Επεξεργασία και βελτιστοποίηση επερωτήσεων (Query processing and optimization) Ανάνηψη (Crash Recovery) Θεωρία Ταυτοχρονισµού (Concurrency Control theory) Επεξεργασία οσοληψιών (Transaction processing) Συµπερασµατικές και Λογικές Βάσεις εδοµένων (Deductive and Logic Databases) Ενεργές Βάσεις εδοµένων (Active Data Bases) Αποθήκες εδοµένων (Data Warehouses) 49 Acknowledgement Slides mostly based on the ones provided by Hector Garcia Molina Thanks! 50 25