VERİ TABANI SİSTEM GERÇEKLEME DERS NOTLARI -1 1.DERS : GİRİŞ, TANIMLAR.. Kaynaklar (türkçe, ing.): Prof. Dr. Ünal Yarımağan, Veri Tabanı Sistemleri



Similar documents
The Relational Data Model and Relational Database Constraints

Chapter 8. SQL-99: SchemaDefinition, Constraints, and Queries and Views

Chapter 2 Database System Concepts and Architecture

How To Create A Table In Sql (Ahem)

EECS 647: Introduction to Database Systems

DATABASE SYSTEM CONCEPTS AND ARCHITECTURE CHAPTER 2

The Relational Model. Ramakrishnan&Gehrke, Chapter 3 CS4320 1

CSC 443 Data Base Management Systems. Basic SQL

Part A: Data Definition Language (DDL) Schema and Catalog CREAT TABLE. Referential Triggered Actions. CSC 742 Database Management Systems

The Relational Model. Why Study the Relational Model?

Lecture 6. SQL, Logical DB Design

IT2305 Database Systems I (Compulsory)

IT2304: Database Systems 1 (DBS 1)

Introduction: Database management system

Introduction. Introduction: Database management system. Introduction: DBS concepts & architecture. Introduction: DBS versus File system

Basic Concepts of Database Systems


Summary on Chapter 4 Basic SQL

The Relational Model. Why Study the Relational Model? Relational Database: Definitions. Chapter 3

COMP 5138 Relational Database Management Systems. Week 5 : Basic SQL. Today s Agenda. Overview. Basic SQL Queries. Joins Queries

CS2Bh: Current Technologies. Introduction to XML and Relational Databases. Introduction to Databases. Why databases? Why not use XML?

3. Relational Model and Relational Algebra

The Relational Model. Why Study the Relational Model? Relational Database: Definitions

Instant SQL Programming

1 File Processing Systems

Chapter 5. SQL: Queries, Constraints, Triggers

There are five fields or columns, with names and types as shown above.

Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification

Information Systems SQL. Nikolaj Popov

Database System. Session 1 Main Theme Introduction to Database Systems Dr. Jean-Claude Franchitti

DBMS Questions. 3.) For which two constraints are indexes created when the constraint is added?

What is a database? COSC 304 Introduction to Database Systems. Database Introduction. Example Problem. Databases in the Real-World

In This Lecture. SQL Data Definition SQL SQL. Notes. Non-Procedural Programming. Database Systems Lecture 5 Natasha Alechina

BCA. Database Management System

Oracle Database 10g: Introduction to SQL

CS2Bh: Current Technologies. Introduction to XML and Relational Databases. The Relational Model. The relational model

Database Management Systems

Database System Architecture & System Catalog Instructor: Mourad Benchikh Text Books: Elmasri & Navathe Chap. 17 Silberschatz & Korth Chap.

Database Design and Programming

1. INTRODUCTION TO RDBMS

Chapter 1: Introduction. Database Management System (DBMS) University Database Example

CSE 132A. Database Systems Principles

Introduction to SQL: Data Retrieving

Part 4: Database Language - SQL

SQL DATA DEFINITION: KEY CONSTRAINTS. CS121: Introduction to Relational Database Systems Fall 2015 Lecture 7

More on SQL. Juliana Freire. Some slides adapted from J. Ullman, L. Delcambre, R. Ramakrishnan, G. Lindstrom and Silberschatz, Korth and Sudarshan

Outline. Data Modeling. Conceptual Design. ER Model Basics: Entities. ER Model Basics: Relationships. Ternary Relationships. Yanlei Diao UMass Amherst

Scheme G. Sample Test Paper-I

Relational Databases

Chapter 1: Introduction

Database Systems. Session 1 Main Theme Introduction to Database Systems Dr. Jean-Claude Franchitti

6 CHAPTER. Relational Database Management Systems and SQL Chapter Objectives In this chapter you will learn the following:

Ch.5 Database Security. Ch.5 Database Security Review

New York University Computer Science Department Courant Institute of Mathematical Sciences

Programming Database lectures for mathema

5.1 Database Schema Schema Generation in SQL

Oracle 10g PL/SQL Training

DBMS / Business Intelligence, SQL Server

Oracle Database 12c: Introduction to SQL Ed 1.1

Oracle Database: SQL and PL/SQL Fundamentals

SQL-99: Schema Definition, Basic Constraints, and Queries

Databases and BigData

Oracle SQL. Course Summary. Duration. Objectives

SQL: Queries, Programming, Triggers

Schema Evolution in SQL-99 and Commercial (Object-)Relational DBMS

Example Instances. SQL: Queries, Programming, Triggers. Conceptual Evaluation Strategy. Basic SQL Query. A Note on Range Variables

Database Design. Marta Jakubowska-Sobczak IT/ADC based on slides prepared by Paula Figueiredo, IT/DB

How To Manage Data In A Database System

Lesson 8: Introduction to Databases E-R Data Modeling

ECS 165A: Introduction to Database Systems

2. Basic Relational Data Model

Oracle Database: SQL and PL/SQL Fundamentals NEW

Databases in Engineering / Lab-1 (MS-Access/SQL)

More SQL: Assertions, Views, and Programming Techniques

Databases 2011 The Relational Model and SQL

Oracle Database 10g Express

Physical Design. Meeting the needs of the users is the gold standard against which we measure our success in creating a database.

Oracle Database: SQL and PL/SQL Fundamentals

SQL: Queries, Programming, Triggers

D61830GC30. MySQL for Developers. Summary. Introduction. Prerequisites. At Course completion After completing this course, students will be able to:

A Comparison of Database Query Languages: SQL, SPARQL, CQL, DMX

Demystified CONTENTS Acknowledgments xvii Introduction xix CHAPTER 1 Database Fundamentals CHAPTER 2 Exploring Relational Database Components

Overview of Database Management

Data Modeling. Database Systems: The Complete Book Ch ,

Review: Participation Constraints

Oracle Database: Introduction to SQL

SQL Server. 1. What is RDBMS?

SQL Server for developers. murach's TRAINING & REFERENCE. Bryan Syverson. Mike Murach & Associates, Inc. Joel Murach

RNDr. Michal Kopecký, Ph.D. Department of Software Engineering, Faculty of Mathematics and Physics, Charles University in Prague

Oracle Database: Introduction to SQL

Exercise 1: Relational Model

Database Systems. Lecture 1: Introduction

SQL Data Definition. Database Systems Lecture 5 Natasha Alechina

UNIT 6. Structured Query Language (SQL) Text: Chapter 5

Relational Database Basics Review

Question 1. Relational Data Model [17 marks] Question 2. SQL and Relational Algebra [31 marks]

Programming with SQL

Introduction to Databases

SQL NULL s, Constraints, Triggers

Transcription:

VERİ TABANI SİSTEM GERÇEKLEME DERS NOTLARI -1 1.DERS : GİRİŞ, TANIMLAR.. Kaynaklar (türkçe, ing.): Prof. Dr. Ünal Yarımağan, Veri Tabanı Sistemleri Elmasri, Navathe, Fund. of Database Systems, 5th ed., Addison Wesley Edward Sciore, Database Design and Implementation, 2009, John Wiley 1 1

Giris Konu Başlıkları VT nedir? VTYS nedir? Veri modeli nedir? Örnek bir veri tabanı: UNIVERSİTE VT kullanıcıları VT programla dilleri VTYS olanakları ve dosya-işleme sisteminden farkları VTYS ana işlevsel modülleri VTYS 3-şema mimarisi ve veri bağımsızlığı VT kullanım/erişim mimarileri (2-katlı, 3 katlı) VT türleri RM RA-SQL 2 2

Veri Tabanı Nedir? Birden çok uygulama tarafından kullanılan Gereksiz yinelemelerden arınmış Düzenli bir şekilde saklanan Birbiriyle ilşkili (uyumlu olarak) Sürekli, fakat statik olmayan Belirli bir amaç için bir araya getirilmiş VERİ TOPLULUĞU (küçük bir dünya) dur. Örnek: şirket, bakanlık, üniversite, market stok takip. 3 3

VT nedir? VT na yeni bilgi, en kisa zamanda yansitilmali VT büyüklüğü için bir kısıtlama yok Örnek : Amazon.com 2 terab (10 12 B) 20 milyon kitap 200 sunucu bilgisayar üzerinde kayıtlı Günlük 15 milyon kullanıcı Yaklaşık 100 admin 4 4

VTYS Nedir? Veri tabanı sistemi ile ilgili her türlü işletimsel gereksinimleri karşılamak için kullanılan sistem seviyesinde, karmaşık, merkezi yazılım sistemine VTYS denir. VTYS genel olarak şu olanakları sağlar: VT tanımlanması, gerçeklenmesi (oluşturulması), kullanımı, paylaşımı Kontrollü veri tekrarı Sorgu işemede verimli erişim metodlarını kullanır. Çoklu kullanıcılı hizmet, veri kurtarma ve yedeklme imkanı sağlar. Farklı kullanıcı arayüzlerine imkan sağlar. Üst seviyeli karmaşık iş kısıtlamlarının tanımlanması, gerçeklenmesi ve sağlanmasına olanak sağlar. Güvenlik tanımlamaları ve sağlanmasını kontrol eder. VT sistemine, gerek işletim sistemi gerek diğer kullanıcılar (uygulama programları gibi...) doğrudan erişemez; ancak VTYS üzerinden erişebilir. 5 5

Genel VT / VTYS yapısı kullanıcılar..... VTYS Veri sözlüğü log dosyaları İndex dosyaları Veri dosyaları Veri sözlüğü, veri tabanı tanımlarının (metada) saklandığı dosyalardır. İndex dosyları, fiziksel erişim dosyalarıdır. Log dosyaları güvenlik amaçlı dosyalardır. 6 6

Konular VT nedir? VTYS nedir? Veri modeli nedir? Örnek bir veri tabanı: UNIVERSİTE VT kullanıcıları VT programla dilleri VTYS olanakları ve dosya-işleme sisteminden farkları VTYS ana işlevsel modülleri VTYS 3-şema mimarisi ve veri bağımsızlığı VT kullanım/erişim mimarileri (2-katlı, 3 katlı) VT türleri RM RA-SQL 7 7

Veri Modeli Nedir? Gerçek dünya verilerini kavramsal ve mantıksal seviyede düzenlemek için kullanılan yapı ve kavramlar bütünü olarak tanımlanır. Örnek; E-R modeli, ilişkisel model (RM) Genel VT tasarımı: kavramsal tasarım mantıksal düzenleme, varlık ve bağıntıların belirlenmesi veri tipleri, değer aralığı, uzunluk belirlenmesi veri bütünlüğü kısıtlamalarının belirlenmesi fiziksel tasarım tercihleri kullanıcıların belirlenmesi ve güvenlik ayarları 8 8

Categories of Data Models Conceptual (high-level, semantic) data models: Provide concepts that are close to the way many users perceive data. (Also called entity-based or object-based data models.) Implementation (representational) data models: used by many commercial DBMS implementations (e.g. relational data models used in many commercial systems). Physical (low-level, internal) data models: Provide concepts that describe details of how data is stored in the computer. These are usually specified in an ad-hoc manner through DBMS design and administration manuals 9 9

Schemas versus Instances Database Schema: The description of a database. Includes descriptions of the database structure, data types, and the constraints on the database. Schema Diagram: An illustrative display of (most aspects of) a database schema. Schema Construct: A component of the schema or an object within the schema, e.g., STUDENT, COURSE. Database State: The actual data stored in a database at a particular moment in time. This includes the collection of all the data in the database. Also called database instance (or occurrence or snapshot). The term instance is also applied to individual database components, e.g. record instance, table instance, entity instance 10 10

Example of a Database Schema vs. State 11 11

Konular VT nedir? VTYS nedir? Veri modeli nedir? Örnek bir veri tabanı: UNIVERSİTE VT kullanıcıları VT programla dilleri VTYS olanakları ve dosya-işleme sisteminden farkları VTYS ana işlevsel modülleri VTYS 3-şema mimarisi ve veri bağımsızlığı VT kullanım/erişim mimarileri (2-katlı, 3 katlı) VT türleri RM RA-SQL 12 12

VT kullanıcıları Yönetici (Admin) VT erişimi ve kullanımı kontrol eder Sistem için gerekli s/w,h/w desteğini belirler Güvenlik açığını, verimsiz kaynak kullanımını belirler Tasarımcı (Designer) Verinin her aşamada modellenmesi ile ilgilenir. Gerçekleme öncesi aşamalardan sorumludur. VT kullanıcıları ile haberleşir gereksinim analizi yapar. Genel olarak bütün kullanıcılar ile yakın temas vardır. Son kullanıcılar (End user) VT sistem yazılımcısı 13 13

DBMS Languages Data Definition Language (DDL) Used by the DBA and database designers to specify the conceptual schema of a database. In many DBMSs, the DDL is also used to define internal and external schemas (views). Data Manipulation Language (DML) High-Level or Non-procedural Languages: These include the relational language SQL May be used in a standalone way or may be embedded in a programming language Low Level or Procedural Languages: These must be embedded in a programming language 14 14

DBMS Programming Language Interfaces Programmer interfaces for embedding DML in a programming languages: Embedded Approach: e.g embedded SQL (for C, C++, etc.), SQLJ (for Java) Procedure Call Approach: e.g. JDBC for Java, ODBC for other programming languages Database Programming Language Approach: e.g. ORACLE has PL/SQL, a programming language based on SQL; language incorporates SQL and its data types as integral components User-friendly interfaces Menu-based, popular for browsing on the web, Formsbased,Graphics-based (Point and Click, Drag and Drop, etc.) Natural language: requests in written English 15 15

Database System Utilities To perform certain functions such as: Loading data stored in files into a database. Includes data conversion tools. Backing up the database periodically on tape. Reorganizing database file structures. Report generation utilities. Performance monitoring utilities. Other functions, such as sorting, user monitoring, data compression, etc. Data dictionary / repository: Used to store schema descriptions and other information such as design decisions, application program descriptions, user information, usage standards, etc. Application Development Environments and CASE (computer-aided software engineering) tools : PowerBuilder (Sybase), JBuilder (Borland), JDeveloper 10G (Oracle) 16 16

Niye VTYS? dosya-işleme niye yetersiz? Veri sorgulama kolaylığı 17 17

Niye VTYS? dosya-işleme niye yetersiz? Üstveri(metadata):Veri tabanında saklanan verinin tanımları (meta data) ve diğer kısıtlamalar da saklanıyor. Farklı küçük dünyalar aynı veri tabanında saklanabiliyor. (Buna Program-veri bağımsızlığı denir.) Dosya işleme uygulaması sadece o uygulama için yazılmış. Kayıt saklama ve erişimde güçlü veri yapıları ihtiyacı Tampon kullanımı program-operasyon bağımsızlığı ihtiyacı Verinin farklı görünümü (multiple views) Paylaşım (sharing) ve çoklu hareket işleme (transaction processing) imkanı Eşzamanlılık Veri kurtarma ve geri sarma Güvenlik ve yetkilendirme SONUÇ OLARAK; yavaş saklama unitesinde saklanan, çok kişi tarafından erişilen, büyük veri yığınlarında, veri kurtarma desteği ile gerçek zamanda hizmet vermek için VTYS na ihtiyacımız var. 18 18

Konular VT nedir? VTYS nedir? Veri modeli nedir? Örnek bir veri tabanı: UNIVERSİTE VT kullanıcıları VT programla dilleri VTYS olanakları ve dosya-işleme sisteminden farkları VTYS ana işlevsel modülleri VTYS 3-şema mimarisi ve veri bağımsızlığı VT kullanım/erişim mimarileri (2-katlı, 3 katlı) VT türleri RM RA-SQL 19 19

Typical DBMS Component Modules 20 20

The three-schema architecture Data extracted from the internal DBMS level is reformatted to match the user s external view (e.g. formatting the results of an SQL query for display in a Web page) External schemas at the external level to describe the various user views. Usually uses the same data model as the conceptual schema. How data is used? Conceptual schema at the conceptual level to describe the structure and constraints for the whole database for a community of users. Uses a conceptual or an implementation data model. What data is inside? Programs refer to an external schema, and are mapped by the DBMS to the internal schema for execution. Internal schema at the internal level to describe physical storage structures and access paths (e.g indexes). Typically uses a physical data model. How data is stored? Proposed to support DBMS characteristics of: Program-data independence. Support of multiple views of the data. Not explicitly used in commercial DBMS products, but has been useful in explaining database system 21 organization 21

Data Independence Physical Data Independence: The capacity to change the internal schema without having to change the conceptual schema.(for example, the internal schema may be changed when certain file structures are reorganized or new indexes are created to improve database performance) Logical Data Independence: The capacity to change the conceptual schema without having to change the external schemas and their associated application programs. When a schema at a lower level is changed, only the mappings between this schema and higher-level schemas need to be changed in a DBMS that fully supports data independence. The higher-level schemas themselves are unchanged. Hence, the application programs need not be changed since they refer to the external schemas. 22 22

Physical Data Independence Select Sname from STUDENT where GradYear=1997 Database catalog (conceptual and physical schemas) Ease of use Query optimization İsolation from changes to physical schema 23 23

Logical Data Independence In a database, if users have their own external schemas, then this db support logical data independence. Adv. of logical data dependence: Customized external schema İsolation from changes to conceptual schema Suppose a change in conceptual schema: split STUDENT table into 2 tables: CURR_STUDENT and ALUMNI. What happens to external schema? Better security 24 24

Konular VT nedir? VTYS nedir? Veri modeli nedir? Örnek bir veri tabanı: UNIVERSİTE VT kullanıcıları VT programla dilleri VTYS olanakları ve dosya-işleme sisteminden farkları VTYS ana işlevsel modülleri VTYS 3-şema mimarisi ve veri bağımsızlığı VT kullanım/erişim mimarileri (2-katlı, 3 katlı) VT türleri RM RA-SQL 25 25

Centralized and Client-Server DBMS Architectures Centralized DBMS: Combines everything into single system including- DBMS software, hardware, application programs, and user interface processing software. User can still connect through a remote terminal however, all processing is done at centralized site. 26 26

Basic 2-tier Client-Server Architectures Specialized Servers with Specialized functions: Print server, File server, DBMS server, Web server, Email server... Clients can access the specialized servers as needed Applications running on clients utilize an Application Program Interface (API) to access server databases via standard interface such as: ODBC: Open Database Connectivity standard JDBC: for Java programming access Client and server must install appropriate client module and server module software for ODBC or JDBC Variations of clients are possible (tightly vs loosely coupled): e.g., in some object DBMSs, more functionality is transferred to clients including data dictionary functions, optimization and recovery across multiple servers, etc. 27 27

Three Tier Client-Server Architecture Common for Web applications Three-tier Architecture Can Enhance Security: Database server only accessible via middle tier (Application Server or Web Server) Clients cannot directly access database server 28 28

Classification of DBMSs Based on the data model used Traditional: Relational, Network, Hierarchical. Emerging: Object-oriented, Objectrelational. XML Other classifications Single-user (typically used with personal computers or embedded dbms) vs. multi-user (most DBMSs). Centralized (uses a single computer with one database) vs. distributed (uses multiple computers, multiple databases) Homogeneous DDBMS Heterogeneous DDBMS Federated or Multidatabase Systems Distributed Database Systems have now come to be known as clientserver based database systems because: They do not support a totally distributed environment, but rather a set of database servers supporting a set of clients. 29 29

Extension to DB capabilities, New Applications DB for Scientific applications Image/videos DB Data mining Spatial/temporal DB Information retrieval (deals with text in general, books, manuscripts, library-based articles) For more generic information about DBMS look at http://en.wikipedia.org/wiki/database_managem ent_system 30 30

Relational Model Relational Model Concepts Relational Model Constraints and Relational Database Schemas Update Operations and Dealing with Constraint Violations 31

General view: the relational model consisted of (1) data independence from hardware and storage implementation and (2) automatic navigation, or a high-level, nonprocedural language for accessing data. Instead of processing one record at a time, a programmer could use the language to specify single operations that would be performed across the entire data set. 2 historical famous relational databases: System R by IBM and Ingres by UC-Berkeley. Tables are the basic building blocks. 32

The relational Model of Data is based on the concept of a Relation The strength of the relational approach to data management comes from the formal foundation provided by the theory of relations We review the essentials of the formal relational model. In practice, there is a standard model based on SQL. Note: There are several important differences between the formal model and the practical model. A Relation is a mathematical concept based on the ideas of sets The model was first proposed by Dr. E.F. Codd of IBM Research in 1970 in the following paper:"a Relational Model for Large Shared Data Banks," Communications of the ACM, June 1970 The above paper caused a major revolution in the field of database management and earned Dr. Codd the coveted ACM Turing Award 33

Informally, a relation looks like a table of values. A relation typically contains a set of rows. (not considered to be ordered) The data elements in each row represent certain facts that correspond to a real-world entity or relationship In the formal model, rows are called tuples Each column has a column header that gives an indication of the meaning of the data items in that column In the formal model, the column header is called an attribute name (or just attribute) Key of a Relation: The key uniquely identifies that row in the table. In the STUDENT table, SSN is the key. Sometimes row-ids or sequential numbers are assigned as keys to identify the rows in a table. Called artificial key or surrogate key 34

The Schema (or description) of a Relation: Denoted by R(A1, A2,...An) R is the name of the relation, The attributes of the relation are A1, A2,..., An Example: CUSTOMER (Cust-id, Cust-name, Address, Phone#) CUSTOMER is the relation name. Defined over the four attributes: Cust-id, Custname, Address, Phone#. Each attribute has a domain or a set of valid values. For example, the domain of Cust-id is 6 digit numbers. A relation is a set of such tuples (rows). A tuple is an ordered set of values (enclosed in angled brackets < > ).Each value is derived from an appropriate domain. For example: A row in the CUSTOMER relation is a 4-tuple and would consist of four values, for example: <632895, "John Smith", "101 Main St. Atlanta, GA 30332", "(404) 894-2000"> This is called a 4-tuple as it has 4 values A tuple (row) in the CUSTOMER relation. 35

A domain has a logical definition: Example: USA_phone_numbers are the set of 10 digit phone numbers valid in the U.S. A domain has a data-type or a format defined for it. The USA_phone_numbers may have a format: (ddd)ddd-dddd where each d is a decimal digit. Dates have various formats such as year, month, date formatted as yyyy-mmdd, or as dd mm,yyyy etc. The attribute name designates the role played by a domain in a relation: Used to interpret the meaning of the data elements corresponding to that attribute Example: The domain Date may be used to define two attributes named Invoice-date and Payment-date with different meanings Example: attribute Cust-name is defined over the domain of character strings of maximum length 25 dom(cust-name) is varchar(25) The role these strings play in the CUSTOMER relation is that of the name of a customer. The relation state, r(r) is a subset of the Cartesian product of the domains of its attributes each domain contains the set of all possible values the attribute can take. 36

A set S of relation schemas that belong to the same database. S is the name of the whole database schema S = {R1, R2,..., Rn} R1, R2,, Rn are the names of the individual relation schemas within the database S Here is the COMPANY database schema with 6 relation schemas: 37

Informal Terms Table Column Header All possible Column Values Row Formal Terms Relation Attribute Domain Tuple Table Definition Populated Table Schema of a Relation State of the Relation 38

Constraints are conditions that must hold on all valid relation states. There are three main types of constraints in the relational model: Key constraints (anahtar kısıtı) Entity integrity constraints (varlık bütünlük kısıtı) Referential integrity constraints (ima bütünlük kısıtı) Another implicit constraint is the domain constraint Every value in a tuple must be from the domain of its attribute (or it could be null, if allowed for that attribute) Yet another constraint is Semantic constraints.. 39

Superkey of R: Is a set of attributes SK of R with the following condition: No two tuples in any valid relation state r(r) will have the same value for SK That is, for any distinct tuples t1 and t2 in r(r), t1[sk] t2[sk] This condition must hold in any valid state r(r) Key of R: The only way to access only 1 record. A "minimal" superkey That is, a key is a superkey K such that removal of any attribute from K results in a set of attributes that is not a superkey (does not possess the superkey uniqueness property) Example: Consider the CAR relation schema: CAR(State, Reg#, SerialNo, Make, Model, Year) CAR has two keys: Key1 = {State, Reg#} Key2 = {SerialNo} Both are also superkeys of CAR {SerialNo, Make} is a superkey but not a key. 40

Entity Integrity: The primary key attributes PK of each relation schema R in S cannot have null values in any tuple of r(r). This is because primary key values are used to identify the individual tuples. t[pk] null for any tuple t in r(r) If PK has several attributes, null is not allowed in any of these attributes Note: Other attributes of R may be constrained to disallow null values, even though they are not members of the primary key. 41

A constraint involving two relations The previous constraints involve a single relation. Used to specify a relationship among tuples in two relations: The referencing relation and the referenced relation. Tuples in the referencing relation R1 have attributes FK (called foreign key attributes) that reference the primary key attributes PK of the referenced relation R2. A tuple t1 in R1 is said to reference a tuple t2 in R2 if t1[fk] = t2[pk]. Statement of the constraint The value in the foreign key column (or columns) FK of the the referencing relation R1 can be either: (1) a value of an existing primary key value of a corresponding primary key PK in the referenced relation R2, or (2) a null. In case (2), the FK in R1 should not be a part of its own primary key. Logical connection 42

Each relation schema can be displayed as a row of attribute names The name of the relation is written above the attribute names The primary key attribute (or attributes) will be underlined A foreign key (referential integrity) constraints is displayed as a directed arc (arrow) from the foreign key attributes to the referenced table Can also point the the primary key of the referenced relation for clarity COMPANY relational schema diagram: 43

Semantic Integrity Constraints: based on application semantics and cannot be expressed by the model per se Example: the max. no. of hours per employee for all projects he or she works on is 56 hrs per week A constraint specification language may have to be used to express these. SQL-99 allows triggers and ASSERTIONS to express for some of these 44

Each relation will have many tuples in its current relation state The relational database state is a union of all the individual relation states Whenever the database is changed, a new state arises Basic operations for changing the database: INSERT a new tuple in a relation DELETE an existing tuple from a relation MODIFY an attribute of an existing tuple Next slide shows an example state for the COMPANY database 45

In case of integrity violation, several actions can be taken: Cancel the operation that causes the violation (RESTRICT or REJECT option) Perform the operation but inform the user of the violation Trigger additional updates so the violation is corrected (CASCADE option, SET NULL option) Execute a user-specified error-correction routine Specifying constraints in SQL NOT NULL may be specified on an attribute Key attributes can be specified via the PRIMARY KEY and UNIQUE phrases referential integrity constraints (foreign keys). 46

Specifying constraints in SQL CREATE TABLE DEPT ( DNAME VARCHAR(10) NOT NULL, DNUMBER INTEGER NOT NULL, MGRSSN CHAR(9) DEFAULT 000, MGRSTARTDATE CHAR(9), PRIMARY KEY (DNUMBER), UNIQUE (DNAME), FOREIGN KEY (MGRSSN) REFERENCES EMP ON DELETE SET DEFAULT ON UPDATE CASCADE); CREATE TABLE EMP( ENAME VARCHAR(30)NOT NULL, ESSN CHAR(9), BDATE DATE, DNO INTEGER DEFAULT 1, SUPERSSN CHAR(9), PRIMARY KEY (ESSN), FOREIGN KEY (DNO) REFERENCES DEPT ON DELETE SET DEFAULT ON UPDATE CASCADE, FOREIGN KEY (SUPERSSN) REFERENCES EMP ON DELETE SET NULL ON UPDATE CASCADE); 47

INSERT may violate any of the constraints: Domain constraint: if one of the attribute values provided for the new tuple is not of the specified attribute domain Key constraint: if the value of a key attribute in the new tuple already exists in another tuple in the relation Referential integrity: if a foreign key value in the new tuple references a primary key value that does not exist in the referenced relation Entity integrity: if the primary key value is null in the new tuple 48

DELETE may violate only referential integrity: If the primary key value of the tuple being deleted is referenced from other tuples in the database Can be remedied by several actions: RESTRICT, CASCADE, SET NULL (see Chapter 8 for more details) RESTRICT option: reject the deletion CASCADE option: remove all referencing tuples SET NULL option: set the foreign keys of the referencing tuples to NULL One of the above options must be specified during database design for each foreign key constraint 49

UPDATE may violate domain constraint and NOT NULL constraint on an attribute being modified Any of the other constraints may also be violated, depending on the attribute being updated: Updating the primary key (PK): Similar to a DELETE followed by an INSERT Need to specify similar options to DELETE Updating a foreign key (FK): May violate referential integrity Updating an ordinary attribute (neither PK nor FK): Can only violate domain constraints 50

Elmasri/Navathe, Figure 5.6 51

Exercise-2 Apply following operations to COMPANY database state at he previous slide. Determine the violated integrity constraints. Insert < 'ProductA', 4, 'Bellaire', 2 > into PROJECT. (violates referential integrity) Insert < 'Production', 4, '943775543', '01-OCT-88' > into DEPARTMENT. (Violates both the key constraint and referential integrity) Delete the WORKS_ON tuples with ESSN= '333445555' (no violation) Delete the EMPLOYEE tuple with SSN= '987654321'. (Violates referential integrity ) Modify the SUPERSSN attribute of the EMPLOYEE tuple with SSN= '999887777' to'943775543'. (violates referential integrity) 52

RA:relational algebra has a similiar power with SQL. RA has NO formal syntax while SQL is industry standard language. RA is task oriented query language while SQL is result-oriented. DBMS translate SQL to RA in order to execute it. RA query can be expressed as a query tree that contains nodes for tables and operators within the query. query tree shows execution order and is used in execution&optimization. 53

operators Single-table operators: Select Project Sort Rename Extend Groupby 2-table operators product join semijoin antijoin union outerjoin 54

A student record database: gradyear sid 1 student N sd 1 sname est eid N enroll grade N es 1 N cs sectid section yearoffered prof Did dname department 1 dc N cid course title 1 55

Student record db state 56

select list students who graduated in 2004: Q1: select(student,gradyear=2004) list students who graduated in 2004 with a major of 10 or 20: Q2: select (STUDENT,GradYear=2004 and (MajorId=10 or MajorId=20)) Q3: select (select(student,gradyear=2004), MajorId=10 or MajorId=20) Q4: select (Q1, MajorId=10 or MajorId=20) query tree for Q3: 57

project List the name and graduation year of all students: Q5: project ( STUDENT, {SName,GradYear}) List the name of all students having major 10 Q6: project ( select(student, MajorId=10), {SName}) What about the following query? Q7: select ( project (STUDENT, {SName}), MajorId=10) Query tree for Q6: 58

sort, rename, extend Sort the students by graduation year and then name Q8: sort( STUDENT, [GradYear, SName]) Sort operator is typically the outmost operator. Rename the records from Q6 with field SName as CSMajors Q9: rename(q6, SName,CSMajors) rename operator is used in multi-table operators (s.a. Product, join) in order to alter schema of one table to match the other s. Extend STUDENT table to have the calculated field GradClass as the the number years since 1863 (first grad year) Q10: extend (STUDENT, GradYear-1863, GradClass) Extend STUDENT table to have the calculated field College as YTU Q11: extend (STUDENT, YTU, College) 59

Group by Show max and min graduation year per major Q12: groupby ( STUDENT, {MajorId}, {Min(GradYear), Max(GradYear)) Show the number of students in each major per graduation year. Q13: groupby ( STUDENT, {MajorId,GradYear}, {Count(SId)}) 60

Group by Show the min graduation year of any student Q14: groupby ( STUDENT, {}, {Min(GradYear)}) Show the list of major ids of all students with duplicates removed. Q15: groupby ( STUDENT, {MajorId}, {}) Show the number of students having a major. Q16: groupby ( STUDENT, {}, {Count(MajorId)}) Count function count all records including duplicates Show the number of different majors Q17: groupby ( STUDENT, {}, {CountDistinct(MajorId)}) Count and CountDistcint (and other aggragte funcs.) ignore NULL 61

Group by Show the most number of A s given in any section 3 steps: Q18: select ( ENROLL, Grade= A ) Q19: groupby (Q18, {SectionId},{Count(EId)}) Q20: groupby (Q19,{},{Max(CountOfEId)}) 1 step: Q21: groupby (groupby (select ( ENROLL, Grade= A ), {SectionId},{Count(EId)}), {},{Max(CountOfEId)}) Query tree for Q21: 62

product All combinations of records from STUDENT and DEPT Q22: product (STUDENT,DEPT) All students and their major department s name Q23: select( product (STUDENT,DEPT), MajorId=DId) 63

join join (T1,T2,P) select(product(t1,t2),p) All students and their major department s name Q24: join(student,dept, MajorId=DId) The grades Joe received during 2004 Q25: select ( STUDENT, SName= Joe ) Q26: join ( Q25, ENROLL, SId=StudentId) Q27: select ( SECTION, YearOffered=2004) Q28: join ( Q26, Q27, SectId=SectionId) Q29: project (Q27, Grade) Is there a more efficient way? Find the grades Joe received during his graduation year. (remember Q29) select YearOffered=GradYear, only after joining with SECTION There are 3 predicates to join 3 tables, but only 2 is enough. Named as circular query 64

Join The sections that gave out the most A s. (remember Q20) Q30: join (Q20, Q19, MaxCountOfEId= CountOfEId) Q31: join (Q30,SECTION,SectionId=SectId) List students that have the same major as Joe. Q32: project ( select ( STUDENT, SName= Joe ), MajorId) Q33: rename( Q32, MajorId, JoesMajor) Q34: join (Q33, STUDENT, MajorId = JoesMajorId) This type of join is named as self-join (or recursive). We have to rename in order to write predicate correctly. 65

semijoin Q24 lists pairs of matching records. Sometimes, we only need if there is match. List departments that have at least 1 student in their major. Q35: semijoin (DEPT,STUDENT,DId=MajorId) Q35 consists of those records from the first table that match some record in the second table. Equivalent join: Q36: join (DEPT,STUDENT,DId=MajorId) Q37: groupby (Q36, {DId,DName},{}) Groupby operator projects and then removes duplicates.. The students who took a course from Profesor Einstein. Q38: select (SECTION, Prof= Einstein ) Q39: semijoin ( ENROLL, Q38, SectionId = SectId) Q40: semijoin ( STUDENT, Q39, StudentId = SId) 66

antijoin Opposite of semijoin; thus returns exactly those records that the semijoin omits. Antijoin cannot be simulated by a join. List departments that have no students mojors. Q41: antijoin ( DEPT, STUDENT, DId=MajorId) Note that union of Q35 and Q41 is DEPT and Q35 and Q41 has no common record. Find the sections where nobody received a grade of F. Q42: select (ENROLL, Grade = F ) Q43: antijoin (SECTION,Q42, SectId = SectionId) What about the following solution? Join ( Select (ENROLL, Grade <> F ), SECTION, SectId=SectionId) Antijoin is used in the following type of queries: NOT EXIST x type FOR ALL x = NOT EXIST NOT x Find those sections in which everyone got an A. (sections in which there does not exist someone who did not get an A ) select (ENROLL, Grade <> A ) antijoin (SECTION,Q42, SectId = SectionId) 67