7.1 The Information system

Size: px
Start display at page:

Download "7.1 The Information system"

Transcription

1 Chapter 7. Database Planning, Design and Administration Last few decades have seen proliferation of software applications, many requiring constant maintenance involving: correcting faults, implementing new user requirements, modifying software to run on new or upgraded platforms. As a result, many major software projects were late, over budget, unreliable, difficult to maintain, performed poorly. Copyright 2007 BUPTSSE Guo Wenming Page 1 Chapter 7. Database Planning, Design and Administration In late 1960s, led to software crisis, now refer to as the software depression. There are several major reasons for failure of software projects including: lack of a complete requirements specification; lack of appropriate development methodology; poor decomposition of design into manageable components. As a solution, a structured approach to development was proposed called information systems lifecycle or software development lifecycle. Copyright 2007 BUPTSSE Guo Wenming Page 2 Chapter 7. Database Planning, Design and Administration In this chapter, we present an overview of the database application lifecycle, and describe each stage of the database application in more detail. 7.1 The Information System Lifecycle 7.2 The Database application Lifecycle 7.3 Database Planning 7.4 System Definition 7.5 Requirements Collection and Analysis 7.6 Database Design Copyright 2007 BUPTSSE Guo Wenming Page 3 Chapter 7. Database Planning, Design and Administration 7.7 DBMS Selection 7.8 Application Design 7.9 Prototyping 7.10 Implementation 7.11 Data Conversion and Loading 7.12 Testing 7.13 Operational Maintenance 7.14 Case Tools 7.15 Data Administration and Database Administration Copyright 2007 BUPTSSE Guo Wenming Page 4 Chapter 7. Database Planning, Design and Administration Main components of an information system. Main stages of database application lifecycle. Main phases of database design: conceptual, logical, and physical design. Benefits of CASE tools. How to evaluate and select a DBMS. Distinction between data administration and database administration. Purpose and tasks associated with data administration and database administration. Copyright 2007 BUPTSSE Guo Wenming Page The Information system Information System: The resources that enable collection, management, control, and dissemination of information throughout an organization. The stages in the lifecycle of an I.S. include: planning, requirements collection and analysis, design, prototyping, implementation, testing, conversion, and operational maintenance. The Database is fundamental component of I.S., and its development and usage should be viewed from perspective of the wider requirements of the organization. Copyright 2007 BUPTSSE Guo Wenming Page 6 1

2 7.2 The Database Application Lifecycle The database application lifecycle is inherently associated with the lifecycle of the information system. The stages of the database application lifecycle are shown in Figure in next page. Database planning System definition Requirements collection and analysis Database design DBMS selection (optional) Copyright 2007 BUPTSSE Guo Wenming Page The Database Application Lifecycle Application design Prototyping (optional) Implementation Data conversion and loading Testing Operational maintenance. It is important to recognised that the stages of the database application lifecycle are not strictly sequential, but involve some amount of repetition of previous stages through feedback loops. Copyright 2007 BUPTSSE Guo Wenming Page The Database Application Lifecycle Database Planning System Definition Requirement Collection and Analysis Database Design Conceptual database design DBMS Selection Application Design Logical database design Physical database design Implemtntation Prototyping Data Conversion & Loading Testing Operational Maintenance Copyright 2007 BUPTSSE Guo Wenming Page Database Planning Database Planning: The management activities that allow the stages of the database application to be realized as efficiently and effectively as possible. Database planning must be integrated with the overall IS strategy of the organization. There are three main issues involved in formulating an IS strategy, which are: Identification of enterprise plans and goals with subsequent determination of information systems needs; Evaluation of current information systems to determine existing strengths and weakness; Appraisal of IT opportunities that might yield competitive advantage. Copyright 2007 BUPTSSE Guo Wenming Page Database Planning Mission statement for the database project defines major aims of database application. Those driving database project within the organization (such as Director or Owner) normally define the mission statement. Mission statement helps clarify purpose of the database project and provides clearer path towards the efficient and effective creation of required database application. Once mission statement is defined, mission objectives are defined. Copyright 2007 BUPTSSE Guo Wenming Page Database Planning Each objective should identify a particular task that the database must support. May be accompanied by some additional information that specifies the work to be done, the resources with which to do it, and the money to pay for it all. Database planning should also include development of standards that govern: how data will be collected, how the format should be specified, what necessary documentation will be needed, how design and implementation should proceed. Copyright 2007 BUPTSSE Guo Wenming Page 12 2

3 7.3 Database Planning Example: A mission statement for DreamHome We begin the conducting interviews with the Director and any other appropriate staff, as indicated by the Director. Examples of typical questions we might ask include: What is the purpose of your Company? Why do you feel that you need a database? How do you know that a database will solve your problem? For example, the database developer may start the interview by asking the Director of DreamHome the following questions: Database Developer What is the purpose of your Company? Director We offer a wide range of high quality properties for rent to clients registered at our branches throughout the UK. Our ability to offer quality properties, of course, depends upon the services we provide to property owners. We provide a highly professional service to property owners to ensure that properties are rented out for maximum return. 7.3 Database Planning A mission statement for DreamHome Database Developer Why do you feel that you need a database? Director To be honest we can t cope with our own success. Over the past few years, we ve opened several branches in most of the main cities of the UK, and at each branch we now offer a larger selection of properties to a growing number of clients. However, this success has been accompanied with increasing data management problems, which means that the level of service we provide is falling. Also, there s a lack of cooperation and sharing of information between branches, which is a very worrying development. Database Developer How do you know that a database will solve your problem? Director All I know is that we are drowning in paperwork. We need something that will speed up the way we work by automating a lot of the day-to-day tasks that seem to take forever these days. Also, I want the branches to start working together. Databases will help to achieve this, won t they? Copyright 2007 BUPTSSE Guo Wenming Page 13 Copyright 2007 BUPTSSE Guo Wenming Page Database Planning A mission statement for DreamHome Responses to these types of questions should help to formulate the mission statement. An example mission statement for the DreamHome database application is shown in Figure When we have a clear and unambiguous mission statement that the staff of DreamHome agree with, we move on to define the mission objectives. Please everyone see additional document Dreamhome_case to know about the details. 7.3 Database Planning Copyright 2007 BUPTSSE Guo Wenming Page 15 Copyright 2007 BUPTSSE Guo Wenming Page Database Planning 7.3 Database Planning Copyright 2007 BUPTSSE Guo Wenming Page 17 Copyright 2007 BUPTSSE Guo Wenming Page 18 3

4 7.3 Database Planning 7.3 Database Planning Copyright 2007 BUPTSSE Guo Wenming Page 19 Copyright 2007 BUPTSSE Guo Wenming Page Database Planning 7.3 Database Planning Copyright 2007 BUPTSSE Guo Wenming Page 21 Copyright 2007 BUPTSSE Guo Wenming Page System Definition System Definition: Describes the scope and boundaries of the database application and the major user views. User view defines what is required of a database application from perspective of: a particular job role (such as Manager or Supervisor) or enterprise application area (such as marketing, personnel, or stock control). Database application may have one or more user views. Identifying user views helps ensure that no major users of the database are forgotten when developing requirements for new application. Copyright 2007 BUPTSSE Guo Wenming Page System Definition User views also help in development of complex database application allowing requirements to be broken down into manageable pieces. We present a diagram that represents the scope and boundaries of the DreamHome database application in additional documents. Copyright 2007 BUPTSSE Guo Wenming Page 24 4

5 7.4 System Definition Copyright 2007 BUPTSSE Guo Wenming Page Requirements Collection and Analysis Requirements collection and analysis: The process of collecting and analyzing information about the part of organization to be supported by the database application, and using this information to identify users requirements of new system. Information is gathered for each major user view including: a description of data used or generated; details of how data is to be used/generated; any additional requirements for new database application. Information is analyzed to identify requirements to be included in new database application. Copyright 2007 BUPTSSE Guo Wenming Page Requirements Collection and Analysis Another important activity is deciding how to manage database application with multiple user views. Three main approaches: centralized approach; view integration approach; combination of both approaches. We present the DreamHome Requirements Collection and Analysis in additional documents. Centralized approach Requirements for each user view are merged into a single set of requirements. A global data model is created based on the merged requirements (which represents all user views). 7.5 Requirements Collection and Analysis The centralized approach to managing multi user views 1 to 3. Copyright 2007 BUPTSSE Guo Wenming Page 27 Copyright 2007 BUPTSSE Guo Wenming Page Requirements Collection and Analysis View integration approach Requirements for each user view are used to build a separate data model to represent that user view. The view integration approach involves leaving the requirements for each user view as separate lists of requirements. Data model representing single user view is called a local data model, and is composed of diagrams and documentation describing requirements of a particular user view of database. Local data models are then merged to produce a global data model, which represents all user views for the database. 7.5 Requirements Collection and Analysis The view integration approach to managing multiple user views 1 to 3 Copyright 2007 BUPTSSE Guo Wenming Page 29 Copyright 2007 BUPTSSE Guo Wenming Page 30 5

6 7.5 Requirements Collection and Analysis Director Manager Supervisor Assistant branch X X staff X X X property for rent X X X X owner X X X X client X X X X property viewing X X lease X X X X newspaper X X 7.6 Database Design Database Design: Process of creating a design for a database that will support the enterprise s operations and objectives. Major aims: Represent data and relationships between data required by all major application areas and user groups. Provide data model that supports any transactions required on the data. Specify a minimal design that is appropriately structured to achieve stated performance requirements for the system (such as response times). Copyright 2007 BUPTSSE Guo Wenming Page 31 Copyright 2007 BUPTSSE Guo Wenming Page Database Design In this section we present an overview of the main approaches to database design. We also discuss the purpose and use of data modeling in database design. We then describe the three phases of database design, namely conceptual logical, and physical design approaches to Database Design Data Modeling Phases of Database Design Copyright 2007 BUPTSSE Guo Wenming Page Approaches to Database Design Approaches include: Bottom-up: beginning at fundamental level of attributes, which are grouped into relations. Top-down: starting with the development of data models that contain a few high-level entities and relationships and then identifying lower-level entities, relationships, and the associated attributes. Inside-out: related to bottom-up approach but differing by first identifying a set of major entities and then spreading out to consider other entities, relationships, and attribute with those first identified. Mixed: using both bottom-up and top-down approach. Copyright 2007 BUPTSSE Guo Wenming Page Data Modeling Main purposes of data modeling include: to assist in understanding the meaning (semantics) of the data; to facilitate communication about the information requirements. Building data model requires answering questions about entities, relationships, and attributes. A data model ensures we understand: each user s perspective of the data; nature of the data itself, independent of its physical representations; use of data across user views Data Modeling Criteria for data models Copyright 2007 BUPTSSE Guo Wenming Page 35 Copyright 2007 BUPTSSE Guo Wenming Page 36 6

7 7.6.3 Phases of Database Design Three phases of database design: Conceptual database design Logical database design Physical database design. Conceptual database design: The process of constructing a model of the information used in an enterprise, independent of all physical considerations. Data model is built using the information in users requirements specification. Source of information for next logical design phase. Copyright 2007 BUPTSSE Guo Wenming Page Phases of Database Design Step: Conceptual database design Step 1 Identify entity types Step 2 Identify relationship types Step 3 Identify and associate attributes with entity or relationship types Step 4 Determine attribute domains Step 5 Determine candidate and primary key attributes Step 6 Consider use of enhanced modeling concepts (optional step) Step 7 Check model for redundancy Step 8 Validate local conceptual model against user transactions Step 9 Review local conceptual data model with user Example for conceptual design: see P327 Copyright 2007 BUPTSSE Guo Wenming Page Phases of Database Design Logical database design: the process of constructing a model of the information used in an enterprise based on a specific data model (e.g. relational), but independent of a particular DBMS and other physical considerations. Conceptual data model is refined and mapped on to a logical data model. The logical data model is based on the target data model for the database (such as relational data model). The logical model also serves an important role during the operational maintenance stage to application lifecycle. Copyright 2007 BUPTSSE Guo Wenming Page Phases of Database Design Step: Logical database design for the relational model 1. Build and validate local logical data model for each view Step 1.1 Remove features not compatible with the relational model (optional step) Step 1.2 Derive relations for local logical data model Step 1.3 Validate relations using normalization Step 1.4 Validate relations against user transactions Step 1.5 Define integrity constraints Step 1.6 Review local logical data model with user 2. Build and validate global logical data model Step 2.1 Merge local logical data models into global model Step 2.2 Validate global logical data model Step 2.3 Check for future growth Step 2.4 Review global logical data model with users Example for logical design: see P342 Copyright 2007 BUPTSSE Guo Wenming Page Phases of Database Design Physical database design: The process of producing a description of the database implementation on secondary storage. It describes the base relations, file organizations, and indexes used to achieve efficient access to the data, any associated integrity constraints and security measures. Physical design is tailored to a specific DBMS system. The main aim of physical design is to describe how we intend to physically implement the logical design. There is feedback between physical and logical design. Because decisions are taken during physical design for improving performance that may affect the structure of the logical data model. Copyright 2007 BUPTSSE Guo Wenming Page Phases of Database Design Step: Physical database design for the relational model Step 1 Translate global logical data model for target DBMS Step 1.1 Design base relations Step 1.2 Design representation of derived data Step 1.3 Design enterprise constraints Step 2 Design physical representation Step 2.1 Analyze transactions Step 2.2 Choose file organization Step 2.3 Choose indexes Step 2.4 Estimate disk space requirements Step 3 Design user views Step 4 Design security mechanisms Step 5 Consider the introduction of controlled redundancy Step 6 Monitor and tune the operational system Example for physical design: see P371 Copyright 2007 BUPTSSE Guo Wenming Page 42 7

8 7.6.3 Phases of Database Design The correspondence between the three-level ANSI-SPARC architecture for a database system and conceptual, logical, and physical design. Copyright 2007 BUPTSSE Guo Wenming Page DBMS selection DBMS Selection: The Selection of an appropriate DBMS to support the database application. Selection can be done at any time prior to logical design provided sufficient information is available regarding system requirements such as performance, ease of restructuring, security, and integrity constraints. Main steps to selecting a DBMS: define Terms of Reference of study; shortlist two or three products; evaluate products; recommend selection and produce report. Copyright 2007 BUPTSSE Guo Wenming Page 44 Features for DBMS evaluation. 7.7 DBMS selection Features for DBMS evaluation. 7.7 DBMS selection Copyright 2007 BUPTSSE Guo Wenming Page 45 Copyright 2007 BUPTSSE Guo Wenming Page Application Design Application design: The design of user interface and application programs that use and process the database. Database and application design are parallel activities. Application includes two important activities: transaction design; user interface design. Transaction: An action, or series of actions, carried out by a single user or application program, which accesses or changes content of the database. Transactions refer real world events such as: the registering of a property for rent, the addition of a new member of staff, the registration of a new client, the renting out of a property. 7.8 Application Design The purpose of transaction design is to define and document the high-level characteristics of the transactions required, including: data to be used by the transaction; functional characteristics of the transaction; output of the transaction; importance to the users; expected rate of usage. Three main types of transactions: Retrieval transaction: to retrieve data for display on the screen or in the production of a report. Uupdate transaction: to insert new records, delete old records, or modify existing records in the database. Mixed transaction: involve both the retrieval and updating of data. Copyright 2007 BUPTSSE Guo Wenming Page 47 Copyright 2007 BUPTSSE Guo Wenming Page 48 8

9 7.8 Application Design User interface design guidelines Meaningful title Comprehensible instructions Logical grouping and sequencing of fields Visually appealing layout of the form/report Familiar field labels Consistent terminology and abbreviations Consistent use of color Visible space and boundaries for data-entry fields Convenient cursor movement Error correction for individual characters and entire fields Error messages for unacceptable values Optional fields marked clearly Explanatory messages for fields Completion signal Copyright 2007 BUPTSSE Guo Wenming Page Prototyping Prototyping: Building a working model of a database application. to identify features of a system that work well, or are inadequate; to suggest improvements or even new features; to clarify the users requirements; to evaluate feasibility of a particular system design. There are two prototyping strategies in common use today: Requirements prototyping: once the requirements are complete the prototype is discarded. Evolutionary prototyping: the prototype is not discarded but with further development becomes the working database application. Copyright 2007 BUPTSSE Guo Wenming Page Implementation Implementation: the physical realization of the database and application designs. The database implementation is achieved using DDL or a graphical user interface (GUI), Use DDL to create database schemas and empty database files, using DDL to create any specified user views. The application programs are implemented using 3GL or 4GL. This will include the database transactions implemented using the DML, possibly embedded in a host programming language. We also implement the other components of the application design such as menu screens, data entry forms, and reports. Copyright 2007 BUPTSSE Guo Wenming Page Data Conversion and Loading Data conversion and loading: Transferring any existing data into new database and converting any existing applications to run on new database. Only required when new database system is replacing an old system. DBMS normally has utility that loads existing files into new database. The utility requires the specification of the source file and the target database, and then automatically converts. May be possible for the developer to convert and use application programs from old system for use by new system. Copyright 2007 BUPTSSE Guo Wenming Page Testing Testing: the process of executing application programs with intent of finding errors. Use carefully planned test strategies and realistic data so that the entire testing process is methodically and rigorously carried out. Demonstrates that database and application programs appear to be working according to requirements. Testing cannot show absence of faults; it can show only that software faults are present. If real data is to be used, it is essential to have backups taken in case of error Operational Maintenance Operational Maintenance: the process of monitoring and maintaining system following installation. Maintenance involves the following activities: Monitoring the performance of the system: if performance falls below an acceptable level, may require tuning or reorganization of the database. Maintaining and upgrading database application (when required): new requirements are Incorporated into database application through the preceding stages of the lifecycle. Copyright 2007 BUPTSSE Guo Wenming Page 53 Copyright 2007 BUPTSSE Guo Wenming Page 54 9

10 7.14 CASE Tools Computer-Aided Software Engineering (CASE) can be applied to any tool that supports software engineering. CASE Support may include: data dictionary to store information about database application s data; design tools to support data analysis; tools to permit development of corporate data model, and conceptual and logical data models; tools to enable prototyping of applications. CASE tools may be divided into three categories Upper-CASE: support initial stages of lifecycle. Lower-CASE: support latter stage of lifecycle. integrated-case: support all stages of lifecycle. Copyright 2007 BUPTSSE Guo Wenming Page CASE Tools Application of CASE tools Copyright 2007 BUPTSSE Guo Wenming Page CASE Tools CASE tools provide following benefits: Standards: help to enforce standards on software project or across the organization. Integration: store all the definition generated in a repository, or data dictionary. The data then can be linked together to ensure that all parts of the system are integrated. support for standard methods: simply result in documentation that is correct and more current. Consistency: check all information consistency. automation: can automatically transform parts of a design specification into executable code. Copyright 2007 BUPTSSE Guo Wenming Page DA and DBA Data Administrator (DA) and Database Administrator (DBA) are responsible for managing and controlling activities associated with corporate data and corporate database, respectively. DA is more concerned with early stages of lifecycle and DBA is more concerned with later stages. Copyright 2007 BUPTSSE Guo Wenming Page DA and DBA DA: Management of data resource including: database planning, development and maintenance of standards, policies and procedures, and conceptual and logical database design. DBA: Management of physical realization of a database application including: physical database design and implementation, setting security and integrity controls, monitoring system performance, and reorganizing the database. Copyright 2007 BUPTSSE Guo Wenming Page 59 Question and Exercises? 1. Describe the major components of an information system. 2. Describe the main purpose(s) and activities associated with each stage of the database application lifecycle. 3. Compare and contrast the three phases of database design. 4. Identify the stage(s) where it is appropriate to select a DBMS and describe an approach to selecting the best DBMS. 5. Application design involves transaction design and user interface design. Describe the purpose and main activities associated with each. 6. Describe the main advantages of using the prototyping approach when building a database application. 7. Define the purpose and tasks associated with data administration and database administration. Copyright 2007 BUPTSSE Guo Wenming Page 60 10

11 Chapter 8. Entity-Relationship Modeling To ensure that we get a precise understanding of the nature of the data and how it is used by the enterprise, we need to have a model for communication that is non-technical and free of ambiguities. The Entity-Relationship (ER) model is one such example. ER modeling is an important technique for any database designer to master and forms the basis of the methodology. In this chapter we introduce the basic concepts of the ER model. Copyright 2007 BUPTSSE Guo Wenming Page 61 Chapter 8. Entity-Relationship Modeling We have chosen a diagrammatic notation that uses an increasingly popular object-oriented modeling language called Unified Modeling Language (UML). 8.1 Entity Types 8.2 Relationship Types 8.3 Attributes 8.4 String and Weak Entity Types 8.5 Attributes on relationship 8.6 Structural Constraints 8.7 Transform ER into relationship Copyright 2007 BUPTSSE Guo Wenming Page 62 Chapter 8. Entity-Relationship Modeling How to use Entity Relationship (ER) modeling in database design. Basic concepts associated with ER model. Diagrammatic technique for displaying ER model using Unified Modeling Language (UML). How to build an ER model from a requirements specification. How to derive a set of relations from a conceptual data model. Copyright 2007 BUPTSSE Guo Wenming Page 63 Chapter 8. Entity-Relationship Modeling Concepts of the ER Model Entity types Oxford Relationship types University Attributes Student Has Bill, Cain, Tim, Kitty, Copyright 2007 BUPTSSE Guo Wenming Page 64 Chapter 8. Entity-Relationship Modeling ER Diagram of Branch View of DreamHome Copyright 2007 BUPTSSE Guo Wenming Page Entity Types Entity type: A group of objects with same properties, which are identified by the enterprise as having an independent existence. The basic concept of the ER model is entity type. Different designer may identify different entities. Entity occurrence: A Uniquely identifiable object of an entity type. We identify each entity type by a name and a list of properties. We use the more general term entity where meaning is entity type or entity occurrence. Copyright 2007 BUPTSSE Guo Wenming Page 66 11

12 8.1 Entity Types A database normally contains many different entities. Examples of entities with a physical or conceptual existence. 8.1 Entity Types Diagrammatic representation of entity types Each entity type is shown as a rectangle labeled with the name of the entity. Normally a entity is named using singular noun. In UML, the first letter of each word in the entity name is upper case. Example for diagrammatic of the Staff and Branch entity types Copyright 2007 BUPTSSE Guo Wenming Page 67 Copyright 2007 BUPTSSE Guo Wenming Page Relationships Types Relationship type: A set of meaningful associations among entity types. A relationship type is a set of associations between one or more participating entity types. Each relationship type is given a name that describes its function. Relationship occurrence: A uniquely identifiable association, which includes one occurrence from each participating entity type. Relationship occurrence indicates the particular entity occurrences that are related. Copyright 2007 BUPTSSE Guo Wenming Page Relationships Types Diagrammatic representation of relationship types Each relationship type is shown as a line connecting the associated entity types, labeled with the name of the relationship. Normally, a relationship is named using a verb or a short phrase including a verb. An arrow symbol is placed beside the name indicating the correct directon. The first letter of each word in the relationship name is shown in upper case. Whenever possible, a relationship name should be unique for given ER model. Copyright 2007 BUPTSSE Guo Wenming Page Relationships Types A diagrammatic representation of Branch Has Staff relationship type. Relationship is only labeled in one direction. 8.2 Relationships Types Degree of a Relationship Type: The number of participating entities in relationship. The degree of a relationship indicates the number of entity types involved in a relationship. Relationship of degree: two is called binary; three is called ternary; four is called quaternary. Copyright 2007 BUPTSSE Guo Wenming Page 71 Copyright 2007 BUPTSSE Guo Wenming Page 72 12

13 8.2 Relationships Types Diagrammatic representation of complex relationships The UML notation uses a diamond to represent relationships with degrees higher than binary. The name of the relationship is displayed inside the diamond and in this case the directional arrow normally associated with the name is omitted. An example of a binary relationship called POwns. 8.2 Relationships Types An example of a ternary relationship called Registers. An example of a quaternary relationship called Arranges. Copyright 2007 BUPTSSE Guo Wenming Page 73 Copyright 2007 BUPTSSE Guo Wenming Page Relationships Types Recursive Relationship: A relationship type where same entity type participates more than once in different roles. Relationships may be given role names to indicate purpose that each participating entity type plays in a relationship. Role names can be important for recursive relationship to determine the function of each participant. Role names may also be used when two entities are associated through more than one relationship. Role names are usually not required if the function of the participating entities in a relationship is unambiguous. Copyright 2007 BUPTSSE Guo Wenming Page Relationships Types An example of a recursive relationship called Supervises with role names Supervisor and Supervisee. Copyright 2007 BUPTSSE Guo Wenming Page Relationships Types An example of entities associated through two distinct relationships called Manages and Has with Role Names. Copyright 2007 BUPTSSE Guo Wenming Page Attributes Attribute: A property of an entity or a relationship type. The attributes hold values that describe each entity occurrence and represent the main part of the data stored in the database. Attribute Domain: The Set of allowable values for one or more attributes. The domain defines the potential values that an attribute may hold and is similar to the domain concept in the relationship model. Attribute may share a domain. A fully developed data model includes the domains of each attribute in the ER model. Copyright 2007 BUPTSSE Guo Wenming Page 78 13

14 8.3 Attributes Attributes can be classified as being: Simple or Composite Single-valued or multi-valued Derived Simple Attribute: An attribute composed of a single component with an independent existence. Simple attributes can not be further subdivided into smaller components. Simple attributes are sometimes called atomic attributes. Composite Attribute: An attribute composed of multiple components, each with an independent existence. Copyright 2007 BUPTSSE Guo Wenming Page Attributes Single-valued Attribute: An attribute that holds a single value for each occurrence of an entity type. The majority of attributes are single-valued. For example: branchno is referred to as being singlevalued. Multi-valued Attribute: An attribute that holds multiple values for each occurrence of an entity type. A multi-valued attribute may have a set of numbers with upper and lower limits. Foe example: telno has between one and three values. Copyright 2007 BUPTSSE Guo Wenming Page Attributes Derived Attribute: An attribute that represents a value that is derivable from value of a related attribute or set of attributes, not necessarily in the same entity type. For example: duration can be calculated from the rentstart and rentfinish, which come from same entity type Lease. totalstaff can be calculated by counting the total number of Staff entity occurrences. deposit of the Lease entity can be calculated by twice the monthly rent of the PropertyForRent entity type. Copyright 2007 BUPTSSE Guo Wenming Page Attributes Candidate Key: The minimal set of attributes that uniquely identifies each occurrence of an entity type. A candidate key cannot contain a null. Primary Key: The candidate key that is selected to uniquely identify each occurrence of an entity type. The choice of primary key for an entity is based on attribute length, minimal number of attributes required. Composite Key: A candidate key that consists of two or more attributes. In some case, the key of an entity type is composed of several attributes, whose values together are unique for each entity occurrence but not separately. Copyright 2007 BUPTSSE Guo Wenming Page Attributes Diagrammatic representation of attributes If an entity type is to be displayed with its attribute, we divide the rectangle representing the entity in two. The upper part of the rectangle display the name of the entity and the lower part lists the names of the attributes. Copyright 2007 BUPTSSE Guo Wenming Page Strong and Weak Entity Types Strong Entity Type: An entity type that is not existence-dependent on some other entity type. A characteristic of a strong entity types is that each entity occurrence is uniquely identifiable using the primary key attribute(s). Weak Entity Type: An entity type that is existence-dependent on some other entity type. Each weak entity occurrence cannot be uniquely identified using only the attributes associated with that entity type. No primary key. Weak entity are sometimes referred to as child, dependent, or subordinate, and strong entity as parent, owner, or dominant. Copyright 2007 BUPTSSE Guo Wenming Page 84 14

15 8.4 Strong and Weak Entity Types For example: Strong Entity Type called Client and Weak Entity Type called Preference 8.5 Attribute on Relationships We use the same symbol, rectangle to represent attribute on a relationship, but the rectangle using a dashed line. Relationship called Advertises with Attributes. Copyright 2007 BUPTSSE Guo Wenming Page 85 Copyright 2007 BUPTSSE Guo Wenming Page Structural Constraints All appropriate enterprise constraints are identified and represented is an important part of modeling an enterprise. Main type of constraint on relationships is called multiplicity. Multiplicity: The number (or range) of possible occurrences of an entity type that may relate to a single occurrence of an associated entity type through a particular relationship. Multiplicity constrains the way that entities are related. Represents policies (called business rules) established by user or company. Copyright 2007 BUPTSSE Guo Wenming Page Structural Constraints The most common degree for relationships is binary. Binary relationships are generally referred to as being: one-to-one (1:1): a member of staff manages a branch; one-to-many (1:*): a member of staff oversees properties for rent; many-to-many (*:*): newspapers advertise properties for rent. It is important to note that not all enterprise constraints can be easily represented in an ER model. Copyright 2007 BUPTSSE Guo Wenming Page Structural Constraints How to determine the multiplicity for each of these constraints and how to represent each in an ER diagram, and how to examine multiplicity for relationships of degrees higher than binary One-to-One (1:1) Relationships One-to-Many (1:*) Relationships Many-to-Many (*:*) Relationships Multiplicity for Complex Relationships Cardinality and Participation Constraints Copyright 2007 BUPTSSE Guo Wenming Page One-to-One (1:1) Relationships Semantic Net of Staff Manages Branch Relationship Type. Copyright 2007 BUPTSSE Guo Wenming Page 90 15

16 8.6.1 One-to-One (1:1) Relationships Multiplicity of Staff Manages Branch (1:1) Relationship Type One-to-Many(1:*) Relationships Semantic Net of Staff Oversees PropertyForRent Relationship Type. Copyright 2007 BUPTSSE Guo Wenming Page 91 Copyright 2007 BUPTSSE Guo Wenming Page One-to-Many(1:*) Relationships Multiplicity of Staff Oversees PropertyForRent (1:*) Relationship Type Many-to-Many(*:*) Relationships Semantic Net of Newspaper Advertises PropertyForRent Relationship Type. Copyright 2007 BUPTSSE Guo Wenming Page 93 Copyright 2007 BUPTSSE Guo Wenming Page Many-to-Many(*:*) Relationships Multiplicity of Newspaper Advertises PropertyForRent (*:*) Relationship. Copyright 2007 BUPTSSE Guo Wenming Page Multiplicity for Complex Relationships Multiplicity for Complex Relationships: The number (or range) of possible occurrences of an entity type in an n-ary relationship when other (n-1) values are fixed. The multiplicity for a ternary relationship represents the potential range of entity occurrences of a particular entity in the relationship when the other two values representing the other two entities are fixed. For example, the ternary registers relationship between Staff, Branch, Client. We examine the registers relationship when the values for the Staff and Branch entities are fixed. Copyright 2007 BUPTSSE Guo Wenming Page 96 16

17 8.6.4 Multiplicity for Complex Relationships Semantic Net of Ternary Registers Relationship with Values for Staff and Branch Entities Fixed Multiplicity for Complex Relationships Multiplicity of Ternary Registers Relationship. When staffno and branchno are fixed the corresponding clienno are zero or more. Copyright 2007 BUPTSSE Guo Wenming Page 97 When staffno and clienno are fixed the corresponding branchno are 1 and 1. When branchno and clienno are fixed the corresponding staffno are 1 and 1. Copyright 2007 BUPTSSE Guo Wenming Page Multiplicity for Complex Relationships Summary of Multiplicity Constraints. Copyright 2007 BUPTSSE Guo Wenming Page Cardinality and Participation Constraint Multiplicity is made up of two types of restrictions on relationships: cardinality and participation. Cardinality: Describes maximum number of possible relationship occurrences for an entity participating in a given relationship type. The cardinality of a binary relationship is what we previously referred to as a one-to-one, one-to-many, many-to-many. Participation: Determines whether all or only some entity occurrences participate in a relationship. All refer to as mandatory participation. Only some refer to as optional participation. Copyright 2007 BUPTSSE Guo Wenming Page Cardinality and Participation Constraint Multiplicity as Cardinality and Participation Constraints Cardinality and Participation Constraint Multiplicity as Cardinality and Participation Constraints. Copyright 2007 BUPTSSE Guo Wenming Page 101 Copyright 2007 BUPTSSE Guo Wenming Page

18 8.7 Transform ER into relationship Local Conceptual Data Model for Staff View Showing all Attributes. 8.7 Transform ER into relationship Transform ER into relational data model: To create relations for the local logical data model to represent the entities, relationships, and attributes that have been identified. (1) Strong entity types Create a relation that includes all simple attributes of that entity. For composite attributes, include only constituent simple attributes. Staff (staffno, fname, lname, position, sex, DOB) Primary Key staffno (2) Weak entity types Create a relation that includes all simple attributes of that entity. Primary key is partially or fully derived from each owner entity. Preference (preftype, maxrent) Primary Key None (at present) Copyright 2007 BUPTSSE Guo Wenming Page 103 Copyright 2007 BUPTSSE Guo Wenming Page Transform ER into reationship (3) 1:* Binary relationship types Entity on one side is designated the parent entity and entity on many side is the child entity. Post copy of the primary key attribute(s) of parent entity into relation representing child entity, to act as a foreign key. Copyright 2007 BUPTSSE Guo Wenming Page Transform ER into reationship (4) 1:1 Binary relationship types More complex as cardinality cannot be used to identify parent and child entities in a relationship. Instead, participation used to decide whether to combine entities into one relation or to create two relations and post copy of primary key from one relation to the other. Consider following: (a) mandatory participation on both sides of 1:1 relationship; (b) mandatory participation on one side of 1:1 relationship; (c) optional participation on both sides of 1:1 relationship. (a) Mandatory participation on both sides of 1:1 relationship Combine entities involved into one relation and choose one of the primary keys of original entities to be primary key of new relation, while other (if one exists) is used as an alternate key. The Client States Preference relationship is an example of a 1:1 relationship with mandatory participation on both sides Client (clientno, fname, lname, telno, preftype, maxrent, staffno) Primary Key clientno Foreign Key staffno references Staff(staffNo) Copyright 2007 BUPTSSE Guo Wenming Page Transform ER into reationship (b) Mandatory participation on one side of a 1:1 relationship Identify parent and child entities using participation constraints. Entity with optional participation is designated parent entity, and other entity designated child entity. Copy of primary key of parent placed in relation representing child entity. If relationship has one or more attributes, these attributes should follow the posting of the primary key to the child relation. Example 8.7 Transform ER into reationship (c) Optional participation on both sides of a 1:1 relationship Designation of the parent and child entities is arbitrary unless can find out more about the relationship. Consider 1:1 Staff Uses Car relationship with optional participation on both sides. If there is no additional information to help select the parent and child entities, the choice is arbitrary. Designate car as parent, or vice versa. Assume majority of cars, but not all, are used by staff and only minority of staff use cars. The Car entity, although optional, is closer to being mandatory than Staff entity. Therefore designate Staff as parent entity and Car as child entity. Copyright 2007 BUPTSSE Guo Wenming Page 107 Copyright 2007 BUPTSSE Guo Wenming Page

19 8.7 Transform ER into reationship (5) 1:1 Recursive relationships - follow rules for participation for a 1:1 relationship. mandatory participation on both sides, represents the recursive relationship as a single relation with two copies of the primary key. As before, one copy of the primary key represents a foreign key and should be renamed to indicate the relationship it represents. mandatory participation on only one side: option to create a single relation with two copies of the primary key as described above, or create a new relation to represent the relationship. The new relation would only have two attributes, both copies of the primary key. As before, the copies of the primary keys act as foreign keys and have to be renamed to indicate the purpose of each in the relation. optional participation on both sides, again create a new relation as described above. 8.7 Transform ER into reationship (6) *:* Binary relationship types Create relation to represent relationship and include any attributes that are part of relationship. Post a copy of the primary key attribute(s) of the entities that participate in relationship into new relation, to act as foreign keys. These foreign keys will also form primary key of new relation, possibly in combination with some of the attributes of the relationship. Copyright 2007 BUPTSSE Guo Wenming Page 109 Copyright 2007 BUPTSSE Guo Wenming Page Transform ER into reationship (7) Complex relationship types Create relation to represent relationship and include any attributes that are part of the relationship. Post copy of primary key attribute(s) of entities that participate in the complex relationship into new relation, to act as foreign keys. Any foreign keys that represent a many relationship (for example, 1..*, 0..*) generally will also form the primary key of new relation, possibly in combination with some of the attributes of the relationship. 8.7 Transform ER into reationship (8) Multi-valued attributes Create new relation to represent multi-valued attribute and include primary key of entity in new relation, to act as a foreign key. Unless the multi-valued attribute is itself an alternate key of the entity, primary key of new relation is combination of the multivalued attribute and the primary key of the entity. Copyright 2007 BUPTSSE Guo Wenming Page 111 Copyright 2007 BUPTSSE Guo Wenming Page Transform ER into reationship 8.7 Transform ER into reationship Relations for the Staff views of Dreamhome. Copyright 2007 BUPTSSE Guo Wenming Page 113 Copyright 2007 BUPTSSE Guo Wenming Page

20 Question and Exercises? Question and Exercises? 1. Describe what entity types represent in an ER model and provide examples of entities with a physical or conceptual existence. 2. Describe what relationship types represent in an ER model and provide examples of unary, binary, ternary, and quaternary relationships. 3. Describe what attributes represent in an ER model and provide examples of simple, composite, single-value, multi-value, and derived attributes. 4. Describe what the multiplicity constraint represents for a relationship type. 5. What are enterprise constraints and how does multiplicity model these constraints? 6. Describe the rules for deriving relations that represent: strong entity types; weak entity types; one-to-many (1:*) binary relationship types; one-to-one (1:1) binary relationship types; one-to-one (1:1) recursive relationship types; many-to-many (*:*) binary relationship types; complex relationship types; multi-valued attributes. Copyright 2007 BUPTSSE Guo Wenming Page Create an ER diagram for each of the following descriptions: (a) Each company operates four departments, and each department belongs to one company. (b) Each department in part (a) employs one or more employees, and each employee works for one department. (c) Each of the employees in part (b) may or may not have one or more dependants, and each dependant belongs to one employee. (d) Each employee in part (c) may or may not have an employment history. (e) Represent all the ER diagrams described in (a), (b), (c), and (d) as a single ER diagram. Copyright 2007 BUPTSSE Guo Wenming Page 116 Question and Exercises? 7. You are required to create a conceptual data model of the data requirements for a company that specializes in IT training. The Company has 30 instructors and can handle up to 100 trainees per training session. The Company offers five advanced technology courses, each of which is taught by a teaching team of two or more instructors. Each instructor is assigned to a maximum of two teaching teams or may be assigned to do research. Each trainee undertakes one advanced technology course per training session. (a) Identify the main entity types for the company. (b) Identify the main relationship types and specify the multiplicity for each relationship. State any assumptions you make about the data. (c) Using your answers for (a) and (b), draw a single ER diagram to represent the data requirements for the company. Question and Exercises? 8. Derive relations for the following conceptual data model: Copyright 2007 BUPTSSE Guo Wenming Page 117 Copyright 2007 BUPTSSE Guo Wenming Page 118 Copyright 2007 BUPTSSE Guo Wenming Page 119 Copyright 2007 BUPTSSE Guo Wenming Page

21 Chapter 9 Normalization Main objective in developing a logical data model for relational database systems is to create an accurate representation of the data, its relationships, and constraints. A technique that we can use to help identify such relationship is called normalization. Normalization is a bottom-up approach to database design that begins by examining the relationships between attributes. We use a top-down approach to database design that begins by identifying the main entities and relationships and uses normalization as a validation technique. Copyright 2007 BUPTSSE Guo Wenming Page 121 Chapter 9 Normalization 9.1 The Purpose of Normalization 9.2 Data Redundancy and Update Anomalies 9.3 Functional Dependencies 9.4 The process of Normalization 9.5 First Normal Form (1NF) 9.6 Second Normal Form (2NF) 9.7 Third Normal Form (3NF) 9.8 General Definition of Second and Third Normal Form 9.9 Boyce-Codd Normal Form (BCNF) 9.10 Review of Normalization Copyright 2007 BUPTSSE Guo Wenming Page 122 Chapter 9 Normalization Purpose of normalization. Problems associated with redundant data. Identification of various types of update anomalies such as insertion, deletion, and modification anomalies. How to recognize appropriateness or quality of the design of relations. How functional dependencies can be used to group attributes into relations that are in a known normal form. How to undertake process of normalization. How to identify most commonly used normal forms, namely 1NF, 2NF, 3NF, and Boyce Codd normal form (BCNF). Copyright 2007 BUPTSSE Guo Wenming Page The Purpose of Normalization Normalization: A technique for producing a set of relations with desirable properties, given the data requirements of an enterprise. Four most commonly used normal forms are first (1NF), second (2NF) and third (3NF) normal forms, and Boyce Codd normal form (BCNF). Based on functional dependencies among the attributes of a relation. A relation can be normalized to a specific form to prevent possible occurrence of update anomalies. Copyright 2007 BUPTSSE Guo Wenming Page Data Redundancy and Update Anomalies Major aim of relational database design is to group attributes into relations to minimize data redundancy and reduce file storage space required by base relations. Problems associated with data redundancy are illustrated by comparing the following Staff and Branch relations with the StaffBranch relation. Copyright 2007 BUPTSSE Guo Wenming Page Data Redundancy and Update Anomalies StaffBranch relation has redundant data: details of a branch are repeated for every member of staff. In contrast, branch information appears only once for each branch in Branch relation and only branchno is repeated in Staff relation, to represent where each member of staff works. Copyright 2007 BUPTSSE Guo Wenming Page

22 9.2 Data Redundancy and Update Anomalies Relations that contain redundant information may have problems called update anomalies. Types of update anomalies include: Insertion anomaly, Deletion anomaly, Modification anomaly. Insertion anomaly To insert the details of new members of staff into StaffBranch relation, we must include the details of the branch at which the staff are to be located. To insert details of a new branch that currently has no members of staff into the StaffBranch relation, as staffno is the primary key, attempting to enter nulls for staffno violates entity integrity, and is not allowed. Copyright 2007 BUPTSSE Guo Wenming Page Data Redundancy and Update Anomalies Deletion anomaly If we delete a tuple from StaffBranch that represents the last member of staff located at a branch, the details about that branch are also lost from the database. Modification anomaly. If we want to change the value of one of the attributes of a particular branch in StaffBranch, we must update the tuples of all staff located at that branch. If this modification is not carried out on all the appropriate tuples of the StaffBranch, the database will become inconsistent. Copyright 2007 BUPTSSE Guo Wenming Page Data Redundancy and Update Anomalies The above examples illustrate that the Staff and Branch relations have more desirable properties than the StaffBranch relation, which can be decomposed into Staff and Branch relations There are two important properties associated with decomposition of a large relation into smaller relations: Lossless-join property ensures that any instance of original relation can be identified from corresponding instances in the smaller relations. Dependency preservation property ensures that a constraint on original relation can be maintained by simply enforcing some constraint on each of the smaller relations. Copyright 2007 BUPTSSE Guo Wenming Page Functional Dependency One of the main concept associated with normalization is functional dependency. Functional Dependency: Describes the relationship between attributes in a relation. For example, if A and B are attributes of relation R, B is functionally dependent on A (denoted A B), if each value of A in R is associated with exactly one value of B in R. Functional dependency is a property of meaning or semantics of the attributes in a realtion. The semantics indicate how attributes relat to one another, and specify the functional dependency between attributes. Copyright 2007 BUPTSSE Guo Wenming Page Functional Dependency Diagrammatic representation: 9.3 Functional Dependency Example 9.1 identifying a functional dependency. Determinant of a functional dependency refers to attribute or group of attributes on left-hand side of the arrow. Copyright 2007 BUPTSSE Guo Wenming Page 131 Copyright 2007 BUPTSSE Guo Wenming Page

23 9.3 Functional Dependency The main characteristics of functional dependencies that we use in normalization: Have a one-to-one relationship between attribute(s) on left and right-hand side of a dependency; hold for all time; are nontrivial. Copyright 2007 BUPTSSE Guo Wenming Page Functional Dependency Example 9.2 Identifying a functional dependency that holds for all time. Consider the values in staffno and sname attributes of Staff. staffno sname: for all time. sname staffno: at a given moment in time. Example 9.3 Trivial functional dependency. staffno, sname sname. staffno, sname staffno. Above dependency are true, but they do not provide any additional information about constraints for us. We are normally more interested in nontrivial dependency because they represent integrity constraints for the relation. Copyright 2007 BUPTSSE Guo Wenming Page Functional Dependency Example 9.4 Identifying a set of functional dependencies for the StaffyBranch relation. staffno sname, position, salary, branchno, baddress branchno baddress baddress branchno branchno, position salary baddress, position salary 9.3 Functional Dependency Complete set of functional dependencies for a given relation can be very large. It is Important to find an approach that can reduce set to a manageable size. Need to identify set of functional dependencies (X) for a relation that is smaller than complete set of functional dependencies (Y) for that relation and has property that every functional dependency in Y is implied by functional dependencies in X. Copyright 2007 BUPTSSE Guo Wenming Page 135 Copyright 2007 BUPTSSE Guo Wenming Page Functional Dependency Set of all functional dependencies implied by a given set of functional dependencies X called closure of X (written X + ). Set of inference rules, called Armstrong s axioms, specifies how new functional dependencies can be inferred from given ones. Let A, B, and C be subsets of the attributes of relation R. Armstrong s axioms are as follows: 1) Reflexivity: If B is a subset of A, then A B 2) Augmentation: If A B, then A,C B,C 3) Transitivity: If A B and B C, then A C All functional dependencies implied by X can be derived from X using these rules. The rules can be used to derive the closure X +. Copyright 2007 BUPTSSE Guo Wenming Page The Process of Normalization Normalization is a formal technique for analyzing relations based on their primary key (or candidate keys) and functional dependencies. Normalization is often executed as a series of steps. Each step corresponds to a specific normal form that has known properties. As normalization proceeds, relations become progressively more restricted (stronger) in format and also less vulnerable to update anomalies. In the following sections, we describe the process of normalization in detail. Copyright 2007 BUPTSSE Guo Wenming Page

24 9.4 The Process of Normalization Diagrammatic illustration of the relationship between the normal forms. Copyright 2007 BUPTSSE Guo Wenming Page First Normal Form (1NF) Unnormalized form (UNF): A table that contains one or more repeating groups. First normal form (1NF): A relation in which the intersection of each row and column contains one and only one value. To transform the unnormalied table to first normal form (1NF). We identify and remove repeating groups within the table. Nominate an attribute or group of attributes to act as the key for the unnormalized table. Identify repeating group(s) in unnormalized table which repeats for the key attribute(s). Copyright 2007 BUPTSSE Guo Wenming Page Second Normal Form (2NF) Second normal form (2NF) is based on the concept of full functional dependency. Full functional dependency: Indicates that if A and B are attributes of a relation, B is fully dependent on A if B is functionally dependent on A, but not on any proper subset of A. A functional dependency A B is partially dependent if there is some attribute that can be removed from A and the dependency still holds. 2NF: A relation that is in 1NF and every non-primarykey attribute is fully functionally dependent on the primary key. Identify primary key for the 1NF relation. Identify functional dependencies in the relation. If partial dependencies exist on the primary key remove them by placing them in a new relation along with copy of their determinant. Copyright 2007 BUPTSSE Guo Wenming Page Second Normal Form (2NF) Example 9.5 Consider the relation ClientRental (clientno, propertyno, cname, paddress, rentstart, rentfinish, rent, ownerno, oname), has the following functional dependency: fd1 clientno,propertyno rentstart,rentfinish fd2 clientno cname fd3 propertyno paddress,rent,ownerno,oname fd4 owner oname fd5 clientno,rentstart propertyno,paddress, rentfinish,rent, ownerno, oname fd6 propertyno,rentstart clientno,cname,rentfinish Solution: Client (clientno,cname) Rental (clientno,propertyno,rentstart,rentfinish) PropertyOwner (propertyno,paddress,rent,ownerno,oname ) Copyright 2007 BUPTSSE Guo Wenming Page Third Normal Form (3NF) Third normal form is based on concept of transitive dependency. Transitive dependency: A, B and C are attributes of a relation such that if A B and B C, then C is transitively dependent on A through B. (Provided that A is not functionally dependent on B or C). 3NF: A relation that is in 1NF and 2NF and in which no non-primary-key attribute is transitively dependent on the primary key. Identify the primary key in the 2NF relation. Identify functional dependencies in the relation. If transitive dependencies exist on the primary key remove them by placing them in a new relation along with copy of their determinant. Copyright 2007 BUPTSSE Guo Wenming Page Third Normal Form (3NF) Example 9.6 Consider the following relations, dervided in the previous example. Client (clientno,cname) fd2 clientno cname Rental (clientno,propertyno,rentstart,rentfinish) fd1 clientno,propertyno rentstart,rentfinish fd5 clientno,rentstart propertyno, rentfinish fd6 propertyno,rentstart clientno,rentfinish PropertyOwner (propertyno,paddress,rent,ownerno,oname ) fd3 propertyno paddress,rent,ownerno,oname fd4 owner oname Solution: Client (clientno,cname) Rental (clientno,propertyno,rentstart,rentfinish) PropertyForRent (propertyno,paddress,rent,ownerno) Owner (ownerno,oname) Copyright 2007 BUPTSSE Guo Wenming Page

25 9.8 General Definition of Second and Third Normal Form In this section, we present more general definitions for 2NF and 3NF that take into account candidate keys of a relation. Second normal form (2NF): A relation that is in 1NF and every non-primary-key attribute is fully functionally dependent on any candidate key. Third normal form (3NF): A relation that is in 1NF and 2NF and in which no non-primary-key attribute is transitively dependent on any candidate key. Copyright 2007 BUPTSSE Guo Wenming Page Boyce-Codd Normal Form (BCNF) Based on functional dependencies that take into account all candidate keys in a relation, however Boyce-Codd (BCNF) also has additional constraints compared with general definition of 3NF. BCNF: A relation is in BCNF if and only if every determinant is a candidate key. The difference between 3NF and BCNF is that for a functional dependency A B, 3NF allows this dependency in a relation if B is a primary-key attribute and A is not a candidate key. Whereas, BCNF insists that for this dependency to remain in a relation, A must be a candidate key. Copyright 2007 BUPTSSE Guo Wenming Page Boyce-Codd Normal Form (BCNF) Every relation in BCNF is also in 3NF. However, relation in 3NF may not be in BCNF. The solution relations in the example 9.6 are all in BCNF. Violation of BCNF is quite rare, since it may only happen under specific condition. The potential to violate BCNF may occur in a relation that: contains two (or more) composite candidate keys; the candidate keys overlap (i.e. have at least one attribute in common). Copyright 2007 BUPTSSE Guo Wenming Page Review of Normaliztion Example 9.7 we extend the DreamHome case to include property inspection by members of staff. They are allocated a company car for use on the day of the inspections. A car may be allocated to several members of staff as required throughout the working day. A members of staff may inspect several properties on a given date, but a property is only inspected once on a given date. The DreamHome Property Inspection Report are presented in next page. Copyright 2007 BUPTSSE Guo Wenming Page Review of Normaliztion DreamHome Property Inspection reports 9.10 Review of Normaliztion We first transfer sample date held on two property inspection reports into table format with rows and columns (unnormalized form). Copyright 2007 BUPTSSE Guo Wenming Page 149 Copyright 2007 BUPTSSE Guo Wenming Page

26 9.10 Review of Normaliztion We transform the previous unnormalized form to 1NF 9.10 Review of Normaliztion The following figure, we illustrate the functional dependencies of the StaffPropertyInspection relation with (propertyno,idate) as the primary key. Copyright 2007 BUPTSSE Guo Wenming Page 151 Copyright 2007 BUPTSSE Guo Wenming Page Review of Normaliztion We note that the paddress is partially dependent on the primary key, whereas the remaining attributes are fully dependent on the whole primary key. The partial dependency propertyno paddress indicates that the relation is not in 2NF. Removing the partial dependency from the relation and creating two new relations, these relations are in 2NF. Property (propertyno, paddress) PropertyInspection (prpertyno, idate, itime, comments, staffno, sname, carreg) Copyright 2007 BUPTSSE Guo Wenming Page Review of Normaliztion The functional dependencies within the Property and PropertyInspection relations are as follows: Property (propertyno, paddress) fd2 propertyno paddress PropertyInspection (prpertyno, idate, itime, comments, staffno, sname, carreg) fd1 propertyno, idate itime, comments, staffno, sname, carreg fd3 staffno sname (transitively dependent on key) fd4 staff,idate carreg fd5 carreg, idate, itime propertyno, comments, staffno, sname fd6 staffno, idate, itime propertyno, comments Copyright 2007 BUPTSSE Guo Wenming Page Review of Normaliztion We note that the sname is transitively dependent on the primary key (propertyno, idate), transitve dependency indicates that the relations are not in 3NF. To transform the PropertyInspection relation into 3NF, we remove the transitive dependency (staffno sname) by creating two new relations. These are in 3NF as following. Property (propertyno, paddress) Staff (staffno, sname) PropertyInspect (prpertyno, idate, itime, comments, staffno, carreg) 9.10 Review of Normaliztion The functional dependencies for the Property, Staff and PropertyInspection relations are as follows: Property (propertyno, paddress) fd2 propertyno paddress Staff (staffno, sname) fd3 staffno sname PropertyInspec (prpertyno, idate, itime, comments, staffno, carreg) fd1 propertyno, idate itime, comments, staffno, carreg fd4 staff,idate carreg fd5 carreg, idate, itime propertyno, comments, staffno fd6 staffno, idate, itime propertyno, comments Copyright 2007 BUPTSSE Guo Wenming Page 155 Copyright 2007 BUPTSSE Guo Wenming Page

27 9.10 Review of Normaliztion We can see that the PropertyInspect relation is not in BCNF because of the presence of the determinant (staffno, idate), which is not a candidate key (represent as fd4). To transform the PropertyInspect relation into BCNF, we must remove the dependency that violates BCNF by creating two new relations. The resulting BCNF relations have the following form: Property (propertyno, paddress) Staff (staffno, sname) PropertyInspection (prpertyno, idate, itime, comments, staffno) StaffCar (staffno, idate, CarReg) 9.10 Review of Normaliztion Decomposition of StaffPropertyInspection relation into BCNF relations Copyright 2007 BUPTSSE Guo Wenming Page 157 Copyright 2007 BUPTSSE Guo Wenming Page 158 Question and Exercises? 1. Describe the types of update anomalies that may occur on a relation that has redundant data. 2. Describe the concept of functional dependency. 3. How is the concept of functional dependency associated with the process of normalization? 4. Describe the concept of full functional dependency and describe how this concept relates to 2NF. Provide an example to illustrate your answer. 5. Describe the concept of transitive dependency and describe how this concept relates to 3NF. Provide an example to illustrate your answer. 6. Examine the Patient Medication Form for the Wellmeadows Hospital case study shown in Figure in next page. (a) Identify the functional dependencies represented by the data shown in the form in Figure (b) Describe and illustrate the process of normalizing the data shown in Figure to first (1NF), second (2NF), third (3NF), and BCNF. (c) Identify the primary, alternate, and foreign keys in your BCNF relations. Question and Exercises? Copyright 2007 BUPTSSE Guo Wenming Page 159 Copyright 2007 BUPTSSE Guo Wenming Page 160 Copyright 2007 BUPTSSE Guo Wenming Page 161 Copyright 2007 BUPTSSE Guo Wenming Page

28 Chapter 10. Transaction Management Transaction management are closely related three functions that are intended to ensure that the database is reliable and remains in a consistent state, namely transaction support, concurrency control services, and recovery services. This reliability and consistency must be maintained in the presence of failures of both hardware and software components, and when multiple users are accessing the database. In this chapter we concentrate on these three functions. Copyright 2007 BUPTSSE Guo Wenming Page 163 Chapter 10. Transaction Management 10.1 Transaction Support 10.2 Concurrency Control 10.3 Database Recovery Copyright 2007 BUPTSSE Guo Wenming Page 164 Chapter 10. Transaction Management Function and importance of transactions. Properties of transactions. Concurrency Control Meaning of serializability. How locking can ensure serializability. Deadlock and how it can be resolved. How timestamping can ensure serializability. Granularity of locking. Recovery Control Some causes of database failure. Purpose of transaction log file. Purpose of checkpointing. How to recover following database failure. Copyright 2007 BUPTSSE Guo Wenming Page Transaction Support Transaction: An action, or series of actions, carried out by a single user or application program, which reads or updates the contents of the database. A transaction is a logical unit of work on the database. Application program can be thought of as a series of transactions with non-database processing taking place in between. A transaction should always transforms database from one consistent state to another, although consistency may be violated while transaction is in progress. Copyright 2007 BUPTSSE Guo Wenming Page Transaction Support Transaction a: update the salary of a particular member of staff given the number. Transaction b: delete the member of staff with a given staff number x, we also need to find all PropertyForRent tuples that this member of staff managed and reassign them to a different member of staff, newstaffno say. Copyright 2007 BUPTSSE Guo Wenming Page Transaction Support A transaction can have one of two outcomes: Success: transaction commits and database reaches a new consistent state. Failure: transaction aborts, and database must be restored to consistent state before it started. Such a transaction is rolled back or undone. Committed transaction cannot be aborted. If the committed transaction was mistake, we must perform another compensating transaction to reverse its effects. Copyright 2007 BUPTSSE Guo Wenming Page

29 10.1 Transaction Support Aborted transaction that is rolled back can be restarted later. The keywords BEGIN TRANSACTION, COMMIT, and ROLLBACK are available in many data manipulation languages to delimit transactions. If these delimiters are not used, the entire program is usually regarded as a single transaction, with the DBMS automatically performing a COMMIT when the program terminates correctly and a ROLLBACK if it does not. Copyright 2007 BUPTSSE Guo Wenming Page Transaction Support State transaction diagram for a transaction Copyright 2007 BUPTSSE Guo Wenming Page Transaction Support Four basic (ACID) properties of a transaction are: Atomicity: All or nothing property. Consistency: Must transform database from one consistent state to another. Isolation: Partial effects of incomplete transactions should not be visible to other transactions. Durability: Effects of a committed transaction are permanent and must not be lost because of later failure Transaction Support DBMS transaction subsystem. Transaction manager coordinates transaction on behalf of programs. Buffer manager is responsible for the transfer of data between disk storage and main memory. Recovery manager ensure that the database is restored. Scheduler is sometimes referred to as the lock manager. Copyright 2007 BUPTSSE Guo Wenming Page 171 Copyright 2007 BUPTSSE Guo Wenming Page Concurrency Control Concurrency control: The process of managing simultaneous operations on the database without having them interfere with one another The Need for Concurrency Control Serializability and Recoverability Locking Methods Deadlock Timestamping Methods Granularity of Data Items Copyright 2007 BUPTSSE Guo Wenming Page The Need for Concurrency Control Prevents interference when two or more users are accessing database simultaneously and at least one is updating data. Although two transactions may be correct in themselves, interleaving of operations may produce an incorrect result. Three examples of potential problems caused by concurrency: Lost update problem. Uncommitted dependency problem. Inconsistent analysis problem. Copyright 2007 BUPTSSE Guo Wenming Page

30 The Need for Concurrency Control Lost update problem: Successfully completed update is overridden by another user. T 1 withdrawing 10 from an account with bal x, initially 100. T 2 depositing 100 into same account. Serially, final balance would be 190. Loss of T 2 s update avoided by preventing T 1 from reading bal x until after update. Copyright 2007 BUPTSSE Guo Wenming Page The Need for Concurrency Control Uncommitted dependency (or dirty read) problem: Occurs when one transaction can see intermediate results of another transaction before it has committed. T 4 updates bal x to 200 but it aborts, so bal x should be back at original value of 100. T 3 has read new value of bal x ( 200) and uses value as basis of 10 reduction, giving a new balance of 190, instead of 90. Problem avoided by preventing T 3 from reading bal x until after T 4 commits or aborts. Copyright 2007 BUPTSSE Guo Wenming Page The Need for Concurrency Control Uncommitted dependency problem The Need for Concurrency Control Inconsistent analysis problem: Occurs when transaction reads several values but second transaction updates some of them during execution of first. Sometimes referred to as unrepeatable (or fuzzy) read. T 6 is totaling balances of account x ( 100), account y ( 50), and account z ( 25). Meantime, T 5 has transferred 10 from bal x to bal z, so T 6 now has wrong result ( 10 too high). Problem avoided by preventing T 6 from reading bal x and bal z until after T 5 completed updates. Copyright 2007 BUPTSSE Guo Wenming Page 177 Copyright 2007 BUPTSSE Guo Wenming Page The Need for Concurrency Control Inconsistent analysis (unrepeatable read ) problem Copyright 2007 BUPTSSE Guo Wenming Page Serializablility and Recoverability Objective of a concurrency control protocol is to schedule transactions in such a way as to avoid any interference. One obvious solution is to allow only one transaction to execute at a time: one transaction is committed before the next transaction is allowed to begin. Could run transactions serially, but this limits degree of concurrency or parallelism in system. Serializability identifies those executions of transactions that are guaranteed to ensure consistency. Copyright 2007 BUPTSSE Guo Wenming Page

31 Serializablility and Recoverability Schedule: A sequence of the operations (reads/writes) by a set of concurrent transactions that preserves the order of operation in each of the individual transations. Serial schedule: A schedule where operations of each transaction are executed consecutively without any interleaved operations from other transactions. Nonserial schedule: A schedule where the operations from a set of concurrent transactions are interleaved. Copyright 2007 BUPTSSE Guo Wenming Page Serializablility and Recoverability The objective of serializability is to find nonserial schedules that are equivalent to some serial schedule. Such a schedule is called serializable. In serializability, the ordering of read/writes is important: (a) If two transactions only read a data item, they do not conflict and order is not important. (b) If two transactions either read or write completely separate data items, they do not conflict and order is not important. (c) If one transaction writes a data item and another reads or writes same data item, order of execution is important. Copyright 2007 BUPTSSE Guo Wenming Page Serializablility and Recoverability Serializability identifies schedules that maintain database consistency, assuming no transaction fails. An alternative perspective examines the recoverability of transactions within a schedule. Recoverability: A schedule where, for each pair of transactions T i and T j, if T j reads a data item previously written by T i, then the commit operation of T i precedes the commit operation of T j. Copyright 2007 BUPTSSE Guo Wenming Page Serializablility and Recoverability Two basic concurrency control techniques: Locking Timestamping Locking and timestamping are conservative approaches: delay transactions in case they conflict with other transactions. Copyright 2007 BUPTSSE Guo Wenming Page Locking methods Locking: A procedure used to control concurrent access to data. When one transaction is accessing the database, a lock may deny access to other transactions to prevent incorrect updates. Locking methods are the most widely used approach to ensure serializability. Generally, a transaction must claim a shared (read) or exclusive (write) lock on a data item before read or write. Lock prevents another transaction from modifying item or even reading it, in the case of a write lock. Copyright 2007 BUPTSSE Guo Wenming Page Locking methods Shared lock: If a transaction has a shared lock on a data item, it can read but not update it. Exclusive lock: If a transaction has exclusive lock on a data item, it can both read and update the item. Reads cannot conflict, so more than one transaction can hold shared locks simultaneously on same item. Exclusive lock gives transaction exclusive access to that item. As long as a transaction holds the exclusive lock on the item, no other transaction can read or update that item. Copyright 2007 BUPTSSE Guo Wenming Page

32 Locking methods To guarantee serializability, need an additional protocol concerning the positioning of lock and unlock operations in every transaction. Two-phase locking (2PL): Transaction follows 2PL protocol if all locking operations precede first unlock operation in the transaction. Two phases for transaction: Growing phase: acquires all locks but cannot release any locks. Shrinking phase: releases locks but cannot acquire any new locks. We now look at how two-phase locking is used to resolve the three problems identified in Locking methods Preventing the lost update problem Copyright 2007 BUPTSSE Guo Wenming Page 187 Copyright 2007 BUPTSSE Guo Wenming Page Locking methods Preventing the uncommitted dependency problem Locking methods Preventing the inconsistent analysis problem Copyright 2007 BUPTSSE Guo Wenming Page 189 Copyright 2007 BUPTSSE Guo Wenming Page Locking methods If every transaction in a schedule follows 2PL, schedule is serializable. However, problems can occur when locks can be released. Look at the Figure in next page: Transactions conform to 2PL, but T 14 aborts. Since T 15 is dependent on T 14, T 15 must also be rolled back. Since T 16 is dependent on T 15, it too must be rolled back. This is called cascading rollback. To prevent this with 2PL, leave release of all locks until end of transaction. This is called rigorous 2PL. Most database systems implement rigorous 2PL Copyright 2007 BUPTSSE Guo Wenming Page Locking methods Cascading rollback with 2PL Copyright 2007 BUPTSSE Guo Wenming Page

33 Deadlocking Deadlock: An impasse that may result when two (or more) transactions are each waiting for locks to be released that are held by the other. Copyright 2007 BUPTSSE Guo Wenming Page Deadlocking Only one way to break deadlock: abort one or more of the transactions. Deadlock should be transparent to user, so DBMS should automatically restart the aborted transaction(s). Three general techniques for handling deadlock: Timeouts. Deadlock prevention. Deadlock detection and recovery. Copyright 2007 BUPTSSE Guo Wenming Page Deadlocking Timeouts Transaction that requests lock will only wait for a system-defined period of time. If lock has not been granted within this period, lock request times out. DBMS assumes transaction may be deadlocked. it aborts and automatically restarts the transaction. Deadlock prevent DBMS looks ahead to see if transaction would cause deadlock and never allows deadlock to occur. Approach to deadlock prevention is to order transactions using transaction timestamps Deadlocking Deadlock detection and recovery DBMS allows deadlock to occur but recognizes it and breaks it. Systems generally avoid the deadlock prevention method. Deadlock detection is usually handled by construction of wait-for graph (WFG) that shows transaction dependencies: Create a node for each transaction. Create edge T i T j, if T i waiting to lock item locked by T j. Deadlock exists if and only if WFG contains cycle. Copyright 2007 BUPTSSE Guo Wenming Page 195 Copyright 2007 BUPTSSE Guo Wenming Page Timestamping methods A different approach that also guarantees serializability uses transaction timestamps to order transaction execution for an equivalent serial schedule. Timestamp methods for concurrency control are quite different from locking methods. No locks are involved, and therefore can be no deadlock. Compared with locking methods, using timestamp methods, there is no waiting: transactions involved in conflict are simply rolled back and restarted. Copyright 2007 BUPTSSE Guo Wenming Page Timestamping methods Timestamp: A unique identifier created by DBMS that indicates the relative starting time of a transaction. Can be generated by using system clock at time transaction started, or by incrementing a logical counter every time a new transaction starts. Timestamping: A concurrency control protocol that orders transactions in such a way that older transaction, transactions with smaller timestamps, get priority in the event of conflict. Copyright 2007 BUPTSSE Guo Wenming Page

34 Timestamping methods The read or write is only allowed to proceed if last update on that data item was carried out by an older transaction. Otherwise, transaction requesting read/write is restarted and given a new timestamp. Besides timestamps for transactions, there are timestamps for data items. Each data item contains: read-timestamp: timestamp of last transaction to read item; write-timestamp: timestamp of last transaction to write (update) item. Copyright 2007 BUPTSSE Guo Wenming Page Timestamping methods Consider a transaction T with timestamp ts(t), the timestamp ordering protocol works as follows: Transaction T issues a read(x) ts(t) < write_timestamp(x): x already updated by younger (later) transaction. Transaction T must be aborted and restarted with a new timestamp. ts(t) > write_timestamp(x): read operation can proceed. and read_timestamp(x) = max(ts(t), read_timestamp(x)) Transaction T issues a write(x) ts(t) < read_timestamp(x): x already read by younger transaction. Roll back transaction T and restart it using a later timestamp. ts(t) < write_timestamp(x): x already written by younger transaction. Transaction T should be rolled back and restarted using a new timestamp. Otherwise, operation can proceed. write_timestamp(x) = ts(t) Copyright 2007 BUPTSSE Guo Wenming Page 200 Example basic timestamp order Timestamping methods Copyright 2007 BUPTSSE Guo Wenming Page Granularity of Data Items Granularity: The size of data items chosen as the unit of protection by a concurrency control protocol. A data item is chosen to be one of the following, ranging from small item sizes to large item sizes : The entire database. A file. A page (or area or database space). A record. A field value of a record. The granularity of the data item that can be locked in a single operation has a significant effect on the overall performance of concurrency control algorithm. Copyright 2007 BUPTSSE Guo Wenming Page Granularity of Data Items There are several tradeoffs that have to be considered in choosing the item size: coarser, the lower the degree of concurrency; finer, more locking information that is needed to be stored. Best item size depends on the types of transactions. Some systems automatically upgrade locks from record or page to file if a particular transaction is locking more than a certain percentage of the records or pages in the file. Copyright 2007 BUPTSSE Guo Wenming Page Granularity of Data Items We could represent granularity of locks in a hierarchical structure. Root node represents entire database, level 1s represent files, etc. When node is locked, all its descendants are also locked. DBMS should check hierarchical path before granting lock. Copyright 2007 BUPTSSE Guo Wenming Page

35 Granularity of Data Items Intention lock could be used to lock all ancestors of a locked node. Intention locks can be read or write. Applied topdown, released bottom-up. Copyright 2007 BUPTSSE Guo Wenming Page Database Recovery Reliability refers to both the resilience of the DBMS to various types of failure and its capability to recover from them. In this section, we consider how this service can be provided. Database recovery: The process of restoring the database to a correct state in the event of a failure The need for Recovery Transaction and Recovery Recovery Facilities Recovery Technique Copyright 2007 BUPTSSE Guo Wenming Page The Need for Recovery The storage of data generally includes four different types of media with an increasing degree of reliability: Main memory: volatile storage, primary storage, it does not survive system crashes. Magnetic disk: online non-volatile storage, secondary storage. Magnetic type: offline non-volatile storage medium, secondary storage. Optical disk: non-volatile storage, secondary storage. Stable storage represents information that has been replicated in several non-volatile storage media with independent failure modes. Copyright 2007 BUPTSSE Guo Wenming Page The Need for Recovery There are many different types of failure that can affect database processing. Among the causes of failure are: System crashes: due to hardware of software errors, resulting in loss of main memory. Media failures: such as head crashes or unreadable media, resulting in loss of parts of secondary storage. Application software errors: errors in program that is accessing database, which cause transaction to fail. Natural physical disasters: such as fires, flood, earthquakes, or power failure. Carelessness or unintentional destruction of data or facilities by operators or users. Sabotage, intentional corruption or destruction of data. Copyright 2007 BUPTSSE Guo Wenming Page Transactions and Recovery Transactions represent the basic unit of recovery in a database system. Recovery manager responsible for two of the four ACID properties of transaction, atomicity and durability. The database buffers occupy an area in main memory from which data is transferred to and from secondary storage. The buffers have been flushed, the buffer manager decides which buffer to write to disk according to LRU (least recently used). The explicit writing of the buffers to secondary storage is known as force-writing. Copyright 2007 BUPTSSE Guo Wenming Page Transactions and Recovery If failure occurs between writing to the buffers and flushing the buffers to secondary storage, the recovery manager must determine the status of the transaction that performed the write at the time of failure. If the transaction had issued its commit, then to ensure durability, the recovery manager have to redo (rollforward) transaction s updates. If transaction had not committed at failure time, the recovery manager have to undo (rollback) any effects of that transaction for atomicity. Partial undo - only one transaction has to be undone. Global undo - all transactions have to be undone. Copyright 2007 BUPTSSE Guo Wenming Page

36 Transactions and Recovery Example: DBMS starts at time t 0, but fails at time t f. Assume data for transactions T 2 and T 3 have been written to secondary storage. T 1 and T 6 have to be undone. In absence of any other information, recovery manager has to redo T 2, T 3, T 4, and T Recovery Facilities A DBMS should provide following facilities to assist with recovery: Backup mechanism, which makes periodic backup copies of database. Logging facilities, which keep track of current state of transactions and database changes. Checkpoint facility, which enables updates to database that are in progress to be made permanent. Recovery manager, which allows DBMS to restore database to a consistent state following a failure. Copyright 2007 BUPTSSE Guo Wenming Page 211 Copyright 2007 BUPTSSE Guo Wenming Page Recovery Facilities Backup mechanism Backup copies of the database and the log file to be made at regular intervals without necessarily having to stop the system first. Log files To keep track of database transactions, the DBMS maintains a special file called a log (or journal) that contains information about all updates to database. The log may contain the following data: Transaction records. Checkpoint records. Often used for other purposes (for example, for performance monitoring and auditing). Copyright 2007 BUPTSSE Guo Wenming Page Recovery Facilities Transaction records contain: Transaction identifier; Type of log record, (transaction start, insert, update, delete, abort, commit); Identifier of data item affected by the database action (insert, delete, and update operations); Before-image of data item, that is its value before change (update and delete operation only); After-image of data item, that is its value after change (insert and update operations only). Log management information, such as a pointer to previous and next log records for that transaction (all operations). Copyright 2007 BUPTSSE Guo Wenming Page Recovery Facilities A segment of a log file The columns pptr and nptr represent pointers to the previous and next log records for each transaction Copyright 2007 BUPTSSE Guo Wenming Page Recovery Facilities Log file may be duplexed or triplexed (that is, two or three separate copies maintained) so that if one copy is damaged, another can be used. In some environments where a vast amount of logging information is generated every day, it is not possible to hold all this data online all the time. One approach to handling the offlining of the log is to divide the online log into two separate random-access files. The first is closed and transferred to offline storage, when first is full, and second is opened and used by transactions. Log file is a potential bottleneck and can be critical in determining overall performance. Copyright 2007 BUPTSSE Guo Wenming Page

37 Recovery Facilities Checkpoint Checkpoint: the point of synchronization between the database and the transaction log file. All buffers are force-written to secondary storage. Checkpoint are scheduled at predetermined intervals and involve the following operations: Writing all log records in main memory to second storage; Writing the modified blocks in the database buffers to secondary storage; Writing a checkpoint record to the log file. This record contains the identifiers of all active transactions at the time of the checkpoint. Copyright 2007 BUPTSSE Guo Wenming Page Recovery Facilities When failure occurs, redo all transactions that committed since the checkpoint and undo all transactions active at time of crash. In previous example, with checkpoint at time t c, changes made by T 2 and T 3 have been written to secondary storage. Thus: only redo T 4 and T 5, undo transactions T 1 and T6. Copyright 2007 BUPTSSE Guo Wenming Page Recovery Techniques The recovery procedure is dependent on the extent of the damage that has occurred to the database, we consider two cases: If database has been damaged: Need to restore last backup copy of database and reapply updates of committed transactions using log file. If database is only inconsistent: Need to undo changes that caused inconsistency. May also need to redo some transactions to ensure updates reach secondary storage. Do not need to use the backup copy of the database, but can restore database using beforeand after-images in the log file. Copyright 2007 BUPTSSE Guo Wenming Page Recovery Techniques Three main recovery techniques: Deferred Update Immediate Update Shadow Paging Using deferred update recovery protocol Updates are not written to the database until after a transaction has reached its commit point. If transaction fails before commit, it will not have modified database and so no undoing of changes required. May be necessary to redo updates of committed transactions as their effect may not have reached database. Copyright 2007 BUPTSSE Guo Wenming Page Recovery Techniques Write-ahead log protocol Essential that log records are written before write to database. Recovery techniques using deferred update: We go back to the most recent checkpoint record: Any transaction with transaction start and transaction commit log records should be redone. For any transaction with transaction start and transaction abort log records, we do nothing since no actual writing was done to the database, so these transactions do not have to be undone Recovery Techniques Using immediate update recovery protocol Updates are applied to database as they occur without waiting to reach the commit point. Need to redo updates of committed transactions following a failure. May need to undo effects of transactions that had not committed at time of failure. Copyright 2007 BUPTSSE Guo Wenming Page 221 Copyright 2007 BUPTSSE Guo Wenming Page

38 Recovery Techniques Recovery techniques using immediate update: We go back to the most recent checkpoint record: For any transaction for which both a transaction start and transaction commit record appear in the log, it should be redo. For any transaction for which the log contains a transaction start record but no a transaction commit record, it must undo that transaction. Undo operations are performed in reverse order in which they were written to log Recovery Techniques Shadow paging Maintain two page tables during life of a transaction: current page and shadow page table. When transaction starts, two pages are the same. Shadow page table is never changed thereafter and is used to restore database in event of failure. During transaction, current page table records all updates to database. When transaction completes, current page table becomes shadow page table. Copyright 2007 BUPTSSE Guo Wenming Page 223 Copyright 2007 BUPTSSE Guo Wenming Page Recovery Techniques Shadow paging advantages: The overhead of maintaining the log file is eliminated Recovery is significantly faster since there is no need for undo or redo operations. Disadvantages Data fragmentation, there are two areas for same data. The need for periodic garbage collection to reclaim inaccessible blocks. Most DBMSs did not implement the shadow paging. Question and Exercises? 1. Explain what is meant by a transaction. 2. The consistency and reliability aspects of transactions are due to the ACIDity properties of transactions. Discuss each of these properties and how they relate to the concurrency control and recovery mechanisms. Give examples to illustrate your answer. 3. Describe, with examples, the types of problem that can occur in a multi-user environment when concurrent access to the database is allowed. 4. Explain the concepts of serial, nonserial, and schedules. 5. Discuss the types of problem that can occur with locking-based mechanisms for concurrency control and the actions that can be taken by a DBMS to prevent them. 6. What is a timestamp? Copyright 2007 BUPTSSE Guo Wenming Page 225 Copyright 2007 BUPTSSE Guo Wenming Page 226 Question and Exercises? 7. For each of the following schedules, state whether the schedule is serializable, recoverable: a) read(t1,balx), read(t2,balx), write(t1,balx), write(t2,balx), commit(t1), commit(t2) b) read(t1,balx),read(t2,balx),write(t3,balx), read(t2,balx), read(t1,balx), commit(t1),commit(t2) c) read(t1,balx),write(t2,balx), write(t1,balx),abort(t2),commit(t1) d) write(t1,balx),read(t2,balx),write(t1,balx), commit(t2),abort(t1) e) read(t1,balx),write(t2,balx), write(t1,balx), read(t3,balx), commit(t1),commit(t2),commit(t3) Copyright 2007 BUPTSSE Guo Wenming Page 227 Copyright 2007 BUPTSSE Guo Wenming Page

39 Chapter 11. Query Processing In network and hierarchical DBMSs, programmer s responsibility to select most appropriate execution strategy. With SQL, user specifies what data is required rather than how it is to be retrieved. Giving the DBMS the responsibility for selecting the best strategy prevents users from choosing strategies that are known to be inefficient and gives the DBMS more control over system performance. In this chapter we concentrate on techniques for query processing and query optimization. Chapter 11. Query Processing 11.1 Overview of Query Processing 11.2 Query Decomposition 11.3 Query optimization 11.4 Transformation Rules for RA Operations 11.5 Heuristical Processing Strategies 11.6 Cost Estimation for RA Operations Copyright 2007 BUPTSSE Guo Wenming Page 229 Copyright 2007 BUPTSSE Guo Wenming Page 230 Chapter 11. Query Processing Objectives of query processing and optimization. How to create a R.A.T. to represent a query. The techniques for query optimization. Rules of equivalence for RA operations. How to apply heuristic transformation rules to improve efficiency of a query. Types of database statistics required to estimate cost of operations. Copyright 2007 BUPTSSE Guo Wenming Page Overview of Query Processing Query Processing: The activities involved in retrieving data from the database. Aims of QP: transform query written in high-level language (e.g. SQL), into correct and efficient execution strategy expressed in low-level language (implementing the Relational Algebra or RA); execute strategy to retrieve required data. Copyright 2007 BUPTSSE Guo Wenming Page Overview of Query Processing Query optimization: The activity of choosing an efficient execution strategy for processing query. As there are many equivalent transformations of same high-level query, aim of QO is to choose one that minimizes resource usage. Generally, reduce total execution time of query. May also reduce response time of query. To find near optimum solution Overview of Query Processing Example 11.1: Find all Managers who work at a London branch. SELECT * FROM Staff s, Branch b WHERE s.branchno = b.branchno AND (s.position = Manager AND b.city = London ); Three equivalent RA queries are: (1) σ (position='manager') (city='london') Staff.branchNo=Branch.branchNo) (Staff X Branch) (2) σ (position='manager') (city='london') (Staff Staff.branchNo=Branch.branchNo Branch) (3) (σ position='manager' (Staff)) Staff.branchNo=Branch.branchNo (σ city='london' (Branch)) Copyright 2007 BUPTSSE Guo Wenming Page 233 Copyright 2007 BUPTSSE Guo Wenming Page

40 11.1 Overview of Query Processing Assume: 1000 tuples in Staff; 50 tuples in Branch; 50 Managers; 5 London branches; results of any intermediate operations stored on disk; cost of the final write is ignored; tuples are accessed one at a time. Cost (in disk accesses) are: (1) ( ) + 2*(1000 * 50) = 101,050 (2) 2* ( ) = 3,050 (3) * (50 + 5) = 1,160 Cartesian product and join operations much more expensive than selection, and third option significantly reduces size of relations being joined together. Copyright 2007 BUPTSSE Guo Wenming Page Overview of Query Processing QP has four main phases: decomposition (consisting of parsing and validation); optimization; code generation; execution. Copyright 2007 BUPTSSE Guo Wenming Page Query Decomposition Aims are to transform high-level query into RA query and check that query is syntactically and semantically correct. Typical stages are: analysis, normalization, semantic analysis, simplification, query restructuring Query Decomposition Analysis Analyze query lexically and syntactically using compiler techniques. Verify relations and attributes exist. Verify operations are appropriate for object type. Example: SELECT staff_no FROM Staff WHERE position > 10; staff_no is not defined for Staff relation (should be staffno). Comparison >10 is incompatible with type position, which is variable character string. Copyright 2007 BUPTSSE Guo Wenming Page 237 Copyright 2007 BUPTSSE Guo Wenming Page Query Decomposition 11.2 Query Decomposition Analysis Finally, query transformed into some internal representation more suitable for processing. Some kind of query tree is typically chosen, constructed as follows: Leaf node created for each base relation. Non-leaf node created for each intermediate relation produced by RA operation. Root of tree represents query result. Sequence is directed from leaves to root. Copyright 2007 BUPTSSE Guo Wenming Page 239 A telational algebra tree can be used to provide an internal representation of a transformed query. Example R.A.T. Copyright 2007 BUPTSSE Guo Wenming Page

41 11.3 Query optimization 11.4 Transformation Rules for RA Operations Two main techniques for query optimization: heuristic rules that order operations in a query; comparing different strategies based on relative costs, and selecting one that minimizes resource usage. The two strategies are usually combined in practice. Query optimization can apply transformation rules to convert one relational algebra expression into an equivalent expression that is known to be more efficient. Copyright 2007 BUPTSSE Guo Wenming Page 241 1) Conjunctive Selection operations can cascade into individual Selection operations (and vice versa). σ p q r (R) = σ p (σ q (σ r (R))) 2) Commutativity of Selection. σ p (σ q (R)) = σ q (σ p (R)) 3) In a sequence of Projection operations, only the last in the sequence is required. Π L Π M Π N (R) = Π L (R) Copyright 2007 BUPTSSE Guo Wenming Page Transformation Rules for RA Operations 4) Commutativity of Selection and Projection. Π Ai,, Am (σ p (R)) = σ p (Π Ai,, Am (R)) where p {A 1, A 2,, A m } 5) Commutativity of Theta join (and Cartesian product). R p S = S p R R X S = S X R 6) Commutativity of Selection and Theta join (or Cartesian product). σ p (R r S) = (σ p (R)) r S σ p (R X S) = (σ p (R)) X S where p {A 1, A 2,, A n } Copyright 2007 BUPTSSE Guo Wenming Page Transformation Rules for RA Operations 7) Commutativity of Projection and Theta join (or Cartesian product). Π L1 L2 (R r S) = (Π L1 (R)) r (Π L2 (S)) 8) Commutativity of Union and Intersection (but not set difference). R S = S R R S = S R 9) Commutativity of Selection and set operations (Union, Intersection, and Set difference). σ p (R S) = σ p (S) σ p (R) σ p (R S) = σ p (S) σ p (R) σ p (R - S) = σ p (S) - σ p (R) Copyright 2007 BUPTSSE Guo Wenming Page Transformation Rules for RA Operations 10) Commutativity of Projection and Union. Π L (R S) = Π L (S) Π L (R) 11) Associativity of Theta join (and Cartesian product). (R S) T = R (S T) (R X S) X T = R X (S X T) 12) Associativity of Union and Intersection (but not Set difference). (R S) T = S (R T) (R S) T = S (R T) Copyright 2007 BUPTSSE Guo Wenming Page Transformation Rules for RA Operations Example 11.2 Use of Transformation Rules For prospective renters of flats, find properties that match requirements and owned by CO93. SELECT p.propertyno, p.street FROM Client c, Viewing v, PropertyForRent p WHERE c.preftype = Flat AND c.clientno = v.clientno AND v.propertyno = p.propertyno AND c.maxrent >= p.rent AND c.preftype = p.type AND p.ownerno = CO93 ; Copyright 2007 BUPTSSE Guo Wenming Page

42 11.4 Transformation Rules for RA Operations 11.4 Transformation Rules for RA Operations Canonical RA tree RA tree formed by pushing Selection down Copyright 2007 BUPTSSE Guo Wenming Page 247 RA tree formed by changing Selection/Cartesian products to Equijoins RA tree formed using associativity of Equijoins Copyright 2007 BUPTSSE Guo Wenming Page Transformation Rules for RA Operations RA tree formed by submituting RA tree formed by c.preftype= Flat in pushing Projections down selection on p.type and pushing Selection down Copyright 2007 BUPTSSE Guo Wenming Page Heuristical Processing Strategies 1) Perform Selection operations as early as possible. Keep predicates on same relation together. 2) Combine Cartesian product with subsequent Selection whose predicate represents join condition into a Join operation. 3) Use associativity of binary operations to rearrange leaf nodes so leaf nodes with most restrictive Selection operations executed first. Copyright 2007 BUPTSSE Guo Wenming Page Heuristical Processing Strategies 4) Perform Projection as early as possible. Keep projection attributes on same relation together. 5) Compute common expressions once. If common expression appears more than once, and result not too large, store result and reuse it when required. Useful when querying views, as same expression is used to construct view each time Cost Estimation for RA Operations Many different ways of implementing RA operations. Aim of QO is to choose most efficient one. Use formulae that estimate costs for a number of options, and select one with lowest cost. Consider only cost of disk access, which is usually dominant cost in QP. Many estimates are based on cardinality of the relation, so need to be able to estimate this. Copyright 2007 BUPTSSE Guo Wenming Page 251 Copyright 2007 BUPTSSE Guo Wenming Page

43 11.6 Cost Estimation for RA Operations Success of estimation depends on amount and currency of statistical information DBMS holds. Keeping statistics current can be problematic. If statistics updated every time tuple is changed, this would impact performance. DBMS could update statistics on a periodic basis, for example nightly, or whenever the system is idle Cost Estimation for RA Operations Cost estimation depends on statistical information held in the system catalog. Typical types of statistics information in system catalog include: For each base relation R: For each attribute A of base relation R For each multilevel index I on attribute set A Copyright 2007 BUPTSSE Guo Wenming Page 253 Copyright 2007 BUPTSSE Guo Wenming Page Cost Estimation for RA Operations Typical statistics for each base relation R include: ntuples(r): the number of tuples in R, the cardinality of each base relation. bfactor(r): the blocking factor of R. nblocks(r): the number of blocks required to store R: nblocks(r) = [ntuples(r)/bfactor(r)] 11.6 Cost Estimation for RA Operations Typical statistics for each attribute A of base relation R include: ndistinct A (R): the number of distinct values that appear for attribute A in R. min A (R),max A (R): the minimum and maximum possible values for attribute A in R. SC A (R): the selection cardinality of attribute A in R. This is the average number of tuples that satisfy an equality condition on attribute A. Copyright 2007 BUPTSSE Guo Wenming Page 255 Copyright 2007 BUPTSSE Guo Wenming Page Cost Estimation for RA Operations Typical statistics for each multilevel index I on attribute set A include: nlevels A (I): the number of levels in I. nlfblocks A (I): the number of leaf blocks in I. Question and Exercises? 1. What are the objectives of query processing? 2. What are the typical phases of query processing? 3. State two main techniques for query optimization. 4. State the heuristics that should be applied to improve the processing of a query. 5. Again using the Hotel schema, draw a relational algebra tree for the following query and use the heuristics rules to transform the query into a more efficient form. Select r.roomno, r.type, r.price Form Room r, Booking b, Hotel h Where r.roomno=b.roomno and b.hotelno=h.hotelno and h.hotelname= Grosvenor Hotel and r.price>100; 6. What types of statistics should a DBMS hold to be able to derive estimates of relational algebra operations? Copyright 2007 BUPTSSE Guo Wenming Page 257 Copyright 2007 BUPTSSE Guo Wenming Page

44 Chapter 12. Security Data is a valuable resource that must be strictly controlled and managed, as with any corporate resource. Part or all of the corporate data may have strategic importance and therefore needs to be kept secure and confidential. The DBMS must ensure that the database is secure. In this chapter, we discuss the security threat of database, and consider the range of computerbased controls that are available as countermeasures to these threats. Chapter 12. Security 12.1 Database Security 12.2 Threats 12.3 Countermeasures 12.4 The Access Control in SQL Copyright 2007 BUPTSSE Guo Wenming Page 259 Copyright 2007 BUPTSSE Guo Wenming Page 260 Chapter 12. Security Scope of database security. Why database security is a serious concern for an organization. Type of threats that can affect a database system. How to protect a computer system using computer-based controls. How to control the access to database 12.1 Database Security Database security: The mechanisms that protect the database against intentional or accidental threats. Security considerations do not only apply to the data held in a database: breaches of security may affect other parts of the system, which may in turn affect the database. Consequently, database security encompasses hardware, software, people, and data. Copyright 2007 BUPTSSE Guo Wenming Page 261 Copyright 2007 BUPTSSE Guo Wenming Page Database Security Database security is concerned with avoiding the following situations: Theft and fraud Loss of confidentiality (secrecy) Loss of privacy Loss of integrity Loss of availability Copyright 2007 BUPTSSE Guo Wenming Page Threats Threat: Any situation or event, whether intentional or unintentional, that will adversely affect a system and consequently an organization. Example of threats: Using another person s means of access Unauthorized amendment or copying of data. Program alteration Inadequate policies and procedures that allow A mix of confidential and normal output Write tapping Illegal entry by hacker Blackmail Theft of data, programs, and equipment. Copyright 2007 BUPTSSE Guo Wenming Page

45 12.2 Threats Summary of Threats to Computer Systems Copyright 2007 BUPTSSE Guo Wenming Page Countermeasure The types of countermeasure to threats on computer systems range from physical controls to administrative procedures. Computer-based security controls include: Authorization Views Backup and recovery Integrity Encryption RAID technology Copyright 2007 BUPTSSE Guo Wenming Page Countermeasure Authorization Views Backup and Recovery Integrity Encryption RIAD (Redundant Array of Independent Disks) Authorization Authorization: The granting of a right or privilege, which enables a subject to legitimately have access to a system or a system s object. Authorization controls are referred to as access controls. The process of authorization involves authorization of subjects requesting access to subject, where subject represents a user or program and object represents a database table, view, procedure, trigger, or any other object that can be created with the system. Copyright 2007 BUPTSSE Guo Wenming Page 267 Copyright 2007 BUPTSSE Guo Wenming Page Authorization Authentication: A mechanism that determines whether a user is who he or she claims to be. Some procedures may have to be undertaken to give a user the right to use the DBMS. The responsibility to authorize use of the DBMS usually rests with the Database Administrator (DBA), who must also set up individual user accounts and passwords using the DBMS itself. Some DBMSs maintain a list of valid user identifiers and associated passwords, which can be dostinct from the operating system s list Authorization Privileges: Privileges may include the right to access or create certain database objects such as relation, views, and indexes, or to run various DBMS utilities. Once a user is given permission to use a DBMS, various privileges may also be automatically associated with it. Privileges are granted to user to accomplish the tasks required for their jobs. The creator of an object owns the object and can assign appropriate privileges for the objects. Copyright 2007 BUPTSSE Guo Wenming Page 269 Copyright 2007 BUPTSSE Guo Wenming Page

46 Views View: a view is the dynamic result of one or more relational operations operating on the base relations to produce another relation. A virtual relation that does not actually exist in the database, but is produced upon request by a particular user, at the time of request. A view can be defined over several relations with a user being granted the appropriate privilege to use it, but not to use the base relations. Using a view is more restrictive than simply having certain privileges granted to a user on the base relations. Copyright 2007 BUPTSSE Guo Wenming Page Backup and Recovery Backup: The process of periodically taking a copy of the database and log file (and possibly programs) to offline storage media. A DBMS should provide backup facilities to assist with the recovery of a database following failure. Make backup copies of the database and log file at regular intervals. Copyright 2007 BUPTSSE Guo Wenming Page Backup and Recovery Journaling: The process of keeping and maintaining a log file (or journal) of all changes made to the database to enable effective recovery in event of failure. A DBMS should provide logging facilities, which keep track of the current state of transactions and database changes, to provide support for recovery procedure. Copyright 2007 BUPTSSE Guo Wenming Page Integrity Integrity: Integrity control consists of constraints that we wish to impose in order to protect the database from becoming inconsistent. Five types of integrity constraints Required data Domain constraints Entity integrity Referential integrity Enterprise constraints. Integrity constraints also contribute to maintaining a secure database system. Prevents data from becoming invalid, and hence giving misleading or incorrect results. Copyright 2007 BUPTSSE Guo Wenming Page Encryption Encryption: The encoding of the data by a special algorithm that renders the data unreadable by any program without the decryption key. Some DBMSs provide an encryption facility to encode particularly sensitive data as a precaution against possible external threats or attempts to access the data. Cryptosystem include: An encryption key to encrypt the data An encryption algorithm that, with the encryption key, transforms the plaintext into ciphertext A decryption key to decrypt the ciphertext A decryption algorithm that, with the decryption key, transforms the ciphertext back into plainttext Copyright 2007 BUPTSSE Guo Wenming Page RAID Hardware that the DBMS is running on must be fault-tolerant, meaning that the DBMS should continue to operate even if one of the hardware components fails. Suggests having redundant components that can be seamlessly integrated into the working system whenever there is one or more component failures Main hardware components that should be faulttolerant include disk drives, disk controllers, CPU, power supplies, cooling fans. Disk drives are most vulnerable components with shortest times between failure of any of the hardware components. Copyright 2007 BUPTSSE Guo Wenming Page

47 RAID One solution is the use of RAID (Redundant Array of Independent Disks) Technology. RIAD: RAID works on having a large disk array comprising an arrangement of several independent disks that are organized to improve reliability and increase performance. Copyright 2007 BUPTSSE Guo Wenming Page RAID Performance is increased through data striping: the data is segmented into equalsize partitions (the striping unit), which are transparently distributed across multiple disks. A single large, fast disk where in actual fact the data is distributed across several smaller disks. Striping improves overall I/O performance by allowing multiple I/Os to be serviced in parallel. Data striping balances the load among disks. Copyright 2007 BUPTSSE Guo Wenming Page RAID Reliability is improved through storing redundant information across the disks using a parity scheme or an errorcorrecting scheme. In a parity scheme, each byte may have a parity bit associated with in that records whether the number of bits in the byte that are set to 1 is even or odd. Error-correcting schemes store two or more additional bits, and can reconstruct the original data if a single bit becomes corrupt. Copyright 2007 BUPTSSE Guo Wenming Page RAID There are a number of different disk configurations with RAID, termed RAID levels. RAID 0 RAID 1 RAID 0+1 RAID 2 RAID 3 RAID 4 RAID 5 RAID 6 For example, ORACLE recommends use of RAID 1 for the redo log files, and RAID 5 for the database files. Copyright 2007 BUPTSSE Guo Wenming Page The Access Control in SQL Please see 6.6. Copyright 2007 BUPTSSE Guo Wenming Page 281 Question and Exercises? 1. Explain the purpose and scope of database security. 2. Explain the following in terms of providing security for a database: a) authorization b) views c) backup and recovery d) integrity e) encryption f) RAID technology 3. Discuss how the Access Control mechanism of SQL works. Copyright 2007 BUPTSSE Guo Wenming Page

Bridge from Entity Relationship modeling to creating SQL databases, tables, & relations

Bridge from Entity Relationship modeling to creating SQL databases, tables, & relations 1 Topics for this week: 1. Good Design 2. Functional Dependencies 3. Normalization Readings for this week: 1. E&N, Ch. 10.1-10.6; 12.2 2. Quickstart, Ch. 3 3. Complete the tutorial at http://sqlcourse2.com/

More information

Database Design Methodologies

Database Design Methodologies Critical Success Factors in Database Design Database Design Methodologies o Work interactively with the users as much as possible. o Follow a structured methodology throughout the data modeling process.

More information

Unit 2.1. Data Analysis 1 - V2.0 1. Data Analysis 1. Dr Gordon Russell, Copyright @ Napier University

Unit 2.1. Data Analysis 1 - V2.0 1. Data Analysis 1. Dr Gordon Russell, Copyright @ Napier University Data Analysis 1 Unit 2.1 Data Analysis 1 - V2.0 1 Entity Relationship Modelling Overview Database Analysis Life Cycle Components of an Entity Relationship Diagram What is a relationship? Entities, attributes,

More information

Chapter 7 Data Modeling Using the Entity- Relationship (ER) Model

Chapter 7 Data Modeling Using the Entity- Relationship (ER) Model Chapter 7 Data Modeling Using the Entity- Relationship (ER) Model Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 7 Outline Using High-Level Conceptual Data Models for

More information

Data Analysis 1. SET08104 Database Systems. Copyright @ Napier University

Data Analysis 1. SET08104 Database Systems. Copyright @ Napier University Data Analysis 1 SET08104 Database Systems Copyright @ Napier University Entity Relationship Modelling Overview Database Analysis Life Cycle Components of an Entity Relationship Diagram What is a relationship?

More information

not necessarily strictly sequential feedback loops exist, i.e. may need to revisit earlier stages during a later stage

not necessarily strictly sequential feedback loops exist, i.e. may need to revisit earlier stages during a later stage Database Design Process there are six stages in the design of a database: 1. requirement analysis 2. conceptual database design 3. choice of the DBMS 4. data model mapping 5. physical design 6. implementation

More information

Chapter 10 Practical Database Design Methodology and Use of UML Diagrams

Chapter 10 Practical Database Design Methodology and Use of UML Diagrams Chapter 10 Practical Database Design Methodology and Use of UML Diagrams Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 10 Outline The Role of Information Systems in

More information

THE ENTITY- RELATIONSHIP (ER) MODEL CHAPTER 7 (6/E) CHAPTER 3 (5/E)

THE ENTITY- RELATIONSHIP (ER) MODEL CHAPTER 7 (6/E) CHAPTER 3 (5/E) THE ENTITY- RELATIONSHIP (ER) MODEL CHAPTER 7 (6/E) CHAPTER 3 (5/E) 2 LECTURE OUTLINE Using High-Level, Conceptual Data Models for Database Design Entity-Relationship (ER) model Popular high-level conceptual

More information

IV. The (Extended) Entity-Relationship Model

IV. The (Extended) Entity-Relationship Model IV. The (Extended) Entity-Relationship Model The Extended Entity-Relationship (EER) Model Entities, Relationships and Attributes Cardinalities, Identifiers and Generalization Documentation of EER Diagrams

More information

DATABASE DESIGN. - Developing database and information systems is performed using a development lifecycle, which consists of a series of steps.

DATABASE DESIGN. - Developing database and information systems is performed using a development lifecycle, which consists of a series of steps. DATABASE DESIGN - The ability to design databases and associated applications is critical to the success of the modern enterprise. - Database design requires understanding both the operational and business

More information

Chapter 10 Practical Database Design Methodology and Use of UML Diagrams

Chapter 10 Practical Database Design Methodology and Use of UML Diagrams Chapter 10 Practical Database Design Methodology and Use of UML Diagrams Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 10 Outline The Role of Information Systems in

More information

Modern Systems Analysis and Design

Modern Systems Analysis and Design Modern Systems Analysis and Design Prof. David Gadish Structuring System Data Requirements Learning Objectives Concisely define each of the following key data modeling terms: entity type, attribute, multivalued

More information

Chapter 3. Data Modeling Using the Entity-Relationship (ER) Model

Chapter 3. Data Modeling Using the Entity-Relationship (ER) Model Chapter 3 Data Modeling Using the Entity-Relationship (ER) Model Chapter Outline Overview of Database Design Process Example Database Application (COMPANY) ER Model Concepts Entities and Attributes Entity

More information

Database Design. Marta Jakubowska-Sobczak IT/ADC based on slides prepared by Paula Figueiredo, IT/DB

Database Design. Marta Jakubowska-Sobczak IT/ADC based on slides prepared by Paula Figueiredo, IT/DB Marta Jakubowska-Sobczak IT/ADC based on slides prepared by Paula Figueiredo, IT/DB Outline Database concepts Conceptual Design Logical Design Communicating with the RDBMS 2 Some concepts Database: an

More information

How To Write A Diagram

How To Write A Diagram Data Model ing Essentials Third Edition Graeme C. Simsion and Graham C. Witt MORGAN KAUFMANN PUBLISHERS AN IMPRINT OF ELSEVIER AMSTERDAM BOSTON LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE

More information

Foundations of Information Management

Foundations of Information Management Foundations of Information Management - WS 2012/13 - Juniorprofessor Alexander Markowetz Bonn Aachen International Center for Information Technology (B-IT) Data & Databases Data: Simple information Database:

More information

Lecture 12: Entity Relationship Modelling

Lecture 12: Entity Relationship Modelling Lecture 12: Entity Relationship Modelling The Entity-Relationship Model Entities Relationships Attributes Constraining the instances Cardinalities Identifiers Generalization 2004-5 Steve Easterbrook. This

More information

2. Conceptual Modeling using the Entity-Relationship Model

2. Conceptual Modeling using the Entity-Relationship Model ECS-165A WQ 11 15 Contents 2. Conceptual Modeling using the Entity-Relationship Model Basic concepts: entities and entity types, attributes and keys, relationships and relationship types Entity-Relationship

More information

Database Design Methodology

Database Design Methodology Database Design Methodology Three phases Database Design Methodology Logical database Physical database Constructing a model of the information used in an enterprise on a specific data model but independent

More information

Chapter 10. Practical Database Design Methodology. The Role of Information Systems in Organizations. Practical Database Design Methodology

Chapter 10. Practical Database Design Methodology. The Role of Information Systems in Organizations. Practical Database Design Methodology Chapter 10 Practical Database Design Methodology Practical Database Design Methodology Design methodology Target database managed by some type of database management system Various design methodologies

More information

Entity-Relationship Model

Entity-Relationship Model UNIT -2 Entity-Relationship Model Introduction to ER Model ER model is represents real world situations using concepts, which are commonly used by people. It allows defining a representation of the real

More information

www.gr8ambitionz.com

www.gr8ambitionz.com Data Base Management Systems (DBMS) Study Material (Objective Type questions with Answers) Shared by Akhil Arora Powered by www. your A to Z competitive exam guide Database Objective type questions Q.1

More information

COMP 378 Database Systems Notes for Chapter 7 of Database System Concepts Database Design and the Entity-Relationship Model

COMP 378 Database Systems Notes for Chapter 7 of Database System Concepts Database Design and the Entity-Relationship Model COMP 378 Database Systems Notes for Chapter 7 of Database System Concepts Database Design and the Entity-Relationship Model The entity-relationship (E-R) model is a a data model in which information stored

More information

Lesson 8: Introduction to Databases E-R Data Modeling

Lesson 8: Introduction to Databases E-R Data Modeling Lesson 8: Introduction to Databases E-R Data Modeling Contents Introduction to Databases Abstraction, Schemas, and Views Data Models Database Management System (DBMS) Components Entity Relationship Data

More information

DATABASE MANAGEMENT SYSTEMS. Question Bank:

DATABASE MANAGEMENT SYSTEMS. Question Bank: DATABASE MANAGEMENT SYSTEMS Question Bank: UNIT 1 1. Define Database? 2. What is a DBMS? 3. What is the need for database systems? 4. Define tupule? 5. What are the responsibilities of DBA? 6. Define schema?

More information

Designing a Database Schema

Designing a Database Schema Week 10: Database Design Database Design From an ER Schema to a Relational One Restructuring an ER schema Performance Analysis Analysis of Redundancies, Removing Generalizations Translation into a Relational

More information

IT2305 Database Systems I (Compulsory)

IT2305 Database Systems I (Compulsory) Database Systems I (Compulsory) INTRODUCTION This is one of the 4 modules designed for Semester 2 of Bachelor of Information Technology Degree program. CREDITS: 04 LEARNING OUTCOMES On completion of this

More information

Chapter 2: Entity-Relationship Model. Entity Sets. " Example: specific person, company, event, plant

Chapter 2: Entity-Relationship Model. Entity Sets.  Example: specific person, company, event, plant Chapter 2: Entity-Relationship Model! Entity Sets! Relationship Sets! Design Issues! Mapping Constraints! Keys! E-R Diagram! Extended E-R Features! Design of an E-R Database Schema! Reduction of an E-R

More information

BCA. Database Management System

BCA. Database Management System BCA IV Sem Database Management System Multiple choice questions 1. A Database Management System (DBMS) is A. Collection of interrelated data B. Collection of programs to access data C. Collection of data

More information

Normalization. Normalization. Normalization. Data Redundancy

Normalization. Normalization. Normalization. Data Redundancy Normalization Normalization o Main objective in developing a logical data model for relational database systems is to create an accurate representation of the data, its relationships, and constraints.

More information

DATABASE NORMALIZATION

DATABASE NORMALIZATION DATABASE NORMALIZATION Normalization: process of efficiently organizing data in the DB. RELATIONS (attributes grouped together) Accurate representation of data, relationships and constraints. Goal: - Eliminate

More information

Database Design Methodology

Database Design Methodology Topic 7 Database Design Methodology LEARNING OUTCOMES When you have completed this Topic you should be able to: 1. Discuss the purpose of a design methodology. 2. Explain three main phases of design methodology.

More information

A brief overview of developing a conceptual data model as the first step in creating a relational database.

A brief overview of developing a conceptual data model as the first step in creating a relational database. Data Modeling Windows Enterprise Support Database Services provides the following documentation about relational database design, the relational database model, and relational database software. Introduction

More information

XV. The Entity-Relationship Model

XV. The Entity-Relationship Model XV. The Entity-Relationship Model The Entity-Relationship Model Entities, Relationships and Attributes Cardinalities, Identifiers and Generalization Documentation of E-R Diagrams and Business Rules The

More information

IT2304: Database Systems 1 (DBS 1)

IT2304: Database Systems 1 (DBS 1) : Database Systems 1 (DBS 1) (Compulsory) 1. OUTLINE OF SYLLABUS Topic Minimum number of hours Introduction to DBMS 07 Relational Data Model 03 Data manipulation using Relational Algebra 06 Data manipulation

More information

Normalization. Purpose of normalization Data redundancy Update anomalies Functional dependency Process of normalization

Normalization. Purpose of normalization Data redundancy Update anomalies Functional dependency Process of normalization Normalization Purpose of normalization Data redundancy Update anomalies Functional dependency Process of normalization 1 Purpose of Normalization Normalization is a technique for producing a set of suitable

More information

Chapter 2: Entity-Relationship Model. E-R R Diagrams

Chapter 2: Entity-Relationship Model. E-R R Diagrams Chapter 2: Entity-Relationship Model What s the use of the E-R model? Entity Sets Relationship Sets Design Issues Mapping Constraints Keys E-R Diagram Extended E-R Features Design of an E-R Database Schema

More information

Concepts of Database Management Seventh Edition. Chapter 6 Database Design 2: Design Method

Concepts of Database Management Seventh Edition. Chapter 6 Database Design 2: Design Method Concepts of Database Management Seventh Edition Chapter 6 Database Design 2: Design Method Objectives Discuss the general process and goals of database design Define user views and explain their function

More information

14 Databases. Source: Foundations of Computer Science Cengage Learning. Objectives After studying this chapter, the student should be able to:

14 Databases. Source: Foundations of Computer Science Cengage Learning. Objectives After studying this chapter, the student should be able to: 14 Databases 14.1 Source: Foundations of Computer Science Cengage Learning Objectives After studying this chapter, the student should be able to: Define a database and a database management system (DBMS)

More information

Files. Files. Files. Files. Files. File Organisation. What s it all about? What s in a file?

Files. Files. Files. Files. Files. File Organisation. What s it all about? What s in a file? Files What s it all about? Information being stored about anything important to the business/individual keeping the files. The simple concepts used in the operation of manual files are often a good guide

More information

Chapter 2: Entity-Relationship Model

Chapter 2: Entity-Relationship Model Chapter 2: Entity-Relationship Model Entity Sets Relationship Sets Design Issues Mapping Constraints Keys E R Diagram Extended E-R Features Design of an E-R Database Schema Reduction of an E-R Schema to

More information

three Entity-Relationship Modeling chapter OVERVIEW CHAPTER

three Entity-Relationship Modeling chapter OVERVIEW CHAPTER three Entity-Relationship Modeling CHAPTER chapter OVERVIEW 3.1 Introduction 3.2 The Entity-Relationship Model 3.3 Entity 3.4 Attributes 3.5 Relationships 3.6 Degree of a Relationship 3.7 Cardinality of

More information

Fundamentals of Database Design

Fundamentals of Database Design Fundamentals of Database Design Zornitsa Zaharieva CERN Data Management Section - Controls Group Accelerators and Beams Department /AB-CO-DM/ 23-FEB-2005 Contents : Introduction to Databases : Main Database

More information

Database Management Systems

Database Management Systems Database Management Systems Database Design (1) 1 Topics Information Systems Life Cycle Data Base Design Logical Design Physical Design Entity Relationship (ER) Model Entity Relationship Attributes Cardinality

More information

Lecture Notes INFORMATION RESOURCES

Lecture Notes INFORMATION RESOURCES Vilnius Gediminas Technical University Jelena Mamčenko Lecture Notes on INFORMATION RESOURCES Part I Introduction to Dta Modeling and MSAccess Code FMITB02004 Course title Information Resourses Course

More information

CA IDMS. Database Design Guide. Release 18.5.00, 2nd Edition

CA IDMS. Database Design Guide. Release 18.5.00, 2nd Edition CA IDMS Database Design Guide Release 18.5.00, 2nd Edition This Documentation, which includes embedded help systems and electronically distributed materials, (hereinafter referred to as the Documentation

More information

Database Concepts. Database & Database Management System. Application examples. Application examples

Database Concepts. Database & Database Management System. Application examples. Application examples Database & Database Management System Database Concepts Database = A shared collection of logically related (and a description of this data), designed to meet the information needs of an organization.

More information

Database Design Overview. Conceptual Design ER Model. Entities and Entity Sets. Entity Set Representation. Keys

Database Design Overview. Conceptual Design ER Model. Entities and Entity Sets. Entity Set Representation. Keys Database Design Overview Conceptual Design. The Entity-Relationship (ER) Model CS430/630 Lecture 12 Conceptual design The Entity-Relationship (ER) Model, UML High-level, close to human thinking Semantic

More information

Chapter 2. Data Model. Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel

Chapter 2. Data Model. Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel Chapter 2 Data Model Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel 1 In this chapter, you will learn: Why data models are important About the basic data-modeling

More information

CSC 742 Database Management Systems

CSC 742 Database Management Systems CSC 742 Database Management Systems Topic #4: Data Modeling Spring 2002 CSC 742: DBMS by Dr. Peng Ning 1 Phases of Database Design Requirement Collection/Analysis Functional Requirements Functional Analysis

More information

Database Design Process

Database Design Process Database Design Process Entity-Relationship Model From Chapter 5, Kroenke book Requirements analysis Conceptual design data model Logical design Schema refinement: Normalization Physical tuning Problem:

More information

LOGICAL DATABASE DESIGN

LOGICAL DATABASE DESIGN MODULE 8 LOGICAL DATABASE DESIGN OBJECTIVE QUESTIONS There are 4 alternative answers to each question. One of them is correct. Pick the correct answer. Do not guess. A key is given at the end of the module

More information

Database Design Process

Database Design Process Entity-Relationship Model Chapter 3, Part 1 Database Design Process Requirements analysis Conceptual design data model Logical design Schema refinement: Normalization Physical tuning 1 Problem: University

More information

Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases

Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 15 Outline Informal Design Guidelines

More information

2. Basic Relational Data Model

2. Basic Relational Data Model 2. Basic Relational Data Model 2.1 Introduction Basic concepts of information models, their realisation in databases comprising data objects and object relationships, and their management by DBMS s that

More information

DBMS Questions. 3.) For which two constraints are indexes created when the constraint is added?

DBMS Questions. 3.) For which two constraints are indexes created when the constraint is added? DBMS Questions 1.) Which type of file is part of the Oracle database? A.) B.) C.) D.) Control file Password file Parameter files Archived log files 2.) Which statements are use to UNLOCK the user? A.)

More information

Designing Databases. Introduction

Designing Databases. Introduction Designing Databases C Introduction Businesses rely on databases for accurate, up-to-date information. Without access to mission critical data, most businesses are unable to perform their normal daily functions,

More information

The Entity-Relationship Model

The Entity-Relationship Model The Entity-Relationship Model 221 After completing this chapter, you should be able to explain the three phases of database design, Why are multiple phases useful? evaluate the significance of the Entity-Relationship

More information

Databases Model the Real World. The Entity- Relationship Model. Conceptual Design. Steps in Database Design. ER Model Basics. ER Model Basics (Contd.

Databases Model the Real World. The Entity- Relationship Model. Conceptual Design. Steps in Database Design. ER Model Basics. ER Model Basics (Contd. The Entity- Relationship Model R &G - Chapter 2 A relationship, I think, is like a shark, you know? It has to constantly move forward or it dies. And I think what we got on our hands is a dead shark. Woody

More information

æ A collection of interrelated and persistent data èusually referred to as the database èdbèè.

æ A collection of interrelated and persistent data èusually referred to as the database èdbèè. CMPT-354-Han-95.3 Lecture Notes September 10, 1995 Chapter 1 Introduction 1.0 Database Management Systems 1. A database management system èdbmsè, or simply a database system èdbsè, consists of æ A collection

More information

Data Modeling Basics

Data Modeling Basics Information Technology Standard Commonwealth of Pennsylvania Governor's Office of Administration/Office for Information Technology STD Number: STD-INF003B STD Title: Data Modeling Basics Issued by: Deputy

More information

DATABASE DESIGN: NORMALIZATION NOTE & EXERCISES (Up to 3NF)

DATABASE DESIGN: NORMALIZATION NOTE & EXERCISES (Up to 3NF) DATABASE DESIGN: NORMALIZATION NOTE & EXERCISES (Up to 3NF) Tables that contain redundant data can suffer from update anomalies, which can introduce inconsistencies into a database. The rules associated

More information

DBMS. Normalization. Module Title?

DBMS. Normalization. Module Title? Normalization Database Normalization Database normalization is the process of removing redundant data from your tables in to improve storage efficiency, data integrity (accuracy and consistency), and scalability

More information

Entity - Relationship Modelling

Entity - Relationship Modelling Topic 5 Entity - Relationship Modelling LEARNING OUTCOMES When you have completed this Topic you should be able to: 1. Acquire the basic concepts of the Entity-Relationship (ER) model. 2. Discuss how to

More information

Introduction to normalization. Introduction to normalization

Introduction to normalization. Introduction to normalization Introduction to normalization Lecture 4 Instructor Anna Sidorova Agenda Presentation Review of relational models, in class exersise Introduction to normalization In-class exercises Discussion of HW2 1

More information

1 File Processing Systems

1 File Processing Systems COMP 378 Database Systems Notes for Chapter 1 of Database System Concepts Introduction A database management system (DBMS) is a collection of data and an integrated set of programs that access that data.

More information

CS 377 Database Systems. Database Design Theory and Normalization. Li Xiong Department of Mathematics and Computer Science Emory University

CS 377 Database Systems. Database Design Theory and Normalization. Li Xiong Department of Mathematics and Computer Science Emory University CS 377 Database Systems Database Design Theory and Normalization Li Xiong Department of Mathematics and Computer Science Emory University 1 Relational database design So far Conceptual database design

More information

Information Systems Analysis and Design CSC340. 2004 John Mylopoulos Database Design -- 2. Information Systems Analysis and Design CSC340

Information Systems Analysis and Design CSC340. 2004 John Mylopoulos Database Design -- 2. Information Systems Analysis and Design CSC340 XX. Database Design Databases Databases and DBMS Data Models, Hierarchical, Network, Relational Database Design Restructuring an ER schema Performance analysis Analysis of Redundancies, Removing generalizations

More information

CHAPTER 6 DATABASE MANAGEMENT SYSTEMS. Learning Objectives

CHAPTER 6 DATABASE MANAGEMENT SYSTEMS. Learning Objectives CHAPTER 6 DATABASE MANAGEMENT SYSTEMS Management Information Systems, 10 th edition, By Raymond McLeod, Jr. and George P. Schell 2007, Prentice Hall, Inc. 1 Learning Objectives Understand the hierarchy

More information

D6 INFORMATION SYSTEMS DEVELOPMENT. SOLUTIONS & MARKING SCHEME. June 2013

D6 INFORMATION SYSTEMS DEVELOPMENT. SOLUTIONS & MARKING SCHEME. June 2013 D6 INFORMATION SYSTEMS DEVELOPMENT. SOLUTIONS & MARKING SCHEME. June 2013 The purpose of these questions is to establish that the students understand the basic ideas that underpin the course. The answers

More information

Relational Schema Design

Relational Schema Design Relational Schema Design Using ER Methodology to Design Relational Database Schemas The Development Process Collect requirements. Analyze the requirements. Conceptually design the data (e.g., draw an ER

More information

1 Class Diagrams and Entity Relationship Diagrams (ERD)

1 Class Diagrams and Entity Relationship Diagrams (ERD) 1 Class Diagrams and Entity Relationship Diagrams (ERD) Class diagrams and ERDs both model the structure of a system. Class diagrams represent the dynamic aspects of a system: both the structural and behavioural

More information

Answers to Review Questions

Answers to Review Questions Tutorial 2 The Database Design Life Cycle Reference: MONASH UNIVERSITY AUSTRALIA Faculty of Information Technology FIT1004 Database Rob, P. & Coronel, C. Database Systems: Design, Implementation & Management,

More information

Foundations of Information Management

Foundations of Information Management Foundations of Information Management - WS 2009/10 Juniorprofessor Alexander Markowetz Bonn Aachen International Center for Information Technology (B-IT) Alexander Markowetz Born 1976 in Brussels, Belgium

More information

Hotel Management System

Hotel Management System A Seminar report On Hotel Management System Submitted in partial fulfillment of the requirement for the award of degree Of MBA SUBMITTED TO: SUBMITTED BY: Preface I have made this report file on the topic

More information

Data Management Operating Procedures and Guidelines

Data Management Operating Procedures and Guidelines DEPARTMENT OF HEALTH & HUMAN SERVICES Centers for Medicare & Medicaid Services 7500 Security Boulevard, Mail Stop N2-14-26 Baltimore, Maryland 21244-1850 Data Administration Data Management Operating Procedures

More information

DATABASE INTRODUCTION

DATABASE INTRODUCTION Introduction The history of database system research is one of exceptional productivity and startling economic impact. We have learnt that from the days of file-based systems there are better ways to handle

More information

SCHEMAS AND STATE OF THE DATABASE

SCHEMAS AND STATE OF THE DATABASE SCHEMAS AND STATE OF THE DATABASE Schema the description of a database specified during database design relatively stable over time Database state the data in a database at a particular moment the set

More information

CHAPTER. Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION. Database Planning and Database Architecture

CHAPTER. Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION. Database Planning and Database Architecture CHAPTER 2 Database Planning and Database Architecture ing, Chapter Objectives R SALE OR Chapter Objectives In this chapter you will 2.1 Data as a Resource learn the following: 2.2 Characteristics of Data

More information

11 Tips to make the requirements definition process more effective and results more usable

11 Tips to make the requirements definition process more effective and results more usable 1 11 Tips to make the s definition process more effective and results more usable This article discusses what I believe are the key techniques for making s definition process repeatable from project to

More information

Physical Design. Meeting the needs of the users is the gold standard against which we measure our success in creating a database.

Physical Design. Meeting the needs of the users is the gold standard against which we measure our success in creating a database. Physical Design Physical Database Design (Defined): Process of producing a description of the implementation of the database on secondary storage; it describes the base relations, file organizations, and

More information

Lecture 2 Normalization

Lecture 2 Normalization MIT 533 ระบบฐานข อม ล 2 Lecture 2 Normalization Walailuk University Lecture 2: Normalization 1 Objectives The purpose of normalization The identification of various types of update anomalies The concept

More information

Chapter 8 The Enhanced Entity- Relationship (EER) Model

Chapter 8 The Enhanced Entity- Relationship (EER) Model Chapter 8 The Enhanced Entity- Relationship (EER) Model Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 8 Outline Subclasses, Superclasses, and Inheritance Specialization

More information

ATM Case Study Part 1

ATM Case Study Part 1 ATM Case Study Part 1 A requirements document specifies the purpose of the ATM system and what it must do. Requirements Document A local bank intends to install a new automated teller machine (ATM) to

More information

Introduction to Computing. Lectured by: Dr. Pham Tran Vu [email protected]

Introduction to Computing. Lectured by: Dr. Pham Tran Vu t.v.pham@cse.hcmut.edu.vn Introduction to Computing Lectured by: Dr. Pham Tran Vu [email protected] Databases The Hierarchy of Data Keys and Attributes The Traditional Approach To Data Management Database A collection of

More information

The E-R èentity-relationshipè data model views the real world as a set of basic objects èentitiesè and

The E-R èentity-relationshipè data model views the real world as a set of basic objects èentitiesè and CMPT-354-Han-95.3 Lecture Notes September 20, 1995 Chapter 2 The Entity-Relationship Model The E-R èentity-relationshipè data model views the real world as a set of basic objects èentitiesè and relationships

More information

Chapter 1: Introduction. Database Management System (DBMS) University Database Example

Chapter 1: Introduction. Database Management System (DBMS) University Database Example This image cannot currently be displayed. Chapter 1: Introduction Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Database Management System (DBMS) DBMS contains information

More information

Questions? Assignment. Techniques for Gathering Requirements. Gathering and Analysing Requirements

Questions? Assignment. Techniques for Gathering Requirements. Gathering and Analysing Requirements Questions? Assignment Why is proper project management important? What is goal of domain analysis? What is the difference between functional and non- functional requirements? Why is it important for requirements

More information

Chapter 2 Database System Concepts and Architecture

Chapter 2 Database System Concepts and Architecture Chapter 2 Database System Concepts and Architecture Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Outline Data Models, Schemas, and Instances Three-Schema Architecture

More information

ISM 318: Database Systems. Objectives. Database. Dr. Hamid R. Nemati

ISM 318: Database Systems. Objectives. Database. Dr. Hamid R. Nemati ISM 318: Database Systems Dr. Hamid R. Nemati Department of Information Systems Operations Management Bryan School of Business Economics Objectives Underst the basics of data databases Underst characteristics

More information

Database Fundamentals: 1

Database Fundamentals: 1 Database Fundamentals Robert J. Robbins Johns Hopkins University [email protected] Database Fundamentals: 1 What is a Database? General: A database is any collection of related data. Restrictive: A database

More information

Data Modeling: Part 1. Entity Relationship (ER) Model

Data Modeling: Part 1. Entity Relationship (ER) Model Data Modeling: Part 1 Entity Relationship (ER) Model MBA 8473 1 Cognitive Objectives (Module 2) 32. Explain the three-step process of data-driven information system (IS) development 33. Examine the purpose

More information

Tutorial on Relational Database Design

Tutorial on Relational Database Design Tutorial on Relational Database Design Introduction Relational database was proposed by Edgar Codd (of IBM Research) around 1969. It has since become the dominant database model for commercial applications

More information

Database Administrator [DBA]

Database Administrator [DBA] Definition Database Administrator [DBA] Centralized control of the database is exerted by a person or group of persons under the supervision of a highlevel administrator. This person or group is referred

More information

Databases What the Specification Says

Databases What the Specification Says Databases What the Specification Says Describe flat files and relational databases, explaining the differences between them; Design a simple relational database to the third normal form (3NF), using entityrelationship

More information

Database Design Final Project

Database Design Final Project Database Design 2015-2016 Database Design Final Project مشروع قاعدة بیانات ھو مشروع على طول السنة لاعطاء الطلبة الفرصة لتطویر قاعدة بیانات باستخدام نظام ادراة قواعد البیانات التجاریة حیث یبین الجدول رقم

More information

Database design 1 The Database Design Process: Before you build the tables and other objects that will make up your system, it is important to take time to design it. A good design is the keystone to creating

More information

LECTURE 11: PROCESS MODELING

LECTURE 11: PROCESS MODELING LECTURE 11: PROCESS MODELING Outline Logical modeling of processes Data Flow Diagram Elements Functional decomposition Data Flows Rules and Guidelines Structured Analysis with Use Cases Learning Objectives

More information