Understanding the Database Design Process



Similar documents
CPS352 Database Systems: Design Project


Introduction to normalization. Introduction to normalization

Access Database Design

Database Design Basics

EXTENDED LEARNING MODULE A

David M. Kroenke and David J. Auer Database Processing 11 th Edition Fundamentals, Design, and Implementation. Chapter Objectives

Tutorial on Relational Database Design

Testing, What is it Good For? Absolutely Everything!

Microsoft Query, the helper application included with Microsoft Office, allows

Database Design and the E-R Model

Relational Database Basics Review

Module 4: Identifying and Researching Career Options Transcript

Microsoft Access Part I (Database Design Basics) ShortCourse Handout

Normalization of Database

Normalization. CIS 3730 Designing and Managing Data. J.G. Zheng Fall 2010

The Secret Formula for Webinar Presentations that Work Every Time

User Services. Intermediate Microsoft Access. Use the new Microsoft Access. Getting Help. Instructors OBJECTIVES. July 2009

Database Dictionary. Provided by GeekGirls.com

Microsoft Access Basics

User Services. Microsoft Access 2003 II. Use the new Microsoft

David M. Kroenke and David J. Auer Database Processing 12 th Edition

NEDARC POSITION PAPER

Introduction to Microsoft Access

Graphing Parabolas With Microsoft Excel

The join operation allows you to combine related rows of data found in two tables into a single result set.

Database Normalization. Mohua Sarkar, Ph.D Software Engineer California Pacific Medical Center

Practical Database Design

Critical analysis. Be more critical! More analysis needed! That s what my tutors say about my essays. I m not really sure what they mean.

Visualization Techniques for Requirements Definition

Credit Card Activity

Designing Databases. Introduction

7. Databases and Database Management Systems

Consider the possible problems with storing the following data in a spreadsheet:

Database Design. Marta Jakubowska-Sobczak IT/ADC based on slides prepared by Paula Figueiredo, IT/DB

INTRODUCTION TO MICROSOFT ACCESS Tables, Queries, Forms & Reports

Top 5 Mistakes Made with Inventory Management for Online Stores

Lotto Master Formula (v1.3) The Formula Used By Lottery Winners

- Eliminating redundant data - Ensuring data dependencies makes sense. ie:- data is stored logically

Ad Sales Scripts. Would you be willing to meet with me in person to discuss this great opportunity?

Interest Rates: Loans, Credit Cards, and Annuties. Interest Rates: Loans, Credit Cards, and Annuties 1/43

1.204 Lecture 2. Keys

If it's in the 2nd NF and there are no non-key fields that depend on attributes in the table other than the Primary Key.

Normalization. Normalization. First goal: to eliminate redundant data. for example, don t storing the same data in more than one table

Pay per Click Success 5 Easy Ways to Grow Sales and Lower Costs

UNIT AUTHOR: Elizabeth Hume, Colonial Heights High School, Colonial Heights City Schools

Data Modeling: Part 1. Entity Relationship (ER) Model

Module 1: The Career Planning Process: An Overview Transcript

ISM 318: Database Systems. Objectives. Database. Dr. Hamid R. Nemati

The Math of Call Center Staffing Calculating Resource Requirements and Understanding Staff and Service Tradeoffs

Database Design and Normalization

DATABASE INTRODUCTION

Top Producer Secrets About Cheap Leads Revealed!

B.1 Database Design and Definition

Why do we need Fundraising Software?

Data Modelling And Normalisation

Welcome! You ve made a wise choice opening a savings account. There s a lot to learn, so let s get going!

What Is Singapore Math?

Copyright (c) 2015 Christopher Small and The Art of Lawyering. All rights reserved.

Once the schema has been designed, it can be implemented in the RDBMS.

Database Normalization And Design Techniques

WINDOWS AZURE AND WINDOWS HPC SERVER

Using Excel As A Database

Conceptual Design: Entity Relationship Models. Objectives. Overview

Your Credit Score and Your Credit Report

USING ALGEBRA TILES EFFECTIVELY

Regional Slot Tech Training Class Five-Day Class-March 18-22, 2013

Evaluating Your Student Debt Situation

0 Introduction to Data Analysis Using an Excel Spreadsheet

Flowcharting, pseudocoding, and process design

Creating and Using Databases with Microsoft Access

presents Understanding VA Hybrid ARM Loans NMLS #

Database Systems. S. Adams. Dilbert. Available: Hans-Petter Halvorsen, M.Sc.

A Tutorial on Quality Assurance of Data Models 1. QA of Data Models. Barry Williams. tutorial_qa_of_models.doc Page 1 of 17 31/12/ :18:36

Microsoft Access 2007 Module 1

The 3 Normal Forms: Copyright Fred Coulson 2007 (last revised February 1, 2009)

BA 561: Database Design and Applications Acct 565: Advanced Accounting Information Systems Syllabus Spring 2015

Introduction to SQL for Data Scientists

THE 120VC PORTFOLIO MANAGEMENT MODEL

What is a database? The parts of an Access database

RELATIONAL DATABASE DESIGN

CHAPTER 6 DATABASE MANAGEMENT SYSTEMS. Learning Objectives

How much time do you waste every week trying to prepare reports for

CA IDMS. Database Design Guide. Release , 2nd Edition

Providing Knowledge to Grow Your Practice 1

and Maths in School. Addition in School. by Kate Robinson

ERP Software and Your Business

Using a conveyancing lawyer: Ten helpful tips

Online Computer Science Degree Programs. Bachelor s and Associate s Degree Programs for Computer Science

Improving Productivity using IT - Level 3 Scenario Assignment Sample Test 4 Version SampleMQTB/1.0/IP3/v1.0. Part 1 Performance

Transcription:

67232976x_ch15.qxd 4/1/8 4:3 PM Page 1 15 Understanding the Database Design Process In this chapter, you ll learn the thought processes behind designing a relational database. After this theory-focused chapter, you ll jump headlong into learning the basic MySQL commands in preparation for integrating MySQL in your own applications. Topics covered in this chapter are. Some advantages to good database design. Three types of table relationships. How to normalize your database. How to implement a good database design process The Importance of Good Database Design A good database design is crucial for a high-performance application, just as an aerodynamic body is important to a race car. If the car doesn t have smooth lines, it will produce drag and go slower. Without optimized relationships, your database won t perform as efficiently as possible. Thinking about relationships and database efficiency is part of normalization. Beyond the issue of performance is the issue of maintenance your database should be easy to maintain. This includes storing only a limited amount (if any) of repetitive data. If you have a lot of repetitive data and one instance of that data undergoes a change (such as a name change), that change has to be made for all occurrences of the data. To eliminate duplication and enhance your ability to maintain the data, you might create a table of possible values and use a key to refer to the value. That way, if the value changes names, the change occurs only once in the master table. The reference remains the same throughout other tables.

67232976x_ch15.qxd 4/1/8 4:3 PM Page 2 2 15: Understanding the Database Design Process By the Way Normalization refers to the process of structuring data in order to minimize duplication and inconsistencies. For example, suppose that you are responsible for maintaining a database of students and the classes in which they re enrolled. If 35 of these students are in the same class, let s call it Advanced Math, this class name would appear 35 times in the table. Now, if the instructor decides to change the name of the class to Mathematics IV, you must change 35 records to reflect the new name of the class. If the database were designed so that class names appeared in one table and just the class ID number was stored with the student record, you would have to change only 1 record not 35 to update the name change. The benefits of a well-planned and designed database are numerous, and it stands to reason that the more work you do up front, the less you ll have to do later. A really bad time for a database redesign is after the public launch of the application using it although it does happen, and the results are costly. So, before you even start coding an application, spend a lot of time designing your database. Throughout the rest of this chapter, you ll learn more about relationships and normalization, two important pieces to the design puzzle. Types of Table Relationships Table relationships come in several forms:. One-to-one relationships. One-to-many relationships. Many-to-many relationships For example, suppose that you have a table called employees that contains each person s Social Security number, name, and the department in which he or she works. Suppose that you also have a separate table called departments, containing the list of all available departments, made up of a Department ID and a name. In the employees table, the Department ID field matches an ID found in the departments table. You can see this type of relationship in Figure 15.1. The PK next to the field name indicates the primary key for the table. In the following sections, you will take a closer look at each of the relationship types.

67232976x_ch15.qxd 4/1/8 4:3 PM Page 3 Types of Table Relationships 3 PK departments DeptID DeptName employees PK SS_Number DeptID FirstName LastName FIGURE 15.1 The employees and departments tables are related through the DeptID key. One-to-One Relationships In a one-to-one relationship, a key appears only once in a related table. The employees and departments tables do not have a one-to-one relationship because many employees undoubtedly belong to the same department. A one-to-one relationship exists, for example, if each employee is assigned one computer within a company. Figure 15.2 shows the one-to-one relationship of employees to computers. EMPLOYEE 1 COMPUTER 1 EMPLOYEE 2 COMPUTER 2 FIGURE 15.2 One computer is assigned to each employee. EMPLOYEE 3 COMPUTER 3 EMPLOYEE 4 COMPUTER 4 The employees and computers tables in your database would look something like Figure 15.3, which represents a one-to-one relationship. employees PK SS_Number DeptID FirstName LastName ComputerID FIGURE 15.3 One-to-one relationship in the data model. computers PK ComputerID ComputerDesc

67232976x_ch15.qxd 4/1/8 4:3 PM Page 4 4 15: Understanding the Database Design Process One-to-Many Relationships In a one-to-many relationship, keys from one table appear multiple times in a related table. The example shown in Figure 15.1, indicating a connection between employees and departments, illustrates a one-to-many relationship. A real-world example would be an organizational chart of the department, as shown in Figure 15.4. FIGURE 15.4 One department contains many employees. DEPARTMENT A EMPLOYEE 1 EMPLOYEE 2 EMPLOYEE 3 EMPLOYEE 4 The one-to-many relationship is the most common type of relationship. Another practical example is the use of a state abbreviation in an address database; each state has a unique identifier (CA for California, PA for Pennsylvania, and so on), and each address in the United States has a state associated with it. If you have eight friends in California and five in Pennsylvania, you will use only two distinct abbreviations in your table. One abbreviation (CA) represents a one-toeight relationship, and the other (PA) represents a one-to-five relationship. Many-to-Many Relationships The many-to-many relationship often causes problems in practical examples of normalized databases, so much so that it is common to simply break many-to-many relationships into a series of one-to-many relationships. In a many-to-many relationship, the key value of one table can appear many times in a related table. So far, it sounds like a one-to-many relationship, but here s the curveball: The opposite is also true, meaning that the primary key from that second table can also appear many times in the first table. Think of such a relationship this way, using the example of students and classes: A student has an ID and a name. A class has an ID and a name. A student usually takes more than one class at a time, and a class always contains more than one student, as you can see in Figure 15.5.

67232976x_ch15.qxd 4/1/8 4:3 PM Page 5 Types of Table Relationships 5 CLASS A STUDENT 1 STUDENT 2 STUDENT 3 CLASS B STUDENT 5 STUDENT 6 FIGURE 15.5 Students take classes, and classes contain students. STUDENT 7 STUDENT 4 CLASS C As you can see, this sort of relationship doesn t present an easy method for relating tables. Your tables could look like Figure 15.6, seemingly unrelated. students PK StudentID FirstName LastName classes PK ClassID ClassDesc FIGURE 15.6 The students table and the classes table, unrelated. To make the theoretical many-to-many relationship, you would create an intermediate table, one that sits between the two tables and essentially maps them together. You might build such a table similar to the one in Figure 15.7. students PK StudentID FirstName LastName students_classes_map StudentID ClassID classes PK ClassID ClassDesc FIGURE 15.7 The students_ classes_map table acts as an intermediary. If you take the information in Figure 15.5 and put it into the intermediate table, you would have something like Figure 15.8. As you can see, many students and many classes happily coexist within the students_classes_map table. With this introduction to the types of relationships, learning about normalization should be a snap.

67232976x_ch15.qxd 4/1/8 4:3 PM Page 6 6 15: Understanding the Database Design Process FIGURE 15.8 The students_ classes_map table populated with data. STUDENTID STUDENT 1 STUDENT 2 STUDENT 3 STUDENT 4 STUDENT 5 STUDENT 6 STUDENT 7 STUDENT 1 STUDENT 2 STUDENT 3 STUDENT 4 CLASSID CLASS A CLASS A CLASS A CLASS A CLASS B CLASS B CLASS C CLASS B CLASS B CLASS C CLASS C Understanding Normalization Normalization is simply a set of rules that will ultimately make your life easier when you re acting as a database administrator. It s the art of organizing your database in such a way that your tables relate where appropriate and are flexible for future growth. The sets of rules used in normalization are called normal forms. If your database design follows the first set of rules, it s considered in the first normal form. If the first three sets of rules of normalization are followed, your database is said to be in the third normal form. Throughout this chapter, you ll learn about each rule in the first, second, and third normal forms and, I hope, will follow them as you create your own applications. You ll be using a sample set of tables for a students-and-courses database and taking it to the third normal form. Problems with the Flat Table Before launching into the first normal form, you have to start with something that needs to be fixed. In the case of a database, it s the flat table. A flat table is like a spreadsheet it has many, many columns. There are no relationships between multiple tables; all the data you could possibly want is right there in that flat table. This

67232976x_ch15.qxd 4/1/8 4:3 PM Page 7 Understanding Normalization 7 scenario is inefficient and consumes more physical space on your hard drive than a normalized database. In your students-and-courses database, assume that you have the following fields in your flat table:. StudentName The name of the student.. CourseID1 The ID of the first course taken by the student.. CourseDescription1 The description of the first course taken by the student.. CourseInstructor1 The instructor of the first course taken by the student.. CourseID2 The ID of the second course taken by the student.. CourseDescription2 The description of the second course taken by the student.. CourseInstructor2 The instructor of the second course taken by the student.. Repeat CourseID, CourseDescription, and CourseInstructor columns many more times to account for all the classes students can take during their academic career. With what you ve learned so far, you should be able to identify the first problem area: CourseID, CourseDescription, and CourseInstructor columns are repeated groups. Eliminating redundancy is the first step in normalization, so next you ll take this flat table to first normal form. If your table remained in its flat format, you could have a lot of unclaimed space and a lot of space being used unnecessarily not an efficient table design. First Normal Form The rules for the first normal form are as follows:. Eliminate repeating information.. Create separate tables for related data. If you think about the flat table design with many repeated sets of fields for the students-and-courses database, you can identify two distinct topics: students and courses. Taking your students-and-courses database to the first normal form would

67232976x_ch15.qxd 4/1/8 4:3 PM Page 8 8 15: Understanding the Database Design Process mean that you create two tables: one for students and one for courses, as shown in Figure 15.9. FIGURE 15.9 Breaking the flat table into two tables. students PK StudentID StudentName student_courses StudentID CourseID CourseDescription CourseInstructor Your two tables now represent a one-to-many relationship of one student to many courses. Students can take as many courses as they want and are not limited to the number of CourseID/CourseDescription/CourseInstructor groupings that existed in the flat table. The next step is to put the tables into second normal form. Second Normal Form The rule for the second normal form is as follows:. No non-key attributes depend on a portion of the primary key. In plain English, this means that if fields in your table are not entirely related to a primary key, you have more work to do. In the students-and-courses example, you need to break out the courses into their own table and modify the students_courses table. CourseID, CourseDescription, and CourseInstructor can become a table called courses with a primary key of CourseID. The students_courses table should then just contain two fields: StudentID and CourseID. You can see this new design in Figure 15.1. FIGURE 15.1 Taking your tables to second normal form. students PK StudentID StudentName student_courses StudentID ClassID courses PK ClassID ClassDescription CourseInstructor

67232976x_ch15.qxd 4/1/8 4:3 PM Page 9 Following the Design Process 9 This structure should look familiar to you as a many-to-many relationship using an intermediary mapping table. The third normal form is the last form you ll look at, and you ll find it s just as simple to understand as the first two. Third Normal Form The rule for the third normal form is as follows:. No attributes depend on other non-key attributes. This rule simply means that you need to look at your tables and see whether you have more fields that can be broken down further and that aren t dependent on a key. Think about removing repeated data and you ll find your answer: instructors. Inevitably, an instructor will teach more than one class. However, CourseInstructor is not a key of any sort. So, if you break out this information and create a separate table purely for the sake of efficiency and maintenance (as shown in Figure 15.11), that s the third normal form. students PK StudentID StudentName student_courses StudentID CourseID courses PK ClassID ClassDescription CourseInstructorID instructors PK InstructorID InstructorName InstructorNotes FIGURE 15.11 Taking your tables to third normal form. Third normal form is usually adequate for removing redundancy and allowing for flexibility and growth. The next section will give you some pointers for the thought process involved in database design and where it fits in the overall design process of your application. Following the Design Process The greatest problem in application design is a lack of forethought. As it applies to database-driven applications, the design process must include a thorough evaluation of your database what it should hold, how data relates to each other, and most importantly, whether it is scalable.

67232976x_ch15.qxd 4/1/8 4:3 PM Page 1 1 15: Understanding the Database Design Process The general steps in the design process are as follows:. Define the objective. Design the data structures (tables, fields). Discern relationships. Define and implement business rules. Create the application Creating the application is the last step not the first! Many developers take an idea for an application, build it, and then go back and try to make a set of database tables fit into it. This approach is completely backward, inefficient, and will cost a lot of time and money. Before you start any application design process, sit down and talk it out. If you can t describe your application including the objectives, audience, and target market you re not ready to build it, let alone model the database. After you can describe the actions and nuances of your application to other people and have it make sense to them, you can start thinking about the tables you want to create. Start with big flat tables because after you write them down, your newfound normalization skills will take over. You will be able to find your redundancies and visualize your relationships. The next step is to do the normalization. Go from flat table to first normal form and so on up to the third normal form if possible. Use paper, pencils, sticky notes, or whatever helps you to visualize the tables and relationships. There s no shame in data modeling on sticky notes until you re ready to create the tables themselves. Plus, using sticky notes is a lot cheaper than buying software to do it for you; modeling software ranges from one hundred to several thousands of dollars! After you have a preliminary data model, look at it from the application s point of view. Or look at it from the point of view of the person using the application you re building. This is the point where you define business rules and see whether your data model will break. An example of a business rule for an online registration application is, Each user must have one email address, and it must not belong to any other user. If EmailAddress wasn t a unique field in your data model, your model would be broken based on the business rule. After your business rules have been applied to your data model, only then can application programming begin. You can rest assured that your data model is solid and you will not be programming yourself into a brick wall. The latter event is all too common.

67232976x_ch15.qxd 4/1/8 4:3 PM Page 11 Workshop 11 Summary Following proper database design is the only way your application will be efficient, flexible, and easy to manage and maintain. An important aspect of database design is to use relationships between tables instead of throwing all your data into one long flat file. Types of relationships include one-to-one, one-to-many, and many-to-many. Using relationships to properly organize your data is called normalization. There are many levels of normalization, but the primary levels are the first, second, and third normal forms. Each level has a rule or two that you must follow. Following all the rules helps ensure that your database is well organized and flexible. To take an idea from inception through to fruition, you should follow a design process. This process essentially says, Think before you act. Discuss rules, requirements, and objectives; then create the final version of your normalized tables. Q&A Q Are there only three normal forms? A No, there are more than three normal forms. Additional forms are the Boyce- Codd normal form, fourth normal form, and fifth normal form/join-projection normal form. These forms are not often followed because the benefits of doing so are outweighed by the cost in man-hours and database efficiency. Workshop The workshop is designed to help you anticipate possible questions, review what you ve learned, and begin putting your knowledge into practice. Quiz 1. Name three types of data relationships. 2. Because many-to-many relationships are difficult to represent in an efficient database design, what should you do?

67232976x_ch15.qxd 4/1/8 4:3 PM Page 12 12 15: Understanding the Database Design Process Quiz Answers 1. One-to-one, one-to-many, many-to-many. 2. Create a series of one-to-many relationships using intermediary mapping tables. Activity Explain each of the three normal forms to a person who works with spreadsheets and flat tables.