Access made easy. Data Modelling And Normalisation 02 www.accessallinone.com
This guide was prepared for AccessAllInOne.com by: Robert Austin This is one of a series of guides pertaining to the use of Microsoft Access. AXLSolutions 2012 All rights reserved. No part of this work may be reproduced in any form, or by any means, without permission in writing. 1 P a g e
Contents Databases, Data Models and Normalisation... 3 Introduction to Databases... 3 Data Modelling... 4 Relationships... 5 Many-to-Many Relationships... 5 One-to-Many Relationship... 7 But why is Data Modelling Important?... 7 Data at the Core of the Access Database Application... 8 Normalisation... 10 0 th Normal Form 0NF... 10 1 st Normal Form 1NF... 11 2 nd Normal Form... 12 Figure 2.15... 13 3 rd Normal Form... 13 Data Integrity... 14 Access as a Relational Database System... 15 Cascade Delete... 15 Cascade Update... 16 The Access Relationship Diagram... 17 Questions... 19 Answers... 23 2 P a g e
Databases, Data Models and Normalisation In this unit you will learn about some of the important concepts that underpin your Access databases. We will go into more detail about what the title above means but to give you a brief introduction: A Database is a collection of data, stored in tables (along with relations). A Data Model is a diagram that shows you visually and conceptually how those tables relate to one another. Normalization is the process of rearranging the fields and tables of a relational database to reduce data redundancy and dependency. Normalization typically consists of splitting up big tables into smaller sized (and fewer redundant) tables and characterizing relationships between them. Introduction to Databases If you look at the tables in the Navigation Pane of Access what you are looking at is the database. Look at that word again; data-base. A base is a home, a place to live. Data is information. So a data-base is a home for information. Each of those Tables in the database is called a Relation. Tables/Relations contain data that is about ONE topic of information. And that s a rule with Relational Databases; each Table/Relation is concerned only about one topic. That s boring isn t it? But so are your dad s stories you hear year in year out, but that s relations for you, right? Although Relations can be pretty boring, they know their stuff! One relative may have an encyclopaedic knowledge of books, another of cars and another of stationary. A relation might REALLY like football players; know every single player and his/her details! Another relation might really be enthusiastic about invoices. In a relational database (remember, that s a home for information) all the relations know just ONE topic of information, but they know it really well. Another important thing about Relations is that they have Relationships! The Invoice relation is great at knowing the total sum of an invoice and another relation knows all about Invoice Details. In another setting, a Student Relation loves borrowing Books from a Library, and the Library will lend Books to Students, so lots of students borrow lots of books. Let s see that in a diagram (figure 2.1) Students Books Figure 2.1 3 P a g e
In figure 2.1 we ve depicted the relations as rectangles rather than stick men. Figure 2.2 shows Invoice and Invoice Details relations. Invoice Invoice Lines Figure 2.2 A final pair are the uncle and nephew relations; Clubs and Players. Uncle Clubs knows all the Clubs in the premiership and nephew Players knows all the players and a few important facts, like age, height, eye colour, etc. Clubs Players Figure 2.3 Data Modelling All the above diagrams form part of Data Modelling. We ve used some pretty awful analogies but they do give us a clear idea about Relations and the information they may contain. We know there are Tables called Relations, we know a Database is a home for those Relations. And we have spoken about there being relationships between these Relations. In data modelling what we try to do is to model pictorially these Relations and importantly the Relationships between these Relations. What we will do next is introduce Relationships and how we model those. The two types of relationships we will be focussing on are the One-To-Many and Many-To-Many relationships. 4 P a g e
Relationships Many-to-Many Relationships Continuing with our analogies, Books and Students are a great example of the Manyto-Many relationship. A Student will borrow many books from the library, and one book will be lent out to many different students at certain points. In figure 2.4 we demonstrate this with a Many-to-Many relationship. Students Books Figure 2.4 Think a moment about these relations. The Students Table knows lots of information about various students their names, age, DOB, maybe address. The Books Table knows lots of information about various books title, date of publication. But where is the lending and borrowing information kept? Good question you might ask. If you ever see a many-to-many relationship remember that they are always hiding information. In this case we re missing the Lending Card. The Lending Card in a library tell us about lending maybe the student number, the book s barcode number, the date the book was lent out, and probably the date the book should be and was returned. Students Books Figure 2.5 Lending Card The Lending Card is the relationship, in a box. We can read figure 2.5 like a sentence. Each Lending Card is related to one Book and one Student (a book can t be lent to two students at once). Conversely, a student can borrow many books so have many Lending Cards whilst a Book will be lent out many times so will be on many Lending Cards as well. This demonstrates the many-to-many relationship. The good news here is the model is accurately depicting what actually happens and all information is being modelled regarding the library system. Obviously a library 5 P a g e
system is more complex, but it will have these Relations and the Relationship Lending Card. Let s do another one. Clubs have many Players. Players can play at more than one Club in their career. Clubs Players Figure 2.6 Again we re missing some information; we need to know the start and end dates of a player s tenure at a club. We would also like to know their salary and any other relevant contractual information. A Contract is made between a Club and a Player. That contract will have at least a start date, probably a salary and stipulate a number of other things. Let s read this diagram. A Contract is between a Player and a Club. Over time a Player will have many Contracts in his life, and a Club will have many Contracted Players. We ve modelled a Football Club database! Clubs Players Figure 2.7 Contract 6 P a g e
One-to-Many Relationship Lastly then, we mentioned Invoices and Invoice Details. At the moment we know that each Invoice has a number of Invoice Details, but Invoice Detailis (e.g. 2 boxes of pencils, 5 reams of paper, a box of chocolates) are specific to an Invoice. Figure 2.8 is already correct! The relationship between Invoice and Invoice Lines has already been depicted so we don t need to put in any more relations. Invoices Invoice Details Figure 2.8 That was a crash course on data modelling. The key things to grasp are: A Relational Database is home to Relations (A.K.A. Tables in Access). Relations only know about a very specific type of information. Relations have Relationships. Relationships are such that, o A single Relation A may be Related to many Relations Bs; a one-tomany relationship. o Many Relation As may be Related to Many Relation Bs; a many-tomany relationship. But Many-to-Many Relationships hide data, so to capture that Relationship we create a 3rd Relation (C) in a box and say: o Relationship C is related to ONE Relation A and ONE Relation B o Relation A can have many C Relationships and, o Relation B can also have many C Relationships. All the Relations and Relationships together form a Data Model. But why is Data Modelling Important? Each of the Relations we found in the above section will result in a table in Access; there will be a Students table, a Books table, a Clubs table etc. But we also found some many-to-many relationships which were hiding vital information. If we had not done the modelling above we would probably have gone ahead and written, say, the Students and Books table. We may even have created some Forms and Reports but we would have reached a point where we couldn t enter the lending information, and by that time we would have spent days developing and only then realised we were missing the Lending Card! What a waste of time and effort. Modelling your data, then, is actually extremely important. It doesn t take long to do; you need just a pencil and paper, a good rubber, a sharpener and a good talk between 7 P a g e
you and the project owner. Creating databases often comes down to nothing more than drawings on a piece of paper. Data at the Core of the Access Database Application To put the idea of getting it right at the beginning into the context of an Access Database Application, Figure 2.9 demonstrates the importance of getting your data modelling right from the start. Macros Forms Queries Data Reports Figure 2.9 Data is at the heart of everything that Access does. Look at the diagram and notice that everything bar Macros is touching Data. Reports use Data. Queries use Data. Forms use Data. If we change details of a Student (their address for example), it will be updated in Data and all the other objects will also be updated. If a Student borrows a book the Lending Card table is updated with a Lending Card record. If a new invoice is entered, Data is updated with a new Invoice record and new related Invoice Lines. If a Player transfers to a new Club a new Contract record is entered into Data. And everything in the application that uses Data will be updated because all those objects use the same pool of data. 8 P a g e
All of that is possible because we have modelled our data already and found out a Lending Card exists or a Contract exists. But what if we had not analysed and modelled our data beforehand? Well, we probably would have made a Players Table, made a Player s Form, and some Player Reports. We d probably have made a Clubs Table, made a few Queries about Clubs, made a Form to enter end edit Clubs. But we couldn t enter the Player transfer contract anywhere, so we might add a field entitled Contract to compensate. Maybe we d put some club information in the Player table or vice versa. By the time we realised our error we will have put hours of effort in and most of that would have been in vain. If an IT project fails it is invariable because the data has not been understood and modelled correctly. Hopefully this message has been driven home; the importance of data analysis and drawing fancy diagrams. Data Modelling is one technique of many that we can use to design the structure of our database. One additional method that complements data modelling is called Normalization. We won t go into too great a detail about Normalization as there are whole books dedicated to the subject, but we will outline the steps to aide your future learning. 9 P a g e
Normalisation Normalisation is the name given to a process invented by a man called Codd. Codd was very clever, and like all clever things made by clever people he made something complex into something relatively simple. He made three rules, and we can use those rules on our Data to make sure it never becomes redundant and ensure that every Relation has all the information it should have. These rules Codd called 1 st, 2 nd and 3 rd Normal Form. The output of each rule is used as the input to the next, and the process begins with 0 th Normal Form. 0 th Normal Form 0NF 0 th normal form doesn t formally exist but has great descriptive value. It represents data held by you that hasn t yet undergone any form of normalisation. If you have already done your Data Modelling you can start with a list of your entities and give those entities the fields you know are associated with them. 0 th NF is invariably a spread-sheet or a set of business forms or a timetable or a list of items. Let s take a look at an invoice and attempt to normalise it. Figure 2.10 10 P a g e
Invoice Line 4 Invoice Line 3 Invoice Line 2 Invoice Line 1 Company Address Company Name Your Ref. Date Invoice No. Souvenirs $15 Transport $250 Catering for 20 $30 Film Location $15000 8590 La Crosiette, Cannes, Cote d Azure, France Cannes Festiaval Events P0123456 17/04/2004 08000590 Using the invoice in Figure 2.10 we could create a Table that we think would hold all the relevant data. It would be in 0NF and look like this: 1 st Normal Form 1NF The rule of first normal form is: Figure 2.11 Remove all Duplicate Columns and assign a suitable Key Duplicate columns are columns of data that are fundamentally equivalent. Invoice details are one of these so we need to change then. We must do three things: 1) Give the invoice a primary key - a key that will uniquely identify this document e.g. invoice number 2) Move the duplicating columns off the table but keep a foreign key reference. Now we have removed the duplicate columns we have broken the original table down into the following two tables: Invoices Invoice Number Date Your Ref. Company Name 08000590 17/04/2004 P0123456 Cannes Festival Events Company Address 8590 La Crosiette, Cannes, Cote d Azure, France InvoiceLineID Description Price Invoice Number 1 Location $15,000 08000590 2 Catering $30 08000590 11 P a g e
Address 4 Address 3 Address 2 Address 1 France Cote d Azure Cannes 8590 La Crosiette 3 Transport $250 08000590 4 Souvenirs $15 08000590 Figure 2.12 Invoice Lines Figure 2.13 The table Invoices above is given a Primary Key called Invoice Number. The table Invoice Lines above has an assumed Primary Key called ID and contains a Foreign Key field called Invoice Number. This is a relation. The Primary Keys in both tables are unique for each row. No two Invoices will have the same Invoice Number and no two Invoice Lines will have the same ID number. 2 nd Normal Form The second rule of normalisation is: Remove subsets of data that apply to multiple rows of a table and place them in separate tables. In the Invoices table it would be possible for a company to be invoiced more than once. In this case this would mean having to repeat the company name Cannes Festival Events. We wish to avoid repeating data if possible as something as simple as the company changing its name would cause us problems. To do this we create another table for companies and move the relevant information over. Companies CompanyID Company Name 1 Cannes Festival Events Figure 2.14 The Invoices table would now look like this: Invoice Date Your Ref. CompanyID Number 08000590 17/04/2004 P0123456 1 12 P a g e
Figure 2.15 The company name and address have been replaced with a single Foreign Key CompanyID. Now if any changes to the company are made all parts of the database will be updated immediately. 3 rd Normal Form This last step in the process does the following: All attributes that are not dependant on the primary key must be eliminated. This is the point at which normalisation can become overly cumbersome and can be of little to no benefit. We could, for example, say that each part of the address (1, 2 and 3) be given its own table! In reality, that would be madness, so in this case it would be best to leave the tables in 2NF. In fact, in most cases, 2NF is more than sufficient. 13 P a g e
Based on this process we can now create an entity relationships diagram that represents the data and relations. Invoice Lines InvoiceLineID Description Price InvoiceNumber Figure 2.16 Invoices InvoiceNumber Date Your Ref. CompanyID Companies CompanyID CompanyName Address1 Address2 Address3 Address4 Correctly modeling a database from the outset is vital as making Table level changes 6 months into a project will mean that every object based on that table, whether it be Query, Form or Report will also need changing. Data Integrity What we have done by moving the Invoice Lines data to an Invoice Lines relation and Company data to a Company relation is that we have increased the reliability of the data held in our Database. No more getting customer names wrong; no more lazy addresses that we can t work out later; no more wrong pricing on products, no more Mr instead of Ms; no more incorrect telephone numbers; no more simple mistakes. All because we now have a good model! However, the Devil is always in the Detail. Primary Keys must be unique! Primary Keys are IDs that uniquely identify no more than 1 entity in our Relation. That is, no two companies may have the same CompanyID. Foreign Keys must have an equivalent Primary Key! If, in the above example, an invoice were to contain the number 3 for a Foreign Key of a company but there were no Company with a CompanyID of 3, we would have a problem. In this case we would say that the data lacked integrity. Maintaining these Primary and Foreign Keys is referred to as Maintaining Data Integrity, or more correctly Maintaining Referential Integrity; By referential we are only interested in Primary Keys and references to those Primary Keys which we call Foreign Keys. We don t care about anything else; what we mean is that a company s name may change, their address may change, their 14 P a g e
mobile number may change but they are always the company and will always have the same CompanyID. Also, an Invoice Line s name may change, its price may change, but it is still the same Invoice Line and will always have the same ID. Access as a Relational Database System What Data does not do is look after itself. In a paper filing system the pages don t get themselves out and put themselves away; a secretary does that and manages the data. The secretary makes sure all the i's are dotted and all the t s crossed. The secretary ensures all the reference numbers are correct and updates addresses and telephone numbers. What Access does, and all relational database systems do Oracle, SQL Server, MySQL, Postgress, SQLite and all others is maintain references between relations. The following are characteristics of every relational database system: Characteristics of every relational database system All Relations have a Primary Key Primary Keys are ALWAYS UNIQUE within that Relation. Foreign Keys are equal to a Primary Keys of their related Relations, OR NULL. A Foreign Key may change to another Relation s Primary Key or be changed to NULL. A Foreign Key can NEVER be a Primary Key of some other Relation; that is Illegal. In keeping with the above rules a Relational Database System maintains the data. These rules have some consequences due to events that can occur with the data and involve Deleting and Updating records. Cascade Delete Using our Invoicing model above, let s ask some questions about the consequences of certain scenarios: 1) What would happen if we deleted an Invoice Line record? 2) What would happen if we deleted an Invoice record? 3) What would happen if we deleted a Company record? And the answers: 1) A record in the Invoice Line relation would be deleted, but that s all. 2) Several records in Invoice Line that were related to the deleted Invoice would now relate to nothing! 3) Several records in the Invoice relation would now relate to no Company records! 15 P a g e
This obviously is not a situation we want; we can t have invoices without Customers, nor can we have Invoice Lines that have no Invoice; none of this would make any sense! So, what Access does is Cascade the Delete operation and delete all related records! That way data integrity is maintained. That, however, may not be the outcome you wanted, but unless you mark a record as being deleted (using a Deleted field) rather than actually deleting it, cascading the delete operation is the only natural action a relational database system can do. Cascade Update Another situation that may arise is that a Primary Key value may change. When this situation occurs, all the Foreign Keys in related Tables contain a value that no longer exists! To overcome this problem a good referential database system will update the Foreign Key of all related Tables as well as the Primary Key, and therefore maintain data integrity. 16 P a g e
The Access Relationship Diagram Access contains its own (very useful) diagram to model data. Open up TeachingInstituteSoftwareSystem.accdb. Click the Relationships button which can be found in the Relationships group of the Database Tools tab on the Ribbon. You should see something like this. There are more tables in the database but some have been hidden to make the diagram easier to read. 17 P a g e
2 4 3 1 5 1. The infinity sign represents the Many part of a relationship. 2. The 1 sign represents the 1 part of a relationship. 3. CourseTypeID is a Foreign Key of tblcoursetype (not shown). 4. CourseID is a primary key as noted by the key symbol. 5. tblstudents and tblcourse are connected by a Many to Many relationship. 18 P a g e
Questions 1) A Database is a collection of what? (3 answers) a. Cups and saucers b. Tables c. Relations d. Crystals e. Entities 2) Which of the following describes the data held within one instance of your answers to question 1? (1 answer) a. A variety of data about any topic b. A concise list of data specific to a topic c. Null d. Information about anything 3) True or false? a. A Relation has Relationships. b. Relations don t work together. c. One-to-many relationships are the norm. d. Many-to-Many relationships are hiding information. 4) Examine the following diagram and answer the questions below. Wards Patients Register a. What is the relationship between Wards and Patients? b. When a Patient is sent to a Ward what happens? 5) Using the same diagram as question 4: a. Give a good Primary Key name for Ward. b. Give a good Primary Key name to Patient. c. Register has a three-part Primary Key, what are ii and iii? i. Registration-Date ii.?? iii.?? 6) Using the following diagram show a one-to-many relationship where one Owner has many animals 19 P a g e
Owner Animals 7) A law is brought in by government that requires that every animal is microchipped and that a record is kept of every past Owner of an Animal. Update the diagram below to include an Owner History Relation. Owner Animals Owner History 8) Which parts of an Access database use Data? (4 answers) a. Macros b. Tables c. Forms d. Reports e. Queries 9) Which parts of an Access databases use Macros? (4 answers) a. Macros b. Tables c. Forms d. Reports e. Queries 10) Which part of an Access database uses Queries? (4 answers) a. Macros b. Tables c. Forms d. Reports e. Queries 11) True or false? a. Modelling one s data is a waste of time. b. Relations are meaningless in this age of information! c. A good data model is well worth the time to make. d. Normalisation is a big waste of human resources. e. Normalisation complements data modelling. 12) Which of the following can uniquely identify a record in a Table? (4 answers) a. Customer Name b. Social Security Number 20 P a g e
c. Invoice Number d. Membership ID e. Customer ID 13) Which of the following is a Composite Primary Key? (2 answers) a. Mobile Number b. Invoice ID c. IBAN d. Member Number and Bill Number e. Table ID and Order ID 14) Which of the following would be a foreign key? a. A StudentID in a Table of Courses. b. A StockItemID in a Table of Stock Items. c. A CarID in a Table of Cars. d. A SupplierID in a Table of Consumables. e. A BookID in a Table of Lending Cards. 15) Below is a list of Relations followed with potential fields; underline those fields that may be repeating groups. a. Invoice ( date, time, value, item description, item value, quantity) b. Parents ( child s name, child s DOB, name, DOB, address, contact number) c. Players ( name, DOB, match date, home/away, score home, score away ) d. Cars ( make, model, doors, engine capacity, owner name, address, dob ) e. Houses ( bedrooms, bathrooms, address, tenant name, rent value, startdate ) 16) How best to Normalize a Report in 0NF into a Many-to-Many relationship? (2 answers) a. Leave it alone. b. Consider it a one-to-one. c. Break it down into two one-to-many relationships. d. Merge them together into one entity. e. Add a Relation that represents the many-to-many relationship. 17) Which of the following are one-to-many relationships? (3 answers) a. One Entity may relate to several of another Entity. b. For every one Record there are many records in another Relation. c. Many Records can have a relationship with one table entry. d. Lots of students attend lots of classes. e. A practice of doctors looks after a panel of patients. 18) True or false? a. Macros work with forms. b. Reports use queries. c. Only forms use queries. d. Forms use tables. e. Queries use reports. 21 P a g e
19) What were the two relationships discussed in this unit? 20) How many relations do you need for a Many-To-Many relationship? 22 P a g e
Answers 1) (b), (c), (e) 2) (b) 3) See below a. True b. False c. True d. True 4) See below a. Many-to-Many. b. It would be recorded in the Register Relation. 5) See below a. Ward-ID, Room-ID, etc as long as it is descriptive and relevant. b. Patient-ID, Medical-ID, etc as long as it is descriptive and relevant. c. See below i. Registration-Date ii. The Foreign Key for Wards. iii. The Foreign Key for Patients. 6) See diagram Owner Animals 7) See diagram Owner Animals Owner History 8) See below a. False b. True c. True d. True e. True 9) See below a. True b. True 23 P a g e
c. True d. True e. Queries 10) See below a. True b. False c. True d. True e. True 11) See below a. False b. False c. True d. False e. True 12) See below a. False b. True c. True d. True e. True 13) See below a. No b. No c. No d. Yes e. Yes 14) See below a. Yes b. No c. No d. Yes e. Yes 15) See below a. Invoice ( date, time, value, item description, item value, quantity) b. Parents ( child s name, child s DOB, name, DOB, address, contact number) c. Players ( name, DOB, match date, home/away, score home, score away ) d. Cars ( make, model, doors, engine capacity, owner name, address, dob ) e. Houses ( bedrooms, bathrooms, address, tenant name, rent value, startdate ) 16) See below a. No 24 P a g e
b. No c. Yes d. No e. Yes 17) See below a. True b. True c. True d. False e. False 18) See below a. True b. True c. False d. False 19) One-to-one, One-to-many 20) 3 25 P a g e