Duplication Problem. Duplicating Columns

Similar documents
Moving Data Between Access and Excel

What is a database? The parts of an Access database

In This Issue: Excel Sorting with Text and Numbers

Creating an Access Database. To start an Access Database, you should first go into Access and then select file, new.

How to Create a Campaign in AdWords Editor

Creating a Participants Mailing and/or Contact List:

ODBC Reference Guide

Getting Started with Access 2007

Microsoft. Access HOW TO GET STARTED WITH

Microsoft Excel 2010 Training. Use Excel tables to manage information

Introduction to Microsoft Access 2010

Financial Reporting Using Microsoft Excel. Presented By: Jim Lee

Introduction to Microsoft Access 2013

Blue Heron Trading Post Powered by Google Groups

Module 9 Ad Hoc Queries

Microsoft Access Basics

NEXT-ANALYTICS lets you specify more than one profile View in a single query.

Introduction to Microsoft Access 2003

Introduction to Data Tables. Data Table Exercises

Developing Entity Relationship Diagrams (ERDs)

ACADEMIC TECHNOLOGY SUPPORT

How to Make the Most of Excel Spreadsheets

ACCESS Importing and Exporting Data Files. Information Technology. MS Access 2007 Users Guide. IT Training & Development (818)

MODULE 7: FINANCIAL REPORTING AND ANALYSIS

Search help. More on Office.com: images templates

In-Depth Guide Advanced Spreadsheet Techniques

How To Manage Inventory In Commerce Server

Microsoft Access Rollup Procedure for Microsoft Office Click on Blank Database and name it something appropriate.

Tools for Excel Modeling. Introduction to Excel2007 Data Tables and Data Table Exercises

1. To build a Smart View report, open Excel and click on the Smart View tab.

Card sort analysis spreadsheet

Pivot Tables & Pivot Charts

Salary Review Spreadsheets

Online Application Instruction Document

Microsoft Access Part I (Database Design Basics) ShortCourse Handout

Creating a Gradebook in Excel

This document describes the capabilities of NEXT Analytics v5.1 to retrieve data from Google Analytics directly into your spreadsheet file.

INTRODUCTION TO MICROSOFT ACCESS Tables, Queries, Forms & Reports

EXCEL 2010: PAGE LAYOUT

FrontStream CRM Import Guide Page 2

Working with Tables. Creating Tables in PDF Forms IN THIS CHAPTER

Monthly Payroll to Finance Reconciliation Report: Access and Instructions

SENDING S WITH MAIL MERGE

VIP Adminstration. Molex VIP What s New Multi-Line Design

Microsoft Office Access 2007 Training

How to Use the Cash Flow Template

Comparative Analysis Report Design (for Jonas Club Management)

Advanced Database Concepts Using Microsoft Access

Web CMS Forms. Contents. IT Training

Create an anonymous public survey for SharePoint in Office Ted Green, SharePoint Architect

Create Mailing Labels Using Excel Data (Mail Merge)

Setting up a basic database in Access 2007

EXTENDED LEARNING MODULE A

Notes on Excel Forecasting Tools. Data Table, Scenario Manager, Goal Seek, & Solver

Best practice guide for reporting PAYE information on or before paying an employee

CiviCRM Events Management Reference Manual

Introduction to Microsoft Access 2007

INSTRUCTIONS AND CONSUMER PROFILES DELAWARE ONLINE HEALTH INSURANCE RATE COMPARISON ISSUED APRIL 25, 2012

LSP 121. LSP 121 Math and Tech Literacy II. Simple Databases. Today s Topics. Database Class Schedule. Simple Databases

Microsoft Excel 2013: Using a Data Entry Form

How To Build An Intranet In Sensesnet.Com

Database File. Table. Field. Datatype. Value. Department of Computer and Mathematical Sciences

Converting an Excel Spreadsheet Into an Access Database

Create a new investment form and publish it to a SharePoint 2013 forms library

Choose the Reports Tab and then the Export/Ad hoc file button. Export Ad-hoc to Excel - 1

Single Sign On: Volunteer Connection Support Tree for Administrators Release 2.0

TheFinancialEdge. Configuration Guide for Accounts Payable

Data Warehousing With Microsoft Access


Learn how to create web enabled (browser) forms in InfoPath 2013 and publish them in SharePoint InfoPath 2013 Web Enabled (Browser) forms

Improving Productivity using IT - Level 3 Scenario Assignment Sample Test 4 Version SampleMQTB/1.0/IP3/v1.0. Part 1 Performance

Contents COMBO SCREEN FOR THEPATRON EDGE ONLINE...1 TICKET/EVENT BUNDLES...11 INDEX...71

CONTENTS MANUFACTURERS GUIDE FOR PUBLIC USERS

MEETINGONE ONLINE ACCOUNT MANAGEMENT PORTAL HOST / ROOM USER GUIDE

IRF Business Objects. Using Excel as a Data Provider in an IRF BO Report. September, 2009

Directions for the AP Invoice Upload Spreadsheet

Using Microsoft Access Databases

Departmental Reporting in Microsoft Excel for Sage 50 Accounts

Microsoft Access 2007

Data entry and analysis Evaluation resources from Wilder Research

Click to create a query in Design View. and click the Query Design button in the Queries group to create a new table in Design View.

How To Use Excel To Compute Compound Interest

Comparing Excel, Access and REDCap as Data Management Tools for Human Health Research Data

MODELLING. IF...THEN Function EXCEL Wherever you see this symbol, make sure you remember to save your work!

A free guide for readers of Double Your Business. By Lee Duncan Your Business.com

Using Excel for Statistical Analysis

General User/Technical Guide for Microsoft Access

Using Formulas, Functions, and Data Analysis Tools Excel 2010 Tutorial

Requirements Management Database

Transcription:

The amount of data required to perform our work today can be staggering. A fundamental question is, How should data be stored? Many people will use spreadsheets for data storage. Creating a spreadsheet is simple and fast, which is way many people use them. However, they present a serious data problem that can only be solved by a database. The fundamental difference between spreadsheets and databases is the former is meant for data analysis and the latter for data storage. To illustrate the problems of storing data in a spreadsheet, we ll take a common scenario of keeping track of event attendance. Duplication Problem One of the primary benefits of a database is that it reduces duplication. Duplicate data is generally bad and should be avoided. When data is duplicated, there is an increased chance of introducing errors into the database. Spreadsheets are not set up to reduce duplication. As such, data is often duplicated in a spreadsheet in three ways: 1. Duplicating columns. 2. Duplicating rows. 3. Duplicating information in a cell. Duplicating Columns First Last Golf 12 Golf 2013 Golf14 Kim Jones x x x Bob Miaygi x x Pat Smith x Here we have a duplication of the events. The last three columns show the events though even in this simple example there are data anomalies. The event names are not consistent. They should either be Golf 12, Golf 13, Golf 14 or Golf 2012, Golf 2013, Golf 2014 or Golf12, Golf13, Golf14. Is this a major problem? Not really. But it does illustrate that as more columns are added, it will be difficult to keep consistency in the names. Another problem with this scenario is that it would be laborious to find out how many events a person attended. More on this below. www.newleafdata.com (567) 455-3162 info@newleafdata.com 1 of 5

Duplicating Rows First Last Event Kim Jones Golf 12 Kimberly Jones Golf 13 Kim Jones Golf 14 Bob Miaygi Golf 12 Robert Miaygi Golf 14 Pat Smith Golf 14 In this example, duplicating rows means that the person is listed more than once. Since the person is listed multiple times, it may be hard to determine if it is the same person. For example, are Kim Jones and Kimberly Jones the same person? There is no way to tell from the list. Duplicating Information in a Cell First Last Event Kim Jones Golf 12 Golf 13 Golf 14 Bob Miaygi Golf 12 Golf 14 Pat Smith Golf 14 Duplicating data in a cell is never a good idea. Of the three duplication scenarios, this is the hardest to query. How many events exists? How many people when to each event? Duplicating cell data almost inevitably means manually counting the data. Adding Complexity Even if one of the above methods was used to keep track of event attendance, it is limited in its complexity. For example, perhaps the people who come have different roles. Most are guests/attendees. Some may be volunteers and a couple of others may be organizers. Perhaps there are various discounts available. How would that get noted? www.newleafdata.com (567) 455-3162 info@newleafdata.com 2 of 5

Dirty Data Dirty data refers to the type and consistency of data stored. In the first example of duplicating columns an attendee is marked with an x. But the cell could just as easily contain the amount paid, or a Yes/No. First Last Golf 12 Golf 2013 Golf14 Kim Jones $10 Not Paid Yes Bob Miaygi Paid Paid Pat Smith $10 It may seem unlikely that our small spreadsheet would suffer from dirty data. But it doesn t take much for data to become dirty, especially as the list grows. Query Conundrum One of the primary reasons for storing data is to query it. That is, ask the data questions. As mentioned above in the duplicating column section, it would be laborious to count how many events a person attended. Counting people is a fairly low-level query. That is, it is not very complex. An organization may really want know the following: How many attendees were first time attendees? How much money did the event make? How much money came from returning attendees? How much money came from first time attendees? www.newleafdata.com (567) 455-3162 info@newleafdata.com 3 of 5

A Better Alternative In a relational database management system (RDBMS), data can be stored and related to other data. This makes viewing and querying the data relatively easy. Moreover, the same data can often be viewed from multiple perspectives. In our example, a people layout could display all the events a person attended while an events layout can list all the attendees. Finding first time attendees, total number of attendees, total income, etc. is quick and easy to obtain. www.newleafdata.com (567) 455-3162 info@newleafdata.com 4 of 5

When to Use a Spreadsheet Although spreadsheets are not optimal for data storage, they are very useful tools for data analysis. In our example, let s say we want to keep track of the number of people who attended an event, how much the event cost, how much revenue the event created, and how much revenue per person the event created. As previously mentioned, these numbers are easy to obtain in a relational database. However, a user may want to use the matrix structure of a spreadsheet to analyze the data. Attendees Cost Gross Net Net per person Golf 12 Golf 13 Golf 14 35 42 45 $80 $80 $82 $350 $420 $450 $270 $340 $368 $7.71 $8.10 $8.18 Questions 1. How many spreadsheets are in your computer, file server, or shared drive? 2. Of those spreadsheets, how many are for data analysis and how many for data storage? 3. How long does it take to find (query) the data for important information? 4. Can you quickly produce a report by simply clicking a button? New Leaf Data, LLC can help you and your organization store data efficiently. Contact New Leaf Data, LLC today to find out how. www.newleafdata.com (567) 455-3162 info@newleafdata.com 5 of 5