2 Tammy Pirmann HS CS teacher in PA NSF RET in Big Data with Temple University Teach CS Principles course Slobodan Vucetic Temple University NSF research project involving Big Data education through the pipeline
3 CS Principles: Big Idea III. Data: Data and information facilitate the creation of knowledge. CSTA K-12 Standards (5.3.A): CT 4. Compare techniques for analyzing massive data collections CPP 11. Describe techniques for locating and collecting small and large-scale data sets Job growth for data scientists
4 Slobodan Vucetic teaches a great graduate level course at Temple that uses BIG data sets He created an undergrad course based on the successful grad course He and I worked together so I would understand the data sets and how college students work with them I wrote a unit for HS students based on his undergrad course
5 In my class, this unit follows a few basic App Inventor tutorials and a lesson on abstraction Students have varying degrees of comfort with spreadsheets and databases Students have read the first two chapters of Blown to Bits by Abelson, Ledeen, Lewis
6 1. Orient activate, motivate, prepare 2. Explore observe, analyze 3. Form Concepts questions 4. Apply examples and problems 5. Close reflect and assess
7 Wear two hats Take on the role of student and see how the student interacts with the material Remain an educator and think about what you can use in your situation Break into groups of 4, making sure that at least one member has a device and the files
9 In 2009 Netflix offered a $1,000,000 prize to the team that could create a movie recommendation system that was 10% better than their existing one. That prize went to BellKor s Pragmatic Chaos". In this activity, we will explore a smaller (but still very large) set of movie data to explore how data can be used to generate useful information.
10 Why is a movie recommendation system worth a million dollars to Netflix?
11 There are three interrelated sets of data The movies
12 The people The ratings
13 1. What scale is being used for recommendations? How many stars?
14 2. What information are we keeping track of for each movie?
15 3. Can a person rate more than one movie?
16 4. What information are we capturing from our users? How are we capturing this data?
17 What additional movie data would be useful?
18 What might we want to know about the people doing the ratings?
19 Is it possible for a movie to never be rated? What effect does that have?
20 How would you go about determining which movie is the "best" movie?
21 Why would people rate movies?
22 Who might use this data, and how?
23 Discuss and agree on three potential problems inherent in an online rating and recommendation system. Be prepared to report out to the class. Discuss and agree on three questions you would like the answers to based on this data. Are there any additional data points needed in order to answer any of your questions? What additional data points would it have been helpful to have access to?
24 Open the people text file How is it formatted? What type of file would you expect this type of data to be in? Why? Open the movie text file How is it different?
25 These three file are related to each other The people rated movies We will make three tabs in one excel file Video tutorial * I have a completed Excel workbook available on the next day for scaffolding, absents, etc.
26 What happened with the vote data? It turns out that Excel has limits There are too many rows in the vote data to be imported into Excel Google spreadsheets can only handle 400,000 cells!
27 One thing you should have noticed is that the data does not have any labels We need to create field labels for this data Let s start with the people tab: What do you think are good labels for the columns of data? The movie tab presents a significant problem We have a file called a read-me file that tells us what each column is
28 I have a question can we trust this data? Can I use it to say The data shows that males between 12 and 24 prefer action movies over romance movies? Do I have confidence in the demographic data? Use the sort function to sort the people data on age. What do you notice?
29 We break into small groups based on previous experience with Excel I teach sort, filter, the count function, renaming tabs Students then use this to determine the percent of people who have probably lied on the form: liars/all people
30 The original groups of students choose a question they wrote down on the first day They now determine how to go about getting the answer to that question from the data This is an analysis plan, not the actual analysis (since some of them have questions that may need a more powerful tool)
31 What genre of movies do people like me give the highest ratings to? We need to determine people like me from the people data We then need to find all the ratings provided by them We need to put those ratings into genre buckets
32 Basic formulas Advanced filtering
33 Spreadsheets gave us more tools than the text file Databases give us more tools than the spreadsheet We have a database on our computers as part of Microsoft Office Open Access
34 We will import our original txt files into Access Each file will become a table in the database The people file has an id for each person which will be defined as our primary key The movie file has an id for each movie which will be defined as our primary key The ratings file has the people id and the movie id, but no ratings id.
35 Each record in the database needs to be able to be identified The primary key is how we identify each record Since each movie can only be rated by a person once, the combination of person id and movie id can be the primary key for our vote table
36 A relational database is one where the tables of data are related to each other by the primary keys Our tables are related through the vote table The primary key of the people table is present in the vote table The primary key of the movie table is present in the vote table
37 The simplest query to write is one based on one table We will use a query to recreate a sort and filter we had done in Excel Using the people table, let s look only at the people who entered an age we consider valid Sort these records by age We can hide the postal code if we are not using it
38 Go back to your written analysis plan Write the query iteratively Start with one table and get that query working Add more complexity to your query in small chunks, checking for accuracy at each step
39 After using the Movie data to teach spreadsheets and databases, we change data sets lest the students believe that big data and recommendation systems are synonomous The Portland data is even larger than the movie data and represents the movements of the people of Portland Oregon over a 24 hour period
40 Locations - The city is divided into a grid with each square given a numeric representation Demographics - Each person has an id and demographic data associated with them Activities Each type of activity is given a numeric representation Time - measured in seconds past midnight
41 With this information, what can we learn?
42 Is it possible there are questions that we should not ask?
43 This data could be used by an urban planner to determine if the city needs a large venue in a particular part of the city It could also be used to determine if a major highway needs more capacity It could be used to predict where utilities are most needed by hour
44 This type of data could show drivers which roads have the most traffic on them Data can show us how much time people spend on their commute Companies can use this type of data to determine where to open a franchise
45 The group of students brainstorm and provide the teacher three proto-concepts for deeper analysis of the Portland data Teacher returns the concepts with one chosen for the group (to eliminate duplication) Students work together to develop that concept and find the answers in the data Students report out to the class what they did and the results
46 Two options to allow you to scaffold the project to different ability levels Both options have the same format Proposal Data acquisition Data analysis plan Final report
47 Proposal Includes the community, the question you want to answer, why it should be answered, what data will be collected and how the answer will be provided to the community Data collection plan and form Data analysis plan Final report
48 Proposal Data collection plan and form Create a form in Google Docs, disseminate the form via , forums, link on website, etc Data analysis plan Final report
49 Proposal Data collection plan and form Data analysis plan You have several tools at your disposal to analyze your data. Decide which tools you will use and why. Develop the queries, sorts and filters that you will use when the data is collected. Be sure your data analysis plan covers the main questions that originally prompted you to collect this data. Final report
50 Proposal Data collection plan and form Data analysis plan Final report Produce a written report back to the community to share the information discovered by your analysis of the data provided by the community. This report should use illustrations or charts where appropriate
51 The only difference for option 2 is that the student will find/access existing data There are many large data sets available from the government Some organizations may also have raw data for the student to work with (Scouts, church groups, etc)
52 The Portland data could be used with Processing to create animated graphs of people movement What s your idea?
ESSENTIAL MICROSOFT OFFICE 2007 Tutorials for Teachers by Bernard John Poole University of Pittsburgh at Johnstown Johnstown, PA Copyright Bernard John Poole, 2007 All rights reserved Dedicated to my mother
Computer Applications (10004) Rationale Statement: With the growing need for computers in school and business, it is important that South Dakota high school students have an understanding of common application
Microsoft Access 2010 Part 1: Introduction to Database Design What is a database? Identifying entities and attributes Understanding relationships and keys Developing tables and other objects Planning a
$FDGHPLF&RPSXWLQJ &RPSXWHU 7UDLQLQJ 6XSSRUW 6HUYLFHV 1HWZRUNLQJ6HUYLFHV :HEHU%XLOGLQJ Using Delphi Data with Excel and Access Using Delphi Data The raw data used to create the CSU financial, human resource,
Computer Skills: Levels of Proficiency September 2011 Computer Skills: Levels of Proficiency Because of the continually increasing use of computers in our daily communications and work, the knowledge of
NEXT Analytics Business Intelligence User Guide This document provides an overview of the powerful business intelligence functions embedded in NEXT Analytics v5. These functions let you build more useful
MS Excel Template Building and Mapping for Neat 5 Neat 5 provides the opportunity to export data directly from the Neat 5 program to an Excel template, entering in column information using receipts saved
Advanced Database Concepts Using Microsoft Access lab 10 Objectives: Upon successful completion of Lab 10, you will be able to Understand database terminology, including database, table, record, field,
QUICK DOC: [Pivot Table Reporting} Pivot Table Reporting For the purposes of this document we are using the Utility Consumption Tracking Module to create our report and charts, but almost any data exported
Using Excel Files 18.00 2.73 The Excel Environment 3.20 0.14 Opening Microsoft Excel 2.00 0.12 Opening a new workbook 1.40 0.26 Opening an existing workbook 1.50 0.37 Save a workbook 1.40 0.28 Copy a workbook
1 BASIC TECHNIQUES IN USING EXCEL TO ANALYZE ASSESSMENT DATA University of Hawai i at Mānoa 11/15/12 2 Mission: Improve Student Learning Through Program Assessment 1 3 Workshop outcomes By the end of this
Microsoft Courses Course Overview With over 90% of UK businesses using Microsoft Office, it's the world's leading software package. Our Microsoft Office course will show you how to operate the three main
Database Software Timetables Figure 8.15 Sample of a relational database. A relational database has many parts connected by one element your student number, for example. Files Fields Personal Information
www.etidaho.com (208) 327-0768 Data Mining, Predictive Analytics with Microsoft Analysis Services and Excel PowerPivot 3 Days About this Course This course is designed for the end users and analysts that
SMP Best Practice Using Sales Management Plus for Sales Person Expenses Product Information www.gosmp.com Tutorial Videos & Training http://www.salesmanagementplus.com/training/trainingvideos_new.htm or
Introduction to Microsoft Access 2010 A database is a collection of information that is related. Access allows you to manage your information in one database file. Within Access there are four major objects:
ESSENTIAL MICROSOFT OFFICE 2013 Tutorials for Teachers by Bernard John Poole Associate Professor Emeritus University of Pittsburgh at Johnstown Johnstown, PA, USA Copyright Bernard John Poole, 2013 All
Importing TSM Data into Microsoft Excel using Microsoft Query An alternate way to report on TSM information is to use Microsoft Excel s import facilities using Microsoft Query to selectively import the
Introduction to Microsoft Access 2013 A database is a collection of information that is related. Access allows you to manage your information in one database file. Within Access there are four major objects:
Introduction to IBM Watson Analytics Data Loading and Data Quality December 16, 2014 Document version 2.0 This document applies to IBM Watson Analytics. Licensed Materials - Property of IBM Copyright IBM
Cleaning Website Referral Traffic Data Overview Welcome to Analytics Canvas's cleaning referral traffic data tutorial. This is one of a number of detailed tutorials in which we explain how each feature
Excel Dashboard with Dynamics GP Excel Reports Scott Witteveen firstname.lastname@example.org (517) 323 7500 Creating an Excel Dashboard with Dynamics GP Excel Reports Step 1 Set up a new workbook Open Excel,
Pfishbone is looking for ways to attract new customers as well as maintain relationships with current customers. Over the past year, the store has been collecting comment cards from customers and recording
Section 1 Spreadsheet Design Level 6 Spreadsheet 6N4089 Contents 1. Assess the suitability of using a spreadsheet to achieve a given requirement from a given specification... 1 Advantages of using Spreadsheet
Before you begin If a yellow security bar appears at the top of the screen in PowerPoint, click Enable Editing. You need PowerPoint 2010 to view this presentation. If you don t have PowerPoint 2010, download
Technology Tools to Collect and Analyze Data Session Outcomes Use at least one data collection tool Use best data presentation strategies Open to new technologies but keep a healthy skepticism Experiment
IT Academy Program 10 IT ACADEMY LESSON PLAN Microsoft Excel Lesson 7 Turn potential into success Lesson 7: Working with Tables, PivotTable, PivotCharts Learning Objectives Learning Goals // The goal of
Intermediate Excel 2013 One major organizational change introduced in Excel 2007, was the ribbon. Each ribbon revealed many more options depending on the tab selected. The Help button is the question mark
General Instructions: The Microsoft Excel Spreadsheet, HWS AL Rehospitalization Tool CPM Rev1.xlsm was developed by the Care Providers of Minnesota as a tool to help Housing with Services/Assisted Living/Home
Microsoft Excel for Windows (Tutorial #1) Objectives of this tutorial This tutorial is designed to further strengthen your skills in using Excel. This will assist you to work with and maintain lists/databases
Indiana County Assessor Association Excel Excellence Basic Excel Data Analysis Division August 2012 1 Agenda Lesson 1: The Benefits of Excel Lesson 2: The Basics of Excel Lesson 3: Hands On Exercises Lesson
Oracle Fusion Middleware Getting Started with Oracle Business Intelligence Publisher 11g Release 1 (11.1.1) E28374-02 September 2013 Welcome to Getting Started with Oracle Business Intelligence Publisher.
How to create and use a Google Doc Use Google Docs to create documents that can be shared with others. Google Docs are a great way to collaborate with others on a project. A benefit of Google Docs is that
Nancy Muir Anita Verno CONTENTS Preface Introduction: Your Digital Toolkit Chapter 1: Managing Your Time with Microsoft Outlook 2010 Skill 1 Open Outlook and Display the Calendar Skill 2 Schedule an Appointment
Excel Lesson 1: Microsoft Excel Basics 1. Active cell: The cell in the worksheet in which you can type data. 2. Active worksheet: The worksheet that is displayed in the work area. 3. Adjacent range: All
Creating Pivot Tables Example Using CIA Inspection Information This is a step by step guide of how to create pivot tables using Microsoft Excel. You can create a pivot tables from any database you have
Page 1 of 8 Excel 2010 Home > Excel 2010 Help and How-to > Getting started with Excel Search help More on Office.com: images templates Basic tasks in Excel 2010 Here are some basic tasks that you can do
To start an Access Database, you should first go into Access and then select file, new. Then on the right side of the screen, select Blank database. Give your database a name where it says db1 and save
Getting Started with Access 2007 1 A database is an organized collection of information about a subject. Examples of databases include an address book, the telephone book, or a filing cabinet full of documents
Lab 11: Budgeting with Excel This lab exercise will have you track credit card bills over a period of three months. You will determine those months in which a budget was met for various categories. You
Generalized Web Based Data Analysis Tool for Policy Agendas Data Paul Wolfgang Temple University ABSTRACT The Policy Agendas web site includes a data analysis tool that permits selection of the data from
Task Force on Technology EXCEL Basic terminology Spreadsheet A spreadsheet is an electronic document that stores various types of data. There are vertical columns and horizontal rows. A cell is where the
SENDING E-MAILS WITH MAIL MERGE You can use Mail Merge for Word and Outlook to create a brochure or newsletter and send it by e- mail to your Outlook contact list or to another address list, created in
Creating a Spreadsheet Gradebook 1 Creating a Gradebook in Excel Spreadsheets are a great tool for creating gradebooks. With a little bit of work, you can create a customized gradebook that will provide
Introduction to Microsoft Access 2003 Zhi Liu School of Information Fall/2006 Introduction and Objectives Microsoft Access 2003 is a powerful, yet easy to learn, relational database application for Microsoft
MS Access DB Tour 3 Extracts and Reports Reference Material GCFLearnFree.org lessons: http://www.gcflearnfree.org/access2013 Skills Building queries that are used to create data Extracts to other programs
Chapter 5 Microsoft Access Topic Introduction to DBMS Microsoft Access Getting Started Creating Database File Database Window Table Queries Form Report Introduction A set of programs designed to organize,
Microsoft Excel 3 Advanced-Level Features of Excel 2013 Revision 0 (12-12-2013) Advanced Conditional Formatting Using Formulas See the Conditional Formatting tab in the Excel 3 Practice.xls spreadsheet.
How to Use a Data Spreadsheet: Excel One does not necessarily have special statistical software to perform statistical analyses. Microsoft Office Excel can be used to run statistical procedures. Although
GroundWork group Microsoft Office and Computer Class Offerings Microsoft Word Word 2010 Level 1 Microsoft Office Word 2010 offers many features that make creating or editing professional documents, flyers,
Microsoft Access Rollup Procedure for Microsoft Office 2007 Note: You will need tax form information in an existing Excel spreadsheet prior to beginning this tutorial. 1. Start Microsoft access 2007. 2.
Excel 2010: Create your first spreadsheet Goals: After completing this course you will be able to: Create a new spreadsheet. Add, subtract, multiply, and divide in a spreadsheet. Enter and format column
Task Description Students explore presenting class-generated data using the wide selection of graphs available in the Microsoft Office Excel program. The students examine the merits of each graph for presenting
Use Mail Merge to create a form letter Suppose that you want to send a form letter to 1,000 different contacts. With the Mail Merge Manager, you can write one form letter, and then have Word merge each
Access Tutorial 2 Building a Database and Defining Table Relationships Microsoft Office 2013 Objectives Session 2.1 Learn the guidelines for designing databases and setting field properties Create a table
Business Technology TOUCH TYPING 10-12 Curriculum Standard: The student will demonstrate touch typing skills at a given level of proficiency. 1. The student will be able to key alphanumeric data without
Mail Merge - Microsoft Word and Excel Queries Scott Kern Senior Consultant This session is an introductory course into the features and functions offered by the MS Query component of Microsoft Excel and
John W. Jacobs Technology Center 450 Exton Square Parkway Exton, PA 19341 610.280.2666 email@example.com www.ccls.org Facebook.com/ChesterCountyLibrary Intermediate Microsoft Excel 2007- Tables and Printing
CS1100: Access Reports A (Very) Short Tutorial on Microsoft Access Report Construction Created By Martin Schedlbauer With contributions from Matthew Ekstrand-Abueg CS1100 Microsoft Access 1 Reports Reports
The purpose of this guide is to give you an overview of the Microsoft Office 365 Excel web application. Creating a New Excel Workbook Log in to Office 365 using your student ID and password in the usual
CALIFORNIA STATE UNIVERSITY, LOS ANGELES INFORMATION TECHNOLOGY SERVICES Microsoft Excel 2010 Part 3: Advanced Excel Winter 2015, Version 1.0 Table of Contents Introduction...2 Sorting Data...2 Sorting
Perfect Pizza - Credit Card Processing Decisions Gail Kaciuba, Ph.D., St. Mary s University, San Antonio, USA ABSTRACT This case is based on a consulting project the author conducted with a credit card
What is it? Excel is a prominent member of the Microsoft Office Suite as important as Word Excel can be thought of as a giant calculator; however it is used in many inventive ways, for many purposes. Three
Web Analytics with Google Analytics (GA) TRAINING MANUAL FOR WEB EDITORS LSA WEB SERVICES Google Analytics Training Manual LSA Web Services Haven Hall, Suite 6051 505 South State Street Ann Arbor, MI 48109-1045
2014 New Jersey Core Curriculum Content s - Technology Content Area Grade Content Statement Students will: Technology A. Technology Operations and Concepts: Students demonstrate a sound understanding of
INTRODUCTION TO EXCEL 1 INTRODUCTION Anyone who has used a computer for more than just playing games will be aware of spreadsheets A spreadsheet is a versatile computer program (package) that enables you
Faculty Access for the Web 7 - New Features and Enhancements New Design...................................................................... 2 Alerts............................................................................
Fibonacci via Recursion and Iteration Provided by TryEngineering - Lesson Focus This lesson introduces how to calculate an arithmetic series, specifically Fibonacci. In the first of two hour-long sessions,
POWERPOINT 2013 Computer/PowerPoint Concepts Internet Safety Review Digital Filing Email Cloud Pre Checklist Completion Digital Filing Unit One: Create and Format PowerPoint Lesson 1- Create/Prepare Planning
Information and communication technology (ICT) skills audit for returning teachers Please rate your ICT skills using the following values: 1 I have no knowledge at all of this area of ICT 2 I have a small
Creating and Managing Online Surveys LEVEL 2 Accessing your online survey account 1. If you are logged into UNF s network, go to https://survey. You will automatically be logged in. 2. If you are not logged
Basic Microsoft Excel 2007 The biggest difference between Excel 2007 and its predecessors is the new layout. All of the old functions are still there (with some new additions), but they are now located
TEESSIDE UNIVERSITY SCHOOL OF HEALTH & SOCIAL CARE SPSS Workbook 1 Data Entry : Questionnaire Data Prepared by: Sylvia Storey firstname.lastname@example.org SPSS data entry 1 This workbook is designed to introduce
Alex Lebedinsky Econ 307 Excel Primer 1. What is Excel? It is spreadsheet software. Excel allows you to enter (record), manipulate and analyze data. Alternatives (Quatro Pro, Google Docs and Spreadsheet)
3 RETRIEVING ISERIES DATA WITH MICROSOFT QUERY Microsoft Query, the helper application included with Microsoft Office, allows Office applications such as Word and Excel to read data from ODBC data sources.
Information Technology Grades 11-12 Basic Operations, Concepts, and Productivity Tools Basic Operations Word Processing Database Students will know and be able to: Identify the platform, version, properties,
Donor Segmentation and Analysis Using Microsoft Excel Pivot Tables Introduction Donor database segmentation and analysis expose important trends and focuses our attention on our best prospects. This how-to
Excel 2003: Ringtones Task 1. Open up a blank spreadsheet 2. Save the spreadsheet to your area and call it Ringtones.xls 3. Add the data as shown here, making sure you keep to the cells as shown Make sure
Lesson 9. s Create a visual report. Customize a visual report. Create a visual report template. Introduction You have updated the cost information in your project plan. When presenting such varied information