2 Tammy Pirmann HS CS teacher in PA NSF RET in Big Data with Temple University Teach CS Principles course Slobodan Vucetic Temple University NSF research project involving Big Data education through the pipeline
3 CS Principles: Big Idea III. Data: Data and information facilitate the creation of knowledge. CSTA K-12 Standards (5.3.A): CT 4. Compare techniques for analyzing massive data collections CPP 11. Describe techniques for locating and collecting small and large-scale data sets Job growth for data scientists
4 Slobodan Vucetic teaches a great graduate level course at Temple that uses BIG data sets He created an undergrad course based on the successful grad course He and I worked together so I would understand the data sets and how college students work with them I wrote a unit for HS students based on his undergrad course
5 In my class, this unit follows a few basic App Inventor tutorials and a lesson on abstraction Students have varying degrees of comfort with spreadsheets and databases Students have read the first two chapters of Blown to Bits by Abelson, Ledeen, Lewis
6 1. Orient activate, motivate, prepare 2. Explore observe, analyze 3. Form Concepts questions 4. Apply examples and problems 5. Close reflect and assess
7 Wear two hats Take on the role of student and see how the student interacts with the material Remain an educator and think about what you can use in your situation Break into groups of 4, making sure that at least one member has a device and the files
9 In 2009 Netflix offered a $1,000,000 prize to the team that could create a movie recommendation system that was 10% better than their existing one. That prize went to BellKor s Pragmatic Chaos". In this activity, we will explore a smaller (but still very large) set of movie data to explore how data can be used to generate useful information.
10 Why is a movie recommendation system worth a million dollars to Netflix?
11 There are three interrelated sets of data The movies
12 The people The ratings
13 1. What scale is being used for recommendations? How many stars?
14 2. What information are we keeping track of for each movie?
15 3. Can a person rate more than one movie?
16 4. What information are we capturing from our users? How are we capturing this data?
17 What additional movie data would be useful?
18 What might we want to know about the people doing the ratings?
19 Is it possible for a movie to never be rated? What effect does that have?
20 How would you go about determining which movie is the "best" movie?
21 Why would people rate movies?
22 Who might use this data, and how?
23 Discuss and agree on three potential problems inherent in an online rating and recommendation system. Be prepared to report out to the class. Discuss and agree on three questions you would like the answers to based on this data. Are there any additional data points needed in order to answer any of your questions? What additional data points would it have been helpful to have access to?
24 Open the people text file How is it formatted? What type of file would you expect this type of data to be in? Why? Open the movie text file How is it different?
25 These three file are related to each other The people rated movies We will make three tabs in one excel file Video tutorial * I have a completed Excel workbook available on the next day for scaffolding, absents, etc.
26 What happened with the vote data? It turns out that Excel has limits There are too many rows in the vote data to be imported into Excel Google spreadsheets can only handle 400,000 cells!
27 One thing you should have noticed is that the data does not have any labels We need to create field labels for this data Let s start with the people tab: What do you think are good labels for the columns of data? The movie tab presents a significant problem We have a file called a read-me file that tells us what each column is
28 I have a question can we trust this data? Can I use it to say The data shows that males between 12 and 24 prefer action movies over romance movies? Do I have confidence in the demographic data? Use the sort function to sort the people data on age. What do you notice?
29 We break into small groups based on previous experience with Excel I teach sort, filter, the count function, renaming tabs Students then use this to determine the percent of people who have probably lied on the form: liars/all people
30 The original groups of students choose a question they wrote down on the first day They now determine how to go about getting the answer to that question from the data This is an analysis plan, not the actual analysis (since some of them have questions that may need a more powerful tool)
31 What genre of movies do people like me give the highest ratings to? We need to determine people like me from the people data We then need to find all the ratings provided by them We need to put those ratings into genre buckets
32 Basic formulas Advanced filtering
33 Spreadsheets gave us more tools than the text file Databases give us more tools than the spreadsheet We have a database on our computers as part of Microsoft Office Open Access
34 We will import our original txt files into Access Each file will become a table in the database The people file has an id for each person which will be defined as our primary key The movie file has an id for each movie which will be defined as our primary key The ratings file has the people id and the movie id, but no ratings id.
35 Each record in the database needs to be able to be identified The primary key is how we identify each record Since each movie can only be rated by a person once, the combination of person id and movie id can be the primary key for our vote table
36 A relational database is one where the tables of data are related to each other by the primary keys Our tables are related through the vote table The primary key of the people table is present in the vote table The primary key of the movie table is present in the vote table
37 The simplest query to write is one based on one table We will use a query to recreate a sort and filter we had done in Excel Using the people table, let s look only at the people who entered an age we consider valid Sort these records by age We can hide the postal code if we are not using it
38 Go back to your written analysis plan Write the query iteratively Start with one table and get that query working Add more complexity to your query in small chunks, checking for accuracy at each step
39 After using the Movie data to teach spreadsheets and databases, we change data sets lest the students believe that big data and recommendation systems are synonomous The Portland data is even larger than the movie data and represents the movements of the people of Portland Oregon over a 24 hour period
40 Locations - The city is divided into a grid with each square given a numeric representation Demographics - Each person has an id and demographic data associated with them Activities Each type of activity is given a numeric representation Time - measured in seconds past midnight
41 With this information, what can we learn?
42 Is it possible there are questions that we should not ask?
43 This data could be used by an urban planner to determine if the city needs a large venue in a particular part of the city It could also be used to determine if a major highway needs more capacity It could be used to predict where utilities are most needed by hour
44 This type of data could show drivers which roads have the most traffic on them Data can show us how much time people spend on their commute Companies can use this type of data to determine where to open a franchise
45 The group of students brainstorm and provide the teacher three proto-concepts for deeper analysis of the Portland data Teacher returns the concepts with one chosen for the group (to eliminate duplication) Students work together to develop that concept and find the answers in the data Students report out to the class what they did and the results
46 Two options to allow you to scaffold the project to different ability levels Both options have the same format Proposal Data acquisition Data analysis plan Final report
47 Proposal Includes the community, the question you want to answer, why it should be answered, what data will be collected and how the answer will be provided to the community Data collection plan and form Data analysis plan Final report
48 Proposal Data collection plan and form Create a form in Google Docs, disseminate the form via , forums, link on website, etc Data analysis plan Final report
49 Proposal Data collection plan and form Data analysis plan You have several tools at your disposal to analyze your data. Decide which tools you will use and why. Develop the queries, sorts and filters that you will use when the data is collected. Be sure your data analysis plan covers the main questions that originally prompted you to collect this data. Final report
50 Proposal Data collection plan and form Data analysis plan Final report Produce a written report back to the community to share the information discovered by your analysis of the data provided by the community. This report should use illustrations or charts where appropriate
51 The only difference for option 2 is that the student will find/access existing data There are many large data sets available from the government Some organizations may also have raw data for the student to work with (Scouts, church groups, etc)
52 The Portland data could be used with Processing to create animated graphs of people movement What s your idea?
Computer Applications (10004) Rationale Statement: With the growing need for computers in school and business, it is important that South Dakota high school students have an understanding of common application
$FDGHPLF&RPSXWLQJ &RPSXWHU 7UDLQLQJ 6XSSRUW 6HUYLFHV 1HWZRUNLQJ6HUYLFHV :HEHU%XLOGLQJ Using Delphi Data with Excel and Access Using Delphi Data The raw data used to create the CSU financial, human resource,
Microsoft Access 2010 Part 1: Introduction to Database Design What is a database? Identifying entities and attributes Understanding relationships and keys Developing tables and other objects Planning a
MS Excel Template Building and Mapping for Neat 5 Neat 5 provides the opportunity to export data directly from the Neat 5 program to an Excel template, entering in column information using receipts saved
Advanced Database Concepts Using Microsoft Access lab 10 Objectives: Upon successful completion of Lab 10, you will be able to Understand database terminology, including database, table, record, field,
Computer Skills: Levels of Proficiency September 2011 Computer Skills: Levels of Proficiency Because of the continually increasing use of computers in our daily communications and work, the knowledge of
1 BASIC TECHNIQUES IN USING EXCEL TO ANALYZE ASSESSMENT DATA University of Hawai i at Mānoa 11/15/12 2 Mission: Improve Student Learning Through Program Assessment 1 3 Workshop outcomes By the end of this
Using Excel Files 18.00 2.73 The Excel Environment 3.20 0.14 Opening Microsoft Excel 2.00 0.12 Opening a new workbook 1.40 0.26 Opening an existing workbook 1.50 0.37 Save a workbook 1.40 0.28 Copy a workbook
www.etidaho.com (208) 327-0768 Data Mining, Predictive Analytics with Microsoft Analysis Services and Excel PowerPivot 3 Days About this Course This course is designed for the end users and analysts that
SMP Best Practice Using Sales Management Plus for Sales Person Expenses Product Information www.gosmp.com Tutorial Videos & Training http://www.salesmanagementplus.com/training/trainingvideos_new.htm or
Microsoft Courses Course Overview With over 90% of UK businesses using Microsoft Office, it's the world's leading software package. Our Microsoft Office course will show you how to operate the three main
Introduction to Microsoft Access 2010 A database is a collection of information that is related. Access allows you to manage your information in one database file. Within Access there are four major objects:
NEXT Analytics Business Intelligence User Guide This document provides an overview of the powerful business intelligence functions embedded in NEXT Analytics v5. These functions let you build more useful
Cleaning Website Referral Traffic Data Overview Welcome to Analytics Canvas's cleaning referral traffic data tutorial. This is one of a number of detailed tutorials in which we explain how each feature
Introduction to Microsoft Access 2013 A database is a collection of information that is related. Access allows you to manage your information in one database file. Within Access there are four major objects:
Importing TSM Data into Microsoft Excel using Microsoft Query An alternate way to report on TSM information is to use Microsoft Excel s import facilities using Microsoft Query to selectively import the
Excel Dashboard with Dynamics GP Excel Reports Scott Witteveen firstname.lastname@example.org (517) 323 7500 Creating an Excel Dashboard with Dynamics GP Excel Reports Step 1 Set up a new workbook Open Excel,
Database Software Timetables Figure 8.15 Sample of a relational database. A relational database has many parts connected by one element your student number, for example. Files Fields Personal Information
Section 1 Spreadsheet Design Level 6 Spreadsheet 6N4089 Contents 1. Assess the suitability of using a spreadsheet to achieve a given requirement from a given specification... 1 Advantages of using Spreadsheet
Technology Tools to Collect and Analyze Data Session Outcomes Use at least one data collection tool Use best data presentation strategies Open to new technologies but keep a healthy skepticism Experiment
Getting Started with Access 2007 1 A database is an organized collection of information about a subject. Examples of databases include an address book, the telephone book, or a filing cabinet full of documents
Oracle Fusion Middleware Getting Started with Oracle Business Intelligence Publisher 11g Release 1 (11.1.1) E28374-02 September 2013 Welcome to Getting Started with Oracle Business Intelligence Publisher.
Introduction to IBM Watson Analytics Data Loading and Data Quality December 16, 2014 Document version 2.0 This document applies to IBM Watson Analytics. Licensed Materials - Property of IBM Copyright IBM
Page 1 of 8 Excel 2010 Home > Excel 2010 Help and How-to > Getting started with Excel Search help More on Office.com: images templates Basic tasks in Excel 2010 Here are some basic tasks that you can do
Indiana County Assessor Association Excel Excellence Basic Excel Data Analysis Division August 2012 1 Agenda Lesson 1: The Benefits of Excel Lesson 2: The Basics of Excel Lesson 3: Hands On Exercises Lesson
Intermediate Excel 2013 One major organizational change introduced in Excel 2007, was the ribbon. Each ribbon revealed many more options depending on the tab selected. The Help button is the question mark
Nancy Muir Anita Verno CONTENTS Preface Introduction: Your Digital Toolkit Chapter 1: Managing Your Time with Microsoft Outlook 2010 Skill 1 Open Outlook and Display the Calendar Skill 2 Schedule an Appointment
Creating Pivot Tables Example Using CIA Inspection Information This is a step by step guide of how to create pivot tables using Microsoft Excel. You can create a pivot tables from any database you have
Task Force on Technology EXCEL Basic terminology Spreadsheet A spreadsheet is an electronic document that stores various types of data. There are vertical columns and horizontal rows. A cell is where the
Access Tutorial 2 Building a Database and Defining Table Relationships Microsoft Office 2013 Objectives Session 2.1 Learn the guidelines for designing databases and setting field properties Create a table
Introduction to Microsoft Access 2003 Zhi Liu School of Information Fall/2006 Introduction and Objectives Microsoft Access 2003 is a powerful, yet easy to learn, relational database application for Microsoft
Generalized Web Based Data Analysis Tool for Policy Agendas Data Paul Wolfgang Temple University ABSTRACT The Policy Agendas web site includes a data analysis tool that permits selection of the data from
SENDING E-MAILS WITH MAIL MERGE You can use Mail Merge for Word and Outlook to create a brochure or newsletter and send it by e- mail to your Outlook contact list or to another address list, created in
Microsoft Access Rollup Procedure for Microsoft Office 2007 Note: You will need tax form information in an existing Excel spreadsheet prior to beginning this tutorial. 1. Start Microsoft access 2007. 2.
Chapter 5 Microsoft Access Topic Introduction to DBMS Microsoft Access Getting Started Creating Database File Database Window Table Queries Form Report Introduction A set of programs designed to organize,
Excel 2010: Create your first spreadsheet Goals: After completing this course you will be able to: Create a new spreadsheet. Add, subtract, multiply, and divide in a spreadsheet. Enter and format column
Use Mail Merge to create a form letter Suppose that you want to send a form letter to 1,000 different contacts. With the Mail Merge Manager, you can write one form letter, and then have Word merge each
Lab 11: Budgeting with Excel This lab exercise will have you track credit card bills over a period of three months. You will determine those months in which a budget was met for various categories. You
Creating a Spreadsheet Gradebook 1 Creating a Gradebook in Excel Spreadsheets are a great tool for creating gradebooks. With a little bit of work, you can create a customized gradebook that will provide
GroundWork group Microsoft Office and Computer Class Offerings Microsoft Word Word 2010 Level 1 Microsoft Office Word 2010 offers many features that make creating or editing professional documents, flyers,
General Instructions: The Microsoft Excel Spreadsheet, HWS AL Rehospitalization Tool CPM Rev1.xlsm was developed by the Care Providers of Minnesota as a tool to help Housing with Services/Assisted Living/Home
2014 New Jersey Core Curriculum Content s - Technology Content Area Grade Content Statement Students will: Technology A. Technology Operations and Concepts: Students demonstrate a sound understanding of
Task Description Students explore presenting class-generated data using the wide selection of graphs available in the Microsoft Office Excel program. The students examine the merits of each graph for presenting
John W. Jacobs Technology Center 450 Exton Square Parkway Exton, PA 19341 610.280.2666 email@example.com www.ccls.org Facebook.com/ChesterCountyLibrary Intermediate Microsoft Excel 2007- Tables and Printing
Information Technology Grades 11-12 Basic Operations, Concepts, and Productivity Tools Basic Operations Word Processing Database Students will know and be able to: Identify the platform, version, properties,
CS1100: Access Reports A (Very) Short Tutorial on Microsoft Access Report Construction Created By Martin Schedlbauer With contributions from Matthew Ekstrand-Abueg CS1100 Microsoft Access 1 Reports Reports
Web Analytics with Google Analytics (GA) TRAINING MANUAL FOR WEB EDITORS LSA WEB SERVICES Google Analytics Training Manual LSA Web Services Haven Hall, Suite 6051 505 South State Street Ann Arbor, MI 48109-1045
Faculty Access for the Web 7 - New Features and Enhancements New Design...................................................................... 2 Alerts............................................................................
A guide to bulk deposit submissions What is a bulk deposit submission? The Bulk Deposit Submission process is used for agents/landlords who have a large amount of deposits to submit at the same time, reducing
POWERPOINT 2013 Computer/PowerPoint Concepts Internet Safety Review Digital Filing Email Cloud Pre Checklist Completion Digital Filing Unit One: Create and Format PowerPoint Lesson 1- Create/Prepare Planning
Excel Database Management Microsoft Reference Guide University Technology Services Computer Training Copyright Notice Copyright 2003 EBook Publishing. All rights reserved. No part of this publication may
CALIFORNIA STATE UNIVERSITY, LOS ANGELES INFORMATION TECHNOLOGY SERVICES Microsoft Excel 2010 Part 3: Advanced Excel Winter 2015, Version 1.0 Table of Contents Introduction...2 Sorting Data...2 Sorting
Lesson 9. s Create a visual report. Customize a visual report. Create a visual report template. Introduction You have updated the cost information in your project plan. When presenting such varied information
Basic Microsoft Excel 2007 The biggest difference between Excel 2007 and its predecessors is the new layout. All of the old functions are still there (with some new additions), but they are now located
EXCEL DATA FILE UPLOAD The Oregon Fuels Tax System allows data to be entered manually into each online schedule. Large amounts of data can be uploaded through the XML process (requires specific software).
Alex Lebedinsky Econ 307 Excel Primer 1. What is Excel? It is spreadsheet software. Excel allows you to enter (record), manipulate and analyze data. Alternatives (Quatro Pro, Google Docs and Spreadsheet)
Reporting Using SQL reporting Services Tutorial Objectives: Introduction to Gold-Vision Reporting Standard Reports Searching for a Report Running a Standard Report Viewing a Report Exporting Data Example
Excel 2003: Ringtones Task 1. Open up a blank spreadsheet 2. Save the spreadsheet to your area and call it Ringtones.xls 3. Add the data as shown here, making sure you keep to the cells as shown Make sure
Introduction This document describes the steps necessary to convert Custom Crystal Reports for Adagio RPT s to use the Adagio ODBC connection introduced with the 9.2A versions of Adagio. All reports in
Information and communication technology (ICT) skills audit for returning teachers Please rate your ICT skills using the following values: 1 I have no knowledge at all of this area of ICT 2 I have a small
How to Use a Data Spreadsheet: Excel One does not necessarily have special statistical software to perform statistical analyses. Microsoft Office Excel can be used to run statistical procedures. Although
Portfolio Reporting Guide Portfolio Report Tutorial The following guide accompanies the Portfolio Report Template, and is intended to assist with the initial population of the spreadsheet and the production
Tutorial 3 Maintaining and Querying a Database Microsoft Access 2013 Objectives Session 3.1 Find, modify, and delete records in a table Hide and unhide fields in a datasheet Work in the Query window in
1. What Are The Different Views To Display A Table A) Datasheet View B) Design View C) Pivote Table & Pivot Chart View D) All Of Above 2. Which Of The Following Creates A Drop Down List Of Values To Choose
A BEGINNER S GUIDE TO VISUALIZATION Featuring REU Site Collaborative Data Visualization Applications June 10, 2014 Vetria L. Byrd, PhD Advanced Visualization, Director REU Coordinator Visualization Scientist
MicroStrategy Desktop Quick Start Guide MicroStrategy Desktop is designed to enable business professionals like you to explore data, simply and without needing direct support from IT. 1 Import data from
EXCEL Tutorial: How to use EXCEL for Graphs and Calculations. Excel is powerful tool and can make your life easier if you are proficient in using it. You will need to use Excel to complete most of your
ACCESSING THE NORFOLK HOSTED SIMS SERVICE ADMINISTRATORS GUIDE 1. URL and Login Credentials In order to access the Norfolk Hosted SIMS Service you will be given a unique URL for your organisation. This
1 LEAPING THE DIGITAL BUSINESS GAP You can t operate an effective business without using technology and being online. In conjunction with Hutt City Libraries, the Hutt Valley Chamber of Commerce is pleased
and TV Show Survey Objectives Each student will utilize the Google Docs form application to create a simple survey to gather information about his or her classmates favorite books, movies, and TV shows.
Coding & Data Skills for Communicators Dr. Cindy Royal Texas State University - San Marcos School of Journalism and Mass Communication Spreadsheet Basics Excel is a powerful productivity tool. It s a spreadsheet
Access Tutorial 3 Maintaining and Querying a Database Microsoft Office 2013 Enhanced Objectives Session 3.1 Find, modify, and delete records in a table Hide and unhide fields in a datasheet Work in the
Computer Classes held Beginning Excel Open existing spreadsheets and enter data Save updates Adjust row and column heights Format numbers and cell characteristics (borders, shading, etc.) Insert formulas
COMPUTER SCIENCE PROJECT Students may earn credit when completing a tutorial software program of your choice as well as the following assignments. Software needs to include tutorials for a word processor,
To start an Access Database, you should first go into Access and then select file, new. Then on the right side of the screen, select Blank database. Give your database a name where it says db1 and save
Perfect Pizza - Credit Card Processing Decisions Gail Kaciuba, Ph.D., St. Mary s University, San Antonio, USA ABSTRACT This case is based on a consulting project the author conducted with a credit card