BUS4 118S Big Data San José State University Fall 2014 When and Where When: Thursday 6:00 pm 8:45 pm Where: Boccardo Business Classroom (BBC) 320 Instructor Dr. Scott Jensen Office: BT 252 Phone: (408) 924 3487 Email: scott.jensen@sjsu.edu Office Hours: Tuesday 4:00 pm 5:00 pm Thursday 4:00 pm 5:00 pm or by appointment Course Description This course explores the Big Data lifecycle and how businesses and the public sector are using Big Data to improve performance, reduce costs, and develop new products. When implementing Big Data projects, businesses have consistently identified two of the most significant hurdles to be: (1) getting data ready for analysis (a.k.a. data wrangling ), and (2) understanding and communicating the business proposition and value. This class will explore the characteristics of Big Data what is different about Big Data that makes data wrangling important, how businesses are using Big Data, operational issues relating to developing the business proposition, ethical considerations, and communicating the outcomes. We will also use current data tools in this area, with a focus on data wrangling and visualization since they are 80% of the effort in a Big Data project. Course Goals and Learning Objectives At the end of this course you should be able to: 1. Explain basic Big Data concepts and what differentiates Big Data 2. Describe how businesses are using Big Data 3. Identify when different Big Data tools should be used 4. Interpret and handle basic data formats for unstructured data 5. Discuss the ethical considerations of Big Data 6. Analyze data problems using a context, need, vision, outcome approach 7. Apply data wrangling tools covered in class 8. Visualize results and understand good visualizations
Required Texts / Readings Big Data @ Work: Dispelling the Myths, Uncovering the Opportunities, by Thomas H. Davenport. Published by Harvard Business Review Press (2014), ISBN: 9781422168165 This book is also available on Amazon in a Kindle version. Data Just Right: Introduction to Large Scale Data & Analytics, by Michael Manoochehri. Published by Addison Wesley (2014), ISBN: 9780321898654 As an SJSU student, an electronic version of this book can be read for FREE from the MLK Library as an electronic Safari book. You will need an account at the Library (if you do not have one, get one they are free) and just search on the title. This is not as convenient as having your own copy, but you cannot beat the cost ($0.00). This book is also available from Amazon in a Kindle version. Thinking with Data: How to Turn Information into Insights, by Max Shron. Published by O Reilly Media (2014), ISBN: 9781449362935 Only Chapter 1, Scoping: Why Before How is required reading for Week 4 (September 18 th ). You do NOT need to purchase this since it is available as a FREE Safari ebook online from the MLK library (same as the Data Just Right textbook). Search for Thinking with Data in the MLK Library catalog online and click on the link that says an ebook available to SJSU Students & Faculty. Just the 16 pages of Chapter 1 are required reading. This book is also available on Amazon in a Kindle version for $9.99 if you like the book and later want your own copy. Other Readings and Materials: For some topics we will be using supplemental readings or additional materials for in class exercises and discussions. These will be provided through Canvas as a PDF file or as a URL to where the materials can be downloaded. Be Sure You Have a PIN Code for Your MLK Library Account: Some of the resources will require an account (PIN code) at the MLK Library to download. If you do not have a PIN (or forgot it), see this page: http://library.sjsu.edu/journal titles/get librarypasswords The library has to have an email address on file to reset your PIN, so please try this the first week (the PIN is also needed to read the Data Just Right book for free). Grading Policy: Assignments, Exams, and Participation This class has both an individual component and a team component. In the first week we will form teams of 5 students each. You will work as a team on in class assignments and collaborate on your industry analysis and technology report. For lab assignments, you are encouraged to work as a team, but can elect to work individually. Lab Assignments (total 200 points): (in class = total of 100 points, follow up assignments = total of 100 points) There are 5 lab assignments in which you will use IBM s SoftLayer PaaS Cloud with their BigInsights Hadoop implementation and other tools. We are setting up accounts for each student in the class and will discuss this further before the first lab session. Most of the lab assignments can be completed on any computer since you will be accessing the console for IBM s cloud based platform through a web browser.
Each lab assignment will have an in class component you need to complete. Each student must complete this in their account, but you can work collaboratively as a team. The in class portion is designed to provide the basic skills needed to use a specific Big Data tool and counts for 50% of the assignment grade. There is a follow up assignment for the remaining 50% of the grade in which you will apply the material learned in the lab. For these follow up assignments you can decide to work as a team, individually, or as sub teams, but if you work as a team, you will rate each group participant s performance. All of the in class lab assignments and follow up assignments together are 200 points (20% of the course grade). We will be doing the in class portion of the assignments in the BBC 301 lab during the second half of the class period. We will start class in our normal classroom (BBC 320), and move to the lab after the break. Quizzes and In Class Assignments (Best 10 out of 12 = total of 150 points): During each class there will either be a quiz at the start of the class, or an in class team assignment that is graded. Over the course of semester, there will be 12 quizzes or graded assignments, and the lowest two will be dropped. These quizzes and assignments total 15% of the course grade, so the best 10 will each be 1.5% of the total course grade. There are two days when we will not have a quiz or graded in class project: (2) the day of the midterm, and (3) the day for team presentations. If you are not participating on the day of an in class project, you will not receive the team points. Team Technology Report and Presentation (100 points): There are many Big Data technologies and tools available (with new tools arriving all of the time). We will discuss a number of technologies in class and work with a number of tools, but we cannot cover the full spectrum in a semester. Each team will select a different Big Data tool they wish to present on. Details will be provided in Canvas (along with a list of tools to select from). Each team will prepare a 4 5 page report that describes a tool, how it is being used, and where they see it fitting into the spectrum of Big Data tools. Starting with week 4, (September 18 th ), teams will do a presentation of 7 10 minutes on the tool they have written about. The presentation is relatively short, so plan on using PowerPoint, and you can include screenshots from a demo video in your slides, but you do not have time to do a demo. Although you can use screenshots in your presentation, you cannot just show a video. For the report you are expected to look to multiple sources (websites, articles, whitepapers, and book chapters) and you must cite the articles or other sources used. Each team must present on a different tool and all of the reports will be collected and made available as a single PDF at the end of the semester that each student can take with them providing a guide to an additional set of tools. There will be some tools that cannot be selected, because we are covering them in class. If you wish to present on a tool not included in the list in Canvas, please ask me first (it needs to be a Big Data tool different from those being presented by other teams, and we want to be sure you can find enough material to write your report). Check the list in Canvas if multiple teams wish to present on a specific topic, the first to sign up gets their first choice. Bonus Points: Since those teams presenting earlier in the semester have to complete their reports earlier, and present before seeing other students present, there are bonus points for presenting during the earlier weeks of the semester. If you present on week 4 (September 18 th ) or week 5 (September 25 th ), there is a 10 point bonus.
If you present on week 6 (October 2 nd ) or week 7 (October 9 th ), there is a 5 point bonus. Team Industry Analysis and Presentation (150 points): Different industries are using Big Data in different ways. Each team will select an industry and be responsible for researching how that industry is using Big Data. The results will be presented in a report and then presented as a team to the class during our last class session. Suggestions for possible industries are included in Canvas. Each team must select a different industry, so the first team to select an industry gets their first choice. There only industries that cannot be selected are web search and social media. At the 10 th week (October 30 th ), each team member must have identified a relevant source for the team analysis (unique from the sources identified by the other team members) and submit a 1 paragraph summary for that source. This will count as a quiz, and is included in the 150 points for quizzes; separate from the 150 points for the industry analysis and presentation. Exams (midterm = 200 points, final = 200 points): There will be a midterm which covers the material up through the week before the midterm exam. The midterm will be a 1 hour exam at the beginning of class and then we will continue after a break with the material for that week. The final exam (Thursday, December 18, 5:15pm 7:30pm) will be comprehensive (based on the whole semester) but will be weighted towards material after the midterm. The exams will also cover material from the lab assignments. The midterm and final are each 20% of the course grade. Points Grade % Lab Assignments (5) (a) In class (lab) part of the assignment (learning skills) 100 10% (b) Follow up assignment (applying skills learned in the lab) 100 10% Quizzes (individual) and In Class Team Assignments Technology Report and Presentation Team Industry Analysis and Presentation Midterm Exam Final Exam Total Points Late Assignments and Makeups Late work or make ups for quizzes, in class team assignments, or exams will not be accepted except by prior arrangement with the instructor and only under extraordinary circumstances. For lab assignments, there is a 10% deduction for each day that an assignment is late. For example, if you would have earned a perfect score (100%), but are 3 days late, the maximum you can earn on that assignment is 70%. Once the assignment answer is discussed, generally the next class session after it was due, no late assignments will be accepted. Please see the class schedule for the midterm, final exam, and team presentations dates. Classroom Protocol 150 15% 100 10% 150 15% 200 20% 200 20% 1000 100%
On days when we have an in class quiz, the quiz will start 10 minutes after the start of class (6:10 pm). If you are late for class, you will not have additional time to complete the quiz. On days when we have an in class team assignment, if you are absent it puts your fellow team members at a disadvantage and you will receive a zero for that assignment. Canvas We will be using Canvas for posting grades, providing additional materials, and making announcements. Be sure to check regularly for announcements or setup Canvas to email you with announcements as they are posted. Please add your photo and bio to Canvas. To set your photo and bio in Canvas, click on your name in the upper right hand corner when you are logged into Canvas (circled in red in the example below). This will display a screen that allows you to upload a photo and enter your bio. For basic instructions for logging into Canvas, see: http://www.sjsu.edu/at/ec/docs/canvas Student%20Login%20Information.pdf Dropping and Adding Courses Students are responsible for understanding the policies and procedures about adding and dropping courses, academic renewal, etc. Information on course add/drops is available at: http://www.sjsu.edu/advising/faq/index.html Information about late drop is available at: http://www.sjsu.edu/aars/policies/latedrops/ Students should be aware of the current deadlines and penalties for adding and dropping classes. University, College, and Department Policy Information a. Academic integrity statement (from Office of Judicial Affairs): Your own commitment to learning, as evidenced by your enrollment at San José State University and the University s Academic Integrity Policy requires you to be honest in all your academic course work. Faculty are required to report all infractions to the Office of Judicial Affairs. The policy on academic integrity can be found at http://www.sjsu.edu/senate/docs/s07 2.pdf b. Campus policy in compliance with the Americans with Disabilities Act: If you need course adaptations or accommodations because of a disability, or if you need special arrangements in case the building must be evacuated, please make an appointment with me as soon as possible, or see me during office hours. Presidential Directive 97 03 requires that students with disabilities register with DRC to establish a record of their disability.
c. College of Business Policies and Procedures: Please review the current College of Business policies at: http://www.cob.sjsu.edu/cob/5_student%20services/cobpolicy.htm To ensure that every student, current and future, who takes courses in the Boccardo Business Center, has the opportunity to experience an environment that is safe, attractive, and otherwise conducive to learning, the College of Business at San José State has established the following policies: Eating: Eating and drinking (except water) are prohibited in the Boccardo Business Center. Students with food will be asked to leave the building. Students who disrupt the course by eating and do not leave the building will be referred to the Judicial Affairs Officer of the University. Cell Phones: Students will turn their cell phones off or put them on vibrate mode while in class. They will not answer their phones in class. Students whose phones disrupt the course and do not stop when requested by the instructor will be referred to the Judicial Affairs Officer of the University. Computer Use: In the classroom, faculty allow students to use computers only for class related activities. These include activities such as taking notes on the lecture underway, following the lecture on Web based PowerPoint slides that the instructor has posted, and finding Web sites to which the instructor directs students at the time of the lecture. Students who use their computers for other activities or who abuse the equipment in any way, at a minimum, will be asked to leave the class and will lose participation points for the day, and, at a maximum, will be referred to the Judicial Affairs Officer of the University for disrupting the course. (Such referral can lead to suspension from the University.) Students are urged to report to their instructors computer use that they regard as inappropriate (i.e., used for activities that are not class related). Academic Honesty: Faculty will make every reasonable effort to foster honest academic conduct in their courses. They will secure examinations and their answers so that students cannot have prior access to them and proctor examinations to prevent students from copying or exchanging information. They will be on the alert for plagiarism. Faculty will provide additional information, ideally on the green sheet, about other unacceptable procedures in class work and examinations. Students who are caught cheating will be reported to the Judicial Affairs Officer of the University, as prescribed by Academic Senate Policy S04 12. Mission The College of Business is the institution of opportunity, providing innovative business education and applied research for the Silicon Valley region.