Big Data Presentation of the course



Similar documents
You should have a working knowledge of the Microsoft Windows platform. A basic knowledge of programming is helpful but not required.

BUDT 758B-0501: Big Data Analytics (Fall 2015) Decisions, Operations & Information Technologies Robert H. Smith School of Business

The? Data: Introduction and Future

Big Data Analytics: Where is it Going and How Can it Be Taught at the Undergraduate Level?

FINAL SCHEDULE YEAR 1 AUGUST WEEK 1

BIG DATA USING HADOOP

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

TDWI: BUSINESS INTELLIGENCE & DATA WAREHOUSING EDUCATION EUROPE

CSE 427 CLOUD COMPUTING WITH BIG DATA APPLICATIONS

BSc Psychology, BSc Developmental Psychology, BSc Forensic Psychology

How To Handle Big Data With A Data Scientist

International University of Monaco 12/04/ :50 - Page 1. Monday 30/01 Tuesday 31/01 Wednesday 01/02 Thursday 02/02 Friday 03/02 Saturday 04/02

Big Data Analytics. Prof. Dr. Lars Schmidt-Thieme

L1: Introduction to Hadoop

International University of Monaco 27/04/ :55 - Page 1. Monday 30/04 Tuesday 01/05 Wednesday 02/05 Thursday 03/05 Friday 04/05 Saturday 05/05

International University of Monaco 21/05/ :01 - Page 1. Monday 30/04 Tuesday 01/05 Wednesday 02/05 Thursday 03/05 Friday 04/05 Saturday 05/05

Seminar Medical Informatics 2015

Hadoop. MPDL-Frühstück 9. Dezember 2013 MPDL INTERN

Academic Calendar

DATA MINING WITH HADOOP AND HIVE Introduction to Architecture

Big Data Frameworks Course. Prof. Sasu Tarkoma

Implement Hadoop jobs to extract business value from large and varied data sets

Academic Calendar for Faculty

Bringing Together ESB and Big Data

Big Data Management in the Clouds. Alexandru Costan IRISA / INSA Rennes (KerData team)

Cleveland State University

Cloud Computing Summary and Preparation for Examination

Introduction to Big Data Analytics p. 1 Big Data Overview p. 2 Data Structures p. 5 Analyst Perspective on Data Repositories p.

8:30 AM - 3:30 PM at the Chicago Cultural Center Chicago M.A. New Student Orientation

Big data for the Masses The Unique Challenge of Big Data Integration

Large-scale Data Processing on the Cloud

WHITE PAPER. Four Key Pillars To A Big Data Management Solution

Big Data Explained. An introduction to Big Data Science.

COMP9321 Web Application Engineering

Analytics on Big Data

CSE-E5430 Scalable Cloud Computing Lecture 2

The Big Data Ecosystem at LinkedIn Roshan Sumbaly, Jay Kreps, and Sam Shah LinkedIn

BIG DATA CHALLENGES AND PERSPECTIVES

Cloud Scale Distributed Data Storage. Jürmo Mehine

Big Data and Data Science: Behind the Buzz Words

DATA ENGINEERING FELLOWS PROGRAM

Hadoop. Sunday, November 25, 12

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect

University of the Incarnate Word Academic Calendar

INDEX. Introduction Page 3. Methodology Page 4. Findings. Conclusion. Page 5. Page 10

BIG DATA TRENDS AND TECHNOLOGIES

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, Viswa Sharma Solutions Architect Tata Consultancy Services

AUGUST 2014 S M T W T F S

Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance.

Microsoft Big Data. Solution Brief

Economics Department Induction Talk

Big Data Training - Hackveda

The 3 questions to ask yourself about BIG DATA

Where We Are. References. Cloud Computing. Levels of Service. Cloud Computing History. Introduction to Data Management CSE 344

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

TRAINING PROGRAM ON BIGDATA/HADOOP

Department of Computer Science University of Cyprus EPL646 Advanced Topics in Databases. Lecture 14

TE's Analytics on Hadoop and SAP HANA Using SAP Vora

Open source Google-style large scale data analysis with Hadoop

Christ Missionary & Industrial College (High School) Steps To Admission

The Next Wave of Data Management. Is Big Data The New Normal?

NOSQL VS RDBMS - WHY THERE IS ROOM FOR BOTH

Cleveland State University

The Bellevue Center for Obesity & Weight Management. Program Director: Manish Parikh, MD WEIGHT LOSS SURGERY INFORMATION SEMINAR

WEIGHT LOSS SURGERY INFORMATION SEMINAR

Information for Exchange Students ERASMUS. University of Teramo

Apriori-Map/Reduce Algorithm

Academic Calendar

Introduction to Apache Cassandra

6.S897 Large-Scale Systems

(Part 2) Lunch Block 7 1:05 PM 2:27 PM

Introduction To Hive

Goal Pre-Week 1: To participate in an introduction and overview to the course and the learners in live format and via recording.

Fall 2014 Blended MAC Residency Sunday, September 7, 2014 Chicago Campus

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014

MEN'S FASHION UK Items are ranked in order of popularity.

NoSQL systems: introduction and data models. Riccardo Torlone Università Roma Tre

Using Cloud Services for Test Environments A case study of the use of Amazon EC2

NoSQL Databases. Polyglot Persistence

Big Data Analytics From Strategie Planning to Enterprise Integration with Tools, Techniques, NoSQL, and Graph

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics

Workshop on Hadoop with Big Data

Transcription:

Academic year 2014/2015 Big Data Presentation of the course Prof. Riccardo Torlone Università Roma Tre

2

A new course Second year at Roma Tre First university course on Big Data in Italy We will experiment together some technologies We will take advantage of a grant from Amazon We will know research and applicative projects on Big Data In conclusion, we will face an adventure.. 3

4

Big Data? Why? Well, because they are.. 5

.. BIG 6 The greater the struggle, the more glorious the triumph (The Butterfly Circus)

.. NECESSARY 7 It is a capital mistake to theorize before one has data (Sherlock Holmes)

.. CHALLENGING 8 It always seems impossible until it is done. (Nelson Mandela)

.. PROFITABLE 9 Data is a precious thing and will last longer than the systems themselves. (Tim Bersten Lee)

.. FASHIONABLE 10 I always wanted to be fashionable. (John Malkovich)

.. EXCITING 11 The most exciting phrase to hear in science, is not 'Eureka!' but 'That's funny... (Isaac Asimov)

General information Teacher Prof. Riccardo Torlone Email: torlone@dia.uniroma3.it Office hours: Wednesday: 11:00-13:00 Via Vasca Navale 79 2 floor room 209 Course Web site http://torlone.dia.uniroma3.it/bigdata/ Moodle page http://moodle2.ing.uniroma3.it/moodle/ You must register A "social" course! Facebook: https://www.facebook.com/groups/bigdataroma3/ Twitter: #bigdataroma3 Lectures Monday and Thursday from 14:00 to 16:00, Room N13 Pause: Easter holidays 12

Goals The course aims at illustrating tools and methods for the management of big data, i.e. massive amounts of unstructured data whose size exceed the capacity of conventional database management systems to capture, store, manage and analyze. Focus on: The requirements of modern applications The problems of storing and processing big data The hardware and software solutions Strategy: Coverage of both methods and tools Exercises with real systems Practical projects Guest lectures on Big Data use cases Business seminars 13

Contents (provisional) Introduction Terminology, main aspects and examples of applications Big Data infrastructures Hadoop & Map-reduce; Cloud computing; NoSQL systems. Big Data processing Cleaning, transformation and data integration High level tools: Pig, Hive; New generation tools for data access. Big Data analysis Methods and algorithms for Big Data analysis; Tools for Big Data analysis: Mahout, Open R; Applications Semantic Web and Open data, Social networks, Genomic data. Business seminars 14

Material Books and papers Teacher slides (available on the Web side of the course) NoSQL systems: Martin J. Fowler, Pramodkumar J. Sadalage. NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence, Addison-Wesley, 2013. Scientific papers and book chapters To be published on the Web site of the course Software Hadoop Amazon Web Services Others.. 15

Exams.. I have a dream.. 16

17

Exam modalities For those attending the course: 2 projects to be done by groups of 1, 2, max 3 students Common project, deadline: mid April, weight:30% Given project, deadline: before the exam, weight:50% A written test: 30 minutes, date of the exam, weight:20% For the other students: Individual project, assigned by the teacher A written test: 3 hours Rules: Similar to all the other exams Three chances: July 2015, September 2015, February 2016 18

Main project Goals To solve a problem of Big data To experiment new technologies Steps: Find challenges and data Choose an approach to analyze data Choose suitable technologies Implement the approach Testing of the system 19

2013/2014 course statistics Students attending the course: 52 June: Participants: 48 Average: 29,2 September: Participants: 3 Average: 28 February: Participants: 1 Average: 27 20