CIS4930/6930 Data Science: Large-scale Advanced Data Analysis Fall Daisy Zhe Wang CISE Department University of Florida
|
|
|
- Lionel Marsh
- 10 years ago
- Views:
Transcription
1 CIS4930/6930 Data Science: Large-scale Advanced Data Analysis Fall 2011 Daisy Zhe Wang CISE Department University of Florida 1
2 Vital Information Instructor: Daisy Zhe Wang Office: E456 Class time: Tuesdays 3-5pm, Thursdays 4-5pm Office hours: right after the class (one hour) TA: none Course page: (read announcements frequently!) 2
3 Overview Trend: Bigger Data and Deeper Analysis Data Science: Uses advanced analysis over large-scale data to create data products 3
4 Data Science A Working Definition Data Science is the science which uses computer science, statistics and machine learning, visualization and human-computer interactions to collect, clean, integrate, analyze, visualize, interact with data to create data products. 4
5 Course Goal In this course, we will have in-depth discussions of recent publications related to Data Science. I will put most emphasis on the systems, applications and algorithms for large-scale advanced data analysis. 5
6 This Course will Give you exposure to research topics and existing work in Data Science. Ask you to critique the papers we are going to read. Strongly encourage you to explore new research problems, come up with better solutions and make contribution. 6
7 This Course will NOT Teach you statistics, machine learning, database systems. Teach you programming. Teach you how to be an expert in map-reduce, statistical packages, parallel databases. 7
8 Expectations Require Information and Database Systems I (CIS4301) Data structures and algorithms, Coding (C, Java) Good Maths and Statistics Background Knowledge Encourage Actively participate in discussions in the classroom Read Data Science literature in general Experience in Machine Learning, NLP, Data Mining Academic honesty 8
9 Course Outline Data Analysis I Systems and Frameworks Data Collection, Cleaning, and Integration Data Analysis II Applications and Algorithms Interface Design and Data Visualization 9
10 Text Books Not required, but recommended. Class notes + papers. 10
11 Additional Reading Pointers Data Science Summit (Strata) ( ) Kaggle Competitions ( Data Science course at Berkeley ( ) Conferences and Journals VLDB, ICDE, SIGMOD CIDR, KDD, ICDM 11
12 Grading How can I get an A? Homework (25 %) Project (55 %) Presentations (20 %) Participation (5% bonus) Novelty in Project (5% bonus) Late submission: 20% per day for up to 5 days. 12
13 Homework (25%) Literary reviews (due before class in hard copy with your name and ID do NOT send ) Main contributions (goals, techniques, evaluations) Positive Critiques Negative Critiques This Thursday: Jeffrey Cohen, Brian Dolan, Mark Dunlap, Joseph M. Hellerstein, Caleb Welton: MAD Skills: New Analysis Practices for Big Data. PVLDB 2(2): (2009) Next Thursday: Cheng-Tao Chu, Sang Kyun Kim, Yi-An Lin, YuanYuan Yu, Gary R. Bradski, Andrew Y. Ng, Kunle Olukotun: Map-Reduce for Machine Learning on Multicore. NIPS 2006:
14 Presentation (20%) Select 1-2 papers from the reading list (I will announce how to sign up) Prepare 1 hour presentation Send me the presentation (ppt) 1 week before the presentation in class Improve presentation according to feedback Deliver the presentation Lead the discussion
15 Project (55%) Work in groups of 2-3 people Project proposal (1-2 pages) (Sep 27) Form the groups Project mid-term evaluation (Nov 3/8) Novelties, Progress, and Deliverables Project final presentation and demo (Dec 6/8) Final Report (Dec 15)
16 Bonuses (5% each) Class participation Novelty in the Project
17 An Overview of Research Topics in Data Science 17
18 Goal of Data Science Turn data into data products.
19 Data Application Databases Wireless Sensor Data, Seismic, Astronomy Data Text Data (Webpages, Wikipedia, s, Enterprise Documents) Social Media Data (Twitter, Blogs, Social Networks) Software Log Data (Server, API, Database Logs, Click Streams), Images, Videos, Music Scientific Data, Medical, Microarray, Genome Data Data is getting Larger and more Diverse
20 Data Products Twitter Text Analysis Spam Filter/Similarity Search User Sentiment/Satisfaction/Feedback News Breakout Trend and Topics 200 million users as of 2011, generating over 200 million tweets and handling over 1.6 billion search queries per day
21 Data Products Netflix Personalized Movie Ratings Movie Recommendations Similar Movies Movie Categories (e.g., 80 s movie with a strong female lead, Kung Fu movies) BlockBuster is out of the business
22 Data Products LinkedIn/Facebook People you may know Applications you may like Jobs/Events you might be interested Classifier for bad users and bad content With high accuracy, Facebook can guess whether you are single or married Who does not have LinkedIn/Facebook Account?
23 Data Products Splunk Degradation, Failure Detection Identify Security Breach Event Monitoring Troubleshoot Tools Cross-platform Event Correlation Founded 2004, Rumor has it Close to IPO
24 Data Products Google Web Search News Recommendation Engine Google Map Google Ads Google Analytics Still the hottest IT company to work for now -- Microsoft of 90 s, IBM of the 70 s
25 Techniques used in Data Science Statistics Machine Learning Data Management Visualization HCI
26 Related Research Areas Databases, Systems Data warehouse, ETL, Data Cubes Statistics and Machine Learning Data Mining Data Visualizations Human Computer Interactions Privacy
27 Challenges in Data Science Preparing Data (Noisy, Incomplete, Diverse, Streaming ) Analyze Data (Scalable, Accurate, Real-time, Advanced Methods, Probabilities and Uncertainties...) Represent Analysis Results (i.e. data product) (Story-telling, Interactive, explainable )
28 Sexy Job in the next 10 years The sexy job in the next ten years will be statisticians The ability to take data to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it that s going to be a hugely important skill. -- Hal Varian, Google Chief Economist, 2009
29 Sexy Job in the next 10 years The sexy job in the next ten years will be data scientists The ability to take data to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it that s going to be a hugely important skill. -- Hal Varian, Google Chief Economist, 2009
30 Skill Set of a Data Scientist Data Management Data collection, storage, cleaning, filtering, integration Statistics and Machine Learning Data modeling, inference, prediction, pattern recognition Interface and Data Visualization HCI design, visualization, story-telling
31 The Life of Data (state-of-the-art) Users Interface Collect Clean Integrate Analysis Visualization Data Sources
32 Data Collection Data in the First Mile Collect data effectively Sensors: Acquisitional data collection Surveys: Model-based Re-ordering of questions (Usher) Collect Structured Data from the Web WebTables Scuba Extraction from Text (more later )
33 Data Cleaning Deal with dirty, noisy, incomplete data E.g., Census, Sensor Networks, Text Extractions Interactive Data Transformation and Cleaning Data Wranger, 2011 Potter s Wheel, 2001 Pay-as-you-go user feedback More automatic (avoid boring tasks) still a research challenge
34 Data Integration Merging multiple data sources Schema Mapping Entity Resolution (SERF) Querying over Multiple data sources Pay-as-you-go data integration Probabilistic data integration Fusion Table: web-centered data integration
35 Data Analysis (I) Systems and Frameworks MADLib Mahout Spark Map-Reduce Online SciDB RIOT MauveDB DataPath Dremel: Google Analytics
36 Data Analysis (II) Applications and Algorithms Text Analysis (OpenIE, Computational Journalism, DBLife, BayesStore) Classification Information extraction Relation extraction Reference reconciliation (Co-reference) On-line advertisement (MADLib) Fraud Detection (Splash)
37 Data Analysis (II) Applications and Algorithms (cont.) Risk Management (MCDB) Probabilistic Databases (MystiQ, Trio, BayesStore..) Mechanical Turks (CrowdDB, CrowdFlow, CrowdSearch ) hot topic now! Recommendation Systems, Log Analysis
38 Interface Design and Data Visualization Polaris Tableau (Stanford) Spreadsheet (e.g., Excel) (MIT) Interactive Querying interface (Berkeley) CONTROL Interactive Cleaning (D^p) Query-by-Example (IBM) ManyEyes
39 Course Project (I) An Application of Data Science Using a Framework We discussed in class Relational Database, Parallel Database Hadoop, Map-Reduce, Mechanical Turk Using a statistical/machine learning algorithm Text Analysis (Classification, Information Extraction, Entity Resolution) A/B Testing, MCMC simulation
40 Course Project (II) Improving Data Science Framework New Interface Design for data cleaning/integration/querying/feedback New Technology to improve Crowdsourcing Service New Framework supporting Data Science applications
41 Research Directions Interactive Query-Driven Text Analysis Information/Relation Extraction Reference Reconciliation Classification Pay-as-you-go Machine Learning On-line Learning Quality Control, Lineage Probabilistic Knowledge Base Probabilistic Database + Crowd-sourcing
42 Homework Today Think about Project, form groups early (project proposal due Sep 27) Reviews due This Thursday: Jeffrey Cohen, Brian Dolan, Mark Dunlap, Joseph M. Hellerstein, Caleb Welton: MAD Skills: New Analysis Practices for Big Data. PVLDB 2(2): (2009) Next Thursday: Cheng-Tao Chu, Sang Kyun Kim, Yi- An Lin, YuanYuan Yu, Gary R. Bradski, Andrew Y. Ng, Kunle Olukotun: Map-Reduce for Machine Learning on Multicore. NIPS 2006:
CIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing. University of Florida, CISE Department Prof.
CIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing University of Florida, CISE Department Prof. Daisy Zhe Wang Data Science Overview Why, What, How, Who Outline Why Data Science?
CAP4773/CIS6930 Projects in Data Science, Fall 2014 [Review] Overview of Data Science
CAP4773/CIS6930 Projects in Data Science, Fall 2014 [Review] Overview of Data Science Dr. Daisy Zhe Wang CISE Department University of Florida August 25th 2014 20 Review Overview of Data Science Why Data
COMP9321 Web Application Engineering
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 11 (Part II) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411
Big Data and Analytics: Challenges and Opportunities
Big Data and Analytics: Challenges and Opportunities Dr. Amin Beheshti Lecturer and Senior Research Associate University of New South Wales, Australia (Service Oriented Computing Group, CSE) Talk: Sharif
Searching frequent itemsets by clustering data
Towards a parallel approach using MapReduce Maria Malek Hubert Kadima LARIS-EISTI Ave du Parc, 95011 Cergy-Pontoise, FRANCE [email protected], [email protected] 1 Introduction and Related Work
Introduction to Data Mining
Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:
Sunnie Chung. Cleveland State University
Sunnie Chung Cleveland State University Data Scientist Big Data Processing Data Mining 2 INTERSECT of Computer Scientists and Statisticians with Knowledge of Data Mining AND Big data Processing Skills:
Big Data Analytics Process & Building Blocks
Big Data Analytics Process & Building Blocks Duen Horng (Polo) Chau Georgia Tech CSE 6242 A / CS 4803 DVA Jan 10, 2013 Partly based on materials by Professors Guy Lebanon, Jeffrey Heer, John Stasko, Christos
Introduction to Big Data! with Apache Spark" UC#BERKELEY#
Introduction to Big Data! with Apache Spark" UC#BERKELEY# So What is Data Science?" Doing Data Science" Data Preparation" Roles" This Lecture" What is Data Science?" Data Science aims to derive knowledge!
Big Data a threat or a chance?
Big Data a threat or a chance? Helwig Hauser University of Bergen, Dept. of Informatics Big Data What is Big Data? well, lots of data, right? we come back to this in a moment. certainly, a buzz-word but
Big Data Systems CS 5965/6965 FALL 2014
Big Data Systems CS 5965/6965 FALL 2014 Today General course overview Q&A Introduction to Big Data Data Collection Assignment #1 General Course Information Course Web Page http://www.cs.utah.edu/~hari/teaching/fall2014.html
Big Data Analytics Building Blocks. Simple Data Storage (SQLite)
http://poloclub.gatech.edu/cse6242 CSE6242 / CX4242: Data & Visual Analytics Big Data Analytics Building Blocks. Simple Data Storage (SQLite) Duen Horng (Polo) Chau Georgia Tech Partly based on materials
Big Data Analytics Building Blocks; Simple Data Storage (SQLite)
Big Data Analytics Building Blocks; Simple Data Storage (SQLite) Duen Horng (Polo) Chau Georgia Tech CSE6242 / CX4242 Jan 9, 2014 Partly based on materials by Professors Guy Lebanon, Jeffrey Heer, John
Jay Buckingham Dynamic Signal [email protected]
Jay Buckingham Dynamic Signal [email protected] Financial Times PeHub.com Wall Street Journal Harvard Business Review Making use of vast amounts of data to: Discover what we don t know Obtain
Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance.
Volume 4, Issue 11, November 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Analytics
Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.
Mike Maxey Senior Director Product Marketing Greenplum A Division of EMC 1 Greenplum Becomes the Foundation of EMC s Big Data Analytics (July 2010) E M C A C Q U I R E S G R E E N P L U M For three years,
Big Data Analytics Building Blocks. Simple Data Storage (SQLite)
http://poloclub.gatech.edu/cse6242 CSE6242 / CX4242: Data & Visual Analytics Big Data Analytics Building Blocks. Simple Data Storage (SQLite) Duen Horng (Polo) Chau Georgia Tech Partly based on materials
Big Data and Analytics (Fall 2015)
Big Data and Analytics (Fall 2015) Core/Elective: MS CS Elective MS SPM Elective Instructor: Dr. Tariq MAHMOOD Credit Hours: 3 Pre-requisite: All Core CS Courses (Knowledge of Data Mining is a Plus) Every
Machine Learning, Data Mining, and Knowledge Discovery: An Introduction
Machine Learning, Data Mining, and Knowledge Discovery: An Introduction AHPCRC Workshop - 8/17/10 - Dr. Martin Based on slides by Gregory Piatetsky-Shapiro from Kdnuggets http://www.kdnuggets.com/data_mining_course/
MAD Skills: New Analysis Practices for Big Data
MAD Skills: New Analysis Practices for Big Data Jeffrey Cohen, Brian Dolan, Mark Dunlap Joseph M. Hellerstein, and Caleb Welton VLDB 2009 Presented by: Kristian Torp Overview Enterprise Data Warehouse
2015 Analyst and Advisor Summit. Advanced Data Analytics Dr. Rod Fontecilla Vice President, Application Services, Chief Data Scientist
2015 Analyst and Advisor Summit Advanced Data Analytics Dr. Rod Fontecilla Vice President, Application Services, Chief Data Scientist Agenda Key Facts Offerings and Capabilities Case Studies When to Engage
IEEE International Conference on Computing, Analytics and Security Trends CAST-2016 (19 21 December, 2016) Call for Paper
IEEE International Conference on Computing, Analytics and Security Trends CAST-2016 (19 21 December, 2016) Call for Paper CAST-2015 provides an opportunity for researchers, academicians, scientists and
Big Data Analytics. Prof. Dr. Lars Schmidt-Thieme
Big Data Analytics Prof. Dr. Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany 33. Sitzung des Arbeitskreises Informationstechnologie,
?????? Data Analytics
?????? Data Analytics Prof. Dr.-Ing. Lars Linsen Prof. Dr. Adalbert FX Wilhelm Fall 2015 0. Organizational Stuff 0.1 Syllabus and Organization Data Analytics 3 Course website http://www.faculty.jacobsuniversity.de/llinsen/teaching/??????.htm
Cleveland State University
Cleveland State University CIS 695 Big Data Processing and Data Analytics (3-0-3) 2016 Section 51 Class Nbr. 5493. Tues, Thur TBA Prerequisites: CIS 505 and CIS 530. CIS 612, CIS 660 Preferred. Instructor:
CSE 6040 Computing for Data Analytics: Methods and Tools. Lecture 1 Course Overview
CSE 6040 Computing for Data Analytics: Methods and Tools Lecture 1 Course Overview DA KUANG, POLO CHAU GEORGIA TECH FALL 2014 Fall 2014 CSE 6040 COMPUTING FOR DATA ANALYSIS 1 Course Staff Instructor Da
Big Data Analytics. Lucas Rego Drumond
Big Data Analytics Lucas Rego Drumond Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Big Data Analytics Big Data Analytics 1 / 36 Outline
Demystifying The Data Scientist
Demystifying The Data Scientist Natasha Balac, Ph.D. Predictive Analytics Center of Excellence, Director San Diego Supercomputer Center University of California, San Diego Brief History of SDSC 1985-1997:
Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014
Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014 Defining Big Not Just Massive Data Big data refers to data sets whose size is beyond the ability of typical database software tools
CPS 216: Advanced Database Systems (Data-intensive Computing Systems) Shivnath Babu
CPS 216: Advanced Database Systems (Data-intensive Computing Systems) Shivnath Babu A Brief History Relational database management systems Time 1975-1985 1985-1995 1995-2005 Let us first see what a relational
Real Time Analytics for Big Data. NtiSh Nati Shalom @natishalom
Real Time Analytics for Big Data A Twitter Inspired Case Study NtiSh Nati Shalom @natishalom Big Data Predictions Overthe next few years we'll see the adoption of scalable frameworks and platforms for
Statistics for BIG data
Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before
Big Data Research in the AMPLab: BDAS and Beyond
Big Data Research in the AMPLab: BDAS and Beyond Michael Franklin UC Berkeley 1 st Spark Summit December 2, 2013 UC BERKELEY AMPLab: Collaborative Big Data Research Launched: January 2011, 6 year planned
Exploring Big Data in Social Networks
Exploring Big Data in Social Networks [email protected] ([email protected]) INWEB National Science and Technology Institute for Web Federal University of Minas Gerais - UFMG May 2013 Some thoughts about
AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW
AGENDA What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story Hadoop PDW Our BIG DATA Roadmap BIG DATA? Volume 59% growth in annual WW information 1.2M Zetabytes (10 21 bytes) this
Statistical Challenges with Big Data in Management Science
Statistical Challenges with Big Data in Management Science Arnab Kumar Laha Indian Institute of Management Ahmedabad Analytics vs Reporting Competitive Advantage Reporting Prescriptive Analytics (Decision
A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani
A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani Technical Architect - Big Data Syntel Agenda Welcome to the Zoo! Evolution Timeline Traditional BI/DW Architecture Where Hadoop Fits In 2 Welcome to
International Journal of Advancements in Research & Technology, Volume 3, Issue 5, May-2014 18 ISSN 2278-7763. BIG DATA: A New Technology
International Journal of Advancements in Research & Technology, Volume 3, Issue 5, May-2014 18 BIG DATA: A New Technology Farah DeebaHasan Student, M.Tech.(IT) Anshul Kumar Sharma Student, M.Tech.(IT)
BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON
BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON Overview * Introduction * Multiple faces of Big Data * Challenges of Big Data * Cloud Computing
Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing
Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Wayne W. Eckerson Director of Research, TechTarget Founder, BI Leadership Forum Business Analytics
Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012. Viswa Sharma Solutions Architect Tata Consultancy Services
Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012 Viswa Sharma Solutions Architect Tata Consultancy Services 1 Agenda What is Hadoop Why Hadoop? The Net Generation is here Sizing the
Big Data & Security. Aljosa Pasic 12/02/2015
Big Data & Security Aljosa Pasic 12/02/2015 Welcome to Madrid!!! Big Data AND security: what is there on our minds? Big Data tools and technologies Big Data T&T chain and security/privacy concern mappings
Let the data speak to you. Look Who s Peeking at Your Paycheck. Big Data. What is Big Data? The Artemis project: Saving preemies using Big Data
CS535 Big Data W1.A.1 CS535 BIG DATA W1.A.2 Let the data speak to you Medication Adherence Score How likely people are to take their medication, based on: How long people have lived at the same address
Exploratory Data Analysis with R. @matthewrenze #codemash
Exploratory Data Analysis with R @matthewrenze #codemash Motivation The ability to take data to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it that
Web Mining Seminar CSE 450. Spring 2008 MWF 11:10 12:00pm Maginnes 113
CSE 450 Web Mining Seminar Spring 2008 MWF 11:10 12:00pm Maginnes 113 Instructor: Dr. Brian D. Davison Dept. of Computer Science & Engineering Lehigh University [email protected] http://www.cse.lehigh.edu/~brian/course/webmining/
E6895 Advanced Big Data Analytics Lecture 3:! Spark and Data Analytics
E6895 Advanced Big Data Analytics Lecture 3:! Spark and Data Analytics Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science Mgr., Dept. of Network Science and Big
Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum
Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum Siva Ravada Senior Director of Development Oracle Spatial and MapViewer 2 Evolving Technology Platforms
Understanding Your Customer Journey by Extending Adobe Analytics with Big Data
SOLUTION BRIEF Understanding Your Customer Journey by Extending Adobe Analytics with Big Data Business Challenge Today s digital marketing teams are overwhelmed by the volume and variety of customer interaction
International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 ISSN 2229-5518
International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 Over viewing issues of data mining with highlights of data warehousing Rushabh H. Baldaniya, Prof H.J.Baldaniya,
Doing Multidisciplinary Research in Data Science
Doing Multidisciplinary Research in Data Science Assoc.Prof. Abzetdin ADAMOV CeDAWI - Center for Data Analytics and Web Insights Qafqaz University [email protected] http://ce.qu.edu.az/~aadamov 16 May
What s next for the Berkeley Data Analytics Stack?
What s next for the Berkeley Data Analytics Stack? Michael Franklin June 30th 2014 Spark Summit San Francisco UC BERKELEY AMPLab: Collaborative Big Data Research 60+ Students, Postdocs, Faculty and Staff
How To Make Sense Of Data With Altilia
HOW TO MAKE SENSE OF BIG DATA TO BETTER DRIVE BUSINESS PROCESSES, IMPROVE DECISION-MAKING, AND SUCCESSFULLY COMPETE IN TODAY S MARKETS. ALTILIA turns Big Data into Smart Data and enables businesses to
Machine Learning using MapReduce
Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous
ESS event: Big Data in Official Statistics
ESS event: Big Data in Official Statistics v erbi v is 1 Parallel sessions 2A and 2B LEARNING AND DEVELOPMENT: CAPACITY BUILDING AND TRAINING FOR ESS HUMAN RESOURCES FACILITATOR: JOSÉ CERVERA- FERRI 2
Datenverwaltung im Wandel - Building an Enterprise Data Hub with
Datenverwaltung im Wandel - Building an Enterprise Data Hub with Cloudera Bernard Doering Regional Director, Central EMEA, Cloudera Cloudera Your Hadoop Experts Founded 2008, by former employees of Employees
Big Data and Data Science: Behind the Buzz Words
Big Data and Data Science: Behind the Buzz Words Peggy Brinkmann, FCAS, MAAA Actuary Milliman, Inc. April 1, 2014 Contents Big data: from hype to value Deconstructing data science Managing big data Analyzing
Monitis Project Proposals for AUA. September 2014, Yerevan, Armenia
Monitis Project Proposals for AUA September 2014, Yerevan, Armenia Distributed Log Collecting and Analysing Platform Project Specifications Category: Big Data and NoSQL Software Requirements: Apache Hadoop
EMC Greenplum Driving the Future of Data Warehousing and Analytics. Tools and Technologies for Big Data
EMC Greenplum Driving the Future of Data Warehousing and Analytics Tools and Technologies for Big Data Steven Hillion V.P. Analytics EMC Data Computing Division 1 Big Data Size: The Volume Of Data Continues
Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved
Hortonworks & SAS Analytics everywhere. Page 1 A change in focus. A shift in Advertising From mass branding A shift in Financial Services From Educated Investing A shift in Healthcare From mass treatment
Cleveland State University
Cleveland State University CIS 612 Modern Database Programming & Big Data Processing (3-0-3) Fall 2014 Section 50 Class Nbr. 2670. Tues, Thur 4:00 5:15 PM Prerequisites: CIS 505 and CIS 530. CIS 611 Preferred.
CSE 544 Principles of Database Management Systems. Magdalena Balazinska (magda) Fall 2007 Lecture 1 - Class Introduction
CSE 544 Principles of Database Management Systems Magdalena Balazinska (magda) Fall 2007 Lecture 1 - Class Introduction Outline Introductions Class overview What is the point of a db management system
CSCI-599 DATA MINING AND STATISTICAL INFERENCE
CSCI-599 DATA MINING AND STATISTICAL INFERENCE Course Information Course ID and title: CSCI-599 Data Mining and Statistical Inference Semester and day/time/location: Spring 2013/ Mon/Wed 3:30-4:50pm Instructor:
Radoop: Analyzing Big Data with RapidMiner and Hadoop
Radoop: Analyzing Big Data with RapidMiner and Hadoop Zoltán Prekopcsák, Gábor Makrai, Tamás Henk, Csaba Gáspár-Papanek Budapest University of Technology and Economics, Hungary Abstract Working with large
The Future of Business Analytics is Now! 2013 IBM Corporation
The Future of Business Analytics is Now! 1 The pressures on organizations are at a point where analytics has evolved from a business initiative to a BUSINESS IMPERATIVE More organization are using analytics
Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies
Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies Big Data: Global Digital Data Growth Growing leaps and bounds by 40+% Year over Year! 2009 =.8 Zetabytes =.08
Advanced In-Database Analytics
Advanced In-Database Analytics Tallinn, Sept. 25th, 2012 Mikko-Pekka Bertling, BDM Greenplum EMEA 1 That sounds complicated? 2 Who can tell me how best to solve this 3 What are the main mathematical functions??
Analyzing Big Data with AWS
Analyzing Big Data with AWS Peter Sirota, General Manager, Amazon Elastic MapReduce @petersirota What is Big Data? Computer generated data Application server logs (web sites, games) Sensor data (weather,
Distributed Computing and Big Data: Hadoop and MapReduce
Distributed Computing and Big Data: Hadoop and MapReduce Bill Keenan, Director Terry Heinze, Architect Thomson Reuters Research & Development Agenda R&D Overview Hadoop and MapReduce Overview Use Case:
Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank
Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank Agenda» Overview» What is Big Data?» Accelerates advances in computer & technologies» Revolutionizes data measurement»
This Symposium brought to you by www.ttcus.com
This Symposium brought to you by www.ttcus.com Linkedin/Group: Technology Training Corporation @Techtrain Technology Training Corporation www.ttcus.com Big Data Analytics as a Service (BDAaaS) Big Data
The Need for Training in Big Data: Experiences and Case Studies
The Need for Training in Big Data: Experiences and Case Studies Guy Lebanon Amazon Background and Disclaimer All opinions are mine; other perspectives are legitimate. Based on my experience as a professor
ISSN: 2320-1363 CONTEXTUAL ADVERTISEMENT MINING BASED ON BIG DATA ANALYTICS
CONTEXTUAL ADVERTISEMENT MINING BASED ON BIG DATA ANALYTICS A.Divya *1, A.M.Saravanan *2, I. Anette Regina *3 MPhil, Research Scholar, Muthurangam Govt. Arts College, Vellore, Tamilnadu, India Assistant
PSG College of Technology, Coimbatore-641 004 Department of Computer & Information Sciences BSc (CT) G1 & G2 Sixth Semester PROJECT DETAILS.
PSG College of Technology, Coimbatore-641 004 Department of Computer & Information Sciences BSc (CT) G1 & G2 Sixth Semester PROJECT DETAILS Project Project Title Area of Abstract No Specialization 1. Software
Information Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli ([email protected])
Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru
Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy Presented by: Jeffrey Zhang and Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop?
CSE 427 CLOUD COMPUTING WITH BIG DATA APPLICATIONS
CSE 427 CLOUD COMPUTING WITH BIG DATA APPLICATIONS COURSE OVERVIEW & STRUCTURE Fall 2015 Marion Neumann ABOUT Marion Neumann email: m dot neumann at wustl dot edu office: Jolley Hall 403 office hours:
Training for Big Data
Training for Big Data Learnings from the CATS Workshop Raghu Ramakrishnan Technical Fellow, Microsoft Head, Big Data Engineering Head, Cloud Information Services Lab Store any kind of data What is Big
Integrating a Big Data Platform into Government:
Integrating a Big Data Platform into Government: Drive Better Decisions for Policy and Program Outcomes John Haddad, Senior Director Product Marketing, Informatica Digital Government Institute s Government
ANALYTICS CENTER LEARNING PROGRAM
Overview of Curriculum ANALYTICS CENTER LEARNING PROGRAM The following courses are offered by Analytics Center as part of its learning program: Course Duration Prerequisites 1- Math and Theory 101 - Fundamentals
International Journal of Engineering Research ISSN: 2348-4039 & Management Technology November-2015 Volume 2, Issue-6
International Journal of Engineering Research ISSN: 2348-4039 & Management Technology Email: [email protected] November-2015 Volume 2, Issue-6 www.ijermt.org Modeling Big Data Characteristics for Discovering
Big Data and Economics, Big Data and Economies. Susan Athey, Stanford University Disclosure: The author consults for Microsoft.
Big Data and Economics, Big Data and Economies Susan Athey, Stanford University Disclosure: The author consults for Microsoft. Lenses on big data 1. The science and practice of using big data 2. Management
"BIG DATA A PROLIFIC USE OF INFORMATION"
Ojulari Moshood Cameron University - IT4444 Capstone 2013 "BIG DATA A PROLIFIC USE OF INFORMATION" Abstract: The idea of big data is to better use the information generated by individual to remake and
LARGE-SCALE DATA-DRIVEN DECISION- MAKING: THE NEXT REVOLUTION FOR TRADITIONAL INDUSTRIES
LARGE-SCALE DATA-DRIVEN DECISION- MAKING: THE NEXT REVOLUTION FOR TRADITIONAL INDUSTRIES How new knowledge-extraction processes and mindsets derived from the Internet Giants technologies will disrupt and
Tap into Hadoop and Other No SQL Sources
Tap into Hadoop and Other No SQL Sources Presented by: Trishla Maru What is Big Data really? The Three Vs of Big Data According to Gartner Volume Volume Orders of magnitude bigger than conventional data
Big Data and Healthcare Payers WHITE PAPER
Knowledgent White Paper Series Big Data and Healthcare Payers WHITE PAPER Summary With the implementation of the Affordable Care Act, the transition to a more member-centric relationship model, and other
