What is Big Data? The three(or four) Vs in Big Data In 2013 the total amount of stored information is estimated to be Volume.
|
|
- Pierce Stephens
- 8 years ago
- Views:
Transcription
1 8/26/2014 CS581 Big Data - Fall /26/2014 CS581 Big Data - Fall CS535/CS581A BIG DATA What is Big Data? PART 0. INTRODUCTION 1. INTRODUCTION TO BIG DATA 2. COURSE INTRODUCTION PART 0. INTRODUCTION 1. INTRODUCTION TO BIG DATA Sangmi Lee Pallickara Computer Science, Colorado State University 8/26/2014 CS581 Big Data - Fall /26/2014 CS581 Big Data - Fall Big Data Things one can do at a large scale that cannot be done at a smaller one To extract new insights Create new forms of values Big Data is about analytics of huge quantities of data in order to infer probabilities Big Data is NOT about trying to teach a computer to think like humans Providing a quantitative dimension it never had before The three(or four) Vs in Big Data In 2013 the total amount of stored information is estimated to be Volume around 1,200 exabytes Voluminous Only 2 percent is non-digital It does not have to be certain number of petabytes or quantity. Velocity How fast the data is coming in? How fast you need to be able to analyze and utilize it Variety Number of sources or incoming vectors Real-time data analysis Veracity Can you trust the data itself, source of the data, or the process? User entry errors, redundancy, corruption of the values Data cleaning 8/26/2014 CS581 Big Data - Fall /26/2014 CS581 Big Data - Fall Use all of the data, or just a little? 1/3 John Graunt s population example Instead of counting every person in London, he inferred the population size using sampling Conducting censuses Costly and time-consuming The US Constitution mandated one every decade As the growing country measured itself in millions Use all of the data, or just a little? 2/3 By the late nineteenth century, even that was proving problematic Data outstripped the Census Bureau s ability to keep up. The 1880 census took 8 years to complete The information was obsolete even before it became available! The 1890 census was estimated to require a full 13 years 1
2 8/26/2014 CS581 Big Data - Fall /26/2014 CS581 Big Data - Fall Use all of the data, or just a little? 3/3 In 2009 a new flu virus was discovered. The Census Bureau contracted with Herman Hollerith An American inventor used his idea of punch cards and tabulation machines for the 1890 census Reduced duration to less than one year Methods of acquiring and analyzing big data is very expensive Image of the newly identified H1N1 influenza virus, taken in the CDC Influenza Laboratory. Source: 8/26/2014 CS581 Big Data - Fall /26/2014 CS581 Big Data - Fall H1N1 virus Combines elements of the viruses that causes bird flu and swine flu Some projected outbreak to be on the scale of the 1918 Spanish flu that had: Infected half a billion people Killed tens of millions In the United States, The Centers for Disease Control and Prevention (CDC) Requested doctors inform patients of new flu cases Tabulated the numbers once a week What will be the problems with this system? No vaccine was ready Public health authorities needed to know where it already was. 8/26/2014 CS581 Big Data - Fall Detecting influenza epidemics using search engine query data -1/2 J. Ginsberg, et al., Detecting influenza epidemics using search engine query data Nature 457 pp ~ 1014, February nature07634.html 8/26/2014 CS581 Big Data - Fall Detecting influenza epidemics using search engine query data -2/2 Predict the spread of the winter flu in the United States Nationally Specific regions and states Looking at what people were searching for on the Internet Google More than three billion search queries every day Saves all the queries 50 million most common search terms Americans type and compared the list with CDC data On the spread of seasonal flu between 2003 and 2008 Identify areas infected by the flu virus by what people searched for on the Internet 2
3 8/26/2014 CS581 Big Data - Fall /26/2014 CS581 Big Data - Fall google.org: Flu Trends Example More information How did they build a prediction model? (1/3) Aggregating historical logs of online web search queries Between 2003 and 2008 Computed a time series of weekly counts for 50 million of the most common search queries in the US Separate aggregate weekly counts were kept for every query in each state Simple model that estimates the probability that a random physician visit in a particular region is related to an influenza-like illness(ili) 8/26/2014 CS581 Big Data - Fall How did they build a prediction model? (2/3) The probability that a random search query submitted from the same region is ILI-related 8/26/2014 CS581 Big Data - Fall How did they build a prediction model? (3/3) Each of the 50 million candidate queries was separately tested To identify the search queries that could most accurately model the CDC ILI visit percentage in each region logit(i(t)) = αlogit(q(t))+ε where: I(t) is the percentage of ILI physician visits at time t Q(t) is the ILI-related query fraction at time t α is the multiplicative coefficient ε is the error term logit(p) = ln(p/(1 p)) Generated the highest scoring search queries 8/26/2014 CS581 Big Data - Fall Evaluation of combining top-scoring queries Maximal performance at estimating out-of sample points during cross-validation was obtained by summing top 45 search queries J Ginsberg et al. Nature 000, 1-3 (2008) doi: /nature07634 After adding query 81 oscar nominations 8/26/2014 CS581 Big Data - Fall 2014 In the top 100, 18 topics like high school basket ball was not Included. It coincided with influenza season in the US Most co-related queries Search query topic Top 45 queries Next 55 queries n Weighted n Weighted Influenza complication Cold/flu remedy General influenza symptoms Term for influenza Specific influenza symptom Symptoms of an influenza complication Antibiotic medication remedies General influenza Symptoms of a related disease Antiviral medication Related disease Unrelated to influenza Total The top 45 queries were used in the final model; the next 55 queries are presented for comparison purposes. The number of queries in each topic is indicated, as well as queryvolume-weighted counts, reflecting the relative frequency of queries in each topic. 3
4 8/26/2014 CS581 Big Data - Fall /26/2014 CS581 Big Data - Fall Data sources Public dataset from the CDC s Influenza Sentinel Provider Surveillance Network Average percentage of all outpatient visits to sentinel providers that were ILI-related Query submitted to the Google search engine 2003 ~ 2008 What makes this work innovative? There have been other approaches: CDC s weekly report Call volume to telephone triage advice lines Tracking over-the-counter drug sales Online activity for influenza surveillance Search queries submitted to a Swedish medical website Tracking search keys ( flu, or influenza ) in Yahoo search queries What is the innovation in Google s approach? Google processes more than 24 Petabytes of data per day 8/26/2014 CS581 Big Data - Fall /26/2014 CS581 Big Data - Fall Related research areas Storage systems How can we efficiently resolve queries on massive amounts of input data? The input dataset may be presented in the form of a distributed data stream Machine learning How can we efficiently solve large-scale machine learning problems? The input data may be massive; stored in a distributed cluster of machines Distributed computing How can we efficiently solve large-scale optimization problems in distributed computing environments? For example, how can we efficiently solve large-scale combinatorial problems, e.g. processing of large scale graphs? PART 0. INTRODUCTION 1. INTRODUCTION TO BIG DATA 2. COURSE INTRODUCTION PART 0. INTRODUCTION 2. COURSE INTRODUCTION Sangmi Lee Pallickara Computer Science, Colorado State University CS581A BIG DATA 8/26/2014 CS581 Big Data - Fall /26/2014 CS581 Big Data - Fall Goal of this course Understanding fundamental concepts in Big Data Analytics Learn about existing technologies and how to apply them Related courses Machine Learning/Statistics Distributed Systems Database Systems Computer Communications Computer Security 4
5 8/26/2014 CS581 Big Data - Fall /26/2014 CS581 Big Data - Fall When is the best time to take this course? About me Research area Big Data Storage, retrievals, analytics and visualization Projects Galileo Distributed data storage system for large scale geospatial time-series datasets Mendel Genomic data storage system GeoLens Visual Analytics 8/26/2014 CS581 Big Data - Fall Galileo: Big Picture 8/26/2014 CS581 Big Data - Fall Galileo: key features Data is voluminous Outpaces what is available on a single hard drive Storage must be over a collection of machines Avoid central coordinators Cope with failures Preserve data locality without introducing storage imbalances And the accompanying query hotspots Support range queries and fast ingest of new data Support for large numbers (10 9 ) of small files High throughput storage and retrieval Data is multidimensional with multiple types Time-series data Support for exact match and range queries (with wildcards) along multiple dimensions Support for multiple data formats netcdf, BUFR, HDF 4/5, and data from the Defense Meteorological Satellite Program 8/26/2014 CS581 Big Data - Fall Galileo Design Considerations 8/26/2014 CS581 Big Data - Fall Distributed Hash Tables (DHTs) Symmetric storage nodes No special-function or controller nodes Storage and retrievals may go to any node, and will be forwarded to the targeted node(s) Incremental scale-up Failure-resiliency These are decentralized systems Each node maintains information about a subset of nodes Each node uses its local information to reach any other node in the system Typically used as <key, value> stores 5
6 8/26/2014 CS581 Big Data - Fall Storage Throughput Block is about 8 KB of data 56,000 blocks per second in a system with 48-nodes 8/26/2014 CS581 Big Data - Fall Communications Course Website Announcements: Check the course website at least twice a week. Schedule (course materials, readings, assignments) Policies RamCT Assignment submission Discussion board Grades Contact Instructor sangmi@cs Office hour: T, TH 11:00AM ~ noon and by appointment Office: CSB456 URL: 8/26/2014 CS581 Big Data - Fall /26/2014 CS581 Big Data - Fall Course Structure Programming Assignments Assignment 2 Part 0 Part 1 Part 2 Part 3 Part 0 Part 1 Part 2 Part 3 Introduction to Big Data Data Storage models Scalable algorithms and Big Data Analytics Frameworks of Big Data Analytics Introduction to Big Data Data Storage models Scalable algorithms and Big Data Analytics Frameworks of Big Data Analytics Assignment 1 8/26/2014 CS581 Big Data - Fall /26/2014 CS581 Big Data - Fall Term Project-1/2 Phase 1. Motivating articles Read 3 motivating articles and summarize them Phase 2. Term project proposal Choose your topic Challenging Big Data topics e.g. Simple distributed storage systems, N-gram analysis engine Big Data challenges in your research domain e.g. Simple analytical tool for large scale data analysis in bioinformatics Plan your project Software design Functional requirements Timeline Evaluation method Team/Individual interview Term Project-2/2 Phase 3. Final submission/demo Description and code Phase 4. Presentation You may work on the term project (Phase 2 ~ 4) individually or as a team (only two member teams will be allowed) You should submit phase 1 individually. Group efforts must be correspondingly bigger If you are working with a teammate, please your team by September 8. 6
7 8/26/2014 CS581 Big Data - Fall /26/2014 CS581 Big Data - Fall Term Project Grading Phase 1 (Motivating articles): 10% Phase 2 (Proposal): 25% Phase 3 (Final submission/demo): 40% Phase 4 (Presentation): 25% Course Timeline Phase 1. September 3 Phase 2. PA1 PA2 Phase 4. September 17 October 6 November 5 Phase 3. Week 16 December 2 August September October November December Midterm I October 2 Midterm 2 December 4 Phase 1. description has been posted. 8/26/2014 CS581 Big Data - Fall /26/2014 CS581 Big Data - Fall Quizzes and Participation There will be 10+ pop quizzes in class 1-2 simple questions Open-book, open-material, not open-internet Minute survey Grading Policy Category Points Percentage of Final Grade Programming Assignments (15% x 2) % Term Project % Quizzes and participation % Midterm Exams (15% x 2) % 8/26/2014 CS581 Big Data - Fall /26/2014 CS581 Big Data - Fall Final Letter Grade Course Policy Letter Grade A Total Percentage % and higher No make-up for missed exams Except in extraordinary circumstances (e.g., major illness, family emergency) B ~ % C ~ % D ~ % No make-up for missed quizzes Except for the case where student provided an advance notice to the instructor based on an emergency F Below % 7
8 8/26/2014 CS581 Big Data - Fall /26/2014 CS581 Big Data - Fall Course Policy No Cell-phones in the class. No Laptops during the exam, and quiz. If you need to use a laptop, please sit in the back row. Course Policy: Assignments Assignment submission RamCT Late policy for the assignments Up to a maximum of 2 day past the deadline. 10% penalty per day will be applied I will ask you to turn off your laptop if needed. Grading will be done by interview. 8/26/2014 CS581 Big Data - Fall /26/2014 CS581 Big Data - Fall More importantly, Questions? Attend the class, ask questions, and discuss Check the course web page and RamCT regularly Try new technologies and apply them Share your experiences with other students in class 8
Let the data speak to you. Look Who s Peeking at Your Paycheck. Big Data. What is Big Data? The Artemis project: Saving preemies using Big Data
CS535 Big Data W1.A.1 CS535 BIG DATA W1.A.2 Let the data speak to you Medication Adherence Score How likely people are to take their medication, based on: How long people have lived at the same address
More informationHealthcare data analytics. Da-Wei Wang Institute of Information Science wdw@iis.sinica.edu.tw
Healthcare data analytics Da-Wei Wang Institute of Information Science wdw@iis.sinica.edu.tw Outline Data Science Enabling technologies Grand goals Issues Google flu trend Privacy Conclusion Analytics
More informationOhio Department of Health Seasonal Influenza Activity Summary MMWR Week 10 March 6 th, March 12 th, 2016
Ohio Department of Health Seasonal Influenza Activity Summary MMWR Week 10 March 6 th, March 12 th, 2016 Current Influenza Activity: Current Ohio Activity Level (Geographic Spread) - Widespread Definition:
More informationBig Data and Analytics (Fall 2015)
Big Data and Analytics (Fall 2015) Core/Elective: MS CS Elective MS SPM Elective Instructor: Dr. Tariq MAHMOOD Credit Hours: 3 Pre-requisite: All Core CS Courses (Knowledge of Data Mining is a Plus) Every
More informationBig Data & Analytics: Your concise guide (note the irony) Wednesday 27th November 2013
Big Data & Analytics: Your concise guide (note the irony) Wednesday 27th November 2013 Housekeeping 1. Any questions coming out of today s presentation can be discussed in the bar this evening 2. OCF is
More informationBig Data and Analytics: Getting Started with ArcGIS. Mike Park Erik Hoel
Big Data and Analytics: Getting Started with ArcGIS Mike Park Erik Hoel Agenda Overview of big data Distributed computation User experience Data management Big data What is it? Big Data is a loosely defined
More informationStatistical Challenges with Big Data in Management Science
Statistical Challenges with Big Data in Management Science Arnab Kumar Laha Indian Institute of Management Ahmedabad Analytics vs Reporting Competitive Advantage Reporting Prescriptive Analytics (Decision
More informationInternational Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: 2454-2377 Vol. 1, Issue 6, October 2015. Big Data and Hadoop
ISSN: 2454-2377, October 2015 Big Data and Hadoop Simmi Bagga 1 Satinder Kaur 2 1 Assistant Professor, Sant Hira Dass Kanya MahaVidyalaya, Kala Sanghian, Distt Kpt. INDIA E-mail: simmibagga12@gmail.com
More informationBig Data Analytics. Genoveva Vargas-Solar http://www.vargas-solar.com/big-data-analytics French Council of Scientific Research, LIG & LAFMIA Labs
1 Big Data Analytics Genoveva Vargas-Solar http://www.vargas-solar.com/big-data-analytics French Council of Scientific Research, LIG & LAFMIA Labs Montevideo, 22 nd November 4 th December, 2015 INFORMATIQUE
More informationCollaborations between Official Statistics and Academia in the Era of Big Data
Collaborations between Official Statistics and Academia in the Era of Big Data World Statistics Day October 20-21, 2015 Budapest Vijay Nair University of Michigan Past-President of ISI vnn@umich.edu What
More informationBig Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料
Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料 美 國 13 歲 學 生 用 Big Data 找 出 霸 淩 熱 點 Puri 架 設 網 站 Bullyvention, 藉 由 分 析 Twitter 上 找 出 提 到 跟 霸 凌 相 關 的 詞, 搭 配 地 理 位 置
More informationSoftware Engineering for Big Data. CS846 Paulo Alencar David R. Cheriton School of Computer Science University of Waterloo
Software Engineering for Big Data CS846 Paulo Alencar David R. Cheriton School of Computer Science University of Waterloo Big Data Big data technologies describe a new generation of technologies that aim
More informationPREPARING FOR A PANDEMIC. Lessons from the Past Plans for the Present and Future
PREPARING FOR A PANDEMIC Lessons from the Past Plans for the Present and Future Pandemics Are Inevitable TM And their impact can be devastating 1918 Spanish Flu 20-100 million deaths worldwide 600,000
More informationInfluenza Surveillance Weekly Report CDC MMWR Week 16: Apr 17 to 23, 2016
Office of Surveillance & Public Health Preparedness Program of Public Health Informatics Influenza Surveillance Weekly Report CDC MMWR Week 16: Apr 17 to 23, 2016 Influenza Activity by County, State, and
More informationInternet Search Activity & Crowdsourcing
Novel Data Sources for Disease Surveillance and Epidemiology: Internet Search Activity & Crowdsourcing Sumiko Mekaru, DVM, MPVM, MLIS HealthMap, Boston Children s Hospital American College of Epidemiology
More informationCOS 116 The Computational Universe Laboratory 9: Virus and Worm Propagation in Networks
COS 116 The Computational Universe Laboratory 9: Virus and Worm Propagation in Networks You learned in lecture about computer viruses and worms. In this lab you will study virus propagation at the quantitative
More informationExploring Big Data in Social Networks
Exploring Big Data in Social Networks virgilio@dcc.ufmg.br (meira@dcc.ufmg.br) INWEB National Science and Technology Institute for Web Federal University of Minas Gerais - UFMG May 2013 Some thoughts about
More informationA.I. in health informatics lecture 1 introduction & stuff kevin small & byron wallace
A.I. in health informatics lecture 1 introduction & stuff kevin small & byron wallace what is this class about? health informatics managing and making sense of biomedical information but mostly from an
More informationManaging Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database
Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica
More informationEstimating PageRank Values of Wikipedia Articles using MapReduce
Estimating PageRank Values of Wikipedia Articles using MapReduce Due: Sept. 30 Wednesday 5:00PM Submission: via Canvas, individual submission Instructor: Sangmi Pallickara Web page: http://www.cs.colostate.edu/~cs535/assignments.html
More informationPreparing for the consequences of a swine flu pandemic
Preparing for the consequences of a swine flu pandemic What CIGNA is Doing To help ensure the health and well-being of the individuals we serve, CIGNA is implementing its action plan to prepare for the
More informationUnderstanding Neo4j Scalability
Understanding Neo4j Scalability David Montag January 2013 Understanding Neo4j Scalability Scalability means different things to different people. Common traits associated include: 1. Redundancy in the
More informationLarge-Scale Test Mining
Large-Scale Test Mining SIAM Conference on Data Mining Text Mining 2010 Alan Ratner Northrop Grumman Information Systems NORTHROP GRUMMAN PRIVATE / PROPRIETARY LEVEL I Aim Identify topic and language/script/coding
More informationData Refinery with Big Data Aspects
International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 7 (2013), pp. 655-662 International Research Publications House http://www. irphouse.com /ijict.htm Data
More informationInfrastructures for big data
Infrastructures for big data Rasmus Pagh 1 Today s lecture Three technologies for handling big data: MapReduce (Hadoop) BigTable (and descendants) Data stream algorithms Alternatives to (some uses of)
More informationStatistics for BIG data
Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before
More informationPUBLIC HEALTH MEETS SOCIAL MEDIA: MINING HEALTH INFO FROM TWITTER
PUBLIC HEALTH MEETS SOCIAL MEDIA: MINING HEALTH INFO FROM TWITTER Michael Paul (@mjp39) Johns Hopkins University Crowdsourcing and Human Computation Lecture 18 Learning about the real world through Twitter
More informationIntro to Bioinformatics
Intro to Bioinformatics Marylyn D Ritchie, PhD Professor, Biochemistry and Molecular Biology Director, Center for Systems Genomics The Pennsylvania State University Sarah A Pendergrass, PhD Research Associate
More informationCAP4773/CIS6930 Projects in Data Science, Fall 2014 [Review] Overview of Data Science
CAP4773/CIS6930 Projects in Data Science, Fall 2014 [Review] Overview of Data Science Dr. Daisy Zhe Wang CISE Department University of Florida August 25th 2014 20 Review Overview of Data Science Why Data
More informationMajor Public Health Threat
GAO United States General Accounting Office Testimony Before the Committee on Government Reform, House of Representatives For Release on Delivery Expected at 10:00 a.m. EST Thursday, February 12, 2004
More informationHow To Scale Out Of A Nosql Database
Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 thomas.steinmaurer@scch.at www.scch.at Michael Zwick DI
More informationPlumas County Public Health Agency. Preparing the Community for Public Health Emergencies
Plumas County Public Health Agency Preparing the Community for Public Health Emergencies Safeguarding Your Investment Local businesses have invested significant time and resources into being successful.
More information5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014
5 Keys to Unlocking the Big Data Analytics Puzzle Anurag Tandon Director, Product Marketing March 26, 2014 1 A Little About Us A global footprint. A proven innovator. A leader in enterprise analytics for
More informationWhy is Internal Audit so Hard?
Why is Internal Audit so Hard? 2 2014 Why is Internal Audit so Hard? 3 2014 Why is Internal Audit so Hard? Waste Abuse Fraud 4 2014 Waves of Change 1 st Wave Personal Computers Electronic Spreadsheets
More informationWhite Paper. Version 1.2 May 2015 RAID Incorporated
White Paper Version 1.2 May 2015 RAID Incorporated Introduction The abundance of Big Data, structured, partially-structured and unstructured massive datasets, which are too large to be processed effectively
More information?????? Data Analytics
?????? Data Analytics Prof. Dr.-Ing. Lars Linsen Prof. Dr. Adalbert FX Wilhelm Fall 2015 0. Organizational Stuff 0.1 Syllabus and Organization Data Analytics 3 Course website http://www.faculty.jacobsuniversity.de/llinsen/teaching/??????.htm
More informationA REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM
A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM Sneha D.Borkar 1, Prof.Chaitali S.Surtakar 2 Student of B.E., Information Technology, J.D.I.E.T, sborkar95@gmail.com Assistant Professor, Information
More informationThe Data Engineer. Mike Tamir Chief Science Officer Galvanize. Steven Miller Global Leader Academic Programs IBM Analytics
The Data Engineer Mike Tamir Chief Science Officer Galvanize Steven Miller Global Leader Academic Programs IBM Analytics Alessandro Gagliardi Lead Faculty Galvanize Businesses are quickly realizing that
More informationNorth Texas School Health Surveillance: First-Year Progress and Next Steps
Dean Lampman, MBA, Regional Surveillance Coordinator, Southwest Center for Advanced Public Health Practice* (Fort Worth, TX), dflampman@tarrantcounty.com Tabatha Powell, MPH, Doctoral student, University
More informationAre You Ready for Big Data?
Are You Ready for Big Data? Jim Gallo National Director, Business Analytics February 11, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?
More informationECDC SURVEILLANCE REPORT
ECDC SURVEILLANCE REPORT Pandemic (H1N1) 2009 Weekly report: Individual case reports EU/EEA countries 31 July 2009 Summary The pandemic A(H1N1) 2009 is still spreading despite the fact that the regular
More informationInformation Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)
More informationOnline Communicable Disease Reporting Handbook For Schools, Child-care Centers & Camps
Online Communicable Disease Reporting Handbook For Schools, Child-care Centers & Camps More Information On Our Website https://www.accesskent.com/health/commdisease/school_daycare.htm Kent County Health
More informationWhy Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts
More informationBig Data Systems CS 5965/6965 FALL 2015
Big Data Systems CS 5965/6965 FALL 2015 Today General course overview Expectations from this course Q&A Introduction to Big Data Assignment #1 General Course Information Course Web Page http://www.cs.utah.edu/~hari/teaching/fall2015.html
More informationEnhancing Respiratory Infection Surveillance on the Arizona-Sonora Border BIDS Program Sentinel Surveillance Data
Enhancing Respiratory Infection Surveillance on the Arizona-Sonora Border BIDS Program Sentinel Surveillance Data Catherine Golenko, MPH, Arizona Department of Health Services *Arizona border population:
More informationScalable Data Analysis in R. Lee E. Edlefsen Chief Scientist UserR! 2011
Scalable Data Analysis in R Lee E. Edlefsen Chief Scientist UserR! 2011 1 Introduction Our ability to collect and store data has rapidly been outpacing our ability to analyze it We need scalable data analysis
More informationINTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE
INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE AGENDA Introduction to Big Data Introduction to Hadoop HDFS file system Map/Reduce framework Hadoop utilities Summary BIG DATA FACTS In what timeframe
More informationAdobe Insight, powered by Omniture
Adobe Insight, powered by Omniture Accelerating government intelligence to the speed of thought 1 Challenges that analysts face 2 Analysis tools and functionality 3 Adobe Insight 4 Summary Never before
More informationHarnessing the power of advanced analytics with IBM Netezza
IBM Software Information Management White Paper Harnessing the power of advanced analytics with IBM Netezza How an appliance approach simplifies the use of advanced analytics Harnessing the power of advanced
More informationPlanning for Pandemic Flu. pandemic flu table-top exercise
Planning for Pandemic Flu Lawrence Dickson - University of Edinburgh Background Previous phase of health and safety management audit programme raised topic of Business Continuity Management Limited ability
More informationCIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing. University of Florida, CISE Department Prof.
CIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing University of Florida, CISE Department Prof. Daisy Zhe Wang Data Science Overview Why, What, How, Who Outline Why Data Science?
More informationInfluenza and Pandemic Flu Guidelines
Influenza and Pandemic Flu Guidelines Introduction Pandemic flu is a form of influenza that spreads rapidly to affect most countries and regions around the world. Unlike the 'ordinary' flu that occurs
More informationIntroduction to Data Mining
Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:
More informationSWINE FLU: FROM CONTAINMENT TO TREATMENT
SWINE FLU: FROM CONTAINMENT TO TREATMENT SWINE FLU: FROM CONTAINMENT TO TREATMENT INTRODUCTION As Swine Flu spreads and more people start to catch it, it makes sense to move from intensive efforts to contain
More informationAre You Ready for Big Data?
Are You Ready for Big Data? Jim Gallo National Director, Business Analytics April 10, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?
More informationCreating the Resilient Corporation
Creating the Resilient Corporation Business Continuity Planning and Pandemics Presented by: Eric Millard, Delivery Manager, Business Continuity and Recovery Services, Hewlett-Packard 2006 Hewlett-Packard
More informationDelivering the power of the world s most successful genomics platform
Delivering the power of the world s most successful genomics platform NextCODE Health is bringing the full power of the world s largest and most successful genomics platform to everyday clinical care NextCODE
More informationBig Data Analytics for Space Exploration, Entrepreneurship and Policy Opportunities. Tiffani Crawford, PhD
Big Analytics for Space Exploration, Entrepreneurship and Policy Opportunities Tiffani Crawford, PhD Big Analytics Characteristics Large quantities of many data types Structured Unstructured Human Machine
More informationPandemic Risk Assessment
Research Note Pandemic Risk Assessment By: Katherine Hagan Copyright 2013, ASA Institute for Risk & Innovation Keywords: pandemic, Influenza A, novel virus, emergency response, monitoring, risk mitigation
More informationBIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics
BIG DATA & ANALYTICS Transforming the business and driving revenue through big data and analytics Collection, storage and extraction of business value from data generated from a variety of sources are
More informationOf all the data in recorded human history, 90 percent has been created in the last two years. - Mark van Rijmenam, Think Bigger, 2014
What is Big Data? Of all the data in recorded human history, 90 percent has been created in the last two years. - Mark van Rijmenam, Think Bigger, 2014 Data in the Twentieth Century and before In 1663,
More informationBig Data a threat or a chance?
Big Data a threat or a chance? Helwig Hauser University of Bergen, Dept. of Informatics Big Data What is Big Data? well, lots of data, right? we come back to this in a moment. certainly, a buzz-word but
More informationFour Orders of Magnitude: Running Large Scale Accumulo Clusters. Aaron Cordova Accumulo Summit, June 2014
Four Orders of Magnitude: Running Large Scale Accumulo Clusters Aaron Cordova Accumulo Summit, June 2014 Scale, Security, Schema Scale to scale 1 - (vt) to change the size of something let s scale the
More informationNative Connectivity to Big Data Sources in MicroStrategy 10. Presented by: Raja Ganapathy
Native Connectivity to Big Data Sources in MicroStrategy 10 Presented by: Raja Ganapathy Agenda MicroStrategy supports several data sources, including Hadoop Why Hadoop? How does MicroStrategy Analytics
More informationData, Measurements, Features
Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are
More informationCOMP9321 Web Application Engineering
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 11 (Part II) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411
More informationWeb Data Mining: A Case Study. Abstract. Introduction
Web Data Mining: A Case Study Samia Jones Galveston College, Galveston, TX 77550 Omprakash K. Gupta Prairie View A&M, Prairie View, TX 77446 okgupta@pvamu.edu Abstract With an enormous amount of data stored
More informationMobile Phone APP Software Browsing Behavior using Clustering Analysis
Proceedings of the 2014 International Conference on Industrial Engineering and Operations Management Bali, Indonesia, January 7 9, 2014 Mobile Phone APP Software Browsing Behavior using Clustering Analysis
More informationINTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK A SURVEY ON BIG DATA ISSUES AMRINDER KAUR Assistant Professor, Department of Computer
More informationOverview. Why this policy? Influenza. Vaccine or mask policies. Other approaches Conclusion. epidemiology transmission vaccine
Overview Why this policy? Influenza epidemiology transmission vaccine Vaccine or mask policies development and implementation Other approaches Conclusion Influenza or mask policy Receive the influenza
More informationIllinois Influenza Surveillance Report
ILLINOIS DEPARTMENT OF PUBLIC HEALTH Illinois Influenza Surveillance Report Week 8: Week Ending Saturday, February 25, 2012 Division of Infectious Diseases Immunizations Section 3/5/2012 1 Please note
More informationIC05 Introduction on Networks &Visualization Nov. 2009. <mathieu.bastian@gmail.com>
IC05 Introduction on Networks &Visualization Nov. 2009 Overview 1. Networks Introduction Networks across disciplines Properties Models 2. Visualization InfoVis Data exploration
More informationBIG DATA FUNDAMENTALS
BIG DATA FUNDAMENTALS Timeframe Minimum of 30 hours Use the concepts of volume, velocity, variety, veracity and value to define big data Learning outcomes Critically evaluate the need for big data management
More informationInformation Visualization WS 2013/14 11 Visual Analytics
1 11.1 Definitions and Motivation Lot of research and papers in this emerging field: Visual Analytics: Scope and Challenges of Keim et al. Illuminating the path of Thomas and Cook 2 11.1 Definitions and
More informationTransforming the Telecoms Business using Big Data and Analytics
Transforming the Telecoms Business using Big Data and Analytics Event: ICT Forum for HR Professionals Venue: Meikles Hotel, Harare, Zimbabwe Date: 19 th 21 st August 2015 AFRALTI 1 Objectives Describe
More informationStatistical Analysis and Visualization for Cyber Security
Statistical Analysis and Visualization for Cyber Security Joanne Wendelberger, Scott Vander Wiel Statistical Sciences Group, CCS-6 Los Alamos National Laboratory Quality and Productivity Research Conference
More informationL1: Introduction to Hadoop
L1: Introduction to Hadoop Feng Li feng.li@cufe.edu.cn School of Statistics and Mathematics Central University of Finance and Economics Revision: December 1, 2014 Today we are going to learn... 1 General
More informationThe big data revolution
The big data revolution Friso van Vollenhoven (Xebia) Enterprise NoSQL Recently, there has been a lot of buzz about the NoSQL movement, a collection of related technologies mostly concerned with storing
More informationIST687 Scientific Data Management
1 IST687 Scientific Data Management Spring 2012 Instructor: Jian Qin Email: jqin@syr.edu Office: 311 Hinds Hall Phone: 315-443-5642 Time: any time Location: anywhere Course Description The Scientific Data
More informationAssociate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2
Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue
More informationBig Data Systems CS 5965/6965 FALL 2014
Big Data Systems CS 5965/6965 FALL 2014 Today General course overview Q&A Introduction to Big Data Data Collection Assignment #1 General Course Information Course Web Page http://www.cs.utah.edu/~hari/teaching/fall2014.html
More informationSummary of infectious disease epidemiology course
Summary of infectious disease epidemiology course Mads Kamper-Jørgensen Associate professor, University of Copenhagen, maka@sund.ku.dk Public health science 3 December 2013 Slide number 1 Aim Possess knowledge
More informationSwine Influenza Special Edition Newsletter
Swine Influenza Newsletter surrounding swine flu, so that you ll have the right facts to make smart decisions for yourself and your family. While the World Health Organization (WHO) and the Centers for
More informationHow To Create An Insight Analysis For Cyber Security
IBM i2 Enterprise Insight Analysis for Cyber Analysis Protect your organization with cyber intelligence Highlights Quickly identify threats, threat actors and hidden connections with multidimensional analytics
More informationCS 207 - Data Science and Visualization Spring 2016
CS 207 - Data Science and Visualization Spring 2016 Professor: Sorelle Friedler sorelle@cs.haverford.edu An introduction to techniques for the automated and human-assisted analysis of data sets. These
More informationPREPARING YOUR ORGANIZATION FOR PANDEMIC FLU. Pandemic Influenza:
PREPARING YOUR ORGANIZATION FOR PANDEMIC FLU Pandemic Influenza: What Business and Organization Leaders Need to Know About Pandemic Influenza Planning State of Alaska Frank H. Murkowski, Governor Department
More informationHIV NOMOGRAM USING BIG DATA ANALYTICS
HIV NOMOGRAM USING BIG DATA ANALYTICS S.Avudaiselvi and P.Tamizhchelvi Student Of Ayya Nadar Janaki Ammal College (Sivakasi) Head Of The Department Of Computer Science, Ayya Nadar Janaki Ammal College
More informationSo today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)
Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #39 Search Engines and Web Crawler :: Part 2 So today we
More informationMining Big Data. Pang-Ning Tan. Associate Professor Dept of Computer Science & Engineering Michigan State University
Mining Big Data Pang-Ning Tan Associate Professor Dept of Computer Science & Engineering Michigan State University Website: http://www.cse.msu.edu/~ptan Google Trends Big Data Smart Cities Big Data and
More informationBigData. An Overview of Several Approaches. David Mera 16/12/2013. Masaryk University Brno, Czech Republic
BigData An Overview of Several Approaches David Mera Masaryk University Brno, Czech Republic 16/12/2013 Table of Contents 1 Introduction 2 Terminology 3 Approaches focused on batch data processing MapReduce-Hadoop
More informationUsing Web and Social Media for Influenza Surveillance
Using Web and Social Media for Influenza Surveillance Courtney D Corley 1, Diane J Cook 2, Armin R Mikler 3 and Karan P Singh 4 Abstract Analysis of Google influenza-like-illness (ILI) search queries has
More informationANALYTICS PREDICTIVE. Tool of Providence or the End of Coincidence? He who does not expect the unexpected will not find it out.
PREDICTIVE ANALYTICS Tool of Providence or the End of Coincidence? He who does not expect the unexpected will not find it out. Unless you expect the unexpected you will ever find truth, for it is hard
More informationBIG DATA What it is and how to use?
BIG DATA What it is and how to use? Lauri Ilison, PhD Data Scientist 21.11.2014 Big Data definition? There is no clear definition for BIG DATA BIG DATA is more of a concept than precise term 1 21.11.14
More informationData Analytics for Healthcare: Creating understanding from big data
Data Analytics for Healthcare: Creating understanding from big data Data Analytics for Healthcare Data analytics is an essential resource for any profession. This collection of data and information is
More informationExploiting the power of Big Data
Exploiting the power of Big Data Timos Sellis School of Computer Science and Information Technology timos.sellis@rmit.edu.au ITECHLAW Asia-Pacific Conference, February 26-28, 2014 Melbourne Australia Timeline
More informationIntroduction to infectious disease epidemiology
Introduction to infectious disease epidemiology Mads Kamper-Jørgensen Associate professor, University of Copenhagen, maka@sund.ku.dk Public health science 24 September 2013 Slide number 1 Practicals Elective
More informationBig Data and Analytics: Challenges and Opportunities
Big Data and Analytics: Challenges and Opportunities Dr. Amin Beheshti Lecturer and Senior Research Associate University of New South Wales, Australia (Service Oriented Computing Group, CSE) Talk: Sharif
More informationCS/COE 1501 http://cs.pitt.edu/~bill/1501/
CS/COE 1501 http://cs.pitt.edu/~bill/1501/ Lecture 01 Course Introduction Meta-notes These notes are intended for use by students in CS1501 at the University of Pittsburgh. They are provided free of charge
More information