Healthcare data analytics. Da-Wei Wang Institute of Information Science
|
|
- Melanie McBride
- 8 years ago
- Views:
Transcription
1 Healthcare data analytics Da-Wei Wang Institute of Information Science
2 Outline Data Science Enabling technologies Grand goals Issues Google flu trend Privacy Conclusion
3
4
5 Analytics Statistics Machine learning decision tree, artificial neural network, support vector machine, Bayesian network Deep learning Graph analytics Natural language processing
6 Map-Reduce Programming model for large-scale computing problems Parallel and distributed computing
7 1. Distribute data to machine (mapper) 2. Map: computing something you want from each data item (key, value) pair 3. Shuffle and Sort (according to key) 4. Reduce: aggregate, summarize, filter, or transform (reducer) 5. Output
8 Compute word frequency 1. Distribute web pages to machines (mapper) 2. Map: for each word, w, create (w, c) pair where c is the number of occurrence of w in the document 3. Shuffle and Sort (according to key) 4. Reduce: add all c_i in pair (w, c_i) 5. Output
9 Visualization main goal of data visualization is to communicate information clearly and effectively through graphical means To convey ideas effectively, both aesthetic form and functionality need to go hand in hand, providing insights into a rather sparse and complex data set by communicating its key-aspects in a more intuitive way Example: Hans Rosling, gapminder
10
11 Heterogeneity in Healthcare Multiple forms insurance claims, physician notes, images conversations about health in social media data from wearables and other monitoring devices. Multiple agencies: Providers, payers, employers, personalizedgenetic-testing companies (23andme), social media, and patients
12 The Learning Healthcare System Series
13 The goal of a learning healthcare system is to deliver the best care every time, and to learn and improve with each care experience Each care experience counts implies massive data Need analytics
14 Precision Medicine Precision medicine is an emerging approach for disease treatment and prevention that takes into account individual variability in genes, environment, and lifestyle for each person. Electronic health records have been widely adopted, genomic analysis costs have dropped significantly, data science has become increasingly sophisticated
15 Precision medicine initiative Mr. Obama called for $215 million in fiscal year 2016 to support the Initiative(2015/1) $130 million was allocated to NIH to build a national, large-scale research participant group, called a cohort $70 million was allocated to the National Cancer Institute to lead efforts in cancer genomics
16 Not only for profit
17 Issues with big data analytics Over fitting Model complexity Association (correlation) v.s. causality Understanding, explanation v.s. predicting Parametric to non-parametric Equational model to algorithmic model Wolfgang Pietsch Big Data The New Science of Complexity
18 Cautious notes Google flu trend Detecting influenza epidemics using search engine query data Nature 2009 (letters) When google got flu wrong Nature 2013 (news) The parable of google flu: traps in big data analysis Science 2014 (policy forum)
19 Google Flu Trend Early detection -> rapid response -> reduced impact Monitor health-seeking behavior in the form of online web search queries Relative frequency of certain queries is highly correlated with the percentage of physician visits Estimate current level of weekly influenza activity
20 Data: hundreds of billions of individual searches logs (03-08), time series of weekly counts for 50 million most common search queries normalized by dividing total number of queries Percentage of ILI-related physician visit data from CDC Goal: to estimate the percentage of influenza like illness (ILI)
21 Estimate the probability, P, that a random physician visit is influenza-like illness related Key insight: the probability, Q, that a random search query is ILI-related can approximate P Next steps: Pick a model to relate P with Q Determine ILI-related query
22 Logit(P)= a + b*logit(q)+e, Logit(x)= ln(x/1-x) P, Q? Training step: select the set of ILI-related queries (Q) that fits the model best
23 Single query as Q, try 50 millions one by one. Favor those performed well for all 9 regions. (9 regions) Produce a sorted list of highest scoring queries. Decide how many queries to be included in Q. N=45
24 results Training Meaning correlation 0.9 (min=0.8, max=0.96, 9 regions) Validating: 42 points for each region ( ) 0.97 (min=0.92, max=0.99)
25 When Google got flu wrong Not doing well for 2012 season 2009 flu trend badly underestimated ILI in the US at the start of the H1N1 pandemic Attributed to changes in people s search behaviors as a result of the exceptional nature of the pandemic
26 The most big data that have received popular attention are not the output of instruments designed to produce valid and reliable data amenable for scientific analysis 50 million search terms to fit 1152 data points Remedy: combining multiple sources and dynamically recalibrating GFT
27 Algorithms dynamics All empirical research stands on a foundation of measurement. Is the instrumentation actually capturing the theoretical construct of interest? In the measurement stable and comparable across cases and over time? Are measurement errors systematic? GFT was an unstable reflection of the prevalence of the flu because of algorithm dynamics affecting google s search algorithm
28 Algorithm dynamics Algorithm dynamics are the changes made by engineers to improve the commercial service and by consumers in using that service The google search algorithm is not a static entity Providing suggested additional search terms (2011) Returning potential diagnoses for searches including physical symptoms (2012)
29 GFT assumes that relative search volume for certain terms is statically related to external events, but search behavior changes dynamically Research subjects attempt to manipulate the data generating process to meet their own goals. (google bomb) Ironically, the more successful we become at monitoring the behavior of people using these open sources of information, the more tempting it will be to manipulate those signals.
30 lessons Transparency and replicability Use big data to understand the unknown GFT for finer granularity Study the algorithms Robust patterns? Replicate across time, with other data source Study evolution of social-technical system embedded in our society. It s not just about size of the data all data
31 健 康 存 摺 與 電 子 病 歷 交 換 中 心 已 經 站 上 了 learning healthcare system 的 起 跑 點
32 防 疫 雲 開 始 嘗 試 machine to machine 自 動 資 料 交 換 使 傳 染 病 監 控 更 即 時 更 經 濟
33 健 康 雲 跨 領 域 研 究 法 律 經 濟 生 醫 公 衛 統 計 資 訊 希 望 創 造 更 尊 重 個 人 且 有 善 的 研 究 環 境
34 Privacy Dispute about National Health Insurance data:not only personal privacy, also autonomy. The right to opt-out? Data de-identified, opt-out reduces the quality of data, it s for public good, administration cost too high It s my decision! IT brings administration cost down What if 30% opt-out, data quality down. But
35 Releasing data Data -> User De-identification (cellsecu system) Data enclave( 資 料 中 心 ) User -> Data Link unlinkable data sets Secure multiparty computation
36 Dataset Linkage problem Linking several dataset can be very useful Linkage is prohibited by law in many places due to privacy concerns Secure multiparty computation (SMC) protocols might remedy the situation We built a prototype system
37 Conclusions Data science has tremendous potential Healthcare analytics can have profound impact on healthcare systems Autonomy and privacy issues have to be addressed 主 動 參 與 是 可 能 的 選 項
What is Big Data? The three(or four) Vs in Big Data In 2013 the total amount of stored information is estimated to be Volume.
8/26/2014 CS581 Big Data - Fall 2014 1 8/26/2014 CS581 Big Data - Fall 2014 2 CS535/CS581A BIG DATA What is Big Data? PART 0. INTRODUCTION 1. INTRODUCTION TO BIG DATA 2. COURSE INTRODUCTION PART 0. INTRODUCTION
More informationBig Data Processing with Google s MapReduce. Alexandru Costan
1 Big Data Processing with Google s MapReduce Alexandru Costan Outline Motivation MapReduce programming model Examples MapReduce system architecture Limitations Extensions 2 Motivation Big Data @Google:
More informationHOW WILL BIG DATA AFFECT RADIOLOGY (RESEARCH / ANALYTICS)? Ronald Arenson, MD
HOW WILL BIG DATA AFFECT RADIOLOGY (RESEARCH / ANALYTICS)? Ronald Arenson, MD DEFINITION OF BIG DATA Big data is a broad term for data sets so large or complex that traditional data processing applications
More informationThe Impact of Big Data on Classic Machine Learning Algorithms. Thomas Jensen, Senior Business Analyst @ Expedia
The Impact of Big Data on Classic Machine Learning Algorithms Thomas Jensen, Senior Business Analyst @ Expedia Who am I? Senior Business Analyst @ Expedia Working within the competitive intelligence unit
More informationANALYTICS PREDICTIVE. Tool of Providence or the End of Coincidence? He who does not expect the unexpected will not find it out.
PREDICTIVE ANALYTICS Tool of Providence or the End of Coincidence? He who does not expect the unexpected will not find it out. Unless you expect the unexpected you will ever find truth, for it is hard
More informationSearch and Data Mining: Techniques. Applications Anya Yarygina Boris Novikov
Search and Data Mining: Techniques Applications Anya Yarygina Boris Novikov Introduction Data mining applications Data mining system products and research prototypes Additional themes on data mining Social
More informationJournée Thématique Big Data 13/03/2015
Journée Thématique Big Data 13/03/2015 1 Agenda About Flaminem What Do We Want To Predict? What Is The Machine Learning Theory Behind It? How Does It Work In Practice? What Is Happening When Data Gets
More informationWebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat
Information Builders enables agile information solutions with business intelligence (BI) and integration technologies. WebFOCUS the most widely utilized business intelligence platform connects to any enterprise
More informationChapter 7. Using Hadoop Cluster and MapReduce
Chapter 7 Using Hadoop Cluster and MapReduce Modeling and Prototyping of RMS for QoS Oriented Grid Page 152 7. Using Hadoop Cluster and MapReduce for Big Data Problems The size of the databases used in
More informationGame Changers for Researchers: Altmetrics, Big Data, Open Access What Might They Change? Kiki Forsythe, M.L.S.
Game Changers for Researchers: Altmetrics, Big Data, Open Access What Might They Change? Kiki Forsythe, M.L.S. Definition of Game Changer A newly introduced element or factor that changes an existing situation
More informationIntroduction to Data Visualization
Introduction to Data Visualization STAT 133 Gaston Sanchez Department of Statistics, UC Berkeley gastonsanchez.com github.com/gastonstat/stat133 Course web: gastonsanchez.com/teaching/stat133 Graphics
More informationPredicting & Preventing Banking Customer Churn by Unlocking Big Data
Predicting & Preventing Banking Customer Churn by Unlocking Big Data Making Sense of Big Data http://www.ngdata.com Predicting & Preventing Banking Customer Churn by Unlocking Big Data 1 Predicting & Preventing
More informationHadoop Usage At Yahoo! Milind Bhandarkar (milindb@yahoo-inc.com)
Hadoop Usage At Yahoo! Milind Bhandarkar (milindb@yahoo-inc.com) About Me Parallel Programming since 1989 High-Performance Scientific Computing 1989-2005, Data-Intensive Computing 2005 -... Hadoop Solutions
More informationPREDICTIVE ANALYTICS: PROVIDING NOVEL APPROACHES TO ENHANCE OUTCOMES RESEARCH LEVERAGING BIG AND COMPLEX DATA
PREDICTIVE ANALYTICS: PROVIDING NOVEL APPROACHES TO ENHANCE OUTCOMES RESEARCH LEVERAGING BIG AND COMPLEX DATA IMS Symposium at ISPOR at Montreal June 2 nd, 2014 Agenda Topic Presenter Time Introduction:
More informationHIV NOMOGRAM USING BIG DATA ANALYTICS
HIV NOMOGRAM USING BIG DATA ANALYTICS S.Avudaiselvi and P.Tamizhchelvi Student Of Ayya Nadar Janaki Ammal College (Sivakasi) Head Of The Department Of Computer Science, Ayya Nadar Janaki Ammal College
More informationPredicting & Preventing Banking Customer Churn by Unlocking Big Data
Predicting & Preventing Banking Customer Churn by Unlocking Big Data Customer Churn: A Key Performance Indicator for Banks In 2012, 50% of customers, globally, either changed their banks or were planning
More informationInformation Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)
More informationSoftware Engineering for Big Data. CS846 Paulo Alencar David R. Cheriton School of Computer Science University of Waterloo
Software Engineering for Big Data CS846 Paulo Alencar David R. Cheriton School of Computer Science University of Waterloo Big Data Big data technologies describe a new generation of technologies that aim
More informationSecondary Uses of Data for Comparative Effectiveness Research
Secondary Uses of Data for Comparative Effectiveness Research Paul Wallace MD Director, Center for Comparative Effectiveness Research The Lewin Group Paul.Wallace@lewin.com Disclosure/Perspectives Training:
More informationAn Overview of Knowledge Discovery Database and Data mining Techniques
An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,
More informationVisual Data Mining. Motivation. Why Visual Data Mining. Integration of visualization and data mining : Chidroop Madhavarapu CSE 591:Visual Analytics
Motivation Visual Data Mining Visualization for Data Mining Huge amounts of information Limited display capacity of output devices Chidroop Madhavarapu CSE 591:Visual Analytics Visual Data Mining (VDM)
More informationCLASSIFYING NETWORK TRAFFIC IN THE BIG DATA ERA
CLASSIFYING NETWORK TRAFFIC IN THE BIG DATA ERA Professor Yang Xiang Network Security and Computing Laboratory (NSCLab) School of Information Technology Deakin University, Melbourne, Australia http://anss.org.au/nsclab
More informationIntroduction. A. Bellaachia Page: 1
Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.
More informationCIS 4930/6930 Spring 2014 Introduction to Data Science /Data Intensive Computing. University of Florida, CISE Department Prof.
CIS 4930/6930 Spring 2014 Introduction to Data Science /Data Intensie Computing Uniersity of Florida, CISE Department Prof. Daisy Zhe Wang Map/Reduce: Simplified Data Processing on Large Clusters Parallel/Distributed
More informationWhy dread a bump on the head?
Why dread a bump on the head? The neuroscience of traumatic brain injury Lesson 6: Exploring the data behind brain injury I. Overview This lesson exposes students to the role data access and analysis can
More informationSanjeev Kumar. contribute
RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 sanjeevk@iasri.res.in 1. Introduction The field of data mining and knowledgee discovery is emerging as a
More informationWROX Certified Big Data Analyst Program by AnalytixLabs and Wiley
WROX Certified Big Data Analyst Program by AnalytixLabs and Wiley Disclaimer: This material is protected under copyright act AnalytixLabs, 2011. Unauthorized use and/ or duplication of this material or
More informationOpportunities and Limitations of Big Data
Opportunities and Limitations of Big Data Karl Schmedders University of Zurich and Swiss Finance Institute «Big Data: Little Ethics?» HWZ-Darden-Conference June 4, 2015 On fortune.com this morning: Apple's
More informationESS event: Big Data in Official Statistics. Antonino Virgillito, Istat
ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web
More informationLarge-Scale Data Sets Clustering Based on MapReduce and Hadoop
Journal of Computational Information Systems 7: 16 (2011) 5956-5963 Available at http://www.jofcis.com Large-Scale Data Sets Clustering Based on MapReduce and Hadoop Ping ZHOU, Jingsheng LEI, Wenjun YE
More informationEHR Surveillance for Seasonal and Pandemic Influenza in Primary Care Settings
EHR Surveillance for Seasonal and Pandemic Influenza in Primary Care Settings Jonathan L. Temte, MD/PhD Chuck Illingworth University of Wisconsin School of Medicine and Public Health Department of Family
More informationPentaho Data Mining Last Modified on January 22, 2007
Pentaho Data Mining Copyright 2007 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For the latest information, please visit our web site at www.pentaho.org
More informationFormal Methods for Preserving Privacy for Big Data Extraction Software
Formal Methods for Preserving Privacy for Big Data Extraction Software M. Brian Blake and Iman Saleh Abstract University of Miami, Coral Gables, FL Given the inexpensive nature and increasing availability
More informationExploration and Visualization of Post-Market Data
Exploration and Visualization of Post-Market Data Jianying Hu, PhD Joint work with David Gotz, Shahram Ebadollahi, Jimeng Sun, Fei Wang, Marianthi Markatou Healthcare Analytics Research IBM T.J. Watson
More informationStatistical Challenges with Big Data in Management Science
Statistical Challenges with Big Data in Management Science Arnab Kumar Laha Indian Institute of Management Ahmedabad Analytics vs Reporting Competitive Advantage Reporting Prescriptive Analytics (Decision
More informationCollaborations between Official Statistics and Academia in the Era of Big Data
Collaborations between Official Statistics and Academia in the Era of Big Data World Statistics Day October 20-21, 2015 Budapest Vijay Nair University of Michigan Past-President of ISI vnn@umich.edu What
More informationDifferential privacy in health care analytics and medical research An interactive tutorial
Differential privacy in health care analytics and medical research An interactive tutorial Speaker: Moritz Hardt Theory Group, IBM Almaden February 21, 2012 Overview 1. Releasing medical data: What could
More informationIntroduction to Data Mining
Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:
More informationBig Data Technology Map-Reduce Motivation: Indexing in Search Engines
Big Data Technology Map-Reduce Motivation: Indexing in Search Engines Edward Bortnikov & Ronny Lempel Yahoo Labs, Haifa Indexing in Search Engines Information Retrieval s two main stages: Indexing process
More informationProtein Protein Interaction Networks
Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics
More informationPrediction of Heart Disease Using Naïve Bayes Algorithm
Prediction of Heart Disease Using Naïve Bayes Algorithm R.Karthiyayini 1, S.Chithaara 2 Assistant Professor, Department of computer Applications, Anna University, BIT campus, Tiruchirapalli, Tamilnadu,
More informationIntro to Bioinformatics
Intro to Bioinformatics Marylyn D Ritchie, PhD Professor, Biochemistry and Molecular Biology Director, Center for Systems Genomics The Pennsylvania State University Sarah A Pendergrass, PhD Research Associate
More informationMS1b Statistical Data Mining
MS1b Statistical Data Mining Yee Whye Teh Department of Statistics Oxford http://www.stats.ox.ac.uk/~teh/datamining.html Outline Administrivia and Introduction Course Structure Syllabus Introduction to
More informationCertificate Program in Applied Big Data Analytics in Dubai. A Collaborative Program offered by INSOFE and Synergy-BI
Certificate Program in Applied Big Data Analytics in Dubai A Collaborative Program offered by INSOFE and Synergy-BI Program Overview Today s manager needs to be extremely data savvy. They need to work
More informationComplexity and Scalability in Semantic Graph Analysis Semantic Days 2013
Complexity and Scalability in Semantic Graph Analysis Semantic Days 2013 James Maltby, Ph.D 1 Outline of Presentation Semantic Graph Analytics Database Architectures In-memory Semantic Database Formulation
More informationVisualization methods for patent data
Visualization methods for patent data Treparel 2013 Dr. Anton Heijs (CTO & Founder) Delft, The Netherlands Introduction Treparel can provide advanced visualizations for patent data. This document describes
More informationSocietal Data Resources and Data Processing Infrastructure
Societal Data Resources and Data Processing Infrastructure Bruno Martins INESC-ID & Instituto Superior Técnico bruno.g.martins@ist.utl.pt 1 DATASTORM Task on Societal Data Project vision : Build infrastructure
More informationBig Data Analytics for Healthcare
Big Data Analytics for Healthcare Jimeng Sun Chandan K. Reddy Healthcare Analytics Department IBM TJ Watson Research Center Department of Computer Science Wayne State University 1 Healthcare Analytics
More informationAn Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics
An Oracle White Paper November 2010 Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics 1 Introduction New applications such as web searches, recommendation engines,
More informationIntroduction to Big Data! with Apache Spark" UC#BERKELEY#
Introduction to Big Data! with Apache Spark" UC#BERKELEY# Course Goals" This Lecture" Brief History of Data Analysis" Big Data and Data Science Why All the Excitement?" Where Big Data Comes From" Course
More informationIntegrating Big Data into the Computing Curricula
Integrating Big Data into the Computing Curricula Yasin Silva, Suzanne Dietrich, Jason Reed, Lisa Tsosie Arizona State University http://www.public.asu.edu/~ynsilva/ibigdata/ 1 Overview Motivation Big
More informationExecutive Briefing White Paper Plant Performance Predictive Analytics
Executive Briefing White Paper Plant Performance Predictive Analytics A Data Mining Based Approach Abstract The data mining buzzword has been floating around the process industries offices and control
More informationHow To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
More informationLog Mining Based on Hadoop s Map and Reduce Technique
Log Mining Based on Hadoop s Map and Reduce Technique ABSTRACT: Anuja Pandit Department of Computer Science, anujapandit25@gmail.com Amruta Deshpande Department of Computer Science, amrutadeshpande1991@gmail.com
More informationBig Data Analytics- Innovations at the Edge
Big Data Analytics- Innovations at the Edge Brian Reed Chief Technologist Healthcare Four Dimensions of Big Data 2 The changing Big Data landscape Annual Growth ~100% Machine Data 90% of Information Human
More informationEnergy Efficient MapReduce
Energy Efficient MapReduce Motivation: Energy consumption is an important aspect of datacenters efficiency, the total power consumption in the united states has doubled from 2000 to 2005, representing
More informationData Analytics for Healthcare: Creating understanding from big data
Data Analytics for Healthcare: Creating understanding from big data Data Analytics for Healthcare Data analytics is an essential resource for any profession. This collection of data and information is
More informationUsing Predictions to Power the Business. Wayne Eckerson Director of Research and Services, TDWI February 18, 2009
Using Predictions to Power the Business Wayne Eckerson Director of Research and Services, TDWI February 18, 2009 Sponsor 2 Speakers Wayne Eckerson Director, TDWI Research Caryn A. Bloom Data Mining Specialist,
More informationSocietal benefits vs. privacy: what distributed secure multi-party computation enable? Research ehelse 2015 21-22 April Oslo
Privacy Societal benefits vs. privacy: what distributed secure multi-party computation enable? Research ehelse 2015 21-22 April Oslo Kassaye Yitbarek Yigzaw UiT The Arctic University of Norway Outline
More informationBig Data a threat or a chance?
Big Data a threat or a chance? Helwig Hauser University of Bergen, Dept. of Informatics Big Data What is Big Data? well, lots of data, right? we come back to this in a moment. certainly, a buzz-word but
More informationFOR IMMEDIATE RELEASE
FOR IMMEDIATE RELEASE Hitachi Developed Basic Artificial Intelligence Technology that Enables Logical Dialogue Analyzes huge volumes of text data on issues under debate, and presents reasons and grounds
More informationSignal and Information Processing
The Fu Foundation School of Engineering and Applied Science Department of Electrical Engineering COLUMBIA UNIVERSITY IN THE CITY OF NEW YORK Signal and Information Processing Prof. John Wright SIGNAL AND
More informationStatistics for BIG data
Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before
More informationHPC ABDS: The Case for an Integrating Apache Big Data Stack
HPC ABDS: The Case for an Integrating Apache Big Data Stack with HPC 1st JTC 1 SGBD Meeting SDSC San Diego March 19 2014 Judy Qiu Shantenu Jha (Rutgers) Geoffrey Fox gcf@indiana.edu http://www.infomall.org
More informationFoundation of Quantitative Data Analysis
Foundation of Quantitative Data Analysis Part 1: Data manipulation and descriptive statistics with SPSS/Excel HSRS #10 - October 17, 2013 Reference : A. Aczel, Complete Business Statistics. Chapters 1
More informationRISK MANAGEMENT HEALTH CARE
RISK MANAGEMENT HEALTH CARE Level: Grades 9-12. Purpose: The purpose is to identify and investigate health care issues so that students maintain good health. Content Standards: This unit covers Science
More informationA bit about Hadoop. Luca Pireddu. March 9, 2012. CRS4Distributed Computing Group. luca.pireddu@crs4.it (CRS4) Luca Pireddu March 9, 2012 1 / 18
A bit about Hadoop Luca Pireddu CRS4Distributed Computing Group March 9, 2012 luca.pireddu@crs4.it (CRS4) Luca Pireddu March 9, 2012 1 / 18 Often seen problems Often seen problems Low parallelism I/O is
More informationBig Data and Privacy. Fritz Henglein Dept. of Computer Science, University of Copenhagen. Finance IT Day Riga, 2015-03-26
Big Data and Privacy Fritz Henglein Dept. of Computer Science, University of Copenhagen Finance IT Day Riga, 2015-03-26 About me Professor, Programming Languages and Systems, University of Copenhagen Director,
More informationBig Data and Scripting map/reduce in Hadoop
Big Data and Scripting map/reduce in Hadoop 1, 2, parts of a Hadoop map/reduce implementation core framework provides customization via indivudual map and reduce functions e.g. implementation in mongodb
More informationFOREIGN AFFAIRS PROGRAM EVALUATION GLOSSARY CORE TERMS
Activity: A specific action or process undertaken over a specific period of time by an organization to convert resources to products or services to achieve results. Related term: Project. Appraisal: An
More informationDistributed Computing and Big Data: Hadoop and MapReduce
Distributed Computing and Big Data: Hadoop and MapReduce Bill Keenan, Director Terry Heinze, Architect Thomson Reuters Research & Development Agenda R&D Overview Hadoop and MapReduce Overview Use Case:
More informationHow can you unlock the value in real-world data? A novel approach to predictive analytics could make the difference.
How can you unlock the value in real-world data? A novel approach to predictive analytics could make the difference. What if you could diagnose patients sooner, start treatment earlier, and prevent symptoms
More informationRecognizing Informed Option Trading
Recognizing Informed Option Trading Alex Bain, Prabal Tiwaree, Kari Okamoto 1 Abstract While equity (stock) markets are generally efficient in discounting public information into stock prices, we believe
More informationBIG DATA ANALYTICS For REAL TIME SYSTEM
BIG DATA ANALYTICS For REAL TIME SYSTEM Where does big data come from? Big Data is often boiled down to three main varieties: Transactional data these include data from invoices, payment orders, storage
More informationSURVEY REPORT DATA SCIENCE SOCIETY 2014
SURVEY REPORT DATA SCIENCE SOCIETY 2014 TABLE OF CONTENTS Contents About the Initiative 1 Report Summary 2 Participants Info 3 Participants Expertise 6 Suggested Discussion Topics 7 Selected Responses
More informationHow To Cluster
Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main
More informationA Professional Big Data Master s Program to train Computational Specialists
A Professional Big Data Master s Program to train Computational Specialists Anoop Sarkar, Fred Popowich, Alexandra Fedorova! School of Computing Science! Education for Employable Graduates: Critical Questions
More informationData Analytics in Health Care
Data Analytics in Health Care ONUP 2016 April 4, 2016 Presented by: Dennis Giokas, CTO, Innovation Ecosystem Group A lot of data, but limited information 2 Data collection might be the single greatest
More informationLeading Genomics. Diagnostic. Discove. Collab. harma. Shanghai Cambridge, MA Reykjavik
Leading Genomics Diagnostic harma Discove Collab Shanghai Cambridge, MA Reykjavik Global leadership for using the genome to create better medicine WuXi NextCODE provides a uniquely proven and integrated
More informationBig Data Analytics and Healthcare
Big Data Analytics and Healthcare Anup Kumar, Professor and Director of MINDS Lab Computer Engineering and Computer Science Department University of Louisville Road Map Introduction Data Sources Structured
More informationHortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved
Hortonworks & SAS Analytics everywhere. Page 1 A change in focus. A shift in Advertising From mass branding A shift in Financial Services From Educated Investing A shift in Healthcare From mass treatment
More informationSpeaker First Plenary Session THE USE OF "BIG DATA" - WHERE ARE WE AND WHAT DOES THE FUTURE HOLD? William H. Crown, PhD
Speaker First Plenary Session THE USE OF "BIG DATA" - WHERE ARE WE AND WHAT DOES THE FUTURE HOLD? William H. Crown, PhD Optum Labs Cambridge, MA, USA Statistical Methods and Machine Learning ISPOR International
More informationWhy Big Data is not Big Hype in Economics and Finance?
Why Big Data is not Big Hype in Economics and Finance? Ariel M. Viale Marshall E. Rinker School of Business Palm Beach Atlantic University West Palm Beach, April 2015 1 The Big Data Hype 2 Big Data as
More informationIntroduction to DISC and Hadoop
Introduction to DISC and Hadoop Alice E. Fischer April 24, 2009 Alice E. Fischer DISC... 1/20 1 2 History Hadoop provides a three-layer paradigm Alice E. Fischer DISC... 2/20 Parallel Computing Past and
More informationPredictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD
Predictive Analytics Techniques: What to Use For Your Big Data March 26, 2014 Fern Halper, PhD Presenter Proven Performance Since 1995 TDWI helps business and IT professionals gain insight about data warehousing,
More informationWhat is Visual Analytics?
What is Visual Analytics? Methods@Manchester Oscar de Bruijn Decision and Cognitive Sciences Manchester Business School 1 Overview What is the problem? How does Visual Analytics offer a solution What is
More informationMachine Learning using MapReduce
Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous
More informationMarketing Mix Modelling and Big Data P. M Cain
1) Introduction Marketing Mix Modelling and Big Data P. M Cain Big data is generally defined in terms of the volume and variety of structured and unstructured information. Whereas structured data is stored
More informationVersion 1.0. HEAL NY Phase 5 Health IT & Public Health Team. Version Released 1.0. HEAL NY Phase 5 Health
Statewide Health Information Network for New York (SHIN-NY) Health Information Exchange (HIE) for Public Health Use Case (Patient Visit, Hospitalization, Lab Result and Hospital Resources Data) Version
More informationBIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics
BIG DATA & ANALYTICS Transforming the business and driving revenue through big data and analytics Collection, storage and extraction of business value from data generated from a variety of sources are
More informationIC05 Introduction on Networks &Visualization Nov. 2009. <mathieu.bastian@gmail.com>
IC05 Introduction on Networks &Visualization Nov. 2009 Overview 1. Networks Introduction Networks across disciplines Properties Models 2. Visualization InfoVis Data exploration
More informationBig Data & Analytics: Your concise guide (note the irony) Wednesday 27th November 2013
Big Data & Analytics: Your concise guide (note the irony) Wednesday 27th November 2013 Housekeeping 1. Any questions coming out of today s presentation can be discussed in the bar this evening 2. OCF is
More informationCORPORATE OVERVIEW. Big Data. Shared. Simply. Securely.
CORPORATE OVERVIEW Big Data. Shared. Simply. Securely. INTRODUCING PHEMI SYSTEMS PHEMI unlocks the power of your data with out-of-the-box privacy, sharing, and governance PHEMI Systems brings advanced
More informationCS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.
Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott
More information!!! The Fallacy of Big Data! Brian Fine and Con Menictas!
!!! The Fallacy of Big Data! Brian Fine and Con Menictas! 1! What is Big Data?! Big data is a vague term for a massive phenomenon that has rapidly become an obsession with entrepreneurs, scientists, governments
More informationIntroduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk
Introduction to Machine Learning and Data Mining Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk Ensembles 2 Learning Ensembles Learn multiple alternative definitions of a concept using different training
More informationDelivering the power of the world s most successful genomics platform
Delivering the power of the world s most successful genomics platform NextCODE Health is bringing the full power of the world s largest and most successful genomics platform to everyday clinical care NextCODE
More informationActive Learning SVM for Blogs recommendation
Active Learning SVM for Blogs recommendation Xin Guan Computer Science, George Mason University Ⅰ.Introduction In the DH Now website, they try to review a big amount of blogs and articles and find the
More information