The Emerging Discipline of Data Science. Principles and Techniques For Data- Intensive Analysis
|
|
|
- Kristopher Garrett
- 10 years ago
- Views:
Transcription
1 The Emerging Discipline of Data Science Principles and Techniques For Data- Intensive Analysis
2 What is Big Data Analy9cs? Is this a new paradigm? What is the role of data? What could possibly go wrong? What is Data Science?
3 Big Data is Hot!
4 Big Data Is Important Market Hot Results, products, jobs Poten9al 4 th Paradigm Accelerates discovery [urgent] BeLer: cost, speed, specificity Change 80% of processes [Gartner] Government Policy (45+) White House; most US Govt agencies Adop9on: Most Human Endeavors All academic disciplines Computa9onal X Cool Low effec9ve adop9on [EMC] 60% opera9onal 20% significant change < 1% effec9ve Results not opera9onal In its infancy þ lacking Understanding Concepts, tools, techniques (methods) 21 st Century Sta9s9cs Theory: principles, guidelines
5 Healthcare Poten9al: BeLer Health; Faster, Cheaper Remedies
6 What could go Wrong? When are Correla9ons Spurious?
7 Or Just Wrong? Eg Google Flu Trends Allegedly Real- 9me, Reliable Predic9ons High 100 out of 108 weeks
8 Future of Life: Ins9tute to mi;gate existen;al risks facing humanity
9 US Legal Community Pursuing Algorithmic Accountability
10 Do We Know / Can We Prove? DIA Result: correct, complete, efficient? What machines / algorithms / Machine Learning / Black Boxes / DIA do? Emergent Data- Driven Society with High Reward: Cancer cures, drug discovery, personalized medicine, Risk: errors in any of the above
11 The search for truth evidence- based causality evidence- based correla9ons
12 Model / Theory Hypotheses Data Analysis
13 Long Illustrious Histories Data Analysis Mathema9cs Babylon (17 th - 12 th C BCE) India (12 th C BCE) Mathema9cal analysis (17 th C, Scien9fic Revolu9on) Sta9s9cs (5 th C BCE, 18 th C) ~4,000 years Scien1fic Method Empiricism Aristotle ( BCE) Ptolemy (1 st C) Bacons (13 th, 16 th C) ~2,000 years Scien9fic Discovery Paradigms 1 Theory 2 Experimenta9on 3 Simula9on 4 escience / Big Data ~ 1,000 years
14 Fourth Paradigm Modern Compu1ng Hardware: 40s- 50s FORTRAN: 50s Spreadsheets: 70s Databases: 70s- 80s World Wide Web: 90s ~ 60 years Data- Intensive Analysis of Everything escience (~2000) Big Data (~2007) Par9cle physics, drug discovery, ~ 15 years Paradigms Long developments Significant shiss Conceptual Theore9cal Procedural
15 Precision Oncology Scans Normal skin cell Original cancer cell Biopsy Biomarkers Monitor Sequence Treated cell Sequencing Machines Treat Compare Patient Chromosomes Test Target Cancer cell Normal cell In vivo In vitro In silico Source: Marty Tenebaum, Cancer Commons
16 Accelerating Scientific Discovery Probabilistic Results What: Correlation Experiment Model Why: Causation Correlations/ Hypotheses
17 Accelerating Scientific Discovery Probabilistic Results What: Correlation Baylor Scientists Experiment Model Why: Causation Watson Correlations/ Hypotheses
18 Profound Changes: Paradigm Shis [Kuhn] New reasoning / problem solving model Data è Data- Intensive (Big Data 4 Vs) Why è What Strategic (theory- based) è Tac9cal (evidence- based) Theory- driven (top- down) è Data- driven (bolom- up) Hypothesis tes9ng è Hypothesis genera9on Enabling Paradigm Shiss in most disciplines Science è escience Accelera9ng (scien9fic / engineering) discovery Most domains Personalized medicine Urban Planning Drug interac9ons Social and Economic Planning Beyond Data- Driven: Symbiosis What + Why Human intelligence + machine intelligence
19 Big Data and Data- Intensive Analysis THE BIG PICTURE: MY PERSPECTIVE
20 DIA Pipelines / Ecosystem Q: What Big Data technologies do you see becoming very popular within the next five years? A: I don t like to say that there s a specific technology, there are pipelines that you would build that have pieces to them How do you process the data, how do you represent it, how do you store it, what inferen9al problem are you trying to solve There s a whole toolbox or ecosystem that you have to understand if you are going to be working in the field Michael Jordan, Pehong Chen Dis;nguished Professor at the University of California, Berkeley
21 Data- Intensive Analysis Analy9cal Models Analy9cal Methods Analy9cal Results Data- Intensive Analysis Data Science
22 Data- Intensive Analysis Analy9cal Models Analy9cal Methods Analy9cal Results Data- Intensive Analysis Data Science
23 Data Management for Data- Intensive Analysis Data- Intensive Analysis Data Sources Internal Shared Data Repository Global Data Catalogue & Grid Access Analy9cal Models Analy9cal Methods Analy9cal Results Raw Data Acquisi9on & Cura9on External Shared Repository Catalogue En99es Analy9cal Data Acquisi9on Rela9onships Data- Intensive Analysis Data Science Data Science
24 Research Method: Examine Complex, Large- Scale Use Cases that push limits DATA- INTENSIVE ANALYSIS (DIA) DIA PROCESS (WORKFLOW / PIPELINE) DIA USE CASE RANGE
25 Data Analysis è Data- Intensive Analysis Common defini9on far too simplis;c : extract knowledge from data DIA: the ac;vity of using data to inves;gate phenomena, to acquire new knowledge, and to correct and integrate previous knowledge DIA Process/Workflow/Pipeline: a sequence of opera;ons that cons;tute an end- to- end DIA from source data to a quan;fied, qualified result
26 My Focus is Not common DIA Use Cases
27 Nor High Impact Organiza9onal DIA
28 Data(- Intensive) Analysis Range * Small Data (volume, velocity, variety) 98% Conven9onal data analysis: 1 K years - sta9s9cs, spreadsheets, databases, Big Data = (volume, velocity, variety) 2% Simple DIA: most data science is simple Jeff Leek 96% Focus Simple models & methods, single user, short dura9on: 65+ self- service tools, ML, widest- usage Rela9ve simplicity: sales, marke9ng, & social trends, defects, Complex DIA 4% Domains: par9cle physics, economics, stock market, genomics, drug discovery, weather, boiling water, psychology, Models & Methods: large, collabora9ve community, long dura9on, very large scale Why? A: This is where things obviously break * Many more factors
29 Example Scien9fic Workflow (Arvados)
30 Complex DIA Use Case #1 escience, Big Science, Networked Science, Community Compu9ng, Science Gateway TOP- QUARK, LARGE HADRON COLLIDER, CERN, SWITZERLAND
31
32 Higg s Boson: 40 Year Search LHC Data from proton proton collisions at centre- of- mass energies of 7 TeV (2011) and 8 TeV (2012)
33 How do you Prove Higgs Boson Exists? Standard model of physics predicts (30 years) Higgs Boson characteris9cs Mass ~125 GeV Decays to γγ, WW and ZZ boson pairs Couplings to W and Z bosons Spin parity Couples to up- type top- quark Couples to down- type fermions? Decays to bolom quarks and τ leptons
34 5 Sigma=000001% possible error (2012) 10 Sigma (2014)
35
36 Original Big Data Applica9on, eg, ATLAS high- energy physics (CERN) Source Data Sets Hardware filtering Root Model Tier 1: long- term cura9on; RAW reprocessing to derived formats Root Model ROOT in Oracle Boson Model ROOT Top Quark Model 1,000s Tier 0: Calibra9on and Alignment Express Stream Analysis ROOT Raw Data Acquisi9on & Cura9on Analy9cal Data Acquisi9on Data- Intensive Analysis
37 Worldwide LHC Computing Grid Tier 0 (CERN) Data recording Ini9al data reconstruc9on Data distribu9on Tier 1 (13 centers) Permanent storage Re- processing Analysis Tier 2 (~160 centers) Simula9on End- user analysis April 2012 Author/more info 37
38 Original Big Data Applica9on, eg, ATLAS high- energy physics (CERN); Oracle + DQ2 + ROOT ATLAS Distributed Data Management System (DQ2) (Pig, Hive, Hadoop) On the Worldwide LHC Compu9ng Grid (WLCG) Source Data Sets Hardware filtering Root Model Tier 1: long- term cura9on; RAW reprocessing to derived formats Root Model ROOT (Oracle) Meta- Data / Access DQ2 (HDFS) Boson Model ROOT Top Quark Model 1,000s Tier 0: Calibra9on and Alignment Express Stream Analysis ROOT Raw Data Acquisi9on & Cura9on Analy9cal Data Acquisi9on Data- Intensive Analysis
39 Based on ~30 Large- Scale DIA Use Cases LESSONS LEARNED
40 DIA Lessons Learned (What) A Sosware Ar9fact: a workflow / pipeline Data- Intensive Analysis Workflow Data Management (80%) (Raw) Data Acquisi9on and Cura9on Analy9cal Data Acquisi9on Data- Intensive Analysis (20%) Objec9ve: switch 80:20 to 20:80 è Let scien;sts do science Explore (DIA) vs Build (sosware engineering) Dura9on: years Emerging Paradigm New programming paradigm Experiments over data Convergence Scien9fic / engineering discovery ~10 programming paradigms: database, IR, BI, DM,
41 Result Types DIA Lessons Learned (How) Provable <- > Probabilis9c <- > Specula9ve Nature Analy9cal Empirical: complete meta- data Abstract: incomplete meta- data Phases: Explora9on, Analysis, Interpreta9on Users Exploratory, Itera9ve, and Incremental Individual Workgroup Organiza9on / Enterprise Community
42 DIA Lessons Learned (People) Machine + Human Intelligence Symbiosis op9mized Domain knowledge cri9cal Mul9- disciplinary, Collabora9ve, Itera9ve Community Compu9ng: DIA Ecosystems sharing Massive resources Knowledge Costs Many (~60): escience, Science Gateways, Networked Science, High- energy physics (CERN: ROOT) Astrophysics (Gaia) Scien9fic Workflow Systems: ~30 Macroeconomics Global Alliance for Genomics and Health Enterprise Ecosystems, eg, Informa9on Services: Thomson Reuters, Bloomberg, Open- Science- Grid The Cancer Genome Atlas The Cancer Genomics Hub
43 DIA Lessons Learned (Essence) The value and role of What data is adequate evidence for Q? truth evidence- based causality evidence- based correla9ons
44 Complex DIA Use Case #2 Informa9on Services DOW JONES, BLOOMBERG, THOMSON REUTERS, PEARSON,
45 Informa9on Services Business Collect, curate, enrich, augment (IP) & disseminate informa9on Financial & Risk Investors News & press releases Brokerage research Instruments: stocks, bonds, loans, Legal Dockets Case Law Public records Law firms Global businesses Intellectual Property & Science Scien9fic ar9cles Patents Trademarks Domain names Clinical trials Tax & Accoun9ng Corporate Government Solu9ons
46 Consequences of Errors
47 Internal & External Data Sources Enterprise- Scale Big Data Architecture (Informa9on Services) Pre- Big Data Directory Big Data Big Data Catalogue Domain specific Data Marts Domain1 DS1 En9ty1 En9ty2 En9ty3 Raw Data Acquisi9on & Cura9on DSn data data data Organiza1on Knowledge Graph (linked En99es) Knowledge Graph Data Analy9cal Data Acquisi9on Domaink Data- Intensive Analysis Open Data Org data Opinion IP Open Registry KG1 Domain specific Knowledge Graphs
48 DIA Lessons Learned Modelling Analy9cal Models and Methods Selec9on / crea9on, fi ng / tuning Result verifica9on Model / method management Data Models En99es dominate Data Lakes Named En99es + En9ty (Graph) Models Ontologies (genomics), Ensembles, Emerging DIA Ecosystems Technology Languages (~30) Analy9cs Suites / Pla orms (~60) Big Data Management (~30)
49 Veracity WHAT COULD POSSIBLY GO WRONG?
50 Do We Know / Can We Prove DIA Result: correct, complete, efficient? What machines / algorithms / Machine Learning / Black Boxes / DIA do? High Risk / High Reward Data- Driven Society Risk: drugs or medical advice that cause harm Reward: faster, cheaper, more effec9ve cancer cures, drug discovery, personalized medicine,
51 Professional Cau9ons Experienced prac99oners Medicine Few data- driven results opera9onalized Mount Sinai: no black box solu9ons Authorita9ve organiza9ons NIH, HHS, EOCD, Na9onal Sta9s9cal Organiza9ons, Legal: Algorithmic Accountability: John ZiLrain, Harvard Law School
52 Q: What could possibly go wrong? A: Every step Data Sets All measurements approximate: availability, quality, requirements, sparse/dense, ; How much can we tolerate? What is the impact on the result? Models: All models wrong George Box 1974 Methods Select (1,000s), tune, verify Different methods è radically different results Results: Probabilis9c, error bounds, verifica9on, Data Analysis is 20% of the story
53 Pre Big Data Challenges Science: Experimental design: hypotheses, null hypotheses, dependent and independent variables, controls, blocking, randomiza9on, repeatability, accuracy Analysis: models, methods Resources: cost, 9me, precision
54 +Big Data Challenges Pre- Big Data scale: volume, velocity, and variety Complexity Data: sources, meta- data, 3Vs Models (reflec9ng the domain) Methods (mul9variate palerns) beyond human cogni9on Results: Massive numbers of correla9ons Unreliability scale) Reliability decreases as the number of variables increases (mul9variate analysis) << 10 variables (science &drug discovery) è 1,000s to millions (Machine Learning) In science and medical research, we ve always known that Misunderstood: Self- service, Automated Data Science- in- a- Box 80% unfamiliar with sta9s9cs, error bars, causa9on/correla9on, probabilis9c reasoning, automated data cura9on and analysis Widespread use of DIA: self- service, democra9za9on of analy9cs Like monkeys playing with loaded guns
55 Somewhere over there
56 DIA Verifica9on Principles & Techniques Conven9onal disciplines Man- machine symbiosis DIA Result è Empirical evidence Flashlight analogy: DIA reduces hypothesis space Cross- valida9on Validate predic9ve model: avoid overfi ng, will model work on unseen data sets? Data par99ons: Training Set, Test /Valida9on Set, ground truth K- fold cross valida9on Research Direc9on New measures of significance, the next genera9on P value 21 st Century sta9s9cs
57 I Proposal SCIENTIFIC METHOD è EMPIRICISM DATA SCIENCE è DATA- INTENSIVE ANALYSIS
58 Data Science is A body of principles and techniques for applying data- intensive analysis for inves;ga;ng phenomena, acquiring new knowledge, and correc;ng and integra;ng previous knowledge with measures of correctness, completeness, and efficiency DIA: an experiment over data
59 Conclusions Big Data & Data- Intensive Analysis Value of evidence (from data) Emerging reasoning and problem solving paradigm High risk / high reward Substan9al results already In its infancy, not yet understood, decades to go Overhyped (short term) but may change our world (long term) è Need for Data Science = principles & guidelines We re now at the what are the principles? point in 9me M Jordan Decades of research and prac9ce
60 Thank You!
How To Understand The Big Data Paradigm
Big Data and Its Empiricist Founda4ons Teresa Scantamburlo The evolu4on of Data Science The mechaniza4on of induc4on The business of data The Big Data paradigm (data + computa4on) Cri4cal analysis Tenta4ve
Open Science, Big Data and Research Reproducibility. Tony Hey Senior Data Science Fellow escience Ins>tute University of Washington tony.hey@live.
Open Science, Big Data and Research Reproducibility Tony Hey Senior Data Science Fellow escience Ins>tute University of Washington [email protected] The Vision of Open Science Vision for a New Era of Research
Data analysis in Par,cle Physics
Data analysis in Par,cle Physics From data taking to discovery Tuesday, 13 August 2013 Lukasz Kreczko - Bristol IT MegaMeet 1 $ whoami Lukasz (Luke) Kreczko Par,cle Physicist Graduated in Physics from
Mission. To provide higher technological educa5on with quality, preparing. competent professionals, with sound founda5ons in science, technology
Mission To provide higher technological educa5on with quality, preparing competent professionals, with sound founda5ons in science, technology and innova5on, commi
Project Por)olio Management
Project Por)olio Management Important markers for IT intensive businesses Rest assured with Infolob s project management methodologies What is Project Por)olio Management? Project Por)olio Management (PPM)
Big Data. The Big Picture. Our flexible and efficient Big Data solu9ons open the door to new opportuni9es and new business areas
Big Data The Big Picture Our flexible and efficient Big Data solu9ons open the door to new opportuni9es and new business areas What is Big Data? Big Data gets its name because that s what it is data that
The Data Reservoir. 10 th September 2014. Mandy Chessell FREng CEng FBCS Dis4nguished Engineer, Master Inventor Chief Architect, Informa4on Solu4ons
Mandy Chessell FREng CEng FBCS Dis4nguished Engineer, Master Inventor Chief Architect, Solu4ons The Reservoir 10 th September 2014 A growing demand Business Teams want Open access to more informa4on More
AVOIDING SILOED DATA AND SILOED DATA MANAGEMENT
AVOIDING SILOED DATA AND SILOED DATA MANAGEMENT Dalton Cervo Author, Consultant, Management Expert September 2015 This presenta?on contains extracts from books that are: Copyright 2011 John Wiley & Sons,
Realm of Big Data Ini0a0ves
Realm of Big Data Ini0a0ves Kamlesh Mhashilkar Head - Analy0cs, Big Data and Informa0on Management (ABIM) Prac0ce TCS Digital Enterprise Copyright 2013 Tata Consultancy Services Limited 1 Realm of Big
Developing the Agile Mindset for Organiza7onal Agility. Shannon Ewan Managing Director, ICAgile @ShannonEwan, @ICAgile
Developing the Agile Mindset for Organiza7onal Agility Shannon Ewan Managing Director, ICAgile @ShannonEwan, @ICAgile 1 Who is here today? And Why? 2 To kick things off What is Agile? 3 Agile is a mindset
MSc Data Science at the University of Sheffield. Started in September 2014
MSc Data Science at the University of Sheffield Started in September 2014 Gianluca Demar?ni Lecturer in Data Science at the Informa?on School since 2014 Ph.D. in Computer Science at U. Hannover, Germany
Big Data Processing Experience in the ATLAS Experiment
Big Data Processing Experience in the ATLAS Experiment A. on behalf of the ATLAS Collabora5on Interna5onal Symposium on Grids and Clouds (ISGC) 2014 March 23-28, 2014 Academia Sinica, Taipei, Taiwan Introduction
Experiments on cost/power and failure aware scheduling for clouds and grids
Experiments on cost/power and failure aware scheduling for clouds and grids Jorge G. Barbosa, Al0no M. Sampaio, Hamid Harabnejad Universidade do Porto, Faculdade de Engenharia, LIACC Porto, Portugal, [email protected]
Information and Communications Technology Supply Chain Risk Management (ICT SCRM) AND NIST Cybersecurity Framework
Information and Communications Technology Supply Chain Risk Management (ICT SCRM) AND NIST Cybersecurity Framework Don t screw with my chain, dude! Jon Boyens Computer Security Division IT Laboratory November
CONTENTS. Introduc on 2. Undergraduate Program 4. BSC in Informa on Systems 4. Graduate Program 7. MSC in Informa on Science 7
1 1 2 CONTENTS Introducon 2 Undergraduate Program 4 BSC in Informaon Systems 4 Graduate Program 7 MSC in Informaon Science 7 MSC in Health Informacs 13 2 3 Introducon The School of Informaon Science at
CMMI for High-Performance with TSP/PSP
Dr. Kıvanç DİNÇER, PMP Hace6epe University Implemen@ng CMMI for High-Performance with TSP/PSP Informa@on Systems & SoFware The Informa@on Systems usage has experienced an exponen@al growth over the past
How To Teach Physics At The Lhc
LHC discoveries and Particle Physics Concepts for Education Farid Ould- Saada, University of Oslo On behalf of IPPOG EPS- HEP, Vienna, 25.07.2015 A successful program LHC data are successfully deployed
From Data to Foresight:
Laura Haas, IBM Fellow IBM Research - Almaden From Data to Foresight: Leveraging Data and Analytics for Materials Research 1 2011 IBM Corporation The road from data to foresight is long? Consumer Reports
Stream Deployments in the Real World: Enhance Opera?onal Intelligence Across Applica?on Delivery, IT Ops, Security, and More
Copyright 2015 Splunk Inc. Stream Deployments in the Real World: Enhance Opera?onal Intelligence Across Applica?on Delivery, IT Ops, Security, and More Stela Udovicic Sr. Product Marke?ng Manager Clayton
An Open Dynamic Big Data Driven Applica3on System Toolkit
An Open Dynamic Big Data Driven Applica3on System Toolkit Craig C. Douglas University of Wyoming and KAUST This research is supported in part by the Na3onal Science Founda3on and King Abdullah University
Data Mining. Supervised Methods. Ciro Donalek [email protected]. Ay/Bi 199ab: Methods of Computa@onal Sciences hcp://esci101.blogspot.
Data Mining Supervised Methods Ciro Donalek [email protected] Supervised Methods Summary Ar@ficial Neural Networks Mul@layer Perceptron Support Vector Machines SoLwares Supervised Models: Supervised
Effec%ve AX 2012 Upgrade Project Planning and Microso< Sure Step. Arbela Technologies
Effec%ve AX 2012 Upgrade Project Planning and Microso< Sure Step Arbela Technologies Why Upgrade? What to do? How to do it? Tools and templates Agenda Sure Step 2012 Ax2012 Upgrade specific steps Checklist
Transla6ng from Clinical Care to Research: Integra6ng i2b2 and OpenClinica
Transla6ng from Clinical Care to : Integra6ng i2b2 and OpenClinica Aaron Abend Managing Director, Recombinant Data Corp May 13, 2011 Copyright 2011 Recombinant Data Corp. All rights reserved. 1 About Recombinant
Graduate Systems Engineering Programs: Report on Outcomes and Objec:ves
Graduate Systems Engineering Programs: Report on Outcomes and Objec:ves Alice Squires, [email protected] Tim Ferris, David Olwell, Nicole Hutchison, Rick Adcock, John BrackeL, Mary VanLeer, Tom
Applying Machine Learning to Network Security Monitoring. Alex Pinto Chief Data Scien2st MLSec Project @alexcpsec @MLSecProject!
Applying Machine Learning to Network Security Monitoring Alex Pinto Chief Data Scien2st MLSec Project @alexcpsec @MLSecProject! whoami Almost 15 years in Informa2on Security, done a licle bit of everything.
Poten&al Impact of FDA Regula&on of EMRs. October 27, 2010
Poten&al Impact of FDA Regula&on of EMRs October 27, 2010 Agenda The case for regula&ng Impact on manufacturers Impact on providers Recommenda&ons and best prac&ces 2 A Medical Device Is an instrument,
benefit of virtualiza/on? Virtualiza/on An interpreter may not work! Requirements for Virtualiza/on 1/06/15 Which of the following is not a poten/al
1/06/15 Benefits of virtualiza/on Virtualiza/on Which of the following is not a poten/al benefit of virtualiza/on? A. cost effec/ve B. applica/on migra/on is easy C. improve applica/on performance D. run
BENCHMARKING V ISUALIZATION TOOL
Copyright 2014 Splunk Inc. BENCHMARKING V ISUALIZATION TOOL J. Green Computer Scien
Everything You Need to Know about Cloud BI. Freek Kamst
Everything You Need to Know about Cloud BI Freek Kamst Business Analy2cs Insight, Bussum June 10th, 2014 What s it all about? Has anything changed in the world of BI? Is Cloud Compu2ng a Hype or here to
The Real Score of Cloud
The Real Score of Cloud Mayur Sahni Sr. Research Manger IDC Asia/Pacific [email protected] @mayursahni Digital Transformation Changing Role of IT Innova&on Informa&on Business agility Changing role of the
Introduc)on to the IoT- A methodology
10/11/14 1 Introduc)on to the IoTA methodology Olivier SAVRY CEA LETI 10/11/14 2 IoTA Objec)ves Provide a reference model of architecture (ARM) based on Interoperability Scalability Security and Privacy
Clusters in the Cloud
Clusters in the Cloud Dr. Paul Coddington, Deputy Director Dr. Shunde Zhang, Compu:ng Specialist eresearch SA October 2014 Use Cases Make the cloud easier to use for compute jobs Par:cularly for users
Science Gateways What are they and why are they having such a tremendous impact on science? Nancy Wilkins- Diehr [email protected]
Science Gateways What are they and why are they having such a tremendous impact on science? Nancy Wilkins- Diehr [email protected] What is a science gateway? science gateway /sī əәns gāt wā / n. 1. an
NERSC Data Efforts Update Prabhat Data and Analytics Group Lead February 23, 2015
NERSC Data Efforts Update Prabhat Data and Analytics Group Lead February 23, 2015-1 - A little bit about myself Computer Scien.st Brown, IIT Delhi Real- 3me Graphics, Virtual Reality, HCI Computa3onal
Data Management in the Cloud: Limitations and Opportunities. Annies Ductan
Data Management in the Cloud: Limitations and Opportunities Annies Ductan Discussion Outline: Introduc)on Overview Vision of Cloud Compu8ng Managing Data in The Cloud Cloud Characteris8cs Data Management
Testing the In-Memory Column Store for in-database physics analysis. Dr. Maaike Limper
Testing the In-Memory Column Store for in-database physics analysis Dr. Maaike Limper About CERN CERN - European Laboratory for Particle Physics Support the research activities of 10 000 scientists from
High Value Manufacturing Catapult. Paul John, Business Director
High Value Manufacturing Catapult Paul John, Business Director Contents Overview of HVM Catapult Summary of capability Existing relationships Key points Overview of the HVM Catapult An instrument of the
HIPAA Breaches, Security Risk Analysis, and Audits
HIPAA Breaches, Security Risk Analysis, and Audits Derrick Hill Senior Health IT Advisor Kentucky REC What cons?tutes PHI? HIPAA provides a list of 18 iden?fiers that cons?tute PHI. Any one of these iden?fiers
Integrating Genetic Data into Clinical Workflow with Clinical Decision Support Apps
White Paper Healthcare Integrating Genetic Data into Clinical Workflow with Clinical Decision Support Apps Executive Summary The Transformation Lab at Intermountain Healthcare in Salt Lake City, Utah,
Performance Management in Big Data Applica6ons. Michael Kopp, Technology Strategist @mikopp
Performance Management in Big Data Applica6ons Michael Kopp, Technology Strategist NoSQL: High Volume/Low Latency DBs Web Java Key Challenges 1) Even Distribu6on 2) Correct Schema and Access paperns 3)
Big Data Use Cases. At Salesforce.com. Narayan Bharadwaj Director, Product Management Salesforce.com. @nadubharadwaj
Big Data Use Cases At Salesforce.com Narayan Bharadwaj Director, Product Management Salesforce.com @nadubharadwaj Safe harbor Safe harbor statement under the Private Securi9es Li9ga9on Reform Act of 1995:
Program Model: Muskingum University offers a unique graduate program integra6ng BUSINESS and TECHNOLOGY to develop the 21 st century professional.
Program Model: Muskingum University offers a unique graduate program integra6ng BUSINESS and TECHNOLOGY to develop the 21 st century professional. 163 Stormont Street New Concord, OH 43762 614-286-7895
How To Use Splunk For Android (Windows) With A Mobile App On A Microsoft Tablet (Windows 8) For Free (Windows 7) For A Limited Time (Windows 10) For $99.99) For Two Years (Windows 9
Copyright 2014 Splunk Inc. Splunk for Mobile Intelligence Bill Emme< Director, Solu?ons Marke?ng Panos Papadopoulos Director, Product Management Disclaimer During the course of this presenta?on, we may
Strategy and Architecture to Establish 'Smart Plants'
Strategy and Architecture to Establish 'Smart Plants' About Intrigo We are a solu*on provider of Business Applica:ons focused on orchestra*ng Customer Value Networks in the changing SAP Enterprise technology
Solving Healthcare Problems with Big Data. Alicia d Empaire
Solving Healthcare Problems with Big Data Alicia d Empaire Agenda Ø Introduction of Baptist Health South Florida Ø Healthcare Changes and Challenges Ø Role of Technology and Analytics Ø BHSF Big Data Analytics
The 3 questions to ask yourself about BIG DATA
The 3 questions to ask yourself about BIG DATA Do you have a big data problem? Companies looking to tackle big data problems are embarking on a journey that is full of hype, buzz, confusion, and misinformation.
How To Change Medicine
P4 Medicine: Personalized, Predictive, Preventive, Participatory A Change of View that Changes Everything Leroy E. Hood Institute for Systems Biology David J. Galas Battelle Memorial Institute Version
