!"#$%&'()*+*,-%')').*/-(.-%0*/-121)&1-23
|
|
|
- Vincent Fowler
- 10 years ago
- Views:
Transcription
1 !"#$%&'()*+*,-%')').*/-(.-%0*/-121)&1-23 Ivo D. Dinov Assoc. Director MIDAS Ed & Training Erin Shellman Research Scientist AWS Patrick Harrington Co-Founder Chief Data Scientist CompGenome.com Nandit Soparkar CEO, Ubiquiti
2 MIDAS Data Science Education & Training: Challenges & Opportunities Ivo D. Dinov Associate Professor, School of Nursing Associate Education Director, Michigan Institute for Data Science (MIDAS) University of Michigan
3 4%&'()%5*6'.*7%&%*8$'1)$1** 9#--'$#5%*9()2&155%&'()3 MIDAS Recently established Data Science institutes and curricular programs!
4 :;7<8*6'.*7%&%*!$(2=2&103 MIDAS
5 o! Listening: streams, analyzing sentiment, intent and trends; o! Looking: searching, indexing and memory management of heterogeneous datasets; Raw, derived or indexed data as well as meta-data; o! Programming: Handling Map-Reduce/HDFS, No-SQL DB, protocol provenance, algorithm development/optimization, pipeline workflows; o! Inferring: Principles of data analyses, Bayesian modeling, inference, uncertainty and quantification of likelihoods; Reasoning & logic; Analytics: Regression, feature selection, dimensionality reduction, temporal patterns, validation; Actionable Knowledge o! Machine Learning: Classification, clustering, mining, information extraction, knowledge retrieval, decision making; o! Predicting: Forecasting, neural models, deep learning, and research topics; o! Summarizing: Presentation of data, processing protocol, analytics provenance, visualization, synthesis
6 o! Technological advances and students IT skills far outpace educators technological expertise and the current rate of IT adoption in college curricula o! Lack of open learning resources and interactive interoperable platforms o! Difficulty sharing data, tools, materials & activities across different disciplines o! Limited technology-enhanced continuing instructor education opportunities o! Discipline-specific knowledge boundaries o! Skills for communication and teamwork involving cooperation in dynamic groups of dispersed researchers o! Ability to aggregate & harmonize data (e.g., fusion of qualitative and quantitative elements).
7 o! Drivers: Motivations, datasets (6D of Big Data), challenges, applications! o! Methods: foundational and trans-disciplinary scientific techniques! o! Tools: web-services, software, code, platforms! o! Analytics: practice of data interrogation!
8 !D'2&1)&*7%&%*8$'1)$1*!"#$%&'()*+*,-%')').3 o! Undergraduate DS Degree Program! o! Stats + Engineering! o! Started Fall 2015! o! o! DS Summer Institute (Public Health)! o! 20+ faculty, 40+ trainees! o! Full support/in residence (1 month)! o! BigDataSummerInst.sph.umich.edu! o! Big Data Summer Bootcamp! o! 5 years running, Business- Engineering! o! Practice of Econ-Bio-Social Analytics! o! ibug-um15.github.io/2015-summercamp! o! Graduate DS Certificate Program! o! Transdisciplinary (SM,SPH,LS&A,SN,CoE,SI)! o! 12 cr, Modeling+Technology +Practice! o!
9 :;7<8*!"#$%&'()*+*,-%')').*EF(').*>(-G%-"H 3 o! Graduate Data Science Certificate Program! o! Enrollment of 100 (UMich) students! o! Develop the Online Graduate Data Science Certificate Curriculum! o! Develop a 5-yr dual undergrad-grad MS Program in Data Science! o! DS Summer Institute (Public Health)! o! Big Data Summer Bootcamp! o! MIDAS Seminar Series (academia, industry, government, partners)! o! Web-streamed and archived for broader impact! o! National Events! o! BD2K, Midwest Big-Data-Hub, Math, Stats, Computer Vision, Machine Learning, Informatics,! o! Funding: NIH, NSF, Foundation: Research & Traineeship Applications!
10 :'$C'.%)*7'I1-1)$1*')*3 7%&%*8$'1)$1*!"#$%&'()*+*,-%')').3 o! Public-Private-Partnership (PPP) Model! o! Integration of Research + Practice + Education! Industry Community UMich Fed Orgs o! Problems + Modeling + Technology + Applications! AtheyLab!
11 ;)&1.-%&'()*(L*!"#$%&'()M*K121%-$C*+*/-%$&'$13 o! Probability and Statistics EBook! o! Motivation, Methods, Techniques, Practice! o! wiki.socr.umich.edu/index.php/ebook! o! Data sets! o! 100 s of Research, Observed & Simulated! o! wiki.socr.umich.edu/index.php/ SOCR_Data! o! Hands-on Activities (Team-focused)! o! Modeling, Inference, Concept Demos! o! wiki.socr.umich.edu/index.php/socr_edumaterials! o! Applets and Webapps! o! Over 500 Web tools! o! Distributed services based on Java, HTML5/ Features: Free, Cloud-service, no-barrier access, LGPL/ JavaScript! CC-BY! Ref: Dinov et al., TS (2013), DOI: /test.12012! SOCR has over 9.6M users worldwide since 2002!
12
13 6'.*7%&%*<)%5=&'$23 o! Web-service combining and integrating multi-source socioeconomic and medical datasets! o! Big data analytic processing! o! Interface for exploratory navigation, manipulation and visualization! o! Adding/removing of visual queries and interactive exploration of multivariate associations! o! Powerful HTML5 technology enabling mobile on-demand computing! Husain, et al., 2015, J Big Data!
14 :;7<8*J)5')1*7%&%*8$'1)$1*!"#$%&'()*/-(.-%0*E78!/H3 MIDAS plans to: o! Develop new hands-on MOOC DS courses (data, learning modules, services, code snippets, applications/partners) o! Deploy MIDAS Training as a Service (TaaS) MOOC platform o! Compile Datasets (research-derived, observational, simulated) identify, aggregate, manage, navigate, and service exemplary Big Data sets o! Faculty/Mentors instructors, collaborators, partners, students and staff to develop a core DS course-series (4 courses) o! Instructional resources and complete end-to-end learning modules o! Track and Validate the DSEP Program annual reports (MIDAS ETC), review of evaluations, sustainability (financial, resources, space), impact
15 How to Engage & Partner in MIDAS Education? Partners! o! Open-Source Community! o! Academia! o! Trainees! o! Industry! o! Non-profit! o! Government! o! Philanthropy! o! Community Orgs! Activities!! Drivers! o! Challenges! o! Datasets! o! Applications! Methods!! Technologies!! Tools/ Services!! Training Opps! o! Mentorship! o! DS Projects! o! Internships!
16 :;7<8*!"#$%&'()*+*,-%')').*,1%0 3 Engineering MIDAS Education & Training Committee Ivo Dinov, Margaret Hedstrom, Honglak Lee, Sebastian Zöllner, Richard Gonzalez, Kerby Shedden Other Contributing MIDAS Faculty Alfred Hero: Electrical Engineering and Computer Science; Biomedical Engineering, Stats H. V. Jagadish: Electrical Engineering and Computer Science Mike Cafarella: Computer Science and Engineering Karthik Duraisamy: Atmospheric, Oceanic, and Space Sciences Judy Jin: Industrial & Operations Engineering Dragomir Radev: School of Information; Computer Science and Engineering; Linguistics Information & Health Sciences Brian Athey: Computational Medicine and Bioinformatics, Medicine Carl Lagoze: School of Information Qiaozhu Mei: School of Information Jeremy Taylor, Biostatistics, Public Health Basic Sciences Vijay Nair: Statistics & Engineering George Alter: Institute for Social Research, History Christopher Miller: Physics & Astronomy August Evrard: Physics & Astronomy Anna Gilbert: Mathematics, Engineering Stephen Smith: Ecology and Evolutionary Biology Ambuj Tewari: Statistics; Computer Science and Engineering Contact [email protected]! [email protected]
17 !"#$%&'()*+*,-%')').*/-(.-%0*/-121)&1-23 Ivo D. Dinov Assoc. Director MIDAS Ed & Training Erin Shellman Research Scientist, AWS Nandit Soparkar CEO, Ubiquiti Patrick Harrington Co-Founder/Chief Data Scientist CompGenome.com
18 !"#$%&'()*+*,-%')').*/-(.-%0*/-121)&1-23 Ivo D. Dinov Assoc. Director MIDAS Ed & Training Erin Shellman Research Scientist AWS Patrick Harrington Co-Founder Chief Data Scientist CompGenome.com Nandit Soparkar CEO, Ubiquiti
19 6'.*7%&%*8$'1)$13 From a descriptive to a constructive definition of Big Data! IBM s 4V s volume, variety, velocity (speed) and veracity (reliability)! MIDAS Big Data Characterization identifies gaps, challenges & needs Mixture of quantitative! & qualitative estimates! Dinov, et al. (2014)! Size Expected 2016 Increase 2018 Incompleteness Example: analyzing observational data of 1,000 s Parkinson s disease patients based on 10,000 s signature biomarkers derived from multi-source imaging, genetics, clinical, physiologic, phenomics and demographic! data elements.!! Software developments, student training, service platforms and methodological advances associated with the Big Data Discovery Science all present existing opportunities for learners, educators, researchers, practitioners and policy makers!
20 6E+15 Walter, Sci Am, 2005! storage doubles every 1.2 yrs! Computational power! double each 1.6 years! 4E+15 2E µm µm 100 µm Increase of Image Resolution! " Time "! 1mm 1cm Cryo_Color Cryo_Color Cryo_Short Gryo_Byte Gryo_Byte Cryo_Short Moore s Law (1000'sTrans/CPU) Genomics_BP(GB) Neuroimaging(GB) Data volume! Increases faster! than computational power! Dinov, et al., 2014! Neuroimaging(GB) Genomics_BP(GB) Moore s Law (1000'sTrans/CPU)
21 7%&%*8$'1)$1*,-%')').*/-(.-%0P*9(-1*9#--'$#5#03 Themes! Training! Examples! Data Scrubbing! Data Mining! Data Inference! Big Data Management! Big Data Representation! Mining Social- Network Graphs! Modeling! Link Analysis! Clustering! Classification! Dimensionality Reduction! Big Data technology, Searching, indexing, memory management, Information extraction, feature selection, Supervised and unsupervised-learning, stream mining Matrix Representation of Sets, Minhashing, Jaccard Similarity, Distance Measures, Euclidean Distances, Jaccard Distance, Cosine Distance, Edit Distance, Hamming Distance, Networks, Graph similarity, Sets as Strings, Prefix Indexing, Data Streams and Processing, Representative Samples, Bloom Filter, Flajolet-Martin Algorithm, TF/IDF, MoM, MLE, Alon-Matias-Szegedy Algorithm for Second Moments, Datar-Gionis-Indyk-Motwani Algorithm (DGIM)! Bio-Social Network Graphs, Root-Mean-Square Error, UV-Decomposition, The NetFlix Challenge, Tweeter Challenge, Distances and Clustering of Social-Network Graphs, Network Cluster Betweenness, Complete Bipartite Subgraphs, Graph Partition, Matrices and Graphs, Eigen-values/ vectors of the Laplacian Matrix, Graph Neighborhood Properties, Diameter of a Graph, Transitive Closure and Reachability, Crowdsourcing Algorithms Statistical Modeling, Machine Learning, Computational Modeling, Feature Extraction, Power Laws, Map-Reduce/Hadoop! PageRank, Spider Traps, Taxation and loops in Big Data traversal, Random Walks! Curse of Dimensionality, Classification and Regression Trees/Random Forests, Hierarchical Clustering, K-means Algorithms, Bradley, Fayyad, and Reina (BFR) Algorithm, CURE Algorithm, Clusters in the GRGPF Algorithm! Random Forest, SVM, Neural Networks, Latent Class Models, Finite Mixture Models! Eigenvalues and Eigenvectors, Principal-Component Analysis, Singular-Value Decomposition, Wrapper-Based vs Filter-Based Feature Selection, Independent Components Analysis, Multidimensional Scaling, CUR Decomposition!
22 BD I K A Big Data Information Knowledge Action Raw Observations Processed Data Maps, Models Actionable Decisions Data Aggregation Data Fusion Causal Inference Treatment Regimens Data Scrubbing Summary Stats Networks, Analytics Forecasts, Predictions Semantic-Mapping Derived Biomarkers Linkages, Associations Healthcare Outcomes
Ann$Arbor$is$Already$Recognized$ $as$a$ Big$Data$Powerhouse $
Ann$Arbor$is$Already$Recognized$ $as$a$ Big$Data$Powerhouse $ Ann Arbor is one of top 5 US Cities for Big Data1 Ranking criteria: Data openness of city across 19 types of data Economic conditions for innovation
RFI Summary: Executive Summary
RFI Summary: Executive Summary On February 20, 2013, the NIH issued a Request for Information titled Training Needs In Response to Big Data to Knowledge (BD2K) Initiative. The response was large, with
Collaborations between Official Statistics and Academia in the Era of Big Data
Collaborations between Official Statistics and Academia in the Era of Big Data World Statistics Day October 20-21, 2015 Budapest Vijay Nair University of Michigan Past-President of ISI [email protected] What
Learning outcomes. Knowledge and understanding. Competence and skills
Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges
Predictive Analytics Certificate Program
Information Technologies Programs Predictive Analytics Certificate Program Accelerate Your Career Offered in partnership with: University of California, Irvine Extension s professional certificate and
Big Data Challenges: Data Management, Analytics & Security
Big Data Challenges: Data Management, Analytics & Security Ivo D. Dinov Statistics Online Computational Resource University of Michigan www.socr.umich.edu Big Data Challenges Availability, Sharing, Aggregation
BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376
Course Director: Dr. Kayvan Najarian (DCM&B, [email protected]) Lectures: Labs: Mondays and Wednesdays 9:00 AM -10:30 AM Rm. 2065 Palmer Commons Bldg. Wednesdays 10:30 AM 11:30 AM (alternate weeks) Rm.
ANALYTICS CENTER LEARNING PROGRAM
Overview of Curriculum ANALYTICS CENTER LEARNING PROGRAM The following courses are offered by Analytics Center as part of its learning program: Course Duration Prerequisites 1- Math and Theory 101 - Fundamentals
COPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments
Contents List of Figures Foreword Preface xxv xxiii xv Acknowledgments xxix Chapter 1 Fraud: Detection, Prevention, and Analytics! 1 Introduction 2 Fraud! 2 Fraud Detection and Prevention 10 Big Data for
Statistics for BIG data
Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before
Proposal for Undergraduate Certificate in Large Data Analysis
Proposal for Undergraduate Certificate in Large Data Analysis To: Helena Dettmer, Associate Dean for Undergraduate Programs and Curriculum From: Suely Oliveira (Computer Science), Kate Cowles (Statistics),
Course Requirements for the Ph.D., M.S. and Certificate Programs
Health Informatics Course Requirements for the Ph.D., M.S. and Certificate Programs Health Informatics Core (6 s.h.) All students must take the following two courses. 173:120 Principles of Public Health
OUTLINE FOR AN INTERDISCIPLINARY CERTIFICATE PROGRAM
OUTLINE FOR AN INTERDISCIPLINARY CERTIFICATE PROGRAM I. Basic Information 1. Institution: University of Georgia Date: September 30, 2015 2. School/College: Franklin College of Arts and Sciences 3. Department/Division:
COMP9321 Web Application Engineering
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 11 (Part II) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411
Data, Measurements, Features
Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are
Big Data and Healthcare Payers WHITE PAPER
Knowledgent White Paper Series Big Data and Healthcare Payers WHITE PAPER Summary With the implementation of the Affordable Care Act, the transition to a more member-centric relationship model, and other
Information Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli ([email protected])
BIOINF 525 Winter 2016 Foundations of Bioinformatics and Systems Biology http://tinyurl.com/bioinf525-w16
Course Director: Dr. Barry Grant (DCM&B, [email protected]) Description: This is a three module course covering (1) Foundations of Bioinformatics, (2) Statistics in Bioinformatics, and (3) Systems
SAS JOINT DATA MINING CERTIFICATION AT BRYANT UNIVERSITY
SAS JOINT DATA MINING CERTIFICATION AT BRYANT UNIVERSITY Billie Anderson Bryant University, 1150 Douglas Pike, Smithfield, RI 02917 Phone: (401) 232-6089, e-mail: [email protected] Phyllis Schumacher
Introduction to Data Mining
Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:
Doctor of Philosophy in Computer Science
Doctor of Philosophy in Computer Science Background/Rationale The program aims to develop computer scientists who are armed with methods, tools and techniques from both theoretical and systems aspects
An interdisciplinary model for analytics education
An interdisciplinary model for analytics education Raffaella Settimi, PhD School of Computing, DePaul University Drew Conway s Data Science Venn Diagram http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram
Graduate Co-op Students Information Manual. Department of Computer Science. Faculty of Science. University of Regina
Graduate Co-op Students Information Manual Department of Computer Science Faculty of Science University of Regina 2014 1 Table of Contents 1. Department Description..3 2. Program Requirements and Procedures
Big Data and Analytics: Challenges and Opportunities
Big Data and Analytics: Challenges and Opportunities Dr. Amin Beheshti Lecturer and Senior Research Associate University of New South Wales, Australia (Service Oriented Computing Group, CSE) Talk: Sharif
Teaching Scheme Credits Assigned Course Code Course Hrs./Week. BEITC802 Big Data 04 02 --- 04 01 --- 05 Analytics. Theory Marks
Teaching Scheme Credits Assigned Course Code Course Hrs./Week Name Theory Practical Tutorial Theory Practical/Oral Tutorial Tota l BEITC802 Big Data 04 02 --- 04 01 --- 05 Analytics Examination Scheme
Data Analytics at NICTA. Stephen Hardy National ICT Australia (NICTA) [email protected]
Data Analytics at NICTA Stephen Hardy National ICT Australia (NICTA) [email protected] NICTA Copyright 2013 Outline Big data = science! Data analytics at NICTA Discrete Finite Infinite Machine Learning
Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.
Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are
Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing
Computational Science and Informatics (Data Science) Programs at GMU
Computational Science and Informatics (Data Science) Programs at GMU Kirk Borne George Mason University School of Physics, Astronomy, & Computational Sciences http://spacs.gmu.edu/ Outline Graduate Program
Principles of Data Mining by Hand&Mannila&Smyth
Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences
Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Master of Science in Computer Science
Master of Science in Computer Science Background/Rationale The MSCS program aims to provide both breadth and depth of knowledge in the concepts and techniques related to the theory, design, implementation,
KATE GLEASON COLLEGE OF ENGINEERING. John D. Hromi Center for Quality and Applied Statistics
ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM KATE GLEASON COLLEGE OF ENGINEERING John D. Hromi Center for Quality and Applied Statistics NEW (or REVISED) COURSE (KGCOE- CQAS- 747- Principles of
Master of Science in Health Information Technology Degree Curriculum
Master of Science in Health Information Technology Degree Curriculum Core courses: 8 courses Total Credit from Core Courses = 24 Core Courses Course Name HRS Pre-Req Choose MIS 525 or CIS 564: 1 MIS 525
Use advanced techniques for summary and visualization of complex data for exploratory analysis and presentation.
MS Biostatistics MS Biostatistics Competencies Study Development: Work collaboratively with biomedical or public health researchers and PhD biostatisticians, as necessary, to provide biostatistical expertise
CS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 Real-Time Systems. CSCI 522 High Performance Computing
CS Master Level Courses and Areas The graduate courses offered may change over time, in response to new developments in computer science and the interests of faculty and students; the list of graduate
Ph.D. in Bioinformatics and Computational Biology Degree Requirements
Ph.D. in Bioinformatics and Computational Biology Degree Requirements Credits Students pursuing the doctoral degree in BCB must complete a minimum of 90 credits of relevant work beyond the bachelor s degree;
An Overview of Knowledge Discovery Database and Data mining Techniques
An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,
Graduate Certificate in Systems Engineering
Graduate Certificate in Systems Engineering Systems Engineering is a multi-disciplinary field that aims at integrating the engineering and management functions in the development and creation of a product,
GRADUATE CATALOG LISTING
GRADUATE CATALOG LISTING 1 BIOINFORMATICS & COMPUTATIONAL BIOLOGY Telephone: (302) 831-0161 http://bioinformatics.udel.edu/education Faculty Listing: http://bioinformatics.udel.edu/education/faculty A.
NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE. Venu Govindaraju
NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE Venu Govindaraju BIOMETRICS DOCUMENT ANALYSIS PATTERN RECOGNITION 8/24/2015 ICDAR- 2015 2 Towards a Globally Optimal Approach for Learning Deep Unsupervised
Machine Learning and Data Analysis overview. Department of Cybernetics, Czech Technical University in Prague. http://ida.felk.cvut.
Machine Learning and Data Analysis overview Jiří Kléma Department of Cybernetics, Czech Technical University in Prague http://ida.felk.cvut.cz psyllabus Lecture Lecturer Content 1. J. Kléma Introduction,
An Introduction to Data Mining
An Introduction to Intel Beijing [email protected] January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail
Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing
Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise, Evangelos Pournaras, Dirk Helbing 1 Overview Main principles of data mining Definition
Is a Data Scientist the New Quant? Stuart Kozola MathWorks
Is a Data Scientist the New Quant? Stuart Kozola MathWorks 2015 The MathWorks, Inc. 1 Facts or information used usually to calculate, analyze, or plan something Information that is produced or stored by
Introduction to Data Mining
Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association
Information Visualization WS 2013/14 11 Visual Analytics
1 11.1 Definitions and Motivation Lot of research and papers in this emerging field: Visual Analytics: Scope and Challenges of Keim et al. Illuminating the path of Thomas and Cook 2 11.1 Definitions and
DATA SCIENCE CURRICULUM WEEK 1 ONLINE PRE-WORK INSTALLING PACKAGES COMMAND LINE CODE EDITOR PYTHON STATISTICS PROJECT O5 PROJECT O3 PROJECT O2
DATA SCIENCE CURRICULUM Before class even begins, students start an at-home pre-work phase. When they convene in class, students spend the first eight weeks doing iterative, project-centered skill acquisition.
Health Informatics Student Handbook
Health Informatics Student Handbook Interdisciplinary Graduate Program in Informatics (IGPI) Please see the Interdisciplinary Graduate Program in Informatics Handbook for more information regarding our
Research-based Learning (RbL) in Computing Courses for Senior Engineering Students
Research-based Learning (RbL) in Computing Courses for Senior Engineering Students Khaled Bashir Shaban, and Mahmoud Abdulwahed Computer Science and Engineering Department; and CRU, Dean s Office Best
Master of Science (MS) in Biostatistics 2014-2015. Program Director and Academic Advisor:
Master of Science (MS) in Biostatistics 014-015 Note: All curriculum revisions will be updated immediately on the website http://publichealh.gwu.edu Program Director and Academic Advisor: Dante A. Verme,
Data Science and Business Analytics Certificate Data Science and Business Intelligence Certificate
Data Science and Business Analytics Certificate Data Science and Business Intelligence Certificate Description The Helzberg School of Management has launched two graduate-level certificates: one in Data
Data Mining for Customer Service Support. Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin
Data Mining for Customer Service Support Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin Traditional Hotline Services Problem Traditional Customer Service Support (manufacturing)
Why is Internal Audit so Hard?
Why is Internal Audit so Hard? 2 2014 Why is Internal Audit so Hard? 3 2014 Why is Internal Audit so Hard? Waste Abuse Fraud 4 2014 Waves of Change 1 st Wave Personal Computers Electronic Spreadsheets
How To Get A Computer Science Degree At Appalachian State
118 Master of Science in Computer Science Department of Computer Science College of Arts and Sciences James T. Wilkes, Chair and Professor Ph.D., Duke University [email protected] http://www.cs.appstate.edu/
The Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
Data Mining and Machine Learning in Bioinformatics
Data Mining and Machine Learning in Bioinformatics PRINCIPAL METHODS AND SUCCESSFUL APPLICATIONS Ruben Armañanzas http://mason.gmu.edu/~rarmanan Adapted from Iñaki Inza slides http://www.sc.ehu.es/isg
Information and Decision Sciences (IDS)
University of Illinois at Chicago 1 Information and Decision Sciences (IDS) Courses IDS 400. Advanced Business Programming Using Java. 0-4 Visual extended business language capabilities, including creating
2015 Workshops for Professors
SAS Education Grow with us Offered by the SAS Global Academic Program Supporting teaching, learning and research in higher education 2015 Workshops for Professors 1 Workshops for Professors As the market
Data-intensive HPC: opportunities and challenges. Patrick Valduriez
Data-intensive HPC: opportunities and challenges Patrick Valduriez Big Data Landscape Multi-$billion market! Big data = Hadoop = MapReduce? No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard,
The Data Engineer. Mike Tamir Chief Science Officer Galvanize. Steven Miller Global Leader Academic Programs IBM Analytics
The Data Engineer Mike Tamir Chief Science Officer Galvanize Steven Miller Global Leader Academic Programs IBM Analytics Alessandro Gagliardi Lead Faculty Galvanize Businesses are quickly realizing that
Chapter 5. Warehousing, Data Acquisition, Data. Visualization
Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization 5-1 Learning Objectives
PROGRAM DIRECTOR: Arthur O Connor Email Contact: URL : THE PROGRAM Careers in Data Analytics Admissions Criteria CURRICULUM Program Requirements
Data Analytics (MS) PROGRAM DIRECTOR: Arthur O Connor CUNY School of Professional Studies 101 West 31 st Street, 7 th Floor New York, NY 10001 Email Contact: Arthur O Connor, [email protected] URL:
I. Justification and Program Goals
MS in Data Science proposed by Department of Computer Science, B. Thomas Golisano College of Computing and Information Sciences Department of Information Sciences and Technologies, B. Thomas Golisano College
Scholarship for Service (SFS) PhD with Information Assurance (IA) Concentration*
Scholarship for Service (SFS) PhD with Information Assurance (IA) Concentration* University of North Texas (UNT) received a NSF award to provide scholarships, including stipends, tuition, health insurance,
MS1b Statistical Data Mining
MS1b Statistical Data Mining Yee Whye Teh Department of Statistics Oxford http://www.stats.ox.ac.uk/~teh/datamining.html Outline Administrivia and Introduction Course Structure Syllabus Introduction to
National and Transnational Security Implications of Big Data in the Life Sciences
Prepared by the American Association for the Advancement of Science in conjunction with the Federal Bureau of Investigation and the United Nations Interregional Crime and Justice Research Institute National
Statistical Models in Data Mining
Statistical Models in Data Mining Sargur N. Srihari University at Buffalo The State University of New York Department of Computer Science and Engineering Department of Biostatistics 1 Srihari Flood of
Data Mining + Business Intelligence. Integration, Design and Implementation
Data Mining + Business Intelligence Integration, Design and Implementation ABOUT ME Vijay Kotu Data, Business, Technology, Statistics BUSINESS INTELLIGENCE - Result Making data accessible Wider distribution
Introduction to Data Science: CptS 483-06 Syllabus First Offering: Fall 2015
Course Information Introduction to Data Science: CptS 483-06 Syllabus First Offering: Fall 2015 Credit Hours: 3 Semester: Fall 2015 Meeting times and location: MWF, 12:10 13:00, Sloan 163 Course website:
Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD
Predictive Analytics Techniques: What to Use For Your Big Data March 26, 2014 Fern Halper, PhD Presenter Proven Performance Since 1995 TDWI helps business and IT professionals gain insight about data warehousing,
Pipeline Pilot Enterprise Server. Flexible Integration of Disparate Data and Applications. Capture and Deployment of Best Practices
overview Pipeline Pilot Enterprise Server Pipeline Pilot Enterprise Server (PPES) is a powerful client-server platform that streamlines the integration and analysis of the vast quantities of data flooding
Data Science, Predictive Analytics & Big Data Analytics Solutions. Service Presentation
Data Science, Predictive Analytics & Big Data Analytics Solutions Service Presentation Did You Know That According to the new research from GE and Accenture*: 87% of companies believe Big Data analytics
Data Science Certificate Program
Information Technologies Programs Data Science Certificate Program Accelerate Your Career extension.uci.edu/datascience Offered in partnership with University of California, Irvine Extension s professional
Big Data Text Mining and Visualization. Anton Heijs
Copyright 2007 by Treparel Information Solutions BV. This report nor any part of it may be copied, circulated, quoted without prior written approval from Treparel7 Treparel Information Solutions BV Delftechpark
SURVEY REPORT DATA SCIENCE SOCIETY 2014
SURVEY REPORT DATA SCIENCE SOCIETY 2014 TABLE OF CONTENTS Contents About the Initiative 1 Report Summary 2 Participants Info 3 Participants Expertise 6 Suggested Discussion Topics 7 Selected Responses
Data Mining Solutions for the Business Environment
Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania [email protected] Over
Clustering Big Data. Anil K. Jain. (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012
Clustering Big Data Anil K. Jain (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012 Outline Big Data How to extract information? Data clustering
Database Marketing, Business Intelligence and Knowledge Discovery
Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski
Winter 2016 Course Timetable. Legend: TIME: M = Monday T = Tuesday W = Wednesday R = Thursday F = Friday BREATH: M = Methodology: RA = Research Area
Winter 2016 Course Timetable Legend: TIME: M = Monday T = Tuesday W = Wednesday R = Thursday F = Friday BREATH: M = Methodology: RA = Research Area Please note: Times listed in parentheses refer to the
Predictive Big Data Analytics: Imaging-Genetics Fundamentals, Research Challenges, & Opportunities. Outline
Predictive Big Data Analytics: Imaging-Genetics Fundamentals, Research Challenges, & Opportunities Ivo D. Dinov Statistics Online Computational Resource Michigan Institute for Data Science Health Behavior
Course Descriptions: Undergraduate/Graduate Certificate Program in Data Visualization and Analysis
9/3/2013 Course Descriptions: Undergraduate/Graduate Certificate Program in Data Visualization and Analysis Seton Hall University, South Orange, New Jersey http://www.shu.edu/go/dava Visualization and
The Use of Open Source Is Growing. So Why Do Organizations Still Turn to SAS?
Conclusions Paper The Use of Open Source Is Growing. So Why Do Organizations Still Turn to SAS? Insights from a presentation at the 2014 Hadoop Summit Featuring Brian Garrett, Principal Solutions Architect
Let the data speak to you. Look Who s Peeking at Your Paycheck. Big Data. What is Big Data? The Artemis project: Saving preemies using Big Data
CS535 Big Data W1.A.1 CS535 BIG DATA W1.A.2 Let the data speak to you Medication Adherence Score How likely people are to take their medication, based on: How long people have lived at the same address
University of Manchester Health Data Science Masters Modules
University of Manchester Health Data Science Masters Modules We are taking applications now for Masters CPD modules beginning in February. All modules are 15 credits and cost 750. Timetable is as follows
Proposal for New Program: BS in Data Science: Computational Analytics
Proposal for New Program: BS in Data Science: Computational Analytics 1. Rationale... The proposed Data Science: Computational Analytics major is designed for students interested in developing expertise
