DSSP Data Science Starter Program - Polytechnique
|
|
- Angelina McCarthy
- 8 years ago
- Views:
Transcription
1 DSSP Data Science Starter Program - Polytechnique A novel professional training on Data Science and Bigdata, offered by École Polytechnique jointly by the Applied Mathematics and Informatics Department 1. Target Audience and Prerequisite(s) Year 1 / October 3 - December 13, 2014 The proposed modules are suitable for anyone with some basic knowledge of Computer Science or Statistics. No programming experience is required. The program is designed for individuals (researchers and practitioners). The concepts and training delivered in this program enable a sound understanding of the context and challenges of Big Data, a challenge that shapes the evolution of sciences and many business domains. The offered program is suitable to both early career professionals as well as senior managers that need an understanding of this challenging area and its applications. 2. Data Science Starter Program The training program aims at professionals and executives and covers taught modules, labs and homework. It addresses state- of- the- art topics in Data Science and Big Data ranging from data collection, storage and processing to analytics and visualization, as well as a range of real- world applications and business/laboratory cases. This program is large- scope, and will cover, to a satisfactory degree of detail, the methods and tools to tackle big data problems. 2.1 Master Structure The training spans 140 hours taught (Friday and Saturday, in October/November), each training day: 2 x 3h slots + 1h conference/invited talk. The thematic articulation is as follows: Week 1. Data Science introduction. Big Data ecosystem: players, software, hardware Data project cycle/management Legal issues/security framework. Week 2. Data Management. Database / SQL, data cleaning, normalization, feature selection & creation spectral, decompositions and dimensionality reduction. Weeks 3-5. Data Analysis and Machine Learning. Descriptive (data quality) Exploratory (summary statistics, correlation, ANOVA) Inferential (theory of generalization, sampling, statistical testing) Predictive (supervised, unsupervised machine learning). Week 6-7. Cloud computing & Big Data. Introduction the basics of the cloud computing paradigm and understanding of performance evaluation for applications in the cloud. Basic concepts of Bigdata - Hadoop/MapReduce as a programming model for distributed processing of large datasets. Introduction to NoSQL languages. Week Graph & Text Mining and Bigdata Camp. Methods and tools for pre- processing, indexing, querying, retrieval and ranking of text at the document and collection levels. Algorithms for text- oriented application in web and social networks. Methods and tools for pre- processing graphs, searching ranking and evaluating nodes and communities. 1
2 2.2 Courses structure and Syllabus Course Objective Syllabus Introduction to Data Science Data Management Data Analysis and Machine Learning Cloud Computing & Bigdata To present a big picture of Data Science as well as of its cycles. To present the foundation of data management: accessing to the data stored in a database and (pre)processing to prepare its analysis To present the basis of Data Analysis and Machine Learning: how to describe and explore a dataset, how to use data to find hidden information and to do prediction with statistical and machine learning algorithms. Introduce the basics of the cloud- computing paradigm. Understand in performance evaluation for applications in the cloud. Understand the basic concepts in Hadoop/ MapReduce as a programming model for distributed processing of large datasets. Big Data ecosystem: players, software, hardware Data project cycle/management Juridic/security framework Databases, SQL, design Data processing: normalization, feature selection & creation, spectral decompositions and dimensionality reduction Looking at the data: Descriptive statistic, PCA and dimension reduction, Statistical testing Unsupervised clustering: Clustering, K- Means and K- Means++, DBSCAN, Hierarchical clustering Linear model and diagnostic: Generalization theory, Prediction vs inference, Linear model and diagnostic Logistic regression: Logistic regression and variable selection, Overfitting and Cross validation, Metric choice (AUC, Precision/Recall, F- Score,...) Machine Learning: Empirical criterion minimization, SVM, Regularization for SVM and logistic regression Tree methods and ensemble methods: Classification And Regression Tree,Bagging and boosting Further topics: Naive Bayes, Non- parametric methods, Neural networks and deep learning, Spectral clustering Overview of Computing Paradigms Grid Computing, Cluster Computing, Distributed Computing, Utility Computing, Cloud Computing Cloud Computing Architecture - Comparison with traditional computing architecture (client/server) Services provided at various levels, Role of Networks protocols, Web services Service Management in Cloud Computing Data security privacy and security Issues Principles of parallel processing and distributed systems Functional programming and parallel algorithms for Mapreduce Hadoop storage, DFS, Cluster architecture, Visual Analytics 2
3 Graph & Text Mining Graphs and Texts are ubiquitous in social and web data. This module provides methods and tools for pre- processing, indexing, querying, retrieval and ranking of text at the document and collection levels. We describe also algorithms for text- oriented application in web and social networks. For graphs, the objective is to provide methods and tools for pre- processing graphs, searching ranking and evaluating nodes and communities. Community mining methods, graph clustering methods (min- cut, spectral clustering), Spectral Clustering of Graph Data Ranking algorithms (Pagerank), Ranking evaluation measures (Kendal Tau, NDCG), Degeneracy (k- core & extensions) Feature extraction for text, scoring, term weighting & the vector space representation, indexing, retrieval functions: time- frequency/inverse- document- frequency (TF- IDF), BM25. Web Mining. Web personalization and recommendations (collaborative filtering) Web Advertising (Google ad- words, 2nd price auctions, campaign design principles, natural language generation for snippets, campaign optimization algorithms). Bigdata Camp Apply the techniques described in the previous lectures to a case study from an industrial problem or academic problem, using state- of- the- art methods and machine learning tools. Conferences - Invited talks Case study from industry or academia Workshops from machine learning challenges This is a horizontal activity spanning all the duration of the master with invited people from academia and industry to present topics and experiences from data science and big data case studies. 3
4 3. Teaching staff Faculty S. Gaiffas (CMAP), C. Giatsidis (LIX), B. Kegl (LAL), Short CV Stéphane Gaïffas is Professeur Chargé at the department of applied mathematics of Ecole Polytechnique. He is doing research in Statistics and Machine Learning, with current applications to web- marketing, social networks, and health records data in partnership with Caisse Nationale d Assurance Maladie. He defended his PhD in Statistics about «Nonparametric Regression and Inhomogeneous Information» under the supervision of Marc Hoffman at LPMA - Univ. Denis Diderot in He was Maitre de Conférence at LSTA - Univ Paris 6 between 2007 and He has a scientific consultant activity for machine learning and big data since 3 years with several french companies. Christos Giatsidis is currently a Post- doctoral researcher in the Computer Science Laboratory at Ecole Polytechnique in France. He received his Diploma in computer Science from the Athens Univ. of Economics & Business, Greece in 2009 and his PhD from Ecole Polytechnique, under the supervision of Prof. Michalis Vazirgiannis. In 2014 he received a "thesis prize" for his thesis entitled "Graph Mining and Community Detection with Degeneracy". He has experience in both the research and industrial domain. Specifically, recent work on the industrial domain includes predicting a players obsession for a large French company in the gambling industry and working on a prediction model for component failure for a big aeronautics company. His research interests include data/graph mining and algorithms for big data management. Balázs Kégl received the Ph.D. degree in computer science from Concordia University, Montreal, in From January to December 2000 he was a Postdoctoral Fellow at the Department of Mathematics and Statistics at Queen's University, Kingston, Canada, receiving NSERC Postdoctoral Fellowship. He was an Assistant Professor from 2001 to 2006 in the Department of Computer Science and Operations Research at the University of Montreal. Since 2006 he has been a research scientist in the Linear Accelerator Laboratory of the CNRS (DR since 2013). He has published more than hundred papers on unsupervised and supervised learning (principal curves, intrinsic dimensionality estimation, boosting), large- scale Bayesian inference and optimization, and on various applications ranging from music and image processing to systems biology and experimental physics. At his current position he has been the head of the AppStat team working on machine learning and statistical inference problems motivated by applications in high- 4
5 energy particle and astroparticle physics. Since 2014, he has been the chair of the Center for Data Science of the University of Paris Saclay. E. Le Pennec (CMAP), Eric Matzner- Lober (CMAP) M. Vazirgiannis (LIX) Erwan Le Pennec have been an Associate Professor (Professeur associé) at the Applied Math department of École Polytechnique since September He is doing his research in statistics and signal processing at the CMAP of the same school. He has done a Signal Processing PhD with Stéphane Mallat at the centre de mathématiques appliquées de l'école Polytechnique. The subject of his thesis is the introduction of geometry in image representation. He defended it on December the 19th 2002: its title is Bandelettes et représentations géométriques des images (Bandelets and geometric representation of images). In , He worked as a "post- doc" in a joint- project between the CMAP and Let It Wave, a company created by Stéphane Mallat, Christophe Bernard, Jérôme Kalifa and myself to exploit our research on bandelets. From 2004 to 2010, He was a "Maitre de Conférence" (Assistant Professor) at the university Paris Diderot (Paris 7) in the "laboratoire de Probabilités et Modèles Aléatoires" (Statistics team). From 2010 to 2013, He was a "Chargé de Recherche" (Research Associate) at the project SELECT of Inria Saclay, a project in which he had already worked in He has also accompanied Let It Wave, even after it was selled to Zoran, as a scientific consultant. Eric Matzner- Lober have been professor of Statistics at Rennes 2 university since 2007, he is also affiliated at Los Alamos National Laboratory. From this year on, he is also part time professor at Ecole Polytechnique. He is a specialist of non parametric statistic and machine learning. He is a renown expert of R, a language for which he runs a book series. He has also funded a statistic consulting company that has been bought by a major consulting actor. Dr. Vazirgiannis is a Professor in LIX, Ecole Polytechnique. He is currently working in the area of Data Science for Bigdata aiming at harnessing the potential of machine learning algorithms for large scale data sets including text and graphs. More specifically his current work is on graph degeneracy for large scale graph mining, graph based text retrieval, learning models from time series data and text mining for the web (i.e. advertising, news streams). He is involved in teaching in data mining and machine learning for big data in Ecole Polytechnique. He has supervised previously nine completed Ph.D. theses and supervises six more underway. He has published chapters in books and encyclopedias, two international books and more than a hundred twenty (120) papers in international refereed journals and conferences. He has received the 5
6 ERCIM and Marie Curie EU fellowships. Also he has coauthored three patents and attracted significant R&D funding including national and international research & development projects. Currently he leads industrial projects in the area of large scale machine learning. 6
7 4. Master Schedule (3/10/ /12/2014) Session Date Topic Teaching Faculty Amphi 1 3/10/2014 Introduction to Data Science Gaiffas, Le Pennec, Matzner Painlevé 2 4/10/2014 Introduction to Data Science Gaiffas, Le Pennec, Matzner Painlevé 3 10/10/2014 Data Management Giatsidis, Vazirgianis Painlevé 4 11/10/2014 Data Management Giatsidis, Vazirgianis Painlevé 5 17/10/2014 Data Analysis and Machine Learning Gaiffas, Le Pennec, Matzner Painlevé 6 18/10/2014 Data Analysis and Machine Learning Gaiffas, Le Pennec, Matzner Painlevé 7 24/10/2014 Data Analysis and Machine Learning Gaiffas, Le Pennec, Matzner Painlevé 8 25/10/2014 Data Analysis and Machine Learning Gaiffas, Le Pennec, Matzner Painlevé 9 07/11/2014 Data Analysis and Machine Learning Gaiffas, Le Pennec, Matzner Painlevé 10 08/11/2014 Data Analysis and Machine Learning Gaiffas, Le Pennec, Matzner Painlevé 11 14/11/2014 Cloud Computing & Bigdata Gaiffas, Matzner Painlevé 12 15/11/2014 Cloud Computing & Bigdata Gaiffas, Matzner Painlevé 13 21/11/2014 Cloud Computing & Bigdata Giatsidis, Vazirgianis Painlevé 14 22/11/2014 Cloud Computing & Bigdata Giatsidis, Vazirgianis Painlevé 15 28/11/2014 Graph/Text Mining Vazirgianis, Giatsidis Painlevé 16 29/12/2014 Graph/Text Mining Vazirgianis, Malliaros Painlevé 17 5/12/2014 Bigdata Camp Kegl, Giatsidis Painlevé 18 6/12/2014 Bigdata Camp Kegl, Giatsidis Labs 19 12/12/2014 Bigdata Camp Kegl, Giatsidis Painlevé 20 13/12/2014 Bigdata Camp Kegl, Giatsidis Painlevé 7
ANALYTICS CENTER LEARNING PROGRAM
Overview of Curriculum ANALYTICS CENTER LEARNING PROGRAM The following courses are offered by Analytics Center as part of its learning program: Course Duration Prerequisites 1- Math and Theory 101 - Fundamentals
More informationIntroduction to Big Data Analytics p. 1 Big Data Overview p. 2 Data Structures p. 5 Analyst Perspective on Data Repositories p.
Introduction p. xvii Introduction to Big Data Analytics p. 1 Big Data Overview p. 2 Data Structures p. 5 Analyst Perspective on Data Repositories p. 9 State of the Practice in Analytics p. 11 BI Versus
More informationHow to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning
How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume
More informationLearning outcomes. Knowledge and understanding. Competence and skills
Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges
More informationIs a Data Scientist the New Quant? Stuart Kozola MathWorks
Is a Data Scientist the New Quant? Stuart Kozola MathWorks 2015 The MathWorks, Inc. 1 Facts or information used usually to calculate, analyze, or plan something Information that is produced or stored by
More informationThe Need for Training in Big Data: Experiences and Case Studies
The Need for Training in Big Data: Experiences and Case Studies Guy Lebanon Amazon Background and Disclaimer All opinions are mine; other perspectives are legitimate. Based on my experience as a professor
More informationBIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376
Course Director: Dr. Kayvan Najarian (DCM&B, kayvan@umich.edu) Lectures: Labs: Mondays and Wednesdays 9:00 AM -10:30 AM Rm. 2065 Palmer Commons Bldg. Wednesdays 10:30 AM 11:30 AM (alternate weeks) Rm.
More informationInformation Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)
More informationAzure Machine Learning, SQL Data Mining and R
Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:
More informationMS1b Statistical Data Mining
MS1b Statistical Data Mining Yee Whye Teh Department of Statistics Oxford http://www.stats.ox.ac.uk/~teh/datamining.html Outline Administrivia and Introduction Course Structure Syllabus Introduction to
More information01219211 Software Development Training Camp 1 (0-3) Prerequisite : 01204214 Program development skill enhancement camp, at least 48 person-hours.
(International Program) 01219141 Object-Oriented Modeling and Programming 3 (3-0) Object concepts, object-oriented design and analysis, object-oriented analysis relating to developing conceptual models
More informationSURVEY REPORT DATA SCIENCE SOCIETY 2014
SURVEY REPORT DATA SCIENCE SOCIETY 2014 TABLE OF CONTENTS Contents About the Initiative 1 Report Summary 2 Participants Info 3 Participants Expertise 6 Suggested Discussion Topics 7 Selected Responses
More informationBig Data Analytics: Where is it Going and How Can it Be Taught at the Undergraduate Level?
Big Data Analytics: Where is it Going and How Can it Be Taught at the Undergraduate Level? Dr. Frank Lee Chair, ECE/CS/IT New York Institute of Technology Old Westbury, NY 11568 Topics This talk describes:
More informationPractical Data Science with Azure Machine Learning, SQL Data Mining, and R
Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be
More informationConcept and Project Objectives
3.1 Publishable summary Concept and Project Objectives Proactive and dynamic QoS management, network intrusion detection and early detection of network congestion problems among other applications in the
More informationCS 207 - Data Science and Visualization Spring 2016
CS 207 - Data Science and Visualization Spring 2016 Professor: Sorelle Friedler sorelle@cs.haverford.edu An introduction to techniques for the automated and human-assisted analysis of data sets. These
More informationCOLLEGE OF SCIENCE. John D. Hromi Center for Quality and Applied Statistics
ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE John D. Hromi Center for Quality and Applied Statistics NEW (or REVISED) COURSE: COS-STAT-747 Principles of Statistical Data Mining
More informationKATE GLEASON COLLEGE OF ENGINEERING. John D. Hromi Center for Quality and Applied Statistics
ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM KATE GLEASON COLLEGE OF ENGINEERING John D. Hromi Center for Quality and Applied Statistics NEW (or REVISED) COURSE (KGCOE- CQAS- 747- Principles of
More informationDATA SCIENCE CURRICULUM WEEK 1 ONLINE PRE-WORK INSTALLING PACKAGES COMMAND LINE CODE EDITOR PYTHON STATISTICS PROJECT O5 PROJECT O3 PROJECT O2
DATA SCIENCE CURRICULUM Before class even begins, students start an at-home pre-work phase. When they convene in class, students spend the first eight weeks doing iterative, project-centered skill acquisition.
More informationPredictive Analytics Certificate Program
Information Technologies Programs Predictive Analytics Certificate Program Accelerate Your Career Offered in partnership with: University of California, Irvine Extension s professional certificate and
More informationBIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON
BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON Overview * Introduction * Multiple faces of Big Data * Challenges of Big Data * Cloud Computing
More informationData Science at U of U
Data Science at U of U Je M. Phillips Assistant Professor, School of Computing Center for Extreme Data Management, Analysis, and Visualization Director, Data Management and Analysis Track University of
More information2016 POST-DOCTORAL PROGRAM Applicant Guide
2016 POST-DOCTORAL PROGRAM Applicant Guide POST-DOCTORAL FELLOWSHIP PROGRAM 2016 Applicant guide The Initiative of Excellence of the University of Bordeaux (IdEx Bordeaux) is opening positions for postdoctoral
More informationSearch in BigData2 - When Big Text meets Big Graph 1. Introduction State of the Art on Big Data
Search in BigData 2 - When Big Text meets Big Graph Christos Giatsidis, Fragkiskos D. Malliaros, François Rousseau, Michalis Vazirgiannis Computer Science Laboratory, École Polytechnique, France {giatsidis,
More informationThe University of Jordan
The University of Jordan Master in Web Intelligence Non Thesis Department of Business Information Technology King Abdullah II School for Information Technology The University of Jordan 1 STUDY PLAN MASTER'S
More informationBig Data and Analytics: Challenges and Opportunities
Big Data and Analytics: Challenges and Opportunities Dr. Amin Beheshti Lecturer and Senior Research Associate University of New South Wales, Australia (Service Oriented Computing Group, CSE) Talk: Sharif
More informationStatistics for BIG data
Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before
More informationCLASSIFYING NETWORK TRAFFIC IN THE BIG DATA ERA
CLASSIFYING NETWORK TRAFFIC IN THE BIG DATA ERA Professor Yang Xiang Network Security and Computing Laboratory (NSCLab) School of Information Technology Deakin University, Melbourne, Australia http://anss.org.au/nsclab
More informationBIG DATA What it is and how to use?
BIG DATA What it is and how to use? Lauri Ilison, PhD Data Scientist 21.11.2014 Big Data definition? There is no clear definition for BIG DATA BIG DATA is more of a concept than precise term 1 21.11.14
More informationA Professional Big Data Master s Program to train Computational Specialists
A Professional Big Data Master s Program to train Computational Specialists Anoop Sarkar, Fred Popowich, Alexandra Fedorova! School of Computing Science! Education for Employable Graduates: Critical Questions
More informationBig Data Analytics and Optimization
Big Data Analytics and Optimization C e r t i f i c a t e P r o g r a m i n E n g i n e e r i n g E x c e l l e n c e e.edu.in http://www.insof LIST OF COURSES Essential Business Skills for a Data Scientist...
More informationCOMP9321 Web Application Engineering
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 11 (Part II) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411
More informationGraduate Co-op Students Information Manual. Department of Computer Science. Faculty of Science. University of Regina
Graduate Co-op Students Information Manual Department of Computer Science Faculty of Science University of Regina 2014 1 Table of Contents 1. Department Description..3 2. Program Requirements and Procedures
More informationEmail: justinjia@ust.hk Office: LSK 5045 Begin subject: [ISOM3360]...
Business Intelligence and Data Mining ISOM 3360: Spring 2015 Instructor Contact Office Hours Course Schedule and Classroom Course Webpage Jia Jia, ISOM Email: justinjia@ust.hk Office: LSK 5045 Begin subject:
More informationPROGRAMME SPECIFICATION POSTGRADUATE PROGRAMME
PROGRAMME SPECIFICATION POSTGRADUATE PROGRAMME KEY FACTS Programme name Advanced Computer Science Award MSc School Mathematics, Computer Science and Engineering Department or equivalent Department of Computing
More informationDEGREE CURRICULUM BIG DATA ANALYTICS SPECIALITY. MASTER in Informatics Engineering
DEGREE CURRICULUM BIG DATA ANALYTICS SPECIALITY MASTER in Informatics Engineering Module general information Module name BIG DATA ANALYTICS SPECIALITY Typology Optional ECTS 18 Temporal organization C1S2
More informationBayesian networks - Time-series models - Apache Spark & Scala
Bayesian networks - Time-series models - Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup - November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly
More informationStatistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
More informationThe Data Mining Process
Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data
More informationCS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.
Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott
More informationBig Data Analytics and Healthcare
Big Data Analytics and Healthcare Anup Kumar, Professor and Director of MINDS Lab Computer Engineering and Computer Science Department University of Louisville Road Map Introduction Data Sources Structured
More informationBig Data Analytics. Prof. Dr. Lars Schmidt-Thieme
Big Data Analytics Prof. Dr. Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany 33. Sitzung des Arbeitskreises Informationstechnologie,
More informationIntroduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing
Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise, Evangelos Pournaras, Dirk Helbing 1 Overview Main principles of data mining Definition
More informationREGULATIONS FOR THE DEGREE OF MASTER OF SCIENCE IN COMPUTER SCIENCE (MSc[CompSc])
305 REGULATIONS FOR THE DEGREE OF MASTER OF SCIENCE IN COMPUTER SCIENCE (MSc[CompSc]) (See also General Regulations) Any publication based on work approved for a higher degree should contain a reference
More informationMHI3000 Big Data Analytics for Health Care Final Project Report
MHI3000 Big Data Analytics for Health Care Final Project Report Zhongtian Fred Qiu (1002274530) http://gallery.azureml.net/details/81ddb2ab137046d4925584b5095ec7aa 1. Data pre-processing The data given
More informationCSCI-599 DATA MINING AND STATISTICAL INFERENCE
CSCI-599 DATA MINING AND STATISTICAL INFERENCE Course Information Course ID and title: CSCI-599 Data Mining and Statistical Inference Semester and day/time/location: Spring 2013/ Mon/Wed 3:30-4:50pm Instructor:
More informationAn Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015
An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content
More information2015 Workshops for Professors
SAS Education Grow with us Offered by the SAS Global Academic Program Supporting teaching, learning and research in higher education 2015 Workshops for Professors 1 Workshops for Professors As the market
More informationEUPIDE 2008 Enterprise-University Partnership in Doctoral Education 12-13 June, Université Pierre et Marie Curie, Paris Conference program
EUPIDE 2008 Enterprise-University Partnership in Doctoral Education 12-13 June, Université Pierre et Marie Curie, Paris Conference program Session 3 Workshop 2 DEVELOPING KNOWLEDGE OF ENTREPRISE Joël Monéger
More informationGovernment of Russian Federation. Faculty of Computer Science School of Data Analysis and Artificial Intelligence
Government of Russian Federation Federal State Autonomous Educational Institution of High Professional Education National Research University «Higher School of Economics» Faculty of Computer Science School
More informationBig-Data Computing with Smart Clouds and IoT Sensing
A New Book from Wiley Publisher to appear in late 2016 or early 2017 Big-Data Computing with Smart Clouds and IoT Sensing Kai Hwang, University of Southern California, USA Min Chen, Huazhong University
More informationJournée Thématique Big Data 13/03/2015
Journée Thématique Big Data 13/03/2015 1 Agenda About Flaminem What Do We Want To Predict? What Is The Machine Learning Theory Behind It? How Does It Work In Practice? What Is Happening When Data Gets
More informationScalable Machine Learning to Exploit Big Data for Knowledge Discovery
Scalable Machine Learning to Exploit Big Data for Knowledge Discovery Una-May O Reilly MIT MIT ILP-EPOCH Taiwan Symposium Big Data: Technologies and Applications Lots of Data Everywhere Knowledge Mining
More informationIntroduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu
Introduction to Machine Learning Lecture 1 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Introduction Logistics Prerequisites: basics concepts needed in probability and statistics
More informationPrerequisites. Course Outline
MS-55040: Data Mining, Predictive Analytics with Microsoft Analysis Services and Excel PowerPivot Description This three-day instructor-led course will introduce the students to the concepts of data mining,
More informationHigh Productivity Data Processing Analytics Methods with Applications
High Productivity Data Processing Analytics Methods with Applications Dr. Ing. Morris Riedel et al. Adjunct Associate Professor School of Engineering and Natural Sciences, University of Iceland Research
More informationMaster of Science in Health Information Technology Degree Curriculum
Master of Science in Health Information Technology Degree Curriculum Core courses: 8 courses Total Credit from Core Courses = 24 Core Courses Course Name HRS Pre-Req Choose MIS 525 or CIS 564: 1 MIS 525
More informationNo BI without Machine Learning
No BI without Machine Learning Francis Pieraut francis@qmining.com http://fraka6.blogspot.com/ 10 March 2011 MTI-820 ETS Too Much Data Supervised Learning (classification) Unsupervised Learning (clustering)
More information270107 - MD - Data Mining
Coordinating unit: Teaching unit: Academic year: Degree: ECTS credits: 015 70 - FIB - Barcelona School of Informatics 715 - EIO - Department of Statistics and Operations Research 73 - CS - Department of
More informationWhat is Data Science? Data, Databases, and the Extraction of Knowledge Renée T., @becomingdatasci, November 2014
What is Data Science? { Data, Databases, and the Extraction of Knowledge Renée T., @becomingdatasci, November 2014 Let s start with: What is Data? http://upload.wikimedia.org/wikipedia/commons/f/f0/darpa
More informationData Mining and Exploration. Data Mining and Exploration: Introduction. Relationships between courses. Overview. Course Introduction
Data Mining and Exploration Data Mining and Exploration: Introduction Amos Storkey, School of Informatics January 10, 2006 http://www.inf.ed.ac.uk/teaching/courses/dme/ Course Introduction Welcome Administration
More informationKnowledge Discovery from patents using KMX Text Analytics
Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs anton.heijs@treparel.com Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers
More informationData Science and Business Analytics Certificate Data Science and Business Intelligence Certificate
Data Science and Business Analytics Certificate Data Science and Business Intelligence Certificate Description The Helzberg School of Management has launched two graduate-level certificates: one in Data
More informationSunnie Chung. Cleveland State University
Sunnie Chung Cleveland State University Data Scientist Big Data Processing Data Mining 2 INTERSECT of Computer Scientists and Statisticians with Knowledge of Data Mining AND Big data Processing Skills:
More informationIntroduction to Data Mining
Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association
More informationExample application (1) Telecommunication. Lecture 1: Data Mining Overview and Process. Example application (2) Health
Lecture 1: Data Mining Overview and Process What is data mining? Example applications Definitions Multi disciplinary Techniques Major challenges The data mining process History of data mining Data mining
More informationMachine Learning. 01 - Introduction
Machine Learning 01 - Introduction Machine learning course One lecture (Wednesday, 9:30, 346) and one exercise (Monday, 17:15, 203). Oral exam, 20 minutes, 5 credit points. Some basic mathematical knowledge
More informationCS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 Real-Time Systems. CSCI 522 High Performance Computing
CS Master Level Courses and Areas The graduate courses offered may change over time, in response to new developments in computer science and the interests of faculty and students; the list of graduate
More informationLet the data speak to you. Look Who s Peeking at Your Paycheck. Big Data. What is Big Data? The Artemis project: Saving preemies using Big Data
CS535 Big Data W1.A.1 CS535 BIG DATA W1.A.2 Let the data speak to you Medication Adherence Score How likely people are to take their medication, based on: How long people have lived at the same address
More informationData Mining + Business Intelligence. Integration, Design and Implementation
Data Mining + Business Intelligence Integration, Design and Implementation ABOUT ME Vijay Kotu Data, Business, Technology, Statistics BUSINESS INTELLIGENCE - Result Making data accessible Wider distribution
More informationCSci 538 Articial Intelligence (Machine Learning and Data Analysis)
CSci 538 Articial Intelligence (Machine Learning and Data Analysis) Course Syllabus Fall 2015 Instructor Derek Harter, Ph.D., Associate Professor Department of Computer Science Texas A&M University - Commerce
More informationPredictive Data modeling for health care: Comparative performance study of different prediction models
Predictive Data modeling for health care: Comparative performance study of different prediction models Shivanand Hiremath hiremat.nitie@gmail.com National Institute of Industrial Engineering (NITIE) Vihar
More informationAn interdisciplinary model for analytics education
An interdisciplinary model for analytics education Raffaella Settimi, PhD School of Computing, DePaul University Drew Conway s Data Science Venn Diagram http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram
More informationfor the Field of Electrical and Information Engineering 1. Introduction: the doctorate in the framework of the European policy of education
2BNew Trends of Doctoral Studies in Europe: Special Considerations for the Field of Electrical and Information Engineering Olivier Bonnaud, Michael H.W. Hoffmann The authors are members of EAEEIE, IEEE,
More informationInternational Workshop on Big Data Analytics for Advanced Databases (BIGDATA, 2016)
International Workshop on Big Data Analytics for Advanced Databases (BIGDATA, 2016) Call for Papers AIM and SCOPE There is an exponential growth in digital data with unprecedented new platforms derived
More informationCOPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments
Contents List of Figures Foreword Preface xxv xxiii xv Acknowledgments xxix Chapter 1 Fraud: Detection, Prevention, and Analytics! 1 Introduction 2 Fraud! 2 Fraud Detection and Prevention 10 Big Data for
More informationSearch Taxonomy. Web Search. Search Engine Optimization. Information Retrieval
Information Retrieval INFO 4300 / CS 4300! Retrieval models Older models» Boolean retrieval» Vector Space model Probabilistic Models» BM25» Language models Web search» Learning to Rank Search Taxonomy!
More informationREGULATIONS FOR THE DEGREE OF MASTER OF SCIENCE IN COMPUTER SCIENCE (MSc[CompSc])
REGULATIONS FOR THE DEGREE OF MASTER OF SCIENCE IN COMPUTER SCIENCE (MSc[CompSc]) (See also General Regulations) Any publication based on work approved for a higher degree should contain a reference to
More informationAn Introduction to Data Mining
An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail
More informationDoctor of Philosophy in Computer Science
Doctor of Philosophy in Computer Science Background/Rationale The program aims to develop computer scientists who are armed with methods, tools and techniques from both theoretical and systems aspects
More informationInformation and Decision Sciences (IDS)
University of Illinois at Chicago 1 Information and Decision Sciences (IDS) Courses IDS 400. Advanced Business Programming Using Java. 0-4 Visual extended business language capabilities, including creating
More informationParallel and Distributed Data Analytics (PDDA 2014)
Ecole d Eté CEA-EDF-Inria 16 au 20 Juin 2014 CEA Cadarache Parallel and Distributed Data Analytics (PDDA 2014) Organizers: CEA: Michael Aupetit (LIST, Saclay) EDF: Georges Hébrail (R&D, Clamart) Inria:
More informationAn Introduction to Health Informatics for a Global Information Based Society
An Introduction to Health Informatics for a Global Information Based Society A Course proposal for 2010 Healthcare Industry Skills Innovation Award Sponsored by the IBM Academic Initiative submitted by
More informationScalable Developments for Big Data Analytics in Remote Sensing
Scalable Developments for Big Data Analytics in Remote Sensing Federated Systems and Data Division Research Group High Productivity Data Processing Dr.-Ing. Morris Riedel et al. Research Group Leader,
More informationMachine Learning with MATLAB David Willingham Application Engineer
Machine Learning with MATLAB David Willingham Application Engineer 2014 The MathWorks, Inc. 1 Goals Overview of machine learning Machine learning models & techniques available in MATLAB Streamlining the
More informationMachine learning for algo trading
Machine learning for algo trading An introduction for nonmathematicians Dr. Aly Kassam Overview High level introduction to machine learning A machine learning bestiary What has all this got to do with
More informationCURRICULUM VITAE. August 2008 now: Lecturer in Analysis at the University of Birmingham.
CURRICULUM VITAE Name: Olga Maleva Work address: School of Mathematics, Watson Building, University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK Telephone: +44(0)121 414 6584 Fax: +44(0)121 414 3389
More informationAdvanced In-Database Analytics
Advanced In-Database Analytics Tallinn, Sept. 25th, 2012 Mikko-Pekka Bertling, BDM Greenplum EMEA 1 That sounds complicated? 2 Who can tell me how best to solve this 3 What are the main mathematical functions??
More informationUsing Data Mining and Machine Learning in Retail
Using Data Mining and Machine Learning in Retail Omeid Seide Senior Manager, Big Data Solutions Sears Holdings Bharat Prasad Big Data Solution Architect Sears Holdings Over a Century of Innovation A Fortune
More informationCore Curriculum to the Course:
Core Curriculum to the Course: Environmental Science Law Economy for Engineering Accounting for Engineering Production System Planning and Analysis Electric Circuits Logic Circuits Methods for Electric
More informationADVANCED MACHINE LEARNING. Introduction
1 1 Introduction Lecturer: Prof. Aude Billard (aude.billard@epfl.ch) Teaching Assistants: Guillaume de Chambrier, Nadia Figueroa, Denys Lamotte, Nicola Sommer 2 2 Course Format Alternate between: Lectures
More informationHealthcare data analytics. Da-Wei Wang Institute of Information Science wdw@iis.sinica.edu.tw
Healthcare data analytics Da-Wei Wang Institute of Information Science wdw@iis.sinica.edu.tw Outline Data Science Enabling technologies Grand goals Issues Google flu trend Privacy Conclusion Analytics
More informationAt a Glance A short portrait of the Technical University of Crete
At a Glance A short portrait of the Technical University of Crete Contact: Technical University of Crete Public & International Relations Department University Campus Akrotiri 731 00 Chania Crete Greece
More informationService courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.
Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are
More informationM E M O R A N D U M. Faculty Senate Approved April 2, 2015
M E M O R A N D U M Faculty Senate Approved April 2, 2015 TO: FROM: Deans and Chairs Becky Bitter, Sr. Assistant Registrar DATE: March 26, 2015 SUBJECT: Minor Change Bulletin No. 11 The courses listed
More informationData Mining. Concepts, Models, Methods, and Algorithms. 2nd Edition
Brochure More information from http://www.researchandmarkets.com/reports/2171322/ Data Mining. Concepts, Models, Methods, and Algorithms. 2nd Edition Description: This book reviews state-of-the-art methodologies
More informationCourse 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing
More informationHow To Become A Data Scientist
Programme Specification Awarding Body/Institution Teaching Institution Queen Mary, University of London Queen Mary, University of London Name of Final Award and Programme Title Master of Science (MSc)
More informationAnalysis Tools and Libraries for BigData
+ Analysis Tools and Libraries for BigData Lecture 02 Abhijit Bendale + Office Hours 2 n Terry Boult (Waiting to Confirm) n Abhijit Bendale (Tue 2:45 to 4:45 pm). Best if you email me in advance, but I
More information