Big Data in Web Age - 互 联 网 时 代 的 大 数 据

Size: px
Start display at page:

Download "Big Data in Web Age - 互 联 网 时 代 的 大 数 据"

Transcription

1 Big Data in Web Age - 互 联 网 时 代 的 大 数 据 Zhang Bo( 张 钹 ) Department of Computer Science &Technology, Tsinghua University

2 大 数 据 时 代 Volume: 2.8ZB (10 21 bytes), Variety, Velocity, 大 海 捞 针 Searching for a needle in a haystack!

3 The Characteristics of Big Data Data from crowds to crowds 34% useful, illusive, useless, content safety, Raw data 7%-tagged, 1%-analyzed

4 Man-Machine Interface Text, Speech, Image,. Behaviors Programming Encoding Unser s Intention Interests Meaning Semantics Content Interpretation Decoding Code Data Instruction Computer Net

5 Image Retrieval by Keywords - white horse (Google)

6 Beidu (A Chinese Web) - 马, 树 (Horse, Tree)

7 The New Demands of Information Processing in Big Data Age Users Intention Users Interest Users feeling, Understanding (Comprehension) of information meaning

8 The fundamental difficulty met by the traditional information processing

9 Why? Basic Assumption Meaning-Form Separation Meaning independent assumption -R. Hartley These semantic aspects of communication irrelevant to the engineering problem. -C. E. Shannon [1] R. V. L. Hartley, Transmission of information, Bell System Technical Journal, July 1928, pp [2] C. E. Shannon, A mathematical theory of communication, Bell System Technical Journal, vol. 27, pp , July, pp , October 1948

10 Comprehension The Natural (Objective) Meaning

11 The Demand of Meaning Dependent based Information Theory Text Speech Image Human Sender X refer to, correlate physical or conceptual world Machine Receiver X X Traditional Information Processing Meaning M

12 Challenges! Can a machine deal with information meaning? How a machine to deal with meaning? Can a traditional information theory deal with meaning and how?

13 Probability-based Theory Sender X M refer to, correlate physical or conceptual world F (W, D) Mapping Receiver X representation coding data Feature Space

14 Fundamental Problems Feature Representation Meaning Does the mapping exist? How to find the mapping?

15 Does there exist such a mapping? 数 字 视 频 编 码 技 术 发 展 至 今 已 有 半 个 世 纪 的 历 史, 已 取 得 很 大 的 进 展 从 五 十 年 代 的 差 分 预 测 编 码, 到 七 十 年 代 的 变 换 编 码 基 于 块 的 运 动 预 测 编 码, 直 到 如 今 兴 起 的 分 布 式 编 码 立 体 视 编 码 多 视 编 码 视 觉 编 码 等 等 Mapping? Meaning (Data) (Rules, Concepts)

16 No, In general! Mapping Semantic Gap Meaning, Semantics Data Bag of words (text) Colors, textures, (image) Frequency spectrum (speech)

17 Data Driven Methods Dataset Pattern Machine Learning A specific data set A proper representation There exists such a mapping

18 How to Mining the Mapping Ill-posed Problems Existence Uniqueness 1 3 Stability 2 Machine Learning

19 Classical Statistics Solution Law of large numbers in function spaces Parametric Statistics Assumption: a known function with a few unknown parameters ax 2 bx c

20 Recent Results F( x, y) F( y x) F( x), y f ( x) Data Function Rules F( x, y ) f( x) If or exists, the rule can be found in probabilistic sense Pe ( ) N

21 Data Driven based Machine Learning (Rote, Superficial) Without Comprehension! Can machines understand text, image, or speech?

22 Artificial Intelligence Methods Human Machine Text Speech Image Sender X refer to, correlate AI physical or conceptual world Meaning S Receiver X Information processing with understanding

23 Expert Systems Human disease diagnosis system Production Rules If a, symptoms (fuzzy) CF: certainty factors Then b function disorder (fuzzy) Inference Engine

24 Scopes of Application Deliberative behaviors problem solving, decision making, diagnosis, planning, common sense, natural language understanding, Perception vision, speech, touch, etc.

25 Nature Language Understanding Manual Rule-based knowledge representation Syntax, Morphology, Semantics,.. Symbolic Inference

26 Neither Traditional Information Processing nor AI along can solve the comprehension problem How will we do next?

27 Comprehension Text: Contextual structures Image: Spatial structure Speech: Temporal structure Video: Temporal-Spatial structure Structured Analysis & Representation 数 字 视 频 编 码 技 术 发 展 至 今 已 有 半 个 世 纪 的 历 史, 已 取 得 很 大 的 进 展 从 五 十 年 代 的 差 分 预 测 编 码, 到 七 十 年 代 的 变 换 编 码 基 于 块 的 运 动 预 测 编 码, 直 到 如 今 兴 起 的 分 布 式 编 码 立 体 视 编 码 多 视 编 码 视 觉 编 码 等 等 t

28 Computer Comprehension of Text Paragraph 数 字 视 频 编 码 技 术 发 展 至 今 已 有 半 个 世 纪 的 历 史, 已 取 得 很 大 的 进 展 从 五 十 年 代 的 差 分 预 测 编 码, 到 七 十 年 代 的 变 换 编 码 基 于 块 的 运 动 预 测 编 码, 直 到 如 今 兴 起 的 分 布 式 编 码 立 体 视 编 码 多 视 编 码 视 觉 编 码 等 等 Sentence-1 Sentence-2. Sentence-n Word-11 Word-12,.. Word-1m, Word-21, Word-22,.

29 This figure is from Serre et al.'s A quantitative theory of immediate visual recognition. Prog Brain Res

30 Unsupervised Deep Learning 9 layers sparse deep autoencoder 10 million 200x200 images 1 billion connections 1,000 machines (16,000 cores), 3 days 1 billion trainable parameters Q. V. Le, Building high-level feature using large scale unsupervised learning Proc. 29 th ICML, 2012

31 Results (Generalization Capacity ) Concept Random guess Same architecture with random weights Best linear filter Best first layer neuron Best neuron Best neuron without contrast normalization Faces 64.8% 67.0% 74.0% 71.0% 81.7% 78.5% Human bodies 64.8% 66.5% 68.1% 67.2% 76.8% 71.8% Cats 64.8% 66.0% 67.8% 67.1% 74.6% 69.3% Concept Stanford network Deep autoencoders 3 layers Deep autoencoders 6 layers K-means on 40x40 images Faces 81.7% 72.3% 70.9% 72.5% Human bodies 76.7% 71.2% 69.8% 69.3% Cats 74.8% 67.5% %

32 Computer Comprehension of Visual Information Top-down feedback Top-down feedback High-level Local connection Knowledgedriven Data-driven V1 V2 IT

33 Data-driven + Knowledge-driven Statistical Inference over An Abstract Structured Declarative Knowledge Representation [1] The probabilistic approach to Artificial Intelligence [2] [1] Tenenbaum, J. B. (CMU), 2011, How to Grow a Mind: Science 11 march 2011: vol.331, no.6022, pp [2] Judea Pearl: 2011 winner of ACM Turing award

34 Quotient Space Based Problem Solving -A theoretical foundation of granular computing

35 国 内 发 行

36 Structural Prediction Learning Learning Rules Classification Structural Prediction Maximal Joint Likelihood Estimation Maximal Conditional Likelihood Estimation Maximal Margin Learning Maximal Entropy Discrimination Learning Naïve Bayesian Network Logistic Regression SVM Maximal Entropy Discrimination Model Hidden Markov Model (1966) 1 Conditional Random Field (2001) 2 Maximal Margin Markov Net (2003) 3 Maximal Entropy Discrimination Markov Net (2008) (zhu Jun)

37 Prior Distribution Likelihood Function Posteriori Distribution T. Bayes ( ) Bayesian Theorem Optimization based Regularized Bayesian Inference Prior Distribution Likelihood Function Posteriori Constraints Optimization Theory Posteriori Distribution Attributes Domain knowledge Zhu Jun, Tsinghua University

38 Neural Turing Machine Google DeepMind, London, UK External Input External Output Recurrent NN Feedforward NN Read Heads Write Heads Memory

39 Three Levels of Processing Natural meaning-recognition Ill-posed problems Sender s Intention Context-Aware, Psychological model Receiver s Reaction-Impact Social knowledge,

40 Conclusions Basic Foundation Content related information processing Multi-granular Computing Applied Foundation Algorithms, Architecture, Parallelism, Management, Storage,

41 Publications-Journal Papers J. Zhu, A. Ahmed, E.P. Xing. MedLDA: Maximum Margin Supervised Topic Models. Journal of Machine Learning Research (JMLR), 13(Aug): , 2012 N. Chen, J. Zhu, F. Sun, E.P. Xing. Large-margin Subspace Learning for Multi-view Data Analysis. IEEE Trans. on Pattern Analysis and Machine Intelligence (PAMI), vol. 34, no. 12, pp , Dec C. Liu, B. Zhang, J. Zhu, and D Wang. Learning a Contextual Multithread Model for Movie/TV Scene Segmentation, IEEE Transactions on Multimedia (TMM), X. Hu and J. Wang, Solving the assignment problem using continuoustime and discrete-time improved dual networks, IEEE Transactions on Neural Networks and Learning Systems (TNNLS), vol. 23, no. 5, pp , X. Hu and B. Zhang, A Gaussian attractor network for memory and recognition with experience-dependent Learning, Neural Computation, vol. 22, no. 5, pp , X. Hu, C. Sun and B. Zhang, Design of recurrent neural networks for solving constrained least absolute deviation problems, IEEE Transactions on Neural Networks (TNN), vol. 21, no. 7, pp , July 2010.

42 J. Zhu, E.P. Xing. Maximum Entropy Discrimination Markov Networks. Journal of Machine Learning Research (JMLR), vol. 10(Nov): , X. Hu and B. Zhang, A new recurrent neural network for solving convex quadratic programming problems with an application to the k- winners-take-all problem, IEEE Transactions on Neural Networks (TNN), vol. 20, no. 4, pp , April D. Wang, Z. Wang J. Li, B. Zhang, and X. Li. Query representation by structured concept threads with application to interactive video retrieval. Journal of Visual Communication and Image Representation. 2009, Vol 20 (2): J. Zhu, Z. Nie, B. Zhang, and J. Wen. Dynamic Hierarchical Markov Random Fields for Integrated Web Data Extraction, Journal of Machine Learning Research (JMLR), vol. 9(Jul): , 2008.

43 Conference Papers J. Zhu, N. Chen, H. Perkins, B. Zhang. Gibbs Max-Margin Supervised Topic Models with Fast Sampling Algorithms, In Proc. of the 30th International Conference on Machine Learning (ICML), Atlanta, USA, M. Xu, J. Zhu, B. Zhang. Fast Max-Margin Matrix Factorization with Data Augmentation, In Proc. of the 30th International Conference on Machine Learning (ICML), Atlanta, USA, N. Chen, J. Zhu, F. Xia, and B. Zhang. Generalized Relational Topic Models with Data Augmentation, To Appear in Proc. of the 23rd International Joint Conference on Artificial Intelligence (IJCAI), Beijing, China, M. Xu, J. Zhu, and B. Zhang. Bayesian Nonparametric Maximum Margin Matrix Factorization for Collaborative Prediction, Advances in Neural Information Processing Systems (NIPS), Lake Tahoe, USA, Q. Jiang, J. Zhu, M. Sun, and E.P. Xing. Monte Carlo Methods for Maximum Margin Supervised Topic Models, Advances in Neural Information Processing Systems (NIPS), Lake Tahoe, USA, J. Ji, J. Li, S. Yan, B. Zhang, and Q. Tian. Super-Bit Locality-Sensitive Hashing. Advances in Neural Information Processing Systems (NIPS), Lake Tahoe, USA, 2012.

44 J. Zhu. Max-Margin Nonparametric Latent Feature Models for Link Prediction, In Proc. of the 29th International Conference on Machine Learning (ICML), Edinburgh, Scotland, J. Zhu, N. Chen, E.P. Xing. Infinite Latent SVM for Classification and Multitask Learning, Advances in Neural Information Processing Systems (NIPS), Granada, Spain, J. Zhu, E.P. Xing. Sparse Topical Coding, In Proc. of 27th Conference on Uncertainty in Artificial Intelligence (UAI), Barcelona, Spain, J. Zhu, N. Chen, E.P. Xing. Infinite SVM: a Dirichlet Process Mixture of Large-margin Kernel Machines, In Proc. of the 28th International Conference on Machine Learning (ICML), Bellevue, Washington, USA, J. Zhu, L.-J. Li, L. Fei-Fei, E.P. Xing. Large Margin Training of Upstream Scene Understanding Models, Advances in Neural Information Processing Systems (NIPS), Vancouver, B.C., Canada, S. Lee, J. Zhu, E.P. Xing. Detecting eqtls using Adaptive Multi-task Lasso, Advances in Neural Information Processing Systems (NIPS), Vancouver, B.C., Canada, N. Chen, J. Zhu and E.P. Xing. Predictive Subspace Learning for Multiview Data: a Large Margin Approach, Advances in Neural Information Processing Systems (NIPS), Vancouver, B.C., Canada, 2010.

45 J. Zhu, E.P. Xing. Conditional Topic Random Fields, In Proc. of the 27th International Conference on Machine Learning (ICML), Haifa, Israel, J. Zhu, and E.P. Xing. On Primal and Dual Sparsity of Markov Networks, In Proc. of 26th International Conference on Machine Learning (ICML), Montreal, Canada, J. Zhu, A. Ahmed, and E.P. Xing. MedLDA: Maximum Margin Supervised Topic Models for Regression and Classification, In Proc. of 26th International Conference on Machine Learning (ICML), Montreal, Canada, J. Zhu, E.P. Xing, and B. Zhang. Partially Observed Maximum Entropy Discrimination Markov Networks, Advances in Neural Information Processing Systems (NIPS), Vancouver, B.C., Canada, J. Zhu, E.P. Xing, and B. Zhang. Laplace Maximum Margin Markov Networks, In Proc. of the 25th International Conference on Machine Learning (ICML), Helsinki, Finland, J. Zhu, Z. Nie, et al. 2D Conditional Random Fields for Web Information Extraction, In Proc. of the 22nd International Conference on Machine Learning (ICML), Bonn, Germany, 2005.

46 J. Zhu, X. Zheng, L. Zhou, and B. Zhang. Scalable Inference in Maxmargin Supervised Topic Models, To Appear in Proc. of the 19th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (SIGKDD), Chicago, USA, J. Zhu, X. Zheng, and B. Zhang. Bayesian Logistic Supervised Topic Models with Data Augmentation, To Appear in Proc. of the 51st Annual Meeting of the Association for Computational Linguistics (ACL), Sofia, Bulgaria, A. Zhang, J. Zhu, and B. Zhang. Sparse Online Topic Models, In Proc. of the 22nd International World Wide Web Conference (WWW), Rio de Janeiro, Brazil, Y. Tian and J. Zhu. Learning from Crowds in the Presence of Schools of Thought, In Proc. of the 18th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (SIGKDD), Beijing, China, L. Xie, Q. Tian, and B. Zhang: Spatial pooling of heterogeneous features for image applications. ACM Multimedia 2012: J. Zhu, N. Lao, and E.P. Xing. Grafting-Light: Fast, Incremental Feature Selection and Structure Learning of Markov Random Fields, In Proc. of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD), Washington DC, USA, 2010.

47 X. Shi, J. Zhu, R. Cai, and L. Zhang. User Grouping Behaviror in Online Forums, In Proc. of 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD), Paris, France, Y. Liang, J. Li, and B. Zhang. Vocabulary-based hashing for image search. ACM MM ; J. Zhu, Z. Nie, X. Liu, B. Zhang, and J.-R. Wen. StatSnowball: a Statistical Approach to Extracting Entity Relationships, In Proc. of 18th International Word Wide Web Conference (WWW), Madrid, Spain, J. Yuan, J. Li, and B. Zhang. Scene understanding with discriminative structured prediction. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2008; J. Zhu, Z. Nie, et al. Simultaneous Record Detection and Attribute Labeling in Web Data Extraction, In Proc. of the 12nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD), Philadelphia, PA, USA, 2006.

48 谢 谢!

Steven C.H. Hoi School of Information Systems Singapore Management University Email: chhoi@smu.edu.sg

Steven C.H. Hoi School of Information Systems Singapore Management University Email: chhoi@smu.edu.sg Steven C.H. Hoi School of Information Systems Singapore Management University Email: chhoi@smu.edu.sg Introduction http://stevenhoi.org/ Finance Recommender Systems Cyber Security Machine Learning Visual

More information

The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2

The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2 2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016) The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2 1 School of

More information

Parallel Data Selection Based on Neurodynamic Optimization in the Era of Big Data

Parallel Data Selection Based on Neurodynamic Optimization in the Era of Big Data Parallel Data Selection Based on Neurodynamic Optimization in the Era of Big Data Jun Wang Department of Mechanical and Automation Engineering The Chinese University of Hong Kong Shatin, New Territories,

More information

Learning outcomes. Knowledge and understanding. Competence and skills

Learning outcomes. Knowledge and understanding. Competence and skills Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges

More information

An Introduction to Data Mining

An Introduction to Data Mining An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail

More information

List of Publications by Claudio Gentile

List of Publications by Claudio Gentile List of Publications by Claudio Gentile Claudio Gentile DiSTA, University of Insubria, Italy claudio.gentile@uninsubria.it November 6, 2013 Abstract Contains the list of publications by Claudio Gentile,

More information

NEURAL NETWORKS A Comprehensive Foundation

NEURAL NETWORKS A Comprehensive Foundation NEURAL NETWORKS A Comprehensive Foundation Second Edition Simon Haykin McMaster University Hamilton, Ontario, Canada Prentice Hall Prentice Hall Upper Saddle River; New Jersey 07458 Preface xii Acknowledgments

More information

PULLING OUT OPINION TARGETS AND OPINION WORDS FROM REVIEWS BASED ON THE WORD ALIGNMENT MODEL AND USING TOPICAL WORD TRIGGER MODEL

PULLING OUT OPINION TARGETS AND OPINION WORDS FROM REVIEWS BASED ON THE WORD ALIGNMENT MODEL AND USING TOPICAL WORD TRIGGER MODEL Journal homepage: www.mjret.in ISSN:2348-6953 PULLING OUT OPINION TARGETS AND OPINION WORDS FROM REVIEWS BASED ON THE WORD ALIGNMENT MODEL AND USING TOPICAL WORD TRIGGER MODEL Utkarsha Vibhute, Prof. Soumitra

More information

Mining Signatures in Healthcare Data Based on Event Sequences and its Applications

Mining Signatures in Healthcare Data Based on Event Sequences and its Applications Mining Signatures in Healthcare Data Based on Event Sequences and its Applications Siddhanth Gokarapu 1, J. Laxmi Narayana 2 1 Student, Computer Science & Engineering-Department, JNTU Hyderabad India 1

More information

How To Use Neural Networks In Data Mining

How To Use Neural Networks In Data Mining International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and

More information

Learning to Process Natural Language in Big Data Environment

Learning to Process Natural Language in Big Data Environment CCF ADL 2015 Nanchang Oct 11, 2015 Learning to Process Natural Language in Big Data Environment Hang Li Noah s Ark Lab Huawei Technologies Part 1: Deep Learning - Present and Future Talk Outline Overview

More information

Clustering Big Data. Anil K. Jain. (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012

Clustering Big Data. Anil K. Jain. (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012 Clustering Big Data Anil K. Jain (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012 Outline Big Data How to extract information? Data clustering

More information

HT2015: SC4 Statistical Data Mining and Machine Learning

HT2015: SC4 Statistical Data Mining and Machine Learning HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Bayesian Nonparametrics Parametric vs Nonparametric

More information

Statistical Models in Data Mining

Statistical Models in Data Mining Statistical Models in Data Mining Sargur N. Srihari University at Buffalo The State University of New York Department of Computer Science and Engineering Department of Biostatistics 1 Srihari Flood of

More information

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning. Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott

More information

Preface: Cognitive Informatics, Cognitive Computing, and Their Denotational Mathematical Foundations (II)

Preface: Cognitive Informatics, Cognitive Computing, and Their Denotational Mathematical Foundations (II) Fundamenta Informaticae 90 (2009) i vii DOI 10.3233/FI-2009-0001 IOS Press i Preface: Cognitive Informatics, Cognitive Computing, and Their Denotational Mathematical Foundations (II) Yingxu Wang Visiting

More information

Florida International University - University of Miami TRECVID 2014

Florida International University - University of Miami TRECVID 2014 Florida International University - University of Miami TRECVID 2014 Miguel Gavidia 3, Tarek Sayed 1, Yilin Yan 1, Quisha Zhu 1, Mei-Ling Shyu 1, Shu-Ching Chen 2, Hsin-Yu Ha 2, Ming Ma 1, Winnie Chen 4,

More information

Tensor Factorization for Multi-Relational Learning

Tensor Factorization for Multi-Relational Learning Tensor Factorization for Multi-Relational Learning Maximilian Nickel 1 and Volker Tresp 2 1 Ludwig Maximilian University, Oettingenstr. 67, Munich, Germany nickel@dbs.ifi.lmu.de 2 Siemens AG, Corporate

More information

Machine Learning Department, School of Computer Science, Carnegie Mellon University, PA

Machine Learning Department, School of Computer Science, Carnegie Mellon University, PA Pengtao Xie Carnegie Mellon University Machine Learning Department School of Computer Science 5000 Forbes Ave Pittsburgh, PA 15213 Tel: (412) 916-9798 Email: pengtaox@cs.cmu.edu Web: http://www.cs.cmu.edu/

More information

Behavior Analysis in Crowded Environments. XiaogangWang Department of Electronic Engineering The Chinese University of Hong Kong June 25, 2011

Behavior Analysis in Crowded Environments. XiaogangWang Department of Electronic Engineering The Chinese University of Hong Kong June 25, 2011 Behavior Analysis in Crowded Environments XiaogangWang Department of Electronic Engineering The Chinese University of Hong Kong June 25, 2011 Behavior Analysis in Sparse Scenes Zelnik-Manor & Irani CVPR

More information

01219211 Software Development Training Camp 1 (0-3) Prerequisite : 01204214 Program development skill enhancement camp, at least 48 person-hours.

01219211 Software Development Training Camp 1 (0-3) Prerequisite : 01204214 Program development skill enhancement camp, at least 48 person-hours. (International Program) 01219141 Object-Oriented Modeling and Programming 3 (3-0) Object concepts, object-oriented design and analysis, object-oriented analysis relating to developing conceptual models

More information

Intrusion Detection via Machine Learning for SCADA System Protection

Intrusion Detection via Machine Learning for SCADA System Protection Intrusion Detection via Machine Learning for SCADA System Protection S.L.P. Yasakethu Department of Computing, University of Surrey, Guildford, GU2 7XH, UK. s.l.yasakethu@surrey.ac.uk J. Jiang Department

More information

Comparative Analysis of EM Clustering Algorithm and Density Based Clustering Algorithm Using WEKA tool.

Comparative Analysis of EM Clustering Algorithm and Density Based Clustering Algorithm Using WEKA tool. International Journal of Engineering Research and Development e-issn: 2278-067X, p-issn: 2278-800X, www.ijerd.com Volume 9, Issue 8 (January 2014), PP. 19-24 Comparative Analysis of EM Clustering Algorithm

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)

More information

Teaching in School of Electronic, Information and Electrical Engineering

Teaching in School of Electronic, Information and Electrical Engineering Introduction to Teaching in School of Electronic, Information and Electrical Engineering Shanghai Jiao Tong University Outline Organization of SEIEE Faculty Enrollments Undergraduate Programs Sample Curricula

More information

CLASSIFYING NETWORK TRAFFIC IN THE BIG DATA ERA

CLASSIFYING NETWORK TRAFFIC IN THE BIG DATA ERA CLASSIFYING NETWORK TRAFFIC IN THE BIG DATA ERA Professor Yang Xiang Network Security and Computing Laboratory (NSCLab) School of Information Technology Deakin University, Melbourne, Australia http://anss.org.au/nsclab

More information

AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM

AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM ABSTRACT Luis Alexandre Rodrigues and Nizam Omar Department of Electrical Engineering, Mackenzie Presbiterian University, Brazil, São Paulo 71251911@mackenzie.br,nizam.omar@mackenzie.br

More information

Bayesian networks - Time-series models - Apache Spark & Scala

Bayesian networks - Time-series models - Apache Spark & Scala Bayesian networks - Time-series models - Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup - November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly

More information

Blog Post Extraction Using Title Finding

Blog Post Extraction Using Title Finding Blog Post Extraction Using Title Finding Linhai Song 1, 2, Xueqi Cheng 1, Yan Guo 1, Bo Wu 1, 2, Yu Wang 1, 2 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing 2 Graduate School

More information

Ming-Wei Chang. Machine learning and its applications to natural language processing, information retrieval and data mining.

Ming-Wei Chang. Machine learning and its applications to natural language processing, information retrieval and data mining. Ming-Wei Chang 201 N Goodwin Ave, Department of Computer Science University of Illinois at Urbana-Champaign, Urbana, IL 61801 +1 (917) 345-6125 mchang21@uiuc.edu http://flake.cs.uiuc.edu/~mchang21 Research

More information

Machine Learning with MATLAB David Willingham Application Engineer

Machine Learning with MATLAB David Willingham Application Engineer Machine Learning with MATLAB David Willingham Application Engineer 2014 The MathWorks, Inc. 1 Goals Overview of machine learning Machine learning models & techniques available in MATLAB Streamlining the

More information

INTRODUCTION TO MACHINE LEARNING 3RD EDITION

INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN The MIT Press, 2014 Lecture Slides for INTRODUCTION TO MACHINE LEARNING 3RD EDITION alpaydin@boun.edu.tr http://www.cmpe.boun.edu.tr/~ethem/i2ml3e CHAPTER 1: INTRODUCTION Big Data 3 Widespread

More information

Principles of Data Mining by Hand&Mannila&Smyth

Principles of Data Mining by Hand&Mannila&Smyth Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences

More information

ENHANCED WEB IMAGE RE-RANKING USING SEMANTIC SIGNATURES

ENHANCED WEB IMAGE RE-RANKING USING SEMANTIC SIGNATURES International Journal of Computer Engineering & Technology (IJCET) Volume 7, Issue 2, March-April 2016, pp. 24 29, Article ID: IJCET_07_02_003 Available online at http://www.iaeme.com/ijcet/issues.asp?jtype=ijcet&vtype=7&itype=2

More information

BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376

BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376 Course Director: Dr. Kayvan Najarian (DCM&B, kayvan@umich.edu) Lectures: Labs: Mondays and Wednesdays 9:00 AM -10:30 AM Rm. 2065 Palmer Commons Bldg. Wednesdays 10:30 AM 11:30 AM (alternate weeks) Rm.

More information

Graduate Co-op Students Information Manual. Department of Computer Science. Faculty of Science. University of Regina

Graduate Co-op Students Information Manual. Department of Computer Science. Faculty of Science. University of Regina Graduate Co-op Students Information Manual Department of Computer Science Faculty of Science University of Regina 2014 1 Table of Contents 1. Department Description..3 2. Program Requirements and Procedures

More information

MA2823: Foundations of Machine Learning

MA2823: Foundations of Machine Learning MA2823: Foundations of Machine Learning École Centrale Paris Fall 2015 Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe agathe.azencott@mines paristech.fr TAs: Jiaqian Yu jiaqian.yu@centralesupelec.fr

More information

Doctor of Philosophy in Computer Science

Doctor of Philosophy in Computer Science Doctor of Philosophy in Computer Science Background/Rationale The program aims to develop computer scientists who are armed with methods, tools and techniques from both theoretical and systems aspects

More information

Research on the UHF RFID Channel Coding Technology based on Simulink

Research on the UHF RFID Channel Coding Technology based on Simulink Vol. 6, No. 7, 015 Research on the UHF RFID Channel Coding Technology based on Simulink Changzhi Wang Shanghai 0160, China Zhicai Shi* Shanghai 0160, China Dai Jian Shanghai 0160, China Li Meng Shanghai

More information

The Data Mining Process

The Data Mining Process Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data

More information

SURVEY REPORT DATA SCIENCE SOCIETY 2014

SURVEY REPORT DATA SCIENCE SOCIETY 2014 SURVEY REPORT DATA SCIENCE SOCIETY 2014 TABLE OF CONTENTS Contents About the Initiative 1 Report Summary 2 Participants Info 3 Participants Expertise 6 Suggested Discussion Topics 7 Selected Responses

More information

Tracking and Recognition in Sports Videos

Tracking and Recognition in Sports Videos Tracking and Recognition in Sports Videos Mustafa Teke a, Masoud Sattari b a Graduate School of Informatics, Middle East Technical University, Ankara, Turkey mustafa.teke@gmail.com b Department of Computer

More information

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015 An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content

More information

Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup

Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup Network Anomaly Detection A Machine Learning Perspective Dhruba Kumar Bhattacharyya Jugal Kumar KaKta»C) CRC Press J Taylor & Francis Croup Boca Raton London New York CRC Press is an imprint of the Taylor

More information

Scalable Developments for Big Data Analytics in Remote Sensing

Scalable Developments for Big Data Analytics in Remote Sensing Scalable Developments for Big Data Analytics in Remote Sensing Federated Systems and Data Division Research Group High Productivity Data Processing Dr.-Ing. Morris Riedel et al. Research Group Leader,

More information

CS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 Real-Time Systems. CSCI 522 High Performance Computing

CS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 Real-Time Systems. CSCI 522 High Performance Computing CS Master Level Courses and Areas The graduate courses offered may change over time, in response to new developments in computer science and the interests of faculty and students; the list of graduate

More information

Latent Dirichlet Markov Allocation for Sentiment Analysis

Latent Dirichlet Markov Allocation for Sentiment Analysis Latent Dirichlet Markov Allocation for Sentiment Analysis Ayoub Bagheri Isfahan University of Technology, Isfahan, Iran Intelligent Database, Data Mining and Bioinformatics Lab, Electrical and Computer

More information

Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016

Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016 Network Machine Learning Research Group S. Jiang Internet-Draft Huawei Technologies Co., Ltd Intended status: Informational October 19, 2015 Expires: April 21, 2016 Abstract Network Machine Learning draft-jiang-nmlrg-network-machine-learning-00

More information

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING EFFICIENT DATA PRE-PROCESSING FOR DATA MINING USING NEURAL NETWORKS JothiKumar.R 1, Sivabalan.R.V 2 1 Research scholar, Noorul Islam University, Nagercoil, India Assistant Professor, Adhiparasakthi College

More information

10-601. Machine Learning. http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html

10-601. Machine Learning. http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html 10-601 Machine Learning http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html Course data All up-to-date info is on the course web page: http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html

More information

NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE. Venu Govindaraju

NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE. Venu Govindaraju NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE Venu Govindaraju BIOMETRICS DOCUMENT ANALYSIS PATTERN RECOGNITION 8/24/2015 ICDAR- 2015 2 Towards a Globally Optimal Approach for Learning Deep Unsupervised

More information

EHR CURATION FOR MEDICAL MINING

EHR CURATION FOR MEDICAL MINING EHR CURATION FOR MEDICAL MINING Ernestina Menasalvas Medical Mining Tutorial@KDD 2015 Sydney, AUSTRALIA 2 Ernestina Menasalvas "EHR Curation for Medical Mining" 08/2015 Agenda Motivation the potential

More information

User Modeling in Big Data. Qiang Yang, Huawei Noah s Ark Lab and Hong Kong University of Science and Technology 杨 强, 华 为 诺 亚 方 舟 实 验 室, 香 港 科 大

User Modeling in Big Data. Qiang Yang, Huawei Noah s Ark Lab and Hong Kong University of Science and Technology 杨 强, 华 为 诺 亚 方 舟 实 验 室, 香 港 科 大 User Modeling in Big Data Qiang Yang, Huawei Noah s Ark Lab and Hong Kong University of Science and Technology 杨 强, 华 为 诺 亚 方 舟 实 验 室, 香 港 科 大 Who we are: Noah s Ark LAB Have you watched the movie 2012?

More information

Annotated bibliographies for presentations in MUMT 611, Winter 2006

Annotated bibliographies for presentations in MUMT 611, Winter 2006 Stephen Sinclair Music Technology Area, McGill University. Montreal, Canada Annotated bibliographies for presentations in MUMT 611, Winter 2006 Presentation 4: Musical Genre Similarity Aucouturier, J.-J.

More information

Parallel Data Mining. Team 2 Flash Coders Team Research Investigation Presentation 2. Foundations of Parallel Computing Oct 2014

Parallel Data Mining. Team 2 Flash Coders Team Research Investigation Presentation 2. Foundations of Parallel Computing Oct 2014 Parallel Data Mining Team 2 Flash Coders Team Research Investigation Presentation 2 Foundations of Parallel Computing Oct 2014 Agenda Overview of topic Analysis of research papers Software design Overview

More information

Neural Networks for Machine Learning. Lecture 13a The ups and downs of backpropagation

Neural Networks for Machine Learning. Lecture 13a The ups and downs of backpropagation Neural Networks for Machine Learning Lecture 13a The ups and downs of backpropagation Geoffrey Hinton Nitish Srivastava, Kevin Swersky Tijmen Tieleman Abdel-rahman Mohamed A brief history of backpropagation

More information

Machine Learning and Data Analysis overview. Department of Cybernetics, Czech Technical University in Prague. http://ida.felk.cvut.

Machine Learning and Data Analysis overview. Department of Cybernetics, Czech Technical University in Prague. http://ida.felk.cvut. Machine Learning and Data Analysis overview Jiří Kléma Department of Cybernetics, Czech Technical University in Prague http://ida.felk.cvut.cz psyllabus Lecture Lecturer Content 1. J. Kléma Introduction,

More information

DATA MINING IN FINANCE

DATA MINING IN FINANCE DATA MINING IN FINANCE Advances in Relational and Hybrid Methods by BORIS KOVALERCHUK Central Washington University, USA and EVGENII VITYAEV Institute of Mathematics Russian Academy of Sciences, Russia

More information

An Automatic and Accurate Segmentation for High Resolution Satellite Image S.Saumya 1, D.V.Jiji Thanka Ligoshia 2

An Automatic and Accurate Segmentation for High Resolution Satellite Image S.Saumya 1, D.V.Jiji Thanka Ligoshia 2 An Automatic and Accurate Segmentation for High Resolution Satellite Image S.Saumya 1, D.V.Jiji Thanka Ligoshia 2 Assistant Professor, Dept of ECE, Bethlahem Institute of Engineering, Karungal, Tamilnadu,

More information

Master of Science in Computer Science

Master of Science in Computer Science Master of Science in Computer Science Background/Rationale The MSCS program aims to provide both breadth and depth of knowledge in the concepts and techniques related to the theory, design, implementation,

More information

Statistics Graduate Courses

Statistics Graduate Courses Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.

More information

Prediction of Heart Disease Using Naïve Bayes Algorithm

Prediction of Heart Disease Using Naïve Bayes Algorithm Prediction of Heart Disease Using Naïve Bayes Algorithm R.Karthiyayini 1, S.Chithaara 2 Assistant Professor, Department of computer Applications, Anna University, BIT campus, Tiruchirapalli, Tamilnadu,

More information

Deep learning applications and challenges in big data analytics

Deep learning applications and challenges in big data analytics Najafabadi et al. Journal of Big Data (2015) 2:1 DOI 10.1186/s40537-014-0007-7 RESEARCH Open Access Deep learning applications and challenges in big data analytics Maryam M Najafabadi 1, Flavio Villanustre

More information

Machine Learning. 01 - Introduction

Machine Learning. 01 - Introduction Machine Learning 01 - Introduction Machine learning course One lecture (Wednesday, 9:30, 346) and one exercise (Monday, 17:15, 203). Oral exam, 20 minutes, 5 credit points. Some basic mathematical knowledge

More information

Using Data Mining for Mobile Communication Clustering and Characterization

Using Data Mining for Mobile Communication Clustering and Characterization Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer

More information

Research Article Distributed Data Mining Based on Deep Neural Network for Wireless Sensor Network

Research Article Distributed Data Mining Based on Deep Neural Network for Wireless Sensor Network Distributed Sensor Networks Volume 2015, Article ID 157453, 7 pages http://dx.doi.org/10.1155/2015/157453 Research Article Distributed Data Mining Based on Deep Neural Network for Wireless Sensor Network

More information

Social-Sensed Multimedia Computing

Social-Sensed Multimedia Computing Social-Sensed Multimedia Computing Wenwu Zhu Tsinghua University Multimedia Computing Search Recommend Multimedia Summarize Social Distribution... Sense from Social Preference Influence User behaviors

More information

Using Artificial Intelligence to Manage Big Data for Litigation

Using Artificial Intelligence to Manage Big Data for Litigation FEBRUARY 3 5, 2015 / THE HILTON NEW YORK Using Artificial Intelligence to Manage Big Data for Litigation Understanding Artificial Intelligence to Make better decisions Improve the process Allay the fear

More information

Random forest algorithm in big data environment

Random forest algorithm in big data environment Random forest algorithm in big data environment Yingchun Liu * School of Economics and Management, Beihang University, Beijing 100191, China Received 1 September 2014, www.cmnt.lv Abstract Random forest

More information

E-commerce Transaction Anomaly Classification

E-commerce Transaction Anomaly Classification E-commerce Transaction Anomaly Classification Minyong Lee minyong@stanford.edu Seunghee Ham sham12@stanford.edu Qiyi Jiang qjiang@stanford.edu I. INTRODUCTION Due to the increasing popularity of e-commerce

More information

Support Vector Machines with Clustering for Training with Very Large Datasets

Support Vector Machines with Clustering for Training with Very Large Datasets Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France theodoros.evgeniou@insead.fr Massimiliano

More information

Fast Matching of Binary Features

Fast Matching of Binary Features Fast Matching of Binary Features Marius Muja and David G. Lowe Laboratory for Computational Intelligence University of British Columbia, Vancouver, Canada {mariusm,lowe}@cs.ubc.ca Abstract There has been

More information

Practical Applications of DATA MINING. Sang C Suh Texas A&M University Commerce JONES & BARTLETT LEARNING

Practical Applications of DATA MINING. Sang C Suh Texas A&M University Commerce JONES & BARTLETT LEARNING Practical Applications of DATA MINING Sang C Suh Texas A&M University Commerce r 3 JONES & BARTLETT LEARNING Contents Preface xi Foreword by Murat M.Tanik xvii Foreword by John Kocur xix Chapter 1 Introduction

More information

Introduction to Data Mining. Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj

Introduction to Data Mining. Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Introduction to Data Mining Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Overview Introduction The Data Mining Process The Basic Data Types The Major Building Blocks Scalability and Streaming

More information

A Novel Feature Selection Method Based on an Integrated Data Envelopment Analysis and Entropy Mode

A Novel Feature Selection Method Based on an Integrated Data Envelopment Analysis and Entropy Mode A Novel Feature Selection Method Based on an Integrated Data Envelopment Analysis and Entropy Mode Seyed Mojtaba Hosseini Bamakan, Peyman Gholami RESEARCH CENTRE OF FICTITIOUS ECONOMY & DATA SCIENCE UNIVERSITY

More information

Machine Learning. CS494/594, Fall 2007 11:10 AM 12:25 PM Claxton 205. Slides adapted (and extended) from: ETHEM ALPAYDIN The MIT Press, 2004

Machine Learning. CS494/594, Fall 2007 11:10 AM 12:25 PM Claxton 205. Slides adapted (and extended) from: ETHEM ALPAYDIN The MIT Press, 2004 CS494/594, Fall 2007 11:10 AM 12:25 PM Claxton 205 Machine Learning Slides adapted (and extended) from: ETHEM ALPAYDIN The MIT Press, 2004 alpaydin@boun.edu.tr http://www.cmpe.boun.edu.tr/~ethem/i2ml What

More information

Data Mining Algorithms Part 1. Dejan Sarka

Data Mining Algorithms Part 1. Dejan Sarka Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses

More information

Big Data: Image & Video Analytics

Big Data: Image & Video Analytics Big Data: Image & Video Analytics How it could support Archiving & Indexing & Searching Dieter Haas, IBM Deutschland GmbH The Big Data Wave 60% of internet traffic is multimedia content (images and videos)

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

Data Mining Yelp Data - Predicting rating stars from review text

Data Mining Yelp Data - Predicting rating stars from review text Data Mining Yelp Data - Predicting rating stars from review text Rakesh Chada Stony Brook University rchada@cs.stonybrook.edu Chetan Naik Stony Brook University cnaik@cs.stonybrook.edu ABSTRACT The majority

More information

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University caizhua@gmail.com 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian

More information

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

More information

Simple and efficient online algorithms for real world applications

Simple and efficient online algorithms for real world applications Simple and efficient online algorithms for real world applications Università degli Studi di Milano Milano, Italy Talk @ Centro de Visión por Computador Something about me PhD in Robotics at LIRA-Lab,

More information

Machine Learning and Statistics: What s the Connection?

Machine Learning and Statistics: What s the Connection? Machine Learning and Statistics: What s the Connection? Institute for Adaptive and Neural Computation School of Informatics, University of Edinburgh, UK August 2006 Outline The roots of machine learning

More information

Machine Learning for Data Science (CS4786) Lecture 1

Machine Learning for Data Science (CS4786) Lecture 1 Machine Learning for Data Science (CS4786) Lecture 1 Tu-Th 10:10 to 11:25 AM Hollister B14 Instructors : Lillian Lee and Karthik Sridharan ROUGH DETAILS ABOUT THE COURSE Diagnostic assignment 0 is out:

More information

Semantic Video Annotation by Mining Association Patterns from Visual and Speech Features

Semantic Video Annotation by Mining Association Patterns from Visual and Speech Features Semantic Video Annotation by Mining Association Patterns from and Speech Features Vincent. S. Tseng, Ja-Hwung Su, Jhih-Hong Huang and Chih-Jen Chen Department of Computer Science and Information Engineering

More information

Learning Gaussian process models from big data. Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu

Learning Gaussian process models from big data. Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu Learning Gaussian process models from big data Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu Machine learning seminar at University of Cambridge, July 4 2012 Data A lot of

More information

MS1b Statistical Data Mining

MS1b Statistical Data Mining MS1b Statistical Data Mining Yee Whye Teh Department of Statistics Oxford http://www.stats.ox.ac.uk/~teh/datamining.html Outline Administrivia and Introduction Course Structure Syllabus Introduction to

More information

IJCSES Vol.7 No.4 October 2013 pp.165-168 Serials Publications BEHAVIOR PERDITION VIA MINING SOCIAL DIMENSIONS

IJCSES Vol.7 No.4 October 2013 pp.165-168 Serials Publications BEHAVIOR PERDITION VIA MINING SOCIAL DIMENSIONS IJCSES Vol.7 No.4 October 2013 pp.165-168 Serials Publications BEHAVIOR PERDITION VIA MINING SOCIAL DIMENSIONS V.Sudhakar 1 and G. Draksha 2 Abstract:- Collective behavior refers to the behaviors of individuals

More information

Machine Learning CS 6830. Lecture 01. Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu

Machine Learning CS 6830. Lecture 01. Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu Machine Learning CS 6830 Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu What is Learning? Merriam-Webster: learn = to acquire knowledge, understanding, or skill

More information

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume

More information

Question 2 Naïve Bayes (16 points)

Question 2 Naïve Bayes (16 points) Question 2 Naïve Bayes (16 points) About 2/3 of your email is spam so you downloaded an open source spam filter based on word occurrences that uses the Naive Bayes classifier. Assume you collected the

More information

A Comparative Study on Sentiment Classification and Ranking on Product Reviews

A Comparative Study on Sentiment Classification and Ranking on Product Reviews A Comparative Study on Sentiment Classification and Ranking on Product Reviews C.EMELDA Research Scholar, PG and Research Department of Computer Science, Nehru Memorial College, Putthanampatti, Bharathidasan

More information

Unsupervised Data Mining (Clustering)

Unsupervised Data Mining (Clustering) Unsupervised Data Mining (Clustering) Javier Béjar KEMLG December 01 Javier Béjar (KEMLG) Unsupervised Data Mining (Clustering) December 01 1 / 51 Introduction Clustering in KDD One of the main tasks in

More information

A Big Data Analytical Framework For Portfolio Optimization Abstract. Keywords. 1. Introduction

A Big Data Analytical Framework For Portfolio Optimization Abstract. Keywords. 1. Introduction A Big Data Analytical Framework For Portfolio Optimization Dhanya Jothimani, Ravi Shankar and Surendra S. Yadav Department of Management Studies, Indian Institute of Technology Delhi {dhanya.jothimani,

More information

Data Mining Analytics for Business Intelligence and Decision Support

Data Mining Analytics for Business Intelligence and Decision Support Data Mining Analytics for Business Intelligence and Decision Support Chid Apte, T.J. Watson Research Center, IBM Research Division Knowledge Discovery and Data Mining (KDD) techniques are used for analyzing

More information

ADVANCED MACHINE LEARNING. Introduction

ADVANCED MACHINE LEARNING. Introduction 1 1 Introduction Lecturer: Prof. Aude Billard (aude.billard@epfl.ch) Teaching Assistants: Guillaume de Chambrier, Nadia Figueroa, Denys Lamotte, Nicola Sommer 2 2 Course Format Alternate between: Lectures

More information

Learning to Rank Revisited: Our Progresses in New Algorithms and Tasks

Learning to Rank Revisited: Our Progresses in New Algorithms and Tasks The 4 th China-Australia Database Workshop Melbourne, Australia Oct. 19, 2015 Learning to Rank Revisited: Our Progresses in New Algorithms and Tasks Jun Xu Institute of Computing Technology, Chinese Academy

More information

A Learning Based Method for Super-Resolution of Low Resolution Images

A Learning Based Method for Super-Resolution of Low Resolution Images A Learning Based Method for Super-Resolution of Low Resolution Images Emre Ugur June 1, 2004 emre.ugur@ceng.metu.edu.tr Abstract The main objective of this project is the study of a learning based method

More information

Sense Making in an IOT World: Sensor Data Analysis with Deep Learning

Sense Making in an IOT World: Sensor Data Analysis with Deep Learning Sense Making in an IOT World: Sensor Data Analysis with Deep Learning Natalia Vassilieva, PhD Senior Research Manager GTC 2016 Deep learning proof points as of today Vision Speech Text Other Search & information

More information