Big Data in Web Age - 互 联 网 时 代 的 大 数 据
|
|
- James Robbins
- 8 years ago
- Views:
Transcription
1 Big Data in Web Age - 互 联 网 时 代 的 大 数 据 Zhang Bo( 张 钹 ) Department of Computer Science &Technology, Tsinghua University
2 大 数 据 时 代 Volume: 2.8ZB (10 21 bytes), Variety, Velocity, 大 海 捞 针 Searching for a needle in a haystack!
3 The Characteristics of Big Data Data from crowds to crowds 34% useful, illusive, useless, content safety, Raw data 7%-tagged, 1%-analyzed
4 Man-Machine Interface Text, Speech, Image,. Behaviors Programming Encoding Unser s Intention Interests Meaning Semantics Content Interpretation Decoding Code Data Instruction Computer Net
5 Image Retrieval by Keywords - white horse (Google)
6 Beidu (A Chinese Web) - 马, 树 (Horse, Tree)
7 The New Demands of Information Processing in Big Data Age Users Intention Users Interest Users feeling, Understanding (Comprehension) of information meaning
8 The fundamental difficulty met by the traditional information processing
9 Why? Basic Assumption Meaning-Form Separation Meaning independent assumption -R. Hartley These semantic aspects of communication irrelevant to the engineering problem. -C. E. Shannon [1] R. V. L. Hartley, Transmission of information, Bell System Technical Journal, July 1928, pp [2] C. E. Shannon, A mathematical theory of communication, Bell System Technical Journal, vol. 27, pp , July, pp , October 1948
10 Comprehension The Natural (Objective) Meaning
11 The Demand of Meaning Dependent based Information Theory Text Speech Image Human Sender X refer to, correlate physical or conceptual world Machine Receiver X X Traditional Information Processing Meaning M
12 Challenges! Can a machine deal with information meaning? How a machine to deal with meaning? Can a traditional information theory deal with meaning and how?
13 Probability-based Theory Sender X M refer to, correlate physical or conceptual world F (W, D) Mapping Receiver X representation coding data Feature Space
14 Fundamental Problems Feature Representation Meaning Does the mapping exist? How to find the mapping?
15 Does there exist such a mapping? 数 字 视 频 编 码 技 术 发 展 至 今 已 有 半 个 世 纪 的 历 史, 已 取 得 很 大 的 进 展 从 五 十 年 代 的 差 分 预 测 编 码, 到 七 十 年 代 的 变 换 编 码 基 于 块 的 运 动 预 测 编 码, 直 到 如 今 兴 起 的 分 布 式 编 码 立 体 视 编 码 多 视 编 码 视 觉 编 码 等 等 Mapping? Meaning (Data) (Rules, Concepts)
16 No, In general! Mapping Semantic Gap Meaning, Semantics Data Bag of words (text) Colors, textures, (image) Frequency spectrum (speech)
17 Data Driven Methods Dataset Pattern Machine Learning A specific data set A proper representation There exists such a mapping
18 How to Mining the Mapping Ill-posed Problems Existence Uniqueness 1 3 Stability 2 Machine Learning
19 Classical Statistics Solution Law of large numbers in function spaces Parametric Statistics Assumption: a known function with a few unknown parameters ax 2 bx c
20 Recent Results F( x, y) F( y x) F( x), y f ( x) Data Function Rules F( x, y ) f( x) If or exists, the rule can be found in probabilistic sense Pe ( ) N
21 Data Driven based Machine Learning (Rote, Superficial) Without Comprehension! Can machines understand text, image, or speech?
22 Artificial Intelligence Methods Human Machine Text Speech Image Sender X refer to, correlate AI physical or conceptual world Meaning S Receiver X Information processing with understanding
23 Expert Systems Human disease diagnosis system Production Rules If a, symptoms (fuzzy) CF: certainty factors Then b function disorder (fuzzy) Inference Engine
24 Scopes of Application Deliberative behaviors problem solving, decision making, diagnosis, planning, common sense, natural language understanding, Perception vision, speech, touch, etc.
25 Nature Language Understanding Manual Rule-based knowledge representation Syntax, Morphology, Semantics,.. Symbolic Inference
26 Neither Traditional Information Processing nor AI along can solve the comprehension problem How will we do next?
27 Comprehension Text: Contextual structures Image: Spatial structure Speech: Temporal structure Video: Temporal-Spatial structure Structured Analysis & Representation 数 字 视 频 编 码 技 术 发 展 至 今 已 有 半 个 世 纪 的 历 史, 已 取 得 很 大 的 进 展 从 五 十 年 代 的 差 分 预 测 编 码, 到 七 十 年 代 的 变 换 编 码 基 于 块 的 运 动 预 测 编 码, 直 到 如 今 兴 起 的 分 布 式 编 码 立 体 视 编 码 多 视 编 码 视 觉 编 码 等 等 t
28 Computer Comprehension of Text Paragraph 数 字 视 频 编 码 技 术 发 展 至 今 已 有 半 个 世 纪 的 历 史, 已 取 得 很 大 的 进 展 从 五 十 年 代 的 差 分 预 测 编 码, 到 七 十 年 代 的 变 换 编 码 基 于 块 的 运 动 预 测 编 码, 直 到 如 今 兴 起 的 分 布 式 编 码 立 体 视 编 码 多 视 编 码 视 觉 编 码 等 等 Sentence-1 Sentence-2. Sentence-n Word-11 Word-12,.. Word-1m, Word-21, Word-22,.
29 This figure is from Serre et al.'s A quantitative theory of immediate visual recognition. Prog Brain Res
30 Unsupervised Deep Learning 9 layers sparse deep autoencoder 10 million 200x200 images 1 billion connections 1,000 machines (16,000 cores), 3 days 1 billion trainable parameters Q. V. Le, Building high-level feature using large scale unsupervised learning Proc. 29 th ICML, 2012
31 Results (Generalization Capacity ) Concept Random guess Same architecture with random weights Best linear filter Best first layer neuron Best neuron Best neuron without contrast normalization Faces 64.8% 67.0% 74.0% 71.0% 81.7% 78.5% Human bodies 64.8% 66.5% 68.1% 67.2% 76.8% 71.8% Cats 64.8% 66.0% 67.8% 67.1% 74.6% 69.3% Concept Stanford network Deep autoencoders 3 layers Deep autoencoders 6 layers K-means on 40x40 images Faces 81.7% 72.3% 70.9% 72.5% Human bodies 76.7% 71.2% 69.8% 69.3% Cats 74.8% 67.5% %
32 Computer Comprehension of Visual Information Top-down feedback Top-down feedback High-level Local connection Knowledgedriven Data-driven V1 V2 IT
33 Data-driven + Knowledge-driven Statistical Inference over An Abstract Structured Declarative Knowledge Representation [1] The probabilistic approach to Artificial Intelligence [2] [1] Tenenbaum, J. B. (CMU), 2011, How to Grow a Mind: Science 11 march 2011: vol.331, no.6022, pp [2] Judea Pearl: 2011 winner of ACM Turing award
34 Quotient Space Based Problem Solving -A theoretical foundation of granular computing
35 国 内 发 行
36 Structural Prediction Learning Learning Rules Classification Structural Prediction Maximal Joint Likelihood Estimation Maximal Conditional Likelihood Estimation Maximal Margin Learning Maximal Entropy Discrimination Learning Naïve Bayesian Network Logistic Regression SVM Maximal Entropy Discrimination Model Hidden Markov Model (1966) 1 Conditional Random Field (2001) 2 Maximal Margin Markov Net (2003) 3 Maximal Entropy Discrimination Markov Net (2008) (zhu Jun)
37 Prior Distribution Likelihood Function Posteriori Distribution T. Bayes ( ) Bayesian Theorem Optimization based Regularized Bayesian Inference Prior Distribution Likelihood Function Posteriori Constraints Optimization Theory Posteriori Distribution Attributes Domain knowledge Zhu Jun, Tsinghua University
38 Neural Turing Machine Google DeepMind, London, UK External Input External Output Recurrent NN Feedforward NN Read Heads Write Heads Memory
39 Three Levels of Processing Natural meaning-recognition Ill-posed problems Sender s Intention Context-Aware, Psychological model Receiver s Reaction-Impact Social knowledge,
40 Conclusions Basic Foundation Content related information processing Multi-granular Computing Applied Foundation Algorithms, Architecture, Parallelism, Management, Storage,
41 Publications-Journal Papers J. Zhu, A. Ahmed, E.P. Xing. MedLDA: Maximum Margin Supervised Topic Models. Journal of Machine Learning Research (JMLR), 13(Aug): , 2012 N. Chen, J. Zhu, F. Sun, E.P. Xing. Large-margin Subspace Learning for Multi-view Data Analysis. IEEE Trans. on Pattern Analysis and Machine Intelligence (PAMI), vol. 34, no. 12, pp , Dec C. Liu, B. Zhang, J. Zhu, and D Wang. Learning a Contextual Multithread Model for Movie/TV Scene Segmentation, IEEE Transactions on Multimedia (TMM), X. Hu and J. Wang, Solving the assignment problem using continuoustime and discrete-time improved dual networks, IEEE Transactions on Neural Networks and Learning Systems (TNNLS), vol. 23, no. 5, pp , X. Hu and B. Zhang, A Gaussian attractor network for memory and recognition with experience-dependent Learning, Neural Computation, vol. 22, no. 5, pp , X. Hu, C. Sun and B. Zhang, Design of recurrent neural networks for solving constrained least absolute deviation problems, IEEE Transactions on Neural Networks (TNN), vol. 21, no. 7, pp , July 2010.
42 J. Zhu, E.P. Xing. Maximum Entropy Discrimination Markov Networks. Journal of Machine Learning Research (JMLR), vol. 10(Nov): , X. Hu and B. Zhang, A new recurrent neural network for solving convex quadratic programming problems with an application to the k- winners-take-all problem, IEEE Transactions on Neural Networks (TNN), vol. 20, no. 4, pp , April D. Wang, Z. Wang J. Li, B. Zhang, and X. Li. Query representation by structured concept threads with application to interactive video retrieval. Journal of Visual Communication and Image Representation. 2009, Vol 20 (2): J. Zhu, Z. Nie, B. Zhang, and J. Wen. Dynamic Hierarchical Markov Random Fields for Integrated Web Data Extraction, Journal of Machine Learning Research (JMLR), vol. 9(Jul): , 2008.
43 Conference Papers J. Zhu, N. Chen, H. Perkins, B. Zhang. Gibbs Max-Margin Supervised Topic Models with Fast Sampling Algorithms, In Proc. of the 30th International Conference on Machine Learning (ICML), Atlanta, USA, M. Xu, J. Zhu, B. Zhang. Fast Max-Margin Matrix Factorization with Data Augmentation, In Proc. of the 30th International Conference on Machine Learning (ICML), Atlanta, USA, N. Chen, J. Zhu, F. Xia, and B. Zhang. Generalized Relational Topic Models with Data Augmentation, To Appear in Proc. of the 23rd International Joint Conference on Artificial Intelligence (IJCAI), Beijing, China, M. Xu, J. Zhu, and B. Zhang. Bayesian Nonparametric Maximum Margin Matrix Factorization for Collaborative Prediction, Advances in Neural Information Processing Systems (NIPS), Lake Tahoe, USA, Q. Jiang, J. Zhu, M. Sun, and E.P. Xing. Monte Carlo Methods for Maximum Margin Supervised Topic Models, Advances in Neural Information Processing Systems (NIPS), Lake Tahoe, USA, J. Ji, J. Li, S. Yan, B. Zhang, and Q. Tian. Super-Bit Locality-Sensitive Hashing. Advances in Neural Information Processing Systems (NIPS), Lake Tahoe, USA, 2012.
44 J. Zhu. Max-Margin Nonparametric Latent Feature Models for Link Prediction, In Proc. of the 29th International Conference on Machine Learning (ICML), Edinburgh, Scotland, J. Zhu, N. Chen, E.P. Xing. Infinite Latent SVM for Classification and Multitask Learning, Advances in Neural Information Processing Systems (NIPS), Granada, Spain, J. Zhu, E.P. Xing. Sparse Topical Coding, In Proc. of 27th Conference on Uncertainty in Artificial Intelligence (UAI), Barcelona, Spain, J. Zhu, N. Chen, E.P. Xing. Infinite SVM: a Dirichlet Process Mixture of Large-margin Kernel Machines, In Proc. of the 28th International Conference on Machine Learning (ICML), Bellevue, Washington, USA, J. Zhu, L.-J. Li, L. Fei-Fei, E.P. Xing. Large Margin Training of Upstream Scene Understanding Models, Advances in Neural Information Processing Systems (NIPS), Vancouver, B.C., Canada, S. Lee, J. Zhu, E.P. Xing. Detecting eqtls using Adaptive Multi-task Lasso, Advances in Neural Information Processing Systems (NIPS), Vancouver, B.C., Canada, N. Chen, J. Zhu and E.P. Xing. Predictive Subspace Learning for Multiview Data: a Large Margin Approach, Advances in Neural Information Processing Systems (NIPS), Vancouver, B.C., Canada, 2010.
45 J. Zhu, E.P. Xing. Conditional Topic Random Fields, In Proc. of the 27th International Conference on Machine Learning (ICML), Haifa, Israel, J. Zhu, and E.P. Xing. On Primal and Dual Sparsity of Markov Networks, In Proc. of 26th International Conference on Machine Learning (ICML), Montreal, Canada, J. Zhu, A. Ahmed, and E.P. Xing. MedLDA: Maximum Margin Supervised Topic Models for Regression and Classification, In Proc. of 26th International Conference on Machine Learning (ICML), Montreal, Canada, J. Zhu, E.P. Xing, and B. Zhang. Partially Observed Maximum Entropy Discrimination Markov Networks, Advances in Neural Information Processing Systems (NIPS), Vancouver, B.C., Canada, J. Zhu, E.P. Xing, and B. Zhang. Laplace Maximum Margin Markov Networks, In Proc. of the 25th International Conference on Machine Learning (ICML), Helsinki, Finland, J. Zhu, Z. Nie, et al. 2D Conditional Random Fields for Web Information Extraction, In Proc. of the 22nd International Conference on Machine Learning (ICML), Bonn, Germany, 2005.
46 J. Zhu, X. Zheng, L. Zhou, and B. Zhang. Scalable Inference in Maxmargin Supervised Topic Models, To Appear in Proc. of the 19th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (SIGKDD), Chicago, USA, J. Zhu, X. Zheng, and B. Zhang. Bayesian Logistic Supervised Topic Models with Data Augmentation, To Appear in Proc. of the 51st Annual Meeting of the Association for Computational Linguistics (ACL), Sofia, Bulgaria, A. Zhang, J. Zhu, and B. Zhang. Sparse Online Topic Models, In Proc. of the 22nd International World Wide Web Conference (WWW), Rio de Janeiro, Brazil, Y. Tian and J. Zhu. Learning from Crowds in the Presence of Schools of Thought, In Proc. of the 18th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (SIGKDD), Beijing, China, L. Xie, Q. Tian, and B. Zhang: Spatial pooling of heterogeneous features for image applications. ACM Multimedia 2012: J. Zhu, N. Lao, and E.P. Xing. Grafting-Light: Fast, Incremental Feature Selection and Structure Learning of Markov Random Fields, In Proc. of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD), Washington DC, USA, 2010.
47 X. Shi, J. Zhu, R. Cai, and L. Zhang. User Grouping Behaviror in Online Forums, In Proc. of 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD), Paris, France, Y. Liang, J. Li, and B. Zhang. Vocabulary-based hashing for image search. ACM MM ; J. Zhu, Z. Nie, X. Liu, B. Zhang, and J.-R. Wen. StatSnowball: a Statistical Approach to Extracting Entity Relationships, In Proc. of 18th International Word Wide Web Conference (WWW), Madrid, Spain, J. Yuan, J. Li, and B. Zhang. Scene understanding with discriminative structured prediction. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2008; J. Zhu, Z. Nie, et al. Simultaneous Record Detection and Attribute Labeling in Web Data Extraction, In Proc. of the 12nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD), Philadelphia, PA, USA, 2006.
48 谢 谢!
Steven C.H. Hoi School of Information Systems Singapore Management University Email: chhoi@smu.edu.sg
Steven C.H. Hoi School of Information Systems Singapore Management University Email: chhoi@smu.edu.sg Introduction http://stevenhoi.org/ Finance Recommender Systems Cyber Security Machine Learning Visual
More informationThe multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2
2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016) The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2 1 School of
More informationParallel Data Selection Based on Neurodynamic Optimization in the Era of Big Data
Parallel Data Selection Based on Neurodynamic Optimization in the Era of Big Data Jun Wang Department of Mechanical and Automation Engineering The Chinese University of Hong Kong Shatin, New Territories,
More informationLearning outcomes. Knowledge and understanding. Competence and skills
Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges
More informationAn Introduction to Data Mining
An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail
More informationList of Publications by Claudio Gentile
List of Publications by Claudio Gentile Claudio Gentile DiSTA, University of Insubria, Italy claudio.gentile@uninsubria.it November 6, 2013 Abstract Contains the list of publications by Claudio Gentile,
More informationNEURAL NETWORKS A Comprehensive Foundation
NEURAL NETWORKS A Comprehensive Foundation Second Edition Simon Haykin McMaster University Hamilton, Ontario, Canada Prentice Hall Prentice Hall Upper Saddle River; New Jersey 07458 Preface xii Acknowledgments
More informationPULLING OUT OPINION TARGETS AND OPINION WORDS FROM REVIEWS BASED ON THE WORD ALIGNMENT MODEL AND USING TOPICAL WORD TRIGGER MODEL
Journal homepage: www.mjret.in ISSN:2348-6953 PULLING OUT OPINION TARGETS AND OPINION WORDS FROM REVIEWS BASED ON THE WORD ALIGNMENT MODEL AND USING TOPICAL WORD TRIGGER MODEL Utkarsha Vibhute, Prof. Soumitra
More informationMining Signatures in Healthcare Data Based on Event Sequences and its Applications
Mining Signatures in Healthcare Data Based on Event Sequences and its Applications Siddhanth Gokarapu 1, J. Laxmi Narayana 2 1 Student, Computer Science & Engineering-Department, JNTU Hyderabad India 1
More informationHow To Use Neural Networks In Data Mining
International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and
More informationLearning to Process Natural Language in Big Data Environment
CCF ADL 2015 Nanchang Oct 11, 2015 Learning to Process Natural Language in Big Data Environment Hang Li Noah s Ark Lab Huawei Technologies Part 1: Deep Learning - Present and Future Talk Outline Overview
More informationClustering Big Data. Anil K. Jain. (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012
Clustering Big Data Anil K. Jain (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012 Outline Big Data How to extract information? Data clustering
More informationHT2015: SC4 Statistical Data Mining and Machine Learning
HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Bayesian Nonparametrics Parametric vs Nonparametric
More informationStatistical Models in Data Mining
Statistical Models in Data Mining Sargur N. Srihari University at Buffalo The State University of New York Department of Computer Science and Engineering Department of Biostatistics 1 Srihari Flood of
More informationCS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.
Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott
More informationPreface: Cognitive Informatics, Cognitive Computing, and Their Denotational Mathematical Foundations (II)
Fundamenta Informaticae 90 (2009) i vii DOI 10.3233/FI-2009-0001 IOS Press i Preface: Cognitive Informatics, Cognitive Computing, and Their Denotational Mathematical Foundations (II) Yingxu Wang Visiting
More informationFlorida International University - University of Miami TRECVID 2014
Florida International University - University of Miami TRECVID 2014 Miguel Gavidia 3, Tarek Sayed 1, Yilin Yan 1, Quisha Zhu 1, Mei-Ling Shyu 1, Shu-Ching Chen 2, Hsin-Yu Ha 2, Ming Ma 1, Winnie Chen 4,
More informationTensor Factorization for Multi-Relational Learning
Tensor Factorization for Multi-Relational Learning Maximilian Nickel 1 and Volker Tresp 2 1 Ludwig Maximilian University, Oettingenstr. 67, Munich, Germany nickel@dbs.ifi.lmu.de 2 Siemens AG, Corporate
More informationMachine Learning Department, School of Computer Science, Carnegie Mellon University, PA
Pengtao Xie Carnegie Mellon University Machine Learning Department School of Computer Science 5000 Forbes Ave Pittsburgh, PA 15213 Tel: (412) 916-9798 Email: pengtaox@cs.cmu.edu Web: http://www.cs.cmu.edu/
More informationBehavior Analysis in Crowded Environments. XiaogangWang Department of Electronic Engineering The Chinese University of Hong Kong June 25, 2011
Behavior Analysis in Crowded Environments XiaogangWang Department of Electronic Engineering The Chinese University of Hong Kong June 25, 2011 Behavior Analysis in Sparse Scenes Zelnik-Manor & Irani CVPR
More information01219211 Software Development Training Camp 1 (0-3) Prerequisite : 01204214 Program development skill enhancement camp, at least 48 person-hours.
(International Program) 01219141 Object-Oriented Modeling and Programming 3 (3-0) Object concepts, object-oriented design and analysis, object-oriented analysis relating to developing conceptual models
More informationIntrusion Detection via Machine Learning for SCADA System Protection
Intrusion Detection via Machine Learning for SCADA System Protection S.L.P. Yasakethu Department of Computing, University of Surrey, Guildford, GU2 7XH, UK. s.l.yasakethu@surrey.ac.uk J. Jiang Department
More informationComparative Analysis of EM Clustering Algorithm and Density Based Clustering Algorithm Using WEKA tool.
International Journal of Engineering Research and Development e-issn: 2278-067X, p-issn: 2278-800X, www.ijerd.com Volume 9, Issue 8 (January 2014), PP. 19-24 Comparative Analysis of EM Clustering Algorithm
More informationInformation Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)
More informationTeaching in School of Electronic, Information and Electrical Engineering
Introduction to Teaching in School of Electronic, Information and Electrical Engineering Shanghai Jiao Tong University Outline Organization of SEIEE Faculty Enrollments Undergraduate Programs Sample Curricula
More informationCLASSIFYING NETWORK TRAFFIC IN THE BIG DATA ERA
CLASSIFYING NETWORK TRAFFIC IN THE BIG DATA ERA Professor Yang Xiang Network Security and Computing Laboratory (NSCLab) School of Information Technology Deakin University, Melbourne, Australia http://anss.org.au/nsclab
More informationAUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM
AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM ABSTRACT Luis Alexandre Rodrigues and Nizam Omar Department of Electrical Engineering, Mackenzie Presbiterian University, Brazil, São Paulo 71251911@mackenzie.br,nizam.omar@mackenzie.br
More informationBayesian networks - Time-series models - Apache Spark & Scala
Bayesian networks - Time-series models - Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup - November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly
More informationBlog Post Extraction Using Title Finding
Blog Post Extraction Using Title Finding Linhai Song 1, 2, Xueqi Cheng 1, Yan Guo 1, Bo Wu 1, 2, Yu Wang 1, 2 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing 2 Graduate School
More informationMing-Wei Chang. Machine learning and its applications to natural language processing, information retrieval and data mining.
Ming-Wei Chang 201 N Goodwin Ave, Department of Computer Science University of Illinois at Urbana-Champaign, Urbana, IL 61801 +1 (917) 345-6125 mchang21@uiuc.edu http://flake.cs.uiuc.edu/~mchang21 Research
More informationMachine Learning with MATLAB David Willingham Application Engineer
Machine Learning with MATLAB David Willingham Application Engineer 2014 The MathWorks, Inc. 1 Goals Overview of machine learning Machine learning models & techniques available in MATLAB Streamlining the
More informationINTRODUCTION TO MACHINE LEARNING 3RD EDITION
ETHEM ALPAYDIN The MIT Press, 2014 Lecture Slides for INTRODUCTION TO MACHINE LEARNING 3RD EDITION alpaydin@boun.edu.tr http://www.cmpe.boun.edu.tr/~ethem/i2ml3e CHAPTER 1: INTRODUCTION Big Data 3 Widespread
More informationPrinciples of Data Mining by Hand&Mannila&Smyth
Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences
More informationENHANCED WEB IMAGE RE-RANKING USING SEMANTIC SIGNATURES
International Journal of Computer Engineering & Technology (IJCET) Volume 7, Issue 2, March-April 2016, pp. 24 29, Article ID: IJCET_07_02_003 Available online at http://www.iaeme.com/ijcet/issues.asp?jtype=ijcet&vtype=7&itype=2
More informationBIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376
Course Director: Dr. Kayvan Najarian (DCM&B, kayvan@umich.edu) Lectures: Labs: Mondays and Wednesdays 9:00 AM -10:30 AM Rm. 2065 Palmer Commons Bldg. Wednesdays 10:30 AM 11:30 AM (alternate weeks) Rm.
More informationGraduate Co-op Students Information Manual. Department of Computer Science. Faculty of Science. University of Regina
Graduate Co-op Students Information Manual Department of Computer Science Faculty of Science University of Regina 2014 1 Table of Contents 1. Department Description..3 2. Program Requirements and Procedures
More informationMA2823: Foundations of Machine Learning
MA2823: Foundations of Machine Learning École Centrale Paris Fall 2015 Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe agathe.azencott@mines paristech.fr TAs: Jiaqian Yu jiaqian.yu@centralesupelec.fr
More informationDoctor of Philosophy in Computer Science
Doctor of Philosophy in Computer Science Background/Rationale The program aims to develop computer scientists who are armed with methods, tools and techniques from both theoretical and systems aspects
More informationResearch on the UHF RFID Channel Coding Technology based on Simulink
Vol. 6, No. 7, 015 Research on the UHF RFID Channel Coding Technology based on Simulink Changzhi Wang Shanghai 0160, China Zhicai Shi* Shanghai 0160, China Dai Jian Shanghai 0160, China Li Meng Shanghai
More informationThe Data Mining Process
Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data
More informationSURVEY REPORT DATA SCIENCE SOCIETY 2014
SURVEY REPORT DATA SCIENCE SOCIETY 2014 TABLE OF CONTENTS Contents About the Initiative 1 Report Summary 2 Participants Info 3 Participants Expertise 6 Suggested Discussion Topics 7 Selected Responses
More informationTracking and Recognition in Sports Videos
Tracking and Recognition in Sports Videos Mustafa Teke a, Masoud Sattari b a Graduate School of Informatics, Middle East Technical University, Ankara, Turkey mustafa.teke@gmail.com b Department of Computer
More informationAn Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015
An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content
More informationDetection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup
Network Anomaly Detection A Machine Learning Perspective Dhruba Kumar Bhattacharyya Jugal Kumar KaKta»C) CRC Press J Taylor & Francis Croup Boca Raton London New York CRC Press is an imprint of the Taylor
More informationScalable Developments for Big Data Analytics in Remote Sensing
Scalable Developments for Big Data Analytics in Remote Sensing Federated Systems and Data Division Research Group High Productivity Data Processing Dr.-Ing. Morris Riedel et al. Research Group Leader,
More informationCS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 Real-Time Systems. CSCI 522 High Performance Computing
CS Master Level Courses and Areas The graduate courses offered may change over time, in response to new developments in computer science and the interests of faculty and students; the list of graduate
More informationLatent Dirichlet Markov Allocation for Sentiment Analysis
Latent Dirichlet Markov Allocation for Sentiment Analysis Ayoub Bagheri Isfahan University of Technology, Isfahan, Iran Intelligent Database, Data Mining and Bioinformatics Lab, Electrical and Computer
More informationNetwork Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016
Network Machine Learning Research Group S. Jiang Internet-Draft Huawei Technologies Co., Ltd Intended status: Informational October 19, 2015 Expires: April 21, 2016 Abstract Network Machine Learning draft-jiang-nmlrg-network-machine-learning-00
More informationEFFICIENT DATA PRE-PROCESSING FOR DATA MINING
EFFICIENT DATA PRE-PROCESSING FOR DATA MINING USING NEURAL NETWORKS JothiKumar.R 1, Sivabalan.R.V 2 1 Research scholar, Noorul Islam University, Nagercoil, India Assistant Professor, Adhiparasakthi College
More information10-601. Machine Learning. http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html
10-601 Machine Learning http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html Course data All up-to-date info is on the course web page: http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html
More informationNAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE. Venu Govindaraju
NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE Venu Govindaraju BIOMETRICS DOCUMENT ANALYSIS PATTERN RECOGNITION 8/24/2015 ICDAR- 2015 2 Towards a Globally Optimal Approach for Learning Deep Unsupervised
More informationEHR CURATION FOR MEDICAL MINING
EHR CURATION FOR MEDICAL MINING Ernestina Menasalvas Medical Mining Tutorial@KDD 2015 Sydney, AUSTRALIA 2 Ernestina Menasalvas "EHR Curation for Medical Mining" 08/2015 Agenda Motivation the potential
More informationUser Modeling in Big Data. Qiang Yang, Huawei Noah s Ark Lab and Hong Kong University of Science and Technology 杨 强, 华 为 诺 亚 方 舟 实 验 室, 香 港 科 大
User Modeling in Big Data Qiang Yang, Huawei Noah s Ark Lab and Hong Kong University of Science and Technology 杨 强, 华 为 诺 亚 方 舟 实 验 室, 香 港 科 大 Who we are: Noah s Ark LAB Have you watched the movie 2012?
More informationAnnotated bibliographies for presentations in MUMT 611, Winter 2006
Stephen Sinclair Music Technology Area, McGill University. Montreal, Canada Annotated bibliographies for presentations in MUMT 611, Winter 2006 Presentation 4: Musical Genre Similarity Aucouturier, J.-J.
More informationParallel Data Mining. Team 2 Flash Coders Team Research Investigation Presentation 2. Foundations of Parallel Computing Oct 2014
Parallel Data Mining Team 2 Flash Coders Team Research Investigation Presentation 2 Foundations of Parallel Computing Oct 2014 Agenda Overview of topic Analysis of research papers Software design Overview
More informationNeural Networks for Machine Learning. Lecture 13a The ups and downs of backpropagation
Neural Networks for Machine Learning Lecture 13a The ups and downs of backpropagation Geoffrey Hinton Nitish Srivastava, Kevin Swersky Tijmen Tieleman Abdel-rahman Mohamed A brief history of backpropagation
More informationMachine Learning and Data Analysis overview. Department of Cybernetics, Czech Technical University in Prague. http://ida.felk.cvut.
Machine Learning and Data Analysis overview Jiří Kléma Department of Cybernetics, Czech Technical University in Prague http://ida.felk.cvut.cz psyllabus Lecture Lecturer Content 1. J. Kléma Introduction,
More informationDATA MINING IN FINANCE
DATA MINING IN FINANCE Advances in Relational and Hybrid Methods by BORIS KOVALERCHUK Central Washington University, USA and EVGENII VITYAEV Institute of Mathematics Russian Academy of Sciences, Russia
More informationAn Automatic and Accurate Segmentation for High Resolution Satellite Image S.Saumya 1, D.V.Jiji Thanka Ligoshia 2
An Automatic and Accurate Segmentation for High Resolution Satellite Image S.Saumya 1, D.V.Jiji Thanka Ligoshia 2 Assistant Professor, Dept of ECE, Bethlahem Institute of Engineering, Karungal, Tamilnadu,
More informationMaster of Science in Computer Science
Master of Science in Computer Science Background/Rationale The MSCS program aims to provide both breadth and depth of knowledge in the concepts and techniques related to the theory, design, implementation,
More informationStatistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
More informationPrediction of Heart Disease Using Naïve Bayes Algorithm
Prediction of Heart Disease Using Naïve Bayes Algorithm R.Karthiyayini 1, S.Chithaara 2 Assistant Professor, Department of computer Applications, Anna University, BIT campus, Tiruchirapalli, Tamilnadu,
More informationDeep learning applications and challenges in big data analytics
Najafabadi et al. Journal of Big Data (2015) 2:1 DOI 10.1186/s40537-014-0007-7 RESEARCH Open Access Deep learning applications and challenges in big data analytics Maryam M Najafabadi 1, Flavio Villanustre
More informationMachine Learning. 01 - Introduction
Machine Learning 01 - Introduction Machine learning course One lecture (Wednesday, 9:30, 346) and one exercise (Monday, 17:15, 203). Oral exam, 20 minutes, 5 credit points. Some basic mathematical knowledge
More informationUsing Data Mining for Mobile Communication Clustering and Characterization
Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer
More informationResearch Article Distributed Data Mining Based on Deep Neural Network for Wireless Sensor Network
Distributed Sensor Networks Volume 2015, Article ID 157453, 7 pages http://dx.doi.org/10.1155/2015/157453 Research Article Distributed Data Mining Based on Deep Neural Network for Wireless Sensor Network
More informationSocial-Sensed Multimedia Computing
Social-Sensed Multimedia Computing Wenwu Zhu Tsinghua University Multimedia Computing Search Recommend Multimedia Summarize Social Distribution... Sense from Social Preference Influence User behaviors
More informationUsing Artificial Intelligence to Manage Big Data for Litigation
FEBRUARY 3 5, 2015 / THE HILTON NEW YORK Using Artificial Intelligence to Manage Big Data for Litigation Understanding Artificial Intelligence to Make better decisions Improve the process Allay the fear
More informationRandom forest algorithm in big data environment
Random forest algorithm in big data environment Yingchun Liu * School of Economics and Management, Beihang University, Beijing 100191, China Received 1 September 2014, www.cmnt.lv Abstract Random forest
More informationE-commerce Transaction Anomaly Classification
E-commerce Transaction Anomaly Classification Minyong Lee minyong@stanford.edu Seunghee Ham sham12@stanford.edu Qiyi Jiang qjiang@stanford.edu I. INTRODUCTION Due to the increasing popularity of e-commerce
More informationSupport Vector Machines with Clustering for Training with Very Large Datasets
Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France theodoros.evgeniou@insead.fr Massimiliano
More informationFast Matching of Binary Features
Fast Matching of Binary Features Marius Muja and David G. Lowe Laboratory for Computational Intelligence University of British Columbia, Vancouver, Canada {mariusm,lowe}@cs.ubc.ca Abstract There has been
More informationPractical Applications of DATA MINING. Sang C Suh Texas A&M University Commerce JONES & BARTLETT LEARNING
Practical Applications of DATA MINING Sang C Suh Texas A&M University Commerce r 3 JONES & BARTLETT LEARNING Contents Preface xi Foreword by Murat M.Tanik xvii Foreword by John Kocur xix Chapter 1 Introduction
More informationIntroduction to Data Mining. Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj
Introduction to Data Mining Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Overview Introduction The Data Mining Process The Basic Data Types The Major Building Blocks Scalability and Streaming
More informationA Novel Feature Selection Method Based on an Integrated Data Envelopment Analysis and Entropy Mode
A Novel Feature Selection Method Based on an Integrated Data Envelopment Analysis and Entropy Mode Seyed Mojtaba Hosseini Bamakan, Peyman Gholami RESEARCH CENTRE OF FICTITIOUS ECONOMY & DATA SCIENCE UNIVERSITY
More informationMachine Learning. CS494/594, Fall 2007 11:10 AM 12:25 PM Claxton 205. Slides adapted (and extended) from: ETHEM ALPAYDIN The MIT Press, 2004
CS494/594, Fall 2007 11:10 AM 12:25 PM Claxton 205 Machine Learning Slides adapted (and extended) from: ETHEM ALPAYDIN The MIT Press, 2004 alpaydin@boun.edu.tr http://www.cmpe.boun.edu.tr/~ethem/i2ml What
More informationData Mining Algorithms Part 1. Dejan Sarka
Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses
More informationBig Data: Image & Video Analytics
Big Data: Image & Video Analytics How it could support Archiving & Indexing & Searching Dieter Haas, IBM Deutschland GmbH The Big Data Wave 60% of internet traffic is multimedia content (images and videos)
More informationSocial Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
More informationData Mining Yelp Data - Predicting rating stars from review text
Data Mining Yelp Data - Predicting rating stars from review text Rakesh Chada Stony Brook University rchada@cs.stonybrook.edu Chetan Naik Stony Brook University cnaik@cs.stonybrook.edu ABSTRACT The majority
More informationBayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com
Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University caizhua@gmail.com 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian
More informationDATA MINING TECHNIQUES AND APPLICATIONS
DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,
More informationSimple and efficient online algorithms for real world applications
Simple and efficient online algorithms for real world applications Università degli Studi di Milano Milano, Italy Talk @ Centro de Visión por Computador Something about me PhD in Robotics at LIRA-Lab,
More informationMachine Learning and Statistics: What s the Connection?
Machine Learning and Statistics: What s the Connection? Institute for Adaptive and Neural Computation School of Informatics, University of Edinburgh, UK August 2006 Outline The roots of machine learning
More informationMachine Learning for Data Science (CS4786) Lecture 1
Machine Learning for Data Science (CS4786) Lecture 1 Tu-Th 10:10 to 11:25 AM Hollister B14 Instructors : Lillian Lee and Karthik Sridharan ROUGH DETAILS ABOUT THE COURSE Diagnostic assignment 0 is out:
More informationSemantic Video Annotation by Mining Association Patterns from Visual and Speech Features
Semantic Video Annotation by Mining Association Patterns from and Speech Features Vincent. S. Tseng, Ja-Hwung Su, Jhih-Hong Huang and Chih-Jen Chen Department of Computer Science and Information Engineering
More informationLearning Gaussian process models from big data. Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu
Learning Gaussian process models from big data Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu Machine learning seminar at University of Cambridge, July 4 2012 Data A lot of
More informationMS1b Statistical Data Mining
MS1b Statistical Data Mining Yee Whye Teh Department of Statistics Oxford http://www.stats.ox.ac.uk/~teh/datamining.html Outline Administrivia and Introduction Course Structure Syllabus Introduction to
More informationIJCSES Vol.7 No.4 October 2013 pp.165-168 Serials Publications BEHAVIOR PERDITION VIA MINING SOCIAL DIMENSIONS
IJCSES Vol.7 No.4 October 2013 pp.165-168 Serials Publications BEHAVIOR PERDITION VIA MINING SOCIAL DIMENSIONS V.Sudhakar 1 and G. Draksha 2 Abstract:- Collective behavior refers to the behaviors of individuals
More informationMachine Learning CS 6830. Lecture 01. Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu
Machine Learning CS 6830 Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu What is Learning? Merriam-Webster: learn = to acquire knowledge, understanding, or skill
More informationHow to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning
How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume
More informationQuestion 2 Naïve Bayes (16 points)
Question 2 Naïve Bayes (16 points) About 2/3 of your email is spam so you downloaded an open source spam filter based on word occurrences that uses the Naive Bayes classifier. Assume you collected the
More informationA Comparative Study on Sentiment Classification and Ranking on Product Reviews
A Comparative Study on Sentiment Classification and Ranking on Product Reviews C.EMELDA Research Scholar, PG and Research Department of Computer Science, Nehru Memorial College, Putthanampatti, Bharathidasan
More informationUnsupervised Data Mining (Clustering)
Unsupervised Data Mining (Clustering) Javier Béjar KEMLG December 01 Javier Béjar (KEMLG) Unsupervised Data Mining (Clustering) December 01 1 / 51 Introduction Clustering in KDD One of the main tasks in
More informationA Big Data Analytical Framework For Portfolio Optimization Abstract. Keywords. 1. Introduction
A Big Data Analytical Framework For Portfolio Optimization Dhanya Jothimani, Ravi Shankar and Surendra S. Yadav Department of Management Studies, Indian Institute of Technology Delhi {dhanya.jothimani,
More informationData Mining Analytics for Business Intelligence and Decision Support
Data Mining Analytics for Business Intelligence and Decision Support Chid Apte, T.J. Watson Research Center, IBM Research Division Knowledge Discovery and Data Mining (KDD) techniques are used for analyzing
More informationADVANCED MACHINE LEARNING. Introduction
1 1 Introduction Lecturer: Prof. Aude Billard (aude.billard@epfl.ch) Teaching Assistants: Guillaume de Chambrier, Nadia Figueroa, Denys Lamotte, Nicola Sommer 2 2 Course Format Alternate between: Lectures
More informationLearning to Rank Revisited: Our Progresses in New Algorithms and Tasks
The 4 th China-Australia Database Workshop Melbourne, Australia Oct. 19, 2015 Learning to Rank Revisited: Our Progresses in New Algorithms and Tasks Jun Xu Institute of Computing Technology, Chinese Academy
More informationA Learning Based Method for Super-Resolution of Low Resolution Images
A Learning Based Method for Super-Resolution of Low Resolution Images Emre Ugur June 1, 2004 emre.ugur@ceng.metu.edu.tr Abstract The main objective of this project is the study of a learning based method
More informationSense Making in an IOT World: Sensor Data Analysis with Deep Learning
Sense Making in an IOT World: Sensor Data Analysis with Deep Learning Natalia Vassilieva, PhD Senior Research Manager GTC 2016 Deep learning proof points as of today Vision Speech Text Other Search & information
More information