Big Data in Web Age - 互 联 网 时 代 的 大 数 据



Similar documents
Steven C.H. Hoi School of Information Systems Singapore Management University

The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2

Parallel Data Selection Based on Neurodynamic Optimization in the Era of Big Data

Learning outcomes. Knowledge and understanding. Competence and skills

An Introduction to Data Mining

List of Publications by Claudio Gentile

NEURAL NETWORKS A Comprehensive Foundation

PULLING OUT OPINION TARGETS AND OPINION WORDS FROM REVIEWS BASED ON THE WORD ALIGNMENT MODEL AND USING TOPICAL WORD TRIGGER MODEL

Mining Signatures in Healthcare Data Based on Event Sequences and its Applications

How To Use Neural Networks In Data Mining

Learning to Process Natural Language in Big Data Environment

Clustering Big Data. Anil K. Jain. (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012

HT2015: SC4 Statistical Data Mining and Machine Learning

Statistical Models in Data Mining

CS 2750 Machine Learning. Lecture 1. Machine Learning. CS 2750 Machine Learning.

Florida International University - University of Miami TRECVID 2014

Tensor Factorization for Multi-Relational Learning

Machine Learning Department, School of Computer Science, Carnegie Mellon University, PA

Behavior Analysis in Crowded Environments. XiaogangWang Department of Electronic Engineering The Chinese University of Hong Kong June 25, 2011

Software Development Training Camp 1 (0-3) Prerequisite : Program development skill enhancement camp, at least 48 person-hours.

Intrusion Detection via Machine Learning for SCADA System Protection

Comparative Analysis of EM Clustering Algorithm and Density Based Clustering Algorithm Using WEKA tool.

Information Management course

Teaching in School of Electronic, Information and Electrical Engineering

CLASSIFYING NETWORK TRAFFIC IN THE BIG DATA ERA

AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM

Bayesian networks - Time-series models - Apache Spark & Scala

Blog Post Extraction Using Title Finding

Ming-Wei Chang. Machine learning and its applications to natural language processing, information retrieval and data mining.

Machine Learning with MATLAB David Willingham Application Engineer

INTRODUCTION TO MACHINE LEARNING 3RD EDITION

Principles of Data Mining by Hand&Mannila&Smyth

BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics

Graduate Co-op Students Information Manual. Department of Computer Science. Faculty of Science. University of Regina

MA2823: Foundations of Machine Learning

Doctor of Philosophy in Computer Science

Research on the UHF RFID Channel Coding Technology based on Simulink

The Data Mining Process

SURVEY REPORT DATA SCIENCE SOCIETY 2014

Tracking and Recognition in Sports Videos

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup

Scalable Developments for Big Data Analytics in Remote Sensing

CS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 Real-Time Systems. CSCI 522 High Performance Computing

Latent Dirichlet Markov Allocation for Sentiment Analysis

Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

Machine Learning.

NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE. Venu Govindaraju

User Modeling in Big Data. Qiang Yang, Huawei Noah s Ark Lab and Hong Kong University of Science and Technology 杨 强, 华 为 诺 亚 方 舟 实 验 室, 香 港 科 大

Annotated bibliographies for presentations in MUMT 611, Winter 2006

Parallel Data Mining. Team 2 Flash Coders Team Research Investigation Presentation 2. Foundations of Parallel Computing Oct 2014

Neural Networks for Machine Learning. Lecture 13a The ups and downs of backpropagation

Machine Learning and Data Analysis overview. Department of Cybernetics, Czech Technical University in Prague.

DATA MINING IN FINANCE

An Automatic and Accurate Segmentation for High Resolution Satellite Image S.Saumya 1, D.V.Jiji Thanka Ligoshia 2

Master of Science in Computer Science

Statistics Graduate Courses

Prediction of Heart Disease Using Naïve Bayes Algorithm

Deep learning applications and challenges in big data analytics

Machine Learning Introduction

Using Data Mining for Mobile Communication Clustering and Characterization

Research Article Distributed Data Mining Based on Deep Neural Network for Wireless Sensor Network

Social-Sensed Multimedia Computing

Using Artificial Intelligence to Manage Big Data for Litigation

Random forest algorithm in big data environment

E-commerce Transaction Anomaly Classification

Support Vector Machines with Clustering for Training with Very Large Datasets

Fast Matching of Binary Features

Practical Applications of DATA MINING. Sang C Suh Texas A&M University Commerce JONES & BARTLETT LEARNING

Introduction to Data Mining. Lijun Zhang

A Novel Feature Selection Method Based on an Integrated Data Envelopment Analysis and Entropy Mode

Machine Learning. CS494/594, Fall :10 AM 12:25 PM Claxton 205. Slides adapted (and extended) from: ETHEM ALPAYDIN The MIT Press, 2004

Data Mining Algorithms Part 1. Dejan Sarka

Big Data: Image & Video Analytics

Social Media Mining. Data Mining Essentials

Data Mining Yelp Data - Predicting rating stars from review text

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University

DATA MINING TECHNIQUES AND APPLICATIONS

Simple and efficient online algorithms for real world applications

Machine Learning and Statistics: What s the Connection?

Machine Learning for Data Science (CS4786) Lecture 1

Semantic Video Annotation by Mining Association Patterns from Visual and Speech Features

Learning Gaussian process models from big data. Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu

MS1b Statistical Data Mining

IJCSES Vol.7 No.4 October 2013 pp Serials Publications BEHAVIOR PERDITION VIA MINING SOCIAL DIMENSIONS

Machine Learning CS Lecture 01. Razvan C. Bunescu School of Electrical Engineering and Computer Science

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

Question 2 Naïve Bayes (16 points)

A Comparative Study on Sentiment Classification and Ranking on Product Reviews

Unsupervised Data Mining (Clustering)

A Big Data Analytical Framework For Portfolio Optimization Abstract. Keywords. 1. Introduction

Data Mining Analytics for Business Intelligence and Decision Support

ADVANCED MACHINE LEARNING. Introduction

Learning to Rank Revisited: Our Progresses in New Algorithms and Tasks

A Learning Based Method for Super-Resolution of Low Resolution Images

Sense Making in an IOT World: Sensor Data Analysis with Deep Learning

Transcription:

Big Data in Web Age - 互 联 网 时 代 的 大 数 据 Zhang Bo( 张 钹 ) Department of Computer Science &Technology, Tsinghua University

大 数 据 时 代 Volume: 2.8ZB (10 21 bytes), Variety, Velocity, 大 海 捞 针 Searching for a needle in a haystack!

The Characteristics of Big Data Data from crowds to crowds 34% useful, illusive, useless, content safety, Raw data 7%-tagged, 1%-analyzed

Man-Machine Interface Text, Speech, Image,. Behaviors Programming Encoding Unser s Intention Interests Meaning Semantics Content Interpretation Decoding Code Data Instruction Computer Net

Image Retrieval by Keywords - white horse (Google)

Beidu (A Chinese Web) - 马, 树 (Horse, Tree)

The New Demands of Information Processing in Big Data Age Users Intention Users Interest Users feeling, Understanding (Comprehension) of information meaning

The fundamental difficulty met by the traditional information processing

Why? Basic Assumption Meaning-Form Separation Meaning independent assumption -R. Hartley These semantic aspects of communication irrelevant to the engineering problem. -C. E. Shannon [1] R. V. L. Hartley, Transmission of information, Bell System Technical Journal, July 1928, pp.535-563 [2] C. E. Shannon, A mathematical theory of communication, Bell System Technical Journal, vol. 27, pp.379-423, July, pp.623-656, October 1948

Comprehension The Natural (Objective) Meaning

The Demand of Meaning Dependent based Information Theory Text Speech Image Human Sender X refer to, correlate physical or conceptual world Machine Receiver X X Traditional Information Processing Meaning M

Challenges! Can a machine deal with information meaning? How a machine to deal with meaning? Can a traditional information theory deal with meaning and how?

Probability-based Theory Sender X M refer to, correlate physical or conceptual world F (W, D) Mapping Receiver X representation coding data Feature Space

Fundamental Problems Feature Representation Meaning Does the mapping exist? How to find the mapping?

Does there exist such a mapping? 数 字 视 频 编 码 技 术 发 展 至 今 已 有 半 个 世 纪 的 历 史, 已 取 得 很 大 的 进 展 从 五 十 年 代 的 差 分 预 测 编 码, 到 七 十 年 代 的 变 换 编 码 基 于 块 的 运 动 预 测 编 码, 直 到 如 今 兴 起 的 分 布 式 编 码 立 体 视 编 码 多 视 编 码 视 觉 编 码 等 等 Mapping? Meaning (Data) (Rules, Concepts)

No, In general! Mapping Semantic Gap Meaning, Semantics Data Bag of words (text) Colors, textures, (image) Frequency spectrum (speech)

Data Driven Methods Dataset Pattern Machine Learning A specific data set A proper representation There exists such a mapping

How to Mining the Mapping Ill-posed Problems Existence Uniqueness 1 3 Stability 2 Machine Learning

Classical Statistics Solution Law of large numbers in function spaces Parametric Statistics Assumption: a known function with a few unknown parameters ax 2 bx c

Recent Results F( x, y) F( y x) F( x), y f ( x) Data Function Rules F( x, y ) f( x) If or exists, the rule can be found in probabilistic sense Pe ( ) N

Data Driven based Machine Learning (Rote, Superficial) Without Comprehension! Can machines understand text, image, or speech?

Artificial Intelligence Methods Human Machine Text Speech Image Sender X refer to, correlate AI physical or conceptual world Meaning S Receiver X Information processing with understanding

Expert Systems Human disease diagnosis system Production Rules If a, symptoms (fuzzy) CF: certainty factors Then b function disorder (fuzzy) Inference Engine

Scopes of Application Deliberative behaviors problem solving, decision making, diagnosis, planning, common sense, natural language understanding, Perception vision, speech, touch, etc.

Nature Language Understanding Manual Rule-based knowledge representation Syntax, Morphology, Semantics,.. Symbolic Inference

Neither Traditional Information Processing nor AI along can solve the comprehension problem How will we do next?

Comprehension Text: Contextual structures Image: Spatial structure Speech: Temporal structure Video: Temporal-Spatial structure Structured Analysis & Representation 数 字 视 频 编 码 技 术 发 展 至 今 已 有 半 个 世 纪 的 历 史, 已 取 得 很 大 的 进 展 从 五 十 年 代 的 差 分 预 测 编 码, 到 七 十 年 代 的 变 换 编 码 基 于 块 的 运 动 预 测 编 码, 直 到 如 今 兴 起 的 分 布 式 编 码 立 体 视 编 码 多 视 编 码 视 觉 编 码 等 等 t

Computer Comprehension of Text Paragraph 数 字 视 频 编 码 技 术 发 展 至 今 已 有 半 个 世 纪 的 历 史, 已 取 得 很 大 的 进 展 从 五 十 年 代 的 差 分 预 测 编 码, 到 七 十 年 代 的 变 换 编 码 基 于 块 的 运 动 预 测 编 码, 直 到 如 今 兴 起 的 分 布 式 编 码 立 体 视 编 码 多 视 编 码 视 觉 编 码 等 等 Sentence-1 Sentence-2. Sentence-n Word-11 Word-12,.. Word-1m, Word-21, Word-22,.

This figure is from Serre et al.'s A quantitative theory of immediate visual recognition. Prog Brain Res. 2007.

Unsupervised Deep Learning 9 layers sparse deep autoencoder 10 million 200x200 images 1 billion connections 1,000 machines (16,000 cores), 3 days 1 billion trainable parameters Q. V. Le, Building high-level feature using large scale unsupervised learning Proc. 29 th ICML, 2012

Results (Generalization Capacity ) Concept Random guess Same architecture with random weights Best linear filter Best first layer neuron Best neuron Best neuron without contrast normalization Faces 64.8% 67.0% 74.0% 71.0% 81.7% 78.5% Human bodies 64.8% 66.5% 68.1% 67.2% 76.8% 71.8% Cats 64.8% 66.0% 67.8% 67.1% 74.6% 69.3% Concept Stanford network Deep autoencoders 3 layers Deep autoencoders 6 layers K-means on 40x40 images Faces 81.7% 72.3% 70.9% 72.5% Human bodies 76.7% 71.2% 69.8% 69.3% Cats 74.8% 67.5% 68.3 68.5%

Computer Comprehension of Visual Information Top-down feedback Top-down feedback High-level Local connection Knowledgedriven Data-driven V1 V2 IT

Data-driven + Knowledge-driven Statistical Inference over An Abstract Structured Declarative Knowledge Representation [1] The probabilistic approach to Artificial Intelligence [2] [1] Tenenbaum, J. B. (CMU), 2011, How to Grow a Mind: Science 11 march 2011: vol.331, no.6022, pp1279-1285 [2] Judea Pearl: 2011 winner of ACM Turing award

Quotient Space Based Problem Solving -A theoretical foundation of granular computing

国 内 发 行

Structural Prediction Learning Learning Rules Classification Structural Prediction Maximal Joint Likelihood Estimation Maximal Conditional Likelihood Estimation Maximal Margin Learning Maximal Entropy Discrimination Learning Naïve Bayesian Network Logistic Regression SVM Maximal Entropy Discrimination Model Hidden Markov Model (1966) 1 Conditional Random Field (2001) 2 Maximal Margin Markov Net (2003) 3 Maximal Entropy Discrimination Markov Net (2008) (zhu Jun)

Prior Distribution Likelihood Function Posteriori Distribution T. Bayes (1702 1761) Bayesian Theorem Optimization based Regularized Bayesian Inference Prior Distribution Likelihood Function Posteriori Constraints Optimization Theory Posteriori Distribution Attributes Domain knowledge Zhu Jun, Tsinghua University

Neural Turing Machine Google DeepMind, London, UK External Input External Output Recurrent NN Feedforward NN Read Heads Write Heads Memory

Three Levels of Processing Natural meaning-recognition Ill-posed problems Sender s Intention Context-Aware, Psychological model Receiver s Reaction-Impact Social knowledge,

Conclusions Basic Foundation Content related information processing Multi-granular Computing Applied Foundation Algorithms, Architecture, Parallelism, Management, Storage,

Publications-Journal Papers J. Zhu, A. Ahmed, E.P. Xing. MedLDA: Maximum Margin Supervised Topic Models. Journal of Machine Learning Research (JMLR), 13(Aug):2237--2278, 2012 N. Chen, J. Zhu, F. Sun, E.P. Xing. Large-margin Subspace Learning for Multi-view Data Analysis. IEEE Trans. on Pattern Analysis and Machine Intelligence (PAMI), vol. 34, no. 12, pp. 2365-2378, Dec. 2012. C. Liu, B. Zhang, J. Zhu, and D Wang. Learning a Contextual Multithread Model for Movie/TV Scene Segmentation, IEEE Transactions on Multimedia (TMM), 2012. X. Hu and J. Wang, Solving the assignment problem using continuoustime and discrete-time improved dual networks, IEEE Transactions on Neural Networks and Learning Systems (TNNLS), vol. 23, no. 5, pp. 821-827, 2012. X. Hu and B. Zhang, A Gaussian attractor network for memory and recognition with experience-dependent Learning, Neural Computation, vol. 22, no. 5, pp. 1333-1357, 2010. X. Hu, C. Sun and B. Zhang, Design of recurrent neural networks for solving constrained least absolute deviation problems, IEEE Transactions on Neural Networks (TNN), vol. 21, no. 7, pp. 1073-1086, July 2010.

J. Zhu, E.P. Xing. Maximum Entropy Discrimination Markov Networks. Journal of Machine Learning Research (JMLR), vol. 10(Nov):2531-2569, 2009. X. Hu and B. Zhang, A new recurrent neural network for solving convex quadratic programming problems with an application to the k- winners-take-all problem, IEEE Transactions on Neural Networks (TNN), vol. 20, no. 4, pp. 654 664, April 2009. D. Wang, Z. Wang J. Li, B. Zhang, and X. Li. Query representation by structured concept threads with application to interactive video retrieval. Journal of Visual Communication and Image Representation. 2009, Vol 20 (2): 104-116 J. Zhu, Z. Nie, B. Zhang, and J. Wen. Dynamic Hierarchical Markov Random Fields for Integrated Web Data Extraction, Journal of Machine Learning Research (JMLR), vol. 9(Jul):1583--1614, 2008.

Conference Papers J. Zhu, N. Chen, H. Perkins, B. Zhang. Gibbs Max-Margin Supervised Topic Models with Fast Sampling Algorithms, In Proc. of the 30th International Conference on Machine Learning (ICML), Atlanta, USA, 2013. M. Xu, J. Zhu, B. Zhang. Fast Max-Margin Matrix Factorization with Data Augmentation, In Proc. of the 30th International Conference on Machine Learning (ICML), Atlanta, USA, 2013. N. Chen, J. Zhu, F. Xia, and B. Zhang. Generalized Relational Topic Models with Data Augmentation, To Appear in Proc. of the 23rd International Joint Conference on Artificial Intelligence (IJCAI), Beijing, China, 2013. M. Xu, J. Zhu, and B. Zhang. Bayesian Nonparametric Maximum Margin Matrix Factorization for Collaborative Prediction, Advances in Neural Information Processing Systems (NIPS), Lake Tahoe, USA, 2012. Q. Jiang, J. Zhu, M. Sun, and E.P. Xing. Monte Carlo Methods for Maximum Margin Supervised Topic Models, Advances in Neural Information Processing Systems (NIPS), Lake Tahoe, USA, 2012. J. Ji, J. Li, S. Yan, B. Zhang, and Q. Tian. Super-Bit Locality-Sensitive Hashing. Advances in Neural Information Processing Systems (NIPS), Lake Tahoe, USA, 2012.

J. Zhu. Max-Margin Nonparametric Latent Feature Models for Link Prediction, In Proc. of the 29th International Conference on Machine Learning (ICML), Edinburgh, Scotland, 2012. J. Zhu, N. Chen, E.P. Xing. Infinite Latent SVM for Classification and Multitask Learning, Advances in Neural Information Processing Systems (NIPS), Granada, Spain, 2011. J. Zhu, E.P. Xing. Sparse Topical Coding, In Proc. of 27th Conference on Uncertainty in Artificial Intelligence (UAI), Barcelona, Spain, 2011. J. Zhu, N. Chen, E.P. Xing. Infinite SVM: a Dirichlet Process Mixture of Large-margin Kernel Machines, In Proc. of the 28th International Conference on Machine Learning (ICML), Bellevue, Washington, USA, 2011. J. Zhu, L.-J. Li, L. Fei-Fei, E.P. Xing. Large Margin Training of Upstream Scene Understanding Models, Advances in Neural Information Processing Systems (NIPS), Vancouver, B.C., Canada, 2010. S. Lee, J. Zhu, E.P. Xing. Detecting eqtls using Adaptive Multi-task Lasso, Advances in Neural Information Processing Systems (NIPS), Vancouver, B.C., Canada, 2010. N. Chen, J. Zhu and E.P. Xing. Predictive Subspace Learning for Multiview Data: a Large Margin Approach, Advances in Neural Information Processing Systems (NIPS), Vancouver, B.C., Canada, 2010.

J. Zhu, E.P. Xing. Conditional Topic Random Fields, In Proc. of the 27th International Conference on Machine Learning (ICML), Haifa, Israel, 2010. J. Zhu, and E.P. Xing. On Primal and Dual Sparsity of Markov Networks, In Proc. of 26th International Conference on Machine Learning (ICML), Montreal, Canada, 2009. J. Zhu, A. Ahmed, and E.P. Xing. MedLDA: Maximum Margin Supervised Topic Models for Regression and Classification, In Proc. of 26th International Conference on Machine Learning (ICML), Montreal, Canada, 2009. J. Zhu, E.P. Xing, and B. Zhang. Partially Observed Maximum Entropy Discrimination Markov Networks, Advances in Neural Information Processing Systems (NIPS), Vancouver, B.C., Canada, 2008. J. Zhu, E.P. Xing, and B. Zhang. Laplace Maximum Margin Markov Networks, In Proc. of the 25th International Conference on Machine Learning (ICML), Helsinki, Finland, 2008. J. Zhu, Z. Nie, et al. 2D Conditional Random Fields for Web Information Extraction, In Proc. of the 22nd International Conference on Machine Learning (ICML), Bonn, Germany, 2005.

J. Zhu, X. Zheng, L. Zhou, and B. Zhang. Scalable Inference in Maxmargin Supervised Topic Models, To Appear in Proc. of the 19th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (SIGKDD), Chicago, USA, 2013. J. Zhu, X. Zheng, and B. Zhang. Bayesian Logistic Supervised Topic Models with Data Augmentation, To Appear in Proc. of the 51st Annual Meeting of the Association for Computational Linguistics (ACL), Sofia, Bulgaria, 2013. A. Zhang, J. Zhu, and B. Zhang. Sparse Online Topic Models, In Proc. of the 22nd International World Wide Web Conference (WWW), Rio de Janeiro, Brazil, 2013. Y. Tian and J. Zhu. Learning from Crowds in the Presence of Schools of Thought, In Proc. of the 18th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (SIGKDD), Beijing, China, 2012. L. Xie, Q. Tian, and B. Zhang: Spatial pooling of heterogeneous features for image applications. ACM Multimedia 2012: 539-548 J. Zhu, N. Lao, and E.P. Xing. Grafting-Light: Fast, Incremental Feature Selection and Structure Learning of Markov Random Fields, In Proc. of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD), Washington DC, USA, 2010.

X. Shi, J. Zhu, R. Cai, and L. Zhang. User Grouping Behaviror in Online Forums, In Proc. of 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD), Paris, France, 2009. Y. Liang, J. Li, and B. Zhang. Vocabulary-based hashing for image search. ACM MM 2009. 589-592; J. Zhu, Z. Nie, X. Liu, B. Zhang, and J.-R. Wen. StatSnowball: a Statistical Approach to Extracting Entity Relationships, In Proc. of 18th International Word Wide Web Conference (WWW), Madrid, Spain, 2009. J. Yuan, J. Li, and B. Zhang. Scene understanding with discriminative structured prediction. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2008; J. Zhu, Z. Nie, et al. Simultaneous Record Detection and Attribute Labeling in Web Data Extraction, In Proc. of the 12nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD), Philadelphia, PA, USA, 2006.

谢 谢!