CIKM 2015 Melbourne Australia Oct. 22, 2015 Building a Better Connected World with Data Mining and Artificial Intelligence Technologies Hang Li Noah s Ark Lab Huawei Technologies
We want to build Intelligent Mobile Devices Data Mining & Artificial Intelligence Intelligent Telecommunication Networks Intelligent Enterprise
Intelligent Telecommunication Networks Software-defined Network Network Maintenance Network Planning and Optimization
Intelligent Enterprise Supply Chain Management Customer Relationship Management Human Resources Management Information and Knowledge Management Communication
Intelligent Mobile Devices Information Recommendation Information Extraction Personal Information Management Machine Translation Natural Language Dialogue(Question Answering)
This Talk: Natural Language Dialogue
Demo: Neural Responding Machine
Outline Information Access through Natural Language Dialogue Data-Driven Approach Single Turn Dialogue Sentence Representation Learning Our Approach Using Deep Learning Summary
New Paradigm in Information Retrieval? Library Search Web Search Natural Language Dialogue 1970 1990 2010
Information Access through Natural Language Dialogue Multi-turn dialogue Goal: task completion, mostly information access Evaluation: completion / cost Including traditional search and question answering as special cases
Example One: Hotel Booking on Smartphone P: How may I help you? U: I'd like to book a hotel room for tomorrow. P: For how many people? U: Just me. What is the total cost? P: That would be $120 per night. U: No problem. Book the room for one night, please.
Example Two: Auto Call Center U: hello H: hello, how can I help you? U: can you tell me how to find ABC software? H: please go to this URL to download U: how to activate the software? H: please see this document
Academic research Related Work Rule-based approach, e.g., Eliza System, Weizenbaum 1964 Reinforcement learning based approach, e.g., Singh et al 1999 Machine translation based approach, e.g., Ritter et al 2011 Commercial products Apple Siri Google Now MS Cortana
Outline Information Access through Natural Language Dialogue Data-Driven Approach Single Turn Dialogue Sentence Representation Learning Our Approach Using Deep Learning Summary
Data-Driven Approach to Dialogue Will you attend ECML/PKDD 2015? Yes, I will Learning how to converse mainly from data
Data-Driven Approach Key Points Mainly learning from data Avoid difficult language understanding problem as much as possible Fundamental mechanism is simple Issues to Investigate Single-turn dialogue Multi -turn dialogue Knowledge utilization and reasoning (lightweight) Dialogue management (lightweight)
As Scientific Research Problem Not as engineering hacks Scientific methodology Mathematical modeling Result should be reproducible (positivism) Divide and conquer (reductionism)
Outline Information Access through Natural Language Dialogue Data-Driven Approach Single Turn Dialogue Sentence Representation Learning Our Approach Using Deep Learning Summary
Reducing Multi-Turn Dialogue to Single- Turn Dialogue hello U: hello H: hello, how can I help you? U: can you tell me how to find ABC software? H: please go to this URL to download U: how to activate the software? H: please see this document Breaking down multi-turn dialogues to many single-turn dialogues hello, how can I help you can you tell me how to find ABC software? please go to this URL to download how to activate the software? please see this document
Retrieval-based Single Turn Dialogue Repository of Message Response Pairs Retrieval System Learning System
Generation-based Single Turn Dialogue Generation System Learning System
Retrieval-based approach to single turn dialogue 5 million Chinese Weibo data, 1 million Japanese Twitter data Registration deadline: Oct 31, 2015
Outline Information Access through Natural Language Dialogue Data-Driven Approach Single Turn Dialogue Sentence Representation Learning Our Approach Using Deep Learning
Representation of Word Meaning dog cat puppy kitten Using high-dimensional real-valued vectors to represent the meaning of words
Representation of Sentence Meaning New finding: This is possible Mary is loved by John Mary loves John John loves Mary Using high-dimensional real-valued vectors to represent the meaning of sentences
Recent Breakthrough in Natural Language Processing Representations from words to sentences Compositional Representing syntax, semantics, even pragmatics
How Is Learning of Sentence Meaning Possible? Deep neural networks (complicated non-linear models) Big Data Task-oriented Error-reduction and gradient-based
Deep Learning Tools for Learning Sentence Representations Neural Word Embedding Recurrent Neural Networks Recursive Neural Networks Convolutional Neural Networks
Word Representation: Neural Word Embedding M c1 c2 c3 c4 c5 w 1 w 2 w 3 10 1 5 2 1 2.5 1 log P( w, c) P( w) P( c) W T M WC t1 t2 t3 matrix factorization w 1 w 2 w 3 7 1 1 2 3 1 1.5 2 word embedding or word2vec
Recurrent Neural Network (RNN) (Mikolov et al. 2010) On sequence of words Variable length Long dependency: LSTM or GRU the cat sat on the mat h t 1 h t f ( ht 1, xt ) the cat sat. mat x t
Recursive Neural Network (RNN) (Socher et al. 2013) On parse tree of sentence Learning is based on max margin parsing the cat sat on the mat the cat sat on the mat
Convolutional Neural Network (CNN) (Hu et al. 2014) Concatenation the cat sat on the mat the cat sat on sat on the mat Robust parsing Shared parameter on same level Fixed length, zero padding the cat sat on the mat max pooling the cat cat sat sat on on the cat sat sat on on the the mat the cat sat cat sat on sat on the on the mat convolution the cat sat on the mat
Outline Information Access through Natural Language Dialogue Data-Driven Approach Single Turn Dialogue Sentence Representation Learning Our Approach Using Deep Learning Summary
Our Models for Single-Turn Dialogue Using Deep Learning Retrieval-based Deep Match CNN (Hu et al., NIPS 2014) Deep Match Tree (Wang et al., IJCAI 2015) Generation-based Neural Responding Machine (Shang et al., ACL 2015)
DL for NLP @Noah Lab Researchers Zhengdong Lu Lifeng Shang Lin Ma Zhaopeng Tu Interns Baotian Hu Fandong Meng Mingxuan Wang Han Zhao
Natural Language Dialogue System - Retrieval based Approach message retrieved messages and responses matched responses ranked responses online retrieval matching ranking best response offline index of messages and responses matching models ranking model
Deep Match CNN - Archtecture I First represent two sentences, and then match 38
Deep Match CNN - Architecture II Represent and match two sentences simultaneously Two dimensional convolution and pooling 39
Deep Match Tree Constructing deep neural network, with first layer corresponding mined patterns 40
Retrieval based Approach: Accuracy = 70% 上 海 今 天 好 熱, 堪 比 新 加 坡 上 海 今 天 热 的 不 一 般 想 去 武 当 山 有 想 同 游 的 么? 我 想 跟 帅 哥 同 游 ~ 哈 哈 It is very hot in Shanghai today, just like Singapore. It is unusually hot. I want to go to Mountain Wudang, it there anybody going together with me? Haha, I want to go with you, handsome boy Using 5 million Weibo Data
Natural Language Dialogue System - Generation based Approach Response y x y 1 Decoder c Encoder x 1 y x 2 2 Context Generator h y t x T Encoding messages to intermediate representations Decoding intermediate representations to responses Recurrent Neural Network (RNN) Message
Neural Responding Machine Intuitively, a big language model generating responses conditioned on messages Decoder 用 了 Global Encoder Context Generator 都 说??? Attention Signal 华 为 手 机 怎 么 样 Local Encoder 140 million parameters Trained from 5 million Weibo data
Generation based Approach Accuracy = 76% 占 中 终 于 结 束 了 Occupy Central is finally over. 下 一 个 是 陆 家 嘴 吧? 我 想 买 三 星 手 机 Will Lujiazui (finance district in Shanghai) be the next? I want to buy a Samsung phone 还 是 支 持 一 下 国 产 的 吧 Let us support our national brands vs. Accuracy of translation approach = 26% Accuracy of retrieval based approach = 70%
Outline Information Access through Natural Language Dialogue Data-Driven Approach Single Turn Dialogue Sentence Representation Learning Our Approach Using Deep Learning Summary
Take-away Messages Natural language dialogue = new paradigm for information retrieval Data-driven approach is key Current focus is single turn dialogue Sentence representation learning is possible Significant progress being made on single-turn dialogue with deep learning and big data Many interesting and challenging problems to be tackled
Thank you!