719 Broadway, Room New York, New York 10003, United States Homepage: thien/

Size: px
Start display at page:

Download "719 Broadway, Room New York, New York 10003, United States Homepage: thien/"

Transcription

1 THIEN HUU NGUYEN Contact Information Objective Research Interests Education 719 Broadway, Room New York, New York 10003, United States Homepage: thien/ To pursue a research-oriented career in Computer Science Information Extraction, Natural Language Processing, Domain Adaptation, Deep Learning, Machine Learning New York University (NYU) Courant Institute of Mathematical Sciences Department of Computer Science PhD Candidate, expected May 2017 M.S., Computer Science, May 2014 (GPA: 3.889) Hanoi University of Science and Technology (HUST), Hanoi,Vietnam B.S., Computer Science, July 2011 Honor Program, Center for Training of Excellent Students Graduate Grade: Very good GPA: 8.61/10 (top 3% of 600 graduated students) Thesis: Extracting named entities from Vietnamese documents: a semi-supervised approach Professional Experience Information Extraction Research Intern at the IBM T.J. Watson Research Center, Yorktown Heights, New York, USA. I propose a Bidirectional Recurrent Neural Network (BRNN) to improve the robustness for Mention Detection. My BRNN system not only outperforms the best reported system (up to 9% relative error reduction) but also achieves the state-of-the-art performance in domain shifts for English mention detection. In addition, I significantly improve the state-of-theart named entity recognition performance for Dutch (up to 28% relative error reduction). (this work is published at IJCAI 2016) June Aug 2015 I propose a framework to jointly learn local and global features for entity linking based on convolutional and recurrent neural networks. The system achieves the state-of-the-art performance for entity linking on both the general and the domain adaptation settings. In addition, I am involved in developing an unsupervised system for entity disambiguation that can link entity mentions to nodes of any knowledge graphs. I learn embeddings for entity (nodes) in the knowledge graphs that are then used to build the collective disambiguation graph. (this work is published at COLING 2016). June Aug 2016 Deep Learning for Information Extraction: to apply deep learning to information extraction. (NYU, Sep 2014 present) Designing a Convolutional Neural Network for relation extraction which avoids any feature engineering but still achieves the state-of-the-art performance on relation classification and good performance on relation extraction (this work is published at NAACL 2015). Applying Convolutional Neural Networks with word embeddings for trigger detection and domain adaptation in event extraction (achieved the state-of-the-art performance for event detection in both the general setting and the domain adaptation setting) (this work is published at ACL-IJCNLP 2015).

2 Design a joint framework for Event Extraction (i.e, jointly labeling triggers and argument roles for events) based on Bidirectional Recurrent Neural Networks with memory vectors/matrices (achieved the state-of-the-art performance for event extraction on the ACE 2005 dataset) (this work is published at NAACL 2016). Knowledge Base Population (KBP) (by NIST): the challenge is to automatically find information pieces in a large corpus to fill in attribute slots for the entities of interest. (NYU, Jun 2014 present) Key member for the NYU 2014 KBP system for Slot Filling: introducing two distant supervision modules into the system: i) the MaxEnt-based distant supervision module (trained on the data generated by the alignment of Freebase tuples and large corpus), ii) the Multi-Instance Multi-Label (MIML) distant supervision module with guidance (trained on the data produced by the alignment of the Wikipedia Info box and Wikipedia articles) Key member for the NYU 2014 KBP system for Cold Start: integrating one distant supervision module and one inference module (allowing the system to infer more relation assertions based on the existing assertions in the system) (ranked 2nd place among the participating systems) Domain Adaptation for Information Extraction (in the IARPA s project Knowledge Discovery and Dissemination (KDD) ): the motivation is to adapt the models trained on the source domains with many labeled data so that the adapted models can work well on the target domains without or with very little training data. This is very useful in reality as we often want to extend our work to new domains where labeled data is not available yet. (NYU, December 2012 present) Conducting experiments on applying Word Clusters (generated from a large scale unlabeled corpus) to enhance the Feature Augmentation technique in building adaptive name taggers Analyzing the features for Relation Extraction to distinguish between domain-specific and domain-independent features Experimenting on Instance Weighting methods for the Covariate Shift problem of Relation Extraction using Tree Kernels: Kernel Mean Matching (KMM), Kullback-Leibler Importance Estimation Procedure (KLIEP) Employing word representations (Word Clusters, Word Embeddings) for Domain Adaptation of Relation Extraction (achieved up to 7% relative improvement over the best reported system) (this work is published in ACL 2014 and ACL-IJCNLP 2015). Research on semi-supervised algorithms for Vietnamese Named Entity Recognition, namely: (HUST, June 2010 July 2011) Combining name variation heuristics with a confidence estimator based on Conditional Random Fields to bootstrap statistical models for extracting named entities of Vietnamese language (this work was published in PAKDD 2011). Applying Semi-supervised Conditional Random Fields for named entity recognition of Vietnamese language Develop an Inductive Logic Programming method to automatically construct extraction rules for the Information Extraction problem of Vietnamese language (HUST, October 2009 May 2010) (this work was published in SoICT 2010). Develop a Relation Extraction system for Vietnamese language using Support Vector Machines (HUST, December 2010 May 2011)

3 Design an information extraction system based on various machine learning technique to construct knowledge bases from web pages of Vietnamese scientists (HUST, December 2010 May 2011) Data mining and Machine Learning Explore agglomerative clustering schemes to learn Fuzzy Concept Hierarchy from databases automatically (HUST, February 2009 October 2009) Construct a Profile Spammer Detection System for the Zingme social network (the largest social network in Vietnam) (R&D Lab, VNG Corp, Vietnam, September 2011 Jun 2012) Research on the Behavioral Targeting problem focusing on two main tasks: User Behavior Segmentation and User Segment Ranking (R&D Lab, VNG Corp, Vietnam, September 2011 Jun 2012) Honours and Awards Publications IBM Ph.D. Fellowship, Dean s Dissertation Fellowship, Graduate School of Arts and Science, NYU, Harold Grad Prize, Courant Institute of Mathematical Science, NYU, nd in KBP Cold Start, TAC 2014 Henry MacCracken Fellowship, New York University, Vietnam Education Foundation (VEF) Fellowship, (recommended by US National Academy) Second Prize in Student Scientific Research Conference, by Ministry of Education and Training, Vietnam, 2012 First Prize in Student Scientific Research Conference, by HUST, June, 2011 Merit Certificate for Excellent Students, by HUST, July, 2011 Annual Ministry of Educational and Training Scholarship for Excellent Students, Vietnam, Second Prize in the National Mathematical Competition for High School Students, by Ministry of Education and Training, Vietnam, 2006 First Prize in the Mathematical Competition of Hung Yen Province, Vietnam 2006 Incentive Award in the National Mathematical Competition for High School Students, by Ministry of Education and Training, Vietnam, 2005 Thien Huu Nguyen, Nicolas Fauceglia, Mariano Rodriguez Muro, Oktie Hassanzadeh, Alfio Massimiliano Gliozzo and Mohammad Sadoghi, Joint Learning of Local and Global Features for Entity Linking via Neural Networks, in Proceedings of COLING 2016, Osaka, Japan, December, Thien Huu Nguyen and Ralph Grishman, Modeling Skip-Grams for Event Detection with Convolutional Neural Networks, in Proceedings of EMNLP 2016, Austin, Texas, USA, November, Thien Huu Nguyen, Kyunghyun Cho and Ralph Grishman, Joint Event Extraction via Recurrent Neural Networks, in Proceedings of NAACL 2016, San Diego, USA, June, Thien Huu Nguyen, Lisheng Fu, Kyunghyun Cho and Ralph Grishman, A Two-stage Approach for Extending Event Detection to New Types via Neural Networks, in Proceedings of ACL Workshop on Representation Learning for NLP (RepL4NLP), Berlin, Germany, August, Thien Huu Nguyen and Ralph Grishman, Combining Neural Networks and Log-linear Models to Improve Relation Extraction, in Proceedings of IJCAI Workshop on Deep Learning for Artificial Intelligence (DLAI), New York, USA, July, Thien Huu Nguyen, Avirup Sil, Georgiana Dinu and Radu Florian, Toward Mention Detection Robustness with Recurrent Neural Networks, in Proceedings of IJCAI Workshop on Deep Learning for Artificial Intelligence (DLAI), New York, USA, July, 2016.

4 Xiang Li, Thien Huu Nguyen, Kai Cao and Ralph Grishman, Improving Event Detection with Abstract Meaning Representation, in Proceedings of ACL-IJCNLP Workshop on Computing News Storylines (CNewS 2015), Beijing, China, July, Thien Huu Nguyen and Ralph Grishman, Event Detection and Domain Adaptation with Convolutional Neural Networks, in Proceedings of ACL-IJCNLP 2015, Beijing, China, July, Thien Huu Nguyen, Barbara Plank and Ralph Grishman, Semantic Representations for Domain Adaptation: A Case Study on the Tree Kernel-based Method for Relation Extraction, in Proceedings of ACL-IJCNLP 2015, Beijing, China, July, Thien Huu Nguyen and Ralph Grishman, Relation Extraction: Perspective from Convolutional Neural Networks, in Proceedings of NAACL Workshop on Vector Space Modeling (VSM) for NLP, Denver, Colorado, June, Thien Huu Nguyen, Yifan He, Maria Pershina, Xiang Li and Ralph Grishman, New York University 2014 Knowledge Base Population Systems, in Proceedings of Text Analysis Conference (TAC), Gaithersburg, Maryland, USA, November Thien Huu Nguyen and Ralph Grishman, Employing Word Representations and Regularization for Domain Adaptation of Relation Extraction, in Proceedings of ACL 2014, pp 68-74, Baltimore, Maryland, USA, June Thien Huu Nguyen, Vinh Quang Nguyen, and Ngoc Minh Thi Nguyen, An information extraction system for constructing knowledge bases from Vietnamese documents, in Proceedings of the 28th Student Scientific Research Conference, pp , School of Information and Communication Technology, HUST, Hanoi, Vietnam, May, Rathany Chan Sam, Huong Thanh Le, Thuy Thanh Nguyen, and Thien Huu Nguyen, Combining Proper Name-Coreference with Conditional Random Fields for Semi-supervised Named Entity Recognition in Vietnamese Text, in Proceedings of the 15th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), pp , Shenzhen, China, May, Huong Thanh Le and Thien Huu Nguyen, Named Entity Recognition using Inductive Logic Programming, in Proceedings of the Symposium on Information and Communication Technology, Hanoi University of Science and Technology (SoICT), pp 71-78, Hanoi, Vietnam, August, Technical Skills Programming Languages Java, Python, Shell (use everyday) C/C++, AWK, MatLab, L A TEX(use when necessary) Others: Mallet, Lucene, MySQL, theano Teaching Experience New York University, New York Teaching Assistant for CSCI-GA.2590: Natural Language Processing Spring 2015 Graduate level course in Natural Language Processing Professor: Prof. Ralph Grishman Professional Service Reviewer Neural Computation Journal Program Committee NAACL 2016, COLING 2016

5 Referees Ralph Grishman, PhD Professor, Computer Science Department of Computer Science Courant Institute of Mathematical Sciences New York University Kyunghyun Cho, PhD Professor, Computer Science Department of Computer Science, Courant Institute of Mathematical Sciences Center for Data Science New York University

Ming-Wei Chang. Machine learning and its applications to natural language processing, information retrieval and data mining.

Ming-Wei Chang. Machine learning and its applications to natural language processing, information retrieval and data mining. Ming-Wei Chang 201 N Goodwin Ave, Department of Computer Science University of Illinois at Urbana-Champaign, Urbana, IL 61801 +1 (917) 345-6125 mchang21@uiuc.edu http://flake.cs.uiuc.edu/~mchang21 Research

More information

Machine Learning Department, School of Computer Science, Carnegie Mellon University, PA

Machine Learning Department, School of Computer Science, Carnegie Mellon University, PA Pengtao Xie Carnegie Mellon University Machine Learning Department School of Computer Science 5000 Forbes Ave Pittsburgh, PA 15213 Tel: (412) 916-9798 Email: pengtaox@cs.cmu.edu Web: http://www.cs.cmu.edu/

More information

Curriculum Vitae. Mahesh Joshi. Education. Research Experience. Publications

Curriculum Vitae. Mahesh Joshi. Education. Research Experience. Publications Mahesh Joshi Curriculum Vitae E-Mail: maheshj@cmu.edu Web: http://www.d.umn.edu/~joshi031/ Education August 2006 present: Masters in Language Technologies, Carnegie Mellon University September 2004 August

More information

SURVEY REPORT DATA SCIENCE SOCIETY 2014

SURVEY REPORT DATA SCIENCE SOCIETY 2014 SURVEY REPORT DATA SCIENCE SOCIETY 2014 TABLE OF CONTENTS Contents About the Initiative 1 Report Summary 2 Participants Info 3 Participants Expertise 6 Suggested Discussion Topics 7 Selected Responses

More information

The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2

The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2 2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016) The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2 1 School of

More information

Text Mining: The state of the art and the challenges

Text Mining: The state of the art and the challenges Text Mining: The state of the art and the challenges Ah-Hwee Tan Kent Ridge Digital Labs 21 Heng Mui Keng Terrace Singapore 119613 Email: ahhwee@krdl.org.sg Abstract Text mining, also known as text data

More information

Machine Learning: Overview

Machine Learning: Overview Machine Learning: Overview Why Learning? Learning is a core of property of being intelligent. Hence Machine learning is a core subarea of Artificial Intelligence. There is a need for programs to behave

More information

Semi-Supervised and Unsupervised Machine Learning. Novel Strategies

Semi-Supervised and Unsupervised Machine Learning. Novel Strategies Brochure More information from http://www.researchandmarkets.com/reports/2179190/ Semi-Supervised and Unsupervised Machine Learning. Novel Strategies Description: This book provides a detailed and up to

More information

COLLEGE OF INFORMATION & COMMUNICATION TECHNOLOGY (CICT) www.cit.ctu.edu.vn

COLLEGE OF INFORMATION & COMMUNICATION TECHNOLOGY (CICT) www.cit.ctu.edu.vn COLLEGE OF INFORMATION & COMMUNICATION TECHNOLOGY (CICT) www.cit.ctu.edu.vn Outline Can Tho University College of ICT (CICT) Mission Organization Fields of Study Facilities Relation & Cooperation Research

More information

Applications of Deep Learning to the GEOINT mission. June 2015

Applications of Deep Learning to the GEOINT mission. June 2015 Applications of Deep Learning to the GEOINT mission June 2015 Overview Motivation Deep Learning Recap GEOINT applications: Imagery exploitation OSINT exploitation Geospatial and activity based analytics

More information

Classification and Prediction

Classification and Prediction Classification and Prediction Slides for Data Mining: Concepts and Techniques Chapter 7 Jiawei Han and Micheline Kamber Intelligent Database Systems Research Lab School of Computing Science Simon Fraser

More information

WEI CHEN. IT-enabled Innovation, Online Community, Open-Source Software, Startup Angel Funding, Interactive Marketing, SaaS Model

WEI CHEN. IT-enabled Innovation, Online Community, Open-Source Software, Startup Angel Funding, Interactive Marketing, SaaS Model WEI CHEN Rady School of Management University of California, San Diego 9500 Gilman Drive, MC 0553 La Jolla, CA 92093-0553 +1(858)337-5951 +1(858)534-0862 wei.chen@rady.ucsd.edu www.mrweichen.info RESEARCH

More information

HA THU LE. Biography EDUCATION

HA THU LE. Biography EDUCATION HA THU LE Electrical and Computer Engineering California State Polytechnic University Pomona 3801 West Temple Ave, Pomona, CA 91768, USA Office: 9-413, Tel: (909) 869 2523, Email: hatle@cpp.edu Biography

More information

An Introduction to Data Mining

An Introduction to Data Mining An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail

More information

Data Mining & Data Stream Mining Open Source Tools

Data Mining & Data Stream Mining Open Source Tools Data Mining & Data Stream Mining Open Source Tools Darshana Parikh, Priyanka Tirkha Student M.Tech, Dept. of CSE, Sri Balaji College Of Engg. & Tech, Jaipur, Rajasthan, India Assistant Professor, Dept.

More information

Software Engineering. Program Description. Admissions Requirements. Certificate. Master of Software Engineering. Master of Science

Software Engineering. Program Description. Admissions Requirements. Certificate. Master of Software Engineering. Master of Science North Dakota State University 1 Software Engineering Program and Application Information Department Head: Graduate Coordinator: Department Location: Dr. Brian M. Slator Dr. Kenneth Magel 258 QBB (formerly

More information

Curriculum Vitae Ruben Sipos

Curriculum Vitae Ruben Sipos Curriculum Vitae Ruben Sipos Mailing Address: 349 Gates Hall Cornell University Ithaca, NY 14853 USA Mobile Phone: +1 607-229-0872 Date of Birth: 8 October 1985 E-mail: rs@cs.cornell.edu Web: http://www.cs.cornell.edu/~rs/

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)

More information

Patent Big Data Analysis by R Data Language for Technology Management

Patent Big Data Analysis by R Data Language for Technology Management , pp. 69-78 http://dx.doi.org/10.14257/ijseia.2016.10.1.08 Patent Big Data Analysis by R Data Language for Technology Management Sunghae Jun * Department of Statistics, Cheongju University, 360-764, Korea

More information

How To Get A Computer Science Degree

How To Get A Computer Science Degree MAJOR: DEGREE: COMPUTER SCIENCE MASTER OF SCIENCE (M.S.) CONCENTRATIONS: HIGH-PERFORMANCE COMPUTING & BIOINFORMATICS CYBER-SECURITY & NETWORKING The Department of Computer Science offers a Master of Science

More information

ADVANCED MACHINE LEARNING. Introduction

ADVANCED MACHINE LEARNING. Introduction 1 1 Introduction Lecturer: Prof. Aude Billard (aude.billard@epfl.ch) Teaching Assistants: Guillaume de Chambrier, Nadia Figueroa, Denys Lamotte, Nicola Sommer 2 2 Course Format Alternate between: Lectures

More information

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

More information

Semi-Supervised Learning for Blog Classification

Semi-Supervised Learning for Blog Classification Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence (2008) Semi-Supervised Learning for Blog Classification Daisuke Ikeda Department of Computational Intelligence and Systems Science,

More information

Data Mining and Machine Learning in Bioinformatics

Data Mining and Machine Learning in Bioinformatics Data Mining and Machine Learning in Bioinformatics PRINCIPAL METHODS AND SUCCESSFUL APPLICATIONS Ruben Armañanzas http://mason.gmu.edu/~rarmanan Adapted from Iñaki Inza slides http://www.sc.ehu.es/isg

More information

Self Organizing Maps for Visualization of Categories

Self Organizing Maps for Visualization of Categories Self Organizing Maps for Visualization of Categories Julian Szymański 1 and Włodzisław Duch 2,3 1 Department of Computer Systems Architecture, Gdańsk University of Technology, Poland, julian.szymanski@eti.pg.gda.pl

More information

Statistics for BIG data

Statistics for BIG data Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before

More information

Truong-Huy Dinh Nguyen

Truong-Huy Dinh Nguyen Truong-Huy Dinh Nguyen, Sep 2015 Journalism Building 238 Department of Computer Science Texas A&M University-Commerce P.O. Box 3011, Commerce, TX 75429-3011 Email: Truong-Huy.Nguyen@tamuc.edu EDUCATION

More information

Vietnam National University Ho Chi Minh city University of Science. Faculty of Information technology

Vietnam National University Ho Chi Minh city University of Science. Faculty of Information technology Vietnam National University Ho Chi Minh city University of Science Faculty of Information technology 1 Universities in Vietnam Two national universities Vietnam National University, Hanoi (VNU) Vietnam

More information

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015 An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content

More information

Comparison of K-means and Backpropagation Data Mining Algorithms

Comparison of K-means and Backpropagation Data Mining Algorithms Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and

More information

8. Machine Learning Applied Artificial Intelligence

8. Machine Learning Applied Artificial Intelligence 8. Machine Learning Applied Artificial Intelligence Prof. Dr. Bernhard Humm Faculty of Computer Science Hochschule Darmstadt University of Applied Sciences 1 Retrospective Natural Language Processing Name

More information

11.04 Lo M Khu CC Bàu Cát II, P.10 Quận Tân Bình, Tp. Hồ Chí Minh Emails: nguyenvu@usc.edu nvu@fit.hcmus.edu.vn

11.04 Lo M Khu CC Bàu Cát II, P.10 Quận Tân Bình, Tp. Hồ Chí Minh Emails: nguyenvu@usc.edu nvu@fit.hcmus.edu.vn Education VU NGUYEN 11.04 Lo M Khu CC Bàu Cát II, P.10 Quận Tân Bình, Tp. Hồ Chí Minh Emails: nguyenvu@usc.edu nvu@fit.hcmus.edu.vn Doctor of Philosophy (Ph.D.) (Dec 2010) Department of Computer Science

More information

Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016

Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016 Network Machine Learning Research Group S. Jiang Internet-Draft Huawei Technologies Co., Ltd Intended status: Informational October 19, 2015 Expires: April 21, 2016 Abstract Network Machine Learning draft-jiang-nmlrg-network-machine-learning-00

More information

Domain Independent Knowledge Base Population From Structured and Unstructured Data Sources

Domain Independent Knowledge Base Population From Structured and Unstructured Data Sources Proceedings of the Twenty-Fourth International Florida Artificial Intelligence Research Society Conference Domain Independent Knowledge Base Population From Structured and Unstructured Data Sources Michelle

More information

Web Document Clustering

Web Document Clustering Web Document Clustering Lab Project based on the MDL clustering suite http://www.cs.ccsu.edu/~markov/mdlclustering/ Zdravko Markov Computer Science Department Central Connecticut State University New Britain,

More information

Yu-Han Chang. USC Information Sciences Institute 4676 Admiralty Way (617) 678-2486 Marina del Rey, CA 90292

Yu-Han Chang. USC Information Sciences Institute 4676 Admiralty Way (617) 678-2486 Marina del Rey, CA 90292 Yu-Han Chang USC Information Sciences Institute ychang@isi.edu 4676 Admiralty Way (617) 678-2486 Marina del Rey, CA 90292 Research Interests My research centers on learning in rich multi-agent environments.

More information

Semantic Concept Based Retrieval of Software Bug Report with Feedback

Semantic Concept Based Retrieval of Software Bug Report with Feedback Semantic Concept Based Retrieval of Software Bug Report with Feedback Tao Zhang, Byungjeong Lee, Hanjoon Kim, Jaeho Lee, Sooyong Kang, and Ilhoon Shin Abstract Mining software bugs provides a way to develop

More information

Title. Introduction to Data Mining. Dr Arulsivanathan Naidoo Statistics South Africa. OECD Conference Cape Town 8-10 December 2010.

Title. Introduction to Data Mining. Dr Arulsivanathan Naidoo Statistics South Africa. OECD Conference Cape Town 8-10 December 2010. Title Introduction to Data Mining Dr Arulsivanathan Naidoo Statistics South Africa OECD Conference Cape Town 8-10 December 2010 1 Outline Introduction Statistics vs Knowledge Discovery Predictive Modeling

More information

Exploration and Visualization of Post-Market Data

Exploration and Visualization of Post-Market Data Exploration and Visualization of Post-Market Data Jianying Hu, PhD Joint work with David Gotz, Shahram Ebadollahi, Jimeng Sun, Fei Wang, Marianthi Markatou Healthcare Analytics Research IBM T.J. Watson

More information

How To Become A Data Scientist

How To Become A Data Scientist Programme Specification Awarding Body/Institution Teaching Institution Queen Mary, University of London Queen Mary, University of London Name of Final Award and Programme Title Master of Science (MSc)

More information

Blog Post Extraction Using Title Finding

Blog Post Extraction Using Title Finding Blog Post Extraction Using Title Finding Linhai Song 1, 2, Xueqi Cheng 1, Yan Guo 1, Bo Wu 1, 2, Yu Wang 1, 2 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing 2 Graduate School

More information

Data, Measurements, Features

Data, Measurements, Features Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are

More information

Industry-Driven Master Certificate in

Industry-Driven Master Certificate in Industry-Driven Master Certificate in Data Science Gianluca Reali, EGI Community Forum 2015, November 12, 2015 Industry-Driven Master Outline External Contexts Academic Program Design Approach Analysis

More information

INFORMATION TECHNOLOGY (IT) 515

INFORMATION TECHNOLOGY (IT) 515 INFORMATION TECHNOLOGY (IT) 515 202 Old Union, (309) 438-8338 Web address: IT.IllinoisState.edu Director: Mary Elaine Califf. Tenured/Tenure-track Faculty: Professors: Gyires, Li, Lim, Mahatanankoon. Associate

More information

Computer Vision (Recognition, Detection and Classification Problems)

Computer Vision (Recognition, Detection and Classification Problems) Mohammad Moghimi Curriculum Vitae 9234 Regents Rd Apt H La Jolla, CA, 92037 H (858) 888-3337 B mmoghimi@cs.cornell.edu Í http://cs.ucsd.edu/~mmoghimi Interests Computer Vision (Recognition, Detection and

More information

Curriculum Vitae. John M. Zelle, Ph.D.

Curriculum Vitae. John M. Zelle, Ph.D. Curriculum Vitae John M. Zelle, Ph.D. Address Department of Math, Computer Science, and Physics Wartburg College 100 Wartburg Blvd. Waverly, IA 50677 (319) 352-8360 email: john.zelle@wartburg.edu Education

More information

Modeling and Design of Intelligent Agent System

Modeling and Design of Intelligent Agent System International Journal of Control, Automation, and Systems Vol. 1, No. 2, June 2003 257 Modeling and Design of Intelligent Agent System Dae Su Kim, Chang Suk Kim, and Kee Wook Rim Abstract: In this study,

More information

Text Analytics with Ambiverse. Text to Knowledge. www.ambiverse.com

Text Analytics with Ambiverse. Text to Knowledge. www.ambiverse.com Text Analytics with Ambiverse Text to Knowledge www.ambiverse.com Version 1.0, February 2016 WWW.AMBIVERSE.COM Contents 1 Ambiverse: Text to Knowledge............................... 5 1.1 Text is all Around

More information

NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE. Venu Govindaraju

NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE. Venu Govindaraju NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE Venu Govindaraju BIOMETRICS DOCUMENT ANALYSIS PATTERN RECOGNITION 8/24/2015 ICDAR- 2015 2 Towards a Globally Optimal Approach for Learning Deep Unsupervised

More information

480093 - TDS - Socio-Environmental Data Science

480093 - TDS - Socio-Environmental Data Science Coordinating unit: Teaching unit: Academic year: Degree: ECTS credits: 2015 480 - IS.UPC - University Research Institute for Sustainability Science and Technology 715 - EIO - Department of Statistics and

More information

Curriculum Vitae. May 10, 1975 (Born in Alexandria, Egypt)

Curriculum Vitae. May 10, 1975 (Born in Alexandria, Egypt) Curriculum Vitae NAME: BIRTH DATE: ADDRESS: Islam Tharwat Elkabani May 10, 1975 (Born in Alexandria, Egypt) Beirut Arab University, Faculty of Science, Department of Mathematics and Computer Science, Beirut,

More information

Collective Behavior Prediction in Social Media. Lei Tang Data Mining & Machine Learning Group Arizona State University

Collective Behavior Prediction in Social Media. Lei Tang Data Mining & Machine Learning Group Arizona State University Collective Behavior Prediction in Social Media Lei Tang Data Mining & Machine Learning Group Arizona State University Social Media Landscape Social Network Content Sharing Social Media Blogs Wiki Forum

More information

Curriculum Vitae RESEARCH INTERESTS EDUCATION. SELECTED PUBLICATION Journal. Current Employment: (August, 2012 )

Curriculum Vitae RESEARCH INTERESTS EDUCATION. SELECTED PUBLICATION Journal. Current Employment: (August, 2012 ) Curriculum Vitae Michael Tu Current Employment: (August, 2012 ) Assistant Professor Department of Computer Information Technology and Graphics School of Technology Purdue University Calumet Email: manghui.tu@purduecal.edu

More information

ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING)

ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING) ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING) Gabriela Ochoa http://www.cs.stir.ac.uk/~goc/ OUTLINE Preliminaries Classification and Clustering Applications

More information

Resume of Hanan H. Elazhary

Resume of Hanan H. Elazhary Resume of Hanan H. Elazhary Home Phone: 35853017, 35853986 Cell Phone: 0112302019 E-mail: hanan@eri.sci.eg, hananelazhary@hotmail.com Nationality: Egyptian Gender: Female EDUCATION Ph.D. in Computer Science

More information

Random forest algorithm in big data environment

Random forest algorithm in big data environment Random forest algorithm in big data environment Yingchun Liu * School of Economics and Management, Beihang University, Beijing 100191, China Received 1 September 2014, www.cmnt.lv Abstract Random forest

More information

Florian M. Federspiel

Florian M. Federspiel Florian M. Federspiel McDonough School of Business, Georgetown University Rafik B. Hariri Building, 5th floor, 37th and O Streets, NW, Washington DC, 20057, USA Email: fmf13@georgetown.edu Tel: +1 202

More information

Grid Density Clustering Algorithm

Grid Density Clustering Algorithm Grid Density Clustering Algorithm Amandeep Kaur Mann 1, Navneet Kaur 2, Scholar, M.Tech (CSE), RIMT, Mandi Gobindgarh, Punjab, India 1 Assistant Professor (CSE), RIMT, Mandi Gobindgarh, Punjab, India 2

More information

Effective Mentor Suggestion System for Collaborative Learning

Effective Mentor Suggestion System for Collaborative Learning Effective Mentor Suggestion System for Collaborative Learning Advait Raut 1 U pasana G 2 Ramakrishna Bairi 3 Ganesh Ramakrishnan 2 (1) IBM, Bangalore, India, 560045 (2) IITB, Mumbai, India, 400076 (3)

More information

Classifying Manipulation Primitives from Visual Data

Classifying Manipulation Primitives from Visual Data Classifying Manipulation Primitives from Visual Data Sandy Huang and Dylan Hadfield-Menell Abstract One approach to learning from demonstrations in robotics is to make use of a classifier to predict if

More information

Identifying Focus, Techniques and Domain of Scientific Papers

Identifying Focus, Techniques and Domain of Scientific Papers Identifying Focus, Techniques and Domain of Scientific Papers Sonal Gupta Department of Computer Science Stanford University Stanford, CA 94305 sonal@cs.stanford.edu Christopher D. Manning Department of

More information

An Empirical Study of Application of Data Mining Techniques in Library System

An Empirical Study of Application of Data Mining Techniques in Library System An Empirical Study of Application of Data Mining Techniques in Library System Veepu Uppal Department of Computer Science and Engineering, Manav Rachna College of Engineering, Faridabad, India Gunjan Chindwani

More information

AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM

AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM ABSTRACT Luis Alexandre Rodrigues and Nizam Omar Department of Electrical Engineering, Mackenzie Presbiterian University, Brazil, São Paulo 71251911@mackenzie.br,nizam.omar@mackenzie.br

More information

COMPUTER SCIENCE PROGRAM

COMPUTER SCIENCE PROGRAM COMPUTER SCIENCE PROGRAM Master of Science in Computer Science (M.S.C.S.) Degree DEGREE INFORMATION CONTACT INFORMATION Program Admission Deadlines: Fall: June 1February 15 Spring: October 15 Summer: No

More information

Enhancing Quality of Data using Data Mining Method

Enhancing Quality of Data using Data Mining Method JOURNAL OF COMPUTING, VOLUME 2, ISSUE 9, SEPTEMBER 2, ISSN 25-967 WWW.JOURNALOFCOMPUTING.ORG 9 Enhancing Quality of Data using Data Mining Method Fatemeh Ghorbanpour A., Mir M. Pedram, Kambiz Badie, Mohammad

More information

The Masters of Science in Information Systems & Technology

The Masters of Science in Information Systems & Technology The Masters of Science in Information Systems & Technology College of Engineering and Computer Science University of Michigan-Dearborn A Rackham School of Graduate Studies Program PH: 313-593-5361; FAX:

More information

Applying Deep Learning to Car Data Logging (CDL) and Driver Assessor (DA) October 22-Oct-15

Applying Deep Learning to Car Data Logging (CDL) and Driver Assessor (DA) October 22-Oct-15 Applying Deep Learning to Car Data Logging (CDL) and Driver Assessor (DA) October 22-Oct-15 GENIVI is a registered trademark of the GENIVI Alliance in the USA and other countries Copyright GENIVI Alliance

More information

Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information

Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information Satoshi Sekine Computer Science Department New York University sekine@cs.nyu.edu Kapil Dalwani Computer Science Department

More information

Benefits of HPC for NLP besides big data

Benefits of HPC for NLP besides big data Benefits of HPC for NLP besides big data Barbara Plank Center for Sprogteknologie (CST) University of Copenhagen, Denmark http://cst.dk/bplank Web-Scale Natural Language Processing in Northern Europe Oslo,

More information

Practical Applications of DATA MINING. Sang C Suh Texas A&M University Commerce JONES & BARTLETT LEARNING

Practical Applications of DATA MINING. Sang C Suh Texas A&M University Commerce JONES & BARTLETT LEARNING Practical Applications of DATA MINING Sang C Suh Texas A&M University Commerce r 3 JONES & BARTLETT LEARNING Contents Preface xi Foreword by Murat M.Tanik xvii Foreword by John Kocur xix Chapter 1 Introduction

More information

Reference Books. Data Mining. Supervised vs. Unsupervised Learning. Classification: Definition. Classification k-nearest neighbors

Reference Books. Data Mining. Supervised vs. Unsupervised Learning. Classification: Definition. Classification k-nearest neighbors Classification k-nearest neighbors Data Mining Dr. Engin YILDIZTEPE Reference Books Han, J., Kamber, M., Pei, J., (2011). Data Mining: Concepts and Techniques. Third edition. San Francisco: Morgan Kaufmann

More information

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery Index Contents Page No. 1. Introduction 1 1.1 Related Research 2 1.2 Objective of Research Work 3 1.3 Why Data Mining is Important 3 1.4 Research Methodology 4 1.5 Research Hypothesis 4 1.6 Scope 5 2.

More information

A Survey on Intrusion Detection System with Data Mining Techniques

A Survey on Intrusion Detection System with Data Mining Techniques A Survey on Intrusion Detection System with Data Mining Techniques Ms. Ruth D 1, Mrs. Lovelin Ponn Felciah M 2 1 M.Phil Scholar, Department of Computer Science, Bishop Heber College (Autonomous), Trichirappalli,

More information

How To Create A Text Classification System For Spam Filtering

How To Create A Text Classification System For Spam Filtering Term Discrimination Based Robust Text Classification with Application to Email Spam Filtering PhD Thesis Khurum Nazir Junejo 2004-03-0018 Advisor: Dr. Asim Karim Department of Computer Science Syed Babar

More information

DATA MINING - SELECTED TOPICS

DATA MINING - SELECTED TOPICS DATA MINING - SELECTED TOPICS Peter Brezany Institute for Software Science University of Vienna E-mail : brezany@par.univie.ac.at 1 MINING SPATIAL DATABASES 2 Spatial Database Systems SDBSs offer spatial

More information

Computer-Based Text- and Data Analysis Technologies and Applications. Mark Cieliebak 9.6.2015

Computer-Based Text- and Data Analysis Technologies and Applications. Mark Cieliebak 9.6.2015 Computer-Based Text- and Data Analysis Technologies and Applications Mark Cieliebak 9.6.2015 Data Scientist analyze Data Library use 2 About Me Mark Cieliebak + Software Engineer & Data Scientist + PhD

More information

CLASSIFYING NETWORK TRAFFIC IN THE BIG DATA ERA

CLASSIFYING NETWORK TRAFFIC IN THE BIG DATA ERA CLASSIFYING NETWORK TRAFFIC IN THE BIG DATA ERA Professor Yang Xiang Network Security and Computing Laboratory (NSCLab) School of Information Technology Deakin University, Melbourne, Australia http://anss.org.au/nsclab

More information

Machine Learning. 01 - Introduction

Machine Learning. 01 - Introduction Machine Learning 01 - Introduction Machine learning course One lecture (Wednesday, 9:30, 346) and one exercise (Monday, 17:15, 203). Oral exam, 20 minutes, 5 credit points. Some basic mathematical knowledge

More information

01219211 Software Development Training Camp 1 (0-3) Prerequisite : 01204214 Program development skill enhancement camp, at least 48 person-hours.

01219211 Software Development Training Camp 1 (0-3) Prerequisite : 01204214 Program development skill enhancement camp, at least 48 person-hours. (International Program) 01219141 Object-Oriented Modeling and Programming 3 (3-0) Object concepts, object-oriented design and analysis, object-oriented analysis relating to developing conceptual models

More information

Chapter 12 Discovering New Knowledge Data Mining

Chapter 12 Discovering New Knowledge Data Mining Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to

More information

Sharareh Noorbaloochi Department of Psychology New York University 6 Washington Place, 559, New York, NY 10003 noorbaloochi@nyu.

Sharareh Noorbaloochi Department of Psychology New York University 6 Washington Place, 559, New York, NY 10003 noorbaloochi@nyu. Sharareh Noorbaloochi Department of Psychology New York University 6 Washington Place, 559, New York, NY 10003 noorbaloochi@nyu.edu (650) 919-3485 EDUCATION AND EMPLOYMENT Postdoctoral Associate, Department

More information

Dr Artyom Nahapetyan

Dr Artyom Nahapetyan Dr Artyom Nahapetyan 2930 SW 23 rd Terrace, Apt 1304, Gainesville, FL 32608 352-334-7283 ext. 308 (office) 352-870-8404 (cell) Artyom@ufl.edu Artyom@InnovativeScheduling.com Nahapetyan.Artyom@gmail.com

More information

Enhanced Boosted Trees Technique for Customer Churn Prediction Model

Enhanced Boosted Trees Technique for Customer Churn Prediction Model IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719 Vol. 04, Issue 03 (March. 2014), V5 PP 41-45 www.iosrjen.org Enhanced Boosted Trees Technique for Customer Churn Prediction

More information

Curriculum Vitae. 1 Person Dr. Horst O. Bunke, Prof. Em. Date of birth July 30, 1949 Place of birth Langenzenn, Germany Citizenship Swiss and German

Curriculum Vitae. 1 Person Dr. Horst O. Bunke, Prof. Em. Date of birth July 30, 1949 Place of birth Langenzenn, Germany Citizenship Swiss and German Curriculum Vitae 1 Person Name Dr. Horst O. Bunke, Prof. Em. Date of birth July 30, 1949 Place of birth Langenzenn, Germany Citizenship Swiss and German 2 Education 1974 Dipl.-Inf. Degree from the University

More information

ISSUES IN RULE BASED KNOWLEDGE DISCOVERING PROCESS

ISSUES IN RULE BASED KNOWLEDGE DISCOVERING PROCESS Advances and Applications in Statistical Sciences Proceedings of The IV Meeting on Dynamics of Social and Economic Systems Volume 2, Issue 2, 2010, Pages 303-314 2010 Mili Publications ISSUES IN RULE BASED

More information

Bob Boothe. Education. Research Interests. Teaching Experience

Bob Boothe. Education. Research Interests. Teaching Experience Bob Boothe Computer Science Dept. University of Southern Maine 96 Falmouth St. P.O. Box 9300 Portland, ME 04103--9300 (207) 780-4789 email: boothe@usm.maine.edu 54 Cottage Park Rd. Portland, ME 04103 (207)

More information

How To Use Data Mining For Knowledge Management In Technology Enhanced Learning

How To Use Data Mining For Knowledge Management In Technology Enhanced Learning Proceedings of the 6th WSEAS International Conference on Applications of Electrical Engineering, Istanbul, Turkey, May 27-29, 2007 115 Data Mining for Knowledge Management in Technology Enhanced Learning

More information

Using Semantic Data Mining for Classification Improvement and Knowledge Extraction

Using Semantic Data Mining for Classification Improvement and Knowledge Extraction Using Semantic Data Mining for Classification Improvement and Knowledge Extraction Fernando Benites and Elena Sapozhnikova University of Konstanz, 78464 Konstanz, Germany. Abstract. The objective of this

More information

PROGRAMME SPECIFICATION POSTGRADUATE PROGRAMME

PROGRAMME SPECIFICATION POSTGRADUATE PROGRAMME PROGRAMME SPECIFICATION POSTGRADUATE PROGRAMME KEY FACTS Programme name Advanced Computer Science Award MSc School Mathematics, Computer Science and Engineering Department or equivalent Department of Computing

More information

Graduate Program Handbook M.S. and Ph.D. Degrees

Graduate Program Handbook M.S. and Ph.D. Degrees Graduate Program Handbook M.S. and Ph.D. Degrees Department of Computer Science University of New Hampshire updated: Summer 2012 1 Overview The department offers both an M.S. in Computer Science and a

More information

Master of Software Engineering BROCHURE

Master of Software Engineering BROCHURE Master of Software Engineering BROCHURE for Vietnamese Students in 2015 Industrial relevancy Opportunities to work at FPT Software during the time of study as one required component of the course Some

More information

Machine Learning. CUNY Graduate Center, Spring 2013. Professor Liang Huang. huang@cs.qc.cuny.edu

Machine Learning. CUNY Graduate Center, Spring 2013. Professor Liang Huang. huang@cs.qc.cuny.edu Machine Learning CUNY Graduate Center, Spring 2013 Professor Liang Huang huang@cs.qc.cuny.edu http://acl.cs.qc.edu/~lhuang/teaching/machine-learning Logistics Lectures M 9:30-11:30 am Room 4419 Personnel

More information

Mining the Software Change Repository of a Legacy Telephony System

Mining the Software Change Repository of a Legacy Telephony System Mining the Software Change Repository of a Legacy Telephony System Jelber Sayyad Shirabad, Timothy C. Lethbridge, Stan Matwin School of Information Technology and Engineering University of Ottawa, Ottawa,

More information

Agreement on Dual Degree Master Program in Computer Science. Politechnika Warszawska. Technische Universität Berlin

Agreement on Dual Degree Master Program in Computer Science. Politechnika Warszawska. Technische Universität Berlin Agreement on Dual Degree Master Program in Computer Science between Politechnika Warszawska Faculty of Electronics and Information Technology and Technische Universität Berlin School of Electrical Engineering

More information

Experiments in Web Page Classification for Semantic Web

Experiments in Web Page Classification for Semantic Web Experiments in Web Page Classification for Semantic Web Asad Satti, Nick Cercone, Vlado Kešelj Faculty of Computer Science, Dalhousie University E-mail: {rashid,nick,vlado}@cs.dal.ca Abstract We address

More information

Hyoduk Shin. Curriculum Vitae. Academic Appointment. 07/2008-06/2012, Kellogg School of Management, Northwestern University.

Hyoduk Shin. Curriculum Vitae. Academic Appointment. 07/2008-06/2012, Kellogg School of Management, Northwestern University. Hyoduk Shin Curriculum Vitae Rady School of Management University of California - San Diego 9500 Gilman Drive #0553 La Jolla, CA 92093-0553 email: hdshin@ucsd.edu Tel: (858) 534-3768 Academic Appointment

More information

Graduate Programs. Dept of Computer Science. Dr. Weining Zhang

Graduate Programs. Dept of Computer Science. Dr. Weining Zhang Graduate in Dept of Computer Science Univ. of Texas at San Antonio Dr. Weining Zhang Overview Two graduate degrees: Master of Science (MS) in Computer Science PhD in Computer Science Currently, there are

More information

DEPARTMENT OF COMPUTER SCIENCE

DEPARTMENT OF COMPUTER SCIENCE DEPARTMENT OF COMPUTER SCIENCE Faculty of Engineering DEPARTMENT OF COMPUTER SCIENCE MSc REGULATIONS AND PROCEDURES (Revised: September 2013) TABLE OF CONTENTS 1. MSC ADMISSION REQUIREMENTS 1.1 Application

More information

TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM

TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM Thanh-Nghi Do College of Information Technology, Cantho University 1 Ly Tu Trong Street, Ninh Kieu District Cantho City, Vietnam

More information