BigData@Chalmers Machine Learning Business Intelligence, Culturomics and Life Sciences



Similar documents
An Introduction to Data Mining

BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics

Data Isn't Everything

Data Integration. Lectures 16 & 17. ECS289A, WQ03, Filkov

Learning outcomes. Knowledge and understanding. Competence and skills

The University of Jordan

Introduction to Data Mining

Master's projects at ITMO University. Daniil Chivilikhin PhD ITMO University

Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

CS 2750 Machine Learning. Lecture 1. Machine Learning. CS 2750 Machine Learning.

MSCA Introduction to Statistical Concepts

A1 Introduction to Data exploration and Machine Learning

Data Analytics at NICTA. Stephen Hardy National ICT Australia (NICTA)

Some Research Challenges for Big Data Analytics of Intelligent Security

How To Get A Computer Engineering Degree

CS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 Real-Time Systems. CSCI 522 High Performance Computing

Page 1 of 5. (Modules, Subjects) SENG DSYS PSYS KMS ADB INS IAT

Learning is a very general term denoting the way in which agents:

Computer Science Electives and Clusters

A Review of Data Mining Techniques

Sanjeev Kumar. contribute

Big Data Mining Services and Knowledge Discovery Applications on Clouds

Information Management course

A Systemic Artificial Intelligence (AI) Approach to Difficult Text Analytics Tasks

DATA SCIENCE ADVISING NOTES David Wild - updated May 2015

Applications of Deep Learning to the GEOINT mission. June 2015

Curriculum Vitae Ruben Sipos

Statistics for BIG data

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing

Big Data Analytics: Where is it Going and How Can it Be Taught at the Undergraduate Level?

Comparison of K-means and Backpropagation Data Mining Algorithms

Using Artificial Intelligence to Manage Big Data for Litigation

An Overview of Knowledge Discovery Database and Data mining Techniques

Doctor of Philosophy in Computer Science

DATA MINING TECHNIQUES AND APPLICATIONS

The Data Mining Process

Healthcare data analytics. Da-Wei Wang Institute of Information Science

Introduction. A. Bellaachia Page: 1

Machine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer

IEEE International Conference on Computing, Analytics and Security Trends CAST-2016 (19 21 December, 2016) Call for Paper

How To Understand And Understand The Theory Of Computational Finance

Big Data from a Database Theory Perspective

REGULATIONS FOR THE DEGREE OF MASTER OF SCIENCE IN COMPUTER SCIENCE (MSc[CompSc])

Master of Science in Computer Science

Machine Learning CS Lecture 01. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Data Mining for Customer Service Support. Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin

Advice for Students completing the B.S. degree in Computer Science based on Quarters How to Satisfy Computer Science Related Electives

Professional Organization Checklist for the Computer Science Curriculum Updates. Association of Computing Machinery Computing Curricula 2008

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS

BIG DATA & DATA SCIENCE

Data Mining Part 5. Prediction

life science data mining

Introduction to Data Mining

Role Description. Position of a Data Scientist Machine Learning at Fractal Analytics

not possible or was possible at a high cost for collecting the data.

Bachelor Degree in Informatics Engineering Master courses

Search and Data Mining: Techniques. Applications Anya Yarygina Boris Novikov

MapReduce Approach to Collective Classification for Networks

BIG DATA AND ANALYTICS

Graduate Co-op Students Information Manual. Department of Computer Science. Faculty of Science. University of Regina

Challenges for Data Driven Systems

Prediction of Heart Disease Using Naïve Bayes Algorithm

Software Development Training Camp 1 (0-3) Prerequisite : Program development skill enhancement camp, at least 48 person-hours.

Machine Learning and Data Analysis overview. Department of Cybernetics, Czech Technical University in Prague.

Data, Measurements, Features

MA2823: Foundations of Machine Learning

REGULATIONS FOR THE DEGREE OF MASTER OF SCIENCE IN COMPUTER SCIENCE (MSc[CompSc])

Mining Online GIS for Crime Rate and Models based on Frequent Pattern Analysis

Data Mining and Soft Computing. Francisco Herrera

Ming-Wei Chang. Machine learning and its applications to natural language processing, information retrieval and data mining.

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

Veterinary epidemiology in the era of big data: Value, Volume and Velocity of information extraction

Steven C.H. Hoi. School of Computer Engineering Nanyang Technological University Singapore

Machine Learning and Data Mining. Fundamentals, robotics, recognition

Big Data Analytics. Prof. Dr. Lars Schmidt-Thieme

Machine Learning: Overview

ADVANCED MACHINE LEARNING. Introduction

Deep Learning For Text Processing

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

Course Requirements for the Ph.D., M.S. and Certificate Programs

Challenges of Cloud Scale Natural Language Processing

Tensor Factorization for Multi-Relational Learning

CPSC 340: Machine Learning and Data Mining. Mark Schmidt University of British Columbia Fall 2015

Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research

Advanced analytics at your hands

Why is Internal Audit so Hard?

Analytics on Big Data

Connecting Basic Research and Healthcare Big Data

The INFUSIS Project Data and Text Mining for In Silico Modeling

CAPTURING THE VALUE OF UNSTRUCTURED DATA: INTRODUCTION TO TEXT MINING

Data-intensive HPC: opportunities and challenges. Patrick Valduriez

Reference Books. Data Mining. Supervised vs. Unsupervised Learning. Classification: Definition. Classification k-nearest neighbors

HT2015: SC4 Statistical Data Mining and Machine Learning

Data Mining Solutions for the Business Environment

Integrating Genetic Data into Clinical Workflow with Clinical Decision Support Apps

and Hung-Wen Chang 1 Department of Human Resource Development, Hsiuping University of Science and Technology, Taichung City 412, Taiwan 3

Business Intelligence Integration. Joel Da Costa, Takudzwa Mabande, Richard Migwalla Antoine Bagula, Joseph Balikuddembe

ICSES Journal on Image Processing and Pattern Recognition (IJIPPR), Aug. 2015, Vol. 1, No. 1

Transcription:

BigData@Chalmers Machine Learning Business Intelligence, Culturomics and Life Sciences Devdatt Dubhashi LAB (Machine Learning. Algorithms, Computational Biology) D&IT Chalmers

Entity Disambiguation Match names in text with the entity behind them Fundamental problem, addressed at annual competitions like Semeval Disambiguation is needed everywhere. Databases, web mining, linguistics, Used at Recorded Future (exemplified next!)

Judge a man by the company he keeps. - Euripides

Pakistan Oxford Uni. India Chris Anderson TED Future Publishing San Francisco

Chris Anderson

Graph Communities

Classification with Graph Kernels

Graph Embeddings and Kernels 3 2 1 ϑ 4 5 1 Embed discrete combinatorial object (graph) into continuous Euclidean space Define kernel based on geometry of Euclidean sp. V. Jethava et al NIPS 2012, JMLR 2013 T. Kerola, L. Hermansson, V. Jethava, F. Johansson CIKM 2013 F. Johansson, V. Jethava et al ICML 2014.

Demonstrator at Recorded Future Classifies names as ambiguous or unique Uses graph classification to classify occurrence graphs of names Ambiguous or Unique? Achieved state-of-the-art results (CIKM, 2013). Powerful extension for complete disambiguation in progress Parallel/Distributed implementation in GraphLab

Towards a knowledge-based culturomics Språkbanken (Swedish Language Bank), University of Gothenburg Language Technology, Lund University LAB Group Department of Computer Science and Engineering, Chalmers University of Technology

Word Embeddings

Deep Learning (Neural Networks) Revolutionized vision and speech systems Dramatic improvements in image classification near human level. Skype real time translation from English to Chinese.

Word Embeddings capture meaning

Dealing with information overload

Document summarization Word vectors + Multiple Kernel learning + Submodular optimization M. Kågeback, O. Mogren et al, Extractive Summarization using Continuous Vector Space Models, Workshop on (CVSC) EACL 2014

Word sense induction Instance cloud for: 'power' energy unit system battery x performance high allows engine equipment processing systems failure management provide him political her government god influence state came us act labour given council about authority M. Kageback, F. Johansson et al, Neural context embeddings for automatic discovery of word senses, (NAACL 2015 workshop on Vector Space Modeling for NLP) Used an innovative clustering technique Exploited word and context vectors.

Senses of for paper Medium Essay Scholarly article Newspaper Newspaper firm Material Vis using t-sne

Probabilistic Regulation of Prediction of metabolic changes due to genetic or environmental perturbations diagnosing metabolic disorders discovering novel drug targets. Metabolism

Genetic Regulation of Metabolism: Using Factor Graphs and Belief Propagation

genetic regulatory network consisting of transcription factor genes, target genes and metabolic reactions

Data mining with Differential Privacy Programming language technology for differential privacy (Sands) Privacy policies for social networks (Schneider) Privacy

Chalmers Machine Learning Summer School 2015

Big Data Analytics May 25-29 Hadoop Spark Spotfire

SVMs and Kernel Methods Graph Theoretic Methods Probabilistic Graphical Models Deep Learning Bayesian Decision Theory Reinforcement Learning Business intelligence Natural Language Technology Life Sciences Transport (Volvo) Infectious disease epidemiology Medical Imaging Political Science

Data Science Securit yprivac y Algorithms Multicores /GPUs Probability and Statistics Optimization Database s Machine Learning Parallel programming Sparse modelling

Chalmers Data-X? Life Science and Engineering Transport Energy Smart Cities (Built Environment) Production Volvo cars (connected cars, historical data) AstraZeneca (mining medical literature) Seal software (mining legal contracts)

Data Science vs EScience Data-centric Probabilistic models GPUs Computational biology, NLP, social sciences Computation-centric Simulation Large clusters/grids Physics, Turbulent flows, Climate