Klaus-Robert Müller et al. Big Data and Machine Learning
|
|
- Veronica Hicks
- 8 years ago
- Views:
Transcription
1 Klaus-Robert Müller et al. Big Data and Machine Learning
2 Some Remarks Machine Learning small data (expensive!) big data big data in neuroscience: BCI et al. social media data physics & materials
3 Toward Brain Computer Interfacing Klaus-Robert Müller, Siamac Fazli, Jan Mehnert, Stefan Haufe, Frank Meinecke, Paul von Bünau, Franz Kiraly, Felix Biessmann, Sven Dähne, Johannes Höhne, Michael Tangermann, Carmen Vidaure, Gabriel Curio, Benjamin Blankertz et al.
4 Invasive BCI at it s best Remark: 24*1000* 3600*30000 ~ 2tb/day [From Schwartz]
5 Noninvasive Brain-Computer Interface DECODING
6 BCI for communcation
7 Brain Pong with BBCI Remark: 3*100* 3600*1000 ~ 1-2Gb/Experiment
8 BBCI paradigms Leitmotiv: let the machines learn - healthy subjects untrained for BCI A: training <10min: right/left hand imagined movements infer the respective brain acivities (ML & SP) B: online feedback session
9 Machine learning approach to BCI: infer prototypical pattern Inference by CSP Algorithm
10 The cerebral cocktail party problem use ICA/NGCA projections for artifact and noise removal feature extraction and selection [cf. Ziehe et al. 2000, Blanchard et al. 2006]
11 BBCI Set-up Artifact removal [cf. Müller et al. 2001, 2007, 2008, Dornhege et al. 2003, 2007, Blankertz et al. 2004, 2005, 2006, 2007, 2008]
12 Shifting distributions within experiment
13 20 Correlating apples and oranges [Biessmann et al. Neuroimage 2012, Machine Learning 2010]
14
15 Temporal Dynamics of Web Data
16 Motivation [Biessmann et al, 2012, and submitted]
17 Canonical Trend Analysis for Social Networks
18 Data Extraction
19 Data Extraction: Retweet Location
20 Mean Location of Reweeted News Articles
21 Downsampling of Geographic Information
22 Canonical Trend Model
23 Why projecting on canonical subspace Recent development: tkcca allows to optimally and nonlinearly correlate over time [Biessmann et al 2010]
24 Canonical Trend Analysis
25 Canonical Trend Analysis
26 Efficient Computation of Canonical Trends [Schölkopf, Smola & Müller 98, Boser, Gyon, Vapnik, 92]
27 Efficient Computation of Canonical Trends
28 Efficient Computation of Canonical Trends
29 Comparisons: Mean, PCA and Canonical Trends
30 Comparisons: Mean, PCA and Canonical Trends
31 Comparisons: Mean, PCA and Canonical Trends
32 Comparisons: Mean, PCA and Canonical Trends
33 Canonical Convolution
34 Spatiotemporal Analysis of Retweets of News
35 53 And now for something completely different [Montavon et al 13, Rupp et al 2012.]
36 IPAM 2011 Klaus-Robert Müller, Matthias Rupp Anatole von Lilienfeld and Alexandre Tkachenko et al
37 Machine Learning for chemical compound space Ansatz: instead of [from von Lilienfeld]
38 Machine Learning for chemical compound space Ansatz: Provide same information to ML as to SE: XYZ-file cast data similarly as in the SE: Unique and continuous in all of CCS Translationally, rotationally, permutationally invariant Symmetrical atoms contribute equally ``Coulomb'' Matrix [energy] fill up with zeros for smaller molecules diagonalize OR sort rows according to their norm measure distance between molecules: [from von Lilienfeld]
39 Coulomb representation of molecules M = 2.4 ii Z i M ij = R Z i i Z j R j M {Z 2, R 2 } {Z 1, R 1 } { Z 3, R 3 } {Z 4, R 4 }... M ij + phantom atoms {0,R 21 } {0,R 22 } {0,R 23 } Coulomb Matrix (Rupp12)
40 Kernel ridge regression Distances between M define Gaussian kernel matrix K Predict energy as sum over weighted Gaussians using weights that minimize error in training set Exact solution As many parameters as molecules + 2 global parameters, characteristic length-scale or kt of system (σ), and noise-level (λ) [from von Lilienfeld]
41 The data GDB-13 database of all organic molecules (within stability & synthetic constraints) of 13 heavy atoms or less: 0.9B compounds Blum & Reymond, JACS (2009) [from von Lilienfeld]
42 Results March 2012 Rupp et al., PRL 9.99 kcal/mol (kernels + eigenspectrum) December 2012 Montavon et al., NIPS 3.51 kcal/mol (deep Neural nets + Coulomb sets) More fun is yet to come... Prediction considered chemically accurate when MAE is below 1 kcal/mol Dataset available at
43 Conclusion Machine Learning is a versatile and ready to use tool for data analysis small data vs. big data fields of ML & Data Bases will hit a limit in near future time for a new marriage
44
Klaus-Robert Müller et al. Machine Learning and Big Data
Klaus-Robert Müller et al. Machine Learning and Big Data Election of the Pope: 2005 [from Wiegand] Election of the Pope: 2013 [from Wiegand] Today s Talk Remarks big data vs. small data (expensive!) Machine
More informationMachine Learning. 01 - Introduction
Machine Learning 01 - Introduction Machine learning course One lecture (Wednesday, 9:30, 346) and one exercise (Monday, 17:15, 203). Oral exam, 20 minutes, 5 credit points. Some basic mathematical knowledge
More informationComponent Ordering in Independent Component Analysis Based on Data Power
Component Ordering in Independent Component Analysis Based on Data Power Anne Hendrikse Raymond Veldhuis University of Twente University of Twente Fac. EEMCS, Signals and Systems Group Fac. EEMCS, Signals
More informationHT2015: SC4 Statistical Data Mining and Machine Learning
HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Bayesian Nonparametrics Parametric vs Nonparametric
More informationCS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.
Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott
More informationUncorrelated Transferable Feature Extraction for Signal Classification in Brain-Computer Interfaces
Uncorrelated Transferable Feature Extraction for Signal Classification in Brain-Computer Interfaces Honglei Shi, Jinhua Xu, Shiliang Sun Shanghai Key Laboratory of Multidimensional Information Processing,
More informationMachine Learning for Data Science (CS4786) Lecture 1
Machine Learning for Data Science (CS4786) Lecture 1 Tu-Th 10:10 to 11:25 AM Hollister B14 Instructors : Lillian Lee and Karthik Sridharan ROUGH DETAILS ABOUT THE COURSE Diagnostic assignment 0 is out:
More informationComparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data
CMPE 59H Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Non-linear
More informationAdvances in Neurotechnology for Brain Computer Interfaces
TECHNISCHE UNIVERSITÄT BERLIN Advances in Neurotechnology for Brain Computer Interfaces von Siamac Fazli Von der Fakultät IV, Elektrotechnik und Informatik, der Technischen Universität Berlin zur Erlangung
More informationFUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM
International Journal of Innovative Computing, Information and Control ICIC International c 0 ISSN 34-48 Volume 8, Number 8, August 0 pp. 4 FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT
More informationIntroduction to Support Vector Machines. Colin Campbell, Bristol University
Introduction to Support Vector Machines Colin Campbell, Bristol University 1 Outline of talk. Part 1. An Introduction to SVMs 1.1. SVMs for binary classification. 1.2. Soft margins and multi-class classification.
More informationLearning to Find Pre-Images
Learning to Find Pre-Images Gökhan H. Bakır, Jason Weston and Bernhard Schölkopf Max Planck Institute for Biological Cybernetics Spemannstraße 38, 72076 Tübingen, Germany {gb,weston,bs}@tuebingen.mpg.de
More informationManifold Learning with Variational Auto-encoder for Medical Image Analysis
Manifold Learning with Variational Auto-encoder for Medical Image Analysis Eunbyung Park Department of Computer Science University of North Carolina at Chapel Hill eunbyung@cs.unc.edu Abstract Manifold
More informationSection for Cognitive Systems DTU Informatics, Technical University of Denmark
Transformation Invariant Sparse Coding Morten Mørup & Mikkel N Schmidt Morten Mørup & Mikkel N. Schmidt Section for Cognitive Systems DTU Informatics, Technical University of Denmark Redundancy Reduction
More informationDeep Learning For Text Processing
Deep Learning For Text Processing Jeffrey A. Bilmes Professor Departments of Electrical Engineering & Computer Science and Engineering University of Washington, Seattle http://melodi.ee.washington.edu/~bilmes
More informationMetric Multidimensional Scaling (MDS): Analyzing Distance Matrices
Metric Multidimensional Scaling (MDS): Analyzing Distance Matrices Hervé Abdi 1 1 Overview Metric multidimensional scaling (MDS) transforms a distance matrix into a set of coordinates such that the (Euclidean)
More informationManifold Learning Examples PCA, LLE and ISOMAP
Manifold Learning Examples PCA, LLE and ISOMAP Dan Ventura October 14, 28 Abstract We try to give a helpful concrete example that demonstrates how to use PCA, LLE and Isomap, attempts to provide some intuition
More informationDetecting Network Anomalies. Anant Shah
Detecting Network Anomalies using Traffic Modeling Anant Shah Anomaly Detection Anomalies are deviations from established behavior In most cases anomalies are indications of problems The science of extracting
More informationBIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376
Course Director: Dr. Kayvan Najarian (DCM&B, kayvan@umich.edu) Lectures: Labs: Mondays and Wednesdays 9:00 AM -10:30 AM Rm. 2065 Palmer Commons Bldg. Wednesdays 10:30 AM 11:30 AM (alternate weeks) Rm.
More informationMachine Learning and Data Mining. Regression Problem. (adapted from) Prof. Alexander Ihler
Machine Learning and Data Mining Regression Problem (adapted from) Prof. Alexander Ihler Overview Regression Problem Definition and define parameters ϴ. Prediction using ϴ as parameters Measure the error
More informationLinear Codes. Chapter 3. 3.1 Basics
Chapter 3 Linear Codes In order to define codes that we can encode and decode efficiently, we add more structure to the codespace. We shall be mainly interested in linear codes. A linear code of length
More informationSupport Vector Machine. Tutorial. (and Statistical Learning Theory)
Support Vector Machine (and Statistical Learning Theory) Tutorial Jason Weston NEC Labs America 4 Independence Way, Princeton, USA. jasonw@nec-labs.com 1 Support Vector Machines: history SVMs introduced
More informationScalable Developments for Big Data Analytics in Remote Sensing
Scalable Developments for Big Data Analytics in Remote Sensing Federated Systems and Data Division Research Group High Productivity Data Processing Dr.-Ing. Morris Riedel et al. Research Group Leader,
More informationDATA ANALYSIS II. Matrix Algorithms
DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where
More informationNEURAL NETWORKS A Comprehensive Foundation
NEURAL NETWORKS A Comprehensive Foundation Second Edition Simon Haykin McMaster University Hamilton, Ontario, Canada Prentice Hall Prentice Hall Upper Saddle River; New Jersey 07458 Preface xii Acknowledgments
More informationSupport Vector Machines with Clustering for Training with Very Large Datasets
Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France theodoros.evgeniou@insead.fr Massimiliano
More informationEfficient online learning of a non-negative sparse autoencoder
and Machine Learning. Bruges (Belgium), 28-30 April 2010, d-side publi., ISBN 2-93030-10-2. Efficient online learning of a non-negative sparse autoencoder Andre Lemme, R. Felix Reinhart and Jochen J. Steil
More informationHow To Use Neural Networks In Data Mining
International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and
More informationKnowledge Discovery from patents using KMX Text Analytics
Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs anton.heijs@treparel.com Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers
More informationEMERGING FRONTIERS AND FUTURE DIRECTIONS FOR PREDICTIVE ANALYTICS, VERSION 4.0
EMERGING FRONTIERS AND FUTURE DIRECTIONS FOR PREDICTIVE ANALYTICS, VERSION 4.0 ELINOR L. VELASQUEZ Dedicated to the children and the young people. Abstract. This is an outline of a new field in predictive
More informationDocumentation Wadsworth BCI Dataset (P300 Evoked Potentials) Data Acquired Using BCI2000's P3 Speller Paradigm (http://www.bci2000.
Documentation Wadsworth BCI Dataset (P300 Evoked Potentials) Data Acquired Using BCI2000's P3 Speller Paradigm (http://www.bci2000.org) BCI Competition III Challenge 2004 Organizer: Benjamin Blankertz
More informationUsing Data Mining for Mobile Communication Clustering and Characterization
Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer
More informationIncrease information transfer rates in BCI by CSP extension to multi-class
Increase information transfer rates in BCI by CSP extension to multi-class Guido Dornhege 1, Benjamin Blankertz 1, Gabriel Curio 2, Klaus-Robert Müller 1,3 1 Fraunhofer FIRST.IDA, Kekuléstr. 7, 12489 Berlin,
More informationReview Jeopardy. Blue vs. Orange. Review Jeopardy
Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?
More informationForschungskolleg Data Analytics Methods and Techniques
Forschungskolleg Data Analytics Methods and Techniques Martin Hahmann, Gunnar Schröder, Phillip Grosse Prof. Dr.-Ing. Wolfgang Lehner Why do we need it? We are drowning in data, but starving for knowledge!
More informationHow can we discover stocks that will
Algorithmic Trading Strategy Based On Massive Data Mining Haoming Li, Zhijun Yang and Tianlun Li Stanford University Abstract We believe that there is useful information hiding behind the noisy and massive
More informationMedical Information Management & Mining. You Chen Jan,15, 2013 You.chen@vanderbilt.edu
Medical Information Management & Mining You Chen Jan,15, 2013 You.chen@vanderbilt.edu 1 Trees Building Materials Trees cannot be used to build a house directly. How can we transform trees to building materials?
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationMachine Learning in Drug Discovery and Drug Design
Machine Learning in Drug Discovery and Drug Design vorgelegt von Dipl.-Chem. Timon Schroeter aus Berlin Von der Fakultät IV Elektrotechnik und Informatik der Technischen Universität Berlin zur Erlangung
More informationThe p-norm generalization of the LMS algorithm for adaptive filtering
The p-norm generalization of the LMS algorithm for adaptive filtering Jyrki Kivinen University of Helsinki Manfred Warmuth University of California, Santa Cruz Babak Hassibi California Institute of Technology
More informationChapter 5. Phrase-based models. Statistical Machine Translation
Chapter 5 Phrase-based models Statistical Machine Translation Motivation Word-Based Models translate words as atomic units Phrase-Based Models translate phrases as atomic units Advantages: many-to-many
More informationSeveral Views of Support Vector Machines
Several Views of Support Vector Machines Ryan M. Rifkin Honda Research Institute USA, Inc. Human Intention Understanding Group 2007 Tikhonov Regularization We are considering algorithms of the form min
More informationLearning to Process Natural Language in Big Data Environment
CCF ADL 2015 Nanchang Oct 11, 2015 Learning to Process Natural Language in Big Data Environment Hang Li Noah s Ark Lab Huawei Technologies Part 1: Deep Learning - Present and Future Talk Outline Overview
More information7 Gaussian Elimination and LU Factorization
7 Gaussian Elimination and LU Factorization In this final section on matrix factorization methods for solving Ax = b we want to take a closer look at Gaussian elimination (probably the best known method
More informationBayesian Statistics: Indian Buffet Process
Bayesian Statistics: Indian Buffet Process Ilker Yildirim Department of Brain and Cognitive Sciences University of Rochester Rochester, NY 14627 August 2012 Reference: Most of the material in this note
More informationCoding and decoding with convolutional codes. The Viterbi Algor
Coding and decoding with convolutional codes. The Viterbi Algorithm. 8 Block codes: main ideas Principles st point of view: infinite length block code nd point of view: convolutions Some examples Repetition
More informationMolecular descriptors and chemometrics: a powerful combined tool for pharmaceutical, toxicological and environmental problems.
Molecular descriptors and chemometrics: a powerful combined tool for pharmaceutical, toxicological and environmental problems. Roberto Todeschini Milano Chemometrics and QSAR Research Group - Dept. of
More informationLearning with Local and Global Consistency
Learning with Local and Global Consistency Dengyong Zhou, Olivier Bousquet, Thomas Navin Lal, Jason Weston, and Bernhard Schölkopf Max Planck Institute for Biological Cybernetics, 7276 Tuebingen, Germany
More informationLearning with Local and Global Consistency
Learning with Local and Global Consistency Dengyong Zhou, Olivier Bousquet, Thomas Navin Lal, Jason Weston, and Bernhard Schölkopf Max Planck Institute for Biological Cybernetics, 7276 Tuebingen, Germany
More informationGraduate Certificate in Systems Engineering
Graduate Certificate in Systems Engineering Systems Engineering is a multi-disciplinary field that aims at integrating the engineering and management functions in the development and creation of a product,
More informationBlind Deconvolution of Barcodes via Dictionary Analysis and Wiener Filter of Barcode Subsections
Blind Deconvolution of Barcodes via Dictionary Analysis and Wiener Filter of Barcode Subsections Maximilian Hung, Bohyun B. Kim, Xiling Zhang August 17, 2013 Abstract While current systems already provide
More informationCHARACTERISTICS IN FLIGHT DATA ESTIMATION WITH LOGISTIC REGRESSION AND SUPPORT VECTOR MACHINES
CHARACTERISTICS IN FLIGHT DATA ESTIMATION WITH LOGISTIC REGRESSION AND SUPPORT VECTOR MACHINES Claus Gwiggner, Ecole Polytechnique, LIX, Palaiseau, France Gert Lanckriet, University of Berkeley, EECS,
More informationNonlinear Iterative Partial Least Squares Method
Numerical Methods for Determining Principal Component Analysis Abstract Factors Béchu, S., Richard-Plouet, M., Fernandez, V., Walton, J., and Fairley, N. (2016) Developments in numerical treatments for
More informationBEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES
BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 123 CHAPTER 7 BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 7.1 Introduction Even though using SVM presents
More informationTaking Inverse Graphics Seriously
CSC2535: 2013 Advanced Machine Learning Taking Inverse Graphics Seriously Geoffrey Hinton Department of Computer Science University of Toronto The representation used by the neural nets that work best
More informationFingerprint-Based Virtual Screening Using Multiple Bioactive Reference Structures
Fingerprint-Based Virtual Screening Using Multiple Bioactive Reference Structures Jérôme Hert, Peter Willett and David J. Wilton (University of Sheffield, Sheffield, UK) Pierre Acklin, Kamal Azzaoui, Edgar
More informationClass #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris
Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines
More informationSteven C.H. Hoi School of Information Systems Singapore Management University Email: chhoi@smu.edu.sg
Steven C.H. Hoi School of Information Systems Singapore Management University Email: chhoi@smu.edu.sg Introduction http://stevenhoi.org/ Finance Recommender Systems Cyber Security Machine Learning Visual
More informationRanking on Data Manifolds
Ranking on Data Manifolds Dengyong Zhou, Jason Weston, Arthur Gretton, Olivier Bousquet, and Bernhard Schölkopf Max Planck Institute for Biological Cybernetics, 72076 Tuebingen, Germany {firstname.secondname
More informationObject Recognition and Template Matching
Object Recognition and Template Matching Template Matching A template is a small image (sub-image) The goal is to find occurrences of this template in a larger image That is, you want to find matches of
More informationModelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic
More informationMassive Data Classification via Unconstrained Support Vector Machines
Massive Data Classification via Unconstrained Support Vector Machines Olvi L. Mangasarian and Michael E. Thompson Computer Sciences Department University of Wisconsin 1210 West Dayton Street Madison, WI
More informationNeural Decoding of Cursor Motion Using a Kalman Filter
Neural Decoding of Cursor Motion Using a Kalman Filter W. Wu M. J. Black Y. Gao E. Bienenstock M. Serruya A. Shaikhouni J. P. Donoghue Division of Applied Mathematics, Dept. of Computer Science, Dept.
More informationSelf Organizing Maps: Fundamentals
Self Organizing Maps: Fundamentals Introduction to Neural Networks : Lecture 16 John A. Bullinaria, 2004 1. What is a Self Organizing Map? 2. Topographic Maps 3. Setting up a Self Organizing Map 4. Kohonen
More information1 st day Basic Training Course
DATES AND LOCATIONS 13-14 April 2015 Princeton Marriott at Forrestal, 100 College Road East, Princeton NJ 08540, New Jersey 16-17 April 2015 Hotel Nikko San Francisco 222 Mason Street, San Francisco, CA
More informationData, Measurements, Features
Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are
More informationNTC Project: S01-PH10 (formerly I01-P10) 1 Forecasting Women s Apparel Sales Using Mathematical Modeling
1 Forecasting Women s Apparel Sales Using Mathematical Modeling Celia Frank* 1, Balaji Vemulapalli 1, Les M. Sztandera 2, Amar Raheja 3 1 School of Textiles and Materials Technology 2 Computer Information
More informationMedical Image Processing on the GPU. Past, Present and Future. Anders Eklund, PhD Virginia Tech Carilion Research Institute andek@vtc.vt.
Medical Image Processing on the GPU Past, Present and Future Anders Eklund, PhD Virginia Tech Carilion Research Institute andek@vtc.vt.edu Outline Motivation why do we need GPUs? Past - how was GPU programming
More informationSupervised Feature Selection & Unsupervised Dimensionality Reduction
Supervised Feature Selection & Unsupervised Dimensionality Reduction Feature Subset Selection Supervised: class labels are given Select a subset of the problem features Why? Redundant features much or
More informationLeast-Squares Intersection of Lines
Least-Squares Intersection of Lines Johannes Traa - UIUC 2013 This write-up derives the least-squares solution for the intersection of lines. In the general case, a set of lines will not intersect at a
More informationDistance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center
Distance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center 1 Outline Part I - Applications Motivation and Introduction Patient similarity application Part II
More informationMath 215 HW #6 Solutions
Math 5 HW #6 Solutions Problem 34 Show that x y is orthogonal to x + y if and only if x = y Proof First, suppose x y is orthogonal to x + y Then since x, y = y, x In other words, = x y, x + y = (x y) T
More informationReasoning Component Architecture
Architecture of a Spam Filter Application By Avi Pfeffer A spam filter consists of two components. In this article, based on my book Practical Probabilistic Programming, first describe the architecture
More informationTowards running complex models on big data
Towards running complex models on big data Working with all the genomes in the world without changing the model (too much) Daniel Lawson Heilbronn Institute, University of Bristol 2013 1 / 17 Motivation
More informationTree based ensemble models regularization by convex optimization
Tree based ensemble models regularization by convex optimization Bertrand Cornélusse, Pierre Geurts and Louis Wehenkel Department of Electrical Engineering and Computer Science University of Liège B-4000
More informationFactor Rotations in Factor Analyses.
Factor Rotations in Factor Analyses. Hervé Abdi 1 The University of Texas at Dallas Introduction The different methods of factor analysis first extract a set a factors from a data set. These factors are
More informationSYMMETRIC EIGENFACES MILI I. SHAH
SYMMETRIC EIGENFACES MILI I. SHAH Abstract. Over the years, mathematicians and computer scientists have produced an extensive body of work in the area of facial analysis. Several facial analysis algorithms
More informationCSE 517A MACHINE LEARNING INTRODUCTION
CSE 517A MACHINE LEARNING INTRODUCTION Spring 2016 Marion Neumann Contents in these slides may be subject to copyright. Some materials are adopted from Killian Weinberger. Thanks, Killian! Machine Learning
More informationSVM Kernels for Time Series Analysis
SVM Kernels for Time Series Analysis Stefan Rüping CS Department, AI Unit, University of Dortmund, 44221 Dortmund, Germany, E-Mail stefan.rueping@unidortmund.de Abstract. Time series analysis is an important
More informationMachine learning for algo trading
Machine learning for algo trading An introduction for nonmathematicians Dr. Aly Kassam Overview High level introduction to machine learning A machine learning bestiary What has all this got to do with
More informationClarify Some Issues on the Sparse Bayesian Learning for Sparse Signal Recovery
Clarify Some Issues on the Sparse Bayesian Learning for Sparse Signal Recovery Zhilin Zhang and Bhaskar D. Rao Technical Report University of California at San Diego September, Abstract Sparse Bayesian
More informationMethods and Applications for Distance Based ANN Training
Methods and Applications for Distance Based ANN Training Christoph Lassner, Rainer Lienhart Multimedia Computing and Computer Vision Lab Augsburg University, Universitätsstr. 6a, 86159 Augsburg, Germany
More informationA Negative Result Concerning Explicit Matrices With The Restricted Isometry Property
A Negative Result Concerning Explicit Matrices With The Restricted Isometry Property Venkat Chandar March 1, 2008 Abstract In this note, we prove that matrices whose entries are all 0 or 1 cannot achieve
More informationAdvanced Ensemble Strategies for Polynomial Models
Advanced Ensemble Strategies for Polynomial Models Pavel Kordík 1, Jan Černý 2 1 Dept. of Computer Science, Faculty of Information Technology, Czech Technical University in Prague, 2 Dept. of Computer
More informationComplex Network Analysis of Brain Connectivity: An Introduction LABREPORT 5
Complex Network Analysis of Brain Connectivity: An Introduction LABREPORT 5 Fernando Ferreira-Santos 2012 Title: Complex Network Analysis of Brain Connectivity: An Introduction Technical Report Authors:
More informationFionn Murtagh, Pedro Contreras International Conference p-adic MATHEMATICAL PHYSICS AND ITS APPLICATIONS. p-adics.2015, September 2015
Constant Time Search and Retrieval in Big Data, with Linear Time and Space Preprocessing, through Randomly Projected Piling and Sparse Ultrametric Coding Fionn Murtagh, Pedro Contreras International Conference
More informationPhysical Chemistry. Tutor: Dr. Jia Falong
Physical Chemistry Professor Jeffrey R. Reimers FAA School of Chemistry, The University of Sydney NSW 2006 Australia Room 702 Chemistry School CCNU Tutor: Dr. Jia Falong Text: Atkins 9 th Edition assumed
More informationPrinciple Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression
Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Saikat Maitra and Jun Yan Abstract: Dimension reduction is one of the major tasks for multivariate
More informationPredict Influencers in the Social Network
Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons
More informationCS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #18: Dimensionality Reduc7on
CS 5614: (Big) Data Management Systems B. Aditya Prakash Lecture #18: Dimensionality Reduc7on Dimensionality Reduc=on Assump=on: Data lies on or near a low d- dimensional subspace Axes of this subspace
More informationData visualization and dimensionality reduction using kernel maps with a reference point
Data visualization and dimensionality reduction using kernel maps with a reference point Johan Suykens K.U. Leuven, ESAT-SCD/SISTA Kasteelpark Arenberg 1 B-31 Leuven (Heverlee), Belgium Tel: 32/16/32 18
More informationBig Data Techniques Applied to Very Short-term Wind Power Forecasting
Big Data Techniques Applied to Very Short-term Wind Power Forecasting Ricardo Bessa Senior Researcher (ricardo.j.bessa@inesctec.pt) Center for Power and Energy Systems, INESC TEC, Portugal Joint work with
More informationSimple and efficient online algorithms for real world applications
Simple and efficient online algorithms for real world applications Università degli Studi di Milano Milano, Italy Talk @ Centro de Visión por Computador Something about me PhD in Robotics at LIRA-Lab,
More informationLearning Gaussian process models from big data. Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu
Learning Gaussian process models from big data Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu Machine learning seminar at University of Cambridge, July 4 2012 Data A lot of
More informationCommunity Mining from Multi-relational Networks
Community Mining from Multi-relational Networks Deng Cai 1, Zheng Shao 1, Xiaofei He 2, Xifeng Yan 1, and Jiawei Han 1 1 Computer Science Department, University of Illinois at Urbana Champaign (dengcai2,
More informationNonlinear Programming Methods.S2 Quadratic Programming
Nonlinear Programming Methods.S2 Quadratic Programming Operations Research Models and Methods Paul A. Jensen and Jonathan F. Bard A linearly constrained optimization problem with a quadratic objective
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 Deep Learning Barnabás Póczos & Aarti Singh Credits Many of the pictures, results, and other materials are taken from: Ruslan Salakhutdinov Joshua Bengio Geoffrey
More informationBig Data: Rethinking Text Visualization
Big Data: Rethinking Text Visualization Dr. Anton Heijs anton.heijs@treparel.com Treparel April 8, 2013 Abstract In this white paper we discuss text visualization approaches and how these are important
More informationHow To Understand And Understand The Theory Of Computational Finance
This course consists of three separate modules. Coordinator: Omiros Papaspiliopoulos Module I: Machine Learning in Finance Lecturer: Argimiro Arratia, Universitat Politecnica de Catalunya and BGSE Overview
More information