What is Candidate Sampling

Size: px
Start display at page:

Download "What is Candidate Sampling"

Transcription

1 What s Canddate Samplng Say we have a multclass or mult label problem where each tranng example ( x, T ) conssts of a context x a small (mult)set of target classes T out of a large unverse L of possble classes. For example, the problem mght be to predctng the next word (or the set of future words) n a sentence gven the prevous words. We wsh to learn a compatblty functon F (x, y ) whch says somethng about the compatblty of a class y wth a context x. For example the probablty of the class gven the context. Exhaustve tranng methods such as softmax and logstc regresson requre us to compute F (x, y ) for every class y L for every tranng example. When L s very large, ths can be prohbtvely expensve. Canddate Samplng tranng methods nvolve constructng a tranng task n whch for each tranng example ( x, T ), we only need to evaluate F (x, y ) for a small set of canddate classes C L. Typcally, the set of canddates C s the unon of the target classes wth a randomly chosen sample of (other) classes S L. C = T S The random choce of S may or may not depend on x and/or T. The tranng algorthm takes the form of a neural network, where the layer representng (x, ) F y s traned by back propagaton from a loss functon.

2 Table of Canddate Samplng Algorthms Postve tranng classes assocated wth tranng example ( x, T ) : Negatve tranng classes assocated wth tranng example ( x, T ) : Input to Tranng Loss G(x, y ) = Tranng Loss F (x, y ) gets traned to approxmate: P OS = NEG = Nose Contrastve Estmaton (NCE) T S F (x, y) l og(q(y x)) Logstc l og(p (y x)) Negatve Samplng T S F (x, y ) Logstc l og P ( Q(y x) (y x)) Sampled Logstc T ( S T ) F (x, y) l og(q(y x)) Logstc l ogodds(y x) = l og ( P 1 P (y x)) Full Logstc T ( L T ) F (x, y ) Logstc l og(odds(y x)) = l og ( P 1 P (y x)) Full Softmax T = { t } ( L T ) F (x, y ) Softmax l og(p (y x)) + K (x) Sampled Softmax T = { t } ( S T ) F (x, y) l og(q(y x)) Softmax l og(p (y x)) + K (x) Q (y x) s defned as the probablty (or expected count) accordng to the samplng algorthm of the class y n the (mult )set of sampled classes gven the context x. K (x) s an arbtrary functon that does not depend on the canddate class. Snce Softmax nvolves a normalzaton, addton of such a functon does not affect the computed probabltes. ( ) l ogstc tranng loss = l og(1 + exp( G(x, y )) + l og(1 + exp(g(x, y )) y P OS y NEG ( ( )) s oftmax tranng loss = G (x, t ) + log exp(g(x, y )) y P OS NEG NCE and Negatve Samplng generalze to the case where T s a multset. In ths case, P (y x) denotes the expected count of y n T. Smlarly, NCE, Negatve Samplng, and Sampled Logstc generalze to the case where S s a multset. In ths case Q(y x) denotes the expected count of y n S.

3 Sampled Softmax (A faster way to tran a softmax classfer) Reference: Assume that we have a sngle label problem. Each tranng example ( x, {t }) conssts of a context and one target class. We wrte P (y x) for the probablty of that the one target class s y gven that the context s x. We would lke to tran a functon F (x, y ) to produce softmax logts that s, relatve log probabltes of the class gven the context: F (x, y) log(p (y x)) + K (x) Where K(x) s an arbtrary functon that does not depend on y. In full softmax tranng, for every tranng example ( x, {t }), we would need to compute logts F (x, y ) for all classes n y L. Ths can get expensve f the unverse of classes L s very large. In Sampled Softmax, for each tranng example ( x, { t }), we pck a small set S L of sampled classes accordng to a chosen samplng functon Q (y x). Each class y L s ncluded n S ndependently wth probablty Q(y x ). P (S = S x ) = Q(y x ) (1 Q(y x )) y S y (L S) We create a set of canddates classes: C contanng the unon of the target class and the sampled t } C = S { Our tranng task s to fgure out, gven ths set C, whch of the classes n C s the target class. For each class y C, we want to compute the posteror probablty that y s the target class gven our knowledge of x and C. We call ths P (t = y x, C ) Applyng Bayes rule: (t x, ) (t, x ) / P (C x ) P = y C = P = y C (t x ) P (C t, ) / P (C x ) = P = y = y x P (y x ) P (C t, x ) / P (C x ) = = y

4 Now to compute P (C t = y, x ), we note that n order for ths to happen, S may or may not contan y, must contan all other elements of C, and must not contan any classes not n C. So: P (t = y x, C ) = P (y x ) Q (y x ) (1 Q (y x )) / P (C x ) P (y x = ) Q(y x ) y C P (y x = ) Q(y x ) C y C {y} y (L C ) Q (y x ) (1 Q (y x )) / P (C x ) / K(x, ) y (L C ) where K(x, C ) s a functon that does not depend on y. So: log(p (t = y x, C )) = log(p (y x )) log(q(y x )) + K (x, C ) These are the relatve logts that should feed nto a softmax classfer predctng whch of the canddates n s the true one. C Snce we are tryng to tran the functon F (x, y) to approxmate l og(p (y x)), we take the layer n our network representng F (x, y), subtract log(q(y x)), and pass the result to a softmax classfer predctng whch canddate s the true one. T ranng Sof tmax Input = F (x, y) l og(q(y x) Backpropagatng the gradents from that classfer trans F to gve us what we want.

5 Nose Contrastve Estmaton (NCE) Reference: Each tranng example ( x, T ) conssts of a context and a small multset of target classes. In practce, T x may always be a set or even a sngle class, but we use a multset here for generalty. We use the followng as a shorthand for the expected count of a class n the set of target classes for a context. In the case of sets wth no duplcates, ths s the probablty of the class gven the context: P (y x) : = E(T (y) x) We would lke to tran a functon F (x, y ) to approxmate the log expected count of the class gven the context, or n the case of a sets, the log probablty of the class gven the context. F (x, y) log (P (y x)) For each example ( x, T ), we pck a multset of sampled classes S. In practce, t probably makes sense to pck a set, but we use a multset here for generalty. Our samplng algorthm may or may not depend on x but may not depend on T. We construct a multset of canddates consstng of the sum of the target classes and the sampled classes. C = T + S Our tranng task s to dstngush the true canddates from the sampled canddates. We have one postve tranng meta example for each element of and one negatve tranng meta example for each element of S. We ntroduce the shorthand Q (y x) to denote the expected count, accordng to our samplng algorthm, of a partcular class n the set of sampled classes. If S never contans duplcates, then ths s a probablty. P l ogodds(y came from T vs S x) = l og ( Q(y x) (y x)) = l og (P (y x)) l og(q(y x)) T Q (y x) : = E (S(y) x)) The frst term, l og (P (y x)), s what we would lke to tran F (x, y ) to estmate.

6 We have a layer n our model whch represents F (x, y ). We add to t the second term, l og(q(y x)), whch we compute analytcally, and we pass the result to a logstc regresson loss whose label ndcates whether y came from T as opposed to S. L ogstc Regresson Input = F (x, y) log(q(y x)) The backpropagaton sgnal trans F (x, y ) to approxmate what we want t to.

7 Negatve Samplng Reference: dstrbuted representatons of words and phrases and ther co mpostonalty.pdf Negatve samplng s a smplfed varant of Nose Contrastve Estmaton where we neglect to subtract off l og(q(y x)) durng tranng. As a result, F (x, y ) s traned to approxmate l og (E(y x)) l og(q(y x)). It s noteworthy that n Negatve Samplng, we are optmzng F (x, y ) to approxmate somethng that depends on the samplng dstrbuton Q. Ths wll make the results hghly dependent on the choce of samplng dstrbuton. Ths s not true for the other algorthms descrbed here.

8 Sampled Logstc Sampled Logstc s a varant on Nose Contrastve Estmaton where we dscard wthout replacement all sampled classes that happen to also be target classes. Ths requres a set, as opposed to a multset, though S T to be may be a multset. As a result we learn an estmator of the log odds of a class as opposed to the log probablty of a class. The math changes from the NCE math as follows: P (y x) l ogodds(y came from T vs (S T ) x) = l og( Q(y x)(1 P (y x)) = l og( P 1 P (y x)) l og(q(y x)) ( P (y x) The frst term, l og 1 P (y x)), s what we would lke to tran F (x, y ) to estmate. We have a layer n our model, whch represents F (x, y ). We add to t the second term, l og(q(y x)), whch we compute analytcally, and we pass the result to a logstc regresson loss predctng whether y came from T vs ( S T ). The backpropagaton sgnal trans the L ogstc Regresson Input = F (x, y) log(q(y x) F (x, y ) layer to approxmate what we want t to. F (x, y) log ( P (y x) 1 P (y x))

9 Context Specfc vs. Generc Samplng In the methods dscussed, the samplng algorthm s allowed to depend on the context. It s possble that for some models, context specfc samplng wll be very useful, n that we can generate context dependent hard negatves and provde a more useful tranng sgnal. The authors have to ths pont focused on generc samplng algorthms such as unform samplng and ungram samplng, whch do not make use of the context. The reason s descrbed n the next secton. Batchwse Samplng We have focused on models whch use the same set S of sampled classes across a whole batch of tranng examples. Ths seems counterntutve shouldn t convergence be faster f we use dfferent sampled classes for each tranng example? The reason for usng the same sampled classes across a batch s computatonal. In many of our models, F (x, y ) s computed as the dot product of a feature vector for the context (the top hdden layer of a neural network), and an embeddng vector for the class. Computng the dot products of many feature vectors wth many embeddng vectors s a matrx multplcaton, whch s hghly effcent on modern hardware, especally on GPUs. Batchng lke ths often allows us to use hundreds or thousands of sampled classes wthout notceable slowdown. Another way to see t s that the overhead of fetchng a class embeddng across devces s greater than the tme t takes to compute ts dot products wth hundreds or even thousands of feature vectors. So f we are gong to use a sampled class wth one context, t s vrtually free to use t wth all of the other contexts n the batch as well.

Forecasting the Direction and Strength of Stock Market Movement

Forecasting the Direction and Strength of Stock Market Movement Forecastng the Drecton and Strength of Stock Market Movement Jngwe Chen Mng Chen Nan Ye cjngwe@stanford.edu mchen5@stanford.edu nanye@stanford.edu Abstract - Stock market s one of the most complcated systems

More information

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification Lecture 4: More classfers and classes C4B Machne Learnng Hlary 20 A. Zsserman Logstc regresson Loss functons revsted Adaboost Loss functons revsted Optmzaton Multple class classfcaton Logstc Regresson

More information

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ). REVIEW OF RISK MANAGEMENT CONCEPTS LOSS DISTRIBUTIONS AND INSURANCE Loss and nsurance: When someone s subject to the rsk of ncurrng a fnancal loss, the loss s generally modeled usng a random varable or

More information

1 Example 1: Axis-aligned rectangles

1 Example 1: Axis-aligned rectangles COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture # 6 Scrbe: Aaron Schld February 21, 2013 Last class, we dscussed an analogue for Occam s Razor for nfnte hypothess spaces that, n conjuncton

More information

v a 1 b 1 i, a 2 b 2 i,..., a n b n i.

v a 1 b 1 i, a 2 b 2 i,..., a n b n i. SECTION 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS 455 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS All the vector spaces we have studed thus far n the text are real vector spaces snce the scalars are

More information

Recurrence. 1 Definitions and main statements

Recurrence. 1 Definitions and main statements Recurrence 1 Defntons and man statements Let X n, n = 0, 1, 2,... be a MC wth the state space S = (1, 2,...), transton probabltes p j = P {X n+1 = j X n = }, and the transton matrx P = (p j ),j S def.

More information

An Interest-Oriented Network Evolution Mechanism for Online Communities

An Interest-Oriented Network Evolution Mechanism for Online Communities An Interest-Orented Network Evoluton Mechansm for Onlne Communtes Cahong Sun and Xaopng Yang School of Informaton, Renmn Unversty of Chna, Bejng 100872, P.R. Chna {chsun,yang}@ruc.edu.cn Abstract. Onlne

More information

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

Luby s Alg. for Maximal Independent Sets using Pairwise Independence Lecture Notes for Randomzed Algorthms Luby s Alg. for Maxmal Independent Sets usng Parwse Independence Last Updated by Erc Vgoda on February, 006 8. Maxmal Independent Sets For a graph G = (V, E), an ndependent

More information

An Alternative Way to Measure Private Equity Performance

An Alternative Way to Measure Private Equity Performance An Alternatve Way to Measure Prvate Equty Performance Peter Todd Parlux Investment Technology LLC Summary Internal Rate of Return (IRR) s probably the most common way to measure the performance of prvate

More information

1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP)

1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP) 6.3 / -- Communcaton Networks II (Görg) SS20 -- www.comnets.un-bremen.de Communcaton Networks II Contents. Fundamentals of probablty theory 2. Emergence of communcaton traffc 3. Stochastc & Markovan Processes

More information

Statistical Methods to Develop Rating Models

Statistical Methods to Develop Rating Models Statstcal Methods to Develop Ratng Models [Evelyn Hayden and Danel Porath, Österrechsche Natonalbank and Unversty of Appled Scences at Manz] Source: The Basel II Rsk Parameters Estmaton, Valdaton, and

More information

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12 14 The Ch-squared dstrbuton PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 1 If a normal varable X, havng mean µ and varance σ, s standardsed, the new varable Z has a mean 0 and varance 1. When ths standardsed

More information

Logistic Regression. Steve Kroon

Logistic Regression. Steve Kroon Logstc Regresson Steve Kroon Course notes sectons: 24.3-24.4 Dsclamer: these notes do not explctly ndcate whether values are vectors or scalars, but expects the reader to dscern ths from the context. Scenaro

More information

Implementation of Deutsch's Algorithm Using Mathcad

Implementation of Deutsch's Algorithm Using Mathcad Implementaton of Deutsch's Algorthm Usng Mathcad Frank Roux The followng s a Mathcad mplementaton of Davd Deutsch's quantum computer prototype as presented on pages - n "Machnes, Logc and Quantum Physcs"

More information

The OC Curve of Attribute Acceptance Plans

The OC Curve of Attribute Acceptance Plans The OC Curve of Attrbute Acceptance Plans The Operatng Characterstc (OC) curve descrbes the probablty of acceptng a lot as a functon of the lot s qualty. Fgure 1 shows a typcal OC Curve. 10 8 6 4 1 3 4

More information

Probabilistic Linear Classifier: Logistic Regression. CS534-Machine Learning

Probabilistic Linear Classifier: Logistic Regression. CS534-Machine Learning robablstc Lnear Classfer: Logstc Regresson CS534-Machne Learnng Three Man Approaches to learnng a Classfer Learn a classfer: a functon f, ŷ f Learn a probablstc dscrmnatve model,.e., the condtonal dstrbuton

More information

Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation

Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation Exhaustve Regresson An Exploraton of Regresson-Based Data Mnng Technques Usng Super Computaton Antony Daves, Ph.D. Assocate Professor of Economcs Duquesne Unversty Pttsburgh, PA 58 Research Fellow The

More information

How To Understand The Results Of The German Meris Cloud And Water Vapour Product

How To Understand The Results Of The German Meris Cloud And Water Vapour Product Ttel: Project: Doc. No.: MERIS level 3 cloud and water vapour products MAPP MAPP-ATBD-ClWVL3 Issue: 1 Revson: 0 Date: 9.12.1998 Functon Name Organsaton Sgnature Date Author: Bennartz FUB Preusker FUB Schüller

More information

Analysis of Energy-Conserving Access Protocols for Wireless Identification Networks

Analysis of Energy-Conserving Access Protocols for Wireless Identification Networks From the Proceedngs of Internatonal Conference on Telecommuncaton Systems (ITC-97), March 2-23, 1997. 1 Analyss of Energy-Conservng Access Protocols for Wreless Identfcaton etworks Imrch Chlamtac a, Chara

More information

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network 700 Proceedngs of the 8th Internatonal Conference on Innovaton & Management Forecastng the Demand of Emergency Supples: Based on the CBR Theory and BP Neural Network Fu Deqang, Lu Yun, L Changbng School

More information

+ + + - - This circuit than can be reduced to a planar circuit

+ + + - - This circuit than can be reduced to a planar circuit MeshCurrent Method The meshcurrent s analog of the nodeoltage method. We sole for a new set of arables, mesh currents, that automatcally satsfy KCLs. As such, meshcurrent method reduces crcut soluton to

More information

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence 1 st Internatonal Symposum on Imprecse Probabltes and Ther Applcatons, Ghent, Belgum, 29 June 2 July 1999 How Sets of Coherent Probabltes May Serve as Models for Degrees of Incoherence Mar J. Schervsh

More information

Efficient Project Portfolio as a tool for Enterprise Risk Management

Efficient Project Portfolio as a tool for Enterprise Risk Management Effcent Proect Portfolo as a tool for Enterprse Rsk Management Valentn O. Nkonov Ural State Techncal Unversty Growth Traectory Consultng Company January 5, 27 Effcent Proect Portfolo as a tool for Enterprse

More information

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION Vson Mouse Saurabh Sarkar a* a Unversty of Cncnnat, Cncnnat, USA ABSTRACT The report dscusses a vson based approach towards trackng of eyes and fngers. The report descrbes the process of locatng the possble

More information

) of the Cell class is created containing information about events associated with the cell. Events are added to the Cell instance

) of the Cell class is created containing information about events associated with the cell. Events are added to the Cell instance Calbraton Method Instances of the Cell class (one nstance for each FMS cell) contan ADC raw data and methods assocated wth each partcular FMS cell. The calbraton method ncludes event selecton (Class Cell

More information

Single and multiple stage classifiers implementing logistic discrimination

Single and multiple stage classifiers implementing logistic discrimination Sngle and multple stage classfers mplementng logstc dscrmnaton Hélo Radke Bttencourt 1 Dens Alter de Olvera Moraes 2 Vctor Haertel 2 1 Pontfíca Unversdade Católca do Ro Grande do Sul - PUCRS Av. Ipranga,

More information

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module LOSSLESS IMAGE COMPRESSION SYSTEMS Lesson 3 Lossless Compresson: Huffman Codng Instructonal Objectves At the end of ths lesson, the students should be able to:. Defne and measure source entropy..

More information

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6 PAR TESTS If a WEIGHT varable s specfed, t s used to replcate a case as many tmes as ndcated by the weght value rounded to the nearest nteger. If the workspace requrements are exceeded and samplng has

More information

Traffic State Estimation in the Traffic Management Center of Berlin

Traffic State Estimation in the Traffic Management Center of Berlin Traffc State Estmaton n the Traffc Management Center of Berln Authors: Peter Vortsch, PTV AG, Stumpfstrasse, D-763 Karlsruhe, Germany phone ++49/72/965/35, emal peter.vortsch@ptv.de Peter Möhl, PTV AG,

More information

Quantization Effects in Digital Filters

Quantization Effects in Digital Filters Quantzaton Effects n Dgtal Flters Dstrbuton of Truncaton Errors In two's complement representaton an exact number would have nfntely many bts (n general). When we lmt the number of bts to some fnte value

More information

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting Causal, Explanatory Forecastng Assumes cause-and-effect relatonshp between system nputs and ts output Forecastng wth Regresson Analyss Rchard S. Barr Inputs System Cause + Effect Relatonshp The job of

More information

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK Sample Stablty Protocol Background The Cholesterol Reference Method Laboratory Network (CRMLN) developed certfcaton protocols for total cholesterol, HDL

More information

Lecture 5,6 Linear Methods for Classification. Summary

Lecture 5,6 Linear Methods for Classification. Summary Lecture 5,6 Lnear Methods for Classfcaton Rce ELEC 697 Farnaz Koushanfar Fall 2006 Summary Bayes Classfers Lnear Classfers Lnear regresson of an ndcator matrx Lnear dscrmnant analyss (LDA) Logstc regresson

More information

Enterprise Master Patient Index

Enterprise Master Patient Index Enterprse Master Patent Index Healthcare data are captured n many dfferent settngs such as hosptals, clncs, labs, and physcan offces. Accordng to a report by the CDC, patents n the Unted States made an

More information

Extending Probabilistic Dynamic Epistemic Logic

Extending Probabilistic Dynamic Epistemic Logic Extendng Probablstc Dynamc Epstemc Logc Joshua Sack May 29, 2008 Probablty Space Defnton A probablty space s a tuple (S, A, µ), where 1 S s a set called the sample space. 2 A P(S) s a σ-algebra: a set

More information

Can Auto Liability Insurance Purchases Signal Risk Attitude?

Can Auto Liability Insurance Purchases Signal Risk Attitude? Internatonal Journal of Busness and Economcs, 2011, Vol. 10, No. 2, 159-164 Can Auto Lablty Insurance Purchases Sgnal Rsk Atttude? Chu-Shu L Department of Internatonal Busness, Asa Unversty, Tawan Sheng-Chang

More information

1. Measuring association using correlation and regression

1. Measuring association using correlation and regression How to measure assocaton I: Correlaton. 1. Measurng assocaton usng correlaton and regresson We often would lke to know how one varable, such as a mother's weght, s related to another varable, such as a

More information

DEFINING %COMPLETE IN MICROSOFT PROJECT

DEFINING %COMPLETE IN MICROSOFT PROJECT CelersSystems DEFINING %COMPLETE IN MICROSOFT PROJECT PREPARED BY James E Aksel, PMP, PMI-SP, MVP For Addtonal Informaton about Earned Value Management Systems and reportng, please contact: CelersSystems,

More information

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by 6 CHAPTER 8 COMPLEX VECTOR SPACES 5. Fnd the kernel of the lnear transformaton gven n Exercse 5. In Exercses 55 and 56, fnd the mage of v, for the ndcated composton, where and are gven by the followng

More information

Learning from Large Distributed Data: A Scaling Down Sampling Scheme for Efficient Data Processing

Learning from Large Distributed Data: A Scaling Down Sampling Scheme for Efficient Data Processing Internatonal Journal of Machne Learnng and Computng, Vol. 4, No. 3, June 04 Learnng from Large Dstrbuted Data: A Scalng Down Samplng Scheme for Effcent Data Processng Che Ngufor and Janusz Wojtusak part

More information

Distributed Column Subset Selection on MapReduce

Distributed Column Subset Selection on MapReduce Dstrbuted Column Subset Selecton on MapReduce Ahmed K. arahat Ahmed Elgohary Al Ghods Mohamed S. Kamel Unversty of Waterloo Waterloo, Ontaro, Canada N2L 3G1 Emal: {afarahat, aelgohary, aghodsb, mkamel}@uwaterloo.ca

More information

Traffic-light a stress test for life insurance provisions

Traffic-light a stress test for life insurance provisions MEMORANDUM Date 006-09-7 Authors Bengt von Bahr, Göran Ronge Traffc-lght a stress test for lfe nsurance provsons Fnansnspetonen P.O. Box 6750 SE-113 85 Stocholm [Sveavägen 167] Tel +46 8 787 80 00 Fax

More information

All Roads Lead to Rome: Optimistic Recovery for Distributed Iterative Data Processing

All Roads Lead to Rome: Optimistic Recovery for Distributed Iterative Data Processing All Roads Lead to Rome: Optmstc Recovery for Dstrbuted Iteratve Data Processng Sebastan Schelter Stephan Ewen Kostas Tzoumas Volker Markl Technsche Unverstät Berln, Germany frstname.lastname@tu-berln.de

More information

A Probabilistic Theory of Coherence

A Probabilistic Theory of Coherence A Probablstc Theory of Coherence BRANDEN FITELSON. The Coherence Measure C Let E be a set of n propostons E,..., E n. We seek a probablstc measure C(E) of the degree of coherence of E. Intutvely, we want

More information

CHAPTER 14 MORE ABOUT REGRESSION

CHAPTER 14 MORE ABOUT REGRESSION CHAPTER 14 MORE ABOUT REGRESSION We learned n Chapter 5 that often a straght lne descrbes the pattern of a relatonshp between two quanttatve varables. For nstance, n Example 5.1 we explored the relatonshp

More information

PRIVATE SCHOOL CHOICE: THE EFFECTS OF RELIGIOUS AFFILIATION AND PARTICIPATION

PRIVATE SCHOOL CHOICE: THE EFFECTS OF RELIGIOUS AFFILIATION AND PARTICIPATION PRIVATE SCHOOL CHOICE: THE EFFECTS OF RELIIOUS AFFILIATION AND PARTICIPATION Danny Cohen-Zada Department of Economcs, Ben-uron Unversty, Beer-Sheva 84105, Israel Wllam Sander Department of Economcs, DePaul

More information

We assume your students are learning about self-regulation (how to change how alert they feel) through the Alert Program with its three stages:

We assume your students are learning about self-regulation (how to change how alert they feel) through the Alert Program with its three stages: Welcome to ALERT BINGO, a fun-flled and educatonal way to learn the fve ways to change engnes levels (Put somethng n your Mouth, Move, Touch, Look, and Lsten) as descrbed n the How Does Your Engne Run?

More information

Realistic Image Synthesis

Realistic Image Synthesis Realstc Image Synthess - Combned Samplng and Path Tracng - Phlpp Slusallek Karol Myszkowsk Vncent Pegoraro Overvew: Today Combned Samplng (Multple Importance Samplng) Renderng and Measurng Equaton Random

More information

Detecting Credit Card Fraud using Periodic Features

Detecting Credit Card Fraud using Periodic Features Detectng Credt Card Fraud usng Perodc Features Alejandro Correa Bahnsen, Djamla Aouada, Aleksandar Stojanovc and Björn Ottersten Interdscplnary Centre for Securty, Relablty and Trust Unversty of Luxembourg,

More information

Prediction of Disability Frequencies in Life Insurance

Prediction of Disability Frequencies in Life Insurance Predcton of Dsablty Frequences n Lfe Insurance Bernhard Köng Fran Weber Maro V. Wüthrch October 28, 2011 Abstract For the predcton of dsablty frequences, not only the observed, but also the ncurred but

More information

SUPPLIER FINANCING AND STOCK MANAGEMENT. A JOINT VIEW.

SUPPLIER FINANCING AND STOCK MANAGEMENT. A JOINT VIEW. SUPPLIER FINANCING AND STOCK MANAGEMENT. A JOINT VIEW. Lucía Isabel García Cebrán Departamento de Economía y Dreccón de Empresas Unversdad de Zaragoza Gran Vía, 2 50.005 Zaragoza (Span) Phone: 976-76-10-00

More information

Loop Parallelization

Loop Parallelization - - Loop Parallelzaton C-52 Complaton steps: nested loops operatng on arrays, sequentell executon of teraton space DECLARE B[..,..+] FOR I :=.. FOR J :=.. I B[I,J] := B[I-,J]+B[I-,J-] ED FOR ED FOR analyze

More information

Simple Interest Loans (Section 5.1) :

Simple Interest Loans (Section 5.1) : Chapter 5 Fnance The frst part of ths revew wll explan the dfferent nterest and nvestment equatons you learned n secton 5.1 through 5.4 of your textbook and go through several examples. The second part

More information

How To Find The Dsablty Frequency Of A Clam

How To Find The Dsablty Frequency Of A Clam 1 Predcton of Dsablty Frequences n Lfe Insurance Bernhard Köng 1, Fran Weber 1, Maro V. Wüthrch 2 Abstract: For the predcton of dsablty frequences, not only the observed, but also the ncurred but not yet

More information

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek HE DISRIBUION OF LOAN PORFOLIO VALUE * Oldrch Alfons Vascek he amount of captal necessary to support a portfolo of debt securtes depends on the probablty dstrbuton of the portfolo loss. Consder a portfolo

More information

On the Optimal Control of a Cascade of Hydro-Electric Power Stations

On the Optimal Control of a Cascade of Hydro-Electric Power Stations On the Optmal Control of a Cascade of Hydro-Electrc Power Statons M.C.M. Guedes a, A.F. Rbero a, G.V. Smrnov b and S. Vlela c a Department of Mathematcs, School of Scences, Unversty of Porto, Portugal;

More information

Multiplication Algorithms for Radix-2 RN-Codings and Two s Complement Numbers

Multiplication Algorithms for Radix-2 RN-Codings and Two s Complement Numbers Multplcaton Algorthms for Radx- RN-Codngs and Two s Complement Numbers Jean-Luc Beuchat Projet Arénare, LIP, ENS Lyon 46, Allée d Itale F 69364 Lyon Cedex 07 jean-luc.beuchat@ens-lyon.fr Jean-Mchel Muller

More information

How To Calculate The Accountng Perod Of Nequalty

How To Calculate The Accountng Perod Of Nequalty Inequalty and The Accountng Perod Quentn Wodon and Shlomo Ytzha World Ban and Hebrew Unversty September Abstract Income nequalty typcally declnes wth the length of tme taen nto account for measurement.

More information

High Correlation between Net Promoter Score and the Development of Consumers' Willingness to Pay (Empirical Evidence from European Mobile Markets)

High Correlation between Net Promoter Score and the Development of Consumers' Willingness to Pay (Empirical Evidence from European Mobile Markets) Hgh Correlaton between et Promoter Score and the Development of Consumers' Wllngness to Pay (Emprcal Evdence from European Moble Marets Ths paper shows that the correlaton between the et Promoter Score

More information

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements Lecture 3 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square Next lecture: Matlab tutoral Announcements Rules for attendng the class: Regstered for credt Regstered for audt (only f there

More information

How To Know The Components Of Mean Squared Error Of Herarchcal Estmator S

How To Know The Components Of Mean Squared Error Of Herarchcal Estmator S S C H E D A E I N F O R M A T I C A E VOLUME 0 0 On Mean Squared Error of Herarchcal Estmator Stans law Brodowsk Faculty of Physcs, Astronomy, and Appled Computer Scence, Jagellonan Unversty, Reymonta

More information

Calculating the high frequency transmission line parameters of power cables

Calculating the high frequency transmission line parameters of power cables < ' Calculatng the hgh frequency transmsson lne parameters of power cables Authors: Dr. John Dcknson, Laboratory Servces Manager, N 0 RW E B Communcatons Mr. Peter J. Ncholson, Project Assgnment Manager,

More information

ActiveClean: Interactive Data Cleaning While Learning Convex Loss Models

ActiveClean: Interactive Data Cleaning While Learning Convex Loss Models ActveClean: Interactve Data Cleanng Whle Learnng Convex Loss Models Sanjay Krshnan, Jannan Wang, Eugene Wu, Mchael J. Frankln, Ken Goldberg UC Berkeley, Columba Unversty {sanjaykrshnan, jnwang, frankln,

More information

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES The goal: to measure (determne) an unknown quantty x (the value of a RV X) Realsaton: n results: y 1, y 2,..., y j,..., y n, (the measured values of Y 1, Y 2,..., Y j,..., Y n ) every result s encumbered

More information

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching)

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching) Face Recognton Problem Face Verfcaton Problem Face Verfcaton (1:1 matchng) Querymage face query Face Recognton (1:N matchng) database Applcaton: Access Control www.vsage.com www.vsoncs.com Bometrc Authentcaton

More information

Properties of Indoor Received Signal Strength for WLAN Location Fingerprinting

Properties of Indoor Received Signal Strength for WLAN Location Fingerprinting Propertes of Indoor Receved Sgnal Strength for WLAN Locaton Fngerprntng Kamol Kaemarungs and Prashant Krshnamurthy Telecommuncatons Program, School of Informaton Scences, Unversty of Pttsburgh E-mal: kakst2,prashk@ptt.edu

More information

The Greedy Method. Introduction. 0/1 Knapsack Problem

The Greedy Method. Introduction. 0/1 Knapsack Problem The Greedy Method Introducton We have completed data structures. We now are gong to look at algorthm desgn methods. Often we are lookng at optmzaton problems whose performance s exponental. For an optmzaton

More information

Lecture 3: Force of Interest, Real Interest Rate, Annuity

Lecture 3: Force of Interest, Real Interest Rate, Annuity Lecture 3: Force of Interest, Real Interest Rate, Annuty Goals: Study contnuous compoundng and force of nterest Dscuss real nterest rate Learn annuty-mmedate, and ts present value Study annuty-due, and

More information

HÜCKEL MOLECULAR ORBITAL THEORY

HÜCKEL MOLECULAR ORBITAL THEORY 1 HÜCKEL MOLECULAR ORBITAL THEORY In general, the vast maorty polyatomc molecules can be thought of as consstng of a collecton of two electron bonds between pars of atoms. So the qualtatve pcture of σ

More information

Section C2: BJT Structure and Operational Modes

Section C2: BJT Structure and Operational Modes Secton 2: JT Structure and Operatonal Modes Recall that the semconductor dode s smply a pn juncton. Dependng on how the juncton s based, current may easly flow between the dode termnals (forward bas, v

More information

Calculation of Sampling Weights

Calculation of Sampling Weights Perre Foy Statstcs Canada 4 Calculaton of Samplng Weghts 4.1 OVERVIEW The basc sample desgn used n TIMSS Populatons 1 and 2 was a two-stage stratfed cluster desgn. 1 The frst stage conssted of a sample

More information

Ring structure of splines on triangulations

Ring structure of splines on triangulations www.oeaw.ac.at Rng structure of splnes on trangulatons N. Vllamzar RICAM-Report 2014-48 www.rcam.oeaw.ac.at RING STRUCTURE OF SPLINES ON TRIANGULATIONS NELLY VILLAMIZAR Introducton For a trangulated regon

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Max Wellng Department of Computer Scence Unversty of Toronto 10 Kng s College Road Toronto, M5S 3G5 Canada wellng@cs.toronto.edu Abstract Ths s a note to explan support vector machnes.

More information

L10: Linear discriminants analysis

L10: Linear discriminants analysis L0: Lnear dscrmnants analyss Lnear dscrmnant analyss, two classes Lnear dscrmnant analyss, C classes LDA vs. PCA Lmtatons of LDA Varants of LDA Other dmensonalty reducton methods CSCE 666 Pattern Analyss

More information

Frequency Selective IQ Phase and IQ Amplitude Imbalance Adjustments for OFDM Direct Conversion Transmitters

Frequency Selective IQ Phase and IQ Amplitude Imbalance Adjustments for OFDM Direct Conversion Transmitters Frequency Selectve IQ Phase and IQ Ampltude Imbalance Adjustments for OFDM Drect Converson ransmtters Edmund Coersmeer, Ernst Zelnsk Noka, Meesmannstrasse 103, 44807 Bochum, Germany edmund.coersmeer@noka.com,

More information

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background:

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background: SPEE Recommended Evaluaton Practce #6 efnton of eclne Curve Parameters Background: The producton hstores of ol and gas wells can be analyzed to estmate reserves and future ol and gas producton rates and

More information

Complete Fairness in Secure Two-Party Computation

Complete Fairness in Secure Two-Party Computation Complete Farness n Secure Two-Party Computaton S. Dov Gordon Carmt Hazay Jonathan Katz Yehuda Lndell Abstract In the settng of secure two-party computaton, two mutually dstrustng partes wsh to compute

More information

An Empirical Study of Search Engine Advertising Effectiveness

An Empirical Study of Search Engine Advertising Effectiveness An Emprcal Study of Search Engne Advertsng Effectveness Sanjog Msra, Smon School of Busness Unversty of Rochester Edeal Pnker, Smon School of Busness Unversty of Rochester Alan Rmm-Kaufman, Rmm-Kaufman

More information

RequIn, a tool for fast web traffic inference

RequIn, a tool for fast web traffic inference RequIn, a tool for fast web traffc nference Olver aul, Jean Etenne Kba GET/INT, LOR Department 9 rue Charles Fourer 90 Evry, France Olver.aul@nt-evry.fr, Jean-Etenne.Kba@nt-evry.fr Abstract As networked

More information

STATISTICAL DATA ANALYSIS IN EXCEL

STATISTICAL DATA ANALYSIS IN EXCEL Mcroarray Center STATISTICAL DATA ANALYSIS IN EXCEL Lecture 6 Some Advanced Topcs Dr. Petr Nazarov 14-01-013 petr.nazarov@crp-sante.lu Statstcal data analyss n Ecel. 6. Some advanced topcs Correcton for

More information

Mining Multiple Large Data Sources

Mining Multiple Large Data Sources The Internatonal Arab Journal of Informaton Technology, Vol. 7, No. 3, July 2 24 Mnng Multple Large Data Sources Anmesh Adhkar, Pralhad Ramachandrarao 2, Bhanu Prasad 3, and Jhml Adhkar 4 Department of

More information

Dropout: A Simple Way to Prevent Neural Networks from Overfitting

Dropout: A Simple Way to Prevent Neural Networks from Overfitting Journal of Machne Learnng Research 15 (2014) 1929-1958 Submtted 11/13; Publshed 6/14 Dropout: A Smple Way to Prevent Neural Networks from Overfttng Ntsh Srvastava Geoffrey Hnton Alex Krzhevsky Ilya Sutskever

More information

The Mathematical Derivation of Least Squares

The Mathematical Derivation of Least Squares Pscholog 885 Prof. Federco The Mathematcal Dervaton of Least Squares Back when the powers that e forced ou to learn matr algera and calculus, I et ou all asked ourself the age-old queston: When the hell

More information

A Hierarchical Anomaly Network Intrusion Detection System using Neural Network Classification

A Hierarchical Anomaly Network Intrusion Detection System using Neural Network Classification IDC IDC A Herarchcal Anomaly Network Intruson Detecton System usng Neural Network Classfcaton ZHENG ZHANG, JUN LI, C. N. MANIKOPOULOS, JAY JORGENSON and JOSE UCLES ECE Department, New Jersey Inst. of Tech.,

More information

Learning from Multiple Outlooks

Learning from Multiple Outlooks Learnng from Multple Outlooks Maayan Harel Department of Electrcal Engneerng, Technon, Hafa, Israel She Mannor Department of Electrcal Engneerng, Technon, Hafa, Israel maayanga@tx.technon.ac.l she@ee.technon.ac.l

More information

Texas Instruments 30X IIS Calculator

Texas Instruments 30X IIS Calculator Texas Instruments 30X IIS Calculator Keystrokes for the TI-30X IIS are shown for a few topcs n whch keystrokes are unque. Start by readng the Quk Start secton. Then, before begnnng a specfc unt of the

More information

Intelligent stock trading system by turning point confirming and probabilistic reasoning

Intelligent stock trading system by turning point confirming and probabilistic reasoning Expert Systems wth Applcatons Expert Systems wth Applcatons 34 (2008) 620 627 www.elsever.com/locate/eswa Intellgent stock tradng system by turnng pont confrmng and probablstc reasonng Depe Bao *, Zehong

More information

Lecture 2: Single Layer Perceptrons Kevin Swingler

Lecture 2: Single Layer Perceptrons Kevin Swingler Lecture 2: Sngle Layer Perceptrons Kevn Sngler kms@cs.str.ac.uk Recap: McCulloch-Ptts Neuron Ths vastly smplfed model of real neurons s also knon as a Threshold Logc Unt: W 2 A Y 3 n W n. A set of synapses

More information

The Current Employment Statistics (CES) survey,

The Current Employment Statistics (CES) survey, Busness Brths and Deaths Impact of busness brths and deaths n the payroll survey The CES probablty-based sample redesgn accounts for most busness brth employment through the mputaton of busness deaths,

More information

Transition Matrix Models of Consumer Credit Ratings

Transition Matrix Models of Consumer Credit Ratings Transton Matrx Models of Consumer Credt Ratngs Abstract Although the corporate credt rsk lterature has many studes modellng the change n the credt rsk of corporate bonds over tme, there s far less analyss

More information

When do data mining results violate privacy? Individual Privacy: Protect the record

When do data mining results violate privacy? Individual Privacy: Protect the record When do data mnng results volate prvacy? Chrs Clfton March 17, 2004 Ths s jont work wth Jashun Jn and Murat Kantarcıoğlu Indvdual Prvacy: Protect the record Indvdual tem n database must not be dsclosed

More information

Title Language Model for Information Retrieval

Title Language Model for Information Retrieval Ttle Language Model for Informaton Retreval Rong Jn Language Technologes Insttute School of Computer Scence Carnege Mellon Unversty Alex G. Hauptmann Computer Scence Department School of Computer Scence

More information

Ad-Hoc Games and Packet Forwardng Networks

Ad-Hoc Games and Packet Forwardng Networks On Desgnng Incentve-Compatble Routng and Forwardng Protocols n Wreless Ad-Hoc Networks An Integrated Approach Usng Game Theoretcal and Cryptographc Technques Sheng Zhong L (Erran) L Yanbn Grace Lu Yang

More information

Web Spam Detection Using Machine Learning in Specific Domain Features

Web Spam Detection Using Machine Learning in Specific Domain Features Journal of Informaton Assurance and Securty 3 (2008) 220-229 Web Spam Detecton Usng Machne Learnng n Specfc Doman Features Hassan Najadat 1, Ismal Hmed 2 Department of Computer Informaton Systems Faculty

More information

PRACTICE 1: MUTUAL FUNDS EVALUATION USING MATLAB.

PRACTICE 1: MUTUAL FUNDS EVALUATION USING MATLAB. PRACTICE 1: MUTUAL FUNDS EVALUATION USING MATLAB. INDEX 1. Load data usng the Edtor wndow and m-fle 2. Learnng to save results from the Edtor wndow. 3. Computng the Sharpe Rato 4. Obtanng the Treynor Rato

More information

Forecasting and Stress Testing Credit Card Default using Dynamic Models

Forecasting and Stress Testing Credit Card Default using Dynamic Models Forecastng and Stress Testng Credt Card Default usng Dynamc Models Tony Bellott and Jonathan Crook Credt Research Centre Unversty of Ednburgh Busness School Verson 4.5 Abstract Typcally models of credt

More information

Design of Output Codes for Fast Covering Learning using Basic Decomposition Techniques

Design of Output Codes for Fast Covering Learning using Basic Decomposition Techniques Journal of Computer Scence (7): 565-57, 6 ISSN 59-66 6 Scence Publcatons Desgn of Output Codes for Fast Coverng Learnng usng Basc Decomposton Technques Aruna Twar and Narendra S. Chaudhar, Faculty of Computer

More information

Software project management with GAs

Software project management with GAs Informaton Scences 177 (27) 238 241 www.elsever.com/locate/ns Software project management wth GAs Enrque Alba *, J. Francsco Chcano Unversty of Málaga, Grupo GISUM, Departamento de Lenguajes y Cencas de

More information

Examensarbete. Rotating Workforce Scheduling. Caroline Granfeldt

Examensarbete. Rotating Workforce Scheduling. Caroline Granfeldt Examensarbete Rotatng Workforce Schedulng Carolne Granfeldt LTH - MAT - EX - - 2015 / 08 - - SE Rotatng Workforce Schedulng Optmerngslära, Lnköpngs Unverstet Carolne Granfeldt LTH - MAT - EX - - 2015

More information