Classification of Multivariate Data Using Distribution Mapping Exponent

Similar documents
8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

What is Candidate Sampling

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

v a 1 b 1 i, a 2 b 2 i,..., a n b n i.

Forecasting the Direction and Strength of Stock Market Movement

Recurrence. 1 Definitions and main statements

1 Example 1: Axis-aligned rectangles

PERRON FROBENIUS THEOREM

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

Quantization Effects in Digital Filters

A Study on Secure Data Storage Strategy in Cloud Computing

L10: Linear discriminants analysis

Portfolio Loss Distribution

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching)

1. Measuring association using correlation and regression

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Loop Parallelization

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

where the coordinates are related to those in the old frame as follows.

Support vector domain description

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

SIMPLE LINEAR CORRELATION

Applied Research Laboratory. Decision Theory and Receiver Design

Can Auto Liability Insurance Purchases Signal Risk Attitude?

How To Calculate The Accountng Perod Of Nequalty

Single and multiple stage classifiers implementing logistic discrimination

Neural Network Solutions for Forward Kinematics Problem of Hybrid Serial-Parallel Manipulator

J. Parallel Distrib. Comput.

Extending Probabilistic Dynamic Epistemic Logic

1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP)

Calculation of Sampling Weights

A Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

Support Vector Machines

Mean Molecular Weight

Lecture 2: Single Layer Perceptrons Kevin Swingler

A NEW ACTIVE QUEUE MANAGEMENT ALGORITHM BASED ON NEURAL NETWORKS PI. M. Yaghoubi Waskasi M. J. Yazdanpanah

Ring structure of splines on triangulations

Binomial Link Functions. Lori Murray, Phil Munz

The Greedy Method. Introduction. 0/1 Knapsack Problem

Adaptive Load Balancing of Parallel Applications with Multi-Agent Reinforcement Learning on Heterogeneous Systems

Load Balancing of Parallelized Information Filters

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008

DEFINING %COMPLETE IN MICROSOFT PROJECT

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION

This circuit than can be reduced to a planar circuit

Application of Quasi Monte Carlo methods and Global Sensitivity Analysis in finance

The OC Curve of Attribute Acceptance Plans

How To Understand The Results Of The German Meris Cloud And Water Vapour Product

A Prediction System Based on Fuzzy Logic

Question 2: What is the variance and standard deviation of a dataset?

Evaluation of the information servicing in a distributed learning environment by using monitoring and stochastic modeling

Production. 2. Y is closed A set is closed if it contains its boundary. We need this for the solution existence in the profit maximization problem.

Bayesian Network Based Causal Relationship Identification and Funding Success Prediction in P2P Lending

Imperial College London

An Alternative Way to Measure Private Equity Performance

Rotation Kinematics, Moment of Inertia, and Torque

Searching for Interacting Features for Spam Filtering

Marginal Benefit Incidence Analysis Using a Single Cross-section of Data. Mohamed Ihsan Ajwad and Quentin Wodon 1. World Bank.

RELIABILITY, RISK AND AVAILABILITY ANLYSIS OF A CONTAINER GANTRY CRANE ABSTRACT

MOGENS BLADT ABSTRACT

Optimal maintenance of a production-inventory system with continuous repair times and idle periods

Out-of-Sample Extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering

"Research Note" APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES *

Working Paper Testing weak cross-sectional dependence in large panels. CESifo working paper: Empirical and Theoretical Methods, No.

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background:

Estimating the Number of Clusters in Genetics of Acute Lymphoblastic Leukemia Data

Efficient Project Portfolio as a tool for Enterprise Risk Management

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

Texas Instruments 30X IIS Calculator

Fast Fuzzy Clustering of Web Page Collections

Rate Monotonic (RM) Disadvantages of cyclic. TDDB47 Real Time Systems. Lecture 2: RM & EDF. Priority-based scheduling. States of a process

How To Know The Components Of Mean Squared Error Of Herarchcal Estmator S

Analysis and Modeling of Buck Converter in Discontinuous-Output-Inductor-Current Mode Operation *

On-Line Fault Detection in Wind Turbine Transmission System using Adaptive Filter and Robust Statistical Features

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence

Performance Analysis and Coding Strategy of ECOC SVMs

Logistic Regression. Steve Kroon

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6

PRACTICE 1: MUTUAL FUNDS EVALUATION USING MATLAB.

Realistic Image Synthesis

Linear Circuits Analysis. Superposition, Thevenin /Norton Equivalent circuits

Vasicek s Model of Distribution of Losses in a Large, Homogeneous Portfolio

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm

A Comparative Study of Data Clustering Techniques

Transcription:

Classfcaton of Multvarate Data Usng Dstrbuton Mang Exonent Marcel J na Insttute of Comuter Scence AS CR Pod vodárenskou v ží, 8 07 Praha 8 Lbe Czech Reublc marcel@cs.cas.cz Abstract: We ntroduce dstrbuton-mang exonent that s somethng lke effectve dmensonalty of multdmensonal sace. The method for classfcaton of multvarate data s roosed. It s based on local estmate of dstrbuton mang exonent for each ont x. Dstances of all onts of a gven class of the tranng set from a gven (unknown) ont x are searched and t s shown that the sum of recrocals of -th ower of these dstances can be used as the robablty densty estmate. The classfcaton ualty was tested and comared wth other methods usng multvarate data from UCI Machne Learnng Reostory. The method has no tunng arameters. Keywords multvaraate data; classfcaton; dstrbuton-mang exonent Introducton Classfcaton of multvarate data s a roblem solved by lot of methods from nearest neghbor method to decsson trees, neural networks and genetc algorthms. The roblem s generally dffcult because of several nfluences, e.g. Hgh roblem dmensonalty where curse of dmensonalty causes excessve grow of rocessng tme. Presence of nose; true data are rarely ure. Multcollnearty,.e. mutual deendence of ndvdual varables. If varables are orgnally consdered ndeendent,.e. orthogonal to all others, multcollnearty causes dstorton of the sace; coordnates are not orthogonal already. Boundary effect. Due to ths effect nearest onts seem to be rather far and farther onts near so that the dstance between the nearest and the farthest

ont of fnte data set can be smaller than the dstance of the nearest neghbor from the gven ont. In ths aer we deal wth dstances n multdmensonal sace and try to smlfy a comlex cture of robablty dstrbuton of onts n ths sace ntroducng mang functons of one varable. Ths varable s the dstance from the gven ont (the uery ont x [3]) n multdmensonal sace. From t follows that mang functons are dfferent for dfferent uery onts and ths s cost we ay for smlfcaton from n varables n n-dmensonal sace to one varable. We wll show that ths cost s not too hgh at least n alcaton resented here. The dstance s basc noton for all aroaches dealng wth neghbors, esecally nearest neghbors. There s a lot of methods of classfcaton based on the nearest neghbors []. They estmate the robablty densty at ont x (a uery ont [3]) of the data sace by rato /V of number of onts of a gven class n a sutable ball of volume V wth center at ont x [5]. These methods need to otmze the best sze of the neghborhood,.e. the number of onts n the neghborhood of the ont x or sze of volume V. The robablty densty n the feature (data) sace s gven by tranng data. The otmal neghborhood sze deends on the tranng data set,.e. on the character of data as well as on the number of samles of a gven class n the tranng set. Often t s recommended to choose neghborhood sze eual to the suare root of number of samles of the tranng set [5]. The method roosed s based on dstances of the tranng set samles x s, s =,, k from ont x. It s shown that the sum of recrocals of -th ower of these dstances, where s a sutable number, s convergent and can be used as a robablty densty estmate. From the fact of hgh ower of dstances n multdmensonal Eucldean sace, fast convergence,.e. small nfluence of dstant samles, follows. The seed of convergence s the better the hgher dmensonalty and the larger. The method remnds Parzen wndow aroach [4], [5] but the roblem wth drect alcaton of ths aroach s that the ste sze does not satsfy a necessary convergency condton. Usng dstances,.e. a smle transformaton from n-dmensonal Eucldean sace E n to one-dmensonal Eucldean sace E, and no teratons, the curse of dmensonalty s straghtforwardly elmnated. The method can be also consdered as a varant of the kernel method, based on a robablty densty estmator but usng a much smler metrc and does not satsfy some mathematcal condtons. Throughout ths aer let us assume that we deal wth normalzed data,.e. the ndvdual coordnates of the samles of the learnng set are normalzed to zero mean and unt varance and the same normalzaton constants (emrcal mean and emrcal varance) are aled to all other (testng and of unknown class) data. Ths transformaton does not mean any change n form of the dstrbuton,.e.

unform dstrbuton remans unform, exonental dstrbuton remans exonental (wth = and shfted by to the left), etc. Probablty Dstrbuton mang functon Let a uery ont x be laced wthout loss of generalty n the orgn. Let us buld balls wth ther centers n ont x and wth volumes V, =,,... Indvdual balls are one n another, the (-)-st nsde the -th lke eels of onon. Then the mean densty of onts n the -th ball s = m /V. The volume of a ball of radus r n n-dmensonal sace s V(r) = const.r n. Thus we have constructed a mang between the mean densty n the -th ball and ts radus r. Then = f(r ). Usng tght analogy between densty (z) and robablty densty (z) one can wrte (r ) = f(r ) and (r ) s the mean robablty densty n the -th ball wth radus r here. Ths way a comlex cture of robablty dstrbuton of onts n the neghborhood of a uery ont x s smlfed to a functon of a scalar varable. We call ths functon a robablty dstrbuton mang functon D(x, r), where x s a uery ont, and r the dstance from t. More exact defntons follow. Defnton Probablty dstrbuton mang functon D(x, r) of the neghborhood of the uery ont x s functon D ( x, r) = ( z) dz, where r s dstance from the uery ont B( x, r) and B(x, r) s ball wth center x and radus r. Defnton Dstrbuton densty mang functon d(x, r) of the neghborhood of the uery ont x s functon d( x, r) = D( x, r), where D(x, r) s a robablty dstrbuton r mang functon of the uery ont x and radus r. Note. It s seen that for fxed x the functon D(x, r), r > 0 s monotoncally growng from zero to one. Functons D(x, r) and d(x, r) for x fxed are onedmensonal analogs to the robablty dstrbuton functon and the robablty densty functons, resectvely. For llustraton see Fg..

Probablty dstrbuton mang functon 0 dstance dstrbuton densty mang functon 0 dstance Fg.. Data n a multdmensonal sace and corresondng robablty dstrbuton mang functon and dstrbuton densty mang functon. Power aroxmaton of the robablty dstrbuton mang functon Let us aroxmate the robablty dstrbuton mang functon by arabolc functon n form D(x, r n ) = const.(r n ). Ths functon s tangent to the vertcal axs n ont (0, 0) and let t s gong through some characterstc onts of the dstrbuton. Defnton Power aroxmaton of the robablty dstrbuton mang functon D(x, r n ) s functon r n such that D( x, r ) const for r 0 +. The exonent s a r dstrbuton-mang exonent. The varable = /n we call dstrbuton mang rato. Note. We often omt a multlcatve constant of the robablty dstrbuton mang functon. Usng aroxmaton of the robablty dstrbuton mang functon by D(x, r n ) = const.(r n ) the dstrbuton mang exonent s = n. Note that dstrbuton-mang exonent s nfluenced by two factors True dstrbuton of onts of the learnng set n E n. Boundary effects, whch have the larger nfluence the larger dmenson n and the smaller the learnng set sze [], [5].

To overcome the roblem of estmaton of usng real data, ths exonent s estmated by lnear regresson for each uery ont as shown n the next Chater. 3 Dstrbuton mang exonent estmaton Let the learnng set U of total m T samles be gven n the form of a matrx X T wth m T rows and n columns. Each samle corresonds to one row of X T and, at the same tme, corresonds to a ont n n-dmensonal Eucldean sace E n, where n s the samle sace dmenson. The learnng set conssts of onts (rows) of two classes c {0, },.e. each row (ont or samle) corresonds to one class. Then, the learnng set U = U 0 U, U 0 U =, U c = {x cs }, s =,, N c, c = {0, }. N c s the number of samles of class c, N 0 + N = m T, and x cs ={ x cs, x cs, x csn } s the data samle of class c. We use normalzed data,.e. each varable x csj (j fxed, s =,,... m T, c = 0 or corresonds to the j-th column of matrx X T ) has zero mean and unt varance. Let ont x U be gven and let onts x cs of one class be sorted so that ndex = corresonds to the nearest neghbor, ndex = to the second nearest neghbor, etc. In the Eucldean metrcs, r = x, x c s the dstance of the -th nearest neghbor of class c from ont x. From defnton of the dstrbuton mang exonent t follows that roortonal to ndex,.e. r r shoud be = k, =,,... N c, c = 0 or, () and where k s a sutable constant. Usng logarthm we get ln( r ) = k + ln( ), =,,... N c. () System of these N c euatons wth resect to unknown can be solved usng standard lnear regresson for both classes. Thus we get two values of, 0 and. To get a sngle value of we use the weghted arthmetc mean, = ( 0 N 0 + N )/ (N 0 + N ). At ths ont we can say that s somethng lke effectve dmensonalty of the data sace ncludng true dstrbuton of onts of both classes and boundary effect. In the next chater we use t drectly nstead of dmenson.

4 All learnng samles aroach Let us defne C ( (3) k c x) = / r k = where C s a constant. We show below that c (x) s robablty densty estmate. Note that (3) remnds Parzen wndow aroach [4], [5 Cha. 4.3] wth weghtng functon K( y) = y for y > r and K(y) = 0 otherwse, <, n> and wth wndow wdth h =. The roblem wth drect alcaton of ths aroach here s that h does not satsfy the necessary condton lm h( ) = 0 [4, e.(.8)]. On the one hand due to () the seres n (3) should be a harmonc seres that s dvergent. On the other hand we wll rove below that t s not true. The seres converges wth sze of r for > and thus we have no reason to lmt r ourselves to the nearest k onts and we can use all onts n the learnng set usng k = N c, c = 0 or. At the same tme the orderng of ndvdual comonents s not essental and we need not sort the samles of X T wth resect to ther r when usng the nearest neghbor aroach. (But we need to sort them when estmatng dstrbuton mang exonent.) In ractcal rocedure for each uery ont x we frst comute the dstrbuton mang exonent usng () by standard lnear regresson. After t, we smly sum u all comonents r and, at the same tme, we store the largest comonent whch corresonds to the nearest neghbor of ont x whch has the smallest r. In the end we subtract t thus excludng the nearest ont. Ths s made for both classes smultaneously gettng numbers S 0 and S for both classes. Ther rato gves a value of the dscrmnant functon, here the Bayes rato. We can get also robablty estmaton that the ont x E n s of class : S S R ( x) = or Then for a threshold (cut) θ chosen, f class else to class 0. 0 S ( x) =. S + S0 R (x) > θ or ( ) > θ then x belongs to The method s very close to the nearest neghbor as well as kernel methods. From the ont of vew of kernel methods, the kernel s or would be K( x) = x x wth Eucldean norm. n E n. There s no smoothng (bandwdth) arameter. The roblem s that ths kernel s dffcult to consder as a robablty dstrbuton functon accordng to the defnton of a kernel []. Takng x-x = r we have K( r) = r and ntegrals K ( r) dr or K ( r) dr are not convergent; they should 0 be eual to or at least fnte. x

5 Probablty Densty Estmaton Let us look at the roblem what s the relaton of art D of sace E n whch falls on nearest neghbors of the gven ont x. We wll assume the followng: Assumton : Let there be onts n the Eucldean sace E n dstrbuted unformly n the sense that the dstrbuton of each of the n coordnates s unform. Let be the order number of the -th nearest neghbor to the ont x. Let r be the dstance of the -th nearest neghbor of the gven ont x E n from ont x. Let D be a constant, (, n) be a constant, and D be the mean value of the varable r, and let t hold D = D. Comment: Under Assumton by the art D of the sace E n we do not mean a volume of a ball wth the center n the ont x and radus r but, n fact (excet for a multlcatve constant), a ball of the same center and radus but n the sace of dmenson gven by constant,.e. n the E. The bass for ntroducng Assumton s fndng as follows. By smulaton one can fnd that the relaton V = V where V s a constant does not hold but for some n t holds D = r D where s the number of the -th nearest neghbor of ont x E n and = D s a constant. From t follows that the -th ower grows lnearly. Theorem Let Assumton be vald, and let D = r, V be mean of a constant K such that n cr be mean of =, D be mean of r r V = where c s a constant. Moreover let exst ( ) = K. Then for the robablty densty ( ) = K / of onts n the neghborhood of ont x t holds V ( ) = ( D ) ( ), where = K ( D ) =. D Proof: The () s robablty densty and at the same tme due to Assumton D s roortonal to (). Then there s a constant K that ( D ) = ( ). Under Assumton there s = D and then ( ) = ( D ) = ( ). 6 The Proof of Convergence Theorem states that robablty densty s rootonal to / r and formula (3) uses the sum of these ratos suosng to get a reasonable number for robablty

densty estmaton. So t s suosed that for a number of samles gong to nfnty, the sum would be convergent. Theorem Let exst a mang of robablty densty of onts of class c n E n, E n E : ( x ) = ( ) so that c r c ( ) K rc = x, c K ( rc rc ) = ( xc), K r r ) = ( x ), (4) ( cnc c( Nc ) cnc where K s a fxed constant that has the same value for both classes. Let exst a constant ε > 0 and ndex k > so that for each > k t holds ( xc ) ( xc ) ( + ( k) ε ) k. (5) Nc Then Sc = ( xc ) K( + Cc ), (6) = r = c where K and C c are fnte constants. Proof: Frst we arrange (6) n form S c = Nc N c = + = rcj rc = 3 rc + c3 + c4 + K + c. Then usng mang (4) ntroduced we get Nc Sc = Kc + K = For ndvdual elements + + Κ + = 3 c c3 Nc Nc c K( + ) c K( + P ) = c 3 c = 3 c cj + c3 + Κ + c c = / n denomnators of fractons n the sum t holds k c c k = = ( + ( k) ε) cj ( + ( k) ε ). c Usng condton (5) the summed elements P k, P k+, n (7) snce the k-th have form P, C =, Pk = C + + ε C + + ε + ( + ε) k = Pk + + Pk + = [ C + ( + ε ) + ( + ε ) +... + ( + ε) ]., (7)

Then accordng to d Alembert s crteron P C + ( + ε ) + ( + ε ) +... + ( + ε ) C + ( + ε ) + ( + ε ) +... + ( + ε ) + ( + ( + ) ε ) k + + = + Pk + > 0and ε > 0. Then the seres s convergent. Notes: a) In the statement of the theorem the sum need not start just by ndex =. We can start wth the nearest neghbor ( = ) or other neghbors ( > ). The value = s gven by a comromse between the error caused by the small value and the large varablty of c = r, and the naccuracy caused by the larger dstance c from ont x for >, see Cha. 3. b) The last condton (5) defnes the seed of dmnshng the tal of the dstrbuton; robably condton that the dstrbuton should have the mean would suffce. < 7 Dscusson From formula (7) t s seen that for a smooth form of the dstrbuton functon around ont x and for the large densty of onts for both classes, the ratos c c are very close to for rather large values of (e.g. 00, but let us take here). For both classes the elements of sum n (7) are,,... and ther 3 sum s.0987 here, and the other elements have form, where + ( )( + δ ) startng from the ndex k t s δ ε. (Index k can be dfferent for both classes.) It s then robable that the values of sums n (7) wll be very close for both classes and the rato of (7) for one and the other class wll be close to Bayes rato ( xc ) 0( xc ) = S S. In such a case we can also estmate the robablty that the 0 samle x belongs among sgnals: ( x) ( xc ). S + S 0 Usng neghbor dstances for the robablty densty estmaton, the robablty densty estmaton should coy the features of the robablty densty functon based on real data. The dea of most nearest-neghbors-based methods as well as kernel methods [] does not reflect the boundary effects. That means that for any ont x, the statstcal dstrbuton of the data onts x surroundng t s suosed to be ndeendent of the locaton of the neghbor onts and ther dstances x from ont x. Ths assumton s often not met, esecally for small data sets and for S

hgher dmensons. To llustrate ths, let us consder unformly dstrbuted onts n a cube (-0.5, + 0.5) n. Let there be a ball wth ts center n the orgn and the 3 radus eual to 0.5. Ths ball occues 4 3 π.0. 5 = 0.54,.e. more than 5 % of that cube n a three-dmensonal sace, 0.080746,.e. 8 % of unt cube n 6- dmensonal sace, 0.006 n 0-dmensonal sace, and 3.8e- n 40- dmensonal sace. It s then seen that startng by some dmenson n, say 5 or 6 and some ndex, the -th nearest neghbor does not le n such a ball around ont x but somewhere n the corner n that cube but outsde ths ball (the boundary effect [6]). From t follows that ths -th neghbor les farther from ont x as would follow from the unformty of dstrbuton. In farther laces from the orgn the sace thus seems to be less dense than near the orgn. The functon n f ( ) = r, where r s the mean dstance of the -th neghbor from ont x, should grow lnearly wth ndex n the case of unform dstrbuton wthout the boundary effect mentoned. In the other case ths functon grows faster than lnearly. The samles of the learnng set are normalzed to zero mean and unt varance for each varable as ntroduced n the begnnng. Assume that all thus arsng margnal dstrbutons are aroxmately normal. Our ont x has an unknown class and also unknown robabltes (x) and 0 (x) and les just n the ont (0, 0, 0). For ont x we can ntroduce dfferent neghborhoods, for our estmatons let us use three only: A. Tll the dstance of one sgma n all dmensons, B. From the dstance of one sgma to the dstance of two sgmas n all dmensons, C. From the dstance of two sgmas to nfnty n each dmenson. There s assumton of normalty of all varables. Then n each dmenson, aroxmately 68 % onts of the learnng set le nsde A, 95 % onts le nsde A and B,.e. 7 % n B, and 5 % n C. To these three layers also some mean dstances (0.5,.5, and 3) corresond n all dmensons. n The total orton (ercentage) of onts n layer A are gven by 0.68, n A and B n together by 0.95, and n C by what remans to. For comutaton of sum n (3) all onts from layers A, B, and C are used. Each ont n these three layeres beneft to the total sum (3) by ts art. Benefts to the total sum by onts n the ndvdual layer are gven by the relatve number of onts n a layer (total onts n layer) dvded by average dstance (whch s the average dstance n one dmenson tmes the suare root of the dmenson) to the (n-)-st ower recomuted to 00 %. E.g. for layer A and n = there s B A = 0.44. /(0.5 ) = 0.65393, B B = 0.440. /(.5 ) = 0.07465,

and B C = 0.0975. /(3 ) = 0.098. These numbers dvded by sum B A + B B + B C gve 73.94 %, 3.46 %,.60 %, resectvely. For n = 5 the dstrbuton of onts n layers A, B, C are 4.54%, 6.84%, and.6%, resectvely, and corresondng benefts are n the same orderng 94.83%, 5.06%, and 0.%. Smlarly for n = 0 we get numbers 0.044687%, 35.80%, 64.5% for dstrbuton of onts, and 99.99993%, 0.000069%,.35588E- for corresondng benefts of onts n layers A, B, and C. These estmatons show that due to the geometry of the multdmensonal Eucldean sace the share of onts corresondng to A wth resect to the total number of onts lessens essentally wth dmenson. At the same tme, ther beneft to the total sum s close to 00 %. Ths s because arts A, B, C are, n fact, not cubes but n-dmensonal balls of rad comuted from an average dstance n one dmenson. It also follows that the share of the layer C to the total sum s neglgble for the dmenson 5 or 6 and more. Wth the dmenson growng also the convergence of the sum s much faster as the onts of the learnng set near ont x gave ractcally the whole value of the sum. The larger dmenson the lesser ercentage of onts from the learnng set nfluence the result. On the other hand, for low dmensonalty, esecally and 3 even the farthest onts have strong nfluence.

German Heart Algorthm Error Note Algorthm Error Note SFSloc7 0.50 ; SFSLoc7 0.357 3 Dscrm 0.535 Bayes 0.374 LogDsc 0.538 Dscrm 0.393 Castle 0.583 LogDsc 0.396 Alloc80 0.584 Alloc80 0.407 Dol9 0.599 QuaDsc 0.4 Smart 0.60 Castle 0.44 Cal 0.603 Cal5 0.444 Cart 0.63 Cart 0.45 QuaDsc 0.69 Cascade 0.467 KNN 0.694 KNN 0.478 Default 0.700 Smart 0.478 Bayes 0.703 Dol9 0.507 IndCart 0.76 Itrule 0.55 BackPro 0.77 BayTree 0.56 BayTree 0.778 Default 0.560 Cn 0.856 BackPro 0.574 Ac 0.878 LVQ 0.600 Itrule 0.879 IndCart 0.630 NewId 0.95 Kohonen 0.693 LVQ 0.963 Ac 0.744 Radal 0.97 Cn 0.767 C4.5 0.985 Radal 0.78 Kohonen.60 C4.5 0.78 Cascade 00.0 NewId 0.844 Adult Ionoshere Algorthm Error Note Algorthm Error Note FSS Nave Bayes 0.405 IB3 0.0330 6; 7 NBTree 0.40 backro 0.0400 8 C4.5-auto 0.446 SFSloc7 0.0596 9 IDTM Dec. table 0.446 Ross Qunlan's C4 0.0600 0 HOODG 0.48 nearest neghbor 0.0790 C4.5 rules 0.494 "non-lnear" ercetr. 0.0800 OC 0.504 "lnear" ercetron 0.0930 C4.5 0.554 Voted ID3 (0.6) 0.564 CN 0.600 Nave-Bayes 0.6 Voted ID3 (0.8) 0.647 T 0.684 SFSloc7 0.786 4 R 0.954 Nearest-neghbor 0.035 Nearest-neghbor 0.4 Pebls Crashed 5 Table. Comarson of classfcaton error of SFSloc7 for dfferent tasks wth results for another classfers as gven by [7]. Notes see the next age.

Notes to Table : for threshold 0.43 numerc data 3 for threshold 0.4 4 for threshold 0.86848 5 Unknown why (bounds WERE ncreased) 6 arameter settngs: 70% and 80% for accetance and drong resectvely 7 (Aha & Kbler, IJCAI-989) 8 an average of over.. 9 for threshold 0.55054 0 no wndowng 8 Results - testng the classfcaton ablty The classfcaton algorthm was wrtten n c++ as SFSloc7 rogram and tested usng tasks from UCI Machne Learnng Reostory [7]. Tasks of classfcaton nto two classes for whch data about revous tests are known were selected: Adult, German, Heart, and Ionoshere. The task Adult s to determne whether a erson makes over 50000 $ a year. The task German s about whether the clent s good or bad to lend hm money. The task Heart ndcates absence or resence of heart dsease for atent. For he task Ionoshere the targets were free electrons n the onoshere. "Good" radar returns are those showng evdence of some tye of structure n the onoshere. "Bad" returns are those that do not; ther sgnals ass through the onoshere. We do not descrbe these tasks n detal here as all can be found n [7]. For each task the same aroach to testng and evaluaton was used as descrbed n [7]. In Table results are shown together wth results for other methods as gven n [7]. For each task methods are sorted accordng to classfcaton error, the method wth the best lowest error frst. It s seen that for some tasks SFSloc7 s good but there are tasks where the method s worse than average t would be strange to outerform all methods for all tasks. The method s totally arameterless. There s no arameter for tunng to get the best result. The method smly works satsfactorly or not, there s nothng to try more. Conclusons In ths aer we dealt wth smlfed reresentaton of robablty dstrbuton of onts n multdmensonal Eucldean sace ncludng boundary effects. A new

method for classfcaton was develoed. The method s based on noton of dstrbuton mang exonent and ts local estmate for each uery ont x. The theorem on convergence was formulated and roved and a convergence estmaton was shown. It was found that the hgher dmensonalty, the better. The method has no tunng arameters: No neghborhood sze, no convergence coeffcents etc. need to be set u n advance to assure convergence. There s no true learnng hase. In the learnng hase only normalzaton constants are comuted and thus ths hase s several orders of magntude faster than the learnng hase of neural networks or many other methods [], [7]. In the recall hase for each samle to be classfed the learnng set s searched twce, once for fndng the local value of the dstrbuton mang exonent, and second for all samles of the learnng set elements of sum (3) are comuted. The amount of comutaton s thus roortonal to the learnng set sze,.e. the dmensonalty tmes the number of the learnng samles. Acknowledgement Ths work was suorted by the Mnstry of Educaton of the Czech Reublc under roject No. LN00B096. References [] Slverman, B. W.: Densty Estmaton for Statstcs and data Analyss. Chaman and Hall, London, 986. [] Bock, R. K. et al.: Methods for multdmensonal event classfcaton: a case study. To be ublshed as Internal Note n CERN, 003. [3] Hnnenburg, A., Aggarwal, C.C., Kem, D.A.: What s the nearest neghbor n hgh dmensonal saces? Proc. of the 6th VLDB Conf., Caro, Egyt, 000, 506-55. [4] Parzen, E.: On Estmaton of Probablty Densty Functon and Mode. The Annals of Mathematcal Statstcs, Vol. 33, No. 3 (Set. 96),. 065-076. [5] Duda, R., Hart, P., Stork, D.G.: Pattern Classfcaton. John Wley and Sons, 000. [6] Arya, S., Mount, D.M., Narayan, O., Accountng for Boundary Effects n Nearest Neghbor Searchng. Dscrete and Comutatonal Geometry Vol.6 (996),. 55-76. [7] UCI Machne Learnng Reostory. htt://www.cs.uc.edu/~mlearn/mlsummary.html