Shape-based Similarity Query for Trajectory of Mobile Objects

Similar documents
TEMPORAL PATTERN IDENTIFICATION OF TIME SERIES DATA USING PATTERN WAVELETS AND GENETIC ALGORITHMS

SELF-EVALUATION FOR VIDEO TRACKING SYSTEMS

Chapter 7. Response of First-Order RL and RC Circuits

Real-time Particle Filters

Capacitors and inductors

USE OF EDUCATION TECHNOLOGY IN ENGLISH CLASSES

Acceleration Lab Teacher s Guide

CHARGE AND DISCHARGE OF A CAPACITOR

Trends in TCP/IP Retransmissions and Resets

Direc Manipulaion Inerface and EGN algorithms

A Probability Density Function for Google s stocks

Chapter 2 Kinematics in One Dimension

Analysis of Pricing and Efficiency Control Strategy between Internet Retailer and Conventional Retailer

Strategic Optimization of a Transportation Distribution Network

Making a Faster Cryptanalytic Time-Memory Trade-Off

The naive method discussed in Lecture 1 uses the most recent observations to forecast future values. That is, Y ˆ t + 1

11/6/2013. Chapter 14: Dynamic AD-AS. Introduction. Introduction. Keeping track of time. The model s elements

BALANCE OF PAYMENTS. First quarter Balance of payments

MTH6121 Introduction to Mathematical Finance Lesson 5

cooking trajectory boiling water B (t) microwave time t (mins)

Statistical Analysis with Little s Law. Supplementary Material: More on the Call Center Data. by Song-Hee Kim and Ward Whitt

Appendix A: Area. 1 Find the radius of a circle that has circumference 12 inches.

Principal components of stock market dynamics. Methodology and applications in brief (to be updated ) Andrei Bouzaev, bouzaev@ya.

Distributing Human Resources among Software Development Projects 1

Constant Data Length Retrieval for Video Servers with Variable Bit Rate Streams

Chapter 1.6 Financial Management

A Note on Using the Svensson procedure to estimate the risk free rate in corporate valuation

DOES TRADING VOLUME INFLUENCE GARCH EFFECTS? SOME EVIDENCE FROM THE GREEK MARKET WITH SPECIAL REFERENCE TO BANKING SECTOR

Chapter 8: Regression with Lagged Explanatory Variables

Full-wave rectification, bulk capacitor calculations Chris Basso January 2009

THE PRESSURE DERIVATIVE

Journal Of Business & Economics Research September 2005 Volume 3, Number 9

WATER MIST FIRE PROTECTION RELIABILITY ANALYSIS

The Application of Multi Shifts and Break Windows in Employees Scheduling

Visualization Foundations IDV 2015/2016

The Transport Equation

Performance Center Overview. Performance Center Overview 1

Motion Along a Straight Line

Automatic measurement and detection of GSM interferences

4. International Parity Conditions

The Greek financial crisis: growing imbalances and sovereign spreads. Heather D. Gibson, Stephan G. Hall and George S. Tavlas

Risk Modelling of Collateralised Lending

Market Liquidity and the Impacts of the Computerized Trading System: Evidence from the Stock Exchange of Thailand

The Torsion of Thin, Open Sections

1. BACKGROUND 1-1 Traffic Flow Surveillance

Option Put-Call Parity Relations When the Underlying Security Pays Dividends

Single-machine Scheduling with Periodic Maintenance and both Preemptive and. Non-preemptive jobs in Remanufacturing System 1

ANALYSIS AND COMPARISONS OF SOME SOLUTION CONCEPTS FOR STOCHASTIC PROGRAMMING PROBLEMS

Term Structure of Prices of Asian Options

Segment and combine approach for non-parametric time-series classification

AP Calculus BC 2010 Scoring Guidelines

Chapter 8 Student Lecture Notes 8-1

Spectrum-Aware Data Replication in Intermittently Connected Cognitive Radio Networks

PROFIT TEST MODELLING IN LIFE ASSURANCE USING SPREADSHEETS PART ONE

9. Capacitor and Resistor Circuits

Duration and Convexity ( ) 20 = Bond B has a maturity of 5 years and also has a required rate of return of 10%. Its price is $613.

Random Walk in 1-D. 3 possible paths x vs n. -5 For our random walk, we assume the probabilities p,q do not depend on time (n) - stationary

MACROECONOMIC FORECASTS AT THE MOF A LOOK INTO THE REAR VIEW MIRROR

Individual Health Insurance April 30, 2008 Pages

A Bayesian framework with auxiliary particle filter for GMTI based ground vehicle tracking aided by domain knowledge

Mathematics in Pharmacokinetics What and Why (A second attempt to make it clearer)

A Distributed Multiple-Target Identity Management Algorithm in Sensor Networks

ARCH Proceedings

Distributed Echo Cancellation in Multimedia Conferencing System

4 Convolution. Recommended Problems. x2[n] 1 2[n]

Switching Regulator IC series Capacitor Calculation for Buck converter IC

DC-DC Boost Converter with Constant Output Voltage for Grid Connected Photovoltaic Application System

Return Calculation of U.S. Treasury Constant Maturity Indices

Caring for trees and your service

A Bayesian Approach for Personalized Booth Recommendation

DDoS Attacks Detection Model and its Application

Communication Networks II Contents

Chapter 2 Problems. 3600s = 25m / s d = s t = 25m / s 0.5s = 12.5m. Δx = x(4) x(0) =12m 0m =12m

Information Theoretic Evaluation of Change Prediction Models for Large-Scale Software

DYNAMIC MODELS FOR VALUATION OF WRONGFUL DEATH PAYMENTS

Analogue and Digital Signal Processing. First Term Third Year CS Engineering By Dr Mukhtiar Ali Unar

Tomographic Clustering To Visualize Blog Communities as Mountain Views

arxiv:physics/ v2 [physics.soc-ph] 19 Jan 2007

Ecological Scheduling Decision Support System Based on RIA and Cloud Computing on the YaLong River Cascade Project

Diagnostic Examination

Multiprocessor Systems-on-Chips

Differential Equations. Solving for Impulse Response. Linear systems are often described using differential equations.

AP Calculus AB 2013 Scoring Guidelines

The Architecture of a Churn Prediction System Based on Stream Mining

ESIGN Rendering Service

Molding. Injection. Design. GE Plastics. GE Engineering Thermoplastics DESIGN GUIDE

Sampling Time-Based Sliding Windows in Bounded Space

This is the author s version of a work that was submitted/accepted for publication in the following source:

Answer, Key Homework 2 David McIntyre Mar 25,

User Identity Verification via Mouse Dynamics

A Natural Feature-Based 3D Object Tracking Method for Wearable Augmented Reality

SLIM: A Scalable Location-Sensitive Information Monitoring Service

Mortality Variance of the Present Value (PV) of Future Annuity Payments

TSG-RAN Working Group 1 (Radio Layer 1) meeting #3 Nynashamn, Sweden 22 nd 26 th March 1999

Bayesian Filtering with Online Gaussian Process Latent Variable Models

Description of the CBOE S&P 500 BuyWrite Index (BXM SM )

Name: Algebra II Review for Quiz #13 Exponential and Logarithmic Functions including Modeling

Vector Autoregressions (VARs): Operational Perspectives

Chapter 6: Business Valuation (Income Approach)

Measuring macroeconomic volatility Applications to export revenue data,

Transcription:

1 Shape-based Similariy Query for Trajecory of Mobile Objecs uaka anagisawa Jun-ichi Akahani Tesuji Saoh NTT Communicaion Science Laboraories, NTT Corporaion {yuaka,akahani,saoh}@cslab.kecl.n.co.jp hp://www.kecl.n.co.jp/scl/sirg/ Absrac. In his paper, we describe an efficien indexing mehod for a shape-based similariy search of he rajecory of dynamically changing locaions of people and mobile objecs. In order o manage rajecories in daabase sysems, we define a daa model of rajecories as direced lines in a space, and he similariy beween rajecories is defined as he Euclidean disance beween direced discree lines. Our proposed similariy query can be used o find ineresed paerns embedded ino he rajecories, for example, he rajecories of mobile cars in a ciy may include paerns for expecing raffic jams. Furhermore, we propose an efficien indexing mehod o rerieve similar rajecories for a query by combining a spaial indexing echnique (R + -Tree) and a dimension reducion echnique, which is called PAA (Piecewise Approximae Aggregae). The indexing mehod can efficienly rerieve rajecories whose shape in a space is similar o he shape of a candidae rajecory from he daabase. 1 Inroducion Recenly, many locaion sensors such as GPS have been developed, and we can obain he rajecory of users and moving objecs using hese sensors [12]. Trajecory daa are widely used in locaion-aware sysems [1], car navigaion sysems, and oher locaion-based informaion sysems, ha can provide services according o a user s curren locaion. These applicaions have sored in hem a lo of rajecories, and hese rajecories may include ineresing individual paerns of each user. For example, by analyzing rajecories of users who work in a building, we can find imporan passages, rooms, sairs, and oher faciliies ha are used frequenly. The resul of he analysis can be used for he managemen and mainenance of he buildings. In he case of a navigaion sysem, a driver can check he roue o a ciy by referring o he rajecories of oher users who have driven o he ciy before. In anoher case, we can sudy characerisics o improve performance in a spor by analyzing he moion daa measured by he sensors aached o he bodies of op spors players.

2 There have been many sudies on managing mobile objecs daa (MOD) [2] [8] [11] [14] [16]. One of he mos ineresing of hese is he developmen of efficien mehod o rerieve objecs, which is indicaed by eiher a spaioemporal range query or a spaio-emporal neares neighbor query. Boh queries are defined as he disance beween he rajecory of a mobile objec and an indicaed poin in a space. For example, he range query is generally defined as he query for rerieving all objecs which passed wihin a given disance of an indicaed poin, such as rerieve all of he people who walked wihin one mile of he buildings a he ime. The range query can also be defined as he query o rerieve all objecs ha passed wihin an indicaed polygon. In boh cases, he query is defined using he disance beween figures in a space. These disance-based queries are useful in locaion managemen of mobile objecs [15], however, hese queries do no have enough power o analyze he paern of he objecs moion. As menioned above, because we are ineresed in he exracion of he individual moving paerns of each objec from he rajecories, i is necessary o develop more powerful ools o analyze he rajecories. Hence, we propose shape-based queries of rajecories in space for he analysis, for insance, rerieve all objecs ha have a similar shape o he rajecory where a user walked in a shop. Using his query, we may classify he cusomers in he shop based on heir shape paerns of he rajecories. In oher words, our approach is based on he shape similariy beween lines, while he exising approaches adop he disance beween poins as he key o rerieve required objecs. I is difficul o define he similariy beween lines in a space. However, we found his useful idea hrough research of ime series daabases [6] [7] [10]. The ime series daabase sysems can sore ime series daa such as emperaure, economic indicaors, populaion, wave signals, and so on, in addiion o supporing queries for exracing paerns from he ime series daa. Mos of he ime series daabase sysems adop he Euclidean disance beween wo ime daa sequences [7] for analysis; if wo sequences, c, c, are given as w 1, w 2,..., w n and w 1, w 2,..., w n, he similariy can be defined as D(c, c ) = (w 1 w 1 )2 +... + (w n w n) 2. (In Secion 3, we describe he similariy in deail) Because rajecory is a ype of ime series daa, he ime series daabase is able o deal wih rajecory. However, rajecory no only has a ime series daa feaure, bu also has a direced line in space feaure. For example, i is difficul for he ime series daabase o find daa for a geographic and spaial query. Therefore, in his paper, we presen a daa model for rajecories of mobile daa, and a query based on he disance beween wo rajecories by exending he similariy used in he ime series daabase sysems. Moreover, we propose a new indexing mehod for rerieving required rajecories by queries based on our defined disance beween rajecories. In Secion 2, we describe our proposed daa model for he rajecory. Secion 3 describes he disance beween wo discree direced lines for calculaing similariies beween wo rajecories. In Secion 4,

3 λ1 λ2 12 λ1 λ4 λ3 2 3 1 1 λ4 11 (a) (b) Fig. 1. Trajecory of Mobile Objecs ((a) rajecory in he real world. (b) rajecory sored in a daabase) we presen boh he processing mehod for our proposed query and an indexing echnique ha is an exension of Piecewise Aggregae Approximaion (PAA) [7]. Finally he evaluaion of our approach is shown in Secion 5. 2 Trajecory of Mobile Objecs In order o effecively manage mobile objecs, i is necessary o manage he locaion of each objec a each ime. Generally, a locaion managemen sysem can rerieve objecs locaed in he indicaed area a he indicaed ime [2]. However, we are ineresed in he similariy of he rajecory s shape in a space. In order o define he similariy beween rajecories, i is necessary a firs o define he rajecory as a figure drawn in space. Hence, we define he daa model for he rajecory of mobile objecs 1. A real-world rajecory is a direced coninuous line wih a sar and end poin (Figure 1(a)). Given a wo-dimensional space R 2 and a closed ime inerval I λ = [, ] wih <, a rajecory λ is defined as follows, Definiion 1 : Trajecory A rajecory is he image of a coninuous mapping of λ : I λ R 2. This definiion is a emporal exension of he definiion of a simple line described in [3]. Nex, we denoe he lengh of rajecories in R 2 as L S and he inerval of rajecories in emporal space as L T : Definiion 2 : Lengh of Trajecory in Space R 2 The lengh of rajecory λ during a period [ 0, 1 ] is denoed as L S (λ, [ 0, 1 ]) calculaed as follows: 1 (dx ) 2 ( ) 2 dy L S (λ, [ 0, 1 ]) = + d, where λ() = (x, y) d d 0 1 For simplificaion of he problem, we jus focus on he rajecory of mobile objecs; in oher words, we do no discuss he daa model of he oher aribues of he objecs, such as shape, name, and so on.

4 The lengh of he whole rajecory is denoed as L S (λ)( = L S (λ, [, ]). Definiion 3 : Temporal Inerval of Trajecory The x = (x, y) is a vecor in space R 2. The emporal inerval of rajecory λ beween x i and x j on λ is defined as follows: L T (λ, [x i, x j ]) = j i, where λ( i ) = x i, λ( j ) = x j, and i, j I λ [, ] L T (λ) = However, a locaion sensor device such as GPS does no coninuously measure he coordinaes of a mobile objec, bu samples such daa. The measured daa are hus a sequence of coordinaes of posiions shown in Figure 1(b). Hence, we define discree rajecory λ as a discree funcion. Each vecor x i represens a posiion of a mobile objec a each ime T λ = { 0, 1,..., m } in he space. Definiion 4 :Discree Trajecory A discree rajecory is he image of a discree mapping: λ : T λ R 2. A discree rajecory can be represened as a vecor sequence x 1,..., x m, also. If T λ = {1, 2,..., m}, we denoe he discrie rajecory λ as jus a simple vecor sequence x 1,..., x m. Addiionally, where λ( i ) = x i, we inroduce several noaions; T λ(i) = i, λ(i) = x i, and λ is he number of he vecors included in λ ( λ = T λ ). Nex, we define he disance beween wo vecors x, x in R 2. Definiion 5 :Disance of Vecors D(x, x ) = (x x ) 2 + (y y ) 2 Tha is, his definiion assumes ha space R 2 is Euclidean. Alhough λ is a discree line, i is necessary o deal wih λ as a coninuous line in a query. In order o saisfy his requiremen, we define a funcion o conver he discree line ino a coninuous line. There are various mehods o calculae an approximae coninuous line from a discree line [4]. In our approach, we adoped he piecewise linear approximaion because of is simpliciy and populariy [15]. Definiion 6 : Piecewise Linear Approximaion of λ λ : [ 0, m ] R 2 is given as { λ() if T λ() = λ i λ(i i+1 i ) + i+1 λ(i+1 i+1 i ) if T λ i mus be seleced under he condiion: i < < i+1 In he res of his paper, we mainly discuss he feaures of boh λ and λ. Noe ha in his paper, we only menion he rajecory on R 2, bu our proposed model and echniques can obviously be adaped o he higher dimensional space R n.

5 L1 L2 L1 d L2 i Q k l L3 j L4 (a) (b) Fig. 2. Disance beween rajecories Val L1 Q Q L2 L1 n L3 L4 Time (a) Previous Spaial knn (b) Previous Temporal knn Fig. 3. Exising knn approaches 3 Similariy Query based on Shapes of Lines 3.1 Shape-based Approach The similariy query is useful in is own righ as a ool for exploraory daa analysis[7], and i is a significan elemen in many daa mining applicaions. For insance, we may find he opimum arrangemen of iems in a marke by analyzing he rajecories of cusomers walking around in a shop. In addiion o is usefulness in he rajecory daabase, he similariy query is one of he mos ineresing fields in ime series daabases. In ime series daabases, he similariy beween wo ses of ime series daa is ypically measured by he Euclidean disance [6] [7], which can be calculaed efficienly. However, here have been few discussions on he similariy beween wo lines in space because he previous approaches for spaial queries have focused on he disance beween a poin and a line [2] [9] [15]. The ineres of he previous approaches was mainly o find objecs ha pass a poin near he indicaed poin, such as a car passing hrough a sree. On he oher hand, we are ineresed in he shape of he rajecory. In order o calculae shape-based similariies among rajecories, i is necessary o define a new similariy for he rajecories, as shown in Figure 2(b).

6 In general, he similariy query is represened as a k Neares Neighbor Query (knn) [2] [5] [9]. There are wo ypes of exising approaches, one is based on spaial similariies, he oher is based on similariy beween wo ime series daa. The example of he exising spaial knn is illusraed in Figure 3(a). In his case, he answer is L 1, L 2 when K is 2. On he oher hand, he similariy beween wo ime series daa is defined as he Euclidean disance beween wo ime series, where he lengh of each is n. The disance is defined as he Euclidean disance beween wo n-dimensonal vecor daa [7] shown in Figure 3(b). While his disance of he ime series daa is based on shape, he disance is defined only in he case of R 1 T (T = [0, ]), bu no in he case of R n T, shown in Figure 2(b). Since he rajecory has boh spaial and emporal feaures, we consider hree ypes of similariy queries for rajecories as follows: Spaio-Temporal Similariy: based on a spaio-emporal feaure in R 2 T. Spaial Similariy: based on a spaial only feaure in R 2 wihou emporal feaures. Temporal Similariy: based on a emporal only feaure in R 1 T wihou spaial feaures. In he res of his secion, we define he similariy in he firs wo cases. We do no define he emporal similariy because his similariy is he same as he similariy defined for he ime series daabases. 3.2 Shape-based similariy query As menioned above, he rajecory has a ime series daa feaure. We define he similariy beween wo rajecories in he same manner as for he similariy defined in he ime series query[7]. For he ime series daabase, he similariy of he wo ime series daa, where each has n value, is given by he Euclidean disance beween vecors in R n. In [6] and [7], when here are wo ime series daa, c = w 1, w 2,..., w n, c = w 1, w 2,..., w n, he disance D(c, c ) is defined as follows: D(c, c ) = (w 1 w 1 )2 +... + (w n w n) 2 This definiion can be exended if each vecor x is a vecor in space R 2, when he ime series vecors are = x 1, x 2,..., x n, = x 1, x 2,..., x n, and he disance is D(c, c ). We define he disance beween wo ime series vecors D(, ) by exending he definiion of D(c, c ), as follows: D(, ) = D(x 1, x 1 )2 +... + D(x n, x n )2 Based on his definiion, we consider he shape-based similariy query for rajecories. Here, Λ is he se of discree rjecories sored in he daabase, and each λ i ( λ i Λ) is a discree rajecory, such as λ i = x 1,..., x m. The query rajecory λ q is given as λ q = x 1,..., x n. The shape-based range query can hen be defined using Λ, λ q, and he previous defined disance beween wo ime series vecors, as follows:

7 Inpu :, λ q and θ (θ is a naural number). Oupu : a, { λ a1,..., λ ak } a. funcion Q range( θ: ineger, λ q, ) : a begin var j : ineger, l := λ q, a := φ; for each λ i in do for j := 1 o λ i l + 1 do begin λ ij = subsequence( λ i, j, l ); { This funcion will reurn a subsequence of he original sequence λ i, such as x j, x j+1,..., x j+l 1, each x λ i } if D( λ q, λ ij ) < θ hen Add λ ij o a ; end; reurn end. a; Fig. 4. The process of he shape-based range query of rajecories Definiion 7 : Shape-based Range Query The process for calculaion of he shape-based range query Q range (θ, λ q, Λ) is given in Figure 4. The range query is defined as a subsequence mach of rajecories as shown in Figure 5. In addiion, he neares neighbor query can be defined using our disance beween rajecories. In our definiion, he emporal feaures are no indicaed in he query, however, we consider ha he emporal feaures can be indicaed independenly from he range query. For example, a query Q range ( θ, λ q, Λ) 11:00 < T λai (1) < 12:00 means rerieving subsequences λ ai where he disance beween λ q and λ ai is less han θ. Moreover, he firs vecor in λ ai is measured wihin he inerval [11:00, 12:00]. 3.3 Spaio-emporal disance beween wo rajecories Our defined disance D(, ) can be used only in he case where each vecor x is measured by he same inerval, ha is = i+1 i (i = 1,..., n 1), where i is an inerval from he ime when x i is measured. However, each vecor in he rajecory is no always measured by he same inerval because sensor devices ofen lose he daa. For example, a discree rajecory illusraed in Figure 6(a) has no measured vecors a = 5, 7, 9.. Therefore, o calculae he similariy using our definiion, we define a emporal normalized discree rajecory λ for rajecory λ, as follows: Definiion 8 : Temporal Normalized Discree Trajecory

8 9 Query Trajecory 4 8 Subsequence Mach 3 7 2 1 6 1 2 3 Sored Trajecory A B C (a) (b) Fig. 5. Similariy query for rajecories Given a rajecory λ defined for ime inerval [ S, E ], and a naural number m, he emporal normalized discree rajecory λ is defined as follows: λ = λ( S ), λ( S + ),..., λ( S + m ), where S + m = E Inuiively, his discree rajecory λ is he re-sampled rajecory per fixed inerval from λ, shown in Figure 6(c). In oher words, λ is generaed by dividing λ ino equal inerval. For discree rajecory λ, we can use he piecewise linear approximaion λ insead of λ. In he case of Figure 6, he emporal normalized discree rajecory (Figure 6(c)) is generaed from he approximae rajecory (Figure 6(b)). Definiion 9 : Spaial-Temporal Similariy beween wo Trajecories Given wo rajecories λ and λ wih he same emporal lengh (i.e. L T (λ) = L T (λ ) ) and a naural number m, he spaio-emporal disance (similariy) D T S (λ, λ ) beween λ and λ is defined as follows: D T S (λ, λ ) = 1 m D( m + 1 λ (i), λ (i))2, where = L T (λ) m = L T (λ ) m i=0 Noe ha D T S ( λ, λ ) can be defined as D T S ( λ, λ ). In his definiion, he similariy is he Euclidean disance beween rajecories represened as m + 1 dimensional vecors, and he inerval of each rajecory is normalized. Using his definiion, i is possible o find rajecories whose shape is more similar o he query rajecory han can be found using previous mehods. 3.4 Spaial disance beween wo rajecories In definiion 8, we focused on he shapes of he rajecories in space R 2 T. However, here are cases where he shapes in R 2 (wihou emporal feaures) are jus as imporan, such as for example in he case of finding similar rajecories o hose of a specified user when focusing on he spaial shape. Hence, we also define he spaial similariy beween wo rajecories.

9 11 12 10 8 1 2 4 6 (a) Original Daa (b) Approximae Trajecory 11 9 10 7 8 12 11 10 9 8 7 d 12 1 2 3 4 5 6 5 4 6 1 2 3 d (c) Temporal Normalizaion (d) Spaial Normalizaion Fig. 6. Normalizaion of rajecories Definiion 10 : Spaial Normalized Discree Trajecory Given a rajecory λ and a naural number m, he spaial normalized discree rajecory λ δ is defined as follows; λ δ = λ( 0 ),..., λ( m ), where L S (λ, [ i 1, i ]) = δ, (i = 1,..., m) Similar o λ, λ δ is generaed by dividing λ ino equal spaial lengh δ. In he case of Figure 6, he spaial normalized discree rajecory (Figure 6(d)) is generaed from he approximae rajecory (Figure 6(b)). Definiion 11 : Spaial Similariy beween Trajecories Given wo rajecories λ and λ wih he same spaial lengh (i.e. L S (λ) = L S (λ )) and a naural number m, he spaial disance (similariy) D S (λ, λ ) beween λ and λ is defined as follows: D S (λ, λ ) = 1 m D( m + 1 λδ (i), λ (i)) 2 where δ = L S(λ) δ m = L S(λ ) m i=0 Using his definiion, i is possible o find he rajecories whose spaial shape is similar o ha of he query rajecory wihou emporal feaures. 4 Indexing Wih our proposed mehod for calculaing similariy beween rajecories, he daabase sysem can find he rajecories ha have similar shapes o he shape

10 C1 C1 C1 C1 D(c1, c2) C2 D(c1, c2) C2 C2 C2 (a) (b) (c) Fig. 7. Disance beween wo sequences c 1, c 2, and Disance beween approximae sequences c 1, c 2 of he query rajecory. However, he cos of calculaing our defined similariy is very high, because i is necessary o calculae Euclidean disances beween each poin on he query rajecory and each poin on all rajecories sored in he daabase. In general, daabase sysems sore a lo of rajecories, and he amoun of daa is increasing rapidly. Therefore, i is imporan o reduce he cos of calculaing similariies. In his secion, we presen an indexing mehod o reduce he cos of calculaing similariies, which is based on echniques for reducing he dimensions of vecor daa. 4.1 Piecewise Aggregae Approximaion Piecewise Aggregae Approximaion (PAA) [7] is a echnique for reducing he cos of comparing wo ses of ime series daa. The essenial idea of his echnique is he reducion of he number of compared daa using he lower limi values of ime series daa. Here, we describe only an ouline of his echnique because i was fully presened in [7]. As menioned in Secion 3, he similariy beween wo ime series daa ses can be defined as Euclidean disance beween wo sequences represened as mulidimensional vecors. Even if he query sequence has a shoer lengh m han he candidae sequence, he similariy can be defined as he disance beween he query sequence and each subsequence of he candidae sequence, as illusraed in Figure 3(b). According o he definiion of he similariy beween wo sequences (menioned in Secion 3.3), when he lengh of he query sequence is m and he maximum lengh of a candidae sequence is n, he order of calculaing he similariy is obviously O(mn) for each sored sequence. In order o reduce cos of comparison, i is necessary o reduce he number of compared values in sequence. The PAA is a echnique for generaing approximae sequences o efficienly calculae similariy. If he original sequence has n values, he approximae sequence has only k values ( k is a facor of n and k is much less han n ). Each

11 Original Daa 1. Projecion 2. Approximaion 3. Ploing on Plane 4. Generaing R + -Tree Fig. 8. The process for generaing indexes o rajecories member of he approximae sqeuence c = w,..., w k is given as follows: w i = 1 k ki j=k(i 1)+1 In shor, each w i is calculaed as he average of w k(i 1)+1,..., w ki. Moreover, i was proved in [7] ha he approximae sequences c, c have a special relaionship wih he original sequences c, c ( c = c = k): D( c, c ) < D(c, c ) This relaionship means he disance beween he approximae sequences is he lower limi of he disance beween he original sequences. For example, in he case of Figure 7(a), he disance beween c 1 and c 2 (Figure 7(b)) is always greaer han he disance beween approximae sequences c 1 and c 2, shown in Figure 7(c). Using his resul, he daabase sysem can reduce he number of compared sequences wih he query sequence. w j 4.2 Exended Indexing Mehod for Shape-based Similariy Query The PAA is a simple and efficien echnique for reducing he number of compared ime series daa; however, his echnique has only been adaped o he daa in

12 space R 1 T. Hence, we exend his echnique o rajecory daa in space R 2 T. Moreover, we presen an efficien indexing mehod for rajecories by combining wo echniques: PAA and a spaial indexing echnique (R + -Tree [13]) 2 We only describe he case of he spaial similariy query, bu he essenial idea can also be adaped o he spaio-emporal similariy query. Firs, we define he exension of PAA for he 2-dimensional space: Definiion 12 : 2-Dimensional PAA (2D-PAA) Given a normalized spaial rajecory λ δ (L S ( λ δ ) = nδ) and k, which is a facor of n, 2-Dimensional PAA of λ δ is: x i = 1 k = x 1, x 2,..., x k such ha x i = ( x i, ȳ i ) ki j=k(i 1)+1 x j, ȳ i = 1 k ki j=k(i 1)+1 y j where x j = (x j, y j ) Inuiively, 2D-PAA can approximae he rajecory s shape by calculaing he cener of he poins conained in he rajecory. Using 2D-PAA and R + -Tree, he indexes o rajecories can be generaed as shown in Figure 8. In order o rerieve rajecories whose disance o he query rajecory is less han θ, he daabase sysem processes he following seps; 1. Calculaing q = x q1,..., x qk for he query rajecory λ q (λ δq ). 2. Searching sequences 1, 2,... (he lengh of each sequence is k) on R + - Tree, such as D S ( i, q ) < θ. 3. Finding answer rajecories λ 1, λ 2,... such as D S (λ i, λ q ) < θ. In our approach, he number of compared daa is reduced in wo seps. The combinaion of boh reducion echniques enables daabases o rerieve answer rajecories efficienly. Therefore he rajecory daabases can suppor he shapebased similariy query wihou a heavy load. Our indexing mehod s performance is evaluaed in Secion 5. 5 Performance Sudy 5.1 Experimenal Seings There are basically hree variables ha could affec he similariy beween rajecories. The firs is he lengh of a query, because as he lengh of query increases, he similariy decreases, bu he cos of calculaing similariies increases. The second variable is he densiy of he poins in a space, since he similariy increases and he number of compared rajecories increases wih he densiy of 2 APCA [6], which is an exension of PAA, uses R-Tree based indexing echniques. However, he indexing mehod only uses R-Tree echniques o rerieve ime series daa (in space R 1 T ) efficienly, no o rerieve sequences of vecors such as rajecories in space R 2 T.

13 20000 20000 18000 18000 16000 16000 14000 14000 Time (msec) 12000 10000 8000 6000 4000 2000 0 4 8 16 32 64 128 256 Lengh 2000 8000 Poins 32000 Time (msec) 12000 10000 8000 6000 4000 2000 0 4 8 16 32 64 128 256 Lengh Poins 32000 8000 2000 Fig. 9. The comparison of our proposed indexing mehod wih he exising mehod he poins. The final variable is he complexiy of he rajecory s shape. Generally, he irregulariy of an objec, moion causes greaer complexiy of shapes; in oher words, people walking randomly generae he mos complex shapes. Therefore, we generae sample rajecories by changing hese variables. The number of rajecories is 2-32, he lengh of rajecories is jus 1000 (i.e., he maximum number of poins is 32000), and he sampling inerval is fixed 3. Each rajecory has shapes ha represen people walking freely on a plane while changing speed and direcion, and he frequency of he change in hese values is alered in order o generae various complex shapes. In addiion, he rajecories are embedded ino a fixed area (size is 500 500) wihou any endencies, in oher words, he densiy of poins can be conroled by he number of embedded rajecories. 5.2 Efficiency We have an experimen o assess he performance of our indexing mehod. For he experimen, we implemened wo ypes of engine o find he rajecory which is he neares o he query rajecory. The firs engine checks every poin on all rajecories (wihou any index), while he second engine checks only poins filered by our proposed index. We generaed random rajecories for a query, and measured he calculaion ime required o find he neares rajecory o he generaed query rajecory. Since his is a simple performance evaluaion of our approach, we gave a very simple query for each engine; he lengh of he query is fixed o k. In oher word, he query rajecory λ q has jus k iems, such as λ q = x 1,..., x k, and he approximae rajecory λ q has only one iem x = (1/k k i=1 x i, 1/k k i=1 y i) where x i = (x i, y i ). Figure 9 illusraes he experimenal resul in he case where he shape of rajecories is simple. While he lef graph in he figure shows he calculaion 3 Even in he case where is no fixed, we can ake similar resuls from he case where is fixed, because fixed rajecories can be mechanically generaed from original rajecories wih he piecewise linear approximaion defined in Secion 2.

14 ime wih our indexing mehods, he righ graph shows wihou any index. Wih our proposed index, he calculaion ime is 60% 75% less han wihou he index in each siuaion. For example, in he case where he query lengh is 256 and he number of poins is 32000 (i is he wors case), he calculaion ime is 3,325 msec wih he index. On he oher hand he calculaion ime wihou he index is 18,206 msec. Alhough he ime aken o generae indexes (overhead) is 3,377 msec when he number of poins is 32,000, i is less han he ime required wihou he index. As a resul of our experimen, he advanage of our indexing mehod is clear. In addiion, we measured he calculaion ime in cases where he shape of rajecories is very complex, such as rajecories where each objec moves almos randomly. In his case, he calculaion ime is 15% 20% greaer han he ime in he simple case; however, he rae of increase is he same in cases wih he index and wihou he index. 6 Conclusion The main conribuion of his paper is he presenaion of an efficien indexing mehod for processing shape-based similariy queries for rajecory daabases. In order o calculae similariy beween rajecories, we defined discree rajecories ha were re-sampled per fixed inerval. Furhermore, we described he performance of our proposed indexing mehod, and we show he advanage of our mehod over he exising mehods. As fuure work, we will implemen he rajecory daabase sysem o evaluae our proposed model, Furhermore, we will develop several real applicaion programs such as a car navigaion sysem, a personal navigaion sysem, and oher locaion-wares. References 1. G. Chen and D. Koz. Caegorizing binary opological relaions beween regions, lines, and poins in geographic daabases. Technical Repor TR2000-381, A Survey of Conex-Aware Mobile Compuing Research, Dep. of Compuer Science, Darmouh College, 2000. 2. H. Chon, D. Agrawal, and A. E. Abbadi. Query processing for moving objecs wih space-ime grid sorage model. In MDM2002 Conference Proceedings, pages 121 129, 2002. 3. E. Clemenini and P. D. Felice. Topological invarians for lines. IEEE Transacion on Knowledge and Daa Engineering, 10(1):38 54, 1998. 4. L. E. Elsgolc. Calculus of Variaions. Pergamon Press LTD, 1961. 5. E. G. Hoel and H. Same. Efficien processing of spaial queries in line segmen daabases. In O. Gunher and H. J. Schek, ediors, SSD 91 Proceedings, volume 525, pages 237 256. Springer-Verlag, 1991. 6. E. Keogh, K. Chakrabari, S. Mehrora, and M. Pazzan. Locally adapive dimensionaliy reducion for indexing large ime series daabases. In SIGMOD2001 Conference Proceedings, pages 151 162, 2001.

15 7. E. Keogh, K. Chakrabari, M. Pazzani, and S. Mehrora. Dimensionaliy reducion for fas similariy search in large ime series daabases. Knowledge and Informaion Sysems, 3(3):263 286, 2001. 8. G. Kollios, D. Gunopulos, and V. J. Tsoras. On indexing mobile objecs. In SIGMOD 99 Conference Proceedings, pages 261 272, 1999. 9. G. Kollios, V. J. Tsoras, D. Gunopulos, A. Delis, and M. Hadjielefheriou. Indexing animaed objecs using spaioemporal access mehods. IEEE Transacions on Knowledge and Daa Engineering, 13(5):758 777, 2001. 10..-S. Moon, K.-. Whang, and W.-S. Han. General mach: A subsequence maching mehod in ime-series daabases based on generalized windows. In SIGMOD 2002 Conference Proceedings, pages 382 393, 2002. 11. K. Porkaew, I. Lazaridis, and S. Mehrora. Querying mobile objecs in spaioemporal daabases. In C. S. Jensen, M. Schneider, B. Seeger, and V. J. Tsoras, ediors, SSTD 2001, volume 2121 of Lecure Noes in Compuer Science, pages 59 78. Springer-Verlag, 2001. 12. N. Priyanha, A. Miu, H. Balakrishnan, and S. Teller. The cricke compass for conex-aware mobile applicaions. In MOBICOM2001 Conference Proceedings, pages 1 14, 2001. 13. T. Sellis, N. Roussopoulos, and C. Falousos. The R + -ree: A dynamic index for mulidimensional objecs. In VLDB 87 Conference Proceedings, pages 3 11, 1987. 14. A. P. Sisla, O. Wolfson, S. Chamberlain, and S. Dao. Modeling and querying moving objecs. In ICDE 97 Proceedings, pages 422 432, 1997. 15. M. Vazirgiannis and O. Wolfson. A spaioemporal model and language for moving objecs on road neworks. In C. S. Jensen, M. Schneider, B. Seeger, and V. J. Tsoras, ediors, SSTD 2001, volume 2121 of Lecure Noes in Compuer Science, pages 20 35. Springer-Verlag, 2001. 16. O. Wolfson, B. u, S. Chamberlain, and L. Jiang. Moving objecs daabases: Issues and soluions. In Saisical and Scienific Daabase Managemen (SSDM 98) Conference Proceedings, pages 111 122, 1998.