SMashQ: Spatial Mashup Framework for k-nn Queries in Time-Dependent Road Networks

Transcription

1 Distributed and Parallel Databases manuscript No. (will be inserted by the editor) SMashQ: Spatial Mashup Framework for k-nn Queries in Time-Dependent Road Networks Detian Zhang Chi-Yin Chow Qing Li Xinming Zhang Yinlong Xu Received: date / Accepted: date Abstract The k-nearest-neighbor (k-nn) query is one of the most popular spatial query types for location-based services (LBS). In this paper, we focus on k-nn queries in time-dependent road networks, where the travel time between two locations may vary significantly at different time of the day. In practice, it is costly for a LBS provider to collect real-time traffic data from vehicles or roadside sensors to compute the best route from a user to a spatial object The work described in this paper was partially supported by grants from City University of Hong Kong (Project No and ) and the National Natural Science Foundation of China under Grant No and Detian Zhang Department of Computer Science and Technology, University of Science and Technology of China, Hefei, China Department of Computer Science, City University of Hong Kong, Hong Kong, China USTC-CityU Joint Advanced Research Center, Suzhou, China tianzdt@mail.ustc.edu.cn Chi-Yin Chow Department of Computer Science, City University of Hong Kong, Hong Kong, China chiychow@cityu.edu.hk Qing Li Department of Computer Science, City University of Hong Kong, Hong Kong, China USTC-CityU Joint Advanced Research Center, Suzhou, China itqli@cityu.edu.hk Xinming Zhang Department of Computer Science and Technology, University of Science and Technology of China, Hefei, China USTC-CityU Joint Advanced Research Center, Suzhou, China xinming@ustc.edu.cn Yinlong Xu Department of Computer Science and Technology, University of Science and Technology of China, Hefei, China USTC-CityU Joint Advanced Research Center, Suzhou, China ylxu@ustc.edu.cn

2 2 Detian Zhang et al. of interest in terms of the travel time. Thus, we design SMashQ, a server-side spatial mashup framework that enables a database server to efficiently evaluate k-nn queries using the route information and travel time accessed from an external Web mapping service, e.g., Microsoft Bing Maps. Due to the expensive cost and limitations of retrieving such external information, we propose three shared execution optimizations for SMashQ, namely, object grouping, direction sharing, and user grouping, to reduce the number of external Web mapping requests and provide highly accurate query answers. We evaluate SMashQ using Microsoft Bing Maps, a real road network, real data sets, and a synthetic data set. Experimental results show that SMashQ is efficient and capable of producing highly accurate query answers. Keywords Spatial mashups location-based services k-nearest-neighbor queries time-dependent road networks Web mapping services 1 Introduction With the ubiquity of wireless Internet access, GPS-enabled mobile devices and the advance in spatial database management systems, location-based services (LBS) have been realized to provide valuable information for their users based on their locations [15, 17]. LBS are an abstraction of spatio-temporal queries. Typical examples of spatio-temporal queries include range queries (e.g., How many vehicles in a certain area ) [9, 20, 24] and k-nearest-neighbor (k-nn) queries (e.g., What are the three nearest gas stations to my car ) [13,20,22,27]. The distance between two locations in a road network is measured in terms of the network distance, instead of the Euclidean distance, to consider the physical movement constrains of the road network [24]. It is usually defined by the distance of their shortest path. However, this kind of distance measure would hide the fact that the user may take longer time to travel to his/her nearest object of interest (e.g., restaurant and hotel) than other ones due to many realistic factors, e.g., heterogeneous traffic conditions and traffic accidents. Travel time (e.g., driving time) is in reality a more meaningful and reliable distance measure for LBS in the road network [6, 7, 10]. Figure 1 depicts an example where Alice wants to find the nearest clinic for emergency medical treatment. A traditional shortest-path based NN query algorithm returns Clinic X. However, a driving-time based NN query algorithm returns Clinic Y because Alice will spend less time to reach Y (3 mins.) than X (10 mins.). Since travel time is highly dynamic, e.g., the driving time on a segment of I-10 freeway in Los Angeles, USA between 8:30AM to 9:30AM changes from 30 minutes to 18 minutes, i.e., 40% decrease in driving time [6], it is almost impossible to accurately predict the travel time between two locations in the road network based on their network distance. The best way to provide realtime travel time computation is to continuously monitor the traffic in the road network; however, it is difficult for every LBS provider to do so due to very expensive deployment cost.

3 Title Suppressed Due to Excessive Length 3 Alice: Where is my nearest clinic? Clinic X Distance: 500 meters Travel time: 10 mins Clinic Y Distance: 1000 meters Travel time: 3 mins Fig. 1 Shortest-path-based nearest object X versus travel-time-based nearest object Y. A mashup, one of the key Web 2.0 technologies, is a web application that combines data, representation, and/or functionality from multiple web applications to create a new application [30]. A server-side spatial mashup (or GIS mashup) provides a cost-effective way for a database server to access the route and travel time information between two locations in the road network from external Web mapping services, e.g., Google Maps [11], Yahoo! Maps [32], Microsoft Bing Maps [19] and government agencies. Web mapping service providers have many sources to collect data (e.g., historical traffic statistics and real-time traffic conditions) to estimate the travel time and direction between any two locations. The application scenario of this work is that location-based service (LBS) providers can subscribe direction services from Web mapping services through their APIs to provide services for their users based on current traffic conditions. In other words, our proposed framework enables the LBS provider to outsource the real-time traffic monitoring and traffic analysis operations to the Web mapping service provider, and thus, the LBS provider can focus on its core business activities and reduce its operational costs. However, existing spatial mashups suffer from the following limitations. (1) It is costly to access direction information from a Web mapping service, e.g., retrieving travel time from the Microsoft MapPoint web service to a database engine takes 502 ms while the time needed to read a cold and hot 8 KB buffer page from disk is 27 ms and ms, respectively [18]. (2) There is a charge on the number of requests issued to a Web mapping service, e.g., Google Maps allows only 2,500 requests per day for evaluation users and 100,000 requests per day for business license users [12]. A location-based service provider needs to pay more to have higher usage limits. (3) The use of retrieved route information is restricted, e.g., the route information must not be pre-fetched, cached, or stored, except only limited amount of content can be temporarily stored for the purpose of improving system performance [12]. (4) Existing Web mapping services only support primitive operations, e.g., the travel direction and time between two locations. A database server has to issue a large number of exter-

4 4 Detian Zhang et al. nal requests to collect small pieces of information from the supported simple operations to process relatively complex spatial queries, e.g., k-nn queries. In this paper, we design SMashQ, a framework that uses a server-side Spatial Mashup paradigm to process k-nn Queries in the time-dependent road network. Given a set of objects R and a k-nn query with a user U s location, SMashQ finds at most k objects from R with the shortest travel time from U s location. The objectives of SMashQ are to reduce the number of external Web mapping requests and provide highly accurate query answers. To accomplish these objectives, SMashQ first prunes the objects that are definitely not part of a query answer from R to find a set of candidate objects for the query answer. Then, three shared execution optimizations, namely, object grouping, direction sharing, and user grouping, are designed for SMashQ. The main idea of the object grouping optimization is to group candidate objects to a set of intersections I. For each intersection I I, one external Web mapping request is issued to retrieve the route and travel time from U to I, and then the travel time from I to each of its objects is estimated. To further reduce the number of external requests, the direction sharing optimization aims at using the route and travel time information from U to an intersection in I to find that from U to other intersections in I. The user grouping optimization groups queries to a set of intersections. For each intersection, its queries are processed at the same time, as they can share the route and travel time information from the intersection. To evaluate the performance of SMashQ, we compare SMashQ with two basic approaches using a real road network, two real-world data sets, synthetic data sets, and Microsoft Bing Maps [19]. Experimental results show that SMashQ outperforms the basic approaches in terms of the number of external Web mapping requests and it provides highly accurate travel time estimation and k-nn query answers. The remainder of this paper is organized as follows. Section 2 highlights related work. Section 3 describes the system model. Section 4 presents an exact algorithm. Section 5 describes the three shared execution optimizations for SMashQ. Experimental results are analyzed in Section 6. Finally, Section 7 concludes this paper. 2 Related Work In this section, we first highlight related work in the areas of spatial queries in distributed systems and shared execution for spatial queries. Then, we focus on the related work in the areas of spatial queries in road networks and spatial queries with external data or services. Spatial queries in distributed systems. Location-based query processing algorithms have been designed for various distributed systems, e.g., client-server systems [20, 23], peer-to-peer systems [26], mobile peer-to-peer systems [4,34], and wireless sensor networks [8,31]. In the client-server systems, a centralized database server stores all objects and employs query execution

5 Title Suppressed Due to Excessive Length 5 optimizations to process location-based queries issued by mobile users [20, 23]. A distributed spatial index structure has been designed for the peer-to-peer system to process location-based queries [26]. When a distributed server does not have enough spatial object information to process a location-based query, it uses the spatial index structure to enlist other peer servers that store the required information for help. Mobile peer-to-peer systems enable users to cache their nearby spatial objects to process location-based queries without enlisting any server for help [4,34]. Wireless sensor networks (WSNs) are more challenging than peer-to-peer systems, because WSNs have many limitations (e.g., limited storage and computational capacity, scarce battery power, and restricted communication ranges and bandwidth). Spatial query processing algorithms designed for WSNs have to be optimized for such limitations [8, 31]. Shared execution for spatial queries. Shared execution techniques are one of the widely used query execution optimizations for spatial queries. A shared execution scheme has been used to enable a system to scale up to a large number of concurrent continuous range and k-nearest-neighbor queries [20,21]. The basic idea is to group similar queries together and then evaluate a group of queries at the same time through a spatial join between moving objects and queries. Another shared execution scheme has been designed to optimize the query execution by clustering moving objects and queries based on their common spatial and temporal properties [23]. However, none of existing shared execution techniques is designed for sharing direction information retrieved from a Web mapping service to process spatial queries. Spatial queries in road networks. Existing location-based query processing algorithms can be categorized into two main classes: location-based queries in time-independent road networks and in time-dependent road networks. (1) Time-independent road networks. In this class, location-based query processing algorithms assume that the cost or weight of a road segment, which can be in terms of distance or travel time, is constant (e.g., [14,24,25]). These algorithms mainly rely on pre-computed distance or travel time information of road segments in road networks. However, the actual travel time of a road segment may vary significantly during different times of the day due to dynamic traffic on road segments [6,7]. (2) Time-dependent road networks. The location-based query processing algorithms designed for time-dependent road networks have the ability to support dynamic weights of road segments and topology of a road network, which can change with time. Location-based queries in time-dependent road networks are more realistic but also more challenging. George et al. [10] proposed a time-aggregated graph, which uses time series to represent time-varying attributes. The time-aggregated graph can be used to compute the shortest path for a given start time or to find the best start time for a path that leads to the shortest travel time. In [6,7], Demiryurek et al. proposed solutions for processing k-nn queries in time-dependent road networks where the weight of each road segment is a function of time. Spatial queries with external data or services. In this paper, we focus on k-nn queries in time-dependent road networks. This paper distinguishes itself from previous work [6, 7, 10] that it does not model the underlying road

6 6 Detian Zhang et al. network based on different criteria. Instead, it employs third party Web mapping services, e.g., Microsoft Bing Maps, to compute the travel time of road networks and provide the route and travel time information through serverside spatial mashups. Since the use of external Web mapping requests is more expensive than accessing local data [18], we propose three shared execution optimizations for SMashQ to reduce the number of such external requests and provide highly accurate query answers. There are some query processing algorithms designed to deal with expensive attributes that are accessed from external Web services (e.g., [1, 2, 18]). To minimize the number of external requests, these algorithms mainly focused on using either some cheap attributes that can be retrieved from local data sources [1, 18] or sampling methods [2] to prune a whole set of objects into a smaller set of candidate objects, and they only issue absolutely necessary external requests for candidate objects. The closest work to this paper is [18], where Levandoski et al. developed a framework for processing spatial skyline queries by pruning candidate objects that are guaranteed not to be part of a query answer and only issuing an external request for each remaining candidate object to retrieve the travel time from the user to the object. However, all previous work did not study how to use shared execution techniques and the road network topology to reduce the number of external requests that is exactly the problem that we solved in this paper. 3 System Model In this section, we describe our system architecture, road network model, and problem definition. Figure 2 depicts the system architecture of SMashQ that consists of three entities, users, a database server, and a Web mapping service provider. Users send k-nn queries to the database server at a LBS provider through their mobile devices at anywhere, anytime. The database server processes queries based on local data (e.g., the location and basic information of restaurants) and external data (i.e., route and travel time information) accessed from the Web mapping service provider. In general, accessing external data is much more expensive than accessing internal data [18]. A road map is modeled as a graph G = (V, E) comprising a set V of vertices with a set E of edges, where each road segment is an edge and each intersection of road segments is a vertex. For instance, Figure 3a depicts a 1: Location-based queries Users SMashQ Fig. 2 System architecture. 2: Web mapping requests 4: Answers 3: Route and travel time Database Server (e.g., restaurant data) (Low Cost Local Data) Web Mapping Service Provider (High Cost External Data)

7 Title Suppressed Due to Excessive Length 7 I 3 I 1 I 2 I 4 I 7 I 8 I 5 I 6 I 9 I 10 I 11 I 12 (a) A road map I 13 I 14 I 15 Road segment Intersection (b) A graph model Fig. 3 Road network model. real road map that is modeled as an undirected graph (Figure 3b), where an edge represents a road segment (e.g., I 1 I 2 and I 1 I 5 ) and a square represents an intersection of road segments (e.g., I 1 and I 2 ). Our problem can be defined as follows. Given a set of objects R and a NN query Q = (λ, k) from a user U, where λ is U s location, and k is the maximum number of returned objects, SMashQ returns U at most k objects in R with the shortest travel time from λ based on the route and travel time information accessed from a Web mapping service provider, e.g., Microsoft Bing Maps [19]. Since the access to the Web mapping service is expensive, our objectives are to reduce the number of external Web mapping requests and provide highly accurate a k-nn query answer. 4 Exact Algorithm In this section, we present an exact algorithm that produces accurate answers for k-nn queries. Since there could be a very large number of objects in a spatial data set, it is extremely inefficient to issue an external Web mapping request to retrieve the route and travel time from a user to each object in the data set. To this end, the exact algorithm employs the existing network expansion technique designed for processing range queries in a road network [24] and existing pruning techniques designed for query processing with external data [1,2,18] to minimize the number of external Web mapping requests. Given an underlying road network modeled as a graph G = (V, E), a spatial object set R, a query Q = (λ, k) from a user U, and U s maximum movement speed U.max speed, the exact algorithm performs an incremental search from U to find an accurate k-nn query answer.

8 8 Detian Zhang et al. Algorithm 1 Exact Algorithm 1: input: G = (V, E), R, Q = (λ, k), U.max speed 2: Initialize a current answer set A, a node set S, and a priority queue Q 3: dist max ; T max 4: N i the nearest node from U 5: NDist Dist(U, N i ) 6: Q.enqueue(N i, NDist); S {N i } 7: while Q is not empty do 8: (N i, NDist) Q.dequeue() 9: if A contains k objects and NDist > dist max then 10: break 11: end if 12: if N i is an object (i.e., N i R) then 13: Issue an external Web mapping request to retrieve T ime(u N i ) 14: if A contains less than k objects then 15: A A {N i } 16: if A contains k objects then 17: T max the largest travel time of the objects in A 18: dist max U.max speed T max 19: end if 20: else if T ime(u N i ) < T max then 21: Replace the object with T max with N i in A 22: T max the largest travel time of the objects in A 23: dist max U.max speed T max 24: end if 25: end if 26: for Each neighbor node N j of N i, where N j / S do 27: Q.enqueue(N j, NDist + Dist(N i, N j )); S S {N j } 28: end for 29: end while 30: return A Algorithm 1 depicts the pseudo code of the exact algorithm. It considers the intersections and objects in the graph as two types of nodes that will be incrementally inserted into a priority queue Q. Each entry in Q is defined as (N i, NDist), where N i is a node identifier and NDist is the network distance from U to N i. When we dequeue Q, the node with the smallest distance is returned and removed from Q. The dequeue operation of Q returns the node with the smallest network distance from U and removes it from Q. Without loss of generality, we assume that U can only U-turn at an intersection. Initially, U s nearest node N i with their distance Dist(U, N i ) is enqueued to Q (Lines 4 to 6 in Algorithm 1). For each iteration, a node (N i, NDist) is dequeued from Q. N i is processed based on the following two cases: Case 1: N i is an intersection. For each neighbor node N j of N i that has not been enqueued to Q, N j is enqueued to Q along with the network distance from U to N j, i.e., Dist(N i, N j ) + NDist, where Dist(N i, N j ) is the distance from N i to N j (Lines 26 to 28 in Algorithm 1). Case 2: N i is an object. An external Web mapping request is issued to retrieve the travel time from U to N i, T ime(u N i ). If a current answer A contains less than k objects, N i is inserted into A (Lines 15 to 18

9 Title Suppressed Due to Excessive Length I 1 1 I 2 I 3 2 I 4 1 R 4 3 R I 7 R I 5 1 I 6 R 8 U 2 R 2 R I 9 I 13 User R 1 I 10 I 11 R I 14 Object 3 R 6 I 8 R 3 I 12 I 15 Intersection Fig. 4 An example for the exact algorithm. in Algorithm 1). If N i is the k-th object inserted into A, we set the largest travel time of the objects in A to T max. In case that A contains k objects and T ime(u N i ) is less than the largest travel time of the objects in A (i.e., T max ), the object with T max is replaced with N i and T max is updated accordingly (Lines 21 to 23 in Algorithm 1). Then, the neighbor nodes of N i are processed as in Case 1. The algorithm repeatedly dequeues nodes from Q until an accurate answer is found. Whenever T max is updated, we calculate the maximum possible network distance dist max that U can travel within the time interval T max (i.e., dist max = U.max speed T max ). For each iteration, if the network distance of the node dequeued from Q is larger than dist max, the algorithm terminates and returns the current answer A to U (Line 9 in Algorithm 1). This is because the travel time from U to any unprocessed object is larger than T max, and thus, it cannot be part of A. Example. Figure 4 shows a running example for the exact algorithm, where a user U issues a query Q = (λ, k = 2), U.λ is represented by a triangle, 10 objects (i.e., R = {R 1,..., R 10 }) are represented by circles, the distance of each edge is indicated by a number beside it, and the maximum movement speed U.max speed is 25 m/s (i.e., 90 km/h). Since U is moving towards I 6 and the distance from U to I 6 is 1 km, (I 6, 1) is enqueued to an empty Q. Figure 5 depicts the 12 iterations performed by the algorithm to find an accurate k-nn query answer. 1. The algorithm dequeues (I 6, 1) from Q. Since I 6 is an intersection, its neighbor nodes (i.e., (R 5, 4), (I 5, 4), and (I 10, 4)) are enqueued to Q. 2. (R 5, 4) is dequeued. Since R 5 is an object, an external Web mapping request is issued to get the travel time from U to R 5, T ime(u R 5 ) = 350 seconds. Since the current answer A is empty, R 5 is simply inserted into

10 10 Detian Zhang et al. NodeN i Time(U N i ) PriorityQueue Current AnswerA T max dist max - - (I 6,1) - (I 6,1) - (R 5,4),(I 5,4),(I 10,4) - (R 5,4) 350s (I 5,4),(I 10,4),(R 4,5) R 5 (I 5,4) - (I 10,4),(R 4,5),(R 2,6),(I 1,9) R 5 (I 10,4) - (R 4,5),(R 1,5),(R 2,6),(I 11,7),(I 14,7),(I 1,9) R 5 (R 4,5) 500s (R 1,5),(R 2,6),(I 2,6),(I 11,7),(I 14,7),(I 1,9) R 5,R 4 500s 12.5km (R 1,5) 400s (R 2,6),(I 2,6),(I 11,7),(I 14,7),(I 9,7),(I 1,9) R 5,R 1 400s 10km (R 2,6) 200s (I 2,6),(I 11,7),(I 14,7),(I 9,7),(I 1,9) R 5,R 2 350s 8.75km (I 2,6) - (I 11,7),(I 14,7),(I 9,7),(I 1,9),(I 3,9) R 5,R 2 350s 8.75km (I 11,7) - (I 14,7),(I 9,7),(R 9,8),(I 1,9),(I 3,9),(R 10,9) R 5,R 2 350s 8.75km (I 14,7) - (I 9,7),(R 9,8),(I 1,9),(I 3,9),(R 10,9),(I 13,10),(I 15,13) R 5,R 2 350s 8.75km (I 9,7) - (R 9,8),(I 1,9),(I 3,9),(R 10,9),(I 13,10),(I 15,13) R 5,R 2 350s 8.75km (R 9,8) 600s (I 1,9),(I 3,9),(R 10,9),(I 13,10),(R 8,10),(I 15,13) R 5,R 2 350s 8.75km Fig. 5 The intermediate steps of the exact algorithm. A. R 4 is the only neighbor of R 5 that has not been processed by the algorithm, so (R 4, 5) is enqueued to Q. 3. (I 5, 4) is dequeued from Q and two of its neighbor nodes, (R 2, 6) and (I 1, 9), are enqueued to Q. 4. This iteration dequeues another intersection (I 10, 4) from Q and enqueues three of its neighbor nodes (R 1, 5), (I 11, 7), and (I 14, 7) to Q. 5. The algorithm dequeues (R 4, 5) from Q and issues an external request for R 4 to get T ime(u R 4 ) = 500 seconds. Since A contains less than k = 2 objects, R 4 is inserted into A. Then, T max is set to the largest travel time of the objects in A, i.e., 500 seconds. As U.max speed = 25 m/s, dist max is set to 25m/s 500s=12.5 km. One of R 4 s neighbor nodes, i.e., (I 2, 6), is enqueued to Q. 6. Since (R 1, 5) is an object, an external request is issued for R 1. As T ime(u R 1 ) = 400 is less than T max, R 4 is replaced with R 1. Then, T max and dist max are updated to 400 seconds and = 10 km, respectively. One of R 1 s neighbor node, I 9, has not been processed by the algorithm, so (I 9, 7) is enqueued to Q. 7. (R 2, 6) is dequeued from Q and an external request is issued. Since T ime(u R 2 ) = 200 is less than T max, R 1 is replaced with R 2 in A. Then, T max and dist max are updated to 350 seconds and = 8.75 km, respectively. As all the neighbor nodes of R 2 have been processed by the algorithm, no node is enqueued to Q. 8. (I 2, 6) is dequeued from Q. (I 3, 9) is enqueued to Q. 9. (I 11, 7) is dequeued from Q. (R 9, 8) and (R 10, 9) are enqueued to Q. 10. (I 14, 7) is dequeued from Q. (I 13, 10) and (I 15, 13) are enqueued to Q. 11. (I 9, 7) is dequeued from Q and no node is needed to be enqueued to Q. 12. The algorithm dequeues (R 9, 8) from Q and issues an external request for R 9 to get T ime(u R 9 ) = 600. Since T ime(u R 9 ) is larger than

11 Title Suppressed Due to Excessive Length 11 T max = 350, no update is needed for A. One of R 9 s neighbor node, i.e., (R 8, 10), is enqueued to Q. After the above 12 iterations, (I 1, 9) is dequeued from Q. Since NDist(U, I 1 ) = 9 is larger than dist max = 8.75, the algorithm terminates and returns A = {R 5, R 2 } to U as a query answer. 5 SMashQ Although the exact algorithm can prune a large number of objects to reduce the number of external Web mapping requests, if the object density in the vicinity of a querying user is very high, the database server would still need to issue a large number of external requests. To this end, we propose three shared execution optimizations, namely, object grouping, direction sharing, and user grouping, for SMashQ based on the road network topology to further reduce the number of external requests. This section is organized as follows: Section 5.1 presents the object grouping optimization for SMashQ. Section 5.2 extends SMashQ to support the direction sharing optimization. Section 5.3 discusses the user grouping optimization that further improves SMashQ. Finally, Section 5.4 analyzes the performance of our algorithms. 5.1 Object Grouping Optimization We observe that many spatial objects are generally located in clusters in real world. For example, many restaurants are located in a downtown area (Figure 3a). Thus, it makes sense to group nearby objects to a representative point. The database server issues only one external request to the Web mapping service for an object group. Then, it shares the retrieved travel time and route information from a user to the representative point among the objects in the group to estimate the travel time from the user to each object. There are two challenges in grouping objects: (a) How to select representative points in a road network? and (b) How to group objects to representative points? Our solution is to select intersections as representative points in a road network and group objects to them. The reason is twofold: (1) Since there is only one possible path from the intersection of a group G to each object in G, it is easy to estimate the travel time from the intersection to each object in G. (2) A road segment should have the same conditions, e.g., speed limit and direction. The estimation of travel time would be more nature and accurate by considering a road segment as a basic unit [16]. Figure 6 gives an example for object grouping where objects R 8, R 9 and R 10 are grouped together by intersection I 11. The database server only needs one external Web mapping request to retrieve the route information and travel time from U and I 11, and then it estimates the travel time from I 11 to each of its group members.

12 12 Detian Zhang et al. I 3 I 1 I 2 I 4 R 4 R 5 R 6 I 5 U I 6 I 7 R 8 R 7 I 8 R 3 R 2 R 9 I 9 R 1 I 10 I 11 R 10 I 12 I 13 I 14 I 15 Intersection of an object group User Object Intersection Fig. 6 Object grouping optimization Greedy Object Grouping Algorithm To minimize the number of external Web mapping requests, we should find a minimal set of intersections that cover all road segments having target objects. Given a k-nn query issued by a user U and a set R of candidate objects, which is computed by Algorithm 2, a subgraph G u = (V u, E u ) is extracted from the graph G of the underlying road network model, where E u is an edge set where each edge contains at least one candidate object in R and V u is a vertex set that consists of the intersections of the edges in E u. Algorithm 2 is similar to the exact algorithm (Algorithm 1), except that after Algorithm 2 issues k external Web mapping requests for the k-nearest objects of U to find an initial answer, and then computes T max and dist max for the initial answer (Lines 13 to 18 in Algorithm 2). After determining the initial answer, Algorithm 2 does not issue any external Web mapping requests for other candidate objects. Instead, Algorithm 2 adds such candidate objects to a candidate object set R (Line 21 in Algorithm 2). The greedy object grouping algorithm returns a vertex set I V u that is a minimal one covering all the edges in E u. The object grouping problem is the vertex cover problem [5], which is NP-complete, so we use a greedy algorithm to find I. Algorithm. Algorithm 3 depicts the pseudo code of a greedy object grouping algorithm. The input consists of the graph G of the underlying road network model and a set R of candidate objects, which are found by Algorithm 2, for a k-nn query issued by a user U. The algorithm first extracts a subgraph G u from G and calculates the number of edges connected to a vertex as the degree of that vertex (Lines 3 to 4 in Algorithm 3). Then, it selects a vertex v from V u with the largest degree to I. In case of a tie, the vertex with the shortest distance to U is selected. Each object on each edge connected to v

13 Title Suppressed Due to Excessive Length 13 Algorithm 2 Candidate object search algorithm 1: input: G = (V, E), R, Q = (λ, k), U.max speed 2: Initialize a current answer set A, a node set S, a priority queue Q, and a candidate object set R 3: dist max ; T max 4: N i the nearest node from U 5: NDist Dist(U, N i ) 6: Q.enqueue(N i, NDist); S {N i } 7: while Q is not empty do 8: (N i, NDist) Q.dequeue() 9: if A contains k objects and NDist > dist max then 10: break 11: end if 12: if N i is an object (i.e., N i R) then 13: if A contains less than k objects then 14: Issue an external Web mapping request to retrieve T ime(u N i ) 15: A A {N i } 16: if A contains k objects then 17: T max the largest travel time of the objects in A 18: dist max U.max speed T max 19: end if 20: else 21: R R {N i } 22: end if 23: end if 24: for Each neighbor node N j of N i, where N j / S do 25: Q.enqueue(N j, NDist + Dist(N i, N j )); S S {N j } 26: end for 27: end while 28: return R Algorithm 3 Greedy object grouping algorithm 1: input: Graph G = (V, E) and a set of candidate objects R from Algorithm 2 2: I { } 3: Extract a subgraph G u = (V u, E u ) from G = (V, E) such that each edge in E u contains some objects in R and V u contains the intersections of the edges in E u 4: Calculate the degree of each vertex in V u 5: while V u is NOT empty do 6: v the vertex with the largest degree in V u 7: Assign each object on each edge connected to v to the object group of v 8: Insert v into I and remove v from V u 9: Remove edges connected to v from E u 10: Update the degree of each vertex in V u 11: Remove the vertex without any edge from V u 12: end while is assigned to the object group of v. Note that each candidate object is assigned to one or two object groups. v is removed from V u and the edges of v are removed from the E u. After that, the degrees of the vertices of the removed edges are updated accordingly. The selection process is repeated until V u becomes empty (Lines 5 to 12 in Algorithm 3). Example. Figures 7 and 8 depict an example for the greedy object grouping algorithm, where a subgraph G u = (V u, E u ) extracted from G, as depicted

14 14 Detian Zhang et al. Current Answer={R 4, R 5 } I 1 I 2 I 4 I 3 R 4 R 6 R 5 I 6 I 7 R 7 I 8 I 5 R 2 U R 8 R 9 R 3 I 9 R 1 I 10 I 11 R 10 I 12 I 13 I 14 I 15 Intersection of a group User Object Intersection Fig. 7 Example of the object grouping optimization. d=1 I 5 d=1 I 7 I 9 I 10 I 11 d=2 d=1 d=2 (a) I = { I 11 } d=1 I 12 d=1 I 9 I 5 d=0 I 7 I 10 d=0 I 12 d=2 d=1 (b) I = { I 11, I 9 } d=0 I 5 I 10 d=0 (c) I = { I 11, I 9 } Fig. 8 Example of the greedy object grouping algorithm. in Figure 8a. Note that since R 4 and R 5 is the current answer A, which is found by Algorithm 2, the route and travel time from U to R 4 and R 5 have been retrieved from the Web mapping service. Thus, R 4 and R 5 are not candidate objects in R = {R 1, R 2, R 8, R 9, R 10 } (Figure 7), which are found by Algorithm 2. Since edges I 5 I 9, I 9 I 10, I 7 I 11 and I 11 I 12 contain some candidate objects in R, these four edges constitute E u and their vertices constitute V u. Then, the greedy algorithm calculates the degree for each vertex in V u. For example, the degree of I 11 is two because two edges I 7 I 11 and I 11 I 12 are connected to I 11. Figure 8a shows that I 11 has the largest degree and is closer to U than I 9, I 11 is selected to I and the edges connected to I 11 are removed from G u. Figure 8b shows that the degrees of I 7 and I 12 of the two deleted edges are updated accordingly. Any vertex in G u without any edge (i.e., I 7 and I 12 ) is removed from V u. Since I 9 has the largest degree, I 9 is selected to I and the edges connected to I 9 are removed. Since I 5 and I 10 have no edge, they are removed from V u. Finally, V u becomes empty, so the greedy algorithm terminates and clusters the candidate objects into two object groups (indicated by dotted rectangles), I 9 = {R 1, R 2 } and I 11 = {R 8, R 9, R 10 }, as indicated in Figure 7.

15 Title Suppressed Due to Excessive Length Travel Time Estimation for Grouped Objects Most Web mapping service providers, e.g., Microsoft Bing Maps, return turn-by-turn route information for a request. For example, if the database server sends a request to the Web mapping service to retrieve the route and travel time information from user U to intersection I 9 in Figure 7, the service returns the turn-by-turn route information, i.e., Route(U I 9 ) = { Dist(U I 6 ), T ime(u I 6 ), Dist(I 6 I 5 ), T ime(i 6 I 5 ), Dist(I 5 I 9 ), T ime(i 5 I 9 ) }, where Dist(A B) and T ime(a B) are the distance and travel time from location A to location B, respectively. If a group contains only one object, the database server simply issues one external request to retrieve the route information from the user to that object. Otherwise, our algorithm performs two steps to estimate the travel time for each object and select the best route (if appropriate) as follows. Step 1: Travel time calculation. Based on whether a candidate object is on the last road segment of a retrieved route, we can distinguish two cases. Case 1: An object is on the last road segment. In this case, we assume that the speed of the last road segment is constant. For example, given the information of the route from U to I 9 (i.e., Route(U I 9 )) retrieved from the Web mapping service, object R 2 is on the last road segment of the route, i.e., I 5 I 9. Hence, the travel time from U to R 2 is calculated as: T ime(u R 2 ) = T ime(u I 9 ) T ime(i 5 I 9 ) Dist(R 2 I 9 ) Dist(I 5 I 9). Case 2: An object is NOT on the last road segment. In this case, a candidate object is not on a retrieved route, i.e., the object is on a road segment S connected to the last road segment S of the route. We assume that the travel speed of S is the same as that of S. For example, given the retrieved information of the route from U to I 9 (i.e., Route(U I 9 )), object R 1 is not on the last road segment of the route, i.e., I 5 I 9. Hence, the travel time from U to R 1 is calculated as: T ime(u R 1 ) = T ime(u I 9 ) + T ime(i 5 I 9 ) Dist(I9 R1) Dist(I 5 I 9 ). Step 2: Route selection. The greedy object grouping algorithm may assign a candidate object to two object groups. In this case, the database server can find the routes from U to the object via each of the intersections. This step selects the route with the shortest travel time. For example, if both the intersections of edge I 9 I 10, i.e., I 9 and I 10, are selected (Figure 7), the travel time from user U to R 1 is selected as T ime(u R 1 ) = min(t ime(u I 9 R 1 ), T ime(u I 10 R 1 )) Algorithm Algorithm 4 depicts the pseudo code of SMashQ with the object grouping optimization. It first executes Algorithm 2 to find a candidate object set R, a current answer A, the maximum travel time T max of the objects in A, and the maximum travel distance dist max for a querying user U moving at its maximum possible speed (i.e., U.max speed) for time period T max (Line 2 in Algorithm 4). Then, it executes the greedy object grouping algorithm (i.e., Algorithm 3) to find an intersection set I. The intersections in I are sorted based

16 16 Detian Zhang et al. Algorithm 4 SMashQ with the object grouping optimization 1: input: G = (V, E), R, Q = (λ, k), U.max speed 2: Execute Algorithm 2 to find a set of candidate objects R, a current answer A, T max and dist max 3: Execute the Greedy Object Grouping Algorithm to get a set of intersections I (Algorithm 3) 4: I is sorted according to the intersection s network distance from U in non-decreasing order 5: while I is not empty do 6: I i the intersection in I with the shortest network distance from U 7: Issue a Web mapping request to retrieve Route(U I i ) and T ime(u I i ) 8: for Each object R j in the object group of I i do 9: Estimate the time from U to R j, i.e., T ime(u R j ) 10: if T ime(u R j ) < T max then 11: Replace the object with T max in A with R j 12: T max max{t ime(u R l ) R l A} 13: dist max U.max speed T max 14: for Each object R k R with the shortest network distance from U > dist max do 15: Remove R k from R and its associated object group(s) 16: end for 17: end if 18: end for 19: Remove I i and the intersection with an empty object group from I 20: end while 21: return A on their shortest network distance from U in non-decreasing order (Line 4 in Algorithm 4). The algorithm repeatedly processes the intersection with the shortest network distance I i I from U because I i has a higher chance to find objects that would be part of a query answer and prune more objects in R. For each I i, the algorithm issues an external Web mapping request to retrieve the route and travel time information from U to I i, denoted as Route(U I i ) and T ime(u I i ), respectively (Line 7 in Algorithm 4). For each object R j in the object group of I i, the travel time from U to R j is estimated. If T ime(u R j ) < T max, the object with T max in A is replaced with R j. T max and dist max are then updated accordingly. After that, for each object R k R with the shortest network distance from U larger than dist max (i.e., dist max = U.max speed T max ), R k is pruned from R and its associated object group(s) (Lines 14 to 16 in Algorithm 4). I i and the intersection with an empty object group are removed from I. This pruning process is repeated until I becomes empty (Lines 5 to 20 in Algorithm 4). In the running example, the exact algorithm has to issue five external Web mapping requests, i.e., U R 1, U R 2, U R 4, U R 5, and U R 9.. With the object grouping optimization, SMashQ only needs to issue four requests, i.e., U R 4, U R 5, U I 9, and U I 11.

17 Title Suppressed Due to Excessive Length Direction Sharing As described in Section 5.1, the object grouping optimization can effectively reduce the number of external Web mapping requests. To further reduce the number of requests, we design the direction sharing optimization to further boost shared execution in SMashQ. This section is organized as follows. We present the main idea and challenges of the direction sharing optimization (Section 5.2.1), its technical details (Section 5.2.2), and the algorithm of SMashQ with the object grouping and direction sharing optimizations (Section 5.2.3) Main Idea and Challenges Most Web mapping service providers will return the route Route(A B) with the shortest travel time, the turn-by-turn route and travel time information for a request from location A to location B. Let X be an intermediate intersection in Route(A B), i.e., T = Route(A... X... B). We observe that every prefix in T, i.e., Route(A X), is also the route with the shortest travel time from A to X. The correctness of this observation is shown in the following theorem. Theorem 1 Given a route T = Route(A... X... B), where X is an intermediate intersection in T, with the shortest travel time from location A to location B, every prefix in T (i.e., Route(A X)) is also the route with the shortest travel time from location A to intersection X. Proof A simple cut-and-paste argument is used to prove this theorem [5]. Suppose that there is a route T = Route(A... Y... X), where Y / T, with a travel time shorter than the prefix Route(A X) in T, then we could cut Route(A X) out of T and paste in T, resulting in a route with a shorter travel time than T, thus contradicting the optimality of T. Based on Theorem 1, given a set of intersections I computed by the greedy object grouping algorithm for a k-nn query issued from a user U, we should first process the intersection I i I such that the direction information from U to I i can be shared with the largest subset I of other intersections in I (i.e., I i has the highest sharing ability). The sharing ability of an intersection I i I is measured as Set(Route(U I i)) I I, where Set(Route(U I i )) is a set of intersections in Route(U I i ). The direction information from U to I i will be used to determine the route and travel time from U to each intersection in I without issuing additional external Web mapping requests. Consider an example that the greedy object grouping algorithm finds I = {I 5, I 6, I 8, I 9, I 11, I 15 } which are represented by solid squares (Figure 9). If we know that Route(U I 9 ) passes through I 1, I 2, I 5, I 6, and I 9, its direction information can be shared with intersections I 5, I 6, and I 9, and thus the sharing ability of I 9 is {I5,I6,I9} 6 = 3 6 = 0.5. In this example, only three external Web mapping requests are needed, i.e., U I 8, U I 9, and U I 15, because their direction information can be shared with all other intersections in I.

18 18 Detian Zhang et al. I 1 I 2 I 3 I 4 Route(U I 8 ) Route(U I 9 ) I 7 I 8 I 5 U I 6 Route(U I 11 ) I 9 I 10 I 11 I 12 I 13 I 14 I 15 Intersection of an object group User Intersection Fig. 9 Direction sharing optimization. Since the traffic condition would be very dynamic, as discussed in Section 1, we could not accurately predict the route between two locations. We will describe a histogram approach (Section 5.2.2) that is employed to estimate the sharing ability for an intersection based on the historical route information retrieved from the Web mapping service Histogram Approach In this section, we describe a histogram approach that is used to maintain historical route information and provide a heuristic measure of sharing ability for an intersection. Since querying users are mobile users, it is impossible to store an entry from every possible location to an intersection. Coupled with the fact that the sharing ability measure is not based on the querying user s location, the histogram approach only maintains the historical route information from one intersection to another one. Given a querying user U who is moving towards an intersection I u, I u is considered as a starting location for the sharing ability measure. The histogram approach basically maintains a two-dimensional matrix H in which an entry H[I i, I j ] (1 i, j V, where V is the number of intersections in the graph G of the road network model) stores the number of intersections from intersection I i to intersection I j, inclusive of I i and I j, based on the latest route information from I i to I j. The rationale behind the histogram approach is that H[I i, I j ] is used to estimate the sharing ability from I i to I j because a route with more intersections has a higher chance to be shared with other intersections. Initially, if I i = I j, H[I i, I j ] = 1; otherwise, H[I i, I j ] = 1. Given a route from I u to I d, we can distinguish two cases for the sharing ability measure based on the value of H[I u, I d ]: Case 1: H[I u, I d ] > 0. The sharing ability of I d is simply equal to H[I u, I d ].

19 Title Suppressed Due to Excessive Length 19 Algorithm 5 SMashQ with the object grouping and direction sharing optimizations 1: input: G = (V, E), R, Q = (λ, k), U.max speed, H 2: Execute Algorithm 2 to find a set of candidate objects R, a current answer A, T max and dist max 3: Execute the Greedy Object Grouping Algorithm to get a set of intersections I (Algorithm 3) 4: I is sorted according to the intersection s sharing ability in non-decreasing order 5: while I is not empty do 6: I h the intersection in I with the highest sharing ability 7: Issue a Web mapping request to retrieve Route(U I h ) and T ime(u I h ) 8: I the intersections included in Set(Route(U I h )) I 9: for Each intersection I i I do 10: for Each object R j in the object group of I i do 11: Estimate the time from U to R j, i.e., T ime(u R j ) 12: if T ime(u R j ) < T max then 13: Replace the object with T max in A with R j 14: T max max{t ime(u R l ) R l A} 15: dist max U.max speed T max 16: for Each object R k R with the shortest network distance from U > dist max do 17: Remove R k from R and its associated object group(s) 18: end for 19: end if 20: end for 21: end for 22: Remove I i, the intersection with an empty object group, and I from I 23: Update H based on Route(U I h ) accordingly 24: end while 25: return A Case 2: H[I u, I d ] = 1. Since the histogram has no historical route information from I u to I d, we employ a shortest path algorithm [5] (e.g., Dijkstra s algorithm and A algorithm) to find the shortest network path from I u and I d. Then, H[I u, I d ] and the sharing ability of I d are set to the number of intersections in the shortest network path, inclusive of I u and I d. After SMashQ receives the route and travel time information from U to an selected intersection I d (i.e., Route(U I d ) and T ime(u I d )), it updates the entry in H corresponds to the route from the first intersection to I d in Route(U I d ) and every entry in H corresponds to each prefix route in Route(U I d ). For example, the database server receives Route(U I 9 ) = { Dist(U I 6 ), T ime(u I 6 ), Dist(I 6 I 2 ), T ime(i 6 I 2 ), Dist(I 2 I 1 ), T ime(i 2 I 1 ), Dist(I 1 I 5 ), T ime(i 1 I 5 ), Dist(I 5 I 9 ), T ime(i 5 I 9 ) } from the Web mapping service provider, it will update H[I 6, I 2 ] = 2, H[I 6, I 1 ] = 3, H[I 6, I 5 ] = 4, and H[I 6, I 9 ] = Algorithm Algorithm 5 depicts the pseudo code of SMashQ with the object grouping and direction sharing optimizations. Similar to Algorithm 4, Algorithm 5

20 20 Detian Zhang et al. executes Algorithm 2 to find a set R of candidate objects, a current answer A, the maximum travel time T max of the objects in A, and the maximum travel distance dist max for a querying user U moving at its maximum possible speed (i.e., U.max speed) for time period T max (Line 2 in Algorithm 5). Then, Algorithm 3 is executed to find a set of intersections I. There are three modifications in the pruning process for the direction sharing optimization. The first modification is that the intersections in I are sorted by their sharing ability in non-decreasing order, instead of their shortest network distance from U in Algorithm 4 (Line 4 in Algorithm 5), and the algorithm repeatedly processes the intersection I h with the highest sharing ability (Line 6 in Algorithm 5). Since the route and travel time information retrieved for a request from U to I h can also be shared with the prefix intersections included in I, the second modification is to use such information to process the object groups of a set intersections I included in both the set of intersection in Route(U I h ) and I (i.e., I = Set(Route(U I h )) I) (Lines 8 to 22 in Algorithm 5). The third modification is to update the histogram H based on the retrieved route information, i.e., Route(U I h ), accordingly. 5.3 User Grouping To further reduce the number of external Web mapping requests for a database server with a high workload, e.g., a large number of users or continuous queries, we design another shared execution for grouping users. Similar to object grouping, we consider intersections in the road network as representative points. The database server only needs to issue one external request from the intersection of a user group G u to the intersection of an object group G o to estimate the travel time from each user U i in G u to each object R j in G o. The key difference between user grouping optimization and object grouping optimization is that users are moving. Grouping users has to consider their movement direction, so a user is grouped to the nearest intersection to which the user is moving. Figure 10 depicts an example, where user A on edge I 9 I 10 is moving towards I 10, so A is grouped to the user group of I 10. Similarly, users D and E are also grouped to the user group of I 10. Users C and B on edges I 1 I 2 and I 6 I 2, respectively, are both moving to I 2, so they are grouped to the user group of I 2. After having grouped users, we process queries in terms of user groups instead of users. The database server finds an overall query answer A for all users in a user group, and then extract the query answer from A for each user in the group. For example, given a user group G u and the intersection I u of G u, the database server first search the k max nearest objects to I u, where k max is the largest value of k of the k-nn queries issued by the users in G u. Then, k max and I u are used to replace k and U in the algorithm of SMashQ with the object grouping optimization (i.e., Algorithm 4) or the algorithm of SMashQ with the object grouping and direction sharing optimizations (Algorithm 5) to find an overall query answer A for all the users in G u. For the k i -NN query of

21 Title Suppressed Due to Excessive Length 21 each user U i in G u, the database server takes the k i objects with the shortest travel time in A as U i s query answer A i, and estimates the travel time from the user s location to I u as in Section to calculate the travel time from U i to each object in A i. Therefore, each query in G u can be effectively processed. 5.4 Performance Analysis We now evaluate the performance of our algorithms. Since accessing travel time and direction information from an external Web mapping service is more expensive than local data processing [18], we compare the performance of our algorithms in terms of the number of external Web mapping requests. Let M and N be the number of querying users and the average number of candidate objects per query, respectively. The exact algorithm needs to issue N M external Web mapping queries to answer all M queries; hence, its cost is Cost Exact = N M By using the object grouping optimization, SMashQ can reduce the average number of candidate objects from N to N through grouping objects to intersections, where N N; hence, Cost O = N M. If SMashQ employs the object grouping and direction sharing optimization techniques, it can further reduce N to N by sharing the direction information among intersections, where N N ; hence, Cost OD = N M. When SMashQ employs all the optimization techniques (i.e., object grouping, direction sharing, and user grouping), it can reduce M to M by grouping users to intersections, where M M; hence, Cost ODU = N M. I 3 I 1 I 2 I 4 C B R 4 R 5 R 6 I 7 R 7 I 8 I 5 I 6 R 8 R 9 R 2 D R 10 R 3 I 9 A R 1 E I 10 I 11 I 12 I 13 I 14 I 15 Intersection of a user group User Object Intersection Fig. 10 User grouping optimization.

22 22 Detian Zhang et al. In general, N N N and M M, so Cost Exact Cost O Cost OD Cost ODU. We will verify our performance analysis through the experiments in Section 6. 6 Performance Evaluation In this section, we first give our evaluation model in Section 6.1, including a basic approach, performance metrics, and default experiment settings. Then, we analyze the experiment result in Section Evaluation Model Basic approach. To our best knowledge, there is no other existing algorithms about k-nn queries using spatial mashups in time-dependent road networks. We design a basic approach (denoted as BasicExact), as proposed in our previous work [33]. The basic idea of BasicExact is to issue external Web mapping requests for the k-nearest objects to find an initial answer, calculate dist max based on the largest travel time in the initial answer and the user s maximum movement speed, and find other objects within the network distance dist max from the user as a candidate object set R for a query answer. Then, BasicExact issues one external Web mapping request from the querying user to each candidate object in R to get an exact k-nn query answer. We also consider the exact algorithm as another basic approach (denoted as Exact) to evaluate the performance of our SMashQ with only the object grouping and direction sharing optimizations (denoted as ) and our SMashQ with all the three shared execution optimizations (i.e., object grouping, direction sharing, and user grouping) (denoted as ). Performance metrics. We evaluate the performance of the basic approaches and our SMashQ algorithms in terms of four metrics. (1) The average number of external Web mapping requests per query and (2) the average response time per query are metrics evaluating the effectiveness of our algorithms. The response time of a query is defined as the sum of the local CPU processing time and the response time from Microsoft Bing Maps (including the communication cost and the CPU processing time at Microsoft Bing Maps). It is calculated as the sum of its local CPU query processing time and the multiplication of the number of external requests and the average response time per external request. From our experimental results, the average response time per external request from Microsoft Bing Maps is 14.3 milliseconds. (3) The accuracy of estimated travel time (denoted as Acc T ime) is a metric computed by: Acc T ime = 1 min ( ) T T, 1 T, (1)

23 Title Suppressed Due to Excessive Length 23 (a) The restaurant data (b) The hotel data Fig. 11 Two real data sets. where T and Tb are the actual travel time (retrieved by the exact algorithm) and the estimated one, respectively, and 0 Acc T ime 1. (4) The accuracy of k-nn query answers (denoted as Acc Ans) evaluates the accuracy returned by our SMashQ algorithms is calculated by: Acc Ans = b A A, A (2) b is a where A is an exact query answer returned by the exact algorithm and A query answer returned by our SMashQ algorithms and 0 Acc Ans 1. Experiment settings. We evaluated the performance of the basic approaches (i.e., BasicExact and Exact) and the SMashQ algorithms (i.e., and ) using C++ and Microsoft Bing Maps [19] with a real road network of Hennepin County, MN, USA [29]. We select an area of 8 8 km2 that contains 6,109 road segments and 3,593 intersections, and the latitude and longitude of its left-bottom and right-top corners are ( , ) and ( , ), respectively. We retrieved two real data sets within the selected area from Google Maps, i.e., a restaurant data set of 1,835 restaurants and a hotel data set of 363 hotels, as depicted in Figures 11a and 11c, respectively. Unless mentioned otherwise, we used the real restaurant data set and uniformly generated 5,000 querying users in the road network for all the experiments, and the default requested number of nearest objects of k-nn queries (i.e., the value of k) is 10. The maximum allowed speed is 90 km per hour.