Multi-vehicle Convoy Analysis Based on ANPR data

Similar documents
HATRIS JTDB Query Tool

CAB TRAVEL TIME PREDICTI - BASED ON HISTORICAL TRIP OBSERVATION

GEO-VISUALIZATION SUPPORT FOR MULTIDIMENSIONAL CLUSTERING

The Scientific Data Mining Process

CHAPTER 3 AVI TRAVEL TIME DATA COLLECTION

Concepts of digital forensics

Procedure for Marine Traffic Simulation with AIS Data

Normal and Binomial. Distributions

Flexible mobility management strategy in cellular networks

ACCELERATION OF HEAVY TRUCKS Woodrow M. Poplin, P.E.

FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM

An Overview of Knowledge Discovery Database and Data mining Techniques

Why is Internal Audit so Hard?

SuperViz: An Interactive Visualization of Super-Peer P2P Network

Understanding Web personalization with Web Usage Mining and its Application: Recommender System

BlipTrack. Case Studies. ANPR vs. Bluetooth. This case describes how data from ANPR cameras match corresponding Bluetooth sensor data.

Position Classification Standard for Management and Program Clerical and Assistance Series, GS-0344

Effective User Navigation in Dynamic Website

Continuous Fastest Path Planning in Road Networks by Mining Real-Time Traffic Event Information

Object Recognition and Template Matching

CCTV - Video Analytics for Traffic Management

Analecta Vol. 8, No. 2 ISSN

Introduction. A. Bellaachia Page: 1

Open Access Research on Application of Neural Network in Computer Network Security Evaluation. Shujuan Jin *

Collection and Use of MAC Address Data for the Travel Demand Model

Visualization methods for patent data

KEITH LEHNERT AND ERIC FRIEDRICH

Business Process Discovery

White Paper April 2006

Data Mining for Manufacturing: Preventive Maintenance, Failure Prediction, Quality Control

Applying Deep Learning to Car Data Logging (CDL) and Driver Assessor (DA) October 22-Oct-15

THE SECURITY EXPOSURE

TRANSPORTATION MODELLING IN CALGARY

Railway Crossing Information System

Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10

Introduction. Introduction. Spatial Data Mining: Definition WHAT S THE DIFFERENCE?

siemens.com/mobility Traffic data analysis in Sitraffic Scala/Concert The expert system for visualization, quality management and statistics

OUTLIER ANALYSIS. Data Mining 1

An Introduction to Data Mining

Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges

Limitations of Human Vision. What is computer vision? What is computer vision (cont d)?

SURVEYING WITH GPS. GPS has become a standard surveying technique in most surveying practices

Linear Programming for Optimization. Mark A. Schulze, Ph.D. Perceptive Scientific Instruments, Inc.

Visual Structure Analysis of Flow Charts in Patent Images

COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction

Machine Learning using MapReduce

Performance Measurement for the better utilisation of New Zealand s existing Transport Network

ARTIFICIAL NEURAL NETWORKS FOR ADAPTIVE MANAGEMENT TRAFFIC LIGHT OBJECTS AT THE INTERSECTION

International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November ISSN

Model, Analyze and Optimize the Supply Chain

Image Compression through DCT and Huffman Coding Technique

Knowledge Discovery and Data Mining. Structured vs. Non-Structured Data

A HYBRID RULE BASED FUZZY-NEURAL EXPERT SYSTEM FOR PASSIVE NETWORK MONITORING

Daily Traffic Control Log

Reliability Reliability Models. Chapter 26 Page 1

Automatic Calibration of an In-vehicle Gaze Tracking System Using Driver s Typical Gaze Behavior

A PHOTOGRAMMETRIC APPRAOCH FOR AUTOMATIC TRAFFIC ASSESSMENT USING CONVENTIONAL CCTV CAMERA

Seagull Intersection Layout. Island Point Road - A Case Study. Authors: John Harper, Wal Smart, Michael de Roos

Application of Data Mining Techniques in Intrusion Detection

Solving Geometric Problems with the Rotating Calipers *

SIPAC. Signals and Data Identification, Processing, Analysis, and Classification

International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: Vol. 1, Issue 6, October Big Data and Hadoop

Big Data and Scripting map/reduce in Hadoop

Privacy Aspects in Big Data Integration: Challenges and Opportunities

Fight fire with fire when protecting sensitive data

Evaluation of Integrated Monitoring System for Application Service Managers

Nav-Cam 7000 TripMaster Software User Guide

Urban KiwiRAP: Road Safety Assessment Programme

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

Using Data Mining for Mobile Communication Clustering and Characterization

Traffic Monitoring Systems. Technology and sensors

ACL Based Dynamic Network Reachability in Cross Domain

Understanding traffic flow

Qi Liu Rutgers Business School ISACA New York 2013

Siemens Intelligence Platform. Non contractual; Commercial in confidence; Subject to change without notice

FCD in the real world system capabilities and applications

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES

ELEVATING FORENSIC INVESTIGATION SYSTEM FOR FILE CLUSTERING

Credit Card Fraud Detection Using Self Organised Map

Operation Count; Numerical Linear Algebra

Automatic Detection of Emergency Vehicles for Hearing Impaired Drivers

NakeDB: Database Schema Visualization

CLUSTERING FOR FORENSIC ANALYSIS

VEHICLE TRACKING SYSTEM USING GPS. 1 Student, ME (IT) Pursuing, SCOE, Vadgaon, Pune. 2 Asst. Professor, SCOE, Vadgaon, Pune

A COGNITIVE APPROACH IN PATTERN ANALYSIS TOOLS AND TECHNIQUES USING WEB USAGE MINING

WYNYARD ADVANCED CRIME ANALYTICS POWERFUL SOFTWARE TO PREVENT AND SOLVE CRIME

How To Use Neural Networks In Data Mining

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining

Neural Network based Vehicle Classification for Intelligent Traffic Control

Document Image Retrieval using Signatures as Queries

ARTIFICIAL INTELLIGENCE METHODS IN EARLY MANUFACTURING TIME ESTIMATION

CRIME MAPPING FROM AN INTELLIGENCE PERSPECTIVE. James Saywell and Peter Bawden Intelligence Analysts State Intelligence Branch South Australia Police

Using LSI for Implementing Document Management Systems Turning unstructured data from a liability to an asset.

Setting the Standard for Safe City Projects in the United States

The University of the State of New York REGENTS HIGH SCHOOL EXAMINATION MATHEMATICS A. Monday, January 26, :15 to 4:15 p.m.

Product Guide. Sawmill Analytics, Swindon SN4 9LZ UK tel:

More Local Structure Information for Make-Model Recognition

Motion & The Global Positioning System (GPS)

Transcription:

Multi-vehicle Convoy Analysis Based on ANPR data A. Homayounfar*, A. T. S. Ho*, N. Zhu*, G. Head, P. Palmer *Department of Computing, School of Engineering and Physical Science, University of Surrey, Guildford, Surrey, UK Surrey Police, Mount Brown, Guildford, UK Keywords: Data Mining, ANPR, Benford s Law, Encryption, Convoy Analysis. Abstract This paper focuses on the development and novel application of data mining techniques for convoy analysis of vehicles based on the automatic number plate recognition (ANPR) system. The amount of ANPR data captured daily by traffic cameras in the road networks is very substantial. Data mining techniques are commonly used to extract relevant information and to reduce the amount of data processing and storage. In this paper, we apply data clustering techniques to extract relevant traffic patterns from the ANPR data to detect and identify unusual patterns and irregular behaviour of multi-vehicle convoy activities. 1 Introduction Convoy Analysis of vehicles is a relatively new research area. There currently exist a few publications in this area because of availability and sensitivity of vehicle data. Gopalan and Narayanaswamy first proposed the use of convoy analysis to determine the routes from origin to destinations of two vehicles [1]. Chardair et al presented some results for a number of instances of the convoy movement problem [2]. Position functions and mathematical modelling were proposed for closely coupled convoys [3, 4]; however these functions and models were similar to work based on GPS tracking of vehicles. Software for convoy intelligence has been developed based on data visualisation and analysis [5]. In this paper, we present a number of data mining and detection techniques to filter and extract the relevant ANPR data for convoy analysis of vehicles. We further investigate different traffic models and scenarios for detecting unusual patterns and irregular behaviour of multi-vehicle convoys. In Section 2, we present data filtering and detection algorithms for convoy analysis based on ANPR data captured from two or more cameras installed in different road network locations. ANPR cameras are typically set at fixed zoom and angle for number plate capturing The huge amount of ANPR data involved for road network has also a direct impact on limiting the number of ANPR cameras used. By estimating speed limits on the road this paper presents an analytical model that can be used to connect vehicles even if the convoy of vehicles near cameras locations does not occur. Our proposed techniques for convoy analysis are based on real ANPR data provided by Surrey Police from four different camera locations around the major road network in Surrey. To protect data privacy in this paper, all ANPR data are encrypted using a simple substitution method and the true camera locations on the map are also coarsely referenced without revealing their exact location details. We identify a number of potential convoy scenarios arising unusual behaviours and patterns of vehicles. These scenarios may correspond to movements and activities that may assist police investigations. 2 Model analysis and detection algorithm In this paper, data sorting, filtering and detection algorithms have been proposed and developed for convoy analysis. Initial results are encouraging, confirming the functionality and accuracy of the proposed algorithms. The scalability of algorithms is an important consideration during the design and development stage as the algorithm must be able to support more traffic cameras leading to a significant further increase in data volume as well

as support for any specific range of time. To achieve a scalable algorithm, two threshold values are introduced to detect single occurrence in different camera locations and two-vector distance (or time difference) among multi-vehicle convoys as illustrated as a flow diagram in Figure lane A* road in the county of Surrey, the number of images could be in the region of 50,000. Dual carriageway and motorways would be handling traffic images in the order of two to three times more data volume. Figure 2 shows a reference satellite map with roadwork graphic overlays for our convoy analysis experiments. Due to the huge data volume, it is paramount that the Police are able to compress and more importantly, to extract the relevant ANPR data that could be submitted for further analysis of unusual patterns and activities related to multi-vehicle convoys. Figure 2: Reference map for analysis in Tables 1-5 of three single lane A* roads and motorway. 1. Figure 1: Flow chart of the algorithm. The vector distance among multi-convoys is used for sorting and clustering data in different groups. Even though the number of traffic cameras installed in each road network is relatively small the amount of ANPR images captured per day is quite substantial. For example, on a typical single Table 1 shows convoy scenarios for two cars or couple pair travelling in close proximity of each other. The first two characters of the number plates of the couple pair are encrypted as shown in the Table. Based on empirical results, a specific detection threshold and a proximity time are used in Table 1 for the convoy analysis experiments. A proximity time of approximately 30 seconds or less is considered to be a reasonable proximity for cars during a typical journey. The two cars identified as a convoy pair in Table 1 have similar number plates, may potentially belong to the same organization, for example, two tour coaches or delivery lorries. We will present more detailed analysis and scenarios of unusual patterns with irregular time of journey and dependency based on static and dynamic analysis of ANPR data in Sections 3 to 5.

NP Camera Time Location Direction 18/**/20**-10:07 18/**/20**-10:07 18/**/20**-10:19 M25 East 18/**/20**-10:19 M25 East 18/**/20**-11:32 M25 West 18/**/20**-11:32 M25 West 18/**/20**-11:45 18/**/20**-11:45 Table 1: Couple convoys travelling very close to one another and according to their car number plates may belong to the same organization. To perform real-time convoy analysis, it is important to ensure a good performance trade-off between detection accuracy and processing speed, given the sheer amount of ANPR data involved on a daily basis. Data mining and extraction techniques of unusual traffic patterns can be efficiently achieved via supervised classification. The hidden patterns among tens of thousands of ANPR data need to be fully understood and classified according to their irregular or unusual behaviours as compared to ordinary or normal behaviour vehicles. Once classified, irregular patterns of ANPR data should be stored and continuously updated in a database for further analysis in future as stored multi-vehicle convoy data may help to predict some linkage and hierarchy among those detected cars. This will be discussed in more detail in Section 5. In this paper, the first data mining method proposed involves two main processes: firstly, the ANPR data are first filtered to remove regular behaviour or activities between vehicles, this is an important process as it helps to reduce the number of daily data from approximately 200,000 by 98 per cent; secondly, a sorting algorithm is applied to the remaining data to detect the multi-vehicle convoys. Unusual patterns obtained via this process are then stored for future analysis if these patterns are repeated on a regular basis. Conventional filtering algorithms such as counting all ANPR data for each location and iterating if two or more convoy NPs are close together in one location then checking whether this happens for other locations, is not appropriate as the number of ANPR data processed is in the hundreds of thousands per day which makes the processing time too slow. Instead, by hashing the data and converting them from strings to integers significantly improves the processing speed and time. The proposed data mining method is developed and processed using MATLAB R2011a and Map Container Objects [6]. Processing of more than 200,000 daily ANPR data from four locations by this algorithm results in the hash output in less than a minute by checking the location and number of appearances of duplicate data. The sorted output data can be recorded daily for future analysis. The scalability of the proposed algorithm can be adapted to determine the closeness of data and classify them into group clusters for future analysis. In addition, by clustering data in different groups daily and comparing them in a routine period weekly, it is possible to enhance the data-mining techniques to predict the behaviour of multi-convoyed drivers, or vehicles with an unusual journey path. Convoy analysis of multi-vehicles can be further improved if the highest number of appearances of detected vehicles can be determined over a specified time period for each camera location or unusual journey daily path. This can be achieved by using the Benford s Law [7-8] on ANPR data collected on a daily, weekly or monthly basis. The use of Benford s law has recently attracted a significant amount of attention particularly for data forensic applications [8-9]. A graph based on Benford s law, is illustrated in Figure 3, which can be used to highlight anomalies of convoy patterns. Figure 3: Benford s graph for five continuous working days for frequency of appearance >19 (average more than 4 times a day detected by A2* camera). All characters in each NP are encrypted.

Figure 2 illustrates the Benford s law curve which can be described by a logarithmic distribution as P(d) = log 10 (1+1/d), where P is the probability of d. Normal patterns such as public transport or delivery business would appear frequently in different locations at similar times of the day or week but unusual patterns such as criminal activities might occur infrequently. By keeping records of a weekly Benford s graph, it is possible to analyse which vehicles make significant changes to their journey behaviour by comparing the Benford s graphs. Certain deviations to the normal Benford s curves would require further analysis of those detected vehicles. The operation of proposed convoy analysis method is as follows: before importing data to the algorithm, the data are first sorted in a standard time format based on the minimum and maximum periods of peak traffic. After comparing the output of filtered ANPR data to input (Monday-Friday but discarding bank holidays and weekends), it is possible to analyse or track unusual vehicle number plates. The number of convoyed scenarios will increase by having more camera locations; as an example using three road network locations there will be seven probabilities (2 3-1) of convoy cars. Two or more vehicles can appear in all three locations as M25A217; or two locations as M25, M25A217 and A217; or appear more than once as, A27 and M25. The number of convoy scenarios can be calculated as follows: Number of Scenarios for N Camera Locations = (2 N -1). (1) For example, based on four locations of cameras installed in road networks around Surrey will result in 15 probabilities of convoys which will be discussed in the following sections. 3 Static analysis of exact camera locations ANPR cameras are generally fixed at certain points without geometrical movements in rotation or zoom to detect vehicle number plates at various locations. In other words, the viewing domain is limited at a certain location close to the camera for number plate detection. In this section the scenarios are only based on multi-vehicle convoys detected at those fixed points. The speed limit [10] V SL for many roads and motorways are typically between 30 and 90 mph, Consider a car travelling from east towards west via the M25 then south towards north via the, the minimum time, t min, if the driver does not exceed the speed limit, can be calculated from t min = [D M25 / (V SL-M25 ) + D / (V SL- )]. It is also interesting to point out that if a car appears from one ANPR camera to the next detection time with the time difference less than t min the driver is definitely exceeding the speed limit. In general the t min from a car travelling via N camera (from camera 1 to camera N) is calculated by N D i t min =. (2) i V SLi By measuring the distance (D) of the camera locations from the M25 to the, t min 8 minutes. Based on available online traffic information data, the daily delay during rush hour is estimated at 10-20 minutes [10]. This gives an average time value, t AVG, from one camera location to the next one. t AVG does not exceed more than twice of t min ; hence t max at the worst case for the above four specific locations are 3t min < t max. Therefore, we can only take into account of vehicles that have appeared as convoy more than the maximum time (unusual patterns). It is evident that a vehicle cannot exceed t max, unless it breaks its journey but when two vehicles both exceed t max and detected by traffic cameras several times in close proximity to each other, then it would clearly highlight a strong linkage between these vehicles. 3.1 Convoy analysis of unusual journey times and locations As discussed in previous sections, we attempted to analyse only the most probable cases of irregular convoys. This also protects against invasion of privacy for normal vehicles with regular patterns and behaviours. For example, during a daily routine it might not be considered unusual for people who might work in the same organization who happened accidentally to travel close together in the same path around 8-10 am and 5-6 pm. However, a suspected scenario is shown in Table 2, where there were two vehicles detected at three different time periods by traffic cameras outside the

normal peak periods by the convoy analysis algorithm. NP Camera Time Location Direction 02/**/20** 11:24 M25 West 02/**/20** 11:25 M25 West 02/**/20** 11:35 02/**/20** 11:36 02/**/20** 14:58 02/**/20** 14:58 S*2*MO H*71N*W S*2*MO H*71N*W S*2*MO H*71N*W Table 2: Convoy time of two vehicles measured at three camera locations. The journey time between traffic cameras and is approximately three and a half hours for both S*2*MO and H*71N*W with t AVG 10-15 minutes. Their close proximity appearing at around the same time periods and locations would certainly indicate a potential relationship between them. This scenario is not so easy to analyse for a short distance and t t AVG to determine whether or not the two vehicles are convoyed. However, when exceeding time is more than the threshold value t max, and if there is also an unusual journey path that the two vehicles are seen at certain locations repeatedly, even with a couple of minutes delay, then this may be considered for further analysis as it might be more serious than a normal convoy. Table 3 highlights a convoy scenario of two vehicles that could not have happened accidentally for three camera locations, which are not in the same path (for camera locations, the path of the and are parallel roads and the M25 path intersects these two). The maximum time distance between the and, t max, does not exceed 30 minutes, which is less than seven times for the journey of B*51*VJ and J*71*ZY. Hence for J*71*ZY not following B*51*VJ is therefore considered highly improbable. NP Camera Time Location Direction B*51*VJ J*71*ZY B*51*VJ J*71*ZY B*51*VJ J*71*ZY 04/**/20** 11:28 04/**/20** 11:30 04/**/20** 14:03 04/**/20** 14:04 04/**/20** 14:14 04/**/20** 14:18 M25 M25 East East Table 3: Example of a car following another at three different times and locations and exceeding t AVG between camera locations at and. 4 Dynamic analysis in addition of camera bases As given in Tables 1-3, the time of analysis is based on static locations of cameras, where the convoy is based on the exact time detected by the ANPR cameras. It is possible to extend the static analysis to dynamic analysis by an approximate time measured for the cars when the convoy is not exactly at the cameras location. Table 4.1 contains convoy results similar to results presented in Tables 1-3, which give the time of the two vehicles as and E*82*ZU passing three traffic cameras on 04/**/20**. From this information it is possible to find out the approximate time and location of the two vehicles when they are a convoy pair within a few miles range of the ANPR cameras. The main difference between this scenario and previous scenarios presented in Sections 2 and 3 is that this convoy also occurs in other locations where there is no existence of any cameras. NP Camera Time Location Direction E*82*ZU E*82*ZU E*82*ZU 10:10 10:17 10:20 10:27 10:32 16:46 16:47 16:59 A217 A217 A217 Table 4.1: Convoy pair in three locations By knowing the speed limit on the A217 (40 mph) and (30 mph), it is possible to make an approximation of the convoy pair vehicles within a few miles in the proximity of the traffic cameras. Convoy Time t 10:29 - rendezvous 16:46<t<16:47-convoy Convoy Location d<2 miles north of A217 camera Table 4.2:. In addition to the convoy, this provides a potential rendezvous scenario with a domain up to two miles north of the camera location on the A217 road. 5 Hierarchies and clustering among ANRPs In addition to convoy analysis, the proposed algorithm can also detect some irregular linkage and relationship between the vehicles; this can be considered as a kind of clustering approach to

classify ANPR data into different relationship groups. From the experiments and analysis of 200,000 daily ANPR data provided by Surrey Police, only a few convoys fall in this category. To allow easy search of ANPR data records, a SQL database has been created with each vehicle number plate assigned an ID field that can be used as a primary key to access the data records efficiently in a hierarchical manner. This offers a significant advantage in performing clustering and data mining of multi-vehicle convoys and in predicting unusual or irregular behaviours of suspect vehicles ID Encoded NP Encoded Date -Time Direction A J*27*KA 2/**/20**-21:04 M25-East B N*73*SW 2/**/20**-21:28 C U*20*FI 2/**/20**-21:36 M25-East A C B D A C D B J*27*KA U*20*FI N*73*SW G*79*FH J*27*KA U*20*FI G*79*FH N*73*SW 3/**/20**-03:38 3/**/20**-03:40 3/**/20**-03:41 3/**/20**-03:49 3/**/20**-03:50 3/**/20**-03:53 3/**/20**-05:52 3/**/20**-05:53 M25-East - - M25-East Table 5: Connections among four vehicles in three groups in two continuous days. From Table 5, it is possible to identify a connection between different vehicles that appeared together as convoy pairs in different days to associate unusual activities via a hierarchy relationship. 6 Conclusions and Future work This paper presented initial research and results of convoy analysis of ANPR data collected from traffic cameras installed at various road network across Surrey. The work was in collaboration with the Surrey Police who kindly provided the data for the research. The huge amount of ANPR data was first reduced via filtering followed by sorting and identifying relevant vehicle data into convoy sets. A number of suspected convoy scenarios were highlighted that could be considered for further analysis by the Police. In future work, we will investigate advanced pattern recognition techniques for convoy analysis, as well as continuing the initial work based on the Benford s law for data forensics of accurate detection of multi-vehicle convoys. References [1] R. Gopalan and C. D. Narayanaswamy, Analysis of algorithm for an online version of the convoy movement problem, Journal of the Operational Research Society, 60, 1230-1236, (2009). [2] P. Chardair, G. P. McKeown, S. A. Verity- Harrison, and S. B. Richardson, solving a time-space network formulation for the convoy movement problem, Operations Research, 53 (2), 219-230, (2005). [3] W. Travis and D. M. Bevly, Trajectory duplication using relative position information for automated ground vehicle convoys, in Proceedings of IEEE/ION Position, Location, and Navigation Symposium, 1022-1032, (May 2008). [4] W. Travis and D. M. Bevly, Performance Analysis of a Closely Coupled GPS/INS Relative Positioning Architecture for Automated Ground Vehicle Convoys, Proceedings of The national Technical Meeting -Institute of navigation, VOL 2 999-1008 (2009) [5] Visual Analysis. Internet: http://www.visualanalysis.com, July 2011 [6] Map Container Objects, MATLAB R2011a Documentation [On-line]. Available: http://www.mathworks.com/help/techdoc/ref/ containers_map.html [7] F. Benford, The law of anomalous numbers, Proc. Amer. Phil. Soc., 78, 551 572, (1938). [8] M. Nigrini. Forensic Analytics: Methods and Techniques for Forensic Accounting Investigations: Wiley Corporate F&A, 2011, 43-44 and 85-98. [9] X. Zhao, A.T.S. Ho, Y. Q. Shi, Image Forensics using Generalized Benford's Law for Accurate Detection of Unknown JPEG Compression in Watermarked Images, 16th International Conference on Digital Signal Processing, 5-7 July 2009, Santorini, Greece [10] Traffic Information: Internet: http://www.highways.gov.uk/traffic/