VEHICLE DETECTION AND CLASSIFICATION FROM HIGH RESOLUTION SATELLITE IMAGES Lizy Araham a, *, M.Sasikumar a Dept. o Electronics & Communication Engineering, Lal Bahadur Shastri Institute o Technology or Women, Trivandrum, Kerala, India lizytvm@yahoo.com Lal Bahadur Shastri Centre or Science and Technology, Trivandrum, Kerala, India drmsasikumar@yahoo.com Commission VI, WG VI/4 KEY WORDS: Region o Interest, Satellite Imaging Corporation, Bright vehicles, Otsu s threshold, SPOT-5, Connected Component Laeling, Cars, Trucks. ABSTRACT: In the past decades satellite imagery has een used successully or weather orecasting, geographical and geological applications. Low resolution satellite images are suicient or these sorts o applications. But the technological developments in the ield o satellite imaging provide high resolution sensors which expands its ield o application. Thus the High Resolution Satellite Imagery (HRSI) proved to e a suitale alternative to aerial photogrammetric data to provide a new data source or oject detection. Since the traic rates in developing countries are enormously increasing, vehicle detection rom satellite data will e a etter choice or automating such systems. In this work, a novel technique or vehicle detection rom the images otained rom high resolution sensors is proposed. Though we are using high resolution images, vehicles are seen only as tiny spots, diicult to distinguish rom the ackground. But we are ale to otain a detection rate not less than 0.9. Thereater we classiy the detected vehicles into cars and trucks and ind the count o them. 1. INTRODUCTION Eorts to extract inormation rom imagery have een in place ever since the irst photographic images were acquired. Low resolution satellite images such as that otained rom LANDSAT, MODIS and AVHRR sensors provide only a vague idea aout the scenes and will not provide any inormation aout the ojects ound in the scene. Images otained rom these types o sensors can e used only or weather orecasting or meteorological applications. For eature extraction and oject detection prolems high resolution images is a asic requirement. High resolution satellites like CARTOSAT, IKONOS, QuickBird, SPOT and WorldView provide detailed inormation aout the ojects as that o aerial images. The panchromatic and o QuickBird images reaches up to 60cm resolution which is as good as aerial images. Recently launched (August 13, 2014) WorldView-3 provides commercially availale panchromatic imagery o 0.31 m resolution. Though the availaility o high resolution satellite images accelerate the process o oject detection and automate such applications, vehicle detection rom satellite images are still a challenging task. This is ecause, even in high spatial resolution imagery, vehicles are seen as minute spots which are unidentiiale rom the oreground regions to detect. Classiying the detected vehicles is more serious as it is sometimes unale to distinguish large and small vehicles using naked eye itsel even rom high resolution images. Seeing the reerence images in our work, this prolem is clearly identiied. Earlier many researches have een perormed on vehicle detection in aerial imagery (Hinz, S., 2005; Schlosser, C., Reiterger, J. & Hinz, S., 2003; Zhao, T., & Nevatia, R.., 2001) and later on the work is extended or satellite imagery. But the methods used or aerial imagery can t e directly applied or satellite images since vehicles are more vivid in aerial images. Fig.1 shows an aerial image o 0.15m resolution. Figures 2, 3 and 4 are satellite images o resolutions 0.8m, 0.61m and 0.46m respectively otained rom dierent sensors. It is clear that even y using high resolution satellite images, vehicles are diicult to distinguish rom the scene. The most distinguished works in this ield include morphological transormations or the classiication o pixels into vehicle and non-vehicle targets. The work done y Jin, X. and Davis, C.H. (2007) uses a morphological shared-weight neural network (MSNN) to learn an implicit vehicle model and classiy pixels into vehicles and non-vehicles. A vehicle image ase lirary was uilt y collecting a numer o cars manually rom test images. The process is time consuming ecause o neural networks and there is an extra urden o creating an image lirary. Zheng, H., and Li, L. (2007) suggested another morphology ased algorithm using 0.6 meter resolution QuickBird panchromatic images to detect vehicles. Zheng, H., Pan, L. and Li, L. (2006) used similar approach as that o Jin, X. and Davis, C.H., ut with increased accuracy. Recent works in this area includes adaptive oosting classiication technique (Leitlo, J., Hinz, S. & Stilla, U., 2010) or the updation o weights and an area correlation method (Liu, W., Yamazaki, F. & Vu, T. T., 2011) to detect vehicles rom satellite images in which very good accuracy is achieved ut the classiication stage is not included. The most recent work in this ield is y Zheng,Z. et al. (2013) where the quality percentage reaches 92%, ut with aerial images o very high resolution having 0.15m. The method uses top-hat and ot-hat transormations successively with Otsu s thresholding or etter accuracy. * Corresponding author. doi:10.5194/isprsannals-ii-1-1-2014 1
ISPRS Annals o the Photogrammetry, Remote Sensing and Spatial Inormation Sciences, Volume II-1, 2014 The specialty o our work is, vehicles are detected even rom satellite images o 2.5m resolution with acceptale accuracy. Both right and dark vehicles are detected using the proposed method and then it is classiied as cars and trucks. Results presented in the work reveals that the detection percentage reaches 90% irrespective o the poor quality o vehicles in satellite images. Figure 4: WorldView-2 RGB Image o Eiel Tower, Paris, France (0.46m) 2. METHODOLOGY Figure 1: Highway image o o 0.15 m resolution [courtsey: Zheng,Z. et al. (2013)] The method or vehicle detection consists o ive steps: Image pre-processing is the irst stage where a satellite image is converted into a greyscale image. Secondly, region o interest having roadways alone is extracted rom the road area image. Next, or the segmentation process, multiple thresholding which depends on the statistical properties o the image is chosen or inding right vehicles. Detecting dark vehicles rom the extracted segment is the third stage. Classiication o vehicles and inding their count are the inal stage o the automated approach. The algorithms developed were implemented, tested and analyzed using MATLAB sotware. The test images are otained rom Satellite Imaging Corporation, Texas, USA which is the oicial Value Added Reseller (VAR) o imaging and geospatial data products. The simpliied lock diagram o the system is shown in ig. 5. Figure 2: IKONOS 0.8m Natural Color Image o Nakuru, Kenya (taken on Feruary 2008) Figure 5: Overall Flow o the System 2.1 Pre-processing the Input Image Figure 3: Quickird(0.61m) RGB Image o Duai, UAE (taken on Feruary 2010) Satellite images are otained as panchromatic (greyscale), natural colour (RGB) and multispectral ands. No preprocessing stage is needed or panchromatic images. For applying the vehicle detection algorithm or satellite natural colour images, it is irst converted to greyscale. Multispectral images have 4 ands where the ourth and is Near Inra-Red (NIR). This and is not used in our algorithm and can e discarded and inally converted to greyscale image. doi:10.5194/isprsannals-ii-1-1-2014 2
2.2 Region o Interest (ROI) Extraction Most o the road segments may not e straight in the concerned satellite image. Thereore, eore going directly to the region o interest segmentation we have to rotate the image in such a way that the road segment should e 0 0 (ig.18) with respect to the horizontal plane. Ater that selection o desired region o interest is done. The procedure is as ollows: (i)rotating the image (a)enter the angle o rotation. i we are rotating it in the clockwise direction, the angle o rotation is negative and or anti-clockwise direction it will e positive. () Convert the value rom cell string to string. (c) Convert the value rom string to numeric. (d)display the rotated image. (ii) Selecting the co-ordinates o the ROI (a) Select the (x,y) coordinates o the upper let corner point o the ROI. ()Select the (x,y) coordinates o the lower right corner point o the ROI. (c)merge the array x and y. (d)convert the values rom numeric to string. (e)display the selected ROI co-ordinates. (iii) Region o interest Extraction (a)sutract x co-ordinates to ind width ()Sutract y co-ordinates to ind height (c)crop the image using the otained width and height o the image. (d)display the segmented region which is the ROI. The igure shown elow (ig.6) is a SPOT-5 panchromatic image o 2.5m resolution which is a highway in Oklahoma City. Figure 7: Selecting the Coordinates o the Image Figure 8: Displaying the Coordinates o the Image Figure 9: Region o Interest or Vehicle Detection The process will extract the road segment alone reducing the time required or vehicle detection process. But i the entire process have to e ully automated the road segments must e extracted without any human intervention. Many works are proposed or road detection rom satellite images (Mokhtarzade,M. & Zoej, M. J. V., 2007; Movaghati. S, Moghaddamjoo.A & Tavakoli.A,, 2010). The authors have also done a work on ully automatic road network extraction using uzzy inerence system (L.Araham and M. Sasikumar, 2011). 2.3 Automatic Road Detection using FIS Figure 6: SPOT-5 Panchromatic Image (2.5m) As the road segment seen in the image is parallel to the horizontal plane, the angle o rotation is taken as 0 0. The upper let and lower right corners o the ROI selected are shown in ig.7. The ROIs are selected y mouse clicking using MATLAB. The (x,y) co-ordinates o the selected region is given in ig.8. The width and height o the region are calculated y sutracting the corresponding co-ordinates. Ater otaining the width and height o the region, the ROI is displayed in ig. 9. Ater evaluating a numer o satellite images and their mean and standard deviation, 11 rules are ormulated or decision making in order to develop a uzzy inerence system (FIS): a. I (mean is low) and (stddev is low) and (hough is not line) then (output is not road). I (mean is low) and (stddev is low) and (hough is line) then (output is road unlikely) c. I (mean is low) and (stddev is high) and (hough is not line) then (output is not road) d. I (mean is low) and (stddev is high) and (hough is line) then (output is not road) e. I (mean is average) and (stddev is low) and (hough is not line) then (output is not road). I (mean is average) and (stddev is low) and (hough is line) then (output is road) g. I (mean is average) and (stddev is high) and (hough is not line) then (output is not road) h. I (mean is average) and (stddev is high) and (hough is line) then (output is road unlikely) i. I (mean is high) and (stddev is low) and (hough is not line) then (output is not road) doi:10.5194/isprsannals-ii-1-1-2014 3
j. I (mean is high) and (stddev is low) and (hough is line) then (output is road unlikely) k. I (mean is high) and (stddev is high) and (hough is line) then (output is road unlikely) The linguistic variales generated using MATLAB are shown elow: matrix M 2 is constructed using the maximum intensity pixel rom each row o M 1. The threshold T 1 is mean o this 1D matrix M 2. T 2 is the minimum value in the matrix M 2. For increasing the percentage o accuracy a third threshold T 3 is calculated which is the mean o irst two thresholds T 1 and T 2. The procedure is rieed in igure 14. Figure 10: Input linguistic variale- mean Figure 14: Finding Threshold Values Figure 11: Input linguistic variale- standard deviation The concept is ased on the act that on a highway, right vehicles have maximum intensity levels than any other ojects. Thereore we are considering the maximum intensity in each row o M 1 to calculate M 2 and therey the three thresholds T 1, T 2 and T 3. Thresholds T 1, T 2, and T 3 are used to convert the test image to three dierent inary images Image-1, Image-2 and Image-3. Fig.15 shows the three thresholded images or calculated threshold values o 200, 149 and 175 or T 1, T 2 and T 3 respectively. Figure 12: Input linguistic variale- hough The FIS output or ig.6 is given elow: (a) Image-1 (T 1 =200) Figure 13: Road Detected Image 2.4 Multiple Thresholding or inding Bright Vehicles Most o the cases, the intensity values o right vehicles are greater than the intensities o the ackground. Using this concept we can use a ixed threshold and pixels higher than this particular threshold corresponds to right vehicles. But some ojects or regions on roads, such as lane markers and road dividers may have similar intensity values as that o right vehicles. Also, each right vehicle may not have same range o intensity ecause o the images taken at dierent times due to sun elevation and azimuth angles and sensor angles. So it is etter to use more than a single threshold. But, as the numer o threshold values increases we get as many inary images, increasing the time taken or vehicle detection process. Thereore, to identiy only the vehicles and to avoid the detection o irrelevant ojects, three dierent thresholds T 1, T 2, and T 3 are used in this work. The third threshold T 3 is ixed as the comination o irst two thresholds T 1 and T 2 or getting more accurate results. In order to ind the three threshold values, consider the two dimensional matrix o image intensities M 1. A one dimensional () Image-2 (T 2 =149) (c)image-3 (T 3 =175) Figure 15: Thresholded Images From ig.15, it is understood that many irrelevant ojects are included in the thresholded images. Also some vehicles are common in the resultant images. In order to remove the irrelevant ojects and to extract the common ojects, the logical AND operation is perormed among the inary images as given in eqn. (1), (2) and (3). The new inary images are shown in ig.16. New Image-1 = itwise AND [Image-1, Image-2] (1) doi:10.5194/isprsannals-ii-1-1-2014 4
New Image-2 = itwise AND [Image-1, Image-3] (2) New Image-3 = itwise AND [Image-2, Image-3] (3) (a) New Image-1 () New Image-2 (c) New Image-3 and ackground spreads is at its minimum. This can e achieved y inding a threshold with the maximum etween class variance and minimum within class variance. For this, the method check all pixel values in the image using equations 5 and 6 to ind out which one is est to classiy oreground and ackground regions, so that oreground regions are clearly distinguished rom the scene depending on the quality o the image. The within class variance is simply the sum o the two variances multiplied y their associated weights. Between class variance is the dierence etween the total variance (sum o ackground and oreground variances) o the image and within class variance. 2 2 2 Within Class Variance σ w = W σ + W σ (5) 2 2 2 Between Class Variance σb = σ σw 2 2 = W( µ µ ) + W ( µ µ ) (where µ =W µ + W µ ) 2 = W W ( µ µ ) where weight, mean and variance o the image are represented y w, µ and σ 2. The oreground and ackground regions are represented as and. The detected dark vehicles are shown in ig.18. (6) Figure 16: New Segmented Images The inal right vehicle detected image (ig.17) is otained as the logical OR operation perormed among the new images (eqn.4) since the three intermediate images have dierent vehicles ecause o the three thresholds. As seen rom the images, some o the vehicles are common which is merged in this operation. Final Bright Vehicle Image = itwise OR [New Image-1, New Image-2, New Image-3] (4) Figure 18: Dark Vehicle Detected Image To avoid considering shadow o vehicles as dark vehicles, the right and dark vehicle detected images are added using a logical OR operation (eqn. 7). This results in comining right vehicles and their shadows together. The inal vehicle detected image is shown in ig. 19. Vehicle Detected Image = itwise OR [Bright Vehicle Detected Image, Otsu s Thresholded Image] (7) Figure 17: Bright Vehicle Detected Image 2.5 Otsu s Thresholding or inding Dark Vehicles For the detection o dark vehicles, the Otsu s threshold (Otsu, N., 1979) is used. Beore applying the Otsu s threshold, a sliding neighorhood operation is applied to the test image (Aurdal, L, Eikvil, L., Koren, H., Hanssen, J.U., Johansen, K. & Holden, M., 2007). In this method a 3-y-3 neighorhood o each and every pixel is selected. The neary pixel is replaced y the minimum intensity value o this 3x3 window. The result is a darker pixel compared to the earlier one. This operation is ollowed y Otsu s thresholding to get the resultant dark vehicle detected image. Otsu's thresholding method involves iterating through all the possile threshold values and calculating a measure o spread or the pixel levels each side o the threshold, i.e. the pixels that either alls in oreground or ackground. The aim is to ind the threshold value where the sum o oreground Figure 19: Resultant Vehicle Detected Image 2.6 Vehicle Classiication & Count Beore moving directly to the vehicle classiication stage, a morphological dilation operation is perormed as some vehicles may get splitted into parts ater segmentation operation. Dilation will comine these parts into a single vehicle, which increase the detection percentage o the vehicle detection algorithm. The structuring element used or the process is deined as eqn. (8): doi:10.5194/isprsannals-ii-1-1-2014 5
SE = 1 1 1 1 1 1 1 1 1 Vehicles are commonly rectangular in shape; we can t see an irregular shaped or circular shaped vehicle. Thereore the aove structuring element is suicient or our detection algorithm. From the detected vehicles, some o the parameters which are ale to classiy them as cars and trucks are calculated. In our work, width, height, and area o the detected vehicles are considered or the classiication stage. For that connected component laeling is perormed on the dilated image. First, taking into account o all the connected components in the reerence image, the average o each o these parameters is computed. Then, the three parameters or each and every detected vehicle are compared with the average values. I width, height, and area o the vehicle is greater than the average values it is considered as a truck, or else it is a car. The algorithm is given elow: 1) Dilate Vehicle Detected Image using structuring element SE. 2) Perorm Connected Component Laeling using 4- connected neighorhood. 3) Compute area, major axis (width) and minor axis (height) o laeled regions. 4) Otain mean o these parameters. 5) Check whether area greater than mean area and Major axis length greater than mean major axis length and Minor axis length greater than mean minor axis length. Yes Increment truck count y 1 No Increment car count y 1 6) Conversion rom numeric to string or car and truck count. 7) Display the count in the message ox. The ollowing igures (ig.20) show the car and truck counts or the vehicle detected image. (8) count o the vehicles; this is, y visually inspecting the region under study. The inerred results are given in tale 1. For all the reerence images, it is seen that, though the cars and trucks are not vivid even in the actual image, the results show that the detection rate is more than 90%. It is noted that or the very high resolution aerial image, since the vehicles are clearly seen, even y taking the entire image as ROI, the method is ale to detect and classiy vehicles with a detection rate o 0.94. Figure21: IKONOS Panchromatic Image (1m) & ROI Figure 20: Car & Truck Counts o the ROI in the Reerence Image 3. EXPERIMENTAL RESULTS The results otained or other two panchromatic IKONOS (1m) and SPOT-5 (2.5m) images o highways in San Jose, CA and Barcelona, Spain are given elow. The ROI or IKONOS image is with an angle o rotation o +13 0 and or SPOT-5 image is with an angle o rotation o -10 0. The experimental method is also veriied with very high resolution aerial and satellite natural colour (RGB) images given in section I. The aerial image is taken as a whole as ROI since only road segment is shown in the image. The angle o rotation o the image is taken as 0 0. Part o the road segment is taken as ROI or other three satellite images with angle o rotations -36 0, +30 0 and +44 0 respectively. To measure the perormance o the algorithm, the results otained are compared with the manual Figure22: Fig.17 with an Angle o Rotation +13 0 Figure 23: Car & Truck Counts o the ROI doi:10.5194/isprsannals-ii-1-1-2014 6
Figure 28: Car & Truck Counts o the ROI Figure24: SPOT-5 Panchromatic Image (2.5m) & ROI Figure 25: Car & Truck Counts o the ROI Figure29: Quickird Greyscale Image (0.61m) & ROI Figure 26: Car & Truck Counts o Fig.1 (0.15m) Figure 30: Car & Truck Counts o the ROI Figure27: IKONOS Grayscale Image (0.8m) & ROI Figure 31: WorldView-2 Grayscale Image (0.46m) & ROI doi:10.5194/isprsannals-ii-1-1-2014 7
Jin, X., and Davis, C.H., Vehicle detection rom high resolution satellite imagery using morphological shared-weight neural networks, International Journal o Image and Vision Computing, vol.25, no.9, pp.1422 1431, 2007. Figure 32: Car & Truck Counts o the ROI Zheng, H., and Li, L., An artiicial immune approach or vehicle detection rom high resolution space imagery, International Journal o Computer Science and Network Security, Vol.7, no.2, pp.67 72, 2007. Zheng, H., Pan, L. and Li, L., A morphological neural network approach or vehicle detection rom high resolution satellite imagery, Lecture Notes in Computer Science (I. King, J. Wang, L. Chan, and D.L. Wang, editors), Vol. 4233, Springer, pp. 99 106. 2006. Leitlo, J., Hinz, S., and Stilla, U., Vehicle Detection in Very High Resolution Satellite Images o City Areas, IEEE Trans. on Geoscience and Remote Sensing, vol.48, no.7, pp.2795-2806, 2010. Tale 1. Perormance Evaluation 4. CONCLUSION In this paper, a multistep algorithm is designed or detecting vehicles rom satellite images o dierent resolutions. The method also classiies and counts the numer cars and trucks in the image. The proposed method is ale to detect exact numer o cars and trucks even rom satellite images lower than 1m resolution in which vehicles are identiied as some noisy white spots. But roads having high density traic, there may e chances or increased error percentage since vehicles are very much closer in those cases. Also in this work vehicles are classiied only as cars and trucks. More numer o classes can e included like cars, small trucks, large trucks and uses. Availaility o very high resolution hyper spectral data will make evolutionary changes in the ield o eature extraction and can e used as a source to rectiy aove mentioned prolems. Further research should ocus on these areas and experiment with maximum resources availale to extract even the minute details rom satellite images. REFERENCES Hinz, S., Detection o vehicles and vehicle queues or road monitoring using high resolution aerial images, Proceedings o 9th World Multiconerence on Systemics, Cyernetics and Inormatics, Orlando, Florida, USA, pp.1-4, 10 13 July, 2005. Schlosser, C., Reiterger, J. and Hinz, S., Automatic car detection in high resolution uran scenes ased on an adaptive 3D-model, Proceedings o the ISPRS Joint Workshop on Remote Sensing and Data Fusion over Uran Areas, Berlin, Germany, pp. 167 171, 2003. Liu, W., Yamazaki, F., Vu, T. T., Automated Vehicle Extraction and Speed Determination From QuickBird Satellite Images, IEEE Journal o Selected Topics in Appl. Earth Oservations and Remote Sensing, vol.4, no.1, pp.1-8, 2011. Zheng,Z., Zhou,G., Wang,Y., Liu,Y., Li,X., Wang,X. and Jiang,L., A Novel Vehicle Detection Method With High Resolution Highway Aerial Image, IEEE Journal o Selected Topics in Appl. Earth Oservations and Remote Sensing, vol.6, no.6, pp.2338-2343, 2013. Mokhtarzade,M. and Zoej, M. J. V., Road detection rom high-resolution satellite images using artiicial neural networks", International Journal o Applied Earth Oservation and Geoinormation 9:32 40, 2007. Movaghati. S, Moghaddamjoo.A and Tavakoli.A, Road Extraction From Satellite Images Using Particle Filtering and Extended Kalman Filtering, IEEE Trans. Geosci. Remote Sens., vol. 48, no. 7, pp. 2807-2817, July 2010. Lizy Araham and M. Sasikumar, Uran Area Road Network Extraction rom Satellite Images using Fuzzy Inerence System, Proc. International Conerence on Computational Intelligence and Computing Research (ICCIC 2011), Decemer 15-18, 2011, Kanyakumari, India, pp. 467-470. Otsu, N., "A threshold selection method rom graylevel histograms," IEEE Transactions. on Sys. Man Cyer., vol. 9, no. 1, pp. 62-66, 1979. Aurdal, L, Eikvil, L., Koren, H., Hanssen, J.U., Johansen, K. and Holden, M. Road Traic Snapshot, Report -1015, Norwegian Computing Center, Oslo, Norway. 2007. Zhao, T., and Nevatia, R., Car detection in low resolution aerial images, Proceedings o the IEEE International Conerence on Computer Vision, Vancouver, Canada, pp. 710 717, 09 10 July, 2001. doi:10.5194/isprsannals-ii-1-1-2014 8