Find Me: An Android App for Measurement and Localization

Size: px
Start display at page:

Download "Find Me: An Android App for Measurement and Localization"

Transcription

1 Find Me: An Android App for Measurement and Localization Grace Tsai and Sairam Sundaresan Department of Electrical Engineering and Computer Science University of Michigan, Ann Arbor Abstract We present a novel method to compute the 3D position of any point in a scene up to scale, measure objects within the scene and also localize the user in a scene using an android phone. Our method uses simple geometry and loose assumptions to compute the 3D configuration of the scene. We do not rely on Manhattan world assumptions or any other similar constraints about the structure of the scene. Using this application, the user can measure objects in the scene, or measure the scene itself. Experiments and results show that our method has very high accuracy and speed, and is ideal for the mobile platform. I. INTRODUCTION There has been a lot of work recently in the area of mobile computer vision. Various computer vision algorithms have been implemented on portable devices such as mobile phones to perform complex tasks such as object recognition, object tracking, image matching, scene understanding, etc. These algorithms have been used for barcode scanning, mobile shopping, location recognition and so on. There has however not been a lot of work done towards measuring a room or a scene using a mobile phone. More precisely, image understanding has not been extended towards measuring objects in the scene or the dimensions of the scene itself. We believe that there is a lot of untapped potential here, particularly because of the numerous applications in this area. Our method can be used by physically challenged people for navigation in a wheelchair. Physically challenged people would like to have the ability to navigate and move seamlessly while on a wheelchair. Modern electronic wheelchairs have a lot of controls on them, hence making them cumbersome for the people who use them. Instead if the person was able to select where he or she wanted to go on the phone, the phone could then return the coordinates of that location, making it easier to provide input to the wheelchair. Augmented reality has gained a lot of popularity recently. We believe that our app can also be used in such applications. For example, for playing games like squash, ping-pong etc between two android phones. When parking space is hard to find, under such cramped conditions, it becomes even more hard to judge whether your car can fit between two cars. A person can use this app to determine whether there is sufficient space to park between two cars in a parking lot. This can also be used to reconstruct a 3D model of the environment for applications such as virtual tours etc. This algorithm can also be used to measure the room and furniture inside it. For example, if a person has just moved into a house or wants to get some new furniture, he/she may not be sure of how much space is available. Using this app, he/she can measure the room up to scale, and then at the store measure the furniture up to the same scale and decide if it will fit nicely or not. The main motivation behind this project was to be able to develop a simple method which could provide meaningful information to the user from a single image and help him/her make informed decisions using the information. A. Previous Work II. RELATED WORK There are three main areas which are related to our work. These areas can be classified as single image methods, Visual SLAM and Structure from motion. Both Stucture from Motion and Visual SLAM techniques make use of a stream of images to produce a model of the scene in the form of a dense 3D point cloud. These methods produce very high quality models of the scene, but there are two issues which arise when such techniques need to be implemented on a mobile platform. The first issue is speed. These methods are computationally intensive which makes them difficult to implement on a phone. Even in the case of a client server model, where the server does the main computation, these methods would still suffer from a severe bottleneck in terms of computation time. The second issue is bandwidth available, i.e., that of sending a video to the server. Both the methods mentioned above require a large number of frames for building a good model of the scene. This in turn would mean that the length of the video recorded has to be large enough. Sending large files over to the server and back would have a crippling effect on the efficiency of the application in terms of time. The other popular technique used is that of Single image based reconstruction [6], [8], [3], [9], [12], [5], [10], [14]. Single image methods have the advantage that only a single image needs to be sent from the client to server. This allows for large savings in time, and also in terms of bandwidth. Saxena et.al used Markov Random Fields to identify the orientation and 3D positions of super pixels in the image and then reconstructed a 3D model from it. Hebert et.al have also done a lot of work in this area. In [7], they generate a coarse geometric model of the scene which divides the scene into three regions, namely ground, sky and surfaces. In [4], they develop a qualitative model of the scene based on blocks.

2 Hoiem et.al [5], generated a box layout of indoor scenes which had a significant amount of clutter in them. The common theme in single image approaches is that of learning a mapping from image features to depth, or surface orientation, etc. As a result, most of these approaches require extensive training and are also computationally expensive. In the case of mobile platforms, these algorithms would not be able to run in near real time speed, and would also impose severe bandwidth constraints. B. Contributions We propose a simple method that exploits geometric properties of the scene and does not require training. Also, we wish to encourage user interaction with the algorithm, which we believe will enhance user experience, and will also provide solutions which are user friendly. Furthermore, our method requires no prior training and this enables our technique to generalize well over different environments. We also do not impose hard constraints on the structure of the scene such as the Manhattan World Assumptions. We only require that the walls be perpendicular to the ground plane. Our method borrows assumptions from the paper by Delage et.al [3], [2]. The rest of the paper is organized as follows, section III deals with the technical details of our algorithm, in section IV, we present some experimental results and section V contains the conclusions. A. Overview III. TECHNICAL DETAILS The whole system is divided into client and server subsystems. The majority of the computation is done on the server. The system diagram shown in Figure 1 and 2 explains the split up of processing as well as the overall process flow. There are two modes of operation, namely single image mode and two image mode. In the single image mode, the user can find out the 3D coordinates of any point in the scene from a single image. For this, the user takes an image, and sends the image to the server. The server then generates various ground wall boundary hypotheses for the image received. These hypotheses are then superimposed on the image and sent back to the user. The user can then choose which hypothesis is the best and send his/her choice to the server. The server then sends back the data pertaining to that hypothesis to the client. Finally, 3D computation is done interactively where the user can specify points on the screen and then have the client compute the 3D coordinates of those points. In the two image mode, a similar procedure is followed where the user takes the first image, sends it to the server. The server sends back the various ground wall boundary hypotheses to the client. The user picks the best hypothesis and the choice is sent to the server. A similar procedure is adopted once the second image is taken by the user. Once both images have been processed, the server computes the motion between the two images using SURF feature correspondences. The computed motion results are then sent to the client. Fig. 1. Process Flow for Single Image Mode. The user takes an image, and sends the image to the server. The server generates various ground wall boundary hypotheses for the image received and send back an selection image with all the hypotheses superimposed on the original image. The user choosees which hypothesis is the best and send the choice to the server and the server sends back the data pertaining to that hypothesis to the client. Finally, 3D computation is done interactively where the user can specify points on the screen and then have the client compute the 3D coordinates of those points. Fig. 2. Process Flow for Two Image Mode. A similar procedure as the single image mode is followed where the user takes the two image, sends it to the server. The server sends back the various ground wall boundary hypotheses to the client. The user picks the best hypothesis and the choice is sent to the server. Once both images have been processed, the server computes the motion between the two images using SURF feature correspondences. The computed motion results are then sent to the client. The client server split up is as follows: The client captures the images and computes the 3D positions of points in the scene in the Single Image mode. The Server generates the hypotheses, and also computes the motion in the Two Image mode. Figures 1 and 2 explain the process in more detail. B. Ground-Wall Boundary Hypotheses A ground-wall boundary hypothesis is defined as a polyline extending from the left to the right boundaries of the image (Figure 3(a)), similar to [3]. The area enclosed by the polyline defines the ground plane (Figure 3(b)) and a line segment defines a wall plane in the physical world (Figure 3(c)). A ground-wall boundary hypothesis can be generated by connecting line segments to bound a region in the image space. Thus, we start by extracting line segments using hough transform and merging them into long straight lines as shown in Figure 3(d). A hypothesis can be any combination of these

3 (a) Example of a Ground-Wall Boundary Hypothesis (b) Ground Plane (a) 3D Coordinates of Ground-Plane Points (c) Wall Plane (d) Line Extraction (e) Occluded Edges (f) Identifying Vertical Lines (b) 3D Coordinates of Wall-Plane Points Fig. 4. (a) The 3D coordinate of a ground plane point can be determined by the intersection of the camera ray and the ground plane with normal vector n g and known camera height h.(b) A wall plane equation is determined by its corresponding ground-wall boundary line segment. The intersection of the camera ray and the wall equation determines the 3D coordinate of a wall plane point. (g) Lines Below the Camera Center (h) Cluster the Lines into Three Groups Fig. 3. Hypothesis generation. (a) A ground-wall boundary hypothesis is defined by a polyline extending from the left to the right boundaries of the image. (b) The ground plane is defined by the enclosed area of the polyline. (c) A line segment in the hypothesis defines a wall in 3D. (d) Line extraction using hough transform. (e) Vertical lines (pink lines) in the polyline implies occluded edges. (f) Vertical lines are identified based on their slopes. (g) The green point is the camera center and the candidate lines are the lines that are below the camera center. (h) The candidate lines are clustered into three groups (i.e. left, right, front) based on their slopes. lines. To simplify the problem, we model the indoor environment using only three walls (i.e. left, right, and front walls) and the ground plane with no occluded edges. Therefore hypotheses with vertical lines are not considered since vertical lines imply occluded edges (Figure 3(e)). Vertical lines are identified based on their slopes as shown in Figure 3(f). Since the camera is always placed above the ground plane, all the lines in a feasible hypothesis are below the camera center (Figure 3(g)). Thus, lines that are above the camera center are removed. We, then, cluster the lines into three groups, (i.e., left, right, front) based on their slopes in the image as shown in Figure 3(h). A hypothesis can be formed by selecting one line from each group and connecting them. We define the edge support of a hypothesis to be the fraction of the length of the boundary that consists of edge pixels which can be obtained by any edge detector. Hypotheses with edge support below a threshold are removed. C. 3D Coordinates of Ground-Plane and Wall-Plane Points In the camera space, the 3D location P i = (x i, y i, z i ) T of an image point p i = (u i, v i, 1) T is related by P i = z i p i for some z i. If the point P i lies on the ground plane with normal vector n g, the exact 3D location can be determined by the intersection of the line and the ground plane (Figure 4(a)): d g = n g P i = z i n g p i (1) where d g is the distance of the optical center of the camera to the ground. By setting the normal vector of the ground plane to n g = (0, 1, 0) T and the distance d g as the camera height, h, 3D location of any point on the ground plane with image location p i can be determined by x i u i /v i P i = y i = h z i 1 1/v i. (2) The camera height h is given by the user or can be set to h = 1 without loss of generality. A wall plane equation is determined by its corresponding ground-wall boundary line segment in a given ground-wall boundary hypothesis as shown in Figure 4(b). Points on the ground plane, we determine the normal vector of a wall plane n w based on the ground-wall boundary line segment, n w = n g v b (3) where v b is the 3D direction vector of the boundary line segment. If a point lies on the wall plane with position p j in the image space, its 3D location can be determined by d w = λ j n w p j (4) where d w can be obtained by any point on the boundary line.

4 D. Camera Motion Estimation To estimate the camera motion, we detect a set of point features and track them across frames. This feature tracking can be obtained by many methods, such as SIFT[11], SURF[1] and KLT[13] feature tracking or matching. In this project, we use SURF feature matching because it is more efficient than SIFT feature matching and it is more accurate than KLT features for large baselines. Since the 3D location of the feature points are static in the world coordinate, camera motion between two frames can be estimated by observing 2D tracked points. This camera motion estimation is equivalent to estimating the rigid body transformation of the 3D point correspondences from a still camera under the camera coordinate system. In this project, we assume that the camera is moving parallel to the ground plane so the estimated rigid body transformation of the points contains three degrees of freedom, ( x, z, θ), where x and z are the translation and θ is the rotation around y- axis. Based on the point correspondences in the image space, we reconstruct the 3D locations of the point features for both frames individually under the ground-wall boundary hypothesis selected by the user. Given the two corresponding 3D point sets in the camera space, {P i = (X P i, YP i, ZP i )T } and {Q i = (X Q i, YQ i, ZQ i )T }, i = 1...N, in two frames, the rigid-body transformation is related by Q i = RP i + T where R is the rotation matrix, T is the 3D translation vector. The rotation matrix R has only one degree of freedom which has the form cos( θ) 0 sin( θ) R = (5) sin( θ) 0 cos( θ) and the translation vector T = (t x, 0, t z ) T has two degrees of freedom. In order to estimate the three degrees of transformation, the point correspondences are projected onto the ground plane which are denoted as {P i = (X P i, h, ZP i )T } and {Q i = (X Q i, h, ZQ i )T }. The rotation matrix can be estimated by the angular difference between two corresponding vectors, P ip j and Q iq j. Our estimated θ is thus the weighted average of the angular differences, cos( θ) = 1 ωij where ω ij is defined as i j ω ij P ip j Q iq j P ip j Q iq j ω ij = (1/ZP i + 1/Z Q i ) (1/Z P j + 1/ZQ j ) (7) 2 2 where intuitively distant 3D points are given lower weights since the reconstructed 3D positions of distant points are less accurate. The translation vector T is then estimated by the weighted average of the differences between RP i and Q i, T = 1 ωi N i=1 (6) ω i (RP i Q i) (8) (a) (b) Fig. 5. Examples of hypotheses generation using MATLAB and images taken by an regular hand-held camera. where ω i = (1/Z P i + 1/Z Q i )/2. IV. EXPERIMENTS AND RESULTS We first tested our idea by simulating the algorithm in MATLAB. Once we validated the simulation results, we then implemented the algorithm on the android platform by using the Eclipse and OpenCV 2.2. The client used was a Motorola Droid with a 5 Megapixel camera and an Arm Cortex A8 processor of speed 550 MHz. We used a 2.4GHz Intel Core 2 Duo Macbook Pro as the server. For evaluating the quality of our method, we performed some experiments to compare the performance of the android system against a MATLAB implementation. For the MATLAB implementation, we captured images using a high quality hand held camera. Further we also tested the quality of feature matching, and the run time of our method. Finally, we estimated the accuracy of our technique for computing 3D positions as well as for estimating motion. Figure 8 shows some snapshots of our algorithm on the droid phone. A. Generating Wall-Boundary Hypotheses Figure 5 shows the results from the MATLAB simulation of the ground-wall hypothesis generation step. Figure 6 shows the results of the same process from our android implementation. It can be seen that in both cases, the number of hypotheses

5 Fig. 6. Examples of hypotheses generation using Android. generated depends on factors such as features available in the environment, illumination conditions, etc. However, it must be noted that despite the fact that the droid phone does not have a very high quality camera, our algorithm is still able to identify a large set of plausible hypotheses. If the images taken from the android phone are blurred, the algorithm does not perform as well as the MATLAB simulation which uses the hand held camera. In some of such cases, the true groundwall boundaries are not extracted by the edge detector or the line detector. B. Feature Matching For feature extraction and matching, we used SURF matching provided by OpenCV 2.2. We considered several feature descriptors such as SIFT, Fern, KLT, but found that the SURF features are not only efficient to compute and match, but also are reliable over various indoor environments. We found that the matching process worked very well even when there was large motion between the two images. However, it should be noted that the matched features highly depend on the motion between the images. If the motion is large, it becomes less likely that the two images share the same feature points. The results of SURF matching on a variety of environments are shown in Figure 7 C. Timing Analysis Table I shows the split up of time taken for the various stages in the single Image mode. The hypothesis generation step is of variable timing since the number of hypotheses generated is not constant but depends on the environment. The rest of the stages take roughly the same time over each environment. Table II shows the split up of time taken for the various stages in the two Image mode. Here, the time taken is roughly a second more than the single image mode because an additional image needs to be sent to the server. Apart from TABLE I SPLIT UP OF TIME FOR THE SINGLE IMAGE MODE Split up-single Image Mode Process Time Taken Sending and Receiving Data sec Hypothesis Generation (Variable) 1 sec (average) Computing 3D 5 msec Total Time sec that, the rest of the processing stages take the same amount of time compared to the single image mode. TABLE II SPLIT UP OF TIME FOR THE TWO IMAGE MODE Split up-single Image Mode Process Time Taken Sending and Receiving Data sec Hypothesis Generation (Variable) 1 sec (average) Motion Estimation 2 msec Total Time sec The real bottleneck in both modes is the data transfer step. We believe that the speed can further be improved by using some kind of compression strategy to send and receive data. However, in such a case, lossless compression would be preferred because the hypotheses generated depend on the quality of the image. D. Accuracy of the Estimations We find that our method has excellent accuracy given the real world situations in which the phone was tested. The error in the 3D estimation, given the user selected boundary is approximately 0.5 m. In order to compute this, we measured the actual 3D coordinates of a few points in the scene and compared our result with it. It must be taken into account that there would be some discrepancy in the comparison because of human error. The accuracy depends on a number of factors

6 (a) (b) (c) Fig. 7. Examples of SURF Feature Matching. (a) is the matching results from images taken from a regular hand-held camera. (b) and (c) is the matching results using Android phone. such as the tilt involved, the hypothesis selected, etc. The precision tends to reduce in cases where the user selects a bad ground-wall boundary. The error of our motion estimation given the selected boundaries is approximately 0.3m for translation and 5 for rotation.the Rotation estimation is less accurate because we estimated it based on a linear approximation even though the rotation matrix is a non-linear transformation. The translation is more accurate because it is a linear transformation and the optimized solution is the averaged distance between two corresponding 3D point sets as discussed before. V. CONCLUSIONS We presented a novel method to compute the 3D position of any point in a scene up to scale, measure objects within the scene and also localize the user in a scene using an android phone. Our method uses simple geometry and loose assumptions to compute the 3D configuration of the scene. Our current approach depends a lot on image quality. The speed of the process varies according to the number of hypotheses

7 (a) Main Menu (b) Camera Menu (c) Two Image Mode Menu (d) Select Hypotheses (e) Measure 3D Location (Single Image Mode) (f) Motion Estimation (Two Image Mode) Fig. 8. Snap Shot of the Android Phone. The main menu (a) lets the user to choose between the modes, single image mode or two image mode. The single image mode directly lead the user to the camera menu (b). Once the user taken the image and send it to the server, the server sends an selection image back for user to choose the best hypothesis as in (d). The server receive the selection and send details of the selected hypothesis along and begin the 3D location measurements (e). Note that all the 3D measurements are done on the phone. If the user select the two image mode, the two image mode menu (c) will pop up. The user then takes the images and select the best hypothesis using the same process as in the single image mode. Once the images are taken and the motion estimation button is hit, the server estimates the motion and send the details to the phone as in (f). generated. Future work includes developing a mechanism for intelligent hypothesis generation and making the method invariant to environmental changes and image quality. Our current results shows that our method has very high accuracy and speed, and is ideal for the mobile platform. [13] J. Shi and C. Tomasi. Good features to track. In IEEE Conference on Computer Vision and Pattern Recognition, pages , [14] H. Wang, S. Gould, and D. Koller. Discriminative learning with latent variables for cluttered indoor scene understanding. In European Conference on Computer Vision, REFERENCES [1] H. Bay, A. Ess, T. Tuytelaars, and L. V. Gool. Surf: Speeded up robust features. In Computer Vision and Image Understanding (CVIU), volume 110, pages , [2] E. Delage, H. Lee, and A. Y. Ng. Automatic single-image 3d reconstructions of indoor Manhattan world scenes [3] E. Delage, H. Lee, and A. Y. Ng. A dynamic Bayesian network model for autonomous 3d reconstruction from a single indoor image. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR, pages , [4] A. Gupta, A. Efros, and M. Hebert. Blocks world revisited: Image understanding using qualitative geometry and mechanics. In European Conference on Computer Vision, [5] V. Hedau, D. Hoiem, and D. Forsyth. Recovering the spatial layout of cluttered rooms. In International Conference on Computer Vision, [6] D. Hoiem, A. Efros, and M. Hebert. Geometric context from a single image. In International Conference on Computer Vision, [7] D. Hoiem, A. Efros, and M. Hebert. Recovering surface layout from an image. In International Journal of Computer Vision, volume 75, October [8] D. Hoiem, A. A. Efros, and M. Hebert. Putting objects in perspective. In Proc. IEEE Computer Vision and Pattern Recognition (CVPR), volume 2, pages , [9] D. C. Lee, M. Hebert, and T. Kanade. Geometric reasoning for single image structure recovery. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), [10] B. Liu, S. Gould, and D. Koller. Single image depth estimation from predicted semantic labels. In Conference on Computer Vision and Pattern Recognition, [11] D. G. Lowe. Distinctive image features from scale-invariant keypoints. In International Journal of Computer Vision, [12] A. Saxena, M. Sun, and A. Ng. Make3d: learning 3d scene structure from a single still image. IEEE transactions on pattern analysis and machine intelligence, 30: , 2009.

Build Panoramas on Android Phones

Build Panoramas on Android Phones Build Panoramas on Android Phones Tao Chu, Bowen Meng, Zixuan Wang Stanford University, Stanford CA Abstract The purpose of this work is to implement panorama stitching from a sequence of photos taken

More information

3D Scanner using Line Laser. 1. Introduction. 2. Theory

3D Scanner using Line Laser. 1. Introduction. 2. Theory . Introduction 3D Scanner using Line Laser Di Lu Electrical, Computer, and Systems Engineering Rensselaer Polytechnic Institute The goal of 3D reconstruction is to recover the 3D properties of a geometric

More information

Semantic Visual Understanding of Indoor Environments: from Structures to Opportunities for Action

Semantic Visual Understanding of Indoor Environments: from Structures to Opportunities for Action Semantic Visual Understanding of Indoor Environments: from Structures to Opportunities for Action Grace Tsai, Collin Johnson, and Benjamin Kuipers Dept. of Electrical Engineering and Computer Science,

More information

siftservice.com - Turning a Computer Vision algorithm into a World Wide Web Service

siftservice.com - Turning a Computer Vision algorithm into a World Wide Web Service siftservice.com - Turning a Computer Vision algorithm into a World Wide Web Service Ahmad Pahlavan Tafti 1, Hamid Hassannia 2, and Zeyun Yu 1 1 Department of Computer Science, University of Wisconsin -Milwaukee,

More information

CS231M Project Report - Automated Real-Time Face Tracking and Blending

CS231M Project Report - Automated Real-Time Face Tracking and Blending CS231M Project Report - Automated Real-Time Face Tracking and Blending Steven Lee, slee2010@stanford.edu June 6, 2015 1 Introduction Summary statement: The goal of this project is to create an Android

More information

Feature Tracking and Optical Flow

Feature Tracking and Optical Flow 02/09/12 Feature Tracking and Optical Flow Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem Many slides adapted from Lana Lazebnik, Silvio Saverse, who in turn adapted slides from Steve

More information

3D Model based Object Class Detection in An Arbitrary View

3D Model based Object Class Detection in An Arbitrary View 3D Model based Object Class Detection in An Arbitrary View Pingkun Yan, Saad M. Khan, Mubarak Shah School of Electrical Engineering and Computer Science University of Central Florida http://www.eecs.ucf.edu/

More information

The Scientific Data Mining Process

The Scientific Data Mining Process Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In

More information

MusicGuide: Album Reviews on the Go Serdar Sali

MusicGuide: Album Reviews on the Go Serdar Sali MusicGuide: Album Reviews on the Go Serdar Sali Abstract The cameras on mobile phones have untapped potential as input devices. In this paper, we present MusicGuide, an application that can be used to

More information

A PHOTOGRAMMETRIC APPRAOCH FOR AUTOMATIC TRAFFIC ASSESSMENT USING CONVENTIONAL CCTV CAMERA

A PHOTOGRAMMETRIC APPRAOCH FOR AUTOMATIC TRAFFIC ASSESSMENT USING CONVENTIONAL CCTV CAMERA A PHOTOGRAMMETRIC APPRAOCH FOR AUTOMATIC TRAFFIC ASSESSMENT USING CONVENTIONAL CCTV CAMERA N. Zarrinpanjeh a, F. Dadrassjavan b, H. Fattahi c * a Islamic Azad University of Qazvin - nzarrin@qiau.ac.ir

More information

Static Environment Recognition Using Omni-camera from a Moving Vehicle

Static Environment Recognition Using Omni-camera from a Moving Vehicle Static Environment Recognition Using Omni-camera from a Moving Vehicle Teruko Yata, Chuck Thorpe Frank Dellaert The Robotics Institute Carnegie Mellon University Pittsburgh, PA 15213 USA College of Computing

More information

Epipolar Geometry. Readings: See Sections 10.1 and 15.6 of Forsyth and Ponce. Right Image. Left Image. e(p ) Epipolar Lines. e(q ) q R.

Epipolar Geometry. Readings: See Sections 10.1 and 15.6 of Forsyth and Ponce. Right Image. Left Image. e(p ) Epipolar Lines. e(q ) q R. Epipolar Geometry We consider two perspective images of a scene as taken from a stereo pair of cameras (or equivalently, assume the scene is rigid and imaged with a single camera from two different locations).

More information

Object Recognition. Selim Aksoy. Bilkent University saksoy@cs.bilkent.edu.tr

Object Recognition. Selim Aksoy. Bilkent University saksoy@cs.bilkent.edu.tr Image Classification and Object Recognition Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr Image classification Image (scene) classification is a fundamental

More information

CS 534: Computer Vision 3D Model-based recognition

CS 534: Computer Vision 3D Model-based recognition CS 534: Computer Vision 3D Model-based recognition Ahmed Elgammal Dept of Computer Science CS 534 3D Model-based Vision - 1 High Level Vision Object Recognition: What it means? Two main recognition tasks:!

More information

Tracking in flussi video 3D. Ing. Samuele Salti

Tracking in flussi video 3D. Ing. Samuele Salti Seminari XXIII ciclo Tracking in flussi video 3D Ing. Tutors: Prof. Tullio Salmon Cinotti Prof. Luigi Di Stefano The Tracking problem Detection Object model, Track initiation, Track termination, Tracking

More information

Colorado School of Mines Computer Vision Professor William Hoff

Colorado School of Mines Computer Vision Professor William Hoff Professor William Hoff Dept of Electrical Engineering &Computer Science http://inside.mines.edu/~whoff/ 1 Introduction to 2 What is? A process that produces from images of the external world a description

More information

Geometric Camera Parameters

Geometric Camera Parameters Geometric Camera Parameters What assumptions have we made so far? -All equations we have derived for far are written in the camera reference frames. -These equations are valid only when: () all distances

More information

More Local Structure Information for Make-Model Recognition

More Local Structure Information for Make-Model Recognition More Local Structure Information for Make-Model Recognition David Anthony Torres Dept. of Computer Science The University of California at San Diego La Jolla, CA 9093 Abstract An object classification

More information

Monash University Clayton s School of Information Technology CSE3313 Computer Graphics Sample Exam Questions 2007

Monash University Clayton s School of Information Technology CSE3313 Computer Graphics Sample Exam Questions 2007 Monash University Clayton s School of Information Technology CSE3313 Computer Graphics Questions 2007 INSTRUCTIONS: Answer all questions. Spend approximately 1 minute per mark. Question 1 30 Marks Total

More information

Tracking of Small Unmanned Aerial Vehicles

Tracking of Small Unmanned Aerial Vehicles Tracking of Small Unmanned Aerial Vehicles Steven Krukowski Adrien Perkins Aeronautics and Astronautics Stanford University Stanford, CA 94305 Email: spk170@stanford.edu Aeronautics and Astronautics Stanford

More information

A Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow

A Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow , pp.233-237 http://dx.doi.org/10.14257/astl.2014.51.53 A Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow Giwoo Kim 1, Hye-Youn Lim 1 and Dae-Seong Kang 1, 1 Department of electronices

More information

Automatic Labeling of Lane Markings for Autonomous Vehicles

Automatic Labeling of Lane Markings for Autonomous Vehicles Automatic Labeling of Lane Markings for Autonomous Vehicles Jeffrey Kiske Stanford University 450 Serra Mall, Stanford, CA 94305 jkiske@stanford.edu 1. Introduction As autonomous vehicles become more popular,

More information

Mean-Shift Tracking with Random Sampling

Mean-Shift Tracking with Random Sampling 1 Mean-Shift Tracking with Random Sampling Alex Po Leung, Shaogang Gong Department of Computer Science Queen Mary, University of London, London, E1 4NS Abstract In this work, boosting the efficiency of

More information

Intuitive Navigation in an Enormous Virtual Environment

Intuitive Navigation in an Enormous Virtual Environment / International Conference on Artificial Reality and Tele-Existence 98 Intuitive Navigation in an Enormous Virtual Environment Yoshifumi Kitamura Shinji Fukatsu Toshihiro Masaki Fumio Kishino Graduate

More information

Edge tracking for motion segmentation and depth ordering

Edge tracking for motion segmentation and depth ordering Edge tracking for motion segmentation and depth ordering P. Smith, T. Drummond and R. Cipolla Department of Engineering University of Cambridge Cambridge CB2 1PZ,UK {pas1001 twd20 cipolla}@eng.cam.ac.uk

More information

INTRODUCTION TO RENDERING TECHNIQUES

INTRODUCTION TO RENDERING TECHNIQUES INTRODUCTION TO RENDERING TECHNIQUES 22 Mar. 212 Yanir Kleiman What is 3D Graphics? Why 3D? Draw one frame at a time Model only once X 24 frames per second Color / texture only once 15, frames for a feature

More information

Arrangements And Duality

Arrangements And Duality Arrangements And Duality 3.1 Introduction 3 Point configurations are tbe most basic structure we study in computational geometry. But what about configurations of more complicated shapes? For example,

More information

Speed Performance Improvement of Vehicle Blob Tracking System

Speed Performance Improvement of Vehicle Blob Tracking System Speed Performance Improvement of Vehicle Blob Tracking System Sung Chun Lee and Ram Nevatia University of Southern California, Los Angeles, CA 90089, USA sungchun@usc.edu, nevatia@usc.edu Abstract. A speed

More information

Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report

Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 69 Class Project Report Junhua Mao and Lunbo Xu University of California, Los Angeles mjhustc@ucla.edu and lunbo

More information

Segmentation of building models from dense 3D point-clouds

Segmentation of building models from dense 3D point-clouds Segmentation of building models from dense 3D point-clouds Joachim Bauer, Konrad Karner, Konrad Schindler, Andreas Klaus, Christopher Zach VRVis Research Center for Virtual Reality and Visualization, Institute

More information

PARAMETRIC MODELING. David Rosen. December 1997. By carefully laying-out datums and geometry, then constraining them with dimensions and constraints,

PARAMETRIC MODELING. David Rosen. December 1997. By carefully laying-out datums and geometry, then constraining them with dimensions and constraints, 1 of 5 11/18/2004 6:24 PM PARAMETRIC MODELING David Rosen December 1997 The term parametric modeling denotes the use of parameters to control the dimensions and shape of CAD models. Think of a rubber CAD

More information

Fast Matching of Binary Features

Fast Matching of Binary Features Fast Matching of Binary Features Marius Muja and David G. Lowe Laboratory for Computational Intelligence University of British Columbia, Vancouver, Canada {mariusm,lowe}@cs.ubc.ca Abstract There has been

More information

IMPLICIT SHAPE MODELS FOR OBJECT DETECTION IN 3D POINT CLOUDS

IMPLICIT SHAPE MODELS FOR OBJECT DETECTION IN 3D POINT CLOUDS IMPLICIT SHAPE MODELS FOR OBJECT DETECTION IN 3D POINT CLOUDS Alexander Velizhev 1 (presenter) Roman Shapovalov 2 Konrad Schindler 3 1 Hexagon Technology Center, Heerbrugg, Switzerland 2 Graphics & Media

More information

Colour Image Segmentation Technique for Screen Printing

Colour Image Segmentation Technique for Screen Printing 60 R.U. Hewage and D.U.J. Sonnadara Department of Physics, University of Colombo, Sri Lanka ABSTRACT Screen-printing is an industry with a large number of applications ranging from printing mobile phone

More information

Circle Object Recognition Based on Monocular Vision for Home Security Robot

Circle Object Recognition Based on Monocular Vision for Home Security Robot Journal of Applied Science and Engineering, Vol. 16, No. 3, pp. 261 268 (2013) DOI: 10.6180/jase.2013.16.3.05 Circle Object Recognition Based on Monocular Vision for Home Security Robot Shih-An Li, Ching-Chang

More information

How To Analyze Ball Blur On A Ball Image

How To Analyze Ball Blur On A Ball Image Single Image 3D Reconstruction of Ball Motion and Spin From Motion Blur An Experiment in Motion from Blur Giacomo Boracchi, Vincenzo Caglioti, Alessandro Giusti Objective From a single image, reconstruct:

More information

MetropoGIS: A City Modeling System DI Dr. Konrad KARNER, DI Andreas KLAUS, DI Joachim BAUER, DI Christopher ZACH

MetropoGIS: A City Modeling System DI Dr. Konrad KARNER, DI Andreas KLAUS, DI Joachim BAUER, DI Christopher ZACH MetropoGIS: A City Modeling System DI Dr. Konrad KARNER, DI Andreas KLAUS, DI Joachim BAUER, DI Christopher ZACH VRVis Research Center for Virtual Reality and Visualization, Virtual Habitat, Inffeldgasse

More information

An Energy-Based Vehicle Tracking System using Principal Component Analysis and Unsupervised ART Network

An Energy-Based Vehicle Tracking System using Principal Component Analysis and Unsupervised ART Network Proceedings of the 8th WSEAS Int. Conf. on ARTIFICIAL INTELLIGENCE, KNOWLEDGE ENGINEERING & DATA BASES (AIKED '9) ISSN: 179-519 435 ISBN: 978-96-474-51-2 An Energy-Based Vehicle Tracking System using Principal

More information

TouchPaper - An Augmented Reality Application with Cloud-Based Image Recognition Service

TouchPaper - An Augmented Reality Application with Cloud-Based Image Recognition Service TouchPaper - An Augmented Reality Application with Cloud-Based Image Recognition Service Feng Tang, Daniel R. Tretter, Qian Lin HP Laboratories HPL-2012-131R1 Keyword(s): image recognition; cloud service;

More information

VEHICLE LOCALISATION AND CLASSIFICATION IN URBAN CCTV STREAMS

VEHICLE LOCALISATION AND CLASSIFICATION IN URBAN CCTV STREAMS VEHICLE LOCALISATION AND CLASSIFICATION IN URBAN CCTV STREAMS Norbert Buch 1, Mark Cracknell 2, James Orwell 1 and Sergio A. Velastin 1 1. Kingston University, Penrhyn Road, Kingston upon Thames, KT1 2EE,

More information

Practical Tour of Visual tracking. David Fleet and Allan Jepson January, 2006

Practical Tour of Visual tracking. David Fleet and Allan Jepson January, 2006 Practical Tour of Visual tracking David Fleet and Allan Jepson January, 2006 Designing a Visual Tracker: What is the state? pose and motion (position, velocity, acceleration, ) shape (size, deformation,

More information

Automatic Detection of PCB Defects

Automatic Detection of PCB Defects IJIRST International Journal for Innovative Research in Science & Technology Volume 1 Issue 6 November 2014 ISSN (online): 2349-6010 Automatic Detection of PCB Defects Ashish Singh PG Student Vimal H.

More information

Cloud tracking with optical flow for short-term solar forecasting

Cloud tracking with optical flow for short-term solar forecasting Cloud tracking with optical flow for short-term solar forecasting Philip Wood-Bradley, José Zapata, John Pye Solar Thermal Group, Australian National University, Canberra, Australia Corresponding author:

More information

Spatial location in 360 of reference points over an object by using stereo vision

Spatial location in 360 of reference points over an object by using stereo vision EDUCATION Revista Mexicana de Física E 59 (2013) 23 27 JANUARY JUNE 2013 Spatial location in 360 of reference points over an object by using stereo vision V. H. Flores a, A. Martínez a, J. A. Rayas a,

More information

Categorical Data Visualization and Clustering Using Subjective Factors

Categorical Data Visualization and Clustering Using Subjective Factors Categorical Data Visualization and Clustering Using Subjective Factors Chia-Hui Chang and Zhi-Kai Ding Department of Computer Science and Information Engineering, National Central University, Chung-Li,

More information

HAND GESTURE BASEDOPERATINGSYSTEM CONTROL

HAND GESTURE BASEDOPERATINGSYSTEM CONTROL HAND GESTURE BASEDOPERATINGSYSTEM CONTROL Garkal Bramhraj 1, palve Atul 2, Ghule Supriya 3, Misal sonali 4 1 Garkal Bramhraj mahadeo, 2 Palve Atule Vasant, 3 Ghule Supriya Shivram, 4 Misal Sonali Babasaheb,

More information

Face detection is a process of localizing and extracting the face region from the

Face detection is a process of localizing and extracting the face region from the Chapter 4 FACE NORMALIZATION 4.1 INTRODUCTION Face detection is a process of localizing and extracting the face region from the background. The detected face varies in rotation, brightness, size, etc.

More information

Interactive Segmentation, Tracking, and Kinematic Modeling of Unknown 3D Articulated Objects

Interactive Segmentation, Tracking, and Kinematic Modeling of Unknown 3D Articulated Objects Interactive Segmentation, Tracking, and Kinematic Modeling of Unknown 3D Articulated Objects Dov Katz, Moslem Kazemi, J. Andrew Bagnell and Anthony Stentz 1 Abstract We present an interactive perceptual

More information

EFFICIENT VEHICLE TRACKING AND CLASSIFICATION FOR AN AUTOMATED TRAFFIC SURVEILLANCE SYSTEM

EFFICIENT VEHICLE TRACKING AND CLASSIFICATION FOR AN AUTOMATED TRAFFIC SURVEILLANCE SYSTEM EFFICIENT VEHICLE TRACKING AND CLASSIFICATION FOR AN AUTOMATED TRAFFIC SURVEILLANCE SYSTEM Amol Ambardekar, Mircea Nicolescu, and George Bebis Department of Computer Science and Engineering University

More information

ROBUST VEHICLE TRACKING IN VIDEO IMAGES BEING TAKEN FROM A HELICOPTER

ROBUST VEHICLE TRACKING IN VIDEO IMAGES BEING TAKEN FROM A HELICOPTER ROBUST VEHICLE TRACKING IN VIDEO IMAGES BEING TAKEN FROM A HELICOPTER Fatemeh Karimi Nejadasl, Ben G.H. Gorte, and Serge P. Hoogendoorn Institute of Earth Observation and Space System, Delft University

More information

Object Recognition and Template Matching

Object Recognition and Template Matching Object Recognition and Template Matching Template Matching A template is a small image (sub-image) The goal is to find occurrences of this template in a larger image That is, you want to find matches of

More information

ACCURACY ASSESSMENT OF BUILDING POINT CLOUDS AUTOMATICALLY GENERATED FROM IPHONE IMAGES

ACCURACY ASSESSMENT OF BUILDING POINT CLOUDS AUTOMATICALLY GENERATED FROM IPHONE IMAGES ACCURACY ASSESSMENT OF BUILDING POINT CLOUDS AUTOMATICALLY GENERATED FROM IPHONE IMAGES B. Sirmacek, R. Lindenbergh Delft University of Technology, Department of Geoscience and Remote Sensing, Stevinweg

More information

Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite

Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite Philip Lenz 1 Andreas Geiger 2 Christoph Stiller 1 Raquel Urtasun 3 1 KARLSRUHE INSTITUTE OF TECHNOLOGY 2 MAX-PLANCK-INSTITUTE IS 3

More information

Least-Squares Intersection of Lines

Least-Squares Intersection of Lines Least-Squares Intersection of Lines Johannes Traa - UIUC 2013 This write-up derives the least-squares solution for the intersection of lines. In the general case, a set of lines will not intersect at a

More information

Cabri Geometry Application User Guide

Cabri Geometry Application User Guide Cabri Geometry Application User Guide Preview of Geometry... 2 Learning the Basics... 3 Managing File Operations... 12 Setting Application Preferences... 14 Selecting and Moving Objects... 17 Deleting

More information

Perception-based Design for Tele-presence

Perception-based Design for Tele-presence Perception-based Design for Tele-presence Santanu Chaudhury 1, Shantanu Ghosh 1,2, Amrita Basu 3, Brejesh Lall 1, Sumantra Dutta Roy 1, Lopamudra Choudhury 3, Prashanth R 1, Ashish Singh 1, and Amit Maniyar

More information

Projection Center Calibration for a Co-located Projector Camera System

Projection Center Calibration for a Co-located Projector Camera System Projection Center Calibration for a Co-located Camera System Toshiyuki Amano Department of Computer and Communication Science Faculty of Systems Engineering, Wakayama University Sakaedani 930, Wakayama,

More information

Human behavior analysis from videos using optical flow

Human behavior analysis from videos using optical flow L a b o r a t o i r e I n f o r m a t i q u e F o n d a m e n t a l e d e L i l l e Human behavior analysis from videos using optical flow Yassine Benabbas Directeur de thèse : Chabane Djeraba Multitel

More information

How To Fuse A Point Cloud With A Laser And Image Data From A Pointcloud

How To Fuse A Point Cloud With A Laser And Image Data From A Pointcloud REAL TIME 3D FUSION OF IMAGERY AND MOBILE LIDAR Paul Mrstik, Vice President Technology Kresimir Kusevic, R&D Engineer Terrapoint Inc. 140-1 Antares Dr. Ottawa, Ontario K2E 8C4 Canada paul.mrstik@terrapoint.com

More information

Analecta Vol. 8, No. 2 ISSN 2064-7964

Analecta Vol. 8, No. 2 ISSN 2064-7964 EXPERIMENTAL APPLICATIONS OF ARTIFICIAL NEURAL NETWORKS IN ENGINEERING PROCESSING SYSTEM S. Dadvandipour Institute of Information Engineering, University of Miskolc, Egyetemváros, 3515, Miskolc, Hungary,

More information

Augmented Reality Tic-Tac-Toe

Augmented Reality Tic-Tac-Toe Augmented Reality Tic-Tac-Toe Joe Maguire, David Saltzman Department of Electrical Engineering jmaguire@stanford.edu, dsaltz@stanford.edu Abstract: This project implements an augmented reality version

More information

FACE RECOGNITION BASED ATTENDANCE MARKING SYSTEM

FACE RECOGNITION BASED ATTENDANCE MARKING SYSTEM Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 2, February 2014,

More information

A Learning Based Method for Super-Resolution of Low Resolution Images

A Learning Based Method for Super-Resolution of Low Resolution Images A Learning Based Method for Super-Resolution of Low Resolution Images Emre Ugur June 1, 2004 emre.ugur@ceng.metu.edu.tr Abstract The main objective of this project is the study of a learning based method

More information

A Study of Automatic License Plate Recognition Algorithms and Techniques

A Study of Automatic License Plate Recognition Algorithms and Techniques A Study of Automatic License Plate Recognition Algorithms and Techniques Nima Asadi Intelligent Embedded Systems Mälardalen University Västerås, Sweden nai10001@student.mdh.se ABSTRACT One of the most

More information

Vision based Vehicle Tracking using a high angle camera

Vision based Vehicle Tracking using a high angle camera Vision based Vehicle Tracking using a high angle camera Raúl Ignacio Ramos García Dule Shu gramos@clemson.edu dshu@clemson.edu Abstract A vehicle tracking and grouping algorithm is presented in this work

More information

Optical Flow. Shenlong Wang CSC2541 Course Presentation Feb 2, 2016

Optical Flow. Shenlong Wang CSC2541 Course Presentation Feb 2, 2016 Optical Flow Shenlong Wang CSC2541 Course Presentation Feb 2, 2016 Outline Introduction Variation Models Feature Matching Methods End-to-end Learning based Methods Discussion Optical Flow Goal: Pixel motion

More information

A Comparative Study between SIFT- Particle and SURF-Particle Video Tracking Algorithms

A Comparative Study between SIFT- Particle and SURF-Particle Video Tracking Algorithms A Comparative Study between SIFT- Particle and SURF-Particle Video Tracking Algorithms H. Kandil and A. Atwan Information Technology Department, Faculty of Computer and Information Sciences, Mansoura University,El-Gomhoria

More information

Classifying Manipulation Primitives from Visual Data

Classifying Manipulation Primitives from Visual Data Classifying Manipulation Primitives from Visual Data Sandy Huang and Dylan Hadfield-Menell Abstract One approach to learning from demonstrations in robotics is to make use of a classifier to predict if

More information

Automatic Single-Image 3d Reconstructions of Indoor Manhattan World Scenes

Automatic Single-Image 3d Reconstructions of Indoor Manhattan World Scenes Automatic Single-Image 3d Reconstructions of Indoor Manhattan World Scenes Erick Delage, Honglak Lee, and Andrew Y. Ng Stanford University, Stanford, CA 94305 {edelage,hllee,ang}@cs.stanford.edu Summary.

More information

Real time vehicle detection and tracking on multiple lanes

Real time vehicle detection and tracking on multiple lanes Real time vehicle detection and tracking on multiple lanes Kristian Kovačić Edouard Ivanjko Hrvoje Gold Department of Intelligent Transportation Systems Faculty of Transport and Traffic Sciences University

More information

High Level Describable Attributes for Predicting Aesthetics and Interestingness

High Level Describable Attributes for Predicting Aesthetics and Interestingness High Level Describable Attributes for Predicting Aesthetics and Interestingness Sagnik Dhar Vicente Ordonez Tamara L Berg Stony Brook University Stony Brook, NY 11794, USA tlberg@cs.stonybrook.edu Abstract

More information

Tracking Groups of Pedestrians in Video Sequences

Tracking Groups of Pedestrians in Video Sequences Tracking Groups of Pedestrians in Video Sequences Jorge S. Marques Pedro M. Jorge Arnaldo J. Abrantes J. M. Lemos IST / ISR ISEL / IST ISEL INESC-ID / IST Lisbon, Portugal Lisbon, Portugal Lisbon, Portugal

More information

Neural Network based Vehicle Classification for Intelligent Traffic Control

Neural Network based Vehicle Classification for Intelligent Traffic Control Neural Network based Vehicle Classification for Intelligent Traffic Control Saeid Fazli 1, Shahram Mohammadi 2, Morteza Rahmani 3 1,2,3 Electrical Engineering Department, Zanjan University, Zanjan, IRAN

More information

Research on the Anti-perspective Correction Algorithm of QR Barcode

Research on the Anti-perspective Correction Algorithm of QR Barcode Researc on te Anti-perspective Correction Algoritm of QR Barcode Jianua Li, Yi-Wen Wang, YiJun Wang,Yi Cen, Guoceng Wang Key Laboratory of Electronic Tin Films and Integrated Devices University of Electronic

More information

ECE 533 Project Report Ashish Dhawan Aditi R. Ganesan

ECE 533 Project Report Ashish Dhawan Aditi R. Ganesan Handwritten Signature Verification ECE 533 Project Report by Ashish Dhawan Aditi R. Ganesan Contents 1. Abstract 3. 2. Introduction 4. 3. Approach 6. 4. Pre-processing 8. 5. Feature Extraction 9. 6. Verification

More information

Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall

Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall Automatic Photo Quality Assessment Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall Estimating i the photorealism of images: Distinguishing i i paintings from photographs h Florin

More information

Tracking Moving Objects In Video Sequences Yiwei Wang, Robert E. Van Dyck, and John F. Doherty Department of Electrical Engineering The Pennsylvania State University University Park, PA16802 Abstract{Object

More information

Probabilistic Latent Semantic Analysis (plsa)

Probabilistic Latent Semantic Analysis (plsa) Probabilistic Latent Semantic Analysis (plsa) SS 2008 Bayesian Networks Multimedia Computing, Universität Augsburg Rainer.Lienhart@informatik.uni-augsburg.de www.multimedia-computing.{de,org} References

More information

Compact Representations and Approximations for Compuation in Games

Compact Representations and Approximations for Compuation in Games Compact Representations and Approximations for Compuation in Games Kevin Swersky April 23, 2008 Abstract Compact representations have recently been developed as a way of both encoding the strategic interactions

More information

The Olympus stereology system. The Computer Assisted Stereological Toolbox

The Olympus stereology system. The Computer Assisted Stereological Toolbox The Olympus stereology system The Computer Assisted Stereological Toolbox CAST is a Computer Assisted Stereological Toolbox for PCs running Microsoft Windows TM. CAST is an interactive, user-friendly,

More information

Taking Inverse Graphics Seriously

Taking Inverse Graphics Seriously CSC2535: 2013 Advanced Machine Learning Taking Inverse Graphics Seriously Geoffrey Hinton Department of Computer Science University of Toronto The representation used by the neural nets that work best

More information

Solving Big Data Problems in Computer Vision with MATLAB Loren Shure

Solving Big Data Problems in Computer Vision with MATLAB Loren Shure Solving Big Data Problems in Computer Vision with MATLAB Loren Shure 2015 The MathWorks, Inc. 1 Why Are We Talking About Big Data? 100 hours of video uploaded to YouTube per minute 1 Explosive increase

More information

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic

More information

How To Fix Out Of Focus And Blur Images With A Dynamic Template Matching Algorithm

How To Fix Out Of Focus And Blur Images With A Dynamic Template Matching Algorithm IJSTE - International Journal of Science Technology & Engineering Volume 1 Issue 10 April 2015 ISSN (online): 2349-784X Image Estimation Algorithm for Out of Focus and Blur Images to Retrieve the Barcode

More information

A Distributed Render Farm System for Animation Production

A Distributed Render Farm System for Animation Production A Distributed Render Farm System for Animation Production Jiali Yao, Zhigeng Pan *, Hongxin Zhang State Key Lab of CAD&CG, Zhejiang University, Hangzhou, 310058, China {yaojiali, zgpan, zhx}@cad.zju.edu.cn

More information

Cloud-Empowered Multimedia Service: An Automatic Video Storytelling Tool

Cloud-Empowered Multimedia Service: An Automatic Video Storytelling Tool Cloud-Empowered Multimedia Service: An Automatic Video Storytelling Tool Joseph C. Tsai Foundation of Computer Science Lab. The University of Aizu Fukushima-ken, Japan jctsai@u-aizu.ac.jp Abstract Video

More information

Behavior Analysis in Crowded Environments. XiaogangWang Department of Electronic Engineering The Chinese University of Hong Kong June 25, 2011

Behavior Analysis in Crowded Environments. XiaogangWang Department of Electronic Engineering The Chinese University of Hong Kong June 25, 2011 Behavior Analysis in Crowded Environments XiaogangWang Department of Electronic Engineering The Chinese University of Hong Kong June 25, 2011 Behavior Analysis in Sparse Scenes Zelnik-Manor & Irani CVPR

More information

Navigation Aid And Label Reading With Voice Communication For Visually Impaired People

Navigation Aid And Label Reading With Voice Communication For Visually Impaired People Navigation Aid And Label Reading With Voice Communication For Visually Impaired People A.Manikandan 1, R.Madhuranthi 2 1 M.Kumarasamy College of Engineering, mani85a@gmail.com,karur,india 2 M.Kumarasamy

More information

A ROBUST BACKGROUND REMOVAL ALGORTIHMS

A ROBUST BACKGROUND REMOVAL ALGORTIHMS A ROBUST BACKGROUND REMOVAL ALGORTIHMS USING FUZZY C-MEANS CLUSTERING ABSTRACT S.Lakshmi 1 and Dr.V.Sankaranarayanan 2 1 Jeppiaar Engineering College, Chennai lakshmi1503@gmail.com 2 Director, Crescent

More information

Robot Perception Continued

Robot Perception Continued Robot Perception Continued 1 Visual Perception Visual Odometry Reconstruction Recognition CS 685 11 Range Sensing strategies Active range sensors Ultrasound Laser range sensor Slides adopted from Siegwart

More information

Virtual Fitting by Single-shot Body Shape Estimation

Virtual Fitting by Single-shot Body Shape Estimation Virtual Fitting by Single-shot Body Shape Estimation Masahiro Sekine* 1 Kaoru Sugita 1 Frank Perbet 2 Björn Stenger 2 Masashi Nishiyama 1 1 Corporate Research & Development Center, Toshiba Corporation,

More information

Learning 3-D Scene Structure from a Single Still Image

Learning 3-D Scene Structure from a Single Still Image Learning 3-D Scene Structure from a Single Still Image Ashutosh Saxena, Min Sun and Andrew Y. Ng Computer Science Department, Stanford University, Stanford, CA 94305 {asaxena,aliensun,ang}@cs.stanford.edu

More information

To determine vertical angular frequency, we need to express vertical viewing angle in terms of and. 2tan. (degree). (1 pt)

To determine vertical angular frequency, we need to express vertical viewing angle in terms of and. 2tan. (degree). (1 pt) Polytechnic University, Dept. Electrical and Computer Engineering EL6123 --- Video Processing, S12 (Prof. Yao Wang) Solution to Midterm Exam Closed Book, 1 sheet of notes (double sided) allowed 1. (5 pt)

More information

Whitepaper. Image stabilization improving camera usability

Whitepaper. Image stabilization improving camera usability Whitepaper Image stabilization improving camera usability Table of contents 1. Introduction 3 2. Vibration Impact on Video Output 3 3. Image Stabilization Techniques 3 3.1 Optical Image Stabilization 3

More information

Indoor Surveillance System Using Android Platform

Indoor Surveillance System Using Android Platform Indoor Surveillance System Using Android Platform 1 Mandar Bhamare, 2 Sushil Dubey, 3 Praharsh Fulzele, 4 Rupali Deshmukh, 5 Dr. Shashi Dugad 1,2,3,4,5 Department of Computer Engineering, Fr. Conceicao

More information

Solving Geometric Problems with the Rotating Calipers *

Solving Geometric Problems with the Rotating Calipers * Solving Geometric Problems with the Rotating Calipers * Godfried Toussaint School of Computer Science McGill University Montreal, Quebec, Canada ABSTRACT Shamos [1] recently showed that the diameter of

More information

Face Recognition in Low-resolution Images by Using Local Zernike Moments

Face Recognition in Low-resolution Images by Using Local Zernike Moments Proceedings of the International Conference on Machine Vision and Machine Learning Prague, Czech Republic, August14-15, 014 Paper No. 15 Face Recognition in Low-resolution Images by Using Local Zernie

More information

Introduction to Engineering System Dynamics

Introduction to Engineering System Dynamics CHAPTER 0 Introduction to Engineering System Dynamics 0.1 INTRODUCTION The objective of an engineering analysis of a dynamic system is prediction of its behaviour or performance. Real dynamic systems are

More information

Relating Vanishing Points to Catadioptric Camera Calibration

Relating Vanishing Points to Catadioptric Camera Calibration Relating Vanishing Points to Catadioptric Camera Calibration Wenting Duan* a, Hui Zhang b, Nigel M. Allinson a a Laboratory of Vision Engineering, University of Lincoln, Brayford Pool, Lincoln, U.K. LN6

More information

HANDS-FREE PC CONTROL CONTROLLING OF MOUSE CURSOR USING EYE MOVEMENT

HANDS-FREE PC CONTROL CONTROLLING OF MOUSE CURSOR USING EYE MOVEMENT International Journal of Scientific and Research Publications, Volume 2, Issue 4, April 2012 1 HANDS-FREE PC CONTROL CONTROLLING OF MOUSE CURSOR USING EYE MOVEMENT Akhil Gupta, Akash Rathi, Dr. Y. Radhika

More information