Computational Studies of Human Motion: Part 1, Tracking and Motion Synthesis

Size: px
Start display at page:

Download "Computational Studies of Human Motion: Part 1, Tracking and Motion Synthesis"

Transcription

1 Foundations and Trends R in Computer Graphics and Vision Vol. 1, No 2/3 (2005) c 2006 D.A. Forsyth, O. Arikan, L. Ikemoto, J. O Brien, D. Ramanan DOI: / Computational Studies of Human Motion: Part 1, Tracking and Motion Synthesis David A. Forsyth 1, Okan Arikan 2, Leslie Ikemoto 3, James O Brien 4 and Deva Ramanan 5 1 University of Illinois Urbana Champaign 2 University of Texas at Austin 3 University of California, Berkeley 4 University of California, Berkeley 5 Toyota Technological Institute at Chicago Abstract We review methods for kinematic tracking of the human body in video. The review is part of a projected book that is intended to cross-fertilize ideas about motion representation between the animation and computer vision communities. The review confines itself to the earlier stages of motion, focusing on tracking and motion synthesis; future material will cover activity representation and motion generation. In general, we take the position that tracking does not necessarily involve (as is usually thought) complex multimodal inference problems. Instead, there are two key problems, both easy to state. The first is lifting, where one must infer the configuration of the body in three dimensions from image data. Ambiguities in lifting can result in multimodal inference problem, and we review what little is known about the extent to which a lift is ambiguous. The second is data association, where one must determine which pixels in an image

2 come from the body. We see a tracking by detection approach as the most productive, and review various human detection methods. Lifting, and a variety of other problems, can be simplified by observing temporal structure in motion, and we review the literature on datadriven human animation to expose what is known about this structure. Accurate generative models of human motion would be extremely useful in both animation and tracking, and we discuss the profound difficulties encountered in building such models. Discriminative methods which should be able to tell whether an observed motion is human or not do not work well yet, and we discuss why. There is an extensive discussion of open issues. In particular, we discuss the nature and extent of lifting ambiguities, which appear to be significant at short timescales and insignificant at longer timescales. This discussion suggests that the best tracking strategy is to track a 2D representation, and then lift it. We point out some puzzling phenomena associated with the choice of human motion representation joint angles vs. joint positions. Finally, we give a quick guide to resources.

3 1 Tracking: Fundamental Notions In a tracking problem, one has some measurements that appear at each tick of a (notional) clock, and, from these measurements, one would like to determine the state of the world. There are two important sources of information. First, measurements constrain the possible state of the world. Second, there are dynamical constraints the state of the world cannot change arbitrarily from time to time. Tracking problems are of great practical importance. There are very good reasons to want to, say, track aircraft using radar returns (good summary histories include [51, 53, 188]; comprehensive reviews of technique in this context include [32, 39, 127]). Not all measurements are informative. For example, if one wishes to track an aircraft where state might involve pose, velocity and acceleration variables, and measurements might be radar returns giving distance and angle to the aircraft from several radar aerials some of the radar returns measured might not come from the aircraft. Instead, they might be the result of noise, of other aircraft, of strips of foil dropped to confuse radar apparatus (chaff or window; see [188]), or of other sources. The problem of determining which measurements are informative and which are not is known as data association. 79

4 80 Tracking: Fundamental Notions Data association is the dominant difficulty in tracking objects in video. This is because so few of the very many pixels in each frame lie on objects of interest. It can be spectacularly difficult to tell which pixels in an image come from an object of interest and which do not. There are a very wide variety of methods for doing so, the details of which largely depend on the specifics of the application problem. Surprisingly, data association is not usually explicitly discussed in the computer vision tracking literature. However, whether a method is useful rests pretty directly on its success at data association differences in other areas tend not to matter all that much in practice. 1.1 General observations The literature on tracking people is immense. Furthermore, the problem has quite different properties depending on precisely what kind of representation one wishes to recover. The most important variable appears to be spatial scale. At a coarse scale, people are blobs. For example, we might view a plaza from the window of a building or a mall corridor from a camera suspended from the ceiling. Each person occupies a small block of pixels, perhaps pixels in total. While we should be able to tell where a person is, there isn t much prospect of determining where the arms and legs are. At this scale, we can expect to recover representations of occupancy where people spend time, for example [424] or of patterns of activity how people move from place to place, and at what time, for example [377]. At a medium scale, people can be thought of as blobs with attached motion fields. For example, a television program of a soccer match, where individuals are usually pixels high. In this case, one can tell where a person is. Arms and legs are still difficult to localize, because they cover relatively few pixels, and there is motion blur. However, the motion fields around the body yield some information as to how the person is moving. One could expect to be able to tell where a runner is in the phase of the run from this information are the legs extended away from the body, or crossing? At a fine scale, the arms and legs cover enough pixels to be detected, and one wants to report the configuration of the body.

5 1.1. General observations 81 We usually refer to this case as kinematic tracking. At a fine spatial scale, one may be able to report such details as whether a person is picking up or handling an object. There are a variety of ways in which one could encode and report configuration, depending on the model adopted is one to report the configuration of the arms? the legs? the fingers? and on whether these reports should be represented in 2D or in 3D. We will discuss various representations in greater detail later. Each scale appears to be useful, but there are no reliable rules of thumb for determining what scale is most useful for what application. For example, one could see ways to tell whether people are picking up objects at a coarse scale. Equally, one could determine patterns of activity from a fine scale. Finally, some quite complex determinations about activity can be made at a surprisingly coarse scale. Tracking tends to be much more difficult at the fine scale, because one must manage more degrees of freedom and because arms and legs can be small, and can move rather fast. In this review, we focus almost entirely on the fine scale; even so, space will not allow detailed discussion of all that has been done. Our choice of scale is dictated by the intuition that good fine-scale tracking will be an essential component of any method that can give general reports on what people are doing in video. There are distinctive features of this problem that make fine scale tracking difficult: State dimension: One typically requires a high dimensional state vector to describe the configuration of the body in a frame. For example, assume we describe a person using a 2D representation. Each of ten body segments (torso, head, upper and lower arms and legs) will be represented by a rectangle of fixed size (that differs from segment to segment). This representation will use an absolute minimum of 12 state variables (position and orientation for one rectangle, and relative orientation for every other). A more practical version of the representation allows the rectangles to slide with respect to one another, and so needs 27 state variables. Considerably more variables are required for 3D models.

6 82 Tracking: Fundamental Notions Nasty dynamics: There is good evidence that such motions as walking have predictable, low-dimensional structure [335, 351]. However, the body can move extremely fast, with large accelerations. These large accelerations mean that one can stop moving predictably very quickly for example, jumping in the air during a walk. For straightforward mechanical reasons, the body parts that move fastest tend to be small and on one end of a long lever which has big muscles at the other end (forearms, fingers and feet, for example). This means that the body segments that the dynamical model fails to predict are going to be hard to find because they are small. As a result, accurate tracking of forearms can be very difficult. Complex appearance phenomena: In most applications one is tracking clothed people. Clothing can change appearance dramatically as it moves, because the forces the body applies to the clothing change, and so the pattern of folds, caused by buckling, changes. There are two important results. First, the pattern of occlusions of texture changes, meaning that the apparent texture of the body segment can change. Second, each fold will have a typical shading pattern attached, and these patterns move in the image as the folds move on the surface. Again, the result is that the apparent texture of the body segment changes. These effects can be seen in Figure 1.4. Data association: There is usually no distinctive color or texture that identifies a person (which is why people are notoriously difficult to find in static images). One possible cue is that many body segments appear at a distinctive scale as extended regions with rather roughly parallel sides. This isn t too helpful, as there are many other sources of such regions (for example, the spines of books on a shelf). Textured backgrounds are a particularly rich source of false structures in edge maps. Much of what follows is about methods to handle data association problems for people tracking.

7 1.2. Tracking by detection Tracking by detection Assume we have some form of template that can detect objects reasonably reliably. A good example might be a face detector. Assume that faces don t move all that fast, and there aren t too many in any given frame. Furthermore, the relationship between our representation of the state of a face and the image is uncomplicated. This occurs, for example, when the faces we view are always frontal or close to frontal. In this case, we can represent the state of the face by what it looks like (which, in principle, doesn t change because the face is frontal) and where it is. Under these circumstances, we can build a tracker quite simply. We maintain a pool of tracks. We detect all faces in each incoming frame. We match faces to tracks, perhaps using an appearance model built from previous instances and also at least implicitly a dynamical model. This is where our assumptions are important; we would like faces to be sufficiently well-spaced with respect to the kinds of velocities we expect that there is seldom any ambiguity in this matching procedure. This matching procedure should not require one-one matches, meaning that some tracks may not receive a face, and some faces may not be allocated a track. For every face that is not attached to a track, we create a new track. Any track that has not received a face for several frames is declared to have ended (Algorithm 1 breaks out this approach). This basic recipe for tracking by detection is worth remembering. In many situations, nothing more complex is required, and the recipe is used without comment in a variety of papers. As a simple example, at coarse scales and from the right view, background subtraction and looking for dark blobs of the right size is sufficient to identify human heads. Yan and Forsyth use this observation in a simple track-by-detection scheme, where heads are linked across frames using a greedy algorithm [424]. The method is effective for obtaining estimates of where people go in public spaces. The method will need some minor improvements and significant technical machinery as the relationship between state and image measurements grows more obscure. However, in this simple form, the

8 84 Tracking: Fundamental Notions Assumptions: We have a detector which is reasonably reliable for all aspects that matter. Objects move relatively slowly with respect to the spacing of detector responses. As a result, a detector response caused either by another object or by a false positive tends to be far from the next true position of our object. First frame: Create a track for each detector response. N th frame: Link tracks and detector responses. Typically, each track gets the closest detector response if it is not further away than some threshold. If the detector is capable of reporting some distinguishing feature (colour, texture, size, etc.), this can be used too. Spawn a new track for each detector response not allocated to a track. Reap any track that has not received a measurement for some number of frames. Cleanup: We now have trajectories in space time. Link any where this is justified (perhaps by a more sophisticated dynamical or appearance model, derived from the candidates for linking). Algorithm 1: The simplest tracking by detection method gives some insight into general tracking problems. The trick of creating tracks promiscuously and then pruning any track that has not received a measurement for some time is a quite general and extremely effective trick. The process of linking measurements to tracks is the aspect of tracking that will cause us the most difficulty (the other aspect, inferring states from measurements, is straightforward though technically involved). This process is made easier if measurements have features that distinctively identify the track from which they come. This can occur because, for example, a face will not change gender from frame to frame, or because tracks are widely spaced with respect

9 1.2. Tracking by detection 85 to the largest practical speed (so that allocating a measurement to the closest track is effective). All this is particularly useful for face tracking, because face detection determining which parts of an image contain human faces, without reference to the individual identity of the faces is one of the substantial successes of computer vision. Neither space nor energy allow a comprehensive review of this topic here. However, the typical approach is: One searches either rectangular or circular image windows over translation, scale and sometimes rotation; corrects illumination within these windows by methods such as histogram equalization; then presents these windows to a classifier which determines whether a face is present or not. There is then some post-processing on the classifier output to ensure that only one detect occurs at each face. This general picture appears in relatively early papers [299, 331, 332, 382, 383]. Points of variation include: the details of illumination correction; appropriate search mechanisms for rotation (cf. [334] and [339]); appropriate classifiers (cf. [259, 282, 333, 339] and [383]); building an incremental classification procedure so that many windows are rejected early and so consume little computation (see [186, 187, 407, 408] and the huge derived literature). There are a variety of strategies for detecting faces using parts, an approach that is becoming increasingly common (compare [54, 173, 222, 253, 256] and [412]; faces are becoming a common category in so-called object category recognition, see, for example, [111]) Background subtraction The simplest detection procedure is to have a good model of the background. In this case, everything that doesn t look like the background is worth tracking. The simplest background subtraction algorithm is to take an image of the background and then subtract it from each frame, thresholding the magnitude of the difference (there is a brief introduction to this area in [118]). Changes in illumination will defeat this approach. A natural improvement is to build a moving average estimate of the background, to keep track of illumination changes (e.g. see [343, 417]; gradients can be incorporated [250]). In outdoor scenes,

10 86 Tracking: Fundamental Notions this approach is defeated by such phenomena as leaves moving in the wind. More sophisticated background models keep track of maximal and minimal values at each pixel [146], or build local statistical models at each pixel [59, 122, 142, 176, 177, 375, 376]. Under some circumstances, background subtraction is sufficient to track people and perform a degree of kinematic inference. Wren et al. describe a system, Pfinder, that uses background subtraction to identify body pixels, then identifies arm, torso and leg pixels by building blobby clusters [417]. Haritaoglu et al. describe a system called W4, which uses background subtraction to segment people from an outdoor view [146]. Foreground regions are then linked in time by applying a second order dynamic model (velocity and acceleration) to propagate median coordinates (a robust estimate of the centroid) forward in time. Sufficiently close matches trigger a search process that matches the relevant foreground component in the previous frame to that in the current frame. Because people can pass one another or form groups, foreground regions can merge, split or appear. Regions appearing, splitting or merging are dealt with by creating (resp. fusing) tracks. Good new tracks can be distinguished from bad new tracks by looking forward in the sequence: a good track continues over time. Allowing a tracker to create new tracks fairly freely, and then telling good from bad by looking at the future in this way is a traditional, and highly useful, trick in the radar tracking community (e.g. see the comprehensive book by Blackman and Popoli [39]). The background subtraction scheme is fairly elaborate, using a range of thresholds to obtain a good blob (Figure 1.1). The resulting blobs are sufficiently good that the contour can be parsed to yield a decomposition into body segments. The method then segments the contours using convexity criteria, and tags the segments using: distance to the head which is at the top of the contour; distance to the feet which are at the bottom of the contour; and distance to the median which is reasonably stable. All this works because, for most configurations of the body, one will encounter body segments in the same order as one walks around the contour (Figure 1.2). Shadows are a perennial nuisance for background subtraction, but this can be dealt with using a stereoscopic reconstruction, as Haritaoglu et al. show ([147]; see also [178]).

11 1.2. Tracking by detection 87 Fig. 1.1 Background subtraction identifies groups of pixels that differ significantly from a background model. The method is most useful for some some cases of surveillance, where one is guaranteed a fixed viewpoint and a static background changing slowly in appearance. On the left, a background model; in the center, a frame; and on the right, the resulting image blobs. The figure is taken from Haritaoglu et al. [146]; in this paper, authors use an elaborate method involving a combination of thresholds to obtain good blobs. Figure 1.2 illustrates a method due to these authors that obtains a kinematic configuration estimate by parsing the blob. Figure from W4: Real-time surveillance of people and their activities, Haritaoglu et al., IEEE Trans. Pattern Analysis and Machine Intelligence, 2000, c 2000 IEEE. Fig. 1.2 For a given view of the body, body segments appear in the outline in a predictable manner. An example for a frontal view appears on the left. Haritaoglu et al identify vertices on the outline of a blob using a form of convexity reasoning (right (b) and right (c)), and then infer labels for these vertices by measuring the distance to head (at the top), feet (at the bottom) and median (below right). These distances give possibly ambiguous labels for each vertex; by applying a set of topological rules obtained using examples of multiple views like that on the left, they obtain an unambiguous labelling.figure from W4: Real-time surveillance of people and their activities, Haritaoglu et al., IEEE Trans. Pattern Analysis and Machine Intelligence, 2000, c 2000 IEEE.

12 88 Tracking: Fundamental Notions Deformable templates Image appearance or appearance is a flexible term used to refer to aspects of an image that are being encoded and should be matched. Appearance models might encode such matters as: Edge position; edge orientation; the distribution of color at some scale (perhaps as a histogram, perhaps as histograms for each of some set of spatially localized buckets); or texture (usually in terms of statistics of filter outputs. A deformable template or snake is a parametric model of image appearance usually used to localize structures. For example, one might have a template that models the outline of a squash [191, 192] or the outline of a person [33], place the template on the image in about the right place, and let a fitting procedure figure out the best position, orientation and parameters. We can write this out formally as follows. Assume we have some form of template that specifies image appearance as a function of some parameters. We write this template which gives (say) image brightness (or color, or texture, and so on) as a function of space x and some parameters θ as T (x θ). We score a comparison between the image at frame n, which we write as I(x,t n ), and this template using the a scoring function ρ ρ(t (x θ),i(x,t n )). A point template is built as a set of active sites within a model coordinate frame. These sites are to match keypoints identified in the image. We now build a model of acceptable sets of active sites obtained as shape, location, etc., changes. Such models can be built with, for example, the methods of principal component analysis (see, for example, [185]). We can now identify a match by obtaining image keypoints, building a correspondence between image keypoints and active sites on the template, and identifying parameters that minimize the fitting error. An alternative is a curve template, an idea originating with the snakes of [191, 192]. We choose a parametric family of image curves for example, a closed B-spline and build a model of acceptable shapes,

13 1.2. Tracking by detection 89 using methods like principal component analysis on the control points. There is an excellent account of methods in the book of Blake and Isard [41]. We can now identify a match by summing values of some image-based potential function over a set of sample points on the curve. A particularly important case occurs when we want the sample points to be close to image points where there is a strong feature response say an edge point. It can be inconvenient to find every edge point in the image (a matter of speed) and this class of template allows us to search for edges only along short sections normal to the curve an example of a gate. Deformable templates have not been widely used as object detectors, because finding a satisfactory minimum one that lies on the object of interest, most likely a global minimum can be hard. The search is hard to initialize because one must identify the feature points that should lie within the gate of the template. However, in tracking problems this difficulty is mitigated if one has a dynamical model of some form. For example, the object might move slowly, meaning that the minimum for frame n will be a good start point for frame n +1. As another example, the object might move with a large, but near constant, velocity. This means that we can predict a good start point from frame n + 1 given frame n. A significant part of the difficulty is caused by image features that don t lie on the object, meaning that another useful case occurs in the near absence of clutter perhaps background subtraction, or the imaging conditions, ensures that there are few or no extra features to confuse the fitting process. Baumberg and Hogg track people with a deformable template built using a B-spline as above, with principal components used to determine the template [33]. They use background subtraction to obtain an outline for the figure, then sample the outline. For this kind of template, correspondence is generally a nuisance, but in some practical applications, this information can be supplied from quite simple considerations. For example, Baumberg and Hogg work with background subtracted data of pedestrians at fairly coarse scales from fixed views [33]. In this case, sampling the outline at fixed fractions of length, and starting at the lowest point on the principal axis yields perfectly acceptable correspondence information.

14 90 Tracking: Fundamental Notions Robustness We have presented scoring a deformable template as a form of least squares fitting problem. There is a basic difficulty in such problems. Points that are dramatically in error, usually called outliers and traditionally blamed on typist error [153, 330], can be overweighted in determining the fit. Outliers in vision problems tend to be unavoidable, because nature is so generous with visual data that there is usually something seriously misleading in any signal. There are a variety of methods for managing difficulties created by outliers that are used in building deformable template trackers. An estimator is called robust if the estimate tends to be only weakly affected by outliers. For example, the average of a set of observations is not a robust estimate of the mean of their source (because if one observation is, say, mistyped, the average could be wildly incorrect). The median is a robust estimate, because it will not be much affected by the mistyped observation. Gating the scheme of finding edge points by searching out some distance along the normal from a curve is one strategy to obtain robustness. In this case, one limits the distance searched. Ideally, there is only one edge point in the search window, but if there are more one takes the closest (strongest, mutatis mutandis depending on application details). If there is nothing, one accepts some fixed score, chosen to make the cost continuous. This means that the cost function, while strictly not differentiable, is not dominated by very distant edge points. These are not seen in the gate, and there is an upper bound on the error any one site can contribute. An alternative is to use an m-estimator. One would like to score the template with a function of squared distance between site and measured point. This function should be close to the identity for small values (so that it behaves like the squared distance) and close to some constant for large values (so that large values don t contribute large biases). A natural form is ρ(u) = u u + σ so that, for d 2 small with respect to σ, we have ρ(d 2 ) d 2 and for d 2 large with respect to σ we have ρ(d 2 ) 1. The advantage of this

15 1.2. Tracking by detection 91 approach is that nearby edge points dominate the fit; the disadvantage is that even fitting problems that are originally convex are no longer convex when the strategy is applied. Numerical methods are consequently more complex, and one must use multiple start points. There is little hope of having a convex problem, because different start points correspond to different splits of the data set into important points and outliers; there is usually more than one such split. Again, large errors no longer dominate the estimation process, and the method is almost universally applied for flow templates The Hausdorff distance The Hausdorff distance is a method to measure similarity between binary images (for example, edge maps; the method originates in Minkowski s work in convex analysis, where it takes a somewhat different form). Assume we have two sets of points P and Q; typically, each point is an edge point in an image. We define the Hausdorff distance between the two sets to be where H(P,Q) =max(h(p,q),h(q,p )) h(p,q) = max min p q. p P q Q The distance is small if there is a point in Q close to each point in P and a point in P close to each point in P. There is a difficulty with robustness, as the Hausdorff distance is large if there are points with no good matches. In practice, one uses a variant of the Hausdorff distance (the generalized Hausdorff distance) where the distance used is the k-th ranked of the available distances rather than the largest. Define to be the operator that orders the elements of its input largest to smallest, then takes the k th largest. We now have F th k where H k (P,Q) =max(h k (P,Q),h k (Q,P )) h k (P,Q) =Fk th (min p q ) q Q

16 92 Tracking: Fundamental Notions (for example, if there are 2n points in P, then h n (P,Q) will give the median of the minimum distances). The advantage of all this is that some large distances get ignored. Now we can compare a template P with an image Q by determining some family of transformations T (θ) and then choosing the set of parameters ˆθ that minimizes H k (T (θ) P,Q). This will involve some form of search over θ. The search is likely to be simplified if as applies in the case of tracking we have a fair estimate of ˆθ to hand. Huttenlocher et al. track using the Hausdorff distance [165]. The template, which consists of a set of edge points, is itself allowed to deform. Images are represented by edge points. They identify the instance of the latest template in the next frame by searching over translations θ of the template to obtain the smallest value of H k (T (θ) P,Q). They then translate the template to that location, and identify all edge points that are within some distance of the current template s edge points. The resulting points form the template for the next frame. This process allows the template to deform to take into account, say, the deformation of the body as a person moves. Performance in heavily textured video must depend on the extent to which the edge detection process suppresses edges and the setting of this distance parameter (a large distance and lots of texture is likely to lead to catastrophe). 1.3 Tracking using flow The difficulty with tracking by detection is that one might not have a deformable template that fully specifies the appearance of an object. It is quite common to have a template that specifies the shape of the domain spanned by the object and the type of its transformation, but not what lies within. Typically, we don t know the pattern, but we do know how it moves. There are several important examples: Human body segments tend to look like a rectangle in any frame, and the motion of this rectangle is likely

17 1.3. Tracking using flow 93 to be either Euclidean or affine, depending on imaging circumstances. A face in a webcam tends to fill a blob-like domain and undergo mainly Euclidean transformations. This is useful for those building user interfaces where the camera on the monitor views the user, and there are numerous papers dealing with this. The face is not necessarily frontal computer users occasionally look away from their monitors but tends to be large, blobby and centered. Edge templates, particularly those specifying outlines, are usually used because we don t know what the interior of the region looks like. Quite often, as we have seen, we know how the template can deform and move. However, we cannot score the interior of the domain because we don t know (say) the pattern of clothing being worn. In each of these cases, we cannot use tracking by detection as above because we do not posess an appropriate template. As a matter of experience, objects don t change appearance much from frame to frame (alternatively, we should use the term appearance to apply to properties that don t change much from frame to frame). All this implies that parts of the previous image could serve as a template if we have a motion model and domain model. We could use a correspondence model to link pixels in the domain in frame n with those in the domain in frame n +1. A good linking should pair pixels that have similar appearances. Such considerations as camera properties, the motion of rigid objects, and computational expense suggest choosing the correspondence model from a small parametric family. All this gives a formal framework. Write a pixel position in the n th frame as x n, the domain in the n th frame as D n, and the transformation from the n th frame to the n + 1 th frame as T n n+1 ( ;θ n ). In this notation θ n represent parameters for the transformation from the n th frame to the n + 1 th frame, and we have that x n+1 = T n n+1 (x n ;θ n ). We assume we know D n. We can obtain D n+1 from D n as T n n+1 (D n ;θ n ). Now we can score the parameters θ n representing the

18 94 Tracking: Fundamental Notions change in state between frames n + 1 and n by comparing D n with D n+1 (which is a function of θ n ). We compute some representation of image information R(x), and, within the domain D n+1 compare R(x n+1 ) with R(T n n+1 (x n ;θ n )), where the transformation is applied to the domain D n Optic flow Generally, a frame-to-frame correspondence should be thought of as a flow field (or an optic flow field) a vector field in the image giving local image motion at each pixel. A flow field is fairly clearly a correspondence, and a correspondence gives rise to a flow field (put the tail of the vector at the pixel position in frame n, and the head at the position in frame n + 1). The notion of optic flow originates with Gibson (see, for example, [128]). A useful construction in the optic flow literature assumes that image intensity is a continuous function of position and time, I(x,t). We then assume that the intensity of image patches does not change with movement. While this assumption may run into troubles with illumination models, specularities, etc., it is not outrageous for small movements. Furthermore, it underlies our willingness to compare pixel values in frames. Accepting this assumption, we have di dt = I dx dt + I t =0 (known as the optic flow equation, e.g. see [160]). Flow is represented by dx/dt. This is important, because if we confine our attention to an appropriate domain, comparing I(T (x;θ n ),t n+1 ) with I(x,t n ) involves, in essence, estimating the total derivative. In particular, I(T (x;θ n ),t n+1 ) I(x,t n ) di dt. Furthermore, the equivalence between correspondence and flow suggests a simpler form for the transformation of pixel values. We regard T (x;θ n ) as taking x from the tail of a flow arrow to the head. At short timescales, this justifies the view that T (x;θ n )=x + δx(θ n ).

19 1.3. Tracking using flow Image stabilization This form of tracking can be used to build boxes around moving objects, a practice known as image stabilization. One has a moving object on a fairly uniform background, and would like to build a domain such that the moving object is centered on the domain. This has the advantage that one can look at relative, rather than absolute, motion cues. For example, one might take a soccer player running around a field, and build a box around the player. If one then fixes the box and its contents in one place, the vast majority of motion cues within the box are cues to how the player s body configuration is changing. As another example, one might stabilize a box around an aerial view of a moving vehicle; now the box contains all visual information about the vehicle s identity. Efros et al. use a straightforward version of this method, where domains are rectangles and flow is pure translation, to stabilize boxes around people viewed at a medium scale (for example, in a soccer video) [100]. In some circumstances, good results can be obtained by matching a rectangle in frame n with the rectangle in frame n +1 that has smallest sum-of-squared differences which might be found by blank search, assisted perhaps by velocity constraints. This is going to work best if the background is relatively simple say, the constant green of a soccer field as then the background isn t a source of noise, so the figure need not be segmented (Figure 1.3). For more complex backgrounds, the approach may still work if one performs background subtraction before stabilization. At a medium scale it is very difficult to localize arms and legs, but they do leave traces in the flow field. The stabilization procedure means that the flow information can be computed with respect to a torso coordinate system, resulting in a representation that can be used to match at a kinematic level, without needing an explicit representation of arm and leg configurations (Figure 1.3) Cardboard people Flow based tracking has the advantage that one doesn t need an explicit model of the appearance of the template. Ju et al. build a model of legs in terms of a set of articulated rectangular patches ( cardboard people ) [190]. Assume we have a domain D in the n th image I(x,t n )

20 96 Tracking: Fundamental Notions Fig. 1.3 Flow based tracking can be useful for medium scale video. Efros et al. stabilize boxes around the torso of players in football video using a sum of squared differences (SSD) as a cost function and straightforward search to identify the best translation values. As the figure on the left shows, the resulting boxes are stable with respect to the torso. On the top right, larger versions of the boxes for some cases. Note that, because the video is at medium scale, it is difficult to resolve arms and legs, which are severely affected by motion blur. Nonetheless, one can make a useful estimate of what the body is doing by computing an estimate of optic flow (bottom right, F x, F y), rectifying this estimate (bottom right, F + x, F x, F + y, F y ) and then smoothing the result (bottom right, Fb + x, etc.). The result is a smoothed estimate of where particular velocity directions are distributed with respect to the torso, which can be used to match and label frames. Figure from Recognizing Action at a Distance, Efros et al., IEEE Int. Conf. Computer Vision 2003, c 2003 IEEE. and a flow field δx(θ) parametrized by θ. Now this flow field takes D to some domain in the n + 1 th image, and establishes a correspondence between pixels in the n th and the n + 1 th image. Ju et al. score ρ(i n+1 (x + δx(θ)) I n (x)) D where ρ is some measure of image error, which is small when the two compare well and large when they are different. Notice that this is a very general approach to the tracking problem, with the difficulty that, unless one is careful about the flow model the problem of finding a minimum might be hard. To our knowledge, the image score is always applied to pixel values, and it seems interesting to wonder what would happen if one scored a difference in texture descriptors. Typically, the score is not minimized directly, but is approximated with the optic flow equation and with a Taylor series. We have ρ(i(x + δx(θ),t n+1 ) I n (x,t n )) D

Practical Tour of Visual tracking. David Fleet and Allan Jepson January, 2006

Practical Tour of Visual tracking. David Fleet and Allan Jepson January, 2006 Practical Tour of Visual tracking David Fleet and Allan Jepson January, 2006 Designing a Visual Tracker: What is the state? pose and motion (position, velocity, acceleration, ) shape (size, deformation,

More information

CS 534: Computer Vision 3D Model-based recognition

CS 534: Computer Vision 3D Model-based recognition CS 534: Computer Vision 3D Model-based recognition Ahmed Elgammal Dept of Computer Science CS 534 3D Model-based Vision - 1 High Level Vision Object Recognition: What it means? Two main recognition tasks:!

More information

Mean-Shift Tracking with Random Sampling

Mean-Shift Tracking with Random Sampling 1 Mean-Shift Tracking with Random Sampling Alex Po Leung, Shaogang Gong Department of Computer Science Queen Mary, University of London, London, E1 4NS Abstract In this work, boosting the efficiency of

More information

Arrowsmith: Automatic Archery Scorer Chanh Nguyen and Irving Lin

Arrowsmith: Automatic Archery Scorer Chanh Nguyen and Irving Lin Arrowsmith: Automatic Archery Scorer Chanh Nguyen and Irving Lin Department of Computer Science, Stanford University ABSTRACT We present a method for automatically determining the score of a round of arrows

More information

A Learning Based Method for Super-Resolution of Low Resolution Images

A Learning Based Method for Super-Resolution of Low Resolution Images A Learning Based Method for Super-Resolution of Low Resolution Images Emre Ugur June 1, 2004 emre.ugur@ceng.metu.edu.tr Abstract The main objective of this project is the study of a learning based method

More information

Jiří Matas. Hough Transform

Jiří Matas. Hough Transform Hough Transform Jiří Matas Center for Machine Perception Department of Cybernetics, Faculty of Electrical Engineering Czech Technical University, Prague Many slides thanks to Kristen Grauman and Bastian

More information

The Scientific Data Mining Process

The Scientific Data Mining Process Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In

More information

VEHICLE LOCALISATION AND CLASSIFICATION IN URBAN CCTV STREAMS

VEHICLE LOCALISATION AND CLASSIFICATION IN URBAN CCTV STREAMS VEHICLE LOCALISATION AND CLASSIFICATION IN URBAN CCTV STREAMS Norbert Buch 1, Mark Cracknell 2, James Orwell 1 and Sergio A. Velastin 1 1. Kingston University, Penrhyn Road, Kingston upon Thames, KT1 2EE,

More information

A Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow

A Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow , pp.233-237 http://dx.doi.org/10.14257/astl.2014.51.53 A Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow Giwoo Kim 1, Hye-Youn Lim 1 and Dae-Seong Kang 1, 1 Department of electronices

More information

Part-Based Recognition

Part-Based Recognition Part-Based Recognition Benedict Brown CS597D, Fall 2003 Princeton University CS 597D, Part-Based Recognition p. 1/32 Introduction Many objects are made up of parts It s presumably easier to identify simple

More information

Feature Tracking and Optical Flow

Feature Tracking and Optical Flow 02/09/12 Feature Tracking and Optical Flow Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem Many slides adapted from Lana Lazebnik, Silvio Saverse, who in turn adapted slides from Steve

More information

Tracking in flussi video 3D. Ing. Samuele Salti

Tracking in flussi video 3D. Ing. Samuele Salti Seminari XXIII ciclo Tracking in flussi video 3D Ing. Tutors: Prof. Tullio Salmon Cinotti Prof. Luigi Di Stefano The Tracking problem Detection Object model, Track initiation, Track termination, Tracking

More information

False alarm in outdoor environments

False alarm in outdoor environments Accepted 1.0 Savantic letter 1(6) False alarm in outdoor environments Accepted 1.0 Savantic letter 2(6) Table of contents Revision history 3 References 3 1 Introduction 4 2 Pre-processing 4 3 Detection,

More information

Real-Time Tracking of Pedestrians and Vehicles

Real-Time Tracking of Pedestrians and Vehicles Real-Time Tracking of Pedestrians and Vehicles N.T. Siebel and S.J. Maybank. Computational Vision Group Department of Computer Science The University of Reading Reading RG6 6AY, England Abstract We present

More information

This week. CENG 732 Computer Animation. Challenges in Human Modeling. Basic Arm Model

This week. CENG 732 Computer Animation. Challenges in Human Modeling. Basic Arm Model CENG 732 Computer Animation Spring 2006-2007 Week 8 Modeling and Animating Articulated Figures: Modeling the Arm, Walking, Facial Animation This week Modeling the arm Different joint structures Walking

More information

EFFICIENT VEHICLE TRACKING AND CLASSIFICATION FOR AN AUTOMATED TRAFFIC SURVEILLANCE SYSTEM

EFFICIENT VEHICLE TRACKING AND CLASSIFICATION FOR AN AUTOMATED TRAFFIC SURVEILLANCE SYSTEM EFFICIENT VEHICLE TRACKING AND CLASSIFICATION FOR AN AUTOMATED TRAFFIC SURVEILLANCE SYSTEM Amol Ambardekar, Mircea Nicolescu, and George Bebis Department of Computer Science and Engineering University

More information

Object tracking & Motion detection in video sequences

Object tracking & Motion detection in video sequences Introduction Object tracking & Motion detection in video sequences Recomended link: http://cmp.felk.cvut.cz/~hlavac/teachpresen/17compvision3d/41imagemotion.pdf 1 2 DYNAMIC SCENE ANALYSIS The input to

More information

CS 4204 Computer Graphics

CS 4204 Computer Graphics CS 4204 Computer Graphics Computer Animation Adapted from notes by Yong Cao Virginia Tech 1 Outline Principles of Animation Keyframe Animation Additional challenges in animation 2 Classic animation Luxo

More information

Vision based Vehicle Tracking using a high angle camera

Vision based Vehicle Tracking using a high angle camera Vision based Vehicle Tracking using a high angle camera Raúl Ignacio Ramos García Dule Shu gramos@clemson.edu dshu@clemson.edu Abstract A vehicle tracking and grouping algorithm is presented in this work

More information

Tracking Groups of Pedestrians in Video Sequences

Tracking Groups of Pedestrians in Video Sequences Tracking Groups of Pedestrians in Video Sequences Jorge S. Marques Pedro M. Jorge Arnaldo J. Abrantes J. M. Lemos IST / ISR ISEL / IST ISEL INESC-ID / IST Lisbon, Portugal Lisbon, Portugal Lisbon, Portugal

More information

Single Image 3D Reconstruction of Ball Motion and Spin From Motion Blur

Single Image 3D Reconstruction of Ball Motion and Spin From Motion Blur Single Image 3D Reconstruction of Ball Motion and Spin From Motion Blur An Experiment in Motion from Blur Giacomo Boracchi, Vincenzo Caglioti, Alessandro Giusti Objective From a single image, reconstruct:

More information

Colorado School of Mines Computer Vision Professor William Hoff

Colorado School of Mines Computer Vision Professor William Hoff Professor William Hoff Dept of Electrical Engineering &Computer Science http://inside.mines.edu/~whoff/ 1 Introduction to 2 What is? A process that produces from images of the external world a description

More information

Classifying Manipulation Primitives from Visual Data

Classifying Manipulation Primitives from Visual Data Classifying Manipulation Primitives from Visual Data Sandy Huang and Dylan Hadfield-Menell Abstract One approach to learning from demonstrations in robotics is to make use of a classifier to predict if

More information

Understanding Purposeful Human Motion

Understanding Purposeful Human Motion M.I.T Media Laboratory Perceptual Computing Section Technical Report No. 85 Appears in Fourth IEEE International Conference on Automatic Face and Gesture Recognition Understanding Purposeful Human Motion

More information

Automatic parameter regulation for a tracking system with an auto-critical function

Automatic parameter regulation for a tracking system with an auto-critical function Automatic parameter regulation for a tracking system with an auto-critical function Daniela Hall INRIA Rhône-Alpes, St. Ismier, France Email: Daniela.Hall@inrialpes.fr Abstract In this article we propose

More information

Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall

Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall Automatic Photo Quality Assessment Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall Estimating i the photorealism of images: Distinguishing i i paintings from photographs h Florin

More information

Environmental Remote Sensing GEOG 2021

Environmental Remote Sensing GEOG 2021 Environmental Remote Sensing GEOG 2021 Lecture 4 Image classification 2 Purpose categorising data data abstraction / simplification data interpretation mapping for land cover mapping use land cover class

More information

Robotics. Chapter 25. Chapter 25 1

Robotics. Chapter 25. Chapter 25 1 Robotics Chapter 25 Chapter 25 1 Outline Robots, Effectors, and Sensors Localization and Mapping Motion Planning Motor Control Chapter 25 2 Mobile Robots Chapter 25 3 Manipulators P R R R R R Configuration

More information

OBJECT TRACKING USING LOG-POLAR TRANSFORMATION

OBJECT TRACKING USING LOG-POLAR TRANSFORMATION OBJECT TRACKING USING LOG-POLAR TRANSFORMATION A Thesis Submitted to the Gradual Faculty of the Louisiana State University and Agricultural and Mechanical College in partial fulfillment of the requirements

More information

MVA ENS Cachan. Lecture 2: Logistic regression & intro to MIL Iasonas Kokkinos Iasonas.kokkinos@ecp.fr

MVA ENS Cachan. Lecture 2: Logistic regression & intro to MIL Iasonas Kokkinos Iasonas.kokkinos@ecp.fr Machine Learning for Computer Vision 1 MVA ENS Cachan Lecture 2: Logistic regression & intro to MIL Iasonas Kokkinos Iasonas.kokkinos@ecp.fr Department of Applied Mathematics Ecole Centrale Paris Galen

More information

the points are called control points approximating curve

the points are called control points approximating curve Chapter 4 Spline Curves A spline curve is a mathematical representation for which it is easy to build an interface that will allow a user to design and control the shape of complex curves and surfaces.

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical

More information

Tracking of Small Unmanned Aerial Vehicles

Tracking of Small Unmanned Aerial Vehicles Tracking of Small Unmanned Aerial Vehicles Steven Krukowski Adrien Perkins Aeronautics and Astronautics Stanford University Stanford, CA 94305 Email: spk170@stanford.edu Aeronautics and Astronautics Stanford

More information

Local features and matching. Image classification & object localization

Local features and matching. Image classification & object localization Overview Instance level search Local features and matching Efficient visual recognition Image classification & object localization Category recognition Image classification: assigning a class label to

More information

Object Recognition and Template Matching

Object Recognition and Template Matching Object Recognition and Template Matching Template Matching A template is a small image (sub-image) The goal is to find occurrences of this template in a larger image That is, you want to find matches of

More information

DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS

DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS 1 AND ALGORITHMS Chiara Renso KDD-LAB ISTI- CNR, Pisa, Italy WHAT IS CLUSTER ANALYSIS? Finding groups of objects such that the objects in a group will be similar

More information

Computer Graphics. Geometric Modeling. Page 1. Copyright Gotsman, Elber, Barequet, Karni, Sheffer Computer Science - Technion. An Example.

Computer Graphics. Geometric Modeling. Page 1. Copyright Gotsman, Elber, Barequet, Karni, Sheffer Computer Science - Technion. An Example. An Example 2 3 4 Outline Objective: Develop methods and algorithms to mathematically model shape of real world objects Categories: Wire-Frame Representation Object is represented as as a set of points

More information

A PHOTOGRAMMETRIC APPRAOCH FOR AUTOMATIC TRAFFIC ASSESSMENT USING CONVENTIONAL CCTV CAMERA

A PHOTOGRAMMETRIC APPRAOCH FOR AUTOMATIC TRAFFIC ASSESSMENT USING CONVENTIONAL CCTV CAMERA A PHOTOGRAMMETRIC APPRAOCH FOR AUTOMATIC TRAFFIC ASSESSMENT USING CONVENTIONAL CCTV CAMERA N. Zarrinpanjeh a, F. Dadrassjavan b, H. Fattahi c * a Islamic Azad University of Qazvin - nzarrin@qiau.ac.ir

More information

Edge tracking for motion segmentation and depth ordering

Edge tracking for motion segmentation and depth ordering Edge tracking for motion segmentation and depth ordering P. Smith, T. Drummond and R. Cipolla Department of Engineering University of Cambridge Cambridge CB2 1PZ,UK {pas1001 twd20 cipolla}@eng.cam.ac.uk

More information

CATIA V5 Tutorials. Mechanism Design & Animation. Release 18. Nader G. Zamani. University of Windsor. Jonathan M. Weaver. University of Detroit Mercy

CATIA V5 Tutorials. Mechanism Design & Animation. Release 18. Nader G. Zamani. University of Windsor. Jonathan M. Weaver. University of Detroit Mercy CATIA V5 Tutorials Mechanism Design & Animation Release 18 Nader G. Zamani University of Windsor Jonathan M. Weaver University of Detroit Mercy SDC PUBLICATIONS Schroff Development Corporation www.schroff.com

More information

E27 SPRING 2013 ZUCKER PROJECT 2 PROJECT 2 AUGMENTED REALITY GAMING SYSTEM

E27 SPRING 2013 ZUCKER PROJECT 2 PROJECT 2 AUGMENTED REALITY GAMING SYSTEM PROJECT 2 AUGMENTED REALITY GAMING SYSTEM OVERVIEW For this project, you will implement the augmented reality gaming system that you began to design during Exam 1. The system consists of a computer, projector,

More information

Canny Edge Detection

Canny Edge Detection Canny Edge Detection 09gr820 March 23, 2009 1 Introduction The purpose of edge detection in general is to significantly reduce the amount of data in an image, while preserving the structural properties

More information

Adaptive Learning of Statistical Appearance Models for 3D Human Tracking

Adaptive Learning of Statistical Appearance Models for 3D Human Tracking Adaptive Learning of Statistical Appearance Models for 3D Human Tracking Timothy J. Roberts, Stephen J. McKenna, Ian W. Ricketts Department of Applied Computing University of Dundee, Scotland, DD1 4HN

More information

LOCAL SURFACE PATCH BASED TIME ATTENDANCE SYSTEM USING FACE. indhubatchvsa@gmail.com

LOCAL SURFACE PATCH BASED TIME ATTENDANCE SYSTEM USING FACE. indhubatchvsa@gmail.com LOCAL SURFACE PATCH BASED TIME ATTENDANCE SYSTEM USING FACE 1 S.Manikandan, 2 S.Abirami, 2 R.Indumathi, 2 R.Nandhini, 2 T.Nanthini 1 Assistant Professor, VSA group of institution, Salem. 2 BE(ECE), VSA

More information

Solving Simultaneous Equations and Matrices

Solving Simultaneous Equations and Matrices Solving Simultaneous Equations and Matrices The following represents a systematic investigation for the steps used to solve two simultaneous linear equations in two unknowns. The motivation for considering

More information

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic

More information

Section 1.1. Introduction to R n

Section 1.1. Introduction to R n The Calculus of Functions of Several Variables Section. Introduction to R n Calculus is the study of functional relationships and how related quantities change with each other. In your first exposure to

More information

discuss how to describe points, lines and planes in 3 space.

discuss how to describe points, lines and planes in 3 space. Chapter 2 3 Space: lines and planes In this chapter we discuss how to describe points, lines and planes in 3 space. introduce the language of vectors. discuss various matters concerning the relative position

More information

CS231M Project Report - Automated Real-Time Face Tracking and Blending

CS231M Project Report - Automated Real-Time Face Tracking and Blending CS231M Project Report - Automated Real-Time Face Tracking and Blending Steven Lee, slee2010@stanford.edu June 6, 2015 1 Introduction Summary statement: The goal of this project is to create an Android

More information

Interactive Computer Graphics

Interactive Computer Graphics Interactive Computer Graphics Lecture 18 Kinematics and Animation Interactive Graphics Lecture 18: Slide 1 Animation of 3D models In the early days physical models were altered frame by frame to create

More information

Tracking And Object Classification For Automated Surveillance

Tracking And Object Classification For Automated Surveillance Tracking And Object Classification For Automated Surveillance Omar Javed and Mubarak Shah Computer Vision ab, University of Central Florida, 4000 Central Florida Blvd, Orlando, Florida 32816, USA {ojaved,shah}@cs.ucf.edu

More information

A Reliability Point and Kalman Filter-based Vehicle Tracking Technique

A Reliability Point and Kalman Filter-based Vehicle Tracking Technique A Reliability Point and Kalman Filter-based Vehicle Tracing Technique Soo Siang Teoh and Thomas Bräunl Abstract This paper introduces a technique for tracing the movement of vehicles in consecutive video

More information

ECE 533 Project Report Ashish Dhawan Aditi R. Ganesan

ECE 533 Project Report Ashish Dhawan Aditi R. Ganesan Handwritten Signature Verification ECE 533 Project Report by Ashish Dhawan Aditi R. Ganesan Contents 1. Abstract 3. 2. Introduction 4. 3. Approach 6. 4. Pre-processing 8. 5. Feature Extraction 9. 6. Verification

More information

Metrics on SO(3) and Inverse Kinematics

Metrics on SO(3) and Inverse Kinematics Mathematical Foundations of Computer Graphics and Vision Metrics on SO(3) and Inverse Kinematics Luca Ballan Institute of Visual Computing Optimization on Manifolds Descent approach d is a ascent direction

More information

Static Environment Recognition Using Omni-camera from a Moving Vehicle

Static Environment Recognition Using Omni-camera from a Moving Vehicle Static Environment Recognition Using Omni-camera from a Moving Vehicle Teruko Yata, Chuck Thorpe Frank Dellaert The Robotics Institute Carnegie Mellon University Pittsburgh, PA 15213 USA College of Computing

More information

Behavior Analysis in Crowded Environments. XiaogangWang Department of Electronic Engineering The Chinese University of Hong Kong June 25, 2011

Behavior Analysis in Crowded Environments. XiaogangWang Department of Electronic Engineering The Chinese University of Hong Kong June 25, 2011 Behavior Analysis in Crowded Environments XiaogangWang Department of Electronic Engineering The Chinese University of Hong Kong June 25, 2011 Behavior Analysis in Sparse Scenes Zelnik-Manor & Irani CVPR

More information

Animation. Persistence of vision: Visual closure:

Animation. Persistence of vision: Visual closure: Animation Persistence of vision: The visual system smoothes in time. This means that images presented to the eye are perceived by the visual system for a short time after they are presented. In turn, this

More information

Introduction to Computer Graphics Marie-Paule Cani & Estelle Duveau

Introduction to Computer Graphics Marie-Paule Cani & Estelle Duveau Introduction to Computer Graphics Marie-Paule Cani & Estelle Duveau 04/02 Introduction & projective rendering 11/02 Prodedural modeling, Interactive modeling with parametric surfaces 25/02 Introduction

More information

Tracking Moving Objects In Video Sequences Yiwei Wang, Robert E. Van Dyck, and John F. Doherty Department of Electrical Engineering The Pennsylvania State University University Park, PA16802 Abstract{Object

More information

animation animation shape specification as a function of time

animation animation shape specification as a function of time animation animation shape specification as a function of time animation representation many ways to represent changes with time intent artistic motion physically-plausible motion efficiency control typically

More information

Image Segmentation and Registration

Image Segmentation and Registration Image Segmentation and Registration Dr. Christine Tanner (tanner@vision.ee.ethz.ch) Computer Vision Laboratory, ETH Zürich Dr. Verena Kaynig, Machine Learning Laboratory, ETH Zürich Outline Segmentation

More information

Finding people in repeated shots of the same scene

Finding people in repeated shots of the same scene Finding people in repeated shots of the same scene Josef Sivic 1 C. Lawrence Zitnick Richard Szeliski 1 University of Oxford Microsoft Research Abstract The goal of this work is to find all occurrences

More information

ROBUST VEHICLE TRACKING IN VIDEO IMAGES BEING TAKEN FROM A HELICOPTER

ROBUST VEHICLE TRACKING IN VIDEO IMAGES BEING TAKEN FROM A HELICOPTER ROBUST VEHICLE TRACKING IN VIDEO IMAGES BEING TAKEN FROM A HELICOPTER Fatemeh Karimi Nejadasl, Ben G.H. Gorte, and Serge P. Hoogendoorn Institute of Earth Observation and Space System, Delft University

More information

Segmentation & Clustering

Segmentation & Clustering EECS 442 Computer vision Segmentation & Clustering Segmentation in human vision K-mean clustering Mean-shift Graph-cut Reading: Chapters 14 [FP] Some slides of this lectures are courtesy of prof F. Li,

More information

Geometric Camera Parameters

Geometric Camera Parameters Geometric Camera Parameters What assumptions have we made so far? -All equations we have derived for far are written in the camera reference frames. -These equations are valid only when: () all distances

More information

Industrial Robotics. Training Objective

Industrial Robotics. Training Objective Training Objective After watching the program and reviewing this printed material, the viewer will learn the basics of industrial robot technology and how robots are used in a variety of manufacturing

More information

A New Robust Algorithm for Video Text Extraction

A New Robust Algorithm for Video Text Extraction A New Robust Algorithm for Video Text Extraction Pattern Recognition, vol. 36, no. 6, June 2003 Edward K. Wong and Minya Chen School of Electrical Engineering and Computer Science Kyungpook National Univ.

More information

A Color Hand Gesture Database for Evaluating and Improving Algorithms on Hand Gesture and Posture Recognition

A Color Hand Gesture Database for Evaluating and Improving Algorithms on Hand Gesture and Posture Recognition Res. Lett. Inf. Math. Sci., 2005, Vol. 7, pp 127-134 127 Available online at http://iims.massey.ac.nz/research/letters/ A Color Hand Gesture Database for Evaluating and Improving Algorithms on Hand Gesture

More information

Monash University Clayton s School of Information Technology CSE3313 Computer Graphics Sample Exam Questions 2007

Monash University Clayton s School of Information Technology CSE3313 Computer Graphics Sample Exam Questions 2007 Monash University Clayton s School of Information Technology CSE3313 Computer Graphics Questions 2007 INSTRUCTIONS: Answer all questions. Spend approximately 1 minute per mark. Question 1 30 Marks Total

More information

Introduction to Computer Graphics

Introduction to Computer Graphics Introduction to Computer Graphics Torsten Möller TASC 8021 778-782-2215 torsten@sfu.ca www.cs.sfu.ca/~torsten Today What is computer graphics? Contents of this course Syllabus Overview of course topics

More information

Computer Graphics Prof. Sukhendu Das Dept. of Computer Science and Engineering Indian Institute of Technology, Madras Lecture 7 Transformations in 2-D

Computer Graphics Prof. Sukhendu Das Dept. of Computer Science and Engineering Indian Institute of Technology, Madras Lecture 7 Transformations in 2-D Computer Graphics Prof. Sukhendu Das Dept. of Computer Science and Engineering Indian Institute of Technology, Madras Lecture 7 Transformations in 2-D Welcome everybody. We continue the discussion on 2D

More information

SPECIAL PERTURBATIONS UNCORRELATED TRACK PROCESSING

SPECIAL PERTURBATIONS UNCORRELATED TRACK PROCESSING AAS 07-228 SPECIAL PERTURBATIONS UNCORRELATED TRACK PROCESSING INTRODUCTION James G. Miller * Two historical uncorrelated track (UCT) processing approaches have been employed using general perturbations

More information

Colour Image Segmentation Technique for Screen Printing

Colour Image Segmentation Technique for Screen Printing 60 R.U. Hewage and D.U.J. Sonnadara Department of Physics, University of Colombo, Sri Lanka ABSTRACT Screen-printing is an industry with a large number of applications ranging from printing mobile phone

More information

Learning the Behavior of Users in a Public Space through Video Tracking

Learning the Behavior of Users in a Public Space through Video Tracking Learning the Behavior of Users in a Public Space through Video Tracking Wei Yan Department of Architecture & Computer Science Division University of California, Berkeley weiyan@uclink.berkeley.edu D. A.

More information

Face Model Fitting on Low Resolution Images

Face Model Fitting on Low Resolution Images Face Model Fitting on Low Resolution Images Xiaoming Liu Peter H. Tu Frederick W. Wheeler Visualization and Computer Vision Lab General Electric Global Research Center Niskayuna, NY, 1239, USA {liux,tu,wheeler}@research.ge.com

More information

Automatic Labeling of Lane Markings for Autonomous Vehicles

Automatic Labeling of Lane Markings for Autonomous Vehicles Automatic Labeling of Lane Markings for Autonomous Vehicles Jeffrey Kiske Stanford University 450 Serra Mall, Stanford, CA 94305 jkiske@stanford.edu 1. Introduction As autonomous vehicles become more popular,

More information

NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS

NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS TEST DESIGN AND FRAMEWORK September 2014 Authorized for Distribution by the New York State Education Department This test design and framework document

More information

Deterministic Sampling-based Switching Kalman Filtering for Vehicle Tracking

Deterministic Sampling-based Switching Kalman Filtering for Vehicle Tracking Proceedings of the IEEE ITSC 2006 2006 IEEE Intelligent Transportation Systems Conference Toronto, Canada, September 17-20, 2006 WA4.1 Deterministic Sampling-based Switching Kalman Filtering for Vehicle

More information

Euler: A System for Numerical Optimization of Programs

Euler: A System for Numerical Optimization of Programs Euler: A System for Numerical Optimization of Programs Swarat Chaudhuri 1 and Armando Solar-Lezama 2 1 Rice University 2 MIT Abstract. We give a tutorial introduction to Euler, a system for solving difficult

More information

CHAPTER 6 TEXTURE ANIMATION

CHAPTER 6 TEXTURE ANIMATION CHAPTER 6 TEXTURE ANIMATION 6.1. INTRODUCTION Animation is the creating of a timed sequence or series of graphic images or frames together to give the appearance of continuous movement. A collection of

More information

Visibility optimization for data visualization: A Survey of Issues and Techniques

Visibility optimization for data visualization: A Survey of Issues and Techniques Visibility optimization for data visualization: A Survey of Issues and Techniques Ch Harika, Dr.Supreethi K.P Student, M.Tech, Assistant Professor College of Engineering, Jawaharlal Nehru Technological

More information

Speed Performance Improvement of Vehicle Blob Tracking System

Speed Performance Improvement of Vehicle Blob Tracking System Speed Performance Improvement of Vehicle Blob Tracking System Sung Chun Lee and Ram Nevatia University of Southern California, Los Angeles, CA 90089, USA sungchun@usc.edu, nevatia@usc.edu Abstract. A speed

More information

VECTORAL IMAGING THE NEW DIRECTION IN AUTOMATED OPTICAL INSPECTION

VECTORAL IMAGING THE NEW DIRECTION IN AUTOMATED OPTICAL INSPECTION VECTORAL IMAGING THE NEW DIRECTION IN AUTOMATED OPTICAL INSPECTION Mark J. Norris Vision Inspection Technology, LLC Haverhill, MA mnorris@vitechnology.com ABSTRACT Traditional methods of identifying and

More information

Data Clustering. Dec 2nd, 2013 Kyrylo Bessonov

Data Clustering. Dec 2nd, 2013 Kyrylo Bessonov Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main

More information

3D Model based Object Class Detection in An Arbitrary View

3D Model based Object Class Detection in An Arbitrary View 3D Model based Object Class Detection in An Arbitrary View Pingkun Yan, Saad M. Khan, Mubarak Shah School of Electrical Engineering and Computer Science University of Central Florida http://www.eecs.ucf.edu/

More information

OPRE 6201 : 2. Simplex Method

OPRE 6201 : 2. Simplex Method OPRE 6201 : 2. Simplex Method 1 The Graphical Method: An Example Consider the following linear program: Max 4x 1 +3x 2 Subject to: 2x 1 +3x 2 6 (1) 3x 1 +2x 2 3 (2) 2x 2 5 (3) 2x 1 +x 2 4 (4) x 1, x 2

More information

Data Mining Practical Machine Learning Tools and Techniques

Data Mining Practical Machine Learning Tools and Techniques Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea

More information

The Visual Internet of Things System Based on Depth Camera

The Visual Internet of Things System Based on Depth Camera The Visual Internet of Things System Based on Depth Camera Xucong Zhang 1, Xiaoyun Wang and Yingmin Jia Abstract The Visual Internet of Things is an important part of information technology. It is proposed

More information

Image Estimation Algorithm for Out of Focus and Blur Images to Retrieve the Barcode Value

Image Estimation Algorithm for Out of Focus and Blur Images to Retrieve the Barcode Value IJSTE - International Journal of Science Technology & Engineering Volume 1 Issue 10 April 2015 ISSN (online): 2349-784X Image Estimation Algorithm for Out of Focus and Blur Images to Retrieve the Barcode

More information

Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report

Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 69 Class Project Report Junhua Mao and Lunbo Xu University of California, Los Angeles mjhustc@ucla.edu and lunbo

More information

Compact Representations and Approximations for Compuation in Games

Compact Representations and Approximations for Compuation in Games Compact Representations and Approximations for Compuation in Games Kevin Swersky April 23, 2008 Abstract Compact representations have recently been developed as a way of both encoding the strategic interactions

More information

Probabilistic Latent Semantic Analysis (plsa)

Probabilistic Latent Semantic Analysis (plsa) Probabilistic Latent Semantic Analysis (plsa) SS 2008 Bayesian Networks Multimedia Computing, Universität Augsburg Rainer.Lienhart@informatik.uni-augsburg.de www.multimedia-computing.{de,org} References

More information

This is an example of a fairly state of the art computer generated line drawing. It uses suggestive contours, slightly stylized strokes, and some

This is an example of a fairly state of the art computer generated line drawing. It uses suggestive contours, slightly stylized strokes, and some 1 This is an example of a fairly state of the art computer generated line drawing. It uses suggestive contours, slightly stylized strokes, and some visual emphasis effects, and its a pretty nice, effective

More information

Applications to Data Smoothing and Image Processing I

Applications to Data Smoothing and Image Processing I Applications to Data Smoothing and Image Processing I MA 348 Kurt Bryan Signals and Images Let t denote time and consider a signal a(t) on some time interval, say t. We ll assume that the signal a(t) is

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

Part 4 fitting with energy loss and multiple scattering non gaussian uncertainties outliers

Part 4 fitting with energy loss and multiple scattering non gaussian uncertainties outliers Part 4 fitting with energy loss and multiple scattering non gaussian uncertainties outliers material intersections to treat material effects in track fit, locate material 'intersections' along particle

More information

Efficient Background Subtraction and Shadow Removal Technique for Multiple Human object Tracking

Efficient Background Subtraction and Shadow Removal Technique for Multiple Human object Tracking ISSN: 2321-7782 (Online) Volume 1, Issue 7, December 2013 International Journal of Advance Research in Computer Science and Management Studies Research Paper Available online at: www.ijarcsms.com Efficient

More information

AP Physics 1 and 2 Lab Investigations

AP Physics 1 and 2 Lab Investigations AP Physics 1 and 2 Lab Investigations Student Guide to Data Analysis New York, NY. College Board, Advanced Placement, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks

More information

Unit 7 Quadratic Relations of the Form y = ax 2 + bx + c

Unit 7 Quadratic Relations of the Form y = ax 2 + bx + c Unit 7 Quadratic Relations of the Form y = ax 2 + bx + c Lesson Outline BIG PICTURE Students will: manipulate algebraic expressions, as needed to understand quadratic relations; identify characteristics

More information

E190Q Lecture 5 Autonomous Robot Navigation

E190Q Lecture 5 Autonomous Robot Navigation E190Q Lecture 5 Autonomous Robot Navigation Instructor: Chris Clark Semester: Spring 2014 1 Figures courtesy of Siegwart & Nourbakhsh Control Structures Planning Based Control Prior Knowledge Operator

More information