Which Edges Matter? Aayush Bansal RI, CMU. Devi Parikh Virginia Tech. Adarsh Kowdle Microsoft. Larry Zitnick Microsoft Research
|
|
|
- Wilfred Blair
- 10 years ago
- Views:
Transcription
1 Which Edges Matter? Aayush Bansal RI, CMU Andrew Gallagher Cornell University Adarsh Kowdle Microsoft Devi Parikh Virginia Tech Larry Zitnick Microsoft Research Abstract In this paper, we investigate the ability of humans to recognize objects using different types of edges. Edges arise in images because of several different physical phenomena, such as shadow boundaries, changes in material albedo or reflectance, changes to surface normals, and occlusion boundaries. By constructing synthetic photorealistic scenes, we control which edges are visible in a rendered image to investigate the relationship between human visual recognition and that edge type. We evaluate the information conveyed by each edge type through human studies on object recognition tasks. We find that edges related to surface normals and depth are the most informative edges, while texture and shadow edges can confuse recognition tasks. This work corroborates recent advances in practical vision systems where active sensors capture depth edges (e.g. Microsoft Kinect) as well as in edge detection where progress is being made towards finding object boundaries instead of just pixel gradients. Further, we evaluate seven standard and state-of-the-art edge detectors based on the types of edges they find by comparing the detected edges with known informative edges in the synthetic scene. We suggest that this evaluation method could lead to more informed metrics for gauging developments in edge detection, without requiring any human labeling. In summary, this work shows that human proficiency at object recognition is due to surface normal and depth edges and suggests that future research should focus on explicitly modeling edge types to increase the likelihood of finding informative edges. 1. Introduction Over the past several decades, object recognition approaches have leveraged edge-based reasoning [30, 43] or gradient information [14]. The progression of the vision community is marked in some ways by fundamental advances related to capturing edges, be it the classical Canny edge detector [6] or the more recent SIFT [24] and HOG descriptors [28] that capture gradient information effectively. The important role of edges and gradients in computer vision does not come as a surprise to a student of pyschovisual processes: the importance of using edges is also supported by nature; the visual systems of mammals contain cells with gradient-based Gabor-like responses [11]. Though fundamental to visual processing, edges or gradients correspond to several different physical phenomena (Figure 1). For instance, object boundaries typically result in edges due to texture or albedo differences between objects. But an edge may also correspond to texture or albedo changes occurring within a single object. Lighting variations due to shadows or changes in surface normals also create edges. In fact, these phenomena are often correlated: depth discontinuities, lighting variations and object boundaries can coincide. This diverse nature of edges raises an important question: are some types of edges (i.e. edges resulting from certain phenomenon) more useful for object recognition than others? If so, which ones? In this paper, we study this question in the context of the human visual system, and then use these findings to gauge progress in computer edge detection by measuring by how well algorithms find edges that contain the most information for (human) object recognition. Different applications benefit from localizing different types of edges. For instance, image segmentation approaches [1] attempt to find edges corresponding to object boundaries. Edges corresponding to depth discontinuities are used by stereo techniques. Texture classification [29] on the other hand relies on the internal gradients or edges of objects. This raises another relevant question: which types of edges are being found by standard automatic edge detection algorithms in computer vision? In this paper, we explore the question of which physical phenomena results in the most informative edges for ob- 1
2 Figure 1. When an edge detector is applied to a color image the resulting edges are a superset of many different edge types. They typically include edges related to occlusion boundaries (red), shape properties such as shading (green), illumination e.g. cast shadows (blue) and the texture of the various surfaces in the scene (yellow). ject recognition for humans. Further, we investigate which physical phenomena correspond to the edges detected by standard edge detectors in computer vision. Clearly, to answer these questions, we need images for which the groundtruth phenomenon leading to each edge pixel can be determined. Labeling these in natural images would be prohibitively time-consuming. Instead, we leverage the advances made in computer graphics. We synthetically render photorealistic images of different scene models [21, 25, 33]. Since the structure of the scene is known, the physical phenomena that produce each edge in the rendered image can be easily determined. The use of computer graphics also allows us to control the illumination conditions, reflectance properties and albedo variations of objects in the scene. This allows us to control which phenomenon and hence which edge types manifest themselves in the images. For instance, an object s albedo can be forced to be uniform, creating no albedo (i.e. no texture) edges. Similarly, gradients associated with changes in surface normals can be removed by applying uniform ambient lighting. This paper presents a first investigation of the human visual system s dependance on edges using computer manipulations of scenes. In our study, we find that texture edges confuse object recognition, while edges resulting from surface normal variations and depth discontinuities are most informative. In our analysis of standard edge detection algorithms, we quantify the percentage of informative edges that each captures. We show that edge detectors find edges resulting from numerous physical phenomena, but the majority of detected edges result from object textures, which are also the least informative. However, we also find indications of solid progress, where recent improvements to edge detection are leading to detectors that capture less of the uninformative edges than older edge detectors. Rather than relying on qualitative intuitions prevalent in the community, this paper explicitly quantifies the usefulness of different edge types for object recognition, using the human visual system as an existence proof, and characterizes different edge detection algorithms. In effect, this paper helps explain the success of recent edge-based methods and the surge in incorporating depth features for object recognition. It also provides inspiration for the design of next-generation features as additional sensing modalities (e.g. direct depth capture along with color) become more widespread. It may be beneficial to explicitly model occlusion, shadows, etc. or train edge detectors to detect more of the informative edges. Developing algorithms that automatically classify an edge into one of its various types to then be selectively fed into a descriptor for recognizing objects could also be beneficial. 2. Related work This work is related to research in several broad subareas of cognitive science and computer vision, including edge and line analysis, human perception from edges, comparison of different descriptors for computer vision tasks, depth imaging and the use of synthetic photorealistic scenes for advancing computer vision. Humans, of course, do not acquire depth data directly for visual processing, but understanding size, shape, and distance is clearly an important part of the visual system as proposed by Marr [26]. The perception of depth and shape are tasks central to the human visual system. Human depth perception incorporates evidence from stereo cues, motion cues, and monocular cues including: texture gradients, lines and edges, occlusion, shading, and defocus [16]. In this work, we are particularly interested in the relationship between edges and object recognition in human perception. Edge and line analysis: For decades, edge extraction has been a component of many computer vision and image processing systems. In many early works, [3, 4, 10, 41], image edges were extracted, and scene reconstruction was attempted based on these edges. However, in practice, these methods faced a fundamental problem: image edges are not necessarily related to structural edges in the scene. Instead, some image edges are related to image noise, or illumination and albedo changes in the scene. Our work attempts to explicitly quantify the usefulness of these different types of
3 edges for object recognition. Human perception from edges and lines: It is well known that humans can perceive depth from an image of edges (i.e. from a line drawing). Studies have explored the ability of humans to perceive shape from various types of images [5, 7, 8, 20, 22]. In [8], subjects demonstrated effective surface normal estimation when shown artist renditions of various objects. Lines drawn by a skilled artist can convey shapes nearly as well as shaded objects. However, the mechanisms used by the artists to produce the line drawings are not computational models. Hence, unlike our study, they do not give us insights into which types of edges matter. In [40], it is shown that humans exhibit roughly similar fmri patterns when viewing either line or color images, suggesting that the interpretation of both modalities is similar (and edge-based). However, line drawings do not capture all available scene information: [39] shows that subjects have systematic misinterpretations of shape in line drawings. Algorithmic approaches to render line drawings of 3D meshes that effectively convey the visual information to viewers have been proposed [12, 19]. Comparison of image features: Several works have compared performance of various image features, be it for low-level tasks such as matching image patches [21, 27, 34] or for high-level tasks such as image classification [42]. The goal of this paper is not to compare existing computational image descriptors. Instead, it is to evaluate the impact of the manifestations of different physical phenomenon in image content on object recognition performance. Moreover, while we characterize the properties of various edge detection algorithms in computer vision, we study recognition in humans and not machines. Depth imaging: The past several years have seen an explosion of practical systems for capturing depth either directly (e.g. the PointGrey BumbleBee camera and Microsoft Kinect), or reasoning about depth in the scene from many images [37], or inferring depth from single images by extracting mid- and low-level features [18, 32]. For computer vision applications, depth-based features have shown remarkable efficacy for accuracy. For example, detection of humans and pose is accomplished with depth-based difference features and no visual features in [36]. Incorporating depth features in addition to RGB features was shown to boost patch-matching performance [21]. With the advent of many new sensing modalities, it is useful to analyze what information we should be attempting to extract from these. For instance, do we need the absolute depth values, or simply the depth discontinuities? Or perhaps the discontinuities in surface normals? Our study attempts to address these questions. By studying human object recognition, our findings are independent of specific algorithmic choices. 1 1 It would be more accurate to say that we choose humans as the algorithm to study since it is the best known vision system to date. Use of synthetic scenes: Synthetic images and video data have been used to advance computer vision in a variety of ways including evaluating the performance of tracking and surveillance algorithms [38], training classifiers for pedestrian detection [25] or human pose estimation [36], learning where to grasp objects [31], evaluating image features for matching patches [21], etc. In this work we use images of synthetic scenes to understand the informativeness of different types of edges for human object recognition. 3. Creating Virtual Worlds In this work, we consider the following four edge types: Shadow edges: observed at the boundaries of cast shadows, e.g. [23]. Texture edges: corresponding to a surface texture or a change in the reflectance properties (e.g. albedo) across the smooth surface of one or more materials (e.g. a stripe on a zebra). Occlusion boundaries: often observed at the boundary of an object, which likely occludes another object having different illumination, surface properties, or surface normal. Surface normal discontinuities: indicating intersecting surfaces or creases where surfaces have distinct surface normals, and therefore are illuminated differently. An edge observed in an image can be a result of a number of different arrangement of light and materials in a scene (Figure 1). We create synthetic scenes and manipulate the renderings so that specific types of edges can be selectively rendered in the scene. We model and render our virtual scenes using SketchUp 8 2 and Trimble 3D Warehouse powered by Google 3. 3D Warehouse is a vast repository of user-created models of objects, scenes, and buildings, some of which are incredibly detailed. SketchUp allows for the creation of rendering styles that control the appearance of faces in the 3D model and edges, and has a simple user interface for placing objects into scenes. Further, controls are available for illumination sources (ambient and direct illumination). Shadows can be separately toggled on or off. Although more sophisticated rendering packages could be used, SketchUp has user interface advantages, and we find that our subjects could not reliably distinguish the rendered images from actual photographs Objects and Scenes We assembled a 3D scene dataset containing 46 unique scenes by searching Google 3D Warehouse for models from the following 7 scene categories: bathroom (6), kitchen (10), living room (8), office (5), house (10), patio (3), and playground (4). Our object dataset contains 54 different objects including sandal, teddy bear, frying pan, tricycle, chair, faucet, grill, tree, laptop, car, person, books, sketchup.google.com/3dwarehouse
4 etc. In each scene, 4 to 12 objects are placed manually by the authors in a contextually appropriate position (e.g. a corkscrew is placed on a countertop, not on the floor). The same object may be placed in multiple scenes (e.g. a teddy may be on a living room sofa and also on a kitchen floor); however, with a different viewpoint Rendering We render each of the 46 synthetic scenes in 6 different ways. Edges in these images correspond to different combinations of the edge types. In addition, a total of 5 edge images are produced, for a total of 11 images per scene. Table 1 describes the renderings, the edge images, and the edges types that are contained in each. For each scene, all renderings are from the same camera viewpoint. The edge renderings (7-11) are made by applying a basic edge detector to the corresponding rendered image (more details in Section 3.3). A few points about specific steps for producing other renderings mentioned in Table 1 using Google SketchUp: 1) RGB: RGB is obtained by rendering with textures, shading, and shadows on. This rendering corresponds most closely to what we experience in everyday life. 2) BW: The grayscale rendering is produced simply by converting the RGB rendering to gray. 3) Albedo: To render only the albedo of the scene the rendering is performed with textures on, but uniform lighting. 4) GraySurf: All Lambertian surfaces are replaced with uniform-gray material having constant albedo. Shading is still visible, but shadows are turned off. 5) GrayShad: This rendering is identical to GraySurf, but the shadows (on faces in the 3D model and on the ground) are on. 6) Depth: There is no direct way in SketchUp to export an image of the depth map. We use the following trick: after turning off all textures and shading, the scene is rendered with linear black fog. The rendered image code value is then linearly related to the distance from the camera Extracting edges We compute the edges in five of the six 4 SketchUp renderings described above to produce renderings 7-11 as in Table 1. A typical approach to visualizing edges in images is to obtain the gradient profile and perform global normalization, However, that approach retains only gradient edges that are globally dominant while globally weak edges are eliminated. To obtain a rendering where the user can observe all the locally dominant gradients clearly, we use the idea of locally normalized gradients by Zitnick [44] where the gradient profile is normalized with respect to the average gradient magnitude in a small local neighborhood. Examples of all 11 renderings of a scene can be seen in Figure 2. 4 Most edge detectors convert an RGB image to grayscale before extracting edges, hence edges from BW and RGB are assumed to be identical. Similar renderings of all 46 scenes, along with a list of the 54 objects in our dataset can be found in the supplementary material. 4. Experiments and Results 4.1. Calibration We first wish to assess how realistic our synthetic scenes are. For each of the 7 scene categories in our dataset, we collected 15 real and 15 synthetic images 5. We analyze the distribution of gradients for the synthetic images and compare them to the profile of real images. Figure 5(a) shows the result. We see that the gradient profiles are very similar. We also conducted a study where 10 subjects were shown each (grayscale) scene for 107ms (as in [13]) and asked to classify it as being real or synthetic. Their average accuracy was 56%, not much higher than chance (50%). We also had subjects recognize the scene category of the image. Average accuracy was 68%, which is significantly above chance ( 15%). This suggests that 107ms is sufficient for subjects to understand the context of the scene and yet they can not reliably tell if the scene is synthetic or real. These tests indicate that our synthetic images are realistic looking Human Object Recognition Studies The goal of our human studies is to determine what types of edges are most informative to humans for recognizing objects. Our experimental setup is shown in Figure 3. We focus the attention of our subjects by flashing a 3 second countdown followed by a prompt to focus on a following red dot. The red dot is located on a white field, and indicates the location of an object to be recognized in the following scene. The scenes, as described earlier, are rendered according to one of our 11 renderings. After viewing the scene, each subject is asked to identify the indicated object using free-form text. They were instructed to use one word or a short phrase for the object name. The scene is flashed for either 107ms or 500ms following the approach of Fei- Fei et al. in [13]. The shorter time frame does not allow for the subject to change fixation point, while the longer time frame allows for one or two quick saccades before the scene disappears. We conducted these tests for each combination of object, scene and rendering style as described in Section 3. We manually marked the location of the objects in the scenes to generate the red dots, which resulted in 265 unique object instances for human subjects to recognize in each of the 11 renderings. Each of these 2915 tests was performed by 10 different subjects. All experiments were performed on Amazon Mechanical Turk 6, a crowd-sourcing service that 5 A subset of these were the 46 images in our dataset. 6 MTurk has been used by several works analyzing abilities of human subjects at visual tasks e.g. [45].
5 Figure 3. The timing sequence used in our human object recognition studies. Figure 4. The grading interface used to evaluate human object recognition performance. also interesting. As more time is given, renderings with more information such as RGB, BW, and GrayShad obtain a more significant performance boost. This may suggest a larger duration is needed for information discovery. More interestingly, Figure 5(c), shows human performance on the edge images. At 107ms viewing time, the presence of more edges in RGB E and GrayShad E reduce recognition performance when compared against the reduced edges in GraySurf E. Also, GraySurf E performs better than the similarly rendered GrayShad E that has the addition of shadows, indicating that shadows may prove distracting. We also find that information in depth edges alone (Depth E ) are not enough for object recognition and result in poor performance. The shape information provided by surface normals (visible in GraySurf E ) is critical. As the viewing time increases to 500 ms, the performance gap (between GraySurf E, Albedo E, and RGB E ) disappears, perhaps as higher-level processes kick in Edge Analysis We now study the types of edges that are found using a variety of machine edge detection and segmentation algorithms: thresholding the gradient of a pixel (grad), canny edge detection [6], normalized cuts [35] (ncut), mean shift [9] (ms), Felzenszwalb and Huttenlocher [15] (fh), ultrametric contour maps [1] (ucm) and Zitnick s approach [44] (ngrad) presented in Section 3.3. The implementation details of the first six approaches can be found in [45]. We used the same parameter settings as in [45] which were chosen to match human edge detection performance. We detected edges using each of these approaches in the RGB image. Since the geometry and lighting of each scene is known, each detected edge can be matched to the physical phenomena associated with the edge. We produce a map for each type (texture, occlusion, shadow, surface normal) of edge by analyzing the original six renderings of each scene. For instance, edge detection on the depth maps corresponds to occlusion edges. Edges that appear in GrayShad E but not in GraySurf E correspond to shadow edges. Edges that appear in Albedo E are texture edges. Edges in GraySurf E that are not occlusion edges are surface normal edges. An illustration of our edge classification for an example scene is shown in Figure 6(a). For the visualization, if an edge pixel is associated to multiple edge types, we assign it to the type that is more rare. The list of precedence is Depth, GraySurf, GrayShad and Albedo. An analysis of the types of edges found by the different edge detection algorithms is in Figure 6(b) and 6(c), where each detected edge pixel can be assigned to multiple source edge types if applicable. We obtain the precision as the proportion of the edges retrieved by each edge detector that are relevant edges (of the specific type) (Figure 6(b)). UCM has relatively high precision for detecting the informative surface normal and depth edges, while a larger proportion of the edges found by other detectors (e.g. ngrad) correspond to the less informative texture edges. Figure 6(c) shows the recall i.e. proportion of the relevant edges (of the specific type) retrieved by each edge detector. We consolidate these statistics in Figure 6(d). We treat the occlusion and surface normal edges as the positive class, and the remaining pixels in the image as the negative class. We compute the proportion of the positive pixels detected by an edge detector and the proportion of negative pixels left undetected. The average of these two accuracies is shown in Figure 6(d). We see that the mean-shift segmentation algorithm performs surprisingly well. This is followed by UCM, a recent state-ofart edge detector. We see that it outperforms several classical approaches even on this new evaluation. To design improved edge detectors, we suggest that it may be useful to perform this type of detailed evaluation of
6 log(num pixels) Gradient (a) Gradient histograms Synthetic Real RGB BW Albedo GraySuf 107 ms 500 ms GrayShad (b) Accuracy of Humans with Images Depth RBGE AlbedoE GraySurfE 107 ms 500 ms GrayShadE (c) Accuracy of Humans with Edges Figure 5. (a) We verified that real and our computer-rendered images have similar low-level statistics, and were difficult for subjects to distinguish. Consequently, we are confident that our human studies provide insight into visual processes on natural imagery. (b) and (c) show object recognition performance by humans when viewing (b) various renderings of the scene and (c) edges extracted from these renderings. When viewing edge images (c), humans achieve the best performance when there is an absence of texture and shadow edges (as in GraySurfE at 107ms), and the visible edges correspond to occlusion and surface normal edges. DepthE (a) (b) (c) Figure 6. (a) Edges found in the RGB image using the state-of-the-art UCM [1] contour detector classified by edge type. Yellow: texture edges, Blue: shadow edges, Green: surface normal edges, Red: occlusion edges, and Magenta: none of the above. The majority of detected edges are texture, which lacks discriminative information for object recognition. The precision (b) and recall (c) of each edge detector for each of the four edge types is also shown. UCM has relatively high precision for detecting the informative surface normal and depth edges, while other detectors (e.g. ngrad) detect a larger proportion of the less informative texture edges. We summarize the performance of each detector in identifying the informative (occlusion + surface normal) edges in (d). Best viewed in color. (d) edge detectors. 7 Useful edge detectors are those that, on a large variety of (possibly computer generated) images, find informative edges associated with surface normals and occlusions, but ignore shadow and texture edges. Notice that this methodology removes the need for manual annotation of images to evaluate edge detectors. 5. Conclusion We study the usefulness of different edge types for object recognition in humans. We approach this problem in a novel way by constructing scenes in a virtual world, controlling surface properties and illumination when rendering images of the scenes, and then applying edge detectors on the resulting images. This allows us to identify the source 7 This is in similar philosophy to Hoiem et al. [17] that suggests more details evaluation of object detectors. of each edge pixel in an image. A battery of human studies were conducted to show that not all edges are equally informative. Edges related to surface normal changes and occlusion were the most informative, and edges associated with shadows and textures make recognition tasks more difficult. We show that edge detectors find edges resulting from numerous physical phenomena, but a larger portion of the edges detected correspond to shadows and albedo (uninformative) than to surface normal and depth changes (informative). We believe that this paper takes steps toward explaining the success of edge-based methods, as well provides inspiration for the design and evaluation of nextgeneration features as additional sensing modalities become more widespread. Automatically identifying the sources of the different edges, the first steps towards which are being taken in works like [2], may be beneficial. This is a preliminary exploration. There are several avenues for future
7 work: more realistic renderings of synthetic scenes, evaluating machine vision algorithms for different edge types, analyzing outdoor scenes, etc. Acknowledgements: This work was supported in part by NSF IIS / References [1] P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik. Contour detection and hierarchical image segmentation. PAMI, [2] J. T. Barron and J. Malik. Shape, albedo, and illumination from a single image of an unknown object. In CVPR, [3] H. G. Barrow and J. M. Tenenbaum. Recovering intrinsic scene characteristics from images. In Computer Vision Systems, [4] H. G. Barrow and J. M. Tenenbaum. Interpreting line drawings as three-dimensional surfaces. AI, [5] K. Burns. Mental models of line drawings. Perception, [6] J. Canny. A computational approach to edge detection. PAMI, [7] F. Cole, F. Durand, W. Freeman, and E. Adelson. Interpreting line drawings of smooth shapes. Journal of Vision, [8] F. Cole, K. Sanik, D. DeCarlo, A. Finkelstein, T. Funkhouser, S. Rusinkiewicz, and M. Singh. How well do line drawings depict shape? ACM Trans. Graph., [9] D. Comanicu and P. Meer. Mean shift: A robust approach toward feature space analysis. PAMI, [10] M. C. Cooper. Interpretation of line-drawings of complex objects. Image and Vision Computing, [11] J. G. Daugman. Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by twodimensional visual cortical filters. Journal of the Optical Society of America A, [12] D. DeCarlo, A. Finkelstein, S. Rusinkiewicz, and A. Santella. Suggestive contours for conveying shape. ACM Transactions on Graphics (Proc. SIGGRAPH), 22(3): , July [13] L. Fei-Fei, A. Iyer, C. Koch, and P. Perona. What do we perceive in a glance of a real-world scene? Journal of Vision, [14] P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained partbased models. PAMI, [15] P. Felzenszwalb and D. Huttenlocher. Efficient graph-based image segmentation. IJCV, [16] J. J. Gibson. Perception of the visual world. In Houghton Mifflin, [17] D. Hoiem, Y. Chodpathumwan, and Q. Dai. Diagnosing error in object detectors. In ECCV, [18] D. Hoiem, A. A. Efros, and M. Hebert. Automatic photo pop-up. In SIGGRAPH, [19] T. Judd, F. Durand, and E. H. Adelson. Apparent ridges for line drawing. ACM Trans. Graph., 26(3):19, [20] N. Kawabata. Depth perception in simple line drawings. Perceptual Motor Skills, [21] B. Keneva, A. Torralba, and W. Freeman. Evaluation of image features using a photorealistic virtual world. In ICCV, [22] J. Koenderink, A. van Doorn, A. Kappers, and J. Todd. Ambiguity and the mental eye in pictorial relief. Perception, [23] J. F. Lalonde, A. A. Efros, and S. G. Narasimhan. Detecting ground shadows in outdoor consumer photographs. In ECCV, [24] D. Lowe. Distinctive image features from scale-invariant keypoints. IJCV, [25] J. Marin, D. Vazquez, D. Geronimo, and A. Lopez. Learning appearance in virtual scenarios for pedestrian detection. In CVPR, [26] D. Marr. Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. Freeman, [27] K. Mikolajczyk and C. Schmid. A performance evaluation of local descriptors. PAMI, [28] D. N and B. Triggs. Histograms of oriented gradients for human detection. In CVPR, [29] T. Randen and J. Husoey. Filtering for texture classification: A comparative study. PAMI, [30] L. Roberts. Machine perception of 3-D solids. Ph.D. Thesis, [31] A. Saxena, J. Driemeyer, and A. Y. Ng. Robotic grasping of novel objects using vision. IJRR, [32] A. Saxena, M. Sun, and A. Y. Ng. Make3d: Learning 3d scene structure from a single still image. PAMI, [33] J. Schels, J. Liebelt, K. Schertler, and R. Lienhart. Building a semantic part-based object class detector from synthetic 3d models. In ICME, [34] C. Schmid, R. Mohr, and C. Bauckhage. Evaluation of interest point detectors. IJCV, [35] J. Shi and J. Malik. Normalized cuts and image segmentation. PAMI, [36] J. Shotton, A. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore, A. Kipman, and A. Blake. Real-time human pose recognition in parts from single depth images. In CVPR, [37] N. Snavely, S. M. Seitz, and R. Szeliski. Photo tourism: Exploring photo collections in 3d. In SIGGRAPH, [38] G. R. Taylor, A. J. Chosak, and P. C. Brewer. Ovvv: Using virtual worlds to design and evaluate surveillance systems. In CVPR, [39] J. T. Todd. The visual perception of 3d shape. In Trends in Cognitive Science, [40] D. B. Walther, B. Chai, E. Caddigan, D. M. Beck, and L. Fei- Fei. Simple line drawings suffice for functional mri decoding of natural scene categories. In PNAS, [41] D. Waltz. Generating semantic descriptions from drawings of scenes with shadows. Technical report, MIT, [42] J. Xiao, J. Hays, K. Ehinger, A. Oliva, and A. Torralba. Sun database: Large-scale scene recognition from abbey to zoo. In CVPR, [43] L. Zhu, Y. Chen, and A. L. Yuille. Learning a hierarchical deformable template for rapid deformable object parsing. PAMI, [44] C. L. Zitnick. Binary coherent edge descriptors. In ECCV, [45] C. L. Zitnick and D. Parikh. The role of image understanding in contour detection. In CVPR, 2012.
The Visual Internet of Things System Based on Depth Camera
The Visual Internet of Things System Based on Depth Camera Xucong Zhang 1, Xiaoyun Wang and Yingmin Jia Abstract The Visual Internet of Things is an important part of information technology. It is proposed
Object Recognition. Selim Aksoy. Bilkent University [email protected]
Image Classification and Object Recognition Selim Aksoy Department of Computer Engineering Bilkent University [email protected] Image classification Image (scene) classification is a fundamental
Introduction. Selim Aksoy. Bilkent University [email protected]
Introduction Selim Aksoy Department of Computer Engineering Bilkent University [email protected] What is computer vision? What does it mean, to see? The plain man's answer (and Aristotle's, too)
Colorado School of Mines Computer Vision Professor William Hoff
Professor William Hoff Dept of Electrical Engineering &Computer Science http://inside.mines.edu/~whoff/ 1 Introduction to 2 What is? A process that produces from images of the external world a description
3D Model based Object Class Detection in An Arbitrary View
3D Model based Object Class Detection in An Arbitrary View Pingkun Yan, Saad M. Khan, Mubarak Shah School of Electrical Engineering and Computer Science University of Central Florida http://www.eecs.ucf.edu/
Mean-Shift Tracking with Random Sampling
1 Mean-Shift Tracking with Random Sampling Alex Po Leung, Shaogang Gong Department of Computer Science Queen Mary, University of London, London, E1 4NS Abstract In this work, boosting the efficiency of
A Short Introduction to Computer Graphics
A Short Introduction to Computer Graphics Frédo Durand MIT Laboratory for Computer Science 1 Introduction Chapter I: Basics Although computer graphics is a vast field that encompasses almost any graphical
Practical Tour of Visual tracking. David Fleet and Allan Jepson January, 2006
Practical Tour of Visual tracking David Fleet and Allan Jepson January, 2006 Designing a Visual Tracker: What is the state? pose and motion (position, velocity, acceleration, ) shape (size, deformation,
Finding people in repeated shots of the same scene
Finding people in repeated shots of the same scene Josef Sivic 1 C. Lawrence Zitnick Richard Szeliski 1 University of Oxford Microsoft Research Abstract The goal of this work is to find all occurrences
Character Image Patterns as Big Data
22 International Conference on Frontiers in Handwriting Recognition Character Image Patterns as Big Data Seiichi Uchida, Ryosuke Ishida, Akira Yoshida, Wenjie Cai, Yaokai Feng Kyushu University, Fukuoka,
Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report
Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 69 Class Project Report Junhua Mao and Lunbo Xu University of California, Los Angeles [email protected] and lunbo
Semantic Recognition: Object Detection and Scene Segmentation
Semantic Recognition: Object Detection and Scene Segmentation Xuming He [email protected] Computer Vision Research Group NICTA Robotic Vision Summer School 2015 Acknowledgement: Slides from Fei-Fei
Feature Tracking and Optical Flow
02/09/12 Feature Tracking and Optical Flow Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem Many slides adapted from Lana Lazebnik, Silvio Saverse, who in turn adapted slides from Steve
LIBSVX and Video Segmentation Evaluation
CVPR 14 Tutorial! 1! LIBSVX and Video Segmentation Evaluation Chenliang Xu and Jason J. Corso!! Computer Science and Engineering! SUNY at Buffalo!! Electrical Engineering and Computer Science! University
VEHICLE LOCALISATION AND CLASSIFICATION IN URBAN CCTV STREAMS
VEHICLE LOCALISATION AND CLASSIFICATION IN URBAN CCTV STREAMS Norbert Buch 1, Mark Cracknell 2, James Orwell 1 and Sergio A. Velastin 1 1. Kingston University, Penrhyn Road, Kingston upon Thames, KT1 2EE,
Optical Flow. Shenlong Wang CSC2541 Course Presentation Feb 2, 2016
Optical Flow Shenlong Wang CSC2541 Course Presentation Feb 2, 2016 Outline Introduction Variation Models Feature Matching Methods End-to-end Learning based Methods Discussion Optical Flow Goal: Pixel motion
How does the Kinect work? John MacCormick
How does the Kinect work? John MacCormick Xbox demo Laptop demo The Kinect uses structured light and machine learning Inferring body position is a two-stage process: first compute a depth map (using structured
Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite
Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite Philip Lenz 1 Andreas Geiger 2 Christoph Stiller 1 Raquel Urtasun 3 1 KARLSRUHE INSTITUTE OF TECHNOLOGY 2 MAX-PLANCK-INSTITUTE IS 3
EFFICIENT VEHICLE TRACKING AND CLASSIFICATION FOR AN AUTOMATED TRAFFIC SURVEILLANCE SYSTEM
EFFICIENT VEHICLE TRACKING AND CLASSIFICATION FOR AN AUTOMATED TRAFFIC SURVEILLANCE SYSTEM Amol Ambardekar, Mircea Nicolescu, and George Bebis Department of Computer Science and Engineering University
Edge tracking for motion segmentation and depth ordering
Edge tracking for motion segmentation and depth ordering P. Smith, T. Drummond and R. Cipolla Department of Engineering University of Cambridge Cambridge CB2 1PZ,UK {pas1001 twd20 cipolla}@eng.cam.ac.uk
Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall
Automatic Photo Quality Assessment Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall Estimating i the photorealism of images: Distinguishing i i paintings from photographs h Florin
LOCAL SURFACE PATCH BASED TIME ATTENDANCE SYSTEM USING FACE. [email protected]
LOCAL SURFACE PATCH BASED TIME ATTENDANCE SYSTEM USING FACE 1 S.Manikandan, 2 S.Abirami, 2 R.Indumathi, 2 R.Nandhini, 2 T.Nanthini 1 Assistant Professor, VSA group of institution, Salem. 2 BE(ECE), VSA
Human Pose Estimation from RGB Input Using Synthetic Training Data
Human Pose Estimation from RGB Input Using Synthetic Training Data Oscar Danielsson and Omid Aghazadeh School of Computer Science and Communication KTH, Stockholm, Sweden {osda02, omida}@kth.se arxiv:1405.1213v2
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic
Template-based Eye and Mouth Detection for 3D Video Conferencing
Template-based Eye and Mouth Detection for 3D Video Conferencing Jürgen Rurainsky and Peter Eisert Fraunhofer Institute for Telecommunications - Heinrich-Hertz-Institute, Image Processing Department, Einsteinufer
Introduction. C 2009 John Wiley & Sons, Ltd
1 Introduction The purpose of this text on stereo-based imaging is twofold: it is to give students of computer vision a thorough grounding in the image analysis and projective geometry techniques relevant
Vision based Vehicle Tracking using a high angle camera
Vision based Vehicle Tracking using a high angle camera Raúl Ignacio Ramos García Dule Shu [email protected] [email protected] Abstract A vehicle tracking and grouping algorithm is presented in this work
Tracking in flussi video 3D. Ing. Samuele Salti
Seminari XXIII ciclo Tracking in flussi video 3D Ing. Tutors: Prof. Tullio Salmon Cinotti Prof. Luigi Di Stefano The Tracking problem Detection Object model, Track initiation, Track termination, Tracking
Professor, D.Sc. (Tech.) Eugene Kovshov MSTU «STANKIN», Moscow, Russia
Professor, D.Sc. (Tech.) Eugene Kovshov MSTU «STANKIN», Moscow, Russia As of today, the issue of Big Data processing is still of high importance. Data flow is increasingly growing. Processing methods
PHYSIOLOGICALLY-BASED DETECTION OF COMPUTER GENERATED FACES IN VIDEO
PHYSIOLOGICALLY-BASED DETECTION OF COMPUTER GENERATED FACES IN VIDEO V. Conotter, E. Bodnari, G. Boato H. Farid Department of Information Engineering and Computer Science University of Trento, Trento (ITALY)
Behavior Analysis in Crowded Environments. XiaogangWang Department of Electronic Engineering The Chinese University of Hong Kong June 25, 2011
Behavior Analysis in Crowded Environments XiaogangWang Department of Electronic Engineering The Chinese University of Hong Kong June 25, 2011 Behavior Analysis in Sparse Scenes Zelnik-Manor & Irani CVPR
Cees Snoek. Machine. Humans. Multimedia Archives. Euvision Technologies The Netherlands. University of Amsterdam The Netherlands. Tree.
Visual search: what's next? Cees Snoek University of Amsterdam The Netherlands Euvision Technologies The Netherlands Problem statement US flag Tree Aircraft Humans Dog Smoking Building Basketball Table
Recognizing Cats and Dogs with Shape and Appearance based Models. Group Member: Chu Wang, Landu Jiang
Recognizing Cats and Dogs with Shape and Appearance based Models Group Member: Chu Wang, Landu Jiang Abstract Recognizing cats and dogs from images is a challenging competition raised by Kaggle platform
Assessment of Camera Phone Distortion and Implications for Watermarking
Assessment of Camera Phone Distortion and Implications for Watermarking Aparna Gurijala, Alastair Reed and Eric Evans Digimarc Corporation, 9405 SW Gemini Drive, Beaverton, OR 97008, USA 1. INTRODUCTION
Local features and matching. Image classification & object localization
Overview Instance level search Local features and matching Efficient visual recognition Image classification & object localization Category recognition Image classification: assigning a class label to
Segmentation & Clustering
EECS 442 Computer vision Segmentation & Clustering Segmentation in human vision K-mean clustering Mean-shift Graph-cut Reading: Chapters 14 [FP] Some slides of this lectures are courtesy of prof F. Li,
Multivariate data visualization using shadow
Proceedings of the IIEEJ Ima and Visual Computing Wor Kuching, Malaysia, Novembe Multivariate data visualization using shadow Zhongxiang ZHENG Suguru SAITO Tokyo Institute of Technology ABSTRACT When visualizing
Character Animation from 2D Pictures and 3D Motion Data ALEXANDER HORNUNG, ELLEN DEKKERS, and LEIF KOBBELT RWTH-Aachen University
Character Animation from 2D Pictures and 3D Motion Data ALEXANDER HORNUNG, ELLEN DEKKERS, and LEIF KOBBELT RWTH-Aachen University Presented by: Harish CS-525 First presentation Abstract This article presents
Colour Image Segmentation Technique for Screen Printing
60 R.U. Hewage and D.U.J. Sonnadara Department of Physics, University of Colombo, Sri Lanka ABSTRACT Screen-printing is an industry with a large number of applications ranging from printing mobile phone
AN IMPROVED DOUBLE CODING LOCAL BINARY PATTERN ALGORITHM FOR FACE RECOGNITION
AN IMPROVED DOUBLE CODING LOCAL BINARY PATTERN ALGORITHM FOR FACE RECOGNITION Saurabh Asija 1, Rakesh Singh 2 1 Research Scholar (Computer Engineering Department), Punjabi University, Patiala. 2 Asst.
Perception-based Design for Tele-presence
Perception-based Design for Tele-presence Santanu Chaudhury 1, Shantanu Ghosh 1,2, Amrita Basu 3, Brejesh Lall 1, Sumantra Dutta Roy 1, Lopamudra Choudhury 3, Prashanth R 1, Ashish Singh 1, and Amit Maniyar
Localizing 3D cuboids in single-view images
Localizing 3D cuboids in single-view images Jianxiong Xiao Bryan C. Russell Antonio Torralba Massachusetts Institute of Technology University of Washington Abstract In this paper we seek to detect rectangular
Lighting Estimation in Indoor Environments from Low-Quality Images
Lighting Estimation in Indoor Environments from Low-Quality Images Natalia Neverova, Damien Muselet, Alain Trémeau Laboratoire Hubert Curien UMR CNRS 5516, University Jean Monnet, Rue du Professeur Benoît
Color Segmentation Based Depth Image Filtering
Color Segmentation Based Depth Image Filtering Michael Schmeing and Xiaoyi Jiang Department of Computer Science, University of Münster Einsteinstraße 62, 48149 Münster, Germany, {m.schmeing xjiang}@uni-muenster.de
ART A. PROGRAM RATIONALE AND PHILOSOPHY
ART A. PROGRAM RATIONALE AND PHILOSOPHY Art education is concerned with the organization of visual material. A primary reliance upon visual experience gives an emphasis that sets it apart from the performing
Learning Detectors from Large Datasets for Object Retrieval in Video Surveillance
2012 IEEE International Conference on Multimedia and Expo Learning Detectors from Large Datasets for Object Retrieval in Video Surveillance Rogerio Feris, Sharath Pankanti IBM T. J. Watson Research Center
MVA ENS Cachan. Lecture 2: Logistic regression & intro to MIL Iasonas Kokkinos [email protected]
Machine Learning for Computer Vision 1 MVA ENS Cachan Lecture 2: Logistic regression & intro to MIL Iasonas Kokkinos [email protected] Department of Applied Mathematics Ecole Centrale Paris Galen
Quantifying Spatial Presence. Summary
Quantifying Spatial Presence Cedar Riener and Dennis Proffitt Department of Psychology, University of Virginia Keywords: spatial presence, illusions, visual perception Summary The human visual system uses
Vehicle Tracking by Simultaneous Detection and Viewpoint Estimation
Vehicle Tracking by Simultaneous Detection and Viewpoint Estimation Ricardo Guerrero-Gómez-Olmedo, Roberto López-Sastre, Saturnino Maldonado-Bascón, and Antonio Fernández-Caballero 2 GRAM, Department of
Bringing Semantics Into Focus Using Visual Abstraction
Bringing Semantics Into Focus Using Visual Abstraction C. Lawrence Zitnick Microsoft Research, Redmond [email protected] Devi Parikh Virginia Tech [email protected] Abstract Relating visual information
Building an Advanced Invariant Real-Time Human Tracking System
UDC 004.41 Building an Advanced Invariant Real-Time Human Tracking System Fayez Idris 1, Mazen Abu_Zaher 2, Rashad J. Rasras 3, and Ibrahiem M. M. El Emary 4 1 School of Informatics and Computing, German-Jordanian
Edge Boxes: Locating Object Proposals from Edges
Edge Boxes: Locating Object Proposals from Edges C. Lawrence Zitnick and Piotr Dollár Microsoft Research Abstract. The use of object proposals is an effective recent approach for increasing the computational
Bernice E. Rogowitz and Holly E. Rushmeier IBM TJ Watson Research Center, P.O. Box 704, Yorktown Heights, NY USA
Are Image Quality Metrics Adequate to Evaluate the Quality of Geometric Objects? Bernice E. Rogowitz and Holly E. Rushmeier IBM TJ Watson Research Center, P.O. Box 704, Yorktown Heights, NY USA ABSTRACT
Crater detection with segmentation-based image processing algorithm
Template reference : 100181708K-EN Crater detection with segmentation-based image processing algorithm M. Spigai, S. Clerc (Thales Alenia Space-France) V. Simard-Bilodeau (U. Sherbrooke and NGC Aerospace,
Classifying Manipulation Primitives from Visual Data
Classifying Manipulation Primitives from Visual Data Sandy Huang and Dylan Hadfield-Menell Abstract One approach to learning from demonstrations in robotics is to make use of a classifier to predict if
Limitations of Human Vision. What is computer vision? What is computer vision (cont d)?
What is computer vision? Limitations of Human Vision Slide 1 Computer vision (image understanding) is a discipline that studies how to reconstruct, interpret and understand a 3D scene from its 2D images
3D U ser I t er aces and Augmented Reality
3D User Interfaces and Augmented Reality Applications Mechanical CAD 3D Animation Virtual Environments Scientific Visualization Mechanical CAD Component design Assembly testingti Mechanical properties
Announcements. Active stereo with structured light. Project structured light patterns onto the object
Announcements Active stereo with structured light Project 3 extension: Wednesday at noon Final project proposal extension: Friday at noon > consult with Steve, Rick, and/or Ian now! Project 2 artifact
The Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
Improving Spatial Support for Objects via Multiple Segmentations
Improving Spatial Support for Objects via Multiple Segmentations Tomasz Malisiewicz and Alexei A. Efros Robotics Institute Carnegie Mellon University Pittsburgh, PA 15213 Abstract Sliding window scanning
Video Camera Image Quality in Physical Electronic Security Systems
Video Camera Image Quality in Physical Electronic Security Systems Video Camera Image Quality in Physical Electronic Security Systems In the second decade of the 21st century, annual revenue for the global
A Cognitive Approach to Vision for a Mobile Robot
A Cognitive Approach to Vision for a Mobile Robot D. Paul Benjamin Christopher Funk Pace University, 1 Pace Plaza, New York, New York 10038, 212-346-1012 [email protected] Damian Lyons Fordham University,
High Level Describable Attributes for Predicting Aesthetics and Interestingness
High Level Describable Attributes for Predicting Aesthetics and Interestingness Sagnik Dhar Vicente Ordonez Tamara L Berg Stony Brook University Stony Brook, NY 11794, USA [email protected] Abstract
Interactive Offline Tracking for Color Objects
Interactive Offline Tracking for Color Objects Yichen Wei Jian Sun Xiaoou Tang Heung-Yeung Shum Microsoft Research Asia, Beijing, China {yichenw,jiansun,xitang,hshum}@microsoft.com Abstract In this paper,
What Makes a Great Picture?
What Makes a Great Picture? Robert Doisneau, 1955 With many slides from Yan Ke, as annotated by Tamara Berg 15-463: Computational Photography Alexei Efros, CMU, Fall 2011 Photography 101 Composition Framing
Course Overview. CSCI 480 Computer Graphics Lecture 1. Administrative Issues Modeling Animation Rendering OpenGL Programming [Angel Ch.
CSCI 480 Computer Graphics Lecture 1 Course Overview January 14, 2013 Jernej Barbic University of Southern California http://www-bcf.usc.edu/~jbarbic/cs480-s13/ Administrative Issues Modeling Animation
Canny Edge Detection
Canny Edge Detection 09gr820 March 23, 2009 1 Introduction The purpose of edge detection in general is to significantly reduce the amount of data in an image, while preserving the structural properties
Probabilistic Latent Semantic Analysis (plsa)
Probabilistic Latent Semantic Analysis (plsa) SS 2008 Bayesian Networks Multimedia Computing, Universität Augsburg [email protected] www.multimedia-computing.{de,org} References
Real-Time Tracking of Pedestrians and Vehicles
Real-Time Tracking of Pedestrians and Vehicles N.T. Siebel and S.J. Maybank. Computational Vision Group Department of Computer Science The University of Reading Reading RG6 6AY, England Abstract We present
Interactive person re-identification in TV series
Interactive person re-identification in TV series Mika Fischer Hazım Kemal Ekenel Rainer Stiefelhagen CV:HCI lab, Karlsruhe Institute of Technology Adenauerring 2, 76131 Karlsruhe, Germany E-mail: {mika.fischer,ekenel,rainer.stiefelhagen}@kit.edu
3D Scanner using Line Laser. 1. Introduction. 2. Theory
. Introduction 3D Scanner using Line Laser Di Lu Electrical, Computer, and Systems Engineering Rensselaer Polytechnic Institute The goal of 3D reconstruction is to recover the 3D properties of a geometric
Speed Performance Improvement of Vehicle Blob Tracking System
Speed Performance Improvement of Vehicle Blob Tracking System Sung Chun Lee and Ram Nevatia University of Southern California, Los Angeles, CA 90089, USA [email protected], [email protected] Abstract. A speed
Segmentation as Selective Search for Object Recognition
Segmentation as Selective Search for Object Recognition Koen E. A. van de Sande Jasper R. R. Uijlings Theo Gevers Arnold W. M. Smeulders University of Amsterdam University of Trento Amsterdam, The Netherlands
A PHOTOGRAMMETRIC APPRAOCH FOR AUTOMATIC TRAFFIC ASSESSMENT USING CONVENTIONAL CCTV CAMERA
A PHOTOGRAMMETRIC APPRAOCH FOR AUTOMATIC TRAFFIC ASSESSMENT USING CONVENTIONAL CCTV CAMERA N. Zarrinpanjeh a, F. Dadrassjavan b, H. Fattahi c * a Islamic Azad University of Qazvin - [email protected]
Tracking performance evaluation on PETS 2015 Challenge datasets
Tracking performance evaluation on PETS 2015 Challenge datasets Tahir Nawaz, Jonathan Boyle, Longzhen Li and James Ferryman Computational Vision Group, School of Systems Engineering University of Reading,
Tracking and Recognition in Sports Videos
Tracking and Recognition in Sports Videos Mustafa Teke a, Masoud Sattari b a Graduate School of Informatics, Middle East Technical University, Ankara, Turkey [email protected] b Department of Computer
Segmentation of building models from dense 3D point-clouds
Segmentation of building models from dense 3D point-clouds Joachim Bauer, Konrad Karner, Konrad Schindler, Andreas Klaus, Christopher Zach VRVis Research Center for Virtual Reality and Visualization, Institute
Enhanced LIC Pencil Filter
Enhanced LIC Pencil Filter Shigefumi Yamamoto, Xiaoyang Mao, Kenji Tanii, Atsumi Imamiya University of Yamanashi {[email protected], [email protected], [email protected]}
Spatio-Temporally Coherent 3D Animation Reconstruction from Multi-view RGB-D Images using Landmark Sampling
, March 13-15, 2013, Hong Kong Spatio-Temporally Coherent 3D Animation Reconstruction from Multi-view RGB-D Images using Landmark Sampling Naveed Ahmed Abstract We present a system for spatio-temporally
Privacy Preserving Automatic Fall Detection for Elderly Using RGBD Cameras
Privacy Preserving Automatic Fall Detection for Elderly Using RGBD Cameras Chenyang Zhang 1, Yingli Tian 1, and Elizabeth Capezuti 2 1 Media Lab, The City University of New York (CUNY), City College New
Florida International University - University of Miami TRECVID 2014
Florida International University - University of Miami TRECVID 2014 Miguel Gavidia 3, Tarek Sayed 1, Yilin Yan 1, Quisha Zhu 1, Mei-Ling Shyu 1, Shu-Ching Chen 2, Hsin-Yu Ha 2, Ming Ma 1, Winnie Chen 4,
DYNAMIC RANGE IMPROVEMENT THROUGH MULTIPLE EXPOSURES. Mark A. Robertson, Sean Borman, and Robert L. Stevenson
c 1999 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or
Analecta Vol. 8, No. 2 ISSN 2064-7964
EXPERIMENTAL APPLICATIONS OF ARTIFICIAL NEURAL NETWORKS IN ENGINEERING PROCESSING SYSTEM S. Dadvandipour Institute of Information Engineering, University of Miskolc, Egyetemváros, 3515, Miskolc, Hungary,
What makes Big Visual Data hard?
What makes Big Visual Data hard? Quint Buchholz Alexei (Alyosha) Efros Carnegie Mellon University My Goals 1. To make you fall in love with Big Visual Data She is a fickle, coy mistress but holds the key
ACCURACY ASSESSMENT OF BUILDING POINT CLOUDS AUTOMATICALLY GENERATED FROM IPHONE IMAGES
ACCURACY ASSESSMENT OF BUILDING POINT CLOUDS AUTOMATICALLY GENERATED FROM IPHONE IMAGES B. Sirmacek, R. Lindenbergh Delft University of Technology, Department of Geoscience and Remote Sensing, Stevinweg
Circle Object Recognition Based on Monocular Vision for Home Security Robot
Journal of Applied Science and Engineering, Vol. 16, No. 3, pp. 261 268 (2013) DOI: 10.6180/jase.2013.16.3.05 Circle Object Recognition Based on Monocular Vision for Home Security Robot Shih-An Li, Ching-Chang
Unsupervised Discovery of Mid-Level Discriminative Patches
Unsupervised Discovery of Mid-Level Discriminative Patches Saurabh Singh, Abhinav Gupta, and Alexei A. Efros Carnegie Mellon University, Pittsburgh, PA 15213, USA http://graphics.cs.cmu.edu/projects/discriminativepatches/
Learning to Predict Where Humans Look
Learning to Predict Where Humans Look Tilke Judd Krista Ehinger Frédo Durand Antonio Torralba [email protected] [email protected] [email protected] [email protected] MIT Computer Science Artificial Intelligence
REAL-TIME IMAGE BASED LIGHTING FOR OUTDOOR AUGMENTED REALITY UNDER DYNAMICALLY CHANGING ILLUMINATION CONDITIONS
REAL-TIME IMAGE BASED LIGHTING FOR OUTDOOR AUGMENTED REALITY UNDER DYNAMICALLY CHANGING ILLUMINATION CONDITIONS Tommy Jensen, Mikkel S. Andersen, Claus B. Madsen Laboratory for Computer Vision and Media
Big Data: Image & Video Analytics
Big Data: Image & Video Analytics How it could support Archiving & Indexing & Searching Dieter Haas, IBM Deutschland GmbH The Big Data Wave 60% of internet traffic is multimedia content (images and videos)
Learning Motion Categories using both Semantic and Structural Information
Learning Motion Categories using both Semantic and Structural Information Shu-Fai Wong, Tae-Kyun Kim and Roberto Cipolla Department of Engineering, University of Cambridge, Cambridge, CB2 1PZ, UK {sfw26,
Journal of Industrial Engineering Research. Adaptive sequence of Key Pose Detection for Human Action Recognition
IWNEST PUBLISHER Journal of Industrial Engineering Research (ISSN: 2077-4559) Journal home page: http://www.iwnest.com/aace/ Adaptive sequence of Key Pose Detection for Human Action Recognition 1 T. Sindhu
Environmental Remote Sensing GEOG 2021
Environmental Remote Sensing GEOG 2021 Lecture 4 Image classification 2 Purpose categorising data data abstraction / simplification data interpretation mapping for land cover mapping use land cover class
