Next Generation Artificial Vision Systems



Similar documents
Processing the Image or Can you Believe what you see? Light and Color for Nonscientists PHYS 1230

NEURAL NETWORKS A Comprehensive Foundation

A bachelor of science degree in electrical engineering with a cumulative undergraduate GPA of at least 3.0 on a 4.0 scale

A Learning Based Method for Super-Resolution of Low Resolution Images

Introduction to Robotics Analysis, Systems, Applications

LIST OF CONTENTS CHAPTER CONTENT PAGE DECLARATION DEDICATION ACKNOWLEDGEMENTS ABSTRACT ABSTRAK

Glencoe. correlated to SOUTH CAROLINA MATH CURRICULUM STANDARDS GRADE 6 3-3, , , 4-9

DIGITAL IMAGE PROCESSING AND ANALYSIS

Depth and Excluded Courses

Low-resolution Image Processing based on FPGA

Advantage of the CMOS Sensor

The Visual Cortex February 2013

Advanced Signal Processing and Digital Noise Reduction

Space Perception and Binocular Vision

Understanding Network Video Security Systems

BSEE Degree Plan Bachelor of Science in Electrical Engineering:

Optical Metrology. Third Edition. Kjell J. Gasvik Spectra Vision AS, Trondheim, Norway JOHN WILEY & SONS, LTD

REGULATIONS FOR THE DEGREE OF MASTER OF SCIENCE IN COMPUTER SCIENCE (MSc[CompSc])

A Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow

FRC WPI Robotics Library Overview

Development of a high-resolution, high-speed vision system using CMOS image sensor technology enhanced by intelligent pixel selection technique

runl I IUI%I/\L Magnetic Resonance Imaging

Robotics. Lecture 3: Sensors. See course website for up to date information.

Trading. Theory and Practice

Time Domain and Frequency Domain Techniques For Multi Shaker Time Waveform Replication

SCHWEITZER ENGINEERING LABORATORIES, COMERCIAL LTDA.

Preface Acknowledgments Acronyms

Analecta Vol. 8, No. 2 ISSN

NEURAL NETWORK FUNDAMENTALS WITH GRAPHS, ALGORITHMS, AND APPLICATIONS

Force/position control of a robotic system for transcranial magnetic stimulation

Fundamentals of Signature Analysis

CONTENTS. Preface Energy bands of a crystal (intuitive approach)

Customer and Business Analytic

The Physiology of the Senses Lecture 1 - The Eye

CSEN301 Embedded Systems Trimester 1

An Open Architecture through Nanocomputing

SYSTEMS, CONTROL AND MECHATRONICS

Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski

Simultaneous Gamma Correction and Registration in the Frequency Domain

1.1 Silicon on Insulator a brief Introduction

Obtaining Knowledge. Lecture 7 Methods of Scientific Observation and Analysis in Behavioral Psychology and Neuropsychology.

Video-Based Eye Tracking

CHAPTER 5 PREDICTIVE MODELING STUDIES TO DETERMINE THE CONVEYING VELOCITY OF PARTS ON VIBRATORY FEEDER

The Limits of Human Vision

Contents. Preface. xiii. Part I 1

Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report

DEANSHIP OF ACADEMIC DEVELOPMENT e-learning Center GUIDELINES FOR

Understanding Megapixel Camera Technology for Network Video Surveillance Systems. Glenn Adair

Biological Neurons and Neural Networks, Artificial Neurons

Mining. Practical. Data. Monte F. Hancock, Jr. Chief Scientist, Celestech, Inc. CRC Press. Taylor & Francis Group

Taking Inverse Graphics Seriously

Master of Science in Electrical Engineering Graduate Program:

Implementation of emulated digital CNN-UM architecture on programmable logic devices and its applications

Cognitive Computational Models for Intelligent Engineering Systems

Power Electronics. Prof. K. Gopakumar. Centre for Electronics Design and Technology. Indian Institute of Science, Bangalore.

Intelligent Flexible Automation

1 Cornea 6 Macula 2 Lens 7 Vitreous humor 3 Iris 8 Optic disc 4 Conjunctiva 9 Ciliary muscles 5 Sclera 10 Choroid

ELECTRICAL ENGINEERING

The mission of the School of Electronic and Computing Systems 3 is to provide:

ELECTRICAL ENGINEERING

Characteristic and use

Study of the Human Eye Working Principle: An impressive high angular resolution system with simple array detectors

Theoretical Perspective

Predict Influencers in the Social Network

W04 Transistors and Applications. Yrd. Doç. Dr. Aytaç Gören

6 Space Perception and Binocular Vision

Anna Martelli Ravenscroft

Physics 9e/Cutnell. correlated to the. College Board AP Physics 1 Course Objectives

ÖZGÜR YILMAZ, Assistant Professor

Lecture 030 DSM CMOS Technology (3/24/10) Page 030-1

Neural Network Design in Cloud Computing

Vehicle-Bridge Interaction Dynamics

A Survey of Video Processing with Field Programmable Gate Arrays (FGPA)

REGULATIONS FOR THE DEGREE OF MASTER OF SCIENCE IN COMPUTER SCIENCE (MSc[CompSc])

COPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments

Alabama Department of Postsecondary Education

Photodiode/Phototransistor Application Circuit < E V3 I P I V OUT E V2. Figure 1. Fundamental Circuit of Photodiode (Without Bias)

Limitations of Human Vision. What is computer vision? What is computer vision (cont d)?

Photonic Reservoir Computing with coupled SOAs

THE HUMAN BRAIN. observations and foundations

Chapter 2 The Research on Fault Diagnosis of Building Electrical System Based on RBF Neural Network

MECE 102 Mechatronics Engineering Orientation

Software and Hardware Solutions for Accurate Data and Profitable Operations. Miguel J. Donald J. Chmielewski Contributor. DuyQuang Nguyen Tanth

W a d i a D i g i t a l

Masters research projects. 1. Adapting Granger causality for use on EEG data.

Content Map For Career & Technology

Problem-Based Group Activities for a Sensation & Perception Course. David S. Kreiner. University of Central Missouri

NEUROMATHEMATICS: DEVELOPMENT TENDENCIES. 1. Which tasks are adequate of neurocomputers?

Overview. Swarms in nature. Fish, birds, ants, termites, Introduction to swarm intelligence principles Particle Swarm Optimization (PSO)

Harmonics and Noise in Photovoltaic (PV) Inverter and the Mitigation Strategies

Fire detection with a frame-less vision sensor working in the NIR band

Accurate and robust image superresolution by neural processing of local image representations

STRAND: Number and Operations Algebra Geometry Measurement Data Analysis and Probability STANDARD:

PASSENGER/PEDESTRIAN ANALYSIS BY NEUROMORPHIC VISUAL INFORMATION PROCESSING

Transcription:

Next Generation Artificial Vision Systems Reverse Engineering the Human Visual System Anil Bharath Maria Petrou Imperial College London ARTECH H O U S E BOSTON LONDON artechhouse.com

Contents Preface xiii CHAPTER 1 The Human Visual System: An Engineering Challenge 1 1.1 Introduction 1 1.2 Overview of the Human Visual System 2 1.2.1 The Human Eye 3 1.2.1.1 Issues to Be Investigated 8 1.2.2 Lateral Geniculate Nucleus (LGN) 10 1.2.3 The VI Region of the Visual Cortex 12 1.2.3.1 Issues to Be Investigated 14 1.2.4 Motion Analysis and V5 15 1.2.4.1 Issues to Be Investigated 15 1.3 Conclusions 15 References 17 The Physiology and Psychology of Vision 19 CHAPTER 2 Retinal Physiology and Neuronal Modeling 21 2.1 Introduction 21 2.2 Retinal Anatomy 21 2.3 Retinal Physiology 25 2.4 Mathematical Modeling Single Cells of the Retina 27 2.5 Mathematical Modeling The Retina and Its Functions 28 2.6 A Flexible, Dynamical Model of Retinal Function 30 2.6.1 Foveal Structure 31 2.6.2 Differential Equations 32 2.6.3 Color Mechanisms 34 2.6.4 Foveal Image Representation 36 2.6.5 Modeling Retinal Motion 37 2.7 Numerical Simulation Examples 38 2.7.1 Parameters and Visual Stimuli 38 2.7.2 Temporal Characteristics 39 2.7.3 Spatial Characteristics 41 2.7.4 Color Characteristics 43 2.8 Conclusions 45 References 46 v

VI Contents CHAPTER 3 Ä Review of VI 3.1 Introduction 3.2 Two Aspects of Organization and Functions in VI 3.2.1 Single-Neuron Responses 3.2.2 Organization of Individual Cells in VI 3.2.2.1 Orientation Selectivity 3.2.2.2 Color Selectivity 3.2.2.3 Scale Selectivity 3.2.2.4 Phase Selectivity 3.3 Computational Understanding of the Feed Forward VI 3.3.1 VI Cell Interactions and Global Computation 3.3.2 Theory and Model of Intracortical Interactions in VI 3.4 Conclusions References CHAPTER 4 Testing the Hypothesis That VI Creates a Bottom-Up Saliency Map 4.1 4.2 4.3 Introduction Materials and Methods Results 4.3.1 Interference by Task-Irrelevant Features 4.3.2 The Color-Orientation Asymmetry in Interference 4.3.3 Advantage for Color-Orientation Double Feature but Not Orientation-Orientation Double Feature 4.3.4 Emergent Grouping of Orientation Features by Spatial Configurations 4.4 Discussion 4.5 Conclusions References 51 51 52 52 53 55 56 57 58 58 59 61 62 63 69 69 73 75 76 81 84 87 92 98 99 The Mathematics of Vision 103 CHAPTER 5 VI Wavelet Models and Visual Inference 5.1 Introduction 5.1.1 Wavelets 5.1.2 Wavelets in Image Analysis and Vision 5.1.3 Wavelet Choices 5.1.4 Linear vs Nonlinear Mappings 5.2 A Polar Separable Complex Wavelet Design 105 105 105 107 107 112 113

Contents VII 5.2.1 Design Overview 113 5.2.2 Filter Designs: Radial Frequency 114 5.2.3 Angular Frequency Response 116 5.2.4 Filter Kernels 118 5.2.5 Steering and Orientation Estimation 119 5.3 The Use of Vl-Like Wavelet Models in Computer Vision 120 5.3.1 Overview 120 5.3.2 Generating Orientation Maps 121 5.3.3 Corner Likelihood Response 123 5.3.4 Phase Estimation 123 5.4 Inference from Vl-Like Representations 124 5.4.1 Vector Image Fields 125 5.4.2 Formulation of Detection 126 5.4.3 Samplingof (B,X) 127 5.4.4 The Notion of "Expected" Vector Fields 128 5.4.5 An Analytic Example: Uniform Intensity Circle 129 5.4.6 Vector Model Plausibility and Extension 129 5.4.7 Vector Fields: A Variable Contrast Model 130 5.4.8 Plausibility by Demonstration 131 5.4.9 Plausibility from Real Image Data 132 5.4.10 Divisive Normalization 133 5.5 Evaluating Shape Detection Algorithms 135 5.5.1 Circle-and-Square Discrimination Test 135 5.6 Grouping Phase-Invariant Feature Maps 138 5.6.1 Keypoint Detection Using DTCWT 138 5.7 Summary and Conclusions 140 References 141 CHAPTER 6 Beyond the Representation of Images by Rectangular Grids 145 6.1 Introduction 145 6.2 Linear Image Processing 145 6.2.1 Interpolation of Irregularly Sampled Data 146 6.2.1.1 Kriging 146 6.2.1.2 Iterative Error Correction 151 6.2.1.3 Normalized Convolution 153 6.2.2 DFT from Irregularly Sampled Data 156 6.3 Nonlinear Image Processing 157 6.3.1 Vl-Inspired Edge Detection 158 6.3.2 Beyond the Conventional Data Representations and Object Descriptors 162 6.3.2.1 The Trace Transform 162 6.3.2.2 Features from the Trace Transform 165

VIII Contents 6.4 Reverse Engineering Some Aspect of the Human Visual System 167 6.5 Conclusions 168 References 169 CHAPTER 7 Reverse Engineering of Human Vision: Hyperacuity and Super-Resolution 171 7.1 Introduction 171 7.2 Hyperacuity and Super-Resolution 172 7.3 Super-Resolution Image Reconstruction Methods 173 7.3.1 Constrained Least Squares Approach 174 7.3.2 Projection onto Convex Sets 177 7.3.3 Maximum A Posteriori Formulation 180 7.3.4 Markov Random Field Prior 180 7.3.5 Comparison of the Super-Resolution Methods 183 7.3.6 Image Registration 183 7.4 Applications of Super-Resolution 184 7.4.1 Application in Minimally Invasive Surgery 184 7.4.2 Other Applications 187 7.5 Conclusions and Further Challenges 188 References 188 CHAPTER 8 Eye Tracking and Depth from Vergence 191 8.1 Introduction 191 8.2 Eye-Tracking Techniques 192 8.3 Applications of Eye Tracking 195 8.3.1 Psychology/Psychiatry and Cognitive Sciences 195 8.3.2 Behavior Analysis 196 8.3.3 Medicine 197 8.3.4 Human-Computer Interaction 199 8.4 Gaze-Contingent Control for Robotic Surgery 200 8.4.1 Ocular Vergence for Depth Recovery 202 8.4.2 Binocular Eye-Tracking Calibration 204 8.4.3 Depth Recovery and Motion Stabilization 206 8.5 Discussion and Conclusions 209 References 210 CHAPTER 9 Motion Detection and Tracking by Mimicking Neurological Dorsal/Ventral Pathways 21 7 9.1 Introduction 217 9.2 Motion Processing in the Human Visual System 218 9.3 Motion Detection 219

Contents IX 9.3.1 Temporal Edge Detection 221 9.3.2 Wavelet Decomposition 224 9.3.3 The Spatiotemporal Haar Wavelet 225 9.3.4 Computational Cost 230 9.4 Dual-Channel Tracking Paradigm 230 9.4.1 Appearance Model 231 9.4.2 Early Approaches to Prediction 232 9.4.3 Tracking by Blob Sorting 233 9.5 Behavior Recognition and Understanding 237 9.6 A Theory of Tracking 239 9.7 Concluding Remarks 241 References 242 UESUÜUJJ Hardware Technologies for Vision 249 CHAPTER 10 Organic and Inorganic Semiconductor Photoreceptors Mimicking the Human Rods and Cones 251 10.1 Introduction 251 10.2 Phototransduction in the Human Eye 253 10.2.1 The Physiology of the Eye 253 10.2.2 Phototransduction Cascade 255 10.2.2.1 Light Activation of the Cascade 257 10.2.2.2 Deactivation of the Cascade 258 10.2.3 Light Adaptation of Photoreceptors: Weber-Fechner's Law 258 10.2.4 Some Engineering Aspects of Photoreceptor Cells 259 10.3 Phototransduction in Silicon 260 10.3.1 CCD Photodetector Arrays 262 10.3.2 CMOS Photodetector Arrays 263 10.3.3 Color Filtering 265 10.3.4 Scaling Considerations 268 10.4 Phototransduction with Organic Semiconductor Devices 269 10.4.1 Principles of Organic Semiconductors 270 10.4.2 Organic Photodetection 271 10.4.3 Organic Photodiode Structure 273 10.4.4 Organic Photodiode Electronic Characteristics 274 10.4.4.1 Photocurrent and Efficiency 274 10.4.4.2 The Equivalent Circuit and Shunt Resistance 277 10.4.4.3 Spectral Response Characteristics 281 10.4.5 Fabrication 281 10.4.5.1 Contact Printing 282 10.4.5.2 Printing on CMOS 284 10.5 Conclusions 285 References 286

X Contents CHAPTER 11 Analog Retinomorphic Circuitry to Perform Retinal and Retinal-Inspired Processing 289 11.1 Introduction 289 11.2 Principles of Analog Processing 290 11.2.1 The Metal Oxide Semiconductor Field Effect Transistor 292 11.2.1.1 Transistor Operation 293 11.2.1.2 nmos and pmos Devices 293 11.2.1.3 Transconductance Characteristics 293 11.2.1.4 Inversion Characteristics 294 11.2.1.5 MOSFET Weak Inversion and Biological Gap Junctions 295 11.2.2 Analog vs Digital Methodologies 296 11.3 Photo Electric Transduction 296 11.3.1 Logarithmic Sensors 297 11.3.2 Feedback Buffers 298 11.3.3 Integration-Based Photodetection Circuits 298 11.3.4 Photocurrent Current-Mode Readout 300 11.4 Retinimorphic Circuit Processing 300 11.4.1 Voltage Mode Resistive Networks 301 11.4.1.1 Limitations with This Approach 303 11.4.2 Current Mode Approaches to Receptive Field Convolution 303 11.4.2.1 Improved Horizontal Cell Circuitry 305 11.4.2.2 Novel Bipolar Circuitry 305 11.4.2.3 Bidirectional Current Mode Processing 306 11.4.2.4 Dealing with Multiple High Impedance Processing Channels 307 11.4.2.5 The Current Comparator 310 11.4.3 Reconfigurable Fields 312 11.4.4 Intelligent Ganglion Cells 314 11.4.4.1 ON-OFF Ganglion Cells 315 11.4.4.2 Pulse Width Encoding 316 11.5 Address Event Representation 317 11.5.1 The Arbitration Tree 318 11.5.2 Collisions 322 11.5.3 Sparse Coding 322 11.5.4 Collision Reduction 323 11.6 Adaptive Foveation 324 11.6.1 System Algorithm 325 11.6.2 Circuit Implementation 326 11.6.3 The Future 329 11.7 Conclusions 330 References 330

Contents XI CHAPTER 12 Analog VI Platforms 335 12.1 Analog Processing: Obsolete? 335 12.2 The Cellular Neural Network 340 12.3 The Linear CNN 340 12.4 CNNs and Mixed Domain Spatiotemporal Transfer Functions 342 12.5 Networks with Temporal Derivative Diffusion 345 12.5.1 Stability 348 12.6 A Signal Flow Graph-Based Implementation 349 12.6.1 Continuous Time Signal Flow Graphs 349 12.6.2 On SFG Relations with the MLCNN 352 12.7 Examples 355 12.7.1 A Spatiotemporal Cone Filter 355 12.7.2 Visual Cortical Receptive Field Modelling 360 12.8 Modeling of Complex Cell Receptive Fields 362 12.9 Summary and Conclusions 363 References 364 CHAPTER 13 From Algorithms to Hardware Implementation 367 13.1 Introduction 367 13.2 Field Programmable Gate Arrays 367 13.2.1 Circuit Design 369 13.2.2 Design Process 369 13.3 Mapping Two-Dimensional Filters onto FPGAs 369 13.4 Implementation of Complex Wavelet Pyramid on FPGA 370 13.4.1 FPGA Design 370 13.4.2 Host Control 373 13.4.3 Implementation Analysis 374 13.4.4 Performance Analysis 375 13.4.4.1 Corner Detection 377 13.4.5 Conclusions 377 13.5 Hardware Implementation of the Trace Transform 377 13.5.1 Introduction to the Trace Transform 377 13.5.2 Computational Complexity 381 13.5.3 Füll Trace Transform System 382 13.5.3.1 Acceleration Methods 382 13.5.3.2 Target Board 383 13.5.3.3 System Overview 383 13.5.3.4 Top-Level Control 384 13.5.3.5 Rotation Block 384 13.5.3.6 Functional Blocks 386 13.5.3.7 Initialization 386 13.5.4 Flexible Functionals for Exploration 387 13.5.4.1 Type A Functional Block 388

XII Contents 13.5.4.2 Type B Functional Block 388 13.5.4.3 Type C Functional Block 389 13.5.5 Functional Coverage 389 13.5.6 Performance and Area Results 389 13.5.7 Conclusions 391 13.6 Summary 391 References 392 CHAPTER 14 Real-Time Spatiotemporal Saliency 395 14.1 Introduction 395 14.2 The Framework Overview 396 14.3 Realization of the Framework 398 14.3.1 Two-Dimensional Feature Detection 398 14.3.2 Feature Tracker 399 14.3.3 Prediction 404 14.3.4 Distribution Distance 406 14.3.5 Suppression 410 14.4 Performance Evaluation 411 14.4.1 Adaptive Saliency Responses 411 14.4.2 Complex Scene Saliency Analysis 412 14.5 Conclusions 413 References 413 Acronyms and Abbreviations 415 About the Editors 419 List of Contributors 420 Index 423