Forensic Image Processing www.martinojerian.com
Forensic Image Processing Lesson 1 An introduction on digital images
Purpose of the course What is a digital image? What use can images have for investigative applications? What are most common issues? How to solve them? What kind of information can I obtain from an image? What precautions I must take care of? 2
Image enhancement and restoration Get more information from the analysis of an image 3
Information classification Pills Footprints, tires... Bullets... Visual information classification upon some criteria 4
Biometrics Comparison and recognition of physiological or behavioral features of a certain subject. Fingerprints Hand palms Faces Iris Ear shape... 5
Photogrammetry Evaluation of some sizes in the scene by proportion with known lengths 6
3d reconstruction Dynamic analysis (event) Static analysis (objects, places, faces...) 7
Contents of the course 1) An introduction on digital images 2) Main issues 3) Removing noise 4) Improving details 5) Advanced techniques 6) Video analysis and enhancement 7) 3d reconstruction 8) Biometrics 8
Outline What is an image Difference between analog and digital Digital images Color representation Image formats and compression 9
What is an image? Bidimensional function of light intensity perceived by the human eye x f(x,y) y 10
Analog and digital Analog: continuous variation. Digital: variations by steps. 11
What is a digital image? Both coordinates and intensity values are discrete sizes copyright www.imageprocessingplace.com 12
Discrete coordinates Minimum indivisible element called pixel (picture element) copyright www.imageprocessingplace.com 13
Discrete values Every pixel may assume only a finite number of different values copyright www.imageprocessingplace.com 14
Implications It is not possible to increase the detail of digital images enlarging it. 15
Image representation 137 137 137 137 137 140 134 133 129 130 131 131 131 131 132 131 129 130 131 127 Matrix of numbers 137 137 137 137 137 131 141 129 133 133 132 132 132 132 134 135 129 128 134 129 136 136 136 136 136 130 133 132 133 134 137 133 127 131 129 131 132 132 128 131 138 138 138 138 138 136 134 131 134 128 131 134 131 137 129 128 131 135 131 131 129 129 129 129 129 133 137 133 134 127 127 128 125 129 133 132 132 125 131 136 138 138 138 138 138 132 132 129 130 129 127 120 133 129 134 131 131 122 127 131 134 134 134 134 134 133 128 131 132 130 124 119 131 130 130 127 128 127 128 130 140 140 140 140 140 136 134 131 139 135 133 133 136 131 128 132 128 130 132 132 136 136 136 136 136 134 137 137 131 128 132 133 131 130 134 133 131 134 131 132 135 135 135 135 135 127 129 136 133 127 131 133 133 127 123 133 126 133 126 135 134 134 134 134 134 134 137 133 133 131 129 132 129 135 133 135 127 132 128 128 130 130 130 130 130 128 129 127 136 133 134 131 129 130 129 130 132 127 134 137 139 139 139 139 139 126 130 127 128 132 127 128 126 129 131 130 132 131 132 125 135 135 135 135 135 130 131 124 129 132 132 130 129 127 129 127 129 128 138 133 129 129 129 129 129 128 124 122 130 123 127 134 131 136 129 130 132 130 131 134 134 134 134 134 134 132 127 129 128 130 129 128 130 135 129 132 125 129 134 136 131 131 131 131 131 130 134 127 128 128 131 132 129 128 131 126 130 126 133 133 138 138 138 138 138 126 129 132 139 127 127 135 138 132 128 131 127 128 137 134 139 139 139 139 139 133 126 127 133 127 129 124 128 124 128 133 134 130 135 130 127 127 127 127 127 126 133 129 133 130 134 133 131 135 125 130 130 131 129 133 16
Image characteristics Resolution: Columns x Rows of the matrix (ex. 1024x768) Bit depth: Number of different values that every pixel may assume (ex. 8 bits = 256 values) 0 = black (minimum intensity) 255 = white (maximum intensity) 17
What about colors? What is color? Color is they way the human visual system measures the visible part of the electromagnetic spectrum copyright www.imageprocessingplace.com 18
How do we perceive color? The human eye has two kind of photoreceptors: Cones: high sensibility to light low sensibility to colors Rods: low sensibility to light high sensibility to colors three different types, for each primary color: red, green and blue. 19
Color in digital images 3 different matrices corresponding to the three RGB components 20
Color spaces A color space is a representation used to specify colors, i.e. to describe quantitatively the human perception of the visible part of the electromagnetic spectrum. RGB is a color space. HSL: hue, saturation, lightness CMYK: used by printers YUV, YIQ, YCrCb: TV signal, split brightness Y and colors CIE XYZ, CIELuv, CIELab: for particular purposes... 21
Digital image formats Digital images can be stored in many different formats The format defines what kind of information are used to represent the image The matrix representing the image would need a big amount of memory: for example: 1024 (width) x 768 (height) x 8 (bits per pixel ) x 3 (color components) = 2,3 MB! Almost always compression techniques are used, which may be: Lossless Lossy 22
Types of compression Lossless compression Reduces used memory Allows to reconstruct exactly original data Not very high compression factor (about 2 or 3) For example: TIF, PNG, BMP Lossy compression (for ex. JPEG) Very high compression factor Some information of the original image are lost, thus it can't be reconstructed exactly Loss of detail Artifacts 23
Lossy compression: example Bmp: 188 kb Jpeg: 7 kb 24
Conclusions Digital images have many different uses in the forensic and investigative fields. Digital images have very different characteristics from the analog ones, such us the old film photographies. It's important to understand the meaning of light intensity and color. Due to practical reasons very often images are compressed with consequent loss of quality. 25
References Digital image processing: Jain, Fundamentals of digital image processing, Prentice Hall Gonzalez-Woods, Digital Image Processing http://www.imageprocessingplace.com Forensic image processing: Peter Kovesi home page http://www.csse.uwa.edu.au/~pk/ 26
Forensic Image Processing Lesson 2 Main issues of forensic image processing
Outline Main problems relative to image quality Techniques to solve them and their implications Images coming mainly from CCTV systems Many of the problems are present also in more general cases 2
Capture devices VCR multiplexer cameras monitor 3
Analog or digital? Very often data is still stored on VHS They must be digitalized to process them VCR frame grabber computer 4
Digital characteristics Not subject to wear and tear But... How do I connect it? What is the format (very often proprietary)? How to I replace it (service interruption)?? 5
Steps of the process Different kinds of disturbs are introduced at different stages: information acquisition (camera) analog storage (VCR) digital conversion and storage (DVR o frame grabbers) c 6
Acquisition Most of the disturbs are usually introduced in this step and are due to : capturing camera features captured scene features captured subjects features 7
Blur Wrong focus Limited depth of field 8
Motion blur Moving subject Too long aperture time of the camera shutter 9
Noise Too long aperture time of the camera shutter Quality of the components (sensor) 10
Geometric distortions Device optics (in particular wide angle lenses) 11
Contrast, colors, brightness Scene features Devices settings Components quality 12
Used standards Resolution Frame rate Interlacing 13
Analog storage Very noticeable problems Mainly caused by wear of devices (VCRs) of storage supports (VHS tapes) 14
Scratches Wear of VHS 15
Line shits Misalignment caused by wrong timings of VCR heads. 16
Electromagnetic interferences Quality of components Shielding 17
Digital conversion and storage Practical limits, such us the need of data compression, usually introduce disturbs even in this step. 18
Lossy compression Artifacts Loss of detail 19
Level compression An infinite number of intensities must be represented by a limited number of values, causing loss of information. 20
Image enhancement/restoration Different disturbs / features of the image How to improve quality? image enhancement image restoration 21
Image enhancement Process used to improve visual appeal of an image, enhance or reduce some features brightness contrast colors crop denoising sharpening... 22
Image restoration Describe by a mathematical model a known disturb that corrupts the image and try to invert the process. deblurring geometric functions filtering... 23
Lawful implications Digital data is very sensible to manipulations: easy cheap everyone can do it what is the original? what is the processed? What's the objective difference between enhancement and manipulation of an image? Which kinds of processing are acceptable and which aren't? 24
The problem Image has been manipulated HOW?? 25
Requirements 1. Preserve original image 2. Document all the details of all the steps of the processing 3. Output image must be exactly replicable applying the documented process to the original image OK! 26
Conclusions Many different problems on the images different results different causes many different ways to face them Sometimes even from very low quality images we can obtain useful information. We can use many techniques to enhance the image, but they must be handled with care, especially if we want to use them as a proof. 27
References Digital images and lawful implications: Recommendations and guidelines for the use of digital imaging processing in the criminal justice system / Best practices for documenting image enhancement. Scientific Working Group on Imaging Technology. http://www.theiai.org/swgit/guidelines/ Digital images as evidence. House of Lords, Science and Technology fifth report. http://www.parliament.the-stationery-office.co.uk/ pa/ld199798/ldselect/ldsctech/064v/st0501.htm 28
Forensic Image Processing Lesson 3 Noise smoothing
Outline What is noise in an image? How is it present? How to reduce it? www.amped.it martino.jerian@amped.it 2
What is the noise? Random noise: random variation in pixel values present also in traditional analog photography Other types: long exposure times (low light conditions) periodic noise compression artifacts... www.amped.it martino.jerian@amped.it 3
Random noise Noise in different types of digital cameras (www.dpreview.com) www.amped.it martino.jerian@amped.it 4
Noise smoothing techniques Spatial image enhancement Work done on image pixels Frequency image enhancement Work done on Fourier transform of the image The same theory can be applied also to other processing techniques that we will see in the next sections. Multi-image enhancement Putting together information coming from different images See lesson 6 www.amped.it martino.jerian@amped.it 5
Spatial image enhancement The value of the pixel in the position (x,y) of the output image is the result of an operation between a certain window of pixel around the position (x,y) in the original image. neighborhood around f(x,y) T operator g(x,y) Original image f Processed image g For practical reasons often the neighborhood is a square around the point to evaluate, but it could be any kind. www.amped.it martino.jerian@amped.it 6
Averaging filter Every point of the output image is obtained by the average of pixels around (x,y) in the original image. neighborhood around f(x,y) T operator g(x,y) This kind of process is called image smoothing. Reduces noise and artifacts...but also detail! www.amped.it martino.jerian@amped.it 7
Example It is difficult to practically obtain useful results from averaging, because usually it causes a big loss of detail. It's important, being at the base of other techniques. Increasing the neighborhood size, its effect become stronger. www.amped.it martino.jerian@amped.it 8
Implementation (I) The averaging filter is an example of spatial filter. Spatial filtering can be implemented in a computer software by a mathematical process called convolution. An averaging filter with a 3x3 neighborhood is like doing the convolution of the image with the mask: 1/9 1/9 1/9 1/9 1/9 1/9 1/9 1/9 1/9 The mask is moved across the whole image until all pixel are filtered. www.amped.it martino.jerian@amped.it 9
Implementation (II) Convolution consists in overlapping the mask over a part of the image of size 3x3, multiplying the pixels for the mask values and summing the results to get the value that will substitute the central pixel. www.amped.it martino.jerian@amped.it 10
Gaussian smoothing If we want to emphasize the contribution of the pixel that are more central in the window, we can use a bell shaped distribution, called Gaussian. 0.00 0.01 0.02 0.01 0.00 0.01 0.06 0.10 0.06 0.01 0.02 0.10 0.16 0.10 0.02 0.01 0.06 0.10 0.06 0.01 0.00 0.01 0.02 0.01 0.00 Gaussian mask 5x5 www.amped.it martino.jerian@amped.it 11
Results original averaging (5x5) gaussian (5x5) www.amped.it martino.jerian@amped.it 12
Linear and non linear filtering Seen filters, applyable by convolution, are called linear filters, since the output image can be obtained by a simple linear combination of input image values. The process is always the same for every value of the pixel. There are many other types of filters, where the method used to calculate the output image depends on the values of the pixel of the neighborhood. This kind of processing is called non linear filtering. It's not possible to implement non linear filters by a simple convolution. www.amped.it martino.jerian@amped.it 13
Median filtering Median filtering is one of the simplest non linear filters. The median value of a sequence of numbers is such a value that half of the numbers are greater and half are lower. For example, the median of 1 1 3 6 6 7 8 is 6. The median filter substitute the central pixel of the considered neighborhood with its median value. Median filtering is very good to remove impulsive noise (often called salt and pepper ) in the images, preserving the details. www.amped.it martino.jerian@amped.it 14
Used windows Sometimes, to better preserve the details, we can use windows different from the square one, using for example a + or x shaped window. a b c b a c d e f d e f e g h i h g i www.amped.it martino.jerian@amped.it 15
Results original corrupted median + median 3x3 square www.amped.it martino.jerian@amped.it 16
Other types of noise Sometimes the noise in the images is not random. We can distinguish some features helpful to understand the causes of this noise in order to apply a more successful filtering. Some examples are the periodic noise or the one caused by compression, which is often present together with artifacts. Different techniques, even quite complex, must be used for each case, using sometimes also frequency filtering techniques. www.amped.it martino.jerian@amped.it 17
Compression noise original (jpeg) filtered www.amped.it martino.jerian@amped.it 18
Fourier transform Using pixel matrices is not the only way to represent images. It is possible to represent images by the Fourier transform : instead of specifying values of single pixels, it considers the image like the sum of different frequencies waveforms. Fourier transform represents an image by a matrix of amplitude and phase for each frequency of the image. It is difficult to interpret it directly, but low frequencies could be considered as uniform zones of the image and the high frequencies as the details. www.amped.it martino.jerian@amped.it 19
Visualization Generally the logarithm of absolute values of the amplitude can be easily visualized as an image. www.amped.it martino.jerian@amped.it 20
Frequency filtering An algorithm, called Fast Fourier Transform (FFT), is used to effectively calculate the Fourier transform of an image. The Fourier transformation is invertible so we can use the inverse (IFFT) to obtain the original image starting from its frequency transform. It is possible to filter the Fourier transform of the image instead of its representation in the spatial domain. FFT filtro IFFT original original FFT filtered FFT filtered image www.amped.it martino.jerian@amped.it 21
Low pass filtering It is possible to use frequency filtering to amplify or reduce some components of the image. Low pass filtering consists in let pass low frequencies and remove high frequencies. We obtain an image similar to the original, but less sharp. We decrease the noise but also the detail! www.amped.it martino.jerian@amped.it 22
Example www.amped.it martino.jerian@amped.it 23
Periodic noise Sometimes images are corrupted by periodic noise. Natural images have a well recognizable general shape, which is like a cross with higher values towards the center. Being periodic, often noise has a well defined frequency, thus it is clearly recognizable in some particular area of the spectrum. Filtering out of the spectrum of such areas, it is possible to reconstruct the image without the disturb. We can work in the same way to remove even other periodic signals, such as banknote watermark, which is not proper noise. www.amped.it martino.jerian@amped.it 24
Example Original image Original spectrum (the two white dots represent the disturb) Filtered spectrum (dots have been removed) Filtered image (inverse transform of filtered spectrum) www.amped.it martino.jerian@amped.it 25
Convolution theorem A convolution in the spatial domain corresponds to a multiplication in the frequency domain. Given two images f(x) and g(x) and their respective Fourier transforms F(n) and G(n), and calling FT the Fourier transform: FT( f(x)*g(x) ) = F(n)G(n) convolution multiplication or, equivalently, FT( f(x)g(x) ) = F(n)*G(n) If we want to filter an image in the frequency domain, we can obtain the same result in the spatial domain with a convolution. www.amped.it martino.jerian@amped.it 26
Conclusions Noise is a random signal over the image Different techniques to reduce it: spatial filtering (pixel) frequency filtering (Fourier) These are basic techniques, bu there are more advanced ones, for example: adaptive filters frame integration www.amped.it martino.jerian@amped.it 27
Forensic Image Processing Lesson 4 Enhancing details
Outline Sharpening: local contrast enhancement to improve some detail in the image. Edge detection: operators for the extraction of borders and details in the image. Interpolation: increasing resolution of an image calculating the value of new pixels by mathematical techniques. 2
Sharpening Sharpening an image corresponds to locally increase the contrast in some parts of the image. It is the inverse process of the smoothing, which is used for noise reduction. The basic idea is to identify and amplify the value of pixels which are mostly different from the neighboring ones, thus emphasizing edges and fine details. It is like summing to the original image another image representing the edges, the details, or the high frequencies. Problem: how to distinguish details from noise? 3
Unsharp mask Increase the detail summing to the original image another image representing high frequencies. High frequencies can be obtained subtracting a low pass (see part 3) filtered version of the image from the original one. 4
Laplacian On of the most common implementation of unsharp masking is to sum to the original image the opposit of its Laplacian (a well known mathematical operator). It may be implemented with different masks, depending on the desired effect and neighborhood size, for example with: 0-1 0-1 4 0-1 -1 0-1 -1-1 -1 8-1 -1-1 -1 5
Example originale sharpened 6
High pass filter Frequency filtering (see part. 3) can be applied also to obtain an image representing borders and details of the original. High pass filtering an image consists in let high frequencies pass and stop low frequencies. Summing this image with the original, we obtain a more detailed one. We increase the detail, but also the noise! 7
Example 8
Other filters In literature many different filters are proposed, even rather complicated, to enhance image detail or to extract edges, that generally has better performances than the basic ones presented here. To get good results filters must be adaptive, thus modifying their characteristics upon the local image values. In this way is possible to increase the detail without amplifying much noise. 9
Example original adaptive sharpening unsharp masking 10
Edge detection Sometimes it can be useful to extract image edges, not just to improve the detail, but to analyze images enhancing some features. 11
Example 12
Sid Wallace 13
How does it work To detect edges we need a filter that emphasizes the points where differences between pixels are most noticeable. A simple filter to detect vertical edges is: -1 0 1 When it is convolved with an image with a strong edge, this filter gives an output image that emphasize vertical borders and ignores uniform areas. 000111-1 0 1 001100 000111 001100 000111 001100 000111 001100 14
Original image Example Output image filter 15
Sobel One of the most common and simple edge detectors is the Sobel operator. It has two masks, on for the vertical edges and one for the horizontal edges. -1 0 1 1 2 1-2 0 2 0 0 0-1 0 1-1 -2-1 The two results are elevated to the 2nd power, then the result is thresholded determining the sensibility of the overall function. 16
Esempio Vertical edges square sum Horizontal edges square 17 threshold
Other edge detectors Laplaciano (seen in unsharp mask) Prewitt Canny... original laplacian 18
Low risolution (I) Sometimes we'd like to zoom some detail of the image 19
Low resolution (II) In digital images it is not possible to increase the detail zooming the image. 20
Interpolation Mathematical process used to estimate unknown pixel values by the values of know neighbor pixels. 21
Interpolation algorithms Nearest neighbor: copies the value of closest pixel. Bilinear: weights neighbor pixels in a way that they contribute proportionally to their distance with a linear behavior. Bicubic: weights neighbor pixels in a way that they contribute proportionally to their distance with a cubic behavior. There are many more complex algorithm, but in real world application almost often these three are used. In particular, the bicubic is the one that gives better results. 22
Monodimensional interpolation s nearest: bilinear: bicubic: 23
Bidimensional interpolation It's more complex, but with the traditional algorithms we can simply interpolate monodimensionally first the rows and then the columns (or viceversa) to obtain the same result. 24
Comparison nearest neighbor bilinear bicubic 8x zoom 25
Advanced techniques 8x bicubic 8x lowadi 26
Conclusions Detail enhancement is complex and strictly correlated with the noise in the images. Different filtering techniques, both in space and frequency, are available. We can use similar techniques for sharpening and edge detection. To increase the resolution we must interpolate images...... but we must take care that the new details are not true but artificially created by mathematic methods to make the images more appealing. 27
Forensic Image Processing Lesson 5 Advanced techniques
Outline Intensity transformations Image histrogram Omomorphic filtering Deblurring Motion deblurring Color image processing 2
Intensity transformations (I) Real world luminance intensity range is much wider that the one that can be captured and visualized with a photography. Intensity transformations modify values pixel by pixel by a defined curve, with the purpose to increase or decrease contrast between different intensity values. 3
Intensity transformations (II) neighbourhood around f(x,y) Original image T operator g(x,y) Processed image g Particular spatial transform where the neighborhood corresponds only with the position of the pixel to calculate. 4
Increasing brighteness 5
Decreasing brightness 6
Non linear transformations (gamma) 7
Gamma correction copyright www.imageprocessingplace.com 8
Other transformations copyright www.imageprocessingplace.com I can remove, enhance or attenuate desired intensity ranges. 9
Image histogram Luminance distribution in an image is called histogram. The histogram represent the number of pixel (vertical axis) of a certain value (horizontal axis). Number of pixels Pixels value 10
Histogram equalization Histogram equalization is a process aimed at making image histogram more uniform and thus improving image contrast. 11
Illumination and reflectance Images are created by light reflected by object, which is composed by: Quantity of light incident on the scene (illumination) Quantity of light reflected by objects in the scene (reflectance) Illumination and reflectance are merged by multiplication: f(x) = i(x) r(x) Reflectance gives information about the color and shape of objects, while illumination variations may cause confusion. 12
Omomorphic filters Working in the frequency domain with the Fourier transform, it is possible to separate reflectance from illumination to reduce the effect of the last one. This kind of filters are called omomorphic filters, and are very useful to correct non uniform light conditions. Generally: High frequencies: illumination Low frequencies: reflectance 13
Example 14
Optical blur It is not always possible to get in focus all the detail of a scene, that may result out of focus ( blurred ). 15
Point spread function A defocused image can be considered as the result of the convolution of an ideal sharp image, with a function called point spread function (PSF), representing the features of the blur. PSF represents the matrix of pixel that in an ideal ( sharp ) case should correspond to a single pixel of maximum intensity. Optical blur can be approximated by a bidimensional Gaussian PSF. 16
Gaussian blur * = convolution 17
Deconvolution In practical cases I would like to reconstruct a sharp image from a blurred one. Upon some hypothesis I can invert the process (deconvolution) to approximatively reconstruct the original image. = deconvolution 18
Motion blur (I) If an object moves too fast with respect to the camera, this causes motion blur. It is very frequent in night footage, where low light condition need a longer aperture time of the camera shutter. 19
Motion blur (II) Motion blur can be modeled by the average of several translated copies of the ideal sharp image. Even this observation may be modeled by the PSF. The image can be restored by the convolution, but better results are obtained with other techniques, such as Wiener filtering. 20
Wiener filter Results obtainable by simple deconvolution are very sensible to the noise in the image. If: u(m,n) is the image without blur (that we don't have in real cases), v(m,n) is the blurred image, u'(m,n) is the image obtained with the restoration process, the purpose of the Wiener filter is to obtain, starting from v(m,n), an image u'(m,n) as similar as possible to u(m,n). This is done trying to minimize the mean square error (MSE), i.e. the average of the squared differences between every single pixel in u'(m,n) and u(m,n). 21
Motion deblurring deconvolution = 22
PSF Estimate correctly the PSF is not easy. A wrong PSF corrupts further the image. 23
Color image processing (I) The vast majority of image processing techniques has been studied for grayscale images (black and white). Color images are represented as the union of 3 grayscale images (RGB components), thus the same techniques are usually applied independently to the three components. This is what is usually done...... but it is not actually very correct! 24
Color image processing (II) From the theoretical point of view it is not acceptable to threat separately the three components, since they are strongly correlated and processing results should be either. In practical applications visual appearance is almost always acceptable...... but taking care of details sometime is possible to notice some artifacts. A definitive and formally correct solution hasn't been found yet and the problem of color is still totally open, since it is based on the human perception, which is difficult to measure objectively. Some kind of problems are specifically studied for color images. 25
Luminance processing An approach that is sometimes used is to process the image in one of those color space that separate informations about color and brightness (see lesson 1), It is possible to process luminance components as if it was a grayscale image and leave untouched the color components. This can be justified by the fact that the human eye is much more sensible to variation in the intensity than the chromatic value. In this way we also reduce computational cost, processing only one matrix instead of three! 26
Conclusions We saw some image processing techniques that allow to obtain, by mean of mathematical models, impressive results. All presented problems are still open and better techniques to use for the processing are still in progress. Color image processing issues are often underestimated, but are of primary importance and difficult solution. 27
Forensic Image Processing Lesson 6 Processing video sequences
Outline Video formats Deinterlacing Frame integration Registration Demultiplexing Motion detection 2
What is a video? A video is set of images (frames) that, played in fast sequence, give the viewer the illusion of movement. Our eye does not perceive flicker between the frames thanks to the visual persistence: the last projected image remains impressed for a certain time (a fraction of a second) on the retina even after its source has been removed. The other effect that contributes to perceive an image sequence as continuous motion is the beta effect, a perceptive illusion by which the brain connects different frames by a certain perception of time and causality. 3
Analog and digital Like in the case of images, also analog and digital video signal are very different. An analog signal is, for example, the one of a televsion, and it can be stored on an analog device such as a VCR. An example of digital signal is the one used by PCs or DVD players; for practical reasons it always needs complex compression techniques to be stored. Analog signal may be converted to digital (and viceversa) by proper converters; they are actually used by the vast majority of visualization devices (such as monitors). 4
Analog video TV-signal transmission is done by three principal standards: PAL, Secam and NTSC. We'll mainly refer to the PAL standard, since is the one used in Europe, but also for the other formats the considerations are very similar. 5
World standards 6 www.wikipedia.org
PAL format PAL signal consists of the transmission of 635 lines at a frequency of 50 frames per second. Actual resolution used for each frame is 576 (height) by 720 (width). Used color space is YUV: Y represents the luminance, U and V color information. This choice has been done in the passage of TV from black and white to colors: in this way is possible to use the same signal: black and white TV consider only Y component, ignoring U and V. 7
Interlacing In the PAL signal images are played at a 50 Hz frequency. Actually the transmitted frames are only the half, 25. In every frame even and odd lines belong to two different frames to be displayed. Thanks to the visual persistence, we perceive the images as if they where actually projected at a 50 Hz frequency. With this technique: transmission bandwidth is divided by two...but also the vertical resolution is divided by two! 8
Deinterlacing In the TV missing lines are interpolated in each frame. The format where all lines are drawn on the screen is called progressive. The process of converting from interlaced to progressive is called deinterlacing. Odd field Even field 9
Interlaced image 10
Odd field image Deinterlacing Even field image 11
Deinterlacing techniques Basic techniques are similar to those used for image enlargement, but the interpolation is applied only on the vertical direction. A bad deinterlacing can lead to artifacts, particularly on small details and diagonal edges, which can become jagged. More advanced techniques allows to obtain better results, for example using adaptive algorithms which evaluates the shape of edges. 12
Linear deinterlacing 13
Adaptive deinterlacing Linear deinterlacing An example of adaptive deinterlacing 14
Multi frame deinterlacing It is possible to perform a better deinterlacing if I put together information coming from different frames. If the scene is static, I can copy the missing lines from the previous or the next frame. If there is movement, we can evaluate it and compensate it to calculate better the missing lines. These are advanced techniques, very often used in modern devices. 15
Digital formats Let's suppose to store digitally a PAL video: the needed storage for every second should be: 576 x 720 (resolution) x 24 bit (8 bit per channel) x 50 Hz (frames per second) = 60750 Megabytes per second!!! This means that a CD could store less more than 10 seconds of video. To digitally store video sequences very advanced compression techniques must be used. 16
Video codecs A video codec is a software component (or an hardware device) which can encode (compressing for storage) and decode (decompressing for visualization) a video stream. Compression techniques have dramatically improved in the very last years. Generally codecs are lossy, so it's needed to find the right balance between compression, loss of quality and computational cost. Video formats differs not only by the used codec, but also the type of file used for the storage, called container. For example DIVX is a codec, AVI is the container. Codecs are identified by a tag, called FOURCC (four character code), saved inside the stream, formats (partially) by the file extension. 17
Most popular codecs H.261, first good compression standard, used for videoconferencing and videotelephony. MPEG-1, used in the VideoCD format (VCD). MPEG-2, used in DVDs and SVCDs. H.263, current standard for videotelephony, videoconferencing and contents streaming by Internet. MPEG-4 (H.264), state-of-the-art of movie compression, extremely widely adopted (DivX, XviD, 3ivx, WMV). Sorenson 3, used by Apple QuickTime, it may be considered the precursor of H.264. RealVideo, very popular some years ago, not very used anymore. 18
Video processing techniques It is possible to consider not only every frame as a separate image, but processing frames taking into account also the temporal information. Putting together information coming from different frames it is possible to obtain results much better than from single images. 19
Frame integration If we have a sequence of images of the same scene, disturbed by zero mean random noise, we can think to average the corresponding pixels in the different frames. With an infinite number of frames to average, the noise should tend to zero allowing to reconstruct the clean image. We won't ever have an infinite set of images......but also with a small number results are noticeable! 20
Example Average on 10 images with random gaussian noise 21
Image registration (I) To apply the frame integration it is necessary that the images represent exactly the same scene. This hypotesis is not always verified, since if the camera is not fixed or there is some moving subject, the scene actually changes in every frame. The process used to align two images (or some details of them) is called registration. In general, the registration is the process to align two or more images representing the same scene, acquired in different moments, with different sensors or from different point of view. It is a process needed before most of image processing applications that combine different frames together. 22
Image registration (II) The simplest type of image registration is the alignment of images differing by a simple translation, for example to stabilize a shaking scene. 10 frames sequence Average without registration Average with registration 23
Image registration (III) More advanced techniques allow to correct also other kinds of transformations, such as the perspective effect. 6 images sequence Without registration With registration 24
Multiplexing Often information coming from several cameras are stored together on the same support, interleaving in different frames signal coming from different cameras from different locations. VCR multiplexer cameras monitor 25
Demultiplexing In the analysis of a video is very uncomfortable to search every time the next frame of a location of interest, since the acquisition sequence usually is not regular and it is not possible to know where is located the next desired frame. The operation which divides a multiplexed stream in separated video sequences, is called demultiplexing. Demultiplexing can be done automatically: by saving by the system the actual sequence log in the multiplexing stage (very uncommon); by later separation by frames similarity criteria (difference, correlation...). 26
Demultiplexing Sometimes I need also to deinterlace the image... 27
Motion detection Often surveillance systems have automatic alarms to signal the responsible personnel in case motion is perceived in the acquired scene in locations where nothing should happen, for example to a forbidden place. The way to accomplish this is called motion detection. The most basic techniques consist on calculating how the current frame differs from the previous or from a difference one (sometimes called background); if this difference, that can be calculated in several ways, is bigger than a certain threshold, then the system alerts the operator. 28
Motion detection Image with no motion Image with motion Reference image Difference image Difference image 29
Conclusions Working on movies rather than single images I have more powerful techniques to restore the footage. In order to efficiently store and transmit the movies, they must be encoded in some compressed format. 30
Forensic Image Processing Lesson 7 3d reconstruction
Outline Stereoscopic vision 3d reconstruction and photogrammetry Perspective correction and rectification Geometric distortion correction 2
Stereoscopic vision To get threedimensional information from images (which are bidimensional) it is necessary the stereoscopic vision. The real world image is projected in a different way on the two human eyes, and thanks to this difference we are able to evaluate relative distances between objects. Closer object are projected more separated on our retina, while far away object are projected closer. 3
Pin-hole camera model In a camera the image is projected on a plane (while the retina is curved). The fundamental model for image formation is called pin-hole camera model. It assumes that every point of the image is generated as direct projection on the real point through an optical center. 4
Photogrammetry The techniques employed to calculate distances in the 3d space by their 2d representation (images) is called photogrammetry. Photogrammetry is used in many different fields: forensics; architecture; geology; archeology; topography;... 5
Projection matrix In order to use a camera to take 3d measures it must be calibrated: this means finding the mathematical relationship between threedimensional points in the real world and where they appear in the image. This relationship is called projection matrix. su q11 sv = q 21 s q31 q12 q22 q32 q13 q23 q33 X q14 Y q24. Z q34 1 Scaled image values (they must be divided by s to obtain the real coordinates) 3d coordinates Projection matrix 6
Calibration Calibration needs a test object where have been located some 3d positions accurately measured. These 3d positions are correlated to the image positions where this points appear by the projection matrix. 7
3d reconstruction It is possible to reconstruct, by the relation between two or more photos, the 3d features (and thus the positions) of a scene. 8
Single view metrology Humans are able to get a lot of 3d information by a single photography. From this consideration other measurement techniques have been developed. 9
Perspective Perspective is the effect that allows us to get 2d information of a scene from its bidimensional representation. Two parallel lines viewed in perspective will converge to a point. Two sets of parallel lines in different directions in the plane determines two vanishing points. Two vanishing points determines the vanishing line of the plane. Al lines lying on planes parallel to this one will converge on points belonging to the vanishing lines. Finding vanishing points and lines is needed to reconstruct 3d information from perspective images. 10
Vanishing lines and vanishing points Vertical vanishing line Vanishing point Vanishing line Vanishing point 11
Cross ratio If we have four aligned points, the ratio of the ration between the length of segments that they determine, remains constant also after a perspective transformation. ( X3 ( X3 X1 ) X2) ( X4 ( X4 X1 ) X2) = constant 12
To usderstand... Cross ratio = AC 488 AD 596 = = BC 173 BD 282 1.334 13
In practice... The cross ratio on an image will be equal to the same cross ratio measured in the real world. If we can measure lengths by reference objects in the real world we can calculate any unknown length on the image. Known lengths Unknown length AC 488 x AD 596 3 = = = BC 173 1 BD 282 2 Image cross ratio 1.334 x = 2 14
Very useful! 15
Rectification Rectification allows to modify perspective effect on a plane making it parallel to the image plane. Useful: To measure and calculate bidimensional ratios To improve visualization of a scene 16
Rectification Rectified sidewalk Rectified fence 17
Rectification As seen from top Original picture 18
This is difficult... Noticeable result, but far from perfect because of geometric distortions and strong perspective to correct. 19
Geometric distortion Pin-hole camera is not precise in most of the real cases. Camera actually introduces geometric distortion that may modify the aspect of an object in the scene. Most of the distortions are caused by the optics and depend on lens characteristics. The most common effect is to transform straight lines in curves, this effect is mostly noticeable in wide angle optics. 20
Distortion correction (I) 21
Distortion correction (II) 22
Why is it important? It's a preparatory step before other transformation (for example to correct perspective). Distortion can heavily modify the characteristics of the scene or of the subjects, especially on the borders of the image. Distortion must be corrected before measuring positions and distances, otherwise the calculation will be wrongly influenced. 23
Geometric transformations (I) Perspective modification and distortion correction are geometric transformations. Many different geometric transformations exist, for example the enlargement (that we've seen with interpolation) and the rotation. The procedure for all geometric transformations consists in calculating by mathematical operations the position of some points starting from the position of this points in the original image. 24
Geometric transformations (II) Original image grid Transformed image grid 25
Interpolation again In the position calculated by a geometric transform I'll have to estimate the value of corresponding pixels. In general I don't know the value of pixels in these positions, since they are different with respect to the original. I must calculate output pixel values starting from the original ones by the interpolation techniques seen with image enlargement. 26
Conclusions Obtaining information about the 3d world from a 2d image is not easy, but can be very important. I can reconstruct the real aspect of a scene. I can (partially) modify point of view of a scene. I can measure something in a scene. 27
Further reading Photogrammetry: Single view metrology Criminisi, Reid, Zisserman International Journal of Computer Vision 2000 http://www.robots.ox.ac.uk/~vgg/publications/papers/criminisi00a.pdf http://www.robots.ox.ac.uk/~vgg/presentations/spie98/criminis/p3.html 28
Forensic Image Processing Lesson 8 Biometrics
Biometric recognition Biometric recognition employs physiological or behavioral characteristics of a person to determine his/her identity. 2
Necessary features (I) What kind of measures can be considered biometric? Every characteristic that is: universal: everyone should have it; distinguishable: taken two different persons, this characteristic must be sufficiently different; permanent: it must be sufficiently invariant in a certain time period; measurable: it must be quantitatively measurable. 3
Practical features (II) In real identification systems other aspects must be taken into consideration: Performances: accuracy and speed of the recognition necessary resources external factors influencing characteristics Acceptability: availability of the persons to accept a certain identification method in their every day life Cheating possibility: how easily the system could be cheated 4
Biometric identifiers Physiological: Behavioral DNA Handwriting Ear shape Voice Face Gait Facial thermogram Posture Fingerprint Keyboard typewriting Hand geometry... Hand vein Iris Odor Handprint Retina... 5
Characteristics Every method has different features. In the table the features of different biometrics are compared upon the perception of the author of the article An introduction to biometric recognition (see Further reading ). 6
Biometric systems A biometric system is composed by four basic components: 1. sensor module: acquires biometric data; 2.feature extraction module: processes acquired data do obtain some numeric parameters from them; 3.matching module: parameters are compared with reference ones; decision-making module: establishes subject identity or confirms or rejects declared identity 7
Identification and verification Biometric systems can be used for two different purposes: identification: who's this person? verification: is really this person who declares to be? Identification is much more complex than verification. When we speak about recognition we must clearly distinguish between identification and verification. For many practical applications verification is sufficient. 8
Procedure example 9
Performances and errors (I) Two different acquisitions of the same biometric feature of the same person are not exactly the same because of: imperfections in the sensor and differences in the acquisition process; modifications in the characteristics of the subject; environmental conditions; interaction of the subject with the sensor. Generally the output of a biometric system is a number indicating the matching score between the acquired data and one element of the database. 10
Performances and errors (II) Decisions of the system are regulated by a threshold t: couples of biometric data with a matching score bigger than t are considered belonging to the same person. A biometric system can commit two types of error: false match: make belonging to the same person two measures actually taken on two different persons; false non-match: make belonging to different persons two measures actually taken on the same person. We must find the right compromise between the false match rate (FMR) and the false non-match rate (FMNR) in every biometric system. 11
Performances and errors (III) 12
Some biometric techniques A little introduction on some biometric techniques: face recognition; iris recognition; fingerprint recognition. 13
Face recognition Two main different approaches: position and shape of some features (nose, lips, eyes...) global analysis of the face as a weighted average of some standard faces. Very sensible to changing light condition and different angles of view. Little obtrusive, it may be very useful in some situations, less effective in others. 14
Basic idea Database composed by faces normalized in size and position. average face 15
Basic idea A certain statistical analysis allows to calculate some reference faces ( eigenfaces ). Every face can be described by the average face by a weighted average of all reference faces. A face can be encoded by the weights of the reference faces needed to represent it. Eigenface 1 Eigenface 2 Eigenface 3 Eigenface 4 Eigenface 5 16
Reconstruction Face reconstruction starting from the average face, adding reference faces. Original face Reconstructed face 17
Iris recognition but: Encoding iris features allows a very fast and precise recognition: the encoding of two different eyes differs in average by 50% of values; the probability that they differ of less than 30% is almost zero. Iris is a unique feature of every person, even in the case of twins. It's very difficult to cheat the system. it's very obtrusive and needs user interaction systems are rather expensive 18
Basic idea (I) Iris image is unrolled 19
Basic idea (II) Iris lines are encoded by their pattern. 20
Fingerprints (I) 21
Fingerprints (II) A fingerprint is characterized by ridges and valleys. Fingerprints differs from person to person (even in twins). Recognition systems are quite cheap and enough precise. Recognition methods (especially for identification) are computationally heavy. For some category of subjects or environmental conditions the method can be inaccurate (manual workers with cut on the skin, sweating hands...). 22
Features (I) Federal Bureau of Investigation Educational Internet Publication Some particular patterns are more frequent. 23
Features (II) 24
Features (III) Fingerprints are characterized by: Patterns (loop, whorl, arch...) Ridges between couples of minutiae Type, direction, and position of minutiae Position of the pores A minutia is a bifurcation or a termination: a clear distinction between them is generally ignored, since one is the negative of the other and small modification in the print can transform a bifurcation in a termination and viceversa. 25
Minutiae (I) Position of minutiae of a fingerprint Position and direction of minutiae of a fingerprint 26
Minutiae (II) The comparison of all the possible couples of minutiae is a complicated and heavy process. 27
Pattern recognition (I) Biometric techniques belongs to the field widely called pattern recognition. Pattern recognition techniques aim at extracting a numeric description from any kind of information (in our case a visual one) for the comparison and recognition with some reference features. Applicative examples are the classification of footprints, tire marks, logos on drug pills, bullets... 28
Pattern recognition (II) Pills database Shoeprint database 29
Conclusions Usage of biometric recognition is widely expanding. One of the main problem is the acceptance by the subjects involved in the recognition process. Employed methods must be very reliable and not obtrusive. Similar techniques are used also to threat non-biometric data. 30
Further reading Biometrics An introduction to biometric recognition Jain, A.K.; Ross, A.; Prabhakar, S.; Circuits and Systems for Video Technology, IEEE Transactions on Volume 14, Issue 1, Jan. 2004 Page(s):4-20 Digital Object Identifier 10.1109/TCSVT.2003.818349 Pattern recognition Overview of Pattern recognition and image processing in forensic science Zeno Geradts and Jurrien Bijhold, Netherlands Forensic Institute http://www.geradts.com/anil/ij/vol_001_no_002/paper005.html 31