How To Use Sensor Fingerprint Matching In Large Databases

Size: px
Start display at page:

Download "How To Use Sensor Fingerprint Matching In Large Databases"

Transcription

1 APPLICATIONS OF MULTIMEDIA FORENSICS DISSERTATION Submitted in Partial Fulfillment of the Requirements for the Degree of DOCTOR OF PHILOSOPHY (Electrical Engineering) at the POLYTECHNIC INSTITUTE OF NEW YORK UNIVERSITY Sevinc Bayram January 2012

2

3

4 Microfilm or other copies of this dissertation are obtainable from UMI Dissertation Publishing ProQuest CSA 789 E. Eisenhower Parkway P.O. Box 1346 Ann Arbor, MI

5 Vita Sevinç Bayram was born in Bursa, the lovely green city of Turkey. She received her B.Sc. and M.Sc. degrees in Electronics Engineering from Uludag University, Bursa, Turkey in 2002 and 2005, respectively. In her B.Sc studies, she researched on fingerprint classification problem. With her M.Sc. studies, she was introduced to the world of Multimedia Forensics where she developed a passion for. In 2006, Sevinç Bayram started working towards her Ph.D. degree at Polytechnic Institute of NYU. In 2010, she was a summer intern at Dolby Labs, where she worked on audio forensics techniques. Her research interests includes all aspects of Multimedia Forensics; consisting Tamper Detection, Source Device/Model Identification, Computer Generated Image Identification, Efficient Techniques in Multimedia Forensics for Large Databases and applications of Multimedia Forensics Techniques. ii

6 To my dear parents, Mustafa, and Seviye Bayram iii

7 Acknowledgements First and foremost, I would like to thank my supervisor Prof. Nasir Memon for welcoming me into his research group, for his continuous help and guidance, and many insightful discussions he provided. I would also like to thank him for his unlimited patience, tolerance and for his great personality, not only guiding me for research but guiding and helping me in other aspects of life. It was and is still an honor to know and work with him. I wish to express my sincere thanks to Professor Hüsrev Taha Sencar for being there always when needed, for sharing his valuable ideas with me, for being a very good advisor, very good friend and very good example. I cannot emphasize enough how much I benefited from his knowledge, wisdom and personality. I would also like to thank Professors Yao Wang and Ivan Selesnick for serving on my thesis committee. I was very fortunate to have them as my professors on very important subjects which I use daily in my research. I would like to especially thank them for being an inspiration and for saving me with their lecture notes (which I check quite often) whenever I am stuck. iv

8 I was also very fortunate to have Ismail Avcibas as my Master thesis advisor, who introduced me to the multimedia forensics problems. His supervision, and support, is truly appreciated. I would like to take this opportunity to also thank all the members of ISIS LAB. I have learned a lot from each of them. My life in this country would not be livable without my dear friends Anagha Mudigonda, Kagan Bakanoglu, Naren Venkatraman, Cagdas Dogan, Senem Acet Coskun, Baris Coskun, Ozgu Alay, Yagiz Sutcu, Kurt Rosenfeld, Napa Sae-Bae, Apuroop Gadde, and I am grateful to each of them for being there for me all the time. Special thanks to my better-half Dervis Salih for his love, and support; for being my best friend, listening to all my complaints and no matter what for finding a way to make me happy. I would like to especially thank him for his efforts to make me a better researcher, for motivating me to work harder and setting a great example himself. Last but not least, I would like to express my gratitude to my beloved family, for their never ending love and support; to my parents who raised me with a love towards education and science, and supported me in all my pursuits; to my sister and brother for their continuous encouragement; and to my two beautiful nieces Esra and Serra for adding joy to my life. v

9 AN ABSTRACT APPLICATIONS OF MULTIMEDIA FORENSICS by Sevinç Bayram Nasir Memon, Advisor Submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy (Electrical Engineering) January 2011 In recent years, the problem of multimedia source verification has received rapidly growing attention. To determine the source of a multimedia object (image/video) several techniques have been developed that can identify characteristics that relate to the physical processes and algorithms used in their generation. In particular, it has been shown that noise-like variations in images and videos, due to the different light sensitivity of pixels, can be accurately measured, and used as a fingerprint of an imaging sensor. The presence of a sensor fingerprint in a multimedia object would provide evidence that the given multimedia object was captured by that exact sensor. Motivated by this, in this thesis, we investigate the different apvi

10 plication potentials of the sensor fingerprint matching technique. For this purpose, we first focus on using sensor fingerprints in a source identification scenario where the aim is to find the multimedia objects captured by a given device in a large collection of multimedia objects, in a timely fashion. The associated fingerprint matching method itself can be computationally expensive, especially for applications that involve large-scale databases (e.g. YouTube or Flickr). To overcome the limitations, we propose two different approaches. In the first approach, we propose to represent sensor fingerprints in binary-quantized form. While, a significant improvement on efficiency is achieved with this approach, it is shown through both analytical study and simulations that the reduction in matching accuracy due to quantization is insignificant as compared to conventional approaches. Experiments on actual sensor fingerprint data are conducted to confirm that there s only a slight increase in the probability of error and to demonstrate the computational efficacy of the approach. In our second approach, we present a binary search tree (BST) data structure based on group testing to enable the fast identification. Our results on the real-world and simulation data show that with the proposed scheme major improvement in search time can be achieved. The limitations of the BST are also shown analytically. Furthermore, we demonstrate how to use device characteristics for conventional content-based video copy detection task.we show the viability of our scheme by both analyzing its robustness against common video processing operations and evaluating its performance on real world data, including controlled video vii

11 sequences as well as the videos that are downloaded from YouTube. Our results show that proposed scheme is very effective and suitable for video copy detection application. viii

12 Publications Publications Related to Thesis S. Bayram, H. T. Sencar, N. Memon, Video Copy Detection Based on Source Device Characteristics: A Complementary Approach to Content-Based Methods, ACM International Conference on Multimedia Information Retrieval, October 2008, Vancouver CA. 5% oral presentation acceptance rate S. Bayram, H. T. Sencar, N. Memon, Efficient Techniques For Sensor Fingerprint Matching In Large Image & Video Databases, SPIE Electronic Imaging, January 2010, San Jose, CA. S. Bayram, H. T. Sencar, N. Memon, Efficient Sensor Fingerprint Matching Through Fingerprint Quantization, manuscript accepted to appear in IEEE Transactions on Information Forensics and Security S. Bayram, H. T. Sencar, N. Memon, Efficient Video Copy Detection Based on Source Device Characteristics, manuscript in preparation to be submitted to IEEE Transactions on Multimedia S. Bayram, H. T. Sencar, N. Memon, Group testing Based Sensor Fingerprint Identification in Large Databases, manuscript in preparation to be submitted to IEEE Transactions on Information Forensics and Security ix

13 Other Publications During PhD Studies Journal S. Bayram, H. T. Sencar, N. Memon, Classification of digital camera-models based on demosaicing artifacts, Journal of Digital Investigation, Volume 5, Issues 1-2, September 2008, Pages Editorial in New Scientist Magazine, Issue 2682, Page 30, 14 November 2008 S. Bayram, J. Ma, P. Tao, V. Svetnik, High-throughput Ocular Artifact Reduction in Multichannel Electroencephalography (EEG) Using Component Subspace Projection, Journal of Neuroscience Methods March 15;196(1): S. Bayram, J. Ma, P. Tao, V. Svetnik, Muscle Artifacts in MultiChannel EEG: Characteristics and Reduction, accepted to Clinical Neurophysiology S. Bayram, H. T. Sencar, N. Memon, Ensemble Systems for Steganalysis, manuscript submitted to IEEE Transactions on Information Forensics and Security Conference S. Bayram, H. T. Sencar, N. Memon, and I. Avcibas, Improvements on source camera-model identification based on CFA interpolation, Proc. of WG 11.9 International Conference on Digital Forensics, 2006, Florida. Y. Sutcu, S. Bayram, H. T. Sencar, and N. Memon, Improvements on sensor noise based source camera identification, IEEE International Conference on Multimedia and Expo (ICME), 2007, Beijing, China. A.E. Dirik, S. Bayram, H. T. Sencar, and N. Memon, New Features to Identify Computer Generated Images, IEEE International Conference on Image Processing (ICIP), 2007, San Antonio, TX. x

14 S. Bayram, H. T. Sencar, N. Memon, A Survey of Copy-Move Forgery Detection Techniques, IEEE Western New York Image Processing Workshop, September 2008, NY Best student paper award S. Bayram, H. T. Sencar, N. Memon, An Efficient and Robust Method For Detecting Copy-Move Forgery, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2009, Taipei Taiwan. S.Bayram, A.E. Dirik, H.T. Sencar and N. Memon, An Ensemble of Classifiers Approach to Steganalysis, ICPR,2010, xi

15 Contents Vita ii Acknowledgements iv Publications ix 1 Introduction Contributions of Thesis Organization of the Thesis Background Imaging sensor output model PRNU noise estimation Verification with PRNU Robustness of PRNU and Anti-Forensics Efficient Sensor Fingerprint Identification In Large Image & Video Databases 16 xii

16 3.1 Through Sensor Fingerprint Quantization Effect on Matching Performance Computation and Storage Aspects Experimental Results Through Binary Search Tree Structure Based on Group Testing Binary Search Tree Building The Tree by Hierarchical Clustering Retrieving Multiple Objects Experimental Results Video Copy Detection Based on Source Device Characteristics: A Complementary Approach to Content-Based Methods Camcorder Identification based on PRNU Obtaining Source Characteristics Based Video Signatures Robustness Properties of Video Signatures Contrast Adjustment Brightness Adjustment Blurring AWGN Addition Compression Random Frame Dropping Performance Evaluation xiii

17 5 Conclusions and Future Directions 83 xiv

18 List of Figures 1.1 High Level Description of Proposed Multimedia Source Device Identification System A simplified depiction of an imaging pipeline within a camera and the different components involved PRNU extraction and verification process The change in correlation due to binary-quantization of sensor fingerprints. The solid line shows the theoretical correlation between the two real-valued fingerprints, dashed lines shows the correlation between a real and a binary fingerprint, and dotted line shows correlation between two binary fingerprints. Circles show the correlation between the two real-valued fingerprints obtained through numerical simulation. Similarly, stars show the correlation between a real- and binary-valued fingerprints and rectangles show correlation between two binary-valued simulated fingerprints xv

19 3.2 The change in PCE due binarization of fingerprints is obtained. The solid line shows the PCE between the two real simulated fingerprints, the dashed line shows the PCE between a real and a binary simulated fingerprint, and dotted line shows PCE between two binary simulated fingerprints ROC curves comparing performance when fingerprint matching involves (a) only real-valued fingerprints, (b)real- and binary-valued fingerprints, and (c) only binary-valued fingerprints The distribution of intra-camera correlations for a) Camera A, b) Camera B, and c) Camera C obtained between the camera fingerprints and fingerprints of images captured by the same camera under the three settings The distribution of inter-camera correlations for a) Camera A, b) Camera B, and c) Camera C obtained between the camera fingerprints and fingerprints of images captured by other cameras under the three settings xvi

20 3.6 The distribution of correlations. (a),(b), and (c) corresponds to Camera A, (d),(e), and (f) to Camera B and (g),(h), and (i) Camera C. The first column is the correlation between real-valued fingerprints, the second column is correlation between real-valued and binaryvalued fingerprints and the last column is the correlation between two binary-valued fingerprints The accuracies for varying k values in each case for Camera C. Solid line shows the real-real case, dashed line shows real-binary and dotted line shows the binary binary case Distribution of correlation between a query fingerprint and a composite when composite doesn t contain a matching fingerprint. The fingerprints are n = 10 7 in size and the composite contains N = The red line shows the analytical findings and the blue lines show the results on simulation data Distribution of correlation between a query fingerprint and a composite when composite contains a matching fingerprint. The fingerprints are n = 10 7 in size and the composite contains N = The red line shows the analytical findings and the blue lines show the results on simulation data ROC curves comparing performance when fingerprint is correlated with composite fingerprints, and n = xvii

21 3.11 Binary search tree The distribution of correlation values using the fingerprint of Sony Cybershot P72 and the fingeprint estimates from Sony Cybershot S90(green) and Sony Cybershot P72(purple) (a)precision-recall Diagrams for 3 cameras when group based approach is used and for Sony S90 when the tree is built by random splitting. (b)precision-recall Diagrams for Sony S90 with different quality fingerprints (a) A video and its contrast enhanced duplicate. (b) A video and its advertisement overlaid version. (c) Two videos taken at at slightly different angels. (d) Similar but not duplicate videos The distribution of inter- and intra-correlation values. Distributions in blue indicate the correlation values of fingerprints associated with the videos shot by the reference camcorder. Distributions in red indicate the correlation values of fingerprints between the reference camcorder and other camcorder. Bit-rate of videos and the number of frames in each segment are (a) 1 Mbps and 1000 frames, (a) 1 Mbps and 1500 frames, (c) 2 Mbps and 1000 frames, and (d) 2 Mbps and 1500 frames Distribution of correlation values obtained by correlating signatures 50 video clips with each other xviii

22 4.4 The distribution of correlation values. Blue distribution is obtained by correlating the unmodified video clips and their modified versions, red is obtained by correlating video clips with the ones coming from the same camcorder and green is obtained by cross correlation of different videos for different types of manipulations. (a) Decreased contrast. (b) Increased contrast. (c) Decreased brightness. (d) Increased brightness. (e) Blurring. (f) AWGN addition. (g) Compression. (h) Random frame dropping The change in mean of correlation values as a function of the strength of (a) contrast increase and (b) contrast decrease The change in mean of correlation values as a function of the strength of (a) brightness decrease and (b) brightness increase The change in mean of correlation values as a function of the strength of (a) blurring and (b) AWGN addition The change in mean of correlation values as a function of the strength of (a) compression and (b) frame dropping (a)cross-correlation of extracted video signatures. (b) ROC curve for detection results on the videos downloaded from YouTube Video copies for which the extracted signatures are dissimilar Video copies with similar signatures Different videos with similar signatures. corr(a,b)= xix

23 4.13 The frames of four example videos from a commercial series The distribution of correlation values obtained from the commercial series. Red distribution is obtained by pair-wise correlations of each video and blue distribution is obtained by correlation of composite videos xx

24 List of Tables 3.1 Comparison of resource requirements of different methods proposed for fingerprint matching Change in True Positive Rates due to Binarization Using Camera- Dependent Thresholds Change in Probability of Error Due to Binary-Quantization Performance Results with Binary Search Tree xxi

25 Chapter 1 Introduction Recent research in digital image and video forensics [1] has shown that media data has certain characteristics that relate to physical mechanisms and algorithms used in its generation. These characteristics, although imperceptible to the human eye, get embedded within multimedia data and are essentially a combination of two interrelated factors: first, the class properties that are common among all devices of a brand and/or model; and second, individual properties that set a device apart from others in its class. For example, for image data, many approaches have been demonstrated to identify the class of the device that created the image (for example was the picture taken by a Sony camera or an iphone camera) based on the properties of the components used in the imaging pipeline, such as the type of the color filter array used, the type of lens, compression parameters or the specifics of the demosaicing (interpolation) 1

26 technique. [2, 3, 4, 5, 6, 7, 8, 9, 10]. It has also been also shown that certain unique low-level characteristics of the image, such as the noise-like characteristics of the imaging sensor[11, 12, 13], and traces of sensor dust[14], can be successfully used in determining the specific source device used to capture the image (for example was this picture taken with this specific camera even though both are Sony?). While the existence of multimedia forensics techniques is essential in determining the origin, veracity and nature of media data, these techniques have a potential for much wider applications. Motivated by this, in this thesis, we show how source device characteristics can be used for several other multimedia applications. We particularly focus on the applications of one of the most successful unique source camera verification technique, namely, the Photo Response Non-Uniformity (PRNU) noise-based sensor fingerprint matching method[11, 15]. It is well established now that any sensor leaves a unique fingerprint in every image/frame captured with the sensor, much like how every gun leaves unique scratch marks on every bullet that passes through its barrel. And furthermore, this unique fingerprint is hard to remove or forge and survives a multitude of operations performed on the image such as blurring, scaling, compression, and even printing and scanning. The sensor fingerprint matching technique has been shown to be very successful when a one to one comparison of two cameras/camcorders was conducted. Therefore, it can be very useful in the cases where legal entities wants to verify whether an image/video with an illegal content has been captured by a suspect s device. 2

27 Figure 1.1: High Level Description of Proposed Multimedia Source Device Identification System 3

28 Although existing capabilities today can reliably verify the source of an image or video, there are many cases that require solutions far beyond current capabilities. Consider the following scenario where a legal entity gets hold of some illegal (e.g., child pornographic, terror related) content but there is no suspect or any other evidence immediately available. Now, it is highly conceivable that the owner of the device that captured this content has also publicly available multimedia data, such as in a Flickr or Facebook account. In this practical scenario, the question arises whether it is possible to search a large collection of content like Flickr, or Facebook (or a database accessible only to the legal entity) to link the illegal content at hand with other content in the available multimedia collection. Considering these facts, the most straightforward application of the sensor fingerprint matching technique would be source device (camera/camcorder) identification, where the aim is to find the multimedia object(s) captured by a given device in a large database of objects. While the identification problem can be solved by performing multiple one to one matching, this would require linear comparisons in the order of the database, since the fingerprint size is very large this is clearly not feasible. There are two obvious choices to increase the matching speed in the large databases. In the first approach, the fingerprint size can be decreased by obtaining a compact representation so that even linear comparisons can be performed. Another standard approach, as depicted in 1.1, would be to organize and index the database such that one is able to quickly and efficiently search only a small part of the database, and find the objects whose 4

29 sensor fingerprint match the query object within a certain tolerance threshold. Moreover, we explore other application possibilities of sensor fingerprints beside forensics. In the Chapter 4, we show how these fingerprints can be deployed to achieve the goals of conventional content-based video copy detection techniques. Video copy detection is defined as the automated analysis procedure used to identify the duplicate and modified copies of a video among a large number of videos. This procedure can then be used for efficient indexing, copyright management and accurate retrieval of videos as well as detection and removal of repeated videos to reduce storage costs. In this thesis, we develop a technique which can adapt sensor fingerprint matching technique to be used for video copy detection purposes. 1.1 Contributions of Thesis In this thesis, we explore the different applications of sensor fingerprint matching method. The first application we consider is efficient source device identification in large databases. The straightforward approaches to achieve efficiency would be either to compress the fingerprints or to index the database so that it can be queried faster. While indexing and searching large collections of multimedia objects, has been a well studied problem with some remarkable engineering successes, indexing and querying device fingerprints poses some unique challenges, including the large dimensionality and randomness of device fingerprints, and a complicated matching procedure that is required to map the query to the nearest fingerprint. In order to 5

30 overcome these challenges, we propose two different approaches. The first approach shows that sensor fingerprint matching can be done significantly faster by binary quantization of the real-valued sensor fingerprint data. We show that the use of binary fingerprints can significantly decrease the storage space needed, the I/O time, and the matching process time. Furthermore, we show through both numerical and experimental analysis that the reduction in the matching performance due to information loss from binarization is minimal. In the second approach, the main idea is to lower the computational complexity by reducing the number of matching to be done. This is realized by a group test based approach, where the query fingerprint is matched with composite fingerprints instead of each fingerprint in the database. The method proposes to construct a binary search tree to index fingerprints. The advantages of the approach are demonstrated on fingerprints created synthetically and also on fingerprints extracted from real world images. The limitations of the method are shown analytically. Another application we propose to use sensor fingerprints for, is video copy detection. In our approach, we used the fact that a video would be a combination of segments which were captured by several different camcorders. Hence, we propose to use the weighted combination of the fingerprints of camcorders (e.g., imaging sensors) involved in generation of a video as our video signature. We investigate the robustness of our video fingerprint under different manipulations. We also explore the viability of our scheme on videos downloaded from YouTube. 6

31 In summary the contributions of this thesis are: We propose to compress sensor fingerprints to enable fast identification of multimedia objects in large databases captured by a given device. For this purpose, we propose to quantize each element of fingerprint into a binary number. For the same application, we propose to index the sensor fingerprints by building a binary search tree based database. While the structure of the tree is studied in detail, approaches for building and updating the tree are also proposed. We adapt sensor fingerprint matching technique to video copy detection problem. We show how to get a reliable and robust video fingerprint using sensor fingerprints of the camcorders used in capturing process of the video. 1.2 Organization of the Thesis In the next chapter, an overview of the sensor fingerprint matching method will be provided and the verification process will be described. Sensor fingerprint quantization and binary search tree methodologies for efficient sensor fingerprint identification in large databases will be detailed in Chapter-3. In Chapter-4, our video copy detection technique will be described. Finally, in Chapter 5, we will conclude the thesis and discuss the future directions. 7

32 Chapter 2 Background Advancements in sensor technology have led to many new and novel imaging sensors. However, images captured by these sophisticated sensors still suffer from systematic noise components such as photo response non-uniformity (PRNU) noise. PRNU noise signal is caused mainly by the impurities in silicon wafers. These imperfections affect the light sensitivity of each individual pixel and form a fixed noise pattern. Since every image captured by the same sensor exhibits the same pattern, PRNU noise can be used as a fingerprint of the sensor. In the rest of this chapter we briefly summarize the basic sensor output model, the estimation of PRNU process, and the matching methodology. 8

33 Figure 2.1: A simplified depiction of an imaging pipeline within a camera and the different components involved. 2.1 Imaging sensor output model In a digital imaging device, the light entering the camera through it s lens is first filtered and focused onto sensor (e.g. CCD) elements which capture the individual pixels that comprise the image. The sensor is the main and most expensive component of a digital imaging device. Each light sensing element of the sensor array integrates the incident light and obtains a digital signal representation of the scenery. Generally, the sensor elements are monochromatic, therefore for each pixel one color value is captured; typically red green, or blue (RGB). Later, a demosaicing operation takes place to calculate the missing color values. This is followed by the white balancing operation, colorimetric interpretation, and gamma correction. After these, noise reduction, anti-aliasing and sharpening are performed to avoid color artifacts. At the end the image/video is compressed and saved in the device s memory [16]. A simplified version of an imaging pipeline is shown in Figure 2.1. For every color channel, let us denote the digital signal representation before demosaicing as I[i], and the incident light intensity as Y[i], where i = 1,..., n specifies 9

34 a specific pixel. Below, all the matrixes shown in bold fonts are in vector form and all the operations would be element-wise. A simplified model of the sensor output model can then be written as: I = g γ.[(1 + K)Y + Λ] γ.θ q (2.1.1) In the above equation, g denotes the color channel gain and γ is the gamma correction factor which is typically γ The zero mean noise-like signal responsible for PRNU is denoted by K. This signal is called the sensor fingerprint. Moreover, Λ denotes the combination of all other additive noise sources such as dark current noise, shot noise and read-out noise. Finally, Θ q denotes the quantization noise. To factor out the most dominant component light intensity Y from this equation, the Taylor expansion (1 + x) γ = 1 + γx + O(x 2 ) can be used, yielding I = (gy) γ.[1 + K + Λ/Y] γ + Θ q = (gy) γ.(1 + γk + γλ/y) + Θ q (2.1.2) Finally, to simplify the notation and to reduce the number of symbols, γ can be absorbed into the PRNU factor K and the sensor output model can be written as: I = I (0) + I (0) K + Θ (2.1.3) where I (0) = (gy) γ is the sensor output in the absence of noise, I (0) K is the PRNU noise term, and Θ = γ (0) Λ/Y + Θ q is the composite of independent random noise component. 10

35 Figure 2.2: PRNU extraction and verification process 2.2 PRNU noise estimation To get a reliable estimation of sensor fingerprint K, the camera in question or a collection of images taken by the camera/sensor is needed. Let us assume that L images are available from the camera. A denoised version of the image I is obtained using a denoising filter F, Î (0) = F(I) [17]. To find a reliable estimate of PRNU Î (0) should be removed from both sides of Equation 2.1.3; so that host signal rejection and noise suppression can be done to improve the signal to noise ratio between I (0) K and I. W = I Î (0) = IK + I (0) Î (0) + (I (0) I)K + Θ = IK + Ξ (2.2.1) 11

36 The noise term Ξ is a combination of Θ, and the two terms introduced by the denoising filter. This noise can be non-stationary in the textured areas, therefore images with smooth regions help to obtain better PRNU estimates. The estimator for sensor fingerprint K from L images I 1, I 2,..., I L, along with the gaussian noise terms with σ 2 variance Ξ 1, Ξ 2,..., Ξ L can then be written as: W k = K + Ξ, W k = I k Î (0) k I k I, Î(0) k = F(I k ) (2.2.2) k where k = 1, 2,..., L. Finally the Maximum Likelihood estimate of sensor fingerprint ˆK can then be written as: ˆK = L k=1 W ki k L k=1 (I k) 2 (2.2.3) The top row of Figure 2.2 shows the PRNU noise estimation process. This estimate is referred to as the fingerprint of the camera. Every image taken by a specific camera will have this PRNU as part of the image which would uniquely identify the camera. 2.3 Verification with PRNU In the previous section, we described how to estimate a camera s PRNU noise or in other words it s sensor fingerprint ˆK. Now given a camera with fingerprint K and a query image, the presence of K in the query would indicate that the image was captured by the given camera. To determine if K is present in image I, we denoise the image with the same denoising filter F. The PRNU estimate of the 12

37 image therefore is W = I F(I). The detection problem can be formalized as a binary hypothesis test, where H 0 : W = Ξ and H 1 : W = I ˆK + Ξ.To decide which hypothesis to accept, a detection statistics is computed by either computing the correlation, ρ = corr(w, I ˆK), or the peak to correlation energy, P = pce(w, I ˆK), between the query fingerprint and the database fingerprint. If we are considering correlation as our matching metric, H 0 will be accepted when ρ < τ ρ or else if P CE is used to match the fingerprints H 0 will be accepted when P < τ P. Here, τ ρ and τ P are two predetermined threshold values used for correlation and P CE comparisons, respectively. Let us assume two fingerprints X and Y. Then, the correlation is defined as ρ(x, Y) = Likewise, P CE is defined as n j=1 Xj.Y j n n. (2.3.1) j=1 Xj2. j=1 Y j2 P(X, Y) = c2 (0), k = 0, 1,..., n (2.3.2) E[c 2 (k)] where c(k) is the circular convolution between the two fingerprints computed as c(k) = 1 n n X j.y j+k (2.3.3) j= Robustness of PRNU and Anti-Forensics PRNU noise is caused due to the manufacturing defects which are impossible to be prevented. Therefore, all multimedia objects produced by a sensor would exhibit a PRNU noise. In addition to this, the probability of two sensors exhibiting same 13

38 PRNU is very low due to the large size(typically larger than 10 6 ) and random nature of PRNU noise. The direct correction of PRNU requires an operation called flat-fielding which in essence can be only realized by creating a perfectly lit scene within the device. However, obtaining a uniform sensor illumination in camera is not trivial, therefore PRNU noise cannot be fixed easily. All this facts make PRNU a perfect candidate to be used as a sensor fingerprint. Of course, one can always wonder how robust PRNU (sensor fingerprint) is and if it can survive the common image processing operations and/or attacks. In [18], the authors have studied the correction and robustness properties of PRNU. This study showed that PRNU is highly robust against denoising, JPEG compression and out of the camera demosaicing. In [19, 20, 21], the authors further removed the effects of demosaicing, JPEG compression and gamma correction type of common image processing operations by a series of post-processing operations on PRNU estimates. In another study Goljan et. al [22], showed that even after scaling and cropping PRNU can be detected given that the right alignment can be found by a brute force search, and in [23], the PRNU detection on printed images has been investigated. It has been shown that PRNU can survive after high quality printing and scanning cycle. However, PRNU failed the so-called fingerprint-copy attack. This attack is essentially realized by estimating camera A s fingerprint and superimposing it onto an image captured by camera B.In [24, 25], it has been shown that such attack cannot 14

39 be detected and the fake fingerprint cannot be distinguished from the genuine one. Such an attack, for example, could potentially be used to frame an innocent person. Recently, Goljan et. al [26] proposed a countermeasure for fingerprint-copy attack. In this approach, it is assumed that the forensics analyst has access to the images captured by camera A that were not available to the attacker. Such images can be used to estimate a new fingerprint. The idea behind the countermeasure was that the fingerprints of the images captured by camera A should be consistent with each other but the fingerprint of the image that attempt to mimic the original fingerprint should be different. 15

40 Chapter 3 Efficient Sensor Fingerprint Identification In Large Image & Video Databases As mentioned before sensor fingerprint matching technique can achieve very high accuracy with false positives and negatives in the order of 10 6 or less [20]. However, when a large database is concerned, sensor fingerprint matching method presents its own unique set of challenges. These challenges revolve around two main issues. The first issue relates to the large dimensionality and high precision representation of sensor fingerprints. As a result, main memory operations like loading of fingerprint data takes considerable amount of time. At the same time, each sensor fingerprint needs a fairly large amount of space for storage. Further, since the sensor finger- 16

41 print data look more like random, compression is not very effective. Typically, the fingerprint extracted from a 10 megapixel image may take up to 50 MB of space even after compression. The second issue is the computational complexity of the matching algorithm. The matching process involves vector operations which, when combined with the high dimensionality of data, becomes a critical concern. In this chapter, we describe in detail the different approaches we take to make identification process possible. Before we do that, let us fix some notation and terminology. Let us assume that we have a database D, which is comprised of N fingerprints. The fingerprints are modeled as normally distributed random sequences p i = [p 1, p 2,..., p n ] such that p j = X j +µ j, µ j N (0, σ 2 ) and X j N (0, 1 σ 2 ) for i = 1,.., N, and j = 1,.., n. Furthermore, the distribution of correlation between matching fingerprints will be; ρ(p k, p l ) N (0, 1 n ) (3.0.1) In this equation, since the fingerprints are non-matching, X k, and X l are independent from each other. On the other hand, when X k = X l, meaning the fingerprints are matching, the distribution of correlation would be; ρ(p k, p l ) N (1 σ 2, 2σ2 σ 4 ) (3.0.2) n The sensor fingerprint verification process is discussed in the previous chapter. The naïve way of using this verification technique for identification purposes would be simply comparing all fingerprints in the database with the query sensor fingerprint. As can be deduced from Eq , computing the correlation between two 17

42 fingerprints of length n requires 3n multiplications. On the contrary, when calculating PCE, n 2 multiplications need to be performed. However, since the numerator of PCE is equivalent to sum square of circular convolution between the two vectors, it can be obtained through fast Fourier transform, which involves n log n + n complex multiplications. Considering typical values of n, where n > 10 6 even for a basic camera, and noting that each element of the fingerprint is a double precision number, it can be seen that the correlation and PCE operations are quite involved, yet ultimately negligible when only the match between two fingerprints is considered. However, in our case, the database contains very large number of fingerprints, which increases the computational requirements immensely. In this case, the query fingerprint has to be compared with all the fingerprints in the database, which means that N fingerprints of size n should be loaded into main memory, and N correlation or PCE values should be calculated. Since N could range from hundreds up to many millions depending on the database, the computational load may become simply overwhelming. In [27], the authors proposed to use only k elements of the camera fingerprints with the highest energy values, where k << n. For each fingerprint in the database, only these high energy coefficients and their locations are stored, which resulted with a considerable storage gain. In this system, when a fingerprint is queried, a matching metric is calculated between the query fingerprint and the fingerprints in the database using the elements in the previously stored locations only. This 18

43 would result, ideally, with an n/k times speed up both in memory load time and in computation time. Obviously, one needs to choose k carefully as probability of error increases with decreasing k. One limitation of the above approach is the assumption that the database only stores fingerprints of known sources, i.e., camera fingerprints. In the case where the database stores fingerprints obtained from individual images, this method cannot be used directly because determining the k most significant fingerprint elements requires access to the corresponding camera fingerprint. In those cases, when query fingerprint is a camera fingerprint, its k highest energy elements can be used for matching while still keeping the sensor fingerprints in full length. Although this would eliminate all the advantages due to storage and memory operations, the computation of the matching metric will be fast. In the rest of this chapter, we will introduce two approaches aiming to increase the efficiency of sensor identification process. In Section 3.1, we will show how to quantize the fingerprints and discuss the effects of quantization. In Section 3.2, we will present a group testing based binary search tree procedure. 19

44 3.1 Through Sensor Fingerprint Quantization Efficient similarity matching in large databases has been studied in many different research areas in multimedia, including biometrics and video copy detection. Different approaches have been proposed on how to index and store the database so that when an object is queried, the entities in the database can be accessed easily and matched efficiently. These approaches, in common, try to obtain a more compact representation for the data. For this purpose, both the structure (like minutiae points in biometric fingerprints) and inherent dependencies (like eigenfaces for face images) in the data are exploited. Although the problem setting in sensor fingerprint matching is much like these problems, the proposed solutions cannot be trivially extended here as sensor fingerprints do not have the same structural properties and do not exhibit systematic dependencies. To increase efficiency of sensor fingerprint matching in large databases, in this section, we propose to apply an information reduction operation and represent sensor fingerprints in a quantized form. Ideally, we would like to obtain a representation as compact as possible. Therefore, we particularly focus on binary quantization and, essentially, use each element s sign information only and disregard magnitude information completely. Hence, given two real-valued fingerprints X and Y to be matched, their quantized versions ˆX and Ŷ are obtained by the following relation: 1, X j < 0 ˆX j = 1, Y j < 0 Ŷ j = (3.1.1) +1, X j 0 +1, Y j 0 20

45 In what follows, we investigate how binarization of sensor fingerprints effects matching performance and describe the advantages gained due to use of binary fingerprints Effect on Matching Performance The main question that arises then is what is the loss in matching performance when using a binarized version of a fingerprint. In this sub-section, we show by analysis as well as simulation that the loss is small. The information loss due to quantization of fingerprints will obviously cause a degradation in the detection statistic, i.e., the initially computed correlation and PCE values. Given the definitions for the two statistics in Eqs and and defining ˆX and Ŷ as the binary-quantized versions of the two fingerprints X and Y, respectively, the corresponding correlation, ˆρ, and PCE, ˆP, values can be computed in terms of ρ and P as follows ˆρ = corr( ˆX, Ŷ) = 4Q(0, 0; ρ) 1, (3.1.2) ˆP = pce( ˆX, Ŷ) = (4Q(0, 0; c(0)) 1)2, k = 1,..., n (3.1.3) E[(4Q(0, 0; c(k)) 1) 2 ] where Q(0,0,ρ) is the two-dimensional Q-function defined as Q(x, y; ρ) = 1 2π 1 ρ 2 x y exp( x2 1 + y1 2 2ρx 1 y 1 )dx 2(1 ρ 2 1 dy 1. (3.1.4) ) The details of the derivation can be found in the Appendix A. 21

46 An alternative setting for fingerprint matching can also be considered by assuming the case where one of the fingerprints is binary-valued and the other is real-valued, i.e., between X and Ŷ or vice versa. This would correspond to a scenario where the sensor fingerprints in the database are kept in binary form, due to reasons of efficiency, and the query fingerprint is real-valued. In this case, the correlation ˇρ between the real-valued query X and binary-valued database fingerprint Ŷ can be calculated by numerically solving the integral given in Eq Figure 3.1 demonstrates through analytical computation how ˆρ and ˇρ change with respect to ρ. It is no surprise that ρ > ˇρ > ˆρ over the entire range of values except for when ρ gets closer to one. In this regime, ˇρ is lower than ˆρ. This is expected because correlation yields high values only if the correlated sequences are extremely related to each other in terms of magnitude values of the coefficients. Therefore, when two fingerprints are related to each other their binary versions will also be related. For example consider, the case where X = Y. In this case, ρ = ˆρ = 1. However, the relation between real-valued and binary-quantized versions will remain limited, even when the two fingerprints are the same prior to quantization, e.g. ˇρ 1. Regardless of this issue, it can be noticed that there is a significant amount of drop in ˆρ when ρ takes values in the range of ρ = 0.4 to ρ = 0.9. This essentially implies that use of binary fingerprints will incur a significant performance penalty; however, as earlier studies demonstrate [11, 28], correlation between two sensor fingerprints can rarely reach as high as 0.3. Therefore, in practice, we are 22

47 only interested in the ρ < 0.3 regime, where the gap between ˆρ and ρ is relatively low. Numerical simulations were also performed to demonstrate the accuracy of analytical results. For this purpose, we created a database of synthetically generated fingerprints. We start with a sequence, X = [X 1,..., X n ], of independent samples from a normal distribution, where X j N (0, 1 σ 2 ) and j = 1,.., n, and n = This sequence is then mixed with zero-mean white Gaussian noise sequences at varying power levels to generate the fingerprints, p j i = Xj + µ j i, and µj i N (0, σ2 ), and i = 1,.., 100. It should be noted that each fingerprint is a sequence of normally distributed zero-mean, unit variance, and independent random variable, i.e., p j i N (0, 1). Also note that, the designated correlation between the two sequences will be ρ = 1 σ 2. To obtain distributions of correlation coefficients, for all σ, therefore ρ values, 100 fingerprints were generated and correlated with each other. The mean correlation value with respect to different ρ values are plotted with circle markers in Figure 3.1. These synthetic fingerprints were then quantized into binary values as described in Eq Mean correlation values were also obtained by correlating binaryvalued fingerprints with each other and by correlating binary-valued and real-valued fingerprints in the two sets. Corresponding results are plotted using square and star markers, respectively, in Figure 3.1. It can be seen that simulation results fit perfectly with the analytical findings. Likewise, a simulation analysis is performed 23

48 Figure 3.1: The change in correlation due to binary-quantization of sensor fingerprints. The solid line shows the theoretical correlation between the two real-valued fingerprints, dashed lines shows the correlation between a real and a binary fingerprint, and dotted line shows correlation between two binary fingerprints. Circles show the correlation between the two real-valued fingerprints obtained through numerical simulation. Similarly, stars show the correlation between a real- and binaryvalued fingerprints and rectangles show correlation between two binary-valued simulated fingerprints. to determine the change in the PCE metric due to quantization. Corresponding results are given in Figure 3.2. This figure also shows that the reduction in PCE is considerably small in the region of interest, i.e., when ρ < 0.3. Now that we have determined how the correlation would change after quantization, we can estimate the probability of error (POE) approximately. For non-quantized full fingerprints, Goljan et al. [29] calculated the POE by assuming that the distribution of correlation coefficients computed among non-matching fingerprints (i.e., fingerprints associated with different image sensors) follows Equation-3.0.1,ρ N (0, 1 ), and n 24

49 Figure 3.2: The change in PCE due binarization of fingerprints is obtained. The solid line shows the PCE between the two real simulated fingerprints, the dashed line shows the PCE between a real and a binary simulated fingerprint, and dotted line shows PCE between two binary simulated fingerprints. the distribution between matching fingerprints (i.e., fingerprints obtained from images taken by the same image sensor) follows Equation-3.0.2, ρ N (1 σ 2, 2σ2 σ 4 n ). Correspondingly, for a database consisting of N 1 non-matching fingerprints and one matching fingerprint, the probability of detection and false alarm rates are calculated as: P F A = 1 (1 Q(τ ρ n)) N, and P D = (1 Q(τ ρ n)) N 1 Q( n(τρ 1 + σ 2 ) 2σ2 σ 4 ). (3.1.5) These results can be extended to compute error rates after quantization of fingerprints as well. Note that, the distribution of correlation between two non-matching binary fingerprints will be approximately the same as that of real-valued finger- 25

50 prints. The distribution of correlation between matching binary-valued fingerprints will change though. In Figure3.1, it can be observed that in the region of interest, there s almost a linear relation between ρ, ˇρ and ˆρ, which can be approximated as ρ 1.57ˆρ and ρ 1.25ˇρ by simple linear curve fitting. Using this approximation,the distribution of correlation between matching binary fingerprints can be determined as ˆρ N ( 1 σ2, 2σ2 σ n and false alarm rates can be obtained as: ). Correspondingly, the probability of detection ˆP F A = 1 (1 Q(1.57τˆρ n)) N, and (3.1.6) ˆP D = (1 Q(1.57τˆρ n)) N 1 Q( 1.57 n(τˆρ 1 σ ) 2σ2 σ 4 ). Similarly, for the real-valued query fingerprint and the binary-valued database fingerprint case, ˇρ N ( 1 σ2 rates would be:, 2σ2 σ n ) and probability of detection and false alarm ˆP F A = 1 (1 Q(1.25τˇρ n)) N, and (3.1.7) ˆP D = (1 Q(1.25τˇρ n)) N 1 Q( 1.25 n(τˇρ 1 σ ) 2σ2 σ 4 ). To compare the performance, ROC curves are plotted for a few selected ρ values. Figure 3.3 displays the ROC curves corresponding to use of real-valued and binary-valued fingerprints during matching. ROC curves are obtained for different ρ values and N is set to It can be seen in all these figures that as ρ increases ROC curves get closer to each other. It should be noted that, in practice, when performing fingerprint matching ρ will take values in the ρ > 0.02 range for matching 26

51 Table 3.1: Comparison of resource requirements of different methods proposed for fingerprint matching # of com- Storage Data to be # multipli- complexity parisons need (bits) loaded (bits) cation (ρ) of mult Conventional N 64Nn 64Nn 3n d 2 Tree based search tree N t log 2t 128Nn 64 N t log 2tn 3n d 2 Short digest (database) N 64Nk 64Nk 3k d 2 Short digest (query) N 64Nn 64Nn 3k d 2 Binary-quantization N Nn Nn n 1 fingerprints, and at this range, the gap between the ROC curves is marginal. This essentially translates to the fact that regardless of whether fingerprints are real- or binary-valued, the change in performance will be insignificant Computation and Storage Aspects From an application standpoint, the most important implication of quantization is the reduction in computation and storage as compared to the conventional approach. Considering a database of sensor fingerprints and query fingerprint to be matched against this database, Table 3.1 provides the computational and storage requirements of the conventional methods for fingerprint matching and of the schemes proposed in [27, 30] in comparison to the proposed approach. It should be noted that in this table, N represents the number of fingerprints in the database, n is the fixed length of each fingerprint, t is the number of elements in a tree as defined in [30], k << n is the length of the short digest as defined in [27], and d represents the 27

52 (a) (b) (c) Figure 3.3: ROC curves comparing performance when fingerprint matching involves (a) only real-valued fingerprints, (b)real- and binary-valued fingerprints, and (c) only binary-valued fingerprints. 28

53 bits needed to store each fingerprint element. As mentioned before, for fair comparison, we consider two versions of the digest based approach described [27]. In the first one, the database only stores real-valued digests obtained from the camera fingerprints. In the second one, however, only the digest for the query fingerprint is available and the database stores full-length fingerprints. It can be seen in the table that the least number of matching operations can be achieved only with the tree based structure. All other approaches will require linear number of matching operations to be conducted. From storage point of view, binarization reduces storage requirement by a factor of 64 assuming the other techniques use a 64 bit floating point number to store fingerprint elements. Correspondingly, I/O operations that involve transfer of fingerprints from and to disk storage will be much faster as compared other approaches. This aspect is very important as I/O operations are the main bottleneck for sensor fingerprint matching in a large database. With the use of binary-valued fingerprints the computational complexity of correlation decreases as well. For binary fingerprints only n multiplications is enough instead of 3n since n j=1 (Xj ) 2. n j=1 (Y j ) 2 = n. Further, correlation between two binary-valued fingerprints effectively reduces to computation of Hamming distance between two sequences, which can be implemented much faster. Let d H = Hamming( ˆX, Ŷ) represent the Hamming distance between two binary 29

54 fingerprints ˆX and Ŷ. The correlation ˆρ can be expressed in terms of d H as ˆρ = n 2d H. (3.1.8) n On the contrary, with floating-point numbers the complexity of multiplication depends on the number of bits in the floating-point representation Experimental Results To demonstrate the performance and efficiency of the proposed approach, 300 images from three different digital cameras, including a Canon Powershot A80, Sony Cybershot S90, and a Canon Powershot S1 IS were collected at their native resolutions. We refer to these cameras as Camera A, B and C respectively, for the sake of simplicity. All images were then cropped to the size of to have a fixed resolution. Sensor fingerprints of each image were extracted. For each camera, 100 images were averaged together to obtain the camera fingerprint while the rest of the sensor fingerprints from single images were saved in a database of 600 fingerprints. All the fingerprints including the camera fingerprints were binarized and saved in another database. Performance Results In the experiments, three different settings were considered. The first setting corresponds to the conventional fingerprint matching procedure where both the query camera fingerprint and the database fingerprints are real-valued. The second one 30

55 (a) (b) (c) Figure 3.4: The distribution of intra-camera correlations for a) Camera A, b) Camera B, and c) Camera C obtained between the camera fingerprints and fingerprints of images captured by the same camera under the three settings. 31

56 refers to case where the query camera fingerprint is real-valued and all the database fingerprints are binary-valued. And finally, third setting represents the case where both query and database fingerprints are binary-valued. In our experiments, since we assume a camera fingerprint is available and used as query to the database, a camera dependent threshold is used rather than a fixed threshold for all tests. Therefore, distinct thresholds for both correlation and PCE statistics were determined for each camera under the three settings. For all settings, first, the correlation and PCE between the camera fingerprints with the fingerprints of images captured by the same camera were calculated. Later, the same metrics between the camera fingerprints and fingerprints of images captured by other cameras were calculated. The thresholds for each metric was set in a way that no false positives would occur. The true positive rates are shown in Table-3.2. From this table, it can be seen that only for Camera C quantization caused a slight increase of nearly 0.3% in the error rate for both correlation and PCE metrics. To get a closer look on how the correlation changes, Gaussian curves were fit on the histograms of correlation values obtained in the previous experiment. Figure 3.4 shows the distributions for intra-camera correlations where camera fingerprints are correlated with fingerprints from images of the same camera under the three settings. Similarly, Figure 3.5 provides the distributions of inter-camera correlations obtained between camera fingerprints and the fingerprints of images from other cameras for all settings. In these figures, straight lines correspond to conventional 32

57 (a) (b) (c) Figure 3.5: The distribution of inter-camera correlations for a) Camera A, b) Camera B, and c) Camera C obtained between the camera fingerprints and fingerprints of images captured by other cameras under the three settings. 33

58 Table 3.2: Change in True Positive Rates due to Binarization Using Camera- Dependent Thresholds Camera ID Metric Real-Real Real-Binary Binary-Binary Camera A ρ 100% 100% 100% P 100% 100% 100% Camera B ρ 100% 100% 100% P 100% 100% 100% Camera C ρ 100% 99.82% 99.64% P 100% 100% 99.82% setting for fingerprint matching, dashed lines to second setting where only database fingerprints are binary, and the dotted lines to third setting where all fingerprints are binary-valued. (In all the figures, x-axis is fixed.) As expected, the mean correlation value drops with the use of binary fingerprints. However, since the variance also decreases, it compensates for the errors due to reduction in the mean. It can also be seen that the mean correlation values for Camera C is very low even in the case where both fingerprints are real-vaued, E[ρ] Alternatively, for each setting and for each camera we plotted histograms for the correlation of camera fingerprints with the fingerprints of the images from the same camera and from other cameras on the same figure, as shown in Figure 3.6. Using the sample distributions on these figures, we can analytically calculate the change in probability of error due to quantization. Table 3.3 presents the false reject rates 34

59 Table 3.3: Change in Probability of Error Due to Binary-Quantization Camera ID Real-Real Real-Binary Binary-Binary Camera A 3.59% 4.39% 4.47% Camera B 2.51% 3.42% 3.92% Camera C 3.05% 5.59% 10.27% for each camera for a fixed false positive rate of These results show that the increase in probability of error for Camera A and B is minimal after binarization; however for Camera C an increase of 7% is observed. This was expected since the correlation of fingerprints from Camera C among each other were relatively low as compared to other cameras in the first place. We also investigated the possibility of combining quantization with the idea of using short fingerprint digests for matching, as described in [27]. For this purpose we conducted an experiment by selecting k highest energy coefficients and converting them to binary as in Eq In Figure 3.7 matching accuracy for Camera C is given under the three settings for varying k. It can be seen that when k > 25000, the drop in accuracy for both real-binary and binary-binary settings are very small. This shows us that the two methods can be combined for faster computation of correlation. 35

60 (a) (b) (c) 36

61 (d) (e) (f) 37

62 (g) (h) (i) Figure 3.6: The distribution of correlations. (a),(b), and (c) corresponds to Camera A, (d),(e), and (f) to Camera B and (g),(h), and (i) Camera C. The first column is the correlation between real-valued fingerprints, the second column is correlation between real-valued and binary-valued fingerprints and the last column is the correlation between two binary-valued fingerprints 38

63 Figure 3.7: The accuracies for varying k values in each case for Camera C. Solid line shows the real-real case, dashed line shows real-binary and dotted line shows the binary binary case Efficiency Results Having established the performance of fingerprint matching with binary fingerprints we now examine the gain obtained in terms of storage and computational requirements. To test the efficacy of the method, 1000 fingerprints were extracted from random images. These fingerprints were then cropped to a size of and vectorized. Same fingerprints were quantized into binary fingerprints as in Eq In our experiments, both Matlab and C programming language implementations were used for evaluations. However, it must be noted that, none of these implementations were optimized for the best performance, and better results could be achieved. Our intention is to demonstrate the relative improvements that can be obtained and not to set performance bounds. 39

64 To quantify the storage gain, first, Matlab was used to save the fingerprints into proprietary.mat format. Noting that Matlab uses its own compression method, each real-valued fingerprint required 5750KB of storage space. For the corresponding binary-valued fingerprints, however, storage space was reduced to 127KB, yielding a storage gain of around 45 times. The real-valued fingerprints were also saved in uncompressed file format using a C implementation. This required 6145KB of storage space for real-valued fingerprints and 97KB storage space for binary fingerprints, which showed an exact 64 times reduction in the file size. To observe the speed-up in I/O loading times, both the real-valued and binaryvalued fingerprint were loaded into the main memory one-by-one using Matlab and C implementations. Since each element of real-valued fingerprint was stored, in double-precision floating-point format, loading will involve reading a sequence of doubles. Noting that the smallest storage unit is a character, which uses one byte, each element of binary-valued fingerprint has to be stored in a byte. Therefore, the improvement in loading time will not reflect the reduction in the storage size. To overcome this limitation, we performed an 8-bit encoding where the elements of binary-valued fingerprints are converted into binary and grouped into blocks of eight bits and stored in byte format. Experiments show that with the use of this naive scheme in Matlab, loading time improves eight times; whereas, our C implementation got 21 times faster as compared to loading time of real-valued fingerprints. 40

65 The other important concern is the speed-up in the computation of the decision statistics. To measure this, we picked a fingerprint from the database and correlated it with all of the 1000 fingerprints. In Matlab implementation, for real-valued fingerprints we use corr2 command, and for binary-valued fingerprints, we compute dot product of fingerprints followed by dividing with the length of the fingerprint. These operations were also implemented in C. In both implementations, we achieved around four times speed-up in correlation computation. It must be noted, however, that although the simple encoding approach described above is suitable for fast loading of binary fingerprints to memory, it, nevertheless, requires decoding of data stored in the memory to obtain the fingerprints, which would incur additional computation and time. To avoid decoding, we considered two approaches that operates directly on decimal digits to compute the Hamming distance between two binary fingerprints. One of the approaches is based on XOR ing bytes and counting the number of ones and the other one uses a pre-computed look-up table of dot products of 8-bit representations of byte values. These methods yielded nearly 9 times faster computation of the correlation between binary-valued fingerprints over real-valued fingerprints. Overall, it can be stated that binary-quantization of fingerprints, as compared to use of real-valued fingerprints, enables 64 times gain in storage space, and at least 21 times speed-up in loading time and 9 times speed-up in computation of correlation while providing almost the same matching accuracy. Obviously, these 41

66 improvements will be further enhanced when binarization is incorporated with the short digest method, which by itself improves correlation time by more than 20 times as demonstrated in the previous subsection. 42

67 3.2 Through Binary Search Tree Structure Based on Group Testing In this section we will use ideas inspired from group testing approaches to design a source device identification system that can potentially be used with a large collection of multimedia objects. A group test is a simultaneous test on an arbitrary group of items that can give one of two outcomes, positive or negative. The outcome is negative if and only if all the items in the group test negative. Group testing has been used in many applications to efficiently identify rare events in a large population[31, 32]. It also has been used in the design of a variety of multi-access communication protocols where multiple users are simultaneously polled to detect their activity [33]. Consider the very primitive fake coin problem example. Let us assume that we are given eight coins and we would like to know which one is fake (positive), given that the fake coins weigh less than the genuine ones. If we divide the coins into two groups and scale each, the group containing the fake coin should weigh less. Therefore, we can determine right away which group has the fake coin. We can further divide this group into two groups and continue scaling until we find the fake coin with only a logarithmic number of weighings. 43

68 3.2.1 Binary Search Tree Similar to the approach in fake coin problem, given a query sensor fingerprint, rather than checking for a match with each object s fingerprint in the database, we can perform a match with fingerprint estimates that are combined together to form a composite fingerprint. To be more clear, let us assume, we have a database D of } sensor fingerprint estimates D = {p i, extracted from N multimedia objects(i = { } 1, 2,..., N). Each element of fingerprint estimate is modeled as p j i = X j i + µj i as defined in the beginning of this chapter (X j i N (0, 1 σ 2 ),µ j i N (0, σ 2 ), p j i N (0, 1), andj = 1,..., n). Let us also assume that we have a query fingerprint p q = X q +µ q that has the same properties. We can define the composite fingerprint as the normalized summation of all N fingerprints in the database; C = 1 N N i=1 p i (3.2.1) The factor 1 N here is to make the composite have unit variance. Our problem turns into a hypothesis testing where; H0: There is no matching fingerprint in the database D; { D = p i : p i = X i + µ i, X i and X q are independent }. H1: There are one or more matching fingerprints in D; } D = {p i : p i = X i + µ i, X i = X q For the null-hypothesis, the distribution of correlation between the query and composite fingerprint will be the same as the distribution of correlation between two 44

69 single non-matching fingerprints, ρ N = C, p q N (0, 1/n), since the composite fingerprint will still be independent of p q. For H1, let us assume the worst case scenario where there is only one matching fingerprint in the composite. In this case, the distribution of correlation can be calculated as follows: ρ M = C, p q = 1 N ρ M N i=1 = 1 N ( p j, p q + p i, p q N i=1,i j p i, p q N ( 1 σ2, 2σ2 σ 4 ) + N (0, 1 N n n ) (3.2.2) Note that the two distributions in Equation are highly dependent to each other, so we can t simply assume that their variance will be added. However, for large N and n, we can assume that the contribution from the first argument to the variance will be negligible. Therefore the distribution of a query fingerprint with a ) composite that contains only one matching fingerprint can be written as : ρ M = C, p q N ( 1 σ 2, 1 ) N n (3.2.3) To verify the correctness of this model, we generated a sample simulation database } containing N = 4096 fingerprints normal with zero mean and unit variance, {ps i such that i = 1,..., 4096 and ps j i N (0, 1), j = 1,..., 107. We also generated a query database with fingerprints using the first set such that corr(ps i, qs i ) = 0.1 by adding white gaussian noise to the entities in the first database. The query fingerprints were normalized to have zero mean and unit variance as well. In the first experiment, our aim was to measure the distribution of correlation where the 45

70 composite fingerprint doesn t include a fingerprint similar to the query. For each query fingerprint(m = 1,..., 4096) we measured the following correlation: ρ s N = 1 N qs m, N k=1,k m ps k (3.2.4) The distribution of ρ s N is shown in Figure-3.8 with blue color. We also drawn our analytical finding for the corresponding correlation distribution, ρ N on the same figure with red color. Furthermore, we measured the correlation between the query fingerprints and composite with a matching fingerprint: ρ s M = 1 N N k=1 ps k, qs m (3.2.5) Similarly, the distribution of ρ s M is shown in Figure-3.9 with blue color, and corresponding analytical findings, ρ M, is drawn on the same figure with red color. From these two figures, we can say that our model approximate the real situation very well. Now let us calculate the probability of detection and probability of false alarm for a fixed threshold τ N : P F A = Q(τ N n), and (3.2.6) ( ) τn 1 σ2 P D = Q N (3.2.7) n Using these In Figure-3.10, one can see the corresponding ROC curves, when n = 10 7 and the correlation between the matching fingerprints are 0.1. This figure shows us that the probability of error would increase with the number of fingerprints in 46

71 Figure 3.8: Distribution of correlation between a query fingerprint and a composite when composite doesn t contain a matching fingerprint. The fingerprints are n = 10 7 in size and the composite contains N = The red line shows the analytical findings and the blue lines show the results on simulation data. the composite. With the given parameters, N = 4096 seems to give a good trade of between performance/efficiency. Therefore, if the composite has no matching fingerprint, we could determine it with only one correlation which would result a 4096 times improvement. On the other hand, if there is a matching fingerprint in the composite, one should further search this composite. This can be realized by a binary search tree structure where the database is split into two groups. Figure 3.11 illustrates the binary search tree created in this way. In this example, we assumed that the database contains 8 fingerprint estimates extracted from 8 images. The leaves of the tree represents the fingerprint estimates p i (i = 1, 2,..., 8) and the 47

72 Figure 3.9: Distribution of correlation between a query fingerprint and a composite when composite contains a matching fingerprint. The fingerprints are n = 10 7 in size and the composite contains N = The red line shows the analytical findings and the blue lines show the results on simulation data. parent nodes represent the normalized sum of their children. We also assumed we have a query fingerprint f A, and one of the fingerprint estimates in the database exhibit the same fingerprint as the query (p 3 = f A + η). The matching fingerprint can identified through marching along the branches of the search tree, from top to bottom, by using the hypothesis testing in each level with different thresholds fixed for that level. In the figure, red arrows depict the route the algorithm follows before identifying p 3 at the leaf of the tree. The binary search tree constructed in this way can potentially yield a logarithmic reduction in identification complexity. However, as mentioned before, the probability of error increases with the size of 48

73 Figure 3.10: ROC curves comparing performance when fingerprint is correlated with composite fingerprints, and n = the tree. On the other hand, it is always possible to build higher trees, all we need to do is essentially to perform a hypothesis test and based on the probability that we obtain, we can decide to go down or not. In the worst case scenario, a match will be detected in every level. In that case, the complexity of the binary search tree would be O( Nn log(h)), where h is the number of fingerprints in the tree. For h example, when N = 4096 and there is a single fingerprint, the improvement will be 4096/12 = 341 times. In addition, when a database contains more than one fingerprint from the same device, random splitting will further effect the performance of the method. Therefore, the BST should be constructed in a way such that fingerprints of media objects 49

74 Figure 3.11: Binary search tree captured by the same device should be placed close in the branches of the tree. In the next section a hierarchical clustering based tree building scheme will be explained that addresses this problem. Another problem binary search tree introduces is the storage space. Realize that the leaves of the tree represents the fingerprints in the database, and the nodes are their compositions. Since we need to keep all the nodes as well, the storage space doubles. Because of the size of sensor fingerprints, this might create a big problem. However, we believe that, in the application we consider, such as the legal applications, more resources can be used for storage to obtain gain in the efficiency. 50

75 3.2.2 Building The Tree by Hierarchical Clustering In this section, we describe a method to build the BST in a way that ensures that media objects captured by the same device are located close in the tree. An obvious way to ensure this is to correlate each single fingerprint estimate with the rest of the database and sort them according to the correlation results. However, building the tree like this would take O(nN 2 ) correlations and therefore would not be feasible. In this thesis, we study a more efficient method to build the tree which is based on hierarchical divisive clustering. The divisive clustering process starts at the top level with all the entities in one cluster. This top level cluster is split using a flat clustering algorithm. This procedure is applied recursively until each entity is in its own singleton cluster[34, 35]. The root of the tree contains a composite fingerprint which is obtained by summing all the fingerprint estimates in the database, C = N i p i. Each individual estimate is then correlated with this composite. The estimates are sorted and divided into two equal sized clusters based on their correlation values. The basic idea here is if there are more than one media objects from the same device, then the correlation value of the corresponding fingerprint estimates with the composite fingerprint should be close to the same. Let s assume that we have m objects from device A in our database {p j = f A + µ j : j = 1, 2,..., m}. The correlation between composite fingerprint C with a single 51

76 fingerprint estimate of camera A can be shown as : m m ρ(c, p q ) = (f A + µ q, (f A + µ j ) = m. f A 2 + µ k. µ j (3.2.8) j=1 j=1 Using this fact, the fingerprint estimates in the database are sorted according to their correlation results. It is expected that, the fingerprint estimates of images from the same device will list in succession after the sorting operation. Then the database is split into two subsets and this process is repeated in every level within each subset which makes the complexity of tree building method as O(nNlogN) Retrieving Multiple Objects The search procedure also needs to accommodate operations like retrieving several media objects captured by the same imaging device. To be able to retrieve all the media objects captured by the same device, the tree needs to be updated after every search. For this, the most recently matched fingerprint estimates are subtracted from the composite fingerprints of all the parent nodes. (In Figure 3.11 this is equivalent to subtracting fingerprint estimate associated with p 3 from all the nodes in the path depicted with red arrows.) The tree can be restored to its initial form, when the search for one device is ended. In a forensics setting, where it is important to limit the false positives, the update and search operations can be repeated consecutively until the search yields a fingerprint whose correlation with the query fingerprint is lower than the preset threshold. On the other hand, this method can be used in a retrieval setting. In this case, rather than setting a 52

77 threshold for the correlation value, the number of searches can be pre-determined. By this way, one can retrieve as many objects as wanted and eliminate the false positives according to threshold later on. 3.3 Experimental Results To demonstrate the performance and efficiency of proposed approach we provide results corresponding to different experimental scenarios. For this, we collected 300 images from 5 different digital cameras, including a Canon Powershot A80, Sony Cybershot S90, Sony Cybershot P72, Canon Powershot S1 IS, and Panasonic DMC FZ20 at their native resolutions. We also downloaded 17, 000 images from the Internet. All images are then cropped to the size of to have a fixed resolution. As a part of the offline step, fingerprint estimates are extracted from all the images and saved in a database to be used later. In the following experiments, the performance of the proposed approach is presented under different settings. I. The first experiment is designed to evaluate the efficiency of the proposed method when the fingerprint of the sensor in question is present and there is only one image in the database exhibiting this fingerprint. For this purpose, we built a database of 1024 images containing only one image from each camera and 1019 images from the Internet. We obtained the fingerprints of the devices from 200 PRNU noise profiles associated with each device. With the conventional approach, where the device fingerprint is correlated with the 53

78 fingerprint estimate of each image, it took 7 minutes and 11 seconds to find an image captured by one of the devices and 9.61 seconds using the proposed approach with no errors in both cases. Table 3.4: Performance Results with Binary Search Tree Camera Number of Searches Time Performance Panasonic DMC FZ min., 34sec. 50/50 Canon Powershot A min., 11sec. 50/50 Canon Powershot S1 IS 27 6min., 47sec. 48/50 Sony Cybershot S min., 47sec. 47/50 Sony Cybershot P min., 42sec. 46/50 II. In this experiment, we built a database of images by mixing 50 images from each camera with the images from the Internet. Our goal here was to measure the performance and efficiency of our approach in identifying as many images as possible while minimizing false matches (i.e., false positives). For this purpose, we set the threshold for the correlation value to Sensor fingerprints of the devices are again obtained using 200 PRNU noise profiles from a camera. The search is repeated until a fingerprint estimate with a correlation lower than the threshold is found. Table-3.4 shows the number of searches, the time and the number of images we were able to detect with our method. 54

79 Figure 3.12: The distribution of correlation values using the fingerprint of Sony Cybershot P72 and the fingeprint estimates from Sony Cybershot S90(green) and Sony Cybershot P72(purple) In Table-3.4, it can be seen that the errors primarily involve Sony made cameras. This is a result of high correlation between the sensor fingerprints of the two Sony camera models. To further test this phenomenon, in Figure-3.12 we provide the distribution of the correlation values between the fingerprint of Sony P72 with the fingerprint estimates of itself and of Sony S90. Results show that to minimize the error one needs the further increase the threshold value. This observation is also in line with the results of [36] where it is shown that fingerprints of cameras from same manufacturer correlate better with each other due to use of similar demosaicing algorithms which reveals itself as a systematic artifact in the extracted PRNU noise profiles. Further improving these results require removal of such demosaicing artifacts from the PRNU noise profiles during the offline step. 55

80 Table-3.4 also shows that to identify 50 images one doesn t need to perform 50 searches as the fingerprints of the same devices were mostly placed in the neighboring leaves of the tree. In this case, the average time to detect 50 images was around 6 minutes and 40 seconds with our method. On the other hand, when a linear search is performed, it would take more than 2 hours. III. In order to show how the proposed method can be used for a retrieval task as described in Section 3.2.3, we use the same setting as in the previous experiment. In this case, we repeatedly update the tree until we identify all the fingerprint estimates associated with the given cameras. Figure 3.13-a shows the precision recall diagrams for all the cameras that made an error during matching to show how many searches have to be performed. The figure indicates that the worst precision is about 0.5 which means that we need to search at most 100 times to find all the relevant fingerprints in the database. In this experiment, we also demonstrated a case where group based approach is not used, and tree is built by splitting the fingerprint estimates randomly. The red line in Figure 3.13-a shows the precision-recall diagram after 100 searches where the fingerprint of Sony S90 is used in search over the database. As expected, the search accuracy with random splitting is inferior in comparison to structured case. This is primarily because when fingerprint estimates are distributed over the nodes randomly, the distance between the nodes will not be far enough. As a result, fingerprint estimates associated with a given node 56

81 are more likely to be close to other node descriptors IV. In this experiment, we investigate the impact of device fingerprint s quality on the search results. For this purpose, tree is constructed from 4096 fingerprint estimates, 50 of which were due to Sony Powershot S90. Then 4 sensor fingerprints are generated for the Sony Powershot S90 by averaging 50, 100, 150 and 200 fingerprint estimates coming from the same camera. In addition, to test the limits, we also used a single fingerprint estimate as the device fingerprint during the search. The Precision-Recall diagrams corresponding the 5 different device fingerprints are presented in 3.13-b. Results show that performance does not change much with the number of fingerprints used when generating the device fingerprint. It must be noted that even though we used a single fingerprint estimate as the device fingerprint this promising result was achieved because related fingerprints were located close leaves of the tree which caused nodes to act almost as device fingerprints. Although, we present results for one camera, we observed that performance was very similar with the other cameras as well. V. We conducted a final experiment to test how the fingerprints associated with images from a given camera are placed in the search tree. For this purpose, we constructed trees using 64, 128, 256, 512, 1024, 2048 and 4096 fingerprint estimates. In each case, roughly 5% of the images were coming from the same camera (Sony S90). The experiments showed us that for trees sizes of 64, 128, 57

82 (a) (b) Figure 3.13: (a)precision-recall Diagrams for 3 cameras when group based approach is used and for Sony S90 when the tree is built by random splitting. (b)precision- Recall Diagrams for Sony S90 with different quality fingerprints. 58

83 and 256 all the fingerprints of Sony S90 were placed successively. When the size was increased to 512 and 1024, we observed that in both cases only two fingerprints from the Internet images were placed in between. For sizes of 2048, and 4096, this number increased to three. These results show that our approach can successfully cluster fingerprints associated with a given camera when building the binary search tree. Finally, we address the issue of using a threshold when deciding the validity of match returned by the search operation. We noted earlier that to make a decision as to whether a database contains an image associated with a query sensor fingerprint, one needs to rely on a preset a threshold below which a match will be considered invalid. Obviously, setting such a threshold might lead to missed matches. Hence, the choice of threshold poses a trade-off between early termination the search (i.e., avoiding false positives during matching) and increased number of missed matches. Since our matching criteria is based [11], the threshold values given there and the corresponding false-positive rates can also be used here when selecting a value for threshold. 59

84 Chapter 4 Video Copy Detection Based on Source Device Characteristics: A Complementary Approach to Content-Based Methods Video copy detection techniques are automated analysis procedures to identify the duplicate and modified copies of a video among a large number of videos so that their use can be managed by content owners and distributors. These techniques are required to accomplish various tasks involved in identifying, searching and retrieving videos from a database. Furthermore, due to the increase in the scale of video databases, the ability to accurately and rapidly perform these tasks become increas- 60

85 ingly crucial.for example, it is reported that the number of videos in video sharing site YouTube s 1 database have reached 73.8 millions by March 2008 and every day more than 150 thousand videos are uploaded to its servers 2 In such systems, copy detection techniques are needed for efficient indexing, copyright management and accurate retrieval of videos as well as detection and removal of repeated videos to reduce storage costs. Development of monitoring systems that can track commercials and media content (e.g., songs, movies, etc.) over various broadcast channels is another application where copy detection techniques are needed most. In any case, realizing above tasks requires techniques that are capable of providing distinguishing characteristics of videos which are also robust to various types of modification. The most prominent approach in video copy detection has been to extract unique features from the audiovisual content [37, 38]. Therefore, many content-based features have been proposed. These features included color features like layouts [39], histograms [40, 41, 42], and coherence [43]; spatial features like edge maps [44] and texture properties [45, 46]; temporal features like shot length [47]; and spatiotemporal features like 3D-DCT coefficient properties [48] and differential luminance characteristics [49]. A video signature is generated by either organizing the computed features into suitable representations or through cryptographically hashing them to obtain more succinct representations. The resulting signatures are expected Kansas State University Digital Ethnography Group s YouTube Statistics report can be obtained at 61

86 to be unique and robust under common processing operations. These signatures are stored in a database for later verifying the match of a given video. The biggest challenge in video copy detection is to retrieve duplicate or modified versions of a video while being able to discriminate it from other similar videos. Since a video can be modified in many different ways, including common video processing operations, overlaying graphical objects onto video frames and insertion/deletion of video content, obtaining video signatures that are robust to all these types of modifications is a challenging task. Figure 4.1-a display frames from a video and its contrast enhanced version which are expected to yield the same signature. Similarly, Figure 4.1-b displays the copy of a video with overlaid advertisement. While robustness of extracted video signatures is crucial for the success of video detection techniques, such a requirement, at the same time, makes it very difficult to differentiate between videos that are very similar in content. Figures 4.1- c and 4.1-d show frames from videos that are visually very similar but essentially different. Therefore, in the presence of many content-wise similar videos, detecting modified copies of a given video becomes a very challenging task, and rapidly increasing size of video databases significantly exacerbates the problem. In the context of these difficulties, main insight of this work is that use of source device characteristics provides a new level of information that can help alleviate above problems. The fact that source characteristics are not primarily content dependent makes it potentially very effective against problems arising due to similarity 62

87 (a) (b) (c) (d) Figure 4.1: (a) A video and its contrast enhanced duplicate. (b) A video and its advertisement overlaid version. (c) Two videos taken at at slightly different angels. (d) Similar but not duplicate videos of content. Moreover, since source device characteristics are not equally subject to constraints of audiovisual content, they are not prone to effects of common video processing operations in the same way, which makes them robust against certain modifications. Hence, incorporation of source device characteristics with contentbased features will improve overall accuracy of video copy detection techniques. In this chapter of the thesis, we propose a new video copy detection scheme that utilizes unique characteristics of imaging sensors used in cameras and camcorders. The underlying idea of the proposed scheme is that a video signature can be defined as a weighted combination of the fingerprints of camcorders (e.g., imaging sensors) involved in generation of a video. The resulting signature essentially depends on various factors that include duration of video, number of involved camcorders, contribution of each camcorder, and partly the content of the video. We demonstrate 63

88 the viability of the idea on videos taken by several different camcorders and on several copies of duplicate and near-duplicate videos downloaded from YouTube. Our results show that signatures extracted from a set of videos downloaded from YouTube do not yield a false-positive in detecting near-duplicate videos and that the signatures are robust to both temporal changes and various common processing operations. 4.1 Camcorder Identification based on PRNU Chen et al. [15] extended the approach in [11]which was described in Section 2.2 to videos to identify source camcorder. Although digital cameras and camcorders are very similar in their operation, obtaining an estimate of the sensor fingerprint from a video is a more challenging task. As a comparison, for internet quality videos, of size 264x352 at 150 kb/sec. bit-rate, the needed duration of the video to obtain a reliable fingerprint is around 10 minutes [15]; whereas, a few hundred images is typically sufficient to obtain the fingerprint of a digital camera. There are several reasons for that: (i) frame sizes of typical videos are smaller which decreases the available information needed for reliable detection; (ii) successive frames are very much alike, hence averaging successive instances of PRNU noise patterns do not effectively eliminate content dependency; and (iii) because of motion compensation sensor fingerprints might be lost in some parts of the frames. Essentially, the accuracy of the fingerprint estimate depends on the quality (compression and resolution) and 64

89 the duration of video (i.e., number of frames). In Figure 4.2, we show the impact of the quality and length of the video on fingerprint estimates obtained from videos taken by 5 different camcorders. Each video is encoded at 1 Mbps and 2 Mbps bitrate and divided into segments of 1000 frames and 1500 frames, and a PRNU noise pattern is extracted from each segment to obtain a fingerprint of the camcorder. By designating one of the camcorders as reference, inter- and intra-correlations of the obtained fingerprints are computed with respect to the reference camcorder. It can be seen that for increasing quality and longer segments the fingerprint estimates yield better differentiation of videos taken by the reference camcorder from the videos taken by other camcorders. 4.2 Obtaining Source Characteristics Based Video Signatures Since a video can be generated by a single camcorder or by combining multiple video segments captured by several camcorders, we define a video signature to be the weighted combination of the fingerprints of the involved camcorders. We, therefore, utilize a procedure similar to one described in [15] in extracting PRNU noise pattern from a video frame. We denoise each video frame with a wavelet-based denoising filter and extract the noise residues which are then averaged together. The resulting pattern is the combination of camcorder fingerprints, and it is treated as 65

90 80 30 Density Density Correlation Values (a) Correlation values (b) Density Density Correlation Values (c) Correlation Values (d) Figure 4.2: The distribution of inter- and intra-correlation values. Distributions in blue indicate the correlation values of fingerprints associated with the videos shot by the reference camcorder. Distributions in red indicate the correlation values of fingerprints between the reference camcorder and other camcorder. Bit-rate of videos and the number of frames in each segment are (a) 1 Mbps and 1000 frames, (a) 1 Mbps and 1500 frames, (c) 2 Mbps and 1000 frames, and (d) 2 Mbps and 1500 frames. the signature of the video. If a video is shot by, for example, two camcorders, the extracted signature will be the weighted average of the fingerprints of these two camcorders. The weighting will depend on the length of the video shot by each camcorder. To detect whether two videos are copies of each other, we assess the correlation between two video signatures. Since PRNU noise pattern is intrinsic to an imaging sensor, one issue that needs 66

91 to be addressed is how to identify videos taken by the same camcorder (or a fixed set of camcorders) as they are expected to yield the same signature. Essentially, due to inability to extract an accurate estimate of the underlying PRNU noise, fingerprints extracted from a video has also contributions from the content itself. That is, the extracted video signature will not only depend on the imaging sensor fingerprints but also it will exhibit some degree of content dependency. In Figure 4.2, it can be seen that the fingerprints extracted from videos captured by the reference camcorder correlate more; however, in the best case the correlation value is just around For unmodified or slightly modified videos correlation would take values close to one. On the other hand, for near-duplicate videos, no matter how similar they are, as long as the source camcorders are different, correlation values will not take high values. These will be further explored in the following sections. Another challenge in video copy detection is the robustness of the extracted video signature when the video is subjected to common processing. Proposed video signature extraction scheme is expected to be robust to the linear operations as they will not degrade the PRNU noise. The scheme would also be robust to temporal changes like random frame droppings, and time desynchronizations as long as number of frames in a video is not reduced dramatically. Since modifications like blurring, noise addition and compression are expected to degrade fingerprint estimates, proposed scheme would be robust to this type of modifications only up to certain extent. One critical type of modification that will impact the performance 67

92 negatively is frame cropping or scaling. This would require establishing synchronization between the sensor fingerprints from the original video and its scaled/cropped version prior to comparison of video signatures. Although Goljan et al. [50] showed that sensor fingerprints can be detected under image cropping or re-scaling through a search of relevant (cropping and scaling) parameters, it would, nevertheless, increase the computational complexity. To evaluate our video copy detection scheme, we performed two sets of experiments. In the first set of experiments, we provide results demonstrating the robustness of the video signatures against various common processing. In the second set of experiments, we apply the proposed scheme to videos downloaded from YouTube and show how the scheme performs on real life test data, where no information is available on the source camcorders. In the following sections, we provide an evaluation of the proposed scheme. 4.3 Robustness Properties of Video Signatures To test the robustness of the extracted signatures, we used videos captured by five different camcorders in Mini-DV format with a frame resolution of 0.68 megapixels. The videos are initially encoded at an average bit-rate of 2 Mbps and at 30 frames per second. The videos depict various sceneries that include indoor/outdoor scenes, fast moving objects, and still scenes, shot at varying optical zoom levels and also using camcorder panning. The videos captured with each camera are divided into 10 68

93 clips of 1000 frames, and the signatures of the resulting 50 video clips are extracted. Figure 4.3 shows that the correlation values computed between different video clips range from to 0.2. The results demonstrate that each video clip yields a different signature even though they are shot by the same camcorder. Next, we assess robustness properties of extracted video signatures by subjecting the video clips to various types of modifications at varying strengths. We extracted signatures from video clips that has undergone manipulation and correlate these signatures with the signatures from original (unmodified) video clips, For each manipulation we provide distributions of how original signatures correlate with (a) signatures extracted from their modified versions (blue distribution), (b) signatures extracted from other videos taken by the same camera (red distribution), and (c) signatures extracted from videos taken by other camcorders (green distribution). The distribution of these correlation values are shown in Figure 4.4. As can be seen in this figure, when the content is different correlation of signatures are less than 0.2 for all types of modifications. Therefore, if the signatures from the original and modified version of a video is above 0.2, video copies can be reliably detected. In our experiments, we set the threshold for identification to 0.2 so that none of the different videos, whether they are taken by same camcorder or not, would be identified as copies. As a performance measure we consider true positive rate (TPR), which determines the rate of correctly detected copies of the video. For each manipulation, we also provide a figure showing the change in mean of signa- 69

94 30 25 Density Correlation values Figure 4.3: Distribution of correlation values obtained by correlating signatures 50 video clips with each other. ture correlations between each video and its modified version with the change in manipulation strength Contrast Adjustment Contrast adjustment operation modifies the range of pixel values without changing their mutual dynamic relationship. Contrast enhancement (increase) maps the luminance values in the interval [v l, v h ] to the interval [0, 255]. The luminance values below v l and higher than vh are saturated to 0 and 255, respectively. In the same manner, when contrast is decreased the luminance values ranging in [0, 255] are mapped to the range [v l, v h ]. Under contrast adjustment since the PRNU noise is largely preserved, the resulting video signatures will not be modified in any significantly. In the experiments, we tried various [v l, v h ] values changing from [25, 230] to [115, 140]. As can be seen in Figure 4.5-a, video signatures are robust up to 90% 70

95 Density Density Correlation values (a) Correlation values (b) Density Density Correlation values (c) Correlation Values (d) Density Density Correlation values (e) Correlation values (f) Density Density Correlation values Correlation Values (f) (g) Figure 4.4: The distribution of correlation values. Blue distribution is obtained by correlating the unmodified video clips and their modified versions, red is obtained by correlating video clips with the ones coming from the same camcorder and green is obtained by cross correlation of different videos for different types of manipulations. (a) Decreased contrast. (b) Increased contrast. (c) Decreased brightness. (d) Increased brightness. (e) Blurring. (f) AWGN addition. (g) Compression. (h) Random frame dropping 71

96 contrast increase, which corresponds to [102, 153] range. For enhancement values [114, 140], the mean correlation value was around 0.18 and actually all correlation values were lower than 0.2. However, it must be noted that this is a very extreme case and most of the luminance values of the frames are saturated to 0 and 255. On the other hand, even in the most extreme cases of contrast decrease, where the luminance values in the range [0, 255] are mapped to [114, 140] range, we were able to detect all the copies of video clips, Figure 4.5-b. The distributions of correlation values between the signatures of original video clips and their contrast increased versions can be seen in Figure 4.4-a and contrast decreased versions can be seen in Figure 4.4-b. These results show that the extracted signatures are very robust to contrast manipulations Mean of correlation values Mean of correlation values Strength of contrast increase Strength of contrast decrease (a) (b) Figure 4.5: The change in mean of correlation values as a function of the strength of (a) contrast increase and (b) contrast decrease. 72

97 4.3.2 Brightness Adjustment Brightness adjustment is performed by either adding or subtracting p percent of the frame mean luminance value to or from each pixel in the frame, where p is a user defined parameter. Since this operation only offsets the pixel values, the PRNU noise will be almost fully preserved and video signature will not change much. During the experiments we varied p value between 10% to 190%, where 10%-99% indicates brightness increase and 101%-190 indicates brightness decrease. The correlation of signatures after adjusting brightness are given in Figure 4.4-c and 4.4-d. Also, the average change in the correlation values with respect to changes in brightness level is given in Figure 4.6-a and 4.6-b. As can be seen in these figures, detection fails only when brightness increase is at an extreme level. For other instances, the video signatures are observed to be robust; therefore, we were able to detect all the copies videos without any false positives Mean of correlation values Mean of correlation values Strength of brightness decrease Strength of brightness increase (a) (b) Figure 4.6: The change in mean of correlation values as a function of the strength of (a) brightness decrease and (b) brightness increase. 73

98 4.3.3 Blurring Blurring is performed by filtering each frame using a standard Gaussian filter function with parameter σ (i.e., standard deviation). Since blurring will remove much of the medium- to high-frequency content, the PRNU noise may be largely removed (depending on the choice of σ), making extracted signatures unreliable. In the experiments, we considered σ = 2, 3, 5, 7 values. Figure 4.7-a shows the mean value of the resulting correlations with the change in the filter size. Results indicate that the signatures are robust to blurring only if Gaussian filter width σ is less than 3. The distribution of correlation values can be seen in Figure 4.4-e AWGN Addition Noise addition will degrade the accuracy of the PRNU noise estimates. It is perceivable that with increasing noise power levels reliable detection of PRNU noise will get more and more difficult. When the noise is additive and frame-wise independent, its impact can be reduced by averaging it over large number of frames; however, this will be effective only very long videos. In the experiments, we added additive white Gaussian noise (AWGN), with varying standard deviation σ, to each video frame. The considered range of noise levels are σ = 2, 3, 5, 10, 20, 30. The results in Figure 4.7-b show that performance is not satisfactory when σ > 5. For σ = 20, 30 our scheme didn t work at all, for σ = 5, we achieved 80% true positive rate (TPR) and for σ = 10 the TPR was 30%, in both cases there were no false 74

99 positives. Figure 4.7-f provides the distribution of correlation values after AWGN addition Mean of correlation values Mean of correlation values Sigma Sigma (a) (b) Figure 4.7: The change in mean of correlation values as a function of the strength of (a) blurring and (b) AWGN addition Compression To show the impact of compression, we re-encoded all videos at bit-rates ranging from 0.8 Mbps to 2 Mbps, while still preserving the frame resolution. (Since compression beyond 0.8 Mbps caused a decrease in frame resolution, we did not consider lower bit-rate values.) We observed that accuracy does not vary with the bit-rate as can be seen as in Figure 4.8-a and Figure 4.4. Therefore, we can conclude that the signatures are very robust to bit rate changes. The distributions of correlation values are given in Figure 4.4-g and shows a similar trend. 75

100 Mean of correlation values Mean of correlation values Bit rate Mbps Percentage of dropped frames (a) (b) Figure 4.8: The change in mean of correlation values as a function of the strength of (a) compression and (b) frame dropping Random Frame Dropping To illustrate the impact of a lossy channel we randomly removed frames from each video clip before extracting the signature. The drop rate varied between 50% to 90%. As Figure 4.8-b and Figure 4.4-h indicate, for all frame drop rates extracted signatures were reliable. Also, the correlation of signatures after random frame dropping can be found in Figure 4.4-h. 4.4 Performance Evaluation To test the performance of the proposed video copy detection scheme, we used videos from the video sharing site YouTube. For this purpose, we downloaded more than 400 videos searched under 44 distinct names without imposing any other constraint (e.g., resolution, compression level, synchronization in time). Each distinct video had copies ranging from 2 to 39. These videos include TV commercials, movie trails, 76

101 and music clips, and duration of each video varies from 20 seconds to 10 minutes at a resolution of 240x320 pixels. Then, signatures extracted from the 400 videos are cross-correlated. The distributions of the resulting correlation values are given in in Figure 4.9-a. In this figure, blue distribution curve indicates the correlation of signatures associated with the same videos and red is for the correlation of signatures associated with different videos. From these distributions, it can be immediately seen that for the same videos, the correlation values are in general greater than 0.5 and mostly close to 1. For different videos, on the other hand, correlation values are centered around 0 with a maximum less than 0.5. To evaluate the performance of the scheme in detecting video copies, at a given decision threshold, we counted the number of decision errors when the copy of the video is deemed to be a different video (false-rejection) and a different video is detected as a copy (false-acceptance). (This is realized by comparing the correlation values associated will all pairs of values with a preset threshold.) Figure 4.9-b displays the receiver operating characteristic (ROC) curve which shows the change in false-acceptance rate in comparison to false-rejection rate by varying the decision threshold across all values. The ROC curve shows that the misidentification rate is very low. Note that in the best case, accuracy performance is 99.30%. To see why some of the videos didn t correlate with their copies, we examined the videos more closely. We found several reasons for not getting similar signatures from the available copies. The most common reason for a misidentification is (slight) 77

102 Density FRR ROC curve Correlation Values FAR (a) (b) Figure 4.9: (a)cross-correlation of extracted video signatures. (b) ROC curve for detection results on the videos downloaded from YouTube. 0 scaling of the videos. Since after scaling extracted signatures do not align, those videos yielded very low correlation values. Figure 4.10-a shows an example of a video and its scaled copy. As expected, another reason for observing low correlation values is compression. Figure 4.10-b provides an example where the copied version of the video is compressed by a factor of 0.75 which yields a correlation value just below the threshold. Another factor contributing to mis-identifications is the extra content (like advertisements) inserted into the videos. We observed that if the added content is around 10% of the length of original video, the signatures yield correlation values more than 0.5. When the added content is more than 30% of the original one in duration, the resulting signatures become substantially dissimilar. Another reason for low correlation values is video summarization. It is observed that even if the a video is shortened more than 30% of its original, signatures yield satisfactory correlation. However, in general when videos are shortened by more than 40%, our scheme was not able to correctly detect the copies. 78

103 (a) Scaled versions (b) Highly compressed ver. Figure 4.10: Video copies for which the extracted signatures are dissimilar. We notice that our signatures are quite robust in the presence on-screen graphic objects, like subtitles and small advertisements, that overlay the video content. Figure 4.11-a gives one such example where detection can be successfully achieved. In addition, small shifts in time didn t effect the signature much. In Figure 4.11-b one can see the 300th frames of a video and it s copy. In this example, the second video started with a blank screen with a duration around 2 seconds that yielded a shift in time. We also examined the videos that are falsely detected as copies of videos with high correlation values. To our observation, most dominant factor in those cases is the continuous presence of a logo or advertisement in different videos, as exemplified in Figure (a) Added subtitles (b) Shifted in time Figure 4.11: Video copies with similar signatures. To determine the impact of content and imaging sensor fingerprint on the resulting video signatures, we performed another experiment on YouTube videos. For this purpose, we downloaded 36 distinct videos of a commercial series (the now 79

104 (a) (b) Figure 4.12: Different videos with similar signatures. corr(a,b)=0.45 famous PC vs Mac commercial) hypothesizing that they are captured using the same set of source devices(s). The videos were short, typically 27 seconds videos, with an average of 700 frames per video and at resolution 240x320 pixels per frame, and content-wise they are quite similar. Figure-4.13 shows representative frames extracted from four of the videos. We extracted signatures from each of the video and computed pair-wise correlations among all signatures. In Figure-4.14, red distribution shows the values obtained by pair-wise correlations. As can be seen, most of the result- ing values are very close to 0, implying no relation between the videos. These results indicate that if the content of the videos are not same, but very similar, and even if they might have been captured by the same set of source devices, our scheme doesn t detect them as copies. On the other hand, these results do not allow us to conclude whether or not the commercials are captured using the same source device(s) as videos captured by the same device are expected to yield higher correlation values. Since extracting reliable fingerprint of the sensors from internet quality videos require longer duration videos, we performed another experiment to see if the source devices for the videos match. For this purpose, we first randomly chose 10 videos and combined them together to generate a composite video. Then 80

105 Figure 4.13: The frames of four example videos from a commercial series. we generated another composite video by choosing 10 different videos from the remaining ones and correlated the resulting signature from the two composite videos. (Note that the two composite videos have no overlapping content.) We repeated the same experiment 250 times by drawing different combinations of 36 videos each time. The distribution of resulting correlation values are shown in Figure-4.14 in blue. These results strongly imply that at least some of the videos are taken by the same set of cameras/camcorders. Overall, the experiments on these commercial series showed that when the same camcorders are used in capturing process of two videos, if their contents are not same, although they may be similar, the resulting signature would be significantly different. 81

106 Figure 4.14: The distribution of correlation values obtained from the commercial series. Red distribution is obtained by pair-wise correlations of each video and blue distribution is obtained by correlation of composite videos 82

Source Class Identification for DSLR and Compact Cameras

Source Class Identification for DSLR and Compact Cameras Source Class Identification for DSLR and Compact Cameras Yanmei Fang #,, Ahmet Emir Dirik #2, Xiaoxi Sun #, Nasir Memon #3 # Dept. of Computer & Information Science Polytechnic Institute of New York University,

More information

The Scientific Data Mining Process

The Scientific Data Mining Process Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In

More information

COMPONENT FORENSICS OF DIGITAL CAMERAS: A NON-INTRUSIVE APPROACH

COMPONENT FORENSICS OF DIGITAL CAMERAS: A NON-INTRUSIVE APPROACH COMPONENT FORENSICS OF DIGITAL CAMERAS: A NON-INTRUSIVE APPROACH Ashwin Swaminathan, Min Wu and K. J. Ray Liu Electrical and Computer Engineering Department, University of Maryland, College Park ABSTRACT

More information

Automated Image Forgery Detection through Classification of JPEG Ghosts

Automated Image Forgery Detection through Classification of JPEG Ghosts Automated Image Forgery Detection through Classification of JPEG Ghosts Fabian Zach, Christian Riess and Elli Angelopoulou Pattern Recognition Lab University of Erlangen-Nuremberg {riess,elli}@i5.cs.fau.de

More information

Combating Anti-forensics of Jpeg Compression

Combating Anti-forensics of Jpeg Compression IJCSI International Journal of Computer Science Issues, Vol. 9, Issue 6, No 3, November 212 ISSN (Online): 1694-814 www.ijcsi.org 454 Combating Anti-forensics of Jpeg Compression Zhenxing Qian 1, Xinpeng

More information

Digital Image Analysis for Evidence: A MATLAB toolbox

Digital Image Analysis for Evidence: A MATLAB toolbox Digital Image Analysis for Evidence: A MATLAB toolbox Susan Welford, Dr. Stuart Gibson, Andrew Payne Forensic Imaging Group, University of Kent, Canterbury, Kent, CT2 7NZ July 2011 Abstract In the last

More information

SOURCE SCANNER IDENTIFICATION FOR SCANNED DOCUMENTS. Nitin Khanna and Edward J. Delp

SOURCE SCANNER IDENTIFICATION FOR SCANNED DOCUMENTS. Nitin Khanna and Edward J. Delp SOURCE SCANNER IDENTIFICATION FOR SCANNED DOCUMENTS Nitin Khanna and Edward J. Delp Video and Image Processing Laboratory School of Electrical and Computer Engineering Purdue University West Lafayette,

More information

Mean-Shift Tracking with Random Sampling

Mean-Shift Tracking with Random Sampling 1 Mean-Shift Tracking with Random Sampling Alex Po Leung, Shaogang Gong Department of Computer Science Queen Mary, University of London, London, E1 4NS Abstract In this work, boosting the efficiency of

More information

A Novel Method for Identifying Exact Sensor Using Multiplicative Noise Component

A Novel Method for Identifying Exact Sensor Using Multiplicative Noise Component 2013 IEEE International Symposium on Multimedia A Novel Method for Identifying Exact Sensor Using Multiplicative Noise Component Babak Mahdian and Stanislav Saic Institute of Information Theory and Automation

More information

PHYSIOLOGICALLY-BASED DETECTION OF COMPUTER GENERATED FACES IN VIDEO

PHYSIOLOGICALLY-BASED DETECTION OF COMPUTER GENERATED FACES IN VIDEO PHYSIOLOGICALLY-BASED DETECTION OF COMPUTER GENERATED FACES IN VIDEO V. Conotter, E. Bodnari, G. Boato H. Farid Department of Information Engineering and Computer Science University of Trento, Trento (ITALY)

More information

Exposing Digital Forgeries Through Chromatic Aberration

Exposing Digital Forgeries Through Chromatic Aberration Exposing Digital Forgeries Through Chromatic Aberration Micah K. Johnson Department of Computer Science Dartmouth College Hanover, NH 03755 kimo@cs.dartmouth.edu Hany Farid Department of Computer Science

More information

Assessment of Camera Phone Distortion and Implications for Watermarking

Assessment of Camera Phone Distortion and Implications for Watermarking Assessment of Camera Phone Distortion and Implications for Watermarking Aparna Gurijala, Alastair Reed and Eric Evans Digimarc Corporation, 9405 SW Gemini Drive, Beaverton, OR 97008, USA 1. INTRODUCTION

More information

Notes on Factoring. MA 206 Kurt Bryan

Notes on Factoring. MA 206 Kurt Bryan The General Approach Notes on Factoring MA 26 Kurt Bryan Suppose I hand you n, a 2 digit integer and tell you that n is composite, with smallest prime factor around 5 digits. Finding a nontrivial factor

More information

Lecture 8: Signal Detection and Noise Assumption

Lecture 8: Signal Detection and Noise Assumption ECE 83 Fall Statistical Signal Processing instructor: R. Nowak, scribe: Feng Ju Lecture 8: Signal Detection and Noise Assumption Signal Detection : X = W H : X = S + W where W N(, σ I n n and S = [s, s,...,

More information

Characterizing Digital Cameras with the Photon Transfer Curve

Characterizing Digital Cameras with the Photon Transfer Curve Characterizing Digital Cameras with the Photon Transfer Curve By: David Gardner Summit Imaging (All rights reserved) Introduction Purchasing a camera for high performance imaging applications is frequently

More information

Chapter 6. The stacking ensemble approach

Chapter 6. The stacking ensemble approach 82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described

More information

JPEG compression of monochrome 2D-barcode images using DCT coefficient distributions

JPEG compression of monochrome 2D-barcode images using DCT coefficient distributions Edith Cowan University Research Online ECU Publications Pre. JPEG compression of monochrome D-barcode images using DCT coefficient distributions Keng Teong Tan Hong Kong Baptist University Douglas Chai

More information

Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall

Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall Automatic Photo Quality Assessment Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall Estimating i the photorealism of images: Distinguishing i i paintings from photographs h Florin

More information

Image Authentication Scheme using Digital Signature and Digital Watermarking

Image Authentication Scheme using Digital Signature and Digital Watermarking www..org 59 Image Authentication Scheme using Digital Signature and Digital Watermarking Seyed Mohammad Mousavi Industrial Management Institute, Tehran, Iran Abstract Usual digital signature schemes for

More information

Semantic Video Annotation by Mining Association Patterns from Visual and Speech Features

Semantic Video Annotation by Mining Association Patterns from Visual and Speech Features Semantic Video Annotation by Mining Association Patterns from and Speech Features Vincent. S. Tseng, Ja-Hwung Su, Jhih-Hong Huang and Chih-Jen Chen Department of Computer Science and Information Engineering

More information

ABSTRACT VIDEO INDEXING AND SUMMARIZATION USING MOTION ACTIVITY. by Kadir Askin Peker

ABSTRACT VIDEO INDEXING AND SUMMARIZATION USING MOTION ACTIVITY. by Kadir Askin Peker ABSTRACT VIDEO INDEXING AND SUMMARIZATION USING MOTION ACTIVITY by Kadir Askin Peker In this dissertation, video-indexing techniques using low-level motion activity characteristics and their application

More information

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic

More information

WHITE PAPER. Are More Pixels Better? www.basler-ipcam.com. Resolution Does it Really Matter?

WHITE PAPER. Are More Pixels Better? www.basler-ipcam.com. Resolution Does it Really Matter? WHITE PAPER www.basler-ipcam.com Are More Pixels Better? The most frequently asked question when buying a new digital security camera is, What resolution does the camera provide? The resolution is indeed

More information

International Journal of Advanced Information in Arts, Science & Management Vol.2, No.2, December 2014

International Journal of Advanced Information in Arts, Science & Management Vol.2, No.2, December 2014 Efficient Attendance Management System Using Face Detection and Recognition Arun.A.V, Bhatath.S, Chethan.N, Manmohan.C.M, Hamsaveni M Department of Computer Science and Engineering, Vidya Vardhaka College

More information

Image Compression through DCT and Huffman Coding Technique

Image Compression through DCT and Huffman Coding Technique International Journal of Current Engineering and Technology E-ISSN 2277 4106, P-ISSN 2347 5161 2015 INPRESSCO, All Rights Reserved Available at http://inpressco.com/category/ijcet Research Article Rahul

More information

WATERMARKING FOR IMAGE AUTHENTICATION

WATERMARKING FOR IMAGE AUTHENTICATION WATERMARKING FOR IMAGE AUTHENTICATION Min Wu Bede Liu Department of Electrical Engineering Princeton University, Princeton, NJ 08544, USA Fax: +1-609-258-3745 {minwu, liu}@ee.princeton.edu ABSTRACT A data

More information

Parametric Comparison of H.264 with Existing Video Standards

Parametric Comparison of H.264 with Existing Video Standards Parametric Comparison of H.264 with Existing Video Standards Sumit Bhardwaj Department of Electronics and Communication Engineering Amity School of Engineering, Noida, Uttar Pradesh,INDIA Jyoti Bhardwaj

More information

Scanners and How to Use Them

Scanners and How to Use Them Written by Jonathan Sachs Copyright 1996-1999 Digital Light & Color Introduction A scanner is a device that converts images to a digital file you can use with your computer. There are many different types

More information

Building an Advanced Invariant Real-Time Human Tracking System

Building an Advanced Invariant Real-Time Human Tracking System UDC 004.41 Building an Advanced Invariant Real-Time Human Tracking System Fayez Idris 1, Mazen Abu_Zaher 2, Rashad J. Rasras 3, and Ibrahiem M. M. El Emary 4 1 School of Informatics and Computing, German-Jordanian

More information

Video Affective Content Recognition Based on Genetic Algorithm Combined HMM

Video Affective Content Recognition Based on Genetic Algorithm Combined HMM Video Affective Content Recognition Based on Genetic Algorithm Combined HMM Kai Sun and Junqing Yu Computer College of Science & Technology, Huazhong University of Science & Technology, Wuhan 430074, China

More information

Data Storage. Chapter 3. Objectives. 3-1 Data Types. Data Inside the Computer. After studying this chapter, students should be able to:

Data Storage. Chapter 3. Objectives. 3-1 Data Types. Data Inside the Computer. After studying this chapter, students should be able to: Chapter 3 Data Storage Objectives After studying this chapter, students should be able to: List five different data types used in a computer. Describe how integers are stored in a computer. Describe how

More information

Data Storage 3.1. Foundations of Computer Science Cengage Learning

Data Storage 3.1. Foundations of Computer Science Cengage Learning 3 Data Storage 3.1 Foundations of Computer Science Cengage Learning Objectives After studying this chapter, the student should be able to: List five different data types used in a computer. Describe how

More information

A Proposal for OpenEXR Color Management

A Proposal for OpenEXR Color Management A Proposal for OpenEXR Color Management Florian Kainz, Industrial Light & Magic Revision 5, 08/05/2004 Abstract We propose a practical color management scheme for the OpenEXR image file format as used

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

Predict the Popularity of YouTube Videos Using Early View Data

Predict the Popularity of YouTube Videos Using Early View Data 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

Multimedia Document Authentication using On-line Signatures as Watermarks

Multimedia Document Authentication using On-line Signatures as Watermarks Multimedia Document Authentication using On-line Signatures as Watermarks Anoop M Namboodiri and Anil K Jain Department of Computer Science and Engineering Michigan State University East Lansing, MI 48824

More information

Big Data: Rethinking Text Visualization

Big Data: Rethinking Text Visualization Big Data: Rethinking Text Visualization Dr. Anton Heijs anton.heijs@treparel.com Treparel April 8, 2013 Abstract In this white paper we discuss text visualization approaches and how these are important

More information

Multimodal Biometric Recognition Security System

Multimodal Biometric Recognition Security System Multimodal Biometric Recognition Security System Anju.M.I, G.Sheeba, G.Sivakami, Monica.J, Savithri.M Department of ECE, New Prince Shri Bhavani College of Engg. & Tech., Chennai, India ABSTRACT: Security

More information

DYNAMIC RANGE IMPROVEMENT THROUGH MULTIPLE EXPOSURES. Mark A. Robertson, Sean Borman, and Robert L. Stevenson

DYNAMIC RANGE IMPROVEMENT THROUGH MULTIPLE EXPOSURES. Mark A. Robertson, Sean Borman, and Robert L. Stevenson c 1999 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or

More information

A Spectral Clustering Approach to Validating Sensors via Their Peers in Distributed Sensor Networks

A Spectral Clustering Approach to Validating Sensors via Their Peers in Distributed Sensor Networks A Spectral Clustering Approach to Validating Sensors via Their Peers in Distributed Sensor Networks H. T. Kung Dario Vlah {htk, dario}@eecs.harvard.edu Harvard School of Engineering and Applied Sciences

More information

Florida International University - University of Miami TRECVID 2014

Florida International University - University of Miami TRECVID 2014 Florida International University - University of Miami TRECVID 2014 Miguel Gavidia 3, Tarek Sayed 1, Yilin Yan 1, Quisha Zhu 1, Mei-Ling Shyu 1, Shu-Ching Chen 2, Hsin-Yu Ha 2, Ming Ma 1, Winnie Chen 4,

More information

How To Filter Spam Image From A Picture By Color Or Color

How To Filter Spam Image From A Picture By Color Or Color Image Content-Based Email Spam Image Filtering Jianyi Wang and Kazuki Katagishi Abstract With the population of Internet around the world, email has become one of the main methods of communication among

More information

Image Normalization for Illumination Compensation in Facial Images

Image Normalization for Illumination Compensation in Facial Images Image Normalization for Illumination Compensation in Facial Images by Martin D. Levine, Maulin R. Gandhi, Jisnu Bhattacharyya Department of Electrical & Computer Engineering & Center for Intelligent Machines

More information

High Quality Image Magnification using Cross-Scale Self-Similarity

High Quality Image Magnification using Cross-Scale Self-Similarity High Quality Image Magnification using Cross-Scale Self-Similarity André Gooßen 1, Arne Ehlers 1, Thomas Pralow 2, Rolf-Rainer Grigat 1 1 Vision Systems, Hamburg University of Technology, D-21079 Hamburg

More information

An Energy-Based Vehicle Tracking System using Principal Component Analysis and Unsupervised ART Network

An Energy-Based Vehicle Tracking System using Principal Component Analysis and Unsupervised ART Network Proceedings of the 8th WSEAS Int. Conf. on ARTIFICIAL INTELLIGENCE, KNOWLEDGE ENGINEERING & DATA BASES (AIKED '9) ISSN: 179-519 435 ISBN: 978-96-474-51-2 An Energy-Based Vehicle Tracking System using Principal

More information

ECE 533 Project Report Ashish Dhawan Aditi R. Ganesan

ECE 533 Project Report Ashish Dhawan Aditi R. Ganesan Handwritten Signature Verification ECE 533 Project Report by Ashish Dhawan Aditi R. Ganesan Contents 1. Abstract 3. 2. Introduction 4. 3. Approach 6. 4. Pre-processing 8. 5. Feature Extraction 9. 6. Verification

More information

How To Fix Out Of Focus And Blur Images With A Dynamic Template Matching Algorithm

How To Fix Out Of Focus And Blur Images With A Dynamic Template Matching Algorithm IJSTE - International Journal of Science Technology & Engineering Volume 1 Issue 10 April 2015 ISSN (online): 2349-784X Image Estimation Algorithm for Out of Focus and Blur Images to Retrieve the Barcode

More information

Environmental Remote Sensing GEOG 2021

Environmental Remote Sensing GEOG 2021 Environmental Remote Sensing GEOG 2021 Lecture 4 Image classification 2 Purpose categorising data data abstraction / simplification data interpretation mapping for land cover mapping use land cover class

More information

Canny Edge Detection

Canny Edge Detection Canny Edge Detection 09gr820 March 23, 2009 1 Introduction The purpose of edge detection in general is to significantly reduce the amount of data in an image, while preserving the structural properties

More information

Classification of Fingerprints. Sarat C. Dass Department of Statistics & Probability

Classification of Fingerprints. Sarat C. Dass Department of Statistics & Probability Classification of Fingerprints Sarat C. Dass Department of Statistics & Probability Fingerprint Classification Fingerprint classification is a coarse level partitioning of a fingerprint database into smaller

More information

TVL - The True Measurement of Video Quality

TVL - The True Measurement of Video Quality ACTi Knowledge Base Category: Educational Note Sub-category: Video Quality, Hardware Model: N/A Firmware: N/A Software: N/A Author: Ando.Meritee Published: 2010/10/25 Reviewed: 2010/10/27 TVL - The True

More information

HSI BASED COLOUR IMAGE EQUALIZATION USING ITERATIVE n th ROOT AND n th POWER

HSI BASED COLOUR IMAGE EQUALIZATION USING ITERATIVE n th ROOT AND n th POWER HSI BASED COLOUR IMAGE EQUALIZATION USING ITERATIVE n th ROOT AND n th POWER Gholamreza Anbarjafari icv Group, IMS Lab, Institute of Technology, University of Tartu, Tartu 50411, Estonia sjafari@ut.ee

More information

Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup

Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup Network Anomaly Detection A Machine Learning Perspective Dhruba Kumar Bhattacharyya Jugal Kumar KaKta»C) CRC Press J Taylor & Francis Croup Boca Raton London New York CRC Press is an imprint of the Taylor

More information

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives

More information

Circle Object Recognition Based on Monocular Vision for Home Security Robot

Circle Object Recognition Based on Monocular Vision for Home Security Robot Journal of Applied Science and Engineering, Vol. 16, No. 3, pp. 261 268 (2013) DOI: 10.6180/jase.2013.16.3.05 Circle Object Recognition Based on Monocular Vision for Home Security Robot Shih-An Li, Ching-Chang

More information

A Method of Caption Detection in News Video

A Method of Caption Detection in News Video 3rd International Conference on Multimedia Technology(ICMT 3) A Method of Caption Detection in News Video He HUANG, Ping SHI Abstract. News video is one of the most important media for people to get information.

More information

ROBUST COLOR JOINT MULTI-FRAME DEMOSAICING AND SUPER- RESOLUTION ALGORITHM

ROBUST COLOR JOINT MULTI-FRAME DEMOSAICING AND SUPER- RESOLUTION ALGORITHM ROBUST COLOR JOINT MULTI-FRAME DEMOSAICING AND SUPER- RESOLUTION ALGORITHM Theodor Heinze Hasso-Plattner-Institute for Software Systems Engineering Prof.-Dr.-Helmert-Str. 2-3, 14482 Potsdam, Germany theodor.heinze@hpi.uni-potsdam.de

More information

Low-resolution Character Recognition by Video-based Super-resolution

Low-resolution Character Recognition by Video-based Super-resolution 2009 10th International Conference on Document Analysis and Recognition Low-resolution Character Recognition by Video-based Super-resolution Ataru Ohkura 1, Daisuke Deguchi 1, Tomokazu Takahashi 2, Ichiro

More information

Introduction to Pattern Recognition

Introduction to Pattern Recognition Introduction to Pattern Recognition Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2009 CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University)

More information

Choosing a digital camera for your microscope John C. Russ, Materials Science and Engineering Dept., North Carolina State Univ.

Choosing a digital camera for your microscope John C. Russ, Materials Science and Engineering Dept., North Carolina State Univ. Choosing a digital camera for your microscope John C. Russ, Materials Science and Engineering Dept., North Carolina State Univ., Raleigh, NC One vital step is to choose a transfer lens matched to your

More information

The Delicate Art of Flower Classification

The Delicate Art of Flower Classification The Delicate Art of Flower Classification Paul Vicol Simon Fraser University University Burnaby, BC pvicol@sfu.ca Note: The following is my contribution to a group project for a graduate machine learning

More information

An introduction to OBJECTIVE ASSESSMENT OF IMAGE QUALITY. Harrison H. Barrett University of Arizona Tucson, AZ

An introduction to OBJECTIVE ASSESSMENT OF IMAGE QUALITY. Harrison H. Barrett University of Arizona Tucson, AZ An introduction to OBJECTIVE ASSESSMENT OF IMAGE QUALITY Harrison H. Barrett University of Arizona Tucson, AZ Outline! Approaches to image quality! Why not fidelity?! Basic premises of the task-based approach!

More information

Galaxy Morphological Classification

Galaxy Morphological Classification Galaxy Morphological Classification Jordan Duprey and James Kolano Abstract To solve the issue of galaxy morphological classification according to a classification scheme modelled off of the Hubble Sequence,

More information

Palmprint Recognition. By Sree Rama Murthy kora Praveen Verma Yashwant Kashyap

Palmprint Recognition. By Sree Rama Murthy kora Praveen Verma Yashwant Kashyap Palmprint Recognition By Sree Rama Murthy kora Praveen Verma Yashwant Kashyap Palm print Palm Patterns are utilized in many applications: 1. To correlate palm patterns with medical disorders, e.g. genetic

More information

Image Compression and Decompression using Adaptive Interpolation

Image Compression and Decompression using Adaptive Interpolation Image Compression and Decompression using Adaptive Interpolation SUNILBHOOSHAN 1,SHIPRASHARMA 2 Jaypee University of Information Technology 1 Electronicsand Communication EngineeringDepartment 2 ComputerScience

More information

Blind Deconvolution of Barcodes via Dictionary Analysis and Wiener Filter of Barcode Subsections

Blind Deconvolution of Barcodes via Dictionary Analysis and Wiener Filter of Barcode Subsections Blind Deconvolution of Barcodes via Dictionary Analysis and Wiener Filter of Barcode Subsections Maximilian Hung, Bohyun B. Kim, Xiling Zhang August 17, 2013 Abstract While current systems already provide

More information

NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE. Venu Govindaraju

NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE. Venu Govindaraju NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE Venu Govindaraju BIOMETRICS DOCUMENT ANALYSIS PATTERN RECOGNITION 8/24/2015 ICDAR- 2015 2 Towards a Globally Optimal Approach for Learning Deep Unsupervised

More information

Potential of face area data for predicting sharpness of natural images

Potential of face area data for predicting sharpness of natural images Potential of face area data for predicting sharpness of natural images Mikko Nuutinen a, Olli Orenius b, Timo Säämänen b, Pirkko Oittinen a a Dept. of Media Technology, Aalto University School of Science

More information

SoMA. Automated testing system of camera algorithms. Sofica Ltd

SoMA. Automated testing system of camera algorithms. Sofica Ltd SoMA Automated testing system of camera algorithms Sofica Ltd February 2012 2 Table of Contents Automated Testing for Camera Algorithms 3 Camera Algorithms 3 Automated Test 4 Testing 6 API Testing 6 Functional

More information

MASCOT Search Results Interpretation

MASCOT Search Results Interpretation The Mascot protein identification program (Matrix Science, Ltd.) uses statistical methods to assess the validity of a match. MS/MS data is not ideal. That is, there are unassignable peaks (noise) and usually

More information

Simultaneous Gamma Correction and Registration in the Frequency Domain

Simultaneous Gamma Correction and Registration in the Frequency Domain Simultaneous Gamma Correction and Registration in the Frequency Domain Alexander Wong a28wong@uwaterloo.ca William Bishop wdbishop@uwaterloo.ca Department of Electrical and Computer Engineering University

More information

Biometric Authentication using Online Signatures

Biometric Authentication using Online Signatures Biometric Authentication using Online Signatures Alisher Kholmatov and Berrin Yanikoglu alisher@su.sabanciuniv.edu, berrin@sabanciuniv.edu http://fens.sabanciuniv.edu Sabanci University, Tuzla, Istanbul,

More information

AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.

AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S. AUTOMATION OF ENERGY DEMAND FORECASTING by Sanzad Siddique, B.S. A Thesis submitted to the Faculty of the Graduate School, Marquette University, in Partial Fulfillment of the Requirements for the Degree

More information

A comprehensive survey on various ETC techniques for secure Data transmission

A comprehensive survey on various ETC techniques for secure Data transmission A comprehensive survey on various ETC techniques for secure Data transmission Shaikh Nasreen 1, Prof. Suchita Wankhade 2 1, 2 Department of Computer Engineering 1, 2 Trinity College of Engineering and

More information

Similarity Search in a Very Large Scale Using Hadoop and HBase

Similarity Search in a Very Large Scale Using Hadoop and HBase Similarity Search in a Very Large Scale Using Hadoop and HBase Stanislav Barton, Vlastislav Dohnal, Philippe Rigaux LAMSADE - Universite Paris Dauphine, France Internet Memory Foundation, Paris, France

More information

LIST OF CONTENTS CHAPTER CONTENT PAGE DECLARATION DEDICATION ACKNOWLEDGEMENTS ABSTRACT ABSTRAK

LIST OF CONTENTS CHAPTER CONTENT PAGE DECLARATION DEDICATION ACKNOWLEDGEMENTS ABSTRACT ABSTRAK vii LIST OF CONTENTS CHAPTER CONTENT PAGE DECLARATION DEDICATION ACKNOWLEDGEMENTS ABSTRACT ABSTRAK LIST OF CONTENTS LIST OF TABLES LIST OF FIGURES LIST OF NOTATIONS LIST OF ABBREVIATIONS LIST OF APPENDICES

More information

How To Evaluate The Performance Of The Process Industry Supply Chain

How To Evaluate The Performance Of The Process Industry Supply Chain Performance Evaluation of the Process Industry Supply r Chain: Case of the Petroleum Industry in India :.2A By Siddharth Varma Submitted in fulfillment of requirements of the degree of DOCTOR OF PHILOSOPHY

More information

Azure Machine Learning, SQL Data Mining and R

Azure Machine Learning, SQL Data Mining and R Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:

More information

Face Recognition in Low-resolution Images by Using Local Zernike Moments

Face Recognition in Low-resolution Images by Using Local Zernike Moments Proceedings of the International Conference on Machine Vision and Machine Learning Prague, Czech Republic, August14-15, 014 Paper No. 15 Face Recognition in Low-resolution Images by Using Local Zernie

More information

Build Panoramas on Android Phones

Build Panoramas on Android Phones Build Panoramas on Android Phones Tao Chu, Bowen Meng, Zixuan Wang Stanford University, Stanford CA Abstract The purpose of this work is to implement panorama stitching from a sequence of photos taken

More information

How To Cluster

How To Cluster Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main

More information

Prediction of DDoS Attack Scheme

Prediction of DDoS Attack Scheme Chapter 5 Prediction of DDoS Attack Scheme Distributed denial of service attack can be launched by malicious nodes participating in the attack, exploit the lack of entry point in a wireless network, and

More information

Signature verification using Kolmogorov-Smirnov. statistic

Signature verification using Kolmogorov-Smirnov. statistic Signature verification using Kolmogorov-Smirnov statistic Harish Srinivasan, Sargur N.Srihari and Matthew J Beal University at Buffalo, the State University of New York, Buffalo USA {srihari,hs32}@cedar.buffalo.edu,mbeal@cse.buffalo.edu

More information

Multivariate Analysis of Ecological Data

Multivariate Analysis of Ecological Data Multivariate Analysis of Ecological Data MICHAEL GREENACRE Professor of Statistics at the Pompeu Fabra University in Barcelona, Spain RAUL PRIMICERIO Associate Professor of Ecology, Evolutionary Biology

More information

Implementation of Canny Edge Detector of color images on CELL/B.E. Architecture.

Implementation of Canny Edge Detector of color images on CELL/B.E. Architecture. Implementation of Canny Edge Detector of color images on CELL/B.E. Architecture. Chirag Gupta,Sumod Mohan K cgupta@clemson.edu, sumodm@clemson.edu Abstract In this project we propose a method to improve

More information

Principal components analysis

Principal components analysis CS229 Lecture notes Andrew Ng Part XI Principal components analysis In our discussion of factor analysis, we gave a way to model data x R n as approximately lying in some k-dimension subspace, where k

More information

2695 P a g e. IV Semester M.Tech (DCN) SJCIT Chickballapur Karnataka India

2695 P a g e. IV Semester M.Tech (DCN) SJCIT Chickballapur Karnataka India Integrity Preservation and Privacy Protection for Digital Medical Images M.Krishna Rani Dr.S.Bhargavi IV Semester M.Tech (DCN) SJCIT Chickballapur Karnataka India Abstract- In medical treatments, the integrity

More information

Enhancing the SNR of the Fiber Optic Rotation Sensor using the LMS Algorithm

Enhancing the SNR of the Fiber Optic Rotation Sensor using the LMS Algorithm 1 Enhancing the SNR of the Fiber Optic Rotation Sensor using the LMS Algorithm Hani Mehrpouyan, Student Member, IEEE, Department of Electrical and Computer Engineering Queen s University, Kingston, Ontario,

More information

VISUALIZATION. Improving the Computer Forensic Analysis Process through

VISUALIZATION. Improving the Computer Forensic Analysis Process through By SHELDON TEERLINK and ROBERT F. ERBACHER Improving the Computer Forensic Analysis Process through VISUALIZATION The ability to display mountains of data in a graphical manner significantly enhances the

More information

Broadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29.

Broadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29. Broadband Networks Prof. Dr. Abhay Karandikar Electrical Engineering Department Indian Institute of Technology, Bombay Lecture - 29 Voice over IP So, today we will discuss about voice over IP and internet

More information

Graphic Design. Background: The part of an artwork that appears to be farthest from the viewer, or in the distance of the scene.

Graphic Design. Background: The part of an artwork that appears to be farthest from the viewer, or in the distance of the scene. Graphic Design Active Layer- When you create multi layers for your images the active layer, or the only one that will be affected by your actions, is the one with a blue background in your layers palette.

More information

IN current film media, the increase in areal density has

IN current film media, the increase in areal density has IEEE TRANSACTIONS ON MAGNETICS, VOL. 44, NO. 1, JANUARY 2008 193 A New Read Channel Model for Patterned Media Storage Seyhan Karakulak, Paul H. Siegel, Fellow, IEEE, Jack K. Wolf, Life Fellow, IEEE, and

More information

Numerical Algorithms Group. Embedded Analytics. A cure for the common code. www.nag.com. Results Matter. Trust NAG.

Numerical Algorithms Group. Embedded Analytics. A cure for the common code. www.nag.com. Results Matter. Trust NAG. Embedded Analytics A cure for the common code www.nag.com Results Matter. Trust NAG. Executive Summary How much information is there in your data? How much is hidden from you, because you don t have access

More information

Performance Analysis and Comparison of JM 15.1 and Intel IPP H.264 Encoder and Decoder

Performance Analysis and Comparison of JM 15.1 and Intel IPP H.264 Encoder and Decoder Performance Analysis and Comparison of 15.1 and H.264 Encoder and Decoder K.V.Suchethan Swaroop and K.R.Rao, IEEE Fellow Department of Electrical Engineering, University of Texas at Arlington Arlington,

More information

Predict Influencers in the Social Network

Predict Influencers in the Social Network Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons

More information

JPEG Image Compression by Using DCT

JPEG Image Compression by Using DCT International Journal of Computer Sciences and Engineering Open Access Research Paper Volume-4, Issue-4 E-ISSN: 2347-2693 JPEG Image Compression by Using DCT Sarika P. Bagal 1* and Vishal B. Raskar 2 1*

More information

Automatic Detection of Emergency Vehicles for Hearing Impaired Drivers

Automatic Detection of Emergency Vehicles for Hearing Impaired Drivers Automatic Detection of Emergency Vehicles for Hearing Impaired Drivers Sung-won ark and Jose Trevino Texas A&M University-Kingsville, EE/CS Department, MSC 92, Kingsville, TX 78363 TEL (36) 593-2638, FAX

More information

Survey of Scanner and Printer Forensics at Purdue University

Survey of Scanner and Printer Forensics at Purdue University Survey of Scanner and Printer Forensics at Purdue University Nitin Khanna, Aravind K. Mikkilineni, George T.-C. Chiu, Jan P. Allebach, and Edward J. Delp Purdue University, West Lafayette IN 47907, USA

More information

6 EXTENDING ALGEBRA. 6.0 Introduction. 6.1 The cubic equation. Objectives

6 EXTENDING ALGEBRA. 6.0 Introduction. 6.1 The cubic equation. Objectives 6 EXTENDING ALGEBRA Chapter 6 Extending Algebra Objectives After studying this chapter you should understand techniques whereby equations of cubic degree and higher can be solved; be able to factorise

More information

CHAPTER VII CONCLUSIONS

CHAPTER VII CONCLUSIONS CHAPTER VII CONCLUSIONS To do successful research, you don t need to know everything, you just need to know of one thing that isn t known. -Arthur Schawlow In this chapter, we provide the summery of the

More information