Chapter 8 Hash functions in digital forensics Page 129

Size: px
Start display at page:

Download "Chapter 8 Hash functions in digital forensics Page 129"

Transcription

1 Page 129 In this chapter we describe the role of hash functions in digital forensics. Essentially hash functions are used for two main purposes: first, authenticity and integrity of digital traces are ensured by applying cryptographic hash functions. Second hash functions identify known objects (e.g., illicit files). Before we give details on their applications in IT forensics, we introduce the foundations of hash functions in Section 8.1. Then Section 8.2 describes the use case authenticity and integrity of digital traces. Finally in Section 8.3 we explain the use case data reduction by identification of known digital objects. 8.1 Cryptographic hash functions and approximate matching In this section we first introduce the general idea of a hash function and then turn to two different concepts: first Section discusses cryptographic hash functions, which originally come from cryptography to be used in the context of the security goals authenticity, integrity, and non-repudiation. Cryptographic hash functions are useful to uniquely identify an input bit string by its hash value. The second concept, on the other hand, is a rather new idea. It deals with the identification of similar input bit strings and is called approximate matching. We turn to approximate matching in Section A general hash function is simply a function, which takes an arbitrary large bit string as input and outputs a bit string of fixed size. If n N denotes the bit length of the output and if we denote as usual by {0,1} the set of all bit strings, then a hash function h is a mapping Hash function h : {0,1} {0,1} n. (8.1) Typically the computation of a hash value is efficient, that is fast in practice. These two properties are characteristic for a hash function and thus used for its definition (see e.g. [16]). Definition 8.1: Hash function Let n N be given. A hash function is a function, which satisfies the following two properties: D 1. Compression: h : {0,1} {0,1} n. 2. Ease of computation: For all input bit strings bs {0,1} computation of h(bs) is fast in practice. The output of the function h(bs) is referred to as a hash value, fingerprint, signature or digest. Example 8.1 We look at two simple hash functions. B 1. We set n = 1. For bs {0,1} we simply define h(bs) by the least significant bit of bs with the additional definition of h(/0) := 0 for the

2 Page 130 empty bit string /0. For instance we have h(10101) = h(11) = h(1) = 1 and h(1000) = h(10) = h(0) = 0. Clearly this function satisfies both requirements from Definition We set n = 2. For bs {0,1} we simply define h(bs) by bs mod 4, where bs is interpreted as a non-negative binary integer. Again, we set h(/0) := 0 for the empty bit string /0. For instance we have h(10110) = h(10) = 10 = 2 and h(1000) = h(0) = 0. Again this function satisfies both requirements from Definition 8.1. Applications Preimage resistance Cryptographic hash functions Hash functions are well-established in computer science for different purposes. Sample security applications of hash functions comprise storage of passwords (e.g., on Linux systems), electronic signatures (both MACs and asymmetric signatures), and whitelists/blacklists in digital forensics. Depending on the application, we have to impose further requirements. For instance, in cryptography a hash value serves as a unique identifier for its input, e.g., in the context of a digital signature, where the hash value uniquely represents the input data. Clearly in theory each hash value possesses infinitely many preimages, that is input bit strings, which map to the given hash value. However, in practice it is not possible to compute such a preimage the run time of the most efficient algorithm to find a preimage is too long. This property is called preimage resistance. Besides preimage resistance a cryptographic hash function satisfies two additional security requirements, which we list in Definition 8.2. D Definition 8.2: Cryptographic hash function Let h : {0,1} {0,1} n be a hash function. h is called a cryptographic hash function if it additionally satisfies the following security requirements: 1. Preimage resistance: Let a hash value H {0,1} n be given. Then it is infeasible in practice to find an input (i.e., a bit string bs) with H = h(bs). 2. Second preimage resistance: Let a bit string bs 1 {0,1} be given. Then it is infeasible in practice to find a second bit string bs 2 with bs 1 = bs 2 and h(bs 1 ) = h(bs 2 ). 3. Collision resistance: It is infeasible in practice to find any two bit strings bs 1,bS 2 {0,1} with bs 1 = bs 2 and h(bs 1 ) = h(bs 2 ). Clearly both hash functions from Example 8.1 are not cryptographic hash functions. For instance, we consider h from Example It is not preimage resistant, because given b {0,1} we simply take b as preimage and have h(b) = b, that is finding preimages is trivial. The same obviously holds for second preimage resistance and collision resistance, respectively. As we will see in this chapter the IT forensic community adopted the use of cryptographic hash functions for two main purposes: ensuring authenticity and integrity of a digital trace and automatic file identification. In both cases, preimage resistance is crucial, because the hash value of the input serves as a unique identifier for its preimage. If such an identifier is given and if we are able to find a preimage, which is different to the actual input, both IT forensic use cases are corrupted.

3 8.1 Cryptographic hash functions and approximate matching Page 131 If h is a hash function, then a necessary condition for h to be a cryptographic hash function is that the bit length of its digest n is sufficiently large. For preimage resistance and second preimage resistance we have to impose n 100, for collision resistance h has to satisfy n 200. Thus we recommend to make use of the stronger requirement and only apply hash functions with n 200. Sample cryptographic hash functions, which are used in digital forensics are MD5 (n = 128), SHA-1 (n = 160) or hash functions from the SHA-2 family (e.g., SHA-256 (n = 256), [21]). For further details we refer to Table 8.1. Name MD5 SHA-1 SHA-256 SHA-512 RIPEMD-160 n Sample cryptographic hash functions Table 8.1: Sample cryptographic hash functions. One important implication of the security properties of a cryptographic hash function is the avalanche effect. If we change the input bit string, then every bit of the output is expected to change its value with probability 50%, i.e., we do not have any control over the output, if the input changes. According to the avalanche effect, if only one single bit in the original input bit string bs is changed to get a tampered one bs, the two outputs h(bs) and h(bs ) look very different. We demonstrate the avalanche effect on base of similar ASCII strings in Example 8.2. Avalanche effect Example 8.2: Avalanche effect We demonstrate the avalanche effect by applying SHA-256 to a simple ASCII string: in the first string, Wolfgang claims to give Angela 1 million EUR, while the amount changes slightly to 1 billion EUR in the second string. However, the respective SHA-256 hash values look very different. B $ echo Dear Angela, I give you 1 million EUR. Wolfgang sha256sum cb10cfd3b6d47af94cd48c096c606ec8d2d836e80c7f87701ff450267efb $ echo Dear Angela, I give you 1 billion EUR. Wolfgang sha256sum 8dc377ef008781d dc7235aff7ac06e39a523eb7fda9ad547f6c4e - The Linux command echo prints the given string (including a subsequent new line character) to standard output. The Linux implementation of SHA- 256 sha256sum takes this string as input. The number of output characters of sha256sum is = 64, because each group of 4 bits of the hash value is printed as one hexadecimal digit. The avalanche effect is eligible in the context of unique identifiers or integrity of a trace, because it is easy to distinguish different input bit strings by comparing their respective hash values. However, the avalanche effect avoids detecting similar objects. It is important to keep this property in mind for the two use cases of cryptographic hash functions in IT forensics Bloom filter This section introduces Bloom filters, which are an important concept for approximate matching. Bloom filters are commonly used to represent elements of a finite set S. A Bloom filter is an array of m bits initially all set to zero. In order to insert an element s S into the filter, k independent hash functions are needed where each hash function h outputs a value between 0 and m 1. Next, s is hashed by all hash functions h. The bits of the Bloom filter at the positions h 0 (s),h 1 (s),...h k 1 (s) are set to one.

4 Page 132 To answer the question if s is in S, we compute h 0 (s ),h 1 (s ),...h k 1 (s ) and analyse if the bits at the corresponding positions in the Bloom filter are set to one. If this holds, s is assumed to be in S, however, we may be wrong as the bits may be set to one by different elements from S. Hence, Bloom filters suffer from a non-trivial false positive rate. Otherwise, if at least one bit is set to zero, we know that s / S. It is obvious that the false negative rate is equal to zero. False positive probability In case of uniformly distributed data the probability that a certain bit is set to 1 during the insertion of an element is 1 /m, i.e., the probability that a bit is still 0 is 1 1 /m. After inserting n elements into the Bloom filter, the probability of a given bit position to be one is 1 (1 1 /m) k n. In order to have a false positive, all k array positions need to be set to one. Hence, the probability p for a false positive is p = 1 (1 1 /m) k n k (1 e kn/m ) k. (8.2) Detection of similar objects No formal definition Extending yes/no output Use case classes Core functions Features, similarity digest Approximate matching: the concept Often it is useful in computer science to identify similar digital objects. Prominent use cases are spam detection, malware analysis, network-based anomaly detection, biometrics, or digital forensics. We first remark that although similarity has a natural meaning for us, a formal definition is still missing. The corresponding NIST special publication draft [22] only describes approximate matching in terms of uses cases, terminology, and requirements. We therefore skip a definition, too. The basic aim of approximate matching is to extend the yes/no outcome of a cryptographic hash function to a continuous one in the scope of automatic detection of a digital object. As explained in Section a cryptographic hash function yields a binary decision identical/differing for a comparison of two input bit strings: identical is encoded for instance as the integer 1, differing as the nonmatching integer 0. The output of an approximate matching comparison on the other hand is a matching score in the interval [0,1], where 1 means a high-level of similarity and 0 a low-level. The NIST draft [22] mentions two use case classes of similarity with two challenges, respectively. First, approximate matching aims at finding resemblence of two objects. The two challenges within this class are object similarity detection (e.g., different versions of a document) and cross correlation, i.e. finding digital artefacts, which share a common object (e.g., two files sharing an identical picture). Second, approximate matching should detect containment. [22] lists the two according challenges fragment detection (e.g., identify a cluster of a deleted blacklisted file or an IP packet transferring a fragment of a classified document) and embedded object detection, i.e. finding an indexed trace within a digital artefact (e.g., a picture within an ). The concept of approximate matching comprises two core functions: a similarity digest generation function and a similarity comparison function. In the terminology of [22] the first one is called the feature extraction function and the latter on is denoted as similarity function. We prefer our notation because it more obviously describes the goal of the respective function. Given an input object to the similarity digest generation function, it identifies characteristic patterns within the given object. As usual these patterns are called features. The specification of an approximate matching algorithm therefore de-

5 8.1 Cryptographic hash functions and approximate matching Page 133 scribes how to extract features from the given input. The set of all features is the output of the similarity digest generation function and called the similarity digest. The similarity comparison function takes as input two similarity digests and outputs a match score in [0,1]. As more the match score is close to 1 the more similar the corresponding two inputs of the similarity digest generation function are considered. As usual with noisy input, the user of approximate matching has to define a threshold to decide about similarity. As a consequence approximate matching suffers from the well-known error rates: the false match rate (FMR) describes the proportion of dissimilar objects falsely declared to match the compared object. On the other side the false non match rate (FNMR) describes the proportion of similar objects falsely declared to not match the compared object. Similarity may be considered on different layers of abstraction. The NIST draft [22] distinguishes three layers: Similarity comparison function Error rates Layers of abstraction 1. First, bytewise approximate matching takes a bit string as input for the similarity Bytewise approximate digest generation function without any high-layer interpretation of the string, matching that is the features are extracted directly from the input bit string. Bytewise approximate matching is therefore a general approach and may be applied to any bit string. However, it assumes that similar artefacts, which are of interest for the digital forensic investigator, are represented by a similar bit string or it fails within this use case. Bytewise approximate matching is often referred to as fuzzy hashing or similarity hashing. 2. Second, semantic approximate matching takes the interpretation of the application data into account and simulates the human similarity perception matching Semantic approximate procedure. For instance, semantic approximate matching in the scope of pictures extracts the features from the visual perception of the picture rather than from its low-layer representation. Semantic approximate matching is often referred to as perceptual hashing or robust hashing. 3. Third, syntactic approximate matching is based on standardised internal structures of an artefact. For instance, within network packets a syntactic ap- matching Syntactic approximate proximate matching algorithm may work on fields like source/destination MAC/IP addresses, ports, protocols. As bytewise and semantic approximate matching are useful for data reduction, we give more insights into these approaches in the subsequent sections. Breitinger et al. [5] provide an in-depth overview and we summarise and extend their key aspects in what follows Bytewise approximate matching According to Breitinger et al. [5] there are seven bytewise approximate matching algorithms published by the digital forensic community. In this section we review the three main approaches of feature extraction which seem to be the most promising ones. The first feature extraction approach is used by the well-known bytewise approximate matching algorithms ssdeep (due to Kornblum [14]) and mrsh-v2 (due to Breitinger and Baier [3]). The similarity digest generation function subdivides the input byte stream (denoted as m) into chunks m 1, m 2,... as depicted in Figure 8.1. The basic idea is that two digital artefacts are similar if they share a sufficient number of chunks. ssdeep, mrsh-v2

6 Page 134 Figure 8.1: Feature extraction of ssdeep and mrsh-v2 Chunk, trigger point The end of a chunk m i (and thus the beginning of the subsequent chunk m i+1 ) is called a trigger point. Such a trigger point is found if the final r bytes before the trigger point meet a certain condition (typically r = 7 and these r bytes determine an integer value, which has to match a predefined value for triggering). Each chunk represents a feature of the input and the feature set is the sequence of chunks, i.e. the input byte stream is fully covered by the feature set. To represent a feature, it is hashed by a hash function h (e.g., h is FNV 1 for ssdeep, h is MD5 for mrsh-v2) and its hash value is either represented by a Base64 character (ssdeep) or a Bloom filter (mrsh-v2). In case of ssdeep the similarity digest is a sequence of Base64 characters, in case of mrsh-v2 it is a sequence of Bloom filters. In Example 8.3 we compute the ssdeep similarity digest of the photo given in Figure 8.2. Figure 8.2: Sample input hacker-siedlung.jpg of ssdeep B Example 8.3: Similarity digest computation of ssdeep We compute the ssdeep similarity digest of the photo given in Figure 8.2. $ ls -l hacker-siedlung.jpg -rw baier baier :16 hacker-siedlung.jpg $ ssdeep -l hacker-siedlung.jpg ssdeep,2.13--blocksize:hash:hash,filename 1536:ZfICsORJt2PazD7Z2xqHmqL36uuXtrHTXkkknIKB+W2pDHviF4eYySb:\ ZfICNRf2CD7YwGqL36FXVTXQnIWgDvi2,"hacker-siedlung.jpg" 1 Fowler/Noll/Vo hash, retrieved

7 8.1 Cryptographic hash functions and approximate matching Page 135 We first look at the file size, which is bytes. Then we invoke ssdeep, its flag -l suppresses the whole path listing in the output of ssdeep. The output lists the block size, two parts of the similarity digest, and the file name, which are separated by a colon, respectively. The block size determines, when a trigger point is found. It aims at splitting the input byte stream in approximately 64 chunks. It is always of the form 3 2 k, where k is the smallest value with 3 2 k 64 file size. In our example we have = 410.6, thus k = 9 and the block size is 1536 = After the first colon, we get the first part of the ssdeep similarity digest corresponding to the block size It consists of Base64 characters, where the character Z represents the hash value of h(m 1 ), f the hash value of h(m 2 ), and b the hash value of the final chunk h(m 55 ). After the second colon we see the second part of the ssdeep similarity digest corresponding to the block size = We expect approximately half of the chunks. The second feature seletion strategy is to extract statistically improbable features. This strategy is implemented by sdhash of Roussev [24]. The basic idea is that uncommon patterns serve as the baseline for similarity. A statistically improbable feature within sdhash is a sequence of 64 bytes with a high Shannon entropy, that is a sufficiently large number of different bytes. The feature set of sdhash is the sequence of the statistically improbable features, which are represented by Bloom filters. There is a parallelised version available for use in large-scale investigations [25]. The third feature selection strategy is based on a majority vote of bit appearance with a subsequent run length encoding. This approach is used by mvhash-b due to Breitinger et al. [4]. The majority vote step replaces each byte of the input byte string by either an 0x00 byte or an 0xFF byte. The mapping depends on the neighbourhood of the respective byte: if the number of 0 bits predominate in its neighbourhood, the byte is mapped to 0x00, otherwise it is mapped to 0xFF. Then run length encoding is used, where each sequence of identical bytes is replaced by its length. The basic idea of similarity is that predominating regions of a certain bit are characteristic for digital objects. The integers of the run length encoding are then inserted into Bloom filters. The similarity digest of mvhash-b is therefore a sequence of Bloom filters. sdhash mvhash-b Semantic approximate matching As semantic approximate matching extracts perceptual features it is bound to a certain area of applications, for instance images, audio streams or videos. Again Breitinger et al. [5] present an overview of semantic approximate matching algorithms in the context of pictures. This branch dates back to the early 1990ies, when content-based image retrieval was an emerging research topic. There are different feature classes, which are used for image approximate matching. Breitinger et al. [5] mention histograms, low-frequency coefficients (e.g., from the discrete cosine transform), block bitmaps or projection-based. To get an idea of image approximate matching, we shortly explain a block bitmap approach used by the robust hashing algorithm rhash due to Steinebach [29]. The similarity digest generation process of rhash is depicted in Figure 8.3. The bit length of the rhash value is fixed in advance. As usual we denote it by n. In a first step, the input image is converted to greyscale and normalised (e.g., in a preset Perceptual features Feature classes rhash

8 Page 136 Figure 8.3: Similarity digest generation of rhash [29] size, with respect to orientation). Then the normalised and greyscaled picture is subdivided into n disjoint blocks, which cover the image. For instance, if n is a square, then rhash subdivides the image into n equally sized rows and columns, respectively. The sample in Figure 8.3 makes use of n = 256 = 16 2, that is the input picture comprises 16 rows and columns, respectively. Next for each block i with 0 i n 1 rhash computes the mean of of its pixel values. We denote the mean of the i-th block by M i and the median of the sequence (M i ) 0 i n 1 by Md. Finally, the block i contributes to the rhash similarity digest by the bit h i, where h i = 0 if and only if m i < Md. A sample rhash similarity digest is given on the right in Figure 8.3. Authenticity, integrity Dead and live analysis Usage of cryptographic hash functions Protect hash values General process 8.2 Authenticity and integrity of digital traces In this section we look at the first use case of hash functions in digital forensics: ensuring authenticity and integrity of digital traces during the IT forensic process (e.g., during data acquisition). Remember authenticity means that the origin of a digital trace is validated, while integrity describes the property that a digital trace did not change. The use case authenticity and integrity of digital traces is relevant for both dead and live analysis. We will focus on dead analysis in what follows (i.e., the digital forensic expert makes use of his own software), but we keep in mind that traces, which are acquired from a live system (e.g., main memory) must be protected by hash values, too. From Section 8.1 we know that cryptographic hash functions ensure integrity and authenticity by design due to their preimage and second preimage property (see Definition 8.2). For this reason the use case authenticity and integrity of digital traces assumes the usage of cryptographic hash functions. An important issue is that we have to protect the hash values against tampering. There are two alternatives to achieve this goal: first the classical analogue approach is to write down the hash values by hand in the narrative minutes (e.g., in the investigation notebook). Then the hash values are protected by the assumption that it is impossible to forge the handwriting of the investigator. Second the digital approach is to compute a digital signature over the hash values. This requires a private cryptographic key, which is related to the investigator. In this case the hash values are protected by the assumption that it is impossible to forge a digital signature. We now discuss the use case authenticity and integrity of digital traces by looking at the classical data acquisition process of a dead system. To sum up the paradigm is to first generate a master copy from the original device (because the original device must be touched as few as possible). Then the master copy is bitwise copied to get the working copy. If we only perform read-only commands on the working copy, we later on must prove that the working copy did not change during the

9 8.2 Authenticity and integrity of digital traces Page 137 investigation (and hence any trace is directly extractable from the original device). The steps are as follows: 1. Compute hash value h 1 over the whole original volume. 2. Write hash value h 1 down in physical logbook. 3. Make a 1-to-1 copy of the volume using dd. This is the master copy of the original device. 4. Compute hash value h 2 over the master copy. 5. Write hash value h 2 down in physical logbook. 6. Compare h 1 and h 2 : if both hash values match, the master copy is identical to the original device. Otherwise, we have to go back to step Generate a 1-to-1 copy of the master copy using dd. This is the working copy. 8. Compute hash value h 3 over the working copy. 9. Write hash value h 3 down in physical logbook. 10. Compare h 2 and h 3 : if both hash values match, the working copy is identical to the master copy and thus to the original device, too. Otherwise, we have to go back to step Perform the investigation read-only on the working copy and extract digital traces. 12. To finish the investigation and to prove integrity of the working copy, compute the hash value h 4 of the working copy after the investigation and check, if h 1 = h 4 holds. If yes, any digital trace is directly related to the original device, otherwise the investigator has to identify the step, where he changed the working copy. We show how to apply this process on base of the well-known cryptographic library openssl in Example 8.4. Example 8.4: Acquire first partition of an HDD In Linux storage media are typically identified by a device (that is a file in the directory /dev) starting with the two letters sd (historically for SCSI device) and a subsequent character to distinguish different devices. For instance the first HDD is referred to as /dev/sda, an attached USB stick is then mapped to /dev/sdb, an external SSD is identified as /dev/sdc, and so on. B In our example we assume that our HDD is the device /dev/sda. Then its first partition is identified by a digit following the device name (e.g., /dev/sda1), an extended partition may be the device /dev/sda5. We apply the general acquisition process and compute the SHA-256 hash value of this partition. In this example, we make use of the openssl tool, because openssl is the most common implementation of cryptographic algorithms like hash functions, encryption or digital signatures. After invoking openssl we have to tell the tool, which class of cryptographic algorithms we want to use. Cryptographic hash functions are identified by the digest command dgst. The remaining arguments are the chosen hash function (the flag -sha256) and the input bit string of the hash function (in our example the first partition of the HDD /dev/sda1).

10 Page 138 # openssl dgst -sha256 /dev/sda1 SHA256(/dev/sda1)= b9c028c604b5a1dfaf8acf0098e7f26de32fd47\ 38c581d9b6cbc84c98b28f39b # dd if=/dev/sda1 of=mastercopy-sda1.dd # openssl dgst -sha256 mastercopy-sda1.dd SHA256(mastercopy-sda1.dd)= b9c028c604b5a1dfaf8acf0098e7f26de32fd47\ 38c581d9b6cbc84c98b28f39b As both hash values match, we generate the working copy and check the respective hash values. # dd if=mastercopy-sda1.dd of=workingcopy-sda1.dd $ openssl dgst -sha256 workingcopy-sda1.dd SHA256(workingcopy-sda1.dd)= b9c028c604b5a1dfaf8acf0098e7f26de32fd47\ 38c581d9b6cbc84c98b28f39b Again both hash values match, that is the working copy is bitwise identical to the first partition of our HDD. We next investigate read-only the working copy. In the last step we check that the working copy did not change during the processing, which we prove by applying SHA-256 to the respective image of the working copy after our investigation. $ openssl dgst -sha256 workingcopy-sda1.dd SHA256(workingcopy-sda1.dd)= b9c028c604b5a1dfaf8acf0098e7f26de32fd47\ 38c581d9b6cbc84c98b28f39b The hash value of the working copy after the investigation matches the respective hash value of /dev/sda1 and thus any digital trace from the working copy is extractable from the partition, too. If for some reason the final hash value does not match, the investigator has to carefully analyse his narrative minutes to find a step where he modified the working copy. An example of destroyed integrity is given in what follows: $ openssl dgst -sha1 workingcopy-sda1.dd SHA256(workingcopy-sda1.dd)= df69b585b1a1af40b1c71d4fe9792fd1e843f8a\ 2fe0c5c3a39aa205e652aabe4 Big data challenge Finding the needle in the haystack 8.3 Identification of known digital objects An important issue in contemporary investigations of computer crime is handling the huge amount of data. The reason is that as of today information is stored and distributed in a digital rather than an analogue way. Low costs of storage devices and cheap unlimited access to the Internet support our ubiquitous use of digital devices. As a consequence a digital forensic investigation typically confronts the IT forensic experts with terabytes of data stored on different sorts of phyiscal or virtual devices: a classical personal computer, a laptop, a tablet PC, a smartphone, a mail provider, a cloud service provider to name only a few. The terabytes of data can be seen as a big haystack, where the actual evidence of some megabytes has to be found, that is the investigator s task is to find the

11 8.3 Identification of known digital objects Page 139 needle in the haystack. In this section we present concepts, which automatically preprocess the terabytes of input data to support the investigator in proving or refuting a hypothesis. If we use the metaphor of finding the needle in the haystack, two concepts are obvious: 1. First, decreasing the haystack means to scale down the actual data, which has Whitelisting to be inspected by the digital forensic expert. This concept is known as whitelisting or filtering out. Any object from the suspect s drive, which is indexed by the whitelist, is not considered for further inspection. We discuss whitelisting in Section Second, increasing the needle means to find hints to suspicious data structures, Blacklisting which actually support a certain hypothesis. These hints have to be confirmed manually by the investigator. This concept is known as blacklisting or filtering in. We discuss blacklisting in Section For both concepts, we need databases of irrelevant data (i.e. a whitelist) or incriminated files (i.e. a blacklist), respectively. The most common whitelist is the Reference Data Set (RDS) from the US-NIST National Software Reference Library (NSRL) [23]. The blacklist is case dependent (e.g., pictures of child abuse, classified documents). The most common basic technology for indexing files are hash functions. The proceeding is quite simple: for each object of the seized device (e.g., a file) calculate the corresponding digest and compare the respective fingerprint against a whiteor blacklist, respectively. As of today cryptographic hash functions (e.g., SHA- 1, SHA-256 [21]) are used. Cryptographic hash functions are very efficient and effective in detecting bitwise identical duplicates, but they fail in revealing similar objects. However, investigators are typically interested in automatic identification of similar objects, for instance to detect the correlation between a blacklisted picture of child abuse and its thumbnail, which was discovered on a seized device. Databases Hash values are used Whitelisting A whitelist is an index of known to be good objects, that is of non-suspicious patterns. The concept of whitelisting is quite simple: any object from the suspect s drive (typically an object is simply a file), which is indexed by the whitelist, is not considered for further inspection. Therefore whitelisting is referred to as filtering out, too. In order to handle a whitelist with respect to memory, a compressed representation of each whitelisted object is used. Additionally, as whitelisted objects are not considered for further investigation, the false match rate (FMR) must be 0. Otherwise it would be possible for an attacker to filter out relevant digital traces. Therefore whitelists are based on cryptographic hash functions. The most common whitelist is the Reference Data Set (RDS) from the US-NIST National Software Reference Library (NSRL) [23]. The RDS indexes files. Its website states 2 : The RDS is a collection of digital signatures of known, traceable software applications. There are application hash values in the hash set which may be considered malicious, i.e. steganography tools and hacking scripts. There are no hash values of illicit data, i.e. child abuse images. Whitelists are based on cryptographic hash functions RDS

12 Page 140 B Example 8.5 We enumerate sample entries of the NSRL Reference Data Set. $ less NSRLFile.txt "SHA-1","MD5","CRC32","FileName","FileSize","ProductCode","OpSystemCode","SpecialCode" " EDD92C4E3D2E F849","392126E756571EBF112CB1C1CDEDF926","EBD105A0",\ "I05002T2.PFB",98865,3095,"WIN","" " DA6391F7F5D2F7FCCF36CEBDA60C6EA02","0E53C14A3E48D94FF596A B492","AA6A7B16",\ "00br2026.gif",2226,228,"WIN","" "000000A9E47BD385A0A3685AA12C2DB6FD727A20","176308F27DD52890F013A3FD80F92E51","D749B562",\ "femvo523.wav",42748,4887,"macosx","" " AFA836117B1B572FAE4713F200567","9B3702B0E788C6D FE3C9786A","05E566DF",\ "J JPG",32768,18266,"358","" " AFA836117B1B572FAE4713F200567","9B3702B0E788C6D FE3C9786A","05E566DF",\ "J JPG",32768,2322,"WIN","" " AFA836117B1B572FAE4713F200567","9B3702B0E788C6D FE3C9786A","05E566DF",\ "J JPG",32768,2575,"WIN","" " AFA836117B1B572FAE4713F200567","9B3702B0E788C6D FE3C9786A","05E566DF",\ "J JPG",32768,2583,"WIN","" " AFA836117B1B572FAE4713F200567","9B3702B0E788C6D FE3C9786A","05E566DF",\ "J JPG",32768,3271,"WIN","" " AFA836117B1B572FAE4713F200567","9B3702B0E788C6D FE3C9786A","05E566DF",\ "J JPG",32768,3282,"UNK","" We see that the image J JPG has a file size of bytes. It is listed six times, because the product code or the operating system code differ. Content of RDS Effectiveness of whitelisting The RDS is updated four times a year. As of May 2015, the current release is RDS 2.48, which contains about 21 million unique files. Its size is about 6 GiB. As listed in Example 8.5 each entry of the RDS lists the SHA-1, MD5 and CRC32 checksum together with the file name and file size of the indexed file. The entries are ordered with respect to the numerical value of the SHA-1 hashes. Hence it is easy to decide if an input file is indexed by the RDS. Although filtering out using the RDS is widespread, only few results are available about its effectiveness. Back in 2008 Douglas White from NIST claims in a presentation at the American Academy of Forensic Sciences (AAFF) that file-based data reduction leaves an average of 30% of disk space for human investigation 3. However, the RDS only indexes application hash values, it does not take any personal files into account. Therefore Baier and Dichtelmüller [2] performed a study on data reduction for different user profiles. The baseline of their research is the data reduction in terms of the number of files rather than disc space (because an investigator has to look at a file rather than on a certain amount of memory). The methodology of Baier and Dichtelmüller [2] is to model different user behaviour and their corresponding file generation characteristics. Their data reduction rates for different profiles is given in Table 8.2. M G means the number of generated files in the file system of the respective user profile and M RDS the number of files in the system, which are indexed by the RDS, too. The data reduction rate is the relation of the number of indexed files to all files, that is R = M RDS M G. To be effective, R should be as close as possible to 1. For instance, the first row in Table 8.2 shows the result for a Windows XP operating system installation only, that is there are no user files. However, only 52.45% of the files in the file system are indexed by the RDS. It is obvious, that the reduction rate decreases if we insert additional user files. For example, if we model a user, which mainly uses his computer for playing games (i.e. the profile gamer), the 3 retrieved on

13 8.3 Identification of known digital objects Page 141 Profile Nr. of Indexed by Data reduction files: M G RDS: M RDS rate: R XP, OS only 10,467 5, % XP, standard software 22,801 9, % XP gamer 126,684 18, % W7, OS only 56,233 18, % W7 standard software 77,601 23, % W7 universal 322,128 42, % Ubuntu ,789 26, % Table 8.2: Data reduction rates for different user profiles using RDS [2] data reduction rate is below 15%. In this case the investigator has to inspect the remaining 85% of the files manually. These results are informally confirmed by practitioners, who are surprised by the high data reduction rates of Baier and Dichtelmüller [2] and mention an expected data reduction rate of 5% for their cases. To sum up, the haystack does not decrease significantly using RDS. As the preprocessing of applying the whitelist takes a lot of effort, our overall assessment is that whitelisting is not effective to automatically preprocess bulk data. Whitelisting is ineffective Blacklisting In contrast to a whitelist a blacklist indexes known to be bad objects, that is suspicious patterns. If an object from the suspect s drive matches an element of the blacklist, the investigator gets a hint to a digital trace, which he inspects manually. Thus blacklisting is also called filtering in. Again in order to handle a blacklist with respect to memory, a blacklist makes use of a compressed representation of each of its elements. In this section we assess different aspects of cryptographic hash functions and approximate matching in the scope of blacklisting. The aspects and our assessment are summarised in Table 8.3. To illustrate our rating, we assign categories starting with + for the best rating followed in descending order by,, to the worst rating. Property Cryptographic Bytewise approximate Semantic approximate hash function matching matching Run-time efficiency very fast + fast - medium slow to 6 20 to 500 Compression short + 1% to 3% short 256 bits of input length 256 to 600 bits Object similarity No Yes + Yes + detection Cross correlation No Yes + No Fragment detection No Yes + No Embedded object No Yes + No detection Domain specific No + No + Yes (e.g., only images) Encoding Yes Yes No + dependency FMR / FNMR 0% + Dependent Dependent Indexing Yes + Inefficient Inefficient Filter in Assessment Table 8.3: Assessment of hash functions with respect to blacklisting We first turn to the aspect efficiency, that is run-time and memory efficiency. Our rating of run-time efficiency is based on the experiments of Breitinger et al. [5]. Efficiency

14 Page 142 We assign a relative speed of 1 to cryptographic hash functions. Then bytewise approximate matching differs by a factor of 1.5 to much slower, e.g., mrsh-v2 has comparable speed to SHA-1, while sdhash is much slower. However, bytewise approximate matching is typically much faster than semantic approximate matching, because the latter one requires more complex computational steps. With respect to compression, both cryptographic hash functions and semantic approximate matching perform well. The hash value is of fixed small size. On the other hand, bytewise approximate matching outputs similarity digests of variable length, which is proportional to the input size (with the exception of ssdeep). For instance, a 1 TiB input requires a size of 10 GiB to 30 GiB for its bytewise approximate matching blacklist. This constitutes a key drawback of bytewise approximate matching. Resemblance Dependency Error rates Indexing We next assess the aspect resemblance (see Section 8.1.3). Both bytewise and semantic approximate matching are able to decide about object similarity, which is not the case for cryptographic hash functions. With regard to cross correlation (i.e. finding digital artefacts, which share a common object) only bytewise approximate matching is able to successfully conduct it. The same holds for the aspect containment, i.e. fragment detection and embedded object detection: only bytewise approximate matching copes with containment. The next aspect is dependency with respect to application area and representation, respectively. Both cryptographic hash functions and bytewise approximate matching consider the bytestream of an object, hence they are not bound to a specific domain of applications (e.g., image similarity, audio similarity). However, as semantic approximate matching extracts features to simulate human perception, it is bound to a certain domain of applications. If we examine encoding dependency, the situation is vice versa: the byte-level algorithms are dependent on the actual encoding (e.g., an image encoded as jpg is considered to be different from the same image encoded as png by both cryptographic hash functions and bytewise approximate matching). On the other hand as semantic approximate matching considers the perceptual level, it does not depend on the encoding representation. With respect to error rates, both the false non match rate (FNMR) and the false match rate (FMR) are of interest. For convenience the FMR should be small, otherwise the investigator is annoyed in manually checking erroneous traces. On the other hand the FNMR must be as close as possible to 0. Otherwise the blacklist fails in pointing to potential evidence, and the trace must be found in a different way. Cryptographic hash functions do not suffer from error rates due to their security requirements from the cryptographic domain (e.g., preimage resistance, collision resistance). However, as approximate matching processes noisy input, it suffers from both a non-trivial FMR and FNMR. It is therefore the operator s responsibility to prioritise the error rates. Our final aspect concerns indexing, that is a sorting algorithm for digests. As explained in Section the RDS sorts cryptographic hash values with respect to their numerical value. Hence indexing is easily possible for blacklists based on cryptographic hash functions. With respect to approximate matching, first approaches towards indexing are available. As they suffer from run time or memory inefficiency, we rate approximate matching rather negative with respect to sorting. 8.4 Summary In this chapter we described the two main use cases of hash functions in digital forensics. The use cases are authenticity and integrity of digital traces (ensured by applying cryptographic hash functions) and identification of known objects (e.g.,

15 8.4 Summary Page 143 illicit files). In the latter case we showed how whitelisting and blacklisting work and how these concepts aim to perform data reduction, respectively. Our conclusion is that whitelisting is not effective and that blacklisting may be performed by cryptographic hash functions or approximate matching.

Chapter 8: On the Use of Hash Functions in. Computer Forensics

Chapter 8: On the Use of Hash Functions in. Computer Forensics Harald Baier Hash Functions in Forensics / WS 2011/2012 2/41 Chapter 8: On the Use of Hash Functions in Computer Forensics Harald Baier Hochschule Darmstadt, CASED WS 2011/2012 Harald Baier Hash Functions

More information

Fuzzy Hashing for Digital Forensic Investigators Dustin Hurlbut - AccessData January 9, 2009

Fuzzy Hashing for Digital Forensic Investigators Dustin Hurlbut - AccessData January 9, 2009 Fuzzy Hashing for Digital Forensic Investigators Dustin Hurlbut - AccessData January 9, 2009 Abstract Fuzzy hashing allows the investigator to focus on potentially incriminating documents that may not

More information

Concepts of digital forensics

Concepts of digital forensics Chapter 3 Concepts of digital forensics Digital forensics is a branch of forensic science concerned with the use of digital information (produced, stored and transmitted by computers) as source of evidence

More information

Digital Evidence Search Kit

Digital Evidence Search Kit Digital Evidence Search Kit K.P. Chow, C.F. Chong, K.Y. Lai, L.C.K. Hui, K. H. Pun, W.W. Tsang, H.W. Chan Center for Information Security and Cryptography Department of Computer Science The University

More information

Automating Linux Malware Analysis Using Limon Sandbox Monnappa K A monnappa22@gmail.com

Automating Linux Malware Analysis Using Limon Sandbox Monnappa K A monnappa22@gmail.com Automating Linux Malware Analysis Using Limon Sandbox Monnappa K A monnappa22@gmail.com A number of devices are running Linux due to its flexibility and open source nature. This has made Linux platform

More information

Symbol Tables. Introduction

Symbol Tables. Introduction Symbol Tables Introduction A compiler needs to collect and use information about the names appearing in the source program. This information is entered into a data structure called a symbol table. The

More information

Chapter 8 Security. IC322 Fall 2014. Computer Networking: A Top Down Approach. 6 th edition Jim Kurose, Keith Ross Addison-Wesley March 2012

Chapter 8 Security. IC322 Fall 2014. Computer Networking: A Top Down Approach. 6 th edition Jim Kurose, Keith Ross Addison-Wesley March 2012 Chapter 8 Security IC322 Fall 2014 Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March 2012 All material copyright 1996-2012 J.F Kurose and K.W. Ross, All

More information

Network Security. Abusayeed Saifullah. CS 5600 Computer Networks. These slides are adapted from Kurose and Ross 8-1

Network Security. Abusayeed Saifullah. CS 5600 Computer Networks. These slides are adapted from Kurose and Ross 8-1 Network Security Abusayeed Saifullah CS 5600 Computer Networks These slides are adapted from Kurose and Ross 8-1 Public Key Cryptography symmetric key crypto v requires sender, receiver know shared secret

More information

CS 2112 Spring 2014. 0 Instructions. Assignment 3 Data Structures and Web Filtering. 0.1 Grading. 0.2 Partners. 0.3 Restrictions

CS 2112 Spring 2014. 0 Instructions. Assignment 3 Data Structures and Web Filtering. 0.1 Grading. 0.2 Partners. 0.3 Restrictions CS 2112 Spring 2014 Assignment 3 Data Structures and Web Filtering Due: March 4, 2014 11:59 PM Implementing spam blacklists and web filters requires matching candidate domain names and URLs very rapidly

More information

Checksums, your best friends, for security

Checksums, your best friends, for security Published in Linux for You, August 2008 issue. - - - - - - - - - - - - - - - - Checksums, your best friends, for security S. Parthasarathy drpartha@gmail.com Imagine that you write an electronic cheque

More information

Project: Simulated Encrypted File System (SEFS)

Project: Simulated Encrypted File System (SEFS) Project: Simulated Encrypted File System (SEFS) Omar Chowdhury Fall 2015 CS526: Information Security 1 Motivation Traditionally files are stored in the disk in plaintext. If the disk gets stolen by a perpetrator,

More information

How encryption works to provide confidentiality. How hashing works to provide integrity. How digital signatures work to provide authenticity and

How encryption works to provide confidentiality. How hashing works to provide integrity. How digital signatures work to provide authenticity and How encryption works to provide confidentiality. How hashing works to provide integrity. How digital signatures work to provide authenticity and non-repudiation. How to obtain a digital certificate. Installing

More information

Overview of Cryptographic Tools for Data Security. Murat Kantarcioglu

Overview of Cryptographic Tools for Data Security. Murat Kantarcioglu UT DALLAS Erik Jonsson School of Engineering & Computer Science Overview of Cryptographic Tools for Data Security Murat Kantarcioglu Pag. 1 Purdue University Cryptographic Primitives We will discuss the

More information

Lecture outline. Computer Forensics and Digital Investigation. Defining the word forensic. Defining Computer forensics. The Digital Investigation

Lecture outline. Computer Forensics and Digital Investigation. Defining the word forensic. Defining Computer forensics. The Digital Investigation Computer Forensics and Digital Investigation Computer Security EDA263, lecture 14 Ulf Larson Lecture outline! Introduction to Computer Forensics! Digital investigation! Conducting a Digital Crime Scene

More information

SECURITY IN NETWORKS

SECURITY IN NETWORKS SECURITY IN NETWORKS GOALS Understand principles of network security: Cryptography and its many uses beyond confidentiality Authentication Message integrity Security in practice: Security in application,

More information

Security Aspects of Piecewise Hashing in Computer Forensics

Security Aspects of Piecewise Hashing in Computer Forensics Security Aspects of Piecewise Hashing in Computer Forensics Harald Baier Center for Advanced Security Research Darmstadt and Hochschule Darmstadt 64295 Darmstadt, Germany harald.baier@cased.de Frank Breitinger

More information

Defining Digital Forensic Examination and Analysis Tools Using Abstraction Layers

Defining Digital Forensic Examination and Analysis Tools Using Abstraction Layers Defining Digital Forensic Examination and Analysis Tools Using Abstraction Layers Brian Carrier Research Scientist @stake Abstract This paper uses the theory of abstraction layers to describe the purpose

More information

winhex Disk Editor, RAM Editor PRESENTED BY: OMAR ZYADAT and LOAI HATTAR

winhex Disk Editor, RAM Editor PRESENTED BY: OMAR ZYADAT and LOAI HATTAR winhex Disk Editor, RAM Editor PRESENTED BY: OMAR ZYADAT and LOAI HATTAR Supervised by : Dr. Lo'ai Tawalbeh New York Institute of Technology (NYIT)-Jordan X-Ways Software Technology AG is a stock corporation

More information

Chapter 11 Security+ Guide to Network Security Fundamentals, Third Edition Basic Cryptography

Chapter 11 Security+ Guide to Network Security Fundamentals, Third Edition Basic Cryptography Chapter 11 Security+ Guide to Network Security Fundamentals, Third Edition Basic Cryptography What Is Steganography? Steganography Process of hiding the existence of the data within another file Example:

More information

2! Bit-stream copy. Acquisition and Tools. Planning Your Investigation. Understanding Bit-Stream Copies. Bit-stream Copies (contd.

2! Bit-stream copy. Acquisition and Tools. Planning Your Investigation. Understanding Bit-Stream Copies. Bit-stream Copies (contd. Acquisition and Tools COMP 2555: Principles of Computer Forensics Autumn 2014 http://www.cs.du.edu/2555 1 Planning Your Investigation! A basic investigation plan should include the following activities:!

More information

Fighting product clones through digital signatures

Fighting product clones through digital signatures Paul Curtis, Katrin Berkenkopf Embedded Experts Team, SEGGER Microcontroller Fighting product clones through digital signatures Product piracy and forgery are growing problems that not only decrease turnover

More information

Network Security. Gaurav Naik Gus Anderson. College of Engineering. Drexel University, Philadelphia, PA. Drexel University. College of Engineering

Network Security. Gaurav Naik Gus Anderson. College of Engineering. Drexel University, Philadelphia, PA. Drexel University. College of Engineering Network Security Gaurav Naik Gus Anderson, Philadelphia, PA Lectures on Network Security Feb 12 (Today!): Public Key Crypto, Hash Functions, Digital Signatures, and the Public Key Infrastructure Feb 14:

More information

SSL A discussion of the Secure Socket Layer

SSL A discussion of the Secure Socket Layer www.harmonysecurity.com info@harmonysecurity.com SSL A discussion of the Secure Socket Layer By Stephen Fewer Contents 1 Introduction 2 2 Encryption Techniques 3 3 Protocol Overview 3 3.1 The SSL Record

More information

SecureDoc Disk Encryption Cryptographic Engine

SecureDoc Disk Encryption Cryptographic Engine SecureDoc Disk Encryption Cryptographic Engine FIPS 140-2 Non-Proprietary Security Policy Abstract: This document specifies Security Policy enforced by SecureDoc Cryptographic Engine compliant with the

More information

Network Security. HIT Shimrit Tzur-David

Network Security. HIT Shimrit Tzur-David Network Security HIT Shimrit Tzur-David 1 Goals: 2 Network Security Understand principles of network security: cryptography and its many uses beyond confidentiality authentication message integrity key

More information

Transport Layer Protocols

Transport Layer Protocols Transport Layer Protocols Version. Transport layer performs two main tasks for the application layer by using the network layer. It provides end to end communication between two applications, and implements

More information

Overview. SSL Cryptography Overview CHAPTER 1

Overview. SSL Cryptography Overview CHAPTER 1 CHAPTER 1 Note The information in this chapter applies to both the ACE module and the ACE appliance unless otherwise noted. The features in this chapter apply to IPv4 and IPv6 unless otherwise noted. Secure

More information

CALIFORNIA SOFTWARE LABS

CALIFORNIA SOFTWARE LABS ; Digital Signatures and PKCS#11 Smart Cards Concepts, Issues and some Programming Details CALIFORNIA SOFTWARE LABS R E A L I Z E Y O U R I D E A S California Software Labs 6800 Koll Center Parkway, Suite

More information

HASH CODE BASED SECURITY IN CLOUD COMPUTING

HASH CODE BASED SECURITY IN CLOUD COMPUTING ABSTRACT HASH CODE BASED SECURITY IN CLOUD COMPUTING Kaleem Ur Rehman M.Tech student (CSE), College of Engineering, TMU Moradabad (India) The Hash functions describe as a phenomenon of information security

More information

Network Security (2) CPSC 441 Department of Computer Science University of Calgary

Network Security (2) CPSC 441 Department of Computer Science University of Calgary Network Security (2) CPSC 441 Department of Computer Science University of Calgary 1 Friends and enemies: Alice, Bob, Trudy well-known in network security world Bob, Alice (lovers!) want to communicate

More information

Theoretical Aspects of Storage Systems Autumn 2009

Theoretical Aspects of Storage Systems Autumn 2009 Theoretical Aspects of Storage Systems Autumn 2009 Chapter 3: Data Deduplication André Brinkmann News Outline Data Deduplication Compare-by-hash strategies Delta-encoding based strategies Measurements

More information

Cryptographic hash functions and MACs Solved Exercises for Cryptographic Hash Functions and MACs

Cryptographic hash functions and MACs Solved Exercises for Cryptographic Hash Functions and MACs Cryptographic hash functions and MACs Solved Exercises for Cryptographic Hash Functions and MACs Enes Pasalic University of Primorska Koper, 2014 Contents 1 Preface 3 2 Problems 4 2 1 Preface This is a

More information

Randomized Hashing for Digital Signatures

Randomized Hashing for Digital Signatures NIST Special Publication 800-106 Randomized Hashing for Digital Signatures Quynh Dang Computer Security Division Information Technology Laboratory C O M P U T E R S E C U R I T Y February 2009 U.S. Department

More information

Encryption, Data Integrity, Digital Certificates, and SSL. Developed by. Jerry Scott. SSL Primer-1-1

Encryption, Data Integrity, Digital Certificates, and SSL. Developed by. Jerry Scott. SSL Primer-1-1 Encryption, Data Integrity, Digital Certificates, and SSL Developed by Jerry Scott 2002 SSL Primer-1-1 Ideas Behind Encryption When information is transmitted across intranets or the Internet, others can

More information

A block based storage model for remote online backups in a trust no one environment

A block based storage model for remote online backups in a trust no one environment A block based storage model for remote online backups in a trust no one environment http://www.duplicati.com/ Kenneth Skovhede (author, kenneth@duplicati.com) René Stach (editor, rene@duplicati.com) Abstract

More information

159.334 Computer Networks. Network Security 1. Professor Richard Harris School of Engineering and Advanced Technology

159.334 Computer Networks. Network Security 1. Professor Richard Harris School of Engineering and Advanced Technology Network Security 1 Professor Richard Harris School of Engineering and Advanced Technology Presentation Outline Overview of Identification and Authentication The importance of identification and Authentication

More information

Big Data & Scripting Part II Streaming Algorithms

Big Data & Scripting Part II Streaming Algorithms Big Data & Scripting Part II Streaming Algorithms 1, Counting Distinct Elements 2, 3, counting distinct elements problem formalization input: stream of elements o from some universe U e.g. ids from a set

More information

Digital Forensics. Tom Pigg Executive Director Tennessee CSEC

Digital Forensics. Tom Pigg Executive Director Tennessee CSEC Digital Forensics Tom Pigg Executive Director Tennessee CSEC Definitions Digital forensics Involves obtaining and analyzing digital information as evidence in civil, criminal, or administrative cases Analyze

More information

Overview of CSS SSL. SSL Cryptography Overview CHAPTER

Overview of CSS SSL. SSL Cryptography Overview CHAPTER CHAPTER 1 Secure Sockets Layer (SSL) is an application-level protocol that provides encryption technology for the Internet, ensuring secure transactions such as the transmission of credit card numbers

More information

Introduction to Computer Security

Introduction to Computer Security Introduction to Computer Security Hash Functions and Digital Signatures Pavel Laskov Wilhelm Schickard Institute for Computer Science Integrity objective in a wide sense Reliability Transmission errors

More information

Introduction...3 Terms in this Document...3 Conditions for Secure Operation...3 Requirements...3 Key Generation Requirements...

Introduction...3 Terms in this Document...3 Conditions for Secure Operation...3 Requirements...3 Key Generation Requirements... Hush Encryption Engine White Paper Introduction...3 Terms in this Document...3 Conditions for Secure Operation...3 Requirements...3 Key Generation Requirements...4 Passphrase Requirements...4 Data Requirements...4

More information

Index Terms Domain name, Firewall, Packet, Phishing, URL.

Index Terms Domain name, Firewall, Packet, Phishing, URL. BDD for Implementation of Packet Filter Firewall and Detecting Phishing Websites Naresh Shende Vidyalankar Institute of Technology Prof. S. K. Shinde Lokmanya Tilak College of Engineering Abstract Packet

More information

Archival of Digital Assets.

Archival of Digital Assets. Archival of Digital Assets. John Burns, Archive Analytics Summary: We discuss the principles of archiving, best practice in both preserving the raw bits and the utility of those bits, and assert that bit-

More information

October 2014 Issue No: 2.0. Good Practice Guide No. 44 Authentication and Credentials for use with HMG Online Services

October 2014 Issue No: 2.0. Good Practice Guide No. 44 Authentication and Credentials for use with HMG Online Services October 2014 Issue No: 2.0 Good Practice Guide No. 44 Authentication and Credentials for use with HMG Online Services Good Practice Guide No. 44 Authentication and Credentials for use with HMG Online Services

More information

Networks and Security Lab. Network Forensics

Networks and Security Lab. Network Forensics Networks and Security Lab Network Forensics Network Forensics - continued We start off from the previous week s exercises and analyze each trace file in detail. Tools needed: Wireshark and your favorite

More information

(Refer Slide Time: 02:17)

(Refer Slide Time: 02:17) Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #06 IP Subnetting and Addressing (Not audible: (00:46)) Now,

More information

Introducing etoken. What is etoken?

Introducing etoken. What is etoken? Introducing etoken Nirit Bear September 2002 What is etoken? Small & portable reader-less Smartcard Standard USB connectivity Logical and physical protection Tamper evident (vs. tamper proof) Water resistant

More information

Fundamentals of Computer Security

Fundamentals of Computer Security Fundamentals of Computer Security Spring 2015 Radu Sion Intro Encryption Hash Functions A Message From Our Sponsors Fundamentals System/Network Security, crypto How do things work Why How to design secure

More information

Victor Shoup Avi Rubin. fshoup,rubing@bellcore.com. Abstract

Victor Shoup Avi Rubin. fshoup,rubing@bellcore.com. Abstract Session Key Distribution Using Smart Cards Victor Shoup Avi Rubin Bellcore, 445 South St., Morristown, NJ 07960 fshoup,rubing@bellcore.com Abstract In this paper, we investigate a method by which smart

More information

RARP: Reverse Address Resolution Protocol

RARP: Reverse Address Resolution Protocol SFWR 4C03: Computer Networks and Computer Security January 19-22 2004 Lecturer: Kartik Krishnan Lectures 7-9 RARP: Reverse Address Resolution Protocol When a system with a local disk is bootstrapped it

More information

Multimedia Document Authentication using On-line Signatures as Watermarks

Multimedia Document Authentication using On-line Signatures as Watermarks Multimedia Document Authentication using On-line Signatures as Watermarks Anoop M Namboodiri and Anil K Jain Department of Computer Science and Engineering Michigan State University East Lansing, MI 48824

More information

Fixity Checks: Checksums, Message Digests and Digital Signatures Audrey Novak, ILTS Digital Preservation Committee November 2006

Fixity Checks: Checksums, Message Digests and Digital Signatures Audrey Novak, ILTS Digital Preservation Committee November 2006 Fixity Checks: Checksums, Message Digests and Digital Signatures Audrey Novak, ILTS Digital Preservation Committee November 2006 Introduction: Fixity, in preservation terms, means that the digital object

More information

MBP_MSTR: Modbus Plus Master 12

MBP_MSTR: Modbus Plus Master 12 Unity Pro MBP_MSTR 33002527 07/2011 MBP_MSTR: Modbus Plus Master 12 Introduction This chapter describes the MBP_MSTR block. What s in this Chapter? This chapter contains the following topics: Topic Page

More information

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 17 Shannon-Fano-Elias Coding and Introduction to Arithmetic Coding

More information

Recommendation for Applications Using Approved Hash Algorithms

Recommendation for Applications Using Approved Hash Algorithms NIST Special Publication 800-107 Recommendation for Applications Using Approved Hash Algorithms Quynh Dang Computer Security Division Information Technology Laboratory C O M P U T E R S E C U R I T Y February

More information

Outline. Computer Science 418. Digital Signatures: Observations. Digital Signatures: Definition. Definition 1 (Digital signature) Digital Signatures

Outline. Computer Science 418. Digital Signatures: Observations. Digital Signatures: Definition. Definition 1 (Digital signature) Digital Signatures Outline Computer Science 418 Digital Signatures Mike Jacobson Department of Computer Science University of Calgary Week 12 1 Digital Signatures 2 Signatures via Public Key Cryptosystems 3 Provable 4 Mike

More information

Selected Topics of IT Security (41.4456) Seminar description

Selected Topics of IT Security (41.4456) Seminar description Selected Topics of IT Security (41.4456) Seminar description Sebastian Abt, Frank Breitinger April 3, 2012 1 Introduction The lecture and accompanying seminar target at master-level students interested

More information

Chapter 7: Network security

Chapter 7: Network security Chapter 7: Network security Foundations: what is security? cryptography authentication message integrity key distribution and certification Security in practice: application layer: secure e-mail transport

More information

You are in the Configuration Management Business

You are in the Configuration Management Business You are in the Configuration Management Business By: John Washburn April 12, 2006 Abstract This paper is directed to any person involved in the testing, certification, qualification, approval or purchase

More information

Availability Digest. www.availabilitydigest.com. Data Deduplication February 2011

Availability Digest. www.availabilitydigest.com. Data Deduplication February 2011 the Availability Digest Data Deduplication February 2011 What is Data Deduplication? Data deduplication is a technology that can reduce disk storage-capacity requirements and replication bandwidth requirements

More information

Electronic Mail Security. Email Security. email is one of the most widely used and regarded network services currently message contents are not secure

Electronic Mail Security. Email Security. email is one of the most widely used and regarded network services currently message contents are not secure Electronic Mail Security CSCI 454/554 Email Security email is one of the most widely used and regarded network services currently message contents are not secure may be inspected either in transit or by

More information

PMOD Installation on Linux Systems

PMOD Installation on Linux Systems User's Guide PMOD Installation on Linux Systems Version 3.7 PMOD Technologies Linux Installation The installation for all types of PMOD systems starts with the software extraction from the installation

More information

Outline. CSc 466/566. Computer Security. 8 : Cryptography Digital Signatures. Digital Signatures. Digital Signatures... Christian Collberg

Outline. CSc 466/566. Computer Security. 8 : Cryptography Digital Signatures. Digital Signatures. Digital Signatures... Christian Collberg Outline CSc 466/566 Computer Security 8 : Cryptography Digital Signatures Version: 2012/02/27 16:07:05 Department of Computer Science University of Arizona collberg@gmail.com Copyright c 2012 Christian

More information

Base Conversion written by Cathy Saxton

Base Conversion written by Cathy Saxton Base Conversion written by Cathy Saxton 1. Base 10 In base 10, the digits, from right to left, specify the 1 s, 10 s, 100 s, 1000 s, etc. These are powers of 10 (10 x ): 10 0 = 1, 10 1 = 10, 10 2 = 100,

More information

Announcements. Lab 2 now on web site

Announcements. Lab 2 now on web site Lab 2 now on web site Announcements Next week my office hours moved to Monday 4:3pm This week office hours Wednesday 4:3pm as usual Weighting of papers for final discussion [discussion of listen] Bro:

More information

John Mathieson US Air Force (WR ALC) Systems & Software Technology Conference Salt Lake City, Utah 19 May 2011

John Mathieson US Air Force (WR ALC) Systems & Software Technology Conference Salt Lake City, Utah 19 May 2011 John Mathieson US Air Force (WR ALC) Systems & Software Technology Conference Salt Lake City, Utah 19 May 2011 Report Documentation Page Form Approved OMB No. 0704-0188 Public reporting burden for the

More information

MSc Computer Security and Forensics. Examinations for 2009-2010 / Semester 1

MSc Computer Security and Forensics. Examinations for 2009-2010 / Semester 1 MSc Computer Security and Forensics Cohort: MCSF/09B/PT Examinations for 2009-2010 / Semester 1 MODULE: COMPUTER FORENSICS & CYBERCRIME MODULE CODE: SECU5101 Duration: 2 Hours Instructions to Candidates:

More information

Cryptography Lecture 8. Digital signatures, hash functions

Cryptography Lecture 8. Digital signatures, hash functions Cryptography Lecture 8 Digital signatures, hash functions A Message Authentication Code is what you get from symmetric cryptography A MAC is used to prevent Eve from creating a new message and inserting

More information

A-level COMPUTER SCIENCE

A-level COMPUTER SCIENCE A-level COMPUTER SCIENCE Paper 2 TBC am/pm 2 hours 30 minutes Materials There are no additional materials required for this paper. Instructions Use black ink or black ball-point pen. Fill in the boxes

More information

CHAPTER 2 DATABASE MANAGEMENT SYSTEM AND SECURITY

CHAPTER 2 DATABASE MANAGEMENT SYSTEM AND SECURITY CHAPTER 2 DATABASE MANAGEMENT SYSTEM AND SECURITY 2.1 Introduction In this chapter, I am going to introduce Database Management Systems (DBMS) and the Structured Query Language (SQL), its syntax and usage.

More information

Dr. Jinyuan (Stella) Sun Dept. of Electrical Engineering and Computer Science University of Tennessee Fall 2010

Dr. Jinyuan (Stella) Sun Dept. of Electrical Engineering and Computer Science University of Tennessee Fall 2010 CS 494/594 Computer and Network Security Dr. Jinyuan (Stella) Sun Dept. of Electrical Engineering and Computer Science University of Tennessee Fall 2010 1 Introduction to Cryptography What is cryptography?

More information

Criteria for web application security check. Version 2015.1

Criteria for web application security check. Version 2015.1 Criteria for web application security check Version 2015.1 i Content Introduction... iii ISC- P- 001 ISC- P- 001.1 ISC- P- 001.2 ISC- P- 001.3 ISC- P- 001.4 ISC- P- 001.5 ISC- P- 001.6 ISC- P- 001.7 ISC-

More information

Digital Signatures. (Note that authentication of sender is also achieved by MACs.) Scan your handwritten signature and append it to the document?

Digital Signatures. (Note that authentication of sender is also achieved by MACs.) Scan your handwritten signature and append it to the document? Cryptography Digital Signatures Professor: Marius Zimand Digital signatures are meant to realize authentication of the sender nonrepudiation (Note that authentication of sender is also achieved by MACs.)

More information

Integrating NLTK with the Hadoop Map Reduce Framework 433-460 Human Language Technology Project

Integrating NLTK with the Hadoop Map Reduce Framework 433-460 Human Language Technology Project Integrating NLTK with the Hadoop Map Reduce Framework 433-460 Human Language Technology Project Paul Bone pbone@csse.unimelb.edu.au June 2008 Contents 1 Introduction 1 2 Method 2 2.1 Hadoop and Python.........................

More information

Design and Analysis of Methods for Signing Electronic Documents Using Mobile Phones

Design and Analysis of Methods for Signing Electronic Documents Using Mobile Phones Design and Analysis of Methods for Signing Electronic Documents Using Mobile Phones Pramote Kuacharoen School of Applied Statistics National Institute of Development Administration 118 Serithai Rd. Bangkapi,

More information

Whitepaper on identity solutions for mobile devices

Whitepaper on identity solutions for mobile devices Whitepaper on identity solutions for mobile devices How software and hardware features of modern mobile devices can improve the security and user experience of your software Author: Jonas Lindstrøm The

More information

Executable Integrity Verification

Executable Integrity Verification Executable Integrity Verification Abstract Background Determining if a given executable has been trojaned is a tedious task. It is beyond the capabilities of the average end user and even many network

More information

Reference Guide WindSpring Data Management Technology (DMT) Solving Today s Storage Optimization Challenges

Reference Guide WindSpring Data Management Technology (DMT) Solving Today s Storage Optimization Challenges Reference Guide WindSpring Data Management Technology (DMT) Solving Today s Storage Optimization Challenges September 2011 Table of Contents The Enterprise and Mobile Storage Landscapes... 3 Increased

More information

WRITING PROOFS. Christopher Heil Georgia Institute of Technology

WRITING PROOFS. Christopher Heil Georgia Institute of Technology WRITING PROOFS Christopher Heil Georgia Institute of Technology A theorem is just a statement of fact A proof of the theorem is a logical explanation of why the theorem is true Many theorems have this

More information

2695 P a g e. IV Semester M.Tech (DCN) SJCIT Chickballapur Karnataka India

2695 P a g e. IV Semester M.Tech (DCN) SJCIT Chickballapur Karnataka India Integrity Preservation and Privacy Protection for Digital Medical Images M.Krishna Rani Dr.S.Bhargavi IV Semester M.Tech (DCN) SJCIT Chickballapur Karnataka India Abstract- In medical treatments, the integrity

More information

Key Management Interoperability Protocol (KMIP)

Key Management Interoperability Protocol (KMIP) (KMIP) Addressing the Need for Standardization in Enterprise Key Management Version 1.0, May 20, 2009 Copyright 2009 by the Organization for the Advancement of Structured Information Standards (OASIS).

More information

What is Web Security? Motivation

What is Web Security? Motivation brucker@inf.ethz.ch http://www.brucker.ch/ Information Security ETH Zürich Zürich, Switzerland Information Security Fundamentals March 23, 2004 The End Users View The Server Providers View What is Web

More information

Practice Questions. CS161 Computer Security, Fall 2008

Practice Questions. CS161 Computer Security, Fall 2008 Practice Questions CS161 Computer Security, Fall 2008 Name Email address Score % / 100 % Please do not forget to fill up your name, email in the box in the midterm exam you can skip this here. These practice

More information

File System Forensics FAT and NTFS. Copyright Priscilla Oppenheimer 1

File System Forensics FAT and NTFS. Copyright Priscilla Oppenheimer 1 File System Forensics FAT and NTFS 1 FAT File Systems 2 File Allocation Table (FAT) File Systems Simple and common Primary file system for DOS and Windows 9x Can be used with Windows NT, 2000, and XP New

More information

Digital Forensics at the National Institute of Standards and Technology

Digital Forensics at the National Institute of Standards and Technology NISTIR 7490 Digital Forensics at the National Institute of Standards and Technology James R. Lyle Douglas R. White Richard P. Ayers NISTIR 7490 Digital Forensics at the National Institute of Standards

More information

The basic groups of components are described below. Fig X- 1 shows the relationship between components on a network.

The basic groups of components are described below. Fig X- 1 shows the relationship between components on a network. Elements of Email Email Components There are a number of software components used to produce, send and transfer email. These components can be broken down as clients or servers, although some components

More information

Computers. Hardware. The Central Processing Unit (CPU) CMPT 125: Lecture 1: Understanding the Computer

Computers. Hardware. The Central Processing Unit (CPU) CMPT 125: Lecture 1: Understanding the Computer Computers CMPT 125: Lecture 1: Understanding the Computer Tamara Smyth, tamaras@cs.sfu.ca School of Computing Science, Simon Fraser University January 3, 2009 A computer performs 2 basic functions: 1.

More information

Test Automation Architectures: Planning for Test Automation

Test Automation Architectures: Planning for Test Automation Test Automation Architectures: Planning for Test Automation Douglas Hoffman Software Quality Methods, LLC. 24646 Heather Heights Place Saratoga, California 95070-9710 Phone 408-741-4830 Fax 408-867-4550

More information

AQA GCSE in Computer Science Computer Science Microsoft IT Academy Mapping

AQA GCSE in Computer Science Computer Science Microsoft IT Academy Mapping AQA GCSE in Computer Science Computer Science Microsoft IT Academy Mapping 3.1.1 Constants, variables and data types Understand what is mean by terms data and information Be able to describe the difference

More information

How to Send Stealth Text From Your Cell Phone

How to Send Stealth Text From Your Cell Phone anonymous secure decentralized SMS stealthtext transactions WHITEPAPER STATE OF THE ART 2/8 WHAT IS STEALTHTEXT? stealthtext is a way to send stealthcoin privately and securely using SMS texting. stealthtext

More information

Waspmote Encryption Libraries. Programming guide

Waspmote Encryption Libraries. Programming guide Waspmote Encryption Libraries Programming guide Index Document version: v4.3-01/2015 Libelium Comunicaciones Distribuidas S.L. INDEX 1. General Concepts... 4 2. Integrity... 7 2.1. Waspmote Libraries...7

More information

SNARE Agent for Windows v 4.2.3 - Release Notes

SNARE Agent for Windows v 4.2.3 - Release Notes SNARE Agent for Windows v 4.2.3 - Release Notes Snare is a program that facilitates the central collection and processing of the Windows Event Log information. All three primary event logs (Application,

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

Digital Forensics Tutorials Acquiring an Image with Kali dcfldd

Digital Forensics Tutorials Acquiring an Image with Kali dcfldd Digital Forensics Tutorials Acquiring an Image with Kali dcfldd Explanation Section Disk Imaging Definition Disk images are used to transfer a hard drive s contents for various reasons. A disk image can

More information

COMP 250 Fall 2012 lecture 2 binary representations Sept. 11, 2012

COMP 250 Fall 2012 lecture 2 binary representations Sept. 11, 2012 Binary numbers The reason humans represent numbers using decimal (the ten digits from 0,1,... 9) is that we have ten fingers. There is no other reason than that. There is nothing special otherwise about

More information

DIGITAL FORENSIC INVESTIGATION, COLLECTION AND PRESERVATION OF DIGITAL EVIDENCE. Vahidin Đaltur, Kemal Hajdarević,

DIGITAL FORENSIC INVESTIGATION, COLLECTION AND PRESERVATION OF DIGITAL EVIDENCE. Vahidin Đaltur, Kemal Hajdarević, DIGITAL FORENSIC INVESTIGATION, COLLECTION AND PRESERVATION OF DIGITAL EVIDENCE Vahidin Đaltur, Kemal Hajdarević, Internacional Burch University, Faculty of Information Technlogy 71000 Sarajevo, Bosnia

More information

Web Application Hacking (Penetration Testing) 5-day Hands-On Course

Web Application Hacking (Penetration Testing) 5-day Hands-On Course Web Application Hacking (Penetration Testing) 5-day Hands-On Course Web Application Hacking (Penetration Testing) 5-day Hands-On Course Course Description Our web sites are under attack on a daily basis

More information

Bitrix Site Manager 4.0. Quick Start Guide to Newsletters and Subscriptions

Bitrix Site Manager 4.0. Quick Start Guide to Newsletters and Subscriptions Bitrix Site Manager 4.0 Quick Start Guide to Newsletters and Subscriptions Contents PREFACE...3 CONFIGURING THE MODULE...4 SETTING UP FOR MANUAL SENDING E-MAIL MESSAGES...6 Creating a newsletter...6 Providing

More information

Radware s Behavioral Server Cracking Protection

Radware s Behavioral Server Cracking Protection Radware s Behavioral Server Cracking Protection A DefensePro Whitepaper By Renaud Bidou Senior Security Specialist,Radware October 2007 www.radware.com Page - 2 - Table of Contents Abstract...3 Information

More information