Storage Optimization in Cloud Environment using Compression Algorithm

Size: px
Start display at page:

Download "Storage Optimization in Cloud Environment using Compression Algorithm"

Transcription

1 Storage Optimization in Cloud Environment using Compression Algorithm K.Govinda 1, Yuvaraj Kumar 2 1 School of Computing Science and Engineering, VIT University, Vellore, India kgovinda@vit.ac.in 2 School of Computing Science and Engineering, VIT University, Vellore, India yuva.murak@gmail.com Abstract: Cloud Storage provides users with storage space and makes user friendly and timely acquire data, which is foundation of all kinds of cloud applications. However, there is lack of deep studies and research on how to optimize cloud storage aiming at improvement of data access performance over cloud. In this environment, consumers are billed as per they used. Generally it is called as Pay-as-you-go. In other words, If you have used for an hour, you are about to pay for the used hour. It is based upon the services. Each service has its own cost. Currently, there are various cloud computing service provider such Amazon, Google, IBM and so on. Most of the professional companies are being shifted to cloud computing architecture environment because of Space, Speed and Resource availability. Only you need to pay for the service you have been consumed, In order to reduce OPEX for cloud consumer. Optimization is challenging task. In this paper we propose a storage optimization mechanism to reduce the storage space over the cloud. greatly facilitate the users. But, high demands are proposed for cloud management system itself. For example, a service failure occurs in Simple Storage Service (S3) in July 2008, and this failure lasted for eight hours, making online companies relying on S3 suffer a great loss. The reason causing the system failure is that the S3 system can not effectively route the user's requests to the appropriate physical storage server. Therefore, cloud storage must be optimized to ensure that the data storage and access efficiency. The rest of the paper is organized as follows. Chapter2 descries different data compression techniques, Chapter3 describes the proposed LZW method and Chapter4 describes implementation followed by conclusion. Keywords: Optimization, Storage, OPEX, LZW and LZ INTRODUCTION Cloud computing is a new form of distributed computing mode after grid computing and pervasive computing. Its aim is to build a virtual infrastructure providing users with remotely computing and storage capacity [1-3]. Since 2006, there have been some of the more successful cloud facilities, such as Amazon's Elastic Compute Cloud [3], IBM's Blue Cloud [5], Nimbus [6], Open Nebula [7], and Google s Google App Engine [8] and so on. Cloud storage is a kind of cloud computing. It provides space for data storage, and user-friendly and timely access way to user, such as a simple storage service Simple Storage Service (S3) built on Amazon EC2 as well as the Google File System [9]. The greatest advantage of cloud storage is it enables users at any time access data. In cloud system, storage management system automatically analyses user s requirements and locate and transform data, which Figure 1 Cloud Storage Scenario

2 2 LITERATURE REVIEW 2.1 Huffman Coding Huffman coding [10] is an entropy encoding algorithm used for lossless data compression. It uses a specific method for choosing the representation for each symbol, resulting in a prefix-free code that expresses the most common characters using shorter strings of bits than are used for less common source symbols. Huffman coding is optimal when the probability of each input symbol is a negative power of two. Prefix-free codes tend to have slight inefficiency on small alphabets, where probabilities often fall between these optimal points. "Blocking", or expanding the alphabet size by coalescing multiple symbols into "words" of fixed or variable-length before Huffman coding, usually helps, especially when adjacent symbols are correlated. Prediction by Partial Matching (PPM) [11, 12] is an adaptive statistical data compression technique based on context modeling and prediction. In general, PPM predicts the probability of a given character based on a given number of characters that immediately precede it. Predictions are usually reduced to symbol rankings. The number of previous symbols, n, determines the order of the PPM model which is denoted as PPM(n). Unbounded variants where the context has no length limitations also exist and are denoted as PPM*. If no prediction can be made based on all n context symbols a prediction is attempted with just n-1 symbols. This process is repeated until a match is found or no more symbols remain in context. At that point a fixed prediction is made. PPM is conceptually simple, but often computationally expensive. Much of the work in optimizing a PPM model is handling inputs that have not already occurred in the input stream[13]. The obvious way to handle them is to create a "neverseen" symbol which triggers the escape sequence. But what probability should be assigned to a symbol that has never been seen. This is called the zero-frequency problem. PPM compression implementations vary greatly in other details. The actual symbol selection is usually recorded using arithmetic coding, though it is also possible to use Huffman encoding or even some type of dictionary coding technique. The underlying model used in most PPM algorithms can also be extended to predict multiple symbols. The symbol size is usually static, typically a single byte, which makes generic handling of any file format easy. 2.2 LZ77 LZ77 algorithms achieve compression by replacing repeated occurrences of data with references to a single copy of that data existing earlier in the input (uncompressed) data stream. A match is encoded by a pair of numbers called a length distance pair, which is equivalent to the statement "each of the next length characters is equal to the characters exactly distance characters behind it in the uncompressed stream". To spot matches, the encoder must keep track of some amount of the most recent data, such as the last 2 kb, 4 kb, or 32 kb. The structure in which this data is held is called a sliding window, which is why LZ77 is sometimes called sliding window compression. The encoder needs to keep this data to look for matches, and the decoder needs to keep this data to interpret the matches the encoder refers to. The larger the sliding window is, the longer back the encoder may search for creating references. It is not only acceptable but frequently useful to allow length-distance pairs to specify a length that actually exceeds the distance. As a copy command, this is puzzling: "Go back four characters and copy e10characters from that position into the current position"[10]. How can ten characters be copied over when only four of them are actually in the buffer? Tackling one byte at a time, there is no problem serving this request, because as a byte is copied over, it may be fed again as input to the copy command. When the copy-from position makes it to the initial destination position, it is consequently fed data that was pasted from the beginning of the copy-from position. The operation is thus equivalent to the statement "copy the data you were given and repetitively paste it until it fits". 2.3 LZ78 LZ78algorithms achieve compression by replacing repeated occurrences of data with references to a dictionary that is built based on the input data stream. Each dictionary entry is of the form dictionary[...] = {index, character}, where index is the index to a previous dictionary entry, and character is appended to the string represented by dictionary[index]. For example, "abc" would be stored (in reverse order) as follows: dictionary[k] = {j, 'c'}, dictionary[j] = {i, 'b'}, dictionary[i] = {0, 'a'}, where an index of 0 implies the end of a string. The algorithm initializes last matching index = 0 and next available index = 1. For each character of the input stream, the dictionary is searched for a match: {last matching index, character}. If a match is found, then last matching index is set to the index of the matching entry, and nothing is output. If a match is not found, then a new dictionary entry is created: dictionary[next available index] = {last matching index, character}, and the algorithm outputs last matching index, followed by character, then resets last matching index = 0 and increments next available index. Once the dictionary is full, no more entries are added. When the end in th of the input stream is reached, the algorithm outputs last matching index. It is very important to know that the strings stored in the dictionary is in the reversed order[14-16]. LZW is an LZ78 based algorithm that uses a dictionary pre-initialized with all possible symbols. The main improvement of LZW is that when a match is not found, the current input stream character is assumed that it will be the first character of an existing string in the dictionary (since the dictionary is initialized with all

3 possible characters), so only the last matching index is output (which may be the pre-initialized dictionary index corresponding to the previous symbol. Table 1 The String Table after compression phase over string /BAT/BE/BAR/BATS. Input String=/BAT/BE/BAR/BATS Character Input Code Output New Code Value New String /B / 256 /B A B 257 BA T A 258 AT / T 259 T/ BE /BE / E 261 E/ BA /BA R A 263 AR / R 264 R/ BAT /BAT S T 266 TS EOF S 3. CHAR = get input character 4. IF STRING+CHAR is in the string table then 5. STRING = STRING + character 6. ELSE 7. output the code for STRING 8. add STRING+CHAR to the string table 9. STRING = CHAR 10. END of IF 11. END of WHILE 12. Output the code for STRING 3.2 Algorithm - LZW_DECOMPRESS 1. Read O_CODE 2. output O_CODE 3. WHILE there are still input characters DO 4. Read N_CODE 5. STRING = get translation of N_CODE 6. output STRING 7. CHAR = first character in STRING 8. add OLD_CODE + CHAR to the translation table 9. O_CODE = N_CODE 10. END of WHILE 3.3 Data Flow Diagram (DFD) 3 PROPOSED METHOD LZW compression generally interchanges a set of characters with single code. This compression methodology will never understand the input text. Behalf of it LZW constructs a table known as string translation table from the text which is being compressed. The string translation table build by LZW generates a strict-length of code to strings. The translation table is initialized with all single-character strings. Each and every single time a previously-encountered string is received from the input, the longest such previously-encountered string is verified, and then the code for the string which is encountered now is concatenated with the initialized extension character and stored in the table[17]. The code for this longest previously-encountered string is the output and the extension character is used as the beginning of the next word. Compression occurs by the translation table which translates the set of characters to a single code as the output instead of a string of characters as shown in table1. Although LZW is often explained in the context of compressing text files, it can be used on any type of file. However, it generally performs best on files with repeated substrings, such as text files. 3.1 Algorithm - LZW_COMPRESS 1. STRING = get input character 2. WHILE there are still input characters DO 4 IMPLEMENTATION Figure 2 Data Flow Diagram We implemented the LZW algorithm using java and achieved around 50% of compression as shown in Fig2,

4 Fig3 and Fig4. The Fig2 shows the size of the file before compression, Fig3 shows the size of the file after compression and Fig5 shows the size of the file after uncompress. Figure 5. after uncompression Figure 3 before compression Figure 4 after compression 5 CONCLUSION We can conclude that if the LZW can be used in cloud environment to compress the data during storage so that the data transfer time and storage reduced subsequently. Because compression and decompression depends upon compiler; as in advancing technologies new and new processors are coming so compiler speed is increasing. By implementation we can conclude that LZW does not have overhead of sending the key because decompression is predefined with ASCII values. So LZW can be successfully integrated with Cloud. References [1] Weiss. Computing in the Clouds[J]. networker 2007,11(4): [2] Rajkumar Buyya, Chee Shin Yeo, Srikumar Venugopal, et al. Cloud computing and emerging IT platforms: Vision, hype, and reality for delivering computing as the 5th utility[j].future Generation Computer Systems 2009,25: [3] Twenty experts define cloud computing[url]. http :// cloud computing. sys.con.com/ read/612375_p.htm ( ). [4] Amazon Inc. Amazon Web Services EC2 site[url]. http :// aws. a m a zon.com/ec2, [5] IBM Blue Cloud project [URL]. /press /us/en / pressrelease/22613.wss/, access on June [6] Nimbus Project [URL]. /clouds /nimbus.html/, [7] OpenNEbula Project [URL]. access on Apr [8] S. Ghemawat, H. Gobioff, and S. Leung. The google file system[c]. In Proceedings of the 19th ACM

5 Symposium on Operating Systems Principles, pages 29 43,2003 [9] GoogleApp [URL] access on June [10] C:\Documents and Settings\DELL\Local Settings\temp\IM\LZSS (LZ77) Discussion and Implementation.mht. [11] D.A. Huffman, "A Method for the Construction of Minimum Redundancy Codes", Proceedings of the I.R.E., September 1952, pp [12] T. Bell, J. Cleary, and I. Witten, Data compression using adaptive coding and partial string matching, IEEE Transactions on Communications, Vol. 32 (4), p , [13] A. Moffat, Implementing the PPM data compression scheme, IEEE Transactions on Communications, Vol. 38 (11), pp , November [14] Ziv, J., & Lempel, A. A Universal Algorithm for Sequential Data Compression, IEEE Transactions on Information Theory, 23(3), pp , May [15] Ziv, J., & Lempel, A. Compression of individual sequences via variable-rate coding, IEEE Trans. Inform. Theory, 24(5), , September [16] M. Burrows and D. J. Wheeler, A Block-sorting Lossless Data Compression Algorithm, Digital Systems Research Canter Research Report 124, May [17] Welch, T.A. A technique for high performance data compression, IEEE Computer, 17(6), 819, Author s Profile Mr.K.Govinda, Ph.D Scholar and A.P (SG) in School of Computing Science and Engineering of VIT University, Vellore, Tamil Nadu. He has more than X years of teaching experience and his areas of interests are Database, Distributed Database, Data Warehousing & Mining and Cloud Computing. Yuvaraj Kumar received the M.Sc. degrees in Computer Science from VIT University in

A* Algorithm Based Optimization for Cloud Storage

A* Algorithm Based Optimization for Cloud Storage International Journal of Digital Content Technology and its Applications Volume 4, Number 8, November 21 A* Algorithm Based Optimization for Cloud Storage 1 Ren Xun-Yi, 2 Ma Xiao-Dong 1* College of Computer

More information

Multimedia Systems WS 2010/2011

Multimedia Systems WS 2010/2011 Multimedia Systems WS 2010/2011 31.01.2011 M. Rahamatullah Khondoker (Room # 36/410 ) University of Kaiserslautern Department of Computer Science Integrated Communication Systems ICSY http://www.icsy.de

More information

Analysis of Compression Algorithms for Program Data

Analysis of Compression Algorithms for Program Data Analysis of Compression Algorithms for Program Data Matthew Simpson, Clemson University with Dr. Rajeev Barua and Surupa Biswas, University of Maryland 12 August 3 Abstract Insufficient available memory

More information

Image Compression through DCT and Huffman Coding Technique

Image Compression through DCT and Huffman Coding Technique International Journal of Current Engineering and Technology E-ISSN 2277 4106, P-ISSN 2347 5161 2015 INPRESSCO, All Rights Reserved Available at http://inpressco.com/category/ijcet Research Article Rahul

More information

On the Use of Compression Algorithms for Network Traffic Classification

On the Use of Compression Algorithms for Network Traffic Classification On the Use of for Network Traffic Classification Christian CALLEGARI Department of Information Ingeneering University of Pisa 23 September 2008 COST-TMA Meeting Samos, Greece Outline Outline 1 Introduction

More information

Compression techniques

Compression techniques Compression techniques David Bařina February 22, 2013 David Bařina Compression techniques February 22, 2013 1 / 37 Contents 1 Terminology 2 Simple techniques 3 Entropy coding 4 Dictionary methods 5 Conclusion

More information

IJESRT. Scientific Journal Impact Factor: 3.449 (ISRA), Impact Factor: 2.114

IJESRT. Scientific Journal Impact Factor: 3.449 (ISRA), Impact Factor: 2.114 IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY Optimized Storage Approaches in Cloud Environment Sri M.Tanooj kumar, A.Radhika Department of Computer Science and Engineering,

More information

Lossless Data Compression Standard Applications and the MapReduce Web Computing Framework

Lossless Data Compression Standard Applications and the MapReduce Web Computing Framework Lossless Data Compression Standard Applications and the MapReduce Web Computing Framework Sergio De Agostino Computer Science Department Sapienza University of Rome Internet as a Distributed System Modern

More information

LZ77. Example 2.10: Let T = badadadabaab and assume d max and l max are large. phrase b a d adadab aa b

LZ77. Example 2.10: Let T = badadadabaab and assume d max and l max are large. phrase b a d adadab aa b LZ77 The original LZ77 algorithm works as follows: A phrase T j starting at a position i is encoded as a triple of the form distance, length, symbol. A triple d, l, s means that: T j = T [i...i + l] =

More information

Wan Accelerators: Optimizing Network Traffic with Compression. Bartosz Agas, Marvin Germar & Christopher Tran

Wan Accelerators: Optimizing Network Traffic with Compression. Bartosz Agas, Marvin Germar & Christopher Tran Wan Accelerators: Optimizing Network Traffic with Compression Bartosz Agas, Marvin Germar & Christopher Tran Introduction A WAN accelerator is an appliance that can maximize the services of a point-to-point(ptp)

More information

Arithmetic Coding: Introduction

Arithmetic Coding: Introduction Data Compression Arithmetic coding Arithmetic Coding: Introduction Allows using fractional parts of bits!! Used in PPM, JPEG/MPEG (as option), Bzip More time costly than Huffman, but integer implementation

More information

HIGH DENSITY DATA STORAGE IN DNA USING AN EFFICIENT MESSAGE ENCODING SCHEME Rahul Vishwakarma 1 and Newsha Amiri 2

HIGH DENSITY DATA STORAGE IN DNA USING AN EFFICIENT MESSAGE ENCODING SCHEME Rahul Vishwakarma 1 and Newsha Amiri 2 HIGH DENSITY DATA STORAGE IN DNA USING AN EFFICIENT MESSAGE ENCODING SCHEME Rahul Vishwakarma 1 and Newsha Amiri 2 1 Tata Consultancy Services, India derahul@ieee.org 2 Bangalore University, India ABSTRACT

More information

Streaming Lossless Data Compression Algorithm (SLDC)

Streaming Lossless Data Compression Algorithm (SLDC) Standard ECMA-321 June 2001 Standardizing Information and Communication Systems Streaming Lossless Data Compression Algorithm (SLDC) Phone: +41 22 849.60.00 - Fax: +41 22 849.60.01 - URL: http://www.ecma.ch

More information

Lempel-Ziv Coding Adaptive Dictionary Compression Algorithm

Lempel-Ziv Coding Adaptive Dictionary Compression Algorithm Lempel-Ziv Coding Adaptive Dictionary Compression Algorithm 1. LZ77:Sliding Window Lempel-Ziv Algorithm [gzip, pkzip] Encode a string by finding the longest match anywhere within a window of past symbols

More information

Information, Entropy, and Coding

Information, Entropy, and Coding Chapter 8 Information, Entropy, and Coding 8. The Need for Data Compression To motivate the material in this chapter, we first consider various data sources and some estimates for the amount of data associated

More information

ANALYSIS OF THE EFFECTIVENESS IN IMAGE COMPRESSION FOR CLOUD STORAGE FOR VARIOUS IMAGE FORMATS

ANALYSIS OF THE EFFECTIVENESS IN IMAGE COMPRESSION FOR CLOUD STORAGE FOR VARIOUS IMAGE FORMATS ANALYSIS OF THE EFFECTIVENESS IN IMAGE COMPRESSION FOR CLOUD STORAGE FOR VARIOUS IMAGE FORMATS Dasaradha Ramaiah K. 1 and T. Venugopal 2 1 IT Department, BVRIT, Hyderabad, India 2 CSE Department, JNTUH,

More information

A Perfect CRIME? TIME Will Tell. Tal Be ery, Web research TL

A Perfect CRIME? TIME Will Tell. Tal Be ery, Web research TL A Perfect CRIME? TIME Will Tell Tal Be ery, Web research TL Agenda BEAST + Modes of operation CRIME + Gzip compression + Compression + encryption leak data TIME + Timing + compression leak data Attacking

More information

A Survey on Cloud Computing

A Survey on Cloud Computing A Survey on Cloud Computing Poulami dalapati* Department of Computer Science Birla Institute of Technology, Mesra Ranchi, India dalapati89@gmail.com G. Sahoo Department of Information Technology Birla

More information

A Proficient scheme for Backup and Restore Data in Android for Mobile Devices M S. Shriwas

A Proficient scheme for Backup and Restore Data in Android for Mobile Devices M S. Shriwas A Proficient scheme for Backup and Restore Data in Android for Mobile Devices M S. Shriwas Abstract: Today are smart phones world. Phones are not just for contact to people but it plays vital role in the

More information

Data Mining Un-Compressed Images from cloud with Clustering Compression technique using Lempel-Ziv-Welch

Data Mining Un-Compressed Images from cloud with Clustering Compression technique using Lempel-Ziv-Welch Data Mining Un-Compressed Images from cloud with Clustering Compression technique using Lempel-Ziv-Welch 1 C. Parthasarathy 2 K.Srinivasan and 3 R.Saravanan Assistant Professor, 1,2,3 Dept. of I.T, SCSVMV

More information

Data Reduction: Deduplication and Compression. Danny Harnik IBM Haifa Research Labs

Data Reduction: Deduplication and Compression. Danny Harnik IBM Haifa Research Labs Data Reduction: Deduplication and Compression Danny Harnik IBM Haifa Research Labs Motivation Reducing the amount of data is a desirable goal Data reduction: an attempt to compress the huge amounts of

More information

Extended Application of Suffix Trees to Data Compression

Extended Application of Suffix Trees to Data Compression Extended Application of Suffix Trees to Data Compression N. Jesper Larsson A practical scheme for maintaining an index for a sliding window in optimal time and space, by use of a suffix tree, is presented.

More information

SRC Research. d i g i t a l. A Block-sorting Lossless Data Compression Algorithm. Report 124. M. Burrows and D.J. Wheeler.

SRC Research. d i g i t a l. A Block-sorting Lossless Data Compression Algorithm. Report 124. M. Burrows and D.J. Wheeler. May 10, 1994 SRC Research Report 124 A Block-sorting Lossless Data Compression Algorithm M. Burrows and D.J. Wheeler d i g i t a l Systems Research Center 130 Lytton Avenue Palo Alto, California 94301

More information

An Efficient Checkpointing Scheme Using Price History of Spot Instances in Cloud Computing Environment

An Efficient Checkpointing Scheme Using Price History of Spot Instances in Cloud Computing Environment An Efficient Checkpointing Scheme Using Price History of Spot Instances in Cloud Computing Environment Daeyong Jung 1, SungHo Chin 1, KwangSik Chung 2, HeonChang Yu 1, JoonMin Gil 3 * 1 Dept. of Computer

More information

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 17 Shannon-Fano-Elias Coding and Introduction to Arithmetic Coding

More information

FEDERATED CLOUD: A DEVELOPMENT IN CLOUD COMPUTING AND A SOLUTION TO EDUCATIONAL NEEDS

FEDERATED CLOUD: A DEVELOPMENT IN CLOUD COMPUTING AND A SOLUTION TO EDUCATIONAL NEEDS International Journal of Computer Engineering and Applications, Volume VIII, Issue II, November 14 FEDERATED CLOUD: A DEVELOPMENT IN CLOUD COMPUTING AND A SOLUTION TO EDUCATIONAL NEEDS Saju Mathew 1, Dr.

More information

Keywords: Cloudsim, MIPS, Gridlet, Virtual machine, Data center, Simulation, SaaS, PaaS, IaaS, VM. Introduction

Keywords: Cloudsim, MIPS, Gridlet, Virtual machine, Data center, Simulation, SaaS, PaaS, IaaS, VM. Introduction Vol. 3 Issue 1, January-2014, pp: (1-5), Impact Factor: 1.252, Available online at: www.erpublications.com Performance evaluation of cloud application with constant data center configuration and variable

More information

Transformation of LOG file using LIPT technique

Transformation of LOG file using LIPT technique Research Article International Journal of Advanced Computer Research, Vol 6(23) ISSN (Print): 2249-7277 ISSN (Online): 2277-7970 http://dx.doi.org/ 10.19101/IJACR.2016.623015 Transformation of LOG file

More information

How to Send Video Images Through Internet

How to Send Video Images Through Internet Transmitting Video Images in XML Web Service Francisco Prieto, Antonio J. Sierra, María Carrión García Departamento de Ingeniería de Sistemas y Automática Área de Ingeniería Telemática Escuela Superior

More information

Searching BWT compressed text with the Boyer-Moore algorithm and binary search

Searching BWT compressed text with the Boyer-Moore algorithm and binary search Searching BWT compressed text with the Boyer-Moore algorithm and binary search Tim Bell 1 Matt Powell 1 Amar Mukherjee 2 Don Adjeroh 3 November 2001 Abstract: This paper explores two techniques for on-line

More information

A Load Balancing Model Based on Cloud Partitioning for the Public Cloud

A Load Balancing Model Based on Cloud Partitioning for the Public Cloud International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 16 (2014), pp. 1605-1610 International Research Publications House http://www. irphouse.com A Load Balancing

More information

How To Understand Cloud Computing

How To Understand Cloud Computing Cloud Computing: a Perspective Study Lizhe WANG, Gregor von LASZEWSKI, Younge ANDREW, Xi HE Service Oriented Cyberinfrastruture Lab, Rochester Inst. of Tech. Abstract The Cloud computing emerges as a new

More information

Optimal Service Pricing for a Cloud Cache

Optimal Service Pricing for a Cloud Cache Optimal Service Pricing for a Cloud Cache K.SRAVANTHI Department of Computer Science & Engineering (M.Tech.) Sindura College of Engineering and Technology Ramagundam,Telangana G.LAKSHMI Asst. Professor,

More information

Data Compression Using Long Common Strings

Data Compression Using Long Common Strings Data Compression Using Long Common Strings Jon Bentley Bell Labs, Room 2C-514 600 Mountain Avenue Murray Hill, NJ 07974 jlb@research.bell-labs.com Douglas McIlroy Department of Computer Science Dartmouth

More information

Chapter 4: Computer Codes

Chapter 4: Computer Codes Slide 1/30 Learning Objectives In this chapter you will learn about: Computer data Computer codes: representation of data in binary Most commonly used computer codes Collating sequence 36 Slide 2/30 Data

More information

CHAPTER 5. Obfuscation is a process of converting original data into unintelligible data. It

CHAPTER 5. Obfuscation is a process of converting original data into unintelligible data. It CHAPTER 5 5.1. Introduction Obfuscation is a process of converting original data into unintelligible data. It is similar to encryption but it uses mathematical calculations or programming logics. Encryption

More information

THE SECURITY AND PRIVACY ISSUES OF RFID SYSTEM

THE SECURITY AND PRIVACY ISSUES OF RFID SYSTEM THE SECURITY AND PRIVACY ISSUES OF RFID SYSTEM Iuon Chang Lin Department of Management Information Systems, National Chung Hsing University, Taiwan, Department of Photonics and Communication Engineering,

More information

CHAPTER 2 LITERATURE REVIEW

CHAPTER 2 LITERATURE REVIEW 11 CHAPTER 2 LITERATURE REVIEW 2.1 INTRODUCTION Image compression is mainly used to reduce storage space, transmission time and bandwidth requirements. In the subsequent sections of this chapter, general

More information

Lossless Grey-scale Image Compression using Source Symbols Reduction and Huffman Coding

Lossless Grey-scale Image Compression using Source Symbols Reduction and Huffman Coding Lossless Grey-scale Image Compression using Source Symbols Reduction and Huffman Coding C. SARAVANAN cs@cc.nitdgp.ac.in Assistant Professor, Computer Centre, National Institute of Technology, Durgapur,WestBengal,

More information

Secured Storage of Outsourced Data in Cloud Computing

Secured Storage of Outsourced Data in Cloud Computing Secured Storage of Outsourced Data in Cloud Computing Chiranjeevi Kasukurthy 1, Ch. Ramesh Kumar 2 1 M.Tech(CSE), Nalanda Institute of Engineering & Technology,Siddharth Nagar, Sattenapalli, Guntur Affiliated

More information

Content-aware Partial Compression for Big Textual Data Analysis Acceleration

Content-aware Partial Compression for Big Textual Data Analysis Acceleration Content-aware Partial Compression for Big Textual Data Analysis Acceleration Dapeng Dong and John Herbert Mobile and Internet Systems Laboratory and Insight Centre for Data Analytic National University

More information

Third Southern African Regional ACM Collegiate Programming Competition. Sponsored by IBM. Problem Set

Third Southern African Regional ACM Collegiate Programming Competition. Sponsored by IBM. Problem Set Problem Set Problem 1 Red Balloon Stockbroker Grapevine Stockbrokers are known to overreact to rumours. You have been contracted to develop a method of spreading disinformation amongst the stockbrokers

More information

A comprehensive survey on various ETC techniques for secure Data transmission

A comprehensive survey on various ETC techniques for secure Data transmission A comprehensive survey on various ETC techniques for secure Data transmission Shaikh Nasreen 1, Prof. Suchita Wankhade 2 1, 2 Department of Computer Engineering 1, 2 Trinity College of Engineering and

More information

PRIVACY PRESERVATION ALGORITHM USING EFFECTIVE DATA LOOKUP ORGANIZATION FOR STORAGE CLOUDS

PRIVACY PRESERVATION ALGORITHM USING EFFECTIVE DATA LOOKUP ORGANIZATION FOR STORAGE CLOUDS PRIVACY PRESERVATION ALGORITHM USING EFFECTIVE DATA LOOKUP ORGANIZATION FOR STORAGE CLOUDS Amar More 1 and Sarang Joshi 2 1 Department of Computer Engineering, Pune Institute of Computer Technology, Maharashtra,

More information

zdelta: An Efficient Delta Compression Tool

zdelta: An Efficient Delta Compression Tool zdelta: An Efficient Delta Compression Tool Dimitre Trendafilov Nasir Memon Torsten Suel Department of Computer and Information Science Technical Report TR-CIS-2002-02 6/26/2002 zdelta: An Efficient Delta

More information

International Journal of Advanced Research in Computer Science and Software Engineering

International Journal of Advanced Research in Computer Science and Software Engineering Volume 3, Issue 7, July 23 ISSN: 2277 28X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Greedy Algorithm:

More information

Ranked Keyword Search Using RSE over Outsourced Cloud Data

Ranked Keyword Search Using RSE over Outsourced Cloud Data Ranked Keyword Search Using RSE over Outsourced Cloud Data Payal Akriti 1, Ms. Preetha Mary Ann 2, D.Sarvanan 3 1 Final Year MCA, Sathyabama University, Tamilnadu, India 2&3 Assistant Professor, Sathyabama

More information

Network (Tree) Topology Inference Based on Prüfer Sequence

Network (Tree) Topology Inference Based on Prüfer Sequence Network (Tree) Topology Inference Based on Prüfer Sequence C. Vanniarajan and Kamala Krithivasan Department of Computer Science and Engineering Indian Institute of Technology Madras Chennai 600036 vanniarajanc@hcl.in,

More information

From Grid Computing to Cloud Computing & Security Issues in Cloud Computing

From Grid Computing to Cloud Computing & Security Issues in Cloud Computing From Grid Computing to Cloud Computing & Security Issues in Cloud Computing Rajendra Kumar Dwivedi Department of CSE, M.M.M. Engineering College, Gorakhpur (UP), India 273010 rajendra_bhilai@yahoo.com

More information

Secured Lossless Medical Image Compression Based On Adaptive Binary Optimization

Secured Lossless Medical Image Compression Based On Adaptive Binary Optimization IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661, p- ISSN: 2278-8727Volume 16, Issue 2, Ver. IV (Mar-Apr. 2014), PP 43-47 Secured Lossless Medical Image Compression Based On Adaptive Binary

More information

Introduction to Hadoop

Introduction to Hadoop Introduction to Hadoop 1 What is Hadoop? the big data revolution extracting value from data cloud computing 2 Understanding MapReduce the word count problem more examples MCS 572 Lecture 24 Introduction

More information

A SURVEY ON MAPREDUCE IN CLOUD COMPUTING

A SURVEY ON MAPREDUCE IN CLOUD COMPUTING A SURVEY ON MAPREDUCE IN CLOUD COMPUTING Dr.M.Newlin Rajkumar 1, S.Balachandar 2, Dr.V.Venkatesakumar 3, T.Mahadevan 4 1 Asst. Prof, Dept. of CSE,Anna University Regional Centre, Coimbatore, newlin_rajkumar@yahoo.co.in

More information

AN INTELLIGENT TEXT DATA ENCRYPTION AND COMPRESSION FOR HIGH SPEED AND SECURE DATA TRANSMISSION OVER INTERNET

AN INTELLIGENT TEXT DATA ENCRYPTION AND COMPRESSION FOR HIGH SPEED AND SECURE DATA TRANSMISSION OVER INTERNET AN INTELLIGENT TEXT DATA ENCRYPTION AND COMPRESSION FOR HIGH SPEED AND SECURE DATA TRANSMISSION OVER INTERNET Dr. V.K. Govindan 1 B.S. Shajee mohan 2 1. Prof. & Head CSED, NIT Calicut, Kerala 2. Assistant

More information

Lempel-Ziv Factorization: LZ77 without Window

Lempel-Ziv Factorization: LZ77 without Window Lempel-Ziv Factorization: LZ77 without Window Enno Ohlebusch May 13, 2016 1 Sux arrays To construct the sux array of a string S boils down to sorting all suxes of S in lexicographic order (also known as

More information

From Grid Computing to Cloud Computing & Security Issues in Cloud Computing

From Grid Computing to Cloud Computing & Security Issues in Cloud Computing From Grid Computing to Cloud Computing & Security Issues in Cloud Computing Rajendra Kumar Dwivedi Assistant Professor (Department of CSE), M.M.M. Engineering College, Gorakhpur (UP), India E-mail: rajendra_bhilai@yahoo.com

More information

So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)

So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02) Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #39 Search Engines and Web Crawler :: Part 2 So today we

More information

Performance Evaluation of Round Robin Algorithm in Cloud Environment

Performance Evaluation of Round Robin Algorithm in Cloud Environment Performance Evaluation of Round Robin Algorithm in Cloud Environment Asha M L 1 Neethu Myshri R 2 Sowmyashree C.S 3 1,3 AP, Dept. of CSE, SVCE, Bangalore. 2 M.E(dept. of CSE) Student, UVCE, Bangalore.

More information

Search Query and Matching Approach of Information Retrieval in Cloud Computing

Search Query and Matching Approach of Information Retrieval in Cloud Computing International Journal of Advances in Electrical and Electronics Engineering 99 Available online at www.ijaeee.com & www.sestindia.org ISSN: 2319-1112 Search Query and Matching Approach of Information Retrieval

More information

Fast Arithmetic Coding (FastAC) Implementations

Fast Arithmetic Coding (FastAC) Implementations Fast Arithmetic Coding (FastAC) Implementations Amir Said 1 Introduction This document describes our fast implementations of arithmetic coding, which achieve optimal compression and higher throughput by

More information

Jeffrey D. Ullman slides. MapReduce for data intensive computing

Jeffrey D. Ullman slides. MapReduce for data intensive computing Jeffrey D. Ullman slides MapReduce for data intensive computing Single-node architecture CPU Machine Learning, Statistics Memory Classical Data Mining Disk Commodity Clusters Web data sets can be very

More information

Cloud Computing: Computing as a Service. Prof. Daivashala Deshmukh Maharashtra Institute of Technology, Aurangabad

Cloud Computing: Computing as a Service. Prof. Daivashala Deshmukh Maharashtra Institute of Technology, Aurangabad Cloud Computing: Computing as a Service Prof. Daivashala Deshmukh Maharashtra Institute of Technology, Aurangabad Abstract: Computing as a utility. is a dream that dates from the beginning from the computer

More information

Optimization of Search Results with Duplicate Page Elimination using Usage Data A. K. Sharma 1, Neelam Duhan 2 1, 2

Optimization of Search Results with Duplicate Page Elimination using Usage Data A. K. Sharma 1, Neelam Duhan 2 1, 2 Optimization of Search Results with Duplicate Page Elimination using Usage Data A. K. Sharma 1, Neelam Duhan 2 1, 2 Department of Computer Engineering, YMCA University of Science & Technology, Faridabad,

More information

FREE computing using Amazon EC2

FREE computing using Amazon EC2 FREE computing using Amazon EC2 Seong-Hwan Jun 1 1 Department of Statistics Univ of British Columbia Nov 1st, 2012 / Student seminar Outline Basics of servers Amazon EC2 Setup R on an EC2 instance Stat

More information

Secure Collaborative Privacy In Cloud Data With Advanced Symmetric Key Block Algorithm

Secure Collaborative Privacy In Cloud Data With Advanced Symmetric Key Block Algorithm Secure Collaborative Privacy In Cloud Data With Advanced Symmetric Key Block Algorithm Twinkle Graf.F 1, Mrs.Prema.P 2 1 (M.E- CSE, Dhanalakshmi College of Engineering, Chennai, India) 2 (Asst. Professor

More information

EFFECTIVE DATA RECOVERY FOR CONSTRUCTIVE CLOUD PLATFORM

EFFECTIVE DATA RECOVERY FOR CONSTRUCTIVE CLOUD PLATFORM INTERNATIONAL JOURNAL OF REVIEWS ON RECENT ELECTRONICS AND COMPUTER SCIENCE EFFECTIVE DATA RECOVERY FOR CONSTRUCTIVE CLOUD PLATFORM Macha Arun 1, B.Ravi Kumar 2 1 M.Tech Student, Dept of CSE, Holy Mary

More information

Signature Amortization Technique for Authenticating Delay Sensitive Stream

Signature Amortization Technique for Authenticating Delay Sensitive Stream Signature Amortization Technique for Authenticating Delay Sensitive Stream M Bruntha 1, Dr J. Premalatha Ph.D. 2 1 M.E., 2 Professor, Department of Information Technology, Kongu Engineering College, Perundurai,

More information

VIRTUAL LABORATORY: MULTI-STYLE CODE EDITOR

VIRTUAL LABORATORY: MULTI-STYLE CODE EDITOR VIRTUAL LABORATORY: MULTI-STYLE CODE EDITOR Andrey V.Lyamin, State University of IT, Mechanics and Optics St. Petersburg, Russia Oleg E.Vashenkov, State University of IT, Mechanics and Optics, St.Petersburg,

More information

Participatory Cloud Computing and the Privacy and Security of Medical Information Applied to A Wireless Smart Board Network

Participatory Cloud Computing and the Privacy and Security of Medical Information Applied to A Wireless Smart Board Network Participatory Cloud Computing and the Privacy and Security of Medical Information Applied to A Wireless Smart Board Network Lutando Ngqakaza ngqlut003@myuct.ac.za UCT Department of Computer Science Abstract:

More information

Compressing Medical Records for Storage on a Low-End Mobile Phone

Compressing Medical Records for Storage on a Low-End Mobile Phone Honours Project Report Compressing Medical Records for Storage on a Low-End Mobile Phone Paul Brittan pbrittan@cs.uct.ac.za Supervised By: Sonia Berman, Gary Marsden & Anne Kayem Category Min Max Chosen

More information

Variables, Constants, and Data Types

Variables, Constants, and Data Types Variables, Constants, and Data Types Primitive Data Types Variables, Initialization, and Assignment Constants Characters Strings Reading for this class: L&L, 2.1-2.3, App C 1 Primitive Data There are eight

More information

A NEW LOSSLESS METHOD OF IMAGE COMPRESSION AND DECOMPRESSION USING HUFFMAN CODING TECHNIQUES

A NEW LOSSLESS METHOD OF IMAGE COMPRESSION AND DECOMPRESSION USING HUFFMAN CODING TECHNIQUES A NEW LOSSLESS METHOD OF IMAGE COMPRESSION AND DECOMPRESSION USING HUFFMAN CODING TECHNIQUES 1 JAGADISH H. PUJAR, 2 LOHIT M. KADLASKAR 1 Faculty, Department of EEE, B V B College of Engg. & Tech., Hubli,

More information

Conceptual Framework Strategies for Image Compression: A Review

Conceptual Framework Strategies for Image Compression: A Review International Journal of Computer Sciences and Engineering Open Access Review Paper Volume-4, Special Issue-1 E-ISSN: 2347-2693 Conceptual Framework Strategies for Image Compression: A Review Sumanta Lal

More information

TCP/IP Networking, Part 2: Web-Based Control

TCP/IP Networking, Part 2: Web-Based Control TCP/IP Networking, Part 2: Web-Based Control Microchip TCP/IP Stack HTTP2 Module 2007 Microchip Technology Incorporated. All Rights Reserved. Building Embedded Web Applications Slide 1 Welcome to the next

More information

Comparison of different image compression formats. ECE 533 Project Report Paula Aguilera

Comparison of different image compression formats. ECE 533 Project Report Paula Aguilera Comparison of different image compression formats ECE 533 Project Report Paula Aguilera Introduction: Images are very important documents nowadays; to work with them in some applications they need to be

More information

Privacy Preserving Outsourcing for Frequent Itemset Mining

Privacy Preserving Outsourcing for Frequent Itemset Mining Privacy Preserving Outsourcing for Frequent Itemset Mining M. Arunadevi 1, R. Anuradha 2 PG Scholar, Department of Software Engineering, Sri Ramakrishna Engineering College, Coimbatore, India 1 Assistant

More information

Multilevel Communication Aware Approach for Load Balancing

Multilevel Communication Aware Approach for Load Balancing Multilevel Communication Aware Approach for Load Balancing 1 Dipti Patel, 2 Ashil Patel Department of Information Technology, L.D. College of Engineering, Gujarat Technological University, Ahmedabad 1

More information

ANALYSIS AND EFFICIENCY OF ERROR FREE COMPRESSION ALGORITHM FOR MEDICAL IMAGE

ANALYSIS AND EFFICIENCY OF ERROR FREE COMPRESSION ALGORITHM FOR MEDICAL IMAGE ANALYSIS AND EFFICIENCY OF ERROR FREE COMPRESSION ALGORITHM FOR MEDICAL IMAGE 1 J HEMAMALINI, 2 D KAAVYA 1 Asstt Prof., Department of Information Technology, Sathyabama University, Chennai, Tamil Nadu

More information

Key Components of WAN Optimization Controller Functionality

Key Components of WAN Optimization Controller Functionality Key Components of WAN Optimization Controller Functionality Introduction and Goals One of the key challenges facing IT organizations relative to application and service delivery is ensuring that the applications

More information

Binary Trees and Huffman Encoding Binary Search Trees

Binary Trees and Huffman Encoding Binary Search Trees Binary Trees and Huffman Encoding Binary Search Trees Computer Science E119 Harvard Extension School Fall 2012 David G. Sullivan, Ph.D. Motivation: Maintaining a Sorted Collection of Data A data dictionary

More information

Dissertation Title: SOCKS5-based Firewall Support For UDP-based Application. Author: Fung, King Pong

Dissertation Title: SOCKS5-based Firewall Support For UDP-based Application. Author: Fung, King Pong Dissertation Title: SOCKS5-based Firewall Support For UDP-based Application Author: Fung, King Pong MSc in Information Technology The Hong Kong Polytechnic University June 1999 i Abstract Abstract of dissertation

More information

RANKING OF CLOUD SERVICE PROVIDERS IN CLOUD

RANKING OF CLOUD SERVICE PROVIDERS IN CLOUD RANKING OF CLOUD SERVICE PROVIDERS IN CLOUD C.S. RAJARAJESWARI, M. ARAMUDHAN Research Scholar, Bharathiyar University,Coimbatore, Tamil Nadu, India. Assoc. Professor, Department of IT, PKIET, Karaikal,

More information

Big Data Technology Map-Reduce Motivation: Indexing in Search Engines

Big Data Technology Map-Reduce Motivation: Indexing in Search Engines Big Data Technology Map-Reduce Motivation: Indexing in Search Engines Edward Bortnikov & Ronny Lempel Yahoo Labs, Haifa Indexing in Search Engines Information Retrieval s two main stages: Indexing process

More information

Introduction to Hadoop

Introduction to Hadoop 1 What is Hadoop? Introduction to Hadoop We are living in an era where large volumes of data are available and the problem is to extract meaning from the data avalanche. The goal of the software tools

More information

High performance computing network for cloud environment using simulators

High performance computing network for cloud environment using simulators High performance computing network for cloud environment using simulators Ajith Singh. N 1 and M. Hemalatha 2 1 Ph.D, Research Scholar (CS), Karpagam University, Coimbatore, India 2 Prof & Head, Department

More information

Structures for Data Compression Responsible persons: Claudia Dolci, Dante Salvini, Michael Schrattner, Robert Weibel

Structures for Data Compression Responsible persons: Claudia Dolci, Dante Salvini, Michael Schrattner, Robert Weibel Geographic Information Technology Training Alliance (GITTA) presents: Responsible persons: Claudia Dolci, Dante Salvini, Michael Schrattner, Robert Weibel Content 1.... 2 1.1. General Compression Concepts...3

More information

A PPM-like, tag-based branch predictor

A PPM-like, tag-based branch predictor Journal of Instruction-Level Parallelism 7 (25) 1-1 Submitted 1/5; published 4/5 A PPM-like, tag-based branch predictor Pierre Michaud IRISA/INRIA Campus de Beaulieu, Rennes 35, France pmichaud@irisa.fr

More information

Storage Management for Files of Dynamic Records

Storage Management for Files of Dynamic Records Storage Management for Files of Dynamic Records Justin Zobel Department of Computer Science, RMIT, GPO Box 2476V, Melbourne 3001, Australia. jz@cs.rmit.edu.au Alistair Moffat Department of Computer Science

More information

Indexing and Compression of Text

Indexing and Compression of Text Compressing the Digital Library Timothy C. Bell 1, Alistair Moffat 2, and Ian H. Witten 3 1 Department of Computer Science, University of Canterbury, New Zealand, tim@cosc.canterbury.ac.nz 2 Department

More information

Parallel Compression and Decompression of DNA Sequence Reads in FASTQ Format

Parallel Compression and Decompression of DNA Sequence Reads in FASTQ Format , pp.91-100 http://dx.doi.org/10.14257/ijhit.2014.7.4.09 Parallel Compression and Decompression of DNA Sequence Reads in FASTQ Format Jingjing Zheng 1,* and Ting Wang 1, 2 1,* Parallel Software and Computational

More information

Department of Computer Science. The University of Arizona. Tucson Arizona

Department of Computer Science. The University of Arizona. Tucson Arizona A TEXT COMPRESSION SCHEME THAT ALLOWS FAST SEARCHING DIRECTLY IN THE COMPRESSED FILE Udi Manber Technical report #93-07 (March 1993) Department of Computer Science The University of Arizona Tucson Arizona

More information

Tape Drive Data Compression Q & A

Tape Drive Data Compression Q & A Tape Drive Data Compression Q & A Question What is data compression and how does compression work? Data compression permits increased storage capacities by using a mathematical algorithm that reduces redundant

More information

Data Storage Security in Cloud Computing for Ensuring Effective and Flexible Distributed System

Data Storage Security in Cloud Computing for Ensuring Effective and Flexible Distributed System Data Storage Security in Cloud Computing for Ensuring Effective and Flexible Distributed System 1 K.Valli Madhavi A.P vallimb@yahoo.com Mobile: 9866034900 2 R.Tamilkodi A.P tamil_kodiin@yahoo.co.in Mobile:

More information

Lossless Medical Image Compression using Predictive Coding and Integer Wavelet Transform based on Minimum Entropy Criteria

Lossless Medical Image Compression using Predictive Coding and Integer Wavelet Transform based on Minimum Entropy Criteria Lossless Medical Image Compression using Predictive Coding and Integer Wavelet Transform based on Minimum Entropy Criteria 1 Komal Gupta, Ram Lautan Verma, 3 Md. Sanawer Alam 1 M.Tech Scholar, Deptt. Of

More information

A Data De-duplication Access Framework for Solid State Drives

A Data De-duplication Access Framework for Solid State Drives JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 28, 941-954 (2012) A Data De-duplication Access Framework for Solid State Drives Department of Electronic Engineering National Taiwan University of Science

More information

encoding compression encryption

encoding compression encryption encoding compression encryption ASCII utf-8 utf-16 zip mpeg jpeg AES RSA diffie-hellman Expressing characters... ASCII and Unicode, conventions of how characters are expressed in bits. ASCII (7 bits) -

More information

Optimization of ETL Work Flow in Data Warehouse

Optimization of ETL Work Flow in Data Warehouse Optimization of ETL Work Flow in Data Warehouse Kommineni Sivaganesh M.Tech Student, CSE Department, Anil Neerukonda Institute of Technology & Science Visakhapatnam, India. Sivaganesh07@gmail.com P Srinivasu

More information

The enhancement of the operating speed of the algorithm of adaptive compression of binary bitmap images

The enhancement of the operating speed of the algorithm of adaptive compression of binary bitmap images The enhancement of the operating speed of the algorithm of adaptive compression of binary bitmap images Borusyak A.V. Research Institute of Applied Mathematics and Cybernetics Lobachevsky Nizhni Novgorod

More information

Today s topics. Digital Computers. More on binary. Binary Digits (Bits)

Today s topics. Digital Computers. More on binary. Binary Digits (Bits) Today s topics! Binary Numbers! Brookshear.-.! Slides from Prof. Marti Hearst of UC Berkeley SIMS! Upcoming! Networks Interactive Introduction to Graph Theory http://www.utm.edu/cgi-bin/caldwell/tutor/departments/math/graph/intro

More information