Searchable encryption



Similar documents
An Efficiency Keyword Search Scheme to improve user experience for Encrypted Data in Cloud

Security over Cloud Data through Encryption Standards

Secure Group Oriented Data Access Model with Keyword Search Property in Cloud Computing Environment

Ranked Search over Encrypted Cloud Data using Multiple Keywords

Secure and Efficient Data Retrieval Process based on Hilbert Space Filling Curve

An Efficient Multi-Keyword Ranked Secure Search On Crypto Drive With Privacy Retaining

Verifiable Symmetric Searchable Encryption for Multiple Groups of Users

Keywords: cloud computing, multiple keywords, service provider, search request, ranked search

A Full-Text Retrieval Algorithm for Encrypted Data in Cloud Storage Applications

Assuring Integrity in Privacy Preserving Multikeyword Ranked Search over Encrypted Cloud Data

Searchable Symmetric Encryption: Improved Definitions and Efficient Constructions

Secure Conjunctive Keyword Search Over Encrypted Data

Ranked Keyword Search Using RSE over Outsourced Cloud Data

Survey on Efficient Information Retrieval for Ranked Query in Cost-Efficient Clouds

A NOVEL APPROACH FOR MULTI-KEYWORD SEARCH WITH ANONYMOUS ID ASSIGNMENT OVER ENCRYPTED CLOUD DATA

Public Key Encryption that Allows PIR Queries

Public Key Encryption with keyword Search

New Efficient Searchable Encryption Schemes from Bilinear Pairings

How To Create A Multi-Keyword Ranked Search Over Encrypted Cloud Data (Mrse)

Sheltered Multi-Owner Data distribution For vibrant Groups in the Cloud

Keyword Search over Shared Cloud Data without Secure Channel or Authority

VERIFIABLE SEARCHABLE SYMMETRIC ENCRYPTION

Facilitating Efficient Encrypted Document Storage and Retrieval in a Cloud Framework

Public Key Encryption with keyword Search

Network Security. Computer Networking Lecture 08. March 19, HKU SPACE Community College. HKU SPACE CC CN Lecture 08 1/23

SECURE AND EFFICIENT PRIVACY-PRESERVING PUBLIC AUDITING SCHEME FOR CLOUD STORAGE

Privacy-Preserving Multi-keyword Ranked Search over Encrypted Cloud Data

Sharing Of Multi Owner Data in Dynamic Groups Securely In Cloud Environment

International Journal of Advance Research in Computer Science and Management Studies

Implementation of Privacy-Preserving Public Auditing and Secure Searchable Data Cloud Storage

Privacy in Encrypted Content Distribution Using Private Broadcast Encryption

3-6 Toward Realizing Privacy-Preserving IP-Traceback

Identifying Data Integrity in the Cloud Storage

MESSAGE AUTHENTICATION IN AN IDENTITY-BASED ENCRYPTION SCHEME: 1-KEY-ENCRYPT-THEN-MAC

An Efficient Security Based Multi Owner Data Sharing for Un-Trusted Groups Using Broadcast Encryption Techniques in Cloud

Privacy-Preserving Data Outsourcing in Cloud Computing

SECURITY ENHANCEMENT OF GROUP SHARING AND PUBLIC AUDITING FOR DATA STORAGE IN CLOUD

Secure Data Management Scheme using One-Time Trapdoor on Cloud Storage Environment

Secure Index Management Scheme on Cloud Storage Environment

Experiments in Encrypted and Searchable Network Audit Logs

A Practical Security Framework for Cloud Storage and Computation

Privacy and Security in Cloud Computing

SSARES: Secure Searchable Automated Remote

Privacy-Preserving Multi-keyword Ranked Search over Encrypted Cloud Data

Improving data integrity on cloud storage services

Professor Radha Poovendran EE Department, University of Washington, Seattle, WA & Professor Dawn Song EECS Department, University of California,

Parallel and Dynamic Searchable Symmetric Encryption

CLOUD COMPUTING SECURITY IN UNRELIABLE CLOUDS USING RELIABLE RE-ENCRYPTION

Hey! Cross Check on Computation in Cloud

Enabling Protection and Well-Organized MRSE over Encrypted Cloud Data Using CP-ABE

NEW CRYPTOGRAPHIC CHALLENGES IN CLOUD COMPUTING ERA

Analysis on Secure Data sharing using ELGamal s Cryptosystem in Cloud

Privacy and Verifiability for Data Storage in Cloud Computing. Melek Ӧnen August 17, 2015 IFIP Summer School, Edinburgh

Paillier Threshold Encryption Toolbox

Index Terms Cloud Storage Services, data integrity, dependable distributed storage, data dynamics, Cloud Computing.

EFFICIENT AND SECURE DATA PRESERVING IN CLOUD USING ENHANCED SECURITY

DATA SECURITY IN CLOUD USING ADVANCED SECURE DE-DUPLICATION

Chapter 7: Network security

A Secure & Efficient Data Integrity Model to establish trust in cloud computing using TPA

A Secure Index Management Scheme for Providing Data Sharing in Cloud Storage

A Searchable Encryption Scheme for Outsourcing Cloud Storage

Data Storage Security in Cloud Computing

Query Services in Cost Efficient Cloud Using Query Analysis

Building an Encrypted and Searchable Audit Log

Multi Layered Securing of Health Records using Public and Private Model in Cloud

Cryptography goes to the Cloud

Enhancing Data Security in Cloud Storage Auditing With Key Abstraction

Secret Sharing based on XOR for Efficient Data Recovery in Cloud

A Proxy-Based Data Security Solution in Mobile Cloud

Security Aspects of. Database Outsourcing. Vahid Khodabakhshi Hadi Halvachi. Dec, 2012

On the Limits of Anonymous Password Authentication

New Techniques for Private Stream Searching

Breaking Generalized Diffie-Hellman Modulo a Composite is no Easier than Factoring

What is network security?

Security in Electronic Payment Systems

Lecture 15 - Digital Signatures

Development of enhanced Third party Auditing Scheme for Secure Cloud Storage

Efficient Similarity Search over Encrypted Data

Lecture 9 - Message Authentication Codes

Key Laboratory of Computer Networks and Information Security Xidian University, Xi an, P.R. China

Privacy-preserving Ranked Multi-Keyword Search Leveraging Polynomial Function in Cloud Computing

RIGOROUS PUBLIC AUDITING SUPPORT ON SHARED DATA STORED IN THE CLOUD BY PRIVACY-PRESERVING MECHANISM

Selective dependable storage services for providing security in cloud computing

EFFICIENT AND SECURE ATTRIBUTE REVOCATION OF DATA IN MULTI-AUTHORITY CLOUD STORAGE

A New Method for Searching Keyword in Cloud Servers Using ANFIS

Verifying Correctness of Trusted data in Clouds

Keywords: - Ring Signature, Homomorphic Authenticable Ring Signature (HARS), Privacy Preserving, Public Auditing, Cloud Computing.

Non-Black-Box Techniques In Crytpography. Thesis for the Ph.D degree Boaz Barak

Efficient Framework for Deploying Information in Cloud Virtual Datacenters with Cryptography Algorithms

Cloud Data Storage Security and Public Auditing

CS 758: Cryptography / Network Security

The Misuse of RC4 in Microsoft Word and Excel

MANAGING OF AUTHENTICATING PASSWORD BY MEANS OF NUMEROUS SERVERS

preliminary experiment conducted on Amazon EC2 instance further demonstrates the fast performance of the design.

Cryptography: Authentication, Blind Signatures, and Digital Cash

Inverted Index Based Multi-Keyword Public-key Searchable Encryption with Strong Privacy Guarantee

Secure Way of Storing Data in Cloud Using Third Party Auditor

EFFECTIVE DATA RECOVERY FOR CONSTRUCTIVE CLOUD PLATFORM

Efficient Unlinkable Secret Handshakes for Anonymous Communications

Seclusion Search over Encrypted Data in Cloud Storage Services

Transcription:

RESEARCH MASTER S DEGREE IN COMPUTER SCIENCE Searchable encryption BIBLIOGRAPHICAL STUDY 26 January 2012 Tarik Moataz INTERNSHIP at Alcatel-Lucent Bell Labs Supervisors Cuppens Frédéric, SFIIS LabSTICC Cuppens Nora, SFIIS LabSTICC Shikfa Abdullatif, Bell Labs 1

Summary Abstract 3 1. Towards a safer storage 3 1.1. Context 3 1.2. The issue 3 1.3. Resulting solution 3 2. Searchable Encryption 4 2.1. Definition 4 2.2. Search approaches on encrypted data 5 2.2.1. Symmetric searchable encryption 5 2.2.2. Asymmetric searchable encryption 5 2.2.3. PIR: Private Information Retrieval 5 3. Searchable Encryption models 5 3.1. Symmetric models 6 3.1.1. Goh s scheme 6 3.1.2. Reza Curtmola & Al. scheme 8 3.2. Asymmetric model: Boneh & Al. 10 3.3. Security of searchable encryption solutions 11 3.3.1. Security solutions 11 4. Comparison of schemes 14 Conclusion 15 Bibliography 15 2

Abstract: Searchable encryption is a new concept used to keep privacy while storing data in an untrusted third party. This approach can be used in a lot of applications such as mail servers, files systems, data bases management... Searchable encryption consists on storing encrypted data (documents, mails, OS files...), retrieving it without any leak of information and keeping anonymity while retaining information confidentiality. In the first part of this study, we will start by presenting the context of searchable encryption. Afterward, we ll point out the importance of security in order to tend towards a full-scale deployment of outsourced data storage. Then we will introduce searchable encryption as a solution to secure data. In the second part, we will define searchable encryption and its different current approaches. In order to enhance our understanding, we will explain several searchable encryption schemes, their security 'models as well as the privacy that they achieve. At last but not least, we will give a comparison between some of the important schemes and finally the subject will be linked to my internship and future contribution at Bell Labs. 1. Towards a safer storage 1.1. Context Nowadays modern life requires much more electronic and computer resources to comply tasks that terminal equipment cannot solely fulfill. These tasks can be storage services, software computing tasks, financial services, multi-party data access. Actually, users need access to their data everywhere, with unlimited storage capacity and availability every time. All these features are not provided by local data storage. Consequently to avoid this hindrance, users are obliged to trust a third party to perform these applications. In fact, users are already used to store their messages in mail servers; furthermore they want to store more data in the outsourced servers to take advantage of huge space. 1.2. The issue Privacy gets more and more important in the outsourced servers. Users store data on servers without any encryption, consequently this information is public and become vulnerable to malicious attacks like shown in figure 1. Users are afraid from unauthorized access threats and the loss of data integrity and confidentiality. The main goal of outsourced storage is to provide privacy and to keep confidentiality of the data stored in the servers. Furthermore, enterprises and big firms tend to avoid using the outsourced storage for fear of disclosing their data. Figure 1 Privacy threat 1.3. Resulting solution We can imagine that a user has to encrypt his data before uploading it on the server. The question is how the user can make a query to retrieve his information without any leak of information. The naïve idea consists on uploading the whole encrypted data whenever the user wants to perform a query, see figure 2. Figure 2 Naïve solution 3

This solution preserves user s privacy, but remains infeasible in practice due to the huge amount of data, the limitation of computational resources on the client side and the limited bandwidth between the server and the client. Several researches were made to solve this problem, but they weren't successful till the year 2000. Indeed, D.Song & Al. [3] were the first to propose a workable solution to store and retrieve data without any leak of information. Thus, Searchable Encryption was initiated and as a consequence, a number of works succeeded to improve Song s solution. As a matter of fact, searchable encryption consists on storing encrypted data and retrieving it by sending encrypted queries in order to hide the access pattern satisfying privacy like shown in figure 3. 2. Searchable Encryption Figure 3 Searchable encryption transaction : Keyword searched for :Encrypted document containing the keyword 2.1. Definition Searchable encryption is a recent concept that performs searches on encrypted data without any leak of information. The main idea is to be able to perform an encrypted query without having to download the whole encrypted data. Indeed, searchable encryption is composed of two steps, see figure 4: Storing a special encryption of data on the untrusted third party (Store phase), Make an encrypted search query to retrieve the desired information (Search phase). Searchable encryption is used to retrieve, for example, all mails containing the word urgent located in the untrusted mail server, or setting the priority of messages crossing a bridge in the network containing the word urgent without having any knowledge about its contents. Searchable encryption has also a lot of applications in encrypted search upon databases (PostgreSQL*5+, MySQL*6+, SQLite*7+ ), mail servers (IMAPS[4],POP..) and file systems (FUSE *8+, Samba *9+ ). We emphasize the problem of searching on encrypted data through this example: Alice has a set of documents and she stores them in an encrypted form into an untrusted server. Alice wants to extract the word amour among her encrypted documents, she sends a hidden query to the server in which the word is encrypted, in such way the servers doesn t know any information about the content of this query. The server has to send to Alice all encrypted documents containing the word amour The example given above deals with a kind of searchable encryption called symmetric encryption. There are other approaches that we will deal with in the next section. 4

Figure 4 Searchable encryption steps 2.2. Search approaches on encrypted data First of all, all the approaches that we will detail are common since they all need a server for storing data, and there are always users whose encrypted data were stored in the server, and who want to perform a search. 2.2.1. Symmetric searchable encryption The main feature of this model of private search is that the user, who encrypts data, is the only user who can perform a search or an update. This fact is due to the use of a private key that belongs only to the user and who cannot share with others. We will see later that we can improve this model to take a multi-user symmetric search basing on broadcast encryption. The symmetric scheme was introduced by GOLDREICH AND R. OSTROVSKY [13] and supposes that the user encrypts his data with a secret key, stores it in the untrusted server, can retrieve his encrypted data then decrypts it with the same key. Several searchable symmetric schemes were done, we can note ([3], [10], [11], [12], [13]). The secure indexes [10] and the look-up tables used in [11] are one of the most successful schemes up to now. Not only complexity was improved but also search features such as conjunctive search [15] were added in recent works. 2.2.2. Asymmetric searchable encryption Currently, there are several applications that need more than a symmetric searchable encryption. We can imagine a person who wants to retrieve in a public data base for an existing encrypted document she didn t store herself. This situation would be impossible to solve unless she has a common secret with the person who encrypted these data. This scheme introduced by Boneh & Al. [14], by analogy to cryptographic scheme, allows a number of users who have a public key to store data in the untrusted server but only the person who had the secret key can perform a search to test of the occurrence of a word. This scheme underwent a number of evolutions either in complexity [16] or search features [17], [18] that allow searching for a set of keyword at once. 2.2.3. PIR [19]: Private Information Retrieval This approach is slightly different from schemes because the data stored in the outsourced servers were unencrypted. The focus is instead solely on allowing the user to perform a search to retrieve documents without revealing the access pattern. Following the news, access to Google alerts, getting updates without revealing the user s interest and consequently preserving anonymity. In the following, we will focus more on the two first approaches (Symmetric/Asymmetric) by giving more details about the most successful schemes up to now. 3. Searchable Encryption models In this Section we will study the models that have marked the searchable encryption. On the one hand, we will illustrate two symmetric models: Goh secure indexes [10] and Reza Curtmola look-up table [11], on the other 5

hand, one asymmetric model: Boneh & Al. [14]. We will give details of all steps that permit to encrypt, search and decrypt data, and then we will discuss the degree of security, namely on the searchable encryption and on these schemes by giving a series of games between an attacker and a challenger. 3.1. Symmetric models 3.1.1. Goh s scheme Goh introduces a new way to retrieve encrypted data. His idea consists on the secure indexes, his algorithm is largely based on a probabilistic data structures called Bloom filter. In this section, first of all, we will detail the Bloom filter structure, and then we will give more specifications about the algorithm s steps beginning by the encryption s phase, and ending by searching s phase. Bloom filter Bloom filters are probabilistic data structures that permit the verification of the existence of an element in the database. The Bloom filter has some features such as: We can add an element but we cannot delete it; The more elements we have, the more false positives we retrieve. Let A={a 1,,a n } a set of n elements and independent pseudorandom functions h i : we initialize all values of m sized vector to zero, for each, we compute the value h i (a) with and the result is made to 1. To test whether the element b is a member of the set A, we have only to calculate h 1 (b),,h r (b). If therefore b A (may be), else if at least one value is equal to zero then b A. We said that b may belong to A because there are false positives since one hash functions gave the same result for different element. The scheme above clarifies the notion of false positives in this case: We insert a and b in the bloom but not c, when we want to verify if c exists in the structure we calculate h i (c), and for this example the result is equal to 1 in all cases that imply normally the existence of this element in the structure where we don t insert it. This example shows the existence of the false positives. 6

Encryption Before beginning the explication of Goh s encyption, we must introduce two new concepts used throughout this algorithm: Keygeneration : takes a private secret as an input (e.g. password) and generates a private key decompose this key to another keys such as:.we Trapdoors : let : a pseudorandom function which m represents the size of the Bloom filter. Trapdoor is a word transformation into a sequence of hash functions using the result of the keygeneration such as: Let a document, with the keyword of the document we want to encrypt according to Goh s algorithm. We follow these steps for each word : 1. We generate the private key using keygeneration algorithm, 2. We compute a Trapdoor on the word : 3. We recalculate the Trapdoor a second time using an identifier of the document rather than the private key : =( we use an identifier to exclude the correlation between two documents having the same keywords. 4. We insert into the bloom filter such as: Goh doesn t stop here; he ameliorates the secrecy in the bloom filter by making his structure noisy. It is true that the result will contain more false positives but this blinding technique is used to distract the attacker. This technique consists on adding 1 s bit at random into the structures. After this blinding we send to the server the following triplet: = <, where represents the blind bloom filter; the encryption of the whole document Search s phase The search s phase is simple, the user has only to regenerate the keygen, the trapdoor associated to the word and then he sends to the server the following trapdoor = The server calculates the hashed of for all identifiers in the database, then he tests whether a match exists in the bloom filter associated to each documents. After, the server sends all documents where he finds a 7

match. Goh s scheme is one of the most performing schemes to retrieve encrypted data. Indeed, using the Bloom filter structure, the search is pre-processed and consequently the search on server side is asymptotically linear to the number of documents stored in the outsourced server. 3.1.2. Reza Curtmola & Al. scheme Reza Curtmola & Al. [11] gave improved schemes for the searchable symmetric encryption. Indeed, the most significant evolution over the previous schemes is the lower computation which is asymptotically constant. In the following, on the one hand we will present the approach searchable symmetric encryption SSE, and give the two phases performed to encrypt and to search over the encrypted data. On the other hand, we will explain a multi-user searchable symmetric encryption M-SSE while there is an owner whose documents are stored according to SSE in an outsourced server, and who wants to give an access safely to another users to the documents stored in the third party. These users can perform only search queries. Encryption phase Goh s scheme *10+ has linked each document to an index; Curtmola & Al scheme meanwhile has linked the set of documents to the same index. We ve seen that the linear computational result over documents of Goh s scheme is made thanks to the probabilistic data structure Bloom Filter. On the other scheme, the encryption phase is based on the combination of look-up tables, linked lists and arrays. We will explain later the reason why this scheme performs the search phase on an asymptotical constant time. The user wants to store his documents on the server. Let be the collection of documents, } be the set of the unique words over. The output of this phase is the encrypted collection of documents and the index that goes with this collection. The index is a combination of various data structures (look-up tables and array). On the one hand, we constitute the identifier set. This later contains all the identifiers of documents that contain the keyword and thus, the identifier collection contains the identifier set of each keyword in. On the other hand, let an array and a linked list that contains all the elements of the identifier set such as: and Furthermore, the look-up table contains a suite of couples, each couple contains an encrypted keyword, and the pointer to the 1 st identifier of in the array concatenated to the key used to decrypt the next pointer in : while. presents a virtual address to locate the entry in, these address are recognized in a FKS dictionary[20], as a consequence, the time for checking for the is constant. We represent the encryption scheme in the following figure: 8

Input Look-up table, Look-up table Search phase: To search for a word the user sends the following Trapdoor: to retrieve all documents containing the word. The search phase works as follow: Find the associated value of in the look-up table Retrieve the first identifier then decrypt the second part of using the key to find the pointer and the key for the second one and so on until the retrieval of all the identifier of the word 9

Multi-user searchable symmetric encryption: So far, all the symmetric searchable encryption schemes seen did not provide a multiuser search. Indeed, in these schemes, only the one who stored the encrypted data can perform a search and retrieve documents. The multiuser search allows the owner of the encrypted data to share with other users the right to search over these encrypted data. Index, User 1 Data owner User 1: permission granted User 2: permission revoked User 2 In the M-SSE, the data owner can grant or revoke a user from the group of privileged users. To add a user, the owner has to give him a key We should point out that the group of revoked users changes dynamically, and thus even if a user have a key, he cannot retrieve the documents using his trapdoor. The M-SSE combines the single user SSE-1 with the broadcast encryption [21]. As consequence, the server (honest) manages the user revocation by checking for each search queries whether the user has his privilege or not. Indeed, each time the owner adds or deletes a user, he generates a new value, and then he sends it to the server. Using this information, the server may verify the permission of the user. The broadcast encryption used in this scheme is explained in details in [21]. 3.2. Asymmetric model: Boneh & Al. D.Boneh & al. [14] consider the situations where a third party shouldn t learn any information more then what she is supposed to know, i.e. Trapdoor, however it has to check for the encrypted data without any leak of information. In Boneh s article, the mail server was chosen to illustrate this situation. In fact, he took the example when Bob sends an email encrypted by Bob s public key, once the gateway receive the encrypted email from Bob, it will route it according to his flag (e.g. flag= urgent ). For instance the user wants to receive all the urgent mails only in his Smartphone, this one will send a trapdoor(private key, urgent ) to the gateway to allow it to search for this keyword. This will permit the gateway to verify the correspondence between the flag and the trapdoor, and then it sends the email if it matches. First of all we will describe this scheme steps, after we will detail the first method given by Boneh & al. used for the encryption phase in the sender s side, then the search phase. Encryption phase in the sender s side Boneh presents two different constructions based on the same algorithm but with different performance implications. In the following paragraph, we will present the general scheme of Boneh encryption then we will detail his first efficient construction using bilinear maps as solution to encrypt the message M. The sender wants to send an encrypted message such as, he has to follow these steps: Keygen(s): presents a secret known by the user (e.g.password), the user generates a public/private key that we will note ; PEKS( : presents a special encryption of the word using the public key The user sends then the following message: 10

< > such as when the set presents only the keywords chosen for the message Construction using bilinear maps: This construction is based on some features of the bilinear map, and its security is held by a Bilinear Diffie- Hellman problem. Let two groups of prime order p and a bilinear map that satisfy a vital property: for any integer we have such as for the encryption, the user needs two hash functions: These functions are known by the sender, the gateway and the receiver. The Keygen works as follows: the secret parameter determines the order of the groups, then the algorithm generates at random a number and a generator, such as and. Once the Keygen performs its computation, Bob knows the public key then he can generate a PEKS for each keyword such as:, with a random number Bob sends then the whole message: A = < >. Now the message A is received by the gateway which needs information to route the message according to his importance. Alice sends then a containing an encrypted information that will let the gateway know whether the keyword exists or not in the encrypted message A. The gateway will then route the mail if it finds this keyword in the message A. We will give more details about this communication round in the following section. The Search phase: Alice sends a Trapdoor to the gateway for the reason above, this algorithm is nothing more than the hashed value of the word searched for to the power, such as: The gateway performs then a test for each PEKS in the encrypted message A to search for the correspondence with the desired word. Let Using the crucial feature of the bilinear map, the gateway can verify the equality without any leak of information, and finally sends the email to Alice. 3.3. Security of searchable encryption solutions 3.3.1. Security solutions So far, we haven t presented any justification about the security achieved by the presented solutions. In other words, even if the model satisfies the encryption security, we should verify whether it is satisfying the searchable encryption security as well. D.Song [3] gave us three principal features which any scheme should verify: The query isolation The controlled searching Hidden queries 11

Reza Curtmola & Al. [11] reformulate these recommendations. Indeed, they affirm that these definitions achieve only a non-adaptive searchable encryption, i.e. all the schemes seen earlier don t take into consideration in their security model the outcome of any antecedent queries. Whereas, R.Curtmola & Al. showed in their article that for a higher degree of security, the security model should be adaptive; i.e.in addition to the three assumptions the scheme should build his security model using the knowledge of the outcome of the previous queries (pattern of search). Up to now, we should point out that all schemes leak the access pattern but nothing else, that means the server doesn t know the word searched for but it can deduce the document containing this word. All these schemes do not reveal any other information more than the encrypted search result. As a consequence, R.Curtmola & Al. define a slightly more secure symmetric encryption scheme: the scheme should not leak any information beyond the outcome and the pattern of search too : Knowing that for any query the same word generates the same Trapdoor (Determinism), we should take into consideration all the history of search since Trapdoors are deterministic in all the schemes seen before. The only work who presents a model which hides the access and search pattern from the untrusted third party is Ostrovsky and Goldreich on software protection based on oblivious RAMs [13], this last hides everything from the server but at the cost of poly-logarithmic complexity in all parameters (number of rounds, server storage and computation) which made it unworkable in practice. Each model proved their construction secure; Goh and Boneh &Al.gave a game between a challenger and an attacker. We should notice that in most cases the attacker is the server. This last is curious but honest (it cannot modify the result of the search). In all the modeled game below, we give the attacker some power, we see if he can override the security of the approach. Goh security model: Goh proposed in his article a Semeantic Security against adaptive Chosen Keyword Attack (IND-CKA): Adversary cannot deduce a document s contents from its index or from what is already known from previous query results The game works as follows: Attacker Challenger Create a number Create a set of q words of subset from Create an index for each subset in Setup phase Until here we suppose that the adversary has knowledge of all the indexes but without knowing the association between the index and his subset in, that s why in the next step, the adversary will ask for trapdoors to create the links between index and the elements in the subsets. Choose Generate a Trapdoor Determine such that time Query phase 12

Here the adversary stops querying and decides to begin the challenge Up to now, the adversary asked for a Trapdoor for the elements in except those in a set picked such that generating at random from and and Pick randomly create an index for Challenge has to guess runs for time and decides if or. The A s advantage of winning is: is a negligible function By reduction, breaking the security of Goh s scheme implies breaking the security of a pseudorandom function. In this game we have shown that given the adversary the following power: Access to all document in plaintext and all the indexes associated Querying for any Trapdoors before the challenge. The adversary cannot deduce which document is encoded in the index without having the trapdoors associated. D.Boneh security model: Boneh proposed in his article two models of adversaries associated to the two constitutions. In the following, we will expose the game for the first constitution based on bilinear maps. The goal of the games is: alone without the associated Trapdoor doesn t reveal any information about the word. The game works as follows: Attacker Challenger Generate Choose any word Generate the Trapdoor Determine the PEKS( ) associated to Here the adversary stops querying and decides to begin the challenge q time Up to now, the adversary asked for a Trapdoor for a number of words. picks tow words such that never asked for their trapdoors. has to guess Pick randomly create 13

The adversary will win the game if. As shown above, the A s advantage is: is a negligible function We say that PEKS is semantically secure against an adaptive chosen keyword attack. Indeed, in the game the attacker was given the following power: Access to all keywords in plaintext and their PEKS Querying for any trapdoors before the challenge. We remark that the role played by the index in Goh model is similar to that played by the PEKS in Boneh model. In both, we should notice that the trapdoors don t leak any information about the word searched for. 4. Comparison of schemes We add to this comparison the [3], [12] and [13] schemes, two symmetric schemes, that we have not seen in details, but to have a global idea of the work done up to now. Type Access pattern hidden Search type Search complexity Server storage N of rounds Query Adaptive adversary Song [3] Symmetric No Linear No No Goh Symmetric No Pre. No No index Boneh [5] Asymmetric No Linear No No Update need recalculation SSE-1 Symmetric No Pre. index SSE-2 Symmetric No Pre. index No Yes Yes Yes Chang & Mitzenmacher [12] Symmetric No Pre. index No Yes Ostrovsky &Goldreich [13] Ostrovsky &Goldreich -Light Symmetric Yes Linear Yes No Symmetric Yes Linear Yes No : number of encrypted documents; : number of indexes; : number of documents containing the word ; : number of unique words (keywords); c is a constant. We emphasize that all these schemes don t perform basically a sub-matching, and in addiction they are casesensitive by nature. Although we can manage to adapt some models to be more flexible. Indeed, Goh, SSE, Boneh, can support these searches options by adding respectively all words combinations to the bloom filter, to the look-up table or to the encrypted mail, but the storage in the server side will increase linearly. We also point out that to update one or a collection of documents to the server following the SSE scheme is very difficult; in fact we should rebuild the look-up table and the whole index that goes with. Finally, on the one hand, the SSE-1/2 scheme is the winner in term of search computation but his big drawback is the updating phase. In the other hand, we can improve the searches options ( proximity, natural search, sub-matching, caseinsensitivity ) in the later works such as Conjunctive, Subset, and Range Queries on Encrypted Data [17]. 14

Conclusion Searchable encryption represents a new concept for improving storage outsourced servers such as Cloud Computing infrastructures. Indeed, the Cloud Computing [1] has become a new way to solve the likes of these problems by providing computing services on infrastructures via internet access. One of the main current uses of the Cloud Computing is data storage (documents, mails ) delivered by many firms such as Amazon, Google, Microsoft On the one hand, Cloud Computing storage has several advantageous features as a total access to data available everywhere, every time, with scalability and reliability. On the other hand, Cloud computing users don t need to know either the location of the physical infrastructures that perform the task or the way their data were stored or computed. What matters the most, is the outcome of their search queries. As a consequence, many of these users restrict their utilization in fear of losing their privacy, thus most of these services should provide a secure access and computation to reassure its clients. Many works had been done in searchable encryption field and there are a number of efficient schemes that are doable in practice. We have seen that SSE Reza Curtmola scheme was one of the most computationally feasible approaches that verify the searchable security rules. We have also seen the possibility to extend the one-user to a multi-user scheme. Throughout this bibliographical study we emphasize on the importance of this concept to secure our data in outsourced servers [2] and to preserve our anonymity. During my internship, I will have to improve the existing solutions and propose an implementable solution for secure data storage in the Cloud. Then I will analyze the possibility to combine a number of schemes according to different storage scenarios and finally I will study the feasibility and scalability of the chosen solution. Bibliography [1] Brian Hayes. Cloud computing. Communications of the ACM, (7):9 11, July 2008. [2] Cryptography Goes to the Cloud Isaac Agudo, David Nuñez, Gabriele Giammatteo, Panagiotis Rizomiliotis and Costas Lambrinoudakis, Secure and Trust Computing, Data Management, and Applications STA 2011 Workshops: IWCS 2011 and STAVE 2011, Loutraki, Greece, June 28-30, 2011. Proceedings [3]D. Song, D. Wagner and A. Perrig. Practical Techniques for Searches on Encrypted Data, IEEE Symposium on Security and Privacy (S&P), 2000, pp.44-55 [4] M;Crispin. Internet message access protocol version 4. RFC1730. December 1994 [5]PostgreSQL DBMS. http://www.postgresql.org/. [6] MySQL DBMS. http://www.mysql.com/. [7] SQLite DBMS. http://www.sqlite.org/. [8] FUSE. http://fuse.sourceforge.net/. [9] Samba Homepage. http://www.samba.org/. [10] Eu Jin Goh. Secure indexes. In the Cryptology eprint Archive, Report 2003/216, March 2004. [11] Reza Curtmola, Juan Garay, Seny Kamara, and Rafail Ostrovsky. Searchable Symmetric Encryption: Improved Definitions and Efficient Constructions, 2006. [12] Y. C. Chang and M. Mitzenmacher. Privacy preserving keyword searches on remote encrypted data. In Applied Cryptography and Network Security Conference (ACNS), 2005. [13] O. Goldreich and R. Ostrovsky. Software protection and simulation on Oblivious RAMs.Journal of the ACM, 43(3):431 473, May 1996. [14] Dan Boneh, Giovanni Di Crescenzo, Rafail Ostrovsky, and Giuseppe Persiano. Public Key Encryption with Keyword Search. In proceedings of Eurocrypt 2004, 2004. [15] Golle, P., Staddon, J., Waters, B.: Secure conjunctive keyword search over encrypteddata. In: Jakobsson, M., Yung, M., Zhou, J. (eds.) ACNS 2004. LNCS,vol. 3089, pp. 31 45. Springer, Heidelberg (2004) [16] Abdalla, M., Bellare, M., Catalano, D., Kiltz, E., Kohno, T., Lange, T., Lee, J.M., Neven, G., Paillier, P., Shi, H.: Searchable encryption revisited: Consistency properties, relation to anonymous IBE, and extensions. In: Shoup, V. (ed.) CRYPTO 2005. LNCS, vol. 3621, pp. 205 222. Springer, Heidelberg (2005) [17] Conjunctive, subset, and range queries on encrypted data ;Dan Boneh, Brent Waters- Theory of Cryptography, 2007 Springer [18] Park, D., Kim, K., Lee, P.: Public key encryption with conjunctive field keyword search. In: Lim, C.H., Yung, M. (eds.) WISA 2004. LNCS, vol. 3325, pp. 73 86. Springer, Heidelberg (2005) [19]Christian Cachin, Silvio Micali, and Markus Stadler. Computationally private information retrieval with polylogarithmic communication. In EUROCRYPT 99, pages 402 414, 1999. [20] M.L. Fredman, J. Koml os, and E. Szemer edi. Storing a sparse table with 0(1) worst case access time. J. ACM, 31(3):538 544, 1984. [21] A. Fiat and M. Naor. Broadcast encryption. In Douglas R. Stinson, editor, Proc. CRYPTO 93, volume 773 of Lecture Notes in Computer Science, pages 480 491. Springer-Verlag, 1994. 15