Efficient Similarity Search over Encrypted Data



Similar documents
Efficient Similarity Search over Encrypted Data

An Efficiency Keyword Search Scheme to improve user experience for Encrypted Data in Cloud

3-6 Toward Realizing Privacy-Preserving IP-Traceback

Balamaruthu Mani. Supervisor: Professor Barak A. Pearlmutter

Privacy Preserving String Matching for Cloud Computing

Secure Computation Martin Beck

A NOVEL APPROACH FOR MULTI-KEYWORD SEARCH WITH ANONYMOUS ID ASSIGNMENT OVER ENCRYPTED CLOUD DATA

MUTI-KEYWORD SEARCH WITH PRESERVING PRIVACY OVER ENCRYPTED DATA IN THE CLOUD

An Efficient Multi-Keyword Ranked Secure Search On Crypto Drive With Privacy Retaining

Public Key Infrastructure

Secure Keyword Based Search in Cloud Computing: A Review

How To Create A Large Data Storage System

Survey on Efficient Information Retrieval for Ranked Query in Cost-Efficient Clouds

Privacy Preserving String Pattern Matching on Outsourced Data

Master s Thesis. Secure Indexes for Keyword Search in Cloud Storage. Supervisor Professor Hitoshi Aida ( ) !!!

Practical Yet Universally Composable Two-Server Password-Authenticated Secret Sharing

Overview of Cryptographic Tools for Data Security. Murat Kantarcioglu

Secure semantic based search over cloud

Private Record Linkage with Bloom Filters

Verifiable Delegation of Computation over Large Datasets

Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

How To Create A Multi-Keyword Ranked Search Over Encrypted Cloud Data (Mrse)

Security Aspects of. Database Outsourcing. Vahid Khodabakhshi Hadi Halvachi. Dec, 2012

Ranked Keyword Search Using RSE over Outsourced Cloud Data

Paillier Threshold Encryption Toolbox

Access Control Basics. Murat Kantarcioglu

Online signature API. Terms used in this document. The API in brief. Version 0.20,

How Secure are your Channels? By Morag Hughson

Privacy-preserving Ranked Multi-Keyword Search Leveraging Polynomial Function in Cloud Computing

Verifiable Outsourced Computations Outsourcing Computations to Untrusted Servers

Cloud Security Overview

INTRUSION PROTECTION AGAINST SQL INJECTION ATTACKS USING REVERSE PROXY

A Secure & Efficient Data Integrity Model to establish trust in cloud computing using TPA

FAST 11. Yongseok Oh University of Seoul. Mobile Embedded System Laboratory

New Constructions and Practical Applications for Private Stream Searching (Extended Abstract)

SURVEY ON: CLOUD DATA RETRIEVAL FOR MULTIKEYWORD BASED ON DATA MINING TECHNOLOGY

Big Data - Security and Privacy

Technical Approaches for Protecting Privacy in the PCORnet Distributed Research Network V1.0

Assuring Integrity in Privacy Preserving Multikeyword Ranked Search over Encrypted Cloud Data

Privacy Aspects in Big Data Integration: Challenges and Opportunities

Application note: Connecting the to a Database

Secure Data Management Scheme using One-Time Trapdoor on Cloud Storage Environment

Tackling The Challenges of Big Data. Tackling The Challenges of Big Data Big Data Systems. Security is a Negative Goal. Nickolai Zeldovich

Journal of Electronic Banking Systems

System Requirement Specification for A Distributed Desktop Search and Document Sharing Tool for Local Area Networks

E-Democracy and e-voting

Secure Data transfer in Cloud Storage Systems using Dynamic Tokens.

Computing on Encrypted Data

Web Security Considerations

Signature Amortization Technique for Authenticating Delay Sensitive Stream

KEYWORD SEARCH OVER PROBABILISTIC RDF GRAPHS

Implementation of Privacy-Preserving Public Auditing and Secure Searchable Data Cloud Storage

Secure Way of Storing Data in Cloud Using Third Party Auditor

Facilitating Efficient Encrypted Document Storage and Retrieval in a Cloud Framework

Secure Large-Scale Bingo

Cryptosystems. Bob wants to send a message M to Alice. Symmetric ciphers: Bob and Alice both share a secret key, K.

Development of Secure Multikeyword Retrieval Methodology for Encrypted Cloud Data

JAVA IEEE Privacy Policy Inference of User-Uploaded Images on Content Sharing Sites Data Mining

Achta's IBAN Validation API Service Overview (achta.com)

K-NN CLASSIFICATION OVER SECURE ENCRYPTED RELATIONAL DATA IN OUTSOURCED ENVIRONMENT

Privacy and Security in Cloud Computing

: Network Security. Name of Staff: Anusha Linda Kostka Department : MSc SE/CT/IT

Keywords: cloud computing, multiple keywords, service provider, search request, ranked search

Secure Deduplication of Encrypted Data without Additional Independent Servers

Einführung in SSL mit Wireshark

Search on Encrypted Data

Sync Security and Privacy Brief

MESSAGE AUTHENTICATION IN AN IDENTITY-BASED ENCRYPTION SCHEME: 1-KEY-ENCRYPT-THEN-MAC

NSF Workshop on Big Data Security and Privacy

Public Key Encryption that Allows PIR Queries

Improving data integrity on cloud storage services

Where every interaction matters.

White Paper BMC Remedy Action Request System Security


Appendix A: Configuring Firewalls for a VPN Server Running Windows Server 2003

Books and Beyond. Erhan J Kartaltepe, Paul Parker, and Shouhuai Xu Department of Computer Science University of Texas at San Antonio

Digital Signatures. Murat Kantarcioglu. Based on Prof. Li s Slides. Digital Signatures: The Problem

Data Privacy Protection of Medical Data in a National Context

Security of Cloud Storage: - Deduplication vs. Privacy

php-crypto-params Documentation

Chapter 9 Key Management 9.1 Distribution of Public Keys Public Announcement of Public Keys Publicly Available Directory

Transcription:

UT DALLAS Erik Jonsson School of Engineering & Computer Science Efficient Similarity Search over Encrypted Data Mehmet Kuzu, Saiful Islam, Murat Kantarcioglu

Introduction Client Untrusted Server Similarity Search over Encrypted Data Selected Encrypted Items Requires: Efficient and Secure Similarity Searchable Encryption Protocols

Problem Formulation BuildIndex(K, D): Extract feature set for each data item in D and form secure index I with key K. Trapdoor (K, f): Generate a trapdoor for a specific feature f with key K and output T. Search(I,T): Perform search on I with trapdoor of feature f (T) and output encrypted collection C:

Locality Sensitive Hashing Family of functions is said to be (r 1, r 2, p 1,p 2 )- sensitive if for any x, y F and for any h H. A composite function g: (g 1,, g λ ) can be formed to push p 1 closer to 1 and p 2 closer to 0 by adjusting the LSH parameters (k, λ).

Security Goals Access Pattern (A p ): Identifiers of data items that are in the result set of a specific query. Similarity Pattern (S p ): Relative similarity among distinct queries.

Secure LSH Index Content of any bucket B k is a bit vector (V B k ): [Enc id (B k ), Enc payload (V B k )] ϵ I.

Secure Search Scheme Shared Information K coll : Secret key of data collection encryption K id, K payload : Secret keys of index construction ρ: Metric space translation function g: Locality sensitive function

Secure Search Scheme Trapdoor Construction for feature f i :

Multi-Server Setting Basic search scheme reveals similarity and access patterns. It is desirable to separate leaked information to mitigate potential attacks. Multi-server setting enables lighter clients.

One Round Search Scheme This scheme is built on Paillier encryption that is semantically secure and additive homomorphic.

One Round Search Scheme Bob performs homomorphic addition on the payloads of trapdoor components.

Error Aware Keyword Search Typographical errors are common both in the queries and data sources. In this context, data items be the documents, features be the words in the document and query feature be a keyword. Bloom filter encoding enables efficient space translation for approximate string matching.

Error Aware Keyword Search Elegant locality sensitive family has been designed for Jaccard distance (MinHash) that is [r 1, r 2,1-r 1, 1-r 2 ] sensitive.

Experimental Setup A sample corpus of 5000 emails is constructed from publicly available Enron e-mail dataset. Words in e-mails are embedded into 500 bit Bloom filter with 15 hash functions. (0.45, 0.8, 0.85, 0.01)-sensitive family is formed from MinHash to tolerate typos. Common typos are introduced into the queries %25 of the time. Default Parameters: (Number of documents: 5000, Number of features: 5000, k:5, λ: 37).

Retrieval Evaluation Ranking limits retrieval of irrelevant items.

Performance Evaluation (Single Server) Increase in k and decrease in λ have similar effects. Decrease in λ leads smaller trapdoors.

Performance Evaluation (Single Server) With increasing n d, matching documents and the size of transferred bit vectors becomes larger.

Performance Evaluation (Multi-Server) Transfer of homomorphic addition results between servers is the main bottleneck.

Conclusion We proposed LSH based secure index and search scheme to enable fast similarity search over encrypted data. We provided a rigorous security definition and proved the security of the scheme to ensure confidentiality of the sensitive data. Efficiency of the proposed scheme is verified with empirical analysis.

Conclusion THANKS! QUESTIONS?