Verifiable Delegation of Computation over Large Datasets



Similar documents
Ensuring Integrity in Cloud Computing via Homomorphic Digital Signatures: new tools and results

Verifiable Outsourced Computations Outsourcing Computations to Untrusted Servers

Privacy and Verifiability for Data Storage in Cloud Computing. Melek Ӧnen August 17, 2015 IFIP Summer School, Edinburgh

MTAT Cryptology II. Digital Signatures. Sven Laur University of Tartu

Improving data integrity on cloud storage services

1 Message Authentication

SECURE AND EFFICIENT PRIVACY-PRESERVING PUBLIC AUDITING SCHEME FOR CLOUD STORAGE

Secure Data transfer in Cloud Storage Systems using Dynamic Tokens.

Lecture 10: CPA Encryption, MACs, Hash Functions. 2 Recap of last lecture - PRGs for one time pads

Lecture 9 - Message Authentication Codes

Privacy and Security in Cloud Computing

Lecture 15 - Digital Signatures

PORs: Proofs of Retrievability for Large Files

Introduction. Digital Signature

Enabling Public Auditing for Secured Data Storage in Cloud Computing

Secure Collaborative Privacy In Cloud Data With Advanced Symmetric Key Block Algorithm

Outline. Computer Science 418. Digital Signatures: Observations. Digital Signatures: Definition. Definition 1 (Digital signature) Digital Signatures

Lecture 5 - CPA security, Pseudorandom functions

Index Terms Cloud Storage Services, data integrity, dependable distributed storage, data dynamics, Cloud Computing.

Enable Public Audit ability for Secure Cloud Storage

Authentication and Encryption: How to order them? Motivation

Public Key Encryption that Allows PIR Queries

Comments on "public integrity auditing for dynamic data sharing with multi-user modification"

Lecture 2: Complexity Theory Review and Interactive Proofs

Digital Signatures. What are Signature Schemes?

Victor Shoup Avi Rubin. Abstract

Cloud Data Storage Services Considering Public Audit for Security

SECURE CLOUD STORAGE PRIVACY-PRESERVING PUBLIC AUDITING FOR DATA STORAGE SECURITY IN CLOUD

Capture Resilient ElGamal Signature Protocols

MACs Message authentication and integrity. Table of contents

preliminary experiment conducted on Amazon EC2 instance further demonstrates the fast performance of the design.

A Secure & Efficient Data Integrity Model to establish trust in cloud computing using TPA

Enhancing Data Security in Cloud Storage Auditing With Key Abstraction

Security Aspects of. Database Outsourcing. Vahid Khodabakhshi Hadi Halvachi. Dec, 2012

Cryptographic hash functions and MACs Solved Exercises for Cryptographic Hash Functions and MACs

MESSAGE AUTHENTICATION IN AN IDENTITY-BASED ENCRYPTION SCHEME: 1-KEY-ENCRYPT-THEN-MAC

Efficient Similarity Search over Encrypted Data

Proofs in Cryptography

Dynamic Searchable Encryption in Very Large Databases: Data Structures and Implementation

Message Authentication Codes 133

Efficient and Secure Dynamic Auditing Protocol for Integrity Verification In Cloud Storage

Secure Deduplication of Encrypted Data without Additional Independent Servers

CryptoVerif Tutorial

VoteID 2011 Internet Voting System with Cast as Intended Verification

Privacy Preserving String Pattern Matching on Outsourced Data

Merkle Hash Trees for Distributed Audit Logs

Computational Soundness of Symbolic Security and Implicit Complexity

Two Factor Zero Knowledge Proof Authentication System

CIS 5371 Cryptography. 8. Encryption --

Ch.9 Cryptography. The Graduate Center, CUNY.! CSc Theoretical Computer Science Konstantinos Vamvourellis

Hosting Services on an Untrusted Cloud

Improved Online/Offline Signature Schemes

Identity-based Encryption with Post-Challenge Auxiliary Inputs for Secure Cloud Applications and Sensor Networks

Currency and Correctness of Content in Object Storage Networks

QUANTUM COMPUTERS AND CRYPTOGRAPHY. Mark Zhandry Stanford University

Secure cloud access system using JAR ABSTRACT:

RIGOROUS PUBLIC AUDITING SUPPORT ON SHARED DATA STORED IN THE CLOUD BY PRIVACY-PRESERVING MECHANISM

Energy Efficiency in Secure and Dynamic Cloud Storage

Ensuring Data Storage Security in Cloud Computing

Verifying Remote Data Integrity in Peer-to-Peer Data Storage: a Comprehensive Survey of Protocols

1 Signatures vs. MACs

Chosen-Ciphertext Security from Identity-Based Encryption

EFFICIENT AND SECURE DATA PRESERVING IN CLOUD USING ENHANCED SECURITY

Privacy-Providing Signatures and Their Applications. PhD Thesis. Author: Somayeh Heidarvand. Advisor: Jorge L. Villar

CS 758: Cryptography / Network Security

Secure Way of Storing Data in Cloud Using Third Party Auditor

IJCSIET-ISSUE5-VOLUME1-SERIES1 Page 1

Arnab Roy Fujitsu Laboratories of America and CSA Big Data WG

Implementation of Privacy-Preserving Public Auditing and Secure Searchable Data Cloud Storage

Overview of Public-Key Cryptography

I. Introduction. A. Related Work

ISSN Index Terms Cloud computing, outsourcing data, cloud storage security, public auditability

Computing on Encrypted Data

Paillier Threshold Encryption Toolbox

Constant-Round Leakage-Resilient Zero-Knowledge Arguments of Knowledge for NP

1 Construction of CCA-secure encryption

1 Formulating The Low Degree Testing Problem

Transcription:

Verifiable Delegation of Computation over Large Datasets Siavosh Benabbas University of Toronto Rosario Gennaro IBM Research Yevgeniy Vahlis AT&T

Cloud Computing Data D Code F Y F(D) Cloud could be malicious or arbitrarily buggy (same as malicious)! Goal: efficiently verify that Y = F(D)

Cloud Computing What is efficient verification? Algo F Data D Option 1: F, D are small but F(D) takes many steps For example: D=N=pq, F tries all prime factors until p,q, are found Efficient verification can be linear in F, D

Cloud Computing What is efficient verification? Algo F Data D Option 2: D is very big F(D) is almost linear in D Plenty of examples: Mining medical records Looking up records (PIR) Making predictions based on trained machine learning models Linear verification is not good enough Need to be (very) sublinear in D

[GGP, CKV, AIK]: Any function can be verifiably delegated in the sense of option 2, assuming Fully Homomorphic Encryption 1. FHE will become practical any moment In the mean time can we do VC without it? 2. [GGP,CKV,AIK] require that a malicious server does not learn if it was successful in cheating a significant restriction in practice

Our Results Non-crypto applications Keyword search A new verifiable delegation scheme Proofs for polynomials of retrievability Delegate functions of the form p(x)=c 0 + c 1 x + c 2 x 2 + + c d x d The degree d is arbitrarily large Extends* to multivariate In polynomials the line of work on auth. data structures and memory checkers Adaptive security the server learns if he was successful Constant communication overhead and client work (strict poly-time) Constant size assumption Verifiable databases A client can outsource dictionaries (i 1, v 1 ) (i n, v n ) Make verifiable retrieval queries Get i Update queries: Add (i, v), Remove (i), Update (i, v)

Prior Work Long series of works related to this problem Interactive Proofs (B,GMR) Probabilistically Checkable Proofs A computation can be associated with a (potentially very long) proof of correctness Verifying an NP problem can take time indep. of size of statement Verifier queries bits of the proof, assuming the Prover honestly provides them Efficient Arguments/CS Proofs [K,M] Prover commits to the PCP proof Verifier queries bits and verifies Statement must be short F(x) = y. Does not deal well with large data. All schemes above are interactive Except for Micali's CS proofs which are made non-interactive in the random oracle model Memory checkers [BlumEvansGemmellKannanNaor91,Ajtai02,GemmellNaor03,NaorRothblum05,Dw orknaorrothvaik09,...] Different model: server can only retrieve array values. The goal is to minimize the number of queries Our solution is not a good memory checker (because the server works hard), but is much more efficient in communication and client work

VERIFIABLE DELEGATION OF POLYMOMIALS

Delegating a polynomial What does it mean to delegate a polynomial? p(x)=a 0 + a 1 x+ + a d x d Public key Short secret SK << d

Delegating a polynomial What does it mean to delegate a polynomial? Public key SK We only want verification Input x Compiled query Response Y Certificate C Goal: be convinced that Y=P(x), or output reject

Our main tool Algebraic PRFs with trapdoor efficient algebraic operations A pseudorandom function F is a family of functions where F K ( ) is indistinguishable from a random function R( ) Algebraic PRF: the range of F K ( ) forms an abelian group F is not a homomorphism! But, given F K (x), F K (y), can compute F K (x) F K (y) A public generator g (This is trivial)

Trapdoor Efficiency Given a range (0,,n) and values (x,x 2,...,x n ) can compute: using the algebraic property Trapdoor efficiency: given (K,x) easy to compute Y (sublinear in n) More generally: other functions of F K (0),,F K (n)

Back to VC Given coefficients a 0,,a d Want to delegate p(x) = a 0 + a 1 x + + a d x Secrecy d of a 0,,a d can be achieved Construction using(singly) Choose random c, compute masking coefficients homomorphic encryption Upload and To answer query x the server computes: and returns (C, P(x))

Verification Verifier s key: PRF key K, masking coefficient c Recall that the server is given The server has (in the exponent) coefficients of An honest server sends: Verifier checks: and Y If = R P(x) was random, this breaks a secure MAC To cheat adversary has to find, W Y

Efficiency If R was random the client would have to remember r 0,, r d Easy to solve using any PRF (in fact, we already did that) Now the client only remembers the PRF key Even if a PRF is used, the verifier needs to check efficiently: Trapdoor efficiency allows exactly that! Given (K, x) can compute R(x) is time sublinear in d

How? From strong-ddh: is ind. from random The PRF is: Efficiency: Need only one exponentiation because: Multivariate: Generalizes Naor-Reingold

How? From DDH Local state size is log(d) We use the Naor-Reingold PRF Efficiency: In the paper: Polynomials with logarithmic number of variables (tradeoff degree/# variables)

To summarize Based on DDH/Strong-DDH we obtian an adaptively secure scheme for delegating high degree polynomials. Can be used for keyword search: To outsource a set of keywords {w 1,,w n } outsource the polynomial p(x) = (x-w 1 ) (x-w 2 ) (x-w n ) Proofs of retrievability Want to make sure that server keeps a large file F Break F into blocks F 0,,F n Outsource the polynomial P(x) = F 0 + F 1 x + + F n x n Audit check: verifiably evaluate P(r) for random r

Open directions Adaptive security for general functions Other efficient constructions for restricted classes of functions Better support for multi-variate polynomials Thank you!

Thank you!

VERIFIABLE DATABASES!

Verifiable databases? Retrieve location i Write to location j Insert to location k Delete from location l Think: SVN with untrusted repository

Very abridged history Merkle trees Data is in stored as leaves of a tree Client keeps a hash of the root Queries/updates are relatively easy log n operations each Insertion/deletion is not good based on amortization Too slow over a network for large storages Memory checkers Different model: server is a RAM Efficiency is counted in # of RAM queries We allow server to work hard Authenticated Data Structures Different model: trusted party has a large secret

Folklore solution without updates For every populated location i Give the server MAC(i, data[i]) For all other locations j Upload a MAC of the shortest prefix w of j that does not extend to a populated i But, hard to do updates can t revoke! root?? (i1,d1) (i2,d2)

Simple Construction Upload to authenticate (i,v i ) This is a MAC Can update (insecurely): To change value to u i, send Now server can find Insertion is easy Efficient deletion not possible Server always has certificate for (i,v i ) Can we fix it? Need to tie all the elements together without growing client state

Composite Order Bilinear Groups Subgroup membership assumption: G = G 1 x G 2 G 1 =p G 2 =q Given g in G, g 2 in G 2 hard to distinguish: (Random from G) c (Random from G 2 )

Instead of uploading Back to verifiable DB The client sends for a random w i The key is a,b,k, and The server now sends* To update location i to value u i client sends and updates w Proof of security: the update token is indistinguishable from. (Actually, there are CCA issues)

Back to verifiable DB But server can t compute! All he has is Upload additional hints h 1 in G, h 0 in G 2 To respond to query i the server sends back: The client performs the check in the target group of the pairing

Open directions Adaptive security for general functions is still open Support higher degree polynomials Obtain constructions based on Lattice assumptions Make verifiable DB publicly checkable Extend VDB to support wider range of queries Thank you!