Security and Privacy in Big Data, Blessing or Curse? 2 nd National Cryptography Days 9-11 April 2015 Dr. Zeki Erkin Cyber Security Section Department of Intelligent Systems Delft University of Technology 1
About me BSc and MSc @ITU, Istanbul, 2002, 2005 PhD @TU Delft, 2010 PostDoc @ TU Delft, 2010-2014 Assist. Prof. @ TU Delft, Cyber Security Group FET Signal Processing in the Encrypted Domain STW Kindred Spirits Dutch/COMMIT Trusted Healthcare and Extreme Wireless Sensor Networks 3TU Big Software on the Run Secure Signal Processing, Privacy Enhancing Technologies MPC, Homomorphic Encryption PCs, TCs: JoPETS, PETs, IEEE TIFS, WIFS, ICIP, ICASSP Bochum, Aarhus, UC Irvine, IBM Zurich 2
Outline Security and Privacy in Big data Motivation Secure Signal Processing Face Recognition Recommender Systems Research Challenges and Opportunities 3
Privacy concerns Data data and more data 4
Problem statement Sensitive Data Commercially valuable algorithm 1. Service provider trustworthy Bankruptcy, lost-theft of data, insiders 2. Service provider untrustworthy Malicious acts, selling-transfer of data to the 3 rd parties Cloud computing: outsourcing computation and storage Where, when, by whom? Laws? Privacy? Espionage? Can we protect privacy while processing data without hampering services? 5
Players Government Regulation, legalization, protecting privacy, providing security and safety (critical infrastructures), creating new business fields Citizens Demanding security and privacy. Economical benefits, job opportunities Business Increasing profit, reducing costs, reaching out to more customers, new business ideas Academia Solutions for societal problems 6
Secure Data Processing computational privacy Privacy Enhancing Technologies Privacy by Design Applied cryptography Homomorphic encryption Garbled circuits Secret sharing MPC techniques Do not reveal sensitive data in plaintext! 7
Face Recognition Alice Bob Is he a criminal? Database Processing Yes, ID/No 8
with Privacy Alice Bob Is he a criminal? Database Processing [Yes], [ID]/[No] Z. Erkin, M. Franz, J. Guajardo, S. Katzenbeisser, R. L. Lagendijk and T. Toft, Privacy- Preserving Face Recognition, 9th International Symposium on Privacy Enhancing Technologies, LNCS 5672, pp. 235-253, August 2009. 9
Eigenface Algorithm 10
Secure Face Recognition 11
Homomorphic Encryption A number of schemes preserve structure after encryption. Additive Homomorphism (Paillier 99) 12
Projection in the encrypted domain Alice (sk) Bob (pk) Input image Encrypted pixel values Feature vectors in a database Apply projection and obtain the feature vector of the input image. 13
Euclidean Distance Alice (sk) Bob (pk) F y =(f (y,1),f (y,2),...,f (y,k) ) Secure Multiplication Protocol! Homomorphism 14
Secure Multiplication Protocol Bob Alice 15
Finding the minimum Alice (sk) Bob (pk) [D 2 (F x,f y )], [D 2 (F x,f w )],...,[D 2 (F x,f z )] Find the minimum squared distance! But [D 2 (F x,f y )] = g D2 (F x,f y ) r n 1 mod n 2 = 154894318447855...4848948974897 [D 2 (F x,f w )] = g D2 (F x,f w ) r n 2 mod n 2 = 956814894149...123484987163 16
Finding the Minimum: Concept 17
Interactive Game Alice Bob 18
Comparison [e i ] = [1] [c i ] [r i ] 1 `Y 1 [c j ] [r j ] [c j ] 2r j j=i+1 19
Secure Face Recognition 20
Performance Implemented in 2009 Integer arithmetic 400 images (112x92) 18 seconds Implementation in 2009 (hybrid approach) Garbled circuits 1000 images 13 seconds 21
Recommender Systems Problem: Privacy likes/dislikes: identification and tracking medical data cannot be stored and processed Solution: Privacy Enhancing Technologies 22
Ideal System 23
3-Party Setting Erkin, Z., Veugen, T., Toft, T., Lagendijk, R.: Generating Private Recommendations Efficiently Using Homomorphic Encryption and Data Packing. IEEE Transactions on Information Forensics and Security 7 (06/2012 2012) 1053 1066 Beye, M., Erkin, Z., Lagendijk, R.: Efficient privacy preserving K-means clustering in a three-party setting. In: Information Forensics and Security (WIFS), 2011 IEEE International Workshop on. (29 2011-dec. 2 2011) 1 6 Canny, J.: Collaborative filtering with privacy. In Proceedings IEEE Symposium on Security and Privacy, IEEE (2002) 45 57 24
Dynamic Execution Problem Kononchuk, D., Z. Erkin, J. C. A. van der Lubbe, and R. L. Lagendijk, "Privacy-Preserving User Data Oriented Services For Groups With Dynamic Participation", ESORICS, Egham, UK, 09/2013. 25
Case Study: Ahold E(ID) Data Profiles 320M visitors in NL per year Suggestions This is BIG DATA 26
Curse or Blessing Curse Awareness - society Legalization - governments Limitations - industry Blessing Research questions! Privacy by design wins! 27
Research Challenges Efficiency Run-time, bandwidth, storage Security model Semi-honest, covert, malicious Cryptographic tools FHE, SHE, HE, GC, SS (additive, strong ramp) MPC techniques Application setting 2-party, 3-party, N-party Static and Dynamic Application domain Cloud computing Confidentiality(privacy), integrity (computation and storage) Smart grids Billing, data aggregation, verification, prediction Automotive, social networks, supply chains Data mining (finance), data fusion, real time, data mitigation etc 28
Opportunities Multi-disciplinary Cryptography, signal processing, pattern recognition, machine learning, social sciences: social-technical solutions (H2020) Wide application domain Biometrics, smart grids, cloud computing, finance, defence..etc H2020 Digital societies: Trust, Privacy ICT calls Thank you for your attention! 29