Sharemind - the Oracle of secure computing systems. Dan Bogdanov, PhD Sharemind product manager dan@cyber.ee



Similar documents
Overview of edx Analytics

NCTA Cloud Architecture

CLOUD MANAGED SERVICES FRAMEWORK E-BOOK

The Cloud as a Computing Platform: Options for the Enterprise

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

Technology Partners. Acceleratio Ltd. is a software development company based in Zagreb, Croatia, founded in 2009.

Computing on Encrypted Data

MicroStrategy Course Catalog

How To Create A Large Data Storage System

Moving Large Data at a Blinding Speed for Critical Business Intelligence. A competitive advantage

PICKPOCKETING MWALLETS. A guide to looting mobile financial services

AWS Account Setup and Services Overview

Enterprise Mobile Solutions

Security Information & Policies

Secure cloud access system using JAR ABSTRACT:

Business Application Services Testing

Tax Fraud in Increasing

PSG College of Technology, Coimbatore Department of Computer & Information Sciences BSc (CT) G1 & G2 Sixth Semester PROJECT DETAILS.

Data Warehouse (DW) Maturity Assessment Questionnaire

Background on Elastic Compute Cloud (EC2) AMI s to choose from including servers hosted on different Linux distros

Logentries Insights: The State of Log Management & Analytics for AWS

Confidentio. Integrated security processing unit. Including key management module, encryption engine and random number generator

Hosted SharePoint. OneDrive for Business. OneDrive for Business with Hosted SharePoint. Secure UK Cloud Document Management from Your Office Anywhere

Ensuring Enterprise Data Security with Secure Mobile File Sharing.

#9011 GeoMedia WebMap Performance Analysis and Tuning (a quick guide to improving system performance)

Cloud storage Security Mechanism with Authentication in Public Cloud

Stephen Coty Director, Threat Research

<Insert Picture Here> Oracle SQL Developer 3.0: Overview and New Features

Lets SAAS-ify that Desktop Application

An Oracle White Paper June Security and the Oracle Database Cloud Service

Your Data, Any Place, Any Time.

Oracle Business Rules Business Whitepaper. An Oracle White Paper September 2005

ACE Management Server Deployment Guide VMware ACE 2.0

Oracle Hospitality Cloud Services* Food & Beverage Service Descriptions and Metrics

Managing the PowerPivot for SharePoint Environment

A Novel Cloud Based Elastic Framework for Big Data Preprocessing

Informatica Dynamic Data Masking

redborder IPS redborder Just common sense IPS overview Common sense

Oracle Data Integrator: Administration and Development

Novacura Flow 5. Technical Overview Version 5.6

Design of Cloud Services for Cloud Based IT Education

Amazon Elastic Compute Cloud Getting Started Guide. My experience

Key Management Interoperability Protocol (KMIP)

IT Security Automation Conference Endpoint Data Protection (EDP) In The Cloud

System Requirements Table of contents

Capacity Management PinkVERIFY

Oracle Database Cloud Service Rick Greenwald, Director, Product Management, Database Cloud

Rights Management Services

SQL 2016 and SQL Azure

GeoCloud Project Report GEOSS Clearinghouse

Sage CRM Technical Specification

BUSINESS TECHNOLOGY (BTE)

ANDREW HERTENSTEIN Manager Microsoft Modern Datacenter and Azure Solutions En Pointe Technologies Phone

Information Security Services

M.S. Computer Science Program

Efficient database auditing

CLOUD COMPUTING. When it's smarter to rent than to buy.. Presented by Anand Tirumani

Chapter 1: Introduction

The Top Web Application Attacks: Are you vulnerable?

Cloud models and compliance requirements which is right for you?

An Evaluation of No-Cost Business Intelligence Tools. Claire Walsh

Methods to increase search performance for encrypted databases

Understanding the Benefits of IBM SPSS Statistics Server

Secure Data transfer in Cloud Storage Systems using Dynamic Tokens.

Tableau Server 7.0 scalability

Dematerialisation and document collaboration

Associate Prof. Dr. Victor Onomza Waziri

SAP HANA Cloud Platform Frequently Asked Questions - Business

Your Data, Any Place, Any Time. Microsoft SQL Server 2008 provides a trusted, productive, and intelligent data platform that enables you to:

Secure Cloud Transactions by Performance, Accuracy, and Precision

A Secure Autonomous Document Architecture for Enterprise Digital Right Management

Citrix Application Streaming. Universal Application Packaging and Delivery Breaking Away from Traditional IT

API Management: Powered by SOA Software Dedicated Cloud

Best Practices for Web Application Load Testing

BlackBerry 10.3 Work and Personal Corporate

QlikView Business Discovery Platform. Algol Consulting Srl

Deploying ArcGIS for Server Using Managed Services

IBM Rational Asset Manager

Introduction. Editions

Powerful analytics. and enterprise security. in a single platform. microstrategy.com 1

Analyzing HTTP/HTTPS Traffic Logs

Service Definition Document

Hadoop & its Usage at Facebook

Nessus Agents. October 2015

Transcription:

Sharemind - the Oracle of secure computing systems Dan Bogdanov, PhD Sharemind product manager dan@cyber.ee

About Sharemind Sharemind helps you analyse data you could not access before. Sharemind resolves trust issues by removing centralised control and unwanted data access points.

Architecture of an IT service Web/mobile apps or other services Business logic incl. data analysis Data storage and query engine Client 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 3 Service Storage Data access points for internal and external parties

Levels of data encryption Database Client 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 3 Client 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 3 Queries Client 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 3 Service Service Service Service Storage Storage Storage Industry standard SSE/OPE/SDE Sharemind

Sharemind platform Encrypted computing Policy enforcement Audit and verification link sort correlate late MPC/HE Intel SGX Multi-party consensus Disclosure control Online verification Offline audit

Programming toolchain SecreC language (C-like, no cryptographic knowledge necessary) Standard library with array and matrix operations, oblivious access, statistical testing, sorting, linking, regression modelling, etc. 15 000 lines of reusable SecreC code

Application Server paradigm interfaces Java/JavaScript/C/C++/Haskell Mobile apps Web apps Desktop apps SQL queries Rmind statistics package application servers database backends Host 1 Host 2 Host n

Sharemind Analytics Engine Rmind

Sharemind Analytics Engine Rmind

Case study: Government data analytics

IT training has a failure rate New IT students Quit studies before November 2012 1800 1 769 Number of students 1350 900 450 1 352 796 1 165 661 1 398 1 438 1 180 558 616 583 1 504 486 89 0 2006 2007 2008 2009 2010 2011 2012 Year By 2012, a total of 43% of students enrolled in in the four largest IT higher learning institutions in Estonia during 2006-2012 had quit their studies. Source: Estonian Ministry of Education and Research, CentAR.

Legal breakthroughs January 2014: Estonian Data Protection Agency declared that Sharemind technology and processes protect data so well that the Personal Data Protection Act doesn t apply. January 2015: after a code audit, the internal oversight at the Tax Board agreed to upload actual income tax records into the Sharemind-based analysis system. February 2015: the Tax Board, Ministry of Education, Information Systems Authority, Ministry of Finance IT Center and Cybernetica signed the world s first secure multi-party data analysis agreement.

Step 1: Import data Estonian Education Information System Ministry of Education and Research Register of taxable persons Estonian Tax and Customs Board Estonian Information System's Authority Ministry of Finance IT Center Cybernetica Data owners uploaded data with the Sharemind importer. Each value was encrypted at the source, private data never left the data owner. Over 600 000 study records (100 MB) used. Over 10 million tax records (1 GB) used. Largest MPC application on real-world data.

Step 2: Run the analysis Estonian Information System's Authority Ministry of Finance IT Center Statistician (Centar) Universities Companies Policymakers Statisticians used Rmind to post queries. Sharemind ensured that only queries in the study plan were actually executed. Additional microdata protection controls were enforced. Cybernetica

Operations performed Tax and Customs Board Extract data Employment tax payments Secret share and upload Higher study events Extract data Ministry of Education and Science Monthly income Aggregate by month Employment tax payments Higher study events Aggregate by year Aggregate by person Average yearly income Expand by years and aggregate by person Employment record of a person Merge by person's ID University career of a person Complete record of a person Analysis results Analysis table Compute additional attributes and align tax payments Data stored with secret sharing and processed with secure multi-party computation Recover results from shares Analysis results? Statistical analyst

2. Tulemused Tulemused kinnitavad, et nominaalajaga lõpetajate osakaal on madal tudengite hulgas üldiselt ja IKTtudengite hulgas eriti. IKT-tudengite hulgas varieerub nominaalajaga lõpetajate osakaal bakalaureuse- IT is harder to graduate õppes 20 protsendi piirimail, mis on madalam kui muude õppekavade tudengite vastav number (vt Joonis 1). Samasugune tendents ilmneb rakenduskõrgharidusõppe puhul. Magistriõppes on nominaalajaga lõpetajate osakaal veidi kõrgem, varieerudes sõltuvalt aastast 30% ja 40% vahel, kuid siingi on IKT õppurite hulgas see madalam kui teistel. Joonis 1. Nominaalajaga lõpetajate osakaal immatrikuleerimisaastate lõikes, IKT- ja mitte-ikt õppekavad, bakalaureuseõpe Meessoost tudengid lõpetavad nominaalaja jooksul väiksema tõenäosusega kui naistudengid (vt Joonis 2, Joonis 3). IKT tudengite madalam nominaalajaga lõpetamise tõenäosus ilmneb mõlema soo puhul.

Nominaalaja jooksul töötamist vaadates selgub üllatuslikult, et IKT-tudengid ei tööta õpingute ajal rohkem kui mitte-ikt õppekavadel õppivad tudengid. Bakalaureuseõppes on kõigi õppeaastate lõikes enamikul aastatel mitte-ikt õppekavade tudengite hulgas tööhõive määr kõrgem kui IKT-tudengitel (vt Joonis 4). Sama on järeldus rakenduskõrghariduse õppurite osas. Magistriõppes, kus tööhõive määrad ületavad 80%, All students are working on aga tulemus vastupidine: IKT-tudengite hulgas on tööhõive määr kõrgem kui mitte-ikt õppekavade õppuritel. Joonis 4. Nominaalaja jooksul töötanud tudengite osakaal kõigist tudengitest aastati, IKT- ja mitte-ikt õppekavad, bakalaureuseõpe Naissoost tudengite tööhõive määr on mõnevõrra kõrgem kui meessoost õppuritel, seda nii IKT- kui mitte- IKT tudengite hulgas (Joonis 5, Joonis 6). Soolised erinevused hõivemäärades varieeruvad aastati on aastaid, kus erinevus on märkimisväärne, ning aastaid, kus olulist erinevust pole.

Case study: Tax fraud detection

VAT evasion is a problem MEUR VAT Social tax Income tax Alcohol excise Tobacco excise Fuel excise Packaging excise

The story of the 1000 law

Secure implementation Benefits Analyze, combine and build reports without decrypting data. Benefits Encryption is applied on the data directly at the source. Confidentiality is guaranteed against all servers and against malicious hackers. Values are only decrypted when all hosts agree to do so. Risk queries Tax Office server secure multi-party computation system with database Taxpayer's association's server The data is cryptographically protected during processing. Transactions No need to unconditionally trust a single organization. Risk scores Watchdog NGO server Tax Office Taxpayers

First performance results Total running time of aggregation (s) 4000 2000 1000 500 250 100 50 1 aggregator 2 aggregators 4 aggregators 8 aggregators 100 200 500 1000 2000 Number of companies processed We estimated that it would have taken 10 days to process one month of data (50M invoices, 80 000 companies). Matching is a hard problem.

Cloud deployment on AWS

4 Benchmark results Much larger data sizes We used three input data sets with different size in our benchmarks (see Table 3). The largest data set corresponds to the estimates of Estonia s Tax and Customs Board on the number of taxable persons and performed business transactions in one month in Estonia. Each company s tax declaration is an XML-file consisting of a number of sales and purchase transactions with different business partners. Table 3. Descriptions of the three data sets used in the experiments. No. of companies No. of transaction partner Total no. of transactions pairs 20 000 200 000 25 000 000 40 000 400 000 50 000 000 80 000 800 000 100 000 000 The source data for 100 000 000 transactions had a In the upload phase, declarations were uploaded to the 80 Sharemind processes, each process receiving a single declaration at a time. After aggregating the data, the results were moved total together size into of a 35 single GB process in XML running format on three instances, (about and 1 the GB remaining in the instances were secret-shared closed. Note that each database). party only moves data shares between instances that it controls. The single process then merged the data and performed the risk analysis computations. We used Amazon CloudWatch to monitor the CPU, network and memory usage of the instances. The running times of all computations are presented on Figure 4. The performance of the prototype has significantly improved compared to the earlier version and is well within practical limits as the analysis only needs to be performed once in a single tax period (each month). As can be expected, in multi-region deployments the computations are slower due to the

Impressive running times 9 hours us 2 eu 2 us,1 eu 08:53:00 8 hours 7 hours Computation time 6 hours 5 hours 4 hours 3 hours 02:47:53 02:25:12 05:05:16 04:26:15 Computation phase Risk analysis Aggregation Upload 2 hours 1 hours 38:44 01:23:10 01:14:36 0 hours 20k 40k 80k 20k 40k 80k 20k 40k 80k Number of companies

We build applications Learn about Sharemind http://sharemind.cyber.ee/ Open source prototyping tools http://sharemind-sdk.github.io/ Contact us for more information and collaborations E-mail: sharemind@cyber.ee Twitter: @sharemind