Arnab Roy Fujitsu Laboratories of America and CSA Big Data WG 1
The Big Data Working Group (BDWG) will be identifying scalable techniques for data-centric security and privacy problems. BDWG s investigation is expected to lead to Crystallization of best practices for security and privacy in big data, Help industry and government on adoption of best practices, Establish liaisons with SDOs to influence big data security and privacy standards Accelerate the adoption of novel research aimed to address security and privacy issues.
1: Data analytics for security 2: Privacy preserving/enhancing technologies 3: Big datascale crypto 8: Framework and Taxonomy Big Data Working Group 70+ members 4: Big data Infrastructures' Attack Surface Analysis and Reduction 7: Top 10 6: Legal Issues 5: Policy and Governance https://basecamp.com/1825565/projects/511355-big-data-working
1) Secure computations in distributed programming frameworks 2) Security best practices for nonrelational datastores 3) Secure data storage and transactions logs 4) End-point input validation/filtering 5) Real time security monitoring 6) Scalable and composable privacypreserving data mining and analytics 7) Cryptographically enforced access control and secure communication 8) Granular access control 9) Granular audits 10) Data provenance 4
Infrastructure security Data Privacy Data Management Integrity and Reactive Security Secure Computations in Distributed Programming Frameworks Privacy Preserving Data Mining and Analytics Secure Data Storage and Transaction Logs End-point validation and filtering Security Best Practices for Non-Relational Data Stores Cryptographically Enforced Data Centric Security Granular Audits Real time Security Monitoring Granular Access Control Data Provenance 5
Malfunctioning compute worker nodes Trust establishment: initiation, periodic trust update Application Computation Infrastructure Access to sensitive data Mandatory access control Privacy of output information Privacy preserving transformations 6
Data from Diverse Appliances and Sensors Lack of stringent authentication and authorization mechanisms Enforcement through middleware layer Passwords should never be held in clear Encrypted data at rest Lack of secure communication between compute nodes Protect communication using SSL/TLS 7
Consumer Data Archive Data Confidentiality and Integrity Availability Consistency Collusion Encryption and Signatures Proof of data possession Periodic audit and hash chains Policy based encryption 8
Adversary may tamper with device or software Tamper-proof Software Data Poisoning Adversary may clone fake devices Adversary may directly control source of data Trust Certificate and Trusted Devices Analytics to detect outliers Adversary may compromise data in transmission Cryptographic Protocols 9
Fraud Detection Security of the infrastructure Security of the monitoring code itself Security of the input sources Adversary may cause data poisoning Discussed before Secure coding practices Discussed before Analytics to detect outliers 10
Exploiting vulnerability at host Encryption of data at rest, access control and authorization mechanisms Consumer Data Privacy Insider threat Outsourcing analytics to untrusted partners Unintended leakage through sharing of data Separation of duty principles, clear policy for logging access to datasets Awareness of re-identification issues, differential privacy 11
Enforcing access control Identity and Attribute-based encryptions Data Integrity and Privacy Search and filter Outsourcing of computation Encryption techniques supporting search and filter Fully Homomorphic Encryption Integrity of data and preservation of anonymity Group signatures with trusted third parties 12
Keeping track of secrecy requirements of individual data elements Pick right level of granularity: row level, column level, cell level Data Privacy Maintaining access labels across analytical transformations At the minimum, conform to lattice of access restrictions. More sophisticated data transforms are being considered in active research Keeping track of roles and authorities of users Authentication, authorization, mandatory access control 13
Completeness of audit information Audit of usage, pricing, billing Timely access to audit information Integrity of audit information Authorized access to audit information Infrastructure solutions as discussed before. Scaling of SIEM tools. 14
Secure collection of data Authentication techniques Keeping track of ownership of data pricing, audit Consistency of data and metadata Message digests Insider threats Access Control through systems and cryptography 15
Thank You 16