NoSQL Database Systems and their Security Challenges



Similar documents
Current Data Security Issues of NoSQL Databases

SQL VS. NO-SQL. Adapted Slides from Dr. Jennifer Widom from Stanford

Preparing Your Data For Cloud

Cloud Scale Distributed Data Storage. Jürmo Mehine

Structured Data Storage

Making Sense ofnosql A GUIDE FOR MANAGERS AND THE REST OF US DAN MCCREARY MANNING ANN KELLY. Shelter Island

NoSQL systems: introduction and data models. Riccardo Torlone Università Roma Tre

Why NoSQL? Your database options in the new non- relational world IBM Cloudant 1

So What s the Big Deal?

Lecture Data Warehouse Systems

INTRODUCTION TO CASSANDRA

Overview of Databases On MacOS. Karl Kuehn Automation Engineer RethinkDB

Analytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world

extensible record stores document stores key-value stores Rick Cattel s clustering from Scalable SQL and NoSQL Data Stores SIGMOD Record, 2010

Integrating Big Data into the Computing Curricula

Towards Privacy aware Big Data analytics

Evaluating NoSQL for Enterprise Applications. Dirk Bartels VP Strategy & Marketing

NoSQL Databases. Institute of Computer Science Databases and Information Systems (DBIS) DB 2, WS 2014/2015

How To Scale Out Of A Nosql Database

NoSQL replacement for SQLite (for Beatstream) Antti-Jussi Kovalainen Seminar OHJ-1860: NoSQL databases

Big Systems, Big Data

Can the Elephants Handle the NoSQL Onslaught?

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

Choosing The Right Big Data Tools For The Job A Polyglot Approach

A COMPARATIVE STUDY OF NOSQL DATA STORAGE MODELS FOR BIG DATA

NoSQL Data Base Basics

How To Handle Big Data With A Data Scientist

Top Ten Security and Privacy Challenges for Big Data and Smartgrids. Arnab Roy Fujitsu Laboratories of America

Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

An Approach to Implement Map Reduce with NoSQL Databases

Introduction to Polyglot Persistence. Antonios Giannopoulos Database Administrator at ObjectRocket by Rackspace

BIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research &

Challenges for Data Driven Systems

Big Data Management and Security

Advanced Data Management Technologies

Introduction to NOSQL


Arnab Roy Fujitsu Laboratories of America and CSA Big Data WG

bigdata Managing Scale in Ontological Systems

Comparison of the Frontier Distributed Database Caching System with NoSQL Databases

MongoDB in the NoSQL and SQL world. Horst Rechner Berlin,

MongoDB Developer and Administrator Certification Course Agenda

BIG DATA TOOLS. Top 10 open source technologies for Big Data

NoSQL Databases. Nikos Parlavantzas

Not Relational Models For The Management of Large Amount of Astronomical Data. Bruno Martino (IASI/CNR), Memmo Federici (IAPS/INAF)

Slave. Master. Research Scholar, Bharathiar University

The evolution of database technology (II) Huibert Aalbers Senior Certified Executive IT Architect

A survey of big data architectures for handling massive data

Big Data and Data Science: Behind the Buzz Words

Cloud Computing and Advanced Relationship Analytics

Study and Comparison of Elastic Cloud Databases : Myth or Reality?

Applications for Big Data Analytics

<Insert Picture Here> Oracle NoSQL Database A Distributed Key-Value Store

Referential Integrity in Cloud NoSQL Databases

F1: A Distributed SQL Database That Scales. Presentation by: Alex Degtiar (adegtiar@cmu.edu) /21/2013

FIS GT.M Multi-purpose Universal NoSQL Database. K.S. Bhaskar Development Director, FIS +1 (610)

Comparing SQL and NOSQL databases

How To Use Big Data For Telco (For A Telco)

Compliance & Data Protection in the Big Data Age - MongoDB Security Architecture

nosql and Non Relational Databases

Big Data. Facebook Wall Data using Graph API. Presented by: Prashant Patel Jaykrushna Patel

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

Big Data Technologies. Prof. Dr. Uta Störl Hochschule Darmstadt Fachbereich Informatik Sommersemester 2015

Understanding NoSQL Technologies on Windows Azure

Highly available, scalable and secure data with Cassandra and DataStax Enterprise. GOTO Berlin 27 th February 2014

An Access Control Model for NoSQL Databases

Cassandra A Decentralized, Structured Storage System

NoSQL in der Cloud Why? Andreas Hartmann

Databases 2 (VU) ( )

these three NoSQL databases because I wanted to see a the two different sides of the CAP

Open Source Technologies on Microsoft Azure

Big Data Technologies Compared June 2014

Introduction to NoSQL Databases. Tore Risch Information Technology Uppsala University

InfiniteGraph: The Distributed Graph Database

The CAP theorem and the design of large scale distributed systems: Part I

Introduction to Big Data Training

MS-55096: Securing Data on Microsoft SQL Server 2012

Understanding NoSQL on Microsoft Azure

How to Choose Between Hadoop, NoSQL and RDBMS

Server-Side JavaScript Injection Bryan Sullivan, Senior Security Researcher, Adobe Secure Software Engineering Team July 2011

Cloud & Big Data a perfect marriage? Patrick Valduriez

How to Hadoop Without the Worry: Protecting Big Data at Scale

Benchmarking and Analysis of NoSQL Technologies

Data Modeling for Big Data

BRAC. Investigating Cloud Data Storage UNIVERSITY SCHOOL OF ENGINEERING. SUPERVISOR: Dr. Mumit Khan DEPARTMENT OF COMPUTER SCIENCE AND ENGEENIRING

Using Object Database db4o as Storage Provider in Voldemort

Big Data Management. Big Data Management. (BDM) Autumn Povl Koch September 2,

THE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES

Domain driven design, NoSQL and multi-model databases

White paper. The Big Data Security Gap: Protecting the Hadoop Cluster

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam

A Study on Security and Privacy in Big Data Processing

Transcription:

NoSQL Database Systems and their Security Challenges Morteza Amini amini@sharif.edu Data & Network Security Lab (DNSL) Department of Computer Engineering Sharif University of Technology September 25 2 th ISC International ISCISC 25 Conference on Information NoSQL Database Security Systems and Cryptology and their Security (ISCISC 5) Challenges

Talk Outline Introduction NoSQL vs. Relational Databases Types of NoSQL Databases NoSQL Security Challenges 2 / 59 ISCISC 25

Introduction 3 / 59 ISCISC 25

Current Trends The new generation of applications like cloud or Grid apps, Business Intelligence, Web 2., Social networking requires storing and processing of terabytes and even petabytes of data 4 / 59 ISCISC 25

ISCISC 25 We have More users More data Interactive apps Today 5 / 59

ISCISC 25 The requirements of storage database systems is changed Today Relational Database is not suitable Distributed Storage and Processing NoSQL = Not Only SQL 6 / 59

NoSQL vs. Relational Databases 7 / 59 ISCISC 25

Why relational database is not suitable? A relational database is a data structure that allows you to link information from different tables 8 / 59 ISCISC 25

Why relational database is not suitable? Pros Have been well-developed to meet confidentiality, availability and integrity Work best with structured data Use standard query language ACID Very good for small dataset 9 / 59 ISCISC 25

Why relational database is not suitable? Cons Scaling Relied on scale up rather than scale out Large feature set Non-linear query execution time Static schema / 59 ISCISC 25

Reasons for Distributed Storage and Processing Take advantage of multiple systems as well as multi-core CPU architectures Servers have to be globally distributed for low latency and failover / 59 ISCISC 25

Characteristics of NoSQL Databases NoSQL databases have been designed for solving the Big Data issue by utilizing distributed, collaborating hosts to achieve satisfactory performance in data storage and retrieval. Mostly being non-relational No join / Unstructured data Provide great performance, availability, scalability and flexibility Distribution, Replication, Failover 2 / 59 ISCISC 25

NoSQL Trend The term NoSQL was first used in 998. Relational databases As name of file-based database that omitted the use of SQL. The term was picked up again in 29. NoSQL databases It became a serious competitor to the term RDB. Google Trend 3 / 59 ISCISC 25

Characteristics of NoSQL Databases Provide BASE (Basically Available, Soft state, Eventual consistent) system, but not ACID as a Relational Database Management System. Schema-free Easy replication support and running well on clusters 4 / 59 Simple API ISCISC 25

CAP Theorem Any shared-data system can have at most two of these properties AP Voldemart (Key-value) CouchDB (Document), Riak(Document) CA Relational databases Vertica (column-oriented) GreenPlum (Relational) CP BigTable (Column Oriented), MongoDB(Document) All nodes always have the same view of the data. Consistency Partition Tolerance Every request (read and write) receives response. Availability The system works well despite physical network partitions. 5 / 59 ISCISC 25

Types of NoSQL Databases 6 / 59 ISCISC 25

NoSQL Data Models There are more than 5 NoSQL databases 7 / 59 ISCISC 25

Major Companies using NoSQL Databases Fidelis Cybersecurity, 24 8 / 59 ISCISC 25

NoSQL Data Models Graph Key-value Data Model document Columwide 9 / 59 ISCISC 25

Key-value Stores Work by matching keys with values, similar to a dictionary very fast very scalable simple model able to distribute horizontally Cons: many data structures (objects) can't be easily modeled key value pairs 2 / 59 ISCISC 25

Key-value Stores http://www.thoughtworks.com/insights/blog/nosql-databases-overview 2 / 59 ISCISC 25

NoSQL Data Models Graph Key-value Data Model document Columwide 22 / 59 ISCISC 25

Column-Wide Stores Similar to relational database but not alike. Each KEY is associated with one or more columns (Row key) A Column stores their data in such way that can be efficiently aggregated, reducing the I/O activity. it is suitable for data mining and analytic systems. http://www.thoughtworks.com/insights/blog/nosql-databases-overview 23 / 59 ISCISC 25

NoSQL Data Models Graph Key-value Data Model document Columwide 24 / 59 ISCISC 25

Document Stores The data is stored in the form of documents in a standard format (xml,pdf, json, etc). More flexible because of their lack of schema The documents may have only the filled and important fields, letting the empty and null out, saving some storage space 25 / 59 http://www.thoughtworks.com/insights/blog/nosql-databases-overview ISCISC 25

NoSQL Data Models Graph Key-value Data model document Columwide 26 / 59 ISCISC 25

Graph Stores Save the data in a form of graph The nodes are object and the edges relations between the objects. The main goal of graph store is to know the relations between nodes and how the nodes are inter-connected. http://scraping.pro/where-nosql-practically-used/ 27 / 59 ISCISC 25

Which one is the best? It depends on the application requirements Size of data Complexity CAP theory Format of data Fidelis Cybersecurity, 24 28 / 59 ISCISC 25

NoSQL Security Challenges 29 / 59 ISCISC 25

NoSQL Security Most of NoSQL databases do not provide any feature of embedding security in the database itself. Developers need to impose security in the middleware. Security issues that affected RDBMSs were also inherited in the NoSQL databases as well as new ones imposed by their new features. 3 / 59 ISCISC 25

NoSQL Security Security may be difficult Owing to the unstructured (dynamic) nature of the data stored in these databases Distributed environment Cost of security in contrast to prformance No strong consistency 3 / 59 ISCISC 25

NoSQL Major Security Challenges Threats Posed By Distributed Environments Authorization and Access Control Safeguarding Integrity Protection of Data at Rest User Data Privacy 32 / 59 ISCISC 25

NoSQL Major Security Challenges Threats Posed By Distributed Environments Authorization and Access Control Safeguarding Integrity Protection of Data at Rest User Data Privacy 33 / 59 ISCISC 25

Threats Posed By Distributed Environments Distributed Environments increase attack surface across several distributed nodes Compromised Clients Malicious data gets propagated from a single compromised location Protecting nodes, name servers and those clients becomes difficult especially when there is no central management security point. Vulnerabilities of Gossip based membership protocol in Cassandra and Dynamo [Aniello, et al. 23] Zombie node Ghost node 34 / 59 ISCISC 25

NoSQL Major Security Challenges Threats Posed By Distributed Environments Authorization and Access Control Safeguarding Integrity Protection of Data at Rest User Data Privacy 35 / 59 ISCISC 25

Authorization and Access Control Two important challenges: Possibility: how to define security policies for schema-less or dynamic-schema databases? Performance: availability vs. access control overhead: how to manage cost of access control? 36 / 59 ISCISC 25

Authorization and Access Control Fine-grained (row or column level) access control: heterogeneous data is stored together in one database as opposed to relational models which conform to defined schemas and tables that store only related data. Schema-less nature of NoSQL DBs does not allow finegrained access control. We need Looking Forward Security Most of them allow Column Family level authorization. NoSQL DBMS Granularity Explanation BigTable Column Family Using ACL Cassandra Column Family Using IAuthorizer API HBase Column Family / Cell Group-based authorization Accumulo Cell Using Visibility field 37 / 59 ISCISC 25

Authorization and Access Control Fine-grained (row or column level) access control: heterogeneous data is stored together Secret in one database as opposed to relational models which conform to defined schemas Unclassified and tables that store only related data. Schema-less nature of NoSQL DBs does not Confidential allow finegrained access control. We need Looking Forward Security Most of them allow Column Family level authorization. Using top-level ontology for access policy specification Confidential NoSQL DBMS Granularity Explanation Unclassified BigTable Column Family Using ACL Cassandra Column Family Secret Using IAuthorizer API HBase Column Family / Cell Group-based authorization Accumulo Cell Using Visibility field 38 / 59 ISCISC 25

Authorization and Access Control Grouping data with the same security level (node level). Fine-grained (row or column level) access control: heterogeneous data is stored together Secret in one database as opposed to relational models which conform to defined schemas Unclassified and tables that store only related data. Schema-less nature of NoSQL DBs does not allow finegrained access control. We need Looking Forward Confidential Security Most of them allow Column Family level authorization. Confidential NoSQL DBMS Granularity Explanation BigTable Column Family Using ACL Unclassified Cassandra Column Family Using IAuthorizer API HBase Column Family / Cell Group-based Secret authorization Accumulo Cell Using Visibility field 39 / 59 ISCISC 25

Authorization and Access Control Inference Control: Access control on aggregated data, especially in Column-Wide databases and Time-Series databases.,22 4 / 59 ISCISC 25

Authorization and Access Control Administration / Access Control Management: how and where to grant database accesses Local vs. Global access policies and their possible conflicts. Centralized approach: single-point-of-failure, availability issues Distributed approach: consistency of distributed access rules Semidistributed approach: 4 / 59 ISCISC 25

Authorization and Access Control By default, there is no authorization. Privileged admins can grant the privileges on resources to a selected user. By default, there is no authorization. Provisions authorization on a per- database level by using a role- based approach. 42 / 59 ISCISC 25

NoSQL Major Security Challenges Threats Posed By Distributed Environments Authorization and Access Control Safeguarding Integrity Protection of Data at Rest User Data Privacy 43 / 59 ISCISC 25

Safeguarding Integrity Enforcing integrity constraints is much harder in NoSQL database system Consistency is in contrast with availability and performance Transactional integrity is in contrast with its soft nature How to define integrity constraints? [its schema-less nature] Which types of integrity constraints can be defined? How to control? [there is absence of central control/ performance and availability issues] 44 / 59 ISCISC 25

NoSQL Major Security Challenges Threats Posed By Distributed Environments Authorization and Access Control Safeguarding Integrity Protection of Data at Rest User Data Privacy 45 / 59 ISCISC 25

Protection of Data at Rest Encryption is widely regarded as the defacto standard for safeguarding data in storage. Most industry solutions offering encryption services lack horizontal scaling and transparency required in the NoSQL environment. Only a few categories of NoSQL databases provide mechanisms to protect data at rest by employing encryption techniques. We need Light Weight Cryptography! 46 / 59 ISCISC 25

Protection of Data at Rest Use Transparent Data Encryption (TDE) to protect data that is written to disk. The commit log is not encrypted at all. Data files in MongoDB are never encrypted. 47 / 59 ISCISC 25

NoSQL Major Security Challenges Threats Posed By Distributed Environments Authorization and Access Control Safeguarding Integrity Protection of Data at Rest User Data Privacy 48 / 59 ISCISC 25

Users Data Privacy Privacy, main challenge of Web 2. and Virtual Social Networks. Large amounts of user- related sensitive information in NoSQL databases. Which kinds of methods is applicable in practice for NoSQL databases? Access Control Encryption Anonymization... 49 / 59 ISCISC 25

NoSQL Minor Security Challenges Authentication (Users and Clients) Audit And Logging Protection of Data at Motion API Security 5 / 59 ISCISC 25

Authentication Some NoSQL databases enforce authentication By default, there is no authentication. mechanism at local node level, but fail to enforce Has Password Authenticator. authentication across all commodity servers. Can further provide Kerberos authentication. By default, there is no authentication. Support for authentication on a per- database level. 5 / 59 ISCISC 25

Audit and Logging NoSQL databases has poor logging and log analysis methods Auditing is available in Enterprise Cassandra. Filters are available for logging MongoDB is far behind in implementing the desired security logging and monitoring. 52 / 59 ISCISC 25

Protection of Data in Motion Communication between clients and nodes (traditional issue) Communication between nodes RPC over TCP/IP 53 / 59 ISCISC 25

Protection of Data in Motion Communications should be encrypted Client-Node Communications: By default, is not encrypted. SSL can be configured. Client-Node Communications Inter-Node Communications: By default, is not encrypted. Inter-Node SSL can Communications be configured. Client-Node Communications: it is required to either recompile MongoDB with the "- ssl" option. Inter-Node Communications: is not supported. 54 / 59 ISCISC 25

API Security APIs can be subjected to several attacks such as Code injection, buffer over flows, command injection as they access the NoSQL databases. Server Side JavaScript Injection (SSJS) Schema injection / Query injection / JSON injection In PHP: $query = 'function() {var search_year = \''.$_GET['year']. '\';'. 'return this.publicationyear == search_year '. ' this.filmingyear == search_year '. ' this.recordingyear == search_year;}'; $cursor = $collection->find(array('$where' => $query)); DoS Attacks http://server/app.php?year=995';while();var%2foo='bar 55 / 59 ISCISC 25

API Security Is vulnerable to injection MongoDB, $where operator can be used for injection. 56 / 59 ISCISC 25

Summary NoSQL Database Systems for unstructured and big data Main features: Performance, Availability, Scalability NoSQL Security Challenges: Threats posed by their distributed nature Fine-grained authorization and inference control Integrity constraint definition and control Light weight transparent encryption of data in rest Users privacy... 57 / 59 ISCISC 25

Some References [Aniello, et al. 23] L. Aniello, S. Bonomi, M. Breno, R. Baldoni, Assessing Data Availability of Cassandra in the Presence of non-accurate Membership, The 2nd International Workshop on Dependability Issues in Cloud Computing, 23. [Kadebu, et al. 24] P. Kadebu, I. Mapanga, A Security Requirements Perspective towards a Secured NOSQL Database Environment, International Conference of Advance Research and Innovation, 24. [Noiumkar, et al. 24] P. Noiumkar, and T. Chomsiri, A Comparison the Level of Security on Top 5 Open Source NoSQL Databases, The 9th International Conference on Information Technology and Applications, 24. [Fidelis Cybersecurity 24] Current Data Security Issues of NoSQL Databases, Fidelis Cybersecurity, 24. [Okman, et al. 2] L. Okman, N. Gal-Oz, Y. Gonen, E. Gudes, and J. Abramov, Security Issues in NoSQL Databases, International Joint Conference of IEEE TrustCom-/IEEE ICESS-/FCST-, 2. [Shermin 23] M. Shermin, An Access Control Model for NoSQL Databases, M.Sc. thesis, The University of Western Ontario, 23. [Ron, et al. 25] A. Ron, A. Shulman-Peleg, E. Bronshtein, No SQL, No Injection? Examining NoSQL Security, The 9th Workshop on Web 2. Security and Privacy, 25. [Rong, et al. 23] C. Rong, Z. Quan, A. Chakravorty, On Access Control Schemes for Hadoop Data Storage, International Conference on Cloud Computing and Big Data, 23. 58 / 59 ISCISC 25

Thanks for your attention... Any Question? amini@sharif.edu Thank Ms Dolatnezhad for helping in preparing this presentation. 59 / 59 ISCISC 25