BBM467 Data Intensive ApplicaAons
|
|
- Prosper Lane
- 8 years ago
- Views:
Transcription
1 Hace7epe Üniversitesi Bilgisayar Mühendisliği Bölümü BBM467 Data Intensive ApplicaAons Dr. Fuat Akal
2 FoundaAons of Data[base] Clusters Database Clusters Hardware Architectures Data Design Schemes ReplicaAon Schemes Query Parallelism Logical Cluster OrganizaAon ReplicaAon Management
3 Database Clusters A cluster of computers can be thought as a single compuang resource. It ualizes mulaple machines to provide a more powerful compuang environment through a single system image. There are two types clusters high availability clusters (HA) high performance compu5ng clusters (HPC)
4 Hardware Architectures: Shared Memory All processors have access to the main memory and the disk, respecavely. The processors are Aghtly coupled inside the same box and interconnected with a special switch. The interprocess communicaaon is done by using a shared memory. The shared- memory approach presents simplicity and allows for load balancing as well as inter- query parallelism which comes for free. However, it is too expensive since it requires a special interconnect among the processors. P P P D D M Its performance and scalability are limited with the available memory and communicaaon bandwidths.
5 Hardware Architectures: Shared Disk In the shared- disk approach, all processors have their own memory, but they share disks. The interprocess communicaaon occurs over a common high- speed bus. Provides high availability. All data is sall accessible even when a node fails. Since each node has its own data cache, cache coherency must be maintained, e.g. by means of a lock manager, which results in reduced performance. Shared- disk systems have limited scalability due to bandwidth of the high- speed bus and potenaal bo7lenecks of shared hardware. M M P P D D D
6 Hardware Architectures: Shared Nothing In a shared- nothing architecture, each node is a complete stand- alone computer with its own memory and disk. M M The nodes are connected via switch or LAN. But, they do not share anything. D P P D The main advantages of such systems are very good scalability and high availability. P D However, the management of data is complicated and the programming with this model is harder due to importance of data paraaoning and allocaaon. M
7 ParAAoning Schemes Ver$cal Par$$oning: VerAcal paraaoning divides the columns of a table into separate tables. VerAcal paraaoning makes projecaons and joins easier and helps opamizing access to the cache by reducing size of the tuples. However, access to the whole table may be required anyway, when execuang queries. Horizontal Par$$oning: Horizontal paraaoning divides a table along its tuples. Its basic advantage is to allow parallel scans or projects. The hash par55oning is based on a hash funcaon that distributes the tuples according to a hashing key. useful for parallel exact match queries and hash- join operaaons. not appropriate for range queries and operaaons on other than paraaoning keys. The range par55oning is made based on value intervals of paraaoning keys. ualizes evaluaaons of range queries. the performance of the range paraaoning depends on the interval size. The round robin paraaoning technique distributes the tuples on each of the paraaons. This approach is also called striping. The number of logically con- secuave tuples forms a striping unit. The relaave size of the striping unit directly affects the performance. Small striping units result in more I/O parallelism for scans and long range queries. Larger striping units, on the other hand, may cause latency to complete scans.
8 ParAAoning Schemes A B A C a) Vertical Partitioning Original Table A B C A B C 1 4 A B C A B C b) Hash Partitioning A B C A B C A B C A B C A B C d) Round-Robin Partitioning c) Range Partitioning
9 Virtual ParAAoning Virtual paraaoning, also called query paraaoning, assumes that all tables are fully replicated on each cluster node. In this approach, a query is decomposed into subqueries which access small pieces of data by appending range predicates to the where clause of that query. Each subquery then deals with only a small part of the data.
10 Virtual ParAAoning (Example) original query SELECT Sum(L_ExtendedPrice*L_Discount) AS Revenue FROM LineItem WHERE L_Discount BETWEEN 0.03 AND 0.05 subquery1 SELECT Sum(L_ExtendedPrice*L_Discount) AS Revenue FROM LineItem WHERE L_Discount BETWEEN 0.03 AND 0.05 AND L_OrderKey BETWEEN 0 AND subquery2 SELECT Sum(L_ExtendedPrice*L_Discount) AS Revenue FROM LineItem WHERE L_Discount BETWEEN 0.03 AND 0.05 AND L_OrderKey BETWEEN AND LineItem node A LineItem node B
11 ReplicaAon Schemes Full Replica$on: Tables are duplicated on each cluster node. That is, each node holds an exact copy of the original database. Par$al Replica$on: ParAal replicaaon means that only parts of original database are replicated on the different cluster nodes. Mixed Replica$on: Both full and paraal replicaaon at the same Ame.
12 ReplicaAon Schemes Original Database c) Mixed Replica$on a) Full Replica$on b) Par$al Replica$on
13 Mixed Data Design - Organize as node groups (NG) - Freely design every NG Global Database Scheme Co-existing Design Schemes Node 1 Node 2 Node 3 Node 4 Node 5 Node Group 1 NG 2 NG 3 Database Cluster
14 Query Parallelism in a Cluster inter- query parallelism: The capability of the database management system to accept queries from mulaple users simultaneously. Each query is executed independently of the others. intra- query parallelism: Achieved by decomposing queries into subqueries and evaluaang them simultaneously. inter- par55on, intra- par55on and hybrid parallelism
15 Q 1 Q 2 Q 4 Data Data Data Database (Partition) Database Partition Database Partition a) inter-query c) intra-query & inter-partition Q 3 Q 5 Data Data Data Database Partition Database Partition Database Partition b) intra-query & intra-partition c) intra-query & intra-partition & inter-partition
16 Logical Cluster OrganizaAon Flat Cluster Architecture: Allows any cluster node to be accessible by clients. Forms a federated database of disanct databases running on independent servers. Connected by a LAN, no resource sharing, such as disks. Provides high availability and simple design. ReplicaAon is difficult to implement with this model. Middleware Based Cluster Architecture: A client can only interact with the cluster through a coordinaaon middleware. The middleware is responsible for scheduling and rouang of the clients requests. The middleware has the knowledge about underlying cluster. It can be used to ensure correct execuaons of concurrent updates and reads. It also allows to improve overall throughput by choosing be7er components, e.g. with less load to perform client requests. It is subject to single point of failure. If the middleware fails, the cluster will become useless. The middleware must be decentralized to improve scalability.
17 Logical Cluster OrganizaAon Clients Coordination Middleware Database Cluster a) flat architecture b) middleware-based architecture
18 ReplicaAon Management ReplicaAon is an essenaal technique to improve availability and scalability by fully or paraally duplicaang data objects among the nodes of a distributed system. ReplicaAon management is responsible for the maintenance of replicas and ensures consistency of mulaple copies of the same data object residing on different nodes. That is, replicaaon management is not simply copying data objects onto different nodes of a distributed system.
19 SynchronizaAon of Updates There are two possibiliaes for the locaaon of updates: Updates can either be centralized on one primary copy Or, be distributed on (a subset of) all replicas (update everywhere). : update : propagation : updatable object : read-only object a) Primary Copy b) Update Everywhere SynchronizaAon of updates can be done in two ways: eager and lazy
20 SynchronizaAon of Updates Eager (or synchronous) replicaaon. All copies of an object are synchronized within the same database transacaon. Allows early detecaon of conflicts and presents a simple soluaon to provide consistency. Has drawbacks regarding performance and due to the high communicaaon overhead among the replicas and the high probability of deadlocks. Lazy (or asynchronous) replicaaon. Replica maintenance is decoupled from the original database transacaon. The transacaons keeping the replicas up- to- date and consistent run as separate and independent database transacaons aler the original transacaon has commi7ed. Compared to eager replicaaon approaches, lazy approaches require addiaonal efforts to guarantee serializable execuaons.
21 Eager Primary Copy ReplicaAon
22 Eager Update Everywhere ReplicaAon
23 Lazy Primary Copy ReplicaAon with Immediate Updates
24 Lazy Primary Copy ReplicaAon with Deferred Updates
25 Lazy Update Everywhere ReplicaAon
Distributed Databases. Concepts. Why distributed databases? Distributed Databases Basic Concepts
Distributed Databases Basic Concepts Distributed Databases Concepts. Advantages and disadvantages of distributed databases. Functions and architecture for a DDBMS. Distributed database design. Levels of
More informationDistributed Databases
Distributed Databases Chapter 1: Introduction Johann Gamper Syllabus Data Independence and Distributed Data Processing Definition of Distributed databases Promises of Distributed Databases Technical Problems
More informationDistributed Systems LEEC (2005/06 2º Sem.)
Distributed Systems LEEC (2005/06 2º Sem.) Introduction João Paulo Carvalho Universidade Técnica de Lisboa / Instituto Superior Técnico Outline Definition of a Distributed System Goals Connecting Users
More informationBBM467 Data Intensive ApplicaAons
Hace7epe Üniversitesi Bilgisayar Mühendisliği Bölümü BBM467 Data Intensive ApplicaAons Dr. Fuat Akal akal@hace7epe.edu.tr Overview What is Cloud CompuAng? VirtualizaAon Service Oriented CompuAng What is
More informationClient/Server Computing Distributed Processing, Client/Server, and Clusters
Client/Server Computing Distributed Processing, Client/Server, and Clusters Chapter 13 Client machines are generally single-user PCs or workstations that provide a highly userfriendly interface to the
More informationCentralized Systems. A Centralized Computer System. Chapter 18: Database System Architectures
Chapter 18: Database System Architectures Centralized Systems! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems! Network Types! Run on a single computer system and do
More informationDISTRIBUTED AND PARALLELL DATABASE
DISTRIBUTED AND PARALLELL DATABASE SYSTEMS Tore Risch Uppsala Database Laboratory Department of Information Technology Uppsala University Sweden http://user.it.uu.se/~torer PAGE 1 What is a Distributed
More informationchapater 7 : Distributed Database Management Systems
chapater 7 : Distributed Database Management Systems Distributed Database Management System When an organization is geographically dispersed, it may choose to store its databases on a central database
More informationCluster Computing. ! Fault tolerance. ! Stateless. ! Throughput. ! Stateful. ! Response time. Architectures. Stateless vs. Stateful.
Architectures Cluster Computing Job Parallelism Request Parallelism 2 2010 VMware Inc. All rights reserved Replication Stateless vs. Stateful! Fault tolerance High availability despite failures If one
More informationParallel Databases. Parallel Architectures. Parallelism Terminology 1/4/2015. Increase performance by performing operations in parallel
Parallel Databases Increase performance by performing operations in parallel Parallel Architectures Shared memory Shared disk Shared nothing closely coupled loosely coupled Parallelism Terminology Speedup:
More informationDistributed Data Management
Introduction Distributed Data Management Involves the distribution of data and work among more than one machine in the network. Distributed computing is more broad than canonical client/server, in that
More informationChapter 18: Database System Architectures. Centralized Systems
Chapter 18: Database System Architectures! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems! Network Types 18.1 Centralized Systems! Run on a single computer system and
More informationWrite a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical
Identify a problem Review approaches to the problem Propose a novel approach to the problem Define, design, prototype an implementation to evaluate your approach Could be a real system, simulation and/or
More informationDWMiner : A tool for mining frequent item sets efficiently in data warehouses
DWMiner : A tool for mining frequent item sets efficiently in data warehouses Bruno Kinder Almentero, Alexandre Gonçalves Evsukoff and Marta Mattoso COPPE/Federal University of Rio de Janeiro, P.O.Box
More informationA Shared-nothing cluster system: Postgres-XC
Welcome A Shared-nothing cluster system: Postgres-XC - Amit Khandekar Agenda Postgres-XC Configuration Shared-nothing architecture applied to Postgres-XC Supported functionalities: Present and Future Configuration
More informationClient/Server and Distributed Computing
Adapted from:operating Systems: Internals and Design Principles, 6/E William Stallings CS571 Fall 2010 Client/Server and Distributed Computing Dave Bremer Otago Polytechnic, N.Z. 2008, Prentice Hall Traditional
More informationBBM467 Data Intensive ApplicaAons
Hace7epe Üniversitesi Bilgisayar Mühendisliği Bölümü BBM467 Data Intensive ApplicaAons Dr. Fuat Akal akal@hace7epe.edu.tr Problem How do you scale up applicaaons? Run jobs processing 100 s of terabytes
More informationHadoop MapReduce over Lustre* High Performance Data Division Omkar Kulkarni April 16, 2013
Hadoop MapReduce over Lustre* High Performance Data Division Omkar Kulkarni April 16, 2013 * Other names and brands may be claimed as the property of others. Agenda Hadoop Intro Why run Hadoop on Lustre?
More informationMeeting Your Scalability Needs with IBM DB2 Universal Database Enterprise - Extended Edition for Windows NT
IBM White Paper: IBM DB2 Universal Database on Windows NT Clusters Meeting Your Scalability Needs with IBM DB2 Universal Database Enterprise Extended Edition for Windows NT Is your decision support system
More informationDistributed Systems. REK s adaptation of Prof. Claypool s adaptation of Tanenbaum s Distributed Systems Chapter 1
Distributed Systems REK s adaptation of Prof. Claypool s adaptation of Tanenbaum s Distributed Systems Chapter 1 1 The Rise of Distributed Systems! Computer hardware prices are falling and power increasing.!
More informationPrinciples and characteristics of distributed systems and environments
Principles and characteristics of distributed systems and environments Definition of a distributed system Distributed system is a collection of independent computers that appears to its users as a single
More informationModule 14: Scalability and High Availability
Module 14: Scalability and High Availability Overview Key high availability features available in Oracle and SQL Server Key scalability features available in Oracle and SQL Server High Availability High
More informationDatabase Replication with Oracle 11g and MS SQL Server 2008
Database Replication with Oracle 11g and MS SQL Server 2008 Flavio Bolfing Software and Systems University of Applied Sciences Chur, Switzerland www.hsr.ch/mse Abstract Database replication is used widely
More informationTECHNIQUES FOR DATA REPLICATION ON DISTRIBUTED DATABASES
Constantin Brâncuşi University of Târgu Jiu ENGINEERING FACULTY SCIENTIFIC CONFERENCE 13 th edition with international participation November 07-08, 2008 Târgu Jiu TECHNIQUES FOR DATA REPLICATION ON DISTRIBUTED
More informationCloud Computing at Google. Architecture
Cloud Computing at Google Google File System Web Systems and Algorithms Google Chris Brooks Department of Computer Science University of San Francisco Google has developed a layered system to handle webscale
More informationSurvey on Comparative Analysis of Database Replication Techniques
72 Survey on Comparative Analysis of Database Replication Techniques Suchit Sapate, Student, Computer Science and Engineering, St. Vincent Pallotti College, Nagpur, India Minakshi Ramteke, Student, Computer
More informationMobile and Heterogeneous databases Database System Architecture. A.R. Hurson Computer Science Missouri Science & Technology
Mobile and Heterogeneous databases Database System Architecture A.R. Hurson Computer Science Missouri Science & Technology 1 Note, this unit will be covered in four lectures. In case you finish it earlier,
More informationDistributed Architectures. Distributed Databases. Distributed Databases. Distributed Databases
Distributed Architectures Distributed Databases Simplest: client-server Distributed databases: two or more database servers connected to a network that can perform transactions independently and together
More informationAn Overview of Distributed Databases
International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 4, Number 2 (2014), pp. 207-214 International Research Publications House http://www. irphouse.com /ijict.htm An Overview
More informationRCFile: A Fast and Space-efficient Data Placement Structure in MapReduce-based Warehouse Systems CLOUD COMPUTING GROUP - LITAO DENG
1 RCFile: A Fast and Space-efficient Data Placement Structure in MapReduce-based Warehouse Systems CLOUD COMPUTING GROUP - LITAO DENG Background 2 Hive is a data warehouse system for Hadoop that facilitates
More informationData Management in the Cloud
Data Management in the Cloud Ryan Stern stern@cs.colostate.edu : Advanced Topics in Distributed Systems Department of Computer Science Colorado State University Outline Today Microsoft Cloud SQL Server
More informationPrinciples of Distributed Database Systems
M. Tamer Özsu Patrick Valduriez Principles of Distributed Database Systems Third Edition
More informationHow To Understand The Concept Of A Distributed System
Distributed Operating Systems Introduction Ewa Niewiadomska-Szynkiewicz and Adam Kozakiewicz ens@ia.pw.edu.pl, akozakie@ia.pw.edu.pl Institute of Control and Computation Engineering Warsaw University of
More informationDistributed Operating Systems
Distributed Operating Systems Prashant Shenoy UMass Computer Science http://lass.cs.umass.edu/~shenoy/courses/677 Lecture 1, page 1 Course Syllabus CMPSCI 677: Distributed Operating Systems Instructor:
More informationApuama: Combining Intra-query and Inter-query Parallelism in a Database Cluster
Apuama: Combining Intra-query and Inter-query Parallelism in a Database Cluster Bernardo Miranda 1, Alexandre A. B. Lima 1,3, Patrick Valduriez 2, and Marta Mattoso 1 1 Computer Science Department, COPPE,
More informationSystem Aware Cyber Security Architecture
System Aware Cyber Security Architecture Rick A. Jones October, 2011 Research Topic DescripAon System Aware Cyber Security Architecture Addresses supply chain and insider threats Embedded into the system
More informationScalability of web applications. CSCI 470: Web Science Keith Vertanen
Scalability of web applications CSCI 470: Web Science Keith Vertanen Scalability questions Overview What's important in order to build scalable web sites? High availability vs. load balancing Approaches
More informationTier Architectures. Kathleen Durant CS 3200
Tier Architectures Kathleen Durant CS 3200 1 Supporting Architectures for DBMS Over the years there have been many different hardware configurations to support database systems Some are outdated others
More informationThe Oracle Universal Server Buffer Manager
The Oracle Universal Server Buffer Manager W. Bridge, A. Joshi, M. Keihl, T. Lahiri, J. Loaiza, N. Macnaughton Oracle Corporation, 500 Oracle Parkway, Box 4OP13, Redwood Shores, CA 94065 { wbridge, ajoshi,
More informationDatabase replication for commodity database services
Database replication for commodity database services Gustavo Alonso Department of Computer Science ETH Zürich alonso@inf.ethz.ch http://www.iks.ethz.ch Replication as a problem Gustavo Alonso. ETH Zürich.
More informationGeoGrid Project and Experiences with Hadoop
GeoGrid Project and Experiences with Hadoop Gong Zhang and Ling Liu Distributed Data Intensive Systems Lab (DiSL) Center for Experimental Computer Systems Research (CERCS) Georgia Institute of Technology
More informationDeveloping Scalable Java Applications with Cacheonix
Developing Scalable Java Applications with Cacheonix Introduction Presenter: Slava Imeshev Founder and main committer, Cacheonix Frequent speaker on scalability simeshev@cacheonix.com www.cacheonix.com/blog/
More informationF1: A Distributed SQL Database That Scales. Presentation by: Alex Degtiar (adegtiar@cmu.edu) 15-799 10/21/2013
F1: A Distributed SQL Database That Scales Presentation by: Alex Degtiar (adegtiar@cmu.edu) 15-799 10/21/2013 What is F1? Distributed relational database Built to replace sharded MySQL back-end of AdWords
More informationAchieving Mainframe-Class Performance on Intel Servers Using InfiniBand Building Blocks. An Oracle White Paper April 2003
Achieving Mainframe-Class Performance on Intel Servers Using InfiniBand Building Blocks An Oracle White Paper April 2003 Achieving Mainframe-Class Performance on Intel Servers Using InfiniBand Building
More informationAgenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.
Agenda Enterprise Performance Factors Overall Enterprise Performance Factors Best Practice for generic Enterprise Best Practice for 3-tiers Enterprise Hardware Load Balancer Basic Unix Tuning Performance
More informationIn Memory Accelerator for MongoDB
In Memory Accelerator for MongoDB Yakov Zhdanov, Director R&D GridGain Systems GridGain: In Memory Computing Leader 5 years in production 100s of customers & users Starts every 10 secs worldwide Over 15,000,000
More informationHow To Virtualize A Storage Area Network (San) With Virtualization
A New Method of SAN Storage Virtualization Table of Contents 1 - ABSTRACT 2 - THE NEED FOR STORAGE VIRTUALIZATION 3 - EXISTING STORAGE VIRTUALIZATION METHODS 4 - A NEW METHOD OF VIRTUALIZATION: Storage
More informationDatabase Scalability {Patterns} / Robert Treat
Database Scalability {Patterns} / Robert Treat robert treat omniti postgres oracle - mysql mssql - sqlite - nosql What are Database Scalability Patterns? Part Design Patterns Part Application Life-Cycle
More informationDirect NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle
Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle Agenda Introduction Database Architecture Direct NFS Client NFS Server
More informationScality RING High performance Storage So7ware for Email pla:orms, StaaS and Cloud ApplicaAons
Scality RING High performance Storage So7ware for Email pla:orms, StaaS and Cloud ApplicaAons Friday, March 18, 2011 MARKET ExponenAal Storage Demand The Digital Universe: Growing by a factor of 44 in
More informationAzure Scalability Prescriptive Architecture using the Enzo Multitenant Framework
Azure Scalability Prescriptive Architecture using the Enzo Multitenant Framework Many corporations and Independent Software Vendors considering cloud computing adoption face a similar challenge: how should
More informationOptimizing Performance. Training Division New Delhi
Optimizing Performance Training Division New Delhi Performance tuning : Goals Minimize the response time for each query Maximize the throughput of the entire database server by minimizing network traffic,
More informationHow To Create A Multi Disk Raid
Click on the diagram to see RAID 0 in action RAID Level 0 requires a minimum of 2 drives to implement RAID 0 implements a striped disk array, the data is broken down into blocks and each block is written
More informationSCALABILITY AND AVAILABILITY
SCALABILITY AND AVAILABILITY Real Systems must be Scalable fast enough to handle the expected load and grow easily when the load grows Available available enough of the time Scalable Scale-up increase
More informationLecture on Storage Systems
Lecture on Storage Systems Network File Systems André Brinkmann Network File Systems Distributed File Systems NFS AFS Network A
More information<Insert Picture Here> Oracle In-Memory Database Cache Overview
Oracle In-Memory Database Cache Overview Simon Law Product Manager The following is intended to outline our general product direction. It is intended for information purposes only,
More information2.1 What are distributed systems? What are systems? Different kind of systems How to distribute systems? 2.2 Communication concepts
Chapter 2 Introduction to Distributed systems 1 Chapter 2 2.1 What are distributed systems? What are systems? Different kind of systems How to distribute systems? 2.2 Communication concepts Client-Server
More informationSymmetric Multiprocessing
Multicore Computing A multi-core processor is a processing system composed of two or more independent cores. One can describe it as an integrated circuit to which two or more individual processors (called
More informationFragmentation and Data Allocation in the Distributed Environments
Annals of the University of Craiova, Mathematics and Computer Science Series Volume 38(3), 2011, Pages 76 83 ISSN: 1223-6934, Online 2246-9958 Fragmentation and Data Allocation in the Distributed Environments
More informationStudy of Load Balancing of Resource Namespace Service
Study of Load Balancing of Resource Namespace Service Masahiro Nakamura, Osamu Tatebe University of Tsukuba Background Resource Namespace Service (RNS) is published as GDF.101 by OGF RNS is intended to
More informationPhysical Database Design and Tuning
Chapter 20 Physical Database Design and Tuning Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 1. Physical Database Design in Relational Databases (1) Factors that Influence
More informationCHAPTER 1: OPERATING SYSTEM FUNDAMENTALS
CHAPTER 1: OPERATING SYSTEM FUNDAMENTALS What is an operating? A collection of software modules to assist programmers in enhancing efficiency, flexibility, and robustness An Extended Machine from the users
More informationDesign Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational Databases aka Just Enough Distributed Systems To Be Dangerous (in 40 minutes) Todd Lipcon (@tlipcon) Cloudera June 11, 2009 Introduction Common Underlying
More informationHighly Available Service Environments Introduction
Highly Available Service Environments Introduction This paper gives a very brief overview of the common issues that occur at the network, hardware, and application layers, as well as possible solutions,
More informationBig Data & Scripting storage networks and distributed file systems
Big Data & Scripting storage networks and distributed file systems 1, 2, adaptivity: Cut-and-Paste 1 distribute blocks to [0, 1] using hash function start with n nodes: n equal parts of [0, 1] [0, 1] N
More informationHadoop Cluster Applications
Hadoop Overview Data analytics has become a key element of the business decision process over the last decade. Classic reporting on a dataset stored in a database was sufficient until recently, but yesterday
More information1. Physical Database Design in Relational Databases (1)
Chapter 20 Physical Database Design and Tuning Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 1. Physical Database Design in Relational Databases (1) Factors that Influence
More informationA Brief Analysis on Architecture and Reliability of Cloud Based Data Storage
Volume 2, No.4, July August 2013 International Journal of Information Systems and Computer Sciences ISSN 2319 7595 Tejaswini S L Jayanthy et al., Available International Online Journal at http://warse.org/pdfs/ijiscs03242013.pdf
More informationThis chapter introduces you to Microso2 Office Access 2013. The chapter focuses on what a database is, the components of a database, what a database
This chapter introduces you to Microso2 Office Access 2013. The chapter focuses on what a database is, the components of a database, what a database can do and how to create a database. 1 The objecaves
More informationDependable Systems. 9. Redundant arrays of. Prof. Dr. Miroslaw Malek. Wintersemester 2004/05 www.informatik.hu-berlin.de/rok/zs
Dependable Systems 9. Redundant arrays of inexpensive disks (RAID) Prof. Dr. Miroslaw Malek Wintersemester 2004/05 www.informatik.hu-berlin.de/rok/zs Redundant Arrays of Inexpensive Disks (RAID) RAID is
More informationProactive, Resource-Aware, Tunable Real-time Fault-tolerant Middleware
Proactive, Resource-Aware, Tunable Real-time Fault-tolerant Middleware Priya Narasimhan T. Dumitraş, A. Paulos, S. Pertet, C. Reverte, J. Slember, D. Srivastava Carnegie Mellon University Problem Description
More informationWeb Server Architectures
Web Server Architectures CS 4244: Internet Programming Dr. Eli Tilevich Based on Flash: An Efficient and Portable Web Server, Vivek S. Pai, Peter Druschel, and Willy Zwaenepoel, 1999 Annual Usenix Technical
More informationI N T E R S Y S T E M S W H I T E P A P E R INTERSYSTEMS CACHÉ AS AN ALTERNATIVE TO IN-MEMORY DATABASES. David Kaaret InterSystems Corporation
INTERSYSTEMS CACHÉ AS AN ALTERNATIVE TO IN-MEMORY DATABASES David Kaaret InterSystems Corporation INTERSYSTEMS CACHÉ AS AN ALTERNATIVE TO IN-MEMORY DATABASES Introduction To overcome the performance limitations
More informationParGRES: a middleware for executing OLAP queries in parallel
ParGRES: a middleware for executing OLAP queries in parallel Marta Mattoso 1, Geraldo Zimbrão 1,3, Alexandre A. B. Lima 1, Fernanda Baião 1,2, Vanessa P. Braganholo 1, Albino Aveleda 1, Bernardo Miranda
More informationCapacity Planning Process Estimating the load Initial configuration
Capacity Planning Any data warehouse solution will grow over time, sometimes quite dramatically. It is essential that the components of the solution (hardware, software, and database) are capable of supporting
More informationIntroduction to Parallel and Distributed Databases
Advanced Topics in Database Systems Introduction to Parallel and Distributed Databases Computer Science 600.316/600.416 Notes for Lectures 1 and 2 Instructor Randal Burns 1. Distributed databases are the
More informationlow-level storage structures e.g. partitions underpinning the warehouse logical table structures
DATA WAREHOUSE PHYSICAL DESIGN The physical design of a data warehouse specifies the: low-level storage structures e.g. partitions underpinning the warehouse logical table structures low-level structures
More informationFault Tolerance & Reliability CDA 5140. Chapter 3 RAID & Sample Commercial FT Systems
Fault Tolerance & Reliability CDA 5140 Chapter 3 RAID & Sample Commercial FT Systems - basic concept in these, as with codes, is redundancy to allow system to continue operation even if some components
More informationObjectives. Distributed Databases and Client/Server Architecture. Distributed Database. Data Fragmentation
Objectives Distributed Databases and Client/Server Architecture IT354 @ Peter Lo 2005 1 Understand the advantages and disadvantages of distributed databases Know the design issues involved in distributed
More informationIn-Memory Columnar Databases HyPer. Arto Kärki University of Helsinki 30.11.2012
In-Memory Columnar Databases HyPer Arto Kärki University of Helsinki 30.11.2012 1 Introduction Columnar Databases Design Choices Data Clustering and Compression Conclusion 2 Introduction The relational
More informationDatabase Replication Techniques: a Three Parameter Classification
Database Replication Techniques: a Three Parameter Classification Matthias Wiesmann Fernando Pedone André Schiper Bettina Kemme Gustavo Alonso Département de Systèmes de Communication Swiss Federal Institute
More informationParallel Execution with Oracle Database 10g Release 2. An Oracle White Paper June 2005
Parallel Execution with Oracle Database 10g Release 2 An Oracle White Paper June 2005 Parallel Execution with Oracle Database 10g Release 2 Executive Overview...3 Introduction...3 Design Strategies for
More informationAN OVERVIEW OF DISTRIBUTED DATABASE MANAGEMENT
AN OVERVIEW OF DISTRIBUTED DATABASE MANAGEMENT BY AYSE YASEMIN SEYDIM CSE 8343 - DISTRIBUTED OPERATING SYSTEMS FALL 1998 TERM PROJECT TABLE OF CONTENTS INTRODUCTION...2 1. WHAT IS A DISTRIBUTED DATABASE
More informationInge Os Sales Consulting Manager Oracle Norway
Inge Os Sales Consulting Manager Oracle Norway Agenda Oracle Fusion Middelware Oracle Database 11GR2 Oracle Database Machine Oracle & Sun Agenda Oracle Fusion Middelware Oracle Database 11GR2 Oracle Database
More informationChapter 3. Database Environment - Objectives. Multi-user DBMS Architectures. Teleprocessing. File-Server
Chapter 3 Database Architectures and the Web Transparencies Database Environment - Objectives The meaning of the client server architecture and the advantages of this type of architecture for a DBMS. The
More informationTushar Joshi Turtle Networks Ltd
MySQL Database for High Availability Web Applications Tushar Joshi Turtle Networks Ltd www.turtle.net Overview What is High Availability? Web/Network Architecture Applications MySQL Replication MySQL Clustering
More informationDistributed Database Management Systems
Distributed Database Management Systems (Distributed, Multi-database, Parallel, Networked and Replicated DBMSs) Terms of reference: Distributed Database: A logically interrelated collection of shared data
More informationIntroduction to Parallel Computing. George Karypis Parallel Programming Platforms
Introduction to Parallel Computing George Karypis Parallel Programming Platforms Elements of a Parallel Computer Hardware Multiple Processors Multiple Memories Interconnection Network System Software Parallel
More informationTechnical Comparison of Oracle Database vs. SQL Server 2000: Focus on Performance. An Oracle White Paper December 2003
Technical Comparison of Oracle Database vs. SQL Server 2000: Focus on Performance An Oracle White Paper December 2003 Technical Comparison of Oracle Database vs. SQL Server 2000: Focus on Performance Introduction...
More informationCS 5523 Operating Systems: Intro to Distributed Systems
CS 5523 Operating Systems: Intro to Distributed Systems Instructor: Dr. Tongping Liu Thank Dr. Dakai Zhu, Dr. Palden Lama for providing their slides. Outline Different Distributed Systems Ø Distributed
More informationBlobSeer: Towards efficient data storage management on large-scale, distributed systems
: Towards efficient data storage management on large-scale, distributed systems Bogdan Nicolae University of Rennes 1, France KerData Team, INRIA Rennes Bretagne-Atlantique PhD Advisors: Gabriel Antoniu
More informationAdaptive Virtual Partitioning for OLAP Query Processing in a Database Cluster
Adaptive Virtual Partitioning for OLAP Query Processing in a Database Cluster Alexandre A. B. Lima 1, Marta Mattoso 1, Patrick Valduriez 2 1 Computer Science Department, COPPE, Federal University of Rio
More informationRecruitment Process Outsourcing
Recruitment Process Outsourcing What, When and Why Some ideas to get you thinking about RPO What is Recruitment Process Outsourcing (RPO)? 2 What is Recruitment Process Outsourcing (RPO)? A client- centric
More informationOutdated Architectures Are Holding Back the Cloud
Outdated Architectures Are Holding Back the Cloud Flash Memory Summit Open Tutorial on Flash and Cloud Computing August 11,2011 Dr John R Busch Founder and CTO Schooner Information Technology JohnBusch@SchoonerInfoTechcom
More informationOpenMosix Presented by Dr. Moshe Bar and MAASK [01]
OpenMosix Presented by Dr. Moshe Bar and MAASK [01] openmosix is a kernel extension for single-system image clustering. openmosix [24] is a tool for a Unix-like kernel, such as Linux, consisting of adaptive
More informationChapter 7: Distributed Systems: Warehouse-Scale Computing. Fall 2011 Jussi Kangasharju
Chapter 7: Distributed Systems: Warehouse-Scale Computing Fall 2011 Jussi Kangasharju Chapter Outline Warehouse-scale computing overview Workloads and software infrastructure Failures and repairs Note:
More informationBLM 413E - Parallel Programming Lecture 3
BLM 413E - Parallel Programming Lecture 3 FSMVU Bilgisayar Mühendisliği Öğr. Gör. Musa AYDIN 14.10.2015 2015-2016 M.A. 1 Parallel Programming Models Parallel Programming Models Overview There are several
More informationCSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2007 Lecture 5 - DBMS Architecture
CSE 544 Principles of Database Management Systems Magdalena Balazinska Fall 2007 Lecture 5 - DBMS Architecture References Anatomy of a database system. J. Hellerstein and M. Stonebraker. In Red Book (4th
More informationDistributed and Parallel Database Systems
Distributed and Parallel Database Systems M. Tamer Özsu Department of Computing Science University of Alberta Edmonton, Canada T6G 2H1 Patrick Valduriez INRIA, Rocquencourt 78153 LE Chesnay Cedex France
More informationIncidentMonitor Server Specification Datasheet
IncidentMonitor Server Specification Datasheet Prepared by Monitor 24-7 Inc October 1, 2015 Contact details: sales@monitor24-7.com North America: +1 416 410.2716 / +1 866 364.2757 Europe: +31 088 008.4600
More information