A Novel Cloud Computing Data Fragmentation Service Design for Distributed Systems
|
|
|
- Myron Banks
- 10 years ago
- Views:
Transcription
1 A Novel Cloud Computing Data Fragmentation Service Design for Distributed Systems Ismail Hababeh School of Computer Engineering and Information Technology, German-Jordanian University Amman, Jordan Abstract- As many distributed database applications contain online information that change continuously and expand incrementally, comprehensive cloud Application Programming Interface API s are required to monitor and control the accuracy of the information and data proliferation. This cloud software is required to monitor and control the accuracy of the information and data proliferation. This software can be viewed as integrated cloud computing services; data fragmentation, clustering network sites, and fragments Allocation that support transactional database applications. In this paper, we describe our data Fragmentation as a Service (FaaS) in construction of a cloud computing software system. Specifically, we design a novel data fragmentation as a service to facilitate enormous data processing, and introduce some functioning enhancement on data distribution to improve the cloud system performance. This research presents our attempt to implement data fragmentation service in a cloud computing system, with large scale data mining as targeted application. Keywords: SaaS, FaaS, CaaS, AaaS, DFA, API. 1 Introduction Cloud computing is web based system development in which huge scalable computing services are provided to users over the Internet. The cloud computing system includes web communications, Software as a Service (SaaS), up-and-coming tools, and has involved extra attention from researchers in different technology areas. Many cloud computing providers have their data centers spread worldwide to maintain data availability which is typically achieved by replication processes. Amazon s cloud simple storage service [1] replicates data across different geographical regions so that data and applications can continue even in the face of failures of their location. This is likely to be help in running applications on data warehouses, but not transactional data management systems [2]. Yahoo [3] and Amazon [4] both implement data replication through PNUTS and SimpleDB cloud data services over distributed network sites. They designed to run analytical applications on data warehouses, but not for transactional data applications. Similarly, Google [5] implements a replicated database, but does not offer a complete relational Application Programming Interface and weakens the data atomicity. The cloud API is written as series of XML-based messages, and executed on the cloud servers to utilize remote web-based applications and reduce the number of calls between the client and the distributed servers [6]. Microsoft SQL Server [7] cloud data service is implemented over distributed network sites. However, as it doesn t apply commit protocols, the distributed system presents lack of data consistency. Researchers in [8] designed the H-Store project to minimize the number of transactions that access data from multiple locations. However, the project still in the theoretical phase, and its feasibility on a real world distributed database systems has not verified. In distributed relational database systems, the transactions on the applications are usually subsets of relations (fragments), so using these fragments and distributing them over the network sites increases the system throughput by means of parallel execution. Therefore, an efficient cloud API fragmentation web service is presented to access and manage data relationships, and enhance both the speed and simplicity of the distributed database functionality. This web service is used to retrieve raw data from the cloud data centers by external programs like Java applications. Moreover, it helps to reduce the cost of accessing data over distributed network sites and increases the distributed system performance through data allocation processes. The remainder of the paper is organized as follows: related work is discussed in Section 2; Section 3 describes the data fragmentation architecture; data fragmentation design is presented in Section 4; Section 5 depicts the performance evaluation and experimental results; and finally Section 6 draws conclusion and outline future work.
2 2 Related Work Various strategies have already partitioned data across distributed systems. There are approaches that determined three main partitioning categories; vertical, horizontal, and hybrid [9, 10]. Some have defined the vertical fragmentation as a process of generating data records fragments [11, 12, 13, 14]. Other researchers have addressed the necessity of horizontal fragmentation [15, 16, 17] which make the processes of data backup and restore much easier. A mixed or hybrid fragmentation; vertical fragmentation followed by a horizontal or vice versa, has been covered by few researches [12,18] due to the intractable nature of this type of fragmentation in relational distributed database systems. data fragments, supporting the use of knowledge extraction, and helping to achieve the effective use of small fragments. The domain knowledge in DFA describes and categorizes the essential and representative elements of the distributed database systems, specifically, for the databases fragmentation. The purpose of the DFA domain knowledge is to ensure that all data elements are available and consistent for database fragmentation process. In addition, it is used to prepare data elements that are valid from one transaction to another, from one application to another, and from one database to another in distributed database systems. The details of this DFA are described and illustrated in the following section. The studies in [16] and [17] are by far the closest to our fragmentation method. The method in [16] considered each record as a fragment in the relation and large number of database fragments is generated, thus more communication costs are required fragments processing. In contrast, the approach in [17] used the whole relation as a fragment, not all records of the fragment have to be retrieved or updated, and a selectivity matrix that indicates the percentage of accessing a fragment by a transaction is considered. However, more redundant fragments are available, and the generated fragments are overlapped. A key difference between our cloud API fragmentation method and the others is that: it presents the minimum number of disjoint fragments that could be generated for each relation according to the queries requirements. This fragmentation service is designed for a cloud computing systems in order to reduce the communication cost over the cloud sites and increase the distributed system throughput. Moreover, the generated disjoint fragments are allocated then into the cloud servers where it saves more communication costs. 3 Data Fragmentation Architecture In a distributed relational database systems, the complete database is not a suitable data unit for distribution because it is too big, especially when considering information relevant for different cloud data centers. Therefore, it is appropriate to develop a web application programming interface service, specifically FaaS, that can extracts the minimum number of disjoint data records which would be allocated to the cloud servers. The architecture of FaaS is recognized by the domain knowledge and three main processes; eliminating data redundancy, defining transactions, and fragmenting data records. Figure 1 describes the Data Fragmentation Architecture (DFA) service that will be used for generating Figure 1. Data Fragmentation Architecture 4 Data Fragmentation Design The requested data in DFA are identified by means of transactions triggered as queries, which determine the specific information that should be extracted from the cloud database servers. The transactions are executed and result in redundant data records as two or more different queries may require the same data records. The redundant data are eliminated and the remaining data records are then partitioned so as to generate the minimum number of fragments which are neither replicated nor intersected (disjoint). 4.1 Eliminating Data Redundancy Different cloud API s are developed to get rid of data redundancy from the distributed cloud servers. The following algorithm is designed to prevent data replication, based on Primary Code Number PCN, from being entered into database application runs over cloud servers.
3 Eliminating Data Redundancy Algorithm: Step1: Build a map which stores the unique list of leads being inserted/updated, using Primary Code Number as a key. Step2:Check for any lead where the key is inserted/updated Step3:If the key is a replication of another lead in this group, Then do steps 4 and 5 Step4:Create a single database query, using the lead map, to find all the leads in the database that have the same key Step 5: Issue a non validation message Else, do step 6 Step 6:If the key is not replicated, Then Add this lead to the new lead map End If End If Step 7: End In this cloud API, all transactions are processed and the redundant data records are eliminated. Thus, database applications get more speed and so more efficient as it have only the required data records to be accessed, processed, and allocated to the distributed cloud servers. 4.2 Defining Database Transactions The data records requested by the clients determine the specific information extracted from the database queries. Database queries are executed as transactions from the applications at the distributed database system sites. The results of the transactions are sets of data records that could be full intersected, partial intersected, or not intersected. The data set itself consists of complete records. Figure 2a. Defining Transactions Figures 2a and 2b illustrates an example of defining and generating different sets of data records according to the definition of each transaction. Figure 2b. Data Records Transactions In this figure, there exist a full intersection between data sets (4,10) over relation 3, and a partial intersection between data sets (1,5), (3,9), and (6,7) over relations 4, 2, and 1 respectively. On the other hand, there is no intersection between the data sets over relation 5. Therefore, a cloud API fragmentation method is developed, partitioned the database records, and generated the minimum number of disjoint fragments which will be allocated to the distributed cloud servers. The details of this approach are illustrated in the following section. 4.3 Fragmenting Data Records The fragmentation process starts looking for any two data records over the same relation having intersection records between them. From any two intersected data sets, three disjoint fragments will be generated; the intersection fragment which represents the common records in both sets, the fragment that represents the records in the first set but not in the second intersected set, and the fragment that represents the records in the second set but not in the first intersected set. Then, the intersected sets are deleted from the data sets list. This process is continued until no more intersections between the data sets still exist. The subsequent fragmentation algorithm describes the processes of generating disjoint fragments from the intersected data records for each relation in DDBS.
4 Intersected Data Fragmentation Algorithm: k Number of the last fragment in the database (0 at the beginning) Repeat for all relations in the database Repeat for all data records S i, S j in each relation, where i j If S i S j Ø k k + 1 F k S i S j F k+1 S i - F k F k+2 S j - F k Delete S i, S j End if Until all intersected data records in each relation have been processed Until all relations in the database have been processed Rename the final fragments sequentially The data records over each relation that do not have intersection between them are renamed as fragments and considered for further fragmentation process. The following fragmentation algorithm expresses the nonintersected data records and adds them to the list of relation fragments. Non-intersected Data Fragmentation Algorithm: k Number of the last fragment Repeat for all relations in the database k k + 1 F k R - F i (for all fragments F i in relation R) IF F k Ø Then Add F k to the collected fragments of relation R End if Until all relations in the database have been processed The same fragmentation process is applied for any two intersected fragments over the same relation. This refragmentation process will be continued until the intersection between data fragments is finished and the fragments are totally disjoint. Figure 3 shows an example of generating disjoint fragments from the data sets over relation 2 in a DDBS. In this figure, the data sets 1,2 over relation 2 are sharing a common data records 2,3,4 which is considered to be redundant data. The fragmentation process isolates the shared data records from both data sets, and generates the following disjoint fragments; F1 which contains the shared data records 2,3,4, F2 that contains the records 1,5,6,7 which are in data set 1 but not in data set 2. As the third fragment should be generated contains the same data records in fragment F1, it will not be created and the data sets 1,2 have to be deleted. The performance evaluation of the fragmentation method is presented in the following section. Figure 3. Data Sets Fragmentation 5 Performance Evaluation and Experimental Results The database fragmentation performance evaluation is based on the computation resulted from dividing the reduced storage size for each relation by the relation queries records size. The reduced storage size is computed as the difference between the size of the queries records and the size of the generated fragments of each relation in DDBS. The experimental results are demonstrated the usability and the efficiency of the cloud API fragmentation method. When this method is tested for 45 quires over 5 database relations that construct the whole database, 27 disjoint fragments are generated. Therefore, 18 data set records are omitted from the database which eliminates the data redundancy, saves more data storage, minimizes the transferred and processed data, and then increases the overall system performance. The fragmentation methods in [16] and [17] are implemented in our fragmentation method. The performance results of [16] and [17] are compared with our cloud API fragmentation method and depicted in Figure 4.
5 Figure 4. Fragmentation Performance Evaluation It is shown in this figure that the performance accomplished by our approach outperforms the performance of the methods in comparison, and this will help in reducing the communication costs in data allocation phase. 6 Conclusions Data fragmentation is one of the primary techniques used in partitioning and developing cloud computing services for distributed database systems. This research discussed the efficiency, usefulness, and the performance improvement achieved by the API fragmentation service in a cloud computing system. The experimental results emphasized the ability of this fragmentation method to minimize the data processed and transferred between the distributed database system network sites, reduce the storage size by eliminating data redundancy, and present significant performance improvements that increase distributed database network system throughput. Moreover, this cloud API fragmentation approach realizes the optimal solution properties of data fragmentation in a distributed database system; the relation fragments include all relation records, the union of all relation fragments constructs the original relation, and the relation fragments are disjoint. In addition, it generates the minimum number of fragments for each relation according to the queries requirements. This will reduce the communication cost over distributed network sites and increase the distributed system performance. In a future work, the research will be focused on the resulted disjoint fragments as objects for distribution over the network sites. It should be distributed in such a way that satisfy the requirements of database queries and enhance the system performance. Therefore, developing web API services for a cloud computing distributed systems, like Clustering distributed network sites as a Service (CaaS) and fragments Allocation as a service (AaaS), have an important impact on the transactional database applications, and improve the distributed systems throughput. References [1] Amazon Simple Storage Service (Amazon S3) [Accessed 29 th April, 2011]. [2] Daniel J. Abadi. Data Management in the Cloud: Limitations and Opportunities. Data Engineering, IEEE Computer Society. March 2009 Vol. 32 No. 1, pp [3] B. Cooper, R. Ramakrishnan, U. Srivastava, A. Silberstein, P. Bohannon, H. Jacobsen, N. Puz, D. Weaver, and R. Yerneni. Pnuts: Yahoo!s hosted data serving platform. Proceedings of VLDB, [4] Amazon Simple DB. [Accessed 19 th February, 2011]. [5] F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber. Bigtable: a distributed storage system for structured data. Proceedings of OSDI, [6] A.Velte, T.Velte & R.Elsenpeter. Cloud Computing: A Practical Approach. McGraw-Hill [7] Microsoft SQL Server for Cloud Servers. ing-sql-server-licenses-for-cloud-servers. [Accessed 7 th March, 2011]. [8] M. Stonebraker, S. R. Madden, D. J. Abadi, S. Harizopoulos, N. Hachem, and P. Helland. The end of an architectural era (it s time for a complete rewrite). VLDB, Vienna, Austria, 2007.
6 [9] Ozsu, M. & Valduriez, P. Principles of Distributed Database Systems. 2nd ed. Englewood Cliffs NJ, Prentice- Hall [10] Khalil, N., Eid, D. & Khair, M. Availability and Reliability Issues in Distributed Databases Using Optimal Horizontal Fragmentation. Trevor J. M. Bench-Capon; Giovanni Soda & A. Min Tjoa, ed.,'dexa' 99, LNCS 1677, Springer, pp [11] Son, J. & Kim, M. An Adaptable Vertical Partitioning Method in Distributed Systems. The Journal of Systems and Software. 73(3), 2004, pp [12] Agrawal, S., Narasayya, V. & Yang, B. Integrating Vertical and Horizontal Partitioning into Automated Physical Database Design. SIGMOD 2004, Paris, France, ACM 2004, pp [13] Tamhankar, A. & Ram, S. Database Fragmentation and Allocation: An Integrated Methodology and Case Study. IEEE Transactions on Systems, Man. and Cybernetics-Part A. Systems and Humans. 28(3), 1998, pp [14] Lim, S. & Ng, Y. Vertical Fragmentation and Allocation in Distributed Deductive Database Systems. The Journal of Information Systems. 22(1), 1997, pp [15] Costa, R. & Lifschitz, S. Database Allocation Strategies for Parallel BLAST Evaluation on Clusters. Distributed and Parallel Databases. 13, 2003, pp [16] Ma, H., Scchewe, K. & Wang, Q. Distribution design for higher-order data models, Data and Knowledge Engineering. 60, 2007, pp [17] Huang, Y. & Chen, J. Fragment Allocation in Distributed Database Design. Journal of Information Science and Engineering. 17, 2001, pp [18] Navathe, S., Karlapalem, K. & Minyoung, R. A mixed fragmentation methodology for initial distributed database design. Journal of Computer and Software Engineering. 3(4), 1995, pp
Hosting Transaction Based Applications on Cloud
Proc. of Int. Conf. on Multimedia Processing, Communication& Info. Tech., MPCIT Hosting Transaction Based Applications on Cloud A.N.Diggikar 1, Dr. D.H.Rao 2 1 Jain College of Engineering, Belgaum, India
A Distribution Management System for Relational Databases in Cloud Environments
JOURNAL OF ELECTRONIC SCIENCE AND TECHNOLOGY, VOL. 11, NO. 2, JUNE 2013 169 A Distribution Management System for Relational Databases in Cloud Environments Sze-Yao Li, Chun-Ming Chang, Yuan-Yu Tsai, Seth
MANAGEMENT OF DATA REPLICATION FOR PC CLUSTER BASED CLOUD STORAGE SYSTEM
MANAGEMENT OF DATA REPLICATION FOR PC CLUSTER BASED CLOUD STORAGE SYSTEM Julia Myint 1 and Thinn Thu Naing 2 1 University of Computer Studies, Yangon, Myanmar [email protected] 2 University of Computer
Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop
Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Kanchan A. Khedikar Department of Computer Science & Engineering Walchand Institute of Technoloy, Solapur, Maharashtra,
Communication System Design Projects
Communication System Design Projects PROFESSOR DEJAN KOSTIC PRESENTER: KIRILL BOGDANOV KTH-DB Geo Distributed Key Value Store DESIGN AND DEVELOP GEO DISTRIBUTED KEY VALUE STORE. DEPLOY AND TEST IT ON A
International Journal of Innovative Research in Computer and Communication Engineering. (An ISO 3297: 2007 Certified Organization)
Improved Performance of Web Based Database Management for Telemedicine by Using Three Fold Approach of Data Fragmentation,Websites Data Clustering and Data Allocation Sneha Katale 1, Sonali Hotkar 2, Abhijeet
Scalable Transaction Management on Cloud Data Management Systems
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661, p- ISSN: 2278-8727Volume 10, Issue 5 (Mar. - Apr. 2013), PP 65-74 Scalable Transaction Management on Cloud Data Management Systems 1 Salve
A Demonstration of Rubato DB: A Highly Scalable NewSQL Database System for OLTP and Big Data Applications
A Demonstration of Rubato DB: A Highly Scalable NewSQL Database System for OLTP and Big Data Applications Li-Yan Yuan Department of Computing Science University of Alberta [email protected] Lengdong
Review of Query Processing Techniques of Cloud Databases Ruchi Nanda Assistant Professor, IIS University Jaipur.
Suresh Gyan Vihar University Journal of Engineering & Technology (An International Bi Annual Journal) Vol. 1, Issue 2, 2015,pp.12-16 ISSN: 2395 0196 Review of Query Processing Techniques of Cloud Databases
Data Management in the Cloud: Limitations and Opportunities
Data Management in the Cloud: Limitations and Opportunities Daniel J. Abadi Yale University New Haven, CT, USA [email protected] Abstract Recently the cloud computing paradigm has been receiving significant
Where We Are. References. Cloud Computing. Levels of Service. Cloud Computing History. Introduction to Data Management CSE 344
Where We Are Introduction to Data Management CSE 344 Lecture 25: DBMS-as-a-service and NoSQL We learned quite a bit about data management see course calendar Three topics left: DBMS-as-a-service and NoSQL
Report Data Management in the Cloud: Limitations and Opportunities
Report Data Management in the Cloud: Limitations and Opportunities Article by Daniel J. Abadi [1] Report by Lukas Probst January 4, 2013 In this report I want to summarize Daniel J. Abadi's article [1]
Loose Coupling between Cloud Computing Applications and Databases: A Challenge to be Hit
International Journal of Computer Systems (ISSN: 2394-1065), Volume 2 Issue 3, March, 2015 Available at http://www.ijcsonline.com/ Loose Coupling between Cloud Computing Applications and Databases: A Challenge
Business Intelligence and Column-Oriented Databases
Page 12 of 344 Business Intelligence and Column-Oriented Databases Kornelije Rabuzin Faculty of Organization and Informatics University of Zagreb Pavlinska 2, 42000 [email protected] Nikola Modrušan
Load Balancing in Distributed Data Base and Distributed Computing System
Load Balancing in Distributed Data Base and Distributed Computing System Lovely Arya Research Scholar Dravidian University KUPPAM, ANDHRA PRADESH Abstract With a distributed system, data can be located
Towards Full-fledged XML Fragmentation for Transactional Distributed Databases
Towards Full-fledged XML Fragmentation for Transactional Distributed Databases Rebeca Schroeder 1, Carmem S. Hara (supervisor) 1 1 Programa de Pós Graduação em Informática Universidade Federal do Paraná
PartJoin: An Efficient Storage and Query Execution for Data Warehouses
PartJoin: An Efficient Storage and Query Execution for Data Warehouses Ladjel Bellatreche 1, Michel Schneider 2, Mukesh Mohania 3, and Bharat Bhargava 4 1 IMERIR, Perpignan, FRANCE [email protected] 2
Data Management in the Cloud
Data Management in the Cloud Ryan Stern [email protected] : Advanced Topics in Distributed Systems Department of Computer Science Colorado State University Outline Today Microsoft Cloud SQL Server
Big Data and Hadoop with components like Flume, Pig, Hive and Jaql
Abstract- Today data is increasing in volume, variety and velocity. To manage this data, we have to use databases with massively parallel software running on tens, hundreds, or more than thousands of servers.
In Memory Accelerator for MongoDB
In Memory Accelerator for MongoDB Yakov Zhdanov, Director R&D GridGain Systems GridGain: In Memory Computing Leader 5 years in production 100s of customers & users Starts every 10 secs worldwide Over 15,000,000
Transactions Management in Cloud Computing
Transactions Management in Cloud Computing Nesrine Ali Abd-El Azim 1, Ali Hamed El Bastawissy 2 1 Computer Science & information Dept., Institute of Statistical Studies & Research, Cairo, Egypt 2 Faculty
Implementing New Approach for Enhancing Performance and Throughput in a Distributed Database
290 The International Arab Journal of Information Technology, Vol. 10, No. 3, May 2013 Implementing New Approach for Enhancing Performance and in a Distributed Database Khaled Maabreh 1 and Alaa Al-Hamami
Big Data and Hadoop with Components like Flume, Pig, Hive and Jaql
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 7, July 2014, pg.759
What is Analytic Infrastructure and Why Should You Care?
What is Analytic Infrastructure and Why Should You Care? Robert L Grossman University of Illinois at Chicago and Open Data Group [email protected] ABSTRACT We define analytic infrastructure to be the services,
5-Layered Architecture of Cloud Database Management System
Available online at www.sciencedirect.com ScienceDirect AASRI Procedia 5 (2013 ) 194 199 2013 AASRI Conference on Parallel and Distributed Computing and Systems 5-Layered Architecture of Cloud Database
Distributed Databases in a Nutshell
Distributed Databases in a Nutshell Marc Pouly [email protected] Department of Informatics University of Fribourg, Switzerland Priciples of Distributed Database Systems M. T. Özsu, P. Valduriez Prentice
Efficient Integration of Data Mining Techniques in Database Management Systems
Efficient Integration of Data Mining Techniques in Database Management Systems Fadila Bentayeb Jérôme Darmont Cédric Udréa ERIC, University of Lyon 2 5 avenue Pierre Mendès-France 69676 Bron Cedex France
CSCI 550: Advanced Data Stores
CSCI 550: Advanced Data Stores Basic Information Place and time: Spring 2014, Tue/Thu 9:30-10:50 am Instructor: Prof. Shahram Ghandeharizadeh, [email protected], 213-740-4781 ITS Help: E-mail: [email protected]
Determination of the normalization level of database schemas through equivalence classes of attributes
Computer Science Journal of Moldova, vol.17, no.2(50), 2009 Determination of the normalization level of database schemas through equivalence classes of attributes Cotelea Vitalie Abstract In this paper,
Report for the seminar Algorithms for Database Systems F1: A Distributed SQL Database That Scales
Report for the seminar Algorithms for Database Systems F1: A Distributed SQL Database That Scales Bogdan Aurel Vancea May 2014 1 Introduction F1 [1] is a distributed relational database developed by Google
International Journal of Innovative Research in Computer and Communication Engineering
FP Tree Algorithm and Approaches in Big Data T.Rathika 1, J.Senthil Murugan 2 Assistant Professor, Department of CSE, SRM University, Ramapuram Campus, Chennai, Tamil Nadu,India 1 Assistant Professor,
An Approach to Implement Map Reduce with NoSQL Databases
www.ijecs.in International Journal Of Engineering And Computer Science ISSN: 2319-7242 Volume 4 Issue 8 Aug 2015, Page No. 13635-13639 An Approach to Implement Map Reduce with NoSQL Databases Ashutosh
Data Management in Cloud based Environment using k- Median Clustering Technique
Data Management in Cloud based Environment using k- Median Clustering Technique Kashish Ara Shakil Department of Computer Science Jamia Millia Islamia New Delhi, India Mansaf Alam Department of Computer
Cloud Data Management: A Short Overview and Comparison of Current Approaches
Cloud Data Management: A Short Overview and Comparison of Current Approaches Siba Mohammad Otto-von-Guericke University Magdeburg [email protected] Sebastian Breß Otto-von-Guericke University
Distributed Framework for Data Mining As a Service on Private Cloud
RESEARCH ARTICLE OPEN ACCESS Distributed Framework for Data Mining As a Service on Private Cloud Shraddha Masih *, Sanjay Tanwani** *Research Scholar & Associate Professor, School of Computer Science &
Joining Cassandra. Luiz Fernando M. Schlindwein Computer Science Department University of Crete Heraklion, Greece [email protected].
Luiz Fernando M. Schlindwein Computer Science Department University of Crete Heraklion, Greece [email protected] Joining Cassandra Binjiang Tao Computer Science Department University of Crete Heraklion,
Advanced Database Group Project - Distributed Database with SQL Server
Advanced Database Group Project - Distributed Database with SQL Server Hung Chang, Qingyi Zhu Erasmus Mundus IT4BI 1. Introduction 1.1 Motivation Distributed database is vague for us. How to differentiate
Transaction Management in Distributed Database Systems: the Case of Oracle s Two-Phase Commit
Transaction Management in Distributed Database Systems: the Case of Oracle s Two-Phase Commit Ghazi Alkhatib Senior Lecturer of MIS Qatar College of Technology Doha, Qatar [email protected] and Ronny
Data Management Course Syllabus
Data Management Course Syllabus Data Management: This course is designed to give students a broad understanding of modern storage systems, data management techniques, and how these systems are used to
Oracle8i Spatial: Experiences with Extensible Databases
Oracle8i Spatial: Experiences with Extensible Databases Siva Ravada and Jayant Sharma Spatial Products Division Oracle Corporation One Oracle Drive Nashua NH-03062 {sravada,jsharma}@us.oracle.com 1 Introduction
CLOUDDMSS: CLOUD-BASED DISTRIBUTED MULTIMEDIA STREAMING SERVICE SYSTEM FOR HETEROGENEOUS DEVICES
CLOUDDMSS: CLOUD-BASED DISTRIBUTED MULTIMEDIA STREAMING SERVICE SYSTEM FOR HETEROGENEOUS DEVICES 1 MYOUNGJIN KIM, 2 CUI YUN, 3 SEUNGHO HAN, 4 HANKU LEE 1,2,3,4 Department of Internet & Multimedia Engineering,
Hadoop s Entry into the Traditional Analytical DBMS Market. Daniel Abadi Yale University August 3 rd, 2010
Hadoop s Entry into the Traditional Analytical DBMS Market Daniel Abadi Yale University August 3 rd, 2010 Data, Data, Everywhere Data explosion Web 2.0 more user data More devices that sense data More
Foundations of Business Intelligence: Databases and Information Management
Foundations of Business Intelligence: Databases and Information Management Content Problems of managing data resources in a traditional file environment Capabilities and value of a database management
Big Data Storage Architecture Design in Cloud Computing
Big Data Storage Architecture Design in Cloud Computing Xuebin Chen 1, Shi Wang 1( ), Yanyan Dong 1, and Xu Wang 2 1 College of Science, North China University of Science and Technology, Tangshan, Hebei,
American International Journal of Research in Science, Technology, Engineering & Mathematics
American International Journal of Research in Science, Technology, Engineering & Mathematics Available online at http://www.iasir.net ISSN (Print): 2328-3491, ISSN (Online): 2328-3580, ISSN (CD-ROM): 2328-3629
Analysing Large Web Log Files in a Hadoop Distributed Cluster Environment
Analysing Large Files in a Hadoop Distributed Cluster Environment S Saravanan, B Uma Maheswari Department of Computer Science and Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham,
A STUDY ON HADOOP ARCHITECTURE FOR BIG DATA ANALYTICS
A STUDY ON HADOOP ARCHITECTURE FOR BIG DATA ANALYTICS Dr. Ananthi Sheshasayee 1, J V N Lakshmi 2 1 Head Department of Computer Science & Research, Quaid-E-Millath Govt College for Women, Chennai, (India)
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK A REVIEW ON HIGH PERFORMANCE DATA STORAGE ARCHITECTURE OF BIGDATA USING HDFS MS.
A Secure Model for Medical Data Sharing
International Journal of Database Theory and Application 45 A Secure Model for Medical Data Sharing Wong Kok Seng 1,1,Myung Ho Kim 1, Rosli Besar 2, Fazly Salleh 2 1 Department of Computer, Soongsil University,
Applying Attribute Level Locking to Decrease the Deadlock on Distributed Database
Applying Attribute Level Locking to Decrease the Deadlock on Distributed Database Dr. Khaled S. Maabreh* and Prof. Dr. Alaa Al-Hamami** * Faculty of Science and Information Technology, Zarqa University,
Data Management Challenges in Cloud Computing Infrastructures
Data Management Challenges in Cloud Computing Infrastructures Divyakant Agrawal Amr El Abbadi Shyam Antony Sudipto Das University of California, Santa Barbara {agrawal, amr, shyam, sudipto}@cs.ucsb.edu
SCHEDULING IN CLOUD COMPUTING
SCHEDULING IN CLOUD COMPUTING Lipsa Tripathy, Rasmi Ranjan Patra CSA,CPGS,OUAT,Bhubaneswar,Odisha Abstract Cloud computing is an emerging technology. It process huge amount of data so scheduling mechanism
Data Mining in the Swamp
WHITE PAPER Page 1 of 8 Data Mining in the Swamp Taming Unruly Data with Cloud Computing By John Brothers Business Intelligence is all about making better decisions from the data you have. However, all
Course 103402 MIS. Foundations of Business Intelligence
Oman College of Management and Technology Course 103402 MIS Topic 5 Foundations of Business Intelligence CS/MIS Department Organizing Data in a Traditional File Environment File organization concepts Database:
Lifetime Management of Cache Memory using Hadoop Snehal Deshmukh 1 Computer, PGMCOE, Wagholi, Pune, India
Volume 3, Issue 1, January 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at: www.ijarcsms.com ISSN:
Big Objects: Amazon S3
Cloud Platforms 1 Big Objects: Amazon S3 S3 = Simple Storage System Stores large objects (=values) that may have access permissions Used in cloud backup services Used to distribute software packages Used
5.5 Copyright 2011 Pearson Education, Inc. publishing as Prentice Hall. Figure 5-2
Class Announcements TIM 50 - Business Information Systems Lecture 15 Database Assignment 2 posted Due Tuesday 5/26 UC Santa Cruz May 19, 2015 Database: Collection of related files containing records on
bigdata Managing Scale in Ontological Systems
Managing Scale in Ontological Systems 1 This presentation offers a brief look scale in ontological (semantic) systems, tradeoffs in expressivity and data scale, and both information and systems architectural
How To Write A Database Program
SQL, NoSQL, and Next Generation DBMSs Shahram Ghandeharizadeh Director of the USC Database Lab Outline A brief history of DBMSs. OSs SQL NoSQL 1960/70 1980+ 2000+ Before Computers Database DBMS/Data Store
Big Data Analysis using Hadoop components like Flume, MapReduce, Pig and Hive
Big Data Analysis using Hadoop components like Flume, MapReduce, Pig and Hive E. Laxmi Lydia 1,Dr. M.Ben Swarup 2 1 Associate Professor, Department of Computer Science and Engineering, Vignan's Institute
Horizontal Aggregations In SQL To Generate Data Sets For Data Mining Analysis In An Optimized Manner
24 Horizontal Aggregations In SQL To Generate Data Sets For Data Mining Analysis In An Optimized Manner Rekha S. Nyaykhor M. Tech, Dept. Of CSE, Priyadarshini Bhagwati College of Engineering, Nagpur, India
PERFORMANCE ANALYSIS OF PaaS CLOUD COMPUTING SYSTEM
PERFORMANCE ANALYSIS OF PaaS CLOUD COMPUTING SYSTEM Akmal Basha 1 Krishna Sagar 2 1 PG Student,Department of Computer Science and Engineering, Madanapalle Institute of Technology & Science, India. 2 Associate
Benchmarking and Analysis of NoSQL Technologies
Benchmarking and Analysis of NoSQL Technologies Suman Kashyap 1, Shruti Zamwar 2, Tanvi Bhavsar 3, Snigdha Singh 4 1,2,3,4 Cummins College of Engineering for Women, Karvenagar, Pune 411052 Abstract The
Research on Clustering Analysis of Big Data Yuan Yuanming 1, 2, a, Wu Chanle 1, 2
Advanced Engineering Forum Vols. 6-7 (2012) pp 82-87 Online: 2012-09-26 (2012) Trans Tech Publications, Switzerland doi:10.4028/www.scientific.net/aef.6-7.82 Research on Clustering Analysis of Big Data
BIG DATA WEB ORGINATED TECHNOLOGY MEETS TELEVISION BHAVAN GHANDI, ADVANCED RESEARCH ENGINEER SANJEEV MISHRA, DISTINGUISHED ADVANCED RESEARCH ENGINEER
BIG DATA WEB ORGINATED TECHNOLOGY MEETS TELEVISION BHAVAN GHANDI, ADVANCED RESEARCH ENGINEER SANJEEV MISHRA, DISTINGUISHED ADVANCED RESEARCH ENGINEER TABLE OF CONTENTS INTRODUCTION WHAT IS BIG DATA?...
Figure 1 Cloud Computing. 1.What is Cloud: Clouds are of specific commercial interest not just on the acquiring tendency to outsource IT
An Overview Of Future Impact Of Cloud Computing Shiva Chaudhry COMPUTER SCIENCE DEPARTMENT IFTM UNIVERSITY MORADABAD Abstraction: The concept of cloud computing has broadcast quickly by the information
Evaluation of NoSQL and Array Databases for Scientific Applications
Evaluation of NoSQL and Array Databases for Scientific Applications Lavanya Ramakrishnan, Pradeep K. Mantha, Yushu Yao, Richard S. Canon Lawrence Berkeley National Lab Berkeley, CA 9472 [lramakrishnan,pkmantha,yyao,scanon]@lbl.gov
NewSQL: Towards Next-Generation Scalable RDBMS for Online Transaction Processing (OLTP) for Big Data Management
NewSQL: Towards Next-Generation Scalable RDBMS for Online Transaction Processing (OLTP) for Big Data Management A B M Moniruzzaman Department of Computer Science and Engineering, Daffodil International
Reallocation and Allocation of Virtual Machines in Cloud Computing Manan D. Shah a, *, Harshad B. Prajapati b
Proceedings of International Conference on Emerging Research in Computing, Information, Communication and Applications (ERCICA-14) Reallocation and Allocation of Virtual Machines in Cloud Computing Manan
Horizontal Partitioning by Predicate Abstraction and its Application to Data Warehouse Design
Horizontal Partitioning by Predicate Abstraction and its Application to Data Warehouse Design Aleksandar Dimovski 1, Goran Velinov 2, and Dragan Sahpaski 2 1 Faculty of Information-Communication Technologies,
Cloud-dew architecture: realizing the potential of distributed database systems in unreliable networks
Int'l Conf. Par. and Dist. Proc. Tech. and Appl. PDPTA'15 85 Cloud-dew architecture: realizing the potential of distributed database systems in unreliable networks Yingwei Wang 1 and Yi Pan 2 1 Department
A Grid Architecture for Manufacturing Database System
Database Systems Journal vol. II, no. 2/2011 23 A Grid Architecture for Manufacturing Database System Laurentiu CIOVICĂ, Constantin Daniel AVRAM Economic Informatics Department, Academy of Economic Studies
References. Introduction to Database Systems CSE 444. Motivation. Basic Features. Outline: Database in the Cloud. Outline
References Introduction to Database Systems CSE 444 Lecture 24: Databases as a Service YongChul Kwon Amazon SimpleDB Website Part of the Amazon Web services Google App Engine Datastore Website Part of
Introduction to Database Systems CSE 444
Introduction to Database Systems CSE 444 Lecture 24: Databases as a Service YongChul Kwon References Amazon SimpleDB Website Part of the Amazon Web services Google App Engine Datastore Website Part of
This paper defines as "Classical"
Principles of Transactional Approach in the Classical Web-based Systems and the Cloud Computing Systems - Comparative Analysis Vanya Lazarova * Summary: This article presents a comparative analysis of
MINING THE DATA FROM DISTRIBUTED DATABASE USING AN IMPROVED MINING ALGORITHM
MINING THE DATA FROM DISTRIBUTED DATABASE USING AN IMPROVED MINING ALGORITHM J. Arokia Renjit Asst. Professor/ CSE Department, Jeppiaar Engineering College, Chennai, TamilNadu,India 600119. Dr.K.L.Shunmuganathan
Future Prospects of Scalable Cloud Computing
Future Prospects of Scalable Cloud Computing Keijo Heljanko Department of Information and Computer Science School of Science Aalto University [email protected] 7.3-2012 1/17 Future Cloud Topics Beyond
Horizontal Fragmentation Technique in Distributed Database
International Journal of Scientific and esearch Publications, Volume, Issue 5, May 0 Horizontal Fragmentation Technique in istributed atabase Ms P Bhuyar ME I st Year (CSE) Sipna College of Engineering
Turkish Journal of Engineering, Science and Technology
Turkish Journal of Engineering, Science and Technology 03 (2014) 106-110 Turkish Journal of Engineering, Science and Technology journal homepage: www.tujest.com Integrating Data Warehouse with OLAP Server
UPS battery remote monitoring system in cloud computing
, pp.11-15 http://dx.doi.org/10.14257/astl.2014.53.03 UPS battery remote monitoring system in cloud computing Shiwei Li, Haiying Wang, Qi Fan School of Automation, Harbin University of Science and Technology
