Collaborative Query Coordination in Community-Driven Data Grids
|
|
- Marlene Palmer
- 8 years ago
- Views:
Transcription
1 HPDC '09 Collaborative Query Coordination in Community-Driven Data Grids Tobias Scholl, Angelika Reiser, and Alfons Kemper Department of Computer Science, Technische Universität München Germany
2 Community-Driven Data Grids (HiSbase)
3 The AstroGrid-D Project German Astronomy Community Grid Funded by the German Ministry of Education and Research Part of D-Grid HPDC 2009 Collaborative Query Processing 3
4 Up-Coming Data-Intensive Applications Alex Szalay, Jim Gray (Nature, 2006): Science in an exponential world Data rates LHC Terabytes a day/night Petabytes a year LSST LOFAR Pan-STARRS LHC LOFAR HPDC 2009 Collaborative Query Processing 4
5 The Multiwavelength Milky Way HPDC 2009 Collaborative Query Processing 5
6 Research Challenges Directly deal with Terabyte/Petabyte-scale data sets Integrate with existing community infrastructures High throughput for growing user communities HPDC 2009 Collaborative Query Processing 6
7 Current Sharing in Data Grids Data autonomy Policies allow partners to access data Each institution ensures Availability (replication) Scalability Various organizational structures [Venugopal et al. 2006]: Centralized Hierarchical Federated Hybrid HPDC 2009 Collaborative Query Processing 7
8 Community-Driven Data Grids (HiSbase) HPDC 2009 Collaborative Query Processing 8
9 Community-Driven Data Grids (HiSbase) HPDC 2009 Collaborative Query Processing 9
10 Distribute by Region not by Archive! HPDC 2009 Collaborative Query Processing 10
11 Distribute by Region not by Archive! HPDC 2009 Collaborative Query Processing 11
12 Distribute by Region not by Archive! HPDC 2009 Collaborative Query Processing 12
13 Distribute by Region not by Archive! HPDC 2009 Collaborative Query Processing 13
14 Mapping Data to Nodes HPDC 2009 Collaborative Query Processing 14
15 Submission Characteristics Portal-based submission Browser in every researcher s "tool box Scalability depends on portal Institution-based submission All data nodes accept queries Submission via local data node HPDC 2009 Collaborative Query Processing 15
16 Coordinator Selection Strategies The node submitting the query SelfStrategy (SS) A node containing relevant data (region-based strategies) FirstRegionStrategy (FRS) SelfOrFirstRegionStrategy (SOFRS) CenterOfGravityStrategy (COGS) RandomRegionStrategy (RRS) HPDC 2009 Collaborative Query Processing 16
17 SelfStrategy (SS) HPDC 2009 Collaborative Query Processing 17
18 FirstRegionStrategy (FRS) HPDC 2009 Collaborative Query Processing 18
19 SelfOrFirstRegionStrategy (SOFRS) Combination from SelfStrategy and FirstRegionStrategy Submit node is coordinator if it covers data Avoids unnecessary data transport With many partitions and many nodes basically the same as FirstRegionStrategy (as probability of Self-case decreases) HPDC 2009 Collaborative Query Processing 19
20 CenterOfGravityStrategy (COGS) Further reduce amount of data shipping "Perfect spot for minimizing data transfer HPDC 2009 Collaborative Query Processing 20
21 RandomRegionStrategy (RRS) Select random relevant region Tradeoff between balancing coordination load and reducing data shipping Probability(a) = 2/9 Probability(b) = 5/9 Probability(c) = 2/ HPDC 2009 Collaborative Query Processing 21
22 Evaluation Coordination Strategies: SS, FRS, SOFRS, COGS, RRS Submission Strategies: portal-based, institution-based Observational data sets Two workloads SDSS query log (Q obs ) Synthetic (Q scaled ) Network size P obs Network traffic measurements Number of routed messages Coordination load balancing Throughput Measurements HPDC 2009 Collaborative Query Processing 22
23 Query Workloads HPDC 2009 Collaborative Query Processing 23
24 Routed Messages per Query (Q obs ) HPDC 2009 Collaborative Query Processing 24
25 Routed Messages per Query (Q scaled ) HPDC 2009 Collaborative Query Processing 25
26 Portal-based Coordination Load HPDC 2009 Collaborative Query Processing 26
27 Institution-based Coordination Load HPDC 2009 Collaborative Query Processing 27
28 Throughput Q obs Q scaled Throughput dependent on query complexity No clear winner in terms of throughput HPDC 2009 Collaborative Query Processing 28
29 Workload-Aware Data Partitioning Query skew (hot spots) triggered by increased interest in particular subsets of the data Two well-known query load balancing techniques: Data partitioning Data replication Finding trade-offs between both (see EDBT 09 paper) HPDC 2009 Collaborative Query Processing 29
30 Load Balancing During Runtime Complement workload-aware partitioning with runtime loadbalancing Short-term peaks Master-slave approach Load monitoring Long-term trends Based on load monitoring Histogram evolution HPDC 2009 Collaborative Query Processing 30
31 Related Work On-line load balancing Hundreds of thousands to millions of nodes Reacting fast Treating objects individually HiSbase HPDC 2009 Collaborative Query Processing 31
32 Who Is the Query Coordinator? Many challenges and opportunities in e-science for distributed computing and database research High-throughput data management Correlation of distributed data sources Collaborative Query Coordination Region-based strategies reduce number of messages Load balancing independent of submission characteristic HPDC 2009 Collaborative Query Processing 32
33 Special Thanks To Ella Qiu, University of British Columbia DAAD Rise Internship Support during implementation Initial measurements HPDC 2009 Collaborative Query Processing 33
34 Get in Touch Database systems group, TU München Web site: The HiSbase project Thank You for Your Attention HPDC 2009 Collaborative Query Processing 34
Collaborative Query Coordination in Community-Driven Data Grids
HPDC '09 Collaborative Query Coordination in Community-Driven Data Grids Tobias Scholl, Angelika Reiser, and Alfons Kemper Department of Computer Science, Germany Community-Driven Data Grids (HiSbase)
More informationCommunity Training: Partitioning Schemes in Good Shape for Federated Data Grids
: Partitioning Schemes in Good Shape for Federated Data Grids Tobias Scholl, Richard Kuntschke, Angelika Reiser, Alfons Kemper 3rd IEEE International Conference on e-science and Grid Computing Bangalore,
More informationCommunity Training: Partitioning Schemes in Good Shape for Federated Data Grids
Community Training: Partitioning Schemes in Good Shape for Federated Data Grids Tobias Scholl Richard Kuntschke Angelika Reiser Alfons Kemper Technische Universität München Munich, Germany firstname.lastname
More informationLocality-Sensitive Operators for Parallel Main-Memory Database Clusters
Locality-Sensitive Operators for Parallel Main-Memory Database Clusters Wolf Rödiger, Tobias Mühlbauer, Philipp Unterbrunner*, Angelika Reiser, Alfons Kemper, Thomas Neumann Technische Universität München,
More informationGeoGrid Project and Experiences with Hadoop
GeoGrid Project and Experiences with Hadoop Gong Zhang and Ling Liu Distributed Data Intensive Systems Lab (DiSL) Center for Experimental Computer Systems Research (CERCS) Georgia Institute of Technology
More informationAstrophysics with Terabyte Datasets. Alex Szalay, JHU and Jim Gray, Microsoft Research
Astrophysics with Terabyte Datasets Alex Szalay, JHU and Jim Gray, Microsoft Research Living in an Exponential World Astronomers have a few hundred TB now 1 pixel (byte) / sq arc second ~ 4TB Multi-spectral,
More informationLow-Power Amdahl-Balanced Blades for Data-Intensive Computing
Thanks to NVIDIA, Microsoft External Research, NSF, Moore Foundation, OCZ Technology Low-Power Amdahl-Balanced Blades for Data-Intensive Computing Alex Szalay, Andreas Terzis, Alainna White, Howie Huang,
More informationPLATFORM AND SOFTWARE AS A SERVICE THE MAPREDUCE PROGRAMMING MODEL AND IMPLEMENTATIONS
PLATFORM AND SOFTWARE AS A SERVICE THE MAPREDUCE PROGRAMMING MODEL AND IMPLEMENTATIONS By HAI JIN, SHADI IBRAHIM, LI QI, HAIJUN CAO, SONG WU and XUANHUA SHI Prepared by: Dr. Faramarz Safi Islamic Azad
More informationE-mail: guido.negri@cern.ch, shank@bu.edu, dario.barberis@cern.ch, kors.bos@cern.ch, alexei.klimentov@cern.ch, massimo.lamanna@cern.
*a, J. Shank b, D. Barberis c, K. Bos d, A. Klimentov e and M. Lamanna a a CERN Switzerland b Boston University c Università & INFN Genova d NIKHEF Amsterdam e BNL Brookhaven National Laboratories E-mail:
More informationData Management and Risk Modelling in Cloud Computing Maintenance
Data Management and Risk Modelling in Cloud Computing Maintenance by Peter Matthews and Victor Muntés-Mulero, Research Staff Members, CA Labs, CA Technologies Cloud Computing is penetrating deeper into
More informationLoad Balancing in MapReduce Based on Scalable Cardinality Estimates
Load Balancing in MapReduce Based on Scalable Cardinality Estimates Benjamin Gufler 1, Nikolaus Augsten #, Angelika Reiser 3, Alfons Kemper 4 Technische Universität München Boltzmannstraße 3, 85748 Garching
More informationLearning from Big Data in
Learning from Big Data in Astronomy an overview Kirk Borne George Mason University School of Physics, Astronomy, & Computational Sciences http://spacs.gmu.edu/ From traditional astronomy 2 to Big Data
More informationTake An Internal Look at Hadoop. Hairong Kuang Grid Team, Yahoo! Inc hairong@yahoo-inc.com
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc hairong@yahoo-inc.com What s Hadoop Framework for running applications on large clusters of commodity hardware Scale: petabytes of data
More informationChapter 7. Using Hadoop Cluster and MapReduce
Chapter 7 Using Hadoop Cluster and MapReduce Modeling and Prototyping of RMS for QoS Oriented Grid Page 152 7. Using Hadoop Cluster and MapReduce for Big Data Problems The size of the databases used in
More informationCross-Matching Very Large Datasets
1 Cross-Matching Very Large Datasets María A. Nieto-Santisteban, Aniruddha R. Thakar, and Alexander S. Szalay Johns Hopkins University Abstract The primary mission of the National Virtual Observatory (NVO)
More informationMigrating a (Large) Science Database to the Cloud
The Sloan Digital Sky Survey Migrating a (Large) Science Database to the Cloud Ani Thakar Alex Szalay Center for Astrophysical Sciences and Institute for Data Intensive Engineering and Science (IDIES)
More informationInformation Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)
More informationScalable Internet Services and Load Balancing
Scalable Services and Load Balancing Kai Shen Services brings ubiquitous connection based applications/services accessible to online users through Applications can be designed and launched quickly and
More informationElastic Application Platform for Market Data Real-Time Analytics. for E-Commerce
Elastic Application Platform for Market Data Real-Time Analytics Can you deliver real-time pricing, on high-speed market data, for real-time critical for E-Commerce decisions? Market Data Analytics applications
More informationHadoop. http://hadoop.apache.org/ Sunday, November 25, 12
Hadoop http://hadoop.apache.org/ What Is Apache Hadoop? The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using
More informationIntro to Sessions 3 & 4: Data Management & Data Analysis. Bob Mann Wide-Field Astronomy Unit University of Edinburgh
Intro to Sessions 3 & 4: Data Management & Data Analysis Bob Mann Wide-Field Astronomy Unit University of Edinburgh 1 Outline Data Management Issues Alternatives to monolithic RDBMS model Intercontinental
More informationData Management in an International Data Grid Project. Timur Chabuk 04/09/2007
Data Management in an International Data Grid Project Timur Chabuk 04/09/2007 Intro LHC opened in 2005 several Petabytes of data per year data created at CERN distributed to Regional Centers all over the
More informationRuminations on Multi-Tenant Databases
To appear in BTW 2007, Aachen Germany Ruminations on Multi-Tenant Databases Dean Jacobs, Stefan Aulbach Technische Universität München Institut für Informatik - Lehrstuhl III (I3) Boltzmannstr. 3 D-85748
More informationAn Ants Algorithm to Improve Energy Efficient Based on Secure Autonomous Routing in WSN
An Ants Algorithm to Improve Energy Efficient Based on Secure Autonomous Routing in WSN *M.A.Preethy, PG SCHOLAR DEPT OF CSE #M.Meena,M.E AP/CSE King College Of Technology, Namakkal Abstract Due to the
More informationMichał Jankowski Maciej Brzeźniak PSNC
National Data Storage - architecture and mechanisms Michał Jankowski Maciej Brzeźniak PSNC Introduction Assumptions Architecture Main components Deployment Use case Agenda Data storage: The problem needs
More informationECHO: Recreating Network Traffic Maps for Datacenters with Tens of Thousands of Servers
ECHO: Recreating Network Traffic Maps for Datacenters with Tens of Thousands of Servers Christina Delimitrou 1, Sriram Sankar 2, Aman Kansal 3, Christos Kozyrakis 1 1 Stanford University 2 Microsoft 3
More informationUsing an In-Memory Data Grid for Near Real-Time Data Analysis
SCALEOUT SOFTWARE Using an In-Memory Data Grid for Near Real-Time Data Analysis by Dr. William Bain, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 IN today s competitive world, businesses
More informationA B S T R A C T. Index Terms : Apache s Hadoop, Map/Reduce, HDFS, Hashing Algorithm. I. INTRODUCTION
Speed- Up Extension To Hadoop System- A Survey Of HDFS Data Placement Sayali Ashok Shivarkar, Prof.Deepali Gatade Computer Network, Sinhgad College of Engineering, Pune, India 1sayalishivarkar20@gmail.com
More informationLoad Balancing on a Grid Using Data Characteristics
Load Balancing on a Grid Using Data Characteristics Jonathan White and Dale R. Thompson Computer Science and Computer Engineering Department University of Arkansas Fayetteville, AR 72701, USA {jlw09, drt}@uark.edu
More informationRelational Databases in the Cloud
Contact Information: February 2011 zimory scale White Paper Relational Databases in the Cloud Target audience CIO/CTOs/Architects with medium to large IT installations looking to reduce IT costs by creating
More informationBusiness Usage Monitoring for Teradata
Managing Big Analytic Data Business Usage Monitoring for Teradata Increasing Operational Efficiency and Reducing Data Management Costs How to Increase Operational Efficiency and Reduce Data Management
More informationOptimize Your Data Warehouse with Hadoop The first steps to transform the economics of data warehousing.
Optimize Your Data Warehouse with Hadoop The first steps to transform the economics of data warehousing. This white paper addresses the challenge of controlling the rising costs of operating and maintaining
More informationWith DDN Big Data Storage
DDN Solution Brief Accelerate > ISR With DDN Big Data Storage The Way to Capture and Analyze the Growing Amount of Data Created by New Technologies 2012 DataDirect Networks. All Rights Reserved. The Big
More informationCloud Computing Now and the Future Development of the IaaS
2010 Cloud Computing Now and the Future Development of the IaaS Quanta Computer Division: CCASD Title: Project Manager Name: Chad Lin Agenda: What is Cloud Computing? Public, Private and Hybrid Cloud.
More informationScaling Your Data to the Cloud
ZBDB Scaling Your Data to the Cloud Technical Overview White Paper POWERED BY Overview ZBDB Zettabyte Database is a new, fully managed data warehouse on the cloud, from SQream Technologies. By building
More informationHadoop implementation of MapReduce computational model. Ján Vaňo
Hadoop implementation of MapReduce computational model Ján Vaňo What is MapReduce? A computational model published in a paper by Google in 2004 Based on distributed computation Complements Google s distributed
More informationMulti-Datacenter Replication
www.basho.com Multi-Datacenter Replication A Technical Overview & Use Cases Table of Contents Table of Contents... 1 Introduction... 1 How It Works... 1 Default Mode...1 Advanced Mode...2 Architectural
More informationTuning Tableau Server for High Performance
Tuning Tableau Server for High Performance I wanna go fast PRESENT ED BY Francois Ajenstat Alan Doerhoefer Daniel Meyer Agenda What are the things that can impact performance? Tips and tricks to improve
More informationChapter 18: Database System Architectures. Centralized Systems
Chapter 18: Database System Architectures! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems! Network Types 18.1 Centralized Systems! Run on a single computer system and
More informationVolunteer Computing, Grid Computing and Cloud Computing: Opportunities for Synergy. Derrick Kondo INRIA, France
Volunteer Computing, Grid Computing and Cloud Computing: Opportunities for Synergy Derrick Kondo INRIA, France Outline Cloud Grid Volunteer Computing Cloud Background Vision Hide complexity of hardware
More informationCloud DBMS: An Overview. Shan-Hung Wu, NetDB CS, NTHU Spring, 2015
Cloud DBMS: An Overview Shan-Hung Wu, NetDB CS, NTHU Spring, 2015 Outline Definition and requirements S through partitioning A through replication Problems of traditional DDBMS Usage analysis: operational
More informationAre You Ready for Big Data?
Are You Ready for Big Data? Jim Gallo National Director, Business Analytics February 11, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?
More informationScalable Source Routing
Scalable Source Routing January 2010 Thomas Fuhrmann Department of Informatics, Self-Organizing Systems Group, Technical University Munich, Germany Routing in Networks You re there. I m here. Scalable
More informationData-Intensive Science and Scientific Data Infrastructure
Data-Intensive Science and Scientific Data Infrastructure Russ Rew, UCAR Unidata ICTP Advanced School on High Performance and Grid Computing 13 April 2011 Overview Data-intensive science Publishing scientific
More informationEnterprise Desktop Grids
Enterprise Desktop Grids Evgeny Ivashko Institute of Applied Mathematical Research, Karelian Research Centre of Russian Academy of Sciences, Petrozavodsk, Russia, ivashko@krc.karelia.ru WWW home page:
More informationCloud Computing with Microsoft Azure
Cloud Computing with Microsoft Azure Michael Stiefel www.reliablesoftware.com development@reliablesoftware.com http://www.reliablesoftware.com/dasblog/default.aspx Azure's Three Flavors Azure Operating
More informationCentralized Systems. A Centralized Computer System. Chapter 18: Database System Architectures
Chapter 18: Database System Architectures Centralized Systems! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems! Network Types! Run on a single computer system and do
More informationHigh Availability Database Solutions. for PostgreSQL & Postgres Plus
High Availability Database Solutions for PostgreSQL & Postgres Plus An EnterpriseDB White Paper for DBAs, Application Developers and Enterprise Architects November, 2008 High Availability Database Solutions
More informationMinimal Cost Data Sets Storage in the Cloud
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 5, May 2014, pg.1091
More informationManaging Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database
Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica
More informationwww.basho.com Technical Overview Simple, Scalable, Object Storage Software
www.basho.com Technical Overview Simple, Scalable, Object Storage Software Table of Contents Table of Contents... 1 Introduction & Overview... 1 Architecture... 2 How it Works... 2 APIs and Interfaces...
More informationA REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM
A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM Sneha D.Borkar 1, Prof.Chaitali S.Surtakar 2 Student of B.E., Information Technology, J.D.I.E.T, sborkar95@gmail.com Assistant Professor, Information
More informationBetriebssystem-Virtualisierung auf einem Rechencluster am SCC mit heterogenem Anwendungsprofil
Betriebssystem-Virtualisierung auf einem Rechencluster am SCC mit heterogenem Anwendungsprofil Volker Büge 1, Marcel Kunze 2, OIiver Oberst 1,2, Günter Quast 1, Armin Scheurer 1 1) Institut für Experimentelle
More informationHigh Velocity Analytics Take the Customer Experience to the Next Level
89 Fifth Avenue, 7th Floor New York, NY 10003 www.theedison.com 212.367.7400 High Velocity Analytics Take the Customer Experience to the Next Level IBM FlashSystem and IBM Tealeaf Printed in the United
More informationMining Large Datasets: Case of Mining Graph Data in the Cloud
Mining Large Datasets: Case of Mining Graph Data in the Cloud Sabeur Aridhi PhD in Computer Science with Laurent d Orazio, Mondher Maddouri and Engelbert Mephu Nguifo 16/05/2014 Sabeur Aridhi Mining Large
More informationEffective Load-balancing via Migration and Replication in Spatial Grids
Effective Load-balancing via Migration and Replication in Spatial Grids Anirban Mondal Kazuo Goda Masaru Kitsuregawa Institute of Industrial Science University of Tokyo, Japan {anirban,kgoda,kitsure}@tkl.iis.u-tokyo.ac.jp
More informationLecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at
Lecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at distributing load b. QUESTION: What is the context? i. How
More informationInternational journal of Engineering Research-Online A Peer Reviewed International Journal Articles available online http://www.ijoer.
RESEARCH ARTICLE ISSN: 2321-7758 GLOBAL LOAD DISTRIBUTION USING SKIP GRAPH, BATON AND CHORD J.K.JEEVITHA, B.KARTHIKA* Information Technology,PSNA College of Engineering & Technology, Dindigul, India Article
More informationCloud Computing. Lecture 5 Grid Case Studies 2014-2015
Cloud Computing Lecture 5 Grid Case Studies 2014-2015 Up until now Introduction. Definition of Cloud Computing. Grid Computing: Schedulers Globus Toolkit Summary Grid Case Studies: Monitoring: TeraGRID
More informationBusiness-centric Storage FUJITSU Hyperscale Storage System ETERNUS CD10000
Business-centric Storage FUJITSU Hyperscale Storage System ETERNUS CD10000 Clear the way for new business opportunities. Unlock the power of data. Overcoming storage limitations Unpredictable data growth
More informationA Review of Customized Dynamic Load Balancing for a Network of Workstations
A Review of Customized Dynamic Load Balancing for a Network of Workstations Taken from work done by: Mohammed Javeed Zaki, Wei Li, Srinivasan Parthasarathy Computer Science Department, University of Rochester
More informationCloud Computing Is In Your Future
Cloud Computing Is In Your Future Michael Stiefel www.reliablesoftware.com development@reliablesoftware.com http://www.reliablesoftware.com/dasblog/default.aspx Cloud Computing is Utility Computing Illusion
More informationAn Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics
An Oracle White Paper November 2010 Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics 1 Introduction New applications such as web searches, recommendation engines,
More informationTowards a Comprehensive Accounting Solution in the Multi-Middleware Environment of the D-Grid Initiative
Towards a Comprehensive Accounting Solution in the Multi-Middleware Environment of the D-Grid Initiative Jan Wiebelitz Wolfgang Müller, Michael Brenner, Gabriele von Voigt Cracow Grid Workshop 2008, Cracow,
More informationHigh Availability for Database Systems in Cloud Computing Environments. Ashraf Aboulnaga University of Waterloo
High Availability for Database Systems in Cloud Computing Environments Ashraf Aboulnaga University of Waterloo Acknowledgments University of Waterloo Prof. Kenneth Salem Umar Farooq Minhas Rui Liu (post-doctoral
More informationI N T E R S Y S T E M S W H I T E P A P E R INTERSYSTEMS CACHÉ AS AN ALTERNATIVE TO IN-MEMORY DATABASES. David Kaaret InterSystems Corporation
INTERSYSTEMS CACHÉ AS AN ALTERNATIVE TO IN-MEMORY DATABASES David Kaaret InterSystems Corporation INTERSYSTEMS CACHÉ AS AN ALTERNATIVE TO IN-MEMORY DATABASES Introduction To overcome the performance limitations
More informationHigh-Throughput Computing for HPC
Intelligent HPC Workload Management Convergence of high-throughput computing (HTC) with high-performance computing (HPC) Table of contents 3 Introduction 3 The Bottleneck in High-Throughput Computing 3
More informationIn Memory Accelerator for MongoDB
In Memory Accelerator for MongoDB Yakov Zhdanov, Director R&D GridGain Systems GridGain: In Memory Computing Leader 5 years in production 100s of customers & users Starts every 10 secs worldwide Over 15,000,000
More informationTools and Services for the Long Term Preservation and Access of Digital Archives
Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer Studies Department of Electrical and Computer
More informationInfiniteGraph: The Distributed Graph Database
A Performance and Distributed Performance Benchmark of InfiniteGraph and a Leading Open Source Graph Database Using Synthetic Data Objectivity, Inc. 640 West California Ave. Suite 240 Sunnyvale, CA 94086
More informationHDMQ :Towards In-Order and Exactly-Once Delivery using Hierarchical Distributed Message Queues. Dharmit Patel Faraj Khasib Shiva Srivastava
HDMQ :Towards In-Order and Exactly-Once Delivery using Hierarchical Distributed Message Queues Dharmit Patel Faraj Khasib Shiva Srivastava Outline What is Distributed Queue Service? Major Queue Service
More informationOne-Size-Fits-All: A DBMS Idea Whose Time has Come and Gone. Michael Stonebraker December, 2008
One-Size-Fits-All: A DBMS Idea Whose Time has Come and Gone Michael Stonebraker December, 2008 DBMS Vendors (The Elephants) Sell One Size Fits All (OSFA) It s too hard for them to maintain multiple code
More informationGeospatial Imaging Cloud Storage Capturing the World at Scale with WOS TM. ddn.com. DDN Whitepaper. 2011 DataDirect Networks. All Rights Reserved.
DDN Whitepaper Geospatial Imaging Cloud Storage Capturing the World at Scale with WOS TM Table of Contents Growth and Complexity Challenges for Geospatial Imaging 3 New Solutions to Drive Insight, Simplicity
More informationThe Legacy Value of Large Public Surveys: the SDSS Archive. Alexander Szalay The Johns Hopkins University
The Legacy Value of Large Public Surveys: the SDSS Archive Alexander Szalay The Johns Hopkins University Sloan Digital Sky Survey The Cosmic Genome Project Started in 1992, finished in 2008 Data is public
More informationStatus and Integration of AP2 Monitoring and Online Steering
Status and Integration of AP2 Monitoring and Online Steering Daniel Lorenz - University of Siegen Stefan Borovac, Markus Mechtel - University of Wuppertal Ralph Müller-Pfefferkorn Technische Universität
More informationBW-EML SAP Standard Application Benchmark
BW-EML SAP Standard Application Benchmark Heiko Gerwens and Tobias Kutning (&) SAP SE, Walldorf, Germany tobas.kutning@sap.com Abstract. The focus of this presentation is on the latest addition to the
More informationHadoop/BigData, IaaS, PaaS
Hadoop/BigData, IaaS, PaaS Behind the Hype, Real Use-Cases for Your Business Peter Ackermann Senior IT Consultant Agenda Introduction Today s hype about cloud-services Infrastructure as a Service (IaaS)
More informationCosmos. Big Data and Big Challenges. Pat Helland July 2011
Cosmos Big Data and Big Challenges Pat Helland July 2011 1 Outline Introduction Cosmos Overview The Structured s Project Some Other Exciting Projects Conclusion 2 What Is COSMOS? Petabyte Store and Computation
More informationIn-Memory Analytics for Big Data
In-Memory Analytics for Big Data Game-changing technology for faster, better insights WHITE PAPER SAS White Paper Table of Contents Introduction: A New Breed of Analytics... 1 SAS In-Memory Overview...
More informationImplementing Web-Based Computing Services To Improve Performance And Assist Telemedicine Database Management System
Implementing Web-Based Computing Services To Improve Performance And Assist Telemedicine Database Management System D. A. Vidhate 1, Ige Pranita 2, Kothari Pooja 3, Kshatriya Pooja 4 (Information Technology,
More informationChapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related
Chapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related Summary Xiangzhe Li Nowadays, there are more and more data everyday about everything. For instance, here are some of the astonishing
More informationTap into Big Data at the Speed of Business
SAP Brief SAP Technology SAP Sybase IQ Objectives Tap into Big Data at the Speed of Business A simpler, more affordable approach to Big Data analytics A simpler, more affordable approach to Big Data analytics
More informationDistributed Database Management Systems for Information Management and Access
464 Distributed Database Management Systems for Information Management and Access N Geetha Abstract Libraries play an important role in the academic world by providing access to world-class information
More informationThe Availability of Commercial Storage Clouds
The Availability of Commercial Storage Clouds Literature Study Introduction to e-science infrastructure 2008-2009 Arjan Borst ccn 0478199 Grid Computing - University of Amsterdam Software Engineer - WireITup
More informationPART IV Performance oriented design, Performance testing, Performance tuning & Performance solutions. Outline. Performance oriented design
PART IV Performance oriented design, Performance testing, Performance tuning & Performance solutions Slide 1 Outline Principles for performance oriented design Performance testing Performance tuning General
More informationReducer Load Balancing and Lazy Initialization in Map Reduce Environment S.Mohanapriya, P.Natesan
Reducer Load Balancing and Lazy Initialization in Map Reduce Environment S.Mohanapriya, P.Natesan Abstract Big Data is revolutionizing 21st-century with increasingly huge amounts of data to store and be
More informationStatistics, Data Mining and Machine Learning in Astronomy: A Practical Python Guide for the Analysis of Survey Data. and Alex Gray
Statistics, Data Mining and Machine Learning in Astronomy: A Practical Python Guide for the Analysis of Survey Data Željko Ivezić, Andrew J. Connolly, Jacob T. VanderPlas University of Washington and Alex
More informationExploiting Data at Rest and Data in Motion with a Big Data Platform
Exploiting Data at Rest and Data in Motion with a Big Data Platform Sarah Brader, sarah_brader@uk.ibm.com What is Big Data? Where does it come from? 12+ TBs of tweet data every day 30 billion RFID tags
More informationAdvanced Computer Networks. Layer-7-Switching and Loadbalancing
Oriana Riva, Department of Computer Science ETH Zürich Advanced Computer Networks 263-3501-00 Layer-7-Switching and Loadbalancing Patrick Stuedi, Qin Yin and Timothy Roscoe Spring Semester 2015 Outline
More informationSterling Business Intelligence
Sterling Business Intelligence Release Note Release 9.0 March 2010 Copyright 2010 Sterling Commerce, Inc. All rights reserved. Additional copyright information is located on the documentation library:
More informationMANAGING AND MINING THE LSST DATA SETS
MANAGING AND MINING THE LSST DATA SETS Astronomy is undergoing an exciting revolution -- a revolution in the way we probe the universe and the way we answer fundamental questions. New technology enables
More informationUCLA Graduate School of Education and Information Studies UCLA
UCLA Graduate School of Education and Information Studies UCLA Peer Reviewed Title: Slides for When use cases are not useful: Data practices, astronomy, and digital libraries Author: Wynholds, Laura, University
More informationData Warehousing and Analytics Infrastructure at Facebook. Ashish Thusoo & Dhruba Borthakur athusoo,dhruba@facebook.com
Data Warehousing and Analytics Infrastructure at Facebook Ashish Thusoo & Dhruba Borthakur athusoo,dhruba@facebook.com Overview Challenges in a Fast Growing & Dynamic Environment Data Flow Architecture,
More informationLecture Data Warehouse Systems
Lecture Data Warehouse Systems Eva Zangerle SS 2013 PART C: Novel Approaches in DW NoSQL and MapReduce Stonebraker on Data Warehouses Star and snowflake schemas are a good idea in the DW world C-Stores
More informationScaleArc idb Solution for SQL Server Deployments
ScaleArc idb Solution for SQL Server Deployments Objective This technology white paper describes the ScaleArc idb solution and outlines the benefits of scaling, load balancing, caching, SQL instrumentation
More informationA1 and FARM scalable graph database on top of a transactional memory layer
A1 and FARM scalable graph database on top of a transactional memory layer Miguel Castro, Aleksandar Dragojević, Dushyanth Narayanan, Ed Nightingale, Alex Shamis Richie Khanna, Matt Renzelmann Chiranjeeb
More informationUpgrading to Microsoft SQL Server 2008 R2 from Microsoft SQL Server 2008, SQL Server 2005, and SQL Server 2000
Upgrading to Microsoft SQL Server 2008 R2 from Microsoft SQL Server 2008, SQL Server 2005, and SQL Server 2000 Your Data, Any Place, Any Time Executive Summary: More than ever, organizations rely on data
More informationANY SURVEILLANCE, ANYWHERE, ANYTIME
ANY SURVEILLANCE, ANYWHERE, ANYTIME WHITEPAPER DDN Storage Powers Next Generation Video Surveillance Infrastructure INTRODUCTION Over the past decade, the world has seen tremendous growth in the use of
More informationBigtable is a proven design Underpins 100+ Google services:
Mastering Massive Data Volumes with Hypertable Doug Judd Talk Outline Overview Architecture Performance Evaluation Case Studies Hypertable Overview Massively Scalable Database Modeled after Google s Bigtable
More information