Addressing Crosscutting Compute/Data Challenges
|
|
- Merry Goodman
- 8 years ago
- Views:
Transcription
1 Rutgers Discovery Informatics Institute (RDI 2 ) New Jersey s Center for Advanced Computation Addressing Crosscutting Compute/Data Challenges Manish Parashar parashar@rutgers.edu Manish Parashar, Rutgers University CDSE - Many crosscutting challenges Computing Multicore; large and increasing core counts, deep memory hierarchies (TH-2: 54.9 PF, 3.12 M Cores, 1.4 PB, 25 MW) New prgm. model, concerns (fault tolerance, energy, etc) New models & technologies: Clouds, grids, hybrid manycore, accelerators, deep storage hierarchies, Data Generating more data than in all of human history: preserve, mine, share? Software Complex applications on coupled compute-data-networked environments, tools needed Modern apps: lines, many groups contribute, take decades to develop, very long lifetimes People Multidisciplinary expertise essential! Appropriate academic program, career tracks 2 1
2 Complex Coupled Application Workflows Tight coupling Coupled simulations/models run concurrently and exchange data frequently E.g., Coupled fusion simulations, multi-block subsurface modeling simulations Loose coupling Coupling and data exchange are less frequent, and often asynchronous and possibly opportunistic E.g., Combustion simulations DNS + LES coupling, material science simulations Data-flow (data-driven) coupling Workflows expressed as dataflow based DAG where data produced by a simulation flows through one or more data processing pipelines E.g., End-to-end simulations workflows with data analytics Ensemble workflows Multiple fan-out stages composed of large numbers of loosely coupled or independent simulations E.g., Uncertainty quantification, history matching, data assimilation Example: Asynchronous Replica Exchange 2
3 Scientific Simulations: A Big Data Challenge Scientific simulations running on high-end computing systems generate huge amounts of data! If a single core produces 2MB/minute on average, one of these machines could generate simulation data between ~170TB per hour -> ~700PB per day -> ~1.4EB per year Successful scientific discovery depends on a comprehensive understanding of this enormous simulation data How we enable the computation scientists to efficiently manage and explore extreme scale data: find the needles in haystack?? Challenges Faced by Traditional HPC Data Pipelines Data analysis challenge Can current data mining, manipulation and visualization algorithms still work effectively on extreme scale machine? I/O challenge Increasing performance gap: disks are outpaced by computing speed Figure. Traditional data analysis pipeline Data movement challenge Lots of data movement between simulation and analysis machines, between coupled mutli-physics simulation components -> longer latencies Improving data locality is critical: do work where the data resides! Energy challenge Future extreme systems are designed to have low-power chips however, much greater power consumption will be due to memory and data movement! The costs of data movement are increasing and dominating! 3
4 Coupling Cycles of Innovation Interoperability/Standards? 7 Addressing Complexity Programming System programming model, languages/ abstraction syntax + semantics abstract machine execution context and assumptions infrastructure, middleware & runtime Hide v/s expose? virtual homogeneity ease of programming robustness? cross-layer adaptation & optimization the inverted stack Applications Programming System (Programming Model + Abstract Machine) Abstraction (Support abstract behaviors/assumptions) Grid Middleware Infrastructure Virtualization (Create and manage VOs and VMs) Virtual Machine Virtual Machine Virtual Machine Virtual Organization Organization Organization Organization Conceptual Implementation Abstractions? Conceptual and Implementation Models for the Grid, M. Parashar and J.C. Browne, Proceedings of the IEEE, Special Issue on Grid Computing, IEEE Press, Vol. 19, No. 3, March 2005.! 4
5 Software Engineering and CDS&E Many sources of complexity complexity from the problem (complex physics) complexity of the end-to-end application workflow complexity from coordination & coupling across codes and research teams complexity from the codes and how they are developed and implemented complexity from the underlying disruptive infrastructure Challenge: Manage complexity while maintaining performance/scalability Coupling cycles of innovation Software Engineering in the Large v/s Software Engineering in the Small? Software management of large systems v/s smaller but highly complex systems Well defined v/s rapidly evolving requirements/execution environments Primary complexity from size v/s application requirements Relative importance of performance Design Principles Separation of Concerns Abstractions Interoperability and standardization Complexity suggests a SOA approach Service Oriented Architecture (SOA): Software as a composition of services Used to deal with system/application complexity, rapidly changing requirements, evolving target platforms, and diverse teams Service: a well-defined, self-contained, and independently developed software element that does not depend on the context or state of other services. Abstraction & separation Computations from compositions and coordination Interface from implementations Existing and proven concept - widely accepted/used by the enterprise computing community For example, extreme-scale data analysis services routinely run at Yahoo & Google, for near real-time web data analysis, for rapid deployment of new web services, etc. 5
6 Examples of some relevant services I/O services: Efficient and adaptable I/O support I/O services should be componentized so that it allows codes to switch between and tune different methods easily. Example: a user may easily switch between file-based and memorybased I/O or switch formats without changing any code, all the while maintaining the same data model when switching I/O components. Code-coupling services: Code-coupling services includes codes that can run on the same or different machines, codes that are tightly coupled (memory-to-memory), as well as codes that are loosely coupled (exchange of data through files). Automatic online (in-transit) data processing services: Efficient data movement services for distributing and archiving data Example: generating summary statistics, graphs, images, movies, etc. Monitoring Services: dynamic real-time monitoring of large-scale simulations (during execution) Automated provenance collections services: Data, system workflow, application, etc. ADIOS: Adaptable I/O System Provides portable, fast, scalable, easy- to- use, metadata rich output Simple API Layered so;ware architecture: Allows plug- ins for different I/O implementacons Abstracts the API from the method used for I/O Beyond I/O coupling, coordinacon, in- situ/in- transit, etc. Open source: hgp:// support/ center- projects/adios/ Research methods from many groups: Georgia Tech, Rutgers, NCSU, Emory, Auburn, Sandia, LBL, PPPL Interface)to)apps)for)descrip/on)of)data)(ADIOS,)etc.)) Feedback) Mul/Bresolu/on) methods) Provenance) Data)Management)Services) Buffering) Data)Compression) methods) Plugins)to)the)hybrid)staging)area) )Schedule) Data)Indexing) (FastBit))methods) Workflow))Engine) Run/me)engine) Data)movement) Analysis)Plugins) Parallel)and)Distributed)File)System) Visualiza/on)Plugins) AdiosBbp) IDX) HDF5) pnetcdf) raw )data) Image)data) Viz.)Client) I/O performance of the Combustion S3D code (96K cores), the SCEC PCML3D (30K cores), and the Fine/Turbo (4K cores) codes. 35 GB/s MPI-IO OR PHDF5 ADIOS S3D SEC Fine/Turbo Simulations with and without ADIOS 6
7 ADIOS Application Partners Over 158 Publications from Application Code Contact Astrophysics+ Chimera+ T.+Mezzacappa+ Combus6on+ S3D+ J.+Chen+ AMR+ Chombo+ B.+Van+Straalen+ CFD+ Fine/Turbo+ M.+Gon6er+ FusionCEdge+ XGC1+ C.+S.+Chang+ FusionCEdge+ XGC0+ S.+H.+Ku+ Geoscience+ AWPCODC+ Y.+Cui+ Materials+ LAMMPS+ A.+Frachioni+ Weather+ GRAPES+ W.+Xue+ Nuclear+ HFODD+ H.+Nam+ Rela6vity+ Maya+ P.+Laguna+ Sub+surface+ Materials+ M.+Wheeler+ M.+Parashar+ Fusion+ GTCCP+ S.+Ethier+ Application Code Materials+ M.+Parashar+ Application Fusion+ Code GTCCP+ Contact S.+Ethier+ Fusion+ GTS+ W.+Wang+ Fusion+ M3DCC1+ S.+Jardin+ Fusion+ M3DCK+ G.+Y.+Fu+ Fusion+ GTC+ Z.+Lin+ Fusion+ GEM+ S.+Parker+ Quantum+ QLG2Q+ M.+Soe+ Image+Analysis+ T.+Kurc+ AMR+ Boxlib+ J.+Bell+ Rela6vity+ Cactus+ G.+Allen+ Weather+ NASA+ T.+Clune+ Materials+ QMC+ J.+Kim+ Climate+ Contact Astrophysics+ Chimera+ T.+Mezzacappa+ Combus6on+ S3D+ J.+Chen+ AMR+ Chombo+ B.+Van+Straalen+ CFD+ Fine/Turbo+ M.+Gon6er+ FusionCEdge+ XGC1+ C.+S.+Chang+ FusionCEdge+ XGC0+ S.+H.+Ku+ Geoscience+ AWPCODC+ Y.+Cui+ Materials+ LAMMPS+ A.+Frachioni+ Weather+ GRAPES+ W.+Xue+ Nuclear+ HFODD+ H.+Nam+ Rela6vity+ Maya+ P.+Laguna+ Sub+surface+ M.+Wheeler+ J.+Fu+ Climate+ CAM+ K.+Evans+ Application Code Contact Fusion+ GTS+ W.+Wang+ Fusion+ M3DCC1+ S.+Jardin+ Fusion+ M3DCK+ G.+Y.+Fu+ Fusion+ GTC+ Z.+Lin+ Fusion+ GEM+ S.+Parker+ Quantum+ QLG2Q+ M.+Soe+ Image+Analysis+ T.+Kurc+ AMR+ Boxlib+ J.+Bell+ Rela6vity+ Cactus+ G.+Allen+ Weather+ NASA+ T.+Clune+ Materials+ QMC+ J.+Kim+ Climate+ J.+Fu+ Climate+ CAM+ K.+Evans+ The ADIOS Approach The overarching design philosophy of our framework is based on the Service-Oriented Architecture Used to deal with system/application complexity, rapidly changing requirements, evolving target platforms, and diverse teams Applications constructed by assembling services based on a universal view of their functionality using a welldefined API Service implementations can be changed easily Integrated simulation can be assembled using these services 7
8 ADIOS SOA: I/O Componentization End users should be able to select the most efficient I/O method for their code, with minimal effort in terms of code alternations/updates. Performance-driven choices should not prevent data from being stored in the desired file format, since this is crucial for later data analysis. Have efficient ways of identifying and selecting certain data for analysis, to help end users cope with the flood of data being produced. Make it easy to introduce new research I/O methods, without changing your code. allow the best CDS&E researchers to contribute in application science projects. Make it easy to allow I/O to do more than just I/O à code coupling, in situ visualization. Rethinking the Data Management Pipeline Hybrid Staging + In-Situ & In-Transit Execution Exploit multi-levels in-memory Hybrid Data Staging to: Decrease the gap between CPU and IO speeds Dynamically deploy and execute data analytical or pre-processing operations either in-situ or in-transit Improve IO write performance 8
9 In-situ/In-transit Data Management & Analytics l l l l Virtual shared-space programming abstraction l Simple API for coordination, interaction and messaging Distributed, associative, in-memory object store l Online data indexing, flexible querying Adaptive cross-layer runtime management l Hybrid in-situ/in-transit execution Efficient, high-throughput/low-latency asynchronous data transport Overview of Research RU Programming abstractions / system DataSpaces/XpressSpaces: Interaction, coordination and messaging abstraction for coupled scientific workflow [HPDC10, CCGrid10,11, HiPC12, CCPE13] ActiveSpaces: Dynamic code deployment for in-staging data processing [IPDPS11] Runtime mechanisms Data-centric Task Mapping: Reduce data movement and increase intranode data sharing [IPDPS12, DISC12] In-situ & In-transit Data Analytics: Simulation-time analysis of large volume data by combining in-situ and in-transit execution [SC 12] Cross-layer Adaptation: adaptive cross-layer approach for dynamic data management in large scale simulation-analysis workflow High-throughput/low-latency asynchronous data transport DART: Network independent transport library for high speed asynchronous data extraction and transfer [HPDC08] 9
10 DataSpaces: A Shared Space Abstraction for the Hybrid Staging Area [HPDC 10] Semantically-specialized virtual shared space abstraction in the staging area Shared (distributed) data objects Simple put/get/query API Supports coupled app. workflows Provide a global-view programming abstraction consistent with the PGAS model (UPC, GA) DataSpaces (HPDC10) Constructed on-the-fly on hybrid staging nodes Indexes data for quick access and retrieval Provides asynchronous coordination and interaction support Complements existing interaction/ coordination mechanisms Figure. Graphical view of the shared space model (derived from Tuple space) for coordination/ interaction Code coupling using DataSpaces Transparent data redistribution Complex geometry-based queries In-space (online) data transformation and manipulations The Replica Exchange Algorithm A powerful sampling algorithm that preserves canonical distributions Allows for efficient crossing of high energy barriers that separate thermodynamically stable states Can significantly reduce sampling times as compared to formulations based on constant temperatures Algorithm overview Several copies, or replicas, of the system are simulated in parallel at different temperatures using walkers Walkers occasionally swap temperatures to allow them to bypass enthalpic barriers by moving to a high temperature Exchanges governed by a probability condition to ensure detailed balance Replica exchange can significantly impact the fields of structural biology and drug design structure based drug design associated with protein misfolding, for example, structure based drug design and binding affinity optimization molecular basis of human diseases associated with protein misfolding 10
11 Parallel Replica Exchange General formulation requires dynamic and complex coordination and communication patterns between the walkers Pair-wise and asynchronous Dependent on the current state (temperature, energy) of replicas, which are changing dynamically Implementation based on commonly used parallel programming frameworks is challenging Message passing frameworks, e.g. MPI, require matching sends and receives to be explicitly defined for each interaction Implementations use restrictive formulations using synchronous exchanges Exchanges between neighboring temperatures only Asynchronous Replica Exchange using DataSpaces DataSpaces provides a semantically specialized virtual shared space abstraction to support scalable asynchronous replica exchange for molecular dynamics applications Enables general non-nearest neighboring temperature exchanges Exchanges are decoupled and asynchronously and dynamically determined Communications are decentralized and peer-to-peer, and occur in parallel Operation Walkers post temperature range of interest to the shared space using post Walkers discover partners and negotiate exchange Walkers execute exchange (handshake) 11
12 Example: Salsa Operation Summary Unprecedented opportunities for dramatic insights through computation! Challenge: Manage complexity while maintaining performance/scalability. complexity from the problem (complex physics) complexity from the codes and how they are developed and implemented complexity of underlying infrastructure (disruptive hardware trends) complexity from coordination across codes and research teams Overarching philosophy Abstraction & Separation through Service Orientation Independence in development, deployment, execution Computations from compositions and coordination; Interface from implementations Existing and proven concepts - widely accepted/used by the enterprise computing community ADIOS Reducing barriers from a scientists perspective Ease-of-use, simple code integration and maintainability Minimizing performance impact Addressing unique requirements of high-end CDS&E 12
13 Thank You!! EPSi Edge Physics Simulation Manish Parashar, Ph.D. Prof., Dept. of Electrical & Computer Engr. Rutgers Discovery Informatics Institute (RDI 2 ) Cloud & Autonomic Computing Center (CAC) Rutgers, The State University of New Jersey parashar@rutgers.edu WWW: rdi2.rutgers.edu 13
Addressing Big Data Challenges in Simulation-based Science
Addressing Big Data Challenges in Simulation-based Science Manish Parashar* Rutgers Discovery Informatics Institute (RDI2) NSF Center for Cloud and Autonomic Computing (CAC) Department of Electrical and
More informationBig Data Challenges in Simulation-based Science
Big Data Challenges in Simulation-based Science Manish Parashar* Rutgers Discovery Informatics Institute (RDI 2 ) Department of Computer Science *Hoang Bui, Tong Jin, Qian Sun, Fan Zhang, ADIOS Team, and
More informationBig Data Challenges in Simulation-based Science
Big Data Challenges in Simulation-based Science Manish Parashar* Rutgers Discovery Informatics Institute (RDI2) NSF Center for Cloud and Autonomic Computing (CAC) Department of Electrical and Computer
More informationBig Data Challenges in Simulation-based Science
Big Data Challenges in Simulation-based Science Manish Parashar Rutgers Discovery Informatics Institute (RDI 2 ) NSF Center for Cloud and Autonomic Computing (CAC) http://parashar.rutgers.edu *Hoang Bui,
More informationNew Jersey Big Data Alliance
Rutgers Discovery Informatics Institute (RDI 2 ) New Jersey s Center for Advanced Computation New Jersey Big Data Alliance Manish Parashar Director, Rutgers Discovery Informatics Institute (RDI 2 ) Professor,
More informationBuilding Platform as a Service for Scientific Applications
Building Platform as a Service for Scientific Applications Moustafa AbdelBaky moustafa@cac.rutgers.edu Rutgers Discovery Informa=cs Ins=tute (RDI 2 ) The NSF Cloud and Autonomic Compu=ng Center Department
More informationExploring Software Defined Federated Infrastructures for Science
Exploring Software Defined Federated Infrastructures for Science Manish Parashar NSF Cloud and Autonomic Computing Center (CAC) Rutgers Discovery Informatics Institute (RDI 2 ) Rutgers, The State University
More informationBig Data Mining Services and Knowledge Discovery Applications on Clouds
Big Data Mining Services and Knowledge Discovery Applications on Clouds Domenico Talia DIMES, Università della Calabria & DtoK Lab Italy talia@dimes.unical.it Data Availability or Data Deluge? Some decades
More informationBSC vision on Big Data and extreme scale computing
BSC vision on Big Data and extreme scale computing Jesus Labarta, Eduard Ayguade,, Fabrizio Gagliardi, Rosa M. Badia, Toni Cortes, Jordi Torres, Adrian Cristal, Osman Unsal, David Carrera, Yolanda Becerra,
More informationBeyond Embarrassingly Parallel Big Data. William Gropp www.cs.illinois.edu/~wgropp
Beyond Embarrassingly Parallel Big Data William Gropp www.cs.illinois.edu/~wgropp Messages Big is big Data driven is an important area, but not all data driven problems are big data (despite current hype).
More informationData Centric Systems (DCS)
Data Centric Systems (DCS) Architecture and Solutions for High Performance Computing, Big Data and High Performance Analytics High Performance Computing with Data Centric Systems 1 Data Centric Systems
More informationCluster, Grid, Cloud Concepts
Cluster, Grid, Cloud Concepts Kalaiselvan.K Contents Section 1: Cluster Section 2: Grid Section 3: Cloud Cluster An Overview Need for a Cluster Cluster categorizations A computer cluster is a group of
More informationBig Data Management in the Clouds and HPC Systems
Big Data Management in the Clouds and HPC Systems Hemera Final Evaluation Paris 17 th December 2014 Shadi Ibrahim Shadi.ibrahim@inria.fr Era of Big Data! Source: CNRS Magazine 2013 2 Era of Big Data! Source:
More informationThe Fusion of Supercomputing and Big Data. Peter Ungaro President & CEO
The Fusion of Supercomputing and Big Data Peter Ungaro President & CEO The Supercomputing Company Supercomputing Big Data Because some great things never change One other thing that hasn t changed. Cray
More informationData Semantics Aware Cloud for High Performance Analytics
Data Semantics Aware Cloud for High Performance Analytics Microsoft Future Cloud Workshop 2011 June 2nd 2011, Prof. Jun Wang, Computer Architecture and Storage System Laboratory (CASS) Acknowledgement
More informationData-intensive HPC: opportunities and challenges. Patrick Valduriez
Data-intensive HPC: opportunities and challenges Patrick Valduriez Big Data Landscape Multi-$billion market! Big data = Hadoop = MapReduce? No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard,
More informationSo#ware Tools and Techniques for HPC, Clouds, and Server- Class SoCs Ron Brightwell
So#ware Tools and Techniques for HPC, Clouds, and Server- Class SoCs Ron Brightwell R&D Manager, Scalable System So#ware Department Sandia National Laboratories is a multi-program laboratory managed and
More informationBIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON
BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON Overview * Introduction * Multiple faces of Big Data * Challenges of Big Data * Cloud Computing
More informationVisualization and Data Analysis
Working Group Outbrief Visualization and Data Analysis James Ahrens, David Rogers, Becky Springmeyer Eric Brugger, Cyrus Harrison, Laura Monroe, Dino Pavlakos Scott Klasky, Kwan-Liu Ma, Hank Childs LLNL-PRES-481881
More informationHPC in the age of (Big)Data
HPC in the age of (Big)Data The fusion of supercomputing and Big, Fast Data Analytics Ugo Varetto (uvaretto@cray.com) CRAY EMEA Research Lab Agenda Outlook Requirements Implementation 2 Outlook Cray s
More informationCray: Enabling Real-Time Discovery in Big Data
Cray: Enabling Real-Time Discovery in Big Data Discovery is the process of gaining valuable insights into the world around us by recognizing previously unknown relationships between occurrences, objects
More informationThe Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets
The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets!! Large data collections appear in many scientific domains like climate studies.!! Users and
More information3rd International Symposium on Big Data and Cloud Computing Challenges (ISBCC-2016) March 10-11, 2016 VIT University, Chennai, India
3rd International Symposium on Big Data and Cloud Computing Challenges (ISBCC-2016) March 10-11, 2016 VIT University, Chennai, India Call for Papers Cloud computing has emerged as a de facto computing
More informationHow In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time
SCALEOUT SOFTWARE How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time by Dr. William Bain and Dr. Mikhail Sobolev, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T wenty-first
More informationScientific Computing's Productivity Gridlock and How Software Engineering Can Help
Scientific Computing's Productivity Gridlock and How Software Engineering Can Help Stuart Faulk, Ph.D. Computer and Information Science University of Oregon Stuart Faulk CSESSP 2015 Outline Challenges
More information1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India
1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India Call for Papers Colossal Data Analysis and Networking has emerged as a de facto
More informationWrite a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical
Identify a problem Review approaches to the problem Propose a novel approach to the problem Define, design, prototype an implementation to evaluate your approach Could be a real system, simulation and/or
More informationScientific versus Business Workflows
2 Scientific versus Business Workflows Roger Barga and Dennis Gannon The formal concept of a workflow has existed in the business world for a long time. An entire industry of tools and technology devoted
More informationHPC technology and future architecture
HPC technology and future architecture Visual Analysis for Extremely Large-Scale Scientific Computing KGT2 Internal Meeting INRIA France Benoit Lange benoit.lange@inria.fr Toàn Nguyên toan.nguyen@inria.fr
More informationBlobSeer: Towards efficient data storage management on large-scale, distributed systems
: Towards efficient data storage management on large-scale, distributed systems Bogdan Nicolae University of Rennes 1, France KerData Team, INRIA Rennes Bretagne-Atlantique PhD Advisors: Gabriel Antoniu
More informationBusiness Transformation for Application Providers
E SB DE CIS IO N GUID E Business Transformation for Application Providers 10 Questions to Ask Before Selecting an Enterprise Service Bus 10 Questions to Ask Before Selecting an Enterprise Service Bus InterSystems
More informationDr. Raju Namburu Computational Sciences Campaign U.S. Army Research Laboratory. The Nation s Premier Laboratory for Land Forces UNCLASSIFIED
Dr. Raju Namburu Computational Sciences Campaign U.S. Army Research Laboratory 21 st Century Research Continuum Theory Theory embodied in computation Hypotheses tested through experiment SCIENTIFIC METHODS
More informationHPC Programming Framework Research Team
HPC Programming Framework Research Team 1. Team Members Naoya Maruyama (Team Leader) Motohiko Matsuda (Research Scientist) Soichiro Suzuki (Technical Staff) Mohamed Wahib (Postdoctoral Researcher) Shinichiro
More informationTowards Scalable Visualization Plugins for Data Staging Workflows
Towards Scalable Visualization Plugins for Data Staging Workflows David Pugmire Oak Ridge National Laboratory Oak Ridge, Tennessee James Kress University of Oregon Eugene, Oregon Jeremy Meredith, Norbert
More informationPlatform Autonomous Custom Scalable Service using Service Oriented Cloud Computing Architecture
Platform Autonomous Custom Scalable Service using Service Oriented Cloud Computing Architecture 1 B. Kamala 2 B. Priya 3 J. M. Nandhini 1 2 3 ABSTRACT The global economic recession and the shrinking budget
More informationCORBA and object oriented middleware. Introduction
CORBA and object oriented middleware Introduction General info Web page http://www.dis.uniroma1.it/~beraldi/elective Exam Project (application), plus oral discussion 3 credits Roadmap Distributed applications
More informationUsing In-Memory Computing to Simplify Big Data Analytics
SCALEOUT SOFTWARE Using In-Memory Computing to Simplify Big Data Analytics by Dr. William Bain, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T he big data revolution is upon us, fed
More informationAccelerating Hadoop MapReduce Using an In-Memory Data Grid
Accelerating Hadoop MapReduce Using an In-Memory Data Grid By David L. Brinker and William L. Bain, ScaleOut Software, Inc. 2013 ScaleOut Software, Inc. 12/27/2012 H adoop has been widely embraced for
More informationA Brief Introduction to Apache Tez
A Brief Introduction to Apache Tez Introduction It is a fact that data is basically the new currency of the modern business world. Companies that effectively maximize the value of their data (extract value
More informationUsing an In-Memory Data Grid for Near Real-Time Data Analysis
SCALEOUT SOFTWARE Using an In-Memory Data Grid for Near Real-Time Data Analysis by Dr. William Bain, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 IN today s competitive world, businesses
More informationGrid Computing Vs. Cloud Computing
International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 6 (2013), pp. 577-582 International Research Publications House http://www. irphouse.com /ijict.htm Grid
More informationDistribution transparency. Degree of transparency. Openness of distributed systems
Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer Science steen@cs.vu.nl Chapter 01: Version: August 27, 2012 1 / 28 Distributed System: Definition A distributed
More informationArchitectures for Big Data Analytics A database perspective
Architectures for Big Data Analytics A database perspective Fernando Velez Director of Product Management Enterprise Information Management, SAP June 2013 Outline Big Data Analytics Requirements Spectrum
More informationService Oriented Architecture
Service Oriented Architecture Charlie Abela Department of Artificial Intelligence charlie.abela@um.edu.mt Last Lecture Web Ontology Language Problems? CSA 3210 Service Oriented Architecture 2 Lecture Outline
More informationBig Data Challenges in Bioinformatics
Big Data Challenges in Bioinformatics BARCELONA SUPERCOMPUTING CENTER COMPUTER SCIENCE DEPARTMENT Autonomic Systems and ebusiness Pla?orms Jordi Torres Jordi.Torres@bsc.es Talk outline! We talk about Petabyte?
More informationBig Data Analytics. Chances and Challenges. Volker Markl
Volker Markl Professor and Chair Database Systems and Information Management (DIMA), Technische Universität Berlin www.dima.tu-berlin.de Big Data Analytics Chances and Challenges Volker Markl DIMA BDOD
More informationHank Childs, University of Oregon
Exascale Analysis & Visualization: Get Ready For a Whole New World Sept. 16, 2015 Hank Childs, University of Oregon Before I forget VisIt: visualization and analysis for very big data DOE Workshop for
More informationA REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM
A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM Sneha D.Borkar 1, Prof.Chaitali S.Surtakar 2 Student of B.E., Information Technology, J.D.I.E.T, sborkar95@gmail.com Assistant Professor, Information
More informationAutonomics, Cyberinfrastructure Federation, and Software-Defined Environments for Science
Autonomics, Cyberinfrastructure Federation, and Software-Defined Environments for Science Manish Parashar NSF Cloud and Autonomic Computing Center (CAC) Rutgers Discovery Informatics Institute (RDI 2 )
More informationAppendix 1 ExaRD Detailed Technical Descriptions
Appendix 1 ExaRD Detailed Technical Descriptions Contents 1 Application Foundations... 3 1.1 Co- Design... 3 1.2 Applied Mathematics... 5 1.3 Data Analytics and Visualization... 8 2 User Experiences...
More informationDISTRIBUTED SYSTEMS AND CLOUD COMPUTING. A Comparative Study
DISTRIBUTED SYSTEMS AND CLOUD COMPUTING A Comparative Study Geographically distributed resources, such as storage devices, data sources, and computing power, are interconnected as a single, unified resource
More informationNASA's Strategy and Activities in Server Side Analytics
NASA's Strategy and Activities in Server Side Analytics Tsengdar Lee, Ph.D. High-end Computing Program Manager NASA Headquarters Presented at the ESGF/UVCDAT Conference Lawrence Livermore National Laboratory
More informationManjrasoft Market Oriented Cloud Computing Platform
Manjrasoft Market Oriented Cloud Computing Platform Aneka Aneka is a market oriented Cloud development and management platform with rapid application development and workload distribution capabilities.
More informationManjrasoft Market Oriented Cloud Computing Platform
Manjrasoft Market Oriented Cloud Computing Platform Innovative Solutions for 3D Rendering Aneka is a market oriented Cloud development and management platform with rapid application development and workload
More informationData Requirements from NERSC Requirements Reviews
Data Requirements from NERSC Requirements Reviews Richard Gerber and Katherine Yelick Lawrence Berkeley National Laboratory Summary Department of Energy Scientists represented by the NERSC user community
More informationThe Role of the Operating System in Cloud Environments
The Role of the Operating System in Cloud Environments Judith Hurwitz, President Marcia Kaufman, COO Sponsored by Red Hat Cloud computing is a technology deployment approach that has the potential to help
More informationXeon+FPGA Platform for the Data Center
Xeon+FPGA Platform for the Data Center ISCA/CARL 2015 PK Gupta, Director of Cloud Platform Technology, DCG/CPG Overview Data Center and Workloads Xeon+FPGA Accelerator Platform Applications and Eco-system
More informationService-Oriented Architectures
Architectures Computing & 2009-11-06 Architectures Computing & SERVICE-ORIENTED COMPUTING (SOC) A new computing paradigm revolving around the concept of software as a service Assumes that entire systems
More informationNASA s Big Data Challenges in Climate Science
NASA s Big Data Challenges in Climate Science Tsengdar Lee, Ph.D. High-end Computing Program Manager NASA Headquarters Presented at IEEE Big Data 2014 Workshop October 29, 2014 1 2 7-km GEOS-5 Nature Run
More informationThe PHI solution. Fujitsu Industry Ready Intel XEON-PHI based solution. SC2013 - Denver
1 The PHI solution Fujitsu Industry Ready Intel XEON-PHI based solution SC2013 - Denver Industrial Application Challenges Most of existing scientific and technical applications Are written for legacy execution
More information2 (18) - SOFTWARE ARCHITECTURE Service Oriented Architecture - Sven Arne Andreasson - Computer Science and Engineering.
Service Oriented Architecture Definition (1) Definitions Services Organizational Impact SOA principles Web services A service-oriented architecture is essentially a collection of services. These services
More informationPerformance Monitoring of Parallel Scientific Applications
Performance Monitoring of Parallel Scientific Applications Abstract. David Skinner National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory This paper introduces an infrastructure
More informationRelational Databases in the Cloud
Contact Information: February 2011 zimory scale White Paper Relational Databases in the Cloud Target audience CIO/CTOs/Architects with medium to large IT installations looking to reduce IT costs by creating
More informationJOURNAL OF OBJECT TECHNOLOGY
JOURNAL OF OBJECT TECHNOLOGY Online at www.jot.fm. Published by ETH Zurich, Chair of Software Engineering JOT, 2008 Vol. 7, No. 8, November-December 2008 What s Your Information Agenda? Mahesh H. Dodani,
More informationIn situ data analysis and I/O acceleration of FLASH astrophysics simulation on leadership-class system using GLEAN
In situ data analysis and I/O acceleration of FLASH astrophysics simulation on leadership-class system using GLEAN Venkatram Vishwanath 1, Mark Hereld 1, Michael E. Papka 1, Randy Hudson 2, G. Cal Jordan
More informationPercipient StorAGe for Exascale Data Centric Computing
Percipient StorAGe for Exascale Data Centric Computing per cip i ent (pr-sp-nt) adj. Having the power of perceiving, especially perceiving keenly and readily. n. One that perceives. Introducing: Seagate
More informationHardware- and Network-Enhanced Software Systems for Cloud Computing
The HARNESS Project: Hardware- and Network-Enhanced Software Systems for Cloud Computing Prof. Alexander Wolf Imperial College London (Project Coordinator) Cloud Market Strata SaaS Software as a Service
More informationIO Informatics The Sentient Suite
IO Informatics The Sentient Suite Our software, The Sentient Suite, allows a user to assemble, view, analyze and search very disparate information in a common environment. The disparate data can be numeric
More informationAugmented Search for Web Applications. New frontier in big log data analysis and application intelligence
Augmented Search for Web Applications New frontier in big log data analysis and application intelligence Business white paper May 2015 Web applications are the most common business applications today.
More informationHow To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
More informationCloud computing insights from 110 implementation projects
IBM Academy of Technology Thought Leadership White Paper October 2010 Cloud computing insights from 110 implementation projects IBM Academy of Technology Survey 2 Cloud computing insights from 110 implementation
More informationConverged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities
Technology Insight Paper Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities By John Webster February 2015 Enabling you to make the best technology decisions Enabling
More informationA Study on Service Oriented Network Virtualization convergence of Cloud Computing
A Study on Service Oriented Network Virtualization convergence of Cloud Computing 1 Kajjam Vinay Kumar, 2 SANTHOSH BODDUPALLI 1 Scholar(M.Tech),Department of Computer Science Engineering, Brilliant Institute
More informationBig Data in HPC Applications and Programming Abstractions. Saba Sehrish Oct 3, 2012
Big Data in HPC Applications and Programming Abstractions Saba Sehrish Oct 3, 2012 Big Data in Computational Science - Size Data requirements for select 2012 INCITE applications at ALCF (BG/P) On-line
More informationPACE Predictive Analytics Center of Excellence @ San Diego Supercomputer Center, UCSD. Natasha Balac, Ph.D.
PACE Predictive Analytics Center of Excellence @ San Diego Supercomputer Center, UCSD Natasha Balac, Ph.D. Brief History of SDSC 1985-1997: NSF national supercomputer center; managed by General Atomics
More informationRecent and Future Activities in HPC and Scientific Data Management Siegfried Benkner
Recent and Future Activities in HPC and Scientific Data Management Siegfried Benkner Research Group Scientific Computing Faculty of Computer Science University of Vienna AUSTRIA http://www.par.univie.ac.at
More informationHow To Understand The Concept Of A Distributed System
Distributed Operating Systems Introduction Ewa Niewiadomska-Szynkiewicz and Adam Kozakiewicz ens@ia.pw.edu.pl, akozakie@ia.pw.edu.pl Institute of Control and Computation Engineering Warsaw University of
More informationSeptember 25, 2007. Maya Gokhale Georgia Institute of Technology
NAND Flash Storage for High Performance Computing Craig Ulmer cdulmer@sandia.gov September 25, 2007 Craig Ulmer Maya Gokhale Greg Diamos Michael Rewak SNL/CA, LLNL Georgia Institute of Technology University
More informationPlatform Autonomous Custom Scalable Service using Service Oriented Cloud Computing Architecture
Platform Autonomous Custom Scalable Service using Service Oriented Cloud Computing Architecture 1 B. Kamala 2 B. Priya 3 J. M. Nandhini - 1 AP-II, MCA Dept, Sri Sai Ram Engineering College, Chennai, kamala.mca@sairam.edu.in
More informationA standards-based approach to application integration
A standards-based approach to application integration An introduction to IBM s WebSphere ESB product Jim MacNair Senior Consulting IT Specialist Macnair@us.ibm.com Copyright IBM Corporation 2005. All rights
More informationEHR Data Preservation
Emerging Computational Approaches to Interoperability the Key to Long Term Preservation of EHR Data William W. Stead, M.D. Associate Vice Chancellor for Health Affairs Chief Strategy & Information Officer
More informationDigital libraries of the future and the role of libraries
Digital libraries of the future and the role of libraries Donatella Castelli ISTI-CNR, Pisa, Italy Abstract Purpose: To introduce the digital libraries of the future, their enabling technologies and their
More informationUnisys ClearPath Forward Fabric Based Platform to Power the Weather Enterprise
Unisys ClearPath Forward Fabric Based Platform to Power the Weather Enterprise Introducing Unisys All in One software based weather platform designed to reduce server space, streamline operations, consolidate
More informationWelcome to the Force.com Developer Day
Welcome to the Force.com Developer Day Sign up for a Developer Edition account at: http://developer.force.com/join Nicola Lalla nlalla@saleforce.com n_lalla nlalla26 Safe Harbor Safe harbor statement under
More informationPrinciples and characteristics of distributed systems and environments
Principles and characteristics of distributed systems and environments Definition of a distributed system Distributed system is a collection of independent computers that appears to its users as a single
More informationVStore++: Virtual Storage Services for Mobile Devices
VStore++: Virtual Storage Services for Mobile Devices Sudarsun Kannan, Karishma Babu, Ada Gavrilovska, and Karsten Schwan Center for Experimental Research in Computer Systems Georgia Institute of Technology
More informationFigure 1: Illustration of service management conceptual framework
Dagstuhl Seminar on Service-Oriented Computing Session Summary Service Management Asit Dan, IBM Participants of the Core Group Luciano Baresi, Politecnico di Milano Asit Dan, IBM (Session Lead) Martin
More informationTowards the Magic Green Broker Jean-Louis Pazat IRISA 1/29. Jean-Louis Pazat. IRISA/INSA Rennes, FRANCE MYRIADS Project Team
Towards the Magic Green Broker Jean-Louis Pazat IRISA 1/29 Jean-Louis Pazat IRISA/INSA Rennes, FRANCE MYRIADS Project Team Towards the Magic Green Broker Jean-Louis Pazat IRISA 2/29 OUTLINE Clouds and
More informationData Mining Governance for Service Oriented Architecture
Data Mining Governance for Service Oriented Architecture Ali Beklen Software Group IBM Turkey Istanbul, TURKEY alibek@tr.ibm.com Turgay Tugay Bilgin Dept. of Computer Engineering Maltepe University Istanbul,
More informationNSF Workshop: High Priority Research Areas on Integrated Sensor, Control and Platform Modeling for Smart Manufacturing
NSF Workshop: High Priority Research Areas on Integrated Sensor, Control and Platform Modeling for Smart Manufacturing Purpose of the Workshop In October 2014, the President s Council of Advisors on Science
More informationData Management in an International Data Grid Project. Timur Chabuk 04/09/2007
Data Management in an International Data Grid Project Timur Chabuk 04/09/2007 Intro LHC opened in 2005 several Petabytes of data per year data created at CERN distributed to Regional Centers all over the
More informationSERVICE ORIENTED ARCHITECTURE
SERVICE ORIENTED ARCHITECTURE Introduction SOA provides an enterprise architecture that supports building connected enterprise applications to provide solutions to business problems. SOA facilitates the
More informationWhat can DDS do for You? Learn how dynamic publish-subscribe messaging can improve the flexibility and scalability of your applications.
What can DDS do for You? Learn how dynamic publish-subscribe messaging can improve the flexibility and scalability of your applications. 2 Contents: Abstract 3 What does DDS do 3 The Strengths of DDS 4
More informationfor my computation? Stefano Cozzini Which infrastructure Which infrastructure Democrito and SISSA/eLAB - Trieste
Which infrastructure Which infrastructure for my computation? Stefano Cozzini Democrito and SISSA/eLAB - Trieste Agenda Introduction:! E-infrastructure and computing infrastructures! What is available
More informationIBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud
IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud February 25, 2014 1 Agenda v Mapping clients needs to cloud technologies v Addressing your pain
More informationExploiting peer group concept for adaptive and highly available services
Exploiting peer group concept for adaptive and highly available services Muhammad Asif Jan Centre for European Nuclear Research (CERN) Switzerland Fahd Ali Zahid, Mohammad Moazam Fraz Foundation University,
More informationOracle Big Data SQL Technical Update
Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical
More informationScientific Computing Programming with Parallel Objects
Scientific Computing Programming with Parallel Objects Esteban Meneses, PhD School of Computing, Costa Rica Institute of Technology Parallel Architectures Galore Personal Computing Embedded Computing Moore
More informationHPCOR 2014. Data Management Policies Co-Chairs: Bill Allcock, Julia White. HPCOR 2014, June 18-19, Oakland, CA 1
HPCOR 2014 Data Management Policies Co-Chairs: Bill Allcock, Julia White HPCOR 2014, June 18-19, Oakland, CA 1 Contributors Bill Allcock, ANL Sasha Ames, LLNL Ashley Barker, ORNL Laura Biven, DOE Jeff
More informationHPC Wales Skills Academy Course Catalogue 2015
HPC Wales Skills Academy Course Catalogue 2015 Overview The HPC Wales Skills Academy provides a variety of courses and workshops aimed at building skills in High Performance Computing (HPC). Our courses
More information