Data analy(cs workflows for climate
|
|
- Antony Booker
- 8 years ago
- Views:
Transcription
1 Data analy(cs workflows for climate Dr. Sandro Fiore Leader, Scientific data management research group Scientific Computing CMCC Prof. Giovanni Aloisio Director, Scientific Computing CMCC ISENES2 Workshop on Workflow Solu(ons in Earth System Modelling Hamburg, June 03-05, 2014
2 Data analytics requirements! A set of requirements have been discussed with our scientists to identify the key data analytics needs. Preliminary requirements and needs focus on: Time series analysis Data reduc6on (e.g. by aggrega6on) Data subse<ng Model intercomparison Mul6model means Data transforma6on (through array- based primi6ves) Param. Sweep experiments (same task applied on a set of data) Comparison between historical data and future scenarios Maps genera6on Ensemble analysis Data analy(cs worflow support But also Performance, re- usability, extensibility
3 Defining some workflows: some real examples Some workflows have been designed with CMCC climate scientists to collect their needs and requirements about more complex processing chains: Extreme Precipitation Tendencies analysis Climate change signal Massive data reduction. Ophidia Workflow Modeling Language IMPORT (comments) SUBSETTING [ dim1, dim2,, dimn ] (ranges) GENERIC WORKFLOW INTERCUBE Operation (comments) DETAILED WORKFLOW Data Analytics Workflow Modelling Language 0-D TASK 2-D TASK 1-D TASK 3-D TASK REDUCTION APPLY EXPORT Task transition [ dim1,, dimn ] operation (aggregation set) [ dim ] operation (comments) (comments) 3-D+ TASK n I/O operation Model/Simulation output Feedback Positive comments from users regarding the data analytics workflow modelling language to describe their use cases Sharing & re-usability of workflows across CMCC divisions GENERIC TASK SUPERVISED TASK MODEL REPOSITORY Exported Files File CMCC-CM CMCC-CM Model RCP8.5 RCP8.5 Scenario Atmos Atmos Variable TAS TAS (Frequency)
4 The Ophidia Project Ophidia is a research effort carried out at the Euro Mediterranean Centre on Climate Change (CMCC) to address big data challenges, issues and requirements for climate change data analytics S. Fiore, A. D Anca, C. Palazzo, I. Foster, D. N. Williams, G. Aloisio, Ophidia: toward bigdata analytics for escience, ICCS2013 Conference, Procedia Elsevier, Barcelona, June 5-7, 2013
5 Architecture 2.0: Ophidia Server & Workflow support Workflow tasks Front-end and security REST WS- I OGC GSI/VOMS Request parser Each task is associated to an Ophidia operator OphidiaDB manager OphidiaDB Workflow queue Request validator Task selector Task starter Mul(ple cube management Single cube submission No6fica6on manager Notification from the framework Workflow engine Notification manager Towards the framework Resource manager
6 The analytics framework: datacube operators (about 50)! OPERATOR NAME OPERATOR DESCRIPTION Operators Data processing Domain-agnostic OPH_APPLY(datacube_in, datacube_out, array_based_primitive) OPH_DUPLICATE(datacube_ in, datacube_out) OPH_SUBSET(datacube_in, subset_string, datacube_out) OPH_MERGE(datacube_in, merge_param, datacube_out) OPH_SPLIT(datacube_in, split_param, datacube_out) OPH_INTERCOMPARISON (datacube_in1, datacube_in2, datacube_out) OPH_DELETE(datacube_in) Creates the datacube_out by applying the array-based primitive to the datacube_in Creates a copy of the datacube_in in the datacube_out Creates the datacube_out by doing a sub-setting of the datacube_in by applying the subset_string Creates the datacube_out by merging groups of merge_param fragments from datacube_in Creates the datacube_out by splitting into groups of split_param fragments each fragment of the datacube_in Creates the datacube_out which is the element-wise difference between datacube_in1 and datacube_in2 Removes the datacube_in Data Access (sequential and parallel operators) Metadata management (sequential and parallel operators) Data processing (parallel operators, MPI & OpenMP based) Import/Export (parallel operators) OPERATOR NAME OPERATOR DESCRIPTION Operators Data processing Domain-oriented OPH_EXPORT_NC Exports the datacube_in data into the (datacube_in, file_out) file_out NetCDF file. OPH_IMPORT_NC Imports the data stored into the file_in (file_in, datacube_out) NetCDF file into the new datacube_in datacube Operators Data access OPH_INSPECT_FRAG Inspects the data stored in the (datacube_in, fragment_in) fragment_in from the datacube_in OPH_PUBLISH(datacube_in) Publishes the datacube_in fragments into HTML pages Operators Metadata OPH_CUBE_ELEMENTS Provides the total number of the (datacube_in) elements in the datacube_in OPH_CUBE_SIZE Provides the disk space occupied by the (datacube_in) datacube_in OPH_LIST(void) Provides the list of available datacubes. OPH_CUBEIO(datacube_in) Provides the provenance information related to the datacube_in OPH_FIND(search_param) Provides the list of datacubes matching the search_param criteria
7 Architecture 2.0: Ophidia Server & Workflow support Workflow tasks Front-end and security REST WS- I OGC GSI/VOMS Request parser OphidiaDB manager OphidiaDB Workflow queue Request validator Task selector Task starter Mul(ple cube management Single cube submission No6fica6on manager Notification from the framework Workflow engine Notification manager Towards the framework Resource manager
8 Single Cube Task Execution of a data/metadata operator through a single declarative statement Single Cube Task OperatorA Data OperatorB Metadata Metadata info oph_cubeio cube=doi;branch=children oph_list level=3 oph_explorecube cube=doi.
9 Single Cube Task (oph_cubeio & provenance) Execution of a data/metadata operator through a single declarative statement
10 Multiple cubes tasks Execution of the same data operator over a group of data cubes through a single declarative statement Multiple cubes tasks can be used: to import a large number of datasets into the Ophidia platform to process a large set of data cubes with similar properties Multiple Cubes Task OperatorA Fork OperatorA OperatorA Join OperatorA Filter [ file=*.nc; ] Data cubes to be processed by a multiple cubes operator can be selected by applying a set of filters
11 Array based primitives Ophidia provides a wide set of array-based primitives (about 100) to perform data reduction (by aggregation), sub-setting, predicates evaluation, statistical analysis, compression, and so forth. They are provided both for byte-oriented and bit-oriented arrays Primitives can be nested to get more complex functionalities Compression is a primitive too! New primitives can be easily integrated as additional plugins by the end users Primitives come as plugins and are applied on a single datacube chunk (fragment)
12 Array based primitives: OPH_MATH ( SIGN ) oph_math(measure, OPH_SIGN, "OPH_DOUBLE ) Single chunk or fragment (input) Single chunk or fragment (output)
13 Array based primitives: OPH_BOXPLOT oph_boxplot(measure, "OPH_DOUBLE ) Single chunk or fragment (input) Single chunk or fragment (output)
14 Array based primitives: nesting feature oph_boxplot(oph_subarray(oph_uncompress(measure), 1,18), "OPH_DOUBLE ) Single chunk or fragment (input) Single chunk or fragment (output) subarray(measure, 1,18)
15 Ophidia primitives (about 100) Mathematical primitives: oph_math, oph_mul_scalar, oph_sum_array, oph_sum_scalar, oph_sum_array_r, oph_gsl_sd, oph_gsl_complex_get_abs, oph_gsl_complex_get_arg, oph_gsl_complex_get_imag, oph_gsl_complex_get_real, oph_gsl_complex_to_polar, oph_gsl_complex_to_rect, oph_gsl_dwt, oph_gsl_fft, oph_gsl_idwt, oph_gsl_ifft, oph_gsl_quantile, oph_gsl_sort, oph_gsl_stats, oph_gsl_histogram, oph_gsl_boxplot, oph_petsc_vec_norm, oph_petsc_vec, oph_petsc_vec_r, oph_sum_scalar2, oph_mul_scalar2, oph_compare, Data transformations: oph_to_bin, oph_to_bit, oph_permute, oph_reverse, oph_dump, oph_convert_d, oph_shift, oph_rotate, oph_concat, oph_predicate, oph_roll_up, oph_get_subarray, oph_get_subarray2, oph_get_subarray3, oph_mask, oph_cast, oph_drill_down (stored procedure), oph_reduce, oph_reduce2, oph_reduce3, oph_aggregate_operator, oph_operator, oph_aggregate_stats, oph_extract, Compression: oph_compress, oph_uncompress, oph_compress2, oph_uncompress2, oph_id_to_index, oph_size_array, oph_count_array, oph_find, oph_find2, Bit measures processing: oph_bit_aggregate, oph_bit_import, oph_bit_export, oph_bit_size, oph_bit_count, oph_bit_dump, oph_bit_operator, oph_bit_not, oph_bit_shift, oph_bit_rotate, oph_bit_reverse, oph_bit_subarray, oph_bit_subarray2, oph_bit_concat, oph_bit_find, oph_bit_reduce, Integration of numerical libraries: math, GNU GSL, PetsC. Integration of some CDO operators
16 Integrating some CDO operators as new primitives CDO Module CDO Operator Ophidia primi(ve fldmax FLDSTAT fldmin fldsum oph_cdo_reduce(meas, OPH_MAX ) oph_cdo_reduce(meas, OPH_MIN ) oph_cdo_reduce(meas, OPH_SUM ) TIMSTAT CONDC COMPC 6mmax 6mmin 6msum i]henc,c ifno_henc,c ltc,c gtc,c oph_cdo_aggregate_operator(meas, OPH_MAX ) oph_cdo_aggregate_operator(meas, OPH_MIN ) oph_cdo_aggregate_operator(meas, OPH_SUM') oph_cdo_condc(meas,'oph_ifthenc',const) oph_cdo_condc(meas,'oph_ifnotthenc',const) oph_cdo_compc(meas,'oph_ltc',const) oph_cdo_compc(meas,'oph_gtc',const) SMOOTH smooth9 oph_cdo_smooth9(meas, OPH_SMOOTH9,nlon) REGRESSION regres detrend trend subtrend (trend ofiles) oph_cdo_regres(meas, OPH_REGRES ) oph_cdo_regres(meas, OPH_DETREND ) oph_cdo_regres(meas, OPH_TREND ) oph_cdo_subtrend (measure, oph_cdo_regres (measure,'oph_trend'),'oph_subtrend'))
17 Ophidia Architecture 2.0: features summary New server front-end and workflow management support Data analytics workflow modelling language Ophidia workflow JSON schema New native I/OServer and query engine Support for different storage devices including memory All of these aspects are strongly connected each other to fully address (near) real time parallel data analytics workflows for Climate P.S.: Some of the slides related to the Ophidia 2.0 architecture have been removed due to copyright issues (papers under publication). Please send an to the contact points in the last slide in case you are interested in getting more information on this research effort
18 References and contacts! [1] G. Aloisio, S. Fiore, I. Foster, D. N. Williams, Scientific big data analytics challenges at large scale, Big Data and Extreme-scale Computing (BDEC), April 30 to May 01, 2013, Charleston, USA (position paper). [2] S. Fiore, A. D'Anca, C. Palazzo, I. Foster, Dean N. Williams, Giovanni Aloisio, Ophidia: Toward Big Data Analytics for escience, ICCS 2013, June 5-7, 2013 Barcelona, Spain, Procedia Computer Science, Elsevier, pp [3] S. Fiore, C. Palazzo, A. D Anca, I. Foster, D. N. Williams, G. Aloisio, A big data analytics framework for scientific data management, Workshop on Big Data and Science: Infrastructure and Services, IEEE International Conference on BigData 2013, October 6-9, 2013, Santa Clara, USA, pp [4] S.Fiore, A. D Anca, D. Elia, C. Palazzo, I. Foster, D. N. Williams, G. Aloisio, Ophidia: a full software stack for scientific data analytics, Workshop on Big Data Principles, Architectures & Applications, HPCS2014, Bologna, USA, July 21-25, Contacts: Sandro Fiore, Giovanni Aloisio CMCC and University of Salento sandro.fiore@cmcc.it, giovanni.aloisio@cmcc.it, Team: S. Fiore, A. D Anca, D. Elia, C. Palazzo, A. Mariello, P. Nassisi, G. Aloisio Acknowledgments: Dean Williams Lawrence Livermore National Lab Ian Foster Argonne National Laboratory & University of Chicago
19 Questions?!
The Ophidia framework: toward big data analy7cs for climate change
The Ophidia framework: toward big data analy7cs for climate change Dr. Sandro Fiore Leader, Scientific data management research group Scientific Computing Division @ CMCC Prof. Giovanni Aloisio Director,
More informationThe Ophidia framework: toward cloud- based big data analy;cs for escience Sandro Fiore, Giovanni Aloisio, Ian Foster, Dean Williams
The Ophidia framework: toward cloud- based big data analy;cs for escience Sandro Fiore, Giovanni Aloisio, Ian Foster, Dean Williams Sandro Fiore, Ph.D. CMCC Scientific Computing and Operations Division
More informationA big data analytics framework for scientific data management
2013 IEEE International Conference on Big Data A big data analytics framework for scientific data management Sandro Fiore 1,2*, Cosimo Palazzo 1,2, Alessandro D Anca 1, Ian Foster 3, Dean N. Williams 4,
More informationThe ORIENTGATE data platform
Seminar on Proposed and Revised set of indicators June 4-5, 2014 - Belgrade (Serbia) The ORIENTGATE data platform WP2, Action 2.4 Alessandra Nuzzo, Sandro Fiore, Giovanni Aloisio Scientific Computing and
More informationIS-ENES WP3. D3.8 - Report on Training Sessions
IS-ENES WP3 D3.8 - Report on Training Sessions Abstract: The deliverable D3.8 describes the organization and the outcomes of the tutorial meetings on the Grid Prototype designed within task4 of the NA2
More informationA Service for Data-Intensive Computations on Virtual Clusters
A Service for Data-Intensive Computations on Virtual Clusters Executing Preservation Strategies at Scale Rainer Schmidt, Christian Sadilek, and Ross King rainer.schmidt@arcs.ac.at Planets Project Permanent
More informationExArch: Climate analytics on distributed exascale data archives Martin Juckes, V. Balaji, B.N. Lawrence, M. Lautenschlager, S. Denvil, G. Aloisio, P.
ExArch: Climate analytics on distributed exascale data archives Martin Juckes, V. Balaji, B.N. Lawrence, M. Lautenschlager, S. Denvil, G. Aloisio, P. Kushner, D. Waliser, S. Pascoe, A. Stephens, P. Kershaw,
More informationHow To Build An Elastic Distributed System Over Big Data. http://esgf.org
How To Build An Elastic Distributed System Over Big Data Overview! The Earth System Grid Federation (ESGF.org) is indeed a spontaneous, unfunded, community-driven, open-source, collaborative effort to
More informationEfficient and Scalable Climate Metadata Management with the GRelC DAIS
Efficient and Scalable Climate Metadata Management with the GRelC DAIS G. Aloisio, S. Fiore CMCC Scientific Computing and Operations Division University of Salento, Lecce Context : countdown of the Intergovernmental
More informationMicroStrategy Course Catalog
MicroStrategy Course Catalog 1 microstrategy.com/education 3 MicroStrategy course matrix 4 MicroStrategy 9 8 MicroStrategy 10 table of contents MicroStrategy course matrix MICROSTRATEGY 9 MICROSTRATEGY
More informationFederated Big Data for resource aggregation and load balancing with DIRAC
Procedia Computer Science Volume 51, 2015, Pages 2769 2773 ICCS 2015 International Conference On Computational Science Federated Big Data for resource aggregation and load balancing with DIRAC Víctor Fernández
More informationData Management and Analysis in Support of DOE Climate Science
Data Management and Analysis in Support of DOE Climate Science August 7 th, 2013 Dean Williams, Galen Shipman Presented to: Processing and Analysis of Very Large Data Sets Workshop The Climate Data Challenge
More informationNASA's Strategy and Activities in Server Side Analytics
NASA's Strategy and Activities in Server Side Analytics Tsengdar Lee, Ph.D. High-end Computing Program Manager NASA Headquarters Presented at the ESGF/UVCDAT Conference Lawrence Livermore National Laboratory
More informationData Semantics Aware Cloud for High Performance Analytics
Data Semantics Aware Cloud for High Performance Analytics Microsoft Future Cloud Workshop 2011 June 2nd 2011, Prof. Jun Wang, Computer Architecture and Storage System Laboratory (CASS) Acknowledgement
More informationScalable Developments for Big Data Analytics in Remote Sensing
Scalable Developments for Big Data Analytics in Remote Sensing Federated Systems and Data Division Research Group High Productivity Data Processing Dr.-Ing. Morris Riedel et al. Research Group Leader,
More informationUsing the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova
Using the Grid for the interactive workflow management in biomedicine Andrea Schenone BIOLAB DIST University of Genova overview background requirements solution case study results background A multilevel
More informationThe ORIENTGATE data platform
Research Papers Issue RP0195 December 2013 The ORIENTGATE data platform SCO Scientific Computing and Operations Division By Alessandra Nuzzo University of Salento and Scientific Computing and Operations
More informationBig Data Mining Services and Knowledge Discovery Applications on Clouds
Big Data Mining Services and Knowledge Discovery Applications on Clouds Domenico Talia DIMES, Università della Calabria & DtoK Lab Italy talia@dimes.unical.it Data Availability or Data Deluge? Some decades
More informationHadoop Job Oriented Training Agenda
1 Hadoop Job Oriented Training Agenda Kapil CK hdpguru@gmail.com Module 1 M o d u l e 1 Understanding Hadoop This module covers an overview of big data, Hadoop, and the Hortonworks Data Platform. 1.1 Module
More informationData processing goes big
Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,
More informationScalable Data Analysis in R. Lee E. Edlefsen Chief Scientist UserR! 2011
Scalable Data Analysis in R Lee E. Edlefsen Chief Scientist UserR! 2011 1 Introduction Our ability to collect and store data has rapidly been outpacing our ability to analyze it We need scalable data analysis
More informationSupercomputing on Windows. Microsoft (Thailand) Limited
Supercomputing on Windows Microsoft (Thailand) Limited W hat D efines S upercom puting A lso called High Performance Computing (HPC) Technical Computing Cutting edge problems in science, engineering and
More informationIT Service Management with System Center Service Manager
Course 10965B: IT Service Management with System Center Service Manager Course Details Course Outline Module 1: Service Management Overview Effective IT Service Management includes process driven methodologies
More informationOptimizing Mass Storage Organization and Access for Multi-Dimensional Scientific Data
Optimizing Mass Storage Organization and Access for Multi-Dimensional Scientific Data Robert Drach, Susan W. Hyer, Steven Louis, Gerald Potter, George Richmond Lawrence Livermore National Laboratory Livermore,
More informationCreating Connection with Hive
Creating Connection with Hive Intellicus Enterprise Reporting and BI Platform Intellicus Technologies info@intellicus.com www.intellicus.com Creating Connection with Hive Copyright 2010 Intellicus Technologies
More informationData-Intensive Science and Scientific Data Infrastructure
Data-Intensive Science and Scientific Data Infrastructure Russ Rew, UCAR Unidata ICTP Advanced School on High Performance and Grid Computing 13 April 2011 Overview Data-intensive science Publishing scientific
More informationPart I Courses Syllabus
Part I Courses Syllabus This document provides detailed information about the basic courses of the MHPC first part activities. The list of courses is the following 1.1 Scientific Programming Environment
More informationClimate-Weather Modeling Studies Using a Prototype Global Cloud-System Resolving Model
ANL/ALCF/ESP-13/1 Climate-Weather Modeling Studies Using a Prototype Global Cloud-System Resolving Model ALCF-2 Early Science Program Technical Report Argonne Leadership Computing Facility About Argonne
More informationSQL Server 2012 Optimization, Performance Tuning and Troubleshooting
1 SQL Server 2012 Optimization, Performance Tuning and Troubleshooting 5 Days (SQ-OPT2012-301-EN) Description During this five-day intensive course, students will learn the internal architecture of SQL
More informationSecurity Analytics Topology
Security Analytics Topology CEP = Stream Analytics Hadoop = Batch Analytics Months to years LOGS PKTS Correlation with Live in Real Time Meta, logs, select payload Decoder Long-term, intensive analysis
More informationCloud-based Linked Data Geoprocessing: Implementing Kriging as WPS on the Cloud
Cloud-based Linked Data Geoprocessing: Implementing Kriging as WPS on the Cloud Elias Grinias Department of Civil Engineering, Surveying and Geomatics, TEI of Central Macedonia and Dimitris Kotzinos ETIS
More informationDavid Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems
David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems About me David Rioja Redondo Telecommunication Engineer - Universidad de Alcalá >2 years building and managing clusters UPM
More informationIT Service Management with System Center Service Manager
Course 10965B: IT Service Management with System Center Service Manager Page 1 of 9 IT Service Management with System Center Service Manager Course 10965B: 3 days; Instructor-Led Introduction This Three-day
More informationData Warehousing. Yeow Wei Choong Anne Laurent
Data Warehousing Yeow Wei Choong Anne Laurent Databases Databases are developed on the IDEA that DATA is one of the cri>cal materials of the Informa>on Age Informa>on, which is created by data, becomes
More informationMS SQL Server 2014 New Features and Database Administration
MS SQL Server 2014 New Features and Database Administration MS SQL Server 2014 Architecture Database Files and Transaction Log SQL Native Client System Databases Schemas Synonyms Dynamic Management Objects
More informationA Novel Cloud Based Elastic Framework for Big Data Preprocessing
School of Systems Engineering A Novel Cloud Based Elastic Framework for Big Data Preprocessing Omer Dawelbeit and Rachel McCrindle October 21, 2014 University of Reading 2008 www.reading.ac.uk Overview
More informationIDL. Get the answers you need from your data. IDL
Get the answers you need from your data. IDL is the preferred computing environment for understanding complex data through interactive visualization and analysis. IDL Powerful visualization. Interactive
More informationBIGDATA GREENPLUM DBA INTRODUCTION COURSE OBJECTIVES COURSE SUMMARY HIGHLIGHTS OF GREENPLUM DBA AT IQ TECH
BIGDATA GREENPLUM DBA Meta-data: Outrun your competition with advanced knowledge in the area of BigData with IQ Technology s online training course on Greenplum DBA. A state-of-the-art course that is delivered
More informationIs a Data Scientist the New Quant? Stuart Kozola MathWorks
Is a Data Scientist the New Quant? Stuart Kozola MathWorks 2015 The MathWorks, Inc. 1 Facts or information used usually to calculate, analyze, or plan something Information that is produced or stored by
More informationThe Top Six Advantages of CUDA-Ready Clusters. Ian Lumb Bright Evangelist
The Top Six Advantages of CUDA-Ready Clusters Ian Lumb Bright Evangelist GTC Express Webinar January 21, 2015 We scientists are time-constrained, said Dr. Yamanaka. Our priority is our research, not managing
More informationConstructing a Data Lake: Hadoop and Oracle Database United!
Constructing a Data Lake: Hadoop and Oracle Database United! Sharon Sophia Stephen Big Data PreSales Consultant February 21, 2015 Safe Harbor The following is intended to outline our general product direction.
More informationBig Data: Using ArcGIS with Apache Hadoop. Erik Hoel and Mike Park
Big Data: Using ArcGIS with Apache Hadoop Erik Hoel and Mike Park Outline Overview of Hadoop Adding GIS capabilities to Hadoop Integrating Hadoop with ArcGIS Apache Hadoop What is Hadoop? Hadoop is a scalable
More informationLustre * Filesystem for Cloud and Hadoop *
OpenFabrics Software User Group Workshop Lustre * Filesystem for Cloud and Hadoop * Robert Read, Intel Lustre * for Cloud and Hadoop * Brief Lustre History and Overview Using Lustre with Hadoop Intel Cloud
More informationInteroperability between Sun Grid Engine and the Windows Compute Cluster
Interoperability between Sun Grid Engine and the Windows Compute Cluster Steven Newhouse Program Manager, Windows HPC Team steven.newhouse@microsoft.com 1 Computer Cluster Roadmap Mainstream HPC Mainstream
More informationLog Mining Based on Hadoop s Map and Reduce Technique
Log Mining Based on Hadoop s Map and Reduce Technique ABSTRACT: Anuja Pandit Department of Computer Science, anujapandit25@gmail.com Amruta Deshpande Department of Computer Science, amrutadeshpande1991@gmail.com
More informationOracle BI 11g R1: Build Repositories
Oracle University Contact Us: 1.800.529.0165 Oracle BI 11g R1: Build Repositories Duration: 5 Days What you will learn This Oracle BI 11g R1: Build Repositories training is based on OBI EE release 11.1.1.7.
More informationEffective Team Development Using Microsoft Visual Studio Team System
Effective Team Development Using Microsoft Visual Studio Team System Course 6214A: Three days; Instructor-Led Introduction This three-day instructor-led course provides students with the knowledge and
More informationIn-Situ Bitmaps Generation and Efficient Data Analysis based on Bitmaps. Yu Su, Yi Wang, Gagan Agrawal The Ohio State University
In-Situ Bitmaps Generation and Efficient Data Analysis based on Bitmaps Yu Su, Yi Wang, Gagan Agrawal The Ohio State University Motivation HPC Trends Huge performance gap CPU: extremely fast for generating
More informationPerformance Comparison of SQL based Big Data Analytics with Lustre and HDFS file systems
Performance Comparison of SQL based Big Data Analytics with Lustre and HDFS file systems Rekha Singhal and Gabriele Pacciucci * Other names and brands may be claimed as the property of others. Lustre File
More informationIT Service Management with System Center Service Manager
3 Riverchase Office Plaza Hoover, Alabama 35244 Phone: 205.989.4944 Fax: 855.317.2187 E-Mail: rwhitney@discoveritt.com Web: www.discoveritt.com IT Service Management with System Center Service Manager
More informationSAP Data Services 4.X. An Enterprise Information management Solution
SAP Data Services 4.X An Enterprise Information management Solution Table of Contents I. SAP Data Services 4.X... 3 Highlights Training Objectives Audience Pre Requisites Keys to Success Certification
More informationNASA s Big Data Challenges in Climate Science
NASA s Big Data Challenges in Climate Science Tsengdar Lee, Ph.D. High-end Computing Program Manager NASA Headquarters Presented at IEEE Big Data 2014 Workshop October 29, 2014 1 2 7-km GEOS-5 Nature Run
More informationExploration of adaptive network transfer for 100 Gbps networks Climate100: Scaling the Earth System Grid to 100Gbps Network
Exploration of adaptive network transfer for 100 Gbps networks Climate100: Scaling the Earth System Grid to 100Gbps Network February 1, 2012 Project period of April 1, 2011 through December 31, 2011 Principal
More informationBuild Transactions Into Your Apps and Mobilize Your Enterprise with MicroStrategy 10
Build Transactions Into Your Apps and Mobilize Your Enterprise with MicroStrategy 10 Agenda Introduction What is a workflow app? Transaction Services Component Objects Transaction Input Controls Transaction
More informationAssociate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2
Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue
More informationIBM Netezza High Capacity Appliance
IBM Netezza High Capacity Appliance Petascale Data Archival, Analysis and Disaster Recovery Solutions IBM Netezza High Capacity Appliance Highlights: Allows querying and analysis of deep archival data
More informationData Integration and ETL with Oracle Warehouse Builder NEW
Oracle University Appelez-nous: +33 (0) 1 57 60 20 81 Data Integration and ETL with Oracle Warehouse Builder NEW Durée: 5 Jours Description In this 5-day hands-on course, students explore the concepts,
More informationThepurposeofahospitalinformationsystem(HIS)istomanagetheinformationthathealth
FederatedDatabaseSystemsforReplicatingInformationin UniversityofDortmund,DepartmentofComputerScience,Informatik10 ExtendingtheSchemaArchitectureof E-mail:willi@ls10.informatik.uni-dortmund.de HospitalInformationSystems
More informationA Technology for BigData Analysis Task Description using Domain-Specific Languages
A Technology for Big Analysis Task Description using Domain-Specific Languages Sergey V. Kovalchuk 1, Artem V. Zakharchuk 1, Jiaqi Liao 1, Sergey V. Ivanov 1, Alexander V. Boukhanovsky 1,2 1 ITMO University,
More informationCritical Strategies for Improving the Code Quality and Cross-Disciplinary Impact of the Computational Earth Sciences
Critical Strategies for Improving the Code Quality and Cross-Disciplinary Impact of the Computational Earth Sciences Johnny Wei-Bing Lin (Physics Department, North Park University) Tyler A. Erickson (MTRI
More informationM2074 - Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000 5 Day Course
Module 1: Introduction to Data Warehousing and OLAP Introducing Data Warehousing Defining OLAP Solutions Understanding Data Warehouse Design Understanding OLAP Models Applying OLAP Cubes At the end of
More informationThe THREDDS Data Repository: for Long Term Data Storage and Access
8B.7 The THREDDS Data Repository: for Long Term Data Storage and Access Anne Wilson, Thomas Baltzer, John Caron Unidata Program Center, UCAR, Boulder, CO 1 INTRODUCTION In order to better manage ever increasing
More informationClinical Knowledge Manager. Product Description 2012 MAKING HEALTH COMPUTE
Clinical Knowledge Manager Product Description 2012 MAKING HEALTH COMPUTE Cofounder and major sponsor Member and official submitter for HL7/OMG HSSP RLUS, EIS 'openehr' is a registered trademark of the
More informationSisense. Product Highlights. www.sisense.com
Sisense Product Highlights Introduction Sisense is a business intelligence solution that simplifies analytics for complex data by offering an end-to-end platform that lets users easily prepare and analyze
More informationSEIZE THE DATA. 2015 SEIZE THE DATA. 2015
1 Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. BIG DATA CONFERENCE 2015 Boston August 10-13 Predicting and reducing deforestation
More information#MMTM15 #INFOARCHIVE #EMCWORLD 1
#MMTM15 #INFOARCHIVE #EMCWORLD 1 1 INFOARCHIVE A TECHNICAL OVERVIEW DAVID HUMBY SOFTWARE ARCHITECT #MMTM15 2 TWEET LIVE DURING THE SESSION! Connect with us: Sign up for a Hands On Lab 6 th May, 1.30 PM,
More informationFunctional Requirements for Digital Asset Management Project version 3.0 11/30/2006
/30/2006 2 3 4 5 6 7 8 9 0 2 3 4 5 6 7 8 9 20 2 22 23 24 25 26 27 28 29 30 3 32 33 34 35 36 37 38 39 = required; 2 = optional; 3 = not required functional requirements Discovery tools available to end-users:
More informationHDF5-iRODS Project. August 20, 2008
P A G E 1 HDF5-iRODS Project Final report Peter Cao The HDF Group 1901 S. First Street, Suite C-2 Champaign, IL 61820 xcao@hdfgroup.org Mike Wan San Diego Supercomputer Center University of California
More informationSQL Server 2012 Business Intelligence Boot Camp
SQL Server 2012 Business Intelligence Boot Camp Length: 5 Days Technology: Microsoft SQL Server 2012 Delivery Method: Instructor-led (classroom) About this Course Data warehousing is a solution organizations
More informationAutomating Big Data Benchmarking for Different Architectures with ALOJA
www.bsc.es Jan 2016 Automating Big Data Benchmarking for Different Architectures with ALOJA Nicolas Poggi, Postdoc Researcher Agenda 1. Intro on Hadoop performance 1. Current scenario and problematic 2.
More informationDBMS / Business Intelligence, SQL Server
DBMS / Business Intelligence, SQL Server Orsys, with 30 years of experience, is providing high quality, independant State of the Art seminars and hands-on courses corresponding to the needs of IT professionals.
More informationIbis: Scaling Python Analy=cs on Hadoop and Impala
Ibis: Scaling Python Analy=cs on Hadoop and Impala Wes McKinney, Budapest BI Forum 2015-10- 14 @wesmckinn 1 Me R&D at Cloudera Serial creator of structured data tools / user interfaces Mathema=cian MIT
More informationFP-Hadoop: Efficient Execution of Parallel Jobs Over Skewed Data
FP-Hadoop: Efficient Execution of Parallel Jobs Over Skewed Data Miguel Liroz-Gistau, Reza Akbarinia, Patrick Valduriez To cite this version: Miguel Liroz-Gistau, Reza Akbarinia, Patrick Valduriez. FP-Hadoop:
More informationAV-005: Administering and Implementing a Data Warehouse with SQL Server 2014
AV-005: Administering and Implementing a Data Warehouse with SQL Server 2014 Career Details Duration 105 hours Prerequisites This career requires that you meet the following prerequisites: Working knowledge
More informationData Management in the Cloud: Limitations and Opportunities. Annies Ductan
Data Management in the Cloud: Limitations and Opportunities Annies Ductan Discussion Outline: Introduc)on Overview Vision of Cloud Compu8ng Managing Data in The Cloud Cloud Characteris8cs Data Management
More informationDesigning and Building Applications for Extreme Scale Systems CS598 William Gropp www.cs.illinois.edu/~wgropp
Designing and Building Applications for Extreme Scale Systems CS598 William Gropp www.cs.illinois.edu/~wgropp Welcome! Who am I? William (Bill) Gropp Professor of Computer Science One of the Creators of
More informationParallel I/O on JUQUEEN
Parallel I/O on JUQUEEN 3. February 2015 3rd JUQUEEN Porting and Tuning Workshop Sebastian Lührs, Kay Thust s.luehrs@fz-juelich.de, k.thust@fz-juelich.de Jülich Supercomputing Centre Overview Blue Gene/Q
More informationWorkflow Tools at NERSC. Debbie Bard djbard@lbl.gov NERSC Data and Analytics Services
Workflow Tools at NERSC Debbie Bard djbard@lbl.gov NERSC Data and Analytics Services NERSC User Meeting August 13th, 2015 What Does Workflow Software Do? Automate connection of applications Chain together
More information6231B: Maintaining a Microsoft SQL Server 2008 R2 Database
6231B: Maintaining a Microsoft SQL Server 2008 R2 Database Course Overview This instructor-led course provides students with the knowledge and skills to maintain a Microsoft SQL Server 2008 R2 database.
More informationApplying Apache Hadoop to NASA s Big Climate Data!
National Aeronautics and Space Administration Applying Apache Hadoop to NASA s Big Climate Data! Use Cases and Lessons Learned! Glenn Tamkin (NASA/CSC)!! Team: John Schnase (NASA/PI), Dan Duffy (NASA/CO),!
More informationRelease 2.1 of SAS Add-In for Microsoft Office Bringing Microsoft PowerPoint into the Mix ABSTRACT INTRODUCTION Data Access
Release 2.1 of SAS Add-In for Microsoft Office Bringing Microsoft PowerPoint into the Mix Jennifer Clegg, SAS Institute Inc., Cary, NC Eric Hill, SAS Institute Inc., Cary, NC ABSTRACT Release 2.1 of SAS
More informationMDM for the Enterprise: Complementing and extending your Active Data Warehousing strategy. Satish Krishnaswamy VP MDM Solutions - Teradata
MDM for the Enterprise: Complementing and extending your Active Data Warehousing strategy Satish Krishnaswamy VP MDM Solutions - Teradata 2 Agenda MDM and its importance Linking to the Active Data Warehousing
More informationCloudCenter Full Lifecycle Management. An application-defined approach to deploying and managing applications in any datacenter or cloud environment
CloudCenter Full Lifecycle Management An application-defined approach to deploying and managing applications in any datacenter or cloud environment CloudCenter Full Lifecycle Management Page 2 Table of
More informationWhat's New in SAS Data Management
Paper SAS034-2014 What's New in SAS Data Management Nancy Rausch, SAS Institute Inc., Cary, NC; Mike Frost, SAS Institute Inc., Cary, NC, Mike Ames, SAS Institute Inc., Cary ABSTRACT The latest releases
More informationOracle9i Data Warehouse Review. Robert F. Edwards Dulcian, Inc.
Oracle9i Data Warehouse Review Robert F. Edwards Dulcian, Inc. Agenda Oracle9i Server OLAP Server Analytical SQL Data Mining ETL Warehouse Builder 3i Oracle 9i Server Overview 9i Server = Data Warehouse
More informationZhenping Liu *, Yao Liang * Virginia Polytechnic Institute and State University. Xu Liang ** University of California, Berkeley
P1.1 AN INTEGRATED DATA MANAGEMENT, RETRIEVAL AND VISUALIZATION SYSTEM FOR EARTH SCIENCE DATASETS Zhenping Liu *, Yao Liang * Virginia Polytechnic Institute and State University Xu Liang ** University
More informationSAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011
SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications Jürgen Primsch, SAP AG July 2011 Why In-Memory? Information at the Speed of Thought Imagine access to business data,
More informationCurrent Order Tool Experiences Complaints
Current Order Tool Experiences Complaints Log in unadvertised case sensitivity for email address that is used as login id CERES Dataset Info pages are too crowded!! On the Data Products Catalog page, remove
More informationBuild Your Knowledge!
About this Course This 3-day Instructor led course Explore several advanced topics of working with SharePoint 2013 sites. Topics include SharePoint Server site definitions (Business Intelligence, Document
More informationBig Data Visualization with JReport
Big Data Visualization with JReport Dean Yao Director of Marketing Greg Harris Systems Engineer Next Generation BI Visualization JReport is an advanced BI visualization platform: Faster, scalable reports,
More information3rd Annual Earth System Grid Federation and Ultrascale Visualization Climate Data Analysis Tools Face-to-Face Meeting Report December 2013
3rd Annual Earth System Grid Federation and Ultrascale Visualization Climate Data Analysis Tools Face-to-Face Meeting Report December 2013 A global consortium of government agencies, educational institutions,
More informationAd Hoc Analysis of Big Data Visualization
Ad Hoc Analysis of Big Data Visualization Dean Yao Director of Marketing Greg Harris Systems Engineer Follow us @Jinfonet #BigDataWebinar JReport Highlights Advanced, Embedded Data Visualization Platform:
More informationBringing Big Data Modelling into the Hands of Domain Experts
Bringing Big Data Modelling into the Hands of Domain Experts David Willingham Senior Application Engineer MathWorks david.willingham@mathworks.com.au 2015 The MathWorks, Inc. 1 Data is the sword of the
More informationSQL Server 2012 End-to-End Business Intelligence Workshop
USA Operations 11921 Freedom Drive Two Fountain Square Suite 550 Reston, VA 20190 solidq.com 800.757.6543 Office 206.203.6112 Fax info@solidq.com SQL Server 2012 End-to-End Business Intelligence Workshop
More informationWhite Paper. How Streaming Data Analytics Enables Real-Time Decisions
White Paper How Streaming Data Analytics Enables Real-Time Decisions Contents Introduction... 1 What Is Streaming Analytics?... 1 How Does SAS Event Stream Processing Work?... 2 Overview...2 Event Stream
More informationEnabling the Big Data Commons through indexing of data and their interactions
biomedical and healthcare Data Discovery Index Ecosystem Enabling the Big Data Commons through indexing of and their interactions 2 nd BD2K all-hands meeting Bethesda 11/12/15 Aims 1. Help users find accessible
More informationA Capability Maturity Model for Scientific Data Management
A Capability Maturity Model for Scientific Data Management 1 A Capability Maturity Model for Scientific Data Management Kevin Crowston & Jian Qin School of Information Studies, Syracuse University July
More informationSystem Requirements for Microsoft Dynamics NAV 2009
System Requirements for Microsoft Dynamics NAV 2009 RoleTailored client Microsoft Windows XP Professional SP3 or later (X86 or Microsoft Windows Vista (Business, Enterprise, or Ultimate) SP1 or later (X86
More informationHigh Performance Computing OpenStack Options. September 22, 2015
High Performance Computing OpenStack PRESENTATION TITLE GOES HERE Options September 22, 2015 Today s Presenters Glyn Bowden, SNIA Cloud Storage Initiative Board HP Helion Professional Services Alex McDonald,
More informationA Case Study - Scaling Legacy Code on Next Generation Platforms
Available online at www.sciencedirect.com ScienceDirect Procedia Engineering 00 (2015) 000 000 www.elsevier.com/locate/procedia 24th International Meshing Roundtable (IMR24) A Case Study - Scaling Legacy
More information