Reagan Moore, PI Mary Whitton, Project Manager. National Science Foundation Cooperative Agreement: OCI

Size: px
Start display at page:

Download "Reagan Moore, PI Mary Whitton, Project Manager. National Science Foundation Cooperative Agreement: OCI 0940841"

Transcription

1 Reagan Moore, PI Mary Whitton, Project Manager National Science Foundation Cooperative Agreement: OCI

2 DFC to Support Hydrologic Modeling Jon Goodall and Bakinam Essawy University of Virginia DFC October 2014 NSF Review Slide 2

3 Motivation Reproducibility Automate data gathering, transformation, and ingestion steps required for modeling. Create reusable, reproducible, and sharable data analyses and visualizations from large data collections (e.g., model outputs). Use virtual machine images so that results can be reproduced. DFC October 2014 NSF Review 3

4 Why Data Grids? Problem: Hydrology datasets are Distributed Multiple owners/curators Heterogeneous in structure and semantics Vision: Co locate data transformations with data sets Server side data processing to support science Reduce burden on scientists to complete repetitive data transformation steps Improve performance by reducing network traffic

5 Why Cloud Computing? We are using Amazon Web Services (AWS) for our computing resources Specifically, we are using Elastic Computing 2 (EC2) instances from AWS for our computing resource. This allows us to easy scale up or scale down our computing needs based on project demands. We can easily image instances and reproduce machines later, if needed.

6 Using Cloud Computing with irods An EC2 instance was launched, and the irods server installed on it Server connected to the icat located at RENCI. irods server is for the University of Virginia (UVA resource). Another EC2 instance launched at UVA: Acts as the client, as only the icommands were installed DFC October 2014 NSF Review Slide 6

7 Launched instances on the AWS Using Putty to connect instance on AWS Instance on the AWS DFC October 2014 NSF Review Slide 7

8 Using FilleZilla client to transfer files in/out the EC2 instance on AWS Files copied to the AWS instance

9 Integrated Rule Oriented Data Management System (irods) DFC October 2014 NSF Review Slide 9 Source:

10 How the AWS Interacts with irods irods Collection (/hydrology/hom e/bakinam)./iget or./iput Command EC2 Instance Client IP ( )./iget or./iput Command SSH (rules) EC2 Instance for irods Server IP ( ) SSH (workflow) Personal Computer IP ( ) 1. Rule is on client but executed on the irods server. 2. Rules are located on the bin directory for the client. 3. Workflow (scripts on the irods Server (/home/jlg7h/irods/server/bin/cmd) directory. DFC October 2014 NSF Review Slide 10

11 Last Demo Data Processing VIC model input (pre processing stage) Source : Billah et al. (2014) DFC October 2014 NSF Review Slide 11

12 Demo Objectives Use the Terra Populus Integrated Data on Population and Environment data along with hydrology model outputs to better understand drought impacts. Automate the creation of visualizations from the TerraPop data and model output files using the DFC and irods Workflow Structured Objects (WSO). Share the visualization output and WSO used to create the visualization through the SEAD HydroShare Project Space. DFC October 2014 NSF Review Slide 12

13 The VIC Model VIC = Variable Infiltration Capacity; A regional scale land surface hydrology model VIC developed at UWashington and Princeton; applied worldwide Spatial resolution: 1/8 degree grid cell Three layers of soil: top layer (Layer 0, 0 10cm) mid layer (Layer 1, 10 30cm) lower layer (Layer 2, cm) Source : Gao et al. (2009) DFC October 2014 NSF Review Slide 13

14 VIC Output data set Variable Infiltration Capacity (VIC) Macro scale Hydrologic Model VIC data set show/45 VIC Output data set on the irods server DFC October 2014 NSF Review Slide 14 Example for a flux file. Fluxes_x_y: x = latitude, y = longitude flux files contain information about moisture and energy fluxes for each time step for the three layers of soil (Top, Middle, and Deep).

15 Executing the WSO to Visualize the VIC Model Output 1. VIC model output datasets and scripts are located on the irods server VIC Output dataset VIC processing scripts run_psp_vic_soilmoisture.scr spatiotempdatabase.py vic_calc_mnth_mc.py vic_monthly_soilmoisture.py vic_soil_moisture.py 2. From irods client, workflow file and parameter file are put into an irods collection irods Client irods Server vicsoilmoisture1.mss vicsoilmoisture.mpf irods collection vicsoilmoisture.rundir 4. *.run file is accessed by irods Client, workflow is executed on irods server using specific parameter file. Execution of workflow generates a new directory and output is stored there. vicsoilmoisture.run 3. irods generates a *.run file

16 County Level Population Data From TerraPop From TerraPop we obtained: 1. A shapefile for all the counties in the US with a unique identifier called GEOID. 2. A.csv file that includes the GEOID, counties name, total population, and the cropland area. DFC October 2014 NSF Review Slide 16

17 Automating TerraPop Data Retrieval using the DFC Grid From Terra Populus collection on irods dfcmain grid to the irods client local machine From irods client local machine irods collection on the hydrology grid irods collection Dfcmain grid iput irods collection Hydrology grid Dfcmain/home/TerraPop boundaries_us_slad_2010.shp data_us_slad_2010.csv iget /hydrology/home/bakinam/terrapopdata boundaries_us_slad_2010.shp data_us_slad_2010.csv Retrieving TerraPop Data from irods collection on hydrology grid to irods resource executable directory by using iget command irods Server Bin directory iget Extract assembled in TerraPop data access system and sent to TerraPop collection on irods dfcmain grid /var/lib/irods/irods/server/bin/ boundaries_us_slad_2010.shp data_us_slad_2010.csv

18 Impact on Science The ability for scientific communities to collaborate around repeatable, reusable, and sharable data visualization workflows that operate on large data collections in DFC The ability to include datasets from other communities as resources within the data visualizations (e.g., TerraPop) The ability to easily share visualizations with metadata through SEAD DFC October 2014 NSF Review Slide 18

19 Who is DFC? DFC October 2014 NSF Review Slide 19

20 DataNet Federation Consortium Drexel Engineering/ MRC TDLC/ Scripps / DICE iplant Collaborative RENCI/ Arizona State U Virginia UNC CH/ Odum/ SILS/ DICE DFC October 2014 NSF Review Slide 20

21 National Science Foundation Cooperative Agreement: OCI

22 1 Rule is on client but executed on the irods server. 2 Rules are located on the bin directory for the client. 3 Workflow (scripts on the irods Server (/home/jlg7h/irods/server/bin/cmd) directory.

23 Science Observatory Network (SciON) Name Institution DFC October 2014 NSF Review Slide 1

24 SciON Goals High level Goals SciON: Science Observatory Network Seismic, Marine and Environmental Science Network SciON and DFC federated to provide persistence for sensor data Immediate Goals: Enable ingestion of sensor data into DFC Ingest streaming real time seismic data from NSF USArray Transportable Array Ingest streaming real time atmospheric pressure data from NSF USArray Transportable Array Ingest streaming real time seismic data from the Central and Eastern US Network Continue to provide data collected by the OOI as available DFC October 2014 NSF Review Slide 2

25 Importance of Real Time Sensor Data Provides real time view of sensing environment Provides ability for early warning systems for earthquakes, tsunamis, wild fires, severe weather, ocean waves Provides oceanographic data in near real time for oceanographic analysis and modeling DFC October 2014 NSF Review Slide 3

26 Antelope Environmental Commercial Software Data Collection and Analysis Monitoring Software Antelope is an integrated collection of programs for data collection and seismic data analysis, and typically runs at the central processing site. It has been in development for over a decade and is deployed around the world. Near real time Processing The Antelope Real Time System is built around a large, flexible, non volatile ring buffer. Data acquisition modules communicate with data loggers, and leave data on the ring buffer. The ring buffer protocol provides a convenient method for directly importing data from other sites, as well as exporting data. Real time processing typically occurs on the ring buffer: programs take input from the ring buffer and write results to the ring buffer. For instance, the detector reads data from the ring buffer and writes detections to the orb. The grid associator reads the detections and quickly provides preliminary event locations. This architecture facilitates running multiple detectors or associators, and other refinements. DFC October 2014 NSF Review Slide 4

27 More on Antelope Open Architecture Antelope runs in a UNIX environment on Linux X86, Linux X Scale, and Macintosh OS X. Antelope has an open architecture, with extensive documentation of internal interfaces. There is already support for many common dataloggers and sensors, but other dataloggers, sensors, and novel devices may be integrated by the end user. Integrated Database A relational database underlies Antelope real time processing. Waveforms, detections, events, and other data are saved from the ring buffer into the database. The analyst reviews data in the database, and there are database versions of programs corresponding to the real time processing software. Development Environment Antelope is a development environment. There is extensive documentation, allowing development of specialized software for site specific applications. Languages supported include C, Fortran, Perl, TCL/Tk, and Python. The Antelope Toolbox for MATLAB is also included for locations that already have MATLAB installed. The Antelope Users' Group has collected source code generated by the Antelope User Community DFC October 2014 NSF Review Slide 5

28 Antelope Usage United States: USArray Plate Boundary Observatory Alaska Earthquake Center University Nevada Reno US Air Force Australia Austria Canada Chile Italy Saudi Arabia Oman Korea Malaysia Taiwan Antarctica DFC October 2014 NSF Review Slide 6

29 USArray Data Flow DFC October 2014 NSF Review Slide 7

30 Science Impact Seamless integration of data from different science disciplines

31 Years 4 and 5 Make interface to Antelope more robust Access Antelope waveform data from different types of sensors Access other types of streaming data available in Antelope instances Creation and deposition of Archival Sensor Packages into NetCDF resources (eg THREDDS Server) Promote applications using asynchronous federation subscription, notification, playback Provide oceanographic data as available DFC October 2014 NSF Review Slide 9

32 National Science Foundation Cooperative Agreement: OCI

33 DFC to Support Social Science Data Preservation and Access Jonathan Crabtree Odum Institute University of North Carolina at Chapel Hill DFC October 2014 NSF Review Slide 1

34 Odum s Multidisciplinary Mission Promote diverse social science research Culture of multidisciplinary collaboration Methodological training programs Leveraging data science Focus on data reuse and research transparency History of data stewardship Dataverse Network partner Founding member of Data-PASS DFC October 2014 NSF Review Slide 2

35 Social Science Archives in Collaboration for Preservation Strategic partnership agreements Coordinated Operations Joint best practices Shared federated catalog Shared tools & technologies DFC October 2014 NSF Review Slide 3

36 Model for Sustainability Strategies Diverse funding sources Strategic partnerships Multiple business models within partnership Diverse holdings that appeal to wide array of disciplines Data-transfer agreements that ensure sustainability DFC October 2014 NSF Review Slide 4

37 Odum s Overall Demo Goals Design curation workflow integration Connect research environment with archive Connect archive with national architecture Open source focused As pluggable as possible DFC October 2014 NSF Review Slide 5

38 Bringing great tools together irods Dataverse Modeshape Databook Architecture Apache Service Mix irods Rule Integration Indexing Engine DFC October 2014 NSF Review Slide 6

39 Dataverse Storage Abstraction Current production environment tied to UNIX based file system In this prototype we used Modeshape Expands the storage options for Dataverse Abstracts the storage layer Allows standardized interface to irods Allows future use of Rules Based Policy Management and ties to Dataverse Data Tags DFC October 2014 NSF Review Slide 7

40 Leveraging DFC Infrastructure Databook infrastructure Rules base policy management Distributed preservation environment Secure distributed storage based on policy rules DFC October 2014 NSF Review Slide 8

41 Odum Demo DFC October 2014 NSF Review Slide 9

42 Advantages for Social Scientists Simplifies archive connection to active research environment Adds possibility for rules based policy management reducing data sharing burdens Enhances secure data sharing possibilities Diversifies Dataverse preservation options Allows curators in archive to assist researchers with data management within individual workflows Gives researchers access to diverse data from many disciplines DFC October 2014 NSF Review Slide 10

43 Impact on Science This DFC infrastructure will act as a catalyst for innovative combinations of digital data. Seamless discovery of data from diverse disciplines could be a fundamental shift in how researchers gather and use digital data. USE CASE Hydrologist uses Dataverse multidisciplinary search. Topic: impacts of drought Result: finds researchers down the watershed engaged in similar issues (e.g. Coastal estuary damage and its effect on fisheries, changes in oceanographic currents and their effect on weather patterns DFC October 2014 NSF Review Slide 11

44 Who is DFC? DFC October 2014 NSF Review Slide 12

45 DataNet Federation Consortium Drexel Engineering/ MRC TDLC/ Scripps / DICE iplant Collaborative RENCI/ Arizona State U Virginia UNC CH/ Odum/ SILS/ DICE DFC October 2014 NSF Review Slide 13

46 National Science Foundation Cooperative Agreement: OCI

47 DFC to Support The Science of Learning Andrea A. Chiba Science Director Temporal Dynamics of Learning Center DFC October 2014 NSF Review Slide 1

48 DFC October 2014 NSF Review Slide 2

49 Our Goal: Understand the role of time and timing in learning From the neuronal and millisecond scale to the brain and multi yearscale DFC October 2014 NSF Review Slide 3

50 Temporal Dynamics of Learning Center: Overview DFC October 2014 NSF Review Slide 4 4

51 The Network of Networks UC San Diego Brown University Carnegie Mellon University Imperial College Rutgers University San Diego State University The Salk Institute UC Berkeley U. at Buffalo University of Colorado U. Pennsylvania U. Pittsburgh SensoriMotor Network Mercado Social Interaction Network Jernigan Serpell Schultz Cognitive Science Cognitive Psychology Computational Neuroscience Computer Science Developmental Psychology Education Learning Theory Linguistics Machine Learning U. Queensland U. of Victoria U. of Washington Vanderbilt University Virginia State University Interacting Memory Systems Perceptual Expertise Network Mathematics Neuropsychology Neuroscience Physics Robotics DFC October 2014 NSF Review Slide 5

52 Actual Collaborations SIN PEN SMN IMSN DFC October 2014 NSF Review Slide 6

53 TDLC requires support for geographically distributed collaborations that share large datasets across tasks and species, coordinate joint analyses of data, share novel stimulus sets, and support computational modeling, while ensuring IRB, HIPAA, and IACUC data restrictions are maintained DFC October 2014 NSF Review Slide 7

54 Distributed Challenges: DFC October 2014 NSF Review Slide 8

55 DFC October 2014 NSF Review Slide 9

56 Bridging Time and Space with irods irods enables geographically distributed collaborative science, bringing laboratories together through a venue for large scale, realtime data sharing Sharing data in accordance with human and animal subjects restrictions requires the implementation of rules for sharing DFC October 2014 NSF Review Slide 10

57 Team Science DFC October 2014 NSF Review Slide 11

58 Distributed Challenges: DFC October 2014 NSF Review Slide 12

59 Video experimental data uploaded using idrop web Video data downloaded for use in Australia DFC October 2014 NSF Review Slide 13

60 Local neural data bulk uploaded using idrop web Processed Collections of Data Received and uploaded from Boston: Collections accessed from Australia for use in models: DFC October 2014 NSF Review Slide 14

61 Remote IRODS Clients Local IRODS Clients MMAP d Hi-Speed Direct I/O IRODS application Virtualized I/O DataDirect SFA Grid Enabled Storage KVM Driver DDN RAID Stack Linux Kernel Technological Sustainability SSD SAS SATA DFC October 2014 NSF Review Slide 15

62 Impact on Science The ability for scientific communities to collaborate on large data collections; even repurposing data for robotic models The ability to continue to collaborate with other science of learning centers and share and publish data collections as desired The ability to adhere to institutional review board standards for sharing and protection of data while also using a virtual platform DFC October 2014 NSF Review Slide 16

63 Future Directions Establish standards for different collections from different scientific domains. Enable data analysis on the grid, using each scientific domain s tools. Create easy Interface for the Rules Engine (for control of each dataset for IRB and IACUC restrictions). Develop portals for sharing data training sets for student education (Interface with MOOCS). DFC October 2014 NSF Review Slide 17

64 Who is DFC? DFC October 2014 NSF Review Slide 18

65 DataNet Federation Consortium Drexel Engineering/ MRC TDLC/ Scripps / DICE iplant Collaborative RENCI/ Arizona State U Virginia UNC CH/ Odum/ SILS/ DICE DFC October 2014 NSF Review Slide 19

66 National Science Foundation Cooperative Agreement: OCI

67 iplant: Data to discovery CI for data intensive life sciences Nirav Merchant iplant/university of Arizona DFC October 2014 NSF Review Slide 1

68 The iplant Collaborative: Vision Enable life science researchers and educators to use and extend cyberinfrastructure Slide 2

69 iplant Architectural Motivation We strive to be the CI Lego blocks Danish 'leg godt' 'play well Also translates as 'I put together' in Latin If a solution is not available you can craft your own using iplant CI components DFC October 2014 NSF Review Slide 3

70 iplant Products DFC October 2014 NSF Review Slide 4

71 The Community Overall Registered users upwards of 20,000 Each product has approx unique users per week Each iplant product has unique data access and use patter Central nature of data make the iplant data store essential component along with Auth service that powers all iplant platforms DFC October 2014 NSF Review Slide 5

72 Data Challenges for iplant Common for each user to have 100 GB 2 TB data Team science and virtual organizations (VO) are very prevalent, often global and highly distributed Consortiums have TB data to share Data movement, sharing, leveraging compute resources are a necessity. Connect data with compute Compute resources are provided by multiple partners Support for modern ways to share and collaborate with data driven teams (web, api, cloud etc.) Establish iplant Data Commons, improve discoverability DFC October 2014 NSF Review Slide 6

73 Milestones and Accomplishments, Year 3 Test grid of the iplant has been integrated with DFC Integration of the iplant data grid with the DFC data grid in progress The policy requirements were assembled. Several micro services used for integrity and end to end encryption Developed by Technology Group at request from iplant Tuning and Optimization of Metadata Catalog for improving performance Performed by Technology Group at request from iplant Generalization of the Discovery Environment in process Exposing iplant s HPC support to DFC Community in progress Generalized add in for popular desktop based analysis applications (NIH imagej and Tassel) to access irods were developed and will be made available for the DFC Community DFC October 2014 NSF Review Slide 7

74 iplant: Life Sciences/Biology Demo Community iplant Data Providers NGS Researcher (Next Gen. Seq.) idrop icommands Image Analysis (NIH imagej) TASSEL (Genetics) Jargon Plugins 30GB+ files 5GB + files iplant Discovery Environment Metadata driven iplant Data Store (irods) Large knowledge repositories (Genome Browsers) UCSC and ENSEMBL UK Requests data by range Creates interactive visualization for web and desktop systems Ecologist (Environmental Data) Bounding Box + date range irods Fuse iplant Atmosphere DFC October 2014 NSF Review Slide 8 NEW NASA MERRA A/S (1970 to date) Runs Map Reduce Returns NetCDF for further analysis in Atmosphere

75 Years forward 1. iplant Federation with other sites (in US and UK) 2. HIVE/metadata integration for our Data Commons effort 3. Metric and understanding of how our data store is being utilized to improve data management capabilities and planning for growth 4. Scalability for irods fuse, data distribution via content delivery systems (read and write) 5. Data Carpentry especially for Next Gen Sequencing and imaging users 6. Enhance support for spatial data, LIDAR and use of container technology (Docker) for processing data directly on resource servers DFC October 2014 NSF Review Slide 9

76 Impact of DFC Ability to provide tools and capabilities to break the 2GB (web protocol imposed) barriers for data transfer Allow users to manage their data with ease, connecting compute, data and facilitating analysis at scales required by their domain Wide range of interfaces developed Avenue to share best practices, metadata management New collaborations e.g. NASA MERRA A/S DFC October 2014 NSF Review Slide 10

77

78 Milestones for Years 4 5 DFC October 2014 NSF Review Slide 12

79 Who is DFC? DFC October 2014 NSF Review Slide 13

80 National Science Foundation Cooperative Agreement: OCI

81 Demo Takeaways Name Here Affiliation DFC October 2014 NSF Review Slide 1

82 Demos: Takeaways These are not point solutions; this is about cyberinfrastructure, applications can tie together A base architecture has been developed that allows pluggable features, and the demonstrated features are examples of plug ins. The DFC architecture and plug ins support preservation, sharing, and discovery by merit of storing your data on the DFC grid, and those capabilities are available via any client. Plug ins are compact, and allow easy extension to accommodate use cases in different domains. DFC October 2014 NSF Review Slide 1

83 Goals Index/data sci. Representation/ontologies Preservation/ discovery Challenges continue as the DFC grows Diversity of formats, domains, terminologies Impediments discoverability, usability, etc. Milestones Impact Interoperable metadata/data infrastructure Data that is sustainable, discoverable, searchable, accessible, usable; can be repurposed Support, even expedite science Support better science long term DFC October 2014 NSF Review Slide 2 Format Registry HIVE Databook

84 Motivations Preserve data in neutral formats Maintain metadata about significant properties, provenance Share databy federation Control data by policies Maximize data value by facilitating curation and discovery DFC October 2014 NSF Review Slide 3

85 Who is DFC? DFC October 2014 NSF Review Slide 4

86 National Science Foundation Cooperative Agreement: OCI

Data Management using irods

Data Management using irods Data Management using irods Fundamentals of Data Management September 2014 Albert Heyrovsky Applications Developer, EPCC a.heyrovsky@epcc.ed.ac.uk 2 Course outline Why talk about irods? What is irods?

More information

irods Policy-Driven Data Preservation Integrating Cloud Storage and Institutional Repositories

irods Policy-Driven Data Preservation Integrating Cloud Storage and Institutional Repositories irods Policy-Driven Data Preservation Integrating Cloud Storage and Institutional Repositories Reagan W. Moore Arcot Rajasekar Mike Wan {moore,sekar,mwan}@diceresearch.org h;p://irods.diceresearch.org

More information

The National Consortium for Data Science (NCDS)

The National Consortium for Data Science (NCDS) The National Consortium for Data Science (NCDS) A Public-Private Partnership to Advance Data Science Ashok Krishnamurthy PhD Deputy Director, RENCI University of North Carolina, Chapel Hill What is NCDS?

More information

RELATED WORK DATANET FEDERATION CONSORTIUM, HTTP://WWW.DATAFED.ORG IRODS, HTTP://IRODS.DICERESEARCH.ORG

RELATED WORK DATANET FEDERATION CONSORTIUM, HTTP://WWW.DATAFED.ORG IRODS, HTTP://IRODS.DICERESEARCH.ORG REAGAN W. MOORE DIRECTOR DATA INTENSIVE CYBER ENVIRONMENTS CENTER UNIVERSITY OF NORTH CAROLINA AT CHAPEL HILL RWMOORE@RENCI.ORG PRIMARY RESEARCH OR PRACTICE AREA(S): POLICY-BASED DATA MANAGEMENT PREVIOUS

More information

NIH Commons Overview, Framework & Pilots - Version 1. The NIH Commons

NIH Commons Overview, Framework & Pilots - Version 1. The NIH Commons The NIH Commons Summary The Commons is a shared virtual space where scientists can work with the digital objects of biomedical research, i.e. it is a system that will allow investigators to find, manage,

More information

ODUM INSTITUTE ARCHIVE SERVICES OVERVIEW IASSIST 2015

ODUM INSTITUTE ARCHIVE SERVICES OVERVIEW IASSIST 2015 ODUM INSTITUTE ARCHIVE SERVICES OVERVIEW IASSIST 2015 JONATHAN CRABTREE Assistant Director of Computing and Archival Research The Odum Institute for Research in Social Science Davis Library, 2nd Floor,

More information

Policy Policy--driven Distributed driven Distributed Data Management (irods) Richard M arciano Marciano marciano@un marciano @un.

Policy Policy--driven Distributed driven Distributed Data Management (irods) Richard M arciano Marciano marciano@un marciano @un. Policy-driven Distributed Data Management (irods) Richard Marciano marciano@unc.edu Professor @ SILS / Chief Scientist for Persistent Archives and Digital Preservation @ RENCI Director of the Sustainable

More information

Technical. Overview. ~ a ~ irods version 4.x

Technical. Overview. ~ a ~ irods version 4.x Technical Overview ~ a ~ irods version 4.x The integrated Ru e-oriented DATA System irods is open-source, data management software that lets users: access, manage, and share data across any type or number

More information

Nevada NSF EPSCoR Track 1 Data Management Plan

Nevada NSF EPSCoR Track 1 Data Management Plan Nevada NSF EPSCoR Track 1 Data Management Plan August 1, 2011 INTRODUCTION Our data management plan is driven by the overall project goals and aims to ensure that the following are achieved: Assure that

More information

Data Management Resources at UNC: The Carolina Digital Repository and Dataverse Network

Data Management Resources at UNC: The Carolina Digital Repository and Dataverse Network Data Management Resources at UNC: The Carolina Digital Repository and Dataverse Network November 16, 2010 Data Management Short Course Series Sponsored by the Odum Institute and the UNC Libraries Campus

More information

A Proof of Concept Cloud Based Solution. Mark Evans Tessella Inc. PASIG Austin, TX - January 13 th 2012

A Proof of Concept Cloud Based Solution. Mark Evans Tessella Inc. PASIG Austin, TX - January 13 th 2012 A Proof of Concept Cloud Based Solution Mark Evans Tessella Inc PASIG Austin, TX - January 13 th 2012 Agenda Background to Tessella and Safety Deposit Box Primary drivers Our Journey to a proof of concept

More information

The Arctic Observing Network and its Data Management Challenges Florence Fetterer (NSIDC/CIRES/CU), James A. Moore (NCAR/EOL), and the CADIS team

The Arctic Observing Network and its Data Management Challenges Florence Fetterer (NSIDC/CIRES/CU), James A. Moore (NCAR/EOL), and the CADIS team The Arctic Observing Network and its Data Management Challenges Florence Fetterer (NSIDC/CIRES/CU), James A. Moore (NCAR/EOL), and the CADIS team Photo courtesy Andrew Mahoney NSF Vision What is AON? a

More information

WOS for Research. ddn.com. DDN Whitepaper. Utilizing irods to manage collaborative research. 2012 DataDirect Networks. All Rights Reserved.

WOS for Research. ddn.com. DDN Whitepaper. Utilizing irods to manage collaborative research. 2012 DataDirect Networks. All Rights Reserved. DDN Whitepaper WOS for Research Utilizing irods to manage collaborative research. 2012 DataDirect Networks. All Rights Reserved. irods and the DDN Web Object Scalar (WOS) Integration irods, an open source

More information

INTEGRATED RULE ORIENTED DATA SYSTEM (IRODS)

INTEGRATED RULE ORIENTED DATA SYSTEM (IRODS) INTEGRATED RULE ORIENTED DATA SYSTEM (IRODS) Todd BenDor Associate Professor Dept. of City and Regional Planning UNC-Chapel Hill bendor@unc.edu http://irods.org/ SESYNC Model Integration Workshop Important

More information

REACCH PNA Data Management Plan

REACCH PNA Data Management Plan REACCH PNA Data Management Plan Regional Approaches to Climate Change (REACCH) For Pacific Northwest Agriculture 875 Perimeter Drive MS 2339 Moscow, ID 83844-2339 http://www.reacchpna.org reacch@uidaho.edu

More information

OSG PUBLIC STORAGE. Tanya Levshina

OSG PUBLIC STORAGE. Tanya Levshina PUBLIC STORAGE Tanya Levshina Motivations for Public Storage 2 data to use sites more easily LHC VOs have solved this problem (FTS, Phedex, LFC) Smaller VOs are still struggling with large data in a distributed

More information

GenomeSpace Architecture

GenomeSpace Architecture GenomeSpace Architecture The primary services, or components, are shown in Figure 1, the high level GenomeSpace architecture. These include (1) an Authorization and Authentication service, (2) an analysis

More information

NASA's Strategy and Activities in Server Side Analytics

NASA's Strategy and Activities in Server Side Analytics NASA's Strategy and Activities in Server Side Analytics Tsengdar Lee, Ph.D. High-end Computing Program Manager NASA Headquarters Presented at the ESGF/UVCDAT Conference Lawrence Livermore National Laboratory

More information

CYBERINFRASTRUCTURE FRAMEWORK FOR 21 st CENTURY SCIENCE AND ENGINEERING (CIF21)

CYBERINFRASTRUCTURE FRAMEWORK FOR 21 st CENTURY SCIENCE AND ENGINEERING (CIF21) CYBERINFRASTRUCTURE FRAMEWORK FOR 21 st CENTURY SCIENCE AND ENGINEERING (CIF21) Goal Develop and deploy comprehensive, integrated, sustainable, and secure cyberinfrastructure (CI) to accelerate research

More information

Automated and Scalable Data Management System for Genome Sequencing Data

Automated and Scalable Data Management System for Genome Sequencing Data Automated and Scalable Data Management System for Genome Sequencing Data Michael Mueller NIHR Imperial BRC Informatics Facility Faculty of Medicine Hammersmith Hospital Campus Continuously falling costs

More information

Accelerate > Converged Storage Infrastructure. DDN Case Study. ddn.com. 2013 DataDirect Networks. All Rights Reserved

Accelerate > Converged Storage Infrastructure. DDN Case Study. ddn.com. 2013 DataDirect Networks. All Rights Reserved DDN Case Study Accelerate > Converged Storage Infrastructure 2013 DataDirect Networks. All Rights Reserved The University of Florida s (ICBR) offers access to cutting-edge technologies designed to enable

More information

Data-Intensive Science and Scientific Data Infrastructure

Data-Intensive Science and Scientific Data Infrastructure Data-Intensive Science and Scientific Data Infrastructure Russ Rew, UCAR Unidata ICTP Advanced School on High Performance and Grid Computing 13 April 2011 Overview Data-intensive science Publishing scientific

More information

Cloud Computing @ JPL Science Data Systems

Cloud Computing @ JPL Science Data Systems Cloud Computing @ JPL Science Data Systems Emily Law, GSAW 2011 Outline Science Data Systems (SDS) Space & Earth SDSs SDS Common Architecture Components Key Components using Cloud Computing Use Case 1:

More information

INTRODUCTION TO THE DATAVERSE NETWORK

INTRODUCTION TO THE DATAVERSE NETWORK INTRODUCTION TO THE DATAVERSE NETWORK JANUARY 7, 2015 Jonathan Crabtree Assistant Director of Computing and Archival Research THE ODUM INSTITUTE FOR RESEARCH IN SOCIAL SCIENCE 228 DAVIS LIBRARY, CB# 3355

More information

Enabling Science in the Cloud: A Remote Sensing Data Processing Service for Environmental Science Analysis

Enabling Science in the Cloud: A Remote Sensing Data Processing Service for Environmental Science Analysis Enabling Science in the Cloud: A Remote Sensing Data Processing Service for Environmental Science Analysis Catharine van Ingen 1, Jie Li 2, Youngryel Ryu 3, Marty Humphrey 2, Deb Agarwal 4, Keith Jackson

More information

Survey of Canadian and International Data Management Initiatives. By Diego Argáez and Kathleen Shearer

Survey of Canadian and International Data Management Initiatives. By Diego Argáez and Kathleen Shearer Survey of Canadian and International Data Management Initiatives By Diego Argáez and Kathleen Shearer on behalf of the CARL Data Management Working Group (Working paper) April 28, 2008 Introduction Today,

More information

#jenkinsconf. Jenkins as a Scientific Data and Image Processing Platform. Jenkins User Conference Boston #jenkinsconf

#jenkinsconf. Jenkins as a Scientific Data and Image Processing Platform. Jenkins User Conference Boston #jenkinsconf Jenkins as a Scientific Data and Image Processing Platform Ioannis K. Moutsatsos, Ph.D., M.SE. Novartis Institutes for Biomedical Research www.novartis.com June 18, 2014 #jenkinsconf Life Sciences are

More information

SURFsara Data Services

SURFsara Data Services SURFsara Data Services SUPPORTING DATA-INTENSIVE SCIENCES Mark van de Sanden The world of the many Many different users (well organised (international) user communities, research groups, universities,

More information

Data Driven Discovery In the Social, Behavioral, and Economic Sciences

Data Driven Discovery In the Social, Behavioral, and Economic Sciences Data Driven Discovery In the Social, Behavioral, and Economic Sciences Simon Appleford, Marshall Scott Poole, Kevin Franklin, Peter Bajcsy, Alan B. Craig, Institute for Computing in the Humanities, Arts,

More information

Data Intensive Cyber Environments Center University of North Carolina at Chapel Hill

Data Intensive Cyber Environments Center University of North Carolina at Chapel Hill Data Intensive Cyber Environments Center University of North Carolina at Chapel Hill DataNet Federation Consortium NSF Award OCI-0940841 Period of Performance: 09/01/2011 8/31/2016 Project funding $7,999,998:

More information

Boulder Creek Critical Zone Observatory Data Management Plan

Boulder Creek Critical Zone Observatory Data Management Plan Boulder Creek Critical Zone Observatory Data Management Plan Types of data The Boulder Creek Critical Zone Observatory (CZO) focuses on research in the Boulder Creek watershed. This encompasses Green Lakes

More information

DataGrids 2.0 irods - A Second Generation Data Cyberinfrastructure. Arcot (RAJA) Rajasekar DICE/SDSC/UCSD

DataGrids 2.0 irods - A Second Generation Data Cyberinfrastructure. Arcot (RAJA) Rajasekar DICE/SDSC/UCSD DataGrids 2.0 irods - A Second Generation Data Cyberinfrastructure Arcot (RAJA) Rajasekar DICE/SDSC/UCSD What is SRB? First Generation Data Grid middleware developed at the San Diego Supercomputer Center

More information

The THREDDS Data Repository: for Long Term Data Storage and Access

The THREDDS Data Repository: for Long Term Data Storage and Access 8B.7 The THREDDS Data Repository: for Long Term Data Storage and Access Anne Wilson, Thomas Baltzer, John Caron Unidata Program Center, UCAR, Boulder, CO 1 INTRODUCTION In order to better manage ever increasing

More information

OpenNebula Open Souce Solution for DC Virtualization. C12G Labs. Online Webinar

OpenNebula Open Souce Solution for DC Virtualization. C12G Labs. Online Webinar OpenNebula Open Souce Solution for DC Virtualization C12G Labs Online Webinar What is OpenNebula? Multi-tenancy, Elasticity and Automatic Provision on Virtualized Environments I m using virtualization/cloud,

More information

OpenNebula Open Souce Solution for DC Virtualization

OpenNebula Open Souce Solution for DC Virtualization 13 th LSM 2012 7 th -12 th July, Geneva OpenNebula Open Souce Solution for DC Virtualization Constantino Vázquez Blanco OpenNebula.org What is OpenNebula? Multi-tenancy, Elasticity and Automatic Provision

More information

iplant + irods: Enabling data driven collaborations Nirav Merchant iplant Collaborative/Univ. of Arizona nirav@email.arizona.edu VAMP 2012 Utrecht

iplant + irods: Enabling data driven collaborations Nirav Merchant iplant Collaborative/Univ. of Arizona nirav@email.arizona.edu VAMP 2012 Utrecht iplant + irods: Enabling data driven collaborations Nirav Merchant iplant Collaborative/Univ. of Arizona nirav@email.arizona.edu VAMP 2012 Utrecht Topic Coverage About iplant 4 th Paradigm Technology challenges

More information

Building Platform as a Service for Scientific Applications

Building Platform as a Service for Scientific Applications Building Platform as a Service for Scientific Applications Moustafa AbdelBaky moustafa@cac.rutgers.edu Rutgers Discovery Informa=cs Ins=tute (RDI 2 ) The NSF Cloud and Autonomic Compu=ng Center Department

More information

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences WP11 Data Storage and Analysis Task 11.1 Coordination Deliverable 11.2 Community Needs of

More information

Canadian National Research Data Repository Service. CC and CARL Partnership for a national platform for Research Data Management

Canadian National Research Data Repository Service. CC and CARL Partnership for a national platform for Research Data Management Research Data Management Canadian National Research Data Repository Service Progress Report, June 2016 As their digital datasets grow, researchers across all fields of inquiry are struggling to manage

More information

OpenNebula Open Souce Solution for DC Virtualization

OpenNebula Open Souce Solution for DC Virtualization OSDC 2012 25 th April, Nürnberg OpenNebula Open Souce Solution for DC Virtualization Constantino Vázquez Blanco OpenNebula.org What is OpenNebula? Multi-tenancy, Elasticity and Automatic Provision on Virtualized

More information

UNINETT Sigma2 AS: architecture and functionality of the future national data infrastructure

UNINETT Sigma2 AS: architecture and functionality of the future national data infrastructure UNINETT Sigma2 AS: architecture and functionality of the future national data infrastructure Authors: A O Jaunsen, G S Dahiya, H A Eide, E Midttun Date: Dec 15, 2015 Summary Uninett Sigma2 provides High

More information

irods Technologies at UNC

irods Technologies at UNC irods Technologies at UNC E-iRODS: Enterprise irods at RENCI Presenter: Leesa Brieger leesa@renci.org SC12 irods Informational Reception 1! UNC Chapel Hill Investment in irods DICE and RENCI: research

More information

WOS Cloud. ddn.com. Personal Storage for the Enterprise. DDN Solution Brief

WOS Cloud. ddn.com. Personal Storage for the Enterprise. DDN Solution Brief DDN Solution Brief Personal Storage for the Enterprise WOS Cloud Secure, Shared Drop-in File Access for Enterprise Users, Anytime and Anywhere 2011 DataDirect Networks. All Rights Reserved DDN WOS Cloud

More information

Hadoop in the Hybrid Cloud

Hadoop in the Hybrid Cloud Presented by Hortonworks and Microsoft Introduction An increasing number of enterprises are either currently using or are planning to use cloud deployment models to expand their IT infrastructure. Big

More information

How To Write A Blog Post On Globus

How To Write A Blog Post On Globus Globus Software as a Service data publication and discovery Kyle Chard, University of Chicago Computation Institute, chard@uchicago.edu Jim Pruyne, University of Chicago Computation Institute, pruyne@uchicago.edu

More information

Solution for private cloud computing

Solution for private cloud computing The CC1 system Solution for private cloud computing 1 Outline What is CC1? Features Technical details Use cases By scientist By HEP experiment System requirements and installation How to get it? 2 What

More information

Note: Hands On workshops are Bring Your Own Laptop (BYOL), unless otherwise noted. Some workshops are Bring Your Own Mobile Device(BYOD).

Note: Hands On workshops are Bring Your Own Laptop (BYOL), unless otherwise noted. Some workshops are Bring Your Own Mobile Device(BYOD). 2015 MN GIS/LIS Consortium Pre Conference Workshops The Minnesota GIS/LIS Consortium is pleased to offer a diverse list of workshops on Wednesday, October 7th, 2015 at the DECC, Duluth, Minnesota Charting

More information

Databases & Data Infrastructure. Kerstin Lehnert

Databases & Data Infrastructure. Kerstin Lehnert + Databases & Data Infrastructure Kerstin Lehnert + Access to Data is Needed 2 to allow verification of research results to allow re-use of data + The road to reuse is perilous (1) 3 Accessibility Discovery,

More information

Introduction to Arvados. A Curoverse White Paper

Introduction to Arvados. A Curoverse White Paper Introduction to Arvados A Curoverse White Paper Contents Arvados in a Nutshell... 4 Why Teams Choose Arvados... 4 The Technical Architecture... 6 System Capabilities... 7 Commitment to Open Source... 12

More information

Environmental Data Management:

Environmental Data Management: Environmental Data Management: Challenges & Opportunities Mohan Ramamurthy, Unidata University Corporation for Atmospheric Research Boulder CO 25 May 2010 Environmental Data Management Workshop Silver

More information

Archiving, Indexing and Accessing Web Materials: Solutions for large amounts of data

Archiving, Indexing and Accessing Web Materials: Solutions for large amounts of data Archiving, Indexing and Accessing Web Materials: Solutions for large amounts of data David Minor 1, Reagan Moore 2, Bing Zhu, Charles Cowart 4 1. (88)4-104 minor@sdsc.edu San Diego Supercomputer Center

More information

XSEDE Overview John Towns

XSEDE Overview John Towns April 15, 2011 XSEDE Overview John Towns XD Solicitation/XD Program extreme Digital Resources for Science and Engineering (NSF 08 571) Extremely Complicated High Performance Computing and Storage Services

More information

With DDN Big Data Storage

With DDN Big Data Storage DDN Solution Brief Accelerate > ISR With DDN Big Data Storage The Way to Capture and Analyze the Growing Amount of Data Created by New Technologies 2012 DataDirect Networks. All Rights Reserved. The Big

More information

NASA s Big Data Challenges in Climate Science

NASA s Big Data Challenges in Climate Science NASA s Big Data Challenges in Climate Science Tsengdar Lee, Ph.D. High-end Computing Program Manager NASA Headquarters Presented at IEEE Big Data 2014 Workshop October 29, 2014 1 2 7-km GEOS-5 Nature Run

More information

globus online Cloud-based services for (reproducible) science Ian Foster Computation Institute University of Chicago and Argonne National Laboratory

globus online Cloud-based services for (reproducible) science Ian Foster Computation Institute University of Chicago and Argonne National Laboratory globus online Cloud-based services for (reproducible) science Ian Foster Computation Institute University of Chicago and Argonne National Laboratory Computation Institute (CI) Apply to challenging problems

More information

13.2 THE INTEGRATED DATA VIEWER A WEB-ENABLED APPLICATION FOR SCIENTIFIC ANALYSIS AND VISUALIZATION

13.2 THE INTEGRATED DATA VIEWER A WEB-ENABLED APPLICATION FOR SCIENTIFIC ANALYSIS AND VISUALIZATION 13.2 THE INTEGRATED DATA VIEWER A WEB-ENABLED APPLICATION FOR SCIENTIFIC ANALYSIS AND VISUALIZATION Don Murray*, Jeff McWhirter, Stuart Wier, Steve Emmerson Unidata Program Center, Boulder, Colorado 1.

More information

A Service for Data-Intensive Computations on Virtual Clusters

A Service for Data-Intensive Computations on Virtual Clusters A Service for Data-Intensive Computations on Virtual Clusters Executing Preservation Strategies at Scale Rainer Schmidt, Christian Sadilek, and Ross King rainer.schmidt@arcs.ac.at Planets Project Permanent

More information

Data Management. Facility Access Challenges: Rudi Eigenmann NEES Operations Headquarters NEEScomm Center Purdue University

Data Management. Facility Access Challenges: Rudi Eigenmann NEES Operations Headquarters NEEScomm Center Purdue University George E. Brown Jr. Network for Earthquake Engineering Simulation Facility Access Challenges: Data Management Rudi Eigenmann NEES Operations Headquarters NEEScomm Center Purdue University Why Data Management?

More information

Science Gateways in the US. Nancy Wilkins-Diehr wilkinsn@sdsc.edu

Science Gateways in the US. Nancy Wilkins-Diehr wilkinsn@sdsc.edu Science Gateways in the US Nancy Wilkins-Diehr wilkinsn@sdsc.edu NSF vision for cyberinfrastructure in the 21st century Software is critical to today s scientific advances Science is all about connections

More information

Environment Canada Data Management Program. Paul Paciorek Corporate Services Branch May 7, 2014

Environment Canada Data Management Program. Paul Paciorek Corporate Services Branch May 7, 2014 Environment Canada Data Management Program Paul Paciorek Corporate Services Branch May 7, 2014 EC Data Management Program (ECDMP) consists of 5 foundational, incremental projects which will implement

More information

NERC Data Policy Guidance Notes

NERC Data Policy Guidance Notes NERC Data Policy Guidance Notes Author: Mark Thorley NERC Data Management Coordinator Contents 1. Data covered by the NERC Data Policy 2. Definition of terms a. Environmental data b. Information products

More information

Orbiter Series Service Oriented Architecture Applications

Orbiter Series Service Oriented Architecture Applications Workshop on Science Agency Uses of Clouds and Grids Orbiter Series Service Oriented Architecture Applications Orbiter Project Overview Mark L. Green mlgreen@txcorp.com Tech-X Corporation, Buffalo Office

More information

SCOOP Data Management: A Standards-based Distributed System for Coastal Data and Modeling

SCOOP Data Management: A Standards-based Distributed System for Coastal Data and Modeling SCOOP Data Management: A Standards-based Distributed System for Coastal Data and Modeling Helen Conover, Bruce Beaumont, Marilyn Drewry, Sara Graves, Ken Keiser, Manil Maskey, Matt Smith The University

More information

Exploring the roles and responsibilities of data centres and institutions in curating research data a preliminary briefing.

Exploring the roles and responsibilities of data centres and institutions in curating research data a preliminary briefing. Exploring the roles and responsibilities of data centres and institutions in curating research data a preliminary briefing. Dr Liz Lyon, UKOLN, University of Bath Introduction and Objectives UKOLN is undertaking

More information

Deploying a distributed data storage system on the UK National Grid Service using federated SRB

Deploying a distributed data storage system on the UK National Grid Service using federated SRB Deploying a distributed data storage system on the UK National Grid Service using federated SRB Manandhar A.S., Kleese K., Berrisford P., Brown G.D. CCLRC e-science Center Abstract As Grid enabled applications

More information

Integrating a Big Data Platform into Government:

Integrating a Big Data Platform into Government: Integrating a Big Data Platform into Government: Drive Better Decisions for Policy and Program Outcomes John Haddad, Senior Director Product Marketing, Informatica Digital Government Institute s Government

More information

Information and Communications Technology Strategy 2014-2017

Information and Communications Technology Strategy 2014-2017 Contents 1 Background ICT in Geoscience Australia... 2 1.1 Introduction... 2 1.2 Purpose... 2 1.3 Geoscience Australia and the Role of ICT... 2 1.4 Stakeholders... 4 2 Strategic drivers, vision and principles...

More information

Data grid storage for digital libraries and archives using irods

Data grid storage for digital libraries and archives using irods Data grid storage for digital libraries and archives using irods Mark Hedges, Centre for e-research, King s College London eresearch Australasia, Melbourne, 30 th Sept. 2008 Background: Project History

More information

FREE computing using Amazon EC2

FREE computing using Amazon EC2 FREE computing using Amazon EC2 Seong-Hwan Jun 1 1 Department of Statistics Univ of British Columbia Nov 1st, 2012 / Student seminar Outline Basics of servers Amazon EC2 Setup R on an EC2 instance Stat

More information

Data Lab Operations Concepts

Data Lab Operations Concepts Data Lab Operations Concepts 1 Introduction This talk will provide an overview of Data Lab components to be implemented Core infrastructure User applications Science Capabilities User Interfaces The scope

More information

Enhanced Research Data Management and Publication with Globus

Enhanced Research Data Management and Publication with Globus Enhanced Research Data Management and Publication with Globus Vas Vasiliadis Jim Pruyne Presented at OR2015 June 8, 2015 Presentations and other useful information available at globus.org/events/or2015/tutorial

More information

Storage Switzerland White Paper Storage Infrastructures for Big Data Workflows

Storage Switzerland White Paper Storage Infrastructures for Big Data Workflows Storage Switzerland White Paper Storage Infrastructures for Big Data Workflows Sponsored by: Prepared by: Eric Slack, Sr. Analyst May 2012 Storage Infrastructures for Big Data Workflows Introduction Big

More information

Assignment # 1 (Cloud Computing Security)

Assignment # 1 (Cloud Computing Security) Assignment # 1 (Cloud Computing Security) Group Members: Abdullah Abid Zeeshan Qaiser M. Umar Hayat Table of Contents Windows Azure Introduction... 4 Windows Azure Services... 4 1. Compute... 4 a) Virtual

More information

Connecting Researchers, Data & HPC

Connecting Researchers, Data & HPC Connecting Researchers, Data & HPC Nick Nystrom Director, Strategic Applications & Bridges PI nystrom@psc.edu July 1, 2015 2015 Pittsburgh Supercomputing Center The Shift to Big Data New Emphases Pan-STARRS

More information

Cornell University Center for Advanced Computing A Sustainable Business Model for Advanced Research Computing

Cornell University Center for Advanced Computing A Sustainable Business Model for Advanced Research Computing Cornell University Center for Advanced Computing A Sustainable Business Model for Advanced Research Computing David A. Lifka lifka@cac.cornell.edu 4/20/13 www.cac.cornell.edu 1 My Background 2007 Cornell

More information

Taking Big Data to the Cloud. Enabling cloud computing & storage for big data applications with on-demand, high-speed transport WHITE PAPER

Taking Big Data to the Cloud. Enabling cloud computing & storage for big data applications with on-demand, high-speed transport WHITE PAPER Taking Big Data to the Cloud WHITE PAPER TABLE OF CONTENTS Introduction 2 The Cloud Promise 3 The Big Data Challenge 3 Aspera Solution 4 Delivering on the Promise 4 HIGHLIGHTS Challenges Transporting large

More information

Databricks. A Primer

Databricks. A Primer Databricks A Primer Who is Databricks? Databricks vision is to empower anyone to easily build and deploy advanced analytics solutions. The company was founded by the team who created Apache Spark, a powerful

More information

How To Understand The Nature Of Big Data

How To Understand The Nature Of Big Data Big Data is Coming for You W. Christopher Lenhardt RENCI DAARWG, Chair Outline A few words about RENCI Introduction: On the Nature of BIG Big Challenges Big Science Questions Big Data Other Big Trends

More information

Luc Declerck AUL, Technology Services Declan Fleming Director, Information Technology Department

Luc Declerck AUL, Technology Services Declan Fleming Director, Information Technology Department Luc Declerck AUL, Technology Services Declan Fleming Director, Information Technology Department What is cyberinfrastructure? Outline Examples of cyberinfrastructure t Why is this relevant to Libraries?

More information

Bringing Big Data Modelling into the Hands of Domain Experts

Bringing Big Data Modelling into the Hands of Domain Experts Bringing Big Data Modelling into the Hands of Domain Experts David Willingham Senior Application Engineer MathWorks david.willingham@mathworks.com.au 2015 The MathWorks, Inc. 1 Data is the sword of the

More information

NetCDF and HDF Data in ArcGIS

NetCDF and HDF Data in ArcGIS 2013 Esri International User Conference July 8 12, 2013 San Diego, California Technical Workshop NetCDF and HDF Data in ArcGIS Nawajish Noman Kevin Butler Esri UC2013. Technical Workshop. Outline NetCDF

More information

Intro to Data Management. Chris Jordan Data Management and Collections Group Texas Advanced Computing Center

Intro to Data Management. Chris Jordan Data Management and Collections Group Texas Advanced Computing Center Intro to Data Management Chris Jordan Data Management and Collections Group Texas Advanced Computing Center Why Data Management? Digital research, above all, creates files Lots of files Without a plan,

More information

Concepts in Distributed Data Management or History of the DICE Group

Concepts in Distributed Data Management or History of the DICE Group Concepts in Distributed Data Management or History of the DICE Group Reagan W. Moore 1, Arcot Rajasekar 1, Michael Wan 3, Wayne Schroeder 2, Antoine de Torcy 1, Sheau- Yen Chen 2, Mike Conway 1, Hao Xu

More information

Object storage in Cloud Computing and Embedded Processing

Object storage in Cloud Computing and Embedded Processing Object storage in Cloud Computing and Embedded Processing Jan Jitze Krol Systems Engineer DDN We Accelerate Information Insight DDN is a Leader in Massively Scalable Platforms and Solutions for Big Data

More information

CE 504 Computational Hydrology Computational Environments and Tools Fritz R. Fiedler

CE 504 Computational Hydrology Computational Environments and Tools Fritz R. Fiedler CE 504 Computational Hydrology Computational Environments and Tools Fritz R. Fiedler 1) Operating systems a) Windows b) Unix and Linux c) Macintosh 2) Data manipulation tools a) Text Editors b) Spreadsheets

More information

Part V Applications. What is cloud computing? SaaS has been around for awhile. Cloud Computing: General concepts

Part V Applications. What is cloud computing? SaaS has been around for awhile. Cloud Computing: General concepts Part V Applications Cloud Computing: General concepts Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 1 What is cloud computing? SaaS: Software as a Service Cloud: Datacenters hardware

More information

A Hurwitz white paper. Inventing the Future. Judith Hurwitz President and CEO. Sponsored by Hitachi

A Hurwitz white paper. Inventing the Future. Judith Hurwitz President and CEO. Sponsored by Hitachi Judith Hurwitz President and CEO Sponsored by Hitachi Introduction Only a few years ago, the greatest concern for businesses was being able to link traditional IT with the requirements of business units.

More information

Big Data in OpenTopography

Big Data in OpenTopography Big Data in OpenTopography Vishu Nandigam San Diego Supercomputer Center NSF Big Data in Educa

More information

A standards-based open source processing chain for ocean modeling in the GEOSS Architecture Implementation Pilot Phase 8 (AIP-8)

A standards-based open source processing chain for ocean modeling in the GEOSS Architecture Implementation Pilot Phase 8 (AIP-8) NATO Science & Technology Organization Centre for Maritime Research and Experimentation (STO-CMRE) Viale San Bartolomeo, 400 19126 La Spezia, Italy A standards-based open source processing chain for ocean

More information

LabArchives Electronic Lab Notebook:

LabArchives Electronic Lab Notebook: Electronic Lab Notebook: Cloud platform to manage research workflow & data Support Data Management Plans Annotate and prove discovery Secure compliance Improve compliance with your data management plans,

More information

ENABLING DATA TRANSFER MANAGEMENT AND SHARING IN THE ERA OF GENOMIC MEDICINE. October 2013

ENABLING DATA TRANSFER MANAGEMENT AND SHARING IN THE ERA OF GENOMIC MEDICINE. October 2013 ENABLING DATA TRANSFER MANAGEMENT AND SHARING IN THE ERA OF GENOMIC MEDICINE October 2013 Introduction As sequencing technologies continue to evolve and genomic data makes its way into clinical use and

More information

irods Overview Intro to Data Grids and Policy-Driven Data Management!!Leesa Brieger, RENCI! Reagan Moore, DICE & RENCI!

irods Overview Intro to Data Grids and Policy-Driven Data Management!!Leesa Brieger, RENCI! Reagan Moore, DICE & RENCI! irods Overview Intro to Data Grids and Policy-Driven Data Management!!Leesa Brieger, RENCI! Reagan Moore, DICE & RENCI! Renaissance Computing Institute (RENCI) A research unit of UNC Chapel Hill Current

More information

Research Data Alliance: Current Activities and Expected Impact. SGBD Workshop, May 2014 Herman Stehouwer

Research Data Alliance: Current Activities and Expected Impact. SGBD Workshop, May 2014 Herman Stehouwer Research Data Alliance: Current Activities and Expected Impact SGBD Workshop, May 2014 Herman Stehouwer The Vision 2 Researchers and innovators openly share data across technologies, disciplines, and countries

More information

Make the Most of Big Data to Drive Innovation Through Reseach

Make the Most of Big Data to Drive Innovation Through Reseach White Paper Make the Most of Big Data to Drive Innovation Through Reseach Bob Burwell, NetApp November 2012 WP-7172 Abstract Monumental data growth is a fact of life in research universities. The ability

More information

Data Management and Real-time Distribution in the HF-Radar National Network

Data Management and Real-time Distribution in the HF-Radar National Network Data Management and Real-time Distribution in the HF-Radar National Network E. Terrill 1, M. Otero 1, L. Hazard 1, D. Conlee 2, J. Harlan 3, J. Kohut 4, P. Reuter 1, T. Cook 1, T. Harris 1, K. Lindquist

More information

Introducing the Open Source CUAHSI Hydrologic Information System Desktop Application (HIS Desktop)

Introducing the Open Source CUAHSI Hydrologic Information System Desktop Application (HIS Desktop) 18 th World IMACS / MODSIM Congress, Cairns, Australia 13-17 July 2009 http://mssanz.org.au/modsim09 Introducing the Open Source CUAHSI Hydrologic Information System Desktop Application (HIS Desktop) Ames,

More information

Object Storage: A Growing Opportunity for Service Providers. White Paper. Prepared for: 2012 Neovise, LLC. All Rights Reserved.

Object Storage: A Growing Opportunity for Service Providers. White Paper. Prepared for: 2012 Neovise, LLC. All Rights Reserved. Object Storage: A Growing Opportunity for Service Providers Prepared for: White Paper 2012 Neovise, LLC. All Rights Reserved. Introduction For service providers, the rise of cloud computing is both a threat

More information

Applications of Deep Learning to the GEOINT mission. June 2015

Applications of Deep Learning to the GEOINT mission. June 2015 Applications of Deep Learning to the GEOINT mission June 2015 Overview Motivation Deep Learning Recap GEOINT applications: Imagery exploitation OSINT exploitation Geospatial and activity based analytics

More information

White Paper on CLOUD COMPUTING

White Paper on CLOUD COMPUTING White Paper on CLOUD COMPUTING INDEX 1. Introduction 2. Features of Cloud Computing 3. Benefits of Cloud computing 4. Service models of Cloud Computing 5. Deployment models of Cloud Computing 6. Examples

More information

Data Lab System Architecture

Data Lab System Architecture Data Lab System Architecture Data Lab Context Data Lab Architecture Astronomer s Desktop Web Page Cmdline Tools Legacy Apps User Code User Mgmt Data Lab Ops Monitoring Presentation Layer Authentication

More information