Data Management Plan (DMP) for Particle Physics Experiments prepared for the 2015 Consolidated Grants Round. Detailed Version
|
|
|
- Isabel Powers
- 9 years ago
- Views:
Transcription
1 Data Management Plan (DMP) for Particle Physics Experiments prepared for the 2015 Consolidated Grants Round. Detailed Version The Particle Physics Experiment Consolidated Grant proposals now being submitted are all for the support of experiments which will produce very significant amounts of data. As such STFC requires a Data Management Plan (DMP) to be provided. This document provides the necessary references to the DMPs of particle physics experiments and responses to the questions set out in the STFC guidelines. 1. The LHC Experiments (ATLAS, CMS, LHCb) 1.1 Computing Models and Open Data Access Policies The data management and data processing processes of the LHC experiments are part of the Computing Models of each of the experiments. They have been developed over the last decade, have operated successfully for the first LHC Run- I period, and will be used in the forthcoming Run- II period. The Computing Models ensure that (i) multiple copies of all raw data are stored at distinct sites around the world, (ii) resilient metadata catalogues are maintained, (iii) experimental conditions databases are maintained, (iv) and software versions are stored and indexed. Since all data can in principle be regenerated from these raw data, then these models meet the fundamental requirements of resilient data preservation. In addition multiple copies of derived data are also stored as well as copies of simulated data to facilitate data analysis. The computing models of the LHC experiments were recently updated by the experiments and the WLCG. The description of these updated models can be found at: Update of the Computing Models of the WLCG and the LHC Experiments Each experiment has also produced policies with respect to open data preservation & access. These are the result of agreement between the full set of international partners of each experiment. These can be found at: Policies Data preservation and open data access issues have also been developed through pan experimental initiatives at CERN which include: Open data access: DPHEP Community study group: In the rest of this document we provide a short summary of the referenced information, by addressing the specific questions set out in the STFC DMP guidelines (
2 1.2 Types of data Broadly speaking the LHC experiments produce four levels of data. Level- 4: Raw data. These are the raw data produced by the experiments at the LHC interaction points after selection by the online triggers. Level- 3: Reconstructed data. These data are derived from the Raw data by applying calibrations and pattern finding algorithms. Typical content includes hits, track, clusters and particle candidates. It is these data that are used by physicists for research. Level- 2: Data to be used for outreach and education. Several activities have been developed whereby subsets of the derived data are made available for outreach and education. Level- 1: Published analysis results. These are the final results of the research and are generally published in journals and conference proceedings. In addition the experiments produce simulated Monte Carlo data (referred to as MC data) in the same four levels. MC data undergoes the equivalent reconstruction processes as for real data. 1.3 Data to be preserved Level- 4 data is fundamental and must be preserved as all other data may, in principle, be derived from it by re- running the reconstruction. Some Level- 3 data is also preserved. This is done for efficiency and economy since the process to re- derive it may take significant computing resources, and in order to easily facilitate re- analysis, re- use and verification of results. Level- 2 data has no unique preservation requirement Level- 1 data is preserved in the journals, and additional data is made available through recognised repositories such as CERN CDS and HEPDATA MC data can in principle always be regenerated provided the software and the associated transforms have been preserved (see later). However out of prudence some MC data is also preserved along with associated real data. 1.4 Data Preservation Processes The preservation of Level- 3 and Level- 4 data is guaranteed by the data management processes of the LHC experiments. The LHC experiments use the Worldwide LHC Computing Grid (WLCG) to implement those processes. The exact details are different for each experiment, but broadly speaking the process is as follows:
3 The Raw data is passed from the experimental areas in near real time to the CERN Tier- 0 data centre where it is immediately stored onto tape. CERN has a remote Tier- 0 centre in Hungary (Wigner Centre) which provides resilience. At least a second tape copy of the Raw data is made shortly afterwards. This second copy is stored at other sites remote to CERN, typically the Tier- 1 data centres. The details and number of copies depend upon the detailed computing model of each experiment but the result is resilient copies of the Raw data spread around the world. The CERN and remote data centres have custodial obligations for the Raw data and guarantee to manage them indefinitely, including migration to new technologies. Level- 3 data is derived by running reconstruction programs. Level- 3 data is also split up into separate streams optimised for different physics research areas. These data are mostly kept on nearline disk, which is replicated to several remote sites according to experiment replication policies which take account of popularity. One or more copies of this derived data will also be stored on tape. In summary several copies of the Raw data are maintained in physically remote locations, at sites with custodial responsibilities. This therefore ensures the physical data preservation requirements of the STFC policy are met. CERN has developed an analysis preservation system. This allows completed analyses to be uploaded and made available for future reference. This includes files, notes, ntuple type data sets, and software extracted from SVN or GIT. This is already being used by the experiments to deposit completed analyses. This can be viewed at (although as it pertains to internal analysis preservation a valid credential is required). The wider HEP community has been working together in a collaboration to develop data preservation methods under the DPHEP study group (Data Preservation for HEP). The objectives of the group includes (i) to review and document the physics objectives of the data persistency in HEP (ii) to exchange information concerning the analysis model: abstraction, software, documentation etc. and identify coherence points, (iii) to address the hardware and software persistency status (iv) to review possible funding programs and other related international initiatives and (v) to converge to a common set of specifications in a document that will constitute the basis for future collaborations. More on DPHEP can be found at Data centres The data recorded by the LHC is stored within WLCG, the UK part of which is provided by GridPP. The WLCG is comprised of the CERN Tier- 0 sites plus 12 Tier- 1 and over 80 Tier- 2 sites throughout Europe, the US and Asia. These are federated together, and provide common authentication, authorisation and accounting systems, and common workload management and data management systems. They have a well developed management and communications process, development and deployment planning, fault reporting and ticketing systems, and a security incident response and escalation process. In 2014 WLCG provides 180 PB of disk storage and 120 PB of Tape storage. This is expected to grow by a factor of ~2 by As noted earlier all data is stored with multiple copies at physically separated sites, thus guaranteeing resilience against any foreseeable disaster.
4 The UK component provides, in 2014, 10 PB of disk storage and 13 PB of tape storage at the Tier- 1 and 20 PB of disk at the Tier- 2 centres. As is the case for all Tier- 1 systems the UK Tier- 1 at RAL is a high quality fully resilient data centre, comprising a 10,000 slot Oracle SL8500 tape robot, 10,000 core batch worker farm and 400 node commodity disk server cluster, all coupled together by a high performance non- blocking 40Gb/s per link mesh internal network. The Tier- 1 is linked to CERN via a resilient 10Gb/s Optical private network (OPN) and is attached to JANET by resilient 40Gb/s links. Data is managed and accessed through the CASTOR (the The CERN Advanced STORage manager) hierarchical storage management system. Data retention, integrity and long term custody are at the heart of Tier- 1 s design and operation. Data retention is addressed through hardware resilience within the storage nodes (eg RAID) and by real- time replication and offline backup of crucial CASTOR meta- data catalogues. Routine bad block scrubbing ensures that hardware faults are identified early and weeded out of the storage system. End to end checksums between the experiments data distribution and CASTOR ensure that data corruption can be identified at the point of access and continuous regular dipstick tests ensure that silent data corruption is promptly identified. Long term data custody requires that hardware (both tape and disk) purchase and phase out forms a routine part of the normal operation of the service. Data must be moved off old disk servers to new and old tape media types need to be phased out in favour of newer higher density media as it comes available. Typical migration between both disk and tape media types typically take many months and must be planned for within the overall capacity management plan. Long term capacity planning must typically look 4-5 years ahead in order to meet both financial and physical constraints. 1.5 Software and metadata implications The data are naturally divided into data sets (either by a physics theme, or a run period). These are catalogued in the experimental cataloguing systems running across the WLCG. These systems index physical file locations, logical file names, and the associated metadata which describes each data set. In the ATLAS case, Rucio will handle replicating files to multiple sites. Specifically for the UK, one of the two Tier 2 hosted replicas will be within our borders. For the web front end there are multiple servers (stateless). Behind that, resilience and redundancy is provided by Oracle with the usual RAC configuration and a Data Guard copy to another set of machines. The CMS Dataset Bookkeeping Service (DBS) holds details of data sets, including metadata, and includes mappings from physics abstractions to file blocks. Multiple instances of DBS are hosted at CERN. The transfer of data files amongst the many CMS sites is managed by PhEDEx (Physics Experiment Data EXport), which has its own Transfer Management Database (TMDB), hosted on resilient Oracle instances at CERN. Logical filenames are translated to physical filenames at individual sites in the Trivial File Catalogue (TFC). In LHCb, the Bookkeeping provides a hierarchical catalogue of logical file names and metadata. The logical file names are mapped directly on to local filesystem directory and file names within the space allocated to LHCb on local storage systems. The file catalogue (currently LFC, soon to be replaced by the Dirac File Catalogue) is used to map logical file names to physical names managed by sites. There is a separate Conditions
5 Database, implemented using Oracle, that records additional information about the parameters when the data were taken. Software is equally important in the LHC context. The knowledge needed to read and reconstruct Raw data, and to subsequently read and analyse the derived data is embedded in large software suites and in databases which record conditions and calibration constants. All such software and databases are versioned and stored in relevant version management systems. Currently SVN and GIT are used. All experiments store information required to link specific software versions to specific analyses. All software required to read and interpret open data will be made available upon request according to the policies of the experiments. 1.6 Data preservation period There is currently no upper time limit foreseen for the retention of LHC data. Naturally the ability to do so depends upon the continuation of CERN and the remote data centres, and of available expertise and resources. 1.7 Open data access CERN and the experiments have taken open data access very seriously, and all of the experiments have developed policies, which are referenced above, and should be read for exact details. These specify: Which data is valuable to others: In general Raw (level 4) data could not be interpreted by third parties without them having a very detailed knowledge of the experimental detectors and reconstruction software (such data is rarely used directly by physicists with the collaborations). Derived data (level 3) may be more easily usable by third parties. Level- 1 data is already openly available. The proprietary period: Experiments specify a fraction of the data that will be made available after a given reserved period. This period ranges up to several years reflecting the very large amount of effort expended by scientists in construction and operation of the experiments over many decades, and in part following the running cycle that defines large coherent blocks of data. For details of periods and amounts of data release please see the individual experiment policies. How will data be shared: Data will be made available in format as specified by the normal experiment operations. This may be exactly the same format in which the data are made available to members of the collaborations themselves. The software required to read the data is also available on a similar basis, along with appropriate documentation. CERN, in collaboration with the experiments, has developed an Open Data Portal. This allows experiments to publish data for open access for research and for education. The portal offers several high- level tools such as an interactive event display and histogram plotting. The CERN Open Data platform also preserves the software tools used to analyse the data. It offers the download of Virtual Machine images and preserves examples of user analysis code. CMS is already about to use this for data release according to its policy. The portal can be found at
6 In some cases individual experiments have also taken the initiative to develop or engage with value added open data services using resources obtained in participating countries. In ATLAS Recast and RIVET are the recommended means for reinterpretation of the data by third parties. They have also developed light- weight packages for exploring self- describing data formats intended mainly for education and outreach. 1.8 Resources required All experiments carry out their data preservation activities as a natural result of scientific good practice. This leads to marginal extra staff costs over and above those operating the WLCG, and some additional storage costs. Naturally these activities rely upon the continuation of CERN and the remote sites. Experiments in general do not have any specific resources for carrying out active open data access activities over and above those described above. The additional cost of storage for data preservation is starting to be specified in the annual resource projections of each experiment which go to the CERN RRB and which are scrutinised and subsequently approved. 2. Non- LHC experiments 2.1 NA62 The NA62 experiment will use the same Grid infrastructure as for the LHC experiments. Thus all of the above description of physical resources and processes for ensuring data preservation pertain. The data management processes are currently being developed and will follow the normal scheme of making secure copies of raw data at remote sites, and keeping the metadata in a resilient catalogue. The international NA62 collaboration is currently drafting a Data Management Policy that will document Data Preservation and Open Access policies. Broadly speaking, the policy follows closely the plans developed by the LHC experiments. Overall the collaboration expects to follow the guidelines being developed by CERN and the LHC experiments jointly on these matters, after appropriate approval by the NA62 Collaboration Board, and within the limits of the effort and resources available. The policy will eventually be made available at FNAL experiments: g- 2, CHIPS, LAr1- ND, LBNF, MicroBoone, MINOS+, NoVA. All the FNAL experiments adhere to the FNAL policy on data management practices and policies which is available here: This policy is implemented across all the experiments through dedicated members of the collaborations who are also members of FNAL's Computing Division which oversees the implementation of the policy. The data retention policy of the experiments is expected to follow the template established by the Tevatron experiments which is underpinned by CERNVM- FS and is detailed in documents from the "ICFA Study Group on Data Preservation and Long Term Analysis in High Energy Physics"
7 ( as part of the wider DASPOS group ( 2.3 LUX and LUX- Zeplin Data generated by the LUX experiment is stored at two mirrored sites, and LUX- Zeplin will adopt the same strategy with one in the US and one in the UK. An additional independent database archive is maintained in the US. The typical data volume to be generated will be ~1 PB, stored on RAID arrays at the mirrored sites and subsequently committed for permanent storage. Version control and traceability of software developed for processing and analysis is performed through a centralised repository at LBNL. The two data centres control and monitor freely distributed raw and processed data within the collaboration. Results intended for publication are internally verified by the collaboration and relevant data and information made available to the public once published. Raw data is regarded as proprietary for the duration of the active experiment and a further period beyond the project end to allow for validation. As with LUX, LZ also adheres to the data policies of the DOE and NSF. The DMP for LZ was recently submitted to the PPRP. 2.4 SuperNEMO SuperNEMO is a small experiment by HEP standards, but the data types, processing and management issues are essentially the same as those of the LHC experiments given in section 1 above. The main difference from the LHC experiments is that CC- IN2P3, Lyon, will act as the hub for SuperNEMO rather than CERN. CC- IN2P3 is one of the main WLCG Tier- 1 sites, providing extensive data storage and processing capabilities via Grid and non- Grid interfaces. (It was used for the NEMO- 3 experiment.) The data from the experiment at the LSM (Laboratoire Souterrain Modane) will be transferred to CC- IN2P3. Resources and infrastructure provided by the collaborating institutes and countries of the SuperNEMO collaboration, including GridPP in the UK, will be used to meet the data, metadata and software preservation requirements. A data management plan consistent with the requirements of open data access is being developed; it is planned to make use of data management processes and tools developed by WLCG, DPHEP etc. 2.5 T2K and Hyper- Kamiokande The T2K experiment also uses the WLCG Grid infrastructure and therefore benefit as the other experiments described above. T2K has been running for five years, so has a well- exercised data distribution network which features multiple archives of the data, metadata, and essential software, preserving them in multiple copies on three continents and leaving any serious loss of data extremely unlikely. It is planned that T2K will transition fairly seamlessly into Hyper- Kamiokande, and thus the T2K data will continue to be analysed for many years, which will guarantee that it will be preserved and ported to new storage technology for the foreseeable future. T2K has been considering what type of Open Access is appropriate for the T2K data, and as a first step the public web pages of T2K already offer access to selected samples of the key reduced data appearing in our publications (see experiment.org/results/), which enables other researchers to directly use the results of our research (for instance our measured neutrino cross sections as a function of energy can be downloaded). The Hyper- Kamiokande experiment will also use the WCLG and adopt similar procedures. It will develop its open data access policy on a timescale commensurate
8 with its data taking, which falls outside the period of this CG. Its DMP has recently been submitted to the PPRP. 2.6 COMET The COMET experiment will use the WLCG (in addition to non- Grid infrastructures). Thus all of the above description of physical resources and processes for ensuring data preservation pertain. The data management processes are currently being developed and will follow the normal scheme of making secure copies of raw data at remote sites (Grid and otherwise), and keeping the metadata in a resilient catalogue. The international COMET collaboration is currently drafting a Data Management Policy that will document Data Preservation and Open Access policies, which are expected to be similar to those of other experiments at J- PARC/KEK such as the T2K experiment. 2.6 SNO+ SNO+ data is managed centrally by the experiment with strong UK leadership through the data flow group. Raw data, the direct output of the DAQ and processed data, after reconstruction has been applied, are both stored on grid enabled storage elements located in both Canada and the UK using the same infrastructure as LHC experiments. This plan provides for redundancy in data storage. Raw data and select processed data will be maintained on these sites indefinitely. These data are accessed via grid certificate with membership of the snoplus.ca VO, which is managed by collaboration members. As physics results are released by the collaboration, relevant data will be released online alongside collaboration papers. The international SNO+ collaboration is currently drafting a Data Management Policy to document Data Preservation and Open Access policies in line with the requirements of SNO+ funding agencies. The Data Management Policy will be included in the collaboration statutes. 2.7 IceCube/PINGU Data from the IceCube / PINGU project is stored at the University of Wisconsin, Madison. The Madison cluster also provides the primary resources used to process the data and generate Monte Carlo. At The University of Manchester, a GPU cluster is being used to generate PINGU Monte Carlo. During this Consolidated Grant round, we also intend to begin using the GridPP resources for the reconstruction of both IceCube data and IceCube / PINGU Monte Carlo.
Evolution of Database Replication Technologies for WLCG
Home Search Collections Journals About Contact us My IOPscience Evolution of Database Replication Technologies for WLCG This content has been downloaded from IOPscience. Please scroll down to see the full
The CMS analysis chain in a distributed environment
The CMS analysis chain in a distributed environment on behalf of the CMS collaboration DESY, Zeuthen,, Germany 22 nd 27 th May, 2005 1 The CMS experiment 2 The CMS Computing Model (1) The CMS collaboration
High Availability Databases based on Oracle 10g RAC on Linux
High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN, June 2006 Luca Canali, CERN IT Outline Goals Architecture of an HA DB Service Deployment at the CERN Physics Database
Big Science and Big Data Dirk Duellmann, CERN Apache Big Data Europe 28 Sep 2015, Budapest, Hungary
Big Science and Big Data Dirk Duellmann, CERN Apache Big Data Europe 28 Sep 2015, Budapest, Hungary 16/02/2015 Real-Time Analytics: Making better and faster business decisions 8 The ATLAS experiment
DSS. High performance storage pools for LHC. Data & Storage Services. Łukasz Janyst. on behalf of the CERN IT-DSS group
DSS High performance storage pools for LHC Łukasz Janyst on behalf of the CERN IT-DSS group CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/it Introduction The goal of EOS is to provide a
Solution Brief: Creating Avid Project Archives
Solution Brief: Creating Avid Project Archives Marquis Project Parking running on a XenData Archive Server provides Fast and Reliable Archiving to LTO or Sony Optical Disc Archive Cartridges Summary Avid
(Possible) HEP Use Case for NDN. Phil DeMar; Wenji Wu NDNComm (UCLA) Sept. 28, 2015
(Possible) HEP Use Case for NDN Phil DeMar; Wenji Wu NDNComm (UCLA) Sept. 28, 2015 Outline LHC Experiments LHC Computing Models CMS Data Federation & AAA Evolving Computing Models & NDN Summary Phil DeMar:
CHESS DAQ* Introduction
CHESS DAQ* Introduction Werner Sun (for the CLASSE IT group), Cornell University * DAQ = data acquisition https://en.wikipedia.org/wiki/data_acquisition Big Data @ CHESS Historically, low data volumes:
Distributed Database Access in the LHC Computing Grid with CORAL
Distributed Database Access in the LHC Computing Grid with CORAL Dirk Duellmann, CERN IT on behalf of the CORAL team (R. Chytracek, D. Duellmann, G. Govi, I. Papadopoulos, Z. Xie) http://pool.cern.ch &
US NSF s Scientific Software Innovation Institutes
US NSF s Scientific Software Innovation Institutes S 2 I 2 awards invest in long-term projects which will realize sustained software infrastructure that is integral to doing transformative science. (Can
Research Data Management PROJECT LIFECYCLE
PROJECT LIFECYCLE Introduction and context Basic Project Info. Thesis Title UH or Research Council? Duration Related Policies UH and STFC policies: open after publication as your research is public funded
Deploying a distributed data storage system on the UK National Grid Service using federated SRB
Deploying a distributed data storage system on the UK National Grid Service using federated SRB Manandhar A.S., Kleese K., Berrisford P., Brown G.D. CCLRC e-science Center Abstract As Grid enabled applications
An Integrated CyberSecurity Approach for HEP Grids. Workshop Report. http://hpcrd.lbl.gov/hepcybersecurity/
An Integrated CyberSecurity Approach for HEP Grids Workshop Report http://hpcrd.lbl.gov/hepcybersecurity/ 1. Introduction The CMS and ATLAS experiments at the Large Hadron Collider (LHC) being built at
CERN local High Availability solutions and experiences. Thorsten Kleinwort CERN IT/FIO WLCG Tier 2 workshop CERN 16.06.2006
CERN local High Availability solutions and experiences Thorsten Kleinwort CERN IT/FIO WLCG Tier 2 workshop CERN 16.06.2006 1 Introduction Different h/w used for GRID services Various techniques & First
PoS(EGICF12-EMITC2)110
User-centric monitoring of the analysis and production activities within the ATLAS and CMS Virtual Organisations using the Experiment Dashboard system Julia Andreeva E-mail: [email protected] Mattia
EDG Project: Database Management Services
EDG Project: Database Management Services Leanne Guy for the EDG Data Management Work Package EDG::WP2 [email protected] http://cern.ch/leanne 17 April 2002 DAI Workshop Presentation 1 Information in
Data storage services at CC-IN2P3
Centre de Calcul de l Institut National de Physique Nucléaire et de Physique des Particules Data storage services at CC-IN2P3 Jean-Yves Nief Agenda Hardware: Storage on disk. Storage on tape. Software:
Tier0 plans and security and backup policy proposals
Tier0 plans and security and backup policy proposals, CERN IT-PSS CERN - IT Outline Service operational aspects Hardware set-up in 2007 Replication set-up Test plan Backup and security policies CERN Oracle
Batch and Cloud overview. Andrew McNab University of Manchester GridPP and LHCb
Batch and Cloud overview Andrew McNab University of Manchester GridPP and LHCb Overview Assumptions Batch systems The Grid Pilot Frameworks DIRAC Virtual Machines Vac Vcycle Tier-2 Evolution Containers
CERN Cloud Storage Evaluation Geoffray Adde, Dirk Duellmann, Maitane Zotes CERN IT
SS Data & Storage CERN Cloud Storage Evaluation Geoffray Adde, Dirk Duellmann, Maitane Zotes CERN IT HEPiX Fall 2012 Workshop October 15-19, 2012 Institute of High Energy Physics, Beijing, China SS Outline
Techniques for implementing & running robust and reliable DB-centric Grid Applications
Techniques for implementing & running robust and reliable DB-centric Grid Applications International Symposium on Grid Computing 2008 11 April 2008 Miguel Anjo, CERN - Physics Databases Outline Robust
UNINETT Sigma2 AS: architecture and functionality of the future national data infrastructure
UNINETT Sigma2 AS: architecture and functionality of the future national data infrastructure Authors: A O Jaunsen, G S Dahiya, H A Eide, E Midttun Date: Dec 15, 2015 Summary Uninett Sigma2 provides High
Solution for private cloud computing
The CC1 system Solution for private cloud computing 1 Outline What is CC1? Features Technical details Use cases By scientist By HEP experiment System requirements and installation How to get it? 2 What
Data Management in an International Data Grid Project. Timur Chabuk 04/09/2007
Data Management in an International Data Grid Project Timur Chabuk 04/09/2007 Intro LHC opened in 2005 several Petabytes of data per year data created at CERN distributed to Regional Centers all over the
Database Services for Physics @ CERN
Database Services for Physics @ CERN Deployment and Monitoring Radovan Chytracek CERN IT Department Outline Database services for physics Status today How we do the services tomorrow? Performance tuning
U-LITE Network Infrastructure
U-LITE: a proposal for scientific computing at LNGS S. Parlati, P. Spinnato, S. Stalio LNGS 13 Sep. 2011 20 years of Scientific Computing at LNGS Early 90s: highly centralized structure based on VMS cluster
Computing at the HL-LHC
Computing at the HL-LHC Predrag Buncic on behalf of the Trigger/DAQ/Offline/Computing Preparatory Group ALICE: Pierre Vande Vyvre, Thorsten Kollegger, Predrag Buncic; ATLAS: David Rousseau, Benedetto Gorini,
Building Reliable, Scalable AR System Solutions. High-Availability. White Paper
Building Reliable, Scalable Solutions High-Availability White Paper Introduction This paper will discuss the products, tools and strategies available for building reliable and scalable Action Request System
BlueArc unified network storage systems 7th TF-Storage Meeting. Scale Bigger, Store Smarter, Accelerate Everything
BlueArc unified network storage systems 7th TF-Storage Meeting Scale Bigger, Store Smarter, Accelerate Everything BlueArc s Heritage Private Company, founded in 1998 Headquarters in San Jose, CA Highest
VIPERVAULT STORAGECRAFT SHADOWPROTECT SETUP GUIDE
VIPERVAULT STORAGECRAFT SHADOWPROTECT SETUP GUIDE Solution Overview Thank you for choosing the ViperVault cloud replication, backup and disaster recovery service. Using this service you can replicate your
Computer Backup Strategies
Computer Backup Strategies Think how much time it would take to recreate everything on your computer...if you could. Given all the threats to your data (viruses, natural disasters, computer crashes, and
SOLUTION GUIDE AND BEST PRACTICES
SOLUTION GUIDE AND BEST PRACTICES Last Updated December 2012 Solution Overview Combine the best in bare-metal backup with the best in remote backup to offer your customers a complete disaster recovery
Cloud Storage. Parallels. Performance Benchmark Results. White Paper. www.parallels.com
Parallels Cloud Storage White Paper Performance Benchmark Results www.parallels.com Table of Contents Executive Summary... 3 Architecture Overview... 3 Key Features... 4 No Special Hardware Requirements...
Data Validation and Data Management Solutions
FRONTIER TECHNOLOGY, INC. Advanced Technology for Superior Solutions. and Solutions Abstract Within the performance evaluation and calibration communities, test programs are driven by requirements, test
XenData Archive Series Software Technical Overview
XenData White Paper XenData Archive Series Software Technical Overview Advanced and Video Editions, Version 4.0 December 2006 XenData Archive Series software manages digital assets on data tape and magnetic
PARALLELS CLOUD STORAGE
PARALLELS CLOUD STORAGE Performance Benchmark Results 1 Table of Contents Executive Summary... Error! Bookmark not defined. Architecture Overview... 3 Key Features... 5 No Special Hardware Requirements...
Introduction to Optical Archiving Library Solution for Long-term Data Retention
Introduction to Optical Archiving Library Solution for Long-term Data Retention CUC Solutions & Hitachi-LG Data Storage 0 TABLE OF CONTENTS 1. Introduction... 2 1.1 Background and Underlying Requirements
ATLAS Petascale Data Processing on the Grid: Facilitating Physics Discoveries at the LHC
ATLAS Petascale Data Processing on the Grid: Facilitating Physics Discoveries at the LHC Wensheng Deng 1, Alexei Klimentov 1, Pavel Nevski 1, Jonas Strandberg 2, Junji Tojo 3, Alexandre Vaniachine 4, Rodney
Arkivum's Digital Archive Managed Service
ArkivumLimited R21 Langley Park Way Chippenham Wiltshire SN15 1GE UK +44 1249 405060 [email protected] @Arkivum arkivum.com Arkivum's Digital Archive Managed Service Service Description 1 / 13 Table of
TABLE OF CONTENTS THE SHAREPOINT MVP GUIDE TO ACHIEVING HIGH AVAILABILITY FOR SHAREPOINT DATA. Introduction. Examining Third-Party Replication Models
1 THE SHAREPOINT MVP GUIDE TO ACHIEVING HIGH AVAILABILITY TABLE OF CONTENTS 3 Introduction 14 Examining Third-Party Replication Models 4 Understanding Sharepoint High Availability Challenges With Sharepoint
The Microsoft Large Mailbox Vision
WHITE PAPER The Microsoft Large Mailbox Vision Giving users large mailboxes without breaking your budget Introduction Giving your users the ability to store more e mail has many advantages. Large mailboxes
Alternative models to distribute VO specific software to WLCG sites: a prototype set up at PIC
EGEE and glite are registered trademarks Enabling Grids for E-sciencE Alternative models to distribute VO specific software to WLCG sites: a prototype set up at PIC Elisa Lanciotti, Arnau Bria, Gonzalo
Ex Libris Rosetta: A Digital Preservation System Product Description
Ex Libris Rosetta: A Digital Preservation System Product Description CONFIDENTIAL INFORMATION The information herein is the property of Ex Libris Ltd. or its affiliates and any misuse or abuse will result
How To Build A Map Library On A Computer Or Computer (For A Museum)
Russ Hunt OCLC Tools Managing & Preserving Digitised Map Libraries Keywords: Digital collection management; digital collection access; digital preservation; long term preservation. Summary This paper explains
CERN analysis preservation (CAP) - Use Cases. Sünje Dallmeier Tiessen, Patricia Herterich, Peter Igo-Kemenes, Tibor Šimko, Tim Smith
CERN analysis preservation (CAP) - Use Cases Sünje Dallmeier Tiessen, Patricia Herterich, Peter Igo-Kemenes, Tibor Šimko, Tim Smith Created in April 2015, published in November 2015 Abstract In this document
ETERNUS CS High End Unified Data Protection
ETERNUS CS High End Unified Data Protection Optimized Backup and Archiving with ETERNUS CS High End 0 Data Protection Issues addressed by ETERNUS CS HE 60% of data growth p.a. Rising back-up windows Too
Research Data Storage and the University of Bristol
Introduction: Policy for the use of the Research Data Storage Facility The University s High Performance Computing (HPC) facility went live to users in May 2007. Access to this world-class HPC facility
Risk & Hazard Management
Rivo Software Solution Layer provides a rapidly deployable complete set of hazard and risk management functionality from any device, accessible from anywhere through our highly secure cloud platform. Identify,
Using S3 cloud storage with ROOT and CernVMFS. Maria Arsuaga-Rios Seppo Heikkila Dirk Duellmann Rene Meusel Jakob Blomer Ben Couturier
Using S3 cloud storage with ROOT and CernVMFS Maria Arsuaga-Rios Seppo Heikkila Dirk Duellmann Rene Meusel Jakob Blomer Ben Couturier INDEX Huawei cloud storages at CERN Old vs. new Huawei UDS comparative
Grid Computing in Aachen
GEFÖRDERT VOM Grid Computing in Aachen III. Physikalisches Institut B Berichtswoche des Graduiertenkollegs Bad Honnef, 05.09.2008 Concept of Grid Computing Computing Grid; like the power grid, but for
HTCondor at the RAL Tier-1
HTCondor at the RAL Tier-1 Andrew Lahiff, Alastair Dewhurst, John Kelly, Ian Collier, James Adams STFC Rutherford Appleton Laboratory HTCondor Week 2014 Outline Overview of HTCondor at RAL Monitoring Multi-core
Potential of Virtualization Technology for Long-term Data Preservation
Potential of Virtualization Technology for Long-term Data Preservation J Blomer on behalf of the CernVM Team [email protected] CERN PH-SFT 1 / 12 Introduction Potential of Virtualization Technology Preserve
Technical. Overview. ~ a ~ irods version 4.x
Technical Overview ~ a ~ irods version 4.x The integrated Ru e-oriented DATA System irods is open-source, data management software that lets users: access, manage, and share data across any type or number
Tandberg Data AccuVault RDX
Tandberg Data AccuVault RDX Binary Testing conducts an independent evaluation and performance test of Tandberg Data s latest small business backup appliance. Data backup is essential to their survival
XenData Video Edition. Product Brief:
XenData Video Edition Product Brief: The Video Edition of XenData Archive Series software manages one or more automated data tape libraries on a single Windows 2003 server to create a cost effective digital
The Data Quality Monitoring Software for the CMS experiment at the LHC
The Data Quality Monitoring Software for the CMS experiment at the LHC On behalf of the CMS Collaboration Marco Rovere, CERN CHEP 2015 Evolution of Software and Computing for Experiments Okinawa, Japan,
Digital Forensics Tutorials Acquiring an Image with FTK Imager
Digital Forensics Tutorials Acquiring an Image with FTK Imager Explanation Section Digital Forensics Definition The use of scientifically derived and proven methods toward the preservation, collection,
Laserfiche Volumes: Introduction and Best Practices
Laserfiche Volumes: Introduction and Best Practices White Paper November 2005 The information contained in this document represents the current view of Compulink Management Center, Inc on the issues discussed
High Availability for Citrix XenApp
WHITE PAPER Citrix XenApp High Availability for Citrix XenApp Enhancing XenApp Availability with NetScaler Reference Architecture www.citrix.com Contents Contents... 2 Introduction... 3 Desktop Availability...
IBM G-Cloud Microsoft Windows Active Directory as a Service
IBM G-Cloud Microsoft Windows Active Directory as a Service Service Definition IBM G-Cloud Windows AD as a Service 1 1. Summary 1.1 Service Description This offering is provided by IBM Global Business
<Insert Picture Here> Cloud Archive Trends and Challenges PASIG Winter 2012
Cloud Archive Trends and Challenges PASIG Winter 2012 Raymond A. Clarke Enterprise Storage Consultant, Oracle Enterprise Solutions Group How Is PASIG Pronounced? Is it PASIG? Is it
Das HappyFace Meta-Monitoring Framework
Das HappyFace Meta-Monitoring Framework B. Berge, M. Heinrich, G. Quast, A. Scheurer, M. Zvada, DPG Frühjahrstagung Karlsruhe, 28. März 1. April 2011 KIT University of the State of Baden-Wuerttemberg and
Scientific Storage at FNAL. Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015
Scientific Storage at FNAL Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015 Index - Storage use cases - Bluearc - Lustre - EOS - dcache disk only - dcache+enstore Data distribution by solution
Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications
Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications White Paper Table of Contents Overview...3 Replication Types Supported...3 Set-up &
Client Study Portfolio
Client Study Portfolio Client Study 1: UK District Council A re-architecture of the client s RAC Environment and full 24/7 support of the client s database systems thereafter has eliminated the risk of
The Compact Muon Solenoid Experiment. Conference Report. Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland
Available on CMS information server CMS CR -2012/114 The Compact Muon Solenoid Experiment Conference Report Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland 23 May 2012 CMS Data Transfer operations
ZFS Administration 1
ZFS Administration 1 With a rapid paradigm-shift towards digital content and large datasets, managing large amounts of data can be a challenging task. Before implementing a storage solution, there are
Summer Student Project Report
Summer Student Project Report Dimitris Kalimeris National and Kapodistrian University of Athens June September 2014 Abstract This report will outline two projects that were done as part of a three months
EMC E20-018. Exam Name: Virtualized Data Center and Cloud Infrastructure Design Specialist
EMC E20-018 Exam Name: Virtualized Data Center and Cloud Infrastructure Design Specialist http://www.exams.solutions/e20-018-exam-guide.html Product: Demo Question: 1 What is the first phase of the Virtual
Forschungszentrum Karlsruhe in der Helmholtz - Gemeinschaft. Holger Marten. Holger. Marten at iwr. fzk. de www.gridka.de
Tier-2 cloud Holger Marten Holger. Marten at iwr. fzk. de www.gridka.de 1 GridKa associated Tier-2 sites spread over 3 EGEE regions. (4 LHC Experiments, 5 (soon: 6) countries, >20 T2 sites) 2 region DECH
Service Description Cloud Storage Openstack Swift
Service Description Cloud Storage Openstack Swift Table of Contents Overview iomart Cloud Storage... 3 iomart Cloud Storage Features... 3 Technical Features... 3 Proxy... 3 Storage Servers... 4 Consistency
Oracle Database 10g: Backup and Recovery 1-2
Oracle Database 10g: Backup and Recovery 1-2 Oracle Database 10g: Backup and Recovery 1-3 What Is Backup and Recovery? The phrase backup and recovery refers to the strategies and techniques that are employed
Data Management Planning
DIY Research Data Management Training Kit for Librarians Data Management Planning Kerry Miller Digital Curation Centre University of Edinburgh [email protected] Running Order I. What is Research Data
