A Physics Approach to Big Data. Adam Kocoloski, PhD CTO Cloudant

Size: px
Start display at page:

Download "A Physics Approach to Big Data. Adam Kocoloski, PhD CTO Cloudant"

Transcription

1 A Physics Approach to Big Data Adam Kocoloski, PhD CTO Cloudant

2

3 Solenoidal Tracker at RHIC (STAR)

4 The life of LHC data Detected by experiment Online filtering (hardware and software) Transferred to CERN main campus, archived & reconstructed Transferred to T1 sites, archived, reconstructed & skimmed Transferred to T2 sites, reconstructed, skimmed, filtered & analysed Written into locally analyzable files, put on laptops Turned into a plot in a paper

5 Dig big tu nne ls

6 C h a i n u p s e r i e s o f a t o m smashers

7 Put sensitive cameras in awkward places

8 Re co rd e ve nts

9 The life of LHC data Detected by experiment Online filtering (hardware and software) Transferred to CERN main campus, archived & reconstructed Transferred to T1 sites, archived, reconstructed & skimmed Transferred to T2 sites, reconstructed, skimmed, filtered & analysed Written into locally analyzable files, put on laptops Turned into a plot in a paper

10 CMS Data Flow We have a big digital camera

11 It takes photos of this

12 Which come out looking like this

13 CMS Data Flow We have a big digital camera

14 CMS Data Flow We have a big digital camera Which goes into lots of computers (HLT)

15 CMS Data Flow We have a big digital camera Which goes into lots of computers (HLT) And then into lots of disk (Storage Manager)

16 CMS Data Flow We have a big digital camera ~200 GB/s Which goes into lots of computers (HLT) ~2 GB/s And then into lots of disk (Storage Manager)

17 Throw away data ASAP

18 The life of LHC data Detected by experiment Online filtering (hardware and software) Transferred to CERN main campus, archived & reconstructed Transferred to T1 sites, archived, reconstructed & skimmed Transferred to T2 sites, reconstructed, skimmed, filtered & analysed Written into locally analyzable files, put on laptops Turned into a plot in a paper

19 10 PB / year

20 The life of LHC data Detected by experiment Online filtering (hardware and software) Transferred to CERN main campus, archived & reconstructed Transferred to T1 sites, archived, reconstructed & skimmed Transferred to T2 sites, reconstructed, skimmed, filtered & analysed Written into locally analyzable files, put on laptops Turned into a plot in a paper

21 Tiers 1PB/week Maybe a bit different Many (>100) sites with 100 s TB storage, s worker nodes Why so many? Politics, power budget, cost

22 The life of LHC data Detected by experiment Online filtering (hardware and software) Transferred to CERN main campus, archived & reconstructed Transferred to T1 sites, archived, reconstructed & skimmed Transferred to T2 sites, reconstructed, skimmed, filtered & analysed Written into locally analyzable files, put on laptops Turned into a plot in a paper

23 Analysis Workflow Each analysis is ~unique Query language is C++ Runs on distributed and local resources Myriad cut selections to identify interesting events Data in final plot substantially reduced from the original dataset

24

25

26 Multivariate analyses extract signals that are impossible to find via hand-drawn cuts

27 Humans find signals even when none exist

28

29 Workflow Ladder Number of users Large datasets (>100 TB) } Complex computation Large datasets (>100 TB) Simple computation Shared datasets (>500 GB) Complex computation Use Grid compute and storage exclusively Shared datasets ( GB) Complex computation Shared datasets ( GB) Simple computation Shared datasets ( GB) Simple computation Private datasets ( GB) Simple computation } } Work on departmental resources, store resulting datasets to Grid storage Work on laptop/desktop machine, store resulting datasets to Grid storage

30 Collaborative skimming remains valuable

31 Let s Review

32 Let s Review Particle physics is special

33 Let s Review Particle physics is special But not as special as it used to be vis a vis data operations

34 Let s Review Particle physics is special But not as special as it used to be vis a vis data operations And that s a good thing

35 Let s Review Particle physics is special But not as special as it used to be vis a vis data operations And that s a good thing Keep only the data you can analyze

36 Let s Review Particle physics is special But not as special as it used to be vis a vis data operations And that s a good thing Keep only the data you can analyze Remember that we re biased to find patterns, and lots of data means lots of patterns

37 Let s Review Particle physics is special But not as special as it used to be vis a vis data operations And that s a good thing Keep only the data you can analyze Remember that we re biased to find patterns, and lots of data means lots of patterns Multivariate analyses win in the end

38 Let s Review Particle physics is special But not as special as it used to be vis a vis data operations And that s a good thing Keep only the data you can analyze Remember that we re biased to find patterns, and lots of data means lots of patterns Multivariate analyses win in the end Skim datasets for fun and profit

39

CMS: Challenges in Advanced Computing Techniques (Big Data, Data reduction, Data Analytics)

CMS: Challenges in Advanced Computing Techniques (Big Data, Data reduction, Data Analytics) CMS: Challenges in Advanced Computing Techniques (Big Data, Data reduction, Data Analytics) With input from: Daniele Bonacorsi, Ian Fisk, Valentin Kuznetsov, David Lange Oliver Gutsche CERN openlab technical

More information

The CMS Tier0 goes Cloud and Grid for LHC Run 2. Dirk Hufnagel (FNAL) for CMS Computing

The CMS Tier0 goes Cloud and Grid for LHC Run 2. Dirk Hufnagel (FNAL) for CMS Computing The CMS Tier0 goes Cloud and Grid for LHC Run 2 Dirk Hufnagel (FNAL) for CMS Computing CHEP, 13.04.2015 Overview Changes for the Tier0 between Run 1 and Run 2 CERN Agile Infrastructure (in GlideInWMS)

More information

How To Write A Data Mining Program On A Large Data Set

How To Write A Data Mining Program On A Large Data Set On the Role of Indexing for Big Data in Scientific Domains Arie Shoshani Lawrence Berkeley National Lab BIGDATA and EXTREME-SCALE COMPUTING April 3-May, 23 Outline q Examples of indexing needs in scientific

More information

Data analysis in Par,cle Physics

Data analysis in Par,cle Physics Data analysis in Par,cle Physics From data taking to discovery Tuesday, 13 August 2013 Lukasz Kreczko - Bristol IT MegaMeet 1 $ whoami Lukasz (Luke) Kreczko Par,cle Physicist Graduated in Physics from

More information

Integrating a heterogeneous and shared Linux cluster into grids

Integrating a heterogeneous and shared Linux cluster into grids Integrating a heterogeneous and shared Linux cluster into grids 1,2 1 1,2 1 V. Büge, U. Felzmann, C. Jung, U. Kerzel, 1 1 1 M. Kreps, G. Quast, A. Vest 1 2 DPG Frühjahrstagung March 28 31, 2006 Dortmund

More information

Status of Grid Activities in Pakistan. FAWAD SAEED National Centre For Physics, Pakistan

Status of Grid Activities in Pakistan. FAWAD SAEED National Centre For Physics, Pakistan Status of Grid Activities in Pakistan FAWAD SAEED National Centre For Physics, Pakistan 1 Introduction of NCP-LCG2 q NCP-LCG2 is the only Tier-2 centre in Pakistan for Worldwide LHC computing Grid (WLCG).

More information

IT of SPIM Data Storage and Compression. EMBO Course - August 27th! Jeff Oegema, Peter Steinbach, Oscar Gonzalez

IT of SPIM Data Storage and Compression. EMBO Course - August 27th! Jeff Oegema, Peter Steinbach, Oscar Gonzalez IT of SPIM Data Storage and Compression EMBO Course - August 27th Jeff Oegema, Peter Steinbach, Oscar Gonzalez 1 Talk Outline Introduction and the IT Team SPIM Data Flow Capture, Compression, and the Data

More information

RO-11-NIPNE, evolution, user support, site and software development. IFIN-HH, DFCTI, LHCb Romanian Team

RO-11-NIPNE, evolution, user support, site and software development. IFIN-HH, DFCTI, LHCb Romanian Team IFIN-HH, DFCTI, LHCb Romanian Team Short overview: The old RO-11-NIPNE site New requirements from the LHCb team User support ( solution offered). Data reprocessing 2012 facts Future plans The old RO-11-NIPNE

More information

NT1: An example for future EISCAT_3D data centre and archiving?

NT1: An example for future EISCAT_3D data centre and archiving? March 10, 2015 1 NT1: An example for future EISCAT_3D data centre and archiving? John White NeIC xx March 10, 2015 2 Introduction High Energy Physics and Computing Worldwide LHC Computing Grid Nordic Tier

More information

Virtualization of a Cluster Batch System

Virtualization of a Cluster Batch System Virtualization of a Cluster Batch System Christian Baun, Volker Büge, Benjamin Klein, Jens Mielke, Oliver Oberst and Armin Scheurer Die Kooperation von Cluster Batch System Batch system accepts computational

More information

What is the real cost of Commercial Cloud provisioning? Thursday, 20 June 13 Lukasz Kreczko - DICE 1

What is the real cost of Commercial Cloud provisioning? Thursday, 20 June 13 Lukasz Kreczko - DICE 1 What is the real cost of Commercial Cloud provisioning? Thursday, 20 June 13 Lukasz Kreczko - DICE 1 SouthGrid in numbers CPU [cores] RAM [TB] Disk [TB] Manpower [FTE] Power [kw] 5100 10.2 3000 7 1.5 x

More information

CMS Experience Provisioning Cloud Resources with GlideinWMS. Anthony Tiradani HTCondor Week 2015 20 May 2015

CMS Experience Provisioning Cloud Resources with GlideinWMS. Anthony Tiradani HTCondor Week 2015 20 May 2015 CMS Experience Provisioning Cloud Resources with GlideinWMS Anthony Tiradani Week 2015 20 May 2015 glideinwms Quick Facts glideinwms is an open- source Fermilab CompuJng Sector product driven by CMS Heavy

More information

Tier0 plans and security and backup policy proposals

Tier0 plans and security and backup policy proposals Tier0 plans and security and backup policy proposals, CERN IT-PSS CERN - IT Outline Service operational aspects Hardware set-up in 2007 Replication set-up Test plan Backup and security policies CERN Oracle

More information

Testing the In-Memory Column Store for in-database physics analysis. Dr. Maaike Limper

Testing the In-Memory Column Store for in-database physics analysis. Dr. Maaike Limper Testing the In-Memory Column Store for in-database physics analysis Dr. Maaike Limper About CERN CERN - European Laboratory for Particle Physics Support the research activities of 10 000 scientists from

More information

Online data handling with Lustre at the CMS experiment

Online data handling with Lustre at the CMS experiment Online data handling with Lustre at the CMS experiment Lavinia Darlea, on behalf of CMS DAQ Group MIT/DAQ CMS September 17, 2015 1 / 29 CERN 2 / 29 CERN CERN was founded 1954: 12 European States Science

More information

Betriebssystem-Virtualisierung auf einem Rechencluster am SCC mit heterogenem Anwendungsprofil

Betriebssystem-Virtualisierung auf einem Rechencluster am SCC mit heterogenem Anwendungsprofil Betriebssystem-Virtualisierung auf einem Rechencluster am SCC mit heterogenem Anwendungsprofil Volker Büge 1, Marcel Kunze 2, OIiver Oberst 1,2, Günter Quast 1, Armin Scheurer 1 1) Institut für Experimentelle

More information

The Data Quality Monitoring Software for the CMS experiment at the LHC

The Data Quality Monitoring Software for the CMS experiment at the LHC The Data Quality Monitoring Software for the CMS experiment at the LHC On behalf of the CMS Collaboration Marco Rovere, CERN CHEP 2015 Evolution of Software and Computing for Experiments Okinawa, Japan,

More information

Establishing Applicability of SSDs to LHC Tier-2 Hardware Configuration

Establishing Applicability of SSDs to LHC Tier-2 Hardware Configuration Establishing Applicability of SSDs to LHC Tier-2 Hardware Configuration A CHEP 2010 presentation by: Sam Skipsey and The GridPP Storage Group With particular acknowledgments to: Wahid Bhimji (go see his

More information

<Insert Picture Here> Cloud Archive Trends and Challenges PASIG Winter 2012

<Insert Picture Here> Cloud Archive Trends and Challenges PASIG Winter 2012 Cloud Archive Trends and Challenges PASIG Winter 2012 Raymond A. Clarke Enterprise Storage Consultant, Oracle Enterprise Solutions Group How Is PASIG Pronounced? Is it PASIG? Is it

More information

Database Monitoring Requirements. Salvatore Di Guida (CERN) On behalf of the CMS DB group

Database Monitoring Requirements. Salvatore Di Guida (CERN) On behalf of the CMS DB group Database Monitoring Requirements Salvatore Di Guida (CERN) On behalf of the CMS DB group Outline CMS Database infrastructure and data flow. Data access patterns. Requirements coming from the hardware and

More information

CMS Computing Model: Notes for a discussion with Super-B

CMS Computing Model: Notes for a discussion with Super-B CMS Computing Model: Notes for a discussion with Super-B Claudio Grandi [ CMS Tier-1 sites coordinator - INFN-Bologna ] Daniele Bonacorsi [ CMS Facilities Ops coordinator - University of Bologna ] 1 Outline

More information

(Possible) HEP Use Case for NDN. Phil DeMar; Wenji Wu NDNComm (UCLA) Sept. 28, 2015

(Possible) HEP Use Case for NDN. Phil DeMar; Wenji Wu NDNComm (UCLA) Sept. 28, 2015 (Possible) HEP Use Case for NDN Phil DeMar; Wenji Wu NDNComm (UCLA) Sept. 28, 2015 Outline LHC Experiments LHC Computing Models CMS Data Federation & AAA Evolving Computing Models & NDN Summary Phil DeMar:

More information

Scaling Out With Apache Spark. DTL Meeting 17-04-2015 Slides based on https://www.sics.se/~amir/files/download/dic/spark.pdf

Scaling Out With Apache Spark. DTL Meeting 17-04-2015 Slides based on https://www.sics.se/~amir/files/download/dic/spark.pdf Scaling Out With Apache Spark DTL Meeting 17-04-2015 Slides based on https://www.sics.se/~amir/files/download/dic/spark.pdf Your hosts Mathijs Kattenberg Technical consultant Jeroen Schot Technical consultant

More information

Data sharing and Big Data in the physical sciences. 2 October 2015

Data sharing and Big Data in the physical sciences. 2 October 2015 Data sharing and Big Data in the physical sciences 2 October 2015 Content Digital curation: Data and metadata Why consider the physical sciences? Astronomy: Video Physics: LHC for example. Video The Research

More information

Addressing research data challenges at the. University of Colorado Boulder

Addressing research data challenges at the. University of Colorado Boulder Addressing research data challenges at the University of Colorado Boulder Thomas Hauser Director Research Computing University of Colorado Boulder thomas.hauser@colorado.edu Research Data Challenges Research

More information

SUPPORT FOR CMS EXPERIMENT AT TIER1 CENTER IN GERMANY

SUPPORT FOR CMS EXPERIMENT AT TIER1 CENTER IN GERMANY The 5th InternaEonal Conference Distributed CompuEng and Grid technologies in Science and EducaEon SUPPORT FOR CMS EXPERIMENT AT TIER1 CENTER IN GERMANY N. Ratnikova, J. Berger, C. Böser, O. Oberst, G.

More information

Big Data and Storage Management at the Large Hadron Collider

Big Data and Storage Management at the Large Hadron Collider Big Data and Storage Management at the Large Hadron Collider Dirk Duellmann CERN IT, Data & Storage Services Accelerating Science and Innovation CERN was founded 1954: 12 European States Science for Peace!

More information

Computing at the HL-LHC

Computing at the HL-LHC Computing at the HL-LHC Predrag Buncic on behalf of the Trigger/DAQ/Offline/Computing Preparatory Group ALICE: Pierre Vande Vyvre, Thorsten Kollegger, Predrag Buncic; ATLAS: David Rousseau, Benedetto Gorini,

More information

No file left behind - monitoring transfer latencies in PhEDEx

No file left behind - monitoring transfer latencies in PhEDEx FERMILAB-CONF-12-825-CD International Conference on Computing in High Energy and Nuclear Physics 2012 (CHEP2012) IOP Publishing No file left behind - monitoring transfer latencies in PhEDEx T Chwalek a,

More information

US NSF s Scientific Software Innovation Institutes

US NSF s Scientific Software Innovation Institutes US NSF s Scientific Software Innovation Institutes S 2 I 2 awards invest in long-term projects which will realize sustained software infrastructure that is integral to doing transformative science. (Can

More information

NERSC File Systems and How to Use Them

NERSC File Systems and How to Use Them NERSC File Systems and How to Use Them David Turner! NERSC User Services Group! Joint Facilities User Forum on Data- Intensive Computing! June 18, 2014 The compute and storage systems 2014 Hopper: 1.3PF,

More information

CDFII Computing Status

CDFII Computing Status CDFII Computing Status OUTLINE: New CDF-Italy computing group organization Usage status at FNAL and CNAF Towards GRID: where we are Plans and requests 22/04/2005 Donatella Lucchesi 1 CDFII Computing group

More information

Data Movement and Storage. Drew Dolgert and previous contributors

Data Movement and Storage. Drew Dolgert and previous contributors Data Movement and Storage Drew Dolgert and previous contributors Data Intensive Computing Location Viewing Manipulation Storage Movement Sharing Interpretation $HOME $WORK $SCRATCH 72 is a Lot, Right?

More information

Data Storage and Data Transfer. Dan Hitchcock Acting Associate Director, Advanced Scientific Computing Research

Data Storage and Data Transfer. Dan Hitchcock Acting Associate Director, Advanced Scientific Computing Research Data Storage and Data Transfer Dan Hitchcock Acting Associate Director, Advanced Scientific Computing Research Office of Science Organization 2 Data to Enable Scientific Discovery DOE SC Investments in

More information

Open access to data and analysis tools from the CMS experiment at the LHC

Open access to data and analysis tools from the CMS experiment at the LHC Open access to data and analysis tools from the CMS experiment at the LHC Thomas McCauley (for the CMS Collaboration and QuarkNet) University of Notre Dame, USA thomas.mccauley@cern.ch! 5 Feb 2015 Outline

More information

Data Management Plan (DMP) for Particle Physics Experiments prepared for the 2015 Consolidated Grants Round. Detailed Version

Data Management Plan (DMP) for Particle Physics Experiments prepared for the 2015 Consolidated Grants Round. Detailed Version Data Management Plan (DMP) for Particle Physics Experiments prepared for the 2015 Consolidated Grants Round. Detailed Version The Particle Physics Experiment Consolidated Grant proposals now being submitted

More information

Online CMS Web-Based Monitoring. Zongru Wan Kansas State University & Fermilab (On behalf of the CMS Collaboration)

Online CMS Web-Based Monitoring. Zongru Wan Kansas State University & Fermilab (On behalf of the CMS Collaboration) Online CMS Web-Based Monitoring Kansas State University & Fermilab (On behalf of the CMS Collaboration) Technology and Instrumentation in Particle Physics June 13, 2011 Chicago, USA CMS One of the high

More information

Forschungszentrum Karlsruhe in der Helmholtz - Gemeinschaft. Holger Marten. Holger. Marten at iwr. fzk. de www.gridka.de

Forschungszentrum Karlsruhe in der Helmholtz - Gemeinschaft. Holger Marten. Holger. Marten at iwr. fzk. de www.gridka.de Tier-2 cloud Holger Marten Holger. Marten at iwr. fzk. de www.gridka.de 1 GridKa associated Tier-2 sites spread over 3 EGEE regions. (4 LHC Experiments, 5 (soon: 6) countries, >20 T2 sites) 2 region DECH

More information

The ALICE Online Data Storage System Roberto Divià (CERN), Ulrich Fuchs (CERN), Irina Makhlyueva (CERN), Pierre Vande Vyvre (CERN) Valerio Altini (CERN), Franco Carena (CERN), Wisla Carena (CERN), Sylvain

More information

The supercomputer for particle physics at the ULB-VUB computing center

The supercomputer for particle physics at the ULB-VUB computing center The supercomputer for particle physics at the ULB-VUB computing center P. Vanlaer Université Libre de Bruxelles Interuniversity Institute for High Energies (ULB-VUB) Tier-2 cluster inauguration ULB, May

More information

Accelerating Experimental Elementary Particle Physics with the Gordon Supercomputer. Frank Würthwein Rick Wagner August 5th, 2013

Accelerating Experimental Elementary Particle Physics with the Gordon Supercomputer. Frank Würthwein Rick Wagner August 5th, 2013 Accelerating Experimental Elementary Particle Physics with the Gordon Supercomputer Frank Würthwein Rick Wagner August 5th, 2013 The Universe is a strange place! 67% of energy is dark energy We got no

More information

Realization of Inventory Databases and Object-Relational Mapping for the Common Information Model

Realization of Inventory Databases and Object-Relational Mapping for the Common Information Model Realization of Inventory Databases and Object-Relational Mapping for the Common Information Model Department of Physics and Technology, University of Bergen. November 8, 2011 Systems and Virtualization

More information

Q1. What are the differences between Data Backup, System Restore, Disk Image, System Recovery Disc and System Repair Disk?

Q1. What are the differences between Data Backup, System Restore, Disk Image, System Recovery Disc and System Repair Disk? Windows Q & A April 2012 By Mary Phillips, Secretary, Icon Users Group, MO April 2011 issue, The ICON-Newsletter of the Interactive Computer Owners Network www.iconusersgroup.com mary@iconusersgroup.org

More information

irods in complying with Public Research Policy

irods in complying with Public Research Policy irods User Group 2015 irods in complying with Public Research Policy Vic Cornell Senior Storage Consultant Overview Compliance overview UK examples Imperial College MedBio Requirements Architecture irods

More information

The CMS analysis chain in a distributed environment

The CMS analysis chain in a distributed environment The CMS analysis chain in a distributed environment on behalf of the CMS collaboration DESY, Zeuthen,, Germany 22 nd 27 th May, 2005 1 The CMS experiment 2 The CMS Computing Model (1) The CMS collaboration

More information

HIP Computing Resources for LHC-startup

HIP Computing Resources for LHC-startup HIP Computing Resources for LHC-startup Tomas Lindén Finnish CMS meeting in Kumpula 03.10. 2007 Kumpula, Helsinki October 3, 2007 1 Tomas Lindén Contents 1. Finnish Tier-1/2 computing in 2007 and 2008

More information

irods at CC-IN2P3: managing petabytes of data

irods at CC-IN2P3: managing petabytes of data Centre de Calcul de l Institut National de Physique Nucléaire et de Physique des Particules irods at CC-IN2P3: managing petabytes of data Jean-Yves Nief Pascal Calvat Yonny Cardenas Quentin Le Boulc h

More information

Is a Data Scientist the New Quant? Stuart Kozola MathWorks

Is a Data Scientist the New Quant? Stuart Kozola MathWorks Is a Data Scientist the New Quant? Stuart Kozola MathWorks 2015 The MathWorks, Inc. 1 Facts or information used usually to calculate, analyze, or plan something Information that is produced or stored by

More information

Using S3 cloud storage with ROOT and CernVMFS. Maria Arsuaga-Rios Seppo Heikkila Dirk Duellmann Rene Meusel Jakob Blomer Ben Couturier

Using S3 cloud storage with ROOT and CernVMFS. Maria Arsuaga-Rios Seppo Heikkila Dirk Duellmann Rene Meusel Jakob Blomer Ben Couturier Using S3 cloud storage with ROOT and CernVMFS Maria Arsuaga-Rios Seppo Heikkila Dirk Duellmann Rene Meusel Jakob Blomer Ben Couturier INDEX Huawei cloud storages at CERN Old vs. new Huawei UDS comparative

More information

An Integrated CyberSecurity Approach for HEP Grids. Workshop Report. http://hpcrd.lbl.gov/hepcybersecurity/

An Integrated CyberSecurity Approach for HEP Grids. Workshop Report. http://hpcrd.lbl.gov/hepcybersecurity/ An Integrated CyberSecurity Approach for HEP Grids Workshop Report http://hpcrd.lbl.gov/hepcybersecurity/ 1. Introduction The CMS and ATLAS experiments at the Large Hadron Collider (LHC) being built at

More information

KIT Site Report. Andreas Petzold. www.kit.edu STEINBUCH CENTRE FOR COMPUTING - SCC

KIT Site Report. Andreas Petzold. www.kit.edu STEINBUCH CENTRE FOR COMPUTING - SCC KIT Site Report Andreas Petzold STEINBUCH CENTRE FOR COMPUTING - SCC KIT University of the State of Baden-Württemberg and National Laboratory of the Helmholtz Association www.kit.edu GridKa Tier 1 - Batch

More information

Cloud Computing. What is it? Presented by Prof. Dr.Prabhas CHONGSTITVATANA Asst. Prof. Dr.Chaiyachet SAIVICHIT. Source : Montana State Library Archive

Cloud Computing. What is it? Presented by Prof. Dr.Prabhas CHONGSTITVATANA Asst. Prof. Dr.Chaiyachet SAIVICHIT. Source : Montana State Library Archive Cloud Computing What is it? Presented by Prof. Dr.Prabhas CHONGSTITVATANA Asst. Prof. Dr.Chaiyachet SAIVICHIT Source : Montana State Library Archive Source: http://www.free-pictures-photos.com/ How much

More information

Grid Computing in Aachen

Grid Computing in Aachen GEFÖRDERT VOM Grid Computing in Aachen III. Physikalisches Institut B Berichtswoche des Graduiertenkollegs Bad Honnef, 05.09.2008 Concept of Grid Computing Computing Grid; like the power grid, but for

More information

DSS. High performance storage pools for LHC. Data & Storage Services. Łukasz Janyst. on behalf of the CERN IT-DSS group

DSS. High performance storage pools for LHC. Data & Storage Services. Łukasz Janyst. on behalf of the CERN IT-DSS group DSS High performance storage pools for LHC Łukasz Janyst on behalf of the CERN IT-DSS group CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/it Introduction The goal of EOS is to provide a

More information

CMS Tier-3 cluster at NISER. Dr. Tania Moulik

CMS Tier-3 cluster at NISER. Dr. Tania Moulik CMS Tier-3 cluster at NISER Dr. Tania Moulik What and why? Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach common goal. Grids tend

More information

Flexible Scalable Hardware independent. Solutions for Long Term Archiving

Flexible Scalable Hardware independent. Solutions for Long Term Archiving Flexible Scalable Hardware independent Solutions for Long Term Archiving More than 20 years of experience in archival storage 2 OA HPA 2010 1992 2000 2004 2007 Mainframe Tape Libraries Open System Tape

More information

HTCondor at the RAL Tier-1

HTCondor at the RAL Tier-1 HTCondor at the RAL Tier-1 Andrew Lahiff, Alastair Dewhurst, John Kelly, Ian Collier, James Adams STFC Rutherford Appleton Laboratory HTCondor Week 2014 Outline Overview of HTCondor at RAL Monitoring Multi-core

More information

Big Science and Big Data Dirk Duellmann, CERN Apache Big Data Europe 28 Sep 2015, Budapest, Hungary

Big Science and Big Data Dirk Duellmann, CERN Apache Big Data Europe 28 Sep 2015, Budapest, Hungary Big Science and Big Data Dirk Duellmann, CERN Apache Big Data Europe 28 Sep 2015, Budapest, Hungary 16/02/2015 Real-Time Analytics: Making better and faster business decisions 8 The ATLAS experiment

More information

Regulatory Compliance

Regulatory Compliance Regulatory Compliance Security: From a security standpoint, disk, tape and DVD, none of them meet the regulatory requirements of Sarbanes Oxley or HIPAA simply because they are generally not encrypted.

More information

OIS. Update on Windows 7 at CERN & Remote Desktop Gateway. Operating Systems & Information Services CERN IT-OIS

OIS. Update on Windows 7 at CERN & Remote Desktop Gateway. Operating Systems & Information Services CERN IT-OIS Operating Systems & Information Services Update on Windows 7 at CERN & Remote Desktop Gateway CERN IT-OIS Tim Bell, Michal Kwiatek, Michal Budzowski, Andreas Wagner HEPiX Fall 2010 Workshop 4th November

More information

Workload Characterization and Analysis of Storage and Bandwidth Needs of LEAD Workspace

Workload Characterization and Analysis of Storage and Bandwidth Needs of LEAD Workspace Workload Characterization and Analysis of Storage and Bandwidth Needs of LEAD Workspace Beth Plale Indiana University plale@cs.indiana.edu LEAD TR 001, V3.0 V3.0 dated January 24, 2007 V2.0 dated August

More information

THE ARCHIVAL SECTOR IN DW2.0 By W H Inmon

THE ARCHIVAL SECTOR IN DW2.0 By W H Inmon The fourth sector of the DW2.0 environment is the archival sector. Fig arch.1 shows the architectural positioning of the archival sector. Fig arch.1 The archival sector All data that flows into the archival

More information

NUIT Tech Talk: Trends in Research Data Mobility

NUIT Tech Talk: Trends in Research Data Mobility NUIT Tech Talk: Trends in Research Data Mobility Pascal Paschos NUIT Academic & Research Technologies, Research Computing Services Matt Wilson NUIT Cyberinfrastructure, Telecommunication and Network Services

More information

Research Data Management Plan

Research Data Management Plan Please consult the guidance notes for examples of answers to help you to complete this template Overview Researchers Name: Title of Research Project: Length of Project: e.g. 6 months, 3 years, etc A brief

More information

Running the scientific data archive

Running the scientific data archive Running the scientific data archive Costs, technologies, challenges Jos van Wezel STEINBUCH CENTRE FOR COMPUTING - SCC KIT University of the State of Baden-Württemberg and National Laboratory of the Helmholtz

More information

Big Data and Cloud Computing for GHRSST

Big Data and Cloud Computing for GHRSST Big Data and Cloud Computing for GHRSST Jean-Francois Piollé (jfpiolle@ifremer.fr) Frédéric Paul, Olivier Archer CERSAT / Institut Français de Recherche pour l Exploitation de la Mer Facing data deluge

More information

Data Requirements from NERSC Requirements Reviews

Data Requirements from NERSC Requirements Reviews Data Requirements from NERSC Requirements Reviews Richard Gerber and Katherine Yelick Lawrence Berkeley National Laboratory Summary Department of Energy Scientists represented by the NERSC user community

More information

Big Data and Analytics: Getting Started with ArcGIS. Mike Park Erik Hoel

Big Data and Analytics: Getting Started with ArcGIS. Mike Park Erik Hoel Big Data and Analytics: Getting Started with ArcGIS Mike Park Erik Hoel Agenda Overview of big data Distributed computation User experience Data management Big data What is it? Big Data is a loosely defined

More information

Single Sign-In User Centered Computing for High Energy Physics

Single Sign-In User Centered Computing for High Energy Physics Single Sign-In User Centered Computing for High Energy Physics Max Fischer, Oliver Oberst, Günter Quast, Marian Zvada ISGC 2013: March 19-22, 2013 Taipei, Taiwan INSTITUT FÜR EXPERIMENTELLE KERNPHYSIK

More information

New Design and Layout Tips For Processing Multiple Tasks

New Design and Layout Tips For Processing Multiple Tasks Novel, Highly-Parallel Software for the Online Storage System of the ATLAS Experiment at CERN: Design and Performances Tommaso Colombo a,b Wainer Vandelli b a Università degli Studi di Pavia b CERN IEEE

More information

Development of Earth Science Observational Data Infrastructure of Taiwan. Fang-Pang Lin National Center for High-Performance Computing, Taiwan

Development of Earth Science Observational Data Infrastructure of Taiwan. Fang-Pang Lin National Center for High-Performance Computing, Taiwan Development of Earth Science Observational Data Infrastructure of Taiwan Fang-Pang Lin National Center for High-Performance Computing, Taiwan GLIF 13, Singapore, 4 Oct 2013 The Path from Infrastructure

More information

How To Buy A Crikit For A Fraction Of The Price

How To Buy A Crikit For A Fraction Of The Price 2012 CriKit- Desktop Private Cloud Paul Morse 5/7/2012 Overview CriKit, which stands for Cloud Resource and Infrastructure Kit, was created out of necessity. As of January 2012, there was no low-cost,

More information

High Availability Databases based on Oracle 10g RAC on Linux

High Availability Databases based on Oracle 10g RAC on Linux High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN, June 2006 Luca Canali, CERN IT Outline Goals Architecture of an HA DB Service Deployment at the CERN Physics Database

More information

Virtualised Cloud Storage Driver for Business ROI. Raymond Ng Principal Consultant Storage Solutions, Asia Pacific Huawei Symantec Technologies

Virtualised Cloud Storage Driver for Business ROI. Raymond Ng Principal Consultant Storage Solutions, Asia Pacific Huawei Symantec Technologies Virtualised Cloud Storage Driver for Business ROI Raymond Ng Principal Consultant Storage Solutions, Asia Pacific Huawei Symantec Technologies Agenda Huawei Symantec Overview Cloud Storage Driver for Business

More information

Datasheet Alarm Center. Version 3.6. Datasheet Alarm Center Version 3.6. www.allgovision.com. AllGo Embedded Systems Pvt. Ltd.

Datasheet Alarm Center. Version 3.6. Datasheet Alarm Center Version 3.6. www.allgovision.com. AllGo Embedded Systems Pvt. Ltd. Datasheet Alarm Center Version 3.6 This Specification Sheet gives the details of system requirements, features and other salient points of AllGoVision advanced Alarm Center AllGo Embedded Systems Pvt.

More information

http://bradhedlund.com/?p=3108

http://bradhedlund.com/?p=3108 http://bradhedlund.com/?p=3108 This article is Part 1 in series that will take a closer look at the architecture and methods of a Hadoop cluster, and how it relates to the network and server infrastructure.

More information

Techniques for implementing & running robust and reliable DB-centric Grid Applications

Techniques for implementing & running robust and reliable DB-centric Grid Applications Techniques for implementing & running robust and reliable DB-centric Grid Applications International Symposium on Grid Computing 2008 11 April 2008 Miguel Anjo, CERN - Physics Databases Outline Robust

More information

Designing the STAR Database Load Balancing Model

Designing the STAR Database Load Balancing Model Replication and load balancing strategy of STAR's Relational Database Management System (RDBM) Michael DePhillips, Jerome Lauret Brookhaven National Laboratory, Upton NY 11973 Mikhail Kopytine Kent State

More information

Big Data Processing Experience in the ATLAS Experiment

Big Data Processing Experience in the ATLAS Experiment Big Data Processing Experience in the ATLAS Experiment A. on behalf of the ATLAS Collabora5on Interna5onal Symposium on Grids and Clouds (ISGC) 2014 March 23-28, 2014 Academia Sinica, Taipei, Taiwan Introduction

More information

Image Search by MapReduce

Image Search by MapReduce Image Search by MapReduce COEN 241 Cloud Computing Term Project Final Report Team #5 Submitted by: Lu Yu Zhe Xu Chengcheng Huang Submitted to: Prof. Ming Hwa Wang 09/01/2015 Preface Currently, there s

More information

Shibbolized irods (and why it matters)

Shibbolized irods (and why it matters) Shibbolized irods (and why it matters) 3 rd TERENA Storage Meeting, Dublin, February 12 th -13 th 2009 David Corney, for Jens Jensen, e-science centre, Rutherford Appleton Lab, UK Overview Introduction

More information

Energy Efficient Storage - Multi- Tier Strategies For Retaining Data

Energy Efficient Storage - Multi- Tier Strategies For Retaining Data Energy and Space Efficient Storage: Multi-tier Strategies for Protecting and Retaining Data NOTICE This White Paper may contain proprietary information protected by copyright. Information in this White

More information

Maurice Askinazi Ofer Rind Tony Wong. HEPIX @ Cornell Nov. 2, 2010 Storage at BNL

Maurice Askinazi Ofer Rind Tony Wong. HEPIX @ Cornell Nov. 2, 2010 Storage at BNL Maurice Askinazi Ofer Rind Tony Wong HEPIX @ Cornell Nov. 2, 2010 Storage at BNL Traditional Storage Dedicated compute nodes and NFS SAN storage Simple and effective, but SAN storage became very expensive

More information

Beyond High Performance Computing: What Matters to CERN

Beyond High Performance Computing: What Matters to CERN Beyond High Performance Computing: What Matters to CERN Pierre VANDE VYVRE for the ALICE Collaboration ALICE Data Acquisition Project Leader CERN, Geneva, Switzerland 2 CERN CERN is the world's largest

More information

Integration of Virtualized Worker Nodes in Batch Systems

Integration of Virtualized Worker Nodes in Batch Systems Integration of Virtualized Worker Nodes Dr. A. Scheurer, Dr. V. Büge, O. Oberst, P. Krauß Linuxtag 2010, Berlin, Session: Cloud Computing, Talk ID: #16197 KIT University of the State of Baden-Wuerttemberg

More information

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the

More information

LHC Databases on the Grid: Achievements and Open Issues * A.V. Vaniachine. Argonne National Laboratory 9700 S Cass Ave, Argonne, IL, 60439, USA

LHC Databases on the Grid: Achievements and Open Issues * A.V. Vaniachine. Argonne National Laboratory 9700 S Cass Ave, Argonne, IL, 60439, USA ANL-HEP-CP-10-18 To appear in the Proceedings of the IV International Conference on Distributed computing and Gridtechnologies in science and education (Grid2010), JINR, Dubna, Russia, 28 June - 3 July,

More information

Data storage services at CC-IN2P3

Data storage services at CC-IN2P3 Centre de Calcul de l Institut National de Physique Nucléaire et de Physique des Particules Data storage services at CC-IN2P3 Jean-Yves Nief Agenda Hardware: Storage on disk. Storage on tape. Software:

More information

If you lost all of your data right now... What would you do?... Backup Plan For Home Users and Very Small Businesses

If you lost all of your data right now... What would you do?... Backup Plan For Home Users and Very Small Businesses 2 If you lost all of your data right now... What would you do?... Backup Plan For Home Users and Very Small Businesses As the Story Goes... At seven pm, I received a distraught phone call on my cellphone...

More information

Oracle Database Cloud Service Rick Greenwald, Director, Product Management, Database Cloud

Oracle Database Cloud Service Rick Greenwald, Director, Product Management, Database Cloud Oracle Database Cloud Service Rick Greenwald, Director, Product Management, Database Cloud Agenda Oracle Cloud Database Service Overview Cloud taxonomy What is the Database Cloud Service? Architecture

More information

Wikibon Storage Projections to an All-flash Datacenter in 2016

Wikibon Storage Projections to an All-flash Datacenter in 2016 Wikibon.com - http://wikibon.com Wikibon Storage Projections to an All-flash Datacenter in 2016 by David Floyer - 1 June 2015 http://wikibon.com/wikibon-storage-projections-to-an-all-flash-datacenter-in-2016/

More information

IT-INFN-CNAF Status Update

IT-INFN-CNAF Status Update IT-INFN-CNAF Status Update LHC-OPN Meeting INFN CNAF, 10-11 December 2009 Stefano Zani 10/11/2009 Stefano Zani INFN CNAF (TIER1 Staff) 1 INFN CNAF CNAF is the main computing facility of the INFN Core business:

More information

Linux and the Higgs Particle

Linux and the Higgs Particle Linux and the Higgs Particle Dr. Bernd Panzer-Steindel Computing Fabric Area Manager, CERN/IT Linux World, Frankfurt 27.October 2004 Outline What is CERN The Physics The Physics Tools The Accelerator The

More information

Data Lab Operations Concepts

Data Lab Operations Concepts Data Lab Operations Concepts 1 Introduction This talk will provide an overview of Data Lab components to be implemented Core infrastructure User applications Science Capabilities User Interfaces The scope

More information

This manual will also describe how to get Photo Supreme SQLServer up and running with an existing instance of SQLServer.

This manual will also describe how to get Photo Supreme SQLServer up and running with an existing instance of SQLServer. 1 Installation Manual SQL Server 2012 Photo Supreme Introduction Important note up front: this manual describes the installation of Photo Supreme with SQLServer. There is a free SQLServer version called

More information

Extreme Computing. Big Data. Stratis Viglas. School of Informatics University of Edinburgh sviglas@inf.ed.ac.uk. Stratis Viglas Extreme Computing 1

Extreme Computing. Big Data. Stratis Viglas. School of Informatics University of Edinburgh sviglas@inf.ed.ac.uk. Stratis Viglas Extreme Computing 1 Extreme Computing Big Data Stratis Viglas School of Informatics University of Edinburgh sviglas@inf.ed.ac.uk Stratis Viglas Extreme Computing 1 Petabyte Age Big Data Challenges Stratis Viglas Extreme Computing

More information

The Compact Muon Solenoid Experiment. Conference Report. Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland

The Compact Muon Solenoid Experiment. Conference Report. Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland Available on CMS information server CMS CR -2012/114 The Compact Muon Solenoid Experiment Conference Report Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland 23 May 2012 CMS Data Transfer operations

More information

Dynamic Extension of a Virtualized Cluster by using Cloud Resources CHEP 2012

Dynamic Extension of a Virtualized Cluster by using Cloud Resources CHEP 2012 Dynamic Extension of a Virtualized Cluster by using Cloud Resources CHEP 2012 Thomas Hauth,, Günter Quast IEKP KIT University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz

More information