A Physics Approach to Big Data. Adam Kocoloski, PhD CTO Cloudant
|
|
- Pierce Waters
- 8 years ago
- Views:
Transcription
1 A Physics Approach to Big Data Adam Kocoloski, PhD CTO Cloudant
2
3 Solenoidal Tracker at RHIC (STAR)
4 The life of LHC data Detected by experiment Online filtering (hardware and software) Transferred to CERN main campus, archived & reconstructed Transferred to T1 sites, archived, reconstructed & skimmed Transferred to T2 sites, reconstructed, skimmed, filtered & analysed Written into locally analyzable files, put on laptops Turned into a plot in a paper
5 Dig big tu nne ls
6 C h a i n u p s e r i e s o f a t o m smashers
7 Put sensitive cameras in awkward places
8 Re co rd e ve nts
9 The life of LHC data Detected by experiment Online filtering (hardware and software) Transferred to CERN main campus, archived & reconstructed Transferred to T1 sites, archived, reconstructed & skimmed Transferred to T2 sites, reconstructed, skimmed, filtered & analysed Written into locally analyzable files, put on laptops Turned into a plot in a paper
10 CMS Data Flow We have a big digital camera
11 It takes photos of this
12 Which come out looking like this
13 CMS Data Flow We have a big digital camera
14 CMS Data Flow We have a big digital camera Which goes into lots of computers (HLT)
15 CMS Data Flow We have a big digital camera Which goes into lots of computers (HLT) And then into lots of disk (Storage Manager)
16 CMS Data Flow We have a big digital camera ~200 GB/s Which goes into lots of computers (HLT) ~2 GB/s And then into lots of disk (Storage Manager)
17 Throw away data ASAP
18 The life of LHC data Detected by experiment Online filtering (hardware and software) Transferred to CERN main campus, archived & reconstructed Transferred to T1 sites, archived, reconstructed & skimmed Transferred to T2 sites, reconstructed, skimmed, filtered & analysed Written into locally analyzable files, put on laptops Turned into a plot in a paper
19 10 PB / year
20 The life of LHC data Detected by experiment Online filtering (hardware and software) Transferred to CERN main campus, archived & reconstructed Transferred to T1 sites, archived, reconstructed & skimmed Transferred to T2 sites, reconstructed, skimmed, filtered & analysed Written into locally analyzable files, put on laptops Turned into a plot in a paper
21 Tiers 1PB/week Maybe a bit different Many (>100) sites with 100 s TB storage, s worker nodes Why so many? Politics, power budget, cost
22 The life of LHC data Detected by experiment Online filtering (hardware and software) Transferred to CERN main campus, archived & reconstructed Transferred to T1 sites, archived, reconstructed & skimmed Transferred to T2 sites, reconstructed, skimmed, filtered & analysed Written into locally analyzable files, put on laptops Turned into a plot in a paper
23 Analysis Workflow Each analysis is ~unique Query language is C++ Runs on distributed and local resources Myriad cut selections to identify interesting events Data in final plot substantially reduced from the original dataset
24
25
26 Multivariate analyses extract signals that are impossible to find via hand-drawn cuts
27 Humans find signals even when none exist
28
29 Workflow Ladder Number of users Large datasets (>100 TB) } Complex computation Large datasets (>100 TB) Simple computation Shared datasets (>500 GB) Complex computation Use Grid compute and storage exclusively Shared datasets ( GB) Complex computation Shared datasets ( GB) Simple computation Shared datasets ( GB) Simple computation Private datasets ( GB) Simple computation } } Work on departmental resources, store resulting datasets to Grid storage Work on laptop/desktop machine, store resulting datasets to Grid storage
30 Collaborative skimming remains valuable
31 Let s Review
32 Let s Review Particle physics is special
33 Let s Review Particle physics is special But not as special as it used to be vis a vis data operations
34 Let s Review Particle physics is special But not as special as it used to be vis a vis data operations And that s a good thing
35 Let s Review Particle physics is special But not as special as it used to be vis a vis data operations And that s a good thing Keep only the data you can analyze
36 Let s Review Particle physics is special But not as special as it used to be vis a vis data operations And that s a good thing Keep only the data you can analyze Remember that we re biased to find patterns, and lots of data means lots of patterns
37 Let s Review Particle physics is special But not as special as it used to be vis a vis data operations And that s a good thing Keep only the data you can analyze Remember that we re biased to find patterns, and lots of data means lots of patterns Multivariate analyses win in the end
38 Let s Review Particle physics is special But not as special as it used to be vis a vis data operations And that s a good thing Keep only the data you can analyze Remember that we re biased to find patterns, and lots of data means lots of patterns Multivariate analyses win in the end Skim datasets for fun and profit
39
CMS: Challenges in Advanced Computing Techniques (Big Data, Data reduction, Data Analytics)
CMS: Challenges in Advanced Computing Techniques (Big Data, Data reduction, Data Analytics) With input from: Daniele Bonacorsi, Ian Fisk, Valentin Kuznetsov, David Lange Oliver Gutsche CERN openlab technical
More informationThe CMS Tier0 goes Cloud and Grid for LHC Run 2. Dirk Hufnagel (FNAL) for CMS Computing
The CMS Tier0 goes Cloud and Grid for LHC Run 2 Dirk Hufnagel (FNAL) for CMS Computing CHEP, 13.04.2015 Overview Changes for the Tier0 between Run 1 and Run 2 CERN Agile Infrastructure (in GlideInWMS)
More informationHow To Write A Data Mining Program On A Large Data Set
On the Role of Indexing for Big Data in Scientific Domains Arie Shoshani Lawrence Berkeley National Lab BIGDATA and EXTREME-SCALE COMPUTING April 3-May, 23 Outline q Examples of indexing needs in scientific
More informationData analysis in Par,cle Physics
Data analysis in Par,cle Physics From data taking to discovery Tuesday, 13 August 2013 Lukasz Kreczko - Bristol IT MegaMeet 1 $ whoami Lukasz (Luke) Kreczko Par,cle Physicist Graduated in Physics from
More informationIntegrating a heterogeneous and shared Linux cluster into grids
Integrating a heterogeneous and shared Linux cluster into grids 1,2 1 1,2 1 V. Büge, U. Felzmann, C. Jung, U. Kerzel, 1 1 1 M. Kreps, G. Quast, A. Vest 1 2 DPG Frühjahrstagung March 28 31, 2006 Dortmund
More informationStatus of Grid Activities in Pakistan. FAWAD SAEED National Centre For Physics, Pakistan
Status of Grid Activities in Pakistan FAWAD SAEED National Centre For Physics, Pakistan 1 Introduction of NCP-LCG2 q NCP-LCG2 is the only Tier-2 centre in Pakistan for Worldwide LHC computing Grid (WLCG).
More informationIT of SPIM Data Storage and Compression. EMBO Course - August 27th! Jeff Oegema, Peter Steinbach, Oscar Gonzalez
IT of SPIM Data Storage and Compression EMBO Course - August 27th Jeff Oegema, Peter Steinbach, Oscar Gonzalez 1 Talk Outline Introduction and the IT Team SPIM Data Flow Capture, Compression, and the Data
More informationRO-11-NIPNE, evolution, user support, site and software development. IFIN-HH, DFCTI, LHCb Romanian Team
IFIN-HH, DFCTI, LHCb Romanian Team Short overview: The old RO-11-NIPNE site New requirements from the LHCb team User support ( solution offered). Data reprocessing 2012 facts Future plans The old RO-11-NIPNE
More informationNT1: An example for future EISCAT_3D data centre and archiving?
March 10, 2015 1 NT1: An example for future EISCAT_3D data centre and archiving? John White NeIC xx March 10, 2015 2 Introduction High Energy Physics and Computing Worldwide LHC Computing Grid Nordic Tier
More informationVirtualization of a Cluster Batch System
Virtualization of a Cluster Batch System Christian Baun, Volker Büge, Benjamin Klein, Jens Mielke, Oliver Oberst and Armin Scheurer Die Kooperation von Cluster Batch System Batch system accepts computational
More informationWhat is the real cost of Commercial Cloud provisioning? Thursday, 20 June 13 Lukasz Kreczko - DICE 1
What is the real cost of Commercial Cloud provisioning? Thursday, 20 June 13 Lukasz Kreczko - DICE 1 SouthGrid in numbers CPU [cores] RAM [TB] Disk [TB] Manpower [FTE] Power [kw] 5100 10.2 3000 7 1.5 x
More informationCMS Experience Provisioning Cloud Resources with GlideinWMS. Anthony Tiradani HTCondor Week 2015 20 May 2015
CMS Experience Provisioning Cloud Resources with GlideinWMS Anthony Tiradani Week 2015 20 May 2015 glideinwms Quick Facts glideinwms is an open- source Fermilab CompuJng Sector product driven by CMS Heavy
More informationTier0 plans and security and backup policy proposals
Tier0 plans and security and backup policy proposals, CERN IT-PSS CERN - IT Outline Service operational aspects Hardware set-up in 2007 Replication set-up Test plan Backup and security policies CERN Oracle
More informationTesting the In-Memory Column Store for in-database physics analysis. Dr. Maaike Limper
Testing the In-Memory Column Store for in-database physics analysis Dr. Maaike Limper About CERN CERN - European Laboratory for Particle Physics Support the research activities of 10 000 scientists from
More informationOnline data handling with Lustre at the CMS experiment
Online data handling with Lustre at the CMS experiment Lavinia Darlea, on behalf of CMS DAQ Group MIT/DAQ CMS September 17, 2015 1 / 29 CERN 2 / 29 CERN CERN was founded 1954: 12 European States Science
More informationBetriebssystem-Virtualisierung auf einem Rechencluster am SCC mit heterogenem Anwendungsprofil
Betriebssystem-Virtualisierung auf einem Rechencluster am SCC mit heterogenem Anwendungsprofil Volker Büge 1, Marcel Kunze 2, OIiver Oberst 1,2, Günter Quast 1, Armin Scheurer 1 1) Institut für Experimentelle
More informationThe Data Quality Monitoring Software for the CMS experiment at the LHC
The Data Quality Monitoring Software for the CMS experiment at the LHC On behalf of the CMS Collaboration Marco Rovere, CERN CHEP 2015 Evolution of Software and Computing for Experiments Okinawa, Japan,
More informationEstablishing Applicability of SSDs to LHC Tier-2 Hardware Configuration
Establishing Applicability of SSDs to LHC Tier-2 Hardware Configuration A CHEP 2010 presentation by: Sam Skipsey and The GridPP Storage Group With particular acknowledgments to: Wahid Bhimji (go see his
More information<Insert Picture Here> Cloud Archive Trends and Challenges PASIG Winter 2012
Cloud Archive Trends and Challenges PASIG Winter 2012 Raymond A. Clarke Enterprise Storage Consultant, Oracle Enterprise Solutions Group How Is PASIG Pronounced? Is it PASIG? Is it
More informationDatabase Monitoring Requirements. Salvatore Di Guida (CERN) On behalf of the CMS DB group
Database Monitoring Requirements Salvatore Di Guida (CERN) On behalf of the CMS DB group Outline CMS Database infrastructure and data flow. Data access patterns. Requirements coming from the hardware and
More informationCMS Computing Model: Notes for a discussion with Super-B
CMS Computing Model: Notes for a discussion with Super-B Claudio Grandi [ CMS Tier-1 sites coordinator - INFN-Bologna ] Daniele Bonacorsi [ CMS Facilities Ops coordinator - University of Bologna ] 1 Outline
More information(Possible) HEP Use Case for NDN. Phil DeMar; Wenji Wu NDNComm (UCLA) Sept. 28, 2015
(Possible) HEP Use Case for NDN Phil DeMar; Wenji Wu NDNComm (UCLA) Sept. 28, 2015 Outline LHC Experiments LHC Computing Models CMS Data Federation & AAA Evolving Computing Models & NDN Summary Phil DeMar:
More informationScaling Out With Apache Spark. DTL Meeting 17-04-2015 Slides based on https://www.sics.se/~amir/files/download/dic/spark.pdf
Scaling Out With Apache Spark DTL Meeting 17-04-2015 Slides based on https://www.sics.se/~amir/files/download/dic/spark.pdf Your hosts Mathijs Kattenberg Technical consultant Jeroen Schot Technical consultant
More informationData sharing and Big Data in the physical sciences. 2 October 2015
Data sharing and Big Data in the physical sciences 2 October 2015 Content Digital curation: Data and metadata Why consider the physical sciences? Astronomy: Video Physics: LHC for example. Video The Research
More informationAddressing research data challenges at the. University of Colorado Boulder
Addressing research data challenges at the University of Colorado Boulder Thomas Hauser Director Research Computing University of Colorado Boulder thomas.hauser@colorado.edu Research Data Challenges Research
More informationSUPPORT FOR CMS EXPERIMENT AT TIER1 CENTER IN GERMANY
The 5th InternaEonal Conference Distributed CompuEng and Grid technologies in Science and EducaEon SUPPORT FOR CMS EXPERIMENT AT TIER1 CENTER IN GERMANY N. Ratnikova, J. Berger, C. Böser, O. Oberst, G.
More informationBig Data and Storage Management at the Large Hadron Collider
Big Data and Storage Management at the Large Hadron Collider Dirk Duellmann CERN IT, Data & Storage Services Accelerating Science and Innovation CERN was founded 1954: 12 European States Science for Peace!
More informationComputing at the HL-LHC
Computing at the HL-LHC Predrag Buncic on behalf of the Trigger/DAQ/Offline/Computing Preparatory Group ALICE: Pierre Vande Vyvre, Thorsten Kollegger, Predrag Buncic; ATLAS: David Rousseau, Benedetto Gorini,
More informationNo file left behind - monitoring transfer latencies in PhEDEx
FERMILAB-CONF-12-825-CD International Conference on Computing in High Energy and Nuclear Physics 2012 (CHEP2012) IOP Publishing No file left behind - monitoring transfer latencies in PhEDEx T Chwalek a,
More informationUS NSF s Scientific Software Innovation Institutes
US NSF s Scientific Software Innovation Institutes S 2 I 2 awards invest in long-term projects which will realize sustained software infrastructure that is integral to doing transformative science. (Can
More informationNERSC File Systems and How to Use Them
NERSC File Systems and How to Use Them David Turner! NERSC User Services Group! Joint Facilities User Forum on Data- Intensive Computing! June 18, 2014 The compute and storage systems 2014 Hopper: 1.3PF,
More informationCDFII Computing Status
CDFII Computing Status OUTLINE: New CDF-Italy computing group organization Usage status at FNAL and CNAF Towards GRID: where we are Plans and requests 22/04/2005 Donatella Lucchesi 1 CDFII Computing group
More informationData Movement and Storage. Drew Dolgert and previous contributors
Data Movement and Storage Drew Dolgert and previous contributors Data Intensive Computing Location Viewing Manipulation Storage Movement Sharing Interpretation $HOME $WORK $SCRATCH 72 is a Lot, Right?
More informationData Storage and Data Transfer. Dan Hitchcock Acting Associate Director, Advanced Scientific Computing Research
Data Storage and Data Transfer Dan Hitchcock Acting Associate Director, Advanced Scientific Computing Research Office of Science Organization 2 Data to Enable Scientific Discovery DOE SC Investments in
More informationOpen access to data and analysis tools from the CMS experiment at the LHC
Open access to data and analysis tools from the CMS experiment at the LHC Thomas McCauley (for the CMS Collaboration and QuarkNet) University of Notre Dame, USA thomas.mccauley@cern.ch! 5 Feb 2015 Outline
More informationData Management Plan (DMP) for Particle Physics Experiments prepared for the 2015 Consolidated Grants Round. Detailed Version
Data Management Plan (DMP) for Particle Physics Experiments prepared for the 2015 Consolidated Grants Round. Detailed Version The Particle Physics Experiment Consolidated Grant proposals now being submitted
More informationOnline CMS Web-Based Monitoring. Zongru Wan Kansas State University & Fermilab (On behalf of the CMS Collaboration)
Online CMS Web-Based Monitoring Kansas State University & Fermilab (On behalf of the CMS Collaboration) Technology and Instrumentation in Particle Physics June 13, 2011 Chicago, USA CMS One of the high
More informationForschungszentrum Karlsruhe in der Helmholtz - Gemeinschaft. Holger Marten. Holger. Marten at iwr. fzk. de www.gridka.de
Tier-2 cloud Holger Marten Holger. Marten at iwr. fzk. de www.gridka.de 1 GridKa associated Tier-2 sites spread over 3 EGEE regions. (4 LHC Experiments, 5 (soon: 6) countries, >20 T2 sites) 2 region DECH
More informationThe ALICE Online Data Storage System Roberto Divià (CERN), Ulrich Fuchs (CERN), Irina Makhlyueva (CERN), Pierre Vande Vyvre (CERN) Valerio Altini (CERN), Franco Carena (CERN), Wisla Carena (CERN), Sylvain
More informationThe supercomputer for particle physics at the ULB-VUB computing center
The supercomputer for particle physics at the ULB-VUB computing center P. Vanlaer Université Libre de Bruxelles Interuniversity Institute for High Energies (ULB-VUB) Tier-2 cluster inauguration ULB, May
More informationCopyright 2005-2010 Soleran, Inc. esalestrack On-Demand CRM. Trademarks and all rights reserved. esalestrack is a Soleran product Privacy Statement
More information
Accelerating Experimental Elementary Particle Physics with the Gordon Supercomputer. Frank Würthwein Rick Wagner August 5th, 2013
Accelerating Experimental Elementary Particle Physics with the Gordon Supercomputer Frank Würthwein Rick Wagner August 5th, 2013 The Universe is a strange place! 67% of energy is dark energy We got no
More informationRealization of Inventory Databases and Object-Relational Mapping for the Common Information Model
Realization of Inventory Databases and Object-Relational Mapping for the Common Information Model Department of Physics and Technology, University of Bergen. November 8, 2011 Systems and Virtualization
More informationQ1. What are the differences between Data Backup, System Restore, Disk Image, System Recovery Disc and System Repair Disk?
Windows Q & A April 2012 By Mary Phillips, Secretary, Icon Users Group, MO April 2011 issue, The ICON-Newsletter of the Interactive Computer Owners Network www.iconusersgroup.com mary@iconusersgroup.org
More informationirods in complying with Public Research Policy
irods User Group 2015 irods in complying with Public Research Policy Vic Cornell Senior Storage Consultant Overview Compliance overview UK examples Imperial College MedBio Requirements Architecture irods
More informationThe CMS analysis chain in a distributed environment
The CMS analysis chain in a distributed environment on behalf of the CMS collaboration DESY, Zeuthen,, Germany 22 nd 27 th May, 2005 1 The CMS experiment 2 The CMS Computing Model (1) The CMS collaboration
More informationHIP Computing Resources for LHC-startup
HIP Computing Resources for LHC-startup Tomas Lindén Finnish CMS meeting in Kumpula 03.10. 2007 Kumpula, Helsinki October 3, 2007 1 Tomas Lindén Contents 1. Finnish Tier-1/2 computing in 2007 and 2008
More informationirods at CC-IN2P3: managing petabytes of data
Centre de Calcul de l Institut National de Physique Nucléaire et de Physique des Particules irods at CC-IN2P3: managing petabytes of data Jean-Yves Nief Pascal Calvat Yonny Cardenas Quentin Le Boulc h
More informationIs a Data Scientist the New Quant? Stuart Kozola MathWorks
Is a Data Scientist the New Quant? Stuart Kozola MathWorks 2015 The MathWorks, Inc. 1 Facts or information used usually to calculate, analyze, or plan something Information that is produced or stored by
More informationUsing S3 cloud storage with ROOT and CernVMFS. Maria Arsuaga-Rios Seppo Heikkila Dirk Duellmann Rene Meusel Jakob Blomer Ben Couturier
Using S3 cloud storage with ROOT and CernVMFS Maria Arsuaga-Rios Seppo Heikkila Dirk Duellmann Rene Meusel Jakob Blomer Ben Couturier INDEX Huawei cloud storages at CERN Old vs. new Huawei UDS comparative
More informationAn Integrated CyberSecurity Approach for HEP Grids. Workshop Report. http://hpcrd.lbl.gov/hepcybersecurity/
An Integrated CyberSecurity Approach for HEP Grids Workshop Report http://hpcrd.lbl.gov/hepcybersecurity/ 1. Introduction The CMS and ATLAS experiments at the Large Hadron Collider (LHC) being built at
More informationKIT Site Report. Andreas Petzold. www.kit.edu STEINBUCH CENTRE FOR COMPUTING - SCC
KIT Site Report Andreas Petzold STEINBUCH CENTRE FOR COMPUTING - SCC KIT University of the State of Baden-Württemberg and National Laboratory of the Helmholtz Association www.kit.edu GridKa Tier 1 - Batch
More informationCloud Computing. What is it? Presented by Prof. Dr.Prabhas CHONGSTITVATANA Asst. Prof. Dr.Chaiyachet SAIVICHIT. Source : Montana State Library Archive
Cloud Computing What is it? Presented by Prof. Dr.Prabhas CHONGSTITVATANA Asst. Prof. Dr.Chaiyachet SAIVICHIT Source : Montana State Library Archive Source: http://www.free-pictures-photos.com/ How much
More informationGrid Computing in Aachen
GEFÖRDERT VOM Grid Computing in Aachen III. Physikalisches Institut B Berichtswoche des Graduiertenkollegs Bad Honnef, 05.09.2008 Concept of Grid Computing Computing Grid; like the power grid, but for
More informationDSS. High performance storage pools for LHC. Data & Storage Services. Łukasz Janyst. on behalf of the CERN IT-DSS group
DSS High performance storage pools for LHC Łukasz Janyst on behalf of the CERN IT-DSS group CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/it Introduction The goal of EOS is to provide a
More informationCMS Tier-3 cluster at NISER. Dr. Tania Moulik
CMS Tier-3 cluster at NISER Dr. Tania Moulik What and why? Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach common goal. Grids tend
More informationFlexible Scalable Hardware independent. Solutions for Long Term Archiving
Flexible Scalable Hardware independent Solutions for Long Term Archiving More than 20 years of experience in archival storage 2 OA HPA 2010 1992 2000 2004 2007 Mainframe Tape Libraries Open System Tape
More informationHTCondor at the RAL Tier-1
HTCondor at the RAL Tier-1 Andrew Lahiff, Alastair Dewhurst, John Kelly, Ian Collier, James Adams STFC Rutherford Appleton Laboratory HTCondor Week 2014 Outline Overview of HTCondor at RAL Monitoring Multi-core
More informationBig Science and Big Data Dirk Duellmann, CERN Apache Big Data Europe 28 Sep 2015, Budapest, Hungary
Big Science and Big Data Dirk Duellmann, CERN Apache Big Data Europe 28 Sep 2015, Budapest, Hungary 16/02/2015 Real-Time Analytics: Making better and faster business decisions 8 The ATLAS experiment
More informationRegulatory Compliance
Regulatory Compliance Security: From a security standpoint, disk, tape and DVD, none of them meet the regulatory requirements of Sarbanes Oxley or HIPAA simply because they are generally not encrypted.
More informationOIS. Update on Windows 7 at CERN & Remote Desktop Gateway. Operating Systems & Information Services CERN IT-OIS
Operating Systems & Information Services Update on Windows 7 at CERN & Remote Desktop Gateway CERN IT-OIS Tim Bell, Michal Kwiatek, Michal Budzowski, Andreas Wagner HEPiX Fall 2010 Workshop 4th November
More informationWorkload Characterization and Analysis of Storage and Bandwidth Needs of LEAD Workspace
Workload Characterization and Analysis of Storage and Bandwidth Needs of LEAD Workspace Beth Plale Indiana University plale@cs.indiana.edu LEAD TR 001, V3.0 V3.0 dated January 24, 2007 V2.0 dated August
More informationTHE ARCHIVAL SECTOR IN DW2.0 By W H Inmon
The fourth sector of the DW2.0 environment is the archival sector. Fig arch.1 shows the architectural positioning of the archival sector. Fig arch.1 The archival sector All data that flows into the archival
More informationNUIT Tech Talk: Trends in Research Data Mobility
NUIT Tech Talk: Trends in Research Data Mobility Pascal Paschos NUIT Academic & Research Technologies, Research Computing Services Matt Wilson NUIT Cyberinfrastructure, Telecommunication and Network Services
More informationResearch Data Management Plan
Please consult the guidance notes for examples of answers to help you to complete this template Overview Researchers Name: Title of Research Project: Length of Project: e.g. 6 months, 3 years, etc A brief
More informationRunning the scientific data archive
Running the scientific data archive Costs, technologies, challenges Jos van Wezel STEINBUCH CENTRE FOR COMPUTING - SCC KIT University of the State of Baden-Württemberg and National Laboratory of the Helmholtz
More informationBig Data and Cloud Computing for GHRSST
Big Data and Cloud Computing for GHRSST Jean-Francois Piollé (jfpiolle@ifremer.fr) Frédéric Paul, Olivier Archer CERSAT / Institut Français de Recherche pour l Exploitation de la Mer Facing data deluge
More informationData Requirements from NERSC Requirements Reviews
Data Requirements from NERSC Requirements Reviews Richard Gerber and Katherine Yelick Lawrence Berkeley National Laboratory Summary Department of Energy Scientists represented by the NERSC user community
More informationBig Data and Analytics: Getting Started with ArcGIS. Mike Park Erik Hoel
Big Data and Analytics: Getting Started with ArcGIS Mike Park Erik Hoel Agenda Overview of big data Distributed computation User experience Data management Big data What is it? Big Data is a loosely defined
More informationSingle Sign-In User Centered Computing for High Energy Physics
Single Sign-In User Centered Computing for High Energy Physics Max Fischer, Oliver Oberst, Günter Quast, Marian Zvada ISGC 2013: March 19-22, 2013 Taipei, Taiwan INSTITUT FÜR EXPERIMENTELLE KERNPHYSIK
More informationNew Design and Layout Tips For Processing Multiple Tasks
Novel, Highly-Parallel Software for the Online Storage System of the ATLAS Experiment at CERN: Design and Performances Tommaso Colombo a,b Wainer Vandelli b a Università degli Studi di Pavia b CERN IEEE
More informationDevelopment of Earth Science Observational Data Infrastructure of Taiwan. Fang-Pang Lin National Center for High-Performance Computing, Taiwan
Development of Earth Science Observational Data Infrastructure of Taiwan Fang-Pang Lin National Center for High-Performance Computing, Taiwan GLIF 13, Singapore, 4 Oct 2013 The Path from Infrastructure
More informationHow To Buy A Crikit For A Fraction Of The Price
2012 CriKit- Desktop Private Cloud Paul Morse 5/7/2012 Overview CriKit, which stands for Cloud Resource and Infrastructure Kit, was created out of necessity. As of January 2012, there was no low-cost,
More informationHigh Availability Databases based on Oracle 10g RAC on Linux
High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN, June 2006 Luca Canali, CERN IT Outline Goals Architecture of an HA DB Service Deployment at the CERN Physics Database
More informationVirtualised Cloud Storage Driver for Business ROI. Raymond Ng Principal Consultant Storage Solutions, Asia Pacific Huawei Symantec Technologies
Virtualised Cloud Storage Driver for Business ROI Raymond Ng Principal Consultant Storage Solutions, Asia Pacific Huawei Symantec Technologies Agenda Huawei Symantec Overview Cloud Storage Driver for Business
More informationDatasheet Alarm Center. Version 3.6. Datasheet Alarm Center Version 3.6. www.allgovision.com. AllGo Embedded Systems Pvt. Ltd.
Datasheet Alarm Center Version 3.6 This Specification Sheet gives the details of system requirements, features and other salient points of AllGoVision advanced Alarm Center AllGo Embedded Systems Pvt.
More informationhttp://bradhedlund.com/?p=3108
http://bradhedlund.com/?p=3108 This article is Part 1 in series that will take a closer look at the architecture and methods of a Hadoop cluster, and how it relates to the network and server infrastructure.
More informationTechniques for implementing & running robust and reliable DB-centric Grid Applications
Techniques for implementing & running robust and reliable DB-centric Grid Applications International Symposium on Grid Computing 2008 11 April 2008 Miguel Anjo, CERN - Physics Databases Outline Robust
More informationDesigning the STAR Database Load Balancing Model
Replication and load balancing strategy of STAR's Relational Database Management System (RDBM) Michael DePhillips, Jerome Lauret Brookhaven National Laboratory, Upton NY 11973 Mikhail Kopytine Kent State
More informationBig Data Processing Experience in the ATLAS Experiment
Big Data Processing Experience in the ATLAS Experiment A. on behalf of the ATLAS Collabora5on Interna5onal Symposium on Grids and Clouds (ISGC) 2014 March 23-28, 2014 Academia Sinica, Taipei, Taiwan Introduction
More informationImage Search by MapReduce
Image Search by MapReduce COEN 241 Cloud Computing Term Project Final Report Team #5 Submitted by: Lu Yu Zhe Xu Chengcheng Huang Submitted to: Prof. Ming Hwa Wang 09/01/2015 Preface Currently, there s
More informationShibbolized irods (and why it matters)
Shibbolized irods (and why it matters) 3 rd TERENA Storage Meeting, Dublin, February 12 th -13 th 2009 David Corney, for Jens Jensen, e-science centre, Rutherford Appleton Lab, UK Overview Introduction
More informationEnergy Efficient Storage - Multi- Tier Strategies For Retaining Data
Energy and Space Efficient Storage: Multi-tier Strategies for Protecting and Retaining Data NOTICE This White Paper may contain proprietary information protected by copyright. Information in this White
More informationMaurice Askinazi Ofer Rind Tony Wong. HEPIX @ Cornell Nov. 2, 2010 Storage at BNL
Maurice Askinazi Ofer Rind Tony Wong HEPIX @ Cornell Nov. 2, 2010 Storage at BNL Traditional Storage Dedicated compute nodes and NFS SAN storage Simple and effective, but SAN storage became very expensive
More informationBeyond High Performance Computing: What Matters to CERN
Beyond High Performance Computing: What Matters to CERN Pierre VANDE VYVRE for the ALICE Collaboration ALICE Data Acquisition Project Leader CERN, Geneva, Switzerland 2 CERN CERN is the world's largest
More informationIntegration of Virtualized Worker Nodes in Batch Systems
Integration of Virtualized Worker Nodes Dr. A. Scheurer, Dr. V. Büge, O. Oberst, P. Krauß Linuxtag 2010, Berlin, Session: Cloud Computing, Talk ID: #16197 KIT University of the State of Baden-Wuerttemberg
More informationW H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract
W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the
More informationLHC Databases on the Grid: Achievements and Open Issues * A.V. Vaniachine. Argonne National Laboratory 9700 S Cass Ave, Argonne, IL, 60439, USA
ANL-HEP-CP-10-18 To appear in the Proceedings of the IV International Conference on Distributed computing and Gridtechnologies in science and education (Grid2010), JINR, Dubna, Russia, 28 June - 3 July,
More informationData storage services at CC-IN2P3
Centre de Calcul de l Institut National de Physique Nucléaire et de Physique des Particules Data storage services at CC-IN2P3 Jean-Yves Nief Agenda Hardware: Storage on disk. Storage on tape. Software:
More informationIf you lost all of your data right now... What would you do?... Backup Plan For Home Users and Very Small Businesses
2 If you lost all of your data right now... What would you do?... Backup Plan For Home Users and Very Small Businesses As the Story Goes... At seven pm, I received a distraught phone call on my cellphone...
More informationOracle Database Cloud Service Rick Greenwald, Director, Product Management, Database Cloud
Oracle Database Cloud Service Rick Greenwald, Director, Product Management, Database Cloud Agenda Oracle Cloud Database Service Overview Cloud taxonomy What is the Database Cloud Service? Architecture
More informationWikibon Storage Projections to an All-flash Datacenter in 2016
Wikibon.com - http://wikibon.com Wikibon Storage Projections to an All-flash Datacenter in 2016 by David Floyer - 1 June 2015 http://wikibon.com/wikibon-storage-projections-to-an-all-flash-datacenter-in-2016/
More informationIT-INFN-CNAF Status Update
IT-INFN-CNAF Status Update LHC-OPN Meeting INFN CNAF, 10-11 December 2009 Stefano Zani 10/11/2009 Stefano Zani INFN CNAF (TIER1 Staff) 1 INFN CNAF CNAF is the main computing facility of the INFN Core business:
More informationLinux and the Higgs Particle
Linux and the Higgs Particle Dr. Bernd Panzer-Steindel Computing Fabric Area Manager, CERN/IT Linux World, Frankfurt 27.October 2004 Outline What is CERN The Physics The Physics Tools The Accelerator The
More informationData Lab Operations Concepts
Data Lab Operations Concepts 1 Introduction This talk will provide an overview of Data Lab components to be implemented Core infrastructure User applications Science Capabilities User Interfaces The scope
More informationThis manual will also describe how to get Photo Supreme SQLServer up and running with an existing instance of SQLServer.
1 Installation Manual SQL Server 2012 Photo Supreme Introduction Important note up front: this manual describes the installation of Photo Supreme with SQLServer. There is a free SQLServer version called
More informationExtreme Computing. Big Data. Stratis Viglas. School of Informatics University of Edinburgh sviglas@inf.ed.ac.uk. Stratis Viglas Extreme Computing 1
Extreme Computing Big Data Stratis Viglas School of Informatics University of Edinburgh sviglas@inf.ed.ac.uk Stratis Viglas Extreme Computing 1 Petabyte Age Big Data Challenges Stratis Viglas Extreme Computing
More informationThe Compact Muon Solenoid Experiment. Conference Report. Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland
Available on CMS information server CMS CR -2012/114 The Compact Muon Solenoid Experiment Conference Report Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland 23 May 2012 CMS Data Transfer operations
More informationDynamic Extension of a Virtualized Cluster by using Cloud Resources CHEP 2012
Dynamic Extension of a Virtualized Cluster by using Cloud Resources CHEP 2012 Thomas Hauth,, Günter Quast IEKP KIT University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz
More information