Open Data & Data Management: the INFN experience



Similar documents
The History Italian Structural Funds Cloud computing technologies for smart government Cloud computing technologies for Smart Government

Trials community. Yannick Legré. EGI InSPIRE RI

Curriculum Vitae Cristina Lancioni

Curriculum Vitae et Studiorum. Giovanni Losurdo

Open Access and Open Research Data in Horizon 2020

The Open Archive at the University of Verona. Maria Gabaldo May 26, 2011

Digital performance of Italy

The Italian Grid Infrastructure (IGI) CGW08 Cracow. Mirco Mazzucato Italian Grid Infrastructure Coordinator INFN CNAF Director

data infrastructures framework for action for H2020

Big Data Challenges for e-science Infrastructure

Workprogramme

Invenio: A Modern Digital Library for Grey Literature

Italian Scientific Big Data Initiative

Scientific Cloud Computing Infrastructure for Europe Strategic Plan. Bob Jones,

MINUTES OF THE ADMINISTRATIVE COMMITTEE: EVALUATION OF THE REQUIRED INTEGRATIONS

DRIVER: Building a sustainable infrastructure of European Scientifc Repositories

SPARC Europe in the Open Access Movement: A Master Plan to tackle the barriers?

Open Access to Publications. From FP7 Pilot to Horizon 2020 Mandate

Implementing & measuring the FP7 OA pilots

EGI services for distribution and federation of data and computing

Open Access to publications and research data in Horizon 2020

How To Promote Agricultural Productivity And Sustainability

OPEN ACCESSAND DATA MANAGEMENT SUPPORTAT THE UNIVERSITY OF HELSINKI

How To Understand The History Of Centro Servizi Calza

Scientific Cloud Computing Infrastructure for Europe. Bob Jones,

ISTITUTO NAZIONALE DI FISICA NUCLEARE CONSIGLIO DIRETTIVO DELIBERAZIONE N

INTERNATIONAL WORKSHOP ON BIKE SHARING A SINGLE SMART CARD FOR DIFFERENT SERVICES. 15/09/10 International workshop on bike sharing

The PON RECAS Infrastructure: present and future of grid/cloud computing and connectivity Guido Russo Università di Napoli Federico II & INFN Napoli

Data at NIST: A View from the Office of Data and Informatics

Big Data Needs High Energy Physics especially the LHC. Richard P Mount SLAC National Accelerator Laboratory June 27, 2013

Jochen Schirrwagen, Najko Jahn. Bielefeld University Library, Germany. Research in Context

Exploitation of ISS scientific data

How To Build An Open Source Data Infrastructure

Scholarly Communication A Matter of Public Policy

Research Infrastructures in Horizon 2020

ACCADEMIA ITALIANA DI LINGUA

D4.2 - Analysis of Data Infrastructure and Data repositories

Biometrics and soft biometrics for people identification

Michele Lucente Curriculum Vitæ

Scientific Data Infrastructure: activities in the Capacities Programme of FP7

Attachment(s) - Curriculum Vitae - Photo. Davide Merlitti. Nationality: Italy. 1 / 7

e-science and technology infrastructure for biodiversity research

Distributed Computing Services on top of a Research and Education Network: GARR. Federico Ruggieri Ubuntunet Connect 2013 Kigali, Rwanda

Virtual digital libraries: The DILIGENT Project. Donatella Castelli ISTI-CNR, Italy

A public-private partnership building a multidisciplinary cloud platform for data intensive science

w w w. e u r o t e c h. c o m w e b. i n f n. i t / a u r o r a s c i e n c e

Volunteer Crowd Computing and Federated Cloud developments. David Wallom

ADVANCEMENTS IN BIG DATA PROCESSING IN THE ATLAS AND CMS EXPERIMENTS 1. A.V. Vaniachine on behalf of the ATLAS and CMS Collaborations

How To Teach Physics At The Lhc

e-infrastructures in Horizon 2020 Vision, approach, drivers, policy background, challenges, WP structure INFODAY France Paris, 25 mars 2014

OpenAIRE Research Data Management Briefing paper

Round Table Italy-Russia at Dubna

Open Access to scientific data. SwissCore Annual Event Brussels, 14 May 2014

EUDAT. Towards a pan-european Collaborative Data Infrastructure

Interoperating Cloud-based Virtual Farms

Shibbolized irods (and why it matters)

Ricerca scientifica e sviluppo umano: il caso del CERN. Fabiola Gianotti CERN, Physics Department

Digital Communication and Interoperability - A Case Study

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences

GENESI-DEC: a federative e-infrastructure for Earth Science data discovery, access, and on-demand processing

SERVIZIO FORMAZIONE ALLA RICERCA PHD COURSES A.Y. 2015/16 31 CYCLE

Curriculum Vitae Et Studiorum

The challenges of digital preservation to support research in the digital age

Building Cloud Applications to Support Research: the Microsoft Azure Research Engagement Project

Leveraging OpenStack Private Clouds

Agenzia Nazionale LLP ERASMUS Intensive Programme A.A. 2012/2013 ATTRIBUZIONI

How To Harvest For The Agnic Portal

Image quality issues in digitization projects of historical documents

Agenda. NRENs, GARR and GEANT in a nutshell SDN Activities Conclusion. Mauro Campanella Internet Festival, Pisa 9 Oct

Francesco Tortorelli

The EGI Federated Cloud e-infrastructure

Politecnico di Torino Research Information System overview on project path and goals

Research Data Alliance: Current Activities and Expected Impact. SGBD Workshop, May 2014 Herman Stehouwer

e-irg workshop Dublin May 2013 Track 1: Coordination of e-infrastructures

Distributed analysis for the ATLAS Experiment in the S.Co.P.E Project

IL NUCLEARE IN ITALIA: SI RIPARTE?

The Resilient Smart Grid Workshop Network-based Data Service

THAMUS PRESENTATION LIRICS Industrial Advisory Group meeting Barcelona 20/21 June Un Consorzio

Procurement Innovation for Cloud Services in Europe

The challenge of managing research data. Axel Berg

Automating Big Data Management, by DISIT Lab Distributed [Systems and Internet, Data Intelligence] Technologies Lab Prof. Ph.D. Eng.

twitter.com/liberologico Pisa Pula (CA) Campochiaro (CB)

Polo Tecnologico Servizi PAVIA

(Possible) HEP Use Case for NDN. Phil DeMar; Wenji Wu NDNComm (UCLA) Sept. 28, 2015

The B2G electronic invoicing in Italy 1 year later

Data Intensive Research Initiative for South Africa (DIRISA)

Summary of the FabriQ 2 Call for Proposals to promote and support entrepreneurial projects in the field of social innovation

Company Presentation and Main Activities

EU H2020 funding opportunities. Mauro Morandin INFN PD

Global Research Data Infrastructure: Path Forward for Progress. Dr. Chris Greer Senior Executive for Cyber Physical Systems

Helix Nebula, the Science Cloud

Social Innovation beyond the State

dati.culturaitalia.it a Pilot Project of CulturaItalia dedicated to Linked Open Data

Enabling multi-cloud resources at CERN within the Helix Nebula project. D. Giordano (CERN IT-SDC) HEPiX Spring 2014 Workshop 23 May 2014

Overview of HEP. in Spain

I musei virtuali europei e la rete di eccellenza v-must.net European virtual museum and the network of excellence v-must.net

How To Host A Cnitfot Net Seminar On Festafotnet

Web and Mobile access to storage with glibrary

Executive summary. Prepared by Bob Jones (IT department) on behalf of CERN 17 March 2015

Transcription:

Open Data & Data Management: the INFN experience Marcello Maggi INFN Senior Researcher Istituto Nazionale Fisica Nucleare Bari-Italy A Chaotic view from a scientist point of view

The HEP Scientist The Standard Model Of Elementary Par>cles Fermions t γ d s b Z Leptons νe νμ ντ W e μ τ The Just Discovered Piece FROM MICROCOSM Force carriers Quarks u c Bosons g h Dark Matter Matter/Anti-matter Asymmetry Super Symmetric Particles TO MACROCOSM

In Big Communities In International Labs (CERN) Past Century collaboration ~500 Scientists From all around the word Today collaboration ~4000 Scientists

Data Sharing & Data Management Fundamental Issue Birth of Web @ CERN

INFN Community of researcher in physics and applied physics Based on 4 national laboratories and 20 divisions spread across Italy Big impact on Italian Society

The Italian e-infrastructure Today Picture: 50 data centers 40,000 cores ~60 PB Growing through approved projects CPU: +25%/year Disk: +20%/year

The (Big) DATA 10 7 sensors produce 5 PByte/sec Complexity reduced by a Data Model Analytics in real time filters to 0.1 1 Gbyte/sec (Trigger) Data + Replica move with a Data Management Policy Analytics produce Publication Data that are Shared Finally the Publications Is all that Open? We Start from here

Open Science Open Access Data Preservation SCOAP3 Innovative Business Molel for OAP Knowledge Base & Semantic Searches Common Practices

INFN & Open Access Budapest Open Access Initiative 2001 Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities 2003 INFN in SCOAP3 2007 INFN signs Berlin Declaration 2008 INFN signs Granada Declaration 2010 INFN signs the MedOAnet position paper 2013

Since Ever HEP community publish and distribute preprints

Worldwide consortium funding HEP publications and enforce OA, through the re-routing of subscription funds, and the transition to a system of commercial competition Past: The HEP agencies subscribing through libraries funded peer-reviews, allowing its users to read the articles. There was no form of commercial competition between journals. Present: The HEP agencies and libraries, together, contribute to the consortium SCOAP3 that, after selecting the journals, pay centrally the peer review for each published article. The articles are OA.

Open Data Event Display of Higgs boson decay

Publication Data Analytic step 1 Pre Selected Data Analytic step 2 Final Data Samples Analytic step 3 Analytic step 4

Raw data The tip of the iceberg

Levels of Open Data? Discussion on going Data&Harmonization&Guidelines Common%tacit%points%of%agreement%between%LHC%experiments: level$1$data:$all%experiments%already%make%data%from%papers%and%supporting% information%available%through%hepdata/inspire,%support%open%access% journals%etc..! level$2$data:$all%experiments%already%support%limited%access%of%samples%in% simple%formats%for%outreach%and%teaching. level$3$data:$full%reconstruction%outputs%for%analysis%(aod,%dpd/ntuples)% might%be%made%available%after%an%embargo%period% %but%suggested%durations% range%from%3%to%10%years,%and%there%is%a%question%of%usefullness.%the%resource% implications%to%make%this%useful%are%high. level$4$data:$general%agreement%raw%data%is%preserved%for%the%experiment% and%future% %open%data%access%is%not%usually%possible%even%to%the% collaboration%members.%(in%atlas%access%to%raw%data%on%tape%is%restricted). Tools$like$Rivet,$HEPDATA$&$Recast$may$make$data$(information)$usefully$ available,$bridging$level$3$and$level$1. 4

PILOT SCREENSHOT opendata.ct.infn.it INFN Open Data

Italian Research DB Resulting from a discussion between the CERN and INFN responsible persons for Open Access

Research DB pilot: opendata.ct.infn.it INVENIO-NEXT & ZENODO Open Data Discovery CINECA VQR OpenAIRE arxiv SINGLE MANDATORY DEPOSIT Data Service Bibl. CNR DSPACE INFN Multi media INFN Grey Lit Scoap3 OA papers Happy INFN scientists Knowledge Service

INFN is Active in Knowledge Base & Semantic Search

A Global OA Repository 2,500 repos >33 M docs

Global Data Repository 600 repos Lots of data!

Data & Knowledge Infrastructure Linked-data search engine Semantic-web enrichment Harvester (running on grid/cloud) Harvester (running on grid/cloud) OA Reps OAI-PMH End-points OAI-PMH Data Reps

European Research e-infrastructures New Trend in Europe: Secure computing resources funding from FA: ELIXIR (Life science) identified nodes in the consortium LifeWatch (Earth science) has IT research center CLARIN (Arts, humanities and social science) has certified centers Virtual hubs federating major computing centers to offer resources and services

Eu-T0 Federate major computing and data process centers of Particle, Nuclear, Astro-Particle Physics, Cosmology and Astrophysics into a integrated distributed infrastructure: a virtual European Tier0 data and computing center around which all other national centers revolve and from which all concerned national e-infrastructures radiate IN2P3-Fr INFN-It STFC-UK DESY-DE KIT-DE IFAE-ES CIEMAT-ES CERN signed the position paper NeIC (Nordic e-infrastructure Collaboration) asked to join

INFN is exporting/importing experience Multidisciplinary and/or extra Europe - Chain-Reds (Coordination and harmonisation of e-infrastructure for research and data sharing - aginfra (data infrastructure for agriculture) - DCH-RP (Digital Cultural Heritage Roadmap for Preservation) - BioVel (Biodiversity Virtual E-Laboratory) National collaborations on - Computational Chemistry: Uni. Pg, Uni. To, CNR-ISOF - Environmental Science: EMSO (European Multidisciplinary Seafloor Observatory) (ESFRI), DRHIM (Distributed Research Infrastructure for Hydro-Meteorology) (FP7 proj.) CIMA (Centro Monitoraggio Ambiente) - Bioinformatics: CNR-ITB, Uni. Bo Partecipazione JRU - Elixir (European life science infrastructure for Biological Information) - Life Watch : earth science in progress

From Global to Local Projects Core Business Projects DHTCS-it ReCaS Prin-Stoa Multidisciplinary Projects (smart cities) Prisma (PiattafoRme cloud Interoperabili per SMArt-government) OCP (Open City Platform) Cagliari 2020

Open Cloud Platform -1 Partners 1. Almaviva the Italian Innovation Company S.P.A 2. Maggioli SpA 3. Santer Reply S.P.A. 4. Pluservice s.r.l capofila della ATI Marche (E-LINKING ONLINE SYSTEMS S.R.L., ETT S.p.A., FILIPPETTI S.P.A., APRA PROGETTI S.R.L., HALLEY INFORMATICA S.R.L., ESALAB S.R.L., SEDA S.p.A. - Gruppo KGS, ITALSOFT S.R.L., JEF S.R.L.) 5. LASCAUX s.r.l. capofila della ATI Toscana-ER (SISTEMI TERRITORIALI S.R.L., SINED S.R.L., PHOOPS S.R.L., AGENZIA ESPRESSI S.A.S., 3D INFORMATICA S.R.L.) 7. INFN - Istituto Nazionale di Fisica Nucleare 8. UniCam - Università degli Studi di Camerino

Open Cloud Platform -2 PA involved ITALIAN REGIONS 1. REGIONE MARCHE 2. REGIONE TOSCANA 3. REGIONE EMILIA ROMAGNA COMUNI/UNIONI 1. Comune di Macerata 2. Comune di San Severino 3. Comune di Camerino 4. Comune di Matelica 5. Comune di Castelraimondo 6. Comune di Tolentino 7. Comune di San Benedetto 8. Comune di Ancona 9. Comune di Pesaro 10. Comune di Senigallia 11. Comune di Civitanova 12. Comune di Fabriano 13. Comunità Montana Alto e Medio Metauro 14. Comune di Ascoli 15. Comune di Rosignano Marittimo 16. Comune di Livorno 17. Comune di Lucca 18. Comune di Massa 19. Unione dei Comuni dell Amiata Grossetana 20. Comune di Cesena 21. Unione dei Comuni della Bassa Romagna

Open Cloud Platform -3 Open Data & Open Service Engine

Open Cloud Platform -4

Conclusions INFN e-infrastructure spreads in the entire territory Part of an International Collaborative e-infrastructure Open Access & Data naturally Rich exchange with other disciplines (federation and/or interoperability) Capable to study, develop & deploy solutions to demands from Global (Macro) to Local (Micro)