Distributed Database Access in the LHC Computing Grid with CORAL



Similar documents
Database Services for CERN

Evolution of Database Replication Technologies for WLCG

Database Monitoring Requirements. Salvatore Di Guida (CERN) On behalf of the CMS DB group

EDG Project: Database Management Services

ATLAS job monitoring in the Dashboard Framework

A J2EE based server for Muon Spectrometer Alignment monitoring in the ATLAS detector Journal of Physics: Conference Series

The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets

CERN local High Availability solutions and experiences. Thorsten Kleinwort CERN IT/FIO WLCG Tier 2 workshop CERN

Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam

Big Science and Big Data Dirk Duellmann, CERN Apache Big Data Europe 28 Sep 2015, Budapest, Hungary

F1: A Distributed SQL Database That Scales. Presentation by: Alex Degtiar (adegtiar@cmu.edu) /21/2013

Comparison of the Frontier Distributed Database Caching System with NoSQL Databases

PoS(EGICF12-EMITC2)110

ARDA Experiment Dashboard

Analyses on functional capabilities of BizTalk Server, Oracle BPEL Process Manger and WebSphere Process Server for applications in Grid middleware

Relational databases for conditions data and event selection in ATLAS

High Availability Databases based on Oracle 10g RAC on Linux

Replication solutions for Oracle database 11g. Zbigniew Baranowski

Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications

Potential of Virtualization Technology for Long-term Data Preservation

Tier0 plans and security and backup policy proposals

Database Scalability {Patterns} / Robert Treat

Using Oracle Real Application Clusters (RAC)

Technical. Overview. ~ a ~ irods version 4.x

Deploying a distributed data storage system on the UK National Grid Service using federated SRB

Tips and Tricks for Using Oracle TimesTen In-Memory Database in the Application Tier

Das HappyFace Meta-Monitoring Framework

Comparing SQL and NOSQL databases

The Data Quality Monitoring Software for the CMS experiment at the LHC

Oracle Database 10g: Backup and Recovery 1-2

Status and Evolution of ATLAS Workload Management System PanDA

A simple object storage system for web applications Dan Pollack AOL

Realization of Inventory Databases and Object-Relational Mapping for the Common Information Model

Oracle Identity Analytics Architecture. An Oracle White Paper July 2010

SCALABLE DATA SERVICES

<Insert Picture Here> Oracle In-Memory Database Cache Overview

Building Scalable Applications Using Microsoft Technologies

Using DataDirect Connect for JDBC with Oracle Real Application Clusters (RAC)

FAQ: Data Services Real Time Set Up

Apache Sentry. Prasad Mujumdar

Resource control in ATLAS distributed data management: Rucio Accounting and Quotas

Analisi di un servizio SRM: StoRM

Developing New ATM Network Management Systems with External Partners A White Paper

ArcGIS for Server Deployment Scenarios An ArcGIS Server s architecture tour

Data Quality Monitoring. workshop

CERN Cloud Storage Evaluation Geoffray Adde, Dirk Duellmann, Maitane Zotes CERN IT

Global Grid User Support - GGUS - start up schedule

Internet Services. CERN IT Department CH-1211 Genève 23 Switzerland

Lecture 10: HBase! Claudia Hauff (Web Information Systems)!

Using S3 cloud storage with ROOT and CernVMFS. Maria Arsuaga-Rios Seppo Heikkila Dirk Duellmann Rene Meusel Jakob Blomer Ben Couturier

Non-Stop for Apache HBase: Active-active region server clusters TECHNICAL BRIEF

ADDING A NEW SITE IN AN EXISTING ORACLE MULTIMASTER REPLICATION WITHOUT QUIESCING THE REPLICATION

The Synergy Between the Object Database, Graph Database, Cloud Computing and NoSQL Paradigms

Storing and Processing Sensor Networks Data in Public Clouds

SQL Server Training Course Content

THE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES

A Comparison of Oracle Berkeley DB and Relational Database Management Systems. An Oracle Technical White Paper November 2006

The CMS analysis chain in a distributed environment

Portable Scale-Out Benchmarks for MySQL. MySQL User Conference 2008 Robert Hodges CTO Continuent, Inc.

Cloud Based Application Architectures using Smart Computing

OLTP Meets Bigdata, Challenges, Options, and Future Saibabu Devabhaktuni

Tushar Joshi Turtle Networks Ltd

Oracle Databases on VMware High Availability

Hadoop Ecosystem B Y R A H I M A.

Near Real Time Indexing Kafka Message to Apache Blur using Spark Streaming. by Dibyendu Bhattacharya

Storage strategy and cloud storage evaluations at CERN Dirk Duellmann, CERN IT

API Architecture. for the Data Interoperability at OSU initiative

Learn Oracle WebLogic Server 12c Administration For Middleware Administrators

Customer Bank Account Management System Technical Specification Document

database abstraction layer database abstraction layers in PHP Lukas Smith BackendMedia

<Insert Picture Here> Oracle SQL Developer 3.0: Overview and New Features

Cloud Computing Is In Your Future

CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level. -ORACLE TIMESTEN 11gR1

Oracle Database: Program with PL/SQL

An Oracle White Paper February Oracle Data Integrator 12c Architecture Overview

Database System Architecture & System Catalog Instructor: Mourad Benchikh Text Books: Elmasri & Navathe Chap. 17 Silberschatz & Korth Chap.

Sharding with postgres_fdw

Appendix A Core Concepts in SQL Server High Availability and Replication

ITG Software Engineering

Configuring the BIG-IP LTM v11 for Oracle Database and RAC

Oracle Database Solutions on VMware High Availability. Business Continuance of SAP Solutions on Vmware vsphere

AGILEXRM REFERENCE ARCHITECTURE

Data Management in an International Data Grid Project. Timur Chabuk 04/09/2007

SOA, case Google. Faculty of technology management Information Technology Service Oriented Communications CT30A8901.

Site specific monitoring of multiple information systems the HappyFace Project

Oracle Data Integrator 11g New Features & OBIEE Integration. Presented by: Arun K. Chaturvedi Business Intelligence Consultant/Architect

Transcription:

Distributed Database Access in the LHC Computing Grid with CORAL Dirk Duellmann, CERN IT on behalf of the CORAL team (R. Chytracek, D. Duellmann, G. Govi, I. Papadopoulos, Z. Xie) http://pool.cern.ch & http://pool.cern.ch/coral CHEP 07, September 6th Victoria, Canada

Other connected contributions at CHEP 07 171 - Production Experience with Distributed Deployment of Databases for the LHC Computing Grid 172 - CERN Database Services for the LHC Computing Grid 122 - Relational databases for conditions data and event selection in ATLAS 161 - Building a Scalable Event-Level Metadata System for ATLAS 186 - Development, Deployment and Operations of ATLAS Databases 319 - Alignment data streams for the ATLAS Inner Detector 333 - Large Scale Access Tests and Online Interfaces to ATLAS Conditions Databases 90 - ATLAS Conditions Database Experience with the COOL Conditions Database Project 236 - Control and monitoring of alignment data for the ATLAS endcap Muon Spectrometer at the LHC 294 - An Inconvenient Truth: file-level metadata and in-file metadata caching in the (file-agnostic) ATLAS distributed event store 462 - Database architecture for the calibration of ATLAS Monitored Drift Tube Chambers 358 - Distributed Interactive Access to Large Amount of Relational Data 430 - The ATLAS METADATA INTERFACE 110 - Experience and Lessons learnt from running high availability databases on Network Attached Storage 109 - Oracle RAC (Real Application Cluster) application scalability, experience with PVSS and methodology 265 - LHCb Distributed Conditions Database 213 - LHCb experience with LFC database replication 89 - LHCb Online Interface to a Conditions Database 322 - CMS Conditions Data Access using FroNTier 325 - The CMS Dataset Bookkeeping Service 182 - Distributed Database Access in the LHC Computing Grid with CORAL 181 - Development Status and Plans for the Common Database Access Layer (CORAL) 204 - COOL Software Development and Service Deployment Status 205 - COOL Performance Tests and Optimization 350 - Nightly builds and software distribution in the / AA / SPI project 447 - Implementing a Modular Framework in a Conditions Database Explorer for ATLAS 292 - Explicit state representation and the ATLAS event data model: theory and practice Dirk.Duellmann@cern.ch Distributed Database Access with CORAL - 2

CORAL - A Database Foundation Layer Started as component of POOL- now packaged independently POOL/COOL use CORAL CORAL does not require either (online use) CORAL goals database foundation for physics applications local area and grid all AA platforms and database back-ends high level interface C++ and Python API abstraction from SQL dialects of db vendors and connection technologies, avoids risk of vendor binding high level services authentication, authorisation, db service look-up, retry and fail-over (see talk #182 for details) Dirk.Duellmann@cern.ch Distributed Database Access with CORAL - 3

Between Applications and Services Physics Applications common or experiment specific COOL Validity Intervals, Conditions Data POOL Object Storage, Meta Data and Collections, File Catalogs CORAL RDBMS Access, Indirection, Authentication/Authorisation Computing Services Files, Databases, Grid Services Dirk.Duellmann@cern.ch Distributed Database Access with CORAL - 4

Applications Using CORAL Persistency framework components POOL and COOL fully replaced direct database dependencies with CORAL POOL File catalogs POOL Event collections COOL conditions database CORAL is also used directly ATLAS: detector geometry several online CMS: conditions data, POPCON framework LHCb: online applications Astrophysics: LSST project is evaluating CORAL for storage of large volume image data Dirk.Duellmann@cern.ch Distributed Database Access with CORAL - 5

Database Back-ends Oracle - 10g R2 based on C level interface (OCI) performance and flexibility in compiler choice all API concepts / optimisations supported natively bind variables, bulk operations, session pooling MySQL - V5 (and 4) small to medium sized services standalone development set-ups SQLight - V3.4 server-less, file based access popular transport/replication medium for read-only data FroNTier/SQUID - caching layer read-only access to Oracle via http multi-level caching between server and client to avoid network and server latencies for repeated queries Dirk.Duellmann@cern.ch CORAL Development Status and Plans - 6

SQL-free API Example 1: Table creation coral::ischema& schema = session.nominalschema(); coral::tabledescription tabledescription; tabledescription.setname( T_t ); tabledescription.insertcolumn( I, long long ); tabledescription.insertcolumn( X, double ); schema.createtable( tabledescription); Oracle MySQL CREATE TABLE T_t ( I NUMBER(20), X BINARY_DOUBLE) CREATE TABLE T_t ( I BIGINT, X DOUBLE PRECISION) Dirk.Duellmann@cern.ch Distributed Database Access with CORAL - 7

SQL-free API cont. Example 2: Issuing a query coral::itable& table = schema.tablehandle( T_t ); coral::iquery* query = table.newquery(); query->addtooutputlist( X ); query->addtoorderlist( I ); query->limitreturnedrows( 5 ); coral::icursor& cursor = query->execute(); Oracle MySQL SELECT * FROM ( SELECT X FROM T_t ORDER BY I ) WHERE ROWNUM < 6 SELECT X FROM T_t ORDER BY I LIMIT 5 Dirk.Duellmann@cern.ch Distributed Database Access with CORAL - 8

CORAL Components Monitoring Service implementation New Component Client Software CORAL Interfaces (C++ abstract classes user-level API) CORAL C++ types (Row buffers, BLOBs, Date, TimeStamp,...) RDBMS Implementation RDBMS Implementation (mysql) (frontier) RDBMS Implementation RDBMS Implementation (sqlite) (oracle) Authentication Service Authentication Service (environment) (xml) Authentication Service (lfc) Lookup Service (xml) Connection Service Pools, Retry, Failover Lookup Service (lfc) Common Implementation developer-level interfaces Relational Service implementation Plug-in libraries, loaded at run-time, interacting only through the interfaces Dirk.Duellmann@cern.ch Distributed Database Access with CORAL - 9

Database Service Lookup & Connection Service Problem: DB clients need to obtain physical DB connection parameters (aka connect string ) more than a host name as technology specific optimisation and retrial parameters are specified handle DB replica selection connection retry and fail-over in case of problems avoid hard-coding of connection parameters or user credentials Several CORAL plug-ins can be selected at runtime to handle the above in different computing environments via shell environment (assuming you control machine access) via XML files (assuming you can control file access) via LFC catalog (certificates to protect DB access credentials) CORAL maintains a client side pool of connections network connection is shared by several logical DB sessions authentication overhead is minimised Dirk.Duellmann@cern.ch CORAL Development Status and Plans - 10

Database Authentication Different requirements in different areas of physics computing Online databases - controlled environment, few applications,users and roles User/password pairs still manageable Integration between online and offline accounts is an issue Offline data production - reconstruction/simulation More users (and roles), apps environment shared with grid DB passwords still widely used, but with security issues Grid jobs - reprocessing, end user analysis Large user community, spreading many organisation boundaries User/password pairs can not be shipped with job to a worker node be maintained for a large virtual organisation Certificates (X.509 proxy cert s) are used by scientific grids support from commercial vendors still missing CORAL support the above via a set of plug-ins coupling to grid services as required Dirk.Duellmann@cern.ch CORAL Development Status and Plans - 11

Database Authorisation After successful authentication CORAL will match the application read/write or update requests against the available roles for a user CORAL enables the appropriate database privileges for a database table or schema based on the VOMS roles in the user certificate Allows to develop general applications insuring proper data protection according to the VO access model Administration of database roles can be based on the existing VO administration tools (VOMS) Dirk.Duellmann@cern.ch Distributed Database Access with CORAL - 12

Building reliable Applications and Services with CORAL The reliability of a composite service depends crucially on core services (eg DB server) but can easily exceed those if a valid retry and fail-over mechanisms are used Unfortunately this often happens only late in the application development or is only incomplete Many different retry and fail-over strategies complicate deployment of db apps CORAL implements a consistent retry and fail-over strategy which can be parametrised according to the application needs parameters are kept in one place (eg service look-up file or DB catalog) This does not completely eliminate the need for specialised handling of error conditions during data manipulation but covers the most frequent problems to obtain a database connection (eg on a overloaded or temporarily unavailable DB server) without loosing the execution state of an application/service Dirk.Duellmann@cern.ch Distributed Database Access with CORAL - 13

Summary Since last CHEP CORAL passed a phase of active development The user API has been very stable, but many DB deployment improvements have been implemented Client-side connection pool, LFC based service look-up and authentication, Python binding and DB Copy Tools The package is widely used by Persistency Framework and experiment code, is well integrated with the database and grid services and provides a complete foundation for database applications Next Steps and future improvements Remove dependency on SEAL from Persistency Framework Investigating a multi-threaded CORAL server to further improve scalability and security for large scale LHC database deployments Existing CORAL applications are expected to profit with minimal/ no code changes Dirk.Duellmann@cern.ch Distributed Database Access with CORAL - 14