Objectivity Data Migration after C5 presentation, 18 th O c t 2 0 0 2 Marcin Nowak, C E R N D at ab as e G rou p, h t t p : / / ce rn.ch / d b
Overview Objectivity/DB cu r r en t u s a g e Da ta th a t n eed s to be p r es er ved R ep l a cem en t s to r a g e s o l u tio n p r o p o s a l R es o u r ce r eq u ir em en ts f o r th e m ig r a tio n Software/hardware/manpower S o f tw a r e d evel o p m en t H a r d w a r e s etu p f o r th e m ig r a tio n S u m m a r y "after C5" 18.10.2002 M arc i n N o w ak, CE R N D B g ro u p 2
Objectivity/DB at CERN Objectivity/DB maintenance contract for CERN has end ed in 2 0 0 1 Result of the change of persistency baseline of LHC ex perim ents Objectivity can be u sed ind efinitel y, bu t: N o bug fix es N o new releases N o support for new v ersions of operating system N o support for new com pilers I ncreasing l y d ifficu l t to su p p ort Objectivity ap p l ications DB g rou p p l ans to d iscontinu e the su p p ort in 1 H 2 0 0 3 s "after C5" 18.10.2002 M arc i n N o w ak, CE R N D B g ro u p 3
Objectivity Databases in C E R N ex p er im en ts LHC experiments So far using test event data no need to p reserve it W il l ing to w ait for a new sol ution p rovided b y L C G ( P O O L ) Measurement data (e.g. ECAL) M igration to rel ational datab ases ( J D B C, M y SQ L ) CO M P A S S HA R P 3 0 0 T B of event data ( + 1 0 % rec onstruc ted events) W il l c ol l ec t data in 2 0 0 3-2 0 0 4 0. 5-1 P B in total 3 0 T B event data "after C5" 18.10.2002 M arc i n N o w ak, CE R N D B g ro u p 4
Migration Project Scope Scope: Migrate HARP and COMPASS physics data Migrate to a new persistency technology Migrate to new tape m ed ia COMPASS m igration seen as m ore u rgent I t has to b e f inished in ad v ance b ef ore 2 0 0 3 d ata-tak ing starts H A R P m igration will f ollow af terward s D A d f d Migration task stages: ev eloping a new storage system apting the ex perim ents sof tware ram ework s Migrating ata "after C5" 18.10.2002 M arc i n N o w ak, CE R N D B g ro u p 5
Data Characteristics The databases of both experiments share basic desig n principl es: Raw events as B L O B s i n o r i g i nal o nl i ne f o r m at ( D A T E ) d a t a b a s e f i l e s a r e s t o r e d i n C a s t o r C o nd i ti o ns d ata i s k ep t i n C o nd i ti o nsd B ( C D B ) L i m i ted m etad ata ( RU N i nf o r m ati o n, event h ead er s) COMPASS stores reconstructed events (DST) as p ersi stent ob j ects P hy sics anal y sis data is stored ou tside O bj ectiv ity "after C5" 18.10.2002 M arc i n N o w ak, CE R N D B g ro u p 6
Proposed New Data Storage Sy stem Hybrid solution based on a relational database and flat files Preserving essential features of the current system n a v i g a t i o n a l a c c e s s t o e v e n t s a n d r e c o n s t r u c t e d e v e n t s W h y not som eth ing else? LCG POOL T he same hyb rid store p rincip le, b ut not read y yet R OOT D oes not p rovid e navigation N o b enefit for using R O O T for large D A T E B L O B s A d atab ase still need ed for metad ata "after C5" 18.10.2002 M arc i n N o w ak, CE R N D B g ro u p 7
Data Storage - Detai l s Raw events M etad ata Original DATE format, flat files kept in Castor M ainly ev ent h ead ers and nav igational information for raw and rec onstru c ted ev ents in relational d atab ase 0. 2 5 % of th e raw d ata v olu me at CER N stored in Orac le ( b u t w ith ou t Orac le-spec ific featu res) P ossib ility for anoth er d atab ase in th e ou tsid e institu tes C o nd i ti o ns d ata M igrated to th e new CDB implementation b ased on Orac le N o interfac e c h ange ( ab strac t interfac e) D S T Ob j ec t streaming to files "after C5" 18.10.2002 M arc i n N o w ak, CE R N D B g ro u p 8
Software Checklist Database servers Installation and configuration Im p le m e ntation of re lational data m ode ls M i g rati o n ap p l i c ati o n R e ading from O b j e ctiv ity B a s e d o n t h e c u r r e n t e x p e r i m e n t s s o f t w a r e f r a m e w o r k W riting to O racle and C astor R e u s a b l e i n t h e n e w f r a m e w o r k E x p eri m en t s so f tw are f ram ew o rk A dap tations to th e ne w storage sy ste m U s i n g t h e s a m e i n t e r f a c e s t r a n s p a r e n t t o t h e u s e r V alidation of th e ne w sy ste m "after C5" 18.10.2002 M arc i n N o w ak, CE R N D B g ro u p 9
Software Development Status Existing components: A prototype of the COMPASS migration software Migrated of a fragment of 2 0 0 1 COMPASS d ata on a referenc e hard ware setu p 0, 5 T B d a t a m i g r a t e d U s e d a s p e r f o r m a n c e e v a l u a t i o n o f t h e s y s t e m c o m p o n e n t s. S e r v i n g a s p r o t o t y p e o f t h e n e w s t o r a g e s y s t e m Expected C O M P A S S migr a tion sof tw a r e d el iv er y d a te Mid d l e N ov emb er P l a nned sta r t on H A R P migr a tion sof tw a r e J anu ary 2 0 0 3 "after C5" 18.10.2002 M arc i n N o w ak, CE R N D B g ro u p 10
Manpower IT/DB is involved in the migration project to a large ex tent Planning activities started early this year S everal p eo p le ( I T / D B / PD M sectio n) w o rk ing o n the p ro j ect C O M P A S S H A R P C o o rdinating ex p erim S e r v e r s : A D C g r o u p C a s t o r : D S g r o u p ents and w ith o ther I T gro u p s I T / D B is w o rk ing clo sely w ith C O M PA S S ex p erts ( M. L am anna, B. G o b b o, V. F ro lo v) E x p ecting an ex p ert as w ell! "after C5" 18.10.2002 M arc i n N o w ak, CE R N D B g ro u p 11
Data Processing Diagram,8947! 74. 0 88 3 4/ 0,8947 LOG S T AG E RF I O OBJ Y 35:9/ 8 54 4 8 :95:9/ 8 54 4 ORACLE DB 8 4; 0 7, /,9, 9 74: 5 : 9 5 0 7 3 4/ 0 "after C5" 18.10.2002 M arc i n N o w ak, CE R N D B g ro u p 12
Hardware Requirements Resource estimation for migrating 350TB in 50 days, assuming 1 00% efficiency 25 Castor disk servers (10 input servers, also used as CP U servers f or th e m ig ration applic ation and 15 output server), 0. 5T B disk spac e eac h 9 dedic ated input tape drives (9 9 4 0) 5 dedic ated output tape drives (9 9 4 0B ) N etw ork b andw idth - 3 00M B / s total th roug h put 3 50T B b orrow ed storag e spac e in Castor Space can be obtained by purchasing COMPASS 2003 tape contingent in adv ance 5 O rac le9 i servers f or m etadata "after C5" 18.10.2002 M arc i n N o w ak, CE R N D B g ro u p 13
Time Outline for the Migration COMPASS migration has to be finished before 2003 data tak ing starts B u t it c an onl y start w hen the new tap e driv es are av ail abl e The drives should arrive at the beginning of November A lic e data c hallenge use all new drives for 1 month S tart in D ec ember 2 0 0 2? E nd in F ebruary / M arc h 2 0 0 3? H AR P migration w il l fol l ow immediatel y, on the same setu p ~ 1 w eek "after C5" 18.10.2002 M arc i n N o w ak, CE R N D B g ro u p 14
Summary Preparations for the migration are progressing well COMPASS prototype software is working R esou rc e al l oc ation h as b een agreed u pon W e sh ou l d b e ab l e to start as soon as th e new tape d riv es are av ail ab l e T he migration of 3 5 0 T B will tak e a c ou ple of months Su c c essfu l c onc l u sion onl y after ex perim new store T he tapes wou ld hav e to b e c opied any how! T o c h ange to a new tape tec h nol ogy ents v al id ate th e A work ing d oc u ment d esc rib ing the migration plan in more d etail is av ailab le at: http://w w w i n f o.c e r n.c h/d b /o b j e c ti v i ty /c o m pa s s m i g r a ti o n /C o m pa s s M i g r a ti o n.pd f "after C5" 18.10.2002 M arc i n N o w ak, CE R N D B g ro u p 15
Questions "after C5" 18.10.2002 M arc i n N o w ak, CE R N D B g ro u p 16