Data & Storage Services CERNBox + EOS: Cloud Storage for Science CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/it Presenter: Luca Masce. Thanks to: Jakub T. Mościcki, Andreas J. Peters, Hugo G. Labrador, Massimo Lamanna CERN/IT- DSS
Content What we have done What we do What we will do CERNBox 2
The origins of the CERNBox project Missing link? CERNBox 4500 dissnct IPs in DNS from cern.ch to *.dropbox.com (daily...) What we are missing easy access cloud storage for end users files go automascally to the cloud and are available always everywhere broken laptop data lost offline access to data work on the plane and rsync when back online keep files in sync across devices access on mobile clients (easy) sharing of files with colleagues ssll surprisingly difficult Can we have this? for documents (small files, oben ppts, text, ) for science data (integrated into data processing workflows and exissng infrastructure) 3
Original architecture (CERNBox beta service) USER Sync client (webdav) Web access (hgps) HTTPS LB Data flow Metadata flow Apache, PHP 5.4 (SCL1.0) mod_proxy_balancer 64 core, 64GB RAM AS OC AS OC AS OC Setup 100% RH6 on standard hardware Based on owncloud Guaranteed failover (redundant nodes) Image courtesy of www.phdcomics.com Keeps track of sync state for every file in the system SQL overheads (Hz metadata ops) DB MySQL server 48GB RAM filesystem (POSIX) Files not exposed directly to the user STORAGE NFS servers, async, SW RAID 1 IniSal space: 20 TB
Usage of the beta service CERNBox Beta 2014 March April May June October users 190 (*) 285 361 429 720 files 191K 907K 1.6M 2.7M 6.4M size 480GB 1TB 1.5TB 1.9TB 3.4TB (*) users inherited from the inisal prototype deployment 15% 1% Size per user 84% Avg ~5GB <10GB >10GB up to 100GB Files per user 1% 5% 94% Avg ~10K files < 5K 5K- 20K up to 100K 5
File access patterns GET/PUT raso: 2/1 File type distribuson: 1200 different file extensions! 30%.c.h.C 30%.jpg.png 15% no extension (UNIX world!) 25% other:.pdf,.txt,.ppt,.docx,.root,.py,.eps,.tex ~100 URL shares, ~40 synced shares UNICODE filenames: greek, russian, thai(?) 6
Pilot limitations Move On the origin client move is propagated to the server On the other clients it is propagated as COPY/DELETE (subopsmal) Symlinks are not supported Ignored files:, :? * < > We currently recommend one sync folder setup: ~/cernbox High per- file overhead Expect 2-5Hz PUT Expect ~10Hz GET Transfer rates Expect 10-30MB download Expect 5-10MB upload Larger files: 400MB file on standard desktop hgps/upload: ~25MB/s, hgps/download: ~60MB/s For wireless devices, laptops, phones do we care about transfer rates? 7
Towards large-scale data sync and share Currently deployed CERNBox beta works OK so far for the classical Dropbox use- case low- frequency document sync and share But can we bring this system to the next level? Our core- business and large- scale workloads expose PBs of exis%ng data from day 1 integrason into physics data processing eco- system central services: batch, interacsve data analysis applicasons sync higher data volumes at higher rates Can we ssll keep the simplicity of cloud storage access? 8
Massive scaling at reduced cost? No need to keep track of all files and directories in the database avoids explosive growth of your DB infrastructure Our file number essmate? With 10K users we have 2.5 billion files in AFS already! What is your number for 100K users? Before we start throwing hardware at the problem consider the cost of running the service Fixed: hardware purchase, service deployment, infrastructure Scaling: hardware incidents, user support; backup; integrity checks; upgrades Infrastructure: space, electricity and cooling in the data center For massive scaling we need to keep TCO under control profit from exissng large- scale operasons and support of our storage services exploit economies of scale 9
Integration Started in May 2014 FuncSonality Enable sync and share for exissng data in EOS Without exporsng data to another storage Direct access to data with efficient sync behind OperaSons NFS/async backend server is a temporary soluson EOS offer virtually unlimited cloud storage for end- users Fold- in the operason cost into EOS But: Integrate as transparently as possible most users don t care about storage backend Fully working soluson compasble with owncloud clients we don t want to end up with half- working CERN- specific soluson 10
EOS Integration Details Understanding sync protocol and underlying semanscs. Add a few consistency features to EOS (e.g. atomic upload) Adding few new features to EOS or libing restricsons (e.g. UTF8 support) Beef- up the webdav endpoint to allow owncloud clients to talk directly to it Integrate web- access and sharing funcsonality Web fronted: develop new plugins Nice integrason of trashbin, versions and sharing: Fusion between owncloud model and EOS model (Hugo G. Labrador) Making more robust less stressed parts in EOS (hgp/webdav) Lots, lots of tessng. 11
CERNBox 2.0 Architecture Sync client (webdav) Web access (hgps) HTTPS HTTPS LB HTTPS LB HTTPS LB LB Data flow Metadata flow Data directly accessible by the user USER hgp (public data) hgps (private data) hgp (internal) KHz metadata ops OC fuse All sync state as metadata in the storage STORAGE (EOS) Files wriyen with USER credenzals disk servers (1000s) IO redirect namespace
Prototype deployment on EOSPPS /eos/user/<u>/<username> this is the default sync and web- enabled folder as an advanced user you may add arbitrary folder from EOS very easy to implement a folder shared by an e- group We can also allow transparent access to different instances 13
First performance numbers User- perceived performance (client) Metadata operason (pycurl with SSL sessions) PROPFIND with 1 entry: 90 Hz PROPFIND with 1K entries: 8.5 KHz PROPFIND with 10K entries: 10KHz nice speed e.g. kernel src tree upload (50K files, 500MB) ~ 1h from laptop/wifi at home, download ~20 min ops/s 200 200 Small files (10KB) pycurl seq 150 100 50 0 125 70 60 50 65 57 30 30 20 11 Download Upload Delete pycurl P=10 pycurl P=50 owncloud sync client 14
Summary Working and usable beta service Useful for ge.ng experience, user feedback and understanding what we want / don t want in the final producson system based on EOS CERNBox Advanced integrason into EOS will open up new possibilises but there is no free lunch: we will have to adapt to evolving owncloud clients, etc. heading towards large sync and share layer for science research all our data exposed from day 1 massive scalability, high performance integrated into exissng workflows - new capabilises! small overhead on top of our exissng operasons and development TCO control and ssll as easy to use as Dropbox.com 15
Integrated storage ecosystem for scientific research sync / share / offline access 2.0 USER webdav & hgps:// online file- system access fuse CERNBox Analysis cluster Central Services high- performance applicason access xrootd:// batch access xrdcopy LARGE- SCALE STORAGE
agenda full ~35 parscipants Tracks Keynote B.Pierce Technology Users Site reports Vendor talks IBM Powerfolder SeaFile PyDio Owncloud 17
CERNBox 2.0 some numbers Advanced prototype stage Adapted exissng webdav interface in EOS to be compasble with owncloud sync clients Test environment (EOSPPS) standard hardware namespace node with Xeon 2.2GHz, 16 cores, 24GB RAM 50 disk servers: cheap JBODs (1000 disks), total 800TB usable space Storage layout: 2 replicas in RAIN mode à every file PUT = 2 copies of the file on two independent storage nodes (with adler32 checksums of content) Event- based hgp(s) load- balancer (nginx) Underlying storage scalability (EOS Prod) Max observed IO: ~40GB/s on a single instance (eosatlas => ) Max observed file stats: 10s KHz Thousands of connected clients Server should never be a bogleneck for CERNBox 18