1 Technical Case Study CERN the European Organization for Nuclear Research How CERN helps physicists unlock the secrets of the universe running critical operations on a foundation of Oracle databases on NetApp storage. Following the Data to Knowledge and Discovery The physicists at CERN strive to expand humankind s general understanding of our world, pushing beyond the boundaries of knowledge to fathom the secrets of the universe. Driven by curiosity and a quest for pure knowledge, CERN s scientific community conducts fundamental research that follows the data wherever it leads in the search for clues and discoveries about how the universe works. But that does not mean that CERN research is without practical and often revolutionary application in everyday life. In 1989, for example, Tim Berners Lee, a scientist at CERN, invented the World Wide Web, conceived and developed to meet the demand for automatic information sharing among the global high energy physics community. CERN also served as the incubator for capacitive touch screens, invented in 1973 by Bent Stumpe and colleagues and originally put to use in the control room of the CERN SPS accelerator. Those inno vations applied research spin-offs, if you will have transformed modern communications. The work at CERN In addition to seeking answers to questions about the universe, the CERN community works to: Encourage global collaboration, bringing nations together through science. Educate, providing advanced worker training and building enthusiasm for physics among the next generation of scientists. Advance the frontiers of technology, cooperating with industry to bring forward new technologies.
2 Researching the building blocks of the universe CERN provides some of the world s most technologically advanced facilities for researching the basic building blocks of the universe. Facilities include particle accelerators and specialized machines to help prove the existence of exotic forms of matter. Research at CERN facilities falls into three major areas of study: The origin of mass. Research in this area includes searching for the Higgs particle, a hypothetical and elementary particle predicted by the Standard Model (SM) of particle physics. The Higgs particle belongs to a class of particles known as bosons and is considered the key to explaining why particles have mass. Dark matter. Galaxies behave as if they have more mass than can be observed. Theories suggest that there is a partner to every existing particle in the SM. Called supersymmetric particles, these particles could be the unseen dark matter. The Big Bang. What happened just after the beginning of the universe? Theorizing that the universe contained a hot, dense mixture of quarks and gluons (called quark-gluon plasma), scientists want to recreate similar conditions to analyze the properties of that mixture. About the Large Hadron Collider The CERN complex hosts a succession of particle accelerators, each able to reach increasingly higher energies. The latest addition to the complex is the Large Hadron Collider (LHC), the world s largest and most powerful particle accelerator. The CERN Control Centre near Geneva, Switzerland, houses all of the controls for the accelerator, its services, and technical infrastructure. Our biggest challenge is handling the volume and rate of data growth. Frédéric Hemmer Head, IT Department CERN The LHC, launched in 2008 and installed about 100 meters underground, forms a 27-kilometer circle that spans the border between France and Switzerland. The ring consists of superconducting magnets with a number of accelerating structures that boost the energy of particles. Traveling in opposite directions in separate pipes, beams inside the LHC are guided around the accelerator by a magnetic field achieved with superconducting magnets pre-cooled with liquid nitrogen and then filled with liquid helium to drop the temperature to a colderthan-outer-space temperature of about -271 C. Beams are directed to collide around the ring at points coinciding with the location of LHC particle detectors. International collaborations currently run four distinct big experiments each characterized by its unique particle detector to study LHC collisions and the properties of matter produced in those collisions. The LHC creates 600 million collisions per second, producing raw data at the rate of 1 million gigabytes per second. Software converts that raw data to readable data objects for later event analysis. Current experiments produce more than 20PB of new data annually, helping CERN scientists push knowledge forward and answer questions about the fundamental laws of nature. Information Technology department role CERN s Information Technology department manages the IT support infrastructure for a staff of about 2,500 and a global research community of more than 10,000 scientists and students representing 608 universities and 113 nationalities. Responsibilities of the CERN scientific and technical staff include designing, building, and ensuring the smooth operation of particle accelerators as well as preparing, running, analyzing, and interpreting data gathered from scientific experiments. 2
3 The department provides access to a broad array of IT services and data to a demanding scientific community that comprises nearly half of the world s particle physicists. They will turn the knob until it breaks, remarks Frédéric Hemmer, head of CERN s IT department. But addressing the challenges our users present is part of what makes life here at CERN so enjoyable. We re constantly adapting IT, even on a weekly basis, to facilitate collaboration and communication and to handle the increasing rate and scale of incoming experimental data. Balancing Demands for Performance, Scalability, and Reliability with Cost Constraints The big science being done at CERN introduces equivalently big data management challenges. IT has to anticipate the needs of inventive users conducting experiments with often-unpredictable requirements. To keep pace, Hemmer and his team must be innovators themselves, rapidly and efficiently delivering IT solutions that empower the CERN research community. CERN IT delivers this functionality while facing the universal challenge of providing more services with limited funding and the same or decreasing data center and administrative resources. In choosing foundational elements of the IT infrastructure technology stack, CERN continually balances technical demands for performance, reliability, and scalability with the constancy of financial constraints. Within the IT team, Database Services owns the responsibility for both the foundational database and the associated storage technologies. CERN first began utilizing Oracle databases and tools in Today, Oracle technology is used throughout the organization and plays a critical role in accelerator control systems, engineering and administrative applications, and LHC experiments. Oracle technology delivers requisite functionality, including high availability, scalability, and performance, with comprehensive tools for data distribution, protection, and manageability. On the data storage side, essential requirements include manageability, availability, and scalability to respond to fast-changing or unexpected requirements. For example, heavy lead ions cause especially complicated collisions that can make estimating data rates an inexact science. In one case, incoming data rates were five times higher than predicted. Hemmer further quantifies: Data can come into our computer center at rates up to 6GB per second that s equivalent to the contents of two DVDs every three seconds. Our job is to ensure that that data is readable and permanently available to our community of physicists. Data is our existence. Our biggest challenge is handling the volume and rate of data growth. Delivering an agile data infrastructure that is intelligent, immortal, and infinite Hemmer s team must build an agile data infrastructure that can: 1) deliver rapid impact through intelligent data management; 2) deliver immortal data availability, including nondisruptive upgrades to leverage technology advances without introducing downtime to the CERN instruments and scientific activities that run 24/7/365; and 3) provide nearly infinite scaling that will enable storage performance and capacity to grow in lock-step with CERN s research requirements and databases. In 2007, after a public tender process, CERN selected NetApp technology for the LHC logging database built on an Oracle database with Real Application Clusters (RAC) technology. Since that time, CERN has unified its entire Oracle infrastructure on NetApp and today stores 99% of all Oracle data on NetApp solutions. NetApp s affordable cost of entry with linearly scalable performance and capacity has enabled CERN to grow its storage footprint at the pace of research demand. 3
4 Eric Grancher, database services architect within CERN IT, says that NetApp delivers enabling functionality to the Oracle environment: NetApp s certification with Oracle RAC over NFS is an asset. NetApp also offers distinct functionality, including 10-Gigabit Ethernet [10GbE] support, low-impact snapshot and cloning, the ability to deliver required performance and capacity at an affordable price point [utilizing NetApp Flash Cache intelligent caching with high-capacity SATA disk drives], support for large files [up to 16TB], and, most recently, Data ONTAP operating in Cluster-Mode for more efficient data mobility. We welcome Data ONTAP Cluster-Mode that lets us move data for load-balancing, for moving less-used or inactive data to lower-cost drives, or for technology updates without having to stop the application. Oracle on NetApp across the organization Today, nearly 100 Oracle databases run on NetApp storage. The CERN IT department provides Oracle services for: LHC control and logging operations Online experiments Offline experiments Administration, including payroll services Engineering services Grancher emphasizes the critical nature of CERN s Oracle databases running on NetApp: Our Oracle on NetApp infrastructure underpins both physics and business operations at CERN. CERN relies on Oracle databases to keep the LHC online and to maintain availability of our administrative databases if those systems go down, it impacts the work of hundreds of people. One of the key decisions we made in building a highly reliable infrastructure was to deploy storage that we trust, that is simple to manage, and then layer on top of that. We take care of our storage and count on it to provide a stable service on which to build database and application services. Streams Streams IT/DB Group Experiment Online Databases Experiment Offline Databases Tier-1 Centers CASTOR (CERN Advanced STORage Manager) Data Raw Data LHC Experiments Middleware LHC Operations Accelerators ACC Administrative, IT, and Engineering Databases Figure 1) CERN S LHC and experiment operations. 4
5 Meeting the Technical Challenges of a Superscale Environment CERN IT infrastructure services must be continuously available and must be superscalable to keep pace with prodigious data growth. CERN science never sleeps keeping the LHC online Any problem receiving or managing data can bring the system down, stopping the particles beam in the LHC. The powerful tools that monitor and control the LHC are built on Oracle databases running on a NetApp data infrastructure. Controlling database (ACCCON). This database stores accelerator settings and controls. CERN operators monitor the accelerator 24/7, inputting required configuration changes into the database via control-room screens. Should this database become unavailable for even a few minutes, operators would be unable to control the accelerator and would have to dump the beam that is, extract the beam into huge graphite blocks to diffuse the beam s energy to protect the multi-billion-dollar LHC. Out-of-range temperatures, for example, could damage magnets that cost upwards of $1,000,000 each, and complicated repairs could take operations offline for weeks or even months. Logging database (ACCLOG). This database records input from thousands of sensors in the LHC, maintaining long-term logs of the status of thousands of magnets and all moving parts, including collimators that protect the beams by scraping off-track particles. This largest and fastest growing of the CERN Oracle databases currently contains 4.1 trillion rows of data (126TB) and, because it contains calibration data, is also essential to keeping the LHC online. Finding a needle in 20 million haystacks Another key challenge in providing access to CERN s massive stores of experimental data is delivering sufficient performance to Oracle index databases. Oracle databases running on NetApp manage the metadata that tracks and enables access to raw research data stored in flat files on the CERN Advanced STORage manager (CASTOR) hierarchical storage management system. CASTOR commodity disk farms and tape silos today provide 40PB of capacity. Over each year of the LHC s operation, the 4 giant detectors observing trillions of elementary particle collisions will accumulate more than 10 million gigabytes of data, equivalent to the contents of about 20 million CD-ROMs. At current recording rates, the CERN physics experiments will generate more than 20PB of new data annually that must be managed by the Oracle databases. CERN s advances in big data analytics help researchers derive maximum and rapid value from these enormous datasets and will ultimately find application in industry, helping to enhance business outcomes through predictive analyses. 5
6 Keeping up CERN s IT department also must enable database and storage systems to keep up with staggering data growth. Across CERN today, NetApp provides 901TB of capacity to Oracle databases, and CERN database staff expects capacity requirements to grow rapidly. Accelerator databases are expected to grow by 50TB each year. Such rapid growth demands unprecedented scalability and efficiencies in the CERN database and storage technology stack. Key enabling technologies to achieve balance for Oracle environments Grancher says deploying Oracle databases on NetApp enables the Database Services team to balance requirements for efficiency with necessary stability, performance, and scalability. He cites vital functionality: 10GbE offers a known growth path to greater bandwidth plus the cost efficiencies of a widely adopted mainstream technology. Leveraging 10GbE also allows CERN to use the same switches and networking that serve the rest of the lab. That means CERN IT can reduce costs by handing off administration to the networking team that is already staffed to provide 24/7 management and support. Oracle Direct NFS (dnfs) enables multiple paths to storage. This technology contributes to scalability and, because it bypasses the server operating system, typically doubles the performance of traditional NFS. But just as importantly, dnfs takes Oracle over NFS from simple to extremely simple the CERN IT staff does not have to worry about how to configure NFS because Oracle generates NFS requests directly from the database. SATA plus NetApp Flash Cache software makes it possible to achieve performance comparable to FC drives at a much lower price point. An FC solution would have been price-prohibitive at CERN s performance requirements and growth rates. NetApp FlexClone software enables efficient creation of temporary, writable copies. CERN required space-efficient Snapshot copies and writable copies of large databases, but also needed to make sure that replication processes did not impact performance. The CERN tender actually specified the maximum impact that creating a specific number of Snapshot copies could have on given workloads. 6
7 NetApp Data ONTAP 8 operating in Cluster-Mode makes it possible to maintain peak application performance and storage efficiency by adding storage and moving data without disrupting ongoing operations. In CERN s environment, no application can be stopped, so the infrastructure must deliver continuous availability with nondisruptive upgrade and other administrative operations. Grancher says that Cluster-Mode works particularly well with Oracle over NFS to give CERN needed agility. How NetApp Participated in Furthering CERN s Mission for Research Hemmer suggests that the most successful technology deployments occur in the presence of a strong partnership. We count on our providers to be innovative and proactive, helping to increase our cost effectiveness and use of resources. Grancher offers an example: With the rapid growth of the LHC logging database it expands at 50TB annually we needed an alternative to our costly FC solution. Moving to SATA would have solved our capacity and cost issues, but we expected performance problems. NetApp s recommendation to put Flash Cache in front let us keep performance at parity. Oracle RAC Databases Storage Interconnect NetApp FAS Storage Systems NetApp Disk Shelves Figure 2) CERN s NAS-based storage infrastructure Making Oracle Database 11g better A member since 2005 of the Oak Table network for Oracle scientists, Grancher understands and emphasizes the importance of implementing a storage foundation that enhances database environments. NetApp delivers a single, integrated platform for an agile data infrastructure that is: Intelligent. Management simplicity helps the CERN IT team more quickly deliver infrastructure to facilitate research. For example, CERN utilizes NetApp FlexVol virtual volumes to simplify provisioning and achieve efficiencies with thin-provisioned volumes. NetApp OnCommand management software also enables automation that reduces human errors. Says Grancher: The Oracle over NFS to NetApp storage has simplified how we access and manage data. With the time our database team saves we re able to offer more services to more users. NetApp has smart tools, and we make good use of them. 7
8 Grancher says that Oracle VM server virtualization on NFS is simple, extensible, and stable. In collaboration with Oracle, NetApp developed a Storage Connect plug-in for Oracle VM 3.0. The plug-in simplifies and centralizes management of Oracle Database and application environments by integrating advanced NetApp storage functionality like deduplication and thin-provisioning capabilities, for example with Oracle VM 3.0. CERN has never had a downtime outage of SATA drives. Moving from FC SAN to SATA NAS, we ve maintained exactly the same level of reliability. Eric Grancher Database Services Architect CERN NetApp technology also enables more efficient data protection and recoverability. Specifically, NetApp lets CERN protect data while avoiding data duplication, provide multiuse datasets without copying, and eliminate duplicate copies of data. Without NetApp SnapRestore technology, Grancher states, we d need weeks to recover just one multiterabyte Oracle Database. NetApp also makes the size of the database irrelevant we can copy a 1- or 10-terabyte database in seconds and restore it in minutes or hours. It used to take 28 days to restore a 100TB Oracle Database now it takes 15 minutes. Used in conjunction with Oracle Real Application Testing, SnapRestore technology also lets us quickly replay a workload for testing. Immortal. Stability of storage is a big asset to the stability of CERN database workloads on top. NetApp RAID-DP technology, redundant components and high-availability-pair controller configurations, and the latest Data ONTAP Cluster-Mode functionality contribute to CERN s ability to build a no-downtime, no-data-loss storage foundation. Grancher points out that NetApp technology has let CERN evolve its Oracle Database solutions with zero downtime: CERN has never had a downtime outage of SATA drives. Moving from FC SAN to SATA NAS, we ve maintained exactly the same level of reliability. Since first deploying NetApp storage in 2007, CERN has not lost a single data block on NetApp. We can t overemphasize the importance of this if CERN databases don t run, the accelerator doesn t run, and physics doesn t function. Infinite. NetApp has also allowed CERN IT to deliver affordable performance. When the capacity requirements of large-scale Oracle databases made FC-based storage no longer a viable option financially, CERN was able to combine more affordable SATA drives with NetApp Flash Cache to deliver needed capacity without performance sacrifice. Grancher says, Using Flash Cache with SATA we re achieving 35,000 IOPS over Ethernet that s the equivalent performance of 250 disks. If a big part of your workload fits into the cache, response time can drop into the millisecond range versus the milliseconds that would be the standard for SATA alone. We also have flexibility to specify what to cache for example, we don t cache archive redo logs and the cache automatically adapts to workloads. That saves time and minimizes errors. With the pace and scope of data growth at CERN, scalable storage capacity and performance are fundamental. States Grancher, CERN is much like any other organization managing an OLTP or big data environment. Our IT infrastructure has to be adaptable, reliable, scalable, and efficient, and our staff has to be proactive in integrating technologies and making effective use of limited resources even as we deal with massive data growth. From affordable cost of entry to just-in-time storage expansion, NetApp has allowed us to grow our storage infrastructure in step with our ever-expanding data and research requirements. Storage-efficiency technologies also help CERN achieve its keep forever data strategy. Hemmer says, When data comes in our computer center, it must be stored permanently. Researchers may want to access data years after it was collected, so we never, never throw away data. 8
9 A Reliable and Extensible Foundation for Research Grancher comments on the larger impact of the Oracle on NetApp infrastructure: Most rewarding for our Database Services team is being able to build something stable, an architecture that s satisfying in terms of results and that s not a one-time solution, but rather a flexible foundation for growth. Our customers CERN s global community of physicists, students, and staff can rely on this infrastructure to deliver dependable data access, enable seamless collaboration, and ensure responsive services. IT footprint: 2X less space, power, cooling (SATA vs. SAS) Hemmer adds, We ve received a number of spontaneous plaudits from scientists for the way in which our computing infrastructure has contributed to the delivery of physics results. By giving them the tools and data access they need for research, we re helping physicists find those breakthrough clues and make the big discoveries that will have an impact far beyond the bounds of our organization. 9
10 About CERN CERN, the European Organization for Nuclear Research, is one of the world s largest and most respected centers for scientific research. Its business is fundamental physics, finding out what the universe is made of and how it works. At CERN, the world s largest and most complex scientific instruments are used to study the basic constituents of matter the fundamental particles. By studying what happens when these particles collide, physicists learn about the laws of nature. Founded in 1954, the CERN Laboratory sits astride the Franco-Swiss border near Geneva, Switzerland. It was one of Europe s first joint ventures and now has 20 Member States. About NetApp NetApp creates innovative storage and data management solutions that deliver outstanding cost efficiency and accelerate business breakthroughs. Discover our passion for helping companies around the world go further, faster at Go further, faster Key Products and Technologies NetApp FAS storage systems DS4243 disk shelves 3TB SATA, 2TB SATA 512GB Flash Cache Data ONTAP 8 FlexVol FlexClone Snapshot technology SnapRestore OnCommand software Thin provisioning Large aggregates NVRAM NFS/CIFS Oracle Oracle Database 11g Enterprise Edition with Real Application Clusters Technology and partitioning options Oracle Direct NFS Oracle Streams Oracle VM Other HP ProCurve 10Gb/s Ethernet switches IBM Tivoli TSM tape system and TDPO library Servers from multiple vendors, all equipped with 10Gb Ethernet 2012 NetApp, Inc. All rights reserved. No portions of this document may be reproduced without prior written consent of NetApp, Inc. Specifications are subject to change without notice. NetApp, the NetApp logo, Go further, faster, Data ONTAP, FlexClone, FlexVol, OnCommand, RAID-DP, SnapRestore, and Snapshot are trademarks or registered trademarks of NetApp, Inc. in the United States and/or other countries. All other brands or products are trademarks or registered trademarks of their respective holders and should be treated as such. NA
UNITED STATES SECURITIES AND EXCHANGE COMMISSION Washington, D.C. 20549 Form 10-K (Mark One) ANNUAL REPORT PURSUANT TO SECTION 13 OR 15(d) OF THE SECURITIES EXCHANGE ACT OF 1934 n For the fiscal year ended
Microsoft System Center 2012 R2 Why Microsoft? For Virtualizing & Managing SharePoint July 2014 v1.0 2014 Microsoft Corporation. All rights reserved. This document is provided as-is. Information and views
Plug Into The Cloud with Oracle Database 12c ORACLE WHITE PAPER DECEMBER 2014 Disclaimer The following is intended to outline our general product direction. It is intended for information purposes only,
The Industrial Internet@Work Marco Annunziata & Peter C. Evans Table of Contents Executive Summary The Industrial Internet Towards No Unplanned Downtime 3 Introduction A New Information and Collaboration
Retail Banking Business Review Industry Trends and Case Studies U.S. Bank Scotiabank Pershing LLC Saudi Credit Bureau Major International Bank Information Builders has been helping customers to transform
For Big Data Analytics There s No Such Thing as Too Big The Compelling Economics and Technology of Big Data Computing March 2012 By: 4syth.com Emerging big data thought leaders Forsyth Communications 2012.
SAP Statement of Direction Business Intelligence Solutions Business Intelligence Solutions from SAP: Statement of Direction Table of Contents 3 Quick Facts 4 Driving Business Innovation Through Radical
Microsoft Corporation and HP Using Network Attached Storage for Reliable Backup and Recovery Microsoft Corporation Published: March 2010 Abstract Tape-based backup and restore technology has for decades
Convergence of Social, Mobile and Cloud: 7 Steps to Ensure Success June, 2013 Contents Executive Overview...4 Business Innovation & Transformation...5 Roadmap for Social, Mobile and Cloud Solutions...7
NDMP Backup of Dell EqualLogic FS Series NAS using CommVault Simpana A Dell EqualLogic Reference Architecture Dell Storage Engineering June 2013 Revisions Date January 2013 June 2013 Description Initial
From Push to Pull- Emerging Models for Mobilizing Resources John Hagel & John Seely Brown Working Paper, October 2005 This working paper represents the beginning of a major new wave of research that will
At the Big Data Crossroads: turning towards a smarter travel experience Thomas H. Davenport Visiting Professor at Harvard Business School Amadeus IT Group is committed to minimizing its carbon footprint.
A Requirement for Virtualization and Cloud Computing An ENTERPRISE MANAGEMENT ASSOCIATES (EMA ) White Paper Prepared for FrontRange Solutions October 2012 IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS
Introduction to InfiniBand for End Users Industry-Standard Value and Performance for High Performance Computing and the Enterprise Paul Grun InfiniBand Trade Association INTRO TO INFINIBAND FOR END USERS
Web Scale IT in the Enterprise It all starts with the data Issue 1 2 Q&A With Claus Moldt, Former Global CIO for SalesForce.com and David Roth, CEO of AppFirst 6 From the Gartner Files: Building a Modern
Front cover End to End Performance Management on IBM i Understand the cycle of Performance Management Maximize performance using the new graphical interface on V6.1 Learn tips and best practices Hernando
McKinsey Center for Business Technology Perspectives on Digital Business 2 McKinsey Center for Business Technology Perspectives on Digital Business January 2012 4 Digital Business Technology, and its impact
www.pwc.com PwC Advisory Oracle practice 2012 How to drive innovation and business growth Leveraging emerging technology for sustainable growth 1 Heart of the matter Top growth driver today is innovation
IT@Intel Achieving Intel Transformation through IT Innovation 2014 2015 Intel IT Business Review Annual Edition The Transformative Power of Innovation Kim Stevenson Intel Chief Information Officer Contents
On Designing and Deploying Internet-Scale Services James Hamilton Windows Live Services Platform ABSTRACT The system-to-administrator ratio is commonly used as a rough metric to understand administrative
The Critical Security Controls for Effective Cyber Defense Version 5.0 1 Introduction... 3 CSC 1: Inventory of Authorized and Unauthorized Devices... 8 CSC 2: Inventory of Authorized and Unauthorized Software...
IBM Storage Networking June 2001 Demystifying Storage Networking DAS, SAN, NAS, NAS Gateways, Fibre Channel, and iscsi by David Sacks IBM Storage Consultant Page 2 Contents 3 In a Nutshell 6 Introducing
The Datacenter as a Computer An Introduction to the Design of Warehouse-Scale Machines iii Synthesis Lectures on Computer Architecture Editor Mark D. Hill, University of Wisconsin, Madison Synthesis Lectures