1 WHITEPAPER The Virtualized Center Kris Domich Principal Consultant Dimension This document discusses Dimension s view of the primary drivers of today s data center design methodology. The areas of virtualization, consolidation, lifecycle management, and security make up the bulk of the business drivers behind data center design and will continue to influence this process for the foreseeable future. This document will discuss those paradigms and how they may be leveraged to create operational efficiencies in the modern day (and future) data center. November 2005
2 WHITEPAPER Contents 1. Background Macro Trend 1: Consolidation and Virtualization Business Impacts and Drivers of Consolidation and Virtualization Macro Trend 2: Information Lifecycle Management Business Impacts and Drivers of Information Lifecycle Management Macro Trend 3: Information Security - Business Impacts and Drivers Dimension s Virtualized Architecture Virtualizing from the Ground Up Virtual Server Deployment Virtual Center in a Box Virtual Storage and Transparent ILM RLM: Policy-Based Provisioning and Decommissioning Converged Networks: Putting the SAN in the LAN & WAN Conclusion 20 2
3 WHITEPAPER List of Figures Figure 1 - Modular Storage Chassis Allocation 10 Figure 2 - Center Architecture 14 Figure 3 - Virtual Server Management Framework 15 Figure 4 - Center in a Box 16 Figure 5 - Storage Virtualization and ILM 16 Figure 6 - Resource Lifecycle Management 18 Figure 7 - Convergence with FCIP 18 Figure 8 - Dimension s Transatlantic FCIP over SDH Design 19 Figure 9 - Convergence with iscsi 20
4 WHITEPAPER 1. Background The design considerations for the data center have changed considerably in recent years. The dawn of distributed computing, coupled with the need for increased storage and computing, has increased the need for resources that are expensive to procure and costly to maintain. As a result, companies are looking for reasons to consolidate infrastructure at many levels - geographic, physical, data, and application. Adding a virtualization layer to abstract these consolidated resources into logical pools adds efficiencies not possible in a highly distributed environment. With increasing amounts of data and equipment moving in and out of data centers, multiple lifecycles need to be managed - Information Lifecycle Management and Resource Lifecycle Management. In recent years the one pervasive influence on all IT has been security and its importance in statue and its influence in architecture continues to grow. Threats to the data network have generated the most attention in recent years. With Network security maturing, the focus is turning to the security of the information asset itself. Given the business drivers for change in data center design, deployment, and operations, Dimension believes the following macro trends will highly affect the IT decision making process: s Consolidation and Virtualization s Resource / Information Lifecycle Management s Information Security We will address each of these trends in detail. 1.1 Macro Trend 1: Consolidation and Virtualization Consolidation and virtualization of computing and storage resources have become the most common strategies for data center optimization and complement each other well. The act of virtualizing a consolidated environment results in the efficient use of the consolidated resources and can provide a means for containing the potentially recurring problem of resource sprawl. Consolidated infrastructures are more efficient and typically require less human capital to manage and return on the investment made in planning and executing a consolidation strategy are often realized in a minimal amount of time. Host Virtualization The concept of virtualization has been around since the 1960s when Gene Amdahl virtualized the entire mainframe architecture for IBM. The basic concept is simple: present storage and computing resources to consumers (users and applications) as required and in such a way that each consumer appears to exist in its own discrete environment. This is done by treating storage and computing resources as an aggregate pool and drawing from that pool on an as-needed basis. The virtualization engine is responsible for managing the physical resources and presenting a portion of those resources in a way that appears to be an isolated, physically separate environment. The application of business logic and policy-driven resource allocation ensures that each application is given the appropriate priority in this shared environment. Advanced technologies are even capable of adding or removing capacity from the pool as required. For the better part of 4 decades, virtualization was used primarily on large systems and not commonly applied to commodity servers. Early implementations on large, open systems required the assignment of specific physical resources for each logical server instance. This is commonly known as partitioning and is not truly a virtualized architecture. Around the mid-to-late 1990s, true virtualization started making an appearance on the common desktop and matured quickly into departmental and enterprisegrade architectures. Virtualization is rapidly becoming the preferred technology for the consolidation of Intel-based systems and consolidation ratios of 5:1 to over 15:1 is not uncommon.
5 WHITEPAPER Storage Virtualization Almost all centralized storage systems make use of virtualization. Large storage systems are made up of many individual disks, arranged in arrays of various sizes and levels of data protection, and performance levels. These arrays are further carved into logical units that are presented to servers as physical disks. can be moved around within the storage system to a different physical location within the frame, or even replicated transparently to another remote frame to the servers. Large scale storage systems depend on the ability to virtualize physical disks and thus this functionality has been inherent to their design from the beginning. Unlike Ethernet, storage hardware, software, and protocols have not reached a state of common interoperability. This can present challenges for a company that has significantly invested in storage in the past, but is faced with having to buy new equipment from another vendor due to business or technical requirements. Companies are further challenged with tying multiple disparate platforms together and to avoid the need to forklift-upgrade their entire storage infrastructure due to interoperability. One answer to this problem is to virtualize the storage resources. The most logical place to apply virtualization is in the host, switch/director, or storage array, with the most common being the host-based variety Dimension believes that as converged networking matures, storage virtualization will move from the host-based to network-based. Well designed storage infrastructure is selected by looking at storage needs through the eyes of the business. Often times, not all data is of equal business value and this value will most likely change during its lifecycle. Storage virtualization enables businesses to more accurately satisfy the business and technical requirements for storage based on the data s true business value. Consider storage virtualization as a way to aggregate all storage resources in a manner that enables applications to have the storage levels based on business value, application criticality, and recoverability. This enables the selection of best of breed components regardless of the vendor. Because the virtualization layer is capable of interoperating with all of the storage components, it provides a centralized point of management and instrumentation bringing increased visibility into storage and the ability to manage it proactively. Hardware manufacturers are investing tremendous amounts of money in R&D around the virtualization of storage. Virtualization will bridge storage islands, enhancing data protection capabilities. Virtualization will enable redeployment of existing resources, increasing the ROI on those resources. Storage Network Consolidation & Virtualization These trends of consolidating infrastructure and then virtualizing slowly work themselves into any IT Infrastructure component that demonstrates the virtues of proliferation. One such area is the Storage Network using Fibre Channel (FC) technology, created as a separate network from the Network in the last decade to overcome perceived issues of Bandwidth and optimized protocols using Fiber Channel (FC). However, each Consolidated Storage Infrastructure put in place demanded a separate Storage Network, and hence proliferation of Storage Networks was created, and in the most part and up until recently were not interoperable to leverage combined resources. Today, concerted effort is now being put into consolidating these separate Storage Networks and then virtualizing the entire Storage Network to create Virtual Storage Networks for ease of administration. At the same time, the network primarily based on Ethernet and TCP/IP has increased 10 fold in its throughput capability due to the availability of 1Gb Ethernet and 10Gb Ethernet, this increased bandwidth has allowed Ethernet and TCP/IP to overcome many of its original short falls as a Storage Network. Consequently the Network is now being seen as potentially the Storage Network of the future. New transport protocols such as FCIP and iscsi are furthering this convergence of the Storage and Network allowing Storage Extension and replication (FCIP) over a much cheaper medium than ever before, while Server connectivity is now possible through standard Ethernet Cards via iscsi dropping the price of Server Connectivity to the Shared Storage pool significantly. 5
6 WHITEPAPER Consolidation Strategies Consolidation strategies are usually planned around a set of business goals such as reducing operational costs, disaster recovery and/or business continuity, mergers/acquisitions, or building/facilities requirements. Most consolidations will be characterized one or more of the following ways: s Geographical consolidation several data centers within a small geographical radius s Physical data center consolidation when too many data centers exist. This is common in companies that have acquired other businesses over time s Physical Servers & storage consolidation reducing the volume of special-purpose servers and storage s consolidation storing large amounts of unstructured data. center managers often realize that they are storing the same data multiple times throughout the enterprise, amounting to multiple terabytes of primary storage, which increases when considering added capacity for backup and replication Among the most common consolidation strategies, server and storage is an imminent need for most organizations. Often server and storage requirements are driven by business units who require new functionality to support the business goals. This new functionality comes in the form of additional applications or upgrades and enhancements to existing applications. These requirements are in turn fed by software vendors who make recommendations for servers and storage, sometimes to the point of specifying the exact vendor and model of the hardware. It is important to note that while certain applications may in fact only run in very specific server/storage configurations from a software support perspective, most can run on any platform that offers adequate CPU and memory capacity coupled with the proper storage performance. In short, software vendors often prescribe server and storage configurations that further exacerbate the problem of proliferation Business Impacts and Drivers of Consolidation and Virtualization The age of distributed computing has resulted in data centers filled with hundreds or thousands of special-purpose servers. These servers often run at minimal efficiencies because in many cases, they simply do not have much work to do. When a server does become overloaded, it is difficult to manage or move the load to another server. This typically results in the purchase of more servers to compensate for the busy ones. Hardware and software purchases are generally unpredictable and often, additional funding for equipment is not easily justified until an application becomes unavailable or, worse yet, a server fails and the ability to provide one or more services is lost. Inefficiencies aside, the highly distributed computing paradigm introduces several other negative impacts to the management and operations of the data center such as: s Instability due to the massively spread-out architecture s Growth spikes in limited space s Decentralized management of data center resources s Increased software licensing costs s Decreased overall visibility into the data center due to instrumentation strain s Increased costs to operate the network, provide power, dissipate heat, and provide adequate staff These massive server and storage environments often consist of multiple vendors and multiple technologies, all at multiple states of maturity. This causes additional burdens in the areas of operations and management as a wide variety of tools and applications may be required to manage all of the disparate equipment types. This necessitates additional staff or worse yet straining the existing staff, which could lead to employee dissatisfaction and attrition.
7 WHITEPAPER center managers realize that they are left to manage a large number of storage computing resources, each with its own management suite and none of the products capable of cross-platform management or resource sharing/pooling across different vendors or technologies. This results in pockets of high, low, and medium resource utilization. Each of these utilization profiles carries advantages and disadvantages: s High Utilization A good position from the perspective that those resources are being utilized to their full potential. These systems are at a high state of efficiency and do not waste space or processing cycles. These resources would however not be capable of sustaining operations under performance spikes and may be prone to outages or other service-affecting conditions. s Low Utilization These systems are ready to take on significantly increased workload, which may be necessary in unpredictable environments. These systems operate at low efficiency and storage and computing cycles are essentially wasted, as is the money spent on them. s Medium Utilization These systems are neither wasting resources nor at risk of performance degradation due to workload spikes. Of course, medium must be subject to the environment-specific definition of the term as many systems are sized appropriately for what would be considered acceptable headroom. Ideally, data center operators should be able to shift workloads across physical resources to eliminate hot spots and remove risks of downtime associated with overloaded systems or storage pools. Operators could shift these workloads in real time, with no data or session loss, and with complete transparency to the user or customer using the resource at that time. The distributed model initially offered the promise of computing and storage resources based on small, inexpensive servers and workstations as opposed to the large, complex, and expensive traditional mainframes. This proves true to a certain point, however even distributed architectures can grow to a point of diminishing return when compared to a high-power, centralized approach. In a highly available environment, one of anything is unacceptable, which means in order to deploy systems in a highly available environment, each server requires multiple power, network, and storage connections to multiple grids, switches, and directors. The cost of planning, building, and supporting these infrastructures adds up rapidly and tends to grow annually. Consolidation and virtualization offers a viable solution to the problem of server and storage proliferation and warrants a serious look by any organization that currently has many special-purpose servers. 1.2 Macro Trend 2: Information Lifecycle Management Lifecycle management of information assets has become increasingly important for a variety of reasons. Regulatory compliance with respect to information availability and privacy is driving much of this. ILM requires that the right information is stored in the right place at the right time, and because of this, ILM may also be considered as a methodology to reduce overall storage costs. This is realistic because it means that varying levels of performance, redundancy, and longevity will be required for each classification of data throughout its lifecycle. To simplify: the most expensive SAN-based storage is not the only place you will ever need or want to store your data. Modular storage has quickly become a popular technology for ILM implementations. Modular storage refers to a single set of storage processors capable of reading and writing from multiple disk types (FC, ATA-100, SATA, etc.) and multiple disk speeds such as 10k or 15k RPM. This flexibility allows for the mixing and matching of disks in the same physical frame which allows the storage pools to be customized for the data they store. For example, for production data where the business requirements leave little tolerance for latency and data availability, 15k fiber channel drives in a RAID0+1 (Striping with Mirroring) configuration might be the most appropriate. At the other end of the spectrum, data that has been earmarked for long-term archival and infrequent access may be more appropriately served by slower, higher-density disks such as large SATA drives.
8 WHITEPAPER Many organizations have made investments in large-scale, enterprise-class storage systems that are not modular or do not permit the same flexibility in terms of mixed storage types. These systems are designed for transaction-based applications and other uses requiring large amounts of storage bandwidth, such as in very large SANs with hundreds or thousands of attached hosts. In most cases data stored on these storage systems can be migrated to another pool due to dormancy or a downgrade of immediate business value. When that is the case, customers tend to look at lower-cost storage options to do so. Modular storage would be one of these options, as would tape or some form of optical media. ILM Enabler: Storage Resource Management (SRM) One of the first exercises one must go though to begin implementing ILM is to understand what data currently exists and classify it ( Classification). Storage Resource Management (SRM) tools are an excellent source of information pertaining to how companies are currently using their storage resources. Sophisticated SRM tools are able to discover storage throughout your visible network and return vital information such as location, size, file type, access rights, last access time, duplicate data, application association, and storage type (SAN/NAS/DASD/Internal). While this is not the complete list of attributes today s SRM tools can report on, it does represent some of the most critical ones needed to understand how ILM can be applied. SRM is a way to gauge and measure the storage environment from a usage perspective and providing a means of instrumentation to understand usage over time. SRM cannot complete the picture on its own as it is only a representation of the physical aspects of storage use. The other piece, and probably the most critical one is the business aspect; why is it used the way it is? Information creation is driven by business processes, performed by people, who use applications that run on servers that connect to storage. This is to say that storage, when viewed through the eyes of the business, should be appropriately matched to the level of importance and criticality represented by the information. Organizations have quickly realized that determining these levels is not something that IT should do in a vacuum and likewise not something that the user community can determine alone.
9 The Tiered Storage Approach In the quest to implement a multi-modal, service-oriented storage architecture, companies have embraced the tiered storage approach. Tiered storage, by definition, is a storage architecture that consists of an integrated set of storage pools with varying degrees of performance and reliability associated with them. These pools may consist of various media types (disk, tape, optical) but the trend is to see them made up entirely of disk. A tiered storage architecture must contain at least two different storage service levels however most organizations will implement 3-4 with subsets of one or more levels. A holistic tiered storage model may consist of the following: s Tier 1 SAN High performance SAN utilizing fast disks (15K rpm), moderate disk sizes (36-146GB) to minimize rebuild times, and a RAID level that maximizes performance and availability such as RAID0+1. All servers attached to this tier would have dual fiber channel paths and would be designated for mission-critical applications. s Tier 2 SAN A highly available SAN for applications that require a high degree of data availability but not necessarily the performance of a Tier 1 SAN. This tier may use slower fiber channel drives (10K rpm) and larger sized disks such as 300+GB. RAID levels may also be RAID0+1 however due to the diminished requirement for performance at this tier, other RAID levels may be used such as RAID5. Some companies use this tier for backup-to-disk-to-tape staging (B2D2T) and find that RAID3 offers adequate protection while keeping disk costs minimized. s Tier 2 NAS Another use of Tier 2 is for Network Attached Storage (NAS). NAS storage is file-level and served to users and applications over Ethernet. NAS consists of a file-serving device, which may be an appliance or a server system that is connected to fiber channel storage. Shares or exports are created by the file-serving device and presented to the users as if they were directories on their local systems. NAS is a cost-effective way to provide central file storage to many users simultaneously. NAS is commonly being deployed on RAID5 arrays using large-capacity 10K fiber channel disks however large user bases located on the LAN or business criticality of the NAS data may necessitate faster disks and more resilient/faster performing RAID levels. s Tier 3 SAN An IP-based SAN using iscsi would be characterized as a tier 3 SAN. While still a SAN in that it uses encapsulated SCSI commands connecting a centralized storage device with multiple server systems, this SAN utilizes Ethernet and the LAN for transport as opposed to a dedicated fiber channel infrastructure. This offers a low cost SAN at the cost of performance. Tier 3 SANs are commonly deployed on lower-speed, higher-density fiber channel disks or SATA. Tier 3 SANs are becoming popular in test/development environments. s Backup Tier A disk-based tier that is intended to replace tape as the media used to store traditional backup and recovery data. Most enterprise backup and recovery packages recognize disk-based storage as a valid backup target. Because of this, the backup tiers are now being implemented in a hierarchical model in conjunction with a tier 2 SAN as described above. The tier 2 SAN serves as the initial backup target because of its speed and ability to facilitate backups with minimized windows. The tier 2 SAN will subsequently purge to the backup tier as a separate process after all hosts are successfully backed up. The backup tier is made up of low-speed, high density disks and is sized appropriately to handle the number of concurrent copies of data that a business requires. Disk libraries may also be used for this tier. Disk libraries are special-purpose appliances that are made of up high-density disks with controllers that allow the disks to emulate tape cartridges of various formats and libraries of various vendors. s Archive Tier A long-term storage pool that is made up o low-speed, high density disks, and often in RAID5 configurations. The archive tier can accept incoming data from any other storage tier. commonly enters the archive tier based on policies and procedures (which can be automated) that determine when data has aged or not been accessed long enough to warrant removing it from a more expensive storage tier with a continued need to preserve it.
10 Modular Storage Chassis Allocation Tier 1 15k Fibre Channel Primary storage for high bandwidth requirements , databases, batch systems Tier 2 10/15k Fibre Channel High-speed, secondary storage requirements such as disc-staging for B2D operations Tier 2 10k Fibre Channel NAS Tier. Can also be ATA drives if file sharing is not widely used. (Home/Group Directories, Profiles, PST) Tier 2/3, Backup, Archive ATA Purely a high-density, low speed tier. Used for database dumps, iscsi SANs, final staging for B2D, archiving Figure 1 - Modular Storage Chassis Allocation Decline Of Tape and Optical Media Once thought of as the industry standard for long term archival, tape and optical media are beginning to decline in popularity for this use. Again, driven by regulatory compliance, the importance of archived data is seldom realized until the archived information can no longer be accessed or read. The integrity of the data coupled with the need to ensure that is has not been tampered with has made an increasingly strong case for disk-based archive systems. Tape and optical storage are dependent upon two critical factors to ensure longevity of data storage and retrieval; zero degradation of the physical media and availability/supportability of devices used to read data from the media. Both of these factors must be super-resilient to the natural passing of time as time can have the most impact. First, media, even when stored under optimal environmental conditions, carries a probability of intermittent failure resulting in corruption. The fact is that most media isn t stored properly to begin with and if so, is often found stored in an area that may be affected by a disaster or other catastrophic event such as flooding, or subject to physical trauma (falling, being crushed, etc.). If you manage to avert these dangers, there is the second factor of product availability and support. For example; to retrieve data that was archived 10 years ago: s Are you in possession of a device that can read the media? s If so, is the device still under a support contract? s Does the manufacturer still support the equipment? s If the equipment fails while you are restoring your data, where can you get help? Companies are quickly realizing that these are the critical questions that must have answers before committing data to an archive solution. Because of this, disk-based solutions have become more attractive. The fundamental design of a disk drive has not changed in three decades. In a world that moves at the speed of IT, this is a future-proof design. Today s disk-based archiving solutions make use of low-speed, high-density disks, which are ideal for this purpose. Configured in RAID or RAIN arrays, these low cost storage solutions are providing a solid foundation for future-proof archiving, are packing multiple terabytes of storage in a few rack-units of space, and requiring a negligible amount of power to do so. 10
11 1.2.1 Business Impacts and Drivers of Information Lifecycle Management Over the last few years, SAN and NAS have been positioned as the silver bullet to data consolidation. In fact, one or both of these two technologies is an appropriate means to consolidate data for most organizations. The evolution of product development has taught us that even within these two technologies, there must be several levels of performance and availability to select from to achieve ILM. This is typically accomplished by tweaking disk speeds or RAID levels as both can have a direct impact on performance, while RAID levels can affect both performance and availability. Over time, many companies have migrated their storage infrastructures to these centralized technologies and then simply added capacity if and when required. This model cannot be sustained indefinitely in a cost-efficient manner. Regulatory compliance may require companies to sustain the ability to store information intact and make it readily available for between 7 and 10 years, or more. Continuously adding capacity to an enterprise-class SAN to accomplish this is not a realistic approach. Companies are typically doubling their storage requirements every 18 months. It has become almost accepted that funding will be required at least that often for additional storage. In many cases, this means expanding the same Tier 1 SAN each time, which is an expensive proposition. ILM has taught us that not all data is of equal business value and not all data needs to be accessible with near-zero latency. Because of this, not all storage must be of the same class. Businesses can now select the type of storage most appropriate for the business value of their data, which decreases overall storage costs. Information creation does not slow down; the same amount of addition storage will be required year over year probably more and more frequently as time goes on. By applying ILM, this information is managed more intelligently and the high-end storage devices do not have to be upgraded or replaced as often. Implementing ILM does require a heightened sense of awareness of the information landscape, which requires that intelligent tools and controls are in place to effectively implement and manage the information lifecycle. A successful implementation of ILM must start with the identification and classification of the data. More and more, companies are relying on 3rd party consultants to assist with this daunting task. The assignment of business value to data is more than determining whether you can live without it or not, but how long and what are the consequences associated with data unavailability or loss? As with most long-term investments, an initial commitment of resources (human and financial) must be made when implementing ILM. It is not a decision that can be executed on overnight. Careful planning in the initial stages must take place to ensure that additional down-stream problems are not created such as drastically over/under sizing a storage pool, which could result in lack of space for critical information and/or diminished ROI or lengthy time to recover the investment. ILM also introduces an additional process to measure and manage the automatic flow of information from pool to pool. This is accomplished through the combination of intelligent storage management tools and a comprehensive storage policy definition exercise. The storage policy drives the information lifecycle and should be finite in definition but considered flexible enough to accommodate new requirements such as modifications to regulations that govern information availability and retention. Currently, no single tool exists to manage this process end to end however the primary storage product vendors have built into their equipment support for 3rd party management and monitoring which can exploit technologies like virtualization. Once the initial policies have been defined and the management tools are in place, the ILM process can operate quite seamlessly across the storage infrastructure. 11
12 1.3 Macro Trend 3: Information Security - Business Impacts and Drivers Given the recent increase in personal data theft, today s information security regulations are only just the beginning of tougher regulations. For example, the United States Senate has introduced a bill that would set national standards for database security, require businesses and government entities to notify individuals if they even suspect an attacker has obtained unencrypted personal data, and empower the Federal Trade Commission to impose fines of $5,000 per violation or up to $25,000 per day. In fact, not securing data at rest could end up costing companies more than a fleet of expensive encryption appliances. By the end of 2006, failure to encrypt credit card numbers stored in databases will be considered negligence in civil cases arising from unauthorized disclosures, according to Gartner. Already, ChoicePoint Inc. s shareholders are suing after the company s shares plummeted on news that consumers personal data had been stolen. Identity theft costs U.S. businesses and consumers $50 billion to $60 billion a year, according to the Federal Trade Commission. It s not a stretch to expect more consumers to sue the companies that let their data be compromised. What s more, the California base Security Breach Act, which applies to any business with customers in that state, requires disclosure even if a break-in is only suspected. Gartner believes that enterprise-class companies nervous about landing on the front page of newspapers like others with recently publicized data losses have been driving the storage security market over the past six to nine months. The market will be spurred on further with the backing of a large vendor like Network Appliances with regards to their recent acquisition of start-up Decru Inc., one of the leading providers of encryption appliances for storage security. Business Challenges The benefits of using encryption for stored data are clear. They include prevention of unauthorized access to data and protection against data loss. However, using encryption to protect stored data is not easy. Overall, there are no real surprises in the cost of data security. Encryption of data costs you in terms of bandwidth and latency, although some vendors feel their solution lower these costs to an acceptable level. Access control and authentication will cost companies in terms of the management hours required to set up and maintain access-control lists. However, many of the challenges are not necessarily only within the technology but within the business and organizational environment. Implementing an encryption solution can involve substantial changes to data storage processes, access control and back-up procedures. Large-scale encryption can also change how applications interact with one another. And the management and administration of encryption keys can be another complex issue. Some of the most common hurdles faced when attempting to implement enterprise storage security are: s Lack of communication/understanding between Security and SAN groups/teams s Inadequate executive visibility and involvement in storage security s Ongoing maintenance expenses s Isolated management of different networks (IP / FC) Measured against some of the market drivers above, these organizational challenges will simply have to be overcome as the market s visibility increases and more public scrutiny is placed on organizations to protect sensitive consumer information. 12
13 Categories Of Functionality Storage security products can be broken up into various categories of functionality: Access Control, File-Level Encryption, and base Encryption. Access Control Enterprises need to understand their need for role-based access control. It is estimated that access control to data on the SAN is implemented by 60% of enterprises, with authentication and management access control close behind. Role-based access control does make management of rights much easier, but companies can incur more overhead. Access should be straightforward and when it comes to simply managing the storage network is can be. The issue becomes more complex as companies try to control who sees what data on the storage infrastructure. There is overlap of the storage infrastructure with groups/users defined in other IT operations. Some storage security products leave the issue to the database, in which you create users and give them access to the data. Other products have an add-on at the client level which controls access by user. Additionally, products in this space can define which applications have access to specific files and folders of data. Access control in that sense does not exist within FC switches. Switches are simply infrastructure products, they have no visibility or ability to determine which users can see the data that passes through its infrastructure. Enterprises wouldn t be able, at the switch level, to define all users having access to the switches and also define which network resources they can see and modify. It is more of an Operating System level function. File and Disk Encryption Different products encrypt/decrypt at different locations, with most of them passing data unencrypted from the switch to the host. Some enterprises don t use encryption, believing that it s not worth the bandwidth penalty that is incurred; others prefer to encrypt on the host. Encryption on the host is CPU-intensive unless you have dedicated encryption processor cards in each host. Encryption on the SAN, be it on the switch or on an appliance, puts unencrypted data on the SAN when it s passed to and from the encryption engine. A recent Network Computing poll of several hundred storage professionals indicated that 32% would put encryption on the host and 21% on the switch. The issue most prevalent in recent news is the loss or theft of storage disks and/or tapes. Several products encrypt the data on the tape and store the key encrypted on the same tape. The master key may be stored on a smart card at a remote site to ensure companies can restore backups if a catastrophic disaster is sustained in the data center. This creates another step and another cost factor in restoring lost data or responding to litigation by retrieving relevant data. base Encryption stored within databases, user profiles and credit card information, are subject to exploitation by internal and external hackers. Products exist to provide database encryption. These tools allow for encryption of tables and/or columns in the database, and then grant decryption rights to certain users. With base Encryption, compromising one machine on the network does not necessarily allow an attacker to gain access to core data. This solution of course, only protects data that fits within the database, and not all critical or sensitive data does. Products have matured to the point that encryption features are transparent to the applications on the network and require almost no changes to the application itself. Through the use of views and decryption stored procedures behind the scenes allow a seamless interaction between the database and the application. 13
14 2 Dimension s Virtualized Architecture 2.1 Virtualizing from the Ground Up A true virtualized data center makes use of virtualization across the storage, compute, and network layers. The high-level architecture in the following graphic shows the primary building locks of a typical data center and how each of the commonly found components contributes to the overall virtual infrastructure. In the model, production server pools are divided by operating platform (Windows or Unix/Linux). A separate test and development pool provides servers of all platforms and is tied to the shared storage components in order to take advantage of the centralized storage resources and for easy access to production data replicas for accurate testing. A common management framework is in place (shown in red) which provides data center management services and data center automation. Shared SAN Tiers Core Infrastructure Systems Edge Routers and Core Switches Providing Network Virtualization Meshed Firewall / Load Balancing Windows Server Pool Shared SAN Tiers Linux / Unix Server Pool Backup, Recovery, Archive Test / Development Server Pool Legend Production SAN NAS Management Centre Management Systems Figure 2 - Center Architecture 14
15 2.2 Virtual Server Deployment Servers in a virtual environment look and behave exactly as their discrete counterparts. A virtual server is a server instance that is created by a process and a hardware abstraction layer from a pool of available computing resources. By deploying these pools of resources, data center managers can be assures that their customers are receiving the precise amount of computing bandwidth required by their applications. This keeps most of the extra cycles available for other applications. Directory Services (Active Directory, LDAP) High-Availability Servers with Ethernet access to management processor VLAN. Managed Hosts Management Clients (NOC, Remote, VPN Tunnels Virtual Server Management Framework Windows Pool Linux/Unix Pool Mainframe Pool Test/Development Pool Management Agents Enterprise-Class database such as MSSQL or Oracle Enterprise base Temporary Repository SAN-based storage volumes to store server images and templates Figure 3 - Virtual Server Management Framework In the above image, the managed hosts to the right make up the available computing resources and are divided by operating platform (Windows, Linux, Unix and Mainframe). A test and development pool is represented and is made up of a sampling of each operating environment. This pool is represented separately from the other pools because the security context for this pool will differ from the production pools and the hardware typically used for test/dev is not of the same caliber as the other systems. By grouping computing resources logically as shown, applications are able to run in the most optimized environment for the given application and security contexts may be maintained. At the heart of this architecture is a common management framework. This consists of a mixture of people, processes, and technical components and is responsible for rendering the virtual servers according to a pre-defined set of rules, which are kept in the template repository. This repository is SAN-attached to a server that has access to business logic that defines how a server will be configured as well as a queuing mechanism that manages requests for a particular server conception. A Directory Services component is also shown. This component is the primary means of determining the security context of the Management Clients, run-state of the virtual servers, and serves as a source for user account and other resource access control of virtual servers and applications as they are built. 15
16 2.2.1 Virtual Center in a Box Users Centre Operations Physical NIC Firewall A Physical NIC Firewall B Physical HBA APP VLAN DMZ VLAN DB VLAN Physical NIC Management VLAN Web Server MTA APPSRV_A MAILDB_A Web Server MTA APPSRV_B MAILDB_B Storage (SCSI Emulation) Physical HBA SAN base_a base_b Figure 4 - Center in a Box An abstract paradigm that is enabled by the virtual server concept is the Center in a Box. Because a large physical server is capable of hosting many guest systems of multiple operating environments, each function of a typical data center can be represented if enough CPU, RAM, and storage resources are made available to the server. In the example shown in the previous graphic, a single physical host is virtualizing a firewall, a set of DMZ services, an application tier, and a database tier which is only accessible by specific server entities within the physical machine. This indicates that the physical server is capable of mimicking the exactly functionality that this architecture would represent in the physical world. This architecture play out particularly useful in test/development environments as well as use-cases where infrastructure must be highly portable. Taken into perspective, the architecture shown in the graphic consists of a highly secure data center architecture that could run on as little as 4U of rack space, is internally clustered for enhanced availability, could be made more robust, and deployed in almost any environment, including the battle field, covert settings, or environments where a high degree of portability is required. The design shown would require a total of seven physical connections (for power, network, storage) to provide power and connectivity to 12 servers. This could be reduced to 5 if internal high-speed disk was used in place of a SAN. 2.3 Virtual Storage and Transparent ILM Tier 1 Storage Delete/ Destroy/ Shred Tier 2 Storage ILM / Storage Virtualization Backup & Recovery 16 Figure 5 - Storage Virtualization and ILM Archive
17 As ILM is not only the classification of data based on business value and retrieve-ability, it is also the automated migration from one class to another. The users and applications should never care where the information is stored; only that is can be retrieved in an adequate amount of time and in tact (data integrity). As a result, we architect the ILM strategy with transparency in mind and base it on the number of major classifications that information will be divided by during its lifecycle. The above image depicts a constant stream of high-volume data that is bound for a storage device. On the front side of the ILM/ Storage Virtualization boundary, data is not concerned with where it is stored and only wishes to be stored in a place where it can be retrieved and perhaps modified. Beyond the ILM/Storage Virtualization boundary resides a multitude of storage destinations ranging from top-tier (high-speed, high availability) to bottom-tier (low-speed, long retention). As previously discussed, this is the multi-tiered storage approach. While we strive to accomplish the balance of cost and performance on the right side of the ILM/ Storage Virtualization boundary, our users and applications care not where or on what hardware their data is stored. As architects, it is incumbent upon us to ensure that we can identify the data we are ingesting, determine its business value, and store it in the most cost-effective manner that satisfies the business need for that data. Another crucial component of the architecture is the ability to move data between tiers. During the information lifecycle, business value will change and often fluctuate, suggesting that a given piece of information could decrease and then increase again in business value. An example of this might be an insurance claim or documents pertaining to a court case; upon inception, the data is highly valued and must be quite retrievable. Once the matter is closed, this information may enter a period of dormancy and after a certain period of time (e.g days), the information may become eligible for archive status and thus should be automatically moved to the appropriate tier. If for some reason the information must be retrieved again, the time-based lifecycle must start over and the information must be moved back to the original tier upon re-retrieval and the time-based lifecycle shall repeat. The Dimension architecture allows for this and removes the manual steps involved with achieving this business requirement. Another absolute stage in the lifecycle of information is termination, or permanent deletion. While not often considered in a save everything forever world, the possession of certain data beyond a certain point introduces liabilities and possible infractions of local and/or international law. Because of this, there must exist a way to bring such information to that state in its lifecycle; destruction. A properly architected ILM strategy will provide a means for information to be permanently removed from all storage systems within the realm of the ILM management hierarchy. Dimension s architecture provides this facility and makes it accessible from any storage tier, making it possible for information to be aged out of the system from any point in the traditional lifecycle. This in conjunction with the business logic that drives a given ILM strategy removes the need for information to flow serially though all tiers of the ILM implementation before it can be permanently removed and provides an increased level of ILM efficiency. 2.4 RLM: Policy-Based Provisioning and Decommissioning Unlike ILM, Resource Lifecycle Management (RLM) has not received the same amount of media attention. In a virtualized architecture, storage and compute resources go through a similar lifecycle. When storage and computing resources are viewed as pools of available resources, the nature is that these resources will be required to fill a capacity deficiency and then later unneeded when that deficiency becomes a surplus. Dimension believes that an RLM architecture must support the ability to have an awareness of the resource requirements (deficiencies and surplus) and be able to intelligently move resources back into the pool when they are no longer needed. These resources then become available for other applications or to increase the power/space allocated to an existing application. 17
18 Virtual servers and storage are provisioned on-thefly when a request for resources is submitted. This can be done manually or automated. Logical Server and Storage Capacity no longer required / Decomission requests Capacity Requests Computing Pool Storage Pool Figure 6 - Resource Lifecycle Management In the graphic, two pools of physical resources are defined; computing and storage (server and disk). The architecture includes a layer of intelligence that is capable of making requests of the pools either as a manual process or fed by proactive monitoring tools that use predefined business logic to determine when additional store & compute capacity s required. Either way, as requests for resources are submitted, the appropriate amount of CPU, RAM, and disk space are allocated to the server instance and the system is imaged with the right operating system, applications, and access control lists. Upon completion of these build-tasks, notification is sent to the application owner that the server is ready and may begin processing service requests. Again, using manual or automated processes, a reduction in need for the store and compute resources can trigger a decommissioning event which will commit all pending transactions on a given virtual server, wipe its storage volumes, and return the allocation of CPU, RAM, and disk resources back to the pools making them available to other resource requests. In this way, resources are used in a just-in-time fashion and are not subject to waste due to under utilization. The capacity requirements of the data center as a whole become the measurement used to determine whether additional hardware and software is needed, removing the business unit/application-centric forecasting model that leaves most data centers riddled with pockets of special-purpose and underutilized resources. 2.5 Converged Networks: Putting the SAN in the LAN & WAN Fiber channel SANs provide the network industry s finest in terms of performance but tend to fall short on that over great distances. For this reason, the concept of SAN Extension has gained in popularity. While repeaters and amplifiers provide some level of hope, transcontinental SAN connectivity would be out of the question using those means and continues to be a popular business requirement that must be satisfied. TCP/IP networks provide the missing link in that continuity chain. TCP/IP networks are everywhere and span all continents including the inhabited portion of Antarctica, en masse. Encapsulating the Fiber Channel Protocol (FCP) within TCP/IP packets is logical; after all, fiber channel is merely the transport of SCSI commands within another transport. Adding another layer does introduce an element of overhead however the benefits gained by the immense, and unlimited distanced TCP/IP can travel offset this tradeoff in an asynchronous environment. Site A SAN Replication Using FCIP Site B SAN Fibre Channel Tunnel Encapsulating within TCP/IP 18 FC to IP Encapsulation Figure 7 - Convergence with FCIP Wide Area Network IP to FC Decapsulation
19 Dimension has achieved significant success in the area of remote replication using FCIP. Regardless of the data origin (file, block, tape, etc), FCIP provides a solid foundation for SAN extension using native, already-in-place networks, and a cost-effective protocol (TCP/IP) that is widely deployed. This means that SAN replication can occur over public infrastructure. Dimension deploys multi-service fiber channel switches that are capable of providing FC, FCIP, and iscsi connectivity from a single chassis. These switches are capable of encapsulating FC within TCP/IP and sending the FC frames over long distances. Once they arrive at the replication site, an identical switch is capable of de-capsulating the FC frames from within the TCP/IP packets and routing those I/Os to the fiber channel-attached disks. With an acceptable change ratio, multiple terabytes of data can replicated over several thousand kilometers. Dimension has extensive experience deploying converged networks. One example of a large-scale, complex coverged network that Dimension designed and deployed involved a Transatlantic link using FCIP-over SDH (Synchronous Digital Hierarchy). This deployment was carried out for a major global bank by Dimension, Cisco, EMC, and Hibernia Atlantic to demonstrate that FCIP and the combined technology was able to provide a flexible SAN solution across existing telecom links between Europe and North America. Cisco is able to offer two alternatives for transporting FC over long haul distance either at Layer 3 as demonstrated here using FCIP or at Layer 2 utilizing Fibre Channel-over-SDH. However, in this case FCIP was chosen since it offers the most flexibility. Full details on this architecture may be found in the Dimension whitepaper, Transatlantic Replication utilizing Fibre Channel over IP (FCIP) Protocol. A conceptual view of the architecture Dimension designed can be seen in the following graphic. Cisco ONS GE/ FC/IP Cisco MDS 9216 EMC DMX Southport Cisco MDS 9216 GE/ FC/IP Cisco ONS Boston Dublin Hibernia SDH Network EMC DMX Cisco ONS Asynchronous Replication - FCIP over SONET/SDH GE/ FC/IP Cisco MDS 9509 EMC DMX Figure 8 - Dimension s Transatlantic FCIP over SDH Design 19
20 Like most new technologies, iscsi has been quite popular amongst data center test/development teams. Unlike production systems, these have little to lose if things don t go as planned. Once dubbed the poor man s SAN, iscsi has quickly taken hold as a cost effective means to provide storage consolidation to mid-tier environments and is acting as a solid provider of Tier-2, 3, and archive storage in many large enterprises. The concept is simple; block-level data access over the existing Ethernet infrastructure. Ethernet is here and it isn t going away anytime soon. Its usefulness is proven time and time again first with data, then voice and video, and now information. Windows Host Fibre Channel Disk Array Fibre Channel TCP/IP Multi-Protocol SAN Switch Linux Host Fibre Channel Unix Host Mainframe Storage Appliance Novell Host Mainframe Host Figure 9 - Convergence with iscsi While Dimension has observed the bulk of iscsi implementations in the test/dev environments, this form of SAN extension has gained popularity in mission-critical production environments as well. A SAN presence does not dictate that all servers in its proximity have dual-path access to it. Today, with concepts like ILM and RLM, the value of the data is in question, which means that value and reliability of the server and the storage resource must also be in question. 2.6 Conclusion These three macro trends presented in this whitepaper of Consolidation and Virtualization, Information Lifecycle Management and Information Security represent IT Architectures that take into consideration both current business requirements and the future vision of these trends, but importantly they are not single products to be purchased via a part number. For many organizations, understanding these macro trends is easier than deciding when to embrace and implement them as IT Architectures, as the next generation of products and services supporting these Trends, hold the eternal promise of being an improvement over yesterdays. However, the question is not if I hold off will I get something that is better than today, but rather, can I gain the business value out of the architecture today with a future roadmap to take me to tomorrow. 20 At Dimension, we believe that for most organizations, there is considerable business value to be gained by embracing the Architecture of a Virtualized Center today.