Best practices for enabling Microsoft SharePoint with HP LeftHand SAN White paper Introduction... 2 Executive summary... 2 SharePoint 2007 storage architecture... 3 Database servers... 3 Content databases... 4 Configuration, shared service provider (SSP), and administration content databases... 4 Search database... 4 Content index database... 4 SQL Server tempdb database... 4 SQL Server logs... 4 Web front-end and application servers... 5 Web front-end servers... 5 Application servers... 5 SharePoint 2007 on iscsi... 5 HP LeftHand integration with SharePoint 2007... 5 Storage clustering... 5 Network RAID... 6 Thin provisioning... 7 Snapshots... 8 Remote copy... 9 HP LeftHand Solution Pack... 10 HP LeftHand VSS Provider... 10 HP LeftHand DSM for MPIO... 11 SAN best practices for SharePoint 2007... 12 Performance sizing recommendations... 12 User definitions... 13 Sizing the SAN... 13 Volume configuration, partition alignment, and formatting... 14 Volume configuration... 14 Partition alignment... 15 Aligning a partition with diskpart for Windows Server 2003... 15 Volume formatting... 16 Backup and restore configuration and operation... 16 Versioning and recycle bin... 16 SharePoint backups... 16 SQL Server backups... 17 Continuous data protection products... 17 General backup and restoration recommendations... 17 Conclusion... 18 For more information... 18 Additional resources... 18 Appendix: SQL examples... 19 Moving tempdb... 19 Adding and sizing tempdb data files... 19 Splitting large database files... 20
Introduction Microsoft Windows SharePoint Service 2007 and Microsoft Office SharePoint Server 2007 (collectively called SharePoint) are one of the fastest-growing infrastructure environments in the world. In 2007, the number of deployments of SharePoint grew by 150 percent over all prior years put together. SharePoint provides a flexible, secure, and scalable environment for collaboration and content management and, once deployed, usually grows organically throughout an organization. From an IT perspective, a SharePoint deployment can be a small, initial environment for a single project, or it can be rolled out as a full IT infrastructure for thousands of users and dozens of applications. Moreover, its organic nature means SharePoint can grow in ways that defy planning. Thus, the storage subsystem underlying SharePoint should be expandable, scalable, and flexible enough to adapt to SharePoint s changing requirements, while providing a wide variety of protection and availability options. Finally, the best storage subsystem for use with SharePoint will be one that is straightforward and easy to administer, while minimizing any effect on the SharePoint Server itself. Executive summary HP LeftHand SANs are scalable data storage systems that simplify management, reduce customer costs, and optimize virtual environments. Easy to deploy and maintain, HP LeftHand SANs keep crucial business data available. Their innovative approach to storage provides unique double-fault protection across the entire SAN, reducing vulnerability without driving up costs as traditional SANs can. A pay-as-you-grow, all-inclusive pricing model and intuitive storage management software built into every system make HP LeftHand SANs perfect for small to midsized companies that must contend with sophisticated storage requirements, but are limited by tight budgets and minimal storage management expertise. LeftHand SANs are equally suited to new server virtualization projects and Microsoft database applications, delivering integrated replication capabilities along with the ability to scale SAN performance non-disruptively. The following HP LeftHand SAN features are especially important in a SharePoint environment: HP LeftHand SANs scale extremely well for SharePoint. The storage can easily scale from both capacity and performance perspectives as more users or more demand is added to the system. This storage scaling is accomplished without planning sessions, complex procedures, or downtime. The storage system is highly automated and is thus easy to administer. IT departments can manage an HP LeftHand SAN with existing Windows and networking administrative expertise most of the complex management tasks associated with other SANs simply don t exist with HP LeftHand SANs. HP LeftHand SANs provide a wide range of management, data protection, and availability features that give administrators everything they need no matter how large the deployment and how critical the application. And all of these features are included in the base product; there are no future addons or additional products to buy. HP LeftHand SANs are highly optimized for the random small-block I/O found in conjunction with SharePoint. This means that their price/performance is exceptional, with linear system performance scaling as capacity is added. 2
SharePoint, however, can be tricky to implement from a storage perspective. This white paper outlines the deployment of SharePoint from a storage standpoint, details the storage-related issues that may be encountered, and discusses the available options. Because there are many data protection choices, one section explains all the main options and provides HP s recommendation for a fully protected environment. The paper covers key HP LeftHand SAN features relating to SharePoint in some detail, as well, and it provides a collection of best practices specific to HP LeftHand SANs. Finally, a performance benchmark section indicates how to determine SAN sizing and offers best practices for a well-performing system. SharePoint 2007 storage architecture One of the challenges of determining a storage strategy for SharePoint is managing all of the server components. This section briefly touches on the various components that must be planned for. This information is then applied to the discussion of the data storage options below. A complete storage strategy for SharePoint must address all of these areas. Database servers The content for SharePoint is contained primarily in Microsoft SQL Server databases. When customers consider data storage, these databases get much of the focus. Using these guidelines, Figure 1 shows the volumes needed for a typical to large SharePoint deployment. It assumes separate servers and includes capacity suggestions, where known. Please refer to the Microsoft document Physical Database Storage Design for complete details on specific server roles and requirements. Figure 1. SharePoint farm and iscsi volumes 3
Content databases There is usually one content database per site collection. It contains all documents and other objects used within the sites, including master pages and page layouts. To estimate the size of the content databases, multiply the expected initial information size by 125 percent to accommodate content versioning (increasing this percentage if versioning will be heavily used). To increase performance and provide flexibility in maintaining individual site collections, dedicate one volume to each content database. Break large content databases into separate, equally sized SQL Server database (.mdf) files to match the number of physical CPUs available, at a ratio of.25 data files per CPU (counting dual- or quad-core CPUs as two or four CPUs, respectively). These database volumes are ideal candidates for thin provisioning, as detailed in the HP SAN/iQ thin provisioning section, due to the challenges associated with accurately predicting their initial size and growth rates. Configuration, shared service provider (SSP), and administration content databases These databases contain descriptions of all objects and object relationships used in the SharePoint site farms, Web applications, and site collection metadata. Their use is primarily administrative. The database files can be placed on a single 15 GB volume and the corresponding log files kept on a separate volume. Search database The search database contains indexes used by the various search engines within SharePoint. Dedicate one thin-provisioned volume for the search database. The average file size of the content collection greatly influences the size required for the search database. For search purposes, each file stored in the collection requires a unique record that includes the history and search logs, anchor links, and crawl statistics. Collections of moderately sized files approximately 10 KB should be sized at 50 percent of the size of the entire data collection. Collections with smaller files (1 KB or less) should be sized at 400 percent of the size of the entire collection. Again, thin provisioning of this volume greatly improves storage efficiency associated with this wide-ranging storage recommendation. Content index database A content index is a full-text index of content stored on a SharePoint site. These search indexes are separate from the Microsoft SQL Server Full Text Index engine. The content index database is not a SQL Server database and does not require a log volume. It needs to be 30 to 40 percent of the size of the aggregate content databases. It should not be thin provisioned. SQL Server tempdb database The SQL Server tempdb database is used to hold temporary user objects such as tables, stored procedures, variables, and cursors. For better performance, tempdb data files can be equally split to match the number of physical CPUs present on the database server (counting dual- and quad-core CPUs as two and four CPUs, respectively), and each file should be stored in its own SAN volume. Presize the combined tempdb files to 25 percent of the size of the largest database with auto growth enabled in SQL Server. Pre-sizing saves SQL Server the overhead of dynamically extending data files. tempdb files are reset to their original size on server restart and should be permanently resized if they grow beyond the initial size. tempdb volumes should not be thin provisioned. SQL Server logs The SQL Server logs should be kept on their own volume or volumes, depending on the total number of databases, log settings, and backup policies. The log volume should not be thin provisioned, because the nature of log files will cause the entire volume to be written to quickly. Size these volumes based on the guidelines in the Microsoft document Physical Database Storage Design. 4
Web front-end and application servers Web front-end servers Web front-end servers are responsible for accepting HTTP requests from Web clients and for serving them HTTP responses and optional data contents, such as attached documents or images. These servers store little data in a SharePoint environment, but are the primary means of data access to clients. Their storage requirements are focused primarily on the host operating system and IIS Web server functionality. Local storage is usually adequate for these servers. Application servers Application servers handle most of the business logic and data access of SharePoint, with extensive use of server-side dynamic content and integration with the various databases. Application servers are usually combined with the Web front-end servers, depending on the resource requirements of any third-party applications. SAN volumes can be created for these applications, following the storage requirements of the application vendor. SharePoint 2007 on iscsi SharePoint scalability and performance are exercises in resource balancing. By following the recommendations of the SharePoint Capacity Planning tool provided by Microsoft, you can determine the server topology needed to support specified user requirements. However, this tool explicitly states that the storage subsystem is not considered in the results. While the server infrastructure may be in place for a properly architected farm, it is easy for the storage subsystem to become overwhelmed by those servers resulting in unexpected poor performance. A strong, expandable storage subsystem is the key to current and future performance of midsized to large-scale Microsoft SharePoint implementations. SharePoint s flexibility and customization options can play havoc on well-planned system architectures. Administrators need a storage subsystem that allows for organic growth without requiring an overabundance of administrative work. iscsi storage solutions are a clear leader in SharePoint implementations because they offer the flexibility to spread I/O across many spindles and to expand existing systems. HP LeftHand SAN integration with SharePoint 2007 Storage clustering HP LeftHand SANs provide superior data availability, scalable performance, and a best-in-class storage management feature set at an affordable price point because, unlike legacy storage systems, HP LeftHand SANs are built upon a distributed clustered architecture. The essential building block of an HP LeftHand SAN is the storage node a compact, high availability server that contains its own processing power, RAM, cache, disk drives, and bandwidth. HP LeftHand SAN clusters storage nodes together, aggregating all of these resources into a single, larger storage system. 5
Figure 2. An HP LeftHand SAN uses clustering for high availability, performance, and ease of management. Each HP LeftHand SAN cluster responds to a single IP address, and every storage node in that cluster participates equally in sharing both the workload and the capacity of the whole cluster aggregating the resources of a pool of enterprise-class, x86-architecture servers into a single iscsi storage cluster. The cluster accepts and responds to iscsi requests as a single unit, and it delivers all of its processing and storage resources to the application servers that use it. This means that, as you scale a storage cluster, you scale capacity and performance at the same time. The cluster consists of a single pool of storage from which you configure virtual volumes, each of whose data blocks are striped and replicated across the cluster using network RAID. Network RAID HP LeftHand SAN network RAID provides synchronous replication of your data, keeping your data available when a drive, a storage server, or even a whole site fails. Network RAID stripes and synchronously replicates data across a storage cluster for high availability and performance. You can choose the network RAID level that your data needs on a per-volume basis, and change it at any time. Your application server accesses the same volume regardless of any underlying SAN hardware failure. Network RAID has built-in synchronous replication that makes multiple copies of your data on a per-volume basis. Automatic and transparent, network RAID is a built-in feature. Network RAID level 0 offers simple striping without replication; levels 2, 3, and 4 replicate every block in your volume the associated number of times, distributing the copies across storage nodes. You can also set up multi-site SANs with network RAID, providing continuous data availability even through a site failure. Automatic failover, failback, and resynchronization happen behind the scenes because they are all done within the logical volume itself. You can migrate volumes between storage clusters without taking data offline. 6
Thin provisioning When you deploy a traditional SAN, you are asked to estimate how much storage you will need months or years later. The likelihood of making an accurate guess that far in advance is low so it s not surprising that administrators often over-allocate and over-spend to make sure they won t run out of space. If you discover later that you have grossly over-allocated your storage, you cannot reclaim the unused space without reconfiguring your SAN. In an HP LeftHand SAN, volume sizing with thin provisioning works differently. You can give your application servers the largest volumes they will ever need without the penalty of over-allocating your storage. HP LeftHand SANs allocate storage space only as data is written to a volume. Easy-toread charts and alerts keep you apprised of volume utilization levels, letting you know when you are approaching storage limits. You can then purchase storage capacity when you need it. Included in all HP LeftHand SANs, thin provisioning: Makes your SAN extremely space-efficient Perpetually optimizes storage capacity in your HP LeftHand SAN Never over-allocates storage, enabling you to buy only what you need Raises overall storage utilization and efficiency, helping to maximize your ROI Figure 3 details utilization improvements gained through thin provisioning. The actual capacity used by a volume grows ahead of the allocated capacity; the provisioned capacity presented to the SharePoint instance remains constant. Figure 3. Allocating additional physical storage 7
Snapshots Volume-level snapshots have a number of significant advantages over disk- and tape-based backups. Snapshots effectively eliminate the backup window as they usually take less than a second to occur. They can be restored to very quickly and can be mounted as volumes on secondary servers for recovery of logical objects. Snapshots are space-efficient; only the blocks containing changed data actually take up space. Thus, a daily snapshot of a SQL Server database may consume only a small percentage of the SAN space taken up by the original database. The largest challenge associated with snapshots is that they are inherently inconsistent because, at any one instant, many blocks of data are in flight. Recovery can thus be very difficult, depending on what was going on at the instant of snapshot. Shutting down applications before taking a snapshot is one solution however, this approach is impractical for 24x7 applications and databases. Microsoft has solved the problem of live snapshots by supplying a mechanism within Windows called the Volume Shadow Copy Service (VSS). VSS coordinates a requestor (usually a backup application), a writer (the application to be backed up), and a provider (the snapshot itself). VSS provides a snapshot that is a consistent image of a volume s data and that can be much more easily restored. SharePoint and SQL Server databases can be backed up via VSS snapshots, resulting in a set of snapshots for each volume on the SAN that contains SharePoint data. Restoration of a snapshot is fast; however, determining which volumes to recover and getting the whole system re-synchronized is difficult. Thus, the major disadvantage of snapshot backups is the expertise required to perform the recovery. Ideally, a good requestor (i.e., backup application) could manage this, but to date no vendor has stepped up and provided a product for SharePoint that utilizes vendor snapshots. Snapshots also do not protect against failure of the storage subsystem; the snapshots are lost if the parent volume is destroyed. Because of this restriction and their manageability issues, SAN-based SharePoint snapshots are usually treated as a last line of defense instead of taking a primary backup role. 8
Figure 4. HP LeftHand SAN snapshots mounted to additional servers as writable volumes Remote copy Remote copy is an HP LeftHand SAN-specific offering that allows volume snapshots to be shipped to another HP LeftHand SAN at a remote location. Like shipping tapes to an offsite vault, it provides protection against total site disasters. The underlying mechanism is efficient and like snapshots does not require backup windows or put a load on the application servers. Remote copy can work over any type of network link, and it sends only data that has changed to the remote site. VSS or some other synchronizing software should be used, or else SharePoint should be shut down at the time of the remote copy snapshot on the local SAN. If a restore is needed, some expertise is required to determine which volumes need to be restored to what locations in order to get SharePoint back up and consistent across all of its databases. 9
Figure 5. Remote copy of SharePoint volumes between two locations 1. HP SAN/iQ Software creates a snapshot of the volume. 2. The snapshot is copied to the remote cluster either physically or via the network. Watermarks prevent confusion between local and remote volumes. 3. Asynchronous replication schedules send only the changed blocks to the remote site. Different retention policies enable you to save recent copies or a history of copies for recovery. 4. Remote volumes can be promoted for disaster recovery or simple backup. HP LeftHand Solution Pack The HP LeftHand Solution Pack for Microsoft Windows Server dramatically improves data storage integration, simplifies data protection, enhances performance, and reduces restore time across the SharePoint farm. The Solution Pack includes the HP LeftHand VSS Provider and HP LeftHand DSM for MPIO, making the HP LeftHand SAN a powerful choice for Windows administrators looking to simplify the move to a storage area network. HP LeftHand VSS Provider In a Microsoft Windows 2003 and 2008 Server environments, Microsoft delivers a software Volume Shadow Copy Service (VSS) framework that facilitates communication between applications and storage, allowing for consistent point-in-time copies of data (shadow copies) for archival and restoration purposes. The framework consists of requestors, which initiate and manage the backup; writers, which prepare an application or file system for shadow copy creation; and providers. The LeftHand VSS Hardware Provider interfaces between VSS framework and storage, executing the snapshot within the storage hardware without OS intervention. The HP LeftHand VSS Provider simplifies the process of using snapshots with SharePoint and thirdparty backup software. This technology allows the HP LeftHand SAN to create consistent point-in-time copies of critical application data that can easily be used to recover any portion of the SharePoint server farm. These snapshots: Are highly reliable, because they are created after the application has been quiesced Are space-efficient, because they use HP LeftHand SAN thin provisioning functionality 10
Have less impact on the SharePoint services than traditional snapshots, because they are created on the back-end SAN rather than on SharePoint servers The Microsoft VSS Developer s Kit contains the vshadow utility, which will act as a minimal requestor (backup application) to initiate the creation of a shadow copy. Snapshots may remain local (nontransportable) or may be transported to another server where the actual backup is performed. The utility can be used as part of a full backup procedure; as a transportable, one-off backup creator; or for testing and verification purposes, as shown here. 1. Download and install the Volume Shadow Copy Service SDK 7.2. 2. Install the SAN/iQ VSS Provider and Authentication Console from the HP LeftHand Solution Pack for Microsoft Windows Server installation package. 3. Configure the SAN volumes and Microsoft iscsi initiator connections as normal. 4. Configure the HP LeftHand VSS Hardware Provider in the Authentication Console as detailed in the HP LeftHand Solution Pack for Microsoft Windows user guide. 5. Create the VSS snapshot of the volumes (represented as the e: and f: drives in this example) from a DOS prompt with the following: vshadow p e: f: 6. Verify that the snapshots are created within the HP LeftHand centralized management console (CMC). HP LeftHand DSM for MPIO The HP LeftHand Device-Specific Module (DSM) for the Windows Server Multipath I/O (MPIO) infrastructure provides for superior path failover and performance capabilities. The HP LeftHand SAN has unique distributed system characteristics, which give the end user superior fault tolerance. The HP LeftHand DSM for MPIO further leverages distributed system technologies and brings them to the Windows iscsi driver. The HP LeftHand DSM provides enhanced MPIO functionality, as follows: An I/O path is built to each storage module in the cluster on which the volume resides. The HP LeftHand DSM handles all path creation for the administrator automatically, unlike other native MPIO solutions that require manual path creation. A superior performance architecture over the native (default) Windows MPIO solutions: Read I/Os are always serviced by a module that holds a copy of the data being requested. Write I/Os are always serviced by a module that holds a copy of the data. Remaining copies, or replicas, of the data in question are forwarded to the appropriate storage module(s) based on the volume replication algorithm 0-way, 2-way 3-way, or 4-way replication. 11
Because an I/O path is built to every storage module in the cluster, the resulting fault-tolerant solution is superior to standard MPIO architectures, which are typically dual-path only. For example, if there are five storage nodes in the cluster, DSM connected volumes will have five iscsi MPIO connections to the SAN (plus one or two administrative sessions). Four of the five connections could actually go down and I/O would still be serviced. Note: The HP LeftHand DSM for MPIO is compatible with the Microsoft iscsi initiator only. It is not compatible with iscsi host bus adapters (HBAs). The HP LeftHand DSM for MPIO is a server-side plug-in that connects to the Microsoft iscsi MPIO driver framework. The DSM understands the data map of the volume(s) on the storage cluster, using the patented HP LeftHand replication algorithm to read and write to exactly the correct storage node. Figure 6 demonstrates how the DSM works with the Microsoft iscsi driver to build MPIO connections to the SAN. Figure 6. HP LeftHand DSM for MPIO SAN best practices for SharePoint 2007 Performance sizing recommendations To size a SAN for performance, application requirements must be translated to either throughput (measured as MB/second) or I/O operations per second (IOPS). For SharePoint, IOPS are the proper metric most of the I/O originates from the underlying SQL Server databases. SQL Server database I/O is composed of 8 KB blocks with a random I/O pattern and a mix of reads/writes. Log I/O is larger, 128 KB blocks with sequential writes. Taken together, the I/O profile is small block, random I/O with a 60:40 write/read ratio. Obviously, different SharePoint usage models will cause this pattern to vary, but these figures provide a baseline definition. 12
User definitions Microsoft has proposed a common set of definition examples for users in the Microsoft Office SharePoint Server 2007 documentation, describing frequency of use and requests per second (RPS) to supported users ratio for each user definition. The user definitions light, typical, heavy, and extreme are shown in Table 1 below: Table 1: SharePoint Server user definitions User type Accesses/hour Active users/rps Light 20 180 Typical 36 100 Heavy 60 60 Extreme 120 30 Sizing the SAN Assuming that all SharePoint volumes are on a single SAN, a single SharePoint request generates between 60 and 100 I/O operations. I/Os may range from 100 to 200 per request for documents larger than 2 MB such as MP3s, videos, high-resolution graphics, etc. The above user profile definitions translate to IOPS per document size, as shown in Table 2 below: Table 2: IOPS per document type User type IOPS/standard document (<2 MB) IOPS/large document (>2 MB) Light 0.4 0.8 Typical 0.75 1.5 Heavy 1.25 2.5 Extreme 2.5 5 Various drive types have different IOPS ratings. Table 3 shows industry-standard ratings for 8 KB random IOPS for 7,200 rpm SATA and 15k rpm SAS drives, assuming RAID 5 overhead and caching. Table 3: Standard drive IOPS Drive type 8 KB random IOPS (RAID 5) SATA 90 SAS 135 The user-generated IOPS calculations are straightforward. Determine how many of each type of user will be using the system concurrently, the total IOPS requirement, and the type and number of disk drives needed to deliver that number of IOPS. For example, a server farm with 300 typical and 200 heavy concurrent users in a Microsoft Office collaboration environment (i.e., normal documents) would generate approximately 475 IOPS (based on [300 typical users X.75 IOPS] + [200 heavy users X 1.25 IOPS]); it would require four 15k rpm SAS drives to meet that sustained rate. The last task is to add in the IOPS and capacity requirements for the high-availability (HA) overhead requirements. This includes the disk-level RAID, the network RAID features of the HP LeftHand SAN, and the thinly provisioned snapshots. Table 4 details examples of the overhead required for these features based upon conservative, performance-focused preferences. 13
Table 4: HA overhead example Configuration/Utility % IOPS overhead IOPS % capacity overhead Capacity (GB) SharePoint (collaboration) N/A 475 N/A 250 RAID 5 for all disk drives N/A N/A 17% 42.5 Network RAID level 2 for SharePoint Snapshots (5 days) for SharePoint 50% 238 100% 250 25% 119 15% 37.5 TOTAL 832 IOPS 580 GB In this example, 832 IOPS are required for optimal performance during peak times. This requirement can be met with seven 300 GB 15k rpm SAS drives, which with an HP LeftHand Network Storage Module (NSM) 2060 can be satisfied with two storage nodes (six drives each). This configuration provides 1.29 TB of usable capacity, far more than the 580 GB of raw capacity required, so space will not be an issue. It is important to note that these IOPS calculations are designed to deliver optimal performance to a SharePoint user community. This is defined as a system that delivers less than one second response time for most normal requests. Storage is only one component of user performance, and these calculations all assume there are no other bottlenecks in the system. It is important to follow the Microsoft guidelines for sizing servers and loading server farms. Since response time requirements are subjective and vary from site to site, the IOPS calculations should be used as a sanity check against the capacity requirements. For example, if in the above example the total capacity requirements of the system can be met with two drives (2 x 300 GB = 600 GB), but the IOPS calculations say seven drives for optimal performance, then the right answer is somewhere between two and seven drives, depending on how important performance is, how much budget sensitivity there is, and so on. With HP LeftHand SANs, the granularity of the decision boils down to how many nodes to purchase. In the case above, the decision is between one or two nodes and as two nodes is a minimum configuration for all the high-availability features, this decision is a simple one. In larger configurations, the difference between performance sizing and capacity size can be as many as two or three nodes difference. Volume configuration, partition alignment, and formatting Volume configuration SharePoint requires the creation of a number of iscsi volumes. It would be possible, in a single-server environment, to specify a single iscsi volume for everything in SharePoint; however, in production environments, it makes sense to create multiple volumes for various databases. Here are some general guidelines: Keep all the main database files in their own volumes. This will make it much easier to expand your server farm by adding more servers. Put database logs in a separate volume from the database files. This enables various recovery scenarios, increases database performance, and allows the use of thin provisioning for the database volumes. Use thin provisioned volumes for the database volumes. Use fully provisioned volumes for the log volumes. 14
Partition alignment This section describes how to configure Windows Server 2003 disk partitions to be aligned optimally for HP LeftHand storage. The Windows Server 2003 default partition set does not align the partition to the physical disk on which the partition resides. Correct partition alignment helps reduce latency when the partition is written to, because it eliminates the unnecessary disk writes and reads that occur when partitions are not aligned. Windows partitions should be aligned at 64K for best results. With a physical disk that maintains 64 sectors per track, Microsoft Windows always creates the partition starting at the 64th sector which misaligns it with the underlying physical disk. To be certain of disk alignment, use Diskpart.exe, a disk partition tool. Provided by Microsoft in the Windows Server 2003 Service Pack 1 support tools, Diskpart.exe can explicitly set the starting offset in the master boot record (MBR). Setting the starting offset correctly will align Exchange I/O with storage track boundaries and improve disk performance. Microsoft Exchange Server 2007 writes data in multiples of 8 KB I/O operations, and I/O operation to a database can be from 8 KB to 1MB. Therefore, make sure that the starting offset is a multiple of 8 KB. Failure to do so may cause a single I/O operation to span two tracks, causing performance degradation. Note: This can only be done when creating a new partition before formatting. It is not possible to align a partition that has data on it already without losing that data. Aligning a partition with diskpart for Windows Server 2003 Open a command prompt and type: diskpart.exe Enter: > list disk Note the disk number that you want to create the partition on. Enter: > select disk (disk number) Enter: > create partition primary align=64 Enter: > assign letter (the letter you want the drive to have) Or enter: Enter: > assign mount (the path of a empty dir to mount the drive to) > exit (to exit diskpart) 15
Volume formatting HP LeftHand SAN thin provisioning allows volumes to be created in the SAN without pre-allocating storage space. This feature greatly increases overall utilization of the SAN and removes the challenge of predicting future storage requirements. Storage space is allocated as data is the written to the volume. Easy-to-read charts and alerts keep you apprised of volume utilization levels, letting you know when you are approaching storage limits. You can then purchase storage capacity when you need it. When performing a full format, Windows Server 2008 writes to the entire disk, whereas previous versions of Windows performed a read. Microsoft recommends performing a quick format when using on-demand allocating modes, such as HP LeftHand SAN thin provisioning. This avoids the initial write to the entire disk, maintains on-demand utilization, and is completed in seconds rather than minutes to hours for large disks. Backup and restore configuration and operation This section addresses a very common complaint about planning your data protection strategy for SharePoint: There are too many overlapping choices. Many vendors have data protection features built into their products; other vendors engineer specific solutions in this space. This section lists all the common data protection technologies and briefly discusses pros and cons with each. At the end of this section, a recommendation is included for customers who are deploying SharePoint on an HP LeftHand SAN. Versioning and recycle bin Using document versioning and SharePoint recycle bins, end users can recover their own documents or go back to earlier versions of existing documents. Versioning offers good protection for overwrites, and the recycle bin protects documents against accidental deletion. There are two recycle bins: a firststage bin, for end users, that retains deleted documents for a certain number of days; and a secondstage bin for administrators. Both versioning and recycle bins require additional capacity and overhead to determine how much additional storage to plan for. Virtually all SharePoint deployments use these capabilities to some degree to address the majority of end-user issues related to the recovery of individual documents. SharePoint also offers a utility to protect against deleted sites. This is not part of the distributed product, but can be downloaded from the Microsoft TechNet site. (A link is provided below in the Additional resources section.) SharePoint backups Microsoft Office SharePoint Server 2007 includes a backup and restore tool, which is available from the central administration site or can be accessed using the STSADM command-line tool. This utility backs up most files and databases to another disk location. A third-party tool can then be used for backing the files up to tape. The backup and restore tool is advantageous, as it is integrated into the SharePoint components. These tools do not have a scheduling facility, are not very fast, consume a larger amount of space, do not release old backups, do not back up customizations, and can have a negative impact on system performance while the backups are being captured. More importantly, this method only applies to SharePoint. All other applications, databases, and files must be backed up using a different method. Because of some of these issues, Microsoft recommends SharePoint backups only for small to medium SharePoint deployments. See Data Protection and Recovery for Office SharePoint Server in Small to Medium Deployments for more details on restrictions. 16
SQL Server backups The embedded SQL Server backup mechanism can also be used. The benefit here is that it is a procedure consistent with other SQL Server databases being administered. However, the administrator must know which databases to back up and, more importantly, how to restore them. SharePoint or file backup must still be used for other components such as the search index and the non-sql Server pieces. As with SharePoint backups, SQL Server backups support only disk backup and are designed to be used in conjunction with a tape backup product. Continuous data protection products Continuous data protection (CDP) products monitor all I/O operations for a particular application and then copy these I/Os to a CDP server, where they are then applied to a duplicate information store on that server. For databases like SQL Server, these products generally use VSS as a way to synchronize the database periodically, providing consistent recovery points that administrators can restore back to. The benefits of CDP products are that they provide very little loss of data in the event of a restore, offer granular restore of individual items, and provide point-in-time recovery of data in case of corruption. For SharePoint, these products take into account the relationships between dispersed data components and coordinate getting all of them back together in the proper place, thus eliminating the expertise required to do restores from snapshots. Most CDP products also have a remote capability to ship data to another CDP server at a remote location. The downside of these products are that they generate large amounts of local area network (LAN) traffic, because all backup data is transferred live over the LAN to the CDP server. The resulting capacity requirements can also be quite large, ranging from three to seven times the size of the production data. Even though the backups are continuous and therefore do not require a backup window on the application servers, restores are done over the LAN via data copying and can therefore be time-consuming if the amount of data being recovered is large. CDP products are a good way to gain the ability to do a logical restore of individual items, but are not as a good mechanism for full restores of entire systems. General backup and restoration recommendations Use network RAID level 2 for all volumes containing SharePoint databases or files. This protects against double drive failures, any hardware failure within a storage node, and any SAN network failures. Use a CDP product that supports SharePoint as the primary backup. Microsoft Data Protection Manager provides item-level restore capability and full point-in-time recovery of the SharePoint environment in the event of a total failure of the production SAN or corruption of any of the primary databases. Use HP LeftHand SAN s multi-site configuration to spread your SAN across two local data centers for protection against local site or data center disasters. This is included in HP LeftHand SAN and does not require any additional administrative expertise or hardware. Use the remote copy facility of your CDP product for protection against metro-wide disasters. This provides for a long-distance remote copy and takes advantage of those products recovery capabilities. HP LeftHand SAN remote copy can also be used here, but the recovery management capabilities are not as complete. Taking VSS snapshots of your database files and log volumes of the SQL Server databases on a nightly basis provides a second tier of protection. Use a tape-based backup solution for a third tier of protection or long-term archive. Network RAID level 2, used in combination with a CDP product such as Microsoft System Center Data Protection Manager, provides a complete solution for protecting your SharePoint environment from all but large-scale disasters. It also enables easy recovery of logical items such as sites, site collections, and documents. 17
Conclusion HP LeftHand iscsi SANs, based on HP SAN/iQ Software, are an ideal choice for Windows SharePoint deployments. With high-performance HP server hardware for LeftHand SANs, the resulting storage cluster can scale to higher performance levels and reliability than many purpose-built storage devices. HP LeftHand SANs scale extremely well for SharePoint. The storage can easily scale from both capacity and performance perspectives when more users and/or demand are added to the system. Because it is highly automated, the HP LeftHand SAN is extremely easy to administer using existing Windows and networking administrative expertise. HP SAN/iQ Software includes a wide range of management, data protection, and availability tools that provide administrators with the functionality they need, regardless of deployment size and application criticality. Finally, because the system performance can scale in a linear fashion as more capacity is added, HP LeftHand SANs offer exceptional price/performance. For more information For more information on HP LeftHand P4000 SAN Solutions based on HP SAN/iQ Software, visit www.hp.com/go/p4000. Additional resources Physical database storage design: www.microsoft.com/technet/prodtechnol/sql/2005/physdbstor.mspx SharePoint capacity planning tool: http://technet.microsoft.com/en-us/library/bb961988.aspx White paper: Performance recommendations for storage planning and monitoring http://technet2.microsoft.com/office/en-us/library/8dd52916-f77d-4444-b593-1f7d6f330e5f1033.mspx?mfr=true Plan for performance and capacity (Office SharePoint Server): http://technet2.microsoft.com/office/en-us/library/8dd52916-f77d-4444-b593-1f7d6f330e5f1033.mspx?mfr=true Estimate performance and capacity requirements for Windows SharePoint Services collaboration environments (Office SharePoint Server): http://technet2.microsoft.com/office/enus/library/8dd52916-f77d-4444-b593-1f7d6f330e5f1033.mspx?mfr=true "Planning and Monitoring SQL Server Storage for SharePoint: Performance Recommendations and Best Practices : http://office.microsoft.com/download/afile.aspx?assetid=am102509151033 VShadow Tool and Sample documentation: http://msdn.microsoft.com/en-us/library/bb530725.aspx Windows Server backup utility: http://technet.microsoft.com/en-us/library/cc754572.aspx White paper: Data protection and recovery for Office SharePoint Server : http://technet.microsoft.com/en-us/library/cc262129.aspx Plan for availability (Office SharePoint Server): http://technet.microsoft.com/en-us/library/cc263044(technet.10).aspx#section4 18
Appendix: SQL examples Moving tempdb This script can be modified with the directory path to move the tempdb after the initial installation of SQL Server. SQL Server must be restarted to complete this change. USE master GO ALTER DATABASE tempdb MODIFY FILE (NAME = tempdev, FILENAME = [New data folder path]\tempdb.mdf') GO ALTER DATABASE tempdb MODIFY FILE (NAME = templog, FILENAME = [New log folder path]\templog.ldf') GO Adding and sizing tempdb data files This script can be modified with the directory path to size and split the tempdb. SQL Server must be restarted to complete this change. USE master GO ALTER DATABASE [tempdb] MODIFY FILE (NAME = N'tempdev', SIZE = [Calculated] KB) GO ALTER DATABASE [tempdb] ADD FILE (NAME = N'tempdev_02', FILENAME = N'[Second data file path]\tempdev_02.ndf', SIZE = [Calculated] KB, FILEGROWTH = 10%) GO ALTER DATABASE [tempdb] ADD FILE (NAME = N'tempdev_03', FILENAME = N'[Third data file path]\tempdev_03.ndf', SIZE = [Calculated] KB, FILEGROWTH = 10%) GO ALTER DATABASE [tempdb] ADD FILE (NAME = N'tempdev_0N', FILENAME = N'[Nth data file path]\tempdev_0n.ndf', SIZE = [Calculated] KB, FILEGROWTH = 10%) GO 19
Splitting large database files Often, SharePoint solutions are implemented without proper planning for storage requirements, which leads to disk I/O issues. In order to move from the out-of-the-box configuration to one that is more highly scalable, it may be necessary to redistribute content from a single content database data file to multiple data files. This procedure will cause all of the content from the original (.mdf) file to be equally distributed to the other (new.ndf) files in the file group. This script may take several hours to complete if the source database file is extremely large. Make sure that a validated backup of the database has been recently executed. Add additional data files to the primary file group of the database following the Microsoft recommendation of.25 data files per CPU. USE [Content_DB_Name] GO DBCC SHRINKFILE (N'Content_DB_Primary_File_Logical_Name, EMPTYFILE) GO Technology for better business outcomes For more information visit, www.hp.com/go/p4000 Copyright 2009 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein. Microsoft and Windows are U.S. registered trademarks of Microsoft Corporation. Intel and Xeon are trademarks of Intel Corporation in the United States and other countries. 4AA2-5499ENW, April 2009 20