White Paper SHAREPOINT 2010 REMOTE BLOB STORES WITH EMC ISILON NAS AND METALOGIX STORAGEPOINT Abstract This white paper describes how to externalize Microsoft SharePoint Server 2010 BLOB stores to EMC Isilon Scale-out NAS by using Metalogix. January 2013
Copyright 2013 EMC Corporation. All Rights Reserved. EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. The information in this publication is provided as is. EMC Corporation makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com. EMC2, EMC, the EMC logo, Isilon, OneFS, AutoBalance, and SmartConnect are registered trademarks or trademarks of EMC Corporation in the United States and other countries. VMware is a registered trademark or trademark of VMware, Inc. in the United States and/or other jurisdictions. All other trademarks used herein are the property of their respective owners. Part Number h11371 2
Table of Contents Executive summary... 4 Solutions overview... 4 Key results/recommendations... 4 Introduction... 5 Purpose... 5 Scope... 5 Audience... 5 Terminology... 5 Fundamentals of SharePoint storage... 6 Technology overview... 7 Advantages of EMC Isilon scale-out storage... 7 Scalability... 8 Simplicity... 8 Automatic balancing for performance and efficiency... 8 Data protection... 8 Reduced costs through greater utilization... 9 Metalogix... 9 File Share Librarian... 9 Architecture... 10 Hardware and software resources... 10 Physical deployment... 11 Storage configuration... 12 Disk pool configuration... 12 Baseline performance results... 14 Test Scenario 1: adding files to SharePoint using File Share Librarian14 Configuration... 15 Results... 15 Scenario 2: externalizing BLOBs with... 16 Configuration... 16 Results... 16 Scenario 3: performance during externalization... 16 Results... 16 Validation and testing... 17 Test methodology... 17 Notes on performance test results... 17 Results summary... 18 Storage performance results... 18 Conclusion... 20 References... 20 About EMC Isilon... 21 Contact EMC Isilon... 21 3
Executive summary The Microsoft SharePoint market is growing steadily; more businesses are relying on SharePoint as a business critical application. SharePoint provides a wide range of functionality that encompasses enterprise content management, collaboration, social networking and document management. Nearly 75 percent of businesses surveyed cite Document Collaboration as the most often used feature within SharePoint. Customers are interested in migrating content to SharePoint and using the rich set of functionalities available when data is managed in a structured framework. One significant challenge all SharePoint administrators face is that all data on the content database is stored within SQL server. Because of performance requirements, SQL server databases are typically stored on the highest performing set of disks possible. This means that file based content such as PDFs, Microsoft Office documents, media files and other large files are stored in SQL server databases residing on the highest performing tier of storage. An alternative is to use a lower tier of storage to house these files (also known as Binary Large Object or BLOBs) while keeping SQL database content on the highest performing set of disks. Another major inhibitor is the migration time and effort required to move content from file shares to SharePoint, and the performance and cost challenges created by storing unstructured content in SharePoint s SQL Server content database. This white paper describes how to externalize Microsoft SharePoint Server 2010 BLOB stores to EMC Isilon Scale-out NAS by using Metalogix software. Metalogix can rapidly migrate file based content into SharePoint and transparently externalize this unstructured content using tiered storage to keep the Microsoft SQL Server database lean and responsive. This offers users and administrators all the benefits of SharePoint s content management features and access controls. Solutions overview The solution combines the use of Metalogix software and EMC Isilon scale-out NAS to provide a cost-effective SharePoint content storage solution that reproduces a standard customer environment that includes both structured and unstructured data. Metalogix software is used to catalog File Share data into SharePoint and move and manage the BLOBs outside of SQL Server. is also used to enable remote BLOB storage for SharePoint 2010 onto EMC Isilon scale-out NAS. This solution uses Isilon storage as the platform for virtualizing Microsoft SQL Servers and SharePoint 2010 Servers, as well as providing storage for remote BLOBs. Key results/recommendations As described in this paper, integration is a highly efficient method for moving file share content into SharePoint, and for externalization of unstructured SharePoint content into remote BLOB storage. With this solution you can: Use to improve your organization s overall storage usage and reduce the physical footprint while still meeting SLAs. 4
Use for effectively managing the externalization of SharePoint BLOBs out of Microsoft SQL Server Content databases. Create SharePoint metadata that points to content located in the file share. Use EMC Isilon as a highly efficient and resilient remote BLOB repository for SharePoint data. Introduction Purpose This white paper provides guidance and recommendations for use of remote BLOB storage for SharePoint using Metalogix with EMC Isilon NAS storage. Scope This white paper presents design guidelines, best practices, and validated test results that were identified during the configuration and test phases of the solution. Audience This white paper is intended for: SharePoint architects EMC Partners EMC employees and field personnel Terminology Table I lists key terminology referenced in this white paper: Table I: Solution terminology Term Binary large object (BLOB) BLOB store Content database Definition A collection of binary data stored as a single entity in a database management system. BLOBs are any files stored in the database, that is, documents, spreadsheets, media files, etc. External storage location for BLOBs in a SharePoint farm or environment, for example, an SMB share. This is a Microsoft SQL Server database, resides in the SharePoint farm, and contains content users save in SharePoint. Content can be separated into multiple content databases at the site-collection level. A content database can include one or more site collections, but a single site cannot span multiple databases. Backing up and restoring sites takes place at the content database level. 5
Externalization Remote BLOB Storage (RBS) Externalization dissects the content stored in a SQL server database into two parts (metadata and BLOB), where the BLOB can then be routed to a more cost-effective tier of storage, which reduces the load on SharePoint and improves efficiency. A SQL Server API that lets database administrators store binary large objects in commodity storage instead of directly on the main database server. An RBS provider, such as Metalogix, is required to perform the externalization and related ongoing BLOB management tasks. Fundamentals of SharePoint storage Before looking at the architecture of the solution, it is important to understand how SharePoint stores documents. Most SharePoint deployments host many different types of files including Word, Excel, PowerPoint, PDF or even media and image files. By default these files are stored in the SQL Server relational content database. The content database contains metadata about the file, while the file itself is stored in an unstructured data format called a BLOB. Over time, as the content stored in SharePoint increases, the size of the content database increases making the SQL database grow considerably. Correspondingly, the frequency with which the data is accessed decreases. Earlier versions of SharePoint stored all data both metadata and documents in SQL databases. This often meant that since SQL Server databases are typically stored on Tier-1 storage even older, less-valuable SharePoint data was still being stored on the most expensive disk space. Beginning with SharePoint 2007 Service Pack 1, Microsoft introduced an API to make it possible to store BLOBs outside of the SQL Server while keeping metadata, and pointers to the associated documents, inside the content database. This externalization requires additional software to enable the db API component to remove the BLOB from the SQL DB and continually manage them outside the DB on the external EMC Isilon location. In the solution described in this document, Metalogix is the RBS service provider that executes and manages the external storage location. EMC Isilon Scale out NAS provides the external storage location. As shown in Figure 1, BLOBs that were originally stored in the content databases are moved to file-based storage using. Externalizing BLOBs means that the BLOB data stream itself is moved or stored in a file share, and only metadata is stored within the SQL Server content database. It is important to note that externalizing BLOBs has no effect on SharePoint functionality. The externalization process is handled outside of SharePoint, which means that SharePoint permissions, versioning and applied workflows are not affected by the process of externalization. Leveraging and Isilon to remotely store BLOBS is 100% transparent to SharePoint users. 6
Figure 1: Solution architecture before and after externalizing BLOB stores Technology overview This section provides an overview of the primary technologies used in this solution: EMC Isilon Scale-out NAS storage Metalogix Microsoft SharePoint Server Advantages of EMC Isilon scale-out storage EMC Isilon scale-out NAS is a highly scalable storage platform that utilizes a modular storage architecture that can grow easily with your business. Isilon storage is powered by an intelligent, distributed file system, the EMC Isilon OneFS operating system that provides the intelligence behind all Isilon scale-out storage solutions. Isilon storage solutions can help you accelerate processes and workflows while scaling easily to handle massive growth and providing the highest levels of data protection available. Specific advantages of EMC Isilon storage are described in this section. 7
Scalability With EMC Isilon scale-out NAS, you can have massive room for growth. Isilon storage solutions scale from a minimum 3 node cluster configuration that provides 18TB of capacity to a 144 node cluster with 20PB of capacity all in a single file system, single volume architecture that is remarkably easy to manage. As additional nodes are added to a cluster, all components of the cluster scale symmetrically and predictably. A new storage node can be added to an Isilon cluster in as little as 60 seconds, eliminating the need to over-provision storage for future requirements. The capabilities of an Isilon storage system allow it to scale for capacity and performance simultaneously, or independently, resulting in an agile storage environment that can expand quickly to changing application workloads. For example, if a given workload is constrained by inadequate disk I/O throughput, more platform nodes can be dynamically added to the cluster. This has the effect of improving overall performance and increasing the amount of capacity almost immediately. Additionally, the Isilon storage architecture simplifies the hardware-refresh cycle by allowing the easy addition and retirement of nodes from the cluster while keeping the cluster itself online, and the data available, throughout the refresh process. Simplicity Isilon scale-out storage solutions are designed for enterprises that want to manage their data, not their storage. Our storage systems are powerful yet simple to install, manage and scale, at virtually any size. And, unlike traditional enterprise storage, Isilon solutions stay simple no matter how much storage capacity is added, how much performance is required or how business needs change in the future. Automatic balancing for performance and efficiency When an additional node is added to an Isilon cluster, the Isilon OneFS AutoBalance TM feature immediately begins migrating data from the existing storage nodes to the newly added node, quickly re-balancing all of the data across all available nodes in the cluster. This automatic rebalancing ensures the new storage node will not become a bottleneck for new data, and that existing data will gain the benefits of the increased resources now available to the storage system. AutoBalance is completely transparent to both end user and storage clients, and can be adjusted to run at off-peak times to limit its impact on high-performance workloads. This is all done automatically and requires no management time from the administrator. Data protection Isilon storage solutions are highly resilient and provide the highest levels of reliability, availability, and serviceability in the industry. An Isilon cluster is designed to tolerate one or more simultaneous component failures without any interruption of service or loss of data. Data protection is applied at the file level, rather than relying on either a single global setting or specific RAID groups as found in other storage platforms. In the event of a component failure, the file system is able to focus on recovering only the impacted data rather than having to repair and rebuild the entire data set. 8
Because all data, metadata, and parity information is distributed across all nodes in the cluster, dedicated parity nodes or disks are not needed. A properly configured EMC Isilon cluster eliminates any single points of failure, since all nodes share equally in the workload and function as peers in a complete symmetrical storage environment. Also, unlike other storage platforms, changes in data-protection settings for files and directories can be modified at any time without the need to take either the cluster or the impacted files offline. Service outages, which are often required in traditional storage environments when re-balancing workloads, are also eliminated with Isilon scale-out NAS. EMC Isilon's globally coherent cache, and SmartConnect TM client load-balancing and failover features, provide high performance with industry-leading levels of data protection, high availability and system resiliency. Reduced costs through greater utilization EMC Isilon storage systems provide highly efficient data storage at scale. For clusters larger than five nodes, storage utilization rates of over 80% are typical. You can gain further additional efficiencies with EMC Isilon SmartPools software for automated storage tiering that continually optimizes your Isilon NAS for performance and economy. With SmartPools, you can set policies to automatically move inactive data to more cost-effective storage, streamlining workflows for your most current data while remaining completely transparent to users and applications. In addition, because Isilon is so easy to manage, many complex, labor-intensive storage management tasks required for traditional storage systems including troubleshooting disk or network bottlenecks or migrating VMs between datastores to rebalance capacity are eliminated with EMC Isilon storage. Metalogix Metalogix software is used for BLOB-storage offload to SharePoint. improves SharePoint performance and scalability by offloading content from the underlying Microsoft SQL Server database to alternate tiers of storage. installs quickly and easily into SharePoint Central Administration and enables you to manage SharePoint BLOBs transparently from a single SharePoint interface. does not require any additional hardware or software to be implemented. File Share Librarian s File Share Librarian enables you to delegate control of the file-share content to SharePoint. The file shares are used like any other storage endpoint. The file-share metadata, associated folder hierarchy, and supporting information are catalogued, replicated, and placed under the control of SharePoint, which enables users to search, manage, and access file-share content just like any other document within SharePoint. 9
Architecture This section provides details on the hardware and software components used in the tested solutions. The solution was configured as follows: All servers were virtualized using a VMware vsphere 5.0 host cluster. A 10-node EMC Isilon storage array, consisting of five Isilon X200 and five Isilon 72NL nodes, was used as the underlying storage platform for virtual machine and file-share data in this solution. An SMB-based shared folder was created in the Isilon 72NL disk pool. This share was populated with documents. Document sizes in the file set ranged from 200K to 12MB. All virtual machine VMDKs were stored on a shared NFS mounted data store from the Isilon X200 disk pool. All content, search, temp and configuration, and the Metalogix database, and their associated log files were placed on VMDKs housed within the NFS data store and presented to the SQL Server virtual machine. Each Content DB had a corresponding BLOB store created on the EMC Isilon cluster and shared using SMB. Content externalized from the Content DB was moved to these locations. Existing file share content was also placed on the Isilon cluster presented in a separate SMB share. Hardware and software resources Table II lists the hardware used to validate this solution. Table II: Solution hardware Equipment Used for Quantity Configuration ESXi Servers VMware SharePoint 2 Management Server VMware vcenter Server 1 8GB of RAM 4-Core Mixed Isilon X200 / 72NL 10 node cluster VMware virtual machines SharePoint content Databases and logs SMB Shares containing BLOBs 1 Active Directory Server Domain Controller 1 8GB of RAM Table III lists the software used in this solution. 4-Core Windows 2008R2 10
Table III: Solution software Software Microsoft Windows Server Microsoft SharePoint Server Microsoft SQL Server Version 2008R2 SP1 Enterprise Edition 2010 SP1 Enterprise Edition 2008R2 SP1 Enterprise Edition Metalogix 3.4 VMware vcenter 5.0 VMware ESXi Server 5.0 Table IV lists the allocation of virtual hardware resources used in this solution. Table IV: Allocation of virtual resources Virtual machine role Quantity Configuration Web front end (WFE) 2 8 CPU 4GB RAM Application 1 8 CPU 8GB RAM Crawl Server 1 4 CPU 8GB RAM Microsoft SQL 1 8 CPU 32GB RAM Domain Controller 1 2 CPU 8GB RAM Environment profile This solution was validated with the environment profile listed in Table 5. Table V: Environment profiles Requirements SharePoint Total User count SharePoint Content DB sizes SharePoint Number of Content DBs Quantity/Type/Size 10,000 with two WFEs 150GB and 250GB 2 Physical deployment This section describes the test scenario and test progression of a generic small - medium sized SharePoint farm. Tests scenarios include, cataloging an existing file share, and externalizing an existing SQL Content DB using Metalogix as the RBS provider. 11
The SharePoint farm was designed as a small medium farm with 2 WFEs, 1 crawl server, 1 SQL Server and 1 application server. The entire farm, including boot volumes for the virtual machines and content databases, was stored in an NFS connected datastore on the Isilon X200. The farm consists of two content DBs (150GB and 250GB). A separate SMB share containing file based data was also used for purposes of cataloging data into SharePoint. Figure 2 below shows the physical architecture of the tested solution: Figure 2: Physical architecture Storage configuration In a completely virtualized solution, virtual machine disks (VMDKs) and the external file share and associated BLOB data are stored within the EMC Isilon file system. Data was logically separated by placing virtual machine data in separate disk pools. Virtual machine data was placed in the Isilon x200 disk pool. File share data and shared folders containing externalized BLOBs were placed in the Isilon 72NL disk pool. These directories were shared via SMB and added to as storage endpoints. Disk pool configuration Disk pools were used to control data placement between the different node types. Each Disk Pool is a homogenous group of nodes with an Isilon storage cluster. For example, Isilon S-Series nodes with 300 GB SAS drives and one 200 GB SSD per node would be in one pool, whereas Isilon NL-Series nodes with 3 TB SATA Drives would be in another. A single Isilon storage cluster may consist of multiple disk pools, since OneFS enables the grouping of multiple node types into a single file system. Each node type is 12
optimized for a different capacity-to-performance ratio, so it is common for organizations to leverage all these architectures into one cluster in order to ensure an optimum match between their different data sets and the storage nodes that they reside on. Disk pools define resources in the pool such as the node type and disk type, SSD use and associated data protection settings. For more information on planning and implementing SmartPools file-pool policies, consult the Next Generation Storage Tiering with EMC Isilon SmartPools white paper, a link to which is provided in the References section of this Guide. In order to ensure that the shared folder used for the external BLOB data is stored on the Isilon 72NL nodes, the folder s disk pool setting was adjusted in the OneFS WebUI file system explorer, as shown in Figure 3. Figure 3: Disk-pool assignment for BLOB folder Best Practice: When virtualizing SharePoint environments and remote BLOB locations on EMC Isilon clusters containing multiple node types, it is a best practice to separate remote BLOB data and virtual machine data into separate disk pools. SmartConnect EMC Isilon SmartConnect software was used in order to load balance ESXI host connections as well as server connections to the storage. Isilon SmartConnect software optimizes network-throughput performance and availability by enabling intelligent client-connection load-balancing and failover capabilities. Through a single host name, SmartConnect enables client-connection load balancing, as well as dynamic NFS failover and failback of client connections across storage nodes to provide optimal utilization of the cluster s available network connections. By leveraging an organization s existing DNS infrastructure, SmartConnect provides universal compatibility with all client types, eliminating the need for complicated connection management on the client side. With SmartConnect, in the event of a 13
node or path failure, file-system stability and availability are maintained for NFS clients that support automatic path failover. To a client system, the cluster appears as a single network element. SmartConnect automatically balances incoming client connections across all available interfaces on the Isilon storage cluster, improving performance on the cluster by distributing the workload evenly across multiple network paths and multiple nodes. Best Practice: EMC recommends using SmartConnect in virtualized SharePoint environments in order to balance network throughput. For additional information on SmartConnect options and configuration, refer to the SmartConnect Best Practice Guide. Baseline performance results Prior to testing the BLOB externalization with a set of baseline performance tests were run to determine the number of users with acceptable response times. Baseline test results determined that the SharePoint farm consisting of two WFEs can support 12,000 users. Table VI shows summarizes test results for the baseline tests. Table VI: Baseline test performance results Test Name Scenario Avg. Test Time (sec) Upload Mixed 1.23 Modify Mixed 1 Search Mixed 0.3 Download Mixed 0.3 Browse Mixed 0.2 Acceptable response times for a heavy SharePoint user profile is under 3 seconds for each of the tests. The limiting factor during the test was the web front end CPUs which averaged 60% on two web front end servers. Test Scenario 1: adding files to SharePoint using File Share Librarian In this test, the Metalogix File Share Librarian was used to catalog fileshare data from an existing file share in to a SharePoint Content DB. The file share in this test consisted of approximately 100GB of file-based data. Data types include Microsoft Office documents, PDF files, and media files. Individual file sizes range from 100KB to 12MB. This scenario would be used for a customer wanting to add file share content to SharePoint without physically moving the files. The File Share Librarian 14
catalogs the file share metadata without moving the actual files. This option allows for users to connect using the existing file share or to restrict users to only connect through SharePoint. Best Practice: Before cataloging file shares it is strongly recommended that a full backup be taken and kept for historical purposes. Best Practice: After data has been cataloged, it is recommended that access from the file share be disabled for security and to ensure that users cannot modify contents without accessing the data through SharePoint. Configuration The Metalogix File Share librarian was set to catalog the file share and present it as a new Content Database within SharePoint. The FileShare Librarian adds the links in the Content DB to the file share data based on where the data resides in SharePoint. For purposes of this test a site within an existing site collection was chosen. Once the File Share Librarian has cataloged a file share, if new content is added, shared system cache must be enabled. The basic process of cataloging file share data into SharePoint is as follows: 1. Enter a name of the File Share Librarian job. 2. For Access Mode: a. If the file share will no longer be available to users after the content is catalogued, select SharePoint Only. b. If users will continue to access and modify files directly from the file share, select SharePoint and File Share. 3. Enter the UNC path of the file share to be cataloged in the File Share to Catalog and click validate. 4. Select the Promote folder permissions to SharePoint containers checkbox. 5. Click Change to the Select the SharePoint Destination Container. 6. Enter the settings to schedule the job. 7. Click Save to save the settings or click Catalog File Share Now. For complete process details, see pages 112-113 of the Metalogix Installation and Administration Guide. Results For this test one server in the SharePoint farm was used for cataloging, with a thread count of 10. The test cataloged 100GB of data in 6 minutes. The existing file share remained accessible while the SharePoint content was updating. Performance impact during catalog process was minimal. CPU on the server used for the catalog process was impacted the most. 15
Scenario 2: externalizing BLOBs with This test scenario simulates an existing 150 GB SharePoint Content DB being externalized with RBS to an SMB File share on EMC Isilon storage. Practical application of this scenario would be an existing SharePoint environment wanting to externalize BLOBs from the SQL databases onto an alternative tier of storage. The process of externalization uses Metalogix. Configuration The process for externalizing BLOBs with SharePoint is as follows: 1. Install Metalogix 2. Create storage end points for your BLOB stores 3. Create storage profiles that dictate how BLOBs are externalized. 4. Create storage profile timer jobs. 5. Click Analyze and Estimate to display an estimate of the space savings that will result from running the externalize job. 6. Click Save to save the configured job schedule or choose to run the job immediately. For complete setup and process instructions please refer to page 76 of the Metalogix Installation and Administration Guide. Results Externalization of the 150GB content database took 1.5 hours. The time required to externalize the Content database is dependent on the size of the content database and the size of the files being externalized. Scenario 3: performance during externalization After running the basic externalization test scenario, the following test scenario was executed to show the effects of BLOB externalization on existing user load. Metalogix provides the ability to control the number of servers used during externalization. Any number of servers in the SharePoint farm can be used to start the externalization process. Externalization jobs from any server in the SharePoint farm can be disabled without interrupting the running job. The number of servers used for externalization and the size of the content database, will affect the time taken to for the job to complete. Best Practice: While not strictly necessary, EMC Isilon strongly recommends that externalization jobs be performed during periods when users and other bulk processes are not heavily utilizing the system resources used in the externalization job. Doing so will minimize impact to user performance. Results During the externalization test, CPU on the web front end servers and SQL server were impacted the most. CPU utilization on the web front end servers increased approximately 20%. While SQL server CPU and disk latency metrics remained within acceptable limits there was an increase in SQL server transactions per second. 16
Altering the amount of threads for each server in the externalization job can will decrease the effect on CPU utilization but increase the overall time taken for the externalization job to run. For recommendations on job settings refer to the Metalogix Installation and Administration Guide. Validation and testing The results in this section show testing from the baseline to a fully externalized SharePoint farm with a migrated file share on an alternate tier of storage. Testing of this solution validates the functionality of EMC Isilon as the alternate tier of BLOB storage and s File Share Librarian. Test methodology During performance tests Visual Studio Team System (VSTS) was used to simulate load on the SharePoint farm. During the tests 12,000 users were simulated according to the Microsoft heavy-user profile, which specifies 60 requests per hour. A think time of 0% was applied to all tests. Zero percent think time is the elimination of typical user decision making time when browsing, searching, or modifying data using SharePoint. Every user request is completed without pause, generating a continuous workload on the system. Baseline response times for each test type are detailed in Table VII below. Table VII: Response times Test Type Action Percentage Recommended Response time Browse User browse 60 Less than 3 seconds Search Unique value search 10 Less than 3 seconds Modify Metadata modify and upload 10 Less than 3 seconds Download Download a document 20 Less than 3 seconds Full performance tests were run with two WFEs. Performance tests were run to obtain performance metrics under the following conditions: 12,000 users accessing a normally operating SharePoint environment User load while cataloging external file share data User load while externalizing existing content database Notes on performance test results Performance test results are highly dependent on workload, specific application requirements, and system design and implementation. System performance will vary as a result of these and other factors. Performance results from this test should not be used as a substitute for specific customer application benchmark and proper application specific to the customer environment 17
All performance data contained in this report was obtained in a rigorously controlled environment. Results obtained in other operating environments may vary significantly. For additional information on sizing SharePoint environments refer to the following documentation. http://technet.microsoft.com/en-us/library/ff758647.aspx#fundamentals When sizing Isilon storage always work with your sales representative for specific sizing guidance. Results summary The following shows the movement of data by showing the amount of space used for each scenario. After externalization, the Content DB space is still used by Microsoft SQL Server until the storage reclaim job is run and the database is shrunk. Cataloging and migrating the file share-based data increases the size of the Content DB used for that share. Table VIII below shows the reduction in size of the SQL content databases after externalizing BLOB data. Table VIII: Content database size Size Before Shrink Size After Shrink Reduction % 105GB 1.7GB 98% 220GB 2.5GB 88% Storage performance results After externalization completed, storage metrics were collected during performance runs to determine impact of running externalized BLOBs on the EMC Isilon storage. The following performance metrics were collected to gauge performance impact from the virtual SharePoint servers as well as SMB related traffic due to SharePoint access of externalized BLOBs. Disk impact from the virtual machines is measured by analyzing NFS throughput and total disk I/O metrics against the relevant disk pools on the EMC Isilon cluster. System utilization across the cluster was also measured to CPU headroom while under load. Figure 4 shows total disk throughput with respect to the NFS datastore. 18
40 NFS MB/s 35 30 25 20 NFS MB/s 15 10 5 0 Figure 4: VNF throughput (VM activity) during load testing To measure the performance impact placed on the storage by SharePoint users accessing documents, the total SMB throughput against the external BLOB store was analyzed. During load testing the average total SMB throughput was 18.64MB/s. Figure 5 shows the total SMB throughput during SharePoint performance testing. 25 SMB MB/s 20 15 10 SMB MB/s 5 0 Figure 5: SMB throughput (BLOB activity) during load testing Figure 6 shows the total disk I/O load placed on the cluster from SharePoint virtual servers and external BLOB stores. 19
250 Total Disk Throughput 200 150 100 Total Disk Throughput 50 0 Figure 6: Disk I/O activity during load testing Conclusion Metalogix can be used with EMC Isilon Scale-out NAS to externalize SharePoint BLOBs and migrate existing file share data into SharePoint. The solution testing outlined in this paper demonstrates that integration is a highly efficient way to move SharePoint data to BLOB Storage, freeing space that would normally be contained in a Microsoft SQL Server database. Externalization provides reliable performance while reducing the need for higher cost storage tiers offering the ability to scale capacity while providing an easy-to-manage storage solution. At the same time, Isilon scale-out NAS provides a highly scalable, resilient and efficient storage platform for the externalized SharePoint data References Product Documentation For additional information, see the documents listed below. Isilon OneFS Configuration Guide SmartConnect Best Practices SmartPools Metalogix Installation and Administration Guide Capacity planning and sizing overview for SharePoint 2010 20
About EMC Isilon EMC Isilon is the global leader in scale-out NAS. We provide powerful yet simple solutions for enterprises that want to manage their data, not their storage. Isilon products are simple to install, manage and scale, at any size and, unlike traditional enterprise storage, Isilon stays simple no matter how much storage is added, how much performance is required, or how business needs change in the future. We re challenging enterprises to think differently about their storage, because when they do, they ll recognize there s a better, simpler way. Learn what we mean at www.emc.com/isilon. U.S. Patent Numbers 7,146,524; 7,346,720; 7,386,675. Other patents pending. Contact EMC Isilon www.emc.com/isilon 505 1 st Avenue South, Seattle, WA 98104 Toll-free: 877-2-ISILON Phone: +1-206-315-7602 Fax: +1-206-315-7501 Email: sales@isilon.com 21