The Dell Email and File Archive Solution with CommVault Simpana Software A Dell CommVault Technical White Paper Dave Jaffe Solution Architect Dell Solution Centers Darin Camp Sr. Technical Alliance Manager CommVault Inc.
Executive Summary Dell s Email and File Archive solution helps customers manage their organization s information. Dell s end-to-end solution capabilities can help customers address storage optimization and compliance requirements, while alleviating burdens related to design, implementation, and ongoing management, through tight integration of archiving software from multiple ISVs with Dell s comprehensive array of massively scalable storage offerings. The global Dell Solution Centers have been formed to test and document reference architectures around integrated solutions, facilitating their deployment into the customer s environment. In addition, these Solution Centers provide labs where customers can come and test these solutions and learn how they can be used to help solve their IT challenges. In this white paper, the business need for archiving is laid out, followed by a high level description of the CommVault Simpana software for data protection. Finally, the three reference architectures comprising the Dell Email and File Archive Solution with CommVault Simpana software are described in detail, including all hardware and software components. Solutions with three key Dell storage arrays the Dell PowerVault DL Backup to Disk Appliance Powered by CommVault, the Dell EqualLogic line of iscsi storage, and the Dell DX6000 Object Storage Platform are shown. Different features of email, file and SharePoint archiving are presented including content indexing, deduplication and high availability. Sizing examples are provided as a basis for work, though the final sizing parameters will be the result of a consultative design session between Dell and the customer. THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS AND TECHNICAL INACCURACIES. THE CONTENT IS PROVIDED AS IS, WITHOUT EXPRESS OR IMPLIED WARRANTIES OF ANY KIND. 2011 Dell Inc. All rights reserved. Reproduction of this material in any manner whatsoever without the express written permission of Dell Inc. is strictly forbidden. For more information, contact Dell. Dell, the DELL logo, and the DELL badge are trademarks of Dell Inc. Intel and Xeon are registered trademarks of Intel Corp. Windows, Exchange and SharePoint are registered trademarks of Microsoft, Inc. Other trademarks and trade names may be used in this document to refer to either the entities claiming the marks and names or their products. Dell Inc. disclaims any proprietary interest in trademarks and trade names other than its own. CommVault, CommVault and logo, the CV logo, CommVault Systems, Solving Forward, Simpana, CommNet, GridStor, CommServe, CommCell, and SnapProtect, are trademarks or registered trademarks of CommVault Systems, Inc. July 2011 Page 2
Contents Executive Summary... 2 Introduction... 5 The Dell Email and File Archive Solution with CommVault Simpana software... 6 CommVault Simpana software... 7 Retention Lifecycle Management for storage consolidation... 7 Retention Lifecycle Management for compliance... 7 Solution Software Components... 9 Solution Hardware Components... 11 Dell PowerVault DL Backup to Disk Appliance Powered by CommVault... 11 Dell EqualLogic iscsi Storage... 11 Dell DX Object Storage Platform... 11 CommVault Simpana software Reference Architecture with Internal Storage... 13 CommVault Simpana software Reference Architecture with the Dell EqualLogic iscsi Storage Platform... 15 Highly Available CommVault Simpana software Reference Architecture with the Dell DX Object Storage Platform... 17 Email and File Archive Sizing... 20 Acknowledgements... 22 Tables Table 1. CommVault Simpana software Components CommCell... 9 Table 2. CommVault Simpana software Components - Additional Components... 9 Table 3. Reference Architecture with Internal Storage Functionality... 14 Table 4. EqualLogic Storage Reference Architecture Functionality... 16 Table 5. High Availability Reference Architecture Functionality... 18 Table 6. CommVault Simpana software sizing, non-journaled case... 20 Table 7. CommVault Simpana software sizing, journaled case... 21 Figures Figure 1. CommVault Simpana Software Platform... 8 Figure 2. The Dell PowerVault DL Backup to Disk Appliance Powered by CommVault... 11 Figure 3. The Dell EqualLogic PS6000XV... 11 Figure 4. The Dell DX Object Store DX6000 Cluster Services Node... 12 Page 3
Figure 5. The Dell DX Object Store DX6012S Storage Node... 12 Figure 6. Reference Architecture with Internal Storage... 13 Figure 7. Reference Architecture with the Dell EqualLogic iscsi Storage Platform... 15 Figure 8. Highly Available CommVault Simpana software Reference Architecture with the Dell DX Object Storage Platform... 17 Figure 9. CommVault Simpana software CommCell Console... 19 Page 4
Introduction To help customers address information technology issues such as managing the explosion of data, optimizing data center resources, and providing secure, cost effective end user computing, Dell has introduced comprehensive end-to-end solutions targeted at addressing these kinds of problems. These solutions combine high performance, cost effective Dell hardware servers, storage and networking with best-of-class software to provide open, affordable and capable solutions. The Dell Email and File Archive Solution is a cornerstone of Dell s approach to intelligent data management. This approach enables customers to control the explosion of data through compression, de-duplication and archiving, to optimize the use of data through the establishment of data management and retention policies, and to acquire a strategic advantage through the use of data for business intelligence and compliance. The Dell Email and File Archive Solution focuses on the use of Dell s comprehensive array of massively scalable storage offerings in archiving email, file and SharePoint data. This paper addresses the solution developed with CommVault Simpana software. To maintain customer choice, there is also a Dell Email and File Archive Solution based upon Symantec Enterprise Vault as well as a hosted offering through Dell s Email Management Service. Regardless of the configuration, all ongoing maintenance and support for the entire solution are provided by Dell, including hardware and software (ISVs included). Located world-wide, Dell Solution Centers provide a place where customers can visit, work with experts in these solutions and have access to a common global lab infrastructure and capability to integrate, test and validate Dell s integrated solutions. In conjunction with Dell product/solution marketing and development, the Dell Solution Centers create Dell reference architectures for each solution, identifying the solution requirements, describing the use cases the solution addresses, and documenting the hardware, software and networking components employed by the solution. These reference architectures ease solution design while allowing for needed customization based on customer-specific requirements. The Dell Email and File Archive Solution with CommVault Simpana software includes three reference architectures, respectively representing at this time three storage options: the use of the internal storage of the Dell PowerVault DL Backup to Disk Appliance, the Dell EqualLogic iscsi storage, and the Dell DX6000 Object Storage Platform. While the three reference architectures represent tested and validated example configurations, customers will most likely select from and extend the three configurations to create email and file archive solutions most suited to their own environment. Dell and CommVault have worked together for eight years to provide Dell customers with highperformance, low-cost data and information management capabilities that are easy to use and can scale with increasing data protection requirements. Dell makes it easy to purchase, deploy and manage CommVault solutions with dedicated appliances, special bundling, streamlined licensing, 3-year upgrade protection and Dell services. In addition to the Dell-CommVault OEM relationship, Dell is an authorized CommVault Reseller, which gives Dell customers access to the complete lineup of Simpana software through Dell Software and Peripherals program. Page 5
The Dell Email and File Archive Solution with CommVault Simpana software Organizations today are faced with the challenges of managing ever increasing data growth and the pressures of complying with regulatory requirements and mandates. IT departments are often tasked with managing disparate heterogeneous environments using a multitude of compute and storage platforms. Dell s Email and File Archive Solution with CommVault Simpana software can enable centralized management of backup and archive data using policy-based Retention Lifecycle Management (RLM) through CommVault Simpana software s single architecture, single console. Dell s Email and File Archive Solution can help improve the economics of long term data retention while helping to reduce risks associated with corporate, legal or regulatory governance. This intelligent data management solution addresses common capacity management problems often associated with the archiving of data such as user quotas, PST file proliferation, and database growth, while providing a common platform for Content Indexing and search to meet compliance requirements. The Dell Email and File Archive Solution with CommVault Simpana software leverages sophisticated, policy-based data management automation processes to consolidate and reduce content and storage utilization. The corresponding solution architecture is based on three Dell storage hardware platforms delivering excellent total cost of ownership and performance for on-premise archiving. The Dell Email and File Archive Solution with CommVault is implemented in three reference architectures featuring, respectively, the internal storage of the Dell PowerVault DL Appliance, Dell EqualLogic iscsi storage, and the DX6000 Object Storage. These reference architectures serve as tested examples of Dell/CommVault solutions; users can mix and match as required. CommVault Simpana software is discussed in detail in the next two sections, followed by details of the Dell storage platforms used in the solution. Finally, the three reference architectures are shown including hardware architecture, software functionality employed, and sizing recommendations. Page 6
CommVault Simpana software CommVault Simpana software with Dell s comprehensive line of storage devices, including the Dell PowerVault DL Appliance, the Dell EqualLogic line of iscsi storage, and the Dell DX Object Storage Platform, is designed to serve end-users, IT teams, and legal and compliance teams and deliver highspeed performance in the face of increasing data volumes. Retention Lifecycle Management for storage consolidation CommVault Simpana Archive software together with the Dell storage helps address the challenges associated with the growth of expensive primary storage housing expanding data stores and vast amounts of old and inactive data. By providing a complete data lifecycle management and retention solution that is optimized across tiers of storage, CommVault software offers an integrated content archiving solution that reclaims valuable primary storage space by automatically moving stale data to more cost effective Dell storage. The archiving of inactive file and email data frees up physical storage space in the enterprise s production environment reducing the size and time required for backup and recovery operations. Once archived, the data on the primary storage is replaced with self-contained stub files. The secondary data can be protected with encryption and further reduced with deduplication and compression, thus allowing retention of more recovery copies to meet the organization s Recovery Point Objectives (RPO) and Recovery Time Objectives (RTO). The embedded global deduplication option provides a single deduplication database that supports multiple storage policies and allows for both source-side and target based deduplication. This can provide total flexibility and control of data retention needs with minimal network impact and maximum data reduction. CommVault Simpana storage policies enable organizations to manage data from a single interface and across multiple tiers, which can decrease overall total cost of ownership (TCO). Transparent file access to archived data is provided to end-users by giving them the ability to see and recall archived files by simply selecting the stub files from the file system. And with today s ediscovery requirements increasingly including file discovery, CommVault Simpana software extends email archiving and search by adding file archiving and search capabilities, all from a single interface. CommVault policies work inside file system directory structures, archiving data directly from its original location rather than moving it first to media staging folders. CommVault Simpana software enables organizations to classify file data by application, plan its lifecycle over time, and assign the most appropriate and cost-effective resources. Retention Lifecycle Management for compliance CommVault Simpana Archive software s singular architecture and single pane of glass management provides transparent access to the content on Dell storage, helping improve end user productivity and collaboration. Administrators set global thresholds and policies centrally to govern when and how information is retained. These archive thresholds and policies align with the organization s business priorities. With full Content Indexing (CI), all information can be searched and retrieved in context. All copies of indexed data (protection, migration and archive data) are combined into a single, searchable repository, a comprehensive solution that meets any organization s data management needs. Page 7
As part of its singular architecture, the seamless integration of CommVault Simpana Backup/Recovery and Archive software provides a single pool of data with a common interface to centrally set and manage data management policies and schedules to track and report on operations across tiered Dell storage. Policy and job management is consolidated while reporting and alerting is improved for administrative and operational efficiencies. As a result, training costs and infrastructure resources can be reduced. Figure 1. CommVault Simpana Software Platform Page 8
Solution Software Components The CommVault Simpana software components used in the Dell Email and File Archive Solution are shown in Table 1 and Table 2. Table 1. CommVault Simpana software Components CommCell Component CommCell CommServe MediaAgent(s) idataagents Archive Agents Role The CommCell component consists of Common Technology Engine (CTE) components including one CommServe, one or more Media Agents, and one or more Client Agents. The CommCell group is managed by a CommCell Console which includes tabs for Security, Storage Resources (including Libraries), Policies (including Replication, Schedule and Storage Policies) and Reporting. The CommServe Server ties the CommCell components together; it is the coordinator and administrator of the CommCell component. The CommServe Server communicates with all agents in the CommCell to initiate data protection, management and recovery operations. Similarly, it communicates with MediaAgents when the media subsystem requires management. The CommServe Server maintains a database, the CommServe Database Engine, containing all the information relating to the CommCell configuration. There is one CommServe in a Simpana installation, often running on the same server as the MediaAgent, but the CommServe functionality can also exist in a Windows Failover Cluster for high availability. The MediaAgent transfers data between the client computer(s) and the storage media. Each MediaAgent communicates locally or remotely to one or more storage devices. MediaAgents may run on the same server as the CommServe and/or be located on separate servers. They may be clustered together using the GridStor feature providing redundant data paths to storage for high availability. idataagents are software modules that are used for backing up and restoring data. The system provides a variety of idataagents, each one designed to handle a different kind of data. If a given computer has two or more types of data, it requires one idataagent for each data type. Migration Archiver Agents are software modules that are responsible for periodically moving unused or infrequently used data on their host computers to secondary storage, thereby reducing the size of data on the primary storage. The system provides several Agents, each one designed to handle a different kind of data. The solution consists of agents for archiving Exchange, SharePoint, and Windows, Unix or Linux file servers and file systems. Table 2. Component Libraries Storage Policies CommVault Simpana software Components - Additional Components Role A disk library is a virtual library associated with one or more mount paths. The disk library does not represent a specific hardware entity; it is a software entity that contains a list of mount paths through which data can be sent to a disk media. Storage policies act as the primary channels through which data is included in data protection and data recovery operations. A storage policy forms the primary logical entity through which a subclient or instance is backed up. Its chief function is to map data from its original location to a physical media. A secondary copy of a storage policy provides a means of making an additional copy of backed up data and is used in auxiliary copy operations, or data protection operations Page 9
Content Indexing and Search Deduplication Silo Storage Outlook Add-in that create inline copies. Content Indexing and Search provides the ability to content index and search both file server/desktop data and protected/archived data for data discovery and other purposes. This product allows Compliance Officers, Administrators and End-Users to search and restore file system and application data. Deduplication is supported by both backup and data archival products in order to provide optimization in storage when the data to be backed up contains redundant data. Data deduplication can be performed at the Client side, before the data is transferred over the network, or at the MediaAgent side, when the data is received at the MediaAgent. Silo Storage enables the storage of deduplicated backups on secondary storage devices. The ability to store deduplicated data on secondary storage reduces the storage requirements and facilitates longer retention periods. The DataArchiver Outlook Add-In is a multi-purpose tool that provides Outlook users with a convenient way to browse, search, restore, recover and/or erase e-mail messages that were backed up or archived from the user s mailbox. It can also be used to perform stub recalls. Page 10
Solution Hardware Components The Dell Email and File Archive Solution takes advantage of three different Dell data storage products which utilize different connection technologies and provide different functionality to the solution. They are: the Dell PowerVault DL Appliance, the Dell EqualLogic iscsi storage, and the Dell DX Object Storage Platform. Dell PowerVault DL Backup to Disk Appliance Powered by CommVault The Dell PowerVault DL Appliance is a powerful storage server containing up to 12 internal Near Line SAS drives up to 2TB. Available as a standalone backup appliance (the Dell PowerVault DL Backup to Disk Appliance powered by CommVault), the DL Appliance is used in the Internal Storage and EqualLogic Storage Reference Architectures as the Simpana CommCell (CommServe plus MediaAgent) as well as a primary storage library. The High Availability Reference Architecture utilizes a pair of DL Appliances as MediaAgents and storage libraries clustered with Simpana GridStor technology. Figure 2. The Dell PowerVault DL Backup to Disk Appliance Powered by CommVault Dell EqualLogic iscsi Storage Dell EqualLogic storage arrays are virtualized iscsi SANs ideal for demanding, high-i/o applications such as OLTP databases, email and virtual servers. They make excellent primary or secondary storage arrays for Simpana archiving. The EqualLogic Reference Architecture employs PS6000XV arrays in both capacities. Figure 3. The Dell EqualLogic PS6000XV Dell DX Object Storage Platform The Dell DX Object Storage Platform is a complete, integrated hardware and software solution designed to handle storage of files and accompanying metadata on disk-based storage nodes. The platform scales to handle billions of objects through the use of unique file identifiers created from one enormous, flat, non-hierarchical address space. Page 11
The DX platform consists of one or more Dell DX6000 Cluster Services Nodes and one or more Dell DX6012 or DX6004 Storage Nodes. For email, file and SharePoint archiving they make excellent longterm secondary storage. A cluster consisting of primary and secondary DX6000 Cluster Services Nodes with two DX6012 storage nodes is employed as secondary storage in the High Availability/DX6000 Reference Architecture. Figure 4. The Dell DX Object Store DX6000 Cluster Services Node EST Figure 5. The Dell DX Object Store DX6012S Storage Node Page 12
CommVault Simpana software Reference Architecture with Internal Storage The Dell Email and File Archive Solution with CommVault Simpana software is implemented in three reference architectures featuring, respectively, the internal storage of the Dell PowerVault DL Appliance (up to 1,000 email users), Dell EqualLogic iscsi storage (up to 5,000 email users), and high availability and DX6000 storage (up to 5,000 email users). Each reference architecture is designed to archive email, file systems and file servers, and SharePoint data. While each reference architecture highlights different features, these are not the only recommended way of configuring CommVault Simpana software on a Dell archive solution. For example, the DX6000 may be used without high availability and deduplication may be used with the internal storage configuration. However, these architectures have been tested in the Dell Solutions Center, and it is a known configuration that may be implemented as-is or serve as a basis for deployment. The Reference Architecture with Internal Storage is for enterprises with up to 1,000 users that prefer the simplicity of a single system implementation with the DL Appliance and don t require high availability or content indexing. The hardware components are shown in Figure 6. The DL Appliance may be expanded with direct attached PowerVault MD-series disk arrays. Figure 6. Reference Architecture with Internal Storage The functionality present in the Internal Storage Reference Architecture is summarized in Table 3. Simpana idataagents (backup) and Archiver clients have been installed on three client systems, an Exchange server, a Windows File System server, and a CIFS network file share server. Exchange database backups are performed with the Exchange Database idataagent. Mailbox archives are performed with the Exchange Mailbox Archiver agent. Content on the Windows File System and CIFS servers is archived with versions of the File Archiver agent. The file systems of all three servers are protected with File System idataagents. The Internal Storage Reference Architecture contains a single Simpana Library, utilizing the DL Appliance internal storage, and a single MediaAgent, the DL Appliance (IDM2-DL1) itself. Two storage policies are defined. The first is the default CommServeDR policy, which provides protection for the DL Page 13
Appliance CommServe server. The second, through its primary copy on the DL Appliance library, provides the storage for all archive and backup clients. Table 3. Reference Architecture with Internal Storage Functionality Clients Exchange Exchange Database idataagent Exchange Mailbox Archiver File System idataagent Windows File System Local File System File Archiver File System idataagent CIFS Server Network File Share File Archiver Libraries Dell Disk Array1 DL Appliance local storage Storage Policies / MediaAgents CommServeDR IDM2-DL1 Dell Disk Array1 IDM2-DL1 Disaster recovery policy for CommServe Primary copy uses Dell Disk Array1 Page 14
CommVault Simpana software Reference Architecture with the Dell EqualLogic iscsi Storage Platform This reference architecture is for enterprises with up to 5,000 users that prefer Dell EqualLogic iscsi storage and require Content Indexing but don t require high availability. The EqualLogic storage may be treated as a second Primary storage target along with the DL Appliance or as Secondary storage for older archives. In addition, deduplication is available to further optimize storage capability. The hardware components are shown in Figure 7. Figure 7. Reference Architecture with the Dell EqualLogic iscsi Storage Platform As seen in Table 4, the EqualLogic Storage Reference Architecture extends the functionality of the Internal Storage Reference Architecture in several key ways: A SharePoint client is added, with backup idataagents for both the SharePoint database and the server file system as well as an Archiver for the SharePoint content. Page 15
Table 4. The EqualLogic storage, attached through a new Library, is accessed in one of two ways: it can be set up as a primary copy through its own storage policy or attached as a secondary copy to the primary copy on the DL Appliance disks for auxiliary copies of data at specified retention times. Deduplication is enabled through a new storage policy using the DL Appliance library (the EqualLogic library can also serve as a deduplication target). Deduplication is selected on a client-by-client basis by associating the client or subclient with the appropriate storage policy. Content Indexing is added through the addition of a Content Index engine on server idm2- ci1. This node performs the content indexing administration, serves as the location of the search index, and can host the search web server for end-user and administrative search. EqualLogic Storage Reference Architecture Functionality Clients Exchange Exchange Database idataagent Exchange Mailbox Archiver File System idataagent Windows File System Local File System File Archiver File System idataagent CIFS Server SharePoint Network File Share File Archiver Sharepoint Server idataagent MS Sharepoint Archiver File System idataagent Libraries Dell Disk Array1 DL Appliance local storage EQL3 EqualLogic PS6000 Volume Storage Policies / MediaAgents CommServeDR IDM2-DL1 Dell Disk Array1 IDM2-DL1 EqualLogic3 IDM2-DL1 Dell Disk Array w/dedupe IDM2-DL1 Disaster recovery policy for CommServe Primary copy uses Dell Disk Array1 library Secondary copy uses EQL3 library Primary copy uses EQL3 library Primary copy uses Dell Disk Array1 library w/ client-side and MediaAgent deduplication Content Indexing Engines Default CIEngine Admin/Search Index on idm2-ci1 Page 16
Highly Available CommVault Simpana software Reference Architecture with the Dell DX Object Storage Platform This architecture is for enterprises with up to 5,000 users that need a highly available archive solution employing the self-managing DX 6000 Object Storage, plus high performance Content Indexing running on multiple servers. High availability is provided by two Simpana CommServe servers running on two PowerEdge R610 servers configured as a Microsoft Failover Cluster, as well as two Simpana MediaAgents running on DL Appliance storage servers in a GridStor cluster. Further high availability is achieved by dual DX6000 Cluster Services Nodes and two or more DX6012S Storage Nodes. Shared storage for the Microsoft Failover Cluster is provided by a Dell EqualLogic PV6000XV iscsi storage array. Figure 8. Highly Available CommVault Simpana software Reference Architecture with the Dell DX Object Storage Platform Page 17
The DL Appliance internal storage serves as primary storage with the DX6000 as a secondary target for material to be archived for longer periods. Since the interface to the DX storage is through the HTTP protocol a Cloud Connector for the DX has been added to Simpana software. In addition to employing the DX6000 storage, the High Availability Reference Architecture extends the functionality of the EqualLogic Reference Architecture in two key ways (see Table 5): Content Indexing is deployed across multiple nodes. The first one, on server idm2-ci2-1, performs the content indexing administration, serves as the location of a search index, and can host the search web server for end-user and administrative search. The second node, on server idm2-ci2-2, and any subsequent nodes that may be added, serves exclusively as a search index node. Copying deduplicated data to a secondary DX6000 requires the use of a SILO copy on the deduplication storage policy Table 5. High Availability Reference Architecture Functionality Clients Exchange Exchange Database idataagent Exchange Mailbox Archiver File System idataagent Windows File System Local File System File Archiver File System idataagent CIFS Server Network File Share File Archiver SharePoint Sharepoint Server idataagent MS Sharepoint Archiver File System idataagent Libraries Dell Disk Array1 DL Appliance-1 local storage Dell Disk Array2 DL Appliance-2 local storage DX6000 DX6000 cluster storage Storage Policies / MediaAgents CommServeDR IDM2-DL2-1 Dell Disk Array1 IDM2-DL2-1 Dell Disk Array2 IDM2-DL2-2 Dell Disk Array w/dedupe IDM2-DL2-1 Disaster recovery policy for CommServe Primary copy uses Dell Disk Array1 library Secondary copy uses DX6000 library Primary copy uses Dell Disk Array2 library Secondary copy uses DX6000 library Primary copy uses Dell Disk Array1 library w/ client-side and MediaAgent deduplication SILO copies to DX6000 library Content Indexing Engines CIEngine Cluster Admin/Search Index on idm2-ci2-1 Search Index on idm2-ci2-2 The CommVault Simpana software CommCell console is shown in Figure 9. The Client Computers, Libraries, Storage Policies and Content Indexing Engines are shown along with the subclients of the Exchange Mailbox Archiver defaultarchiveset, the Job Controller and the Event Viewer. Page 18
Figure 9. CommVault Simpana software CommCell Console Page 19
Email and File Archive Sizing The sizing of CommVault Simpana file and email archive solutions with Dell storage is extremely userdependent and should be addressed by a Dell Services planning engagement. Two example use cases illustrate the variables involved in sizing. In the first case it is assumed that the average user sends or receives 20 messages a day that will be retained and will need to be archived and content indexed. In the second case it is assumed that there is a journaling Exchange server set up that captures all email traffic, and therefore the number of email messages per user per day rises to 100. In both cases there is a specified amount of new file data to be archived per week, as well as initial file and email data that is present at the time archiving starts. Both use cases were run for 1,000 and 5,000 users. In these examples the following parameters were used: Average email size without attachments: 15 KB Average percentage of emails with attachments: 20 Average number of attachments when there are attachments: 1 Average size of attachment: 250 KB Initial PST file and email backlog: 200 MB per user New file data per week: 5 MB per user Initial file data backlog: 100 MB per user Reduction due to filtering of non-searchable files (JPGs, etc.): 33% Working days per year: 260 (52 weeks x 5 working days per week) Retention period: 5 years for archives, 3 year for content index data Data growth: 20% per year The non-journaled case, with 20 messages to archive per user per day, was used to design the three reference architectures: Table 6. 1000 Users CommVault Simpana software sizing, non-journaled case Year Archive (GB) Content Index (GB) 5000 Users Year Archive (GB) Content Index (GB) Year 1 690 233 Year 1 3448 1164 Year 2 827 279 Year 2 4137 1396 Year 3 993 335 Year 3 4965 1676 Year 4 1192 Year 4 5958 Year 5 1430 Year 5 7149 Total 5.0 TB 0.8 TB Total 25.1 TB 4.1 TB The DL Appliance in the Internal Storage Reference Architecture (1,000 users) provides about 8 TB usable archive storage, which allows for the predicted 5 TB required over a five year period plus extra for backups, etc. A single content indexer can hold about 1 TB of indexes, sufficient for the 3 years of index data shown in the model. Page 20
The two 5,000 user reference architectures provide about 30 TB (EqualLogic) and 28 TB (DX6000) respectively, both sufficient for the predicted 25.1 TB. To index every email from every user would require about 4 content index servers. More than likely, however, an enterprise will need to search only content from specified departments and time periods. The case where all emails are journaled and archived is shown next. As seen, this dramatically increases the required storage, further underscoring the need to fully understand the user s environment before designing a Simpana implementation. Table 7. 1000 Users CommVault Simpana software sizing, journaled case Year Archive (GB) Content Index (GB) 5000 Users Year Archive (GB) Content Index (GB) Year 1 1552 524 Year 1 7761 2619 Year 2 1863 629 Year 2 9313 3143 Year 3 2235 754 Year 3 11175 3772 Year 4 2682 Year 4 13410 Year 5 3218 Year 5 16092 Total 11.3 TB 1.9 TB Total 56.4 TB 9.3 TB Page 21
Acknowledgements Dave Jaffe would like to thank his colleagues in the Dell Solution Centers (John Tian, Intelligent Data Management Global Team Lead at the time this work was done, for his excellent leadership as well as technical savvy; Trenton Potgieter for his work on Sharepoint, the clustered configuration and networking; Gary Pannell and Thomas Heine for lab support; and Tim Platson for valuable discussions); Darin Camp, Juan Garcia, Cam Vitale and the rest of the CommVault crew for their excellent support; and Scott Reichmanis and Sri Nandigam of Dell Product Group for help setting up the DL Appliance and DX6000. Page 22