SharePoint Data Management and Scalability on Microsoft SQL Server Kevin Kline Technical Strategy Manager for SQL Server Quest Software http://www.kevinekline.com @KEKline Contributions: Joel Oleson, Mike Watson, Todd Klindt 2009 Quest Software, Inc. ALL RIGHTS RESERVED
Welcome & Thanks! 2
Kevin Kline SQL Server Expert for Quest Software Former developer, DBA, and enterprise architect since 86 Former president of PASS (www.sqlpass.org) Microsoft MVP since 04 Author of SQL in a Nutshell and 9 other books Twitter @kekline Blog http://kevinekline.com 3
Audience Poll New to SharePoint? SQL Admins? SharePoint Admins? Large-scale Implementation (+1 TB) experience? How many SQL Admins are freaking out because of the number of SharePoint databases? 4
Session Objectives And Takeaways Session Objective(s): Understand the SQL and storage factors that affect a large scale SharePoint deployment. SharePoint SQL and storage best practices. Key Takeaway: Proper SQL and Storage design is critical to overall SharePoint health! 5
SharePoint Containment Farm Hierarchy Servers Web Front End, APP, SQL Web Applications Central Admin, SSP Admin, Content Databases Content, Config, SSP, Search Site Collections Internet, Intranet Portal, Wikis, Blogs, Team, Doc, Mtg Sites Wikis, Blogs, Team, Doc, Mtg Lists Doc Lib, Pages, Events, Discussions, Surveys, etc Items Files, calendar items, contacts, customers, images, custom 6
Understanding SharePoint Databases Farm Config Servers Web Apps Solutions Global Config Web App Content 1..2 Site Collections Sites Lists Pages Documents DWPs SSP Search Properties SSP My site host config Profiles BDC config Excel Calc 7
Understanding Configuration DB Config Database Sites (Spsites) Servers Webapps (Vservers) 8
Understanding Content DB Content Database Sites (spsite) Webs (spweb) Doc Stream 9
Understanding SSP DB - Search Search Database Search Properties 10
Understanding SSP DB SSP Database MySite Host Config Profiles BDC Config Excel Calc 11
12
Why is SQL that important? SQL Health = SharePoint Health! Sub-optimal SQL perf will radiate to other components in the farm. Slow response from SQL Server will result in queued App requests. As the app slows down, so does SQL. Slow App Slow SQL 13
Database Disk I/O Demand Most Demand Medium Demand Low Demand Search Config *Content.. Temp +SSP Model Tlog s Master Logging Usage New In 2010 * Except during backup, indexing, and during Profile Import 14
Top Performance Killers Indexing/Crawling Backup (SQL & Tape) Profile Import Misc Timer Jobs User Sync for large #s of Users Poor Storage Configuration STSADM Backup/Restore Large List Operations Heavy User Operation List Import/Write Network Inefficient Queries 15
16
Scaling SQL 2.5TB 2.5TB 2.5TB SCALE OUT 17
Scalling SQL - Out More SQL servers = More flexibility There aren t really any physical barriers SharePoint won t prevent you from placing 100 databases on 100 different SQL instances The real barriers are manageability and cost. More servers = more money More servers = more management $$ + > management = $$$$ 18
SCALE UP Scaling SQL 2.5TB 2.5TB 2.5TB 19
Scaling SQL - Up Design is Paramount! Consider the following: Overall SQL Throughput (transactions/sec) Disk throughput (IOPS) Network throughput (MB/sec) Disk backup throughput (MB/sec) Network based backup throughput (MB/sec) Length of maintenance windows (hours -> minutes) SharePoint upgrade throughput 20
SQL: Scale Out VS. Scale Up Scale Out Scale Up Advantages Better Performance Easier to Manage Better Flexibility Cheaper Disadvantages More Expensive System Design is Critical Harder to Manage Single Point of Failure 21
Walkthrough: Scale Up VS. Out Deployment: How to Design a 5TB SharePoint SQL 1TB 1TB 1TB 1TB 1TB 1TB 1TB 1TB 1TB 1TB 22
Consider the Organization Will the SharePoint SQL Servers be self managed? What experience does the team managing SQL have? Do they have: Monitoring? Standard Maintenance Procedures? Standard Maintenance Windows? Standard SQL Builds? What are the break/fix and standard SLA s? 23
Scaling SQL The Bottom Line Don t scale SQL instances beyond comfort zones! Do measure system throughput Know All of your bottlenecks! Scaling out is more flexible but scaling up is more cost effective. Find a balance between scaling up and out and stick to it. (1-5TB per instance for example) 24
25
Highly Available Deployment? Redundant Switches Redundant Web/Application Servers Active/Passive SQL w/ Redundant HBA s Redundant SAN Fabric RAID 1 Storage Redundant Power Supplies 26
Mirroring Within a Farm SQL High Avail or High Protection (sync) mirroring replaces or augments clustering as the SQL HA solution. Farm components can span closely located datacenters* Must have LAN like connectivity (1Gbps) Must have less than 1ms in latency (2ms RTT) Can be Active/Active or Active/Passive Use DNS or Load Balancing to direct traffic between frontends. 27
Mirroring Within Farm 28
High Availability Between Farms Can use a variety of methods to ship content between farms/data centers: Log shipping Mirroring Storage replication Longer distances supported* The greater the latency the harder it is to replicate content No way to keep configuration or search in sync 29
The Two Basic HA/DR Scenarios Mirroring Within Farm Pros: Great combo HA/DR solution Cheaper to implement Easier to manage Cons: Requires closely located datacenters Requires excellent network conditions Not flexible Content corruption is replicated immediately. Mirroring/Log ship Between Farms Pros: Allows long distance separation Can protect against logical corruption Very flexible! Cons: More expensive Harder to setup and manage Failover is a big decision 31
Combining Solutions 32
SQL 2008 - Do you have Enterprise? SQL Enterprise Asynchronous Mirroring with compression Transparent Database Encryption Backup Compression Resource Governor SQL Standard Synchronous Mirroring 2 Node Clustering Log Shipping with compression Restore Compressed Backup SQL Express FREE! Both SharePoint Foundation 2010 and SharePoint 2010 use Up to 4 GB Max Storage Use as a Witness in Mirroring 33
34
Content DB Size Limitation 100GB? Exceeding 100GB? Keep in mind: Backup/restore/maintenance will be harder Use differential backup All sites share the same tables. Isolate large sites. Use multiple data files Defrag regularly YMMV: H/W and usage profile dependant. 35
Large Lists 2000 Items? SharePoint supports large lists, but you must carefully plan how users view the lists to prevent performance impacts. For best performance, do not exceed 2,000 items per folder or view Define limits on views. Use indexed columns. Take it easy on column and field counts. 36
SQL Memory 4GB Enough? 4 GB is the minimum required memory, 8 GB is recommended for medium size deployments, and 16 GB and above is recommended for large deployments. What influences the amount of RAM? Number and size of Content databases. Number of concurrent requests to SQL. Size and width of commonly used lists. Remember: Minimum is where we start 37
SQL Data files Best Practices: TempDB: Create multiple data files up to the ½ or even ¼ number of CPU cores Allocate TempDB on RAID 1 (or R1 variants) RAID5 sucks at writes! Separate Data and Logs on different LUNS Spread databases on multiple spindles Pre-allocate files and use autogrow only as safety net SharePoint 2010 supports file groups for content databases! 38
Identifying Disk Bottlenecks Perfmon Monitor transfer/sec for throughput trends. Monitor Disk sec/read / Disk sec/write for bottlenecks. Monitor disk Queue length for bottlenecks. SQL Select * from sys.dm_io_virtual_file_stats(null, null) Solution http://bit.ly/9on4ch 39
40
Farm Services Content Lots of New SharePoint 2010 Databases Config Admin_Content Application Registry Service StateService Web Analytics Web Service WSS_Usage Reporting_DB Staging DB Search_Service_Application Crawlstore SearchDB PropertyStore WSS_Search SocialDB ProfileDB (was SSP db) SyncDB BDC_Service_DB Word Conversion Service Application Performance_Point ManagedMetadata Secure_Store_Service WSS_Content_GUID WSS_Content_GUID1 WSS_Content_GUID2 41
Large List Throttling You control when and how much! Configurable List Throttling And Thresholds List throttling controls forces end users to create more efficient views with < x number of items. 42
Web Part Performance Dashboards 43
Best Practices Analyzer Health Rules Runs on a Timer Job. Create your own! Repair Automagically! Search for TechNet: Plan for software boundaries 44
Logs & Reporting to the DB Extensibility for reporting and possibilities are limitless 45
Applying the Newest Learnings Add more processor to the backend: 4 cores to 8 cores Add more RAM: 16GB to 32GB Run profile sync on our terms! Run the jobs as little as possible. Once a week or once a month. Separate SSP SQL instance from Search SQL instance. 46
Summary SQL is extremely important to SharePoint health and Performance Put SQL on 64bit. (Required for SharePoint 2010) SQL 2008 Enterprise Scale, HA, compliance security features Think IOPS when designing disk arrays. Always separate work loads with the following priority: temp, log, search, content. SQL scales up and out. Don t push the limits upward, but keep manageability and costs in mind when scaling out. Designing enterprise services with great care. Separate SSP and Search when possible. SharePoint 2010 brings more databases so strategically plan for 20-50 dbs min 47
48
49
Search Disk Performance Drive IOPs Read (max) IOPs Write (max) Ratio Read/ Write Latency Read (sec) Latency Write (sec) Search DB Logs 14.67 1,777.29 0.01 0.3060 0.8550 Temp DB 1,110.98 1,492.01 0.74 1.6870 3.5660 Query 3,507.26 1,631.96 2.15 3.4360 3.2140 file group Crawl file group 3,043.93 371.65 8.19 15.0840 15.8720 Reference: http://blogs.msdn.com/enterprisesearch/archive/2008/05/19/sqlmonitoring-and-i-o.aspx 50