Geospatial Server Performance Colin Bertram UK User Group Meeting 23-Sep-2014
Topics Auditing a Geospatial Server Solution Web Server Strategies and Configuration Database Server Strategy and Configuration Network Infrastructure Capacity Planning & Analysis Web Application Security Virtualisation 25-Sep-14 2014 Intergraph Corporation
AUDITING A GEOSPATIAL SOLUTION
Performance Audit Methodology A Good Technology Audit Requires Current user experience Web server hardware size, capabilities, software, and configuration Database server size, capabilities, software and configuration Network configuration and bandwidth Database configuration / tuning Data contents, format, indexing, metadata, use Web application design / review 1, 2, and 3-year outlook for the system Performance Testing (baseline)
Tuning Once the audit is complete Using all available instrumentation, allow a user perform end-to-end use of the system, collecting metrics. This is the baseline by which all optimisations will be measured Document metrics, identifying the best opportunities for improvement Make improvements, 1 at a time, testing afterwards to quantify the quality of the change Perform a final baseline analysis, identifying system bottlenecks Document new user perceptions of the system
WEB SERVER STRATEGIES AND CONFIGURATION
Server Configuration Guidelines CPU Avoid consumer-grade processors Faster is better 64-bit is better but may require extra configuration for backwards compatibility Fast system bus is better 2 Processors or processor cores can improve load but not speed Recommend at least 4 CPU cores for production environments Memory More is better Allocate 1GB for Windows 2008/2012 Server and related overhead In the case of GeoMedia WebMap: Allocate 32MB per concurrent user request for overhead Allocate 32MB to 128MB per concurrent user request to cache map data (actual amount depends on richness of map content) Recommend at least 4GB (for smaller implementations) September 25, 2014 Intergraph Confidential 7
Server Configuration Guidelines Disk A fast disk interface is better Recommend at least 10K RPM Recommend 5.4ms access is better Large I/O buffer is better on NT supported drives Recommend SCSI technologies over ATA technologies RAID is good when applied correctly, avoid slow RAID-5. Recommend RAID 10 (1+0) for production environments Network 100 Mb is okay for most developer applications 1Gb / 10Gb is better for production and enterprise applications Upgrade hubs to switches at key connection points Consider teaming multiple network cards to improve bandwidth Minimize the path between critical devices Verify all pathways are Full Duplex
DATABASE SERVER STRATEGIES AND CONFIGURATION
Generic SQL Database Tuning SQL Database Indexing - DO Index tables that are frequently queried for less than 2% to 4% of the table s rows Index all frequently queried tables with more than 100 records Index all queried tables with more than 1000 records Index key columns such as ID, MSLINK, etc., when indexing is required Index frequently queried database columns Index database columns that join to other tables Index uniquely if possible Cluster critical data if your database supports it
Generic SQL Database Tuning SQL Database Indexing - DO NOT Index every column in a table unless you have query criteria for every column Index a table before loading or translating a large quantity of data into it Index a table when others are using the database
Generic SQL Database Tuning SQL Normalisation A good database design is essential. To optimise a database, you should consider various levels of normalisation. These include: Eliminate repeating groups of columns. Place this data into a join table. Eliminate repeating text strings. Place this data into a join table. Eliminate unnecessary data in your primary tables. Place this data into a join table.
Microsoft Access Data Server Pros Cheap, Portable, and Widely Used Full-featured SQL interface Spatial indexing available with _sk Spatial Keys Memory mapping results in a small footprint Cons Data Access Object is slow Single User Read/Write database Not respected in the industry File size limited to 2GB Windows imposed limit: 128 connections per server Windows imposed limit: 2048 table in all connections per server
Intergraph GeoMedia SmartStore Data Server Pros Very fast access to spatially filtered data Compressed to eliminate unnecessary I/O Memory maps the CDT and DDC files. Windows NT treats the files as if they were an extension to swap file Depends on Windows NT Kernel to keep a reasonable portion of the data files in RAM (all if RAM is available) Cons Weak SQL interface, compared to other databases Attribute data is flat data and supports no indexing
Microsoft SQL Server Data Server Pros Fast Good price / performance ratio Easy user interface Scales easily (with fail over) Cons Market perception of Microsoft SQL*Express edition limited to 1 CPU
Oracle Spatial Data Server Pros Respected as an Industry Standard Enterprise Database Fault tolerant, scalable, reliable The fastest GIS data store on the market when correctly configured and indexed Cons Database configuration and indexing is required, and can be very difficult Memory hungry Very expensive
Which Database Meets Your Needs? Database Suitability Matrix Access Software Cost 4 5 5 2 1 1-Expensive Software Maintenance 4 5 4 2 1 1-Expensive General data query performance 3 1 1 5 5 1-Slow Spatial data query performance 3 1 5 4 5 1-Slow Scalable 1 1 1 4 5 1-Not Scalable Ease to maintain 5 5 4 3 1 1-Difficult Technical Size Limitation 2GB 2GB 2GB - - Practical Size Limitation 100MB 100MB 2GB - - Read/Write Yes No No Yes Yes Read/Write with Transacations No No No Yes* Yes Enterprise data store No No No Yes Yes Industry-recognized No No No Yes Yes Long-term Cost Low Low Low Medium High ArcView SmartStore SQL*Server Oracle Please refer to vendor recommendations when sizing SQL*Server or Oracle hardware
NETWORK INFRASTRUCTURE
Network Usage What to Look For Network I/O is a potential bottleneck for distributed systems Be prepared to monitor total network traffic as a percentage capabilities For each process, be prepared to monitor network traffic by type and destination Look at bandwidth requirements to/from file server(s), database server(s), and web clients Look for extraneous network traffic, not targeted at the server under observation Monitor hardware interrupts and Deferred Procedure Calls (DPCs) Network traffic is the primary source of interrupts For every packet received, the Network Interface Card will interrupt the server so that the Windows kernel can receive the data A badly programmed switch can cause excessive network traffic resulting in slowed operations
CAPACITY PLANNING & ANALYSIS
Why Load Balancing? Provides resilience All servers operate independently and in parallel. All but 1 server can fail Provides scalability Multiple servers distribute CPU load Multiple servers distribute I/O load Provides rapid response to unforeseen events Quickly scale up by adding servers to the cluster when a site is CPU or bandwidth constrained Strategy must include scale-up of database resources for data-driven applications
Before you Begin How will your applications balance? Load balancing can be accomplished through both hardware and software Hardware methodology is most common via VIP (virtual IP) round-robin allocation of requests Software methodology using Windows Network Load Balancing NLB or others Virtual Switch software load balancing available in VMWare and others Software load balancing with NLB is cheap, but potentially with up to 5% performance hit. Load balancing has 2 models No Affinity: Stateless balance requests among all servers (By default: no session variables, temporary files) Class C (single affinity): Allocate users to servers ( sticky sessions ) No Affinity provides the greatest scalability
Before you Begin Are you confident the database and application is completely stable and ready to scale up? Debugging load-balanced applications is extremely difficult. The application must be thoroughly analysed prior to implementing the solution. Performance analysis must be performed before capacity planning is valid. Without it, you may spend money in the wrong places. Unforeseen application load can severely impact production resources. Plan your hardware and application resources wisely
A General Guide to Estimating User Load Statistical Analysis of Active Users A Guideline At 2 standard deviations (97% of the time), 50 regular Web mapping users have the potential to generate 2 or fewer concurrent 5-second map requests At 3 standard deviations (99.7% of the time), 50 regular Web mapping users have the potential to generate 5.4 or fewer concurrent 5-second map requests These statistics are based on customer Web log analysis. Your experience may differ.
Web Server Capacity Analysis / Planning Customer Case Study Web Server Capacity Planning Response time (seconds) Note: values above will vary greatly depending on selected network, database, Web application and map content The example above was created assuming all data is served from a separate, well-tuned Oracle 10g database.
WEB APPLICATION SECURITY
Web Application Security Best Practice Configurations #1. DO NOT MAKE YOURSELF A TARGET Choose META tags wisely Do not insert META tags in your Web pages that attract unwanted attention Do not insert META tags that identify server capabilities Use separate Web & database servers for internal & external apps. Use a separate development server for testing and debugging apps. Review servers, completely removing TEMP and DEBUG files Trap and disable all error reporting that reveals server capabilities Audit Web site activity and store logs in a secure location Educate developers on sound security coding practices Use application scanners to thwart malformed URLs Practice Security through Obscurity Implement robots.txt to block indexing by search engines http://www.robotstxt.org/robotstxt.html Disable IIS identifying information
Approaches to Security Two Technology Approaches 1. Comprehensive Protection OS, Network, Virus, URL, Application scanning Intrusion Detection Systems Highly restrictive, sometimes too restrictive 2. Rapid Recovery Very fast backup / restore VMWare Checkpoint / Rollback Both approaches require rigidly securing your servers Implementing a solid patch strategy Limiting applications & services Regularly review all Web server logs Ref. SANS Top 20 Internet Security Problems, Threads & Risks
Main Types of Threats SQL Injection can be used to completely compromise a data-driven Web sites by compromising the database server Unauthorised access to data, potentially with full DBA privileges OS Command Injection is typically accomplished via SQL Injection combined with unchecked SQL configurations Unauthorised access to the OS, potentially with Administrator privileges Cross-Site Scripting (XSS) can be used to compromise the computers of everyone who visits your Web site Allows the Web server to proxy attacks on other Web sites Allows Web page spoofing to collect sensitive user information All vulnerabilities are exploited via parameter spoofing 32
VIRTUALISATION
Virtualisation Pros & Cons Pros Low hardware costs Rapid scaling up / down High versatility Simple, centralised server management Virtual Switch for fast server-to-server communications Resilience Centralised Backup Rapid Recovery Redundancy Cons High license costs Diminished performance Greater complexity Performance impact by other virtual servers Limited Virtual Switch capabilities High risk point of failure Hot Backup consistency
Virtual Machines & Security Gartner Survey says 60% of Virtualized servers will be less secure than the physical servers they replace... http://www.darkreading.com/security/storage/showarticle.jht ml?articleid=223900133&subsection=storage+security Best Practices don t change just because a server is virtual You still must patch it at the OS and application levels You still must patch its host at the OS and application levels You still must apply network and security measures Failure to apply best practices and industry standards will result in security shortcomings and system vulnerability.
Colin Bertram Senior Application Engineer Geospatial Security, Government & Infrastructure Tel. +44 (0) 179 349 2780 colin.bertram@intergraph.com Intergraph (UK) Ltd. Delta Business Park Swindon, SN5 7XP 25-Sep-14 2014 Intergraph Corporation 37