Coveo Platform 7.0. Hardware and Software Recommendations



Similar documents
SBClient and Microsoft Windows Terminal Server (Including Citrix Server)

Licensing Windows Server 2012 for use with virtualization technologies

Licensing Windows Server 2012 R2 for use with virtualization technologies

Ten Steps for an Easy Install of the eg Enterprise Suite

1)What hardware is available for installing/configuring MOSS 2010?

How To Install An Orin Failver Engine On A Network With A Network Card (Orin) On A 2Gigbook (Orion) On An Ipad (Orina) Orin (Ornet) Ornet (Orn

Instant Chime for IBM Sametime Quick Start Guide

AccessData Corporation AD Lab System Specification Guide v1.1

Improved Data Center Power Consumption and Streamlining Management in Windows Server 2008 R2 with SP1

Serv-U Distributed Architecture Guide

Installation Guide Marshal Reporting Console

Helpdesk Support Tickets & Knowledgebase

MaaS360 Cloud Extender

State of Wisconsin. File Server Service Service Offering Definition

Caching Software Performance Test: Microsoft SQL Server Acceleration with FlashSoft Software 3.8 for Windows Server

Implementing ifolder Server in the DMZ with ifolder Data inside the Firewall

Restricted Document. Pulsant Technical Specification

Avatier Identity Management Suite

Introduction to Mindjet MindManager Server

Microsoft Exchange 2010 on VMware Design and Sizing Examples

Serv-U Distributed Architecture Guide

Deployment Overview (Installation):

xdb Configuration Guide

This guide is intended for administrators, who want to install, configure, and manage SAP Lumira, server for BI Platform

URM 11g Implementation Tips, Tricks & Gotchas ALAN MACKENTHUN FISHBOWL SOLUTIONS, INC.

Disk Redundancy (RAID)

The Relativity Appliance Installation Guide

Using Sentry-go Enterprise/ASPX for Sentry-go Quick & Plus! monitors

An Oracle White Paper January Oracle WebLogic Server on Oracle Database Appliance

Learn More Cloud Extender Requirements Cheat Sheet

E2E Express 3.0. Requirements

Level 1 Technical. RealPresence Web Suite and Web Suite Pro. Contents

Release Notes. Dell SonicWALL Security firmware is supported on the following appliances: Dell SonicWALL Security 200

CSC IT practix Recommendations

System Business Continuity Classification

Installation Guide Marshal Reporting Console

State of Wisconsin Division of Enterprise Technology (DET) Distributed Database Hosting Service Offering Definition (SOD)

How to deploy IVE Active-Active and Active-Passive clusters

Readme File. Purpose. Introduction to Data Integration Management. Oracle s Hyperion Data Integration Management Release 9.2.

Information Services Hosting Arrangements

Licensing the Core Client Access License (CAL) Suite and Enterprise CAL Suite

Getting Started Guide

StoneFly M-Series DR Backup Appliance

SANsymphony-V Storage Virtualization Software Installation and Getting Started Guide. February 5,

Readme File. Purpose. What is Translation Manager 9.3.1? Hyperion Translation Manager Release Readme

April 3, Release Notes

AvePoint High Speed Migration Supplementary Tools

Understand Business Continuity

The Importance Advanced Data Collection System Maintenance. Berry Drijsen Global Service Business Manager. knowledge to shape your future

StoneFly Z-Series DR Backup Appliance

SaaS Listing CA Cloud Service Management

Identify Major Server Hardware Components

Implementing SQL Manage Quick Guide

BASIC TECHNICAL FEATURE DESCRIPTION

Optimal Payments Extension. Supporting Documentation for the Extension Package v1.1

Copyright 2013, SafeNet, Inc. All rights reserved. We have attempted to make these documents complete, accurate, and

System Business Continuity Classification

2. When logging is used, which severity level indicates that a device is unusable?

Microsoft Certified Database Administrator (MCDBA)

Best Practices for Optimizing Performance and Availability in Virtual Infrastructures

FUJITSU Software ServerView Suite ServerView PrimeCollect

ATL: Atlas Transformation Language. ATL Installation Guide

A Beginner s Guide to Building Virtual Web Servers

Release Notes. Dell SonicWALL Security 8.0 firmware is supported on the following appliances: Dell SonicWALL Security 200

Cloud Services Frequently Asked Questions FAQ

Integrating With incontact dbprovider & Screen Pops

GETTING STARTED With the Control Panel Table of Contents

ScaleIO Security Configuration Guide

Best Practice - Pentaho BA for High Availability

Telelink 6. Installation Manual

webnetwork Pre-Installation Configuration Checklist

risk2value System Requirements

TaskCentre v4.5 MS SQL Server Trigger Tool White Paper

o How AD Query Works o Installation Requirements o Inserting your License Key o Selecting and Changing your Search Domain

Blue Link Solutions Terminal Server Configuration How to Install Blue Link Solutions in a Terminal Server Environment

Diagnostic Manager Change Log

Software Distribution

Systems Support - Extended

HP ExpertOne. HP2-T21: Administering HP Server Solutions. Table of Contents

Service Level Agreement (SLA) Hosted Products. Netop Business Solutions A/S

Access the SQLsafe Release Notes

Bitrix Intranet. Product Requirements

Identify Storage Technologies and Understand RAID

Preparing to Deploy Reflection : A Guide for System Administrators. Version 14.1

Volume Licensing reference guide. Windows Server 2012 R2

Project Startup Report Presented to the IT Committee June 26, 2012

ROSS RepliWeb Operations Suite for SharePoint. SSL User Guide

STIOffice Integration Installation, FAQ and Troubleshooting

Architecting HP Server Solutions

E-Biz Web Hosting Control Panel

TaskCentre v4.5 Send Message (SMTP) Tool White Paper

Transcription:

Cve Platfrm 7.0 Hardware and Sftware Recmmendatins

Ntice The cntent in this dcument represents the current view f Cve as f the date f publicatin. Because Cve cntinually respnds t changing market cnditins, infrmatin in this dcument is subject t change withut ntice. Fr the latest dcumentatin, visit ur website at www.cve.cm. Cpyright 2013, Cve Slutins Inc. All rights reserved. Cve is a trademark f Cve Slutins Inc. This dcument is prtected by cpyright and ther intellectual prperty law and is subject t the cnfidentiality and ther restrictins specified in the Cve License Agreement. Dcument part number: PM-130521-EN Publicatin date: 10/17/2015 ii

Table f Cntents 1. Cve Platfrm Hardware and Sftware Requirements iv 1.1 Index f up t 5 Millin Dcuments (Minimum Requirements) iv 1.2 Index Frm 5 t 20 Millin Dcuments iv 1.3 Index Frm 20 t 40 Millin Dcuments v 1.4 Index Frm 40 t 80 Millin Dcuments v 1.5 Operating System Cmpatibility vi 1.6 Index Size vi 1.7 Nn Index Files vii 1.8 RAID Cnfiguratin vii 1.9 Near Real-Time Indexing Disk viii 1.10 Third-Party Sftware Requirement viii 1.11 Relatin Between CES Features and Hardware Resurces viii 1.12 RAID Type Cmparisn and Recmmendatins x 1.13 Virtualized Server Guidelines x 1.13.1 Guidelines xi 2. Cve Scalability Mdel xiii 2.1 One Server Cnfiguratins xiii 2.2 Larger Number f Dcuments: Add a Slice xiii 2.3 Mre Queries: Add Mirrr and Frnt-End Servers xiv 2.4 Heavy Dcument Cnversin: Add Remte Cnverter Servers xv 2.5 Indexes in Varius Lcatins: Set Up a Gegraphically Distributed Index (GDI) xv 2.6 Abut Index Slices xvi 2.7 Abut Mirrr Servers xvii 2.8 Abut Gegraphically Distributed Indexing xviii 2.8.1 Setting up Gegraphically Distributed Indexing xx 2.8.2 Setting up Gegraphically Distributed Indexing Using a Mirrr xx 3. Planning Repsitries t Index xxii iii

1. Cve Platfrm Hardware and Sftware Requirements This tpic presents the hardware, sftware, and perating system specs fr the server n which yu install Cve Enterprise Search (CES) fr varius index size ranges that ne Cve instance can manage. The system specificatins apply t back-end Cve Master and Mirrr servers. Ntes: Operate CES n a dedicated server. When ther prcesses are running in parallel r when the query activity reaches peaks and becmes missin-critical, a server meeting the specified requirements may nt be sufficient. Cve prducts wrk best n physical machines, but als supprt virtual envirnments such as VMware (ESX), Micrsft Hyper-V, Amazn Web Services (AWS), Micrsft Azure. Cnsider distributing the index ver mre than ne Cve instance when the number f dcuments t index exceeds the maximum index size presented in this tpic (see Cve Scalability Mdel). Cntact Cve Supprt fr assistance t select the best Cve cnfiguratin fr yur envirnment. 1.1 Index f up t 5 Millin Dcuments (Minimum Requirements) Cmpnent Operating System CPU - Prcessrs RAM - Memry Disk - OS and Prgram Files Disk - Index (1 slice) Disk - Near Real-Time Indexing (ptinal) Minimum Requirement Windws Server 4 Cre (1 4), 2.0 GHz r higher 16 GB 1 x 150 GB SATA 7.2/10 K RPM 1 x 300 GB SATA 7.2/10 K RPM 1 x 150 GB SSD r SATA 7.2/10 K RPM Imprtant: Ensure that yur envirnment meets the abve minimum requirements and fllws recmmendatins belw befre cntacting Cve Supprt t get help fr a perfrmance issue. 1.2 Index Frm 5 t 20 Millin Dcuments Cmpnent Operating System CPU - Prcessrs Recmmendatin Windws Server 8 Cre (1 8), 2.0 GHz r higher iv

Cmpnent RAM - Memry Recmmendatin 32 GB Disks - OS and Prgram Files 2 x 150 GB SAS 10/15 K RPM, RAID 1 Disks - Index (1 slice) 2 x 600 GB SAS 10/15 K RPM, RAID 1 Disks - Other CES Files 2 x 300 GB SAS 10/15 K RPM, RAID 1 Disk - Near Real-Time Indexing (ptinal) 2 x 300 GB SSD r SATA 10 K RPM, RAID 1 1.3 Index Frm 20 t 40 Millin Dcuments Cmpnent Operating System CPU - Prcessrs RAM - Memry Recmmendatin Windws Server 16 Cre (2 8) t 24 Cre (2 12), 2.4 GHz r higher 64 GB Disks - OS and Prgram Files 2 x 150 GB SAS 10/15 K RPM, RAID 1 Disks - Index (1 slice) 4 x 600 GB SAS 10/15 K RPM, RAID 10 Disks - Other CES Files 2 x 600 GB SAS 10/15 K RPM, RAID 1 Disk - Near Real-Time Indexing (ptinal) 2 x 300 GB SSD r SATA 10 K RPM, RAID 1 1.4 Index Frm 40 t 80 Millin Dcuments Cmpnent Operating System CPU - Prcessrs RAM - Memry Recmmendatin Windws Server 24 Cre (2 12) t 32 Cre (4 8), 2.4 GHz r higher 128 GB Disks - OS and Prgram Files 2 x 150 GB SAS 10/15 K RPM, RAID 1 Disks - Index (slice 1) 4 x 600 GB SAS 10/15 K RPM, RAID 10 Disks - Index (slice 2) 4 x 600 GB SAS 10/15 K RPM, RAID 10 Disks - Other CES Files 2 x 600 GB SAS 10/15 K RPM, RAID 1 Disk - Near Real-Time Indexing (ptinal) 2 x 300 GB SSD r SATA 10 K RPM, RAID 1 v

1.5 Operating System Cmpatibility The servers n which CES r Cve.NET Frnt-End run must use ne f the fllwing OS: Windws Server 2012 R2 (with IIS 8) x64 CES 7.0.6196+ (Nvember 2013) Cve.NET Frnt-End 12.0.446+ (Nvember 2013) Windws Server 2012 (with IIS 8) x64 CES 7.0.4897+ (December 2012) Cve.NET Frnt-End 12.0.61+ (December 2012) Windws Server 2008 R2 (with IIS 7) x64 Windws Server 2008 (with IIS 7) Imprtant: The x64 Windws OS versin is required fr indexes with mre than 500 K dcuments. Ntes: CES can perate n nn-server versins f Windws perating systems. Hwever, fr prductin purpses, Cve nly supprts and recmmends Windws Server perating systems t prevent perfrmance, stability, and scalability issues. Fr evaluatin purpse nly CES can run under: Windws 8.1 Pr r Enterprise (with IIS 8) CES 7.0.6196+ (Nvember 2013) Cve.NET Frnt-End 12.0.446+ (Nvember 2013) Windws 8 Pr r Enterprise (with IIS 8) CES 7.0.4897+ (December 2012) Cve.NET Frnt-End 12.0.61+ (December 2012) Windws 7 Prfessinal, Enterprise, r Ultimate (with IIS 7) 1.6 Index Size The index typically ccupies 30% t 50% f the ttal size f the riginal dcuments. Example: Yu index dcuments that ccupy 1 TB in varius repsitries. With yur mix f cntent type, the index size ends up at 42% f the riginal dcuments size (420 GB). The size f the dedicated index hard disk shuld be at least 500 GB. Dedicate a disk r disk set t the index files. Use lcal disks, direct-attached strage (DAS), r strage area netwrk (SAN). Netwrk-attached strage (NAS) as well as server message blck (SMB) and ther files shares are nt supprted. When a secnd index slice is required, yu must install each slice n separate disk sets. RAID 0 and RAID 5 are nt recmmended. Nte: The index autmatically switches t the read-nly mde t prevent errrs when the index disk free space reaches a minimum f 5 GB. vi

1.7 Nn Index Files Fr index sizes with mre than 5 millin dcuments, it is recmmended t stre nn index CES files n a dedicated set f disks t separate inputs/utputs fr these files frm thse fr index slices. The nn index files are: Lg files Default slice files Cnverter files Cnnectr files Cache files Cnfiguratin files Certificate stre files Nte: Yu will specify the path fr these files fllwing the CES installatin when yu create the index. 1.8 RAID Cnfiguratin It is recmmended t create a RAID vlume (a single accessible strage area) fr each f the three categries f files: OS and prgram files Index files slice Other Cve Enterprise Search (CES) files One RAID cntrller can be used fr all lgical vlumes as lng as it supprts the number f disks in yur server. vii

Example: The fllwing typical Cve server hard disk RAID cnfiguratin fr a 20-40 M dcument index is made f 8 disks rganized in three RAID vlumes (C:, D:, and E: in this example). 1.9 Near Real-Time Indexing Disk The Near Real-Time Indexing (NRTI) feature allws t make new dcuments searchable significantly faster fr indexes with tw millin dcuments r mre. When yu want t fully take advantage f this feature, because NRTI is I/O intensive, it is recmmended t add a NRTI dedicated disk t yur Cve Master and Mirrr servers that are serving queries and cnfigure NRTI t use the dedicated disk. The recmmended disk specificatins depend n the size f yur index as shwn in the first sub-sectins f this tpic. 1.10 Third-Party Sftware Requirement The CES installer adds the fllwing required sftware elements when nt already installed: Micrsft.NET Framewrk 3.5 SP1 and 4.5.2 (side by side) Internet Infrmatin Services (IIS) 8 r 7 Micrsft Chart Cntrls fr Dtnet Framewrk 3.5 SP1 Micrsft Visual C++ 2010 Redistributable Package (x64) Nte: Micrsft Visual C++ 2012 Redistributable Package (x64) is installed thrugh the CES installer, which requires Micrsft Visual C++ 2010. MSXML 6 1.11 Relatin Between CES Features and Hardware Resurces CES invlves several prcesses that are running cncurrently n ne r mre servers. Depending n the size f yur Cve installatin, the activated features, and the phases f peratin, these prcesses cnsume varius viii

levels f hardware resurces. The fllwing list describes Cve prcesses, cmpnents, r features that affect the hardware resurces. CPU Dcument cnversin is entirely dne in parallel. The greater number f CPU cres the better. Querying requires several CPU cres t perfrm as many steps in parallel. Queries with numerus terms, exact match peratrs, NEAR peratr, r wildcard characters can take significant amunt f CPU resurces. Nte: Yu can cnfigure the relative pririty f the main, indexing, and crawling prcesses as well as specify the number f query threads. Physical memry (RAM) Indexing uses a lt f physical memry t pre-cmpute mappings frm terms t identifiers. Mre memry is better. Querying requires a gd amunt f physical memry fr caches. Dcument cnversin typically lads dcuments in physical memry. Numerus numerical fields ften require t be kept in physical memry t achieve gd query perfrmance. Facet fields require an amunt f physical memry directly prprtinal t the number f facet values t be cached. Nt having enugh memry t cache facets is nt an ptin, as query perfrmance wuld degrade significantly. String r numerical srting fields have t be set up t be laded in physical memry. Fr string srting fields, the number f field items (cardinality) is what dictates hw much memry is needed, the higher it is, the mre memry it takes. Fr numerical srting fields, cardinality desn t matter, nly the number f dcuments in the index des. Hard disk Indexing is disk I/O intensive. Upgrading the disk subsystem has the mst impact fr better perfrmance. Querying is a prcess requiring a fast disk subsystem. Adding many string fields affects the disk subsystem, because it adds a lt f new terms t the index. Facet fields are easier t index when the number f different facet values is lw (cardinality). The higher the cardinality, the higher the stress n the disk subsystem. String srting fields put mre stress n the disk subsystem than numerical fields. Dcument summarizatin prduces a cncept list and summary sentences that are added t the index. Dcument cnversin accesses hard disk nly fr very large dcuments. ix

1.12 RAID Type Cmparisn and Recmmendatins A redundant array f independent disks (RAID) is a technlgy that can be used t prvide increased strage functins and reliability thrugh redundancy. The fllwing table prvides a brief descriptin f the cmmn available RAID types and usage recmmendatin fr CES. Type Brief descriptin Recmmendatin RAID 0 RAID 1 Data striping withut redundancy. Single disk failure destrys the array. High perfrmance. Mirrring. Prvides better read perfrmance than a single disk. Is fault tlerant. Nt recmmended since it is nt fault tlerant Recmmended fr a single slice RAID 10 Hybrid (r nested) RAID that is a stripe f mirrrs (a RAID 0 f RAID 1). Perfrmance is high fr reads and writes. Is fault tlerant, as lng as a mirrr des nt lse all its disks. Recmmended fr tw index slices RAID 0+1 RAID 5 RAID 1 n Offers a slightly better perfrmance but is slightly less fault tlerant than RAID 10. Data striping with blck level parity. Requires all drives but ne t perate. Drive failure requires replacement. Read perfrmance is adequate, but write perfrmance is t lw t be used in an indexing cntext. A set f mirrrs, ne fr each slice, when tw slices are needed. Write and read perfrmances are better because f the way CES evenly splits I/O peratins between the RAID arrays. Basically, CES des the ladbalancing instead f the cntrller. Mre cmplex cnfiguratin. Recmmended Nt recmmended Recmmended t achieve better perfrmances with the same number f drives 1.13 Virtualized Server Guidelines Cve prducts can perate n virtual machines as they d n real hardware. As fr real hardware, the key t a successful Cve deplyment n virtualized hardware is t respect the Cve Platfrm requirements fr the size f yur index (see "Cve Platfrm Hardware and Sftware Requirements" n page iv). Virtualized envirnments can vary greatly frm ne implementatin t anther s it is nt apprpriate t state specific virtual hardware requirements. The nature f virtualized envirnments t ptimize the usage f real hardware resurces by sharing them amng several virtual servers ges against the ideal setup where required resurces are dedicated t a server. Like many prcesses, Cve prcesses such as indexing dcuments r serving queries have varying server resurce lads ver time (see "Relatin Between CES Features and Hardware Resurces" n page viii). x

Yu r sme f yur clleagues are experts n yur hypervisr and virtual envirnment. Yu have the respnsibility t ensure that yur Cve virtual server implementatin maximizes the chances that required resurces will be available when they are needed. 1.13.1 Guidelines Dedicate a virtual machine (VM) respecting the Cve Platfrm requirements fr yur index size fr each Cve server (see "Cve Platfrm Hardware and Sftware Requirements" n page iv). Minimize vercmmitment f CPU and memry resurces n a hst where a virtual Cve server is running. Fr large indexes, when yur virtual envirnments des nt allw yu t create a virtual server that respects the requirements (see "Index Frm 40 t 80 Millin Dcuments " n page v), cnsider the fllwing ptins: Use gegraphically distributed indexing (GDI) t split the index in tw f mre CES instances, each n its dedicated virtual server (see "Abut Gegraphically Distributed Indexing" n page xviii). Cmmissin a dedicated hardware Cve server meeting the requirements. Disk management Many Cve prcesses are disk I/O intensive. The Cve server requirements specify using separate dedicated disks fr specific Cve server prcess categries (perating system and prgrams, index, ther Cve files, and near-real time indexing) t ptimize disk I/O perfrmances and minimize interferences. In a virtual envirnment, the pl f available strage resurces are nt nly shared amng varius prcesses f ne server, but als with prcesses frm several ther servers. Perfrmance issues with virtual Cve servers are ften linked t pr disk perfrmances. Example: The Cve server VM shares a disk resurce with ther hst VMs, including a VM n which a large repsitry is hsted. The disk resurce is able t respnd t the average traffic. Hwever, when the Cve server indexes the large repsitry, the disk resurce thrughput quickly reaches its limit because bth the Cve and the repsitry servers make significantly mre disk I/Os, respectively t index the cntent, and t respnd t the Cve crawler. The perfrmance f bth systems (and any ther hst VMs sharing the same disk resurce) drps significantly while indexing takes place. Yu are the expert with yur hypervisr and virtual envirnment: Avid sharing the same strage resurces between a Cve server and a repsitry that is indexed by Cve. Preferably use disk resurces frm a lw latency strage area netwrk SAN. Attach available virtual disk resurces (such as lgical unit number [LUN] strage) that best match the requirements fr yur index size. Distribute Cve intensive prcess schedules in time when shared resurces are mst available. When yu index mre than ne repsitry, avid starting all surce refreshes at the same time. Define surce schedules that distribute surce refreshes in time during ff-peak hurs. xi

Example: The default Every day surce schedule starts every day at midnight. If all yur daily refreshed surces use this schedule, they all start at midnight, ptentially verlading yur shared resurces. Rather create surce specific (like Repsitry1 daily, Repsitry2 daily,...) r time specific (like Daily at 2:00 AM, Weekdays at 3:00 AM, Saturdays at 9:00 PM...) surce schedules that yu can assign t yur surces t distribute the prcesses ver the ff-peak perid. Avid scheduling Cve intensive prcesses at the same time as ther intensive prcesses (such as backups) frm ther systems. xii

2. Cve Scalability Mdel The Cve Platfrm implementatin can be easily scaled t serve the search needs fr varius enterprise sizes, frm a single department t a large glbal internatinal rganizatin. The Cve scalability mdel allws t perate either with a single Cve server, with ne Cve instance cmpsed f tw r mre Cve servers, r with tw r mre inter-cnnected Cve instances. Nte: Additinal licensing is required fr Cve instances cnfigured with mre than ne server. 2.1 One Server Cnfiguratins In its simplest frm, a Cve instance is entirely hsted n ne server that perfrms all the prcesses (crawling repsitries, cnverting dcuments, hsting the index, hsting the search interface website, handling query requests, and returning query results). Cve Master server The default slice is always included in the Master server 2.2 Larger Number f Dcuments: Add a Slice When the number f dcuments t index increases, hsting the index n a single hard disk leads t size and perfrmance limitatins. On ne Cve server, the index can be divided in up t tw slices, each n a separate disk (see "Abut Index Slices" n page xvi). Cve Master server xiii

The default slice is always included in the Master server Extra slice When the index size exceeds the capacity f ne server, yu can create ther Cve instances and federate search results (see "Abut Gegraphically Distributed Indexing" n page xviii). 2.3 Mre Queries: Add Mirrr and Frnt-End Servers When ne Cve server handles all the CES prcesses and the rate f queries increases, the users may eventually feel that the system is slwer, typically when the results are n lnger returned within a secnd. Adding ne r mre Cve Mirrr servers allws supprting significantly mre queries while maintaining sub-secnd respnse time 90 % f the time. A Mirrr server hlds a cpy f the master index and cntinuusly receives updates frm the Master server (see "Abut Mirrr Servers" n page xvii). When IIS n the Master server is verladed and cannt adequately serve search interfaces, ne r mre Cve Frnt-End servers can als be added t distribute the website hsting f search interfaces and the handling f search queries. The Master server and the Mirrr servers are typically set up in a netwrk lad-balancing (NLB) cnfiguratin t prvide ptimized service availability and failver capability. Fr the same reasns, Frnt-End servers can als be set up in a separate NLB cluster. xiv

Tw Cve Frnt-End servers in a netwrk lad-balanced cluster The Cve Master server with tw Mirrr servers in anther netwrk lad-balanced cluster Cve Master server with up t tw slices First Cve Mirrr server with cpies f the Master server slices Secnd Cve Mirrr server with cpies f the Master server slices 2.4 Heavy Dcument Cnversin: Add Remte Cnverter Servers When dcument cnversin requires significant server resurces, the dcument cnversin prcess can be distributed t ne r mre Cve Remte Cnverter servers t free the Cve Master server frm this task. This is useful fr example when cnverting numerus dcuments invlving the CPU and physical memry intensive ptical character recgnitin (OCR) mdule. Cve Master server First Cve Remte Cnverter server fr nrmal dcuments Secnd Cve Remte Cnverter server fr OCR cnversin 2.5 Indexes in Varius Lcatins: Set Up a Gegraphically Distributed Index (GDI) Within an rganizatin, separate Cve instances may be distributed in different departments, buildings, cities, r even cntinents. Multiple Cve instances can be set up t frm a gegraphically distributed index (GDI). Queries entered in a search interface f ne Cve instance returns results gathered and ranked frm tw r mre Cve instances (see "Abut Gegraphically Distributed Indexing" n page xviii). xv

Five server Cve instance f department A in the American ffices Three server Cve instance fr department B in the American ffices Single server Cve instance fr the Eurpe ffices 2.6 Abut Index Slices An index slice is a separate physical strage lcatin fr a part f the master index. The purpse f a slice is t distribute the master index cntent t increase the available space and t speed up the indexing prcess. The Master server is always created with ne slice named default. Adding a slice is necessary when the size f the disk limits the number f dcuments that can be indexed r when its perfrmance affects the query respnse time. A slice must be added n separate disks n the Maser server. Nte: When all slices n yur Master server are getting full, cnsider federating search n tw r mre Cve instances (see "Abut Gegraphically Distributed Indexing" n page xviii). Index slices facts A slice can typically cntain up t 40 millin dcuments. One Cve server can typically hst up t tw slices and cntain up t 80 millin dcuments. These numbers can hwever vary depending n a number f factrs. Each slice shuld be created n a separate dedicated disk set. Slice files can als be hsted n a strage area netwrk (SAN). Slices are filled in a distributed fashin s they grw with the same number f dcuments. xvi

Nte: When yu add ne slice, because the default slice is getting clse t the limit, new dcuments will be added nly t the new slice until it reaches the number f dcuments cntained in the default slice. The dcuments are then distributed evenly in the tw slices. 2.7 Abut Mirrr Servers A Cve Mirrr server hsts a cpy f the master index lcated n the Cve Master server. A Mirrr server allws distributing queries between servers t speed up the querying prcess and t prvide failver capability. The prcess t add a Mirrr server t a Cve instance invlves the fllwing steps: 1. Installing the Cve Mirrr cmpnents n a dedicated server. 2. Cnfigure the Cve Master server t recgnize and synchrnize its index with remte Mirrr server. 3. Cnfiguring ne r mre Frnt-End servers t send queries t the new mirrr server and t ther available Back-End servers. Mirrr server facts The Cve Master server hsts the master index. In the Administratin Tl, the Master index is named the Default mirrr. A Mirrr server cntains a cpy f the master index, and therefre als duplicates the slice cnfiguratin f the Master server. A Master server sends index changes t the Mirrr servers fllwing its update schedules. The Cve administratr can als schedule Mirrr server updates, fr example t ff-peak hurs. One Mirrr server exclusively serving queries can typically respnd t 25 queries per secnd (QPS), which can be adequate fr several thusand cncurrent users. Mirrr servers prvide a failver capability when the Master server and ne r mre Mirrrs are cnfigured in a netwrk lad-balancing (NLB) cluster. As lng as ne server is up and running, the system can return results t respnd t incming queries. Yu cnsider adding ne r mre Mirrr servers: When the query respnses time increases ntably (abve ne secnd) during peak usage because the Master server is busy with the multiple Cve prcesses. When yu want t have query serving failver capability. When yu want t cmpletely free the Master server frm the query serving task. Tip: Yu can add tw r mre Mirrr servers in an NLB cluster, excluding the master server frm the cluster, and sending all queries frm the Frnt-End servers t the NLB cluster address. xvii

A netwrk lad-balanced (NLB) cluster including the Master server and tw Mirrr servers. The Cve Master server Up t tw index slices f the Master server A first Cve Mirrr server with its duplicate Master server index slices A secnd Cve Mirrr server with its duplicate Master server index slices 2.8 Abut Gegraphically Distributed Indexing Gegraphically Distributed Indexing (GDI) is a federated search feature which enables the cmmunicatin between different Cve instances. Its purpse is t deliver high search perfrmance and availability, cnnecting unified indexes frm ffices distributed in different departments, sites, cities, r even cuntries arund the wrld. With GDI, a user makes a single query request in the search interface f the lcal Cve instance. The query is distributed t the remte indexes participating in the federatin. The lcal Cve instance receives, merges, and ranks results frm lcal and remte indexes befre returning them t the user. The Cve scalability mdel supprts tw GDI cnfiguratins that are transparent t end-users. Query federated t the remte index In this cnfiguratin, the lcal Cve instance sends federated queries t the remte indexes ver the WAN (see "Setting up Gegraphically Distributed Indexing" n page xx). xviii

This cnfiguratin is very simple t set up and has negligible impact n the WAN bandwidth. The query perfrmance may hwever be affected by the time required fr the rund trip t the remte Cve instance. Example: Users wrking in the Pal Alt and Chicag ffices must be able t search cntent frm the Cve instance in the ther ffice. In the Pal Alt ffice, yu cnfigure the Cve instance t accept queries frm the Cve instance in the Chicag ffice. In the Chicag ffice, yu cnfigure the Cve instance t accept queries frm the Cve instance in the Pal Alt ffice. Query federated t a lcal mirrr f a remte index In this cnfiguratin, yu lcally install a Cve Mirrr server f the remte index. Yu cnfigure the remte Cve instance t synchrnize its index with this Mirrr server ver the WAN. The lcal Cve instance sends federated queries t the lcal mirrr f the remte index (see "Setting up Gegraphically Distributed Indexing Using a Mirrr" n page xx). Example: In the Pal Alt ffice, yu deply a Cve Mirrr server f the Chicag index ffice. Yu cnfigure the Chicag Cve instance t synchrnize its index with the Pal Alt mirrr server. Yu cnfigure the Pal Alt Cve instance t use the mirrr server as the Chicag remte index. In the Chicag ffice, yu can d the same thing fr the Pal Alt ffice. xix

2.8.1 Setting up Gegraphically Distributed Indexing Yu can easily create a ne way intercnnectin between a lcal and a remte Cve instance t create a gegraphically distributed index (GDI) using the CES remte index features (see "Abut Gegraphically Distributed Indexing" n page xviii). The cnfiguratin cnsists in enabling the remte Cve instance t accept remte queries, add the remte index t the lcal Cve instance, and cnfigure a lcal search interface t include results frm the remte index. Users frm the lcal Cve instance can then search cntent in the remte index. Nte: Yu can create a tw-way intercnnectin between Cve instances by repeating the fllwing prcedure fr the ther directin. T set up a gegraphically distributed index 1. Ensure that the Cve instances that yu want t intercnnect meet the fllwing requirements: All Cve instances f the gegraphically distributed indexing setup must reside n the same dmain. The CES search applicatin pl fr the lcal Cve instance must run under a dmain accunt. It is a best practice t create a dedicated accunt fr this purpse with a strng passwrd that never changes. 2. On the Cve Master server f the lcal Cve instance: a. Add the remte index. b. Cnfigure a scpe that includes the remte index. 3. On the Cve Frnt-End server f the lcal Cve instance, assign the scpe that includes the remte index t the desired.net search interface. 4. Using the.net search interface fr which yu mdified the scpe, perfrm queries t validate that cntent frm the remte index is returned. Tip: If needed, n the Cve Master server f the remte Cve instance, pen the CES Cnsle t validate that the remte index receives the queries sent frm the lcal.net search interface. 2.8.2 Setting up Gegraphically Distributed Indexing Using a Mirrr Yu can set up gegraphically distributed indexing using a lcal mirrr f a remte index (see "Abut Gegraphically Distributed Indexing" n page xviii). Nte: Yu can create a tw-way intercnnectin between Cve instances by repeating the fllwing prcedure fr the ther directin. T set up a gegraphically distributed index using a mirrr 1. Ensure that the Cve instances that yu want t intercnnect meet the fllwing requirements: All Cve instances f the gegraphically distributed indexing setup must reside n the same dmain. The CES search applicatin pl fr the lcal Cve instance must run under a dmain accunt. It is a xx

best practice t create a dedicated accunt fr this purpse with a strng passwrd that never changes. 2. Physically install a lcal server and install the Cve Mirrr sftware cmpnents withut the web interfaces. 3. Fr the remte Cve instance, cnfigure the Master server t recgnize and synchrnize its index with yur new lcal Mirrr server. 4. On the Cve Master server f the lcal Cve instance: a. Add the remte index. b. Cnfigure a scpe that includes the remte index. 5. On the Cve Frnt-End server f the lcal Cve instance, assign the scpe that includes the remte index t the desired.net search interface. 6. Using the.net search interface fr which yu mdified the scpe, perfrm queries t validate that cntent frm the remte index is returned. Tip: If needed, n the Cve Master server f the remte Cve instance, pen the CES Cnsle t validate that the remte index receives the queries sent frm the lcal.net search interface. xxi

3. Planning Repsitries t Index Bringing structured and unstructured data frm multiple repsitries in ne unified index is the great benefit f the Cve Platfrm. When a unified index is available, the Cve slutins allw yu t search, cnslidate, crrelate, and analyze infrmatin frm emails, knwledge based dcuments, custmer relatin management (CRM) system, database entries, peple infrmatin, etc. One f the basic tasks when planning a Cve installatin is t identify all the repsitries that yu want t index within yur rganizatin. Cve Enterprise Search (CES) can index many types f repsitries and supprts many specific systems. Analyze the cntent f each repsitry: Estimate the number f dcuments in the repsitry. Estimate the ttal size f the riginal dcuments List the main types f repsitry dcuments. Example: Micrsft Office, PDF, text, html, email, database recrds Identify if sme cntent requires special cnversin tls. Example: Text extractin using ptical character recgnitin (OCR) in images. Fr email repsitries and desktps, estimate the number f users. Estimate the yearly grwth fr: Number f dcuments Number f users Nte the values fr each repsitry in a table similar t the fllwing ne and cntact the Cve Supprt t help yu plan yur Cve installatin. Repsitry Dcuments Users Type Example Number Ttal size (MB) Annual grwth (%) Number Annual grwth (%) Email Web pages File share Micrsft Exchange Ltus Ntes Enterprise Vault Website Extranet Netwrk drives xxii

Repsitry Dcuments Users Type Example Number Ttal size (MB) Annual grwth (%) Number Annual grwth (%) Lcal files Peple infrmatin Database CMS CRM Wiki Desktps Laptps Micrsft Active Directry Micrsft SQL Server Micrsft SharePint Sitecre Salesfrce Cnfluence xxiii

xxiv