Technical best practices for Xythos deployment Last Updated: June 27, 2005 Copyright 2005 Xythos Software
TABLE OF CONTENTS 1 INTRODUCTION... 1 2 TECHNICAL ARCHITECTURE... 1 2.1 DOCUMENT STORES / FILE STORAGE... 1 2.2 GLOBAL SCHEMA... 2 2.3 WEB SERVER... 2 2.4 LOAD BALANCER... 2 2.5 LDAP INTEGRATION... 2 User Account Creation... 2 LDAP Groups / File and Directory Permissions... 2 LDAP Fail-Over... 3 2.6 FILE CONTENT SEARCHING... 3 2.7 SCALING... 3 Application Tier Scaling... 3 Storage Tier Scaling... 3 Global Schema Scaling... 4 2.8 BACKUP AND RECOVERY... 4 2.9 CLUSTERING, FAIL-OVER AND HIGH AVAILABILITY... 4 3 PLANNING XYTHOS DEPLOYMENT... 5 3.1 SYSTEM SIZING... 5 External Storage... 5 File Content Search Index... 5 Database Size... 5 3.2 HARDWARE REQUIREMENTS... 6 Storage Server... 6 Load Balancer... 6 Web Server... 6 3.3 XYTHOS ADMINISTRATORS... 7 Super Users... 7 3.4 USER AND GROUP REPOSITORY...7 Regular User Model... 7 LDAP User Model... 7 3.5 PLANNING BACKUP AND RECOVERY, HIGH AVAILABILITY, FAIL-OVER... 8 4 INSTALLATION... 8 Multiple Server Install... 9 Database Setup... 9 4.1 INSTALLXYTHOS... 9 4.2 XYTHOS ADMIN CONFIGURATION... 9 LDAP Configuration... 9 New User Templates... 10 Language and Timezones... 11 Security... 11 Storage / Document Stores... 11 Logging... 12 Virtual Servers... 12 4.3 APPLICATION SERVER CONFIGURATION... 12 Running as Service... 12 SSL... 12 Xythos Admin Security... 12 Heap size... 12 4.4 CLUSTERING THE XYTHOS SERVER... 12 Global Schema DB... 13 Intra-Server Communication... 13 Load Balancers and Xythos Sessions... 13 App Server Configuration... 13 Install Instructions for Second Xythos Server... 14
5 CONFIGURE CONTENT REPOSITORY... 14 Directory Structure... 14 Document Classifications... 15 6 DATA IMPORT... 15 6.1 CUSTOM MIGRATION TOOLS... 15 6.2 IMPORTING HOME DIRECTORIES... 15 7 MAINTENANCE AND SUPPORT... 16 7.1 CHANGING FILE STORAGE... 16
1 Introduction This white paper provides guidance on the best practices for deploying the Xythos Document Manager, Digital Locker, and Web File Server in a wide variety of environments. (For sake of simplicity, this document refers to all three products collectively as Xythos or the Xythos server.) This guide supplements the technical documentation provided with these products. In general, the topics are arranged along a timeline that matches your own deployment considerations. The starting point, naturally, is the system's architecture. 2 Technical Architecture The Xythos server consists of several loosely-coupled, distributed components. This design ensures a great deal of flexibility, which is vital to ensuring scalability, reliability, and performance. Equally important, the architecture is designed to fit into your existing IT "ecosystem" as easily as possible. Xythos consists of the following components: Application Server - The application logic runs in any Java servlet 2.3 compatible container. Xythos is bundled with Jakarta Tomcat, but you are free to use any application server. The Xythos server is also J2EE-compatible. Therefore, it can be deployed using J2EE-1.3 compatible application servers and some J2EE-1.2 compatible application servers. The Xythos server provides the functionality for creating an EAR (Enterprise Archive) file, a JAR file that contains a J2EE application. A system administrator can also create several WAR (Web Archive) files, a package of Web modules that collectively perform as a J2EE application. In other words, the administrator can deploy either the EAR file or WAR files with a J2EE-compatible application server. Database - Xythos uses a relational database to store all metadata, such as permissions, versions, properties, comments and logging information. Supported databases include: IBM DB2, Oracle, Postgres (open source), and MS SQL Server. Xythos is bundled with Postgres, so you do not have to do a separate database installation to bring up Xythos for the first time. File System - Xythos stores all file content directly onto a storage device in any native file system, including Networked Attached Storage (NAS), a Storage Area Network (SAN), and Directly Attached Storage. Alternately, you can also store the content in a database. System Architecture Load Balancer / Web Servers App Server / Servlet Container Xythos WFS (scale 1...N) DB Global Schema DB Document Store Schema (scale 1...N) Storage Server (NAS, SAN, RAID) Each App Server must have a mount/share to storage server(s) (scale 1...N) 2.1 Document Stores / File Storage The way Xythos stores files again demonstrates the critical importance of flexibility. While users never see this part of the system, in reality, an "obfuscated" document store exists behind a database, which presents the picture of an enhanced file system. The database provides the map to the familiar world of files and folders, along with additional Copyright 2005 Xythos Software Page 1
features like versions, custom metadata, and workflows. It also provides the gateway to the actual storage behind the scenes, either the database or a file storage device (NAS, SAN, etc.). Xythos maintains its own proprietary directory structure on external storage. For example, a file called /users/jdoe/test.doc may be physically stored at /xythosdata/2005/2005-01/2005-01-12/4901/4201.204. The first part of the file path, /xythosdata, is an External Storage Location a mounted directory, on each app server running the Xythos server, that points to the storage device. Xythos writes files to directories corresponding to the date they were added. The server never overwrites files on external storage. Instead, it deletes the old file and creates a new one in a directory corresponding to the edit date. So a directory for a given date, /2005/2005-01/2005-01-12/, represents all the new content, from uploads and edits, created on that date. Because Xythos maintains a pointer to the file content in the DB, the system will only maintain one copy of each unique file, even if it many users have it. As soon as the content (bytes) change, the system will branch and maintain another copy. This single content source feature reduces storage requirements. 2.2 Global Schema Each Xythos instance needs one database for the Global Schema, which stores configuration parameters that tie together the whole system. This information includes user and group definitions, such as contact lists, preferences, global groups, and user sessions. There can be only one Global Schema per Xythos instance, and it must run on a single database, however that database can be clustered. 2.3 Web Server For Xythos, the bundled Tomcat servlet engine can act as both the web server and application server. If you want to run a separate web server, it must meet the following requirements: - Works with a Java servlet container - Allows communication through the WebDAV extensions to the HTTP specification (including HTTPS). - Allows communication through all other headers and methods supported under HTTP 1.1 and WebDAV. 2.4 Load Balancer Frequently, system administrators deploy multiple application servers running Xythos, with a load balancer as the front end to all of them. To ensure high performance in this configuration, the Xythos server makes extensive use of caches. We recommend that you configure the load balancer to use sticky session, so that it sends requests to the same application server for he duration of a user s session. However, this option is not a requirement. Each application server is state-less, and sessions are maintained in the Global Schema DB. It doesn t matter if the load balancer bounces a single user from one application server to another, since that user will remain logged in. 2.5 LDAP Integration The Xythos server uses an LDAP service for user authentication. Xythos then associates the user and group repository associated with the Xythos content. The Xythos server integrates with any directory server that conforms to the LDAP specification v2 or v3 (using Java s JNDI libraries to communicate with LDAP servers). Xythos has successfully integrated with Microsoft Active Directory, Novell NDS or edirectory, Netscape Directory, and OpenLDAP. Xythos only reads from the LDAP server; it doesn t write or update information. User Account Creation When integrated with LDAP, the Xythos server doesn t require any action to create a user account. Home directories are created when users log in for the first time. You can configure the system to grant access to any LDAP-authenticated user, or to a restricted subset designated by LDAP group membership or some other LDAPmaintained attribute. LDAP Groups / File and Directory Permissions The Xythos server has very granular control over file and directory permissions. Each file or directory can have permissions (read, write, delete, administer, inherit read, inherit write, inherit delete, inherit administer) set for any user or group which exists in LDAP (there are also groups which can be defined only in the Xythos server). At login, the server calculates each user s group membership. Any permission set for a group applies to all the users Copyright 2005 Xythos Software Page 2
that are a member of it. This provides a powerful model where Xythos maintains permissions to LDAP groups, while LDAP administrators maintain the group membership. LDAP Fail-Over Xythos administrators can configure the system to integrate with redundant LDAP servers in the same. If Xythos is unable to get a connection to a given LDAP server, it can fail-over to another. 2.6 File Content Searching The File Content Search feature is composed of two components, filters and an index. Filters are responsible for extracting text content from the various files types. The content is stored in a search index which is written to disk. If there are multiple application server, the search index must be mounted to each server (like external storage). 2.7 Scaling The Xythos server was designed to with scalability and performance in mind. A system administrator can scale each component the application, the database, or storage as needed to handle increasing demands on the system. Each component can scale independently: there can be 1 to N application servers, database servers, and storage devices. Note that the Global Schema must run on a single database that can be clustered if needed. Scalability Application Server Scaling Application Tier App Server Xythos WFS 1...... n App Server Xythos WFS DB Scaling External Storage (NAS, SAN, RAID ) Storage Tier DB 1...... n DB 1.. n External Storage (NAS, SAN, RAID ) Storage Scaling Application Tier Scaling A single Xythos instance can support numerous application servers at the application tier. If you have a loadbalancing switch/router, you can implement both load-balancing and fail-over capabilities. Each application server is state-less, processing user requests to the database and the storage devices behind. If your application server is getting saturated with these requests, simply add another application server. Storage Tier Scaling The storage tier consists of one or more document stores. Each document store consists of a DB to store file meta-data, and one or more External Storage Locations to store file content. The storage tier can scale in two dimensions: Additional document stores can be added to the system to provide load balancing for the database(s). Each top level directory (root directory) in the Xythos server is associated with a particular document store and all files stored within that directory are stored in the database and file system(s) configured for the document store. This allows administrators to distribute the management of all Xythos content across multiple databases and storage devices. For example, data within the /users/ directory can be managed by one database and storage server, while all shared data in /company/ can be managed by different DB and storage server. Top level directories can also be moved from one document store to another. Copyright 2005 Xythos Software Page 3
Additional storage (External Storage Locations) can be added in order to provide more storage capacity. Once disk space starts to run low on a particular External Storage Location, administrators can add another External Storage Location to the document store. The system will continue to read and write to the old storage locations for existing files, but new content will be written to the new storage location. External Storage Locations can be moved and renamed as computing resources change. Global Schema Scaling There must be one, and only one, Global Schema per Xythos instance and this has to run in one database. This DB could be clustered for performance reasons, although this is probably not necessary since there is not a lot of activity using this DB (most of the system parameters are read at startup and cached from that point on). The heaviest usage of the Global Schema will be to manage user sessions, which is not that database intensive. 2.8 Backup and Recovery Backup of the DB and file system are done however you would normally backup these systems today. There are two ways of running the server, with or without our point-in-time recovery. We recommend running with the point-in-time recovery feature on. In this manner, new files are stored in two locations: a primary storage location and a temporary storage location (the database can be used for temp storage). The primary file system should be backed up at some interval. Files are kept in the temporary storage location for 2 of your backup intervals. If there is a disk crash on the primary storage location, you can recover primary to the point in time of the last backup via your backup system. In this case, the temporary storage location contains all the files created between the last primary backup and the crash of primary storage (this assumes that the temp storage is on a machine isolated from the primary crash). The Xythos Recovery Manager (an option in the admin) can be used to replay (or copy) the files that were written since the backup from the temporary space back to the primary storage location. Thus, the temporary storage location functions similarly to a database redo or transaction log. In this way the system recovers from file system failures. Backup : and Recovery Backup System backup interval backup last backup before crash recover from backup primary disk crash (doesn t effect temp ) Primary Storage Xythos Recovery Manger Temp Storage As for single file recovery, there are a few mechanisms in place for this. The first is the trashcan. Each directory has a designated spot where files are moved before they get deleted completely from the system. The system also has the ability to track deletions from the system. In this manner file names and storage locations in the backups are marked when a file is deleted. If a user calls the help desk in order to retrieve a file that they deleted (even a few months back), the admin can go into the Xythos Delete History feature and go directly to the backup where the file last was (this beats looking through backups for a long time because the person can't remember when the file was deleted, or where it was "exactly"). The databases (Oracle, SQL Server, Postgres, DB2) already have mechanisms in place for failure of their hardware and we rely on their mechanisms to recover them. Therefore each part of the system can do point-intime recovery and works with the backup strategies you already have in place. time 2.9 Clustering, Fail-Over and High Availability Application server fail-over can be accomplished simply by deploying the solution with multiple application servers in a load-balanced configuration. Copyright 2005 Xythos Software Page 4
Xythos relies on existing DB clustering technologies to provide fail-over/clustering capabilities. For example, both Oracle and SQL Server provide application clustering capabilities. You would need to configure such a solution at the database tier, then configure the Xythos server to talk to the clustered environment. For storage tier, fail-over would be provided by the storage technology you choose to deploy with the solution. This may be accomplished by deploying SAN or NAS solutions (e.g. a Network Appliance box), or by simply using RAID technology to achieve disk mirroring. The Xythos server supports fail-over to the multiple LDAP servers; if it cannot obtain a connection to a given server, it will fail-over to the next one. 3 Planning Xythos Deployment Before deploying the Xythos server, it is important to consider how the system will be used in order to select the correct hardware and system configuration. A system that has a very large amount of files but not that many users may require a different configuration than a system with a lot of users. Some questions to consider when planning are: - How many users will use the system? - What storage quota users be allowed? - How will users use the system and what types of content will they use? - Which types of documents are of highest value for sharing across your organization? (training/research materials, financial reports/budgets, policies and procedures, subject matter expert reports, project documents, etc.)? - For those that purchased the Enterprise edition, what users will be responsible for developing the document classifications and/or workflows? Are these the same users that will use the classifications and workflows? - Will the Xythos file content search feature be used? 3.1 System Sizing Before selecting an appropriate hardware configuration for the Xythos system, you must understand how much storage capacity will be needed. External Storage The system will need enough disk space on external storage to store all the content that Xythos will manage. If you intend to use the temporary storage feature (to ensure point in time recovery) then you will need to allocate space for that as well. File Content Search Index If the File Content Searching feature is enabled, you must allocate space to store the search index. Ideally this should be placed on the same storage device as the file content. The amount of space needed to store the index depends on the files types that are stored in Xythos. For example, if all the files are ASCI (HTML, text files, etc) then there will be a large amount of data in the search index. On the other hand, if all the files were images, which do not have text content, then the search index size would be small. In general, assuming a variety of files types will be stored in Xythos, the search index will be about 10% of the total file content. Database Size Regarding the number of document stores, database sizing depends on how large the database (schema) associated with the document store will grow. A system administrator may want to keep a single database, or (if comfortable managing this configuration) split this tier across multiple databases. Note that you can add document stores "on the fly" with the Xythos server. If you start with one, and then decide you need additional document stores, all you need to do is setup the corresponding DB (schema), run the Xythos installation utility to setup the DB objects. Copyright 2005 Xythos Software Page 5
The document store schema size depends on the amount of files and the amount of meta-data per file. In order to estimate the database size, the following assumptions were made: each file will average 3 file versions, each file will average 3 custom properties, each file will average 10 log entries. Given these assumptions, each file will require ~5k bytes, in order to store meta-data information. This number would need to be adjusted accordingly if the estimates in the assumptions are increased or decreased. 3.2 Hardware Requirements The various components of Xythos can be run on different machines or on a single machine. The arrangement of those components is the Xythos topology. Deciding which topology to use depends upon performance needs, available equipment, and the amount of information being stored in Xythos. The table bellow provides recommendations for system sizing based on the number of users. Number of Topology Users 0 2K Xythos / App Servers: 1 Server, 2 Processor, 2GHz, 2 gig RAM Database: 1 server, 2 Processor, 2GHz, 4 gig RAM Storage Server 2K 5K 5K 10K 10K 20K Xythos / App Servers: 2 Servers, 2 Processor, 2GHz, 2 gig RAM Database: 1 server, 2 Processor, 2GHz, 4 gig RAM Storage Server Load Balancer Xythos / App Servers: 3 Servers, 2 Processor, 2GHz, 2 gig RAM Database: 1 server, 2-4 Processor, 2GHz, 4 gig RAM Storage Server Load Balancer Xythos / App Servers: 4 Servers, 2 Processor, 2GHz, 2 gig RAM Database: 1 server, 4 Processor, 2GHz, 8 gig RAM Storage Server Load Balancer 20 K 40 K Xythos / App Servers: 5 Servers, 2 Processor, 2GHz, 2 gig RAM Database: 1 server, 4 Processor, 2GHz, 8 gig RAM Storage Server Load Balancer 40K+ Xythos / App Servers: 6+ Servers, 2 Processor, 2GHz, 2 gig RAM Database: 1-2 servers, 4 Processor, 2GHz, 8 gig RAM Storage Server Load Balancer Storage Server A dedicated storage node must present itself to each application server as a mounted drive or network share. Load Balancer If this is more than one application server, a load balancer should be placed in front. This is not required, since the Xythos app servers are stateless, DNS "round robin" can work, but it may not detect device failures like a commercial load balance will (Big IP, MS Load Balancing Service, Cisco Local Director, pen (open source)). Web Server The Xythos server does not require a dedicated web server, although one can be used. Xythos is bundled with the Tomcat application sever which can perform the duties of a web server. Unless you have a specific use in mind for a web server we don t recommend using one. Some reasons why you might want to have a web server in front of Xythos include: - Web server is needed to serve up Xythos as well as other application Copyright 2005 Xythos Software Page 6
- Web servers provide a convenient way for administrators to post messages when the Xythos server is down for maintenance. - You may need a Web Server in which to install your authentication module (ex: Netegrity Siteminder, Oblix, NetPoint) 3.3 Xythos Administrators There are two major types administration tasks needed to maintain the Xythos server. There will be system administration to maintain the web, application, storage, and database tiers, and to configure Xythos via the admin application. There will also be administration of the content, document classifications and workflows within the Xythos server. The system admin and content admin can be the same person or different people. Almost all system and content administration can be accomplished with the Xythos Admin application. There are a lot of features in the Xythos Admin that a content administrator may not care about. Super Users Super-users have full permissions to do anything in the system, as far as content is concerned. They have implicit read, write, delete, and administer permission for every file or directory in Xythos. These users log into Xythos like any other user, but they have full access to the file system. In some situations, a super user may be a better choice than the Xythos Admin, because it allows for use of the Xythos Drive or WebDAV clients. Super-users are purely optional, but we recommend creating them if you have a "trusted" user responsible for supervising the content in the system. 3.4 User and Group Repository The content in Xythos is associated with users and groups which are stored in a user repository. It is important to decide early on what user repository you are going to use, because you cannot switch user repositories at a later time. The Xythos server has the capability to integrate with any user repository (with custom coding), and supports two out of the box: Regular User Model Users and groups are created, stored, and managed in Xythos' global schema. With this user model you have the option to allow users to self register, or user account creation can be done with the Xythos Admin, API, or bulk loader utility. LDAP User Model Users and groups are created, stored, and managed in an LDAP server. The Xythos content is associated with the LDAP users or groups via a principal ID, a string that uniquely represents a user or group in LDAP and will not change over time. Principal ID s for users are groups are defined as follows: User Principal ID For users, the principal ID can be either of the following: [1] The user s distinguished name (DN) which represents the full path to the user in LDAP (e.g. CN=someUser,OU=groups,DC=xythos,DC=com). Since DN s can change, Xythos does not recommend using the DN to identify users. [2] The value of a user attribute in LDAP. If there is an attribute on each user which is guaranteed to be unique, and will never change, then this should be used as the principal ID for the users. Active Directory provides an attribute like this named ObjectGUID. For other systems there may be a something suitable such as employeenum, or ssn, or username. Note that in the case where usernames are not allowed to change, username is a better choice than the user s DN, which can change if any part of its containing tree does. Group Principal ID Group the principal IDs are the group s distinguished name (DN) which represents the full path to the group in LDAP (e.g. CN=someGroup,OU=groups,DC=xythos,DC=com). Princicipal ID Change Since all metadata in the Xythos system is associated with principal IDs for users and groups, changes in these IDs should be avoided. If a users principal ID were to change, they become a completely new and different user to the Xythos system, and they will no longer have any of their content. Copyright 2005 Xythos Software Page 7
Defining Xythos Access You may not want every user in the LDAP server to have access to the Xythos server. If so you can define the users with access to be (a) the users that are in a given group or (b) users with a specific attribute (ex: xythosaccess=true). If you go with either of these configurations you will have to modify your LDAP server accordingly. Group Membership The WFS allows for assigning file permissions and properties to groups. Doing so will apply the permission to all the users that are a member the group. Groups can be created and maintained in Xythos (Xythos Global Groups) but the preferred approach is to use LDAP. The WFS has two ways of calculating the LDAP group membership for a user: [1] using the user's "memberof" attribute [2] using group "members" attribute. [1] is preferred because it takes much less time to process this attribute than to search for all the groups that contain the current user, and all the groups that contain those groups, and so on. The WFS expects the memberof attribute to be be a list of distinguished names for groups, not a list of group names (ex: memberof=cn=group1,ou=groups,o=world; CN=group2, OU=groups,O=world). Ideally the memberof attribute will contain all the nested groups a user belongs to as well. If user is a member of the sales group, which itself is a member of the company group, then memberof may or may not contain the company (nested) group. If it doesn t contain the nested groups, then may need to configure the WFS to search recursively for groups using the group members attribute, which will find all nested groups but may adversely impact login time. 3.5 Planning Backup and Recovery, High availability, Fail-over Administrators should plan to backup all of the data that makes up the Xythos system. Additionally, if there are requirements for high availability and fail-over, all Xythos data may need to be continually duplicated to another system (via DB transaction logs, rsynch, mirroring, etc). The relevant data is stored in several places: - External Storage: If the external storage feature is enabled (recommended approach) file content will be written to disk. - Databases: The global schema and document store databases - File Content Search Index: If Xythos file content search is enabled, an index is used to store file content information. - LDAP Server: User and group information. To simplify backup procedures most data can be stored on a storage device such as a SAN. In this configuration, the DB can write its data files to the SAN which also stores file content and the search index. 4 Installation While the standard Xythos documentation already walks you through the steps needed to install the server, this section guides you provides some important tips and tricks. To expedite the deployment process you may wish to contact Xythos Professional Services for onsite installation, configuration and training. Before installation, the system administrator should make sure every item in this list is checked off: - All hardware is deployed, with OS settings completed. - There is network connectivity between the web servers, app servers, database, and external storage servers. - An external storage device, sufficient to store all the desired content is installed and can be shared to each of the app servers. - The database software is installed and the instance is created. - The necessary JDBC drives are installed on each application server. - A java development kit (JDK not JRE) is installed on each of the application servers. - SSL certificates should be requested. Copyright 2005 Xythos Software Page 8
Multiple Server Install If you are going to install the Xythos server on more than one application server, we recommend that one of the servers is setup and configured completely. You can then use this instance as a template for other servers. There is more information on clustering the Xythos server below. For now, concentrate on just one of the application servers. Database Setup Please refer to the product documentation to create the necessary databases and users. Here are some common ways to avoid possible pitfalls: - For MSSQL, the DBs must be created after logging into the Enterprise Manager as sa. - For MSSQL the DBs must be created with the correct collation name. - For all DBs, ensure the correct JDBC drivers are available. - For Oracle, there are requirements for the DB instance (which will hold the Xythos databases or users ) that must be configured when the instance is created. Please refer to these requirements in the installation documentation before proceeding. 4.1 installxythos The installxythos application is responsible for installing, upgrading and maintaining the Xythos server. Here are some things to keep in mind when running the installation utility: - You will need a valid license key in order install the server. The Xythos production license keys have an expiration date; the server must be installed before that date. Once installed, the key will not expire (it is a perpetual license). - When specifying the location of java home, you must use the home of the JDK, not the JRE. - installxythos will ask you for the location of your JDBC drivers. It will copy the drives (jar files) from the location you have specified into <xythos home>/custom/lib so that they are in the classpath (if you are using another application server you will need to configure the classpath). Make sure that the JDBC drivers are NOT in <xythos home>/custom/lib before you start, since that will cause an error when the system tries to copy them on top of themselves. - When you have completed the installation of the DB tables, make sure to configure the bundled servlet container (even if you are going to use something else). - Enable LDAP integration, document classification or workflow if applicable. 4.2 Xythos Admin Configuration Once the system is installed bring up the Xythos Admin application to configure your system. The following configuration changes should be made before users are allowed to login. LDAP Configuration (Xythos Admin -> Server Admin -> User Model) This is the first thing that should be done after installation since many other configurations depend on it. After completing the wizard you should verify all the settings and make changes if necessary. User Unique ID Perhaps the most important setting is the user s unique ID on the User and Group Attributes page. This should be set to an attribute that will not change. Domains Domains are another important aspect of the LDAP integration. A domain is composed of a name and a DN which represents a location in the LDAP tree (ex: OU=users,DC=xythos,DC=com). The Xythos server will execute all searches against LDAP starting from the DN for one of the domains. In other words, the domains define all the LDAP data that Xythos can see. Domain names appear in the Xythos interface in a couple of places: the login page has a drop down menu listing the user domains, and when searching for other users and groups, to assign Copyright 2005 Xythos Software Page 9
permissions, users must select the domain (location) to search from. If more than one domain is defined they must not contain each other. For example you should not setup a domain with DN ou=groups,dc=xythos,dc=com, and ou=vendors,ou=groups,dc=xythos,dc=com. To avoid possible user confusion regarding domains, Xythos recommends that one domain (or if not using Active Directory, one default user and one group domain) is defined that points to the root of the relevant LDAP directory structure. Here is an example of a recommended configuration: - Active Directory If the AD server is in the xythos.com domain, there should be one Active Directory Forest Root domain with name xythos.com, and DN DC=xythos,DC=com. - Other LDAP Servers Find a spot in the LDAP tree that is a root of all users and groups and point the user and group domains to this. IE create one Default User Domain with name company.com and DN ou=company,o=world. Also create one Group Domain with name groups.company.com and DN ou=company,o=world. However you may need to setup multiple domains in order to segregate users and groups. The domain DNs are very important because they are used to create an ID (principal ID) that uniquely identifies a group. Domain DNs will also become part of a user s principal ID if there is no suitable Unique User ID attribute. So the domain DN string will become part of the IDs associated with file and directory permissions. For example, when assigning a permission to an LDAP group, a user will search for and select the desired group, then configure the permissions for that group. The system will then associate the group s principal ID to this permission, and this principal ID must match exactly (it is case sensitive) the group principal ID the system will calculate for users that are members of this group. In most cases, the system looks at the value of a user s memberof attribute in LDAP to determine the group s principal ID s which this user is a member of. In this case, the group domain DNs must exactly match the format that LDAP is returning for a user s memberof attribute. If a users memberof attribute (which can be examined using the LDAP search utility in the Xythos admin) shows groups like CN=group1,OU=groups,DC=xythos,DC=com, then the group domain DN must be contained in this string, IE DC=xythos,DC=com, NOT dc=xythos,dc=com. If the domain DNs do not exactly match what LDAP is returning, permissions to LDAP groups will not work. Group Membership MemberOf should be set to inclusive if possible. Supporting Usernames that contain @ Xythos has special logic to parse usernames of the form username@domainname. If you must support usernames that contain the @, then you should change the default delimiter to some character that will not show up in the username filed, such as #. (Xythos Admin -> Server Admin -> User Model -> Directory Service -> Domain Delimeter). New User Templates (Xythos Admin -> Users and Groups -> New User Templates) New User Templates provide a mechanism to configure the directory structure, permissions, and properties for new home directories. When integrated with LDAP home directories are created when users login the first time. New User Templates can be associated with users by an LDAP attribute, group, or custom java class. We recommend that home directories be placed in some top level directory such as /users. Create a users directory (File System -> Add Top Level Directory) with no quota, owner, or bandwidth settings. You probably do not want versioning or logging on by default for this directory (allow users to toggle these settings as needed). Once the /users directory is setup define a New User Template with Home Directory Parent Path set to /users, and Directory Template set to Create New Directory Template. Save the template, and then configure the properties, and directory tree for this template. Note that each directory (the users home directory and all sub dirs) can be configured to have a certain set of properties (trash, quota, bandwidth, logging, versioning) and permissions. Once the templates are setup they must be associated with users (Server Admin -> User Model -> User & Group Attributes -> Required User Attributes and Settings -> User Template). If you only have one template it should be specified as the default (in the New User Template section) and you should set User Template to Always use default New User Template. Otherwise you must define a mapping between LDAP user attributes or group membership and the templates. Copyright 2005 Xythos Software Page 10
Language and Timezones (Xythos Admin -> Server Admin -> Internationalization) If you would like to display Xythos UIs in a language other than English, add your language code to the list of available languages. Select the default language and timezone. Security (Xythos Admin -> Server Admin -> Security) Superuser Group, Classification Administrators Group, Classification Users Group These groups can be set to Xythos global groups or LDAP groups (recommended). How owner of new file is determined The setting for How owner of new file is determined determines who the owner of a new file will be. For example, if a directory /foo is owned by usera, but userb has write access to it, and uploads a file test.doc, should /foo/test.doc be owned by usera or userb. We recommend that this be set to Owner is the creator of the resource or programmatic, which would cause userb to own the file in this example. Make session cookies persistent Persistent session cookies are written to disk and can be shared by multiple applications. Also, when browsers are closed they will not be deleted. Since this represents a security risk, it is recommended that session cookies are not persistent. Users are allowed to search for other users This parameter controls if there will be a search from field when users search for users or groups, to assign permissions, in the Xythos server. If you have only one LDAP domain defined (Active Directory), or if all your domains point to the same place (Iplanet), then you should configure this to only on the user s location. This is the simplest configuration and will hide the search from drop down from Xythos users. Since the user s location (domain) points to the root of the LDAP tree, searching from here is always going to find what is needed. Performance & Quotas (Xythos Admin -> Server Admin -> Performance & Quotas) Default quota for new directories This applies to all new directories created in the system. Note that user home directory quotas can and should be configured via the New User Template feature, so this should define the default quota for any other directories. Check bandwidth limits This setting can enable to disable bandwidth at the system level. Note that when it is enabled you still must configure a bandwidth limit for a given directory for this to apply. Bandwidth does have a performance impact, and it should not be used unless necessary. Unless you know for sure you intend to turn on bandwidth for a particular directory, leave this off. (Xythos Admin -> Server Admin -> Server Group Specific) File cache size The file cache is used for file reads (gets). If many users were downloading the same files, it will be kept in cache to improve performance. This should be set to 1/10 th of your available memory. You may need to come back to this setting after configuring the max java heaps size in a later step. Storage / Document Stores (Xythos Admin -> Server Admin -> Storage) Define external storage locations for your primary and temp storage. Temp storage is used for backup and recovery purposes as discussed in the Architecture section. Then configure the document store to use the primary and temp locations. You also may wish to rename the document store name something other than the default. Copyright 2005 Xythos Software Page 11
If any production files or directories were added to the Xythos server before external storage is enabled they must be migrated out to storage from the DB via tha migrate data now link on the document store page. Logging (Xythos Admin -> Server Admin -> Logging) Setup and log file and set log to file as yes. You should leave all logging off unless troubleshooting a specific problem. When the system is not logging to a file the messages will be sent to standard out. Virtual Servers (Xythos Admin -> Server Admin -> Virtual Servers) Please refer to the product documentation for a detailed description of Virtual Servers. You must configure at least one network address for your default Virtual Server to be the domain name and port which will be used to access the WFS. 4.3 Application Server Configuration Once the Xythos server is installed and the basic system configuration is complete there are some configurations which should be made to the app server(s). Running as Service The app server running Xythos should be configured to run as a service which starts automatically when the system boots up. Please note that the user that runs the app server process will need read, write, delete permissions to all of the external storage locations to manage Xythos content. SSL SSL should be configured, otherwise the quick launch WebFolder icon in the WebUI will not appear (this is due to behavior in MS Internet Explorer). Refer to the product documentation for more details on how to setup Tomcat with SSL. Xythos Admin Security The Xythos Admin application needs some security around it because it allows full control of Xythos content. If you are using the bundled Tomcat engine, the admin is protected for you and can be managed via the installxythos program. The security provided is a username/password maintained in Tomcat s configuration files (tomcatusers.xml). If you are not using Tomcat you will need to setup some security to protect the /xythosadmin URLs. Note that there is no true admin user modeled in the Xythos system. When you log into the admin, that use has nothing to do with the Xythos user model, it is simply a set of auth credentials needed to get passed the security of the app server. Heap size The available memory for the app server running the Xythos server should be set to the maximum possible value based on available RAM. Generally, assuming there are no other applications running on the Xythos machine(s), the max memory should be set to 75% of the RAM. If you are using the bundled Tomcat the maximum java heap size can be set via installxythos (choose configure server, then set java heap). 4.4 Clustering the Xythos Server There can be 1 or more Xythos application instances running parallel in the application tier. The following diagram illustrates a deployment with 2 app servers running the Xythos server: Copyright 2005 Xythos Software Page 12
Global Schema DB In a deployment with more than one application server, the Xythos Global schema (xythos DB) is what ties everything together. Each Xythos instance has <xythos home>/xythos.properties which contains the JDBC connection URL pointing to the Global schema. Once connected, each Xythos server retrieves all of the system configuration parameters, including knowledge of other servers, from the Global DB. Intra-Server Communication The Xythos servers communicate with each other. When an administrator re-configures Xythos on one application instance, the server on which the change took place will send a message to all the other Xythos servers informing them to retrieve the updated parameters from the DB. Xythos.ServerIPAddress, in xythos.properties is used for this intra-server communication. It should be set to a valid IP address or machine name for the app server. The /xythosremoteadmin web application is what listens for updates from other Xythos servers (default is port 2223). Load Balancers and Xythos Sessions Each Xythos server is stateless. All persistent data such as users, sessions, files, and directories is stored in other systems (DB, LDAP, file storage). Session IDs are stored in a cookie on the client machine, and in the Global DB. Because of this, it doesn t matter which app server a user s request is sent to, they will still be logged in. The Xythos server does make use of caches so for maximum performance the load balancer should be configured for sticky session which means that a user is directed to the same app server for the duration of their session. This is not a requirement, and using DNS round-robin type load balancing will work as well. App Server Configuration Each Xythos server can be thought of as identical (unless you are using Server Groups), and therefore each needs to have the same ability to interact with the other devices in the Xythos system. Specifically, each Xythos application server must have the following: A mounted drive, or network share, to the storage device(s). This mount must be identical on all the Xythos servers, and it must point to the same place on the storage device. For example, a Xythos file called /users/jdoe/test.doc is physically stored at /xythosdata/2005/2005-01/2005-01-12/4901/4201.204. In this case, /xythosdata is a mounted directory, on each app server running the Xythos server, that points to the storage device. In the Xythos server, /xythosdata is called an External Storage Location. Each External Storage Location defines a root of a directory structure containing Xythos file content. Ability to communicate with each of the other Xythos servers via their registered IP address (xythos.properties -> Xythos.ServerIPAddress) and the specified port for intra server communication (2223 by default). Ability to communicate with the DBs, LDAP servers(s), SMTP server, and any other systems integrated into the Xythos server. Copyright 2005 Xythos Software Page 13
Install Instructions for Second Xythos Server After installing and configuring Xythos, the database tables are already setup and the license key has been verified so you don t need to repeat the same installation process on app server 2. The following tasks should be performed to configure Xythos on a second server: - Extract the Xythos code into <xythos home> on server 2 - Copy xythos.properties from server 1 to <xythos home> on server 2. - Edit xythos.properties on server 2 and set Xythos.ServerIPAddress to be the IP address or machine name of server 2. - Configure the app server - If using the bundled tomcat, run <xythos home>/wfs-x.y.zz/java cp. installxythos and choose the option to update the bundled Servlet container. This will configure the bundled tomcat which is located in <xythos home>/server-x.y.zz. - If you are deploying using an EAR file, make sure to edit rebuild the EAR from server 2, after the xythos.properties file has been edited to have the correct ServerIPAddress. This could also be changed afterwards by editing the properties file in the deployment directory for your application server. - Make sure the /xythosremoteadmin web app is installed 5 Configure Content Repository It is important to carefully consider what the directory structure should look like, and what the inheritable properties should be before the system is heavily used. Inheritable properties are applied to new files/directories added (uploaded, or copied) to a directory. Directory Structure When designing the directory structure one must consider the following: - Only Xythos Administrators can create Top Level Directories. - Only top level directories can be moved between document stores (to scale across multiple DBs). - Inheritable properties for a given directory are applied to new objects (copied, moved) at the time they are added to the directory. - Will there be different content administrators responsible for different directories in the system? If so, you may want to configure the permissions so that content administrators have full control of certain areas of the tree. - Quotas applied to a directory and all its children, so shared areas should have a very high, or unlimited, quota. - In order for users to easily navigate to a directory they must have read access on the directory in question as well as all of it parents. - Trash cans are defined per directory. Not every directory is going to have a trash, so when a file is deleted, it will be placed in the nearest trash can that is defined, working back up the directory tree. So if a file is deleted from /a/b/c/d, and there is only a trash defined for a, it will be placed in /a/trash/somefile. Note that by default, when a directory (not a home dir) is created in the admin there will not be a trash specified. We recommend that all user home directories are separated from the areas with shared directories, although that is not required. Here is an example directory structure: /Users/ - Contains all user home directories. New User Template should be configured to place home dirs here. /Company/ - Root for company shared information. Permissions should be set to that all Authenticated Users have read access. You may want to enable inherit read for all Authenticated Users too. A trash can, /Company/trash, should be configured for this, and all other top level directories. Someone will need to be responsible for cleaning out the trash. Copyright 2005 Xythos Software Page 14
/Company/Sales Contains shared information for the sales. Everyone in the company (all Authenticated Users) has read access, but not write, delete, or administer. The sales group has all permissions, including all inheritable permissions. /Company/Sales/Proposals Since this directory contains sensitive information, it should have logging and versioning on, and should be applied to its children. Shared data should be placed in directories which users can easily navigate to from the root (/). The root should be configured as viewable in the XDM. The only time to disable viewing the root is when there are thousands of top level directories (WFS 4.2 and earlier versions created many top level home directories), because clicking on the root in this case would be a performance problems as the WFS figures out permissions. Document Classifications Document classifications and properties can also be inherited from parent directories so they should be in place before content is added. You can assign one or more document classes to a directory such that when a file is uploaded to that directory, the document can be classified by choosing a class from the list of document classes assigned to the directory. If only one class is listed for a directory, that class is automatically used for classification on every new file uploaded to the directory. The Xythos server also supports mandatory properties, which are required to have a value. A default value should be configured for these properties, otherwise WebDAV uploads will not work because they cannot set the property that needs a value. 6 Data Import There are four ways to add content to the WFS: the XDM interface, WebDAV, the file import tool, and the API. The file import tool is the most effective way to import large amounts of data. It can import any directory structure that is mounted/shared to the application server into a Xythos target directory The tool doesn t capture permissions, ownership, or other properties so configuring the inheritable permissions and default document classifications on the target directory before the data import is essential. 6.1 Custom Migration Tools If there is a need to retain meta-data and permissions on the imported data, beyond what is supported by inheriting from the target directories, then custom migration tools will need to be developed. 6.2 Importing Home Directories If home directories are imported, ownership and permissions must be set afterwards with the Admin, bulkloader, or API. Because of this, we recommend that shared and common content is imported via the import tool (or WebDAV), but users add content to their space. In cases where home directories must be imported in mass, one recommended approach to fix the ownership and permissions with a customization to the WFS login process. The home directories are configured by each user as they login to the system the first time. With this customization, the home dir import works like this: - Mount the parent of the all the home directories to the app server. - Create a /Users directory in the WFS. Don t setup any permissions. - Import the parent directory to all the home dirs into the /Users dir. This process will add all the existing home directory content to the WFS but without any owners or permissions. - Configure a custom session manager to apply the correct permissions and ownership to the imported home directories as users login. This custom session manager is Java code that will need to be developed beforehand (contact our Pro Services group for assistance) that changes the home directory creation logic during login to look for the imported home directory and configure that as the login-user s home dir. This approach assumes that there is a way to map each user to their imported home directory most often the dir is named based on username. Copyright 2005 Xythos Software Page 15
7 Maintenance and Support Care and feeding of the Xythos server consists of: - periodic upgrades - Maintain the web, app, DB, storage and LDAP servers. - Manage a backup process for all the data Xythos persists, including: global schema, document store schema(s), files on external storage, and Xythos search engine index files. - Recover deleted files from backup - Helping users understand feature set, directories, classifications and workflows. - Creating and maintain directories, classifications and workflows 7.1 Changing File Storage There may be a time when you will wish to change the storage device used to store Xythos file content. Many systems will start out in a testing configuration where file content is stored on the DB server, then the content must be migrated to a SAN or NAS. The full path to a file stored on disk is based on the external storage location (Xythos Admin -> Server Admin -> Storage -> External Storage Locations) and the relative path which is a date-based. For example, a Xythos file /test/foo.doc may be stored at /xythosmount/2005/2005-05/2005-05-25/1501/2080.1. In this case /xythosmount is the external storage location and everything else is the relative path to the file on that storage location. If you need to switch storage devices you would migrate (move or copy) all of the content on the current storage location to the new storage device, then change the external storage location path in the admin. So in this example you would move all the content from /xythosmount to /newxythosmount then change the external storage location to /newxythosmount (Xythos Admin -> Server Admin -> Storage -> External Storage Locations, then change the value and click on modify). Copyright 2005 Xythos Software Page 16